From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27284 invoked by alias); 28 Sep 2007 13:31:27 -0000 Received: (qmail 26130 invoked from network); 28 Sep 2007 13:31:04 -0000 Received: from unknown (69.17.117.7) by sourceware.org with QMTP; 28 Sep 2007 13:31:04 -0000 Received: (qmail 20009 invoked from network); 28 Sep 2007 13:31:03 -0000 Received: from dsl027-162-100.atl1.dsl.speakeasy.net (HELO otter.localdomain) ([216.27.162.100]) (envelope-sender ) by mail5.sea5.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 28 Sep 2007 13:31:03 -0000 Received: from localhost (jmills@localhost) by otter.localdomain (8.11.6/8.11.6) with ESMTP id l8SDV4220086; Fri, 28 Sep 2007 13:31:04 GMT Date: Fri, 28 Sep 2007 13:31:00 -0000 From: John Mills Reply-To: John Mills To: eCos Users cc: Alok Singh In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Subject: RE: [ECOS] Re: Network TCP Handler: stale socket disposal X-SW-Source: 2007-09/txt/msg00152.txt.bz2 Alok - Thanks for your feedback. That makes the success rate 50:50 (2 of 4 respondents) for the patch. The web server in our product is a somewhat secondary administrative function and we serve simple content that we control 100%. That allows me to summarily close many browser inquiries. I have added and fixed the POST code so it handles our binary firmware images. Either of these may have closed some vulnerabilities that may be affecting you - I don't know. Operationally our problem was triggered by vulnerability scanners used by our customers' SysAdmins. These locked up our product in the course of their overall test scenarios. In principle that meant we were also vlunerable to the corresponding hacker exploits. That's what I meant in an earlier post about a specific, observed functional problem. The lock-up is broader than just web service. When the socket-descriptor pool ('zone') is depleted, no new net sockets can be allocated. This affected other, primary functions in our product, making it a critical issue. I traced the problem by putting 'diag_printf' lines at points where sockets were created and deallocated, working down to find "what didn't happen" when a socket was lost. Sounds like you have the same road ahead of you. Thanks again for your reaponse. - John Mills On Fri, 28 Sep 2007, Alok Singh wrote: > Hi John/everybody, > The patch didn't work for me. I still had all the sockets exhausted, and so the web server hangs, and doesn't accept any new connection. The number of sockets I'm creating while configuring ECOS is 32. Please see the dump of " cyg_kmem_print_stats()" below when the problem comes. > > Test case: I've a script that opens and closes connection to web server every second. It takes around 2 hours to exhaust the SOCKETS zone of sockets. The TCP zone of sockets also comes down. Even if I stop the script, the sockets never recover. I'm currently debugging it(trying to understand TCP by reading Comer and stevens). > > Any ideas are welcome. > > *************** > cyg_kmem_print_stats() - > Network stack mbuf stats: > mbufs 32, clusters 6, free clusters 6 > Failed to get 0 times > Waited to get 0 times > Drained queues to get 0 times > VM zone 'ripcb': > Total: 32, Free: 32, Allocs: 0, Frees: 0, Fails: 0 > VM zone 'tcpcb': > Total: 32, Free: 1, Allocs: 3989, Frees: 3958, Fails: 0 > VM zone 'udpcb': > Total: 32, Free: 31, Allocs: 14, Frees: 13, Fails: 0 > VM zone 'socket': > Total: 32, Free: 0, Allocs: 10319, Frees: 3971, Fails: 6316 > Misc mpool: total 131056, free 79008, max free block 77972 > Mbufs pool: total 130944, free 128768, blocksize 128 > Clust pool: total 262144, free 247808, blocksize 2048 > *********************************************************** > > -Alok -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss