From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11660 invoked by alias); 12 Jun 2007 11:19:08 -0000 Received: (qmail 11650 invoked by uid 22791); 12 Jun 2007 11:19:07 -0000 X-Spam-Check-By: sourceware.org Received: from mx-dk.vsc.vitesse.com (HELO mx-dk.vsc.vitesse.com) (217.74.214.45) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 12 Jun 2007 11:19:02 +0000 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Tue, 12 Jun 2007 15:49:00 -0000 Message-ID: <376637F07F8A9242AD11921B15FA17DC1E6450@mx-dk.vsc.vitesse.com> In-Reply-To: <466E8C04.90307@ds3switch.com> References: <466DDDCF.1040506@ds3switch.com> <20070611230045.GI26816@lunn.ch> <466DE365.1090800@ds3switch.com> <20070612034851.GJ26816@lunn.ch> <466E8C04.90307@ds3switch.com> From: "Lars Povlsen" To: "eCos Disuss" X-IsSubscribed: yes Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Subject: RE: [ECOS] Re: accept() FreeBSD hangs when out of resources X-SW-Source: 2007-06/txt/msg00126.txt.bz2 This seems a lot like the problem I've seen - and reported on 17/4-07. I've been able to occasionally reproduce it manually with a browser (MSIE), but enabling TCP debug logging causes the problem to go away (not occur). AFAICS, it is a race condition in the TCP stack causing socket buffers to be leaked (forever). Calling cyg_kmem_print_stats() displays the problem (but you need reset to recover :-() : Network stack mbuf stats: mbufs 97, clusters 60, free clusters 1 Failed to get 0 times Waited to get 0 times Drained queues to get 0 times VM zone 'ripcb': Total: 64, Free: 64, Allocs: 0, Frees: 0, Fails: 0 VM zone 'tcpcb': Total: 64, Free: 61, Allocs: 353, Frees: 350, Fails: 0 VM zone 'udpcb': Total: 64, Free: 63, Allocs: 4, Frees: 3, Fails: 0 VM zone 'socket': Total: 64, *Free: 0*, Allocs: 365, Frees: 293, Fails: 8 Misc mpool: total 98304, free 4192, max free block 3748 Mbufs pool: total 81792, free 69248, blocksize 128 Clust pool: total 163840, free 38912, blocksize 2048 FWIW, I have not had time to dig into this (as my attempts to produce a test bench has failed...) ---Lars -----Original Message----- From: ecos-discuss-owner@ecos.sourceware.org [mailto:ecos-discuss-owner@ecos.sourceware.org] On Behalf Of Tad Sent: 12. juni 2007 14:05 To: eCos Disuss Subject: Re: [ECOS] Re: accept() FreeBSD hangs when out of resources Andrew Lunn wrote: > On Mon, Jun 11, 2007 at 04:05:57PM -0800, Tad wrote: >=20=20=20 >> Andrew Lunn wrote: >>=20=20=20=20=20 >>> On Mon, Jun 11, 2007 at 03:42:07PM -0800, Tad wrote: >>>=20=20 >>>=20=20=20=20=20=20=20 >>>>>> accept() won't return and won't timeout (>12hrs) when listen()=20 >>>>>> indicates a new connection, if out of sockets/file-descriptors and all=20 >>>>>> TCP connections are in ESTABLISHED state. >>>>>>=20=20=20=20=20=20=20=20 >>>>>>=20=20=20=20=20=20=20=20=20=20=20=20=20 >>>>> Where exactly is it blocked. Please could you provide a call stack. >>>>>=20=20=20=20=20=20=20=20=20=20=20 more info. seems to be dependent on CYGNUM_FILEIO_NFILE rather than=20 CYGPKG_NET_MAXSOCKETS. reducing NFILE < MAXSOCKETS causes accept to=20 hang with fewer established connections than before reduction. --=20 Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss