From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14294 invoked by alias); 28 Aug 2007 18:42:34 -0000 Received: (qmail 11491 invoked from network); 28 Aug 2007 18:42:18 -0000 Received: from unknown (69.17.117.3) by sourceware.org with QMTP; 28 Aug 2007 18:42:18 -0000 Received: (qmail 23206 invoked from network); 28 Aug 2007 18:42:17 -0000 Received: from dsl027-162-100.atl1.dsl.speakeasy.net (HELO otter.localdomain) ([216.27.162.100]) (envelope-sender ) by mail1.sea5.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 28 Aug 2007 18:42:16 -0000 Received: from localhost (jmills@localhost) by otter.localdomain (8.11.6/8.11.6) with ESMTP id l7SIgJ413338; Tue, 28 Aug 2007 18:42:19 GMT Date: Tue, 28 Aug 2007 18:42:00 -0000 From: John Mills Reply-To: John Mills To: eCos Users cc: Rick Davis In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Subject: RE: [ECOS] network problem X-SW-Source: 2007-08/txt/msg00155.txt.bz2 Rick - You could try drastically reducing the capacity of the pool you think you're depleting. That way you might generate the lockup in a few minutes or an hour, instead of 12 hours. - John Mills On Tue, 28 Aug 2007, John Mills wrote: > Rick - > > I have just run to ground a problem with very similar symptoms. It turned > out that the 'socket' pool ("zone") was depleted by unrecoverable, stale > TCP connection records. I tracked this down by adding a counter for > allocated/ deallocated data structures from that pool and diagnostic > printouts of the count as sockets were allocated or freed. Though we > noticed the problem with web inquiries, it turned out to have other > effects - like your inability to open a 'telnet' connection. > > As I understand eCos 'zones', each is initially allocated a fixed memory > block based on the size of a specific data structure and the number of > such structures they are expected to provide. Thus a simple counter will > reflect where you stand with respect to a particular pool's capacity and > you probably don't have to dig into the zone alloc/dealloc mechanism. > > HTH. > > - John Mills > > DISCLAIMER: I'm a relative beginner with eCos. > > On Tue, 28 Aug 2007, Rick Davis wrote: > > > Andrew, > > > > After I sent the e-mail this morning, it stopped working in another way. > > http stopped responding but pings still worked. I have a simple telnet > > server on my application and it was failing trying to bind with "Try again > > later". In_pcbbind was failing because in_pcbinshash was failing indicating > > it couldn't MALLOC memory. I turned on fancy asserts and tracing and am > > testing again. It usually take 12 Hrs or more to fail. In_pcbhash MALLOCs > > from the network pool so I am monitoring that. Any ideas why the network > > pool would run out of memory? Can it get fragmented? > > > > Thanks, > > Rick Davis > > -- - John Mills john.m.mills@alum.mit.edu -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss