From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21244 invoked by alias); 27 Aug 2007 08:12:40 -0000 Received: (qmail 21091 invoked by uid 22791); 27 Aug 2007 08:12:39 -0000 X-Spam-Check-By: sourceware.org Received: from londo.lunn.ch (HELO londo.lunn.ch) (80.238.139.98) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 27 Aug 2007 08:12:30 +0000 Received: from lunn by londo.lunn.ch with local (Exim 3.36 #1 (Debian)) id 1IPZhx-0007lv-00; Mon, 27 Aug 2007 10:12:25 +0200 Date: Mon, 27 Aug 2007 08:12:00 -0000 From: Andrew Lunn To: Rick Davis Cc: ecos-discuss@ecos.sourceware.org Message-ID: <20070827081225.GZ31057@lunn.ch> Mail-Followup-To: Rick Davis , ecos-discuss@ecos.sourceware.org References: <003601c7e876$4e9ee690$ebdcb3b0$@net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <003601c7e876$4e9ee690$ebdcb3b0$@net> User-Agent: Mutt/1.5.16 (2007-06-11) X-IsSubscribed: yes Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Subject: Re: [ECOS] network problem X-SW-Source: 2007-08/txt/msg00145.txt.bz2 On Mon, Aug 27, 2007 at 02:48:42AM -0400, Rick Davis wrote: > I have a device using the MPC859T processor that has a small web server > running using the standard eCos web server. I have a status page that > auto-refreshes every 15 seconds and I am pinging the unit every second (Yes, > I have a customer that is actually doing this). I don't really know what > other network activity is occurring at the customer's site but my test lab > has Windows network chatter going on. After about 12 or so hours the web > stops responding and the unit can no longer be pinged. The FEC Ethernet > driver is receiving packets and is calling the eth_drv_dsr but the deliver > function is never called. > > I have been tracking this down for some time and have noticed the > following... > > 1. The alarm thread in timeout.c is getting blocked when calling > splx_internal() just before the call to eth_drv_run_deliveries(). > 2. The current value of spl_state in sync.c is 4 (SPL_NET) > > Any ideas why the network would not release the splx_mutex? > Any suggestion on how to further track this down? > I don't have a GDB interface on my platform. :( What vintage of eCos are you using? If you go back far enough into the mists of time, there was at least one bug fix for alarms. But that is a long time ago. Do you have asserts enabled? It might give some clues..... You could also enable CYGIMPL_TRACE_SPLX and call show_sched_events() when you hit the deadlock. That should tell you what function is holding the mutex. You might want to add to the log structure __builtin_return_addresss(0), so you can see one more level up the call stack. Otherwise i think you will just get spi_slpnet, which is not much use. Andrew -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss