From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9062 invoked by alias); 14 Aug 2007 13:18:49 -0000 Received: (qmail 8651 invoked by uid 22791); 14 Aug 2007 13:18:46 -0000 X-Spam-Check-By: sourceware.org Received: from mail39.megamailservers.com (HELO mail39.megamailservers.com) (216.251.36.39) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 14 Aug 2007 13:18:31 +0000 X-Authenticated-User: wvisser.lszpaper.com Received: from [192.168.42.101] ([209.171.99.142]) (authenticated bits=0) by mail39.megamailservers.com (8.13.6.20060614/8.13.1) with ESMTP id l7EDIQUh011587 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 14 Aug 2007 09:18:29 -0400 Message-ID: <46C1AB86.6080008@lszpaper.com> Date: Tue, 14 Aug 2007 13:18:00 -0000 From: Wayne Visser User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: ecos-discuss@sourceware.org References: <463F213D.6040907@lszpaper.com> In-Reply-To: <463F213D.6040907@lszpaper.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Subject: Re: [ECOS] High priority thread versus network - FOLLOW-UP X-SW-Source: 2007-08/txt/msg00078.txt.bz2 Wayne Visser wrote: > Hello all, > > We're having a problem with an eCos app that has a relatively > long-running, high priority thread (runs at priority 2 every 10 ms and > takes about 4ms to complete). Under high network loads, the app will > crash with no asserts or panics. If the high priority thread is > disabled, the app will run fine for days without problem under high net > loads. Conversely, without any networking activity, the app runs fine > for days. > Hello all, This is a follow-up to some mysterious crashes we were seeing related to network activity. Related posts are here: https://bugzilla.ecoscentric.com/show_bug.cgi?id=1000403 http://sourceware.org/ml/ecos-discuss/2007-03/msg00024.html http://sourceware.org/ml/ecos-discuss/2007-05/msg00046.html This was seen on an i386 platform (Advantech PCM3370) and we noticed that when the AGP aperture was reduced the observed problem apparently 'disappeared' leaving me to think we had some type of memory conflict between the aperture and the ethernet card. Whether or not a high-priority thread was running and stealing time away from the networking threads turned out not be causal - network activity on its own was enough to cause crashes. No asserts or panics were raised at a crash (apart from me perhaps :-0 It's not entirely clear why a change (specifically a reduction) in AGP aperture eliminates the assumed memory conflict. We also observed that crashing was most frequent when the aperture was set to 1/2 of the main memory size. The board uses a Via VT8606 Northbridge (ProSavage PN133T) and on our board's BIOS, it's possible to reduce but not completely disable the aperture so we did some research and have a method to disable it programmatically, which is probably better anyway. Since this Northbridge is fairly common in PC104 boards, maybe someone else is seeing crashes in a similar way so here's how we ended up disabling the aperture. NOTE: This is not a totally satisfying fix, since we don't completely understand the problem, but in the 2 months that we've started doing this we have not recorded a single crash on our test boards. // ******************************************************************* #include // // ... // ... // ... // ******************************************************************* #define DEBUG_VT8606_SETUP 0 // // ... // ... // ... // ******************************************************************* // device/vendor id matching function for VT8606 Northbridge // static cyg_bool pci_find_match_func(cyg_uint16 v, cyg_uint16 d, cyg_uint32 c, void *p) { // vendor ID for Via Technologies = 0x1106 // device ID for VT8606 = 0x0605 return ((v == 0x1106) && (d == 0x0605)); } // ******************************************************************* // Disable graphics aperture feature in Via VT8606 Northbridge. Call // this function as soon after startup as possible (i.e. before setting // up PCI devices). This function is benign if no VT8606 exists. // static void chipset_init(void) { cyg_pci_device_id pci_device_id; cyg_pci_device pci_device_info; #if DEBUG_VT8606_SETUP > 0 cyg_uint8 b; #endif cyg_uint32 dw; pci_device_id = CYG_PCI_NULL_DEVID; cyg_pci_init(); if ( cyg_pci_find_matching(&pci_find_match_func, NULL, &pci_device_id)) { cyg_pci_get_device_info(pci_device_id, &pci_device_info); if (cyg_pci_configure_device(&pci_device_info)) { // read GA base #if DEBUG_VT8606_SETUP > 0 cyg_pci_read_config_uint32(pci_device_info.devid, 0x10, &dw); diag_printf(" GA BASE (0x88): 0x%08x\n", dw); #endif // read TLB and disable aperture cyg_pci_read_config_uint32(pci_device_info.devid, 0x88, &dw); #if DEBUG_VT8606_SETUP > 0 diag_printf(" GA TLB (0x88): 0x%08x\n", dw); #endif dw &= ~2; cyg_pci_write_config_uint32(pci_device_info.devid, 0x88, dw); #if DEBUG_VT8606_SETUP > 0 cyg_pci_read_config_uint32(pci_device_info.devid, 0x88, &dw); diag_printf(" GA TLB (after disabling aperture) (0x88): 0x%08x\n", dw); #endif // read aperture size and set to 0 #if DEBUG_VT8606_SETUP > 0 cyg_pci_read_config_uint8(pci_device_info.devid, 0x84, &b); diag_printf(" Aperture Size (0x84): 0x%02x\n", b); #endif cyg_pci_write_config_uint8(pci_device_info.devid, 0x84, 0); #if DEBUG_VT8606_SETUP > 0 cyg_pci_read_config_uint8(pci_device_info.devid, 0x84, &b); diag_printf(" Aperture Size (after setting to 0) (0x84): 0x%02x\n", b); #endif } } } // ******************************************************************* -- Wayne Visser LSZ PaperTech Inc. -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss