From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29073 invoked by alias); 25 Jan 2013 22:07:50 -0000 Received: (qmail 29057 invoked by uid 22791); 25 Jan 2013 22:07:49 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_05,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,KHOP_SPAMHAUS_DROP,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE X-Spam-Check-By: sourceware.org Received: from mail-ie0-f178.google.com (HELO mail-ie0-f178.google.com) (209.85.223.178) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 25 Jan 2013 22:07:15 +0000 Received: by mail-ie0-f178.google.com with SMTP id c12so197799ieb.37 for ; Fri, 25 Jan 2013 14:07:15 -0800 (PST) X-Received: by 10.50.242.73 with SMTP id wo9mr5011965igc.36.1359151635090; Fri, 25 Jan 2013 14:07:15 -0800 (PST) Received: from localhost (dsl.comtrol.com. [64.122.56.22]) by mx.google.com with ESMTPS id ua8sm4696688igb.12.2013.01.25.14.07.13 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 25 Jan 2013 14:07:14 -0800 (PST) Date: Fri, 25 Jan 2013 22:07:00 -0000 From: Grant Edwards To: ecos-discuss@ecos.sourceware.org Message-ID: <20130125220711.GA14802@grante> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Subject: [ECOS] ARP-related problem X-SW-Source: 2013-01/txt/msg00026.txt.bz2 I've been examining Ethernet traces from a customer site where there appear to be problems associated with the eCos ARP table entry timeouts. The application involves an eCos app that exchanges blocks of data with a Windows PC every 60-80 milliseconds. Everything runs fine for hours at a time, but there are occasional failures where the network traffic just sort of "stops" and the TCP connection has to be re-established. These failures only happen when the eCos network stack sends out an ARP request asking about the Windows PC's IP address. The ARP requests usually don't cause problems, but sometimes all traffic between the two endpoints of the TCP/IP connection stops for 5-6 seconds -- then the applications time out, close the TCP connection and establish a new one. Upon examining the network trace, I see ARP requests for the Windows PC's IP address from the eCos network stack every 20 minutes exactly This happens even though there is a constant stream of packets being received from and sent to that IP address for the entire 20 minutes between ARP requests. >From what I can tell looking at src/sys/if_ether.c, when an ARP table entry gets to be 20 minutes old it is deleted without any attempt to refresh it. Then, a few milliseconds later a packet is to be sent to the IP address so an ARP request is sent. Any more attempts to send packets before the ARP reply has been processed result in tx packets being discarded. Why not attempt to refresh ARP table entries _before_ they expire and we start discarding tx packets? If we received an Ethernet packet from an IP address 10ms ago, why do we even _need_ to send an ARP request? The Ethernet address from 10ms is probably still valid... -- Grant Edwards grant.b.edwards Yow! I am a traffic light, at and Alan Ginzberg kidnapped gmail.com my laundry in 1927! -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss