public inbox for ecos-discuss@sourceware.org
 help / color / mirror / Atom feed
* [ECOS] bug in ARP FSM with fragmented packets?
@ 2008-01-21 11:18 Jürgen Lambrecht
  2008-01-21 11:48 ` Andrew Lunn
  0 siblings, 1 reply; 5+ messages in thread
From: Jürgen Lambrecht @ 2008-01-21 11:18 UTC (permalink / raw)
  To: eCos Discussion

Hello,

I'm investigating a problem where UDP packets get lost when there is an 
ARP table timeout.
- It only happens with UDP, not with TCP; IPv4.
- It only happens with fragmented IP packets, so with UDP packets bigger 
than the Ethernet MTU of 1518 Bytes.
- I use freeBSD stack, version from CVS end 2007
This is what I see with Ethereal: The application receives a UDP request 
packet, and must send back a UDP reply packet. Instead an ARP 
request/reply happens, and the packet gets lost.

Using static ARP table entries solves the problem of course, but that is 
not acceptable.

Has anybody seen such a problem before?
Or has anybody a clue where to start looking?

I will do new and more detailed tests next week, and start looking to 
the bsd_tcpip code.

Kind regards,

-- 
Jürgen Lambrecht
R&D Engineer
Televic Transport Systems
http://www.televic.com
Televic NV / SA (main office)  	
Leo Bekaertlaan 1
B-8870 Izegem
Tel: +32 (0)51 303045
Fax: +32 (0)51 310670


-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ECOS] bug in ARP FSM with fragmented packets?
  2008-01-21 11:18 [ECOS] bug in ARP FSM with fragmented packets? Jürgen Lambrecht
@ 2008-01-21 11:48 ` Andrew Lunn
  2008-01-22 15:42   ` Jürgen Lambrecht
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Lunn @ 2008-01-21 11:48 UTC (permalink / raw)
  To: J?rgen Lambrecht; +Cc: eCos Discussion

On Mon, Jan 21, 2008 at 12:18:16PM +0100, J?rgen Lambrecht wrote:
> Hello,
>
> I'm investigating a problem where UDP packets get lost when there is an  
> ARP table timeout.
> - It only happens with UDP, not with TCP; IPv4.
> - It only happens with fragmented IP packets, so with UDP packets bigger  
> than the Ethernet MTU of 1518 Bytes.
> - I use freeBSD stack, version from CVS end 2007
> This is what I see with Ethereal: The application receives a UDP request  
> packet, and must send back a UDP reply packet. Instead an ARP  
> request/reply happens, and the packet gets lost.
>
> Using static ARP table entries solves the problem of course, but that is  
> not acceptable.
>
> Has anybody seen such a problem before?
> Or has anybody a clue where to start looking?

First off, is this really a bug? UDP is unreliable. It is allowed to
drop packets. If the application requires reliable packet transfer, it
must perform retries at the application layer.

It is a while since i looked at the ARP code. However, i think it will
hold onto one IP packet when it needs to make an ARP request. Once the
ARP reply comes back it will send the packet it held. If more transmit
requests are made before it has the ARP reply it discards
packets. This fits your description. Your big UDP packet is being
fragmented, causing two or more packets to be sent, of which all but
one gets discarded.

If you really must change this, you need to implement a list of
packets, not a single packet.

However, in my view, your application is broken, not ARP.

         Andrew

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ECOS] bug in ARP FSM with fragmented packets?
  2008-01-21 11:48 ` Andrew Lunn
@ 2008-01-22 15:42   ` Jürgen Lambrecht
  0 siblings, 0 replies; 5+ messages in thread
From: Jürgen Lambrecht @ 2008-01-22 15:42 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: eCos Discussion

Andrew Lunn wrote:

>On Mon, Jan 21, 2008 at 12:18:16PM +0100, J?rgen Lambrecht wrote:
>  
>
>>Hello,
>>
>>I'm investigating a problem where UDP packets get lost when there is an  
>>ARP table timeout.
>>- It only happens with UDP, not with TCP; IPv4.
>>- It only happens with fragmented IP packets, so with UDP packets bigger  
>>than the Ethernet MTU of 1518 Bytes.
>>- I use freeBSD stack, version from CVS end 2007
>>This is what I see with Ethereal: The application receives a UDP request  
>>packet, and must send back a UDP reply packet. Instead an ARP  
>>request/reply happens, and the packet gets lost.
>>
>>Using static ARP table entries solves the problem of course, but that is  
>>not acceptable.
>>
>>Has anybody seen such a problem before?
>>Or has anybody a clue where to start looking?
>>    
>>
>
>First off, is this really a bug? UDP is unreliable. It is allowed to
>drop packets. If the application requires reliable packet transfer, it
>must perform retries at the application layer.
>  
>
I agree that UDP is unreliable, but this only caused by an unreliable 
network, not by unreliable SW.
I know what you mean, but I don't agree.
In a certain way you are saying: UDP related SW can contain bugs or have 
strange behavior, because it is not needed to be reliable.
In our case, we "own" the complete network path from sender to receiver. 
And the tests are done point-to-point with our dedicated HW&SW. So then 
UDP should be completely reliable.

>It is a while since i looked at the ARP code. However, i think it will
>hold onto one IP packet when it needs to make an ARP request. Once the
>ARP reply comes back it will send the packet it held. If more transmit
>requests are made before it has the ARP reply it discards
>packets. This fits your description. Your big UDP packet is being
>fragmented, causing two or more packets to be sent, of which all but
>one gets discarded.
>  
>
ok

>If you really must change this, you need to implement a list of
>packets, not a single packet.
>  
>
I agree that fragmentation is not a good idea; we have already mailed 
about this.. But we have to do it.

>However, in my view, your application is broken, not ARP.
>
>         Andrew
>
>  
>
So I will look next week to the ARP code, after having done detailed 
tests to be sure that ARP is the problem.
If indeed ARP only stores 1 packet, I propose to modify the code so that 
ARP stores as many packets as fit in the networking buffer. The size of 
the static network buffers is a configuration option. I know there are 
several networking buffers - I will have to find out what buffers are 
use wherefore..

kind regards,
Jürgen

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ECOS] bug in ARP FSM with fragmented packets?
  2008-01-21 13:31 Emmanuel Coullien
@ 2008-01-21 15:53 ` Andrew Lunn
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Lunn @ 2008-01-21 15:53 UTC (permalink / raw)
  To: Emmanuel Coullien; +Cc: ecos-discuss

> We met this same problem and we solved it temporary by changing the
> arpt_keep to 24h in if_ether.c (just for the test). But we don't know
> if we can really do it.
> Why isn't it acceptable and do you know what are the consequences if
> we use a such long timeout ?

I would not recommend it. If you have any dynamic nature in your
network, it is going to break. The IP to MAC address mapping does
change at times. eg if you DHCP server gives the IP address to a
different client, the ARP entry will be wrong and you won't be able to
talk to the new node.

    Andrew

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ECOS] bug in ARP FSM with fragmented packets?
@ 2008-01-21 13:31 Emmanuel Coullien
  2008-01-21 15:53 ` Andrew Lunn
  0 siblings, 1 reply; 5+ messages in thread
From: Emmanuel Coullien @ 2008-01-21 13:31 UTC (permalink / raw)
  To: ecos-discuss

On Mon, Jan 21, 2008 at 12:18:16PM +0100, J?rgen Lambrecht wrote:
> Hello,
>
> I'm investigating a problem where UDP packets get lost when there is an
> ARP table timeout.
> - It only happens with UDP, not with TCP; IPv4.
> - It only happens with fragmented IP packets, so with UDP packets bigger
> than the Ethernet MTU of 1518 Bytes.
> - I use freeBSD stack, version from CVS end 2007
> This is what I see with Ethereal: The application receives a UDP request
> packet, and must send back a UDP reply packet. Instead an ARP
> request/reply happens, and the packet gets lost.
>
> Using static ARP table entries solves the problem of course, but that is
> not acceptable.
>
> Has anybody seen such a problem before?
> Or has anybody a clue where to start looking?

Hi,

We met this same problem and we solved it temporary by changing the
arpt_keep to 24h in if_ether.c (just for the test). But we don't know
if we can really do it.
Why isn't it acceptable and do you know what are the consequences if
we use a such long timeout ?

-- 
Emmanuel Coullien

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-01-22 15:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-21 11:18 [ECOS] bug in ARP FSM with fragmented packets? Jürgen Lambrecht
2008-01-21 11:48 ` Andrew Lunn
2008-01-22 15:42   ` Jürgen Lambrecht
2008-01-21 13:31 Emmanuel Coullien
2008-01-21 15:53 ` Andrew Lunn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).