public inbox for ecos-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug 1000738] New: Redboot networking problem
@ 2009-04-06 17:35 bugzilla-daemon
  2009-04-16 15:12 ` [Bug 1000738] " bugzilla-daemon
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-06 17:35 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738

           Summary: Redboot networking problem
           Product: eCos
           Version: CVS
          Platform: snds (Samsung SNDS)
        OS/Version: HostOS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: normal
         Component: RedBoot
        AssignedTo: gary@mlbassoc.com
        ReportedBy: iliev@ronetix.at
         QAContact: ecos-bugs@ecos.sourceware.org
             Class: ---


There is a problem when performing two http loads one after other - the second
load put the RedBoot in an endless loop.

If there is a small delay between the both loads, then sometimes it works.
If the the first load is a http and the second load a tftp, then the problem
doesn't exist.

The endless loop is in tcp.c, __tcp_handler():
   for (prev = NULL, s = tcp_list; s; prev = s, s = s->next) {
       if (s->our_port == ntohs(tcp->dest_port)) {
       if (s->his_port == 0)

It loops forever because *s == s->next* and *s->our_port* and
*ntohs(tcp->dest_port)* differs with one.


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
@ 2009-04-16 15:12 ` bugzilla-daemon
  2009-04-17  8:09 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-16 15:12 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738


Andrew Lunn <andrew.lunn@ascom.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrew.lunn@ascom.ch




--- Comment #1 from Andrew Lunn <andrew.lunn@ascom.ch>  2009-04-16 16:12:27 ---
There is a race condition with closing the socket and opening the next socket.

The normal code path is:

http_client.c opens the first socket and transfers data. Once finished it calls
http_stream_close() which calls __tcp_abort(). __tcp_abort() starts a timer
with a delay of 1ms. After that 1ms delay the function do_abort() is called
which sends a TCP ACK and RST packet and then unlinks the socket structure from
the linked list of sockets.

The race happens because the socket structure is a member of the static
singleton http_stream in http_client.c. What i think is happening is that after
the http_stream_close(), you are starting a second http transfer, before the
1ms delay. This results in the http_stream->sock structure being added to the
linked list for a "second time", messing up the list pointers, and so giving
your endless loop. When you delay your next http transfer for a short while,
bigger an 1ms, the socket gets removed from the list before it is added to the
list and everybody is happy.

How to solve this problem? _tcp_open has code like:

     // Send off the SYN packet to open the connection
    tcp_send(s, TCP_FLAG_SYN, 0);
    // Wait for connection to establish
    while (s->state != _ESTABLISHED) {
        if (s->state == _CLOSED) {
            diag_printf("TCP open - host closed connection\n");
            return -1;
        }
        if (--timeout <= 0) {
            diag_printf("TCP open - connection timed out\n");
            return -1;
        }
        MS_TICKS_DELAY();
        __tcp_poll();
    }
    return 0;

Maybe abort needs something similar:

void
__tcp_abort(tcp_socket_t *s, unsigned long delay)
{
  int timeout = 10;

  __timer_set(&abort_timer, delay, do_abort, s);

  while (s->state != _CLOSED) {
        if (--timeout <= 0) {
            diag_printf("TCP close - connection failed to close\n");
            return;
        }
        MS_TICKS_DELAY();
        __tcp_poll();
    }     
}


It also looks like there could be a second similar race condition when the
connection breaks. The code calls __tcp_close(&s->sock) and returns. Maybe a
call to __tcp_close_wait() is needed?


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
  2009-04-16 15:12 ` [Bug 1000738] " bugzilla-daemon
@ 2009-04-17  8:09 ` bugzilla-daemon
  2009-04-17  8:11 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-17  8:09 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738


John Dallaway <john@dallaway.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
         Extra Info|---                         |REQUESTED
     Ever Confirmed|0                           |1




--- Comment #2 from John Dallaway <john@dallaway.org.uk>  2009-04-17 09:09:00 ---
Ilko, can you verify that the proposed change described in comment #1 resolves
the problem you are experiencing please?


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
  2009-04-16 15:12 ` [Bug 1000738] " bugzilla-daemon
  2009-04-17  8:09 ` bugzilla-daemon
@ 2009-04-17  8:11 ` bugzilla-daemon
  2009-04-17 16:46 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-17  8:11 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738





--- Comment #3 from Andrew Lunn <andrew.lunn@ascom.ch>  2009-04-17 09:11:02 ---
And if you need a real patch, let me know.


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
                   ` (2 preceding siblings ...)
  2009-04-17  8:11 ` bugzilla-daemon
@ 2009-04-17 16:46 ` bugzilla-daemon
  2009-04-17 16:57 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-17 16:46 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738





--- Comment #4 from Ilko Iliev <iliev@ronetix.at>  2009-04-17 17:46:40 ---
Andrew, thank you for the patch - it resolved the problem.
However I changed the timeout in the __tcp_abort() from 10 to 1000 because
sometimes it takes up to 500 ticks until the connection is closed.

Could you make a patch?


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
                   ` (3 preceding siblings ...)
  2009-04-17 16:46 ` bugzilla-daemon
@ 2009-04-17 16:57 ` bugzilla-daemon
  2009-04-19 20:41 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-17 16:57 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738





--- Comment #5 from Andrew Lunn <andrew.lunn@ascom.ch>  2009-04-17 17:57:08 ---
Good to hear it works. I'm kind of supprised about the timeout needing to be
bigger. I will look closer. 

As for a patch, could you do a 

cvs diff packages/redboot/current/src/net

We would be 90% done with that as a starting point.


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
                   ` (4 preceding siblings ...)
  2009-04-17 16:57 ` bugzilla-daemon
@ 2009-04-19 20:41 ` bugzilla-daemon
  2009-04-20 11:02 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-19 20:41 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738





--- Comment #6 from Ilko Iliev <iliev@ronetix.at>  2009-04-19 21:41:39 ---
Andrew,
I don't need the patch for me.
Maybe you can make a patch and commit it into the eCos mainline.


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
                   ` (5 preceding siblings ...)
  2009-04-19 20:41 ` bugzilla-daemon
@ 2009-04-20 11:02 ` bugzilla-daemon
  2009-04-20 11:06 ` bugzilla-daemon
  2009-04-20 11:19 ` bugzilla-daemon
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-20 11:02 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738





--- Comment #7 from Andrew Lunn <andrew.lunn@ascom.ch>  2009-04-20 12:02:14 ---
Created an attachment (id=699)
 --> (http://bugs.ecos.sourceware.org/attachment.cgi?id=699)
Patch to fix race condition

Hi Ilko

Please could you test this patch. If it is O.K. i will commit it.


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
                   ` (6 preceding siblings ...)
  2009-04-20 11:02 ` bugzilla-daemon
@ 2009-04-20 11:06 ` bugzilla-daemon
  2009-04-20 11:19 ` bugzilla-daemon
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-20 11:06 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738





--- Comment #8 from Ilko Iliev <iliev@ronetix.at>  2009-04-20 12:06:16 ---
the patch is OK


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1000738] Redboot networking problem
  2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
                   ` (7 preceding siblings ...)
  2009-04-20 11:06 ` bugzilla-daemon
@ 2009-04-20 11:19 ` bugzilla-daemon
  8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-20 11:19 UTC (permalink / raw)
  To: ecos-bugs

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738


Andrew Lunn <andrew.lunn@ascom.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |CURRENTRELEASE




--- Comment #9 from Andrew Lunn <andrew.lunn@ascom.ch>  2009-04-20 12:19:48 ---
Patch committed.


-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-04-20 11:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
2009-04-16 15:12 ` [Bug 1000738] " bugzilla-daemon
2009-04-17  8:09 ` bugzilla-daemon
2009-04-17  8:11 ` bugzilla-daemon
2009-04-17 16:46 ` bugzilla-daemon
2009-04-17 16:57 ` bugzilla-daemon
2009-04-19 20:41 ` bugzilla-daemon
2009-04-20 11:02 ` bugzilla-daemon
2009-04-20 11:06 ` bugzilla-daemon
2009-04-20 11:19 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).