* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
@ 2009-04-16 15:12 ` bugzilla-daemon
2009-04-17 8:09 ` bugzilla-daemon
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-16 15:12 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
Andrew Lunn <andrew.lunn@ascom.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |andrew.lunn@ascom.ch
--- Comment #1 from Andrew Lunn <andrew.lunn@ascom.ch> 2009-04-16 16:12:27 ---
There is a race condition with closing the socket and opening the next socket.
The normal code path is:
http_client.c opens the first socket and transfers data. Once finished it calls
http_stream_close() which calls __tcp_abort(). __tcp_abort() starts a timer
with a delay of 1ms. After that 1ms delay the function do_abort() is called
which sends a TCP ACK and RST packet and then unlinks the socket structure from
the linked list of sockets.
The race happens because the socket structure is a member of the static
singleton http_stream in http_client.c. What i think is happening is that after
the http_stream_close(), you are starting a second http transfer, before the
1ms delay. This results in the http_stream->sock structure being added to the
linked list for a "second time", messing up the list pointers, and so giving
your endless loop. When you delay your next http transfer for a short while,
bigger an 1ms, the socket gets removed from the list before it is added to the
list and everybody is happy.
How to solve this problem? _tcp_open has code like:
// Send off the SYN packet to open the connection
tcp_send(s, TCP_FLAG_SYN, 0);
// Wait for connection to establish
while (s->state != _ESTABLISHED) {
if (s->state == _CLOSED) {
diag_printf("TCP open - host closed connection\n");
return -1;
}
if (--timeout <= 0) {
diag_printf("TCP open - connection timed out\n");
return -1;
}
MS_TICKS_DELAY();
__tcp_poll();
}
return 0;
Maybe abort needs something similar:
void
__tcp_abort(tcp_socket_t *s, unsigned long delay)
{
int timeout = 10;
__timer_set(&abort_timer, delay, do_abort, s);
while (s->state != _CLOSED) {
if (--timeout <= 0) {
diag_printf("TCP close - connection failed to close\n");
return;
}
MS_TICKS_DELAY();
__tcp_poll();
}
}
It also looks like there could be a second similar race condition when the
connection breaks. The code calls __tcp_close(&s->sock) and returns. Maybe a
call to __tcp_close_wait() is needed?
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
2009-04-16 15:12 ` [Bug 1000738] " bugzilla-daemon
@ 2009-04-17 8:09 ` bugzilla-daemon
2009-04-17 8:11 ` bugzilla-daemon
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-17 8:09 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
John Dallaway <john@dallaway.org.uk> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Extra Info|--- |REQUESTED
Ever Confirmed|0 |1
--- Comment #2 from John Dallaway <john@dallaway.org.uk> 2009-04-17 09:09:00 ---
Ilko, can you verify that the proposed change described in comment #1 resolves
the problem you are experiencing please?
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
2009-04-16 15:12 ` [Bug 1000738] " bugzilla-daemon
2009-04-17 8:09 ` bugzilla-daemon
@ 2009-04-17 8:11 ` bugzilla-daemon
2009-04-17 16:46 ` bugzilla-daemon
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-17 8:11 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
--- Comment #3 from Andrew Lunn <andrew.lunn@ascom.ch> 2009-04-17 09:11:02 ---
And if you need a real patch, let me know.
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
` (2 preceding siblings ...)
2009-04-17 8:11 ` bugzilla-daemon
@ 2009-04-17 16:46 ` bugzilla-daemon
2009-04-17 16:57 ` bugzilla-daemon
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-17 16:46 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
--- Comment #4 from Ilko Iliev <iliev@ronetix.at> 2009-04-17 17:46:40 ---
Andrew, thank you for the patch - it resolved the problem.
However I changed the timeout in the __tcp_abort() from 10 to 1000 because
sometimes it takes up to 500 ticks until the connection is closed.
Could you make a patch?
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
` (3 preceding siblings ...)
2009-04-17 16:46 ` bugzilla-daemon
@ 2009-04-17 16:57 ` bugzilla-daemon
2009-04-19 20:41 ` bugzilla-daemon
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-17 16:57 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
--- Comment #5 from Andrew Lunn <andrew.lunn@ascom.ch> 2009-04-17 17:57:08 ---
Good to hear it works. I'm kind of supprised about the timeout needing to be
bigger. I will look closer.
As for a patch, could you do a
cvs diff packages/redboot/current/src/net
We would be 90% done with that as a starting point.
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
` (4 preceding siblings ...)
2009-04-17 16:57 ` bugzilla-daemon
@ 2009-04-19 20:41 ` bugzilla-daemon
2009-04-20 11:02 ` bugzilla-daemon
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-19 20:41 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
--- Comment #6 from Ilko Iliev <iliev@ronetix.at> 2009-04-19 21:41:39 ---
Andrew,
I don't need the patch for me.
Maybe you can make a patch and commit it into the eCos mainline.
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
` (5 preceding siblings ...)
2009-04-19 20:41 ` bugzilla-daemon
@ 2009-04-20 11:02 ` bugzilla-daemon
2009-04-20 11:06 ` bugzilla-daemon
2009-04-20 11:19 ` bugzilla-daemon
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-20 11:02 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
--- Comment #7 from Andrew Lunn <andrew.lunn@ascom.ch> 2009-04-20 12:02:14 ---
Created an attachment (id=699)
--> (http://bugs.ecos.sourceware.org/attachment.cgi?id=699)
Patch to fix race condition
Hi Ilko
Please could you test this patch. If it is O.K. i will commit it.
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
` (6 preceding siblings ...)
2009-04-20 11:02 ` bugzilla-daemon
@ 2009-04-20 11:06 ` bugzilla-daemon
2009-04-20 11:19 ` bugzilla-daemon
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-20 11:06 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
--- Comment #8 from Ilko Iliev <iliev@ronetix.at> 2009-04-20 12:06:16 ---
the patch is OK
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 1000738] Redboot networking problem
2009-04-06 17:35 [Bug 1000738] New: Redboot networking problem bugzilla-daemon
` (7 preceding siblings ...)
2009-04-20 11:06 ` bugzilla-daemon
@ 2009-04-20 11:19 ` bugzilla-daemon
8 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2009-04-20 11:19 UTC (permalink / raw)
To: ecos-bugs
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1000738
Andrew Lunn <andrew.lunn@ascom.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |CURRENTRELEASE
--- Comment #9 from Andrew Lunn <andrew.lunn@ascom.ch> 2009-04-20 12:19:48 ---
Patch committed.
--
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread