* Race condition in 19990922 snapshot [PATCH]
@ 1999-09-24 13:24 Patrick J. LoPresti
1999-09-24 14:01 ` Patrick J. LoPresti
1999-09-30 23:42 ` Patrick J. LoPresti
0 siblings, 2 replies; 4+ messages in thread
From: Patrick J. LoPresti @ 1999-09-24 13:24 UTC (permalink / raw)
To: cygwin
The appended patch fixes a race condition in select.cc.
This code creates a Win32 socket ("exitsock") which it uses to let the
master thread notify a slave thread that it should exit a particular
call to Winsock select(). (I know, technically, that threads don't
really have a master/slave relationship; but look at the code and you
will see what I mean.)
The problem here is twofold:
1) Doing select() on something created by socket(), when that socket
has not been connected our bound to anything, does not have
defined behavior in any documentation I can find. The code in
select.cc seems to rely on some weird Winsock behavior; namely,
that calling closesocket() on a newly-created socket will trigger
an "exceptional condition" for any select() waiting on it.
2) There is a race condition anyway. It is possible for the master
thread to call closesocket() before the slave thread even gets to
the select(). The result is to select() on a value which is no
longer a legitimate socket at all. I have seen this race happen
under strace. The result is (usually) for the select() to return
with an error code, which Cygwin pretty much ignores and
continues plowing through. Sometimes the result is a complete
hang on our SMP box. (I do not have details of the failures, but
I suspect some kind of corruption in the internals of Winsock or
Cygwin.)
The enclosed patch uses a socketpair() instead of a single socket.
The slave thread now selects for reading on one member of the pair;
the master can close the other end to notify the slave it should fall
out of select(). This fixes both of the problems: The behavior is now
well-defined, and the race condition is avoided.
After installing this patch, most of the Cygwin hangs we were
experiencing are now gone. At least, I can't reproduce them in 20
seconds like I could before :-).
- Pat
P.S. You will get a compile-time warning because socketpair() is not
declared. I was unable to find it in any .h file or I would have
#included it...
======================================================================
--- cygwin-src/winsup/select.cc Wed Sep 22 23:57:31 1999
+++ cygwin-src-patl/winsup/select.cc Fri Sep 24 15:59:42 1999
@@ -991,7 +991,7 @@ struct socketinf
{
HANDLE thread;
winsock_fd_set readfds, writefds, exceptfds;
- SOCKET exitsock;
+ int exitsockpair[2];
select_record *start;
};
@@ -1090,8 +1090,9 @@ thread_socket (void *arg)
}
}
- if (WINSOCK_FD_ISSET (si->exitsock, &si->exceptfds))
- select_printf ("exitsock got an exception");
+ if (WINSOCK_FD_ISSET (dtable[si->exitsockpair[1]]->get_handle (),
+ &si->readfds))
+ select_printf ("exitsockpair was closed");
return 0;
}
@@ -1133,15 +1134,16 @@ start_thread_socket (select_record *me,
}
}
- if ((si->exitsock = socket (PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ if (socketpair (PF_INET, SOCK_STREAM, 0, si->exitsockpair) != 0)
{
- __seterrno ();
select_printf ("cannot create socket");
return -1;
}
- select_printf ("exitsock %p", si->exitsock);
- WINSOCK_FD_SET ((HANDLE)si->exitsock, &si->exceptfds);
+ select_printf ("exitsocks %d %d", si->exitsockpair[0],
+ si->exitsockpair[1]);
+ WINSOCK_FD_SET (dtable[si->exitsockpair[1]]->get_handle (),
+ &si->readfds);
stuff->device_specific[FHDEVN(FH_SOCKET)] = (void *) si;
si->start = &stuff->start;
select_printf ("stuff_start %p", &stuff->start);
@@ -1157,8 +1159,9 @@ socket_cleanup (select_record *me, selec
select_printf ("si %p si->thread %p", si, si ? si->thread : NULL);
if (si && si->thread)
{
- closesocket (si->exitsock);
+ _close (si->exitsockpair[0]);
WaitForSingleObject (si->thread, INFINITE);
+ _close (si->exitsockpair[1]);
CloseHandle (si->thread);
stuff->device_specific[FHDEVN(FH_SOCKET)] = NULL;
delete si;
--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Race condition in 19990922 snapshot [PATCH]
1999-09-24 13:24 Race condition in 19990922 snapshot [PATCH] Patrick J. LoPresti
@ 1999-09-24 14:01 ` Patrick J. LoPresti
1999-09-30 23:42 ` Patrick J. LoPresti
1999-09-30 23:42 ` Patrick J. LoPresti
1 sibling, 1 reply; 4+ messages in thread
From: Patrick J. LoPresti @ 1999-09-24 14:01 UTC (permalink / raw)
To: cygwin
patl> This code creates a Win32 socket ("exitsock") which it uses to
Er, "this code" as in "the original Cygwin code". Sorry for the
ambiguity.
- Pat
--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Race condition in 19990922 snapshot [PATCH]
1999-09-24 13:24 Race condition in 19990922 snapshot [PATCH] Patrick J. LoPresti
1999-09-24 14:01 ` Patrick J. LoPresti
@ 1999-09-30 23:42 ` Patrick J. LoPresti
1 sibling, 0 replies; 4+ messages in thread
From: Patrick J. LoPresti @ 1999-09-30 23:42 UTC (permalink / raw)
To: cygwin
The appended patch fixes a race condition in select.cc.
This code creates a Win32 socket ("exitsock") which it uses to let the
master thread notify a slave thread that it should exit a particular
call to Winsock select(). (I know, technically, that threads don't
really have a master/slave relationship; but look at the code and you
will see what I mean.)
The problem here is twofold:
1) Doing select() on something created by socket(), when that socket
has not been connected our bound to anything, does not have
defined behavior in any documentation I can find. The code in
select.cc seems to rely on some weird Winsock behavior; namely,
that calling closesocket() on a newly-created socket will trigger
an "exceptional condition" for any select() waiting on it.
2) There is a race condition anyway. It is possible for the master
thread to call closesocket() before the slave thread even gets to
the select(). The result is to select() on a value which is no
longer a legitimate socket at all. I have seen this race happen
under strace. The result is (usually) for the select() to return
with an error code, which Cygwin pretty much ignores and
continues plowing through. Sometimes the result is a complete
hang on our SMP box. (I do not have details of the failures, but
I suspect some kind of corruption in the internals of Winsock or
Cygwin.)
The enclosed patch uses a socketpair() instead of a single socket.
The slave thread now selects for reading on one member of the pair;
the master can close the other end to notify the slave it should fall
out of select(). This fixes both of the problems: The behavior is now
well-defined, and the race condition is avoided.
After installing this patch, most of the Cygwin hangs we were
experiencing are now gone. At least, I can't reproduce them in 20
seconds like I could before :-).
- Pat
P.S. You will get a compile-time warning because socketpair() is not
declared. I was unable to find it in any .h file or I would have
#included it...
======================================================================
--- cygwin-src/winsup/select.cc Wed Sep 22 23:57:31 1999
+++ cygwin-src-patl/winsup/select.cc Fri Sep 24 15:59:42 1999
@@ -991,7 +991,7 @@ struct socketinf
{
HANDLE thread;
winsock_fd_set readfds, writefds, exceptfds;
- SOCKET exitsock;
+ int exitsockpair[2];
select_record *start;
};
@@ -1090,8 +1090,9 @@ thread_socket (void *arg)
}
}
- if (WINSOCK_FD_ISSET (si->exitsock, &si->exceptfds))
- select_printf ("exitsock got an exception");
+ if (WINSOCK_FD_ISSET (dtable[si->exitsockpair[1]]->get_handle (),
+ &si->readfds))
+ select_printf ("exitsockpair was closed");
return 0;
}
@@ -1133,15 +1134,16 @@ start_thread_socket (select_record *me,
}
}
- if ((si->exitsock = socket (PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ if (socketpair (PF_INET, SOCK_STREAM, 0, si->exitsockpair) != 0)
{
- __seterrno ();
select_printf ("cannot create socket");
return -1;
}
- select_printf ("exitsock %p", si->exitsock);
- WINSOCK_FD_SET ((HANDLE)si->exitsock, &si->exceptfds);
+ select_printf ("exitsocks %d %d", si->exitsockpair[0],
+ si->exitsockpair[1]);
+ WINSOCK_FD_SET (dtable[si->exitsockpair[1]]->get_handle (),
+ &si->readfds);
stuff->device_specific[FHDEVN(FH_SOCKET)] = (void *) si;
si->start = &stuff->start;
select_printf ("stuff_start %p", &stuff->start);
@@ -1157,8 +1159,9 @@ socket_cleanup (select_record *me, selec
select_printf ("si %p si->thread %p", si, si ? si->thread : NULL);
if (si && si->thread)
{
- closesocket (si->exitsock);
+ _close (si->exitsockpair[0]);
WaitForSingleObject (si->thread, INFINITE);
+ _close (si->exitsockpair[1]);
CloseHandle (si->thread);
stuff->device_specific[FHDEVN(FH_SOCKET)] = NULL;
delete si;
--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Race condition in 19990922 snapshot [PATCH]
1999-09-24 14:01 ` Patrick J. LoPresti
@ 1999-09-30 23:42 ` Patrick J. LoPresti
0 siblings, 0 replies; 4+ messages in thread
From: Patrick J. LoPresti @ 1999-09-30 23:42 UTC (permalink / raw)
To: cygwin
patl> This code creates a Win32 socket ("exitsock") which it uses to
Er, "this code" as in "the original Cygwin code". Sorry for the
ambiguity.
- Pat
--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~1999-09-30 23:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-09-24 13:24 Race condition in 19990922 snapshot [PATCH] Patrick J. LoPresti
1999-09-24 14:01 ` Patrick J. LoPresti
1999-09-30 23:42 ` Patrick J. LoPresti
1999-09-30 23:42 ` Patrick J. LoPresti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).