* SIGILL with pthreads and sockets
@ 2001-08-04 18:37 Wesel
2001-08-04 18:41 ` Christopher Faylor
2001-08-04 19:26 ` Robert Collins
0 siblings, 2 replies; 4+ messages in thread
From: Wesel @ 2001-08-04 18:37 UTC (permalink / raw)
To: cygwin
And bears, oh my.
I've been trying to make some very simple proxy server software (just an
Advertisement filter) and not having much luck. I managed to track the problem
down to the function select() and those for resolving host names. When used in
child threads, these functions somehow render the system unstable so that later,
during an innocuous program statement, a SIGILL is raised.
I looked up SIGILL and it said either it was caused by either a hardware problem or
a problem with the compiler itself. I'm hoping desperately that it's not a hardware
error as the only thing that sucks worse than my modem is my financial standing.
We're talking student level poor, folks. :) My only solace lies in a compiler error,
and the generous aid of you all.
Anyway, here's the code that produces a SIGILL. By commenting out the "#define
THREADED" line, I made all errors go away (as well as any semblance of speed or
efficiency). Could somebody wiser in the ways of cygwin please tell me if I'm
running up against some unforseen or little-known compiler problem? I'm really at a
loss why it doesn't work.
Some notes before the code.
1) Most of the time my threads run synchronously, only one running at a time. Code
I put between the UNLOCK and the LOCK macros consists of blocking functions, select,
gethostname, and such. I repeat, most of the time the thread is LOCKed. Outside of
between the UNLOCK and LOCK macros, shared resources can not be accessed at the
same time.
2) My mysterious lock_function is a rather lame cludge that checks to make sure I'm
not calling pthread_mutex_lock twice in a row in a thread.
3) threadcounter is a local variable for each thread. It is initialized to a
constantly incrementing g_threadcounter, so every thread has a unique number
starting from 0 which is the main thread, and continuing 1 through TEST_SET which
are the child threads. I could have used pthread_self(), but where's the fun in
reading hexadecimal anyway? :)
4) The hosts string array is not intended to infringe upon any copyrights, being
that it's as many URLs as I could think up in 5 minutes. Please don't sue me,
Disney.
File: test.cpp
---
#include <arpa/inet.h>
#include <sys/socket.h>
#include <netdb.h>
#include <unistd.h>
#include <errno.h>//for all our possible error message
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <map>//for watching thread locks
#define THREADED
#define TRUE 1
struct SocketInfo
{
SocketInfo() {}
SocketInfo(const SocketInfo& sp)
{
address = sp.address;
socket = sp.socket;
}
sockaddr_in address;
int socket;
const char* host;
};
void Test4(void);
void* Test4Thread(void*);
int main(int argc, char* argv[])
{
try
{
Test4();
}
catch(int error)
{
printf("Feep! %s\n",strerror(error));
}
catch(...)
{
puts("Feeperific!");
throw;
}
return 0;
}
#ifdef THREADED
pthread_mutex_t popcorn = PTHREAD_MUTEX_INITIALIZER;
void lock_function(int which, int inc)
{
static map<int,int> lock_test;
lock_test[which] += inc;
if(lock_test[which]>1)
{
printf("Feep! %d thread locked itself twice!\n", which);
exit(0);
}
if(lock_test[which]<0)
{
printf("Feep! %d thread unlocked itself twice!\n", which);
exit(0);
}
}
#define LOCK { \
pthread_mutex_lock(&popcorn); \
lock_function(threadcounter, 1); \
printf("%d> Lock\n",threadcounter); \
}
#define UNLOCK { \
printf("%d> Unlock\n",threadcounter);\
lock_function(threadcounter, -1);\
pthread_mutex_unlock(&popcorn); \
}
#else
#define LOCK
#define UNLOCK
#endif
#define DESTPORT 80
int g_threadcounter = 0;
int sock_size = sizeof(sockaddr_in);
#define TEST_SET 15
//Make 10 connections.
SocketInfo dest[TEST_SET];
pthread_t t_id[TEST_SET];
char* hosts[TEST_SET] = {
"transform.to",
"integral.org",
"www.google.com",
"altavista.com",
"208.180.232.33",
"www.gamefaqs.com",
"204.71.200.74",
"www.pokemon.com",
"www.disney.com",
"216.218.194.6",
"www.ucdavis.edu",
"www.cnet.com",
"www.gnu.org",
"www.landfield.com",
"216.200.16.61"};
int yes = 1;
void* Test4thread(void* arg) {
int threadcounter = ++g_threadcounter;
LOCK;
SocketInfo& dest = *((SocketInfo*) arg);
hostent* hp = NULL;
try
{
printf("Thread #%d starting!\n",threadcounter);
bzero((char*) dest.address.sin_zero, 8);
// zero the rest of the struct
dest.address.sin_family = AF_INET;
// host byte order
dest.address.sin_port = htons(DESTPORT);
// short, network byte order
UNLOCK;
printf("Thread #%d resolving!\n",threadcounter);
printf("Resolving %s...\n", dest.host);
dest.address.sin_addr.s_addr = inet_addr(dest.host);
if(dest.address.sin_addr.s_addr == (unsigned)-1)
{
//host is not an IP address. Attempt to resolve...
hp = gethostbyname(dest.host);
if (hp)
{
printf("%d> Host! %s\n", threadcounter, hp->h_name);
dest.address.sin_family = hp->h_addrtype;
bcopy(hp->h_addr, (caddr_t)&dest.address.sin_addr, hp->h_length);
}
else
{
printf("Unknown host %s\n", dest.host);
return arg;
}
}
printf("Thread #%d done resolving.\n",threadcounter);
LOCK;
int right_fd = connect(dest.socket,(sockaddr*) &dest.address,sock_size);
if(right_fd == -1)
{
puts("Destination would not connect!");
printf("%s %d\n", _sys_errlist[errno], threadcounter);
return arg;
}
fd_set sockfd;
timeval timeout;
timeout.tv_sec = 1;
timeout.tv_usec = 0;
FD_ZERO(&sockfd);
FD_SET(dest.socket,&sockfd);
printf("Thread #%d selecting!\n",threadcounter);
UNLOCK;
if(select(dest.socket+1, &sockfd, NULL, NULL, &timeout)<0)
{
printf("%d> Feep! %s", threadcounter, strerror(errno));
return arg;
}
LOCK;
printf("Thread #%d done selecting!\n",threadcounter);
}
catch(...) { puts("Feeperdeep"); }
close(dest.socket);
UNLOCK;
return arg;
}
void Test4(void) {
setbuf(stdout,NULL);
int threadcounter = g_threadcounter;
int i = 0;
for(i = 0; i < TEST_SET; i++)
{
dest[i].host = hosts[i];
if ((dest[i].socket = socket(AF_INET, SOCK_STREAM, 0)) == -1) {
perror("socket");
exit(1);
}
if (setsockopt(dest[i].socket,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(int)) == -1) {
perror("setsockopt");
exit(1);
}
}
LOCK; //Wait for it...
for(i = 0; i < TEST_SET; i++)
{
t_id[i] = new pthread_t;
#ifdef THREADED
pthread_create (t_id + i, NULL, Test4thread, (void*) (dest + i));
#else
Test4thread((void*) (dest + i));
#endif
}
UNLOCK; //Go!
for(i = 0; i < TEST_SET; i++)
{
#ifdef THREADED
pthread_join(t_id[i],NULL);
#endif
LOCK;
printf("Thread %d joined.\n",i+1);
UNLOCK;
delete t_id[i];
}
threadcounter--;
puts("This is after threads.");
return;
}
---
Get the picture? Basically it creates a bunch of sockets, pairs them up with a
host name in my SocketInfo structure, then has 10 baby threads resolve the host
names, connect the sockets and wait for readable data. Since I'm connecting at
the HTTP port, there will never be readable data until I send "GET somefile.html"
or something. Therefore, the select functions all timeout, then the threads clean
up and exit harmlessly.
Well maybe not so harmlessly. After thread #6 prints "Thread #6 Done selecting"
and UNLOCKS, a SIGILL happens. It's always thread #6 in GDB. Thread #9 when
not using GDB. I don't know why. It happens sometime while returning from the
lock_function function, according to GDB.
I'd appreciate it if someone would tell me if this is a problem with cygwin
itself, or with the nut on the end of the keyboard.
Wesel
--
Please do not feed me Twinkies
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGILL with pthreads and sockets
2001-08-04 18:37 SIGILL with pthreads and sockets Wesel
@ 2001-08-04 18:41 ` Christopher Faylor
2001-08-04 19:14 ` Wesel
2001-08-04 19:26 ` Robert Collins
1 sibling, 1 reply; 4+ messages in thread
From: Christopher Faylor @ 2001-08-04 18:41 UTC (permalink / raw)
To: cygwin; +Cc: wassdamo
On Sat, Aug 04, 2001 at 06:50:32PM -0700, Wesel wrote:
>I'd appreciate it if someone would tell me if this is a problem with cygwin
>itself, or with the nut on the end of the keyboard.
I may be wrong but I think it is unlikely that anyone is going to be willing
to debug a program of this size.
If you can get this down to a simpler test case it will be much more likely to
be investigated.
cgf
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGILL with pthreads and sockets
2001-08-04 18:41 ` Christopher Faylor
@ 2001-08-04 19:14 ` Wesel
0 siblings, 0 replies; 4+ messages in thread
From: Wesel @ 2001-08-04 19:14 UTC (permalink / raw)
To: Christopher Faylor
Christopher Faylor wrote:
>
> I may be wrong but I think it is unlikely that anyone is going to be willing
> to debug a program of this size.
Yes, I was afraid of that. I've been pruning it for a month now, trying
to get it down to a reasonable size.
>
> If you can get this down to a simpler test case it will be much more likely to
> be investigated.
Would you believe it's about a fourth the size of the full program? :/
I'll try to cut it down more then... the problem only comes up when I
combine sockets and pthreads, both of which take a lot of code to
handle. I'm not sure exactly what I can eliminate yet.
Thanks for the speedy reply!
Wesel
Who still isn't completely set up for this mailing list, so please
ignore any transient header manipulations.
--
Please do not feed me Twinkies
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGILL with pthreads and sockets
2001-08-04 18:37 SIGILL with pthreads and sockets Wesel
2001-08-04 18:41 ` Christopher Faylor
@ 2001-08-04 19:26 ` Robert Collins
1 sibling, 0 replies; 4+ messages in thread
From: Robert Collins @ 2001-08-04 19:26 UTC (permalink / raw)
To: Wesel; +Cc: cygwin
On 04 Aug 2001 18:50:32 -0700, Wesel wrote:
> And bears, oh my.
>
> I've been trying to make some very simple proxy server software (just an
> Advertisement filter) and not having much luck. I managed to track the problem
> down to the function select() and those for resolving host names. When used in
> child threads, these functions somehow render the system unstable so that later,
> during an innocuous program statement, a SIGILL is raised.
Ok, pragmatic solution: just use the precompiled squid proxy server or
modify the squid source to do what you need. Onto the less pragmatic
solution...
> Anyway, here's the code that produces a SIGILL. By commenting out the "#define
> THREADED" line, I made all errors go away (as well as any semblance of speed or
> efficiency). Could somebody wiser in the ways of cygwin please tell me if I'm
> running up against some unforseen or little-known compiler problem? I'm really at a
> loss why it doesn't work.
What version of cygwin are you using (uname -a)?
> Some notes before the code.
> 1) Most of the time my threads run synchronously, only one running at a time. Code
> I put between the UNLOCK and the LOCK macros consists of blocking functions, select,
> gethostname, and such. I repeat, most of the time the thread is LOCKed. Outside of
> between the UNLOCK and LOCK macros, shared resources can not be accessed at the
> same time.
Good.
> 2) My mysterious lock_function is a rather lame cludge that checks to make sure I'm
> not calling pthread_mutex_lock twice in a row in a thread.
If you need a kludge to prevent double locking, there is likely a design
issue with your program.
> 3) threadcounter is a local variable for each thread. It is initialized to a
> constantly incrementing g_threadcounter, so every thread has a unique number
> starting from 0 which is the main thread, and continuing 1 through TEST_SET which
> are the child threads. I could have used pthread_self(), but where's the fun in
> reading hexadecimal anyway? :)
<shrug>.
> 4) The hosts string array is not intended to infringe upon any copyrights, being
> that it's as many URLs as I could think up in 5 minutes. Please don't sue me,
> Disney.
>
> File: test.cpp
> ---
> #include <arpa/inet.h>
> #include <sys/socket.h>
> #include <netdb.h>
> #include <unistd.h>
> #include <errno.h>//for all our possible error message
>
> #include <stdio.h>
> #include <stdlib.h>
>
> #include <pthread.h>
> #include <map>//for watching thread locks
Whats in <map> ? My ignorance is showing here.
> #define THREADED
>
> #define TRUE 1
>
> struct SocketInfo
> {
> SocketInfo() {}
>
> SocketInfo(const SocketInfo& sp)
> {
> address = sp.address;
> socket = sp.socket;
> }
> sockaddr_in address;
> int socket;
> const char* host;
> };
>
> void Test4(void);
> void* Test4Thread(void*);
>
> int main(int argc, char* argv[])
> {
> try
> {
> Test4();
> }
> catch(int error)
> {
> printf("Feep! %s\n",strerror(error));
> }
> catch(...)
> {
> puts("Feeperific!");
> throw;
> }
>
> return 0;
> }
>
> #ifdef THREADED
>
> pthread_mutex_t popcorn = PTHREAD_MUTEX_INITIALIZER;
>
> void lock_function(int which, int inc)
> {
> static map<int,int> lock_test;
>
> lock_test[which] += inc;
>
> if(lock_test[which]>1)
> {
> printf("Feep! %d thread locked itself twice!\n", which);
> exit(0);
> }
>
> if(lock_test[which]<0)
> {
> printf("Feep! %d thread unlocked itself twice!\n", which);
> exit(0);
> }
> }
>
> #define LOCK { \
> pthread_mutex_lock(&popcorn); \
> lock_function(threadcounter, 1); \
> printf("%d> Lock\n",threadcounter); \
> }
> #define UNLOCK { \
> printf("%d> Unlock\n",threadcounter);\
> lock_function(threadcounter, -1);\
> pthread_mutex_unlock(&popcorn); \
> }
>
> #else
>
> #define LOCK
> #define UNLOCK
>
> #endif
>
> #define DESTPORT 80
>
> int g_threadcounter = 0;
> int sock_size = sizeof(sockaddr_in);
>
>
> #define TEST_SET 15
> //Make 10 connections.
>
> SocketInfo dest[TEST_SET];
> pthread_t t_id[TEST_SET];
> char* hosts[TEST_SET] = {
> "transform.to",
> "integral.org",
> "www.google.com",
> "altavista.com",
> "208.180.232.33",
> "www.gamefaqs.com",
> "204.71.200.74",
> "www.pokemon.com",
> "www.disney.com",
> "216.218.194.6",
> "www.ucdavis.edu",
> "www.cnet.com",
> "www.gnu.org",
> "www.landfield.com",
> "216.200.16.61"};
>
> int yes = 1;
>
> void* Test4thread(void* arg) {
>
> int threadcounter = ++g_threadcounter;
You have a race here. Every thread _can_ end up with the same
threadcounter.
> LOCK;
>
> SocketInfo& dest = *((SocketInfo*) arg);
I may be off my rocker here, but shouldn't this be
SocketInfo dest * = (SocketInfo *) arg;
otherwise there is no point having a globally declared dest[] anyway.
also relevant is the duplicate name that can lead to scope confusion
BTCATK.
> hostent* hp = NULL;
>
> try
> {
>
> printf("Thread #%d starting!\n",threadcounter);
>
>
> bzero((char*) dest.address.sin_zero, 8);
> // zero the rest of the struct
> dest.address.sin_family = AF_INET;
> // host byte order
> dest.address.sin_port = htons(DESTPORT);
> // short, network byte order
>
>
> UNLOCK;
> printf("Thread #%d resolving!\n",threadcounter);
> printf("Resolving %s...\n", dest.host);
You have races here: printf to console is not guaranteed threadsafe.
> dest.address.sin_addr.s_addr = inet_addr(dest.host);
> if(dest.address.sin_addr.s_addr == (unsigned)-1)
> {
> //host is not an IP address. Attempt to resolve...
> hp = gethostbyname(dest.host);
> if (hp)
> {
> printf("%d> Host! %s\n", threadcounter, hp->h_name);
> dest.address.sin_family = hp->h_addrtype;
> bcopy(hp->h_addr, (caddr_t)&dest.address.sin_addr, hp->h_length);
> }
> else
> {
> printf("Unknown host %s\n", dest.host);
> return arg;
> }
> }
> printf("Thread #%d done resolving.\n",threadcounter);
> LOCK;
>
> int right_fd = connect(dest.socket,(sockaddr*) &dest.address,sock_size);
>
> if(right_fd == -1)
> {
> puts("Destination would not connect!");
> printf("%s %d\n", _sys_errlist[errno], threadcounter);
> return arg;
Where's your UNLOCK?
> }
>
> fd_set sockfd;
> timeval timeout;
> timeout.tv_sec = 1;
> timeout.tv_usec = 0;
>
>
> FD_ZERO(&sockfd);
> FD_SET(dest.socket,&sockfd);
>
> printf("Thread #%d selecting!\n",threadcounter);
> UNLOCK;
> if(select(dest.socket+1, &sockfd, NULL, NULL, &timeout)<0)
> {
> printf("%d> Feep! %s", threadcounter, strerror(errno));
> return arg;
> }
> LOCK;
> printf("Thread #%d done selecting!\n",threadcounter);
>
> }
> catch(...) { puts("Feeperdeep"); }
>
> close(dest.socket);
>
> UNLOCK;
>
> return arg;
> }
>
> void Test4(void) {
>
> setbuf(stdout,NULL);
>
> int threadcounter = g_threadcounter;
if g_threadcounter is your global tool to id threads, then this is
suspect: you need int *threadcounter = &g_threadcounter;
> int i = 0;
>
> for(i = 0; i < TEST_SET; i++)
> {
> dest[i].host = hosts[i];
>
> if ((dest[i].socket = socket(AF_INET, SOCK_STREAM, 0)) == -1) {
> perror("socket");
> exit(1);
> }
>
> if (setsockopt(dest[i].socket,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(int)) == -1) {
> perror("setsockopt");
> exit(1);
> }
>
> }
>
> LOCK; //Wait for it...
> for(i = 0; i < TEST_SET; i++)
> {
> t_id[i] = new pthread_t;
This is wrong. pthreads is a C interface, not a C++ interface. If you
want to initialize a pthread_t variable before pthread_create, use
memset (&t_id[i], '\0', sizeof(pthread_t)); It shouldn't be causing your
bug, but it will lead to memory leaks at the very least.
> #ifdef THREADED
> pthread_create (t_id + i, NULL, Test4thread, (void*) (dest + i));
This is not intuitive to read: even though it is functionally correct.
I'd suggest
pthread_create (&t_id[i], NULL, Test4thread, (void *) (&dest[i]));
> #else
> Test4thread((void*) (dest + i));
> #endif
> }
> UNLOCK; //Go!
>
> for(i = 0; i < TEST_SET; i++)
> {
> #ifdef THREADED
> pthread_join(t_id[i],NULL);
> #endif
> LOCK;
> printf("Thread %d joined.\n",i+1);
> UNLOCK;
> delete t_id[i];
again, this is wrong, new and delete aren't valid for pthread
operations. I have _no idea_ what sideeffects you could be creating by
doing this. You are trying to free an opaque pointer. Your compiler
*should* be spitting the dummy at this.
> }
>
> threadcounter--;
>
> puts("This is after threads.");
> return;
> }
> ---
>
> Get the picture? Basically it creates a bunch of sockets, pairs them up with a
> host name in my SocketInfo structure, then has 10 baby threads resolve the host
> names, connect the sockets and wait for readable data. Since I'm connecting at
> the HTTP port, there will never be readable data until I send "GET somefile.html"
> or something. Therefore, the select functions all timeout, then the threads clean
> up and exit harmlessly.
>
> Well maybe not so harmlessly. After thread #6 prints "Thread #6 Done selecting"
> and UNLOCKS, a SIGILL happens. It's always thread #6 in GDB. Thread #9 when
> not using GDB. I don't know why. It happens sometime while returning from the
> lock_function function, according to GDB.
>
> I'd appreciate it if someone would tell me if this is a problem with cygwin
> itself, or with the nut on the end of the keyboard.
#0 tell us what version of cygwin this happens on.
#1 build yourself a debug version of cygwin.
#2 break point at the line before failing, and then breakpoint the
fialing system call, that will get you debugging intop the cygwin1.dll
itself.
#3 I'd do the following, if the code where mine.
1) rather than arrays of data for per thread stuff use pthread_key
2) LOCK and UNLOCK sparingly rather than mostly LOCKED. Just lock around
printf's and other global behaviour.
Rob
>
> Wesel
> --
>
> Please do not feed me Twinkies
>
> --
> Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
> Bug reporting: http://cygwin.com/bugs.html
> Documentation: http://cygwin.com/docs.html
> FAQ: http://cygwin.com/faq/
>
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2001-08-04 19:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-04 18:37 SIGILL with pthreads and sockets Wesel
2001-08-04 18:41 ` Christopher Faylor
2001-08-04 19:14 ` Wesel
2001-08-04 19:26 ` Robert Collins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).