Hi all. I have some peculiarities with pthread-win32 and suppose there's a bug in library. Here's my example code: #include <pthread.h> #include <semaphore.h> #include <stdio.h> #include <assert.h> void * thr(void * arg) { sem_post((sem_t*)arg); return 0; } int main() { sem_t sem; int error = 0; error = sem_init(&sem, 0, 0); // OK assert(!error); pthread_t thread; error = pthread_create(&thread, 0, thr, &sem); // OK assert(!error); sem_wait(&sem); error = sem_destroy(&sem); if (error != 0) { error = errno; // errno == 16 (EBUSY) printf("errno = %d\n", error); } pthread_join(thread, 0); return error; } So, here we have error 16 (0x10) in almost all runs, independently on compile options (at least i couldn't find working combination). Compiler is msvc 7.1 sp1. I've read about some troubles with it in BUGS file, but first, this one is unrelated to those, as I can see, second all tests from `tests' directory run with no errors, but this one fails even with same compile options. When I've tried to debug, it turned out that when main() is executing sem_destroy(), child thread is still in sem_post(). But I couldn't find out what's going on there and supposed this is some kind of race, this is why subject is about races. Google didn't find similar issues about pthread-win32 library. Really hope, this is my fault, but I have no idea where I'm wrong. Thanks in advance. PS I'll also try gcc for win32 and msvc8.0 later - I don't have them on this computer. -- eof
Can I be removed from the mail list? Thanks, Liu Ye -----Original Message----- From: pthreads-win32-owner@sourceware.org [mailto:pthreads-win32-owner@sourceware.org] On Behalf Of Sergey Fokin Sent: Tuesday, December 12, 2006 4:27 AM To: pthreads-win32@sourceware.org Subject: Pthread-win32 races? Hi all. I have some peculiarities with pthread-win32 and suppose there's a bug in library. Here's my example code: #include <pthread.h> #include <semaphore.h> #include <stdio.h> #include <assert.h> void * thr(void * arg) { sem_post((sem_t*)arg); return 0; } int main() { sem_t sem; int error = 0; error = sem_init(&sem, 0, 0); // OK assert(!error); pthread_t thread; error = pthread_create(&thread, 0, thr, &sem); // OK assert(!error); sem_wait(&sem); error = sem_destroy(&sem); if (error != 0) { error = errno; // errno == 16 (EBUSY) printf("errno = %d\n", error); } pthread_join(thread, 0); return error; } So, here we have error 16 (0x10) in almost all runs, independently on compile options (at least i couldn't find working combination). Compiler is msvc 7.1 sp1. I've read about some troubles with it in BUGS file, but first, this one is unrelated to those, as I can see, second all tests from `tests' directory run with no errors, but this one fails even with same compile options. When I've tried to debug, it turned out that when main() is executing sem_destroy(), child thread is still in sem_post(). But I couldn't find out what's going on there and supposed this is some kind of race, this is why subject is about races. Google didn't find similar issues about pthread-win32 library. Really hope, this is my fault, but I have no idea where I'm wrong. Thanks in advance. PS I'll also try gcc for win32 and msvc8.0 later - I don't have them on this computer. -- eof
Hi Sergey, The library is working correctly since sem_destroy() is returning the error EBUSY as required and documented at: http://sourceware.org/pthreads-win32/manual/sem_init.html This is also in accordance with the Single Unix Specification. If it was hanging your program rather than returning the error then that would be a problem. By the way, in your sample code you don't check the return code from the sem_post(), but the semaphore could already be destroyed at that point. It would be better in this and similar cases to call sem_destroy() after the call to pthread_join(), or at least after you can guarantee that the semaphore is no longer required by any child threads. A sem_t "handle" is not required to be unique in time, so it's possible to destroy a semaphore and init a new one having another purpose altogether, which then by chance occupies the same physical memory location, i.e. has the same "handle" (in pthreads-win32 this is just the pointer to the struct in memory), so a sema op somewhere may not fail even though, logically, it is no longer accessing the semaphore it should be, and the application may now be mysteriously badly behaved and difficult to debug. Regards. Ross Sergey Fokin wrote: > Hi all. > > I have some peculiarities with pthread-win32 and suppose there's a bug > in library. > > Here's my example code: > > #include <pthread.h> > #include <semaphore.h> > #include <stdio.h> > #include <assert.h> > > void * thr(void * arg) > { > sem_post((sem_t*)arg); > return 0; > } > > int main() > { > sem_t sem; > int error = 0; > error = sem_init(&sem, 0, 0); // OK > assert(!error); > > pthread_t thread; > error = pthread_create(&thread, 0, thr, &sem); // OK > assert(!error); > > sem_wait(&sem); > error = sem_destroy(&sem); > if (error != 0) > { > error = errno; // errno == 16 (EBUSY) > printf("errno = %d\n", error); > } > > pthread_join(thread, 0); > > return error; > } > > > So, here we have error 16 (0x10) in almost all runs, independently on > compile options (at least i couldn't find working combination). > Compiler is msvc 7.1 sp1. > > I've read about some troubles with it in BUGS file, but first, this > one is unrelated to those, as I can see, second all tests from `tests' > directory run with no errors, but this one fails even with same > compile options. > > When I've tried to debug, it turned out that when main() is executing > sem_destroy(), child thread is still in sem_post(). But I couldn't > find out what's going on there and supposed this is some kind of race, > this is why subject is about races. > > Google didn't find similar issues about pthread-win32 library. > > Really hope, this is my fault, but I have no idea where I'm wrong. > > Thanks in advance. > > PS I'll also try gcc for win32 and msvc8.0 later - I don't have them > on this computer. >
Hello. > The library is working correctly since sem_destroy() is returning the > error EBUSY as required and documented at: > > http://sourceware.org/pthreads-win32/manual/sem_init.html > > This is also in accordance with the Single Unix Specification. If it was > hanging your program rather than returning the error then that would be > a problem. The sem_destroy function sets errno to the following error code on error: EBUSY if some threads are currently blocked waiting on the semaphore. But there's obviously no threads waiting on semaphore, is there? > By the way, in your sample code you don't check the return code from the > sem_post(), but the semaphore could already be destroyed at that point. It couldn't (shouldn't, because actually it does). Because semaphore is destroyed only after sem_wait(), but sem_wait() returns (should return) only after sem_post() succeeds. Did I understood right? > It would be better in this and similar cases to call sem_destroy() after > the call to pthread_join(), or at least after you can guarantee that the > semaphore is no longer required by any child threads. In this example I can destroy semaphore after pthread_join(). But in my program logic is more complicated and sem_post()'ing thread doesn't finish after sem_post(). And again the same question: Does sem_post() perform atomic access to the semaphore or I should perform additional synchronisation to access the semaphore? Synchronizing access to semaphore looks strange, don't you think so? This quotation is from linux sem_post manual: !sem_post! atomically increases the count of the semaphore pointed to by |sem|. This function never blocks and can safely be used in asyn- chronous signal handlers. So, I think supplied code must be correct according to manual. > A sem_t "handle" is not required to be unique in time, so it's possible > to destroy a semaphore and init a new one having another purpose > altogether, which then by chance occupies the same physical memory > location, i.e. has the same "handle" (in pthreads-win32 this is just the > pointer to the struct in memory), so a sema op somewhere may not fail > even though, logically, it is no longer accessing the semaphore it > should be, and the application may now be mysteriously badly behaved and > difficult to debug. Yes, I understand this. And there's no chance to accidentally access destroyed semaphore. -- eof
Sergey Fokin wrote: > Hello. > >> The library is working correctly since sem_destroy() is returning the >> error EBUSY as required and documented at: >> >> http://sourceware.org/pthreads-win32/manual/sem_init.html >> >> This is also in accordance with the Single Unix Specification. If it was >> hanging your program rather than returning the error then that would be >> a problem. > > The sem_destroy function sets errno to the following error code on error: > EBUSY if some threads are currently blocked waiting on the semaphore. > > But there's obviously no threads waiting on semaphore, is there? My apologies - I didn't even see the sem_wait() even though it's right before the sem_destroy(), and regardless, as you say, it should only return EBUSY if someone is waiting, not posting. It's been a while but the code looks like it tries to ensure this. I'll have to look closer when I'm more awake. > >> By the way, in your sample code you don't check the return code from the >> sem_post(), but the semaphore could already be destroyed at that point. > > It couldn't (shouldn't, because actually it does). Because semaphore > is destroyed only after sem_wait(), but sem_wait() returns (should > return) only after sem_post() succeeds. Did I understood right? As above - my mistake. Ross
Sergey Fokin wrote: > Hi all. > > I have some peculiarities with pthread-win32 and suppose there's a bug > in library. Sergey, Although I haven't tested the fix yet, it is possible in the current version for sem_destroy to incorrectly return EBUSY. The problem is in sem_destroy.c. In the sample case, sem_post releases the waiting thread which can then enter sem_destroy while sem_post still holds the mutex guarding the semaphore's state. The EBUSY error results from sem_destroy's "trylock" attempt to acquire that mutex. It should block instead. I'm about to test a fix for this, which if successful will also remove another race that occurs around invalidating the destroyed semaphore. Ross > > Here's my example code: > > #include <pthread.h> > #include <semaphore.h> > #include <stdio.h> > #include <assert.h> > > void * thr(void * arg) > { > sem_post((sem_t*)arg); > return 0; > } > > int main() > { > sem_t sem; > int error = 0; > error = sem_init(&sem, 0, 0); // OK > assert(!error); > > pthread_t thread; > error = pthread_create(&thread, 0, thr, &sem); // OK > assert(!error); > > sem_wait(&sem); > error = sem_destroy(&sem); > if (error != 0) > { > error = errno; // errno == 16 (EBUSY) > printf("errno = %d\n", error); > } > > pthread_join(thread, 0); > > return error; > } > > > So, here we have error 16 (0x10) in almost all runs, independently on > compile options (at least i couldn't find working combination). > Compiler is msvc 7.1 sp1. > > I've read about some troubles with it in BUGS file, but first, this > one is unrelated to those, as I can see, second all tests from `tests' > directory run with no errors, but this one fails even with same > compile options. > > When I've tried to debug, it turned out that when main() is executing > sem_destroy(), child thread is still in sem_post(). But I couldn't > find out what's going on there and supposed this is some kind of race, > this is why subject is about races. > > Google didn't find similar issues about pthread-win32 library. > > Really hope, this is my fault, but I have no idea where I'm wrong. > > Thanks in advance. > > PS I'll also try gcc for win32 and msvc8.0 later - I don't have them > on this computer. >