From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 489 invoked by alias); 27 Mar 2007 13:44:10 -0000 Received: (qmail 475 invoked by uid 22791); 27 Mar 2007 13:44:08 -0000 X-Spam-Check-By: sourceware.org Received: from humvee.dot.net.au (HELO quokka.dot.net.au) (202.147.68.10) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 27 Mar 2007 14:43:56 +0100 Received: from [121.127.200.46] (helo=[121.127.200.46]) by quokka.dot.net.au with esmtp (Exim 3.35 #1 (Debian)) id 1HWBxi-0003vc-00; Tue, 27 Mar 2007 23:43:46 +1000 Message-ID: <46091181.4070402@homemail.com.au> Date: Sat, 31 Mar 2007 07:15:00 -0000 From: Ross Johnson User-Agent: Thunderbird 1.5.0.7 (X11/20061008) MIME-Version: 1.0 To: Stefan Eilemann CC: Pthreads-Win32 list Subject: Re: pthread_join problem References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact pthreads-win32-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: pthreads-win32-owner@sourceware.org X-SW-Source: 2007/txt/msg00021.txt.bz2 Stefan Eilemann wrote: > Hello, > > I am in the situation that a pthread_join does not return, even > though the thread has called pthread_exit. > > I read the cleanup notes, but I think it does not apply here. > > I am using the C cleanup code. One thread calls pthread_exit, > the other phtread_join. I've verified that the thread calling > pthread_exit does the longjmp to the thread start code, which > calls _endthreadex. > > The main thread calling pthread_join does hang in > WaitForMultipleObjects. > > The problem only occurs when I am using some unrelated(?) > external code (the Mellanox SDP Infiniband implementation), > so it could be caused by that, or just be a race appearing > with this code. > > There are other pthreads in my application, which terminate > correctly with pthread_exit/pthread_join. Only one thread - > the network receive thread ;)- does exhibit the problem. > > Do you have an idea what could be the cause of this problem? > Anything else I could try to find the problem? One thing that comes to mind that seems to fit the evidence assumes that the external code that you mentioned is a DLL and it executes it's own dllMain routine which somehow interferes with pthread-win32's thread exit cleanup. This would be occurring after _endthreadex() is called, which you've verified is called. I don't know how Win32 determines which and in what order these dllMain's are called (is it the order the DLLs are loaded?), but pthreads-win32 does rely on this mechanism to do some final cleanup and status setting for each 'POSIX' thread, and if this doesn't get done I imagine it's possible you would see symptoms like this. Pthreads-win32's dllMain() calls pthread_win32_thread_detach_np() in pthread_win32_attach_detach_np.c. To verify that this is happening you could set up a thread-specific data key and give it a destructor routine, have your problem thread set it to a non-null value, and then see if the destructor routine is called. Regards. Ross > > Best Regards, > > Stefan. > > PS: I've tested the Win64 version, and it works like a charm.