From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5643 invoked by alias); 1 May 2002 15:37:37 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 5633 invoked from network); 1 May 2002 15:37:36 -0000 Received: from unknown (HELO itdomain003.itdomain.net.au) (203.63.157.208) by sources.redhat.com with SMTP; 1 May 2002 15:37:36 -0000 content-class: urn:content-classes:message Subject: RE: 1.1.3 and upwards: apparent bug with pthread_cond_wait() and/or signal() MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Wed, 01 May 2002 08:37:00 -0000 X-MimeOLE: Produced By Microsoft Exchange V6.0.5762.3 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Robert Collins" To: "Michael Beach" Cc: X-SW-Source: 2002-05/txt/msg00035.txt.bz2 > -----Original Message----- > From: Michael Beach [mailto:michaelb@ieee.org]=20 > Sent: Thursday, May 02, 2002 12:21 AM > Thanks for taking the time to look at this issue, but I must=20 > disagree that=20 > this is the problem.=20 You're going to have to debug this yourself. I've given you my opinion :]. > If the test thread locks the mutex first, sure it will=20 > probably signal before=20 > the main thread is wating, but that doesn't matter because=20 > the main thread=20 does this sequence look plausible to you? I don't claim it is whats happening because the string output doesn't fit.. but it illustrates the race. On a dual processor machine this is much more likely than a single. thread - lock thread - state=3Drun thread - signal main - lock main - test state (passes) thread - test state (fails) main - state =3D acknowledged main - signal thread wait main - unlock main - join thread is hung. what are we seeing: main - lock main - test state fails main - wait thread - lock thread - state=3Drun thread - signal -- test thread has signal()ed thread - test state (fails) -- test thread about to wait()... thread wait -- main thread wakes! main - state =3D acknowledged -- main thread about to signal() main - signal main - unlock -- main thread waiting for exit... thread should wake here. =20 >=20 > If the above hand-wavy explanation does not seem convincing,=20 ... > the different platforms does not seem to hold much water... Without a few more output statements, I'll not buy into that. However I do accept your hand waving. Particularly since I've noticed something useful out of this: pthread_join's argument should not be 0. I have to dig up the spec to confirm this though.... but our code will segfault like crazy on you as it stands. =20 > However, that said, I will be trying 1.3.10 to see if it=20 > makes a difference.=20 > If not, then I guess I will just have to make the move to the=20 > Win32 threading=20 > and synchronization APIs. Blech! You could always help us debug the pthreads code... I wonder if the recent patches I haven't reviewed properly yet address this. If you had time, you could try them and see... > > You should also _always_ test for the return value when=20 > using pthreads=20 > > calls. They don't throw exceptions and they don't set errno, so the=20 > > only way you can tell an error has occurred is to record the return=20 > > value. >=20 > Yes I know. The reason for this sloppy coding is that this=20 > test program is=20 > ... Please don't remove error handling. If I were to run this program I'd expect to have error handling so I don't have to add it in. And running the code w/o error handling won't help me id anything non-trivial. Rob (Cygwin pthreads maintainer). -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/