From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5580 invoked by alias); 26 May 2005 16:22:20 -0000 Mailing-List: contact pthreads-win32-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: pthreads-win32-owner@sources.redhat.com Received: (qmail 5532 invoked by uid 22791); 26 May 2005 16:22:07 -0000 Received: from wproxy.gmail.com (HELO wproxy.gmail.com) (64.233.184.204) by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Thu, 26 May 2005 16:22:07 +0000 Received: by wproxy.gmail.com with SMTP id 69so765220wra for ; Thu, 26 May 2005 09:22:04 -0700 (PDT) Received: by 10.54.26.33 with SMTP id 33mr1058249wrz; Thu, 26 May 2005 09:22:03 -0700 (PDT) Received: by 10.54.7.59 with HTTP; Thu, 26 May 2005 09:22:03 -0700 (PDT) Message-ID: <97ffb310505260922fb3fa59@mail.gmail.com> Date: Thu, 26 May 2005 16:22:00 -0000 From: Gottlob Frege Reply-To: Gottlob Frege To: Ross Johnson Subject: Re: pthread_once suggestions Cc: Pthreads-Win32 list In-Reply-To: <1117074885.9478.98.camel@desk.home> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <0IH100J9YK9N5G@mta3.srv.hcvlny.cv.net> <1117035766.9478.44.camel@desk.home> <1117074885.9478.98.camel@desk.home> X-SW-Source: 2005/txt/msg00093.txt.bz2 To get me back into the loop, can someone post a version with these latest ideas (pseudocode or whatever)? On 5/25/05, Ross Johnson wrote: > On Thu, 2005-05-26 at 01:42 +1000, Ross Johnson wrote: > > On Wed, 2005-05-25 at 06:40 -0400, Vladimir Kliatchko wrote: > > > Hi, > > > I did not have much time to look into it, but from the top of my head= it > > > appears that the problem you pointed out is similar to the problem wi= th the > > > initializing thread not being canceled but completing while the threa= d B is > > > suspended before creating the semaphore. This last problem was addres= sed by > > > simply rechecking the 'done' flag. Can't we similarly recheck 'starte= d' > > > flag? > > > > I don't see why this won't work, but it's really late here so I will > > continue to look at it tomorrow. If it does work then I may owe Gottlob > > an apology because he also used this flag in this way, and I'm hoping I > > had a good reason not to do the same, but I'm too tired to look tonight. >=20 > Good morning, >=20 > Sorry if I seem to be unnecessarily confusing this, but I need to > satisfy myself that all the older issues can truly be set aside. >=20 > Here's my explanation of why 'started' wasn't sufficient in the current > version (but is now in your version): >=20 > The current event based version uses a manual-reset event, which the new > initter must reset (the alternative using auto-reset and daisy chaining > resets was not acceptable) and priority manipulation was needed because > of this. The cancelled state flag is useful here and elsewhere because > it meant all of this extra complexity could be kept out of the no- > cancellation paths efficiently. >=20 > Your semaphore version removes the need for any extra post-cancellation > processing by the new initter thread, and I think I can agree - it's now > possible to simply recheck 'started' as you've now done. >=20 > I think it's looking much much better now and I'm very exited at the > prospects. >=20 > Thanks. > Ross >=20 > > I would love to have been wrong all along. > > > > > See the attached code. > > > > > > --vlad > > > > > > PS. > > > I did rerun all the regression tests but only on a single CPU system.= The > > > condition above I tested by sticking a 'sleep' call before creating t= he > > > semaphore. > > > Do you have a multi-CPU box around? > > > > Not I, but Tim Theisen runs the tests on his 4-way before I release a > > new version. He was certainly picking up problems that ran fine on my > > uni-processor. > > > > Regards. > > Ross > > > > > > > > -----Original Message----- > > > From: Ross Johnson [mailto:Ross.Johnson@homemail.com.au] > > > Sent: Wednesday, May 25, 2005 5:41 AM > > > To: Vladimir Kliatchko > > > Cc: Gottlob Frege > > > Subject: Re: pthread_once suggestions > > > > > > On Wed, 2005-05-25 at 17:48 +1000, Ross Johnson wrote: > > > > I think there is a problem after all. > > > > > > > > If thread A is running the init_routine and thread B is suspended > > > > immediately before it is about to increment numSemaphoreUsers, and > > > > thread A is cancelled, then: there may not be a semaphore yet for t= he > > > > cancelled thread A to release; thread B will eventually resume, cre= ate > > > > the semaphore, and then wait on it until a new thread C arrives to > > > > finally complete the init_routine and release thread B. > > > > > > > > If there are only ever 2 threads then thread B will wait forever. B= ut > > > > even if there are more than 2 threads, thread B should not have to = wait > > > > until thread C arrives. > > > > > > > > So I think it's still necessary to have a flag to tell potential wa= iters > > > > that the init_routine has been cancelled. > > > > > > > > Even so, I think the semaphore can still eliminate a lot of the cur= rent > > > > complexity, especially the priority management. > > > > > > Hi Vlad, > > > > > > Sorry, it's all coming back to me - but slowly. If I add the cancelled > > > state flag back in to prevent thread B waiting on the semaphore that = it > > > has just created (see above), then I seem to be returning to the > > > situation where starvation occurs. > > > > > > If thread C arrives and sets 'started' to true first, but suspends > > > before it can clear the cancelled flag, then if thread B (which has b= een > > > woken) has a higher priority than thread C, it could busy loop and > > > starve thread C of CPU, preventing it from ever clearing the cancelled > > > flag. This is a problem that Gottlob noted during the previous effort= to > > > tame this routine using events. It's what ultimately forced the > > > inclusion of the priority promotion code. > > > > > > Regards. > > > Ross > > > > > > > Is this correct? > > > > > > > > Regards. > > > > Ross > > > > > > > > On Wed, 2005-05-25 at 16:55 +1000, Ross Johnson wrote: > > > > > On Tue, 2005-05-24 at 20:56 -0400, Vladimir Kliatchko wrote: > > > > > > Dear Ross, > > > > > > I was looking at the implementation of the pthread_once and cam= e up > > > > > > with what seemed a simpler and more robust implementation. I wo= uld > > > > > > like to contribute this implementation. Pls, let me know what t= hink. > > > > > > > > > > > > Thx, > > > > > > --vlad > > > > > > > > > > Hi Vlad, > > > > > > > > > > This is great. It seems to have everything and I can't find fault= with > > > > > it - I'm still looking since I think there were some very obscure > > > > > scenarios, but you seem to have fundamentally eliminated them. > > > > > > > > > > This is actually much closer to the simple [working] version post= ed by > > > > > Gottlob Frege initially:- > > > > > http://sources.redhat.com/ml/pthreads-win32/2005/msg00029.html > > > > > which uses events but didn't include cancelation. Your's is even a > > > > > little simpler I think. > > > > > > > > > > The advantage your semaphore has over the event is that you can r= elease > > > > > just one waiter after a cancellation but still release all waiter= s when > > > > > done (PulseEvent is deprecated and dangerous and wasn't an option= ). If > > > > > there are really no problems with it then I think you've legitimi= sed the > > > > > whole approach, as opposed to the named or reference counting mut= ex > > > > > trick etc. > > > > > > > > > > And you've done it without changing pthread_once_t_ or PTHREAD_ON= CE_INIT > > > > > - so the ABI is unaffected. Fantastic! > > > > > > > > > > I presume you've already run the test suite, and once4.c in parti= cular > > > > > has passed? I'll let you know if any problems arise. Did you test= it on > > > > > a multi-processor system? > > > > > > > > > > Thank you. > > > > > Ross > > > > > > > > > > > The implementation: > > > > > > #define PTHREAD_ONCE_INIT { PTW32_FALSE, PTW32_FALSE, 0, = 0} > > > > > > > > > > > > > > > > > > > > > > > > struct pthread_once_t_ > > > > > > { > > > > > > int done; > > > > > > int started; > > > > > > int numSemaphoreUsers; > > > > > > HANDLE semaphore; > > > > > > }; > > > > > > > > > > > > static void PTW32_CDECL > > > > > > ptw32_once_init_routine_cleanup(void * arg) > > > > > > { > > > > > > pthread_once_t * once_control =3D (pthread_once_t *) arg; > > > > > > > > > > > > (void) PTW32_INTERLOCKED_EXCHANGE((LPLONG)&once_control->star= ted, > > > (LONG)PTW32_FALSE); > > > > > > > > > > > > if (InterlockedExchangeAdd((LPLONG)&once_control->semaphore, = 0L)) /* > > > MBR fence */ > > > > > > { > > > > > > ReleaseSemaphore(once_control->semaphore, 1, NULL); > > > > > > } > > > > > > } > > > > > > > > > > > > int > > > > > > pthread_once (pthread_once_t * once_control, void (*init_routin= e) > > > (void)) > > > > > > { > > > > > > int result; > > > > > > HANDLE sema; > > > > > > > > > > > > if (once_control =3D=3D NULL || init_routine =3D=3D NULL) > > > > > > { > > > > > > result =3D EINVAL; > > > > > > goto FAIL0; > > > > > > } > > > > > > else > > > > > > { > > > > > > result =3D 0; > > > > > > } > > > > > > > > > > > > while (!InterlockedExchangeAdd((LPLONG)&once_control->done, 0= L)) /* > > > Atomic Read */ > > > > > > { > > > > > > if (!PTW32_INTERLOCKED_EXCHANGE((LPLONG)&once_control->st= arted, > > > (LONG)PTW32_TRUE)) > > > > > > { > > > > > > > > > > > > #ifdef _MSC_VER > > > > > > #pragma inline_depth(0) > > > > > > #endif > > > > > > > > > > > > pthread_cleanup_push(ptw32_once_init_routine_clea= nup, > > > (void *) once_control); > > > > > > (*init_routine)(); > > > > > > pthread_cleanup_pop(0); > > > > > > > > > > > > #ifdef _MSC_VER > > > > > > #pragma inline_depth() > > > > > > #endif > > > > > > > > > > > > (void) > > > PTW32_INTERLOCKED_EXCHANGE((LPLONG)&once_control->done, (LONG)PTW32_T= RUE); > > > > > > > > > > > > /* > > > > > > * we didn't create the event. > > > > > > * it is only there if there is someone waiting. > > > > > > * Avoid using the global event_lock but still pr= event > > > SetEvent > > > > > > * from overwriting any 'lasterror' if the event = is > > > closed before we > > > > > > * are done with it. > > > > > > */ > > > > > > if > > > (InterlockedExchangeAdd((LPLONG)&once_control->semaphore, 0L)) /* MBR= fence > > > */ > > > > > > { > > > > > > ReleaseSemaphore(once_control->semaphore, > > > once_control->numSemaphoreUsers, NULL); > > > > > > } > > > > > > } > > > > > > else > > > > > > { > > > > > > > > > InterlockedIncrement((LPLONG)&once_control->numSemaphoreUsers); > > > > > > if > > > (!InterlockedExchangeAdd((LPLONG)&once_control->semaphore, 0L)) /* MB= R fence > > > */ > > > > > > { > > > > > > sema =3D CreateSemaphore(NULL, 0, INT_MAX, NU= LL); > > > > > > if > > > (PTW32_INTERLOCKED_COMPARE_EXCHANGE((PTW32_INTERLOCKED_LPLONG)&once_c= ontrol- > > > >semaphore, > > > > > > > > > (PTW32_INTERLOCKED_LONG)sema, > > > > > > > > > (PTW32_INTERLOCKED_LONG)0)) > > > > > > { > > > > > > CloseHandle(sema); > > > > > > } > > > > > > } > > > > > > > > > > > > /* > > > > > > * Check 'state' again in case the initting threa= d has > > > finished or cancelled > > > > > > * and left before seeing that there was an event= to > > > trigger. > > > > > > */ > > > > > > if (InterlockedExchangeAdd((LPLONG)&once_control-= >done, > > > 0L) || > > > > > > WaitForSingleObject(once_control->semaphore, > > > INFINITE)) > > > > > > { > > > > > > if (0 =3D=3D > > > InterlockedDecrement((LPLONG)&once_control->numSemaphoreUsers)) > > > > > > { > > > > > > /* we were last */ > > > > > > if ((sema =3D (HANDLE) > > > PTW32_INTERLOCKED_EXCHANGE((LPLONG)&once_control->semaphore, > > > > > > > > > (LONG)0))) > > > > > > { > > > > > > CloseHandle(sema); > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > > > > > > > /* > > > > > > * ------------ > > > > > > * Failure Code > > > > > > * ------------ > > > > > > */ > > > > > > FAIL0: > > > > > > return (result); > > > > > > > > > > > > } /* pthread_once= */ > > > > > > > > > >=20 >