From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <sourceware-bugzilla@sourceware.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 734803854143; Wed, 10 May 2023 21:29:17 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 734803854143
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org;
	s=default; t=1683754158;
	bh=cX+VQPoo3SxFSmxvAClB12j4leUIXL2u3M6dFxnfqkI=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=xp8RjCa2EDEmk6FYO/ZJ02jkhR1tkbGKVKb+TbvYOODBXZe62CQBNvTjWeTHVtDiU
	 GnIsTa51aGD82aJK+ZJzQt5TsELMrsz5x2np+lz9YEWX0h8l9XUjm3HV6/YhKG9XxQ
	 QFHQf00KiVT1Cpo//uRLKd4ueXtuZpqYfLQ3ZSnc=
From: "frankbarrus_sw at shaggy dot cc" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug nptl/25847] pthread_cond_signal failed to wake up
 pthread_cond_wait due to a bug in undoing stealing
Date: Wed, 10 May 2023 21:29:15 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: glibc
X-Bugzilla-Component: nptl
X-Bugzilla-Version: 2.27
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: frankbarrus_sw at shaggy dot cc
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: carlos at redhat dot com
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-25847-131-J22NWtB6R3@http.sourceware.org/bugzilla/>
In-Reply-To: <bug-25847-131@http.sourceware.org/bugzilla/>
References: <bug-25847-131@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <glibc-bugs.sourceware.org>

https://sourceware.org/bugzilla/show_bug.cgi?id=3D25847

--- Comment #60 from Frank Barrus <frankbarrus_sw at shaggy dot cc> ---
Comment on attachment 13419
  --> https://sourceware.org/bugzilla/attachment.cgi?id=3D13419
proposed lost wakeup fix with g_signals relative to g1_start and no signal
stealing

>    pthreads NPTL: lost wakeup fix 2
>=20=20=20=20
>    This fixes the lost wakeup (from a bug in signal stealing) with a chan=
ge
>    in the usage of g_signals[] in the condition variable internal state.
>    It also completely eliminates the concept and handling of signal steal=
ing,
>    as well as the need for signalers to block to wait for waiters to wake
>    up every time there is a G1/G2 switch.  This greatly reduces the avera=
ge
>    and maximum latency for pthread_cond_signal.
>=20=20=20=20
>    The g_signals[] field now contains a signal count that is relative to
>    the current g1_start value.  Since it is a 32-bit field, and the LSB is
>    still reserved (though not currently used anymore), it has a 31-bit va=
lue
>    that corresponds to the low 31 bits of the sequence number in g1_start.
>    (since g1_start also has an LSB flag, this means bits 31:1 in g_signals
>    correspond to bits 31:1 in g1_start, plus the current signal count)
>=20=20=20=20
>    By making the signal count relative to g1_start, there is no longer
>    any ambiguity or A/B/A issue, and thus any checks before blocking,
>    including the futex call itself, are guaranteed not to block if the G1=
/G2
>    switch occurs, even if the signal count remains the same.  This allows
>    initially safely blocking in G2 until the switch to G1 occurs, and
>    then transitioning from G1 to a new G1 or G2, and always being able to
>    distinguish the state change.  This removes the race condition and A/B=
/A
>    problems that otherwise ocurred if a late (pre-empted) waiter were to
>    resume just as the futex call attempted to block on g_signal since
>    otherwise there was no last opportunity to re-check things like whether
>    the current G1 group was already closed.
>=20
>    By fixing these issues, the signal stealing code can be eliminated,
>    since there is no concept of signal stealing anymore.  The code to blo=
ck
>    for all waiters to exit g_refs can also be removed, since any waiters
>    that are still in the g_refs region can be guaranteed to safely wake
>    up and exit.  If there are still any left at this time, they are all
>    sent one final futex wakeup to ensure that they are not blocked any
>    longer, but there is no need for the signaller to block and wait for
>    them to wake up and exit the g_refs region.
>=20=20=20=20
>    The signal count is then effectively "zeroed" but since it is now
>    relative to g1_start, this is done by advancing it to a new value that
>    can be observed by any pending blocking waiters.  Any late waiters can
>    always tell the difference, and can thus just cleanly exit if they are
>    in a stale G1 or G2.  They can never steal a signal from the current
>    G1 if they are not in the current G1, since the signal value that has
>    to match in the cmpxchg has the low 31 bits of the g1_start value
>    contained in it, and that's first checked, and then it won't match if
>    there's a G1/G2 change.
>
>    Note: the 31-bit sequence number used in g_signals is designed to
>    handle wrap-around when checking the signal count, but if the entire
>    31-bit wraparound (2 billion signals) occurs while there is still a
>    late waiter that has not yet resumed, and it happens to then match
>    the current g1_start low bits, and the pre-emption occurs after the
>    normal "closed group" checks (which are 64-bit) but then hits the
>    futex syscall and signal consuming code, then an A/B/A issue could
>    still result and cause an incorrect assumption about whether it
>    should block.  This particular scenario seems unlikely in practice.
>    Note that once awake from the futex, the waiter would notice the
>    closed group before consuming the signal (since that's still a 64-bit
>    check that would not be aliased in the wrap-around in g_signals),
>    so the biggest impact would be blocking on the futex until the next
>    full wakeup from a G1/G2 switch.
>
>Signed-off-by: Frank Barrus <frankbarrus_sw@shaggy.cc>
>---
> nptl/pthread_cond_common.c | 105 +++++++++++----------------------
> nptl/pthread_cond_wait.c   | 144 +++++++++++++++-------------------------=
-----
> 2 files changed, 81 insertions(+), 168 deletions(-)
>
>diff --git a/nptl/pthread_cond_common.c b/nptl/pthread_cond_common.c
>index c35b9ef..eae7f4d 100644
>--- a/nptl/pthread_cond_common.c
>+++ b/nptl/pthread_cond_common.c
>@@ -341,7 +341,6 @@ static bool __attribute__ ((unused))
> __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
>     unsigned int *g1index, int private)
> {
>-  const unsigned int maxspin =3D 0;
>   unsigned int g1 =3D *g1index;
>=20
>   /* If there is no waiter in G2, we don't do anything.  The expression m=
ay
>@@ -362,84 +361,46 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *con=
d, uint64_t wseq,
>      * New waiters arriving concurrently with the group switching will al=
l go
>        into G2 until we atomically make the switch.  Waiters existing in =
G2
>        are not affected.
>-     * Waiters in G1 will be closed out immediately by setting a flag in
>-       __g_signals, which will prevent waiters from blocking using a fute=
x on
>-       __g_signals and also notifies them that the group is closed.  As a
>-       result, they will eventually remove their group reference, allowin=
g us
>-       to close switch group roles.  */
>-
>-  /* First, set the closed flag on __g_signals.  This tells waiters that =
are
>-     about to wait that they shouldn't do that anymore.  This basically
>-     serves as an advance notificaton of the upcoming change to __g1_star=
t;
>-     waiters interpret it as if __g1_start was larger than their waiter
>-     sequence position.  This allows us to change __g1_start after waiting
>-     for all existing waiters with group references to leave, which in tu=
rn
>-     makes recovery after stealing a signal simpler because it then can be
>-     skipped if __g1_start indicates that the group is closed (otherwise,
>-     we would have to recover always because waiters don't know how big t=
heir
>-     groups are).  Relaxed MO is fine.  */
>-  atomic_fetch_or_relaxed (cond->__data.__g_signals + g1, 1);
>-
>-  /* Wait until there are no group references anymore.  The fetch-or oper=
ation
>-     injects us into the modification order of __g_refs; release MO ensur=
es
>-     that waiters incrementing __g_refs after our fetch-or see the previo=
us
>-     changes to __g_signals and to __g1_start that had to happen before w=
e can
>-     switch this G1 and alias with an older group (we have two groups, so
>-     aliasing requires switching group roles twice).  Note that nobody el=
se
>-     can have set the wake-request flag, so we do not have to act upon it.
>-
>-     Also note that it is harmless if older waiters or waiters from this =
G1
>-     get a group reference after we have quiesced the group because it wi=
ll
>-     remain closed for them either because of the closed flag in __g_sign=
als
>-     or the later update to __g1_start.  New waiters will never arrive he=
re
>-     but instead continue to go into the still current G2.  */
>-  unsigned r =3D atomic_fetch_or_release (cond->__data.__g_refs + g1, 0);
>-  while ((r >> 1) > 0)
>-    {
>-      for (unsigned int spin =3D maxspin; ((r >> 1) > 0) && (spin > 0); s=
pin--)
>-	{
>-	  /* TODO Back off.  */
>-	  r =3D atomic_load_relaxed (cond->__data.__g_refs + g1);
>-	}
>-      if ((r >> 1) > 0)
>-	{
>-	  /* There is still a waiter after spinning.  Set the wake-request
>-	     flag and block.  Relaxed MO is fine because this is just about
>-	     this futex word.
>-
>-	     Update r to include the set wake-request flag so that the upcoming
>-	     futex_wait only blocks if the flag is still set (otherwise, we'd
>-	     violate the basic client-side futex protocol).  */
>-	  r =3D atomic_fetch_or_relaxed (cond->__data.__g_refs + g1, 1) | 1;
>-
>-	  if ((r >> 1) > 0)
>-	    futex_wait_simple (cond->__data.__g_refs + g1, r, private);
>-	  /* Reload here so we eventually see the most recent value even if we
>-	     do not spin.   */
>-	  r =3D atomic_load_relaxed (cond->__data.__g_refs + g1);
>-	}
>-    }
>-  /* Acquire MO so that we synchronize with the release operation that wa=
iters
>-     use to decrement __g_refs and thus happen after the waiters we waited
>-     for.  */
>-  atomic_thread_fence_acquire ();
>+     * Waiters in G1 will be closed out immediately by the advancing of
>+       __g_signals to the next "lowseq" (low 31 bits of the new g1_start),
>+       which will prevent waiters from blocking using a futex on
>+       __g_signals since it provides enough signals for all possible
>+       remaining waiters.  As a result, they can each consume a signal
>+       and they will eventually remove their group reference.  */
>=20
>   /* Update __g1_start, which finishes closing this group.  The value we =
add
>      will never be negative because old_orig_size can only be zero when we
>      switch groups the first time after a condvar was initialized, in whi=
ch
>-     case G1 will be at index 1 and we will add a value of 1.  See above =
for
>-     why this takes place after waiting for quiescence of the group.
>+     case G1 will be at index 1 and we will add a value of 1.
>      Relaxed MO is fine because the change comes with no additional
>      constraints that others would have to observe.  */
>   __condvar_add_g1_start_relaxed (cond,
>       (old_orig_size << 1) + (g1 =3D=3D 1 ? 1 : - 1));
>=20
>-  /* Now reopen the group, thus enabling waiters to again block using the
>-     futex controlled by __g_signals.  Release MO so that observers that =
see
>-     no signals (and thus can block) also see the write __g1_start and th=
us
>-     that this is now a new group (see __pthread_cond_wait_common for the
>-     matching acquire MO loads).  */
>-  atomic_store_release (cond->__data.__g_signals + g1, 0);
>+  unsigned int lowseq =3D ((old_g1_start + old_orig_size) << 1) & ~1U;
>+
>+  /* If any waiters still hold group references (and thus could be blocke=
d),
>+     then wake them all up now and prevent any running ones from blocking.
>+     This is effectively a catch-all for any possible current or future
>+     bugs that can allow the group size to reach 0 before all G1 waiters
>+     have been awakened or at least given signals to consume, or any
>+     other case that can leave blocked (or about to block) older waiters.=
.  */
>+  if ((atomic_fetch_or_release (cond->__data.__g_refs + g1, 0) >> 1) > 0)
>+   {
>+    /* First advance signals to the end of the group (i.e. enough signals
>+       for the entire G1 group) to ensure that waiters which have not
>+       yet blocked in the futex will not block.
>+       Note that in the vast majority of cases, this should never
>+       actually be necessary, since __g_signals will have enough
>+       signals for the remaining g_refs waiters.  As an optimization,
>+       we could check this first before proceeding, although that
>+       could still leave the potential for futex lost wakeup bugs
>+       if the signal count was non-zero but the futex wakeup
>+       was somehow lost.  */
>+    atomic_store_relaxed (cond->__data.__g_signals + g1, lowseq);
>+
>+    futex_wake (cond->__data.__g_signals + g1, INT_MAX, private);
>+   }
>=20
>   /* At this point, the old G1 is now a valid new G2 (but not in use yet).
>      No old waiter can neither grab a signal nor acquire a reference with=
out
>@@ -451,6 +412,10 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond=
, uint64_t wseq,
>   g1 ^=3D 1;
>   *g1index ^=3D 1;
>=20
>+  /* Now advance the new G1 g_signals to the new lowseq, giving it
>+     an effective signal count of 0 to start.  */
>+  atomic_store_relaxed (cond->__data.__g_signals + g1, lowseq);
>+
>   /* These values are just observed by signalers, and thus protected by t=
he
>      lock.  */
>   unsigned int orig_size =3D wseq - (old_g1_start + old_orig_size);
>diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
>index 54e504a..69cba15 100644
>--- a/nptl/pthread_cond_wait.c
>+++ b/nptl/pthread_cond_wait.c
>@@ -239,9 +239,7 @@ __condvar_cleanup_waiting (void *arg)
>    signaled), and a reference count.
>=20
>    The group reference count is used to maintain the number of waiters th=
at
>-   are using the group's futex.  Before a group can change its role, the
>-   reference count must show that no waiters are using the futex anymore;=
 this
>-   prevents ABA issues on the futex word.
>+   are using the group's futex.
>=20
>    To represent which intervals in the waiter sequence the groups cover (=
and
>    thus also which group slot contains G1 or G2), we use a 64b counter to
>@@ -301,11 +299,12 @@ __condvar_cleanup_waiting (void *arg)
>        last reference.
>      * Reference count used by waiters concurrently with signalers that h=
ave
>        acquired the condvar-internal lock.
>-   __g_signals: The number of signals that can still be consumed.
>+   __g_signals: The number of signals that can still be consumed, relativ=
e to
>+     the current g1_start.  (i.e. bits 31 to 1 of __g_signals are bits
>+     31 to 1 of g1_start with the signal count added)
>      * Used as a futex word by waiters.  Used concurrently by waiters and
>        signalers.
>-     * LSB is true iff this group has been completely signaled (i.e., it =
is
>-       closed).
>+     * LSB is currently reserved and 0.
>    __g_size: Waiters remaining in this group (i.e., which have not been
>      signaled yet.
>      * Accessed by signalers and waiters that cancel waiting (both do so =
only
>@@ -329,18 +328,6 @@ __condvar_cleanup_waiting (void *arg)
>    sufficient because if a waiter can see a sufficiently large value, it =
could
>    have also consume a signal in the waiters group.
>=20
>-   Waiters try to grab a signal from __g_signals without holding a refere=
nce
>-   count, which can lead to stealing a signal from a more recent group af=
ter
>-   their own group was already closed.  They cannot always detect whether=
 they
>-   in fact did because they do not know when they stole, but they can
>-   conservatively add a signal back to the group they stole from; if they
>-   did so unnecessarily, all that happens is a spurious wake-up.  To make=
 this
>-   even less likely, __g1_start contains the index of the current g2 too,
>-   which allows waiters to check if there aliasing on the group slots; if
>-   there wasn't, they didn't steal from the current G1, which means that =
the
>-   G1 they stole from must have been already closed and they do not need =
to
>-   fix anything.
>-
>    It is essential that the last field in pthread_cond_t is __g_signals[1=
]:
>    The previous condvar used a pointer-sized field in pthread_cond_t, so a
>    PTHREAD_COND_INITIALIZER from that condvar implementation might only
>@@ -436,6 +423,9 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthr=
ead_mutex_t *mutex,
>     {
>       while (1)
> 	{
>+          uint64_t g1_start =3D __condvar_load_g1_start_relaxed (cond);
>+          unsigned int lowseq =3D (g1_start & 1) =3D=3D g ? signals : g1_=
start & ~1U;
>+
> 	  /* Spin-wait first.
> 	     Note that spinning first without checking whether a timeout
> 	     passed might lead to what looks like a spurious wake-up even
>@@ -447,35 +437,45 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pt=
hread_mutex_t *mutex,
> 	     having to compare against the current time seems to be the right
> 	     choice from a performance perspective for most use cases.  */
> 	  unsigned int spin =3D maxspin;
>-	  while (signals =3D=3D 0 && spin > 0)
>+	  while (spin > 0 && ((int)(signals - lowseq) < 2))
> 	    {
> 	      /* Check that we are not spinning on a group that's already
> 		 closed.  */
>-	      if (seq < (__condvar_load_g1_start_relaxed (cond) >> 1))
>-		goto done;
>+	      if (seq < (g1_start >> 1))
>+		break;
>=20
> 	      /* TODO Back off.  */
>=20
> 	      /* Reload signals.  See above for MO.  */
> 	      signals =3D atomic_load_acquire (cond->__data.__g_signals + g);
>+              g1_start =3D __condvar_load_g1_start_relaxed (cond);
>+              lowseq =3D (g1_start & 1) =3D=3D g ? signals : g1_start & ~=
1U;
> 	      spin--;
> 	    }
>=20
>-	  /* If our group will be closed as indicated by the flag on signals,
>-	     don't bother grabbing a signal.  */
>-	  if (signals & 1)
>-	    goto done;
>-
>-	  /* If there is an available signal, don't block.  */
>-	  if (signals !=3D 0)
>+          if (seq < (g1_start >> 1))
>+	    {
>+              /* If the group is closed already,
>+	         then this waiter originally had enough extra signals to
>+	         consume, up until the time its group was closed.  */
>+	       goto done;
>+            }
>+
>+	  /* If there is an available signal, don't block.
>+             If __g1_start has advanced at all, then we must be in G1
>+	     by now, perhaps in the process of switching back to an older
>+	     G2, but in either case we're allowed to consume the available
>+	     signal and should not block anymore.  */
>+	  if ((int)(signals - lowseq) >=3D 2)
> 	    break;
>=20
> 	  /* No signals available after spinning, so prepare to block.
> 	     We first acquire a group reference and use acquire MO for that so
> 	     that we synchronize with the dummy read-modify-write in
> 	     __condvar_quiesce_and_switch_g1 if we read from that.  In turn,
>-	     in this case this will make us see the closed flag on __g_signals
>-	     that designates a concurrent attempt to reuse the group's slot.
>+	     in this case this will make us see the advancement of __g_signals
>+	     to the upcoming new g1_start that occurs with a concurrent
>+	     attempt to reuse the group's slot.
> 	     We use acquire MO for the __g_signals check to make the
> 	     __g1_start check work (see spinning above).
> 	     Note that the group reference acquisition will not mask the
>@@ -483,15 +483,24 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pt=
hread_mutex_t *mutex,
> 	     an atomic read-modify-write operation and thus extend the release
> 	     sequence.  */
> 	  atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);
>-	  if (((atomic_load_acquire (cond->__data.__g_signals + g) & 1) !=3D 0)
>-	      || (seq < (__condvar_load_g1_start_relaxed (cond) >> 1)))
>+	  signals =3D atomic_load_acquire (cond->__data.__g_signals + g);
>+          g1_start =3D __condvar_load_g1_start_relaxed (cond);
>+          lowseq =3D (g1_start & 1) =3D=3D g ? signals : g1_start & ~1U;
>+
>+          if (seq < (g1_start >> 1))
> 	    {
>-	      /* Our group is closed.  Wake up any signalers that might be
>-		 waiting.  */
>+              /* group is closed already, so don't block */
> 	      __condvar_dec_grefs (cond, g, private);
> 	      goto done;
> 	    }
>=20
>+	  if ((int)(signals - lowseq) >=3D 2)
>+	    {
>+	      /* a signal showed up or G1/G2 switched after we grabbed the refco=
unt */
>+	      __condvar_dec_grefs (cond, g, private);
>+	      break;
>+            }
>+
> 	  // Now block.
> 	  struct _pthread_cleanup_buffer buffer;
> 	  struct _condvar_cleanup_buffer cbuffer;
>@@ -502,7 +511,7 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthr=
ead_mutex_t *mutex,
> 	  __pthread_cleanup_push (&buffer, __condvar_cleanup_waiting, &cbuffer);
>=20
> 	  err =3D __futex_abstimed_wait_cancelable64 (
>-	    cond->__data.__g_signals + g, 0, clockid, abstime, private);
>+	    cond->__data.__g_signals + g, signals, clockid, abstime, private);
>=20
> 	  __pthread_cleanup_pop (&buffer, 0);
>=20
>@@ -525,6 +534,8 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthr=
ead_mutex_t *mutex,
> 	  signals =3D atomic_load_acquire (cond->__data.__g_signals + g);
> 	}
>=20
>+       if (seq < (__condvar_load_g1_start_relaxed (cond) >> 1))
>+	 goto done;
>     }
>   /* Try to grab a signal.  Use acquire MO so that we see an up-to-date v=
alue
>      of __g1_start below (see spinning above for a similar case).  In
>@@ -533,69 +544,6 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pth=
read_mutex_t *mutex,
>   while (!atomic_compare_exchange_weak_acquire (cond->__data.__g_signals =
+ g,
> 						&signals, signals - 2));
>=20
>-  /* We consumed a signal but we could have consumed from a more recent g=
roup
>-     that aliased with ours due to being in the same group slot.  If this
>-     might be the case our group must be closed as visible through
>-     __g1_start.  */
>-  uint64_t g1_start =3D __condvar_load_g1_start_relaxed (cond);
>-  if (seq < (g1_start >> 1))
>-    {
>-      /* We potentially stole a signal from a more recent group but we do=
 not
>-	 know which group we really consumed from.
>-	 We do not care about groups older than current G1 because they are
>-	 closed; we could have stolen from these, but then we just add a
>-	 spurious wake-up for the current groups.
>-	 We will never steal a signal from current G2 that was really intended
>-	 for G2 because G2 never receives signals (until it becomes G1).  We
>-	 could have stolen a signal from G2 that was conservatively added by a
>-	 previous waiter that also thought it stole a signal -- but given that
>-	 that signal was added unnecessarily, it's not a problem if we steal
>-	 it.
>-	 Thus, the remaining case is that we could have stolen from the current
>-	 G1, where "current" means the __g1_start value we observed.  However,
>-	 if the current G1 does not have the same slot index as we do, we did
>-	 not steal from it and do not need to undo that.  This is the reason
>-	 for putting a bit with G2's index into__g1_start as well.  */
>-      if (((g1_start & 1) ^ 1) =3D=3D g)
>-	{
>-	  /* We have to conservatively undo our potential mistake of stealing
>-	     a signal.  We can stop trying to do that when the current G1
>-	     changes because other spinning waiters will notice this too and
>-	     __condvar_quiesce_and_switch_g1 has checked that there are no
>-	     futex waiters anymore before switching G1.
>-	     Relaxed MO is fine for the __g1_start load because we need to
>-	     merely be able to observe this fact and not have to observe
>-	     something else as well.
>-	     ??? Would it help to spin for a little while to see whether the
>-	     current G1 gets closed?  This might be worthwhile if the group is
>-	     small or close to being closed.  */
>-	  unsigned int s =3D atomic_load_relaxed (cond->__data.__g_signals + g);
>-	  while (__condvar_load_g1_start_relaxed (cond) =3D=3D g1_start)
>-	    {
>-	      /* Try to add a signal.  We don't need to acquire the lock
>-		 because at worst we can cause a spurious wake-up.  If the
>-		 group is in the process of being closed (LSB is true), this
>-		 has an effect similar to us adding a signal.  */
>-	      if (((s & 1) !=3D 0)
>-		  || atomic_compare_exchange_weak_relaxed
>-		       (cond->__data.__g_signals + g, &s, s + 2))
>-		{
>-		  /* If we added a signal, we also need to add a wake-up on
>-		     the futex.  We also need to do that if we skipped adding
>-		     a signal because the group is being closed because
>-		     while __condvar_quiesce_and_switch_g1 could have closed
>-		     the group, it might stil be waiting for futex waiters to
>-		     leave (and one of those waiters might be the one we stole
>-		     the signal from, which cause it to block using the
>-		     futex).  */
>-		  futex_wake (cond->__data.__g_signals + g, 1, private);
>-		  break;
>-		}
>-	      /* TODO Back off.  */
>-	    }
>-	}
>-    }
>-
>  done:
>=20
>   /* Confirm that we have been woken.  We do that before acquiring the mu=
tex

--=20
You are receiving this mail because:
You are on the CC list for the bug.=