public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] nptl: Add backoff mechanism to spinlock loop
@ 2022-03-25  1:50 Wangyang Guo
  2022-03-25  1:58 ` Noah Goldstein
  0 siblings, 1 reply; 7+ messages in thread
From: Wangyang Guo @ 2022-03-25  1:50 UTC (permalink / raw)
  To: libc-alpha; +Cc: Wangyang Guo, hjl.tools

When mutiple threads waiting for lock at the same time, once lock owner
releases the lock, waiters will see lock available and all try to lock,
which may cause an expensive CAS storm.

Binary exponential backoff with random jitter is introduced. As try-lock
attempt increases, there is more likely that a larger number threads
compete for adaptive mutex lock, so increase wait time in exponential.
A random jitter is also added to avoid synchronous try-lock from other
threads.

Signed-off-by: Wangyang Guo <wangyang.guo@intel.com>
---
 nptl/pthread_mutex_lock.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
index d2e652d151..ec57dc3627 100644
--- a/nptl/pthread_mutex_lock.c
+++ b/nptl/pthread_mutex_lock.c
@@ -26,6 +26,7 @@
 #include <futex-internal.h>
 #include <stap-probe.h>
 #include <shlib-compat.h>
+#include <random-bits.h>
 
 /* Some of the following definitions differ when pthread_mutex_cond_lock.c
    includes this file.  */
@@ -138,14 +139,26 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
 	  int cnt = 0;
 	  int max_cnt = MIN (max_adaptive_count (),
 			     mutex->__data.__spins * 2 + 10);
+	  int spin_count, exp_backoff = 1;
+	  unsigned int jitter = random_bits ();
 	  do
 	    {
-	      if (cnt++ >= max_cnt)
+	      /* In each loop, spin count is exponential backoff plus
+	         random jitter, random range is [0, exp_backoff-1].  */
+	      spin_count = exp_backoff + (jitter & (exp_backoff - 1));
+	      cnt += spin_count;
+	      if (cnt >= max_cnt)
 		{
+		  /* If cnt exceeds max spin count, just go to wait
+		     queue.  */
 		  LLL_MUTEX_LOCK (mutex);
 		  break;
 		}
-	      atomic_spin_nop ();
+	      do
+		  atomic_spin_nop ();
+	      while (--spin_count > 0);
+	      /* Binary exponential backoff, prepare for next loop.  */
+	      exp_backoff <<= 1;
 	    }
 	  while (LLL_MUTEX_READ_LOCK (mutex) != 0
 		 || LLL_MUTEX_TRYLOCK (mutex) != 0);
-- 
2.35.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nptl: Add backoff mechanism to spinlock loop
  2022-03-25  1:50 [PATCH] nptl: Add backoff mechanism to spinlock loop Wangyang Guo
@ 2022-03-25  1:58 ` Noah Goldstein
  2022-03-25  3:24   ` 回复: " Guo, Wangyang
  0 siblings, 1 reply; 7+ messages in thread
From: Noah Goldstein @ 2022-03-25  1:58 UTC (permalink / raw)
  To: Wangyang Guo; +Cc: GNU C Library

On Thu, Mar 24, 2022 at 8:51 PM Wangyang Guo via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> When mutiple threads waiting for lock at the same time, once lock owner
> releases the lock, waiters will see lock available and all try to lock,
> which may cause an expensive CAS storm.
>
> Binary exponential backoff with random jitter is introduced. As try-lock
> attempt increases, there is more likely that a larger number threads
> compete for adaptive mutex lock, so increase wait time in exponential.
> A random jitter is also added to avoid synchronous try-lock from other
> threads.
>
> Signed-off-by: Wangyang Guo <wangyang.guo@intel.com>
> ---
>  nptl/pthread_mutex_lock.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
> index d2e652d151..ec57dc3627 100644
> --- a/nptl/pthread_mutex_lock.c
> +++ b/nptl/pthread_mutex_lock.c
> @@ -26,6 +26,7 @@
>  #include <futex-internal.h>
>  #include <stap-probe.h>
>  #include <shlib-compat.h>
> +#include <random-bits.h>
>
>  /* Some of the following definitions differ when pthread_mutex_cond_lock.c
>     includes this file.  */
> @@ -138,14 +139,26 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
>           int cnt = 0;
>           int max_cnt = MIN (max_adaptive_count (),
>                              mutex->__data.__spins * 2 + 10);
> +         int spin_count, exp_backoff = 1;
> +         unsigned int jitter = random_bits ();
>           do
>             {
> -             if (cnt++ >= max_cnt)
> +             /* In each loop, spin count is exponential backoff plus
> +                random jitter, random range is [0, exp_backoff-1].  */
> +             spin_count = exp_backoff + (jitter & (exp_backoff - 1));
> +             cnt += spin_count;
> +             if (cnt >= max_cnt)
>                 {
> +                 /* If cnt exceeds max spin count, just go to wait
> +                    queue.  */
>                   LLL_MUTEX_LOCK (mutex);
>                   break;
>                 }
> -             atomic_spin_nop ();
> +             do
> +                 atomic_spin_nop ();
> +             while (--spin_count > 0);
> +             /* Binary exponential backoff, prepare for next loop.  */
> +             exp_backoff <<= 1;
>             }
>           while (LLL_MUTEX_READ_LOCK (mutex) != 0
Does this load not already prevent against the 'CAS storm'?

>                  || LLL_MUTEX_TRYLOCK (mutex) != 0);
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* 回复: [PATCH] nptl: Add backoff mechanism to spinlock loop
  2022-03-25  1:58 ` Noah Goldstein
@ 2022-03-25  3:24   ` Guo, Wangyang
  2022-03-25  3:42     ` Noah Goldstein
  0 siblings, 1 reply; 7+ messages in thread
From: Guo, Wangyang @ 2022-03-25  3:24 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: GNU C Library


>> +                 atomic_spin_nop ();
>> +             while (--spin_count > 0);
>> +             /* Binary exponential backoff, prepare for next loop.  */
>> +             exp_backoff <<= 1;
>>             }
>>           while (LLL_MUTEX_READ_LOCK (mutex) != 0
>Does this load not already prevent against the 'CAS storm'?
This just prevent CAS in a long held lock.
But if multiple threads waiting for lock at the same time, suppose many of them are going to read lock state,
once the lock owner release the lock at this point, those waiter threads will be convinced lock is unlocked. 
The next step, they will all do try lock at the same time. Backoff is introduced to solve this problem.

>>                  || LLL_MUTEX_TRYLOCK (mutex) != 0);


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nptl: Add backoff mechanism to spinlock loop
  2022-03-25  3:24   ` 回复: " Guo, Wangyang
@ 2022-03-25  3:42     ` Noah Goldstein
  2022-03-25  4:32       ` Guo, Wangyang
  0 siblings, 1 reply; 7+ messages in thread
From: Noah Goldstein @ 2022-03-25  3:42 UTC (permalink / raw)
  To: Guo, Wangyang; +Cc: GNU C Library

On Thu, Mar 24, 2022 at 10:24 PM Guo, Wangyang <wangyang.guo@intel.com> wrote:
>
>
> >> +                 atomic_spin_nop ();
> >> +             while (--spin_count > 0);
> >> +             /* Binary exponential backoff, prepare for next loop.  */
> >> +             exp_backoff <<= 1;
> >>             }
> >>           while (LLL_MUTEX_READ_LOCK (mutex) != 0
> >Does this load not already prevent against the 'CAS storm'?
> This just prevent CAS in a long held lock.
> But if multiple threads waiting for lock at the same time, suppose many of them are going to read lock state,
> once the lock owner release the lock at this point, those waiter threads will be convinced lock is unlocked.
> The next step, they will all do try lock at the same time. Backoff is introduced to solve this problem.

The loop isn't spinning on CAS failure which is typically where
you see the poor performance.
I get that there can still be some contention on the CAS, but
shouldn't the read check limit the evict ping-ponging?



>
> >>                  || LLL_MUTEX_TRYLOCK (mutex) != 0);
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nptl: Add backoff mechanism to spinlock loop
  2022-03-25  3:42     ` Noah Goldstein
@ 2022-03-25  4:32       ` Guo, Wangyang
  2022-03-25 15:25         ` Noah Goldstein
  0 siblings, 1 reply; 7+ messages in thread
From: Guo, Wangyang @ 2022-03-25  4:32 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: GNU C Library

Noah Goldstein wrote:
> > >> +                 atomic_spin_nop ();
> > >> +             while (--spin_count > 0);
> > >> +             /* Binary exponential backoff, prepare for next loop.  */
> > >> +             exp_backoff <<= 1;
> > >>             }
> > >>           while (LLL_MUTEX_READ_LOCK (mutex) != 0
> > >Does this load not already prevent against the 'CAS storm'?
> > This just prevent CAS in a long held lock.
> > But if multiple threads waiting for lock at the same time, suppose 
> > many of them are going to read lock state, once the lock owner release the lock at this point, those waiter threads will be convinced lock is unlocked.
> > The next step, they will all do try lock at the same time. Backoff is introduced to solve this problem.
> 
> The loop isn't spinning on CAS failure which is typically where you see the poor performance.
> I get that there can still be some contention on the CAS, but shouldn't the read check limit the evict ping-ponging?

Yes, read check can help. 
But in a very high contention case, it becomes more easier to meet the above problem that needs backoff.
> >
> > >>                  || LLL_MUTEX_TRYLOCK (mutex) != 0);
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nptl: Add backoff mechanism to spinlock loop
  2022-03-25  4:32       ` Guo, Wangyang
@ 2022-03-25 15:25         ` Noah Goldstein
  2022-03-28  8:35           ` Guo, Wangyang
  0 siblings, 1 reply; 7+ messages in thread
From: Noah Goldstein @ 2022-03-25 15:25 UTC (permalink / raw)
  To: Guo, Wangyang; +Cc: GNU C Library

On Thu, Mar 24, 2022 at 11:32 PM Guo, Wangyang <wangyang.guo@intel.com> wrote:
>
> Noah Goldstein wrote:
> > > >> +                 atomic_spin_nop ();
> > > >> +             while (--spin_count > 0);
> > > >> +             /* Binary exponential backoff, prepare for next loop.  */
> > > >> +             exp_backoff <<= 1;
> > > >>             }
> > > >>           while (LLL_MUTEX_READ_LOCK (mutex) != 0
> > > >Does this load not already prevent against the 'CAS storm'?
> > > This just prevent CAS in a long held lock.
> > > But if multiple threads waiting for lock at the same time, suppose
> > > many of them are going to read lock state, once the lock owner release the lock at this point, those waiter threads will be convinced lock is unlocked.
> > > The next step, they will all do try lock at the same time. Backoff is introduced to solve this problem.
> >
> > The loop isn't spinning on CAS failure which is typically where you see the poor performance.
> > I get that there can still be some contention on the CAS, but shouldn't the read check limit the evict ping-ponging?
>
> Yes, read check can help.
> But in a very high contention case, it becomes more easier to meet the above problem that needs backoff.

Do we need both then?

But made a quick benchmark to test this out and your right, using the lock for
a tiny critical section at least (incrementing an int) see less failed
CAS attempts
w/ this patch and better performance :)


> > >
> > > >>                  || LLL_MUTEX_TRYLOCK (mutex) != 0);
> > >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] nptl: Add backoff mechanism to spinlock loop
  2022-03-25 15:25         ` Noah Goldstein
@ 2022-03-28  8:35           ` Guo, Wangyang
  0 siblings, 0 replies; 7+ messages in thread
From: Guo, Wangyang @ 2022-03-28  8:35 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: GNU C Library

> On Thu, Mar 24, 2022 at 11:32 PM Guo, Wangyang <wangyang.guo@intel.com> wrote:
> >
> > Noah Goldstein wrote:
> > > > >> +                 atomic_spin_nop ();
> > > > >> +             while (--spin_count > 0);
> > > > >> +             /* Binary exponential backoff, prepare for next loop.  */
> > > > >> +             exp_backoff <<= 1;
> > > > >>             }
> > > > >>           while (LLL_MUTEX_READ_LOCK (mutex) != 0
> > > > >Does this load not already prevent against the 'CAS storm'?
> > > > This just prevent CAS in a long held lock.
> > > > But if multiple threads waiting for lock at the same time, suppose 
> > > > many of them are going to read lock state, once the lock owner release the lock at this point, those waiter threads will be convinced lock is unlocked.
> > > > The next step, they will all do try lock at the same time. Backoff is introduced to solve this problem.
> > >
> > > The loop isn't spinning on CAS failure which is typically where you see the poor performance.
> > > I get that there can still be some contention on the CAS, but shouldn't the read check limit the evict ping-ponging?
> >
> > Yes, read check can help.
> > But in a very high contention case, it becomes more easier to meet the above problem that needs backoff.
> Do we need both then?
> But made a quick benchmark to test this out and your right, using the lock for a tiny critical section at least (incrementing an int) see less failed CAS attempts w/ this patch and better performance :)

In theory, the read-check still works in a long held lock, but will have extra overhead if lock can be acquired immediately.
From my testing, without read-check has a better performance.

> > > >
> > > > >>                  || LLL_MUTEX_TRYLOCK (mutex) != 0);
> > > >

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-03-28  8:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-25  1:50 [PATCH] nptl: Add backoff mechanism to spinlock loop Wangyang Guo
2022-03-25  1:58 ` Noah Goldstein
2022-03-25  3:24   ` 回复: " Guo, Wangyang
2022-03-25  3:42     ` Noah Goldstein
2022-03-25  4:32       ` Guo, Wangyang
2022-03-25 15:25         ` Noah Goldstein
2022-03-28  8:35           ` Guo, Wangyang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).