[PATCH 1/2] Optimize generic spinlock code and use C11 like atomic macros.

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Stefan Liebler <stli@linux.vnet.ibm.com>
To: libc-alpha@sourceware.org
Cc: Stefan Liebler <stli@linux.vnet.ibm.com>
Subject: [PATCH 1/2] Optimize generic spinlock code and use C11 like atomic macros.
Date: Fri, 16 Dec 2016 16:32:00 -0000	[thread overview]
Message-ID: <1481905917-15654-1-git-send-email-stli@linux.vnet.ibm.com> (raw)

This patch optimizes the generic spinlock code.
The type pthread_spinlock_t is a typedef to volatile int on all archs.
Passing a volatile pointer to the atomic macros can lead to extra stores
and loads to stack if such a macro creates a temporary variable by using
__typeof (*(mem)). Thus the patch passes a int pointer to the atomic macros.

The atomic macros are replaced by the C11 like atomic macros and thus
the code is aligned to it.

I've added a glibc_likely hint to the first atomic exchange in
pthread_spin_lock in order to return immediately to caller if lock is free.
Without the hint, there is an additional jump if lock is free.

I've added the atomic_spin_nop macro within the loop of plain reads.
The plain reads are realized by dereferencing the volatile pointer
as the for-loop was optimized out with atomic_load_relaxed.

For pthread_spin_trylock, a machine-specific version can define
SPIN_TRYLOCK_USE_CMPXCHG_INSTEAD_OF_XCHG to 1 if an explicit test if
lock is free is optimal.

ChangeLog:

	* nptl/pthread_spin_init.c (pthread_spin_init):
	Use atomic_store_relaxed.
	* nptl/pthread_spin_lock.c (pthread_spin_lock):
	Use C11-like atomic macros and pass int pointers instead of
	volatile int pointers.
	* nptl/pthread_spin_trylock.c
	(pthread_spin_trylock): Likewise. Use an explicit test if lock
	is free, if new SPIN_TRYLOCK_USE_CMPXCHG_INSTEAD_OF_XCHG macro
	is set to one.
	(SPIN_TRYLOCK_USE_CMPXCHG_INSTEAD_OF_XCHG): New define.
	* nptl/pthread_spin_unlock.c (pthread_spin_unlock):
	Use atomic_store_release.
---
 nptl/pthread_spin_init.c    |  4 +++-
 nptl/pthread_spin_lock.c    | 54 +++++++++++++++++++++++++++++++++++----------
 nptl/pthread_spin_trylock.c | 29 ++++++++++++++++++++++--
 nptl/pthread_spin_unlock.c  |  6 +++--
 4 files changed, 76 insertions(+), 17 deletions(-)

diff --git a/nptl/pthread_spin_init.c b/nptl/pthread_spin_init.c
index 8ed4772..65d05b8 100644
--- a/nptl/pthread_spin_init.c
+++ b/nptl/pthread_spin_init.c
@@ -22,6 +22,8 @@
 int
 pthread_spin_init (pthread_spinlock_t *lock, int pshared)
 {
-  *lock = 0;
+  /* The atomic_store_relaxed is enough as we only initialize the spinlock here
+     and we are not in a critical region.  */
+  atomic_store_relaxed (lock, 0);
   return 0;
 }
diff --git a/nptl/pthread_spin_lock.c b/nptl/pthread_spin_lock.c
index fb9bcc1..adbc9d7 100644
--- a/nptl/pthread_spin_lock.c
+++ b/nptl/pthread_spin_lock.c
@@ -21,7 +21,7 @@
 
 /* A machine-specific version can define SPIN_LOCK_READS_BETWEEN_CMPXCHG
   to the number of plain reads that it's optimal to spin on between uses
-  of atomic_compare_and_exchange_val_acq.  If spinning forever is optimal
+  of atomic_compare_exchange_weak_acquire.  If spinning forever is optimal
   then use -1.  If no plain reads here would ever be optimal, use 0.  */
 #ifndef SPIN_LOCK_READS_BETWEEN_CMPXCHG
 # warning machine-dependent file should define SPIN_LOCK_READS_BETWEEN_CMPXCHG
@@ -29,18 +29,27 @@
 #endif
 
 int
-pthread_spin_lock (pthread_spinlock_t *lock)
+pthread_spin_lock (pthread_spinlock_t *lock_volatile)
 {
+  /* The type pthread_spinlock_t is a typedef to volatile int on all archs.
+     Passing a volatile pointer to the atomic macros can lead to extra stores
+     and loads to stack if such a macro creates a temporary variable by using
+     __typeof (*(mem)).  */
+  int *lock = (int *) lock_volatile;
+
   /* atomic_exchange usually takes less instructions than
      atomic_compare_and_exchange.  On the other hand,
      atomic_compare_and_exchange potentially generates less bus traffic
      when the lock is locked.
      We assume that the first try mostly will be successful, and we use
      atomic_exchange.  For the subsequent tries we use
-     atomic_compare_and_exchange.  */
-  if (atomic_exchange_acq (lock, 1) == 0)
+     atomic_compare_and_exchange.
+     We need acquire memory order here as we need to see if another thread has
+     locked / unlocked this spinlock.  */
+  if (__glibc_likely (atomic_exchange_acquire (lock, 1) == 0))
     return 0;
 
+  int val;
   do
     {
       /* The lock is contended and we need to wait.  Going straight back
@@ -50,20 +59,41 @@ pthread_spin_lock (pthread_spinlock_t *lock)
 	 On the other hand, we do want to update memory state on the local core
 	 once in a while to avoid spinning indefinitely until some event that
 	 will happen to update local memory as a side-effect.  */
-      if (SPIN_LOCK_READS_BETWEEN_CMPXCHG >= 0)
+
+#if SPIN_LOCK_READS_BETWEEN_CMPXCHG >= 0
+      /* Use at most SPIN_LOCK_READS_BETWEEN_CMPXCHG plain reads between the
+	 atomic compare and exchanges.  */
+      int wait;
+      for (wait = 0; wait < SPIN_LOCK_READS_BETWEEN_CMPXCHG;  wait ++)
 	{
-	  int wait = SPIN_LOCK_READS_BETWEEN_CMPXCHG;
+	  atomic_spin_nop ();
 
-	  while (*lock != 0 && wait > 0)
-	    --wait;
+	  /* Use a plain read every round.  */
+	  val = *lock_volatile;
+	  if (val == 0)
+	    break;
 	}
-      else
+
+      /* Set expected value to zero for the next compare and exchange.  */
+      val = 0;
+
+#else /* SPIN_LOCK_READS_BETWEEN_CMPXCHG < 0  */
+      /* Use plain reads until spinlock is free and then try a further atomic
+	 compare and exchange the next time.  */
+      do
 	{
-	  while (*lock != 0)
-	    ;
+	  atomic_spin_nop ();
+
+	  /* Use a plain read every round.  */
+	  val = *lock_volatile;
 	}
+      while (val != 0);
+
+#endif
+      /* We need acquire memory order here for the same reason as mentioned
+	 for the first try to lock the spinlock.  */
     }
-  while (atomic_compare_and_exchange_val_acq (lock, 1, 0) != 0);
+  while (!atomic_compare_exchange_weak_acquire (lock, &val, 1));
 
   return 0;
 }
diff --git a/nptl/pthread_spin_trylock.c b/nptl/pthread_spin_trylock.c
index 4e1a96c..8e9c76f 100644
--- a/nptl/pthread_spin_trylock.c
+++ b/nptl/pthread_spin_trylock.c
@@ -20,8 +20,33 @@
 #include <atomic.h>
 #include "pthreadP.h"
 
+/* A machine-specific version can define
+   SPIN_TRYLOCK_USE_CMPXCHG_INSTEAD_OF_XCHG to 1 if an explicit test if
+   lock is free is optimal.  */
+#ifndef SPIN_TRYLOCK_USE_CMPXCHG_INSTEAD_OF_XCHG
+# define SPIN_TRYLOCK_USE_CMPXCHG_INSTEAD_OF_XCHG 0
+#endif
+
 int
-pthread_spin_trylock (pthread_spinlock_t *lock)
+pthread_spin_trylock (pthread_spinlock_t *lock_volatile)
 {
-  return atomic_exchange_acq (lock, 1) ? EBUSY : 0;
+  /* See comment in pthread_spin_lock.c.  */
+  int *lock = (int *) lock_volatile;
+
+  /* We need acquire memory order here as we need to see if another
+     thread has locked / unlocked this spinlock.  */
+#if SPIN_TRYLOCK_USE_CMPXCHG_INSTEAD_OF_XCHG == 1
+  /* Load and test the spinlock and only try to lock the spinlock if it is
+     free.  */
+  int val = atomic_load_relaxed (lock);
+  if (__glibc_likely (val == 0
+		      && atomic_compare_exchange_weak_acquire (lock, &val, 1)))
+    return 0;
+#else
+  /* Set spinlock to locked and test if we have locked it.  */
+  if (__glibc_likely (atomic_exchange_acquire (lock, 1) == 0))
+    return 0;
+#endif
+
+  return EBUSY;
 }
diff --git a/nptl/pthread_spin_unlock.c b/nptl/pthread_spin_unlock.c
index d4b63ac..014f295 100644
--- a/nptl/pthread_spin_unlock.c
+++ b/nptl/pthread_spin_unlock.c
@@ -23,7 +23,9 @@
 int
 pthread_spin_unlock (pthread_spinlock_t *lock)
 {
-  atomic_full_barrier ();
-  *lock = 0;
+  /* The atomic_store_release synchronizes-with the atomic_exchange_acquire
+     or atomic_compare_exchange_weak_acquire in pthread_spin_lock /
+     pthread_spin_trylock.  */
+  atomic_store_release (lock, 0);
   return 0;
 }
-- 
2.3.0

next             reply	other threads:[~2016-12-16 16:32 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-16 16:32 Stefan Liebler [this message]
2016-12-16 16:32 ` [PATCH 2/2] S390: Use generic spinlock code Stefan Liebler
2017-02-08 14:49   ` Stefan Liebler
2017-02-13 20:39     ` Torvald Riegel
2017-02-15 16:26       ` Stefan Liebler
2017-02-18 17:05         ` Torvald Riegel
2017-03-14 15:55           ` Stefan Liebler
2017-03-21 15:43             ` Stefan Liebler
2017-04-06 12:27             ` Torvald Riegel
2016-12-19 12:14 ` [PATCH 1/2] Optimize generic spinlock code and use C11 like atomic macros Szabolcs Nagy
2017-02-08 14:49   ` Stefan Liebler
2017-02-13 20:29     ` Torvald Riegel
2017-02-15  9:36       ` Stefan Liebler
2017-02-18 16:57         ` Torvald Riegel
2017-02-19  9:20           ` Florian Weimer
2017-02-20 13:11             ` Torvald Riegel
2017-02-26  7:55               ` Florian Weimer
2017-02-26 20:06                 ` Torvald Riegel
2017-02-26 20:29                   ` Florian Weimer
2017-02-26 20:35                     ` Torvald Riegel
2017-02-27 17:57                       ` Szabolcs Nagy
2017-02-28  7:15                         ` Torvald Riegel
2017-03-14 15:55                           ` Stefan Liebler
2017-02-20 12:15           ` Stefan Liebler
2017-02-20 13:51             ` Torvald Riegel
2017-03-14 15:55               ` Stefan Liebler
2017-03-21 15:43                 ` Stefan Liebler
2017-03-22 12:56                   ` Szabolcs Nagy
2017-03-23 16:16                     ` Stefan Liebler
2017-03-23 17:52                       ` Szabolcs Nagy
2017-04-06 12:04                     ` Torvald Riegel
2017-03-27 13:08                   ` Stefan Liebler
2017-04-04 10:29                     ` [PING] " Stefan Liebler
2017-03-29 14:16                 ` Stefan Liebler
2017-04-06 14:00                 ` Torvald Riegel
2017-04-07 16:23                   ` Stefan Liebler
2017-04-09 13:51                     ` Torvald Riegel
2017-04-10 12:00                       ` Stefan Liebler
2017-04-18 13:09                         ` Stefan Liebler
2017-04-25  6:47                           ` Stefan Liebler
2017-05-03 11:38                             ` [PATCH 1/2] [PING] " Stefan Liebler
2017-05-10 13:00                               ` Stefan Liebler
2017-05-17 13:09                                 ` Stefan Liebler
2017-05-24  6:37                                   ` Stefan Liebler
2017-05-30  7:18                                 ` Torvald Riegel
2017-05-31  8:29                                   ` Stefan Liebler
2017-05-31 16:48                                     ` Torvald Riegel
2017-06-01 13:40                                       ` Joseph Myers
2017-06-01 14:33                                         ` Torvald Riegel
2017-06-06  7:51                                       ` [PATCH 1/2] [COMMITTED] " Stefan Liebler
2017-04-10  8:17                     ` [PATCH 1/2] " Andreas Schwab
2017-04-10 12:00                       ` Stefan Liebler
2017-04-10 13:36                         ` Andreas Schwab
2017-04-11  7:06                           ` Stefan Liebler
2017-04-11  8:45                             ` Andreas Schwab
2017-04-11 10:15                               ` Stefan Liebler
2017-04-11 12:05                                 ` Andreas Schwab
2017-04-11 12:19                                   ` Stefan Liebler
2017-04-11 13:08                                   ` Zack Weinberg
2017-04-13 16:36                             ` Torvald Riegel
2017-05-30 21:00                     ` Tulio Magno Quites Machado Filho
2017-04-18 21:17                   ` Joseph Myers
2017-04-19  8:27                     ` Stefan Liebler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1481905917-15654-1-git-send-email-stli@linux.vnet.ibm.com \
    --to=stli@linux.vnet.ibm.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).