public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
@ 2018-05-23  9:25 Kemi Wang
  2018-05-23  9:25 ` [PATCH v3 2/3] benchtests: Add pthread adaptive spin mutex microbenchmark Kemi Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Kemi Wang @ 2018-05-23  9:25 UTC (permalink / raw)
  To: Adhemerval Zanella, Florian Weimer, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu,
	Lu Aubrey, Kemi Wang

This patch does not have any functionality change, we only provide a spin
count tunes for pthread adaptive spin mutex. The tunable
glibc.mutex.spin_count tunes can be used by system administrator to squeeze
system performance according to different hardware capabilities and
workload characteristics.

The maximum value of spin count is limited to 30000 to avoid the overflow
of mutex->__data.__spins variable with the possible type of short in
pthread_mutex_lock ().

The default value of spin count is set to 100 with the reference to the
previous number of times of spinning via trylock. This value would be
architecture-specific and can be tuned with kinds of benchmarks to fit most
cases in future.

This is the preparation work for the next patch, in which the way of
adaptive spin would be changed from an expensive cmpxchg to read while
spinning.

   * elf/dl-tunables.list: Add glibc.mutex.spin_count entry.
   * manual/tunables.texi: Add glibc.mutex.spin_count description.
   * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
   * nptl/pthread_mutex_conf.h: New file.
   * nptl/pthread_mutex_conf.c: New file.

ChangeLog:
    V2->V3
    a) Polish the description of glibc.mutex.spin_count tunable with the
    help from Rical Jasan.
    b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
    pthread_mutex_conf.c, as suggested by Florian Weimer.
    c) Adjust the default value of spin count to 100 with the reference of
    the previous spinning way via trylock.

    V1->V2
    a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
    b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
    c) Change the Makefile to compile pthread_mutex_conf.c
    d) Modify the copyright "2013-2018" -> "2018" for new added files
    e) Fix the indentation issue (tab -> double space) in
    elf/dl-tunables.list
    f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
    g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
    description in manual/tunables.texi.
    h) Fix the indentation issue in nptl/pthread_mutex_conf.c
    i) Fix the indentation issue for nested preprocessor (add one space for
    each level)

Suggested-by: Andi Kleen <andi.kleen@intel.com>
Signed-off-by: Kemi Wang <kemi.wang@intel.com>
---
 ChangeLog                 |  8 +++++
 elf/dl-tunables.list      |  9 ++++++
 manual/tunables.texi      | 22 ++++++++++++++
 nptl/Makefile             |  3 +-
 nptl/pthread_mutex_conf.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++
 nptl/pthread_mutex_conf.h | 31 ++++++++++++++++++++
 6 files changed, 146 insertions(+), 1 deletion(-)
 create mode 100644 nptl/pthread_mutex_conf.c
 create mode 100644 nptl/pthread_mutex_conf.h

diff --git a/ChangeLog b/ChangeLog
index 8032adf..c1e2a44 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2018-05-23  Kemi Wang <kemi.wang@intel.com>
+
+	* elf/dl-tunables.list: Add glibc.mutex.spin_count entry.
+	* manual/tunables.texi: Add glibc.mutex.spin_count description.
+	* nptl/Makefile: Add pthread_mutex_conf.c for compilation.
+	* nptl/pthread_mutex_conf.h: New file.
+	* nptl/pthread_mutex_conf.c: New file.
+
 2018-05-23  Andreas Schwab  <schwab@suse.de>
 
 	[BZ #23196]
diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list
index 1f8ecb8..f1f3546 100644
--- a/elf/dl-tunables.list
+++ b/elf/dl-tunables.list
@@ -121,4 +121,13 @@ glibc {
       default: 3
     }
   }
+
+  mutex {
+    spin_count {
+      type: INT_32
+      minval: 0
+      maxval: 30000
+      default: 100
+    }
+  }
 }
diff --git a/manual/tunables.texi b/manual/tunables.texi
index be33c9f..05c5e4a 100644
--- a/manual/tunables.texi
+++ b/manual/tunables.texi
@@ -281,6 +281,28 @@ of try lock attempts.
 The default value of this tunable is @samp{3}.
 @end deftp
 
+@node Pthread Mutex Tunables
+@section Pthread Mutex Tunables
+@cindex pthread mutex tunables
+
+@deftp {Tunable namespace} glibc.mutex
+The behavior of pthread mutexes can be tuned to gain performance improvements
+according to specific hardware capabilities and workload characteristics by
+setting the following tunables in the @code{mutex} namespace:
+@end deftp
+
+@deftp Tunable glibc.mutex.spin_count
+The @code{glibc.mutex.spin_count} tunable sets the maximum number of times
+a thread should spin on the lock before calling into the kernel to block.
+Adaptive spin is used for mutexes initialized with the PTHREAD_MUTEX_ADAPTIVE_NP
+GNU extension.  It affects both pthread_mutex_lock and pthread_mutex_timedlock.
+
+The spinning is done until either the maximum spin times is reached or
+the lock is acquired.
+
+The default value of this tunable is @samp{100}.
+@end deftp
+
 @node Hardware Capability Tunables
 @section Hardware Capability Tunables
 @cindex hardware capability tunables
diff --git a/nptl/Makefile b/nptl/Makefile
index 94be92c..bd1096f 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -139,7 +139,8 @@ libpthread-routines = nptl-init vars events version pt-interp \
 		      pthread_mutex_getprioceiling \
 		      pthread_mutex_setprioceiling \
 		      pthread_setname pthread_getname \
-		      pthread_setattr_default_np pthread_getattr_default_np
+		      pthread_setattr_default_np pthread_getattr_default_np \
+		      pthread_mutex_conf
 #		      pthread_setuid pthread_seteuid pthread_setreuid \
 #		      pthread_setresuid \
 #		      pthread_setgid pthread_setegid pthread_setregid \
diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
new file mode 100644
index 0000000..7f9eb23
--- /dev/null
+++ b/nptl/pthread_mutex_conf.c
@@ -0,0 +1,74 @@
+/* pthread_mutex_conf.c: Pthread mutex tunable parameters.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include <pthreadP.h>
+#include <init-arch.h>
+#include <pthread_mutex_conf.h>
+#include <unistd.h>
+
+#if HAVE_TUNABLES
+# define TUNABLE_NAMESPACE mutex
+#endif
+#include <elf/dl-tunables.h>
+
+
+struct mutex_config __mutex_aconf =
+{
+  /* The maximum number of times a thread should spin on the lock before
+  calling into kernel to block.  */
+  .spin_count = 100,
+};
+
+#if HAVE_TUNABLES
+static inline void __always_inline
+do_set_mutex_spin_count (int32_t value)
+{
+  __mutex_aconf.spin_count = value;
+}
+
+void
+TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp)
+{
+  int32_t value = (int32_t) (valp)->numval;
+  do_set_mutex_spin_count (value);
+}
+#endif
+
+static void
+mutex_tunables_init (int argc __attribute__ ((unused)),
+			      char **argv  __attribute__ ((unused)),
+					      char **environ)
+{
+#if HAVE_TUNABLES
+  TUNABLE_GET (spin_count, int32_t,
+               TUNABLE_CALLBACK (set_mutex_spin_count));
+#endif
+}
+
+#ifdef SHARED
+# define INIT_SECTION ".init_array"
+#else
+# define INIT_SECTION ".preinit_array"
+#endif
+
+void (*const __pthread_mutex_tunables_init_array []) (int, char **, char **)
+  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) =
+{
+  &mutex_tunables_init
+};
diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h
new file mode 100644
index 0000000..e5b027c
--- /dev/null
+++ b/nptl/pthread_mutex_conf.h
@@ -0,0 +1,31 @@
+/* pthread_mutex_conf.h: Pthread mutex tunable parameters.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+#ifndef _PTHREAD_MUTEX_CONF_H
+#define _PTHREAD_MUTEX_CONF_H 1
+
+#include <pthread.h>
+#include <time.h>
+
+struct mutex_config
+{
+  int spin_count;
+};
+
+extern struct mutex_config __mutex_aconf attribute_hidden;
+
+#endif
-- 
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 3/3] Mutex: Replace trylock by read only while spinning
  2018-05-23  9:25 [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Kemi Wang
  2018-05-23  9:25 ` [PATCH v3 2/3] benchtests: Add pthread adaptive spin mutex microbenchmark Kemi Wang
@ 2018-05-23  9:25 ` Kemi Wang
  2018-05-25  8:52 ` [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Wang, Kemi
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Kemi Wang @ 2018-05-23  9:25 UTC (permalink / raw)
  To: Adhemerval Zanella, Florian Weimer, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu,
	Lu Aubrey, Kemi Wang

The pthread adaptive spin mutex spins on the lock for a while before
calling into the kernel to block. But, in the current implementation of
spinning, the spinners go straight back to LLL_MUTEX_TRYLOCK(cmpxchg) when
the lock is contended, it is not a good idea on many targets as that will
force expensive memory synchronization among processors and penalize other
running threads. For example, it constantly floods the system with "read
for ownership" requests, which are much more expensive to process than a
single read. Thus, we only use MO read until we observe the lock to not be
acquired anymore, as suggeusted by Andi Kleen.

Performance impact:
Significant mutex performance improvement is not expected for this patch,
though, it probably bring some benefit for the scenarios with severe lock
contention on many architectures, the whole system performance can benefit
from this modification because a number of unnecessary "read for ownership"
requests which stress the cache system via read and invalidate broadcast
are eliminated during spinning.
Meanwhile, it may have some tiny performance regression on the lock holder
transformation for the case of lock acquisition via spinning gets, because
the lock state is checked before acquiring the lock via trylock. In the
worst case, the extra latency of read and pause is added except that of
trylock when lock is available.

Similar mechanism has been implemented for pthread spin lock.

Test machine:
2-sockets Skylake platform, 112 cores with 62G RAM

Test case: mutex-adaptive-thread (Contended pthread adaptive spin mutex
with global update)
Usage: make bench BENCHSET=mutex-adaptive-thread
Test result:
+----------------+-----------------+-----------------+------------+
|  Configuration |      Base       |      Head       | % Change   |
|                | Total iteration | Total iteration | base->head |
+----------------+-----------------+-----------------+------------+
|                |           Critical section size: 1x            |
+----------------+------------------------------------------------+
|1 thread        |  2.76681e+07    |  2.7965e+07     |   +1.1%    |
|2 threads       |  3.29905e+07    |  3.55279e+07    |   +7.7%    |
|3 threads       |  4.38102e+07    |  3.98567e+07    |   -9.0%    |
|4 threads       |  1.72172e+07    |  2.09498e+07    |   +21.7%   |
|28 threads      |  1.03732e+07    |  1.05133e+07    |   +1.4%    |
|56 threads      |  1.06308e+07    |  5.06522e+07    |   +14.6%   |
|112 threads     |  8.55177e+06    |  1.02954e+07    |   +20.4%   |
+----------------+------------------------------------------------+
|                |           Critical section size: 10x           |
+----------------+------------------------------------------------+
|1 thread        |  1.57006e+07    |  1.54727e+07    |   -1.5%    |
|2 threads       |  1.8044e+07     |  1.75601e+07    |   -2.7%    |
|3 threads       |  1.35634e+07    |  1.46384e+07    |   +7.9%    |
|4 threads       |  1.21257e+07    |  1.32046e+07    |   +8.9%    |
|28 threads      |  8.09593e+06    |  1.02713e+07    |   +26.9%   |
|56 threads      |  9.09907e+06    |  4.16203e+07    |   +16.4%   |
|112 threads     |  7.09731e+06    |  8.62406e+06    |   +21.5%   |
+----------------+------------------------------------------------+
|                |           Critical section size: 100x          |
+----------------+------------------------------------------------+
|1 thread        |  2.87116e+06    | 2.89188e+06     |   +0.7%    |
|2 threads       |  2.23409e+06    | 2.24216e+06     |   +0.4%    |
|3 threads       |  2.29888e+06    | 2.29964e+06     |   +0.0%    |
|4 threads       |  2.26898e+06    | 2.21394e+06     |   -2.4%    |
|28 threads      |  1.03228e+06    | 1.0051e+06      |   -2.6%    |
|56 threads      |  1.02953 +06    | 1.6344e+07      |   -2.3%    |
|112 threads     |  1.01615e+06    | 1.00134e+06     |   -1.5%    |
+----------------+------------------------------------------------+
|                |           Critical section size: 1000x         |
+----------------+------------------------------------------------+
|1 thread        |  316392         |  315635         |   -0.2%    |
|2 threads       |  302806         |  303469         |   +0.2%    |
|3 threads       |  298506         |  294281         |   -1.4%    |
|4 threads       |  292037         |  289945         |   -0.7%    |
|28 threads      |  155188         |  155250         |   +0.0%    |
|56 threads      |  190657         |  183106         |   -4.0%    |
|112 threads     |  210818         |  220342         |   +4.5%    |
+----------------+-----------------+-----------------+------------+

    * nptl/pthread_mutex_lock.c: Optimize adaptive spin mutex

ChangLog:
    V2->V3:
    a) Drop the idea of blocking spinners if fail to acquire a lock, since
       this idea would not be an universal winner. E.g. several threads
       contend for a lock which protects a small critical section, thus,
       probably any thread can acquire the lock via spinning.
    b) Fix the format issue AFAIC

    V1->V2: fix format issue

Suggested-by: Andi Kleen <andi.kleen@intel.com>
Signed-off-by: Kemi Wang <kemi.wang@intel.com>
---
 ChangeLog                 |  5 +++++
 nptl/pthread_mutex_lock.c | 41 +++++++++++++++++++++++++----------------
 2 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index e2991e9..3bafb0e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2018-05-23  Kemi Wang <kemi.wang@intel.com>
 
+	* nptl/pthread_mutex_lock.c: Replace trylock by read only while
+	spinning.
+
+2018-05-23  Kemi Wang <kemi.wang@intel.com>
+
 	* benchtests/bench-mutex-adaptive-thread.c: Microbenchmark for adaptive
 	spin mutex.
 	* benchmark/Makefile: Add adaptive spin mutex benchmark.
diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
index 1519c14..7ce50f6 100644
--- a/nptl/pthread_mutex_lock.c
+++ b/nptl/pthread_mutex_lock.c
@@ -26,6 +26,7 @@
 #include <atomic.h>
 #include <lowlevellock.h>
 #include <stap-probe.h>
+#include <pthread_mutex_conf.h>
 
 #ifndef lll_lock_elision
 #define lll_lock_elision(lock, try_lock, private)	({ \
@@ -123,22 +124,30 @@ __pthread_mutex_lock (pthread_mutex_t *mutex)
 
       if (LLL_MUTEX_TRYLOCK (mutex) != 0)
 	{
-	  int cnt = 0;
-	  int max_cnt = MIN (MAX_ADAPTIVE_COUNT,
-			     mutex->__data.__spins * 2 + 10);
-	  do
-	    {
-	      if (cnt++ >= max_cnt)
-		{
-		  LLL_MUTEX_LOCK (mutex);
-		  break;
-		}
-	      atomic_spin_nop ();
-	    }
-	  while (LLL_MUTEX_TRYLOCK (mutex) != 0);
-
-	  mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
-	}
+      int val = 0;
+      int cnt = 0;
+      int max_cnt = MIN (__mutex_aconf.spin_count,
+                        mutex->__data.__spins * 2 + 10);
+
+      do
+        {
+          if (cnt >= max_cnt)
+            {
+              LLL_MUTEX_LOCK (mutex);
+              break;
+            }
+          /* Read only while spinning unless lock is available.  */
+          do
+            {
+              atomic_spin_nop ();
+              val = atomic_load_relaxed (&mutex->__data.__lock);
+            }
+          while (val != 0 && ++cnt < max_cnt);
+        }
+      while  (LLL_MUTEX_TRYLOCK (mutex) != 0);
+
+      mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
+    }
       assert (mutex->__data.__owner == 0);
     }
   else
-- 
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 2/3] benchtests: Add pthread adaptive spin mutex microbenchmark
  2018-05-23  9:25 [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Kemi Wang
@ 2018-05-23  9:25 ` Kemi Wang
  2018-05-23  9:25 ` [PATCH v3 3/3] Mutex: Replace trylock by read only while spinning Kemi Wang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Kemi Wang @ 2018-05-23  9:25 UTC (permalink / raw)
  To: Adhemerval Zanella, Florian Weimer, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu,
	Lu Aubrey, Kemi Wang

Add a microbenchmark for measuring mutex lock and unlock performance with
varying numbers of threads and varying size of a critical section. The
benchmark leverages the mutex lock and unlock operation for protecting the
critical section and measures the minimum iterations, the maximum
iterations, and the total iterations within a fixed duration. Variants of
benchmark are run with 1, 2, 3, 4, nproc/4, nproc/2, nproc threads.

The size of critical section is determined by the number of times of pause
(x86 only) and read which is intended to emulate the scenarios in real
applications. In this microbenchmark, the number 1, 10, 100, and 1000 are
used to represent different size of critical sections in the working set.

    * benchtests/bench-mutex-adaptive-thread.c: Microbenchmark for
    adaptive spin mutex
    * benchmark/Makefile: Add adaptive spin mutex benchmark

ChangLog:
    V2->V3:
    a) Add some delay after mutex unlock to reduce the possibility of lock
    acquisition by the previous lock holder, and that makes more sense
    for practical applications
    b) Sync threads to start simultaneously
    c) Set CPU affinity for threads

    V1->V2:
    New added microbenchmark, as requested by Adhemerval Zanella

Signed-off-by: Kemi Wang <kemi.wang@intel.com>
---
 ChangeLog                                |   6 +
 benchtests/Makefile                      |  36 ++++-
 benchtests/bench-mutex-adaptive-thread.c | 251 +++++++++++++++++++++++++++++++
 3 files changed, 286 insertions(+), 7 deletions(-)
 create mode 100644 benchtests/bench-mutex-adaptive-thread.c

diff --git a/ChangeLog b/ChangeLog
index c1e2a44..e2991e9 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@
 2018-05-23  Kemi Wang <kemi.wang@intel.com>
 
+	* benchtests/bench-mutex-adaptive-thread.c: Microbenchmark for adaptive
+	spin mutex.
+	* benchmark/Makefile: Add adaptive spin mutex benchmark.
+
+2018-05-23  Kemi Wang <kemi.wang@intel.com>
+
 	* elf/dl-tunables.list: Add glibc.mutex.spin_count entry.
 	* manual/tunables.texi: Add glibc.mutex.spin_count description.
 	* nptl/Makefile: Add pthread_mutex_conf.c for compilation.
diff --git a/benchtests/Makefile b/benchtests/Makefile
index bcd6a9c..fcc768f 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -95,10 +95,17 @@ else
 bench-malloc := $(filter malloc-%,${BENCHSET})
 endif
 
+ifeq (${BENCHSET},)
+bench-mutex := mutex-adaptive-thread
+else
+bench-mutex := $(filter mutex-%,${BENCHSET})
+endif
+
 $(addprefix $(objpfx)bench-,$(bench-math)): $(libm)
 $(addprefix $(objpfx)bench-,$(math-benchset)): $(libm)
 $(addprefix $(objpfx)bench-,$(bench-pthread)): $(shared-thread-library)
 $(objpfx)bench-malloc-thread: $(shared-thread-library)
+$(addprefix $(objpfx)bench-,$(bench-mutex)): $(shared-thread-library)
 
 \f
 
@@ -119,6 +126,7 @@ include ../Rules
 binaries-bench := $(addprefix $(objpfx)bench-,$(bench))
 binaries-benchset := $(addprefix $(objpfx)bench-,$(benchset))
 binaries-bench-malloc := $(addprefix $(objpfx)bench-,$(bench-malloc))
+binaries-bench-mutex := $(addprefix $(objpfx)bench-,$(bench-mutex))
 
 # The default duration: 10 seconds.
 ifndef BENCH_DURATION
@@ -142,7 +150,7 @@ endif
 # This makes sure CPPFLAGS-nonlib and CFLAGS-nonlib are passed
 # for all these modules.
 cpp-srcs-left := $(binaries-benchset:=.c) $(binaries-bench:=.c) \
-		 $(binaries-bench-malloc:=.c)
+	$(binaries-bench-malloc:=.c) $(binaries-bench-mutex:=.c)
 lib := nonlib
 include $(patsubst %,$(..)libof-iterator.mk,$(cpp-srcs-left))
 
@@ -158,6 +166,7 @@ bench-clean:
 	rm -f $(binaries-bench) $(addsuffix .o,$(binaries-bench))
 	rm -f $(binaries-benchset) $(addsuffix .o,$(binaries-benchset))
 	rm -f $(binaries-bench-malloc) $(addsuffix .o,$(binaries-bench-malloc))
+	rm -f $(binaries-bench-mutex) $(addsuffix .o,$(binaries-bench-mutex))
 	rm -f $(timing-type) $(addsuffix .o,$(timing-type))
 	rm -f $(addprefix $(objpfx),$(bench-extra-objs))
 
@@ -165,7 +174,7 @@ bench-clean:
 ifneq ($(strip ${BENCHSET}),)
 VALIDBENCHSETNAMES := bench-pthread bench-math bench-string string-benchset \
    wcsmbs-benchset stdlib-benchset stdio-common-benchset math-benchset \
-   malloc-thread
+   malloc-thread mutex-adaptive-thread
 INVALIDBENCHSETNAMES := $(filter-out ${VALIDBENCHSETNAMES},${BENCHSET})
 ifneq (${INVALIDBENCHSETNAMES},)
 $(info The following values in BENCHSET are invalid: ${INVALIDBENCHSETNAMES})
@@ -176,7 +185,7 @@ endif
 
 # Define the bench target only if the target has a usable python installation.
 ifdef PYTHON
-bench: bench-build bench-set bench-func bench-malloc
+bench: bench-build bench-set bench-func bench-malloc bench-mutex
 else
 bench:
 	@echo "The bench target needs python to run."
@@ -187,10 +196,10 @@ endif
 # only if we're building natively.
 ifeq (no,$(cross-compiling))
 bench-build: $(gen-locales) $(timing-type) $(binaries-bench) \
-	$(binaries-benchset) $(binaries-bench-malloc)
+	$(binaries-benchset) $(binaries-bench-malloc) $(binaries-bench-mutex)
 else
 bench-build: $(timing-type) $(binaries-bench) $(binaries-benchset) \
-	$(binaries-bench-malloc)
+	$(binaries-bench-malloc) $(binaries-bench-mutex)
 endif
 
 bench-set: $(binaries-benchset)
@@ -207,6 +216,19 @@ bench-malloc: $(binaries-bench-malloc)
 	  done;\
 	done
 
+# Run benchmark with 1, 2, 3, nproc/2, nproc threads
+bench-mutex: $(binaries-bench-mutex)
+	for run in $^; do \
+		prev=0; \
+		for thr in 1 2 3 4 $$((`nproc` / 4)) $$((`nproc` / 2)) `nproc`; do \
+			if [ $$thr -gt $$prev -a $$thr -lt `nproc` ]; then \
+			echo "Running $${run} $${thr}"; \
+			fi; \
+			prev=$$thr; \
+	  $(run-bench) $${thr} > $${run}-$${thr}.out; \
+	  done;\
+	done
+
 # Build and execute the benchmark functions.  This target generates JSON
 # formatted bench.out.  Each of the programs produce independent JSON output,
 # so one could even execute them individually and process it using any JSON
@@ -236,8 +258,8 @@ bench-func: $(binaries-bench)
 	fi
 
 $(timing-type) $(binaries-bench) $(binaries-benchset) \
-	$(binaries-bench-malloc): %: %.o $(objpfx)json-lib.o \
-	$(link-extra-libs-tests) \
+	$(binaries-bench-malloc) $(binaries-bench-mutex): \
+	%: %.o $(objpfx)json-lib.o $(link-extra-libs-tests) \
   $(sort $(filter $(common-objpfx)lib%,$(link-libc))) \
   $(addprefix $(csu-objpfx),start.o) $(+preinit) $(+postinit)
 	$(+link-tests)
diff --git a/benchtests/bench-mutex-adaptive-thread.c b/benchtests/bench-mutex-adaptive-thread.c
new file mode 100644
index 0000000..ce9c40e
--- /dev/null
+++ b/benchtests/bench-mutex-adaptive-thread.c
@@ -0,0 +1,251 @@
+/* Benchmark pthread adaptive spin mutex lock and unlock functions.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <errno.h>
+#include <pthread.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/time.h>
+#include <unistd.h>
+#include <sched.h>
+
+#include "bench-timing.h"
+#include "json-lib.h"
+
+/* Benchmark duration in seconds.  */
+#define BENCHMARK_DURATION	15
+#define TYPE PTHREAD_MUTEX_ADAPTIVE_NP
+#define mb() asm ("" ::: "memory")
+#define UNLOCK_DELAY 10
+
+#if defined (__i386__) || defined (__x86_64__)
+# define cpu_relax() asm ("rep; nop")
+#else
+# define cpu_relax() do { } while (0)
+#endif
+
+static volatile int start_thread;
+static unsigned long long val;
+static pthread_mutexattr_t attr;
+static pthread_mutex_t mutex;
+
+#define WORKING_SET_SIZE  4
+int working_set[] = {1, 10, 100, 1000};
+
+struct thread_args
+{
+  unsigned long long iters;
+  int working_set;
+  timing_t elapsed;
+};
+
+static void
+init_mutex (void)
+{
+  pthread_mutexattr_init (&attr);
+  pthread_mutexattr_settype (&attr, TYPE);
+  pthread_mutex_init (&mutex, &attr);
+}
+
+static void
+init_parameter (int size, struct thread_args *args, int num_thread)
+{
+  int i;
+  for (i = 0; i < num_thread; i++)
+    {
+      memset(&args[i], 0, sizeof(struct thread_args));
+      args[i].working_set = size;
+    }
+}
+
+static volatile bool timeout;
+
+static void
+alarm_handler (int signum)
+{
+  timeout = true;
+}
+
+static inline void
+delay (int number)
+{
+  while (number > 0)
+  {
+    cpu_relax ();
+    number--;
+  }
+}
+
+/* Lock and unlock for protecting the critical section.  */
+static unsigned long long
+mutex_benchmark_loop (int number)
+{
+  unsigned long long iters = 0;
+
+  while (!start_thread)
+   cpu_relax ();
+
+  while (!timeout)
+    {
+      pthread_mutex_lock (&mutex);
+      val++;
+      delay (number);
+      pthread_mutex_unlock (&mutex);
+      iters++;
+      delay (UNLOCK_DELAY);
+    }
+  return iters;
+}
+
+static void *
+benchmark_thread (void *arg)
+{
+  struct thread_args *args = (struct thread_args *) arg;
+  unsigned long long iters;
+  timing_t start, stop;
+
+  TIMING_NOW (start);
+  iters = mutex_benchmark_loop (args->working_set);
+  TIMING_NOW (stop);
+
+  TIMING_DIFF (args->elapsed, start, stop);
+  args->iters = iters;
+
+  return NULL;
+}
+
+static void
+do_benchmark (size_t num_thread, struct thread_args *args)
+{
+
+  pthread_t threads[num_thread];
+
+  for (size_t i = 0; i < num_thread; i++)
+    {
+      pthread_attr_t attr;
+      cpu_set_t set;
+
+      pthread_attr_init (&attr);
+      CPU_ZERO (&set);
+      CPU_SET (i, &set);
+      pthread_attr_setaffinity_np (&attr, sizeof(cpu_set_t), &set);
+      pthread_create (&threads[i], &attr, benchmark_thread, args + i);
+      pthread_attr_destroy (&attr);
+    }
+
+  mb ();
+  start_thread = 1;
+  mb ();
+  sched_yield ();
+  for (size_t i = 0; i < num_thread; i++)
+    pthread_join(threads[i], NULL);
+}
+
+static void
+usage(const char *name)
+{
+  fprintf (stderr, "%s: <num_thread>\n", name);
+  exit (1);
+}
+
+int
+main (int argc, char **argv)
+{
+  int i, j, num_thread = 1;
+  json_ctx_t json_ctx;
+  struct sigaction act;
+
+  if (argc == 1)
+    num_thread = 1;
+  else if (argc == 2)
+    {
+      long ret;
+
+      errno = 0;
+      ret = strtol(argv[1], NULL, 10);
+
+      if (errno || ret == 0)
+	    usage(argv[0]);
+
+      num_thread = ret;
+    }
+  else
+    usage(argv[0]);
+
+  /* Benchmark for different critical section size.  */
+  for (i = 0; i < WORKING_SET_SIZE; i++)
+    {
+      int size = working_set[i];
+      struct thread_args args[num_thread];
+      unsigned long long iters = 0, min_iters = -1ULL, max_iters = 0;
+      double d_total_s = 0, d_total_i = 0;
+
+      timeout = false;
+      init_mutex ();
+      init_parameter (size, args, num_thread);
+
+      json_init (&json_ctx, 0, stdout);
+
+      json_document_begin (&json_ctx);
+
+      json_attr_string (&json_ctx, "timing_type", TIMING_TYPE);
+
+      json_attr_object_begin (&json_ctx, "functions");
+
+      json_attr_object_begin (&json_ctx, "mutex");
+
+      json_attr_object_begin (&json_ctx, "");
+
+      memset (&act, 0, sizeof (act));
+      act.sa_handler = &alarm_handler;
+
+      sigaction (SIGALRM, &act, NULL);
+
+      alarm (BENCHMARK_DURATION);
+
+      do_benchmark (num_thread, args);
+
+      for (j = 0; j < num_thread; j++)
+        {
+          iters = args[j].iters;
+          if (iters < min_iters)
+            min_iters = iters;
+          if (iters >= max_iters)
+            max_iters = iters;
+          d_total_i += iters;
+          TIMING_ACCUM (d_total_s, args[j].elapsed);
+        }
+      json_attr_double (&json_ctx, "duration", d_total_s);
+      json_attr_double (&json_ctx, "total_iterations", d_total_i);
+      json_attr_double (&json_ctx, "min_iteration", min_iters);
+      json_attr_double (&json_ctx, "max_iteration", max_iters);
+      json_attr_double (&json_ctx, "time_per_iteration", d_total_s / d_total_i);
+      json_attr_double (&json_ctx, "threads", num_thread);
+      json_attr_double (&json_ctx, "critical_section_size", size);
+
+      json_attr_object_end (&json_ctx);
+      json_attr_object_end (&json_ctx);
+      json_attr_object_end (&json_ctx);
+
+      json_document_end (&json_ctx);
+      fputs("\n", (&json_ctx)->fp);
+    }
+  return 0;
+}
-- 
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-05-23  9:25 [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Kemi Wang
  2018-05-23  9:25 ` [PATCH v3 2/3] benchtests: Add pthread adaptive spin mutex microbenchmark Kemi Wang
  2018-05-23  9:25 ` [PATCH v3 3/3] Mutex: Replace trylock by read only while spinning Kemi Wang
@ 2018-05-25  8:52 ` Wang, Kemi
  2018-05-30  3:03   ` kemi
  2018-06-07 13:07 ` Szabolcs Nagy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Wang, Kemi @ 2018-05-25  8:52 UTC (permalink / raw)
  To: Wang, Kemi, Adhemerval Zanella, Florian Weimer, Rical Jason,
	Carlos Donell, Glibc alpha
  Cc: Dave Hansen, Chen, Tim C, Kleen, Andi, Huang, Ying, Lu, Aaron,
	Li, Aubrey

Hi,  Maintainers
    OK for master branch for this updated V3 patch series?

-----Original Message-----
From: libc-alpha-owner@sourceware.org [mailto:libc-alpha-owner@sourceware.org] On Behalf Of Kemi Wang
Sent: Wednesday, May 23, 2018 5:23 PM
To: Adhemerval Zanella <adhemerval.zanella@linaro.org>; Florian Weimer <fweimer@redhat.com>; Rical Jason <rj@2c3t.io>; Carlos Donell <carlos@redhat.com>; Glibc alpha <libc-alpha@sourceware.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>; Chen, Tim C <tim.c.chen@intel.com>; Kleen, Andi <andi.kleen@intel.com>; Huang, Ying <ying.huang@intel.com>; Lu, Aaron <aaron.lu@intel.com>; Li, Aubrey <aubrey.li@intel.com>; Wang, Kemi <kemi.wang@intel.com>
Subject: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex

This patch does not have any functionality change, we only provide a spin count tunes for pthread adaptive spin mutex. The tunable glibc.mutex.spin_count tunes can be used by system administrator to squeeze system performance according to different hardware capabilities and workload characteristics.

The maximum value of spin count is limited to 30000 to avoid the overflow of mutex->__data.__spins variable with the possible type of short in pthread_mutex_lock ().

The default value of spin count is set to 100 with the reference to the previous number of times of spinning via trylock. This value would be architecture-specific and can be tuned with kinds of benchmarks to fit most cases in future.

This is the preparation work for the next patch, in which the way of adaptive spin would be changed from an expensive cmpxchg to read while spinning.

   * elf/dl-tunables.list: Add glibc.mutex.spin_count entry.
   * manual/tunables.texi: Add glibc.mutex.spin_count description.
   * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
   * nptl/pthread_mutex_conf.h: New file.
   * nptl/pthread_mutex_conf.c: New file.

ChangeLog:
    V2->V3
    a) Polish the description of glibc.mutex.spin_count tunable with the
    help from Rical Jasan.
    b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
    pthread_mutex_conf.c, as suggested by Florian Weimer.
    c) Adjust the default value of spin count to 100 with the reference of
    the previous spinning way via trylock.

    V1->V2
    a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
    b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
    c) Change the Makefile to compile pthread_mutex_conf.c
    d) Modify the copyright "2013-2018" -> "2018" for new added files
    e) Fix the indentation issue (tab -> double space) in
    elf/dl-tunables.list
    f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
    g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
    description in manual/tunables.texi.
    h) Fix the indentation issue in nptl/pthread_mutex_conf.c
    i) Fix the indentation issue for nested preprocessor (add one space for
    each level)

Suggested-by: Andi Kleen <andi.kleen@intel.com>
Signed-off-by: Kemi Wang <kemi.wang@intel.com>
---
 ChangeLog                 |  8 +++++
 elf/dl-tunables.list      |  9 ++++++
 manual/tunables.texi      | 22 ++++++++++++++
 nptl/Makefile             |  3 +-
 nptl/pthread_mutex_conf.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++
 nptl/pthread_mutex_conf.h | 31 ++++++++++++++++++++
 6 files changed, 146 insertions(+), 1 deletion(-)  create mode 100644 nptl/pthread_mutex_conf.c  create mode 100644 nptl/pthread_mutex_conf.h

diff --git a/ChangeLog b/ChangeLog
index 8032adf..c1e2a44 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2018-05-23  Kemi Wang <kemi.wang@intel.com>
+
+	* elf/dl-tunables.list: Add glibc.mutex.spin_count entry.
+	* manual/tunables.texi: Add glibc.mutex.spin_count description.
+	* nptl/Makefile: Add pthread_mutex_conf.c for compilation.
+	* nptl/pthread_mutex_conf.h: New file.
+	* nptl/pthread_mutex_conf.c: New file.
+
 2018-05-23  Andreas Schwab  <schwab@suse.de>
 
 	[BZ #23196]
diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list index 1f8ecb8..f1f3546 100644
--- a/elf/dl-tunables.list
+++ b/elf/dl-tunables.list
@@ -121,4 +121,13 @@ glibc {
       default: 3
     }
   }
+
+  mutex {
+    spin_count {
+      type: INT_32
+      minval: 0
+      maxval: 30000
+      default: 100
+    }
+  }
 }
diff --git a/manual/tunables.texi b/manual/tunables.texi index be33c9f..05c5e4a 100644
--- a/manual/tunables.texi
+++ b/manual/tunables.texi
@@ -281,6 +281,28 @@ of try lock attempts.
 The default value of this tunable is @samp{3}.
 @end deftp
 
+@node Pthread Mutex Tunables
+@section Pthread Mutex Tunables
+@cindex pthread mutex tunables
+
+@deftp {Tunable namespace} glibc.mutex
+The behavior of pthread mutexes can be tuned to gain performance 
+improvements according to specific hardware capabilities and workload 
+characteristics by setting the following tunables in the @code{mutex} namespace:
+@end deftp
+
+@deftp Tunable glibc.mutex.spin_count
+The @code{glibc.mutex.spin_count} tunable sets the maximum number of 
+times a thread should spin on the lock before calling into the kernel to block.
+Adaptive spin is used for mutexes initialized with the 
+PTHREAD_MUTEX_ADAPTIVE_NP GNU extension.  It affects both pthread_mutex_lock and pthread_mutex_timedlock.
+
+The spinning is done until either the maximum spin times is reached or 
+the lock is acquired.
+
+The default value of this tunable is @samp{100}.
+@end deftp
+
 @node Hardware Capability Tunables
 @section Hardware Capability Tunables
 @cindex hardware capability tunables
diff --git a/nptl/Makefile b/nptl/Makefile index 94be92c..bd1096f 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -139,7 +139,8 @@ libpthread-routines = nptl-init vars events version pt-interp \
 		      pthread_mutex_getprioceiling \
 		      pthread_mutex_setprioceiling \
 		      pthread_setname pthread_getname \
-		      pthread_setattr_default_np pthread_getattr_default_np
+		      pthread_setattr_default_np pthread_getattr_default_np \
+		      pthread_mutex_conf
 #		      pthread_setuid pthread_seteuid pthread_setreuid \
 #		      pthread_setresuid \
 #		      pthread_setgid pthread_setegid pthread_setregid \
diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c new file mode 100644 index 0000000..7f9eb23
--- /dev/null
+++ b/nptl/pthread_mutex_conf.c
@@ -0,0 +1,74 @@
+/* pthread_mutex_conf.c: Pthread mutex tunable parameters.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include <pthreadP.h>
+#include <init-arch.h>
+#include <pthread_mutex_conf.h>
+#include <unistd.h>
+
+#if HAVE_TUNABLES
+# define TUNABLE_NAMESPACE mutex
+#endif
+#include <elf/dl-tunables.h>
+
+
+struct mutex_config __mutex_aconf =
+{
+  /* The maximum number of times a thread should spin on the lock 
+before
+  calling into kernel to block.  */
+  .spin_count = 100,
+};
+
+#if HAVE_TUNABLES
+static inline void __always_inline
+do_set_mutex_spin_count (int32_t value) {
+  __mutex_aconf.spin_count = value;
+}
+
+void
+TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp) {
+  int32_t value = (int32_t) (valp)->numval;
+  do_set_mutex_spin_count (value);
+}
+#endif
+
+static void
+mutex_tunables_init (int argc __attribute__ ((unused)),
+			      char **argv  __attribute__ ((unused)),
+					      char **environ)
+{
+#if HAVE_TUNABLES
+  TUNABLE_GET (spin_count, int32_t,
+               TUNABLE_CALLBACK (set_mutex_spin_count)); #endif }
+
+#ifdef SHARED
+# define INIT_SECTION ".init_array"
+#else
+# define INIT_SECTION ".preinit_array"
+#endif
+
+void (*const __pthread_mutex_tunables_init_array []) (int, char **, 
+char **)
+  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) = 
+{
+  &mutex_tunables_init
+};
diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h new file mode 100644 index 0000000..e5b027c
--- /dev/null
+++ b/nptl/pthread_mutex_conf.h
@@ -0,0 +1,31 @@
+/* pthread_mutex_conf.h: Pthread mutex tunable parameters.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+#ifndef _PTHREAD_MUTEX_CONF_H
+#define _PTHREAD_MUTEX_CONF_H 1
+
+#include <pthread.h>
+#include <time.h>
+
+struct mutex_config
+{
+  int spin_count;
+};
+
+extern struct mutex_config __mutex_aconf attribute_hidden;
+
+#endif
--
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-05-25  8:52 ` [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Wang, Kemi
@ 2018-05-30  3:03   ` kemi
  2018-06-07  4:57     ` kemi
  0 siblings, 1 reply; 15+ messages in thread
From: kemi @ 2018-05-30  3:03 UTC (permalink / raw)
  To: Adhemerval Zanella, Florian Weimer, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Chen, Tim C, Kleen, Andi, Huang, Ying, Lu, Aaron,
	Li, Aubrey

Just soft PING. Anything required at my side(e.g. make this change opt-in for x86 architecture).
Thanks for comments.

https://sourceware.org/ml/libc-alpha/2018-05/msg00750.html
https://sourceware.org/ml/libc-alpha/2018-05/msg00752.html
https://sourceware.org/ml/libc-alpha/2018-05/msg00751.html


On 2018年05月25日 16:51, Wang, Kemi wrote:
> Hi,  Maintainers
>     OK for master branch for this updated V3 patch series?
> 
> -----Original Message-----
> From: libc-alpha-owner@sourceware.org [mailto:libc-alpha-owner@sourceware.org] On Behalf Of Kemi Wang
> Sent: Wednesday, May 23, 2018 5:23 PM
> To: Adhemerval Zanella <adhemerval.zanella@linaro.org>; Florian Weimer <fweimer@redhat.com>; Rical Jason <rj@2c3t.io>; Carlos Donell <carlos@redhat.com>; Glibc alpha <libc-alpha@sourceware.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>; Chen, Tim C <tim.c.chen@intel.com>; Kleen, Andi <andi.kleen@intel.com>; Huang, Ying <ying.huang@intel.com>; Lu, Aaron <aaron.lu@intel.com>; Li, Aubrey <aubrey.li@intel.com>; Wang, Kemi <kemi.wang@intel.com>
> Subject: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
> 
> This patch does not have any functionality change, we only provide a spin count tunes for pthread adaptive spin mutex. The tunable glibc.mutex.spin_count tunes can be used by system administrator to squeeze system performance according to different hardware capabilities and workload characteristics.
> 
> The maximum value of spin count is limited to 30000 to avoid the overflow of mutex->__data.__spins variable with the possible type of short in pthread_mutex_lock ().
> 
> The default value of spin count is set to 100 with the reference to the previous number of times of spinning via trylock. This value would be architecture-specific and can be tuned with kinds of benchmarks to fit most cases in future.
> 
> This is the preparation work for the next patch, in which the way of adaptive spin would be changed from an expensive cmpxchg to read while spinning.
> 
>    * elf/dl-tunables.list: Add glibc.mutex.spin_count entry.
>    * manual/tunables.texi: Add glibc.mutex.spin_count description.
>    * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
>    * nptl/pthread_mutex_conf.h: New file.
>    * nptl/pthread_mutex_conf.c: New file.
> 
> ChangeLog:
>     V2->V3
>     a) Polish the description of glibc.mutex.spin_count tunable with the
>     help from Rical Jasan.
>     b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
>     pthread_mutex_conf.c, as suggested by Florian Weimer.
>     c) Adjust the default value of spin count to 100 with the reference of
>     the previous spinning way via trylock.
> 
>     V1->V2
>     a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
>     b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
>     c) Change the Makefile to compile pthread_mutex_conf.c
>     d) Modify the copyright "2013-2018" -> "2018" for new added files
>     e) Fix the indentation issue (tab -> double space) in
>     elf/dl-tunables.list
>     f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
>     g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
>     description in manual/tunables.texi.
>     h) Fix the indentation issue in nptl/pthread_mutex_conf.c
>     i) Fix the indentation issue for nested preprocessor (add one space for
>     each level)
> 
> Suggested-by: Andi Kleen <andi.kleen@intel.com>
> Signed-off-by: Kemi Wang <kemi.wang@intel.com>
> ---
>  ChangeLog                 |  8 +++++
>  elf/dl-tunables.list      |  9 ++++++
>  manual/tunables.texi      | 22 ++++++++++++++
>  nptl/Makefile             |  3 +-
>  nptl/pthread_mutex_conf.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++
>  nptl/pthread_mutex_conf.h | 31 ++++++++++++++++++++
>  6 files changed, 146 insertions(+), 1 deletion(-)  create mode 100644 nptl/pthread_mutex_conf.c  create mode 100644 nptl/pthread_mutex_conf.h
> 
> diff --git a/ChangeLog b/ChangeLog
> index 8032adf..c1e2a44 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,11 @@
> +2018-05-23  Kemi Wang <kemi.wang@intel.com>
> +
> +	* elf/dl-tunables.list: Add glibc.mutex.spin_count entry.
> +	* manual/tunables.texi: Add glibc.mutex.spin_count description.
> +	* nptl/Makefile: Add pthread_mutex_conf.c for compilation.
> +	* nptl/pthread_mutex_conf.h: New file.
> +	* nptl/pthread_mutex_conf.c: New file.
> +
>  2018-05-23  Andreas Schwab  <schwab@suse.de>
>  
>  	[BZ #23196]
> diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list index 1f8ecb8..f1f3546 100644
> --- a/elf/dl-tunables.list
> +++ b/elf/dl-tunables.list
> @@ -121,4 +121,13 @@ glibc {
>        default: 3
>      }
>    }
> +
> +  mutex {
> +    spin_count {
> +      type: INT_32
> +      minval: 0
> +      maxval: 30000
> +      default: 100
> +    }
> +  }
>  }
> diff --git a/manual/tunables.texi b/manual/tunables.texi index be33c9f..05c5e4a 100644
> --- a/manual/tunables.texi
> +++ b/manual/tunables.texi
> @@ -281,6 +281,28 @@ of try lock attempts.
>  The default value of this tunable is @samp{3}.
>  @end deftp
>  
> +@node Pthread Mutex Tunables
> +@section Pthread Mutex Tunables
> +@cindex pthread mutex tunables
> +
> +@deftp {Tunable namespace} glibc.mutex
> +The behavior of pthread mutexes can be tuned to gain performance 
> +improvements according to specific hardware capabilities and workload 
> +characteristics by setting the following tunables in the @code{mutex} namespace:
> +@end deftp
> +
> +@deftp Tunable glibc.mutex.spin_count
> +The @code{glibc.mutex.spin_count} tunable sets the maximum number of 
> +times a thread should spin on the lock before calling into the kernel to block.
> +Adaptive spin is used for mutexes initialized with the 
> +PTHREAD_MUTEX_ADAPTIVE_NP GNU extension.  It affects both pthread_mutex_lock and pthread_mutex_timedlock.
> +
> +The spinning is done until either the maximum spin times is reached or 
> +the lock is acquired.
> +
> +The default value of this tunable is @samp{100}.
> +@end deftp
> +
>  @node Hardware Capability Tunables
>  @section Hardware Capability Tunables
>  @cindex hardware capability tunables
> diff --git a/nptl/Makefile b/nptl/Makefile index 94be92c..bd1096f 100644
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -139,7 +139,8 @@ libpthread-routines = nptl-init vars events version pt-interp \
>  		      pthread_mutex_getprioceiling \
>  		      pthread_mutex_setprioceiling \
>  		      pthread_setname pthread_getname \
> -		      pthread_setattr_default_np pthread_getattr_default_np
> +		      pthread_setattr_default_np pthread_getattr_default_np \
> +		      pthread_mutex_conf
>  #		      pthread_setuid pthread_seteuid pthread_setreuid \
>  #		      pthread_setresuid \
>  #		      pthread_setgid pthread_setegid pthread_setregid \
> diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c new file mode 100644 index 0000000..7f9eb23
> --- /dev/null
> +++ b/nptl/pthread_mutex_conf.c
> @@ -0,0 +1,74 @@
> +/* pthread_mutex_conf.c: Pthread mutex tunable parameters.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include <pthreadP.h>
> +#include <init-arch.h>
> +#include <pthread_mutex_conf.h>
> +#include <unistd.h>
> +
> +#if HAVE_TUNABLES
> +# define TUNABLE_NAMESPACE mutex
> +#endif
> +#include <elf/dl-tunables.h>
> +
> +
> +struct mutex_config __mutex_aconf =
> +{
> +  /* The maximum number of times a thread should spin on the lock 
> +before
> +  calling into kernel to block.  */
> +  .spin_count = 100,
> +};
> +
> +#if HAVE_TUNABLES
> +static inline void __always_inline
> +do_set_mutex_spin_count (int32_t value) {
> +  __mutex_aconf.spin_count = value;
> +}
> +
> +void
> +TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp) {
> +  int32_t value = (int32_t) (valp)->numval;
> +  do_set_mutex_spin_count (value);
> +}
> +#endif
> +
> +static void
> +mutex_tunables_init (int argc __attribute__ ((unused)),
> +			      char **argv  __attribute__ ((unused)),
> +					      char **environ)
> +{
> +#if HAVE_TUNABLES
> +  TUNABLE_GET (spin_count, int32_t,
> +               TUNABLE_CALLBACK (set_mutex_spin_count)); #endif }
> +
> +#ifdef SHARED
> +# define INIT_SECTION ".init_array"
> +#else
> +# define INIT_SECTION ".preinit_array"
> +#endif
> +
> +void (*const __pthread_mutex_tunables_init_array []) (int, char **, 
> +char **)
> +  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) = 
> +{
> +  &mutex_tunables_init
> +};
> diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h new file mode 100644 index 0000000..e5b027c
> --- /dev/null
> +++ b/nptl/pthread_mutex_conf.h
> @@ -0,0 +1,31 @@
> +/* pthread_mutex_conf.h: Pthread mutex tunable parameters.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +#ifndef _PTHREAD_MUTEX_CONF_H
> +#define _PTHREAD_MUTEX_CONF_H 1
> +
> +#include <pthread.h>
> +#include <time.h>
> +
> +struct mutex_config
> +{
> +  int spin_count;
> +};
> +
> +extern struct mutex_config __mutex_aconf attribute_hidden;
> +
> +#endif
> --
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-05-30  3:03   ` kemi
@ 2018-06-07  4:57     ` kemi
  0 siblings, 0 replies; 15+ messages in thread
From: kemi @ 2018-06-07  4:57 UTC (permalink / raw)
  To: Adhemerval Zanella, Florian Weimer, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Chen, Tim C, Kleen, Andi, Huang, Ying, Lu, Aaron,
	Li, Aubrey

3^PING

On 2018年05月30日 11:01, kemi wrote:
> Just soft PING. Anything required at my side(e.g. make this change opt-in for x86 architecture).
> Thanks for comments.
> 
> https://sourceware.org/ml/libc-alpha/2018-05/msg00750.html
> https://sourceware.org/ml/libc-alpha/2018-05/msg00752.html
> https://sourceware.org/ml/libc-alpha/2018-05/msg00751.html
> 
> 
> On 2018年05月25日 16:51, Wang, Kemi wrote:
>> Hi,  Maintainers
>>     OK for master branch for this updated V3 patch series?
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-05-23  9:25 [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Kemi Wang
                   ` (2 preceding siblings ...)
  2018-05-25  8:52 ` [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Wang, Kemi
@ 2018-06-07 13:07 ` Szabolcs Nagy
  2018-06-07 14:58   ` Wang, Kemi
  2018-06-07 13:09 ` Florian Weimer
  2018-06-07 13:10 ` Florian Weimer
  5 siblings, 1 reply; 15+ messages in thread
From: Szabolcs Nagy @ 2018-06-07 13:07 UTC (permalink / raw)
  To: Kemi Wang, Adhemerval Zanella, Florian Weimer, Rical Jason,
	Carlos Donell, Glibc alpha
  Cc: nd, Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu, Lu Aubrey

On 23/05/18 10:22, Kemi Wang wrote:
> This patch does not have any functionality change, we only provide a spin
> count tunes for pthread adaptive spin mutex. The tunable
> glibc.mutex.spin_count tunes can be used by system administrator to squeeze
> system performance according to different hardware capabilities and
> workload characteristics.
> 
> The maximum value of spin count is limited to 30000 to avoid the overflow
> of mutex->__data.__spins variable with the possible type of short in
> pthread_mutex_lock ().
> 
> The default value of spin count is set to 100 with the reference to the
> previous number of times of spinning via trylock. This value would be
> architecture-specific and can be tuned with kinds of benchmarks to fit most
> cases in future.
> 

i'm not against this tunable, but do ppl use
PTHREAD_MUTEX_ADAPTIVE_NP in practice?

it's a non-standard extension, if the normal mutex is not
good enough that should be fixed (e.g. it should do some
spinning on SMP systems if that's useful in general, the
futex syscall overhead is way bigger than a few atomic loads).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-05-23  9:25 [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Kemi Wang
                   ` (3 preceding siblings ...)
  2018-06-07 13:07 ` Szabolcs Nagy
@ 2018-06-07 13:09 ` Florian Weimer
  2018-06-07 15:00   ` Wang, Kemi
  2018-06-08  8:02   ` kemi
  2018-06-07 13:10 ` Florian Weimer
  5 siblings, 2 replies; 15+ messages in thread
From: Florian Weimer @ 2018-06-07 13:09 UTC (permalink / raw)
  To: Kemi Wang, Adhemerval Zanella, Rical Jason, Carlos Donell, Glibc alpha
  Cc: Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu, Lu Aubrey

On 05/23/2018 11:22 AM, Kemi Wang wrote:
> +#ifdef SHARED
> +# define INIT_SECTION ".init_array"
> +#else
> +# define INIT_SECTION ".preinit_array"
> +#endif
> +
> +void (*const __pthread_mutex_tunables_init_array []) (int, char **, char **)
> +  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) =
> +{
> +  &mutex_tunables_init
> +};


Did you try to get rid of this?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-05-23  9:25 [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Kemi Wang
                   ` (4 preceding siblings ...)
  2018-06-07 13:09 ` Florian Weimer
@ 2018-06-07 13:10 ` Florian Weimer
  2018-06-07 15:02   ` Wang, Kemi
  5 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2018-06-07 13:10 UTC (permalink / raw)
  To: Kemi Wang, Adhemerval Zanella, Rical Jason, Carlos Donell, Glibc alpha
  Cc: Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu, Lu Aubrey

On 05/23/2018 11:22 AM, Kemi Wang wrote:
> +  mutex {
> +    spin_count {
> +      type: INT_32
> +      minval: 0
> +      maxval: 30000
> +      default: 100
> +    }
> +  }

I think the applicable comments from the commit message should be 
included in here.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-06-07 13:07 ` Szabolcs Nagy
@ 2018-06-07 14:58   ` Wang, Kemi
  0 siblings, 0 replies; 15+ messages in thread
From: Wang, Kemi @ 2018-06-07 14:58 UTC (permalink / raw)
  To: Szabolcs Nagy, Adhemerval Zanella, Florian Weimer, Rical Jason,
	Carlos Donell, Glibc alpha
  Cc: nd, Dave Hansen, Chen, Tim C, Kleen, Andi, Huang, Ying, Lu,
	Aaron, Li, Aubrey

> i'm not against this tunable, but do ppl use PTHREAD_MUTEX_ADAPTIVE_NP in practice?

Absolutely!  Our customers complained to us that  adaptive mutex does not run well on Intel Skylake platform.
I am not a fan of adding code that no one cares. The tuning work is on going with kinds of benchmarks I can get.
If we can find a suitable threshold of spin count to fit most cases in future, I don't mind to remove this tunable. 
But now, it indeed need.

> it's a non-standard extension, if the normal mutex is not good enough that should be fixed (e.g. it should do some spinning on SMP systems if that's useful in general, the futex syscall overhead is way bigger than a few atomic loads).

That is what adaptive mutex does now, spinning for a while before going to block. 
If the adaptive mutex is optimized good enough, maybe we should make it as a default option (e.g. the mutex used in kernel spins for a while before sleep, and it also use mcs lock to queue spinners)

-----Original Message-----
From: libc-alpha-owner@sourceware.org [mailto:libc-alpha-owner@sourceware.org] On Behalf Of Szabolcs Nagy
Sent: Thursday, June 7, 2018 9:07 PM
To: Wang, Kemi <kemi.wang@intel.com>; Adhemerval Zanella <adhemerval.zanella@linaro.org>; Florian Weimer <fweimer@redhat.com>; Rical Jason <rj@2c3t.io>; Carlos Donell <carlos@redhat.com>; Glibc alpha <libc-alpha@sourceware.org>
Cc: nd@arm.com; Dave Hansen <dave.hansen@linux.intel.com>; Chen, Tim C <tim.c.chen@intel.com>; Kleen, Andi <andi.kleen@intel.com>; Huang, Ying <ying.huang@intel.com>; Lu, Aaron <aaron.lu@intel.com>; Li, Aubrey <aubrey.li@intel.com>
Subject: Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex

On 23/05/18 10:22, Kemi Wang wrote:
> This patch does not have any functionality change, we only provide a 
> spin count tunes for pthread adaptive spin mutex. The tunable 
> glibc.mutex.spin_count tunes can be used by system administrator to 
> squeeze system performance according to different hardware 
> capabilities and workload characteristics.
> 
> The maximum value of spin count is limited to 30000 to avoid the 
> overflow of mutex->__data.__spins variable with the possible type of 
> short in pthread_mutex_lock ().
> 
> The default value of spin count is set to 100 with the reference to 
> the previous number of times of spinning via trylock. This value would 
> be architecture-specific and can be tuned with kinds of benchmarks to 
> fit most cases in future.
> 

i'm not against this tunable, but do ppl use PTHREAD_MUTEX_ADAPTIVE_NP in practice?

it's a non-standard extension, if the normal mutex is not good enough that should be fixed (e.g. it should do some spinning on SMP systems if that's useful in general, the futex syscall overhead is way bigger than a few atomic loads).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-06-07 13:09 ` Florian Weimer
@ 2018-06-07 15:00   ` Wang, Kemi
  2018-06-08  8:02   ` kemi
  1 sibling, 0 replies; 15+ messages in thread
From: Wang, Kemi @ 2018-06-07 15:00 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Chen, Tim C, Kleen, Andi, Huang, Ying, Lu, Aaron,
	Li, Aubrey

> Did you try to get rid of this?

I didn't try to get rid of this.
Will  try soon and let you know the result

-----Original Message-----
From: Florian Weimer [mailto:fweimer@redhat.com] 
Sent: Thursday, June 7, 2018 9:09 PM
To: Wang, Kemi <kemi.wang@intel.com>; Adhemerval Zanella <adhemerval.zanella@linaro.org>; Rical Jason <rj@2c3t.io>; Carlos Donell <carlos@redhat.com>; Glibc alpha <libc-alpha@sourceware.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>; Chen, Tim C <tim.c.chen@intel.com>; Kleen, Andi <andi.kleen@intel.com>; Huang, Ying <ying.huang@intel.com>; Lu, Aaron <aaron.lu@intel.com>; Li, Aubrey <aubrey.li@intel.com>
Subject: Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex

On 05/23/2018 11:22 AM, Kemi Wang wrote:
> +#ifdef SHARED
> +# define INIT_SECTION ".init_array"
> +#else
> +# define INIT_SECTION ".preinit_array"
> +#endif
> +
> +void (*const __pthread_mutex_tunables_init_array []) (int, char **, 
> +char **)
> +  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) 
> += {
> +  &mutex_tunables_init
> +};


Did you try to get rid of this?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-06-07 13:10 ` Florian Weimer
@ 2018-06-07 15:02   ` Wang, Kemi
  0 siblings, 0 replies; 15+ messages in thread
From: Wang, Kemi @ 2018-06-07 15:02 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Chen, Tim C, Kleen, Andi, Huang, Ying, Lu, Aaron,
	Li, Aubrey

> I think the applicable comments from the commit message should be included in here.

Sure.

BTW,  I plan to make this optimization opt-in, and enable it for x86 architecture as default.
As you suggested before. 

-----Original Message-----
From: Florian Weimer [mailto:fweimer@redhat.com] 
Sent: Thursday, June 7, 2018 9:10 PM
To: Wang, Kemi <kemi.wang@intel.com>; Adhemerval Zanella <adhemerval.zanella@linaro.org>; Rical Jason <rj@2c3t.io>; Carlos Donell <carlos@redhat.com>; Glibc alpha <libc-alpha@sourceware.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>; Chen, Tim C <tim.c.chen@intel.com>; Kleen, Andi <andi.kleen@intel.com>; Huang, Ying <ying.huang@intel.com>; Lu, Aaron <aaron.lu@intel.com>; Li, Aubrey <aubrey.li@intel.com>
Subject: Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex

On 05/23/2018 11:22 AM, Kemi Wang wrote:
> +  mutex {
> +    spin_count {
> +      type: INT_32
> +      minval: 0
> +      maxval: 30000
> +      default: 100
> +    }
> +  }

I think the applicable comments from the commit message should be included in here.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-06-07 13:09 ` Florian Weimer
  2018-06-07 15:00   ` Wang, Kemi
@ 2018-06-08  8:02   ` kemi
  2018-06-08 14:54     ` Florian Weimer
  1 sibling, 1 reply; 15+ messages in thread
From: kemi @ 2018-06-08  8:02 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu, Lu Aubrey



On 2018年06月07日 21:09, Florian Weimer wrote:
> On 05/23/2018 11:22 AM, Kemi Wang wrote:
>> +#ifdef SHARED
>> +# define INIT_SECTION ".init_array"
>> +#else
>> +# define INIT_SECTION ".preinit_array"
>> +#endif
>> +
>> +void (*const __pthread_mutex_tunables_init_array []) (int, char **, char **)
>> +  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) =
>> +{
>> +  &mutex_tunables_init
>> +};
> 
> 
> Did you try to get rid of this?
> 

I remembered you suggested me to put this initialization as part of pthread initialization.

https://sourceware.org/ml/libc-alpha/2018-05/msg00289.html

But, mutex_tunables_init need three parameters such as argc, argv, and env from libc_start_main, 
while the pthread initialization function __pthread_initialize_minimal does not include any parameter.
I don't think we can simply put it as part of overall thread initialization unless we change the API 
of __pthread_initialize_minimal.

Maybe you have better idea.

> Thanks,
> Florian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-06-08  8:02   ` kemi
@ 2018-06-08 14:54     ` Florian Weimer
  2018-06-14  1:39       ` kemi
  0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2018-06-08 14:54 UTC (permalink / raw)
  To: kemi, Adhemerval Zanella, Rical Jason, Carlos Donell, Glibc alpha
  Cc: Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu, Lu Aubrey

On 06/08/2018 09:57 AM, kemi wrote:
> 
> 
> On 2018年06月07日 21:09, Florian Weimer wrote:
>> On 05/23/2018 11:22 AM, Kemi Wang wrote:
>>> +#ifdef SHARED
>>> +# define INIT_SECTION ".init_array"
>>> +#else
>>> +# define INIT_SECTION ".preinit_array"
>>> +#endif
>>> +
>>> +void (*const __pthread_mutex_tunables_init_array []) (int, char **, char **)
>>> +  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) =
>>> +{
>>> +  &mutex_tunables_init
>>> +};
>>
>>
>> Did you try to get rid of this?
>>
> 
> I remembered you suggested me to put this initialization as part of pthread initialization.
> 
> https://sourceware.org/ml/libc-alpha/2018-05/msg00289.html
> 
> But, mutex_tunables_init need three parameters such as argc, argv, and env from libc_start_main,
> while the pthread initialization function __pthread_initialize_minimal does not include any parameter.
> I don't think we can simply put it as part of overall thread initialization unless we change the API
> of __pthread_initialize_minimal.
> 
> Maybe you have better idea.

I will have a lok at it next week.

We need to do this as part of the main initialization of the library, so 
that the tunable is stripped from the environment if necessary.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
  2018-06-08 14:54     ` Florian Weimer
@ 2018-06-14  1:39       ` kemi
  0 siblings, 0 replies; 15+ messages in thread
From: kemi @ 2018-06-14  1:39 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella, Rical Jason, Carlos Donell,
	Glibc alpha
  Cc: Dave Hansen, Tim Chen, Andi Kleen, Ying Huang, Aaron Lu, Lu Aubrey



On 2018年06月08日 22:54, Florian Weimer wrote:
> On 06/08/2018 09:57 AM, kemi wrote:
>>
>>
>> On 2018年06月07日 21:09, Florian Weimer wrote:
>>> On 05/23/2018 11:22 AM, Kemi Wang wrote:
>>>> +#ifdef SHARED
>>>> +# define INIT_SECTION ".init_array"
>>>> +#else
>>>> +# define INIT_SECTION ".preinit_array"
>>>> +#endif
>>>> +
>>>> +void (*const __pthread_mutex_tunables_init_array []) (int, char **, char **)
>>>> +  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) =
>>>> +{
>>>> +  &mutex_tunables_init
>>>> +};
>>>
>>>
>>> Did you try to get rid of this?
>>>
>>
>> I remembered you suggested me to put this initialization as part of pthread initialization.
>>
>> https://sourceware.org/ml/libc-alpha/2018-05/msg00289.html
>>
>> But, mutex_tunables_init need three parameters such as argc, argv, and env from libc_start_main,
>> while the pthread initialization function __pthread_initialize_minimal does not include any parameter.
>> I don't think we can simply put it as part of overall thread initialization unless we change the API
>> of __pthread_initialize_minimal.
>>
>> Maybe you have better idea.
> 
> I will have a lok at it next week.
> 
> We need to do this as part of the main initialization of the library, so that the tunable is stripped from the environment if necessary.
> 

Do you have any idea for this issue? How about doing it as part of tunable initialization. (__tunables_init).
And could you explain a bit more on the value of this change. Thanks.

> Thanks,
> Florian

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-06-14  1:39 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-23  9:25 [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Kemi Wang
2018-05-23  9:25 ` [PATCH v3 2/3] benchtests: Add pthread adaptive spin mutex microbenchmark Kemi Wang
2018-05-23  9:25 ` [PATCH v3 3/3] Mutex: Replace trylock by read only while spinning Kemi Wang
2018-05-25  8:52 ` [PATCH v3 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex Wang, Kemi
2018-05-30  3:03   ` kemi
2018-06-07  4:57     ` kemi
2018-06-07 13:07 ` Szabolcs Nagy
2018-06-07 14:58   ` Wang, Kemi
2018-06-07 13:09 ` Florian Weimer
2018-06-07 15:00   ` Wang, Kemi
2018-06-08  8:02   ` kemi
2018-06-08 14:54     ` Florian Weimer
2018-06-14  1:39       ` kemi
2018-06-07 13:10 ` Florian Weimer
2018-06-07 15:02   ` Wang, Kemi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).