[PATCH] Improve performance of libc locks

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* [PATCH] Improve performance of libc locks
@ 2022-08-11 16:22 Wilco Dijkstra
  2022-08-15 14:07 ` Carlos O'Donell
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Wilco Dijkstra @ 2022-08-11 16:22 UTC (permalink / raw)
  To: 'GNU C Library'

Improve performance of libc locks by adding a fast path for the
single-threaded case.

On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded
performance is unchanged.

Passes regress on AArch64, OK for commit?

---

diff --git a/sysdeps/nptl/libc-lockP.h b/sysdeps/nptl/libc-lockP.h
index d3a6837fd212f3f5dfd80f46d0e9ce365042ae0c..ccdb11fee6f14069d0b936be93d0f0fa6d8bc30b 100644
--- a/sysdeps/nptl/libc-lockP.h
+++ b/sysdeps/nptl/libc-lockP.h
@@ -108,7 +108,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0");
 #define __libc_rwlock_fini(NAME) ((void) 0)
 
 /* Lock the named lock variable.  */
-#define __libc_lock_lock(NAME) ({ lll_lock (NAME, LLL_PRIVATE); 0; })
+#define __libc_lock_lock(NAME)						\
+ ({									\
+    if (SINGLE_THREAD_P)						\
+      (NAME) = LLL_LOCK_INITIALIZER_LOCKED;				\
+    else								\
+      lll_lock (NAME, LLL_PRIVATE);					\
+    0;									\
+  })
 #define __libc_rwlock_rdlock(NAME) __pthread_rwlock_rdlock (&(NAME))
 #define __libc_rwlock_wrlock(NAME) __pthread_rwlock_wrlock (&(NAME))
 
@@ -116,7 +123,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0");
 #define __libc_lock_trylock(NAME) lll_trylock (NAME)
 
 /* Unlock the named lock variable.  */
-#define __libc_lock_unlock(NAME) lll_unlock (NAME, LLL_PRIVATE)
+#define __libc_lock_unlock(NAME)					\
+ ({									\
+    if (SINGLE_THREAD_P)						\
+      (NAME) = LLL_LOCK_INITIALIZER;					\
+    else								\
+      lll_unlock (NAME, LLL_PRIVATE);					\
+    0;									\
+ })
 #define __libc_rwlock_unlock(NAME) __pthread_rwlock_unlock (&(NAME))
 
 #if IS_IN (rtld)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2022-08-11 16:22 [PATCH] Improve performance of libc locks Wilco Dijkstra
@ 2022-08-15 14:07 ` Carlos O'Donell
  2022-08-15 17:35   ` Wilco Dijkstra
  2022-11-15 20:17 ` Cristian Rodríguez
  2023-11-23 18:17 ` Florian Weimer
  2 siblings, 1 reply; 15+ messages in thread
From: Carlos O'Donell @ 2022-08-15 14:07 UTC (permalink / raw)
  To: Wilco Dijkstra, 'GNU C Library'

On 8/11/22 12:22, Wilco Dijkstra via Libc-alpha wrote:
> Improve performance of libc locks by adding a fast path for the
> single-threaded case.
> 
> On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded
> performance is unchanged.
> 
> Passes regress on AArch64, OK for commit?

This impacts all architectures.

Are you able to run the microbenchmarks to show that this improves one of them?

If you can, then we can at least ask the other machine maintainers to test the
patch shows gains there too. Conceptually I don't see why it wouldn't improve
the performance of all architectures, but having a baseline at the point of the
patch is good for recording the performance for future discussions.

If we don't have a benchmark that shows this specific base of ST vs MT and
internal __libc_lock_lock-locks then we should add one. Improving the internal
locking for our algorithms is always going to be a point of interest for IHVs.

Thanks.
 
> ---
> 
> diff --git a/sysdeps/nptl/libc-lockP.h b/sysdeps/nptl/libc-lockP.h
> index d3a6837fd212f3f5dfd80f46d0e9ce365042ae0c..ccdb11fee6f14069d0b936be93d0f0fa6d8bc30b 100644
> --- a/sysdeps/nptl/libc-lockP.h
> +++ b/sysdeps/nptl/libc-lockP.h
> @@ -108,7 +108,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0");
>  #define __libc_rwlock_fini(NAME) ((void) 0)
>  
>  /* Lock the named lock variable.  */
> -#define __libc_lock_lock(NAME) ({ lll_lock (NAME, LLL_PRIVATE); 0; })
> +#define __libc_lock_lock(NAME)						\
> + ({									\
> +    if (SINGLE_THREAD_P)						\
> +      (NAME) = LLL_LOCK_INITIALIZER_LOCKED;				\
> +    else								\
> +      lll_lock (NAME, LLL_PRIVATE);					\
> +    0;									\
> +  })
>  #define __libc_rwlock_rdlock(NAME) __pthread_rwlock_rdlock (&(NAME))
>  #define __libc_rwlock_wrlock(NAME) __pthread_rwlock_wrlock (&(NAME))
>  
> @@ -116,7 +123,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0");
>  #define __libc_lock_trylock(NAME) lll_trylock (NAME)
>  
>  /* Unlock the named lock variable.  */
> -#define __libc_lock_unlock(NAME) lll_unlock (NAME, LLL_PRIVATE)
> +#define __libc_lock_unlock(NAME)					\
> + ({									\
> +    if (SINGLE_THREAD_P)						\
> +      (NAME) = LLL_LOCK_INITIALIZER;					\
> +    else								\
> +      lll_unlock (NAME, LLL_PRIVATE);					\
> +    0;									\
> + })
>  #define __libc_rwlock_unlock(NAME) __pthread_rwlock_unlock (&(NAME))
>  
>  #if IS_IN (rtld)
> 
> 
> 


-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2022-08-15 14:07 ` Carlos O'Donell
@ 2022-08-15 17:35   ` Wilco Dijkstra
  2022-08-16  7:26     ` Noah Goldstein
  0 siblings, 1 reply; 15+ messages in thread
From: Wilco Dijkstra @ 2022-08-15 17:35 UTC (permalink / raw)
  To: Carlos O'Donell, 'GNU C Library'

 Hi Carlos,

> This impacts all architectures.

That was the goal indeed - we should add single-threaded optimizations in a
generic way.

> If we don't have a benchmark that shows this specific base of ST vs MT and
> internal __libc_lock_lock-locks then we should add one. Improving the internal
> locking for our algorithms is always going to be a point of interest for IHVs.

I can easily wrap my rand() microbench in json and add it to the benchtests.
I think it would be harder to do more tests on internal locks/headers since they
are not easily usable from benchtest infrastructure (just including libc-lock.h
results in lots of errors...).

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2022-08-15 17:35   ` Wilco Dijkstra
@ 2022-08-16  7:26     ` Noah Goldstein
  0 siblings, 0 replies; 15+ messages in thread
From: Noah Goldstein @ 2022-08-16  7:26 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: Carlos O'Donell, GNU C Library

On Tue, Aug 16, 2022 at 1:35 AM Wilco Dijkstra via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
>  Hi Carlos,
>
> > This impacts all architectures.
>
> That was the goal indeed - we should add single-threaded optimizations in a
> generic way.
>
> > If we don't have a benchmark that shows this specific base of ST vs MT and
> > internal __libc_lock_lock-locks then we should add one. Improving the internal
> > locking for our algorithms is always going to be a point of interest for IHVs.
>
> I can easily wrap my rand() microbench in json and add it to the benchtests.

Think that would be good so we can easily measure on other architectures.

> I think it would be harder to do more tests on internal locks/headers since they
> are not easily usable from benchtest infrastructure (just including libc-lock.h
> results in lots of errors...).
>
> Cheers,
> Wilco

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2022-08-11 16:22 [PATCH] Improve performance of libc locks Wilco Dijkstra
  2022-08-15 14:07 ` Carlos O'Donell
@ 2022-11-15 20:17 ` Cristian Rodríguez
  2022-12-09 14:10   ` Wilco Dijkstra
  2023-11-23 18:17 ` Florian Weimer
  2 siblings, 1 reply; 15+ messages in thread
From: Cristian Rodríguez @ 2022-11-15 20:17 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: GNU C Library

On Thu, Aug 11, 2022 at 12:23 PM Wilco Dijkstra via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Improve performance of libc locks by adding a fast path for the
> single-threaded case.
>
> On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded
> performance is unchanged.
>
> Passes regress on AArch64, OK for commit?
>
> ---
>

Ping ? I saw the stdio one was committed but what happened with this one ?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2022-11-15 20:17 ` Cristian Rodríguez
@ 2022-12-09 14:10   ` Wilco Dijkstra
  2023-11-23 17:16     ` Cristian Rodríguez
  0 siblings, 1 reply; 15+ messages in thread
From: Wilco Dijkstra @ 2022-12-09 14:10 UTC (permalink / raw)
  To: Cristian Rodríguez; +Cc: GNU C Library

Hi Cristian,

>> Improve performance of libc locks by adding a fast path for the
>> single-threaded case.
>>
>> On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded
>> performance is unchanged.
>>
>> Passes regress on AArch64, OK for commit?
>
> Ping ? I saw the stdio one was committed but what happened with this one ?

It is waiting on the locking benchmark being accepted. I've pinged that
(https://sourceware.org/pipermail/libc-alpha/2022-December/143944.html)
since it would be great to get all this in the next GLIBC.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2022-12-09 14:10   ` Wilco Dijkstra
@ 2023-11-23 17:16     ` Cristian Rodríguez
  0 siblings, 0 replies; 15+ messages in thread
From: Cristian Rodríguez @ 2023-11-23 17:16 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 731 bytes --]

On Fri, Dec 9, 2022 at 11:10 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com>
wrote:

> Hi Cristian,
>
> >> Improve performance of libc locks by adding a fast path for the
> >> single-threaded case.
> >>
> >> On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded
> >> performance is unchanged.
> >>
> >> Passes regress on AArch64, OK for commit?
> >
> > Ping ? I saw the stdio one was committed but what happened with this one
> ?
>
> It is waiting on the locking benchmark being accepted. I've pinged that
> (https://sourceware.org/pipermail/libc-alpha/2022-December/143944.html)
> since it would be great to get all this in the next GLIBC.
>
> Cheers,
> Wilco


Ping ? did this move forward?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2022-08-11 16:22 [PATCH] Improve performance of libc locks Wilco Dijkstra
  2022-08-15 14:07 ` Carlos O'Donell
  2022-11-15 20:17 ` Cristian Rodríguez
@ 2023-11-23 18:17 ` Florian Weimer
  2023-11-24 13:47   ` Wilco Dijkstra
  2 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2023-11-23 18:17 UTC (permalink / raw)
  To: Wilco Dijkstra via Libc-alpha; +Cc: Wilco Dijkstra

* Wilco Dijkstra via Libc-alpha:

> Improve performance of libc locks by adding a fast path for the
> single-threaded case.
>
> On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded
> performance is unchanged.
>
> Passes regress on AArch64, OK for commit?
>
> ---
>
> diff --git a/sysdeps/nptl/libc-lockP.h b/sysdeps/nptl/libc-lockP.h
> index d3a6837fd212f3f5dfd80f46d0e9ce365042ae0c..ccdb11fee6f14069d0b936be93d0f0fa6d8bc30b 100644
> --- a/sysdeps/nptl/libc-lockP.h
> +++ b/sysdeps/nptl/libc-lockP.h
> @@ -108,7 +108,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0");
>  #define __libc_rwlock_fini(NAME) ((void) 0)
>  
>  /* Lock the named lock variable.  */
> -#define __libc_lock_lock(NAME) ({ lll_lock (NAME, LLL_PRIVATE); 0; })
> +#define __libc_lock_lock(NAME)						\
> + ({									\
> +    if (SINGLE_THREAD_P)						\
> +      (NAME) = LLL_LOCK_INITIALIZER_LOCKED;				\
> +    else								\
> +      lll_lock (NAME, LLL_PRIVATE);					\
> +    0;									\
> +  })

We already have SINGLE_THREAD_P checks around locking in several places.
This makes the __libc_lock_lock check redudant in those cases.  I
believe this was done deliberately because in many cases, we can also to
skip cancellation handling at the same time.

The rand performance issue could be addressed with a similar localized
change.  I believe that would be far less controversial.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2023-11-23 18:17 ` Florian Weimer
@ 2023-11-24 13:47   ` Wilco Dijkstra
  2023-11-24 16:29     ` Carlos O'Donell
  0 siblings, 1 reply; 15+ messages in thread
From: Wilco Dijkstra @ 2023-11-24 13:47 UTC (permalink / raw)
  To: Florian Weimer, Wilco Dijkstra via Libc-alpha

Hi Florian, 

> We already have SINGLE_THREAD_P checks around locking in several places.
> This makes the __libc_lock_lock check redudant in those cases.  I
> believe this was done deliberately because in many cases, we can also to
> skip cancellation handling at the same time.

Yes, this is best for the most performance critical cases. However there are lots of
functions that use locks and many will improve with a single thread check at a 
higher level. You're right that would add extra checks in cases that are not
performance critical.

Maybe a solution would be to introduce __libc_lock_fast() or similar? That way one
can improve performance critical code easily without having to add special fast
paths. Today it would use SINGLE_THREAD_P, but it could potentially use RSEQ in
the future.

> The rand performance issue could be addressed with a similar localized
> change.  I believe that would be far less controversial.

I can send a patch that adds fast paths to rand() if that helps unblocking this.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Improve performance of libc locks
  2023-11-24 13:47   ` Wilco Dijkstra
@ 2023-11-24 16:29     ` Carlos O'Donell
  2024-06-26 15:45       ` [PATCH v2] " Wilco Dijkstra
  0 siblings, 1 reply; 15+ messages in thread
From: Carlos O'Donell @ 2023-11-24 16:29 UTC (permalink / raw)
  To: Wilco Dijkstra, Florian Weimer, Wilco Dijkstra via Libc-alpha

On 11/24/23 08:47, Wilco Dijkstra wrote:
> Hi Florian, 
> 
>> We already have SINGLE_THREAD_P checks around locking in several places.
>> This makes the __libc_lock_lock check redudant in those cases.  I
>> believe this was done deliberately because in many cases, we can also to
>> skip cancellation handling at the same time.
> 
> Yes, this is best for the most performance critical cases. However there are lots of
> functions that use locks and many will improve with a single thread check at a 
> higher level. You're right that would add extra checks in cases that are not
> performance critical.

Right.

> Maybe a solution would be to introduce __libc_lock_fast() or similar? That way one
> can improve performance critical code easily without having to add special fast
> paths. Today it would use SINGLE_THREAD_P, but it could potentially use RSEQ in
> the future.

I would prefer a solution like this because you can actively audit, and migrate
the callers rather than adding a hidden (to the developer) change in the macro.

>> The rand performance issue could be addressed with a similar localized
>> change.  I believe that would be far less controversial.
> 
> I can send a patch that adds fast paths to rand() if that helps unblocking this.

I think it would. Add the fast path to rand(), and a microbenchmark to show that this 
is good for performance on your machine, that way we don't regress this change
in the future when we work on rand(). I'm amenable to not having a microbenchmark, but
every time we talk about performance adding a little bit to the corpus helps ensure
we don't loose track of the gains.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] Improve performance of libc locks
  2023-11-24 16:29     ` Carlos O'Donell
@ 2024-06-26 15:45       ` Wilco Dijkstra
  2024-06-26 17:42         ` Florian Weimer
  2024-06-27  3:54         ` Noah Goldstein
  0 siblings, 2 replies; 15+ messages in thread
From: Wilco Dijkstra @ 2024-06-26 15:45 UTC (permalink / raw)
  To: Carlos O'Donell, Florian Weimer; +Cc: Wilco Dijkstra via Libc-alpha


Here is v2 which adds new locking primitives rather than changing the
existing ones. This allows finegrained changing of existing uses of locks
as discussed in https://sourceware.org/pipermail/libc-alpha/2023-November/152945.html


Add a new __libc_fast_(un)lock with fast path for the single-threaded case.
This can be used to replace of existing __libc_lock_(un)lock that are performance
critical.

On Neoverse V1, a loop using rand() modified to use the new locks improved
3.6 times.  Multithreaded performance is unchanged.

---

diff --git a/sysdeps/generic/libc-lock.h b/sysdeps/generic/libc-lock.h
index ea6291291009ae21e71a4bb3ca805e422168739c..06b2501d6ba46c662fdf9866e440ce1fc9c0c6da 100644
--- a/sysdeps/generic/libc-lock.h
+++ b/sysdeps/generic/libc-lock.h
@@ -61,6 +61,7 @@
 
 /* Lock the named lock variable.  */
 #define __libc_lock_lock(NAME)
+#define __libc_fast_lock(NAME)
 #define __libc_rwlock_rdlock(NAME)
 #define __libc_rwlock_wrlock(NAME)
 
@@ -78,6 +79,7 @@
 
 /* Unlock the named lock variable.  */
 #define __libc_lock_unlock(NAME)
+#define __libc_fast_unlock(NAME)
 #define __libc_rwlock_unlock(NAME)
 
 /* Unlock the recursive named lock variable.  */
diff --git a/sysdeps/mach/libc-lock.h b/sysdeps/mach/libc-lock.h
index 63427f7767f403b38c97e16b541984816fd2362e..9d5f4ec6bb6a7d8d75f37ab9faf6af1a14ab7cd7 100644
--- a/sysdeps/mach/libc-lock.h
+++ b/sysdeps/mach/libc-lock.h
@@ -71,6 +71,7 @@ typedef struct __libc_lock_recursive_opaque__ __libc_lock_recursive_t;
 /* Lock the named lock variable.  */
 #define __libc_lock_lock(NAME)   \
   ({ lll_lock ((NAME), LLL_PRIVATE); 0; })
+#define __libc_fast_lock(NAME) __libc_lock_lock(NAME)
 
 /* Lock the named lock variable.  */
 #define __libc_lock_trylock(NAME) lll_trylock (NAME)
@@ -78,6 +79,7 @@ typedef struct __libc_lock_recursive_opaque__ __libc_lock_recursive_t;
 /* Unlock the named lock variable.  */
 #define __libc_lock_unlock(NAME)   \
   ({ lll_unlock ((NAME), LLL_PRIVATE); 0; })
+#define __libc_fast_unlock(NAME) __libc_lock_unlock(NAME)
 
 #define __libc_lock_define_recursive(CLASS,NAME) \
   CLASS __libc_lock_recursive_t NAME;
diff --git a/sysdeps/nptl/libc-lockP.h b/sysdeps/nptl/libc-lockP.h
index 348999ca23f330a952068496a4442646903d1edc..eb8a7a4e863ba2a2a012d241c84920e82f37bf3d 100644
--- a/sysdeps/nptl/libc-lockP.h
+++ b/sysdeps/nptl/libc-lockP.h
@@ -110,6 +110,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0");
 
 /* Lock the named lock variable.  */
 #define __libc_lock_lock(NAME) ({ lll_lock (NAME, LLL_PRIVATE); 0; })
+#define __libc_fast_lock(NAME)						\
+ ({									\
+    if (SINGLE_THREAD_P)						\
+      (NAME) = LLL_LOCK_INITIALIZER_LOCKED;				\
+    else								\
+      lll_lock (NAME, LLL_PRIVATE);					\
+    0;									\
+  })
 #define __libc_rwlock_rdlock(NAME) __pthread_rwlock_rdlock (&(NAME))
 #define __libc_rwlock_wrlock(NAME) __pthread_rwlock_wrlock (&(NAME))
 
@@ -118,6 +126,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0");
 
 /* Unlock the named lock variable.  */
 #define __libc_lock_unlock(NAME) lll_unlock (NAME, LLL_PRIVATE)
+#define __libc_fast_unlock(NAME)					\
+ ({									\
+    if (SINGLE_THREAD_P)						\
+      (NAME) = LLL_LOCK_INITIALIZER;					\
+    else								\
+      lll_unlock (NAME, LLL_PRIVATE);					\
+    0;									\
+ })
 #define __libc_rwlock_unlock(NAME) __pthread_rwlock_unlock (&(NAME))
 
 #if IS_IN (rtld)





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] Improve performance of libc locks
  2024-06-26 15:45       ` [PATCH v2] " Wilco Dijkstra
@ 2024-06-26 17:42         ` Florian Weimer
  2024-06-27  3:52           ` Noah Goldstein
  2024-06-27  3:54         ` Noah Goldstein
  1 sibling, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2024-06-26 17:42 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: Carlos O'Donell, Wilco Dijkstra via Libc-alpha

* Wilco Dijkstra:

> diff --git a/sysdeps/mach/libc-lock.h b/sysdeps/mach/libc-lock.h
> index 63427f7767f403b38c97e16b541984816fd2362e..9d5f4ec6bb6a7d8d75f37ab9faf6af1a14ab7cd7 100644
> --- a/sysdeps/mach/libc-lock.h
> +++ b/sysdeps/mach/libc-lock.h
> @@ -71,6 +71,7 @@ typedef struct __libc_lock_recursive_opaque__ __libc_lock_recursive_t;
>  /* Lock the named lock variable.  */
>  #define __libc_lock_lock(NAME)   \
>    ({ lll_lock ((NAME), LLL_PRIVATE); 0; })
> +#define __libc_fast_lock(NAME) __libc_lock_lock(NAME)

I find the missing & in the existing __libc_lock_lock macros quite
hideous, and I suspect I'm not the only one.  I wonder what's more
important: making incremental progress in this area, or consistency with
the existing locking macros?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] Improve performance of libc locks
  2024-06-26 17:42         ` Florian Weimer
@ 2024-06-27  3:52           ` Noah Goldstein
  2024-06-27 15:47             ` Wilco Dijkstra
  0 siblings, 1 reply; 15+ messages in thread
From: Noah Goldstein @ 2024-06-27  3:52 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Wilco Dijkstra, Carlos O'Donell, Wilco Dijkstra via Libc-alpha

On Thu, Jun 27, 2024 at 1:43 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Wilco Dijkstra:
>
> > diff --git a/sysdeps/mach/libc-lock.h b/sysdeps/mach/libc-lock.h
> > index 63427f7767f403b38c97e16b541984816fd2362e..9d5f4ec6bb6a7d8d75f37ab9faf6af1a14ab7cd7 100644
> > --- a/sysdeps/mach/libc-lock.h
> > +++ b/sysdeps/mach/libc-lock.h
> > @@ -71,6 +71,7 @@ typedef struct __libc_lock_recursive_opaque__ __libc_lock_recursive_t;
> >  /* Lock the named lock variable.  */
> >  #define __libc_lock_lock(NAME)   \
> >    ({ lll_lock ((NAME), LLL_PRIVATE); 0; })
> > +#define __libc_fast_lock(NAME) __libc_lock_lock(NAME)
>
> I find the missing & in the existing __libc_lock_lock macros quite
> hideous, and I suspect I'm not the only one.  I wonder what's more
> important: making incremental progress in this area, or consistency with
> the existing locking macros?

If the plan is to update all(ish) uses of `__libc_lock_*` places with the new
variant, I was be in favor of dropping the &.

If these will only be used in a select few places, I think a different API would
cause more confusion than its worth.

>
> Thanks,
> Florian
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] Improve performance of libc locks
  2024-06-26 15:45       ` [PATCH v2] " Wilco Dijkstra
  2024-06-26 17:42         ` Florian Weimer
@ 2024-06-27  3:54         ` Noah Goldstein
  1 sibling, 0 replies; 15+ messages in thread
From: Noah Goldstein @ 2024-06-27  3:54 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: Carlos O'Donell, Florian Weimer, Wilco Dijkstra via Libc-alpha

On Wed, Jun 26, 2024 at 11:46 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>
>
> Here is v2 which adds new locking primitives rather than changing the
> existing ones. This allows finegrained changing of existing uses of locks
> as discussed in https://sourceware.org/pipermail/libc-alpha/2023-November/152945.html
>
>
> Add a new __libc_fast_(un)lock with fast path for the single-threaded case.
> This can be used to replace of existing __libc_lock_(un)lock that are performance
> critical.
>
> On Neoverse V1, a loop using rand() modified to use the new locks improved
> 3.6 times.  Multithreaded performance is unchanged.
>
> ---
>
> diff --git a/sysdeps/generic/libc-lock.h b/sysdeps/generic/libc-lock.h
> index ea6291291009ae21e71a4bb3ca805e422168739c..06b2501d6ba46c662fdf9866e440ce1fc9c0c6da 100644
> --- a/sysdeps/generic/libc-lock.h
> +++ b/sysdeps/generic/libc-lock.h
> @@ -61,6 +61,7 @@
>
>  /* Lock the named lock variable.  */
>  #define __libc_lock_lock(NAME)
> +#define __libc_fast_lock(NAME)
>  #define __libc_rwlock_rdlock(NAME)
>  #define __libc_rwlock_wrlock(NAME)
>
> @@ -78,6 +79,7 @@
>
>  /* Unlock the named lock variable.  */
>  #define __libc_lock_unlock(NAME)
> +#define __libc_fast_unlock(NAME)
>  #define __libc_rwlock_unlock(NAME)

Nit: I would change "fast" -> "single_thread_opt" given that there are some
caveats to using them.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] Improve performance of libc locks
  2024-06-27  3:52           ` Noah Goldstein
@ 2024-06-27 15:47             ` Wilco Dijkstra
  0 siblings, 0 replies; 15+ messages in thread
From: Wilco Dijkstra @ 2024-06-27 15:47 UTC (permalink / raw)
  To: Noah Goldstein, Florian Weimer
  Cc: Carlos O'Donell, Wilco Dijkstra via Libc-alpha

Hi,

>> I find the missing & in the existing __libc_lock_lock macros quite
>> hideous, and I suspect I'm not the only one.  I wonder what's more
>> important: making incremental progress in this area, or consistency with
>> the existing locking macros?
>
> If the plan is to update all(ish) uses of `__libc_lock_*` places with the new
> variant, I was be in favor of dropping the &.
> 
> If these will only be used in a select few places, I think a different API would
> cause more confusion than its worth.

The goal is to use the new fast variant only in places where there is a benefit
(since there was resistance in doing it everywhere like we did for file IO).

I agree it would make more sense to have a pointer argument for locks.
I think it would be best to first define a new set of properly named lock primitives
that don't repeat 'lock' 2 or 3 times or have unused arguments. They would be
inline functions to get type checking and avoid multiple indirections via tons of
complex macros. Then we can replace existing uses one subdir at a time.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-06-27 15:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-11 16:22 [PATCH] Improve performance of libc locks Wilco Dijkstra
2022-08-15 14:07 ` Carlos O'Donell
2022-08-15 17:35   ` Wilco Dijkstra
2022-08-16  7:26     ` Noah Goldstein
2022-11-15 20:17 ` Cristian Rodríguez
2022-12-09 14:10   ` Wilco Dijkstra
2023-11-23 17:16     ` Cristian Rodríguez
2023-11-23 18:17 ` Florian Weimer
2023-11-24 13:47   ` Wilco Dijkstra
2023-11-24 16:29     ` Carlos O'Donell
2024-06-26 15:45       ` [PATCH v2] " Wilco Dijkstra
2024-06-26 17:42         ` Florian Weimer
2024-06-27  3:52           ` Noah Goldstein
2024-06-27 15:47             ` Wilco Dijkstra
2024-06-27  3:54         ` Noah Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).