From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [IPv6:2a00:1450:4864:20::635]) by sourceware.org (Postfix) with ESMTPS id D7955385772E for ; Wed, 19 Apr 2023 16:57:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D7955385772E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x635.google.com with SMTP id a640c23a62f3a-94f910ea993so193870066b.3 for ; Wed, 19 Apr 2023 09:57:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681923447; x=1684515447; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qm6e3Tswc3MOpBMRtkTtYps6r6JQLD4ceQWeKFuIDJ4=; b=XTfrZK0Ry6p0HY82hBdn761Dq1BIqB9ZlwohAG6nKa8lOzOsjFW60bAuDKjmua4S9O EtKKLaWljaSgAgNj6cXgXFkYpdBN60C4npQGBk73Cm6Rvw6/a4TyFpb+jD1dZb8A1/A9 QsdfMZQK2wH4RKZLCgY+Q4zu5ZAAmmNlRQsqZjflNl44KRRQtnYt8aGOTTjw0wv9KdFn VBy1Q7jzJK55SrTq7XQEQaUQeXxzGOH6SmRm4XgyYY6uSq9WMDzBjDjBUBUoB2EXvmyQ 0Q93h1HuTqGBNlE4ECzRbW1ceKfq/OY09HMcJ9ylbkFQK4GsNu1Ic0oIT8Hj7tn1Nh39 d3Kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681923447; x=1684515447; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qm6e3Tswc3MOpBMRtkTtYps6r6JQLD4ceQWeKFuIDJ4=; b=bUqpy+fAJmdE4fXZCBk6C/lffgyY9EBDlO+9hF+7JRRrIkE/gRNmHv7amf3fNtNIgE PJo9R7HFS/KEj2NM9UNNaMtxaxHL4Wq9MqOw3W1qRClB9ZAcZl/GPQglasXZO1Tvlk2l yOG4TaltfX8HSr6lbnAtP9ZP8/a1FzBVqT85mj+59KudoluYyMfmj27sJC47ZQ+GBqw2 aoY6L5IlGPyfnMhZK0JaAqSHYUztbo3ikN02qVWmN2dKKkqF/PjQH37l8IO0HeFyKT8n oIrggyfIavpjYVzfPUzrdODC6dLxscCX4h+turfEFE8GBDQRpZNjKPR3Sp0SZ5LGxotd a0dQ== X-Gm-Message-State: AAQBX9ddg5LTSZq16Yyg4xq93FRiU37Jf+oqU+wwaX35YAAvqVHc2AEr q2WRNLf9f/OuHrxyrM2cMc/eFdQ/6qnDSDgbSzQ= X-Google-Smtp-Source: AKy350Zdy9kq8UFkQ61Zd5047wPiL04qY/nXij5L/GnkYAtEVOFY1h/J1CIwm2thcxC/vDlXmqw0HEm9M2RoBibPp20= X-Received: by 2002:a05:6402:1041:b0:4fb:4fc2:e600 with SMTP id e1-20020a056402104100b004fb4fc2e600mr6445498edu.42.1681923447126; Wed, 19 Apr 2023 09:57:27 -0700 (PDT) MIME-Version: 1.0 References: <20230419011722.1154501-1-goldstein.w.n@gmail.com> In-Reply-To: From: Noah Goldstein Date: Wed, 19 Apr 2023 11:57:15 -0500 Message-ID: Subject: Re: [PATCH v1] nptl: Add single-threaded optimization to ADAPTIVE_NP mutex To: "H.J. Lu" Cc: libc-alpha@sourceware.org, carlos@systemhalted.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Apr 19, 2023 at 10:42=E2=80=AFAM H.J. Lu wrot= e: > > On Tue, Apr 18, 2023 at 6:17=E2=80=AFPM Noah Goldstein wrote: > > > > ADAPTIVED_NP mutex is generally the most performant mutex option and > > is commonly used so it seems preferable for it to also benefit from > > the optimization. > > Adaptive mutex works better than normal mutex when futex syscall is the m= ain > bottleneck. For single threaded applications or machines with fewer cor= es, > adaptive mutex may not improve performance. If it is the case, shouldn'= t we > focus on improving performance when futex syscall is the bottleneck? I think adaptive mutex is more general than just avoiding mutexes. It seems to be the most general purpose lock (adaptive between spin/futex based on its usage) and is being pitched as the true default for things like std::mu= tex. The single-threaded optimization seems to fit with that mentality. > > > make check passes on linux-x86_64 > > --- > > nptl/pthread_mutex_cond_lock.c | 1 + > > nptl/pthread_mutex_lock.c | 40 ++++++++++++++++++++++++++-------- > > 2 files changed, 32 insertions(+), 9 deletions(-) > > > > diff --git a/nptl/pthread_mutex_cond_lock.c b/nptl/pthread_mutex_cond_l= ock.c > > index f3af514305..fcc90bb16c 100644 > > --- a/nptl/pthread_mutex_cond_lock.c > > +++ b/nptl/pthread_mutex_cond_lock.c > > @@ -3,6 +3,7 @@ > > #define LLL_MUTEX_LOCK(mutex) \ > > lll_cond_lock ((mutex)->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex)= ) > > #define LLL_MUTEX_LOCK_OPTIMIZED(mutex) LLL_MUTEX_LOCK (mutex) > > +#define LLL_MUTEX_TRYLOCK_OPTIMIZED(mutex) LLL_MUTEX_TRYLOCK (mutex) > > > > /* Not actually elided so far. Needed? */ > > #define LLL_MUTEX_LOCK_ELISION(mutex) \ > > diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c > > index d4f96c70ef..011bd7488d 100644 > > --- a/nptl/pthread_mutex_lock.c > > +++ b/nptl/pthread_mutex_lock.c > > @@ -30,9 +30,9 @@ > > /* Some of the following definitions differ when pthread_mutex_cond_lo= ck.c > > includes this file. */ > > #ifndef LLL_MUTEX_LOCK > > -/* lll_lock with single-thread optimization. */ > > -static inline void > > -lll_mutex_lock_optimized (pthread_mutex_t *mutex) > > + > > +static inline int > > +lll_mutex_try_singlethreaded_opt (pthread_mutex_t *mutex, int private) > > { > > /* The single-threaded optimization is only valid for private > > mutexes. For process-shared mutexes, the mutex could be in a > > @@ -41,16 +41,38 @@ lll_mutex_lock_optimized (pthread_mutex_t *mutex) > > acquired, POSIX requires that pthread_mutex_lock deadlocks for > > normal mutexes, so skip the optimization in that case as > > well. */ > > - int private =3D PTHREAD_MUTEX_PSHARED (mutex); > > if (private =3D=3D LLL_PRIVATE && SINGLE_THREAD_P && mutex->__data._= _lock =3D=3D 0) > > - mutex->__data.__lock =3D 1; > > - else > > + { > > + mutex->__data.__lock =3D 1; > > + return 0; > > + } > > + return 1; > > +} > > + > > +/* lll_trylock with single-thread optimization. */ > > +static inline int > > +lll_mutex_trylock_optimized (pthread_mutex_t *mutex) > > +{ > > + if (lll_mutex_try_singlethreaded_opt (mutex, PTHREAD_MUTEX_PSHARED (= mutex)) > > + =3D=3D 0) > > + return 0; > > + return lll_trylock (mutex->__data.__lock); > > +} > > + > > +/* lll_lock with single-thread optimization. */ > > +static inline void > > +lll_mutex_lock_optimized (pthread_mutex_t *mutex) > > +{ > > + int private =3D PTHREAD_MUTEX_PSHARED (mutex); > > + if (lll_mutex_try_singlethreaded_opt (mutex, private)) > > lll_lock (mutex->__data.__lock, private); > > } > > > > # define LLL_MUTEX_LOCK(mutex) = \ > > lll_lock ((mutex)->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex)) > > # define LLL_MUTEX_LOCK_OPTIMIZED(mutex) lll_mutex_lock_optimized (mut= ex) > > +# define LLL_MUTEX_TRYLOCK_OPTIMIZED(mutex) \ > > + lll_mutex_trylock_optimized (mutex) > > # define LLL_MUTEX_TRYLOCK(mutex) \ > > lll_trylock ((mutex)->__data.__lock) > > # define LLL_ROBUST_MUTEX_LOCK_MODIFIER 0 > > @@ -133,11 +155,11 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex) > > else if (__builtin_expect (PTHREAD_MUTEX_TYPE (mutex) > > =3D=3D PTHREAD_MUTEX_ADAPTIVE_NP, 1)) > > { > > - if (LLL_MUTEX_TRYLOCK (mutex) !=3D 0) > > + if (LLL_MUTEX_TRYLOCK_OPTIMIZED (mutex) !=3D 0) > > { > > int cnt =3D 0; > > - int max_cnt =3D MIN (max_adaptive_count (), > > - mutex->__data.__spins * 2 + 10); > > + int max_cnt > > + =3D MIN (max_adaptive_count (), mutex->__data.__spins * 2= + 10); > > int spin_count, exp_backoff =3D 1; > > unsigned int jitter =3D get_jitter (); > > do > > -- > > 2.34.1 > > > > > -- > H.J.