From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1131.google.com (mail-yw1-x1131.google.com [IPv6:2607:f8b0:4864:20::1131]) by sourceware.org (Postfix) with ESMTPS id BD5DB3858D33 for ; Wed, 19 Apr 2023 15:42:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BD5DB3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x1131.google.com with SMTP id 00721157ae682-54fb615ac3dso5520787b3.2 for ; Wed, 19 Apr 2023 08:42:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681918931; x=1684510931; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pcUediiCtRvvyHQBDojdiwf5CJrrPM9Merb+vVacqb0=; b=hdIWG9nYZSS41clMaKFuIhQzeWi1JUobFjFfp4Q/S8HZCY3cXALS/aEyqYdqVxV2RB drka5Rfj7ncmo1kUutpbqywA7c+Kk+VvoPe5IZ9mzDr4m4R4OCVToJ6RnH18fJYHFUwy BsIVq0jAXpkIQoHy9t5ECOZXquKX8udbrifqOEMts9uUcdiBQYAZQhbPpqcKhlM9mG+t oFV/OidT3x53cV6XOppPMAu3dYoj8Tr5lQlrn2NEPxNExPL7NzG9zCf/duMPj/6jv+yy 7Id/Wy7BypD7kRnGZ8XMZuHsphI2CVQb/a2uROFM0cZYQPDV0YMFdroTHhzv3tWEI/IG D6Wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681918931; x=1684510931; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pcUediiCtRvvyHQBDojdiwf5CJrrPM9Merb+vVacqb0=; b=d7zCtDQgwvcwSlQO59wPXmiJxoPUO6oQPpRUNO5o2PIWe0JdMXqmN1NvRU51PxF8GM 96KNeo17S5UIVKgf8zkYRVQFtpEtb1lnLLB8hOHfSQH6tf9Ipd5BmfcJUYClcf0q6TZl zE8TcWHpvfi4oRO++dCx8BZwFzfqB8wX5bGVRbIkYDI8Cad61QygOEo/ZxXkBCOaSHn7 UEAnmcqDlnA5pXlJdkGROa8PAu8hhh4I2KMbTAhGCUUyaFKusAGLSFeF3iiLs2LBOe5S pRFsjLytQm5zMaeIe19jcBx8XtaZqUelUobfnJ0cqopoOJposJBWwP/SUePhXtnO38cS hSvg== X-Gm-Message-State: AAQBX9dN1BCEbbwrOjdl0JTgCVe5Q7ivgX+EzNR5qIqLy8oQUii5uADm o/bwjCvm9Nex+jVOxW4t2/JmRBQFjm/hCdpwQFCHnTai X-Google-Smtp-Source: AKy350bln9W9RXuuHh5RwjFISok9qf5DJ2bFWdn5Xe1brcIqtPUCWoKv1+H9DCS+rL2IzPYFPY5zQq2FHydWJLgxRBU= X-Received: by 2002:a81:b1c3:0:b0:54f:bb6b:bf6d with SMTP id p186-20020a81b1c3000000b0054fbb6bbf6dmr1758821ywh.2.1681918929837; Wed, 19 Apr 2023 08:42:09 -0700 (PDT) MIME-Version: 1.0 References: <20230419011722.1154501-1-goldstein.w.n@gmail.com> In-Reply-To: <20230419011722.1154501-1-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Wed, 19 Apr 2023 08:41:33 -0700 Message-ID: Subject: Re: [PATCH v1] nptl: Add single-threaded optimization to ADAPTIVE_NP mutex To: Noah Goldstein Cc: libc-alpha@sourceware.org, carlos@systemhalted.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3021.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Apr 18, 2023 at 6:17=E2=80=AFPM Noah Goldstein wrote: > > ADAPTIVED_NP mutex is generally the most performant mutex option and > is commonly used so it seems preferable for it to also benefit from > the optimization. Adaptive mutex works better than normal mutex when futex syscall is the mai= n bottleneck. For single threaded applications or machines with fewer cores= , adaptive mutex may not improve performance. If it is the case, shouldn't = we focus on improving performance when futex syscall is the bottleneck? > make check passes on linux-x86_64 > --- > nptl/pthread_mutex_cond_lock.c | 1 + > nptl/pthread_mutex_lock.c | 40 ++++++++++++++++++++++++++-------- > 2 files changed, 32 insertions(+), 9 deletions(-) > > diff --git a/nptl/pthread_mutex_cond_lock.c b/nptl/pthread_mutex_cond_loc= k.c > index f3af514305..fcc90bb16c 100644 > --- a/nptl/pthread_mutex_cond_lock.c > +++ b/nptl/pthread_mutex_cond_lock.c > @@ -3,6 +3,7 @@ > #define LLL_MUTEX_LOCK(mutex) \ > lll_cond_lock ((mutex)->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex)) > #define LLL_MUTEX_LOCK_OPTIMIZED(mutex) LLL_MUTEX_LOCK (mutex) > +#define LLL_MUTEX_TRYLOCK_OPTIMIZED(mutex) LLL_MUTEX_TRYLOCK (mutex) > > /* Not actually elided so far. Needed? */ > #define LLL_MUTEX_LOCK_ELISION(mutex) \ > diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c > index d4f96c70ef..011bd7488d 100644 > --- a/nptl/pthread_mutex_lock.c > +++ b/nptl/pthread_mutex_lock.c > @@ -30,9 +30,9 @@ > /* Some of the following definitions differ when pthread_mutex_cond_lock= .c > includes this file. */ > #ifndef LLL_MUTEX_LOCK > -/* lll_lock with single-thread optimization. */ > -static inline void > -lll_mutex_lock_optimized (pthread_mutex_t *mutex) > + > +static inline int > +lll_mutex_try_singlethreaded_opt (pthread_mutex_t *mutex, int private) > { > /* The single-threaded optimization is only valid for private > mutexes. For process-shared mutexes, the mutex could be in a > @@ -41,16 +41,38 @@ lll_mutex_lock_optimized (pthread_mutex_t *mutex) > acquired, POSIX requires that pthread_mutex_lock deadlocks for > normal mutexes, so skip the optimization in that case as > well. */ > - int private =3D PTHREAD_MUTEX_PSHARED (mutex); > if (private =3D=3D LLL_PRIVATE && SINGLE_THREAD_P && mutex->__data.__l= ock =3D=3D 0) > - mutex->__data.__lock =3D 1; > - else > + { > + mutex->__data.__lock =3D 1; > + return 0; > + } > + return 1; > +} > + > +/* lll_trylock with single-thread optimization. */ > +static inline int > +lll_mutex_trylock_optimized (pthread_mutex_t *mutex) > +{ > + if (lll_mutex_try_singlethreaded_opt (mutex, PTHREAD_MUTEX_PSHARED (mu= tex)) > + =3D=3D 0) > + return 0; > + return lll_trylock (mutex->__data.__lock); > +} > + > +/* lll_lock with single-thread optimization. */ > +static inline void > +lll_mutex_lock_optimized (pthread_mutex_t *mutex) > +{ > + int private =3D PTHREAD_MUTEX_PSHARED (mutex); > + if (lll_mutex_try_singlethreaded_opt (mutex, private)) > lll_lock (mutex->__data.__lock, private); > } > > # define LLL_MUTEX_LOCK(mutex) \ > lll_lock ((mutex)->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex)) > # define LLL_MUTEX_LOCK_OPTIMIZED(mutex) lll_mutex_lock_optimized (mutex= ) > +# define LLL_MUTEX_TRYLOCK_OPTIMIZED(mutex) \ > + lll_mutex_trylock_optimized (mutex) > # define LLL_MUTEX_TRYLOCK(mutex) \ > lll_trylock ((mutex)->__data.__lock) > # define LLL_ROBUST_MUTEX_LOCK_MODIFIER 0 > @@ -133,11 +155,11 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex) > else if (__builtin_expect (PTHREAD_MUTEX_TYPE (mutex) > =3D=3D PTHREAD_MUTEX_ADAPTIVE_NP, 1)) > { > - if (LLL_MUTEX_TRYLOCK (mutex) !=3D 0) > + if (LLL_MUTEX_TRYLOCK_OPTIMIZED (mutex) !=3D 0) > { > int cnt =3D 0; > - int max_cnt =3D MIN (max_adaptive_count (), > - mutex->__data.__spins * 2 + 10); > + int max_cnt > + =3D MIN (max_adaptive_count (), mutex->__data.__spins * 2 += 10); > int spin_count, exp_backoff =3D 1; > unsigned int jitter =3D get_jitter (); > do > -- > 2.34.1 > --=20 H.J.