From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by sourceware.org (Postfix) with ESMTPS id 908093858C2D for ; Wed, 30 Mar 2022 17:07:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 908093858C2D Received: by mail-pj1-x102a.google.com with SMTP id bx24-20020a17090af49800b001c6872a9e4eso664636pjb.5 for ; Wed, 30 Mar 2022 10:07:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/PQek3fN4fWwpGrTwTMbTVtnxRkRSVKt48EONsoqCKA=; b=Jd+r12YJNt0vqIskLGHxpiMXhQIJHePYiQ1LY16O9bcnyMjJ/CtDfXSTAk4TBWCLfM Vy0WPZN/EUZQ+rPqCGeX5OpdZhVX/pP2uAIWXWDX39cvER6alJOUJfNiTI0R5SCQxD8J 1SOBxavDHAD1dY/wUi51JXFzZGUxSBwm2dT0udZYVUj3tRWcugmJyz49wvvQMhYV20Q+ 7hfpozcRB88woKjUKF8f2obGT8Gq5ZcPrCfNkBDJx44X0Cyoo+Y/oRZQtjGdlJ1ETa0/ +qhafUwjegiSAxU2ANE3Jx7WfzI6ym2ImwqUeOTansdHyPhlXSycEBIudMtSYg7+gCl0 N8CQ== X-Gm-Message-State: AOAM531YE8Zy4ksdITlyRBfnmBshw8nO27aRmONHfXVgoiR8S7sCF7A2 zn8ZR2uTD9QrFmxiwg80FiRfXIIJmBlfwa2zkGI= X-Google-Smtp-Source: ABdhPJyxxCD4apFLwb31AL+ak2+o/rbUtpQmYvvRHoL+lE9VE91zbYyQTsk5e3Q5iYpsEbqpNzyI4PSpv+TffG6OVwc= X-Received: by 2002:a17:902:ce91:b0:154:16c2:63b3 with SMTP id f17-20020a170902ce9100b0015416c263b3mr37346580plg.22.1648660036090; Wed, 30 Mar 2022 10:07:16 -0700 (PDT) MIME-Version: 1.0 References: <20220328084705.468207-1-wangyang.guo@intel.com> <97b1105f-42e8-a347-f82e-c81e548f0c2f@linaro.org> In-Reply-To: <97b1105f-42e8-a347-f82e-c81e548f0c2f@linaro.org> From: Noah Goldstein Date: Wed, 30 Mar 2022 12:07:05 -0500 Message-ID: Subject: Re: [PATCH v2] nptl: Add backoff mechanism to spinlock loop To: Adhemerval Zanella Cc: Wangyang Guo , GNU C Library Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2022 17:07:19 -0000 On Wed, Mar 30, 2022 at 6:54 AM Adhemerval Zanella via Libc-alpha wrote: > > > > On 28/03/2022 05:47, Wangyang Guo via Libc-alpha wrote: > > When mutiple threads waiting for lock at the same time, once lock owner > > releases the lock, waiters will see lock available and all try to lock, > > which may cause an expensive CAS storm. > > > > Binary exponential backoff with random jitter is introduced. As try-lock > > attempt increases, there is more likely that a larger number threads > > compete for adaptive mutex lock, so increase wait time in exponential. > > A random jitter is also added to avoid synchronous try-lock from other > > threads. > > > > v2: Remove read-check before try-lock for performance. > > > > Signed-off-by: Wangyang Guo > > --- > > nptl/pthread_mutex_lock.c | 25 ++++++++++++++++--------- > > 1 file changed, 16 insertions(+), 9 deletions(-) > > > > diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c > > index d2e652d151..7e75ec1cba 100644 > > --- a/nptl/pthread_mutex_lock.c > > +++ b/nptl/pthread_mutex_lock.c > > @@ -26,6 +26,7 @@ > > #include > > #include > > #include > > +#include > > > > /* Some of the following definitions differ when pthread_mutex_cond_lock.c > > includes this file. */ > > @@ -64,11 +65,6 @@ lll_mutex_lock_optimized (pthread_mutex_t *mutex) > > # define PTHREAD_MUTEX_VERSIONS 1 > > #endif > > > > -#ifndef LLL_MUTEX_READ_LOCK > > -# define LLL_MUTEX_READ_LOCK(mutex) \ > > - atomic_load_relaxed (&(mutex)->__data.__lock) > > -#endif > > - > > static int __pthread_mutex_lock_full (pthread_mutex_t *mutex) > > __attribute_noinline__; > > > > @@ -138,17 +134,28 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex) > > int cnt = 0; > > int max_cnt = MIN (max_adaptive_count (), > > mutex->__data.__spins * 2 + 10); > > + int spin_count, exp_backoff = 1; > > + unsigned int jitter = random_bits (); > > This will issue a syscall for architectures that do not have clock_gettime > on vDSO, which is a performance regression. You will need to move the > jitter setup to be arch-specific, where the generic interface setting > no random jitter. What would be the best init jitter for arch w/ only syscall timers? TID? Or something else? > > > do > > { > > - if (cnt++ >= max_cnt) > > + /* In each loop, spin count is exponential backoff plus > > + random jitter, random range is [0, exp_backoff-1]. */ > > + spin_count = exp_backoff + (jitter & (exp_backoff - 1)); > > + cnt += spin_count; > > + if (cnt >= max_cnt) > > { > > + /* If cnt exceeds max spin count, just go to wait > > + queue. */ > > LLL_MUTEX_LOCK (mutex); > > break; > > } > > - atomic_spin_nop (); > > + do > > + atomic_spin_nop (); > > + while (--spin_count > 0); > > + /* Binary exponential backoff, prepare for next loop. */ > > + exp_backoff <<= 1; > > } > > - while (LLL_MUTEX_READ_LOCK (mutex) != 0 > > - || LLL_MUTEX_TRYLOCK (mutex) != 0); > > + while (LLL_MUTEX_TRYLOCK (mutex) != 0); > > > > mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8; > > }