From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by sourceware.org (Postfix) with ESMTPS id 2AE6B3858408 for ; Thu, 11 Nov 2021 16:24:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2AE6B3858408 Received: by mail-pj1-x1032.google.com with SMTP id np3so4532248pjb.4 for ; Thu, 11 Nov 2021 08:24:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9KxeCHlIVkq5yLvmiG6eVQQN5u75eNk3lURi4t8OhVE=; b=033E+vWXPQ+YCcQu2iTEDvhDkYvoQX+HMvV0HoKI+jqC/6LvJVxNrJxIk5Cat7o61N mPx28tWtFulNczdIeWo7aGMprFiJIB9G3aTEuwFoB8fgNhomTuHqHcgjzyrm+7dXjH9G C+v3GRsDUhnk3eLJl6uMesry6cl9QWITsrLoVWL6M/LULbEtqpM0Pf7MYpVkCWID35Ys JERPHnQsfpknQILP6Wzp0Btsr6HmKjKdj5Tpj12jkr9mwkhS+1y51ZxDWgWgAZ+P+5f9 baDLnXmcI+Rhnv+mN6HXpmPOhOAQaynypp84nnT0Fwo6fXb3iEQVsTp+6KRG/s2bug13 t0AA== X-Gm-Message-State: AOAM533R93KhXPMYNjwqX+ZgPxyGD4t8WSzLlQxi3MVoegwoMYYhkMv8 3gmv/92V0NQFo6Q3pVc3tQU= X-Google-Smtp-Source: ABdhPJzPRX4WuUIaIxSbcr05RdaRAFaCxav9Co+WOgRyDg8IRlMWGP4plEzBydzrA38JsxsaiYD7Mg== X-Received: by 2002:a17:90a:e54c:: with SMTP id ei12mr27753782pjb.81.1636647870257; Thu, 11 Nov 2021 08:24:30 -0800 (PST) Received: from gnu-cfl-2.localdomain ([172.58.35.133]) by smtp.gmail.com with ESMTPSA id j8sm3872923pfu.27.2021.11.11.08.24.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 08:24:29 -0800 (PST) Received: from gnu-cfl-2.lan (localhost [IPv6:::1]) by gnu-cfl-2.localdomain (Postfix) with ESMTP id EAF5D1A0987; Thu, 11 Nov 2021 08:24:28 -0800 (PST) From: "H.J. Lu" To: libc-alpha@sourceware.org Cc: Florian Weimer , Oleh Derevenko , Arjan van de Ven , Andreas Schwab , "Paul A . Clarke" , Noah Goldstein Subject: [PATCH v6 1/4] Add LLL_MUTEX_READ_LOCK [BZ #28537] Date: Thu, 11 Nov 2021 08:24:25 -0800 Message-Id: <20211111162428.2286605-2-hjl.tools@gmail.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211111162428.2286605-1-hjl.tools@gmail.com> References: <20211111162428.2286605-1-hjl.tools@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3029.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2021 16:24:32 -0000 CAS instruction is expensive. From the x86 CPU's point of view, getting a cache line for writing is more expensive than reading. See Appendix A.2 Spinlock in: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf The full compare and swap will grab the cache line exclusive and cause excessive cache line bouncing. Add LLL_MUTEX_READ_LOCK to do an atomic load and skip CAS in spinlock loop if compare may fail to reduce cache line bouncing on contended locks. --- nptl/pthread_mutex_lock.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c index 2bd41767e0..72058c719c 100644 --- a/nptl/pthread_mutex_lock.c +++ b/nptl/pthread_mutex_lock.c @@ -64,6 +64,11 @@ lll_mutex_lock_optimized (pthread_mutex_t *mutex) # define PTHREAD_MUTEX_VERSIONS 1 #endif +#ifndef LLL_MUTEX_READ_LOCK +# define LLL_MUTEX_READ_LOCK(mutex) \ + atomic_load_relaxed (&(mutex)->__data.__lock) +#endif + static int __pthread_mutex_lock_full (pthread_mutex_t *mutex) __attribute_noinline__; @@ -141,6 +146,8 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex) break; } atomic_spin_nop (); + if (LLL_MUTEX_READ_LOCK (mutex) != 0) + continue; } while (LLL_MUTEX_TRYLOCK (mutex) != 0); -- 2.33.1