From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by sourceware.org (Postfix) with ESMTPS id 48D0A385840C for ; Wed, 10 Nov 2021 16:04:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 48D0A385840C Received: by mail-pg1-x530.google.com with SMTP id p8so2642694pgh.11 for ; Wed, 10 Nov 2021 08:04:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=mip+6nB2QqWKEc77owbSI2x5f48Ju8cjPYKqoUczzUQ=; b=ukIS4jzXuukWut/7vPom/S4Q91qfQ8zo62lvy1BkvvYAImLHcWtKzEVqw4EFHdqXig kBuMWEvFKOnd/z9wN8XNwOCRPpK6kOXQWMzQso75XbsGGEAyRiVnlHHS2EKZV9csmZuP 84pFyKkiLnQB/x05rF2yDf6pBAGjyApnbVC6u6sWyilEOC73MH7LxkT22vE96pM4W1gv q5z0UoxQ0BvkzWk9rI5ToHB7RAH+EFP6TNxrOoMCQL8VTpBdchZhddDWnQrNVXBD6hO6 tsxWtzXgp5LykwpBXp3v+zVmxLqR4PY+4D4cCe/FAdms8bNrMD0ifCCJfrdLioHoL3ja /lww== X-Gm-Message-State: AOAM5330SUf1vSLfIJjA1dJ+G/SFzT5CUK2TF1fQgSzlPQDaUjL+rOlZ 8wQ0DIctSda2DnXit7dkzjd/wMpJtCYctvwyAz0= X-Google-Smtp-Source: ABdhPJxH8Kc6/egG+/1VglO/HzVUnXKI9p0KvaXABieCCqyjkG9FaAJ7RRebu4yic97SUJ2JdNya3Yb0HurZ+ARPnCk= X-Received: by 2002:a63:87c1:: with SMTP id i184mr586496pge.75.1636560248189; Wed, 10 Nov 2021 08:04:08 -0800 (PST) MIME-Version: 1.0 References: <20211110001614.2087610-1-hjl.tools@gmail.com> <20211110153559.GC4930@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com> <87o86s2ekj.fsf@oldenburg.str.redhat.com> In-Reply-To: <87o86s2ekj.fsf@oldenburg.str.redhat.com> From: "H.J. Lu" Date: Wed, 10 Nov 2021 08:03:32 -0800 Message-ID: Subject: Re: [PATCH v4 0/3] Optimize CAS [BZ #28537] To: Florian Weimer Cc: "Paul A. Clarke" , GNU C Library , Arjan van de Ven , Andreas Schwab Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3022.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2021 16:04:11 -0000 On Wed, Nov 10, 2021 at 7:52 AM Florian Weimer wrote: > > * Paul A. Clarke: > > > On Tue, Nov 09, 2021 at 04:16:11PM -0800, H.J. Lu via Libc-alpha wrote: > >> CAS instruction is expensive. From the x86 CPU's point of view, getti= ng > >> a cache line for writing is more expensive than reading. See Appendix > >> A.2 Spinlock in: > >> > >> https://www.intel.com/content/dam/www/public/us/en/documents/white-pap= ers/xeon-lock-scaling-analysis-paper.pdf > >> > >> The full compare and swap will grab the cache line exclusive and cause > >> excessive cache line bouncing. > >> > >> Optimize CAS in low level locks and pthread_mutex_lock.c: > >> > >> 1. Do an atomic load and skip CAS if compare may fail to reduce cache > >> line bouncing on contended locks. > >> 2. Replace atomic_compare_and_exchange_bool_acq with > >> atomic_compare_and_exchange_val_acq to avoid the extra load. > >> 3. Drop __glibc_unlikely in __lll_trylock and lll_cond_trylock since w= e > >> don't know if it's actually rare; in the contended case it is clearly = not > >> rare. > > > > I see build errors: > > > > In file included from pthread_mutex_cond_lock.c:23: > > ../nptl/pthread_mutex_lock.c: In function =E2=80=98__pthread_mutex_cond= _lock_full=E2=80=99: > > ../nptl/pthread_mutex_lock.c:442:6: error: a label can only be part of = a statement and a declaration is not a statement > > int private =3D (robust > > ^~~ > > The patch has: > > + locked_mutex: > /* The mutex is locked. The kernel will now take care of > everything. */ > int private =3D (robust > > This is only supported in recent C versions, I think the workaround is > to add an empty statement with a semicolon, like this: > > + locked_mutex:; > /* The mutex is locked. The kernel will now take care of > everything. */ > int private =3D (robust > > Thanks, > Florian > Can you try users/hjl/x86/atomic-nptl branch: https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/x86/atomic-nptl I fixed a couple issues and added more CAS optimizations. --=20 H.J.