From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jwakely.gcc@gmail.com>
Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com
 [IPv6:2a00:1450:4864:20::42b])
 by sourceware.org (Postfix) with ESMTPS id 87C26398203F;
 Wed,  4 Aug 2021 15:52:14 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 87C26398203F
Received: by mail-wr1-x42b.google.com with SMTP id k4so2833288wrc.0;
 Wed, 04 Aug 2021 08:52:14 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=J/jPeskLfTviKFoakztE3DGGhmS8lbZw80hCWYDU/RM=;
 b=OjOt1Vp3q7764diX84SlRRnSDA7Hj8ieePJMFK3OIhOgwTuqOT+0WkdFsHgc8CoODr
 GJr8Ex6XZ4NG3hpmSbod5Lduq28o/G21IOiG09Sin5KdQBHhBHr8YgTFpmQQIBSyhvRP
 Y9NVpVIT8OWGPTTFTv7aRKKTGNKnkmpMTRauOv6i2zgp+PrvpS71OT+4TQNuuiRdtF6n
 ypFkCG1fZz22iGHSmpI+6zNxI+qyttWYbcZaRevItw/BqyqbBTd+93i7fMgv8x+ad3uT
 SK5H1++F0rucvuQy0KwiBhRWNIyswjsipGO/jsycjNlwlkZ0vnr0sqOU1AXJV03pRWJF
 y5EQ==
X-Gm-Message-State: AOAM5339tkl3kOgYEgTRfcttGZoQ6Bw9/rY8scaY0uQtUKGhNTfsBiEn
 Po+UYjULiG2XeaAeKOOO86zdbuQu3KUXmq+PKWs=
X-Google-Smtp-Source: ABdhPJycBjBppnkuAJAlOJRaXPRxxht1eOkQGiGeqnalj5bkAFQcOHW2XEB9Hd0L+oMXJyNGj09lS33DOQwbxETqFXI=
X-Received: by 2002:a5d:49c7:: with SMTP id t7mr7757wrs.321.1628092333632;
 Wed, 04 Aug 2021 08:52:13 -0700 (PDT)
MIME-Version: 1.0
References: <CABBFi095Sn2-=DiZyLEfAP2=XCf9jA6F6uATBBe0LAK16EhogA@mail.gmail.com>
 <CAH6eHdQzMFVVrnaKiL5v_bF-HtC+H1reLaimN6x=9LxRxvERCg@mail.gmail.com>
 <CABBFi08YYQRgU-bMweP1gA_qDtKiD42JaptEsZPpG+KMYNw_WQ@mail.gmail.com>
 <CABBFi0-+oRNUDM3TxDetxMzhmfzrxBHJQxBvM3TWsogbM8hMZw@mail.gmail.com>
 <CABBFi0_QP2Psm2RbSmr38WnjX9MJgehywTfhjXRtJ3rXFsN4-w@mail.gmail.com>
 <CAH6eHdSH9uOpOLrH37qztYe=eRpdX=iWSYkzAaC4EOZio9ekwA@mail.gmail.com>
 <CAH6eHdT=WCfPt+CaAJtEN_+tbEn52X=Q7jR7_aQmktv6zs-zVg@mail.gmail.com>
 <CABBFi08ox7rBjJpK=hwe7qu4f05QbBE1a4YvtTzbJ3Uz8v2Peg@mail.gmail.com>
In-Reply-To: <CABBFi08ox7rBjJpK=hwe7qu4f05QbBE1a4YvtTzbJ3Uz8v2Peg@mail.gmail.com>
From: Jonathan Wakely <jwakely.gcc@gmail.com>
Date: Wed, 4 Aug 2021 16:52:02 +0100
Message-ID: <CAH6eHdR3k7F-mD7ga_06rJUxsPSpB00LnVMjgxMndsSnrNVB=g@mail.gmail.com>
Subject: Re: [PATCH] libstdc++: Skip atomic instructions in
 _Sp_counted_base::_M_release when both counts are 1
To: Maged Michael <maged.michael@gmail.com>
Cc: "libstdc++" <libstdc++@gcc.gnu.org>, gcc-patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_NUMSUBJECT,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=no autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Aug 2021 15:52:15 -0000

On Wed, 4 Aug 2021 at 16:47, Maged Michael <maged.michael@gmail.com> wrote:
>
> Thanks, Jonathan!
>
> On Wed, Aug 4, 2021 at 11:32 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>>
>> On Tue, 3 Aug 2021 at 21:59, Jonathan Wakely wrote:
>> >
>> > On Mon, 2 Aug 2021 at 14:29, Maged Michael wrote:
>> > >
>> > > This is the right patch. The previous one is missing noexcept. Sorry.
>> > >
>> > >
>> > > On Mon, Aug 2, 2021 at 9:23 AM Maged Michael <maged.michael@gmail.com> wrote:
>> > >>
>> > >> Please find attached an updated patch after incorporating Jonathan's suggestions.
>> > >>
>> > >> Changes from the last patch include:
>> > >> - Add a TSAN macro to bits/c++config.
>> > >> - Use separate constexpr bool-s for the conditions for lock-freedom, double-width and alignment.
>> > >> - Move the code in the optimized path to a separate function _M_release_double_width_cas.
>> >
>> > Thanks for the updated patch. At a quick glance it looks great. I'll
>> > apply it locally and test it tomorrow.
>>
>>
>> It occurs to me now that _M_release_double_width_cas is the wrong
>> name, because we're not doing a DWCAS, just a double-width load. What
>> do you think about renaming it to _M_release_combined instead? Since
>> it does a combined load of the two counts.
>
>
> Yes definitely _M_release_combined makes sense. Will do.

See the patch I attached to my previous mail, which has a refactored
version that gets rid of that function entirely.

>
>>
>> I'm no longer confident in the alignof suggestion I gave you.
>>
>> +        constexpr bool __double_word
>> +          = sizeof(long long) == 2 * sizeof(_Atomic_word);
>> +        // The ref-count members follow the vptr, so are aligned to
>> +        // alignof(void*).
>> +        constexpr bool __aligned = alignof(long long) <= alignof(void*);
>>
>> For IA32 (i.e. 32-bit x86) this constant will be true, because
>> alignof(long long) and alignof(void*) are both 4, even though
>> sizeof(long long) is 8. So in theory the _M_use_count and
>> _M_weak_count members could be in different cache lines, and the
>> atomic load will not be atomic. I think we want to use __alignof(long
>> long) here to get the "natural" alignment, not the smaller 4B
>> alignment allowed by SysV ABI. That means that we won't do this
>> optimization on IA32, but I think that's acceptable.
>>
>> Alternatively, we could keep using alignof, but with an additional
>> run-time check something like (uintptr_t)&_M_use_count / 64 ==
>> (uintptr_t)&_M_weak_count / 64 to check they're on the same cache
>> line. I think for now we should just use __alignof and live without
>> the optimization on IA32.
>>
> I'd rather avoid any runtime checks because they may negate the speed rationale for doing this optimization.
> I'd be OK with not doing this optimization for any 32-bit architecture.
>
> Is it OK to change the __align condition to the following?
> constexpr bool __aligned =
>   (alignof(long long) <= alignof(void*))
>   && (sizeof(long long) == sizeof(void*));

Yes, that will work fine.