From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 3CC113858D28 for ; Wed, 29 Sep 2021 12:14:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3CC113858D28 Received: from mail-vk1-f197.google.com (mail-vk1-f197.google.com [209.85.221.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-167-z2JAsJE9Pr6LLxb9oQhoZg-1; Wed, 29 Sep 2021 08:13:58 -0400 X-MC-Unique: z2JAsJE9Pr6LLxb9oQhoZg-1 Received: by mail-vk1-f197.google.com with SMTP id t191-20020a1f91c8000000b002a1a213ef36so449974vkd.4 for ; Wed, 29 Sep 2021 05:13:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KQISfXs/ddQdliuiCiZgKAkHhiFctLN+4B/oQAxYi8M=; b=jMM/xD+3AkyjRsnoBwa+VA6XWt4uv32p4lSYyL2scRwriHw8aNl72n8G5oe2ubGec4 OmcJ/pUyUxNq2LK5J+ZiSTuMZlVi4vfY8yp2q66C69ryRDD/yx+a12+CPAW5FThLZgrj KPNcxzkMlci0SFHsrvON/yHPtYNxdLYfiS5AJKcd9o05CQG5Qnk+fp76cvS2C1mN4rO8 FyaOAgwPkEDDGDZ1sbaX6xNZpUmE+K178Y/o1wR6wyNxCulRYfww318pIo71DMB+ujt7 u9V9BJVZu1IYJrAkbWhWR4cys0stFxfEf2eeZ2qjpQSJNrbicqzg9TYdzKpxXdip7dbT zawQ== X-Gm-Message-State: AOAM532pb64Yjh6GJv8Ki6Ou92w/VzV1z+DKBK3DUn0KYrXZwaBoU128 Oxep9+lQT1jXcYPl7VHxGmliFKGVA2Ya3nfRAFn4X+fQQC8MClISVCR/Q42OlOsBUFoiHVd9fe3 XJW26P0Wid5xSAaxjxrihtdDZeapXNZU= X-Received: by 2002:a05:6102:3666:: with SMTP id bg6mr10340425vsb.11.1632917637421; Wed, 29 Sep 2021 05:13:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy55NhKcq106D1uyz2J9kqPXMOT8G1DcuC/FBo+3OxAyw2IyFdbdm3j7E+kwlMwMSkreUnPF87TgtjHIe5EYM0= X-Received: by 2002:a05:6102:3666:: with SMTP id bg6mr10340388vsb.11.1632917637121; Wed, 29 Sep 2021 05:13:57 -0700 (PDT) MIME-Version: 1.0 References: <20210923180837.633173-1-rodgert@appliantology.com> <20210927141031.651313-1-rodgert@appliantology.com> In-Reply-To: <20210927141031.651313-1-rodgert@appliantology.com> From: Jonathan Wakely Date: Wed, 29 Sep 2021 13:13:46 +0100 Message-ID: Subject: Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange To: Thomas Rodgers Cc: gcc Patches , "libstdc++" , Thomas Rodgers , Thomas Rodgers X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2021 12:14:02 -0000 On Mon, 27 Sept 2021 at 15:11, Thomas Rodgers wrote: > > From: Thomas Rodgers > > Now with checks for __has_builtin(__builtin_clear_padding) > > This change implements P0528 which requires that padding bits not > participate in atomic compare exchange operations. All arguments to the > generic template are 'sanitized' by the __builtin_clearpadding intrisic > before they are used in comparisons. This alrequires that any stores > also sanitize the incoming value. > > Signed-off-by: Thomas Rodgers > > libstdc++=v3/ChangeLog: > > * include/std/atomic (atomic::atomic(_Tp) clear padding for > __cplusplus > 201703L. > (atomic::store()) Clear padding. > (atomic::exchange()) Likewise. > (atomic::compare_exchange_weak()) Likewise. > (atomic::compare_exchange_strong()) Likewise. Don't we also need this for std::atomic_ref, i.e. for the __atomic_impl free functions in ? There we don't have any distinction between atomic_ref and atomic_ref, they both use the same implementations. But I think that's OK, as I think the built-in is smart enough to be a no-op for types with no padding. > * testsuite/29_atomics/atomic/compare_exchange_padding.cc: New > test. > --- > libstdc++-v3/include/std/atomic | 41 +++++++++++++++++- > .../atomic/compare_exchange_padding.cc | 42 +++++++++++++++++++ > 2 files changed, 81 insertions(+), 2 deletions(-) > create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc > > diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic > index 936dd50ba1c..4ac9ccdc1ab 100644 > --- a/libstdc++-v3/include/std/atomic > +++ b/libstdc++-v3/include/std/atomic > @@ -228,7 +228,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > atomic& operator=(const atomic&) = delete; > atomic& operator=(const atomic&) volatile = delete; > > - constexpr atomic(_Tp __i) noexcept : _M_i(__i) { } > +#if __cplusplus > 201703L && __has_builtin(__builtin_clear_padding) > + constexpr atomic(_Tp __i) noexcept : _M_i(__i) > + { __builtin_clear_padding(std::__addressof(_M_i)); } > +#else > + constexpr atomic(_Tp __i) noexcept : _M_i(__i) > + { } > +#endif Please write this as a single function with the preprocessor conditions in the body: constexpr atomic(_Tp __i) noexcept : _M_i(__i) { #if __cplusplus > 201703L && __has_builtin(__builtin_clear_padding) __builtin_clear_padding(std::__addressof(_M_i)); } #endif } This not only avoids duplication of the identical parts, but it avoids warnings from ld.gold if you use --detect-odr-violations. Otherwise, the linker can see a definition of that constructor on two different lines (233 and 236), and so warns about possible ODR violations, something like "warning: while linking foo: symbol 'std::atomic::atomic(int)' defined in multiple places (possible ODR violation): ...atomic:233 ... atomic:236" Can't we clear the padding for >= 201402L instead of only C++20? Only C++11 has a problem with the built-in in a constexpr function, right? So we can DTRT for C++14 upwards. > > operator _Tp() const noexcept > { return load(); } > @@ -268,12 +274,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > void > store(_Tp __i, memory_order __m = memory_order_seq_cst) noexcept > { > +#if __has_builtin(__builtin_clear_padding) > + __builtin_clear_padding(std::__addressof(__i)); > +#endif We repeat this *a lot*. When I started work on this I defined a non-member function in the __atomic_impl namespace: template _GLIBCXX_ALWAYS_INLINE void __clear_padding(_Tp& __val) noexcept { #if __has_builtin(__builtin_clear_padding) __builtin_clear_padding(std::__addressof(__val)); #endif } Then you can just use that everywhere (except the constexpr constructor), without all the #if checks. > __atomic_store(std::__addressof(_M_i), std::__addressof(__i), int(__m)); > } > > void > store(_Tp __i, memory_order __m = memory_order_seq_cst) volatile noexcept > { > +#if __has_builtin(__builtin_clear_padding) > + __builtin_clear_padding(std::__addressof(__i)); > +#endif > __atomic_store(std::__addressof(_M_i), std::__addressof(__i), int(__m)); > } > > @@ -300,6 +312,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > { > alignas(_Tp) unsigned char __buf[sizeof(_Tp)]; > _Tp* __ptr = reinterpret_cast<_Tp*>(__buf); > +#if __has_builtin(__builtin_clear_padding) > + __builtin_clear_padding(std::__addressof(__i)); > +#endif > __atomic_exchange(std::__addressof(_M_i), std::__addressof(__i), > __ptr, int(__m)); > return *__ptr; > @@ -311,6 +326,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > { > alignas(_Tp) unsigned char __buf[sizeof(_Tp)]; > _Tp* __ptr = reinterpret_cast<_Tp*>(__buf); > +#if __has_builtin(__builtin_clear_padding) > + __builtin_clear_padding(std::__addressof(__i)); > +#endif > __atomic_exchange(std::__addressof(_M_i), std::__addressof(__i), > __ptr, int(__m)); > return *__ptr; > @@ -322,6 +340,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > { > __glibcxx_assert(__is_valid_cmpexch_failure_order(__f)); > > +#if __has_builtin(__builtin_clear_padding) > + __builtin_clear_padding(std::__addressof(__e)); This unconditionally clears the padding of __e, which I don't think is allowed. It potentially introduces a data race if another thread is doing the CAS at the same time, and the program assumes that only the CAS that fails will update expected. See the thread I started at https://lists.isocpp.org/parallel/2020/12/3443.php ("atomic compare_exchange and padding bits", 2020-12-03) The conclusion was that writing to __e is not allowed in the failure case, so you need to make a copy of it (into a buffer, using memcpy), then clear the padding in the copy, then try the __atomic_compare_exchange and if it fails, copy back from the buffer to __e. If all that extra work doesn't get inlined then we want to only do it for types which might have padding bits, so I had __atomic_impl::__maybe_has_padding in my unfinished patch: template constexpr bool __maybe_has_padding() { #if __has_builtin(__has_unique_object_representations) return !__has_unique_object_representations(_Tp); #else return true; #endif } The MSVC implementation uses !__has_unique_object_representations(_Tp) && !is_floating_point<_Tp>::value here, which is better than mine above (FP types don't have unique object reps, but also don't have padding bits). And then do something like this in compare_exchange_weak: + { +#if __has_builtin(__builtin_clear_padding) + if _GLIBCXX_CONSTEXPR17 (__maybe_has_padding<_Tp>()) + { + _Val<_Tp> __expected0 = __expected; // XXX should use memcpy + auto* __exp = __atomic_impl::__clear_padding(__expected0); + auto* __des = __atomic_impl::__clear_padding(__desired); + if (__atomic_compare_exchange(__ptr, __exp, __des, true, + int(__success), int(__failure))) + return true; + __builtin_memcpy(std::__addressof(__expected), __exp, sizeof(_Tp)); + return false; + } +#endif return __atomic_compare_exchange(__ptr, std::__addressof(__expected), And similarly for compare_exchange_strong (or refactor them into one function that takes a bool for weak/strong). If you do all that in __atomic_impl::compare_exchange_weak (making it take a bool for weak/strong) then you can reuse it from __atomic_impl:compare_exchange_strong, and then change the gneric atomic::compare_exchange_{weak,strong} to use that as well. > diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc > new file mode 100644 > index 00000000000..0875f168097 > --- /dev/null > +++ b/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc > @@ -0,0 +1,42 @@ > +// { dg-options "-std=gnu++2a" } > +// { dg-do run { target c++2a } } We can (and should) use "20" not "2a". Does it need to be C++20 though, aren't all the clearings that are being tested going to happen unconditionally? (well ... as long as the builtin exists, which is true for GCC). > +// { dg-add-options libatomic } > + > +#include > + > +#include > + > +struct S { char c; short s; }; > + > +void __attribute__((noinline,noipa)) > +fill_struct(S& s) > +{ __builtin_memset(&s, 0xff, sizeof(S)); } > + > +bool > +compare_struct(const S& a, const S& b) > +{ return __builtin_memcmp(&a, &b, sizeof(S)) == 0; } > + > +int > +main () > +{ > + S s; > + fill_struct(s); > + s.c = 'a'; > + s.s = 42; > + > + std::atomic as{ s }; > + auto ts = as.load(); > + VERIFY( !compare_struct(s, ts) ); // padding cleared on construction > + as.exchange(s); > + auto es = as.load(); > + VERIFY( compare_struct(ts, es) ); // padding cleared on exchange > + > + S n; > + fill_struct(n); > + n.c = 'b'; > + n.s = 71; > + // padding cleared on compexchg > + VERIFY( as.compare_exchange_weak(s, n) ); Is it safe assume this won't fail spuriously? There is only one thread doing the RMW operation, is that enough to avoid spurious failures? > + VERIFY( as.compare_exchange_strong(n, s) ); > + return 0; > +} > -- > 2.31.1 >