[PATCH] libstdc++: atomic: Add missing clear_padding in __atomic

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
@ 2024-01-08  1:01 xndcn
  2024-01-15  0:45 ` [PING][PATCH] " xndcn
  2024-01-15  3:46 ` [PATCH] " H.J. Lu
  0 siblings, 2 replies; 28+ messages in thread
From: xndcn @ 2024-01-08  1:01 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]

Hi, I found __atomic_float constructor does not clear padding,
while __compare_exchange assumes it as zeroed padding. So it is easy to
reproducing a infinite loop in X86-64 with long double type like:
---
-O0 -std=c++23 -mlong-double-80
#include <stdio.h>
#include <atomic>

#define T long double
int main() {
    std::atomic<T> t(0.5);
    t.fetch_add(0.5);
    float x = t;
    printf("%f\n", x);
}
---

So we should add __builtin_clear_padding in __atomic_float constructor,
just like the generic atomic struct.

regtested on x86_64-linux. Is it OK for trunk?

---
libstdc++: atomic: Add missing clear_padding in __atomic_float constructor.

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float
constructor.
---
 libstdc++-v3/include/bits/atomic_base.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index f4ce0fa53..d59c2209e 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

       constexpr
       __atomic_float(_Fp __t) : _M_fp(__t)
-      { }
+      {
+#if __has_builtin(__builtin_clear_padding)
+ if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
+  __builtin_clear_padding(std::__addressof(_M_fp));
+#endif
+      }

       __atomic_float(const __atomic_float&) = delete;
       __atomic_float& operator=(const __atomic_float&) = delete;
-- 
2.25.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PING][PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-08  1:01 [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor xndcn
@ 2024-01-15  0:45 ` xndcn
  2024-01-15  3:46 ` [PATCH] " H.J. Lu
  1 sibling, 0 replies; 28+ messages in thread
From: xndcn @ 2024-01-15  0:45 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1781 bytes --]

Ping. Thanks.

xndcn <xndchn@gmail.com> 于2024年1月8日周一 09:01写道：

> Hi, I found __atomic_float constructor does not clear padding,
> while __compare_exchange assumes it as zeroed padding. So it is easy to
> reproducing a infinite loop in X86-64 with long double type like:
> ---
> -O0 -std=c++23 -mlong-double-80
> #include <stdio.h>
> #include <atomic>
>
> #define T long double
> int main() {
>     std::atomic<T> t(0.5);
>     t.fetch_add(0.5);
>     float x = t;
>     printf("%f\n", x);
> }
> ---
>
> So we should add __builtin_clear_padding in __atomic_float constructor,
> just like the generic atomic struct.
>
> regtested on x86_64-linux. Is it OK for trunk?
>
> ---
> libstdc++: atomic: Add missing clear_padding in __atomic_float constructor.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> ---
>  libstdc++-v3/include/bits/atomic_base.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..d59c2209e 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        constexpr
>        __atomic_float(_Fp __t) : _M_fp(__t)
> -      { }
> +      {
> +#if __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> +  __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +      }
>
>        __atomic_float(const __atomic_float&) = delete;
>        __atomic_float& operator=(const __atomic_float&) = delete;
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-08  1:01 [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor xndcn
  2024-01-15  0:45 ` [PING][PATCH] " xndcn
@ 2024-01-15  3:46 ` H.J. Lu
  2024-01-16  9:53   ` xndcn
  1 sibling, 1 reply; 28+ messages in thread
From: H.J. Lu @ 2024-01-15  3:46 UTC (permalink / raw)
  To: xndcn; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1746 bytes --]

On Sun, Jan 7, 2024, 5:02 PM xndcn <xndchn@gmail.com> wrote:

> Hi, I found __atomic_float constructor does not clear padding,
> while __compare_exchange assumes it as zeroed padding. So it is easy to
> reproducing a infinite loop in X86-64 with long double type like:
> ---
> -O0 -std=c++23 -mlong-double-80
> #include <stdio.h>
> #include <atomic>
>
> #define T long double
> int main() {
>     std::atomic<T> t(0.5);
>     t.fetch_add(0.5);
>     float x = t;
>     printf("%f\n", x);
> }
> ---
>
> So we should add __builtin_clear_padding in __atomic_float constructor,
> just like the generic atomic struct.
>
> regtested on x86_64-linux. Is it OK for trunk?
>
> ---
> libstdc++: atomic: Add missing clear_padding in __atomic_float constructor.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> ---
>  libstdc++-v3/include/bits/atomic_base.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..d59c2209e 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        constexpr
>        __atomic_float(_Fp __t) : _M_fp(__t)
> -      { }
> +      {
> +#if __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> +  __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +      }
>
>        __atomic_float(const __atomic_float&) = delete;
>        __atomic_float& operator=(const __atomic_float&) = delete;
> --
> 2.25.1
>

Can you add a testcase?

Thanks.

H.J.

>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-15  3:46 ` [PATCH] " H.J. Lu
@ 2024-01-16  9:53   ` xndcn
  2024-01-16 10:12     ` Xi Ruoyao
  0 siblings, 1 reply; 28+ messages in thread
From: xndcn @ 2024-01-16  9:53 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 5543 bytes --]

Thanks, so I add a test: atomic_float/compare_exchange_padding.cc, which
will fail due to timeout without the patch.

---
libstdc++-v3/ChangeLog:

 * include/bits/atomic_base.h: add __builtin_clear_padding in
__atomic_float constructor.
 * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
 * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
---
 libstdc++-v3/include/bits/atomic_base.h | 7 ++-
 .../atomic_float/compare_exchange_padding.cc | 50 +++++++++++++++++++
 libstdc++-v3/testsuite/lib/dg-options.exp | 1 +
 3 files changed, 57 insertions(+), 1 deletion(-)
 create mode 100644
libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index d3a2c4f3805..cbe3749e17f 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

       constexpr
       __atomic_float(_Fp __t) : _M_fp(__t)
- { }
+ {
+#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
+ if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
+ __builtin_clear_padding(std::__addressof(_M_fp));
+#endif
+ }

       __atomic_float(const __atomic_float&) = delete;
       __atomic_float& operator=(const __atomic_float&) = delete;
diff --git
a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 00000000000..9376ab22850
--- /dev/null
+++
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,50 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-timeout 10 }
+// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
+// { dg-do run { target { ia32 || x86_64-*-* } } }
+// { dg-add-options libatomic }
+
+#include <atomic>
+#include <testsuite_hooks.h>
+
+template<typename T>
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+ T mask;
+ __builtin_memset(&mask, 0xff, sizeof(T));
+ __builtin_clear_padding(&mask);
+ unsigned char* ptr_f = (unsigned char*)&f;
+ unsigned char* ptr_mask = (unsigned char*)&mask;
+ for (int i = 0; i < sizeof(T); i++)
+ {
+ if (ptr_mask[i] == 0x00)
+ {
+ ptr_f[i] = 0xff;
+ }
+ }
+}
+
+void
+test01()
+{
+ long double f = 0.5f; // long double may contains padding on X86
+ fill_padding(f);
+ std::atomic<long double> as{ f }; // padding cleared on constructor
+ long double t = 1.5;
+
+ as.fetch_add(t);
+ long double s = f + t;
+ t = as.load();
+ VERIFY(s == t); // padding ignored on float comparing
+ fill_padding(s);
+ VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+ fill_padding(f);
+ VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+}
+
+int main()
+{
+ test01();
+}
diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
b/libstdc++-v3/testsuite/lib/dg-options.exp
index bc387d17ed7..d9a19dadd7f 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
   || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
   || [istarget riscv*-*-*]
   || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
+ || ([istarget i?86-*-*] || [istarget x86_64-*-*])
        } {
  global TOOL_OPTIONS

-- 
2.25.1

H.J. Lu <hjl.tools@gmail.com> 于2024年1月15日周一 11:46写道：

> On Sun, Jan 7, 2024, 5:02 PM xndcn <xndchn@gmail.com> wrote:
>
>> Hi, I found __atomic_float constructor does not clear padding,
>> while __compare_exchange assumes it as zeroed padding. So it is easy to
>> reproducing a infinite loop in X86-64 with long double type like:
>> ---
>> -O0 -std=c++23 -mlong-double-80
>> #include <stdio.h>
>> #include <atomic>
>>
>> #define T long double
>> int main() {
>>     std::atomic<T> t(0.5);
>>     t.fetch_add(0.5);
>>     float x = t;
>>     printf("%f\n", x);
>> }
>> ---
>>
>> So we should add __builtin_clear_padding in __atomic_float constructor,
>> just like the generic atomic struct.
>>
>> regtested on x86_64-linux. Is it OK for trunk?
>>
>> ---
>> libstdc++: atomic: Add missing clear_padding in __atomic_float
>> constructor.
>>
>> libstdc++-v3/ChangeLog:
>>
>> * include/bits/atomic_base.h: add __builtin_clear_padding in
>> __atomic_float constructor.
>> ---
>>  libstdc++-v3/include/bits/atomic_base.h | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/libstdc++-v3/include/bits/atomic_base.h
>> b/libstdc++-v3/include/bits/atomic_base.h
>> index f4ce0fa53..d59c2209e 100644
>> --- a/libstdc++-v3/include/bits/atomic_base.h
>> +++ b/libstdc++-v3/include/bits/atomic_base.h
>> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>
>>        constexpr
>>        __atomic_float(_Fp __t) : _M_fp(__t)
>> -      { }
>> +      {
>> +#if __has_builtin(__builtin_clear_padding)
>> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
>> +  __builtin_clear_padding(std::__addressof(_M_fp));
>> +#endif
>> +      }
>>
>>        __atomic_float(const __atomic_float&) = delete;
>>        __atomic_float& operator=(const __atomic_float&) = delete;
>> --
>> 2.25.1
>>
>
> Can you add a testcase?
>
> Thanks.
>
> H.J.
>
>>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-16  9:53   ` xndcn
@ 2024-01-16 10:12     ` Xi Ruoyao
  2024-01-16 16:16       ` xndcn
  0 siblings, 1 reply; 28+ messages in thread
From: Xi Ruoyao @ 2024-01-16 10:12 UTC (permalink / raw)
  To: xndcn, H.J. Lu; +Cc: GCC Patches, Jakub Jelinek, libstdc++

On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> Thanks, so I add a test: atomic_float/compare_exchange_padding.cc,
> which will fail due to timeout without the patch.

Please resend in plain text instead of HTML.  Sending in HTML causes the
patch mangled.

And libstdc++ patches should CC libstdc++@gcc.gnu.org, in addition to
gcc-patches@gcc.gnu.org.

> ---
> libstdc++-v3/ChangeLog:
> 
>  * include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float constructor.
>  * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
>  * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> ---
>  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
>  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
>  libstdc++-v3/testsuite/lib/dg-options.exp     |  1 +
>  3 files changed, 57 insertions(+), 1 deletion(-)
>  create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> 
> diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
> index d3a2c4f3805..cbe3749e17f 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  
>        constexpr
>        __atomic_float(_Fp __t) : _M_fp(__t)
> -      { }
> +      {
> +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> + __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +      }
>  
>        __atomic_float(const __atomic_float&) = delete;
>        __atomic_float& operator=(const __atomic_float&) = delete;
> diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> new file mode 100644
> index 00000000000..9376ab22850
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -0,0 +1,50 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O0" }
> +// { dg-timeout 10 }
> +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> +// { dg-do run { target { ia32 || x86_64-*-* } } }
> +// { dg-add-options libatomic }
> +
> +#include <atomic>
> +#include <testsuite_hooks.h>
> +
> +template<typename T>
> +void __attribute__((noinline,noipa))
> +fill_padding(T& f)
> +{
> +  T mask;
> +  __builtin_memset(&mask, 0xff, sizeof(T));
> +  __builtin_clear_padding(&mask);
> +  unsigned char* ptr_f = (unsigned char*)&f;
> +  unsigned char* ptr_mask = (unsigned char*)&mask;
> +  for (int i = 0; i < sizeof(T); i++)
> +  {
> +    if (ptr_mask[i] == 0x00)
> +    {
> +      ptr_f[i] = 0xff;
> +    }
> +  }
> +}
> +
> +void
> +test01()
> +{
> +  long double f = 0.5f; // long double may contains padding on X86
> +  fill_padding(f);
> +  std::atomic<long double> as{ f }; // padding cleared on constructor
> +  long double t = 1.5;
> +
> +  as.fetch_add(t);
> +  long double s = f + t;
> +  t = as.load();
> +  VERIFY(s == t); // padding ignored on float comparing
> +  fill_padding(s);
> +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +  fill_padding(f);
> +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> +}
> +
> +int main()
> +{
> +  test01();
> +}
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp
> index bc387d17ed7..d9a19dadd7f 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
>    || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
>    || [istarget riscv*-*-*]
>    || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
> + || ([istarget i?86-*-*] || [istarget x86_64-*-*])

This seems too overkill as "dg-add-options libatomic" is not intended to
handle 16-byte atomics.  Maybe we can fork this to a new dg-add-options
like "add_options_for_libatomic_16b"?

>         } {
>   global TOOL_OPTIONS
>  
> -- 
> 2.25.1

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-16 10:12     ` Xi Ruoyao
@ 2024-01-16 16:16       ` xndcn
  2024-01-24 15:53         ` xndcn
                           ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: xndcn @ 2024-01-16 16:16 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: H.J. Lu, GCC Patches, Jakub Jelinek, libstdc++

Sorry about the mangled content...
So I add a new add-options for libatomic_16b:

---
libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h: add __builtin_clear_padding in
__atomic_float constructor.
* testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
* testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
---
 libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
 .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
 libstdc++-v3/testsuite/lib/dg-options.exp     | 22 ++++++++
 3 files changed, 78 insertions(+), 1 deletion(-)
 create mode 100644
libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index f4ce0fa53..104ddfdbe 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

       constexpr
       __atomic_float(_Fp __t) : _M_fp(__t)
-      { }
+      {
+#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
+ if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
+   __builtin_clear_padding(std::__addressof(_M_fp));
+#endif
+      }

       __atomic_float(const __atomic_float&) = delete;
       __atomic_float& operator=(const __atomic_float&) = delete;
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 000000000..d538b3d55
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,50 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-timeout 10 }
+// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
+// { dg-do run { target { ia32 || x86_64-*-* } } }
+// { dg-add-options libatomic_16b }
+
+#include <atomic>
+#include <testsuite_hooks.h>
+
+template<typename T>
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+  T mask;
+  __builtin_memset(&mask, 0xff, sizeof(T));
+  __builtin_clear_padding(&mask);
+  unsigned char* ptr_f = (unsigned char*)&f;
+  unsigned char* ptr_mask = (unsigned char*)&mask;
+  for (int i = 0; i < sizeof(T); i++)
+  {
+    if (ptr_mask[i] == 0x00)
+    {
+      ptr_f[i] = 0xff;
+    }
+  }
+}
+
+void
+test01()
+{
+  long double f = 0.5f; // long double may contains padding on X86
+  fill_padding(f);
+  std::atomic<long double> as{ f }; // padding cleared on constructor
+  long double t = 1.5;
+
+  as.fetch_add(t);
+  long double s = f + t;
+  t = as.load();
+  VERIFY(s == t); // padding ignored on float comparing
+  fill_padding(s);
+  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+  fill_padding(f);
+  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+}
+
+int main()
+{
+  test01();
+}
diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
b/libstdc++-v3/testsuite/lib/dg-options.exp
index 850442b6b..25da20d58 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
     return $flags
 }

+# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
+proc add_options_for_libatomic_16b { flags } {
+    if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
+       } {
+ global TOOL_OPTIONS
+
+ set link_flags ""
+ if ![is_remote host] {
+     if [info exists TOOL_OPTIONS] {
+ set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
+     } else {
+ set link_flags "[atomic_link_flags [get_multilibs]]"
+     }
+ }
+
+ append link_flags " -latomic "
+
+ return "$flags $link_flags"
+    }
+    return $flags
+}
+
 # Add options to enable use of deprecated features.
 proc add_options_for_using-deprecated { flags } {
     return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
-- 
2.25.1


Xi Ruoyao <xry111@xry111.site> 于2024年1月16日周二 18:12写道：
>
> On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> > Thanks, so I add a test: atomic_float/compare_exchange_padding.cc,
> > which will fail due to timeout without the patch.
>
> Please resend in plain text instead of HTML.  Sending in HTML causes the
> patch mangled.
>
> And libstdc++ patches should CC libstdc++@gcc.gnu.org, in addition to
> gcc-patches@gcc.gnu.org.
>
> > ---
> > libstdc++-v3/ChangeLog:
> >
> >  * include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float constructor.
> >  * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
> >  * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> > ---
> >  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
> >  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
> >  libstdc++-v3/testsuite/lib/dg-options.exp     |  1 +
> >  3 files changed, 57 insertions(+), 1 deletion(-)
> >  create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> >
> > diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
> > index d3a2c4f3805..cbe3749e17f 100644
> > --- a/libstdc++-v3/include/bits/atomic_base.h
> > +++ b/libstdc++-v3/include/bits/atomic_base.h
> > @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> >        constexpr
> >        __atomic_float(_Fp __t) : _M_fp(__t)
> > -      { }
> > +      {
> > +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> > + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> > + __builtin_clear_padding(std::__addressof(_M_fp));
> > +#endif
> > +      }
> >
> >        __atomic_float(const __atomic_float&) = delete;
> >        __atomic_float& operator=(const __atomic_float&) = delete;
> > diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > new file mode 100644
> > index 00000000000..9376ab22850
> > --- /dev/null
> > +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > @@ -0,0 +1,50 @@
> > +// { dg-do run { target c++20 } }
> > +// { dg-options "-O0" }
> > +// { dg-timeout 10 }
> > +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> > +// { dg-do run { target { ia32 || x86_64-*-* } } }
> > +// { dg-add-options libatomic }
> > +
> > +#include <atomic>
> > +#include <testsuite_hooks.h>
> > +
> > +template<typename T>
> > +void __attribute__((noinline,noipa))
> > +fill_padding(T& f)
> > +{
> > +  T mask;
> > +  __builtin_memset(&mask, 0xff, sizeof(T));
> > +  __builtin_clear_padding(&mask);
> > +  unsigned char* ptr_f = (unsigned char*)&f;
> > +  unsigned char* ptr_mask = (unsigned char*)&mask;
> > +  for (int i = 0; i < sizeof(T); i++)
> > +  {
> > +    if (ptr_mask[i] == 0x00)
> > +    {
> > +      ptr_f[i] = 0xff;
> > +    }
> > +  }
> > +}
> > +
> > +void
> > +test01()
> > +{
> > +  long double f = 0.5f; // long double may contains padding on X86
> > +  fill_padding(f);
> > +  std::atomic<long double> as{ f }; // padding cleared on constructor
> > +  long double t = 1.5;
> > +
> > +  as.fetch_add(t);
> > +  long double s = f + t;
> > +  t = as.load();
> > +  VERIFY(s == t); // padding ignored on float comparing
> > +  fill_padding(s);
> > +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> > +  fill_padding(f);
> > +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> > +}
> > +
> > +int main()
> > +{
> > +  test01();
> > +}
> > diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp
> > index bc387d17ed7..d9a19dadd7f 100644
> > --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> > +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> > @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
> >    || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
> >    || [istarget riscv*-*-*]
> >    || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
> > + || ([istarget i?86-*-*] || [istarget x86_64-*-*])
>
> This seems too overkill as "dg-add-options libatomic" is not intended to
> handle 16-byte atomics.  Maybe we can fork this to a new dg-add-options
> like "add_options_for_libatomic_16b"?
>
> >         } {
> >   global TOOL_OPTIONS
> >
> > --
> > 2.25.1
>
> --
> Xi Ruoyao <xry111@xry111.site>
> School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-16 16:16       ` xndcn
@ 2024-01-24 15:53         ` xndcn
  2024-01-30 16:08         ` [PING][PATCH] " xndcn
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 28+ messages in thread
From: xndcn @ 2024-01-24 15:53 UTC (permalink / raw)
  To: Xi Ruoyao, Jakub Jelinek, H.J. Lu; +Cc: GCC Patches, libstdc++

Hi, is it OK for trunk? I do not have access to the repo, can you
please help me submit the patch? Thanks.

xndcn <xndchn@gmail.com> 于2024年1月17日周三 00:16写道：
>
> Sorry about the mangled content...
> So I add a new add-options for libatomic_16b:
>
> ---
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
> * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> ---
>  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
>  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
>  libstdc++-v3/testsuite/lib/dg-options.exp     | 22 ++++++++
>  3 files changed, 78 insertions(+), 1 deletion(-)
>  create mode 100644
> libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..104ddfdbe 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        constexpr
>        __atomic_float(_Fp __t) : _M_fp(__t)
> -      { }
> +      {
> +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> +   __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +      }
>
>        __atomic_float(const __atomic_float&) = delete;
>        __atomic_float& operator=(const __atomic_float&) = delete;
> diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> new file mode 100644
> index 000000000..d538b3d55
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -0,0 +1,50 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O0" }
> +// { dg-timeout 10 }
> +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> +// { dg-do run { target { ia32 || x86_64-*-* } } }
> +// { dg-add-options libatomic_16b }
> +
> +#include <atomic>
> +#include <testsuite_hooks.h>
> +
> +template<typename T>
> +void __attribute__((noinline,noipa))
> +fill_padding(T& f)
> +{
> +  T mask;
> +  __builtin_memset(&mask, 0xff, sizeof(T));
> +  __builtin_clear_padding(&mask);
> +  unsigned char* ptr_f = (unsigned char*)&f;
> +  unsigned char* ptr_mask = (unsigned char*)&mask;
> +  for (int i = 0; i < sizeof(T); i++)
> +  {
> +    if (ptr_mask[i] == 0x00)
> +    {
> +      ptr_f[i] = 0xff;
> +    }
> +  }
> +}
> +
> +void
> +test01()
> +{
> +  long double f = 0.5f; // long double may contains padding on X86
> +  fill_padding(f);
> +  std::atomic<long double> as{ f }; // padding cleared on constructor
> +  long double t = 1.5;
> +
> +  as.fetch_add(t);
> +  long double s = f + t;
> +  t = as.load();
> +  VERIFY(s == t); // padding ignored on float comparing
> +  fill_padding(s);
> +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +  fill_padding(f);
> +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> +}
> +
> +int main()
> +{
> +  test01();
> +}
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
> b/libstdc++-v3/testsuite/lib/dg-options.exp
> index 850442b6b..25da20d58 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
>      return $flags
>  }
>
> +# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
> +proc add_options_for_libatomic_16b { flags } {
> +    if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
> +       } {
> + global TOOL_OPTIONS
> +
> + set link_flags ""
> + if ![is_remote host] {
> +     if [info exists TOOL_OPTIONS] {
> + set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
> +     } else {
> + set link_flags "[atomic_link_flags [get_multilibs]]"
> +     }
> + }
> +
> + append link_flags " -latomic "
> +
> + return "$flags $link_flags"
> +    }
> +    return $flags
> +}
> +
>  # Add options to enable use of deprecated features.
>  proc add_options_for_using-deprecated { flags } {
>      return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
> --
> 2.25.1
>
>
> Xi Ruoyao <xry111@xry111.site> 于2024年1月16日周二 18:12写道：
> >
> > On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> > > Thanks, so I add a test: atomic_float/compare_exchange_padding.cc,
> > > which will fail due to timeout without the patch.
> >
> > Please resend in plain text instead of HTML.  Sending in HTML causes the
> > patch mangled.
> >
> > And libstdc++ patches should CC libstdc++@gcc.gnu.org, in addition to
> > gcc-patches@gcc.gnu.org.
> >
> > > ---
> > > libstdc++-v3/ChangeLog:
> > >
> > >  * include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float constructor.
> > >  * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
> > >  * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> > > ---
> > >  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
> > >  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
> > >  libstdc++-v3/testsuite/lib/dg-options.exp     |  1 +
> > >  3 files changed, 57 insertions(+), 1 deletion(-)
> > >  create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > >
> > > diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
> > > index d3a2c4f3805..cbe3749e17f 100644
> > > --- a/libstdc++-v3/include/bits/atomic_base.h
> > > +++ b/libstdc++-v3/include/bits/atomic_base.h
> > > @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >
> > >        constexpr
> > >        __atomic_float(_Fp __t) : _M_fp(__t)
> > > -      { }
> > > +      {
> > > +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> > > + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> > > + __builtin_clear_padding(std::__addressof(_M_fp));
> > > +#endif
> > > +      }
> > >
> > >        __atomic_float(const __atomic_float&) = delete;
> > >        __atomic_float& operator=(const __atomic_float&) = delete;
> > > diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > > new file mode 100644
> > > index 00000000000..9376ab22850
> > > --- /dev/null
> > > +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > > @@ -0,0 +1,50 @@
> > > +// { dg-do run { target c++20 } }
> > > +// { dg-options "-O0" }
> > > +// { dg-timeout 10 }
> > > +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> > > +// { dg-do run { target { ia32 || x86_64-*-* } } }
> > > +// { dg-add-options libatomic }
> > > +
> > > +#include <atomic>
> > > +#include <testsuite_hooks.h>
> > > +
> > > +template<typename T>
> > > +void __attribute__((noinline,noipa))
> > > +fill_padding(T& f)
> > > +{
> > > +  T mask;
> > > +  __builtin_memset(&mask, 0xff, sizeof(T));
> > > +  __builtin_clear_padding(&mask);
> > > +  unsigned char* ptr_f = (unsigned char*)&f;
> > > +  unsigned char* ptr_mask = (unsigned char*)&mask;
> > > +  for (int i = 0; i < sizeof(T); i++)
> > > +  {
> > > +    if (ptr_mask[i] == 0x00)
> > > +    {
> > > +      ptr_f[i] = 0xff;
> > > +    }
> > > +  }
> > > +}
> > > +
> > > +void
> > > +test01()
> > > +{
> > > +  long double f = 0.5f; // long double may contains padding on X86
> > > +  fill_padding(f);
> > > +  std::atomic<long double> as{ f }; // padding cleared on constructor
> > > +  long double t = 1.5;
> > > +
> > > +  as.fetch_add(t);
> > > +  long double s = f + t;
> > > +  t = as.load();
> > > +  VERIFY(s == t); // padding ignored on float comparing
> > > +  fill_padding(s);
> > > +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> > > +  fill_padding(f);
> > > +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> > > +}
> > > +
> > > +int main()
> > > +{
> > > +  test01();
> > > +}
> > > diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp
> > > index bc387d17ed7..d9a19dadd7f 100644
> > > --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> > > +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> > > @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
> > >    || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
> > >    || [istarget riscv*-*-*]
> > >    || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
> > > + || ([istarget i?86-*-*] || [istarget x86_64-*-*])
> >
> > This seems too overkill as "dg-add-options libatomic" is not intended to
> > handle 16-byte atomics.  Maybe we can fork this to a new dg-add-options
> > like "add_options_for_libatomic_16b"?
> >
> > >         } {
> > >   global TOOL_OPTIONS
> > >
> > > --
> > > 2.25.1
> >
> > --
> > Xi Ruoyao <xry111@xry111.site>
> > School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PING][PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-16 16:16       ` xndcn
  2024-01-24 15:53         ` xndcn
@ 2024-01-30 16:08         ` xndcn
  2024-01-30 16:31         ` [PATCH] " Jonathan Wakely
  2024-01-30 16:34         ` Jonathan Wakely
  3 siblings, 0 replies; 28+ messages in thread
From: xndcn @ 2024-01-30 16:08 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: H.J. Lu, GCC Patches, Jakub Jelinek, libstdc++

Ping, thanks.
I do not have access to the repo, anyone can please help me submit the
patch? Thanks.

xndcn <xndchn@gmail.com> 于2024年1月17日周三 00:16写道：
>
> Sorry about the mangled content...
> So I add a new add-options for libatomic_16b:
>
> ---
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
> * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> ---
>  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
>  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
>  libstdc++-v3/testsuite/lib/dg-options.exp     | 22 ++++++++
>  3 files changed, 78 insertions(+), 1 deletion(-)
>  create mode 100644
> libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..104ddfdbe 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        constexpr
>        __atomic_float(_Fp __t) : _M_fp(__t)
> -      { }
> +      {
> +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> +   __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +      }
>
>        __atomic_float(const __atomic_float&) = delete;
>        __atomic_float& operator=(const __atomic_float&) = delete;
> diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> new file mode 100644
> index 000000000..d538b3d55
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -0,0 +1,50 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O0" }
> +// { dg-timeout 10 }
> +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> +// { dg-do run { target { ia32 || x86_64-*-* } } }
> +// { dg-add-options libatomic_16b }
> +
> +#include <atomic>
> +#include <testsuite_hooks.h>
> +
> +template<typename T>
> +void __attribute__((noinline,noipa))
> +fill_padding(T& f)
> +{
> +  T mask;
> +  __builtin_memset(&mask, 0xff, sizeof(T));
> +  __builtin_clear_padding(&mask);
> +  unsigned char* ptr_f = (unsigned char*)&f;
> +  unsigned char* ptr_mask = (unsigned char*)&mask;
> +  for (int i = 0; i < sizeof(T); i++)
> +  {
> +    if (ptr_mask[i] == 0x00)
> +    {
> +      ptr_f[i] = 0xff;
> +    }
> +  }
> +}
> +
> +void
> +test01()
> +{
> +  long double f = 0.5f; // long double may contains padding on X86
> +  fill_padding(f);
> +  std::atomic<long double> as{ f }; // padding cleared on constructor
> +  long double t = 1.5;
> +
> +  as.fetch_add(t);
> +  long double s = f + t;
> +  t = as.load();
> +  VERIFY(s == t); // padding ignored on float comparing
> +  fill_padding(s);
> +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +  fill_padding(f);
> +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> +}
> +
> +int main()
> +{
> +  test01();
> +}
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
> b/libstdc++-v3/testsuite/lib/dg-options.exp
> index 850442b6b..25da20d58 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
>      return $flags
>  }
>
> +# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
> +proc add_options_for_libatomic_16b { flags } {
> +    if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
> +       } {
> + global TOOL_OPTIONS
> +
> + set link_flags ""
> + if ![is_remote host] {
> +     if [info exists TOOL_OPTIONS] {
> + set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
> +     } else {
> + set link_flags "[atomic_link_flags [get_multilibs]]"
> +     }
> + }
> +
> + append link_flags " -latomic "
> +
> + return "$flags $link_flags"
> +    }
> +    return $flags
> +}
> +
>  # Add options to enable use of deprecated features.
>  proc add_options_for_using-deprecated { flags } {
>      return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
> --
> 2.25.1
>
>
> Xi Ruoyao <xry111@xry111.site> 于2024年1月16日周二 18:12写道：
> >
> > On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> > > Thanks, so I add a test: atomic_float/compare_exchange_padding.cc,
> > > which will fail due to timeout without the patch.
> >
> > Please resend in plain text instead of HTML.  Sending in HTML causes the
> > patch mangled.
> >
> > And libstdc++ patches should CC libstdc++@gcc.gnu.org, in addition to
> > gcc-patches@gcc.gnu.org.
> >
> > > ---
> > > libstdc++-v3/ChangeLog:
> > >
> > >  * include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float constructor.
> > >  * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
> > >  * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> > > ---
> > >  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
> > >  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
> > >  libstdc++-v3/testsuite/lib/dg-options.exp     |  1 +
> > >  3 files changed, 57 insertions(+), 1 deletion(-)
> > >  create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > >
> > > diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
> > > index d3a2c4f3805..cbe3749e17f 100644
> > > --- a/libstdc++-v3/include/bits/atomic_base.h
> > > +++ b/libstdc++-v3/include/bits/atomic_base.h
> > > @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >
> > >        constexpr
> > >        __atomic_float(_Fp __t) : _M_fp(__t)
> > > -      { }
> > > +      {
> > > +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> > > + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> > > + __builtin_clear_padding(std::__addressof(_M_fp));
> > > +#endif
> > > +      }
> > >
> > >        __atomic_float(const __atomic_float&) = delete;
> > >        __atomic_float& operator=(const __atomic_float&) = delete;
> > > diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > > new file mode 100644
> > > index 00000000000..9376ab22850
> > > --- /dev/null
> > > +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > > @@ -0,0 +1,50 @@
> > > +// { dg-do run { target c++20 } }
> > > +// { dg-options "-O0" }
> > > +// { dg-timeout 10 }
> > > +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> > > +// { dg-do run { target { ia32 || x86_64-*-* } } }
> > > +// { dg-add-options libatomic }
> > > +
> > > +#include <atomic>
> > > +#include <testsuite_hooks.h>
> > > +
> > > +template<typename T>
> > > +void __attribute__((noinline,noipa))
> > > +fill_padding(T& f)
> > > +{
> > > +  T mask;
> > > +  __builtin_memset(&mask, 0xff, sizeof(T));
> > > +  __builtin_clear_padding(&mask);
> > > +  unsigned char* ptr_f = (unsigned char*)&f;
> > > +  unsigned char* ptr_mask = (unsigned char*)&mask;
> > > +  for (int i = 0; i < sizeof(T); i++)
> > > +  {
> > > +    if (ptr_mask[i] == 0x00)
> > > +    {
> > > +      ptr_f[i] = 0xff;
> > > +    }
> > > +  }
> > > +}
> > > +
> > > +void
> > > +test01()
> > > +{
> > > +  long double f = 0.5f; // long double may contains padding on X86
> > > +  fill_padding(f);
> > > +  std::atomic<long double> as{ f }; // padding cleared on constructor
> > > +  long double t = 1.5;
> > > +
> > > +  as.fetch_add(t);
> > > +  long double s = f + t;
> > > +  t = as.load();
> > > +  VERIFY(s == t); // padding ignored on float comparing
> > > +  fill_padding(s);
> > > +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> > > +  fill_padding(f);
> > > +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> > > +}
> > > +
> > > +int main()
> > > +{
> > > +  test01();
> > > +}
> > > diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp
> > > index bc387d17ed7..d9a19dadd7f 100644
> > > --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> > > +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> > > @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
> > >    || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
> > >    || [istarget riscv*-*-*]
> > >    || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
> > > + || ([istarget i?86-*-*] || [istarget x86_64-*-*])
> >
> > This seems too overkill as "dg-add-options libatomic" is not intended to
> > handle 16-byte atomics.  Maybe we can fork this to a new dg-add-options
> > like "add_options_for_libatomic_16b"?
> >
> > >         } {
> > >   global TOOL_OPTIONS
> > >
> > > --
> > > 2.25.1
> >
> > --
> > Xi Ruoyao <xry111@xry111.site>
> > School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-16 16:16       ` xndcn
  2024-01-24 15:53         ` xndcn
  2024-01-30 16:08         ` [PING][PATCH] " xndcn
@ 2024-01-30 16:31         ` Jonathan Wakely
  2024-01-30 16:34         ` Jonathan Wakely
  3 siblings, 0 replies; 28+ messages in thread
From: Jonathan Wakely @ 2024-01-30 16:31 UTC (permalink / raw)
  To: xndcn; +Cc: Xi Ruoyao, H.J. Lu, GCC Patches, Jakub Jelinek, libstdc++

On Tue, 16 Jan 2024 at 16:17, xndcn <xndchn@gmail.com> wrote:
>
> Sorry about the mangled content...
> So I add a new add-options for libatomic_16b:
>
> ---
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
> * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> ---
>  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
>  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
>  libstdc++-v3/testsuite/lib/dg-options.exp     | 22 ++++++++
>  3 files changed, 78 insertions(+), 1 deletion(-)
>  create mode 100644
> libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..104ddfdbe 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        constexpr
>        __atomic_float(_Fp __t) : _M_fp(__t)
> -      { }
> +      {
> +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())

This code is not compiled for C++11 and C++14, so this should just use
constexpr not _GLIBCXX17_CONSTEXPR.

Apart from that I think this patch looks OK.


> +   __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +      }
>
>        __atomic_float(const __atomic_float&) = delete;
>        __atomic_float& operator=(const __atomic_float&) = delete;
> diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> new file mode 100644
> index 000000000..d538b3d55
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -0,0 +1,50 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O0" }
> +// { dg-timeout 10 }
> +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> +// { dg-do run { target { ia32 || x86_64-*-* } } }
> +// { dg-add-options libatomic_16b }
> +
> +#include <atomic>
> +#include <testsuite_hooks.h>
> +
> +template<typename T>
> +void __attribute__((noinline,noipa))
> +fill_padding(T& f)
> +{
> +  T mask;
> +  __builtin_memset(&mask, 0xff, sizeof(T));
> +  __builtin_clear_padding(&mask);
> +  unsigned char* ptr_f = (unsigned char*)&f;
> +  unsigned char* ptr_mask = (unsigned char*)&mask;
> +  for (int i = 0; i < sizeof(T); i++)
> +  {
> +    if (ptr_mask[i] == 0x00)
> +    {
> +      ptr_f[i] = 0xff;
> +    }
> +  }
> +}
> +
> +void
> +test01()
> +{
> +  long double f = 0.5f; // long double may contains padding on X86
> +  fill_padding(f);
> +  std::atomic<long double> as{ f }; // padding cleared on constructor
> +  long double t = 1.5;
> +
> +  as.fetch_add(t);
> +  long double s = f + t;
> +  t = as.load();
> +  VERIFY(s == t); // padding ignored on float comparing
> +  fill_padding(s);
> +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +  fill_padding(f);
> +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> +}
> +
> +int main()
> +{
> +  test01();
> +}
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
> b/libstdc++-v3/testsuite/lib/dg-options.exp
> index 850442b6b..25da20d58 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
>      return $flags
>  }
>
> +# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
> +proc add_options_for_libatomic_16b { flags } {
> +    if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
> +       } {
> + global TOOL_OPTIONS
> +
> + set link_flags ""
> + if ![is_remote host] {
> +     if [info exists TOOL_OPTIONS] {
> + set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
> +     } else {
> + set link_flags "[atomic_link_flags [get_multilibs]]"
> +     }
> + }
> +
> + append link_flags " -latomic "
> +
> + return "$flags $link_flags"
> +    }
> +    return $flags
> +}
> +
>  # Add options to enable use of deprecated features.
>  proc add_options_for_using-deprecated { flags } {
>      return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
> --
> 2.25.1
>
>
> Xi Ruoyao <xry111@xry111.site> 于2024年1月16日周二 18:12写道：
> >
> > On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> > > Thanks, so I add a test: atomic_float/compare_exchange_padding.cc,
> > > which will fail due to timeout without the patch.
> >
> > Please resend in plain text instead of HTML.  Sending in HTML causes the
> > patch mangled.
> >
> > And libstdc++ patches should CC libstdc++@gcc.gnu.org, in addition to
> > gcc-patches@gcc.gnu.org.
> >
> > > ---
> > > libstdc++-v3/ChangeLog:
> > >
> > >  * include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float constructor.
> > >  * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
> > >  * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> > > ---
> > >  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
> > >  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
> > >  libstdc++-v3/testsuite/lib/dg-options.exp     |  1 +
> > >  3 files changed, 57 insertions(+), 1 deletion(-)
> > >  create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > >
> > > diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
> > > index d3a2c4f3805..cbe3749e17f 100644
> > > --- a/libstdc++-v3/include/bits/atomic_base.h
> > > +++ b/libstdc++-v3/include/bits/atomic_base.h
> > > @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >
> > >        constexpr
> > >        __atomic_float(_Fp __t) : _M_fp(__t)
> > > -      { }
> > > +      {
> > > +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> > > + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> > > + __builtin_clear_padding(std::__addressof(_M_fp));
> > > +#endif
> > > +      }
> > >
> > >        __atomic_float(const __atomic_float&) = delete;
> > >        __atomic_float& operator=(const __atomic_float&) = delete;
> > > diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > > new file mode 100644
> > > index 00000000000..9376ab22850
> > > --- /dev/null
> > > +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > > @@ -0,0 +1,50 @@
> > > +// { dg-do run { target c++20 } }
> > > +// { dg-options "-O0" }
> > > +// { dg-timeout 10 }
> > > +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> > > +// { dg-do run { target { ia32 || x86_64-*-* } } }
> > > +// { dg-add-options libatomic }
> > > +
> > > +#include <atomic>
> > > +#include <testsuite_hooks.h>
> > > +
> > > +template<typename T>
> > > +void __attribute__((noinline,noipa))
> > > +fill_padding(T& f)
> > > +{
> > > +  T mask;
> > > +  __builtin_memset(&mask, 0xff, sizeof(T));
> > > +  __builtin_clear_padding(&mask);
> > > +  unsigned char* ptr_f = (unsigned char*)&f;
> > > +  unsigned char* ptr_mask = (unsigned char*)&mask;
> > > +  for (int i = 0; i < sizeof(T); i++)
> > > +  {
> > > +    if (ptr_mask[i] == 0x00)
> > > +    {
> > > +      ptr_f[i] = 0xff;
> > > +    }
> > > +  }
> > > +}
> > > +
> > > +void
> > > +test01()
> > > +{
> > > +  long double f = 0.5f; // long double may contains padding on X86
> > > +  fill_padding(f);
> > > +  std::atomic<long double> as{ f }; // padding cleared on constructor
> > > +  long double t = 1.5;
> > > +
> > > +  as.fetch_add(t);
> > > +  long double s = f + t;
> > > +  t = as.load();
> > > +  VERIFY(s == t); // padding ignored on float comparing
> > > +  fill_padding(s);
> > > +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> > > +  fill_padding(f);
> > > +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> > > +}
> > > +
> > > +int main()
> > > +{
> > > +  test01();
> > > +}
> > > diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp
> > > index bc387d17ed7..d9a19dadd7f 100644
> > > --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> > > +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> > > @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
> > >    || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
> > >    || [istarget riscv*-*-*]
> > >    || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
> > > + || ([istarget i?86-*-*] || [istarget x86_64-*-*])
> >
> > This seems too overkill as "dg-add-options libatomic" is not intended to
> > handle 16-byte atomics.  Maybe we can fork this to a new dg-add-options
> > like "add_options_for_libatomic_16b"?
> >
> > >         } {
> > >   global TOOL_OPTIONS
> > >
> > > --
> > > 2.25.1
> >
> > --
> > Xi Ruoyao <xry111@xry111.site>
> > School of Aerospace Science and Technology, Xidian University
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-16 16:16       ` xndcn
                           ` (2 preceding siblings ...)
  2024-01-30 16:31         ` [PATCH] " Jonathan Wakely
@ 2024-01-30 16:34         ` Jonathan Wakely
  2024-01-30 16:50           ` Jakub Jelinek
  3 siblings, 1 reply; 28+ messages in thread
From: Jonathan Wakely @ 2024-01-30 16:34 UTC (permalink / raw)
  To: xndcn; +Cc: Xi Ruoyao, H.J. Lu, GCC Patches, Jakub Jelinek, libstdc++

On Tue, 16 Jan 2024 at 16:17, xndcn <xndchn@gmail.com> wrote:
>
> Sorry about the mangled content...
> So I add a new add-options for libatomic_16b:
>
> ---
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
> * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> ---
>  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
>  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
>  libstdc++-v3/testsuite/lib/dg-options.exp     | 22 ++++++++
>  3 files changed, 78 insertions(+), 1 deletion(-)
>  create mode 100644
> libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..104ddfdbe 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        constexpr
>        __atomic_float(_Fp __t) : _M_fp(__t)
> -      { }
> +      {
> +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> +   __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +      }
>
>        __atomic_float(const __atomic_float&) = delete;
>        __atomic_float& operator=(const __atomic_float&) = delete;
> diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> new file mode 100644
> index 000000000..d538b3d55
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -0,0 +1,50 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O0" }
> +// { dg-timeout 10 }
> +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> +// { dg-do run { target { ia32 || x86_64-*-* } } }
> +// { dg-add-options libatomic_16b }

Why not just use -latomic unconditionally here?

You're adding a new libatomic_16b option that expands to -latomic for
x86 and nothing otherwise, and then using it in a test that only runs
for x86. So it's just adding -latomic to all targets that run the
test, right?

Also, this patch needs either a copyright assignment or a DCO sign-off:
https://gcc.gnu.org/contribute.html#legal


> +
> +#include <atomic>
> +#include <testsuite_hooks.h>
> +
> +template<typename T>
> +void __attribute__((noinline,noipa))
> +fill_padding(T& f)
> +{
> +  T mask;
> +  __builtin_memset(&mask, 0xff, sizeof(T));
> +  __builtin_clear_padding(&mask);
> +  unsigned char* ptr_f = (unsigned char*)&f;
> +  unsigned char* ptr_mask = (unsigned char*)&mask;
> +  for (int i = 0; i < sizeof(T); i++)
> +  {
> +    if (ptr_mask[i] == 0x00)
> +    {
> +      ptr_f[i] = 0xff;
> +    }
> +  }
> +}
> +
> +void
> +test01()
> +{
> +  long double f = 0.5f; // long double may contains padding on X86
> +  fill_padding(f);
> +  std::atomic<long double> as{ f }; // padding cleared on constructor
> +  long double t = 1.5;
> +
> +  as.fetch_add(t);
> +  long double s = f + t;
> +  t = as.load();
> +  VERIFY(s == t); // padding ignored on float comparing
> +  fill_padding(s);
> +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +  fill_padding(f);
> +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> +}
> +
> +int main()
> +{
> +  test01();
> +}
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
> b/libstdc++-v3/testsuite/lib/dg-options.exp
> index 850442b6b..25da20d58 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
>      return $flags
>  }
>
> +# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
> +proc add_options_for_libatomic_16b { flags } {
> +    if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
> +       } {
> + global TOOL_OPTIONS
> +
> + set link_flags ""
> + if ![is_remote host] {
> +     if [info exists TOOL_OPTIONS] {
> + set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
> +     } else {
> + set link_flags "[atomic_link_flags [get_multilibs]]"
> +     }
> + }
> +
> + append link_flags " -latomic "
> +
> + return "$flags $link_flags"
> +    }
> +    return $flags
> +}
> +
>  # Add options to enable use of deprecated features.
>  proc add_options_for_using-deprecated { flags } {
>      return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
> --
> 2.25.1
>
>
> Xi Ruoyao <xry111@xry111.site> 于2024年1月16日周二 18:12写道：
> >
> > On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> > > Thanks, so I add a test: atomic_float/compare_exchange_padding.cc,
> > > which will fail due to timeout without the patch.
> >
> > Please resend in plain text instead of HTML.  Sending in HTML causes the
> > patch mangled.
> >
> > And libstdc++ patches should CC libstdc++@gcc.gnu.org, in addition to
> > gcc-patches@gcc.gnu.org.
> >
> > > ---
> > > libstdc++-v3/ChangeLog:
> > >
> > >  * include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float constructor.
> > >  * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
> > >  * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> > > ---
> > >  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
> > >  .../atomic_float/compare_exchange_padding.cc  | 50 +++++++++++++++++++
> > >  libstdc++-v3/testsuite/lib/dg-options.exp     |  1 +
> > >  3 files changed, 57 insertions(+), 1 deletion(-)
> > >  create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > >
> > > diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
> > > index d3a2c4f3805..cbe3749e17f 100644
> > > --- a/libstdc++-v3/include/bits/atomic_base.h
> > > +++ b/libstdc++-v3/include/bits/atomic_base.h
> > > @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >
> > >        constexpr
> > >        __atomic_float(_Fp __t) : _M_fp(__t)
> > > -      { }
> > > +      {
> > > +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> > > + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> > > + __builtin_clear_padding(std::__addressof(_M_fp));
> > > +#endif
> > > +      }
> > >
> > >        __atomic_float(const __atomic_float&) = delete;
> > >        __atomic_float& operator=(const __atomic_float&) = delete;
> > > diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > > new file mode 100644
> > > index 00000000000..9376ab22850
> > > --- /dev/null
> > > +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > > @@ -0,0 +1,50 @@
> > > +// { dg-do run { target c++20 } }
> > > +// { dg-options "-O0" }
> > > +// { dg-timeout 10 }
> > > +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> > > +// { dg-do run { target { ia32 || x86_64-*-* } } }
> > > +// { dg-add-options libatomic }
> > > +
> > > +#include <atomic>
> > > +#include <testsuite_hooks.h>
> > > +
> > > +template<typename T>
> > > +void __attribute__((noinline,noipa))
> > > +fill_padding(T& f)
> > > +{
> > > +  T mask;
> > > +  __builtin_memset(&mask, 0xff, sizeof(T));
> > > +  __builtin_clear_padding(&mask);
> > > +  unsigned char* ptr_f = (unsigned char*)&f;
> > > +  unsigned char* ptr_mask = (unsigned char*)&mask;
> > > +  for (int i = 0; i < sizeof(T); i++)
> > > +  {
> > > +    if (ptr_mask[i] == 0x00)
> > > +    {
> > > +      ptr_f[i] = 0xff;
> > > +    }
> > > +  }
> > > +}
> > > +
> > > +void
> > > +test01()
> > > +{
> > > +  long double f = 0.5f; // long double may contains padding on X86
> > > +  fill_padding(f);
> > > +  std::atomic<long double> as{ f }; // padding cleared on constructor
> > > +  long double t = 1.5;
> > > +
> > > +  as.fetch_add(t);
> > > +  long double s = f + t;
> > > +  t = as.load();
> > > +  VERIFY(s == t); // padding ignored on float comparing
> > > +  fill_padding(s);
> > > +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> > > +  fill_padding(f);
> > > +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> > > +}
> > > +
> > > +int main()
> > > +{
> > > +  test01();
> > > +}
> > > diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp
> > > index bc387d17ed7..d9a19dadd7f 100644
> > > --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> > > +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> > > @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
> > >    || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
> > >    || [istarget riscv*-*-*]
> > >    || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
> > > + || ([istarget i?86-*-*] || [istarget x86_64-*-*])
> >
> > This seems too overkill as "dg-add-options libatomic" is not intended to
> > handle 16-byte atomics.  Maybe we can fork this to a new dg-add-options
> > like "add_options_for_libatomic_16b"?
> >
> > >         } {
> > >   global TOOL_OPTIONS
> > >
> > > --
> > > 2.25.1
> >
> > --
> > Xi Ruoyao <xry111@xry111.site>
> > School of Aerospace Science and Technology, Xidian University
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-30 16:34         ` Jonathan Wakely
@ 2024-01-30 16:50           ` Jakub Jelinek
  2024-01-31 17:19             ` xndcn
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Jelinek @ 2024-01-30 16:50 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: xndcn, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

On Tue, Jan 30, 2024 at 04:34:30PM +0000, Jonathan Wakely wrote:
> > --- /dev/null
> > +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> > @@ -0,0 +1,50 @@
> > +// { dg-do run { target c++20 } }
> > +// { dg-options "-O0" }
> > +// { dg-timeout 10 }

Why dg-timeout?
I don't see it used in any of the libstdc++ tests and I don't see anything
expensive on this test.  Just leave it out.

> > +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> > +// { dg-do run { target { ia32 || x86_64-*-* } } }

Also, the target selectors above look wrong.
Generally, one should use { i?86-*-* x86_64-*-* } because e.g. on some
OSes like Solaris, one builds i?86 compiler which supports -m32 or -m64
options.
I wouldn't add the -mlong-double-80 at all, it is an ABI changing option,
if you rely in the test on some 80-bit long double properties (but on which
ones?  Sure, if -mlong-double-64 or -mlong-double-128 is in effect, the
type won't have any padding, but __builtin_clear_padding should reflect
that), it is better to guard the part of the test with some checks, whether
std::numeric_limits<long double>::digits == 64, or
preprocessor __LDBL_MANT_DIG__ == 64.

> > +// { dg-add-options libatomic_16b }
> 
> Why not just use -latomic unconditionally here?
> 
> You're adding a new libatomic_16b option that expands to -latomic for
> x86 and nothing otherwise, and then using it in a test that only runs
> for x86. So it's just adding -latomic to all targets that run the
> test, right?
> 
> Also, this patch needs either a copyright assignment or a DCO sign-off:
> https://gcc.gnu.org/contribute.html#legal

And if somebody would need to commit it for you, also patch in non-garbled
state (e.g. sent as an attachment, or even compressed attachment if using
really bad MUAs).

	Jakub

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-30 16:50           ` Jakub Jelinek
@ 2024-01-31 17:19             ` xndcn
  2024-02-01 13:54               ` Jonathan Wakely
  0 siblings, 1 reply; 28+ messages in thread
From: xndcn @ 2024-01-31 17:19 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jonathan Wakely, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

Thanks!

> Why not just use -latomic unconditionally here?
testsuites of libstdc++ seems to have some tricks to find and link
libatomic.a by using "dg-add-options libatomic", which do nothing for
x86 target. In previous email, we decide not to pollute the current
option, so we add a new libatomic_16b option here.

> Why dg-timeout?
without the patch, "as.fetch_add(t);" will result in an infinite loop,
so I add timeout to help it escape. Should I keep it or not?

> Also, the target selectors above look wrong.
Thanks, fixed with checking "std::numeric_limits<long double>::digits == 64".

> This code is not compiled for C++11 and C++14, so this should just use
> constexpr not _GLIBCXX17_CONSTEXPR.
Thanks, fixed with "constexpr".

So here is the fixed patch, please review it again, thanks:
---
libstdc++-v3/ChangeLog:

 * include/bits/atomic_base.h: add __builtin_clear_padding in
__atomic_float constructor.
 * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
 * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.

Signed-off-by: xndcn <xndchn@gmail.com>
---
 libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
 .../atomic_float/compare_exchange_padding.cc  | 54 +++++++++++++++++++
 libstdc++-v3/testsuite/lib/dg-options.exp     | 22 ++++++++
 3 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644
libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index f4ce0fa53..90e3f0e4b 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

       constexpr
       __atomic_float(_Fp __t) : _M_fp(__t)
-      { }
+      {
+#if __has_builtin(__builtin_clear_padding)
+ if constexpr (__atomic_impl::__maybe_has_padding<_Fp>())
+   __builtin_clear_padding(std::__addressof(_M_fp));
+#endif
+      }

       __atomic_float(const __atomic_float&) = delete;
       __atomic_float& operator=(const __atomic_float&) = delete;
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 000000000..e6e17e4b5
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,54 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-timeout 10 }
+// { dg-do run { target { i?86-*-* x86_64-*-* } } }
+// { dg-add-options libatomic_16b }
+
+#include <atomic>
+#include <limits>
+#include <testsuite_hooks.h>
+
+template<typename T>
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+  T mask;
+  __builtin_memset(&mask, 0xff, sizeof(T));
+  __builtin_clear_padding(&mask);
+  unsigned char* ptr_f = (unsigned char*)&f;
+  unsigned char* ptr_mask = (unsigned char*)&mask;
+  for (int i = 0; i < sizeof(T); i++)
+  {
+    if (ptr_mask[i] == 0x00)
+    {
+      ptr_f[i] = 0xff;
+    }
+  }
+}
+
+void
+test01()
+{
+  // test for long double with padding (float80)
+  if constexpr (std::numeric_limits<long double>::digits == 64)
+  {
+    long double f = 0.5f; // long double may contains padding on X86
+    fill_padding(f);
+    std::atomic<long double> as{ f }; // padding cleared on constructor
+    long double t = 1.5;
+
+    as.fetch_add(t);
+    long double s = f + t;
+    t = as.load();
+    VERIFY(s == t); // padding ignored on float comparing
+    fill_padding(s);
+    VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+    fill_padding(f);
+    VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+  }
+}
+
+int main()
+{
+  test01();
+}
diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
b/libstdc++-v3/testsuite/lib/dg-options.exp
index 850442b6b..75b27a136 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
     return $flags
 }

+# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
+proc add_options_for_libatomic_16b { flags } {
+    if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
+       } {
+ global TOOL_OPTIONS
+
+ set link_flags ""
+ if ![is_remote host] {
+     if [info exists TOOL_OPTIONS] {
+ set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
+     } else {
+ set link_flags "[atomic_link_flags [get_multilibs]]"
+     }
+ }
+
+ append link_flags " -latomic "
+
+ return "$flags $link_flags"
+    }
+    return $flags
+}
+
 # Add options to enable use of deprecated features.
 proc add_options_for_using-deprecated { flags } {
     return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
-- 
2.25.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-01-31 17:19             ` xndcn
@ 2024-02-01 13:54               ` Jonathan Wakely
  2024-02-02 16:52                 ` xndcn
  0 siblings, 1 reply; 28+ messages in thread
From: Jonathan Wakely @ 2024-02-01 13:54 UTC (permalink / raw)
  To: xndcn; +Cc: Jakub Jelinek, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

On Wed, 31 Jan 2024 at 17:20, xndcn <xndchn@gmail.com> wrote:
>
> Thanks!
>
> > Why not just use -latomic unconditionally here?
> testsuites of libstdc++ seems to have some tricks to find and link
> libatomic.a by using "dg-add-options libatomic", which do nothing for
> x86 target. In previous email, we decide not to pollute the current
> option, so we add a new libatomic_16b option here.

Right, we don't want to change the existing libatomic option. But we
don't need a new one if it's going to be used in exactly one test and
if the new option does the same thing for all targets that run the
test.

However, on that subject, does the test really need to be restricted
to x86? It shouldn't loop on any target, right?

>
> > Why dg-timeout?
> without the patch, "as.fetch_add(t);" will result in an infinite loop,
> so I add timeout to help it escape. Should I keep it or not?

No, because the patch is supposed to prevent the infinite loop, and so
there's no need to stop it looping after 10s. It won't loop at all.

>
> > Also, the target selectors above look wrong.
> Thanks, fixed with checking "std::numeric_limits<long double>::digits == 64".
>
> > This code is not compiled for C++11 and C++14, so this should just use
> > constexpr not _GLIBCXX17_CONSTEXPR.
> Thanks, fixed with "constexpr".

Actually, why are you using the built-in directly in the first place? See below.

>
> So here is the fixed patch, please review it again, thanks:
> ---
> libstdc++-v3/ChangeLog:
>
>  * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
>  * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
>  * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
>
> Signed-off-by: xndcn <xndchn@gmail.com>
> ---
>  libstdc++-v3/include/bits/atomic_base.h       |  7 ++-
>  .../atomic_float/compare_exchange_padding.cc  | 54 +++++++++++++++++++
>  libstdc++-v3/testsuite/lib/dg-options.exp     | 22 ++++++++
>  3 files changed, 82 insertions(+), 1 deletion(-)
>  create mode 100644
> libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..90e3f0e4b 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        constexpr
>        __atomic_float(_Fp __t) : _M_fp(__t)
> -      { }
> +      {
> +#if __has_builtin(__builtin_clear_padding)
> + if constexpr (__atomic_impl::__maybe_has_padding<_Fp>())
> +   __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +      }

Why can't this be simply:

        { __atomic_impl::__clear_padding(_M_fp); }

?

We only need to clear padding for long double, not float and double, right?


>
>        __atomic_float(const __atomic_float&) = delete;
>        __atomic_float& operator=(const __atomic_float&) = delete;
> diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> new file mode 100644
> index 000000000..e6e17e4b5
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -0,0 +1,54 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O0" }
> +// { dg-timeout 10 }
> +// { dg-do run { target { i?86-*-* x86_64-*-* } } }

Why can't we run this on all targets?

> +// { dg-add-options libatomic_16b }

Let's just add -latomic to the dg-options line for this one test. We
don't want that in general because we want to know that atomics work
without it in most cases. It should be fine to use it for one test.

> +
> +#include <atomic>
> +#include <limits>
> +#include <testsuite_hooks.h>
> +
> +template<typename T>
> +void __attribute__((noinline,noipa))
> +fill_padding(T& f)
> +{
> +  T mask;
> +  __builtin_memset(&mask, 0xff, sizeof(T));

There's no reason to use __builtin_memset here, just include <cstring>
and use std::memcpy.


> +  __builtin_clear_padding(&mask);
> +  unsigned char* ptr_f = (unsigned char*)&f;
> +  unsigned char* ptr_mask = (unsigned char*)&mask;
> +  for (int i = 0; i < sizeof(T); i++)
> +  {
> +    if (ptr_mask[i] == 0x00)
> +    {
> +      ptr_f[i] = 0xff;
> +    }
> +  }
> +}
> +
> +void
> +test01()
> +{
> +  // test for long double with padding (float80)
> +  if constexpr (std::numeric_limits<long double>::digits == 64)
> +  {
> +    long double f = 0.5f; // long double may contains padding on X86

It definitely does have padding, just say "long double has padding bits on x86"

> +    fill_padding(f);
> +    std::atomic<long double> as{ f }; // padding cleared on constructor
> +    long double t = 1.5;
> +
> +    as.fetch_add(t);
> +    long double s = f + t;
> +    t = as.load();
> +    VERIFY(s == t); // padding ignored on float comparing
> +    fill_padding(s);
> +    VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +    fill_padding(f);
> +    VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> +  }
> +}
> +
> +int main()
> +{
> +  test01();
> +}
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
> b/libstdc++-v3/testsuite/lib/dg-options.exp
> index 850442b6b..75b27a136 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
>      return $flags
>  }
>
> +# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
> +proc add_options_for_libatomic_16b { flags } {
> +    if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
> +       } {
> + global TOOL_OPTIONS
> +
> + set link_flags ""
> + if ![is_remote host] {
> +     if [info exists TOOL_OPTIONS] {
> + set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
> +     } else {
> + set link_flags "[atomic_link_flags [get_multilibs]]"
> +     }
> + }
> +
> + append link_flags " -latomic "
> +
> + return "$flags $link_flags"
> +    }
> +    return $flags
> +}
> +
>  # Add options to enable use of deprecated features.
>  proc add_options_for_using-deprecated { flags } {
>      return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
> --
> 2.25.1
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-02-01 13:54               ` Jonathan Wakely
@ 2024-02-02 16:52                 ` xndcn
  2024-02-16 12:38                   ` Jonathan Wakely
  0 siblings, 1 reply; 28+ messages in thread
From: xndcn @ 2024-02-02 16:52 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

Thank you for your careful review!

> But we don't need a new one if it's going to be used in exactly one test and if the new option does the same thing for all targets that run the test.
Got it, thanks. Now add option "-latomic" directly, but it still rely
on the trick "[atomic_link_flags [get_multilibs]]"

> No, because the patch is supposed to prevent the infinite loop, and so there's no need to stop it looping after 10s. It won't loop at all.
Thanks, deleted.

> We only need to clear padding for long double, not float and double, right?
Yes, actually there is a check "if constexpr
(__atomic_impl::__maybe_has_padding<_Fp>())".
But "__atomic_impl::__clear_padding(_M_fp); " is indeed simply, so fixed here.

> Why can't we run this on all targets?
Got it, now target option deleted.

> There's no reason to use __builtin_memset here, just include <cstring> and use std::memcpy.
Thanks, fixed.

> It definitely does have padding, just say "long double has padding bits on x86"
Thanks, fixed.

So here comes the latest patch:
---
libstdc++-v3/ChangeLog:

 * include/bits/atomic_base.h: add __builtin_clear_padding in
__atomic_float constructor.
 * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.

Signed-off-by: xndcn <xndchn@gmail.com>
---
 libstdc++-v3/include/bits/atomic_base.h       |  2 +-
 .../atomic_float/compare_exchange_padding.cc  | 53 +++++++++++++++++++
 2 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100644
libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index f4ce0fa53..cdd6f07da 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

       constexpr
       __atomic_float(_Fp __t) : _M_fp(__t)
-      { }
+      { __atomic_impl::__clear_padding(_M_fp); }

       __atomic_float(const __atomic_float&) = delete;
       __atomic_float& operator=(const __atomic_float&) = delete;
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 000000000..eeace251c
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,53 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-additional-options "[atomic_link_flags [get_multilibs]] -latomic" }
+
+#include <atomic>
+#include <cstring>
+#include <limits>
+#include <testsuite_hooks.h>
+
+template<typename T>
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+  T mask;
+  std::memset(&mask, 0xff, sizeof(T));
+  __builtin_clear_padding(&mask);
+  unsigned char* ptr_f = (unsigned char*)&f;
+  unsigned char* ptr_mask = (unsigned char*)&mask;
+  for (int i = 0; i < sizeof(T); i++)
+  {
+    if (ptr_mask[i] == 0x00)
+    {
+      ptr_f[i] = 0xff;
+    }
+  }
+}
+
+void
+test01()
+{
+  // test for long double with padding (float80)
+  if constexpr (std::numeric_limits<long double>::digits == 64)
+  {
+    long double f = 0.5f; // long double has padding bits on x86
+    fill_padding(f);
+    std::atomic<long double> as{ f }; // padding cleared on constructor
+    long double t = 1.5;
+
+    as.fetch_add(t);
+    long double s = f + t;
+    t = as.load();
+    VERIFY(s == t); // padding ignored on float comparing
+    fill_padding(s);
+    VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+    fill_padding(f);
+    VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+  }
+}
+
+int main()
+{
+  test01();
+}
-- 
2.25.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-02-02 16:52                 ` xndcn
@ 2024-02-16 12:38                   ` Jonathan Wakely
  2024-02-16 13:51                     ` Jonathan Wakely
  0 siblings, 1 reply; 28+ messages in thread
From: Jonathan Wakely @ 2024-02-16 12:38 UTC (permalink / raw)
  To: xndcn; +Cc: Jakub Jelinek, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

On Fri, 2 Feb 2024 at 16:52, xndcn <xndchn@gmail.com> wrote:
>
> Thank you for your careful review!
>
> > But we don't need a new one if it's going to be used in exactly one test and if the new option does the same thing for all targets that run the test.
> Got it, thanks. Now add option "-latomic" directly, but it still rely
> on the trick "[atomic_link_flags [get_multilibs]]"
>
> > No, because the patch is supposed to prevent the infinite loop, and so there's no need to stop it looping after 10s. It won't loop at all.
> Thanks, deleted.
>
> > We only need to clear padding for long double, not float and double, right?
> Yes, actually there is a check "if constexpr
> (__atomic_impl::__maybe_has_padding<_Fp>())".
> But "__atomic_impl::__clear_padding(_M_fp); " is indeed simply, so fixed here.
>
> > Why can't we run this on all targets?
> Got it, now target option deleted.
>
> > There's no reason to use __builtin_memset here, just include <cstring> and use std::memcpy.
> Thanks, fixed.
>
> > It definitely does have padding, just say "long double has padding bits on x86"
> Thanks, fixed.
>
> So here comes the latest patch:


Thanks. I've applied the patch to my tree, but the new test fails
pretty reliably.

The infinite loop in std::atomic<long double>::fetch_add is fixed by
clearing padding in the constructor, but the test fails on the
compare_exchange_weak or compare_exchange_strong lines here:


> +    as.fetch_add(t);
> +    long double s = f + t;
> +    t = as.load();
> +    VERIFY(s == t); // padding ignored on float comparing
> +    fill_padding(s);
> +    VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +    fill_padding(f);
> +    VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
>


I think the problem is here in __atomic_impl::__compare_exchange:

   if (__atomic_compare_exchange(__pval, __pexp, __pi,
                                          __is_weak, int(__s), int(__f)))
     return true;

Even though padding in *__pexp and *__pi has been cleared, the value
of *__pval after a successful __atomic_compare_exchange has non-zero
padding. That means that the next compare_exchange will fail, because
we assume that the stored value always has zeroed padding bits.

Here's a gdb session showing that __atomic_compare_exchange stores a
value with non-zero padding:

Breakpoint 2, test01 () at compare_exchange_padding.cc:43
43        long double s2 = s;
(gdb) n
44        fill_padding(s2);
(gdb)
45        while (!as.compare_exchange_weak(s2, f)) // padding cleared
on compexchg
(gdb) p/x as._M_fp
$11 = 0x40008000000000000000
(gdb) step
std::__atomic_float<long double>::compare_exchange_weak
(this=0x7fffffffd8c0, __expected=@0x7fffffffd8a0: 2, __desired=0.5,
    __order=std::memory_order::seq_cst) at
/home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1387
1387            return compare_exchange_weak(__expected, __desired, __order,
(gdb) step
std::__atomic_float<long double>::compare_exchange_weak
(this=0x7fffffffd8c0, __expected=@0x7fffffffd8a0: 2, __desired=0.5,
    __success=std::memory_order::seq_cst,
__failure=std::memory_order::seq_cst) at
/home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1347
1347            return __atomic_impl::compare_exchange_weak(&_M_fp,
(gdb) step
std::__atomic_impl::compare_exchange_weak<false, long double>
(__check_padding=false, __failure=std::memory_order::seq_cst,
    __success=std::memory_order::seq_cst, __desired=0.5,
__expected=@0x7fffffffd8a0: 2, __ptr=0x7fffffffd8c0)
    at /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1123
1123            return __atomic_impl::__compare_exchange<_AtomicRef>(
(gdb)
std::__atomic_impl::__compare_exchange<false, long double>
(__f=std::memory_order::seq_cst, __s=std::memory_order::seq_cst,
__is_weak=true,
    __i=<optimized out>, __e=@0x7fffffffd8a0: 2,
__val=@0x7fffffffd8c0: 2) at
/home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:994
994             __glibcxx_assert(__is_valid_cmpexch_failure_order(__f));
(gdb) n
997             _Tp* const __pval = std::__addressof(__val);
(gdb)
1008                _Vp* const __pi = __atomic_impl::__clear_padding(__i);
(gdb)
1010                _Vp __exp = __e;
(gdb)
1012                _Vp* const __pexp = __atomic_impl::__clear_padding(__exp);
(gdb)
1016                if (__atomic_compare_exchange(__pval, __pexp, __pi,
(gdb) p/x *__pval
$12 = 0x40008000000000000000
(gdb) p/x *__pexp
$13 = 0x40008000000000000000
(gdb) p/x *__pi
$14 = 0x3ffe8000000000000000
(gdb) n
1018                  return true;
(gdb) p/x *__pval
$15 = 0x7ffff7bf3ffe8000000000000000
(gdb)

We stored *__pi which has zero padding, but the result in *__pval has
non-zero padding. This doesn't seem to be gdb being misleading by
loading *__pval into a FP register which doesn't preserve the zero
padding, because if I do this then it fails:

  as.fetch_add(t);
  VERIFY(as.load() == s);
  __builtin_clear_padding(&s);
  VERIFY( std::memcmp(&s, &as, sizeof(s)) == 0 );

So the value stored by fetch_add (which uses compare_exchange_weak in
a loop) really doesn't have zero padding bits.

I'm not sure what's going on here yet, maybe __atomic_compare_exchange
recognizes that the pointers are long double* and so only copies the
non-padding bits into the value?

I think I'll push the fix to the __atomic_float constructor, but with
a simplified test case that doesn't include the
compare_exchange_{weak,strong} calls, until I figure out how to fix
it.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-02-16 12:38                   ` Jonathan Wakely
@ 2024-02-16 13:51                     ` Jonathan Wakely
  2024-02-16 14:10                       ` Jakub Jelinek
  2024-02-19  7:55                       ` [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange Jakub Jelinek
  0 siblings, 2 replies; 28+ messages in thread
From: Jonathan Wakely @ 2024-02-16 13:51 UTC (permalink / raw)
  To: xndcn; +Cc: Jakub Jelinek, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

On Fri, 16 Feb 2024 at 12:38, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On Fri, 2 Feb 2024 at 16:52, xndcn <xndchn@gmail.com> wrote:
> >
> > Thank you for your careful review!
> >
> > > But we don't need a new one if it's going to be used in exactly one test and if the new option does the same thing for all targets that run the test.
> > Got it, thanks. Now add option "-latomic" directly, but it still rely
> > on the trick "[atomic_link_flags [get_multilibs]]"
> >
> > > No, because the patch is supposed to prevent the infinite loop, and so there's no need to stop it looping after 10s. It won't loop at all.
> > Thanks, deleted.
> >
> > > We only need to clear padding for long double, not float and double, right?
> > Yes, actually there is a check "if constexpr
> > (__atomic_impl::__maybe_has_padding<_Fp>())".
> > But "__atomic_impl::__clear_padding(_M_fp); " is indeed simply, so fixed here.
> >
> > > Why can't we run this on all targets?
> > Got it, now target option deleted.
> >
> > > There's no reason to use __builtin_memset here, just include <cstring> and use std::memcpy.
> > Thanks, fixed.
> >
> > > It definitely does have padding, just say "long double has padding bits on x86"
> > Thanks, fixed.
> >
> > So here comes the latest patch:
>
>
> Thanks. I've applied the patch to my tree, but the new test fails
> pretty reliably.
>
> The infinite loop in std::atomic<long double>::fetch_add is fixed by
> clearing padding in the constructor, but the test fails on the
> compare_exchange_weak or compare_exchange_strong lines here:
>
>
> > +    as.fetch_add(t);
> > +    long double s = f + t;
> > +    t = as.load();
> > +    VERIFY(s == t); // padding ignored on float comparing
> > +    fill_padding(s);
> > +    VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> > +    fill_padding(f);
> > +    VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> >
>
>
> I think the problem is here in __atomic_impl::__compare_exchange:
>
>    if (__atomic_compare_exchange(__pval, __pexp, __pi,
>                                           __is_weak, int(__s), int(__f)))
>      return true;
>
> Even though padding in *__pexp and *__pi has been cleared, the value
> of *__pval after a successful __atomic_compare_exchange has non-zero
> padding. That means that the next compare_exchange will fail, because
> we assume that the stored value always has zeroed padding bits.
>
> Here's a gdb session showing that __atomic_compare_exchange stores a
> value with non-zero padding:
>
> Breakpoint 2, test01 () at compare_exchange_padding.cc:43
> 43        long double s2 = s;
> (gdb) n
> 44        fill_padding(s2);
> (gdb)
> 45        while (!as.compare_exchange_weak(s2, f)) // padding cleared
> on compexchg
> (gdb) p/x as._M_fp
> $11 = 0x40008000000000000000
> (gdb) step
> std::__atomic_float<long double>::compare_exchange_weak
> (this=0x7fffffffd8c0, __expected=@0x7fffffffd8a0: 2, __desired=0.5,
>     __order=std::memory_order::seq_cst) at
> /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1387
> 1387            return compare_exchange_weak(__expected, __desired, __order,
> (gdb) step
> std::__atomic_float<long double>::compare_exchange_weak
> (this=0x7fffffffd8c0, __expected=@0x7fffffffd8a0: 2, __desired=0.5,
>     __success=std::memory_order::seq_cst,
> __failure=std::memory_order::seq_cst) at
> /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1347
> 1347            return __atomic_impl::compare_exchange_weak(&_M_fp,
> (gdb) step
> std::__atomic_impl::compare_exchange_weak<false, long double>
> (__check_padding=false, __failure=std::memory_order::seq_cst,
>     __success=std::memory_order::seq_cst, __desired=0.5,
> __expected=@0x7fffffffd8a0: 2, __ptr=0x7fffffffd8c0)
>     at /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1123
> 1123            return __atomic_impl::__compare_exchange<_AtomicRef>(
> (gdb)
> std::__atomic_impl::__compare_exchange<false, long double>
> (__f=std::memory_order::seq_cst, __s=std::memory_order::seq_cst,
> __is_weak=true,
>     __i=<optimized out>, __e=@0x7fffffffd8a0: 2,
> __val=@0x7fffffffd8c0: 2) at
> /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:994
> 994             __glibcxx_assert(__is_valid_cmpexch_failure_order(__f));
> (gdb) n
> 997             _Tp* const __pval = std::__addressof(__val);
> (gdb)
> 1008                _Vp* const __pi = __atomic_impl::__clear_padding(__i);
> (gdb)
> 1010                _Vp __exp = __e;
> (gdb)
> 1012                _Vp* const __pexp = __atomic_impl::__clear_padding(__exp);
> (gdb)
> 1016                if (__atomic_compare_exchange(__pval, __pexp, __pi,
> (gdb) p/x *__pval
> $12 = 0x40008000000000000000
> (gdb) p/x *__pexp
> $13 = 0x40008000000000000000
> (gdb) p/x *__pi
> $14 = 0x3ffe8000000000000000
> (gdb) n
> 1018                  return true;
> (gdb) p/x *__pval
> $15 = 0x7ffff7bf3ffe8000000000000000
> (gdb)
>
> We stored *__pi which has zero padding, but the result in *__pval has
> non-zero padding. This doesn't seem to be gdb being misleading by
> loading *__pval into a FP register which doesn't preserve the zero
> padding, because if I do this then it fails:
>
>   as.fetch_add(t);
>   VERIFY(as.load() == s);
>   __builtin_clear_padding(&s);
>   VERIFY( std::memcmp(&s, &as, sizeof(s)) == 0 );
>
> So the value stored by fetch_add (which uses compare_exchange_weak in
> a loop) really doesn't have zero padding bits.
>
> I'm not sure what's going on here yet, maybe __atomic_compare_exchange
> recognizes that the pointers are long double* and so only copies the
> non-padding bits into the value?

Ah, although __atomic_compare_exchange only takes pointers, the
compiler replaces that with a call to __atomic_compare_exchange_n
which takes the newval by value, which presumably uses an 80-bit FP
register and so the padding bits become indeterminate again.

Maybe we can make std::atomic<long double> actually store a 16-byte
struct containing an opaque char buffer, which we store a long double
into. That should stop FP registers being used for it. We never need
an actual long double, because we only access the value using the
__atomic_xxx built-ins.

Maybe we need that for every std::atomic<T> where T has padding bits,
because even for struct S { int i; char c; } passing S by value could
be scalarized and only copy the value representation, not preserve the
padding that we've carefully cleared.



>
> I think I'll push the fix to the __atomic_float constructor, but with
> a simplified test case that doesn't include the
> compare_exchange_{weak,strong} calls, until I figure out how to fix
> it.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-02-16 13:51                     ` Jonathan Wakely
@ 2024-02-16 14:10                       ` Jakub Jelinek
  2024-02-16 15:15                         ` Jonathan Wakely
  2024-02-19  7:55                       ` [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange Jakub Jelinek
  1 sibling, 1 reply; 28+ messages in thread
From: Jakub Jelinek @ 2024-02-16 14:10 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: xndcn, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote:
> Ah, although __atomic_compare_exchange only takes pointers, the
> compiler replaces that with a call to __atomic_compare_exchange_n
> which takes the newval by value, which presumably uses an 80-bit FP
> register and so the padding bits become indeterminate again.

__atomic_compare_exchange_n only works with integers, so I guess
it is doing VIEW_CONVERT_EXPR (aka union-style type punning) on the
argument.

Do you have preprocessed source for the testcase?

	Jakub


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-02-16 14:10                       ` Jakub Jelinek
@ 2024-02-16 15:15                         ` Jonathan Wakely
  2024-03-14 15:13                           ` Jonathan Wakely
  0 siblings, 1 reply; 28+ messages in thread
From: Jonathan Wakely @ 2024-02-16 15:15 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: xndcn, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

On Fri, 16 Feb 2024 at 14:10, Jakub Jelinek wrote:
>
> On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote:
> > Ah, although __atomic_compare_exchange only takes pointers, the
> > compiler replaces that with a call to __atomic_compare_exchange_n
> > which takes the newval by value, which presumably uses an 80-bit FP
> > register and so the padding bits become indeterminate again.
>
> __atomic_compare_exchange_n only works with integers, so I guess
> it is doing VIEW_CONVERT_EXPR (aka union-style type punning) on the
> argument.
>
> Do you have preprocessed source for the testcase?

Sent offlist.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange
  2024-02-16 13:51                     ` Jonathan Wakely
  2024-02-16 14:10                       ` Jakub Jelinek
@ 2024-02-19  7:55                       ` Jakub Jelinek
  2024-02-20  0:12                         ` Jason Merrill
  1 sibling, 1 reply; 28+ messages in thread
From: Jakub Jelinek @ 2024-02-19  7:55 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches, Jonathan Wakely

On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote:
> Ah, although __atomic_compare_exchange only takes pointers, the
> compiler replaces that with a call to __atomic_compare_exchange_n
> which takes the newval by value, which presumably uses an 80-bit FP
> register and so the padding bits become indeterminate again.

The problem is that __atomic_{,compare_}exchange lowering if it has
a supported atomic 1/2/4/8/16 size emits code like:
  _3 = *p2;
  _4 = VIEW_CONVERT_EXPR<I_type> (_3);
so if long double or some small struct etc. has some carefully filled
padding bits, those bits can be lost on the assignment.  The library call
for __atomic_{,compare_}exchange would actually work because it woiuld
load the value from memory using integral type or memcpy.
E.g. on
void
foo (long double *a, long double *b, long double *c)
{
  __atomic_compare_exchange (a, b, c, false, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}
we end up with -O0 with:
	fldt	(%rax)
	fstpt	-48(%rbp)
	movq	-48(%rbp), %rax
	movq	-40(%rbp), %rdx
i.e. load *c from memory into 387 register, store it back to uninitialized
stack slot (the padding bits are now random in there) and then load a
__uint128_t (pair of GPR regs).  The problem is that we first load it using
whatever type the pointer points to and then VIEW_CONVERT_EXPR that value:
  p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
  p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
The following patch fixes that by creating a MEM_REF instead, with the
I_type type, but with the original pointer type on the second argument for
aliasing purposes, so we actually preserve the padding bits that way.
I've done that for types which may have padding and also for
non-integral/pointer types, because I fear even on floating point types
like double or float which don't have padding bits the copying through
floating point could misbehave in presence of sNaNs or unsupported bit
combinations.
With this patch instead of the above assembly we emit
	movq	8(%rax), %rdx
	movq	(%rax), %rax
I had to add support for MEM_REF in pt.cc, though with the assumption
that it has been already originally created with non-dependent
types/operands (which is the case here for the __atomic*exchange lowering).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-19  Jakub Jelinek  <jakub@redhat.com>

gcc/c-family/
	* c-common.cc (resolve_overloaded_atomic_exchange): For
	non-integral/pointer types or types which may need padding
	instead of setting p1 to VIEW_CONVERT_EXPR<I_type> (*p1), set it to
	MEM_REF with p1 and (typeof (p1)) 0 operands and I_type type.
	(resolve_overloaded_atomic_compare_exchange): Similarly for p2.
gcc/cp/
	* pt.cc (tsubst_expr): Handle MEM_REF.
gcc/testsuite/
	* g++.dg/ext/atomic-5.C: New test.

--- gcc/c-family/c-common.cc.jj	2024-02-16 17:33:43.995739790 +0100
+++ gcc/c-family/c-common.cc	2024-02-17 11:11:34.029474214 +0100
@@ -7794,8 +7794,23 @@ resolve_overloaded_atomic_exchange (loca
   p0 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p0);
   (*params)[0] = p0; 
   /* Convert new value to required type, and dereference it.  */
-  p1 = build_indirect_ref (loc, p1, RO_UNARY_STAR);
-  p1 = build1 (VIEW_CONVERT_EXPR, I_type, p1);
+  if ((!INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (p1)))
+       && !POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (p1))))
+      || clear_padding_type_may_have_padding_p (TREE_TYPE (TREE_TYPE (p1))))
+    {
+      /* If *p1 type can have padding or may involve floating point which
+	 could e.g. be promoted to wider precision and demoted afterwards,
+	 state of padding bits might not be preserved.  */
+      build_indirect_ref (loc, p1, RO_UNARY_STAR);
+      p1 = build2_loc (loc, MEM_REF, I_type,
+		       build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1),
+		       build_zero_cst (TREE_TYPE (p1)));
+    }
+  else
+    {
+      p1 = build_indirect_ref (loc, p1, RO_UNARY_STAR);
+      p1 = build1 (VIEW_CONVERT_EXPR, I_type, p1);
+    }
   (*params)[1] = p1;
 
   /* Move memory model to the 3rd position, and end param list.  */
@@ -7874,8 +7889,23 @@ resolve_overloaded_atomic_compare_exchan
   (*params)[1] = p1;
 
   /* Convert desired value to required type, and dereference it.  */
-  p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
-  p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
+  if ((!INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (p2)))
+       && !POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (p2))))
+      || clear_padding_type_may_have_padding_p (TREE_TYPE (TREE_TYPE (p2))))
+    {
+      /* If *p2 type can have padding or may involve floating point which
+	 could e.g. be promoted to wider precision and demoted afterwards,
+	 state of padding bits might not be preserved.  */
+      build_indirect_ref (loc, p2, RO_UNARY_STAR);
+      p2 = build2_loc (loc, MEM_REF, I_type,
+		       build1 (VIEW_CONVERT_EXPR, I_type_ptr, p2),
+		       build_zero_cst (TREE_TYPE (p2)));
+    }
+  else
+    {
+      p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
+      p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
+    }
   (*params)[2] = p2;
 
   /* The rest of the parameters are fine. NULL means no special return value
--- gcc/cp/pt.cc.jj	2024-02-14 14:26:19.695811596 +0100
+++ gcc/cp/pt.cc	2024-02-17 11:05:47.988237152 +0100
@@ -20088,6 +20088,14 @@ tsubst_expr (tree t, tree args, tsubst_f
 	RETURN (r);
       }
 
+    case MEM_REF:
+      {
+	tree op0 = RECUR (TREE_OPERAND (t, 0));
+	tree op1 = RECUR (TREE_OPERAND (t, 0));
+	tree new_type = tsubst (TREE_TYPE (t), args, complain, in_decl);
+	RETURN (build2_loc (EXPR_LOCATION (t), MEM_REF, new_type, op0, op1));
+      }
+
     case NOP_EXPR:
       {
 	tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
--- gcc/testsuite/g++.dg/ext/atomic-5.C.jj	2024-02-17 11:13:37.824770288 +0100
+++ gcc/testsuite/g++.dg/ext/atomic-5.C	2024-02-17 11:14:54.077720732 +0100
@@ -0,0 +1,42 @@
+// { dg-do compile { target c++14 } }
+
+template <int N>
+void
+foo (long double *ptr, long double *val, long double *ret)
+{
+  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
+}
+
+template <int N>
+bool
+bar (long double *ptr, long double *exp, long double *des)
+{
+  return __atomic_compare_exchange (ptr, exp, des, false,
+				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
+}
+
+bool
+baz (long double *p, long double *q, long double *r)
+{
+  foo<0> (p, q, r);
+  foo<1> (p + 1, q + 1, r + 1);
+  return bar<0> (p + 2, q + 2, r + 2) || bar<1> (p + 3, q + 3, r + 3);
+}
+
+constexpr int
+qux (long double *ptr, long double *val, long double *ret)
+{
+  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
+  return 0;
+}
+
+constexpr bool
+corge (long double *ptr, long double *exp, long double *des)
+{
+  return __atomic_compare_exchange (ptr, exp, des, false,
+				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
+}
+
+long double a[6];
+const int b = qux (a, a + 1, a + 2);
+const bool c = corge (a + 3, a + 4, a + 5);

	Jakub


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange
  2024-02-19  7:55                       ` [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange Jakub Jelinek
@ 2024-02-20  0:12                         ` Jason Merrill
  2024-02-20  0:51                           ` Jakub Jelinek
  0 siblings, 1 reply; 28+ messages in thread
From: Jason Merrill @ 2024-02-20  0:12 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Jonathan Wakely, Richard Biener

On 2/19/24 02:55, Jakub Jelinek wrote:
> On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote:
>> Ah, although __atomic_compare_exchange only takes pointers, the
>> compiler replaces that with a call to __atomic_compare_exchange_n
>> which takes the newval by value, which presumably uses an 80-bit FP
>> register and so the padding bits become indeterminate again.
> 
> The problem is that __atomic_{,compare_}exchange lowering if it has
> a supported atomic 1/2/4/8/16 size emits code like:
>    _3 = *p2;
>    _4 = VIEW_CONVERT_EXPR<I_type> (_3);

Hmm, I find that surprising; I thought of VIEW_CONVERT_EXPR as working 
on lvalues, not rvalues, based on the documentation describing it as 
roughly *(type2 *)&X.

Now I see that gimplify_expr does what you describe, and I wonder what 
the advantage of that is.  That seems to go back to richi's r206420 for 
PR59471.  r270579 for PR limited it to cases where the caller wants an 
rvalue; perhaps it should also be avoided when the operand is an 
INDIRECT_REF?

> so if long double or some small struct etc. has some carefully filled
> padding bits, those bits can be lost on the assignment.  The library call
> for __atomic_{,compare_}exchange would actually work because it woiuld
> load the value from memory using integral type or memcpy.
> E.g. on
> void
> foo (long double *a, long double *b, long double *c)
> {
>    __atomic_compare_exchange (a, b, c, false, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
> }
> we end up with -O0 with:
> 	fldt	(%rax)
> 	fstpt	-48(%rbp)
> 	movq	-48(%rbp), %rax
> 	movq	-40(%rbp), %rdx
> i.e. load *c from memory into 387 register, store it back to uninitialized
> stack slot (the padding bits are now random in there) and then load a
> __uint128_t (pair of GPR regs).  The problem is that we first load it using
> whatever type the pointer points to and then VIEW_CONVERT_EXPR that value:
>    p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
>    p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
> The following patch fixes that by creating a MEM_REF instead, with the
> I_type type, but with the original pointer type on the second argument for
> aliasing purposes, so we actually preserve the padding bits that way.
> I've done that for types which may have padding and also for
> non-integral/pointer types, because I fear even on floating point types
> like double or float which don't have padding bits the copying through
> floating point could misbehave in presence of sNaNs or unsupported bit
> combinations.
> With this patch instead of the above assembly we emit
> 	movq	8(%rax), %rdx
> 	movq	(%rax), %rax
> I had to add support for MEM_REF in pt.cc, though with the assumption
> that it has been already originally created with non-dependent
> types/operands (which is the case here for the __atomic*exchange lowering).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2024-02-19  Jakub Jelinek  <jakub@redhat.com>
> 
> gcc/c-family/
> 	* c-common.cc (resolve_overloaded_atomic_exchange): For
> 	non-integral/pointer types or types which may need padding
> 	instead of setting p1 to VIEW_CONVERT_EXPR<I_type> (*p1), set it to
> 	MEM_REF with p1 and (typeof (p1)) 0 operands and I_type type.
> 	(resolve_overloaded_atomic_compare_exchange): Similarly for p2.
> gcc/cp/
> 	* pt.cc (tsubst_expr): Handle MEM_REF.
> gcc/testsuite/
> 	* g++.dg/ext/atomic-5.C: New test.
> 
> --- gcc/c-family/c-common.cc.jj	2024-02-16 17:33:43.995739790 +0100
> +++ gcc/c-family/c-common.cc	2024-02-17 11:11:34.029474214 +0100
> @@ -7794,8 +7794,23 @@ resolve_overloaded_atomic_exchange (loca
>     p0 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p0);
>     (*params)[0] = p0;
>     /* Convert new value to required type, and dereference it.  */
> -  p1 = build_indirect_ref (loc, p1, RO_UNARY_STAR);
> -  p1 = build1 (VIEW_CONVERT_EXPR, I_type, p1);
> +  if ((!INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (p1)))
> +       && !POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (p1))))
> +      || clear_padding_type_may_have_padding_p (TREE_TYPE (TREE_TYPE (p1))))
> +    {
> +      /* If *p1 type can have padding or may involve floating point which
> +	 could e.g. be promoted to wider precision and demoted afterwards,
> +	 state of padding bits might not be preserved.  */
> +      build_indirect_ref (loc, p1, RO_UNARY_STAR);
> +      p1 = build2_loc (loc, MEM_REF, I_type,
> +		       build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1),
> +		       build_zero_cst (TREE_TYPE (p1)));
> +    }
> +  else
> +    {
> +      p1 = build_indirect_ref (loc, p1, RO_UNARY_STAR);
> +      p1 = build1 (VIEW_CONVERT_EXPR, I_type, p1);
> +    }
>     (*params)[1] = p1;
>   
>     /* Move memory model to the 3rd position, and end param list.  */
> @@ -7874,8 +7889,23 @@ resolve_overloaded_atomic_compare_exchan
>     (*params)[1] = p1;
>   
>     /* Convert desired value to required type, and dereference it.  */
> -  p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
> -  p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
> +  if ((!INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (p2)))
> +       && !POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (p2))))
> +      || clear_padding_type_may_have_padding_p (TREE_TYPE (TREE_TYPE (p2))))
> +    {
> +      /* If *p2 type can have padding or may involve floating point which
> +	 could e.g. be promoted to wider precision and demoted afterwards,
> +	 state of padding bits might not be preserved.  */
> +      build_indirect_ref (loc, p2, RO_UNARY_STAR);
> +      p2 = build2_loc (loc, MEM_REF, I_type,
> +		       build1 (VIEW_CONVERT_EXPR, I_type_ptr, p2),
> +		       build_zero_cst (TREE_TYPE (p2)));
> +    }
> +  else
> +    {
> +      p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
> +      p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
> +    }
>     (*params)[2] = p2;
>   
>     /* The rest of the parameters are fine. NULL means no special return value
> --- gcc/cp/pt.cc.jj	2024-02-14 14:26:19.695811596 +0100
> +++ gcc/cp/pt.cc	2024-02-17 11:05:47.988237152 +0100
> @@ -20088,6 +20088,14 @@ tsubst_expr (tree t, tree args, tsubst_f
>   	RETURN (r);
>         }
>   
> +    case MEM_REF:
> +      {
> +	tree op0 = RECUR (TREE_OPERAND (t, 0));
> +	tree op1 = RECUR (TREE_OPERAND (t, 0));
> +	tree new_type = tsubst (TREE_TYPE (t), args, complain, in_decl);
> +	RETURN (build2_loc (EXPR_LOCATION (t), MEM_REF, new_type, op0, op1));
> +      }
> +
>       case NOP_EXPR:
>         {
>   	tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
> --- gcc/testsuite/g++.dg/ext/atomic-5.C.jj	2024-02-17 11:13:37.824770288 +0100
> +++ gcc/testsuite/g++.dg/ext/atomic-5.C	2024-02-17 11:14:54.077720732 +0100
> @@ -0,0 +1,42 @@
> +// { dg-do compile { target c++14 } }
> +
> +template <int N>
> +void
> +foo (long double *ptr, long double *val, long double *ret)
> +{
> +  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
> +}
> +
> +template <int N>
> +bool
> +bar (long double *ptr, long double *exp, long double *des)
> +{
> +  return __atomic_compare_exchange (ptr, exp, des, false,
> +				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
> +}
> +
> +bool
> +baz (long double *p, long double *q, long double *r)
> +{
> +  foo<0> (p, q, r);
> +  foo<1> (p + 1, q + 1, r + 1);
> +  return bar<0> (p + 2, q + 2, r + 2) || bar<1> (p + 3, q + 3, r + 3);
> +}
> +
> +constexpr int
> +qux (long double *ptr, long double *val, long double *ret)
> +{
> +  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
> +  return 0;
> +}
> +
> +constexpr bool
> +corge (long double *ptr, long double *exp, long double *des)
> +{
> +  return __atomic_compare_exchange (ptr, exp, des, false,
> +				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
> +}
> +
> +long double a[6];
> +const int b = qux (a, a + 1, a + 2);
> +const bool c = corge (a + 3, a + 4, a + 5);
> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange
  2024-02-20  0:12                         ` Jason Merrill
@ 2024-02-20  0:51                           ` Jakub Jelinek
  2024-02-20  8:01                             ` Richard Biener
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Jelinek @ 2024-02-20  0:51 UTC (permalink / raw)
  To: Jason Merrill, Richard Biener; +Cc: gcc-patches, Jonathan Wakely

On Tue, Feb 20, 2024 at 12:12:11AM +0000, Jason Merrill wrote:
> On 2/19/24 02:55, Jakub Jelinek wrote:
> > On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote:
> > > Ah, although __atomic_compare_exchange only takes pointers, the
> > > compiler replaces that with a call to __atomic_compare_exchange_n
> > > which takes the newval by value, which presumably uses an 80-bit FP
> > > register and so the padding bits become indeterminate again.
> > 
> > The problem is that __atomic_{,compare_}exchange lowering if it has
> > a supported atomic 1/2/4/8/16 size emits code like:
> >    _3 = *p2;
> >    _4 = VIEW_CONVERT_EXPR<I_type> (_3);
> 
> Hmm, I find that surprising; I thought of VIEW_CONVERT_EXPR as working on
> lvalues, not rvalues, based on the documentation describing it as roughly
> *(type2 *)&X.

It works on both, if it is on the lhs, it obviously needs to be on an lvalue
and is something like
VIEW_CONVERT_EXPR<I_type> (mem) = something;
and if it is on rhs, it can be on an rvalue, just reinterpret bits of
something as something else, so more like
((union { typeof (val) x; I_type y; }) (val)).y

> Now I see that gimplify_expr does what you describe, and I wonder what the
> advantage of that is.  That seems to go back to richi's r206420 for PR59471.
> r270579 for PR limited it to cases where the caller wants an rvalue; perhaps
> it should also be avoided when the operand is an INDIRECT_REF?

Strangely we don't even try to optimize it, even at -O2 that
  _3 = *p2_2(D);
  _4 = VIEW_CONVERT_EXPR<I_type> (_3);
stays around until optimized.  I believe VIEW_CONVERT_EXPR<I_type> is valid
around a memory reference, so it could be either
  _4 = VIEW_CONVERT_EXPR<I_type> (*p2_2(D));
or the MEM_REF with aliasing info from original pointer and type from VCE.
For optimizations, guess it is a matter of writing some match.pd rule, but
we can't rely on it for the atomics.

Doing it in the gimplifier is possible, but not sure how to do that easily,
given that the VIEW_CONVERT_EXPR argument can be either lvalue or rvalue
and we need to gimplify it first.
The first part is exactly what forces it into a separate SSA_NAME for the
load vs. VIEW_CONVERT_EXPR around it:

        case VIEW_CONVERT_EXPR:
          if ((fallback & fb_rvalue)
              && is_gimple_reg_type (TREE_TYPE (*expr_p))
              && is_gimple_reg_type (TREE_TYPE (TREE_OPERAND (*expr_p, 0))))
            {
              ret = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p,
                                   post_p, is_gimple_val, fb_rvalue);
              recalculate_side_effects (*expr_p);
              break;
            }
          /* Fallthru.  */

        case ARRAY_REF:
        case ARRAY_RANGE_REF:
        case REALPART_EXPR:
        case IMAGPART_EXPR:
        case COMPONENT_REF:
          ret = gimplify_compound_lval (expr_p, pre_p, post_p,
                                        fallback ? fallback : fb_rvalue);
          break;

but if we do the gimplify_compound_lval, we'd actually force it to be
addressable (with possible errors if that isn't possible) etc.
Having just a special-case of VIEW_CONVERT_EXPR of INDIRECT_REF and
do gimplify_compound_lval in that case mikght be wrong I think if
e.g. INDIRECT_REF operand is ADDR_EXPR, shouldn't we cancel *& in that case
and still not force it into memory?

The INDIRECT_REF: case is:
          {
            bool volatilep = TREE_THIS_VOLATILE (*expr_p);
            bool notrap = TREE_THIS_NOTRAP (*expr_p);
            tree saved_ptr_type = TREE_TYPE (TREE_OPERAND (*expr_p, 0));
      
            *expr_p = fold_indirect_ref_loc (input_location, *expr_p);
            if (*expr_p != save_expr)
              {
                ret = GS_OK;
                break;
              }
        
            ret = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, post_p,
                                 is_gimple_reg, fb_rvalue);
            if (ret == GS_ERROR)
              break;
        
            recalculate_side_effects (*expr_p);
            *expr_p = fold_build2_loc (input_location, MEM_REF,
                                       TREE_TYPE (*expr_p),
                                       TREE_OPERAND (*expr_p, 0),
                                       build_int_cst (saved_ptr_type, 0));
            TREE_THIS_VOLATILE (*expr_p) = volatilep;
            TREE_THIS_NOTRAP (*expr_p) = notrap;
            ret = GS_OK;
            break;
          }
so I think if we want to special case VIEW_CONVERT_EXPR on INDIRECT_REF,
we should basically copy and adjust that to the start of the case
VIEW_CONVERT_EXPR:.  In particular, if fold_indirect_ref_loc did something
and it isn't a different INDIRECT_REF or something addressable, do what it
does now (i.e. possibly treat as lvalue), otherwise gimplify the INDIRECT_REF
operand and build a MEM_REF, but with the type of the VIEW_CONVERT_EXPR but
still saved_ptr_type of the INDIRECT_REF.

Though, that all still feels like an optimization rather than guarantee
which is what we need for the atomics.

	Jakub


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange
  2024-02-20  0:51                           ` Jakub Jelinek
@ 2024-02-20  8:01                             ` Richard Biener
  2024-02-20 10:02                               ` [PATCH] c-family, c++, v2: " Jakub Jelinek
  0 siblings, 1 reply; 28+ messages in thread
From: Richard Biener @ 2024-02-20  8:01 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jason Merrill, gcc-patches, Jonathan Wakely

On Tue, 20 Feb 2024, Jakub Jelinek wrote:

> On Tue, Feb 20, 2024 at 12:12:11AM +0000, Jason Merrill wrote:
> > On 2/19/24 02:55, Jakub Jelinek wrote:
> > > On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote:
> > > > Ah, although __atomic_compare_exchange only takes pointers, the
> > > > compiler replaces that with a call to __atomic_compare_exchange_n
> > > > which takes the newval by value, which presumably uses an 80-bit FP
> > > > register and so the padding bits become indeterminate again.
> > > 
> > > The problem is that __atomic_{,compare_}exchange lowering if it has
> > > a supported atomic 1/2/4/8/16 size emits code like:
> > >    _3 = *p2;
> > >    _4 = VIEW_CONVERT_EXPR<I_type> (_3);
> > 
> > Hmm, I find that surprising; I thought of VIEW_CONVERT_EXPR as working on
> > lvalues, not rvalues, based on the documentation describing it as roughly
> > *(type2 *)&X.
> 
> It works on both, if it is on the lhs, it obviously needs to be on an lvalue
> and is something like
> VIEW_CONVERT_EXPR<I_type> (mem) = something;
> and if it is on rhs, it can be on an rvalue, just reinterpret bits of
> something as something else, so more like
> ((union { typeof (val) x; I_type y; }) (val)).y
> 
> > Now I see that gimplify_expr does what you describe, and I wonder what the
> > advantage of that is.  That seems to go back to richi's r206420 for PR59471.
> > r270579 for PR limited it to cases where the caller wants an rvalue; perhaps
> > it should also be avoided when the operand is an INDIRECT_REF?
> 
> Strangely we don't even try to optimize it, even at -O2 that
>   _3 = *p2_2(D);
>   _4 = VIEW_CONVERT_EXPR<I_type> (_3);
> stays around until optimized.  I believe VIEW_CONVERT_EXPR<I_type> is valid
> around a memory reference, so it could be either
>   _4 = VIEW_CONVERT_EXPR<I_type> (*p2_2(D));
> or the MEM_REF with aliasing info from original pointer and type from VCE.
> For optimizations, guess it is a matter of writing some match.pd rule, but
> we can't rely on it for the atomics.

I'm not sure those would be really equivalent (MEM_REF vs. V_C_E
as well as combined vs. split).  It really depends how RTL expansion
handles this (as you can see padding can be fun here).

So I'd be nervous for a match.pd rule here (also we can't match
memory defs).

As for your patch I'd go with a MEM_REF unconditionally, I don't
think we want different behavior whether there's padding or not ...

> Doing it in the gimplifier is possible, but not sure how to do that easily,
> given that the VIEW_CONVERT_EXPR argument can be either lvalue or rvalue
> and we need to gimplify it first.
> The first part is exactly what forces it into a separate SSA_NAME for the
> load vs. VIEW_CONVERT_EXPR around it:
> 
>         case VIEW_CONVERT_EXPR:
>           if ((fallback & fb_rvalue)
>               && is_gimple_reg_type (TREE_TYPE (*expr_p))
>               && is_gimple_reg_type (TREE_TYPE (TREE_OPERAND (*expr_p, 0))))
>             {
>               ret = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p,
>                                    post_p, is_gimple_val, fb_rvalue);
>               recalculate_side_effects (*expr_p);
>               break;
>             }

that was done to help optimizing some cases (IIRC with vectors / int 
punning).

>           /* Fallthru.  */
> 
>         case ARRAY_REF:
>         case ARRAY_RANGE_REF:
>         case REALPART_EXPR:
>         case IMAGPART_EXPR:
>         case COMPONENT_REF:
>           ret = gimplify_compound_lval (expr_p, pre_p, post_p,
>                                         fallback ? fallback : fb_rvalue);
>           break;
> 
> but if we do the gimplify_compound_lval, we'd actually force it to be
> addressable (with possible errors if that isn't possible) etc.
> Having just a special-case of VIEW_CONVERT_EXPR of INDIRECT_REF and
> do gimplify_compound_lval in that case mikght be wrong I think if
> e.g. INDIRECT_REF operand is ADDR_EXPR, shouldn't we cancel *& in that case
> and still not force it into memory?
> 
> The INDIRECT_REF: case is:
>           {
>             bool volatilep = TREE_THIS_VOLATILE (*expr_p);
>             bool notrap = TREE_THIS_NOTRAP (*expr_p);
>             tree saved_ptr_type = TREE_TYPE (TREE_OPERAND (*expr_p, 0));
>       
>             *expr_p = fold_indirect_ref_loc (input_location, *expr_p);
>             if (*expr_p != save_expr)
>               {
>                 ret = GS_OK;
>                 break;
>               }
>         
>             ret = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, post_p,
>                                  is_gimple_reg, fb_rvalue);
>             if (ret == GS_ERROR)
>               break;
>         
>             recalculate_side_effects (*expr_p);
>             *expr_p = fold_build2_loc (input_location, MEM_REF,
>                                        TREE_TYPE (*expr_p),
>                                        TREE_OPERAND (*expr_p, 0),
>                                        build_int_cst (saved_ptr_type, 0));
>             TREE_THIS_VOLATILE (*expr_p) = volatilep;
>             TREE_THIS_NOTRAP (*expr_p) = notrap;
>             ret = GS_OK;
>             break;
>           }
> so I think if we want to special case VIEW_CONVERT_EXPR on INDIRECT_REF,
> we should basically copy and adjust that to the start of the case
> VIEW_CONVERT_EXPR:.  In particular, if fold_indirect_ref_loc did something
> and it isn't a different INDIRECT_REF or something addressable, do what it
> does now (i.e. possibly treat as lvalue), otherwise gimplify the INDIRECT_REF
> operand and build a MEM_REF, but with the type of the VIEW_CONVERT_EXPR but
> still saved_ptr_type of the INDIRECT_REF.
> 
> Though, that all still feels like an optimization rather than guarantee
> which is what we need for the atomics.

Yes, and I think the frontend does it wrong - if it wants to do a load
of l_type it should do that, not re-interpret the result.  I suppose
the docs

@defbuiltin{void __atomic_exchange (@var{type} *@var{ptr}, @var{type} 
*@var{val}, @var{type} *@var{ret}, int @var{memorder})}
This is the generic version of an atomic exchange.  It stores the
contents of @code{*@var{val}} into @code{*@var{ptr}}. The original value
of @code{*@var{ptr}} is copied into @code{*@var{ret}}.

contain too little standards legalise to constrain 'type', it just
appears to use lvalues of *ptr, *val and *ret here which for floats
with padding possibly isn't well-defined and "contents" isn't a
standard term (it doesn't say bit representation or something similar),
it also says "stores" and "copied into".

That said, if the FE determines in l_type a type suitable for copying
then the access should use a MEM_REF, no further "conversion" required.

Richard.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] c-family, c++, v2: Fix up handling of types which may have padding in __atomic_{compare_}exchange
  2024-02-20  8:01                             ` Richard Biener
@ 2024-02-20 10:02                               ` Jakub Jelinek
  2024-02-20 10:11                                 ` Richard Biener
  2024-03-07  0:00                                 ` Jason Merrill
  0 siblings, 2 replies; 28+ messages in thread
From: Jakub Jelinek @ 2024-02-20 10:02 UTC (permalink / raw)
  To: Richard Biener, Jason Merrill; +Cc: gcc-patches, Jonathan Wakely

On Tue, Feb 20, 2024 at 09:01:10AM +0100, Richard Biener wrote:
> I'm not sure those would be really equivalent (MEM_REF vs. V_C_E
> as well as combined vs. split).  It really depends how RTL expansion
> handles this (as you can see padding can be fun here).
> 
> So I'd be nervous for a match.pd rule here (also we can't match
> memory defs).

Ok.  Perhaps forwprop then; anyway, that would be an optimization.

> As for your patch I'd go with a MEM_REF unconditionally, I don't
> think we want different behavior whether there's padding or not ...

I've made it conditional so that the MEM_REFs don't appear that often in the
FE trees, but maybe that is fine.

The unconditional patch would then be:

2024-02-20  Jakub Jelinek  <jakub@redhat.com>

gcc/c-family/
	* c-common.cc (resolve_overloaded_atomic_exchange): Instead of setting
	p1 to VIEW_CONVERT_EXPR<I_type> (*p1), set it to MEM_REF with p1 and
	(typeof (p1)) 0 operands and I_type type.
	(resolve_overloaded_atomic_compare_exchange): Similarly for p2.
gcc/cp/
	* pt.cc (tsubst_expr): Handle MEM_REF.
gcc/testsuite/
	* g++.dg/ext/atomic-5.C: New test.

--- gcc/c-family/c-common.cc.jj	2024-02-17 16:40:42.831571693 +0100
+++ gcc/c-family/c-common.cc	2024-02-20 10:58:56.599865656 +0100
@@ -7793,9 +7793,14 @@ resolve_overloaded_atomic_exchange (loca
   /* Convert object pointer to required type.  */
   p0 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p0);
   (*params)[0] = p0; 
-  /* Convert new value to required type, and dereference it.  */
-  p1 = build_indirect_ref (loc, p1, RO_UNARY_STAR);
-  p1 = build1 (VIEW_CONVERT_EXPR, I_type, p1);
+  /* Convert new value to required type, and dereference it.
+     If *p1 type can have padding or may involve floating point which
+     could e.g. be promoted to wider precision and demoted afterwards,
+     state of padding bits might not be preserved.  */
+  build_indirect_ref (loc, p1, RO_UNARY_STAR);
+  p1 = build2_loc (loc, MEM_REF, I_type,
+		   build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1),
+		   build_zero_cst (TREE_TYPE (p1)));
   (*params)[1] = p1;
 
   /* Move memory model to the 3rd position, and end param list.  */
@@ -7873,9 +7878,14 @@ resolve_overloaded_atomic_compare_exchan
   p1 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1);
   (*params)[1] = p1;
 
-  /* Convert desired value to required type, and dereference it.  */
-  p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
-  p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
+  /* Convert desired value to required type, and dereference it.
+     If *p2 type can have padding or may involve floating point which
+     could e.g. be promoted to wider precision and demoted afterwards,
+     state of padding bits might not be preserved.  */
+  build_indirect_ref (loc, p2, RO_UNARY_STAR);
+  p2 = build2_loc (loc, MEM_REF, I_type,
+		   build1 (VIEW_CONVERT_EXPR, I_type_ptr, p2),
+		   build_zero_cst (TREE_TYPE (p2)));
   (*params)[2] = p2;
 
   /* The rest of the parameters are fine. NULL means no special return value
--- gcc/cp/pt.cc.jj	2024-02-17 16:40:42.868571182 +0100
+++ gcc/cp/pt.cc	2024-02-20 10:57:36.646973603 +0100
@@ -20088,6 +20088,14 @@ tsubst_expr (tree t, tree args, tsubst_f
 	RETURN (r);
       }
 
+    case MEM_REF:
+      {
+	tree op0 = RECUR (TREE_OPERAND (t, 0));
+	tree op1 = RECUR (TREE_OPERAND (t, 0));
+	tree new_type = tsubst (TREE_TYPE (t), args, complain, in_decl);
+	RETURN (build2_loc (EXPR_LOCATION (t), MEM_REF, new_type, op0, op1));
+      }
+
     case NOP_EXPR:
       {
 	tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
--- gcc/testsuite/g++.dg/ext/atomic-5.C.jj	2024-02-20 10:57:36.647973589 +0100
+++ gcc/testsuite/g++.dg/ext/atomic-5.C	2024-02-20 10:57:36.647973589 +0100
@@ -0,0 +1,42 @@
+// { dg-do compile { target c++14 } }
+
+template <int N>
+void
+foo (long double *ptr, long double *val, long double *ret)
+{
+  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
+}
+
+template <int N>
+bool
+bar (long double *ptr, long double *exp, long double *des)
+{
+  return __atomic_compare_exchange (ptr, exp, des, false,
+				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
+}
+
+bool
+baz (long double *p, long double *q, long double *r)
+{
+  foo<0> (p, q, r);
+  foo<1> (p + 1, q + 1, r + 1);
+  return bar<0> (p + 2, q + 2, r + 2) || bar<1> (p + 3, q + 3, r + 3);
+}
+
+constexpr int
+qux (long double *ptr, long double *val, long double *ret)
+{
+  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
+  return 0;
+}
+
+constexpr bool
+corge (long double *ptr, long double *exp, long double *des)
+{
+  return __atomic_compare_exchange (ptr, exp, des, false,
+				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
+}
+
+long double a[6];
+const int b = qux (a, a + 1, a + 2);
+const bool c = corge (a + 3, a + 4, a + 5);


	Jakub


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] c-family, c++, v2: Fix up handling of types which may have padding in __atomic_{compare_}exchange
  2024-02-20 10:02                               ` [PATCH] c-family, c++, v2: " Jakub Jelinek
@ 2024-02-20 10:11                                 ` Richard Biener
  2024-02-20 10:27                                   ` Jakub Jelinek
  2024-03-07  0:00                                 ` Jason Merrill
  1 sibling, 1 reply; 28+ messages in thread
From: Richard Biener @ 2024-02-20 10:11 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jason Merrill, gcc-patches, Jonathan Wakely

On Tue, 20 Feb 2024, Jakub Jelinek wrote:

> On Tue, Feb 20, 2024 at 09:01:10AM +0100, Richard Biener wrote:
> > I'm not sure those would be really equivalent (MEM_REF vs. V_C_E
> > as well as combined vs. split).  It really depends how RTL expansion
> > handles this (as you can see padding can be fun here).
> > 
> > So I'd be nervous for a match.pd rule here (also we can't match
> > memory defs).
> 
> Ok.  Perhaps forwprop then; anyway, that would be an optimization.
> 
> > As for your patch I'd go with a MEM_REF unconditionally, I don't
> > think we want different behavior whether there's padding or not ...
> 
> I've made it conditional so that the MEM_REFs don't appear that often in the
> FE trees, but maybe that is fine.
> 
> The unconditional patch would then be:
> 
> 2024-02-20  Jakub Jelinek  <jakub@redhat.com>
> 
> gcc/c-family/
> 	* c-common.cc (resolve_overloaded_atomic_exchange): Instead of setting
> 	p1 to VIEW_CONVERT_EXPR<I_type> (*p1), set it to MEM_REF with p1 and
> 	(typeof (p1)) 0 operands and I_type type.
> 	(resolve_overloaded_atomic_compare_exchange): Similarly for p2.
> gcc/cp/
> 	* pt.cc (tsubst_expr): Handle MEM_REF.
> gcc/testsuite/
> 	* g++.dg/ext/atomic-5.C: New test.
> 
> --- gcc/c-family/c-common.cc.jj	2024-02-17 16:40:42.831571693 +0100
> +++ gcc/c-family/c-common.cc	2024-02-20 10:58:56.599865656 +0100
> @@ -7793,9 +7793,14 @@ resolve_overloaded_atomic_exchange (loca
>    /* Convert object pointer to required type.  */
>    p0 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p0);
>    (*params)[0] = p0; 
> -  /* Convert new value to required type, and dereference it.  */
> -  p1 = build_indirect_ref (loc, p1, RO_UNARY_STAR);
> -  p1 = build1 (VIEW_CONVERT_EXPR, I_type, p1);
> +  /* Convert new value to required type, and dereference it.
> +     If *p1 type can have padding or may involve floating point which
> +     could e.g. be promoted to wider precision and demoted afterwards,
> +     state of padding bits might not be preserved.  */
> +  build_indirect_ref (loc, p1, RO_UNARY_STAR);
> +  p1 = build2_loc (loc, MEM_REF, I_type,
> +		   build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1),

Why the V_C_E to I_type_ptr?  The type of p1 doesn't
really matter (unless it could be a non-pointer).

Also note that I_type needs to be properly address-space qualified
in case the access should be to an address-space.  Formerly with
the INDIRECT_REF that would likely be automagic.

> +		   build_zero_cst (TREE_TYPE (p1)));
>    (*params)[1] = p1;
>  
>    /* Move memory model to the 3rd position, and end param list.  */
> @@ -7873,9 +7878,14 @@ resolve_overloaded_atomic_compare_exchan
>    p1 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1);
>    (*params)[1] = p1;
>  
> -  /* Convert desired value to required type, and dereference it.  */
> -  p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
> -  p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
> +  /* Convert desired value to required type, and dereference it.
> +     If *p2 type can have padding or may involve floating point which
> +     could e.g. be promoted to wider precision and demoted afterwards,
> +     state of padding bits might not be preserved.  */
> +  build_indirect_ref (loc, p2, RO_UNARY_STAR);
> +  p2 = build2_loc (loc, MEM_REF, I_type,
> +		   build1 (VIEW_CONVERT_EXPR, I_type_ptr, p2),
> +		   build_zero_cst (TREE_TYPE (p2)));
>    (*params)[2] = p2;
>  
>    /* The rest of the parameters are fine. NULL means no special return value
> --- gcc/cp/pt.cc.jj	2024-02-17 16:40:42.868571182 +0100
> +++ gcc/cp/pt.cc	2024-02-20 10:57:36.646973603 +0100
> @@ -20088,6 +20088,14 @@ tsubst_expr (tree t, tree args, tsubst_f
>  	RETURN (r);
>        }
>  
> +    case MEM_REF:
> +      {
> +	tree op0 = RECUR (TREE_OPERAND (t, 0));
> +	tree op1 = RECUR (TREE_OPERAND (t, 0));
> +	tree new_type = tsubst (TREE_TYPE (t), args, complain, in_decl);
> +	RETURN (build2_loc (EXPR_LOCATION (t), MEM_REF, new_type, op0, op1));
> +      }
> +
>      case NOP_EXPR:
>        {
>  	tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
> --- gcc/testsuite/g++.dg/ext/atomic-5.C.jj	2024-02-20 10:57:36.647973589 +0100
> +++ gcc/testsuite/g++.dg/ext/atomic-5.C	2024-02-20 10:57:36.647973589 +0100
> @@ -0,0 +1,42 @@
> +// { dg-do compile { target c++14 } }
> +
> +template <int N>
> +void
> +foo (long double *ptr, long double *val, long double *ret)
> +{
> +  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
> +}
> +
> +template <int N>
> +bool
> +bar (long double *ptr, long double *exp, long double *des)
> +{
> +  return __atomic_compare_exchange (ptr, exp, des, false,
> +				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
> +}
> +
> +bool
> +baz (long double *p, long double *q, long double *r)
> +{
> +  foo<0> (p, q, r);
> +  foo<1> (p + 1, q + 1, r + 1);
> +  return bar<0> (p + 2, q + 2, r + 2) || bar<1> (p + 3, q + 3, r + 3);
> +}
> +
> +constexpr int
> +qux (long double *ptr, long double *val, long double *ret)
> +{
> +  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
> +  return 0;
> +}
> +
> +constexpr bool
> +corge (long double *ptr, long double *exp, long double *des)
> +{
> +  return __atomic_compare_exchange (ptr, exp, des, false,
> +				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
> +}
> +
> +long double a[6];
> +const int b = qux (a, a + 1, a + 2);
> +const bool c = corge (a + 3, a + 4, a + 5);
> 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] c-family, c++, v2: Fix up handling of types which may have padding in __atomic_{compare_}exchange
  2024-02-20 10:11                                 ` Richard Biener
@ 2024-02-20 10:27                                   ` Jakub Jelinek
  0 siblings, 0 replies; 28+ messages in thread
From: Jakub Jelinek @ 2024-02-20 10:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jason Merrill, gcc-patches, Jonathan Wakely

On Tue, Feb 20, 2024 at 11:11:22AM +0100, Richard Biener wrote:
> > --- gcc/c-family/c-common.cc.jj	2024-02-17 16:40:42.831571693 +0100
> > +++ gcc/c-family/c-common.cc	2024-02-20 10:58:56.599865656 +0100
> > @@ -7793,9 +7793,14 @@ resolve_overloaded_atomic_exchange (loca
> >    /* Convert object pointer to required type.  */
> >    p0 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p0);
> >    (*params)[0] = p0; 
> > -  /* Convert new value to required type, and dereference it.  */
> > -  p1 = build_indirect_ref (loc, p1, RO_UNARY_STAR);
> > -  p1 = build1 (VIEW_CONVERT_EXPR, I_type, p1);
> > +  /* Convert new value to required type, and dereference it.
> > +     If *p1 type can have padding or may involve floating point which
> > +     could e.g. be promoted to wider precision and demoted afterwards,
> > +     state of padding bits might not be preserved.  */
> > +  build_indirect_ref (loc, p1, RO_UNARY_STAR);
> > +  p1 = build2_loc (loc, MEM_REF, I_type,
> > +		   build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1),
> 
> Why the V_C_E to I_type_ptr?  The type of p1 doesn't
> really matter (unless it could be a non-pointer).

Just to help the FE when trying to constexpr evaluate it, because
for constexpr evaluation it evaluates MEM_REF the same as INDIRECT_REF
(and punts on non-0 second argument).  The actual call is non-constexpr,
just wanted to avoid ICEs or something similar, constexpr evaluation can
try to process the arguments (and succeed or fail, doesn't matter,
but not ICE) and then the call will not be constant expression and so
everything won't be.

> Also note that I_type needs to be properly address-space qualified
> in case the access should be to an address-space.  Formerly with
> the INDIRECT_REF that would likely be automagic.

I don't think using __atomic_*exchange on non-default as is valid,
if one doesn't have the exact __atomic_*exchange_N, it is handled
as a library call which is passed pointers and that definitely will
not deal with non-default address spaces.
Furthermore, the type should be the same in all arguments, and
the first argument is just converted to I_type_ptr and dealt with later, so
I don't think it ever worked even for the supported sizes.
int                                                                                                                                                                                  
foo (__seg_gs int *a, __seg_gs int *b, __seg_gs int *c)                                                                                                                                  
{                                                                                                                                                                                     
  return __atomic_compare_exchange (a, b, c, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);                                                                                                     
}                                                                                                                                                                                     
results in
	movl	(%rsi), %eax
	movl	%gs:(%rdx), %edx
	lock cmpxchgl	%edx, (%rdi)
	sete	%dl
	je	.L2
	movl	%eax, (%rsi)
.L2:
i.e. pretty much random what is loaded from a different AS and what is not.

	Jakub


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] c-family, c++, v2: Fix up handling of types which may have padding in __atomic_{compare_}exchange
  2024-02-20 10:02                               ` [PATCH] c-family, c++, v2: " Jakub Jelinek
  2024-02-20 10:11                                 ` Richard Biener
@ 2024-03-07  0:00                                 ` Jason Merrill
  1 sibling, 0 replies; 28+ messages in thread
From: Jason Merrill @ 2024-03-07  0:00 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: gcc-patches, Jonathan Wakely

On 2/20/24 05:02, Jakub Jelinek wrote:
> On Tue, Feb 20, 2024 at 09:01:10AM +0100, Richard Biener wrote:
>> I'm not sure those would be really equivalent (MEM_REF vs. V_C_E
>> as well as combined vs. split).  It really depends how RTL expansion
>> handles this (as you can see padding can be fun here).
>>
>> So I'd be nervous for a match.pd rule here (also we can't match
>> memory defs).
> 
> Ok.  Perhaps forwprop then; anyway, that would be an optimization.
> 
>> As for your patch I'd go with a MEM_REF unconditionally, I don't
>> think we want different behavior whether there's padding or not ...
> 
> I've made it conditional so that the MEM_REFs don't appear that often in the
> FE trees, but maybe that is fine.
> 
> The unconditional patch would then be:

OK.

> 2024-02-20  Jakub Jelinek  <jakub@redhat.com>
> 
> gcc/c-family/
> 	* c-common.cc (resolve_overloaded_atomic_exchange): Instead of setting
> 	p1 to VIEW_CONVERT_EXPR<I_type> (*p1), set it to MEM_REF with p1 and
> 	(typeof (p1)) 0 operands and I_type type.
> 	(resolve_overloaded_atomic_compare_exchange): Similarly for p2.
> gcc/cp/
> 	* pt.cc (tsubst_expr): Handle MEM_REF.
> gcc/testsuite/
> 	* g++.dg/ext/atomic-5.C: New test.
> 
> --- gcc/c-family/c-common.cc.jj	2024-02-17 16:40:42.831571693 +0100
> +++ gcc/c-family/c-common.cc	2024-02-20 10:58:56.599865656 +0100
> @@ -7793,9 +7793,14 @@ resolve_overloaded_atomic_exchange (loca
>     /* Convert object pointer to required type.  */
>     p0 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p0);
>     (*params)[0] = p0;
> -  /* Convert new value to required type, and dereference it.  */
> -  p1 = build_indirect_ref (loc, p1, RO_UNARY_STAR);
> -  p1 = build1 (VIEW_CONVERT_EXPR, I_type, p1);
> +  /* Convert new value to required type, and dereference it.
> +     If *p1 type can have padding or may involve floating point which
> +     could e.g. be promoted to wider precision and demoted afterwards,
> +     state of padding bits might not be preserved.  */
> +  build_indirect_ref (loc, p1, RO_UNARY_STAR);
> +  p1 = build2_loc (loc, MEM_REF, I_type,
> +		   build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1),
> +		   build_zero_cst (TREE_TYPE (p1)));
>     (*params)[1] = p1;
>   
>     /* Move memory model to the 3rd position, and end param list.  */
> @@ -7873,9 +7878,14 @@ resolve_overloaded_atomic_compare_exchan
>     p1 = build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1);
>     (*params)[1] = p1;
>   
> -  /* Convert desired value to required type, and dereference it.  */
> -  p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR);
> -  p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2);
> +  /* Convert desired value to required type, and dereference it.
> +     If *p2 type can have padding or may involve floating point which
> +     could e.g. be promoted to wider precision and demoted afterwards,
> +     state of padding bits might not be preserved.  */
> +  build_indirect_ref (loc, p2, RO_UNARY_STAR);
> +  p2 = build2_loc (loc, MEM_REF, I_type,
> +		   build1 (VIEW_CONVERT_EXPR, I_type_ptr, p2),
> +		   build_zero_cst (TREE_TYPE (p2)));
>     (*params)[2] = p2;
>   
>     /* The rest of the parameters are fine. NULL means no special return value
> --- gcc/cp/pt.cc.jj	2024-02-17 16:40:42.868571182 +0100
> +++ gcc/cp/pt.cc	2024-02-20 10:57:36.646973603 +0100
> @@ -20088,6 +20088,14 @@ tsubst_expr (tree t, tree args, tsubst_f
>   	RETURN (r);
>         }
>   
> +    case MEM_REF:
> +      {
> +	tree op0 = RECUR (TREE_OPERAND (t, 0));
> +	tree op1 = RECUR (TREE_OPERAND (t, 0));
> +	tree new_type = tsubst (TREE_TYPE (t), args, complain, in_decl);
> +	RETURN (build2_loc (EXPR_LOCATION (t), MEM_REF, new_type, op0, op1));
> +      }
> +
>       case NOP_EXPR:
>         {
>   	tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
> --- gcc/testsuite/g++.dg/ext/atomic-5.C.jj	2024-02-20 10:57:36.647973589 +0100
> +++ gcc/testsuite/g++.dg/ext/atomic-5.C	2024-02-20 10:57:36.647973589 +0100
> @@ -0,0 +1,42 @@
> +// { dg-do compile { target c++14 } }
> +
> +template <int N>
> +void
> +foo (long double *ptr, long double *val, long double *ret)
> +{
> +  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
> +}
> +
> +template <int N>
> +bool
> +bar (long double *ptr, long double *exp, long double *des)
> +{
> +  return __atomic_compare_exchange (ptr, exp, des, false,
> +				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
> +}
> +
> +bool
> +baz (long double *p, long double *q, long double *r)
> +{
> +  foo<0> (p, q, r);
> +  foo<1> (p + 1, q + 1, r + 1);
> +  return bar<0> (p + 2, q + 2, r + 2) || bar<1> (p + 3, q + 3, r + 3);
> +}
> +
> +constexpr int
> +qux (long double *ptr, long double *val, long double *ret)
> +{
> +  __atomic_exchange (ptr, val, ret, __ATOMIC_RELAXED);
> +  return 0;
> +}
> +
> +constexpr bool
> +corge (long double *ptr, long double *exp, long double *des)
> +{
> +  return __atomic_compare_exchange (ptr, exp, des, false,
> +				    __ATOMIC_RELAXED, __ATOMIC_RELAXED);
> +}
> +
> +long double a[6];
> +const int b = qux (a, a + 1, a + 2);
> +const bool c = corge (a + 3, a + 4, a + 5);
> 
> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-02-16 15:15                         ` Jonathan Wakely
@ 2024-03-14 15:13                           ` Jonathan Wakely
  2024-03-25 15:42                             ` [PING][PATCH] " xndcn
  0 siblings, 1 reply; 28+ messages in thread
From: Jonathan Wakely @ 2024-03-14 15:13 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: xndcn, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 787 bytes --]

On Fri, 16 Feb 2024 at 15:15, Jonathan Wakely wrote:
>
> On Fri, 16 Feb 2024 at 14:10, Jakub Jelinek wrote:
> >
> > On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote:
> > > Ah, although __atomic_compare_exchange only takes pointers, the
> > > compiler replaces that with a call to __atomic_compare_exchange_n
> > > which takes the newval by value, which presumably uses an 80-bit FP
> > > register and so the padding bits become indeterminate again.
> >
> > __atomic_compare_exchange_n only works with integers, so I guess
> > it is doing VIEW_CONVERT_EXPR (aka union-style type punning) on the
> > argument.
> >
> > Do you have preprocessed source for the testcase?
>
> Sent offlist.

Jakub fixed the compiler, so I've pushed the attached patch now.

Tested x86_64-linux.

[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 2795 bytes --]

commit 0adc8c5f146b108f99c4df09e43276e3a2419262
Author: xndcn <xndchn@gmail.com>
Date:   Fri Feb 16 11:00:13 2024

    libstdc++: Add missing clear_padding in __atomic_float constructor
    
    For 80-bit long double we need to clear the padding bits on
    construction.
    
    libstdc++-v3/ChangeLog:
    
            * include/bits/atomic_base.h (__atomic_float::__atomic_float(Fp)):
            Clear padding.
            * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc:
            New test.
    
    Signed-off-by: xndcn <xndchn@gmail.com>
    
    Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
index b857b441169..dd360302f80 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       constexpr
       __atomic_float(_Fp __t) : _M_fp(__t)
-      { }
+      { __atomic_impl::__clear_padding(_M_fp); }
 
       __atomic_float(const __atomic_float&) = delete;
       __atomic_float& operator=(const __atomic_float&) = delete;
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 00000000000..49626ac6651
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,53 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-additional-options "[atomic_link_flags [get_multilibs]] -latomic" }
+
+#include <atomic>
+#include <cstring>
+#include <limits>
+#include <testsuite_hooks.h>
+
+template<typename T>
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+  T mask;
+  std::memset(&mask, 0xff, sizeof(T));
+  __builtin_clear_padding(&mask);
+  unsigned char* ptr_f = (unsigned char*)&f;
+  unsigned char* ptr_mask = (unsigned char*)&mask;
+  for (unsigned i = 0; i < sizeof(T); i++)
+  {
+    if (ptr_mask[i] == 0x00)
+    {
+      ptr_f[i] = 0xff;
+    }
+  }
+}
+
+void
+test01()
+{
+  // test for long double with padding (float80)
+  if constexpr (std::numeric_limits<long double>::digits == 64)
+  {
+    long double f = 0.5f; // long double has padding bits on x86
+    fill_padding(f);
+    std::atomic<long double> as{ f }; // padding cleared on constructor
+    long double t = 1.5;
+
+    as.fetch_add(t);
+    long double s = f + t;
+    t = as.load();
+    VERIFY(s == t); // padding ignored on comparison
+    fill_padding(s);
+    VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+    fill_padding(f);
+    VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+  }
+}
+
+int main()
+{
+  test01();
+}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PING][PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor
  2024-03-14 15:13                           ` Jonathan Wakely
@ 2024-03-25 15:42                             ` xndcn
  0 siblings, 0 replies; 28+ messages in thread
From: xndcn @ 2024-03-25 15:42 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, Xi Ruoyao, H.J. Lu, GCC Patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 966 bytes --]

Wow, thank you all, you guys!

在 2024年3月14日星期四，Jonathan Wakely <jwakely@redhat.com> 写道：

> On Fri, 16 Feb 2024 at 15:15, Jonathan Wakely wrote:
> >
> > On Fri, 16 Feb 2024 at 14:10, Jakub Jelinek wrote:
> > >
> > > On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote:
> > > > Ah, although __atomic_compare_exchange only takes pointers, the
> > > > compiler replaces that with a call to __atomic_compare_exchange_n
> > > > which takes the newval by value, which presumably uses an 80-bit FP
> > > > register and so the padding bits become indeterminate again.
> > >
> > > __atomic_compare_exchange_n only works with integers, so I guess
> > > it is doing VIEW_CONVERT_EXPR (aka union-style type punning) on the
> > > argument.
> > >
> > > Do you have preprocessed source for the testcase?
> >
> > Sent offlist.
>
> Jakub fixed the compiler, so I've pushed the attached patch now.
>
> Tested x86_64-linux.
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2024-03-25 15:42 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-08  1:01 [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor xndcn
2024-01-15  0:45 ` [PING][PATCH] " xndcn
2024-01-15  3:46 ` [PATCH] " H.J. Lu
2024-01-16  9:53   ` xndcn
2024-01-16 10:12     ` Xi Ruoyao
2024-01-16 16:16       ` xndcn
2024-01-24 15:53         ` xndcn
2024-01-30 16:08         ` [PING][PATCH] " xndcn
2024-01-30 16:31         ` [PATCH] " Jonathan Wakely
2024-01-30 16:34         ` Jonathan Wakely
2024-01-30 16:50           ` Jakub Jelinek
2024-01-31 17:19             ` xndcn
2024-02-01 13:54               ` Jonathan Wakely
2024-02-02 16:52                 ` xndcn
2024-02-16 12:38                   ` Jonathan Wakely
2024-02-16 13:51                     ` Jonathan Wakely
2024-02-16 14:10                       ` Jakub Jelinek
2024-02-16 15:15                         ` Jonathan Wakely
2024-03-14 15:13                           ` Jonathan Wakely
2024-03-25 15:42                             ` [PING][PATCH] " xndcn
2024-02-19  7:55                       ` [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange Jakub Jelinek
2024-02-20  0:12                         ` Jason Merrill
2024-02-20  0:51                           ` Jakub Jelinek
2024-02-20  8:01                             ` Richard Biener
2024-02-20 10:02                               ` [PATCH] c-family, c++, v2: " Jakub Jelinek
2024-02-20 10:11                                 ` Richard Biener
2024-02-20 10:27                                   ` Jakub Jelinek
2024-03-07  0:00                                 ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).