* [PATCH] libgcc: Add a weak stub for __sync_synchronize
@ 2020-11-03 15:08 Bernd Edlinger
2020-11-17 5:43 ` [PING] " Bernd Edlinger
2020-11-17 12:44 ` Richard Earnshaw (lists)
0 siblings, 2 replies; 17+ messages in thread
From: Bernd Edlinger @ 2020-11-03 15:08 UTC (permalink / raw)
To: gcc-patches, Ramana Radhakrishnan, Richard Earnshaw
[-- Attachment #1: Type: text/plain, Size: 545 bytes --]
Hi,
this fixes a problem with a missing symbol __sync_synchronize
which happens when newlib is used together with libstdc++ for
the non-threaded simulator target arm-none-eabi.
There are several questions on stackoverflow about this issue.
I would like to add a weak symbol for this target, since this
is only a default implementation and not meant to override a
possibly more sophisticated synchronization function from the
c-runtime.
Regression tested successfully on arm-none-eabi with newlib-3.3.0.
Is it OK for trunk?
Thanks
Bernd.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-a-weak-stub-for-__sync_synchronize.patch --]
[-- Type: text/x-patch; name="0001-Add-a-weak-stub-for-__sync_synchronize.patch", Size: 3334 bytes --]
From f8a3df552f4b98308096659c66b4c8ea68580f25 Mon Sep 17 00:00:00 2001
From: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date: Mon, 2 Nov 2020 11:43:44 +0100
Subject: [PATCH] libgcc: Add a weak stub for __sync_synchronize
This patch adds a default implementation for __sync_synchronize,
which prevents many unresolved symbol errors on arm-none-eabi.
This happens often in C++ programs even without any threading.
libgcc:
2020-11-02 Bernd Edlinger <bernd.edlinger@hotmail.de>
* config.host: Use t-eabi for arm-none-eabi.
* config/arm/t-eabi: New.
* config/arm/eabi-sync.c: New.
---
libgcc/config.host | 2 +-
libgcc/config/arm/eabi-sync.c | 39 +++++++++++++++++++++++++++++++++++++++
libgcc/config/arm/t-eabi | 1 +
3 files changed, 41 insertions(+), 1 deletion(-)
create mode 100644 libgcc/config/arm/eabi-sync.c
create mode 100644 libgcc/config/arm/t-eabi
diff --git a/libgcc/config.host b/libgcc/config.host
index fd8e55e..e25abf4 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -495,7 +495,7 @@ arm*-*-eabi* | arm*-*-symbianelf* | arm*-*-rtems*)
tm_file="$tm_file arm/bpabi-lib.h"
case ${host} in
arm*-*-eabi* | arm*-*-rtems*)
- tmake_file="${tmake_file} arm/t-bpabi t-crtfm"
+ tmake_file="${tmake_file} arm/t-bpabi t-crtfm arm/t-eabi"
extra_parts="crtbegin.o crtend.o crti.o crtn.o"
;;
arm*-*-symbianelf*)
diff --git a/libgcc/config/arm/eabi-sync.c b/libgcc/config/arm/eabi-sync.c
new file mode 100644
index 0000000..bffdd4a
--- /dev/null
+++ b/libgcc/config/arm/eabi-sync.c
@@ -0,0 +1,39 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+<http://www.gnu.org/licenses/>. */
+
+void __attribute__ ((weak))
+__sync_synchronize (void)
+{
+#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__) \
+ || defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__) \
+ || defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__) \
+ || defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
+#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
+ __asm __volatile ("dmb" : : : "memory");
+#else
+ __asm __volatile ("mcr p15, 0, r0, c7, c10, 5" : : : "memory");
+#endif
+#else
+ __asm __volatile ("nop" : : : "memory");
+#endif
+}
diff --git a/libgcc/config/arm/t-eabi b/libgcc/config/arm/t-eabi
new file mode 100644
index 0000000..556032f
--- /dev/null
+++ b/libgcc/config/arm/t-eabi
@@ -0,0 +1 @@
+LIB2ADD_ST += $(srcdir)/config/arm/eabi-sync.c
--
1.9.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PING] [PATCH] libgcc: Add a weak stub for __sync_synchronize
2020-11-03 15:08 [PATCH] libgcc: Add a weak stub for __sync_synchronize Bernd Edlinger
@ 2020-11-17 5:43 ` Bernd Edlinger
2020-11-17 12:44 ` Richard Earnshaw (lists)
1 sibling, 0 replies; 17+ messages in thread
From: Bernd Edlinger @ 2020-11-17 5:43 UTC (permalink / raw)
To: gcc-patches, Ramana Radhakrishnan, Richard Earnshaw
Ping...
I'd like to ping for this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557886.html
Thanks
Bernd.
On 11/3/20 4:08 PM, Bernd Edlinger wrote:
> Hi,
>
> this fixes a problem with a missing symbol __sync_synchronize
> which happens when newlib is used together with libstdc++ for
> the non-threaded simulator target arm-none-eabi.
>
> There are several questions on stackoverflow about this issue.
>
> I would like to add a weak symbol for this target, since this
> is only a default implementation and not meant to override a
> possibly more sophisticated synchronization function from the
> c-runtime.
>
>
> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>
> Is it OK for trunk?
>
>
> Thanks
> Bernd.
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize
2020-11-03 15:08 [PATCH] libgcc: Add a weak stub for __sync_synchronize Bernd Edlinger
2020-11-17 5:43 ` [PING] " Bernd Edlinger
@ 2020-11-17 12:44 ` Richard Earnshaw (lists)
2020-11-17 15:18 ` Bernd Edlinger
1 sibling, 1 reply; 17+ messages in thread
From: Richard Earnshaw (lists) @ 2020-11-17 12:44 UTC (permalink / raw)
To: Bernd Edlinger, gcc-patches, Ramana Radhakrishnan
On 03/11/2020 15:08, Bernd Edlinger wrote:
> Hi,
>
> this fixes a problem with a missing symbol __sync_synchronize
> which happens when newlib is used together with libstdc++ for
> the non-threaded simulator target arm-none-eabi.
>
> There are several questions on stackoverflow about this issue.
>
> I would like to add a weak symbol for this target, since this
> is only a default implementation and not meant to override a
> possibly more sophisticated synchronization function from the
> c-runtime.
>
>
> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>
> Is it OK for trunk?
>
>
> Thanks
> Bernd.
>
I seem to recall that this was a deliberate decision - you can't guess
this correctly, at least when trying to build portable code - you just
have to know which runtime you will be using.
I think Ramana had some changes in the works at one point to address
(some) of this, but I'm not sure what happened to them. Ramana?
+#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__) \
+ || defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__) \
+ || defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__) \
+ || defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
+#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
Ug, no! Use the ACLE macros to avoid this sort of mess.
R.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize
2020-11-17 12:44 ` Richard Earnshaw (lists)
@ 2020-11-17 15:18 ` Bernd Edlinger
2020-11-17 15:41 ` Richard Earnshaw (lists)
0 siblings, 1 reply; 17+ messages in thread
From: Bernd Edlinger @ 2020-11-17 15:18 UTC (permalink / raw)
To: Richard Earnshaw (lists), gcc-patches, Ramana Radhakrishnan
[-- Attachment #1: Type: text/plain, Size: 2842 bytes --]
On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
> On 03/11/2020 15:08, Bernd Edlinger wrote:
>> Hi,
>>
>> this fixes a problem with a missing symbol __sync_synchronize
>> which happens when newlib is used together with libstdc++ for
>> the non-threaded simulator target arm-none-eabi.
>>
>> There are several questions on stackoverflow about this issue.
>>
>> I would like to add a weak symbol for this target, since this
>> is only a default implementation and not meant to override a
>> possibly more sophisticated synchronization function from the
>> c-runtime.
>>
>>
>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>>
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>
>
> I seem to recall that this was a deliberate decision - you can't guess
> this correctly, at least when trying to build portable code - you just
> have to know which runtime you will be using.
>
Therefore I suggest to use the weak attribute. It is on purpose not
implementing all of the atomics.
The use case, is a C++ program which initializes a local static variable.
$ cat test.cc
#include <string>
main(int argc, char **argv)
{
static std::string x = "test";
return 0;
}
compiles to this:
sub sp, sp, #20
str r0, [fp, #-24]
str r1, [fp, #-28]
ldr r3, .L14
ldrb r4, [r3]
bl __sync_synchronize
and r3, r4, #255
and r3, r3, #1
cmp r3, #0
moveq r3, #1
movne r3, #0
and r3, r3, #255
cmp r3, #0
beq .L8
ldr r0, .L14
bl __cxa_guard_acquire
mov r3, r0
so __sync_synchronize is not defined in newlib since the target (arm-sim)
is known to be not multi-threaded,
but __cxa_guard_acquire is also not a thread safe function,
because __GTHREADS is not defined by libgcc, since it is known
at configure time, that the target does not support threads.
So libstdc++ does not try to use a mutex or any atomics either,
because it is not compiled with __GTHREADS.
I can further narrow down the patch by only defining this function when
__GTHREADS is not defined, to make it more clear.
> I think Ramana had some changes in the works at one point to address
> (some) of this, but I'm not sure what happened to them. Ramana?
>
>
> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__) \
> + || defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__) \
> + || defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__) \
> + || defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>
> Ug, no! Use the ACLE macros to avoid this sort of mess.
>
Ah, thanks, copy-paste from freebsd-atomic.c :)
I've attached the updated patch.
Is it OK?
Thanks
Bernd.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-libgcc-Add-a-weak-stub-for-__sync_synchronize.patch --]
[-- Type: text/x-patch; name="0001-libgcc-Add-a-weak-stub-for-__sync_synchronize.patch", Size: 3087 bytes --]
From ca44e1fcb4b991306cbcde6293d20c77ce74ad68 Mon Sep 17 00:00:00 2001
From: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date: Mon, 2 Nov 2020 11:43:44 +0100
Subject: [PATCH] libgcc: Add a weak stub for __sync_synchronize
This patch adds a default implementation for __sync_synchronize,
which prevents many unresolved symbol errors on arm-none-eabi.
This happens often in C++ programs even without any threading.
libgcc:
2020-11-17 Bernd Edlinger <bernd.edlinger@hotmail.de>
* config.host: Use t-eabi for arm-none-eabi.
* config/arm/t-eabi: New.
* config/arm/eabi-sync.c: New.
---
libgcc/config.host | 2 +-
libgcc/config/arm/eabi-sync.c | 38 ++++++++++++++++++++++++++++++++++++++
libgcc/config/arm/t-eabi | 1 +
3 files changed, 40 insertions(+), 1 deletion(-)
create mode 100644 libgcc/config/arm/eabi-sync.c
create mode 100644 libgcc/config/arm/t-eabi
diff --git a/libgcc/config.host b/libgcc/config.host
index 66af834..eaf74f1 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -496,7 +496,7 @@ arm*-*-eabi* | arm*-*-symbianelf* | arm*-*-rtems*)
tm_file="$tm_file arm/bpabi-lib.h"
case ${host} in
arm*-*-eabi* | arm*-*-rtems*)
- tmake_file="${tmake_file} arm/t-bpabi t-crtfm"
+ tmake_file="${tmake_file} arm/t-bpabi t-crtfm arm/t-eabi"
extra_parts="crtbegin.o crtend.o crti.o crtn.o"
;;
arm*-*-symbianelf*)
diff --git a/libgcc/config/arm/eabi-sync.c b/libgcc/config/arm/eabi-sync.c
new file mode 100644
index 0000000..c37bacf
--- /dev/null
+++ b/libgcc/config/arm/eabi-sync.c
@@ -0,0 +1,38 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+<http://www.gnu.org/licenses/>. */
+
+#include "gthr.h"
+
+#ifndef __GTHREADS
+void __attribute__ ((weak))
+__sync_synchronize (void)
+{
+#if __ARM_ARCH >= 7
+ __asm __volatile ("dmb" : : : "memory");
+#elif ARM_ARCH == 6
+ __asm __volatile ("mcr p15, 0, r0, c7, c10, 5" : : : "memory");
+#else
+ __asm __volatile ("nop" : : : "memory");
+#endif
+}
+#endif
diff --git a/libgcc/config/arm/t-eabi b/libgcc/config/arm/t-eabi
new file mode 100644
index 0000000..556032f
--- /dev/null
+++ b/libgcc/config/arm/t-eabi
@@ -0,0 +1 @@
+LIB2ADD_ST += $(srcdir)/config/arm/eabi-sync.c
--
1.9.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize
2020-11-17 15:18 ` Bernd Edlinger
@ 2020-11-17 15:41 ` Richard Earnshaw (lists)
2020-11-17 15:51 ` Christophe Lyon
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Richard Earnshaw (lists) @ 2020-11-17 15:41 UTC (permalink / raw)
To: Bernd Edlinger, gcc-patches, Ramana Radhakrishnan
On 17/11/2020 15:18, Bernd Edlinger wrote:
> On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
>> On 03/11/2020 15:08, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this fixes a problem with a missing symbol __sync_synchronize
>>> which happens when newlib is used together with libstdc++ for
>>> the non-threaded simulator target arm-none-eabi.
>>>
>>> There are several questions on stackoverflow about this issue.
>>>
>>> I would like to add a weak symbol for this target, since this
>>> is only a default implementation and not meant to override a
>>> possibly more sophisticated synchronization function from the
>>> c-runtime.
>>>
>>>
>>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>>>
>>> Is it OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>
>>
>> I seem to recall that this was a deliberate decision - you can't guess
>> this correctly, at least when trying to build portable code - you just
>> have to know which runtime you will be using.
>>
>
> Therefore I suggest to use the weak attribute. It is on purpose not
> implementing all of the atomics.
>
> The use case, is a C++ program which initializes a local static variable.
>
> $ cat test.cc
> #include <string>
> main(int argc, char **argv)
> {
> static std::string x = "test";
> return 0;
> }
>
> compiles to this:
> sub sp, sp, #20
> str r0, [fp, #-24]
> str r1, [fp, #-28]
> ldr r3, .L14
> ldrb r4, [r3]
> bl __sync_synchronize
> and r3, r4, #255
> and r3, r3, #1
> cmp r3, #0
> moveq r3, #1
> movne r3, #0
> and r3, r3, #255
> cmp r3, #0
> beq .L8
> ldr r0, .L14
> bl __cxa_guard_acquire
> mov r3, r0
>
> so __sync_synchronize is not defined in newlib since the target (arm-sim)
> is known to be not multi-threaded,
> but __cxa_guard_acquire is also not a thread safe function,
> because __GTHREADS is not defined by libgcc, since it is known
> at configure time, that the target does not support threads.
> So libstdc++ does not try to use a mutex or any atomics either,
> because it is not compiled with __GTHREADS.
>
> I can further narrow down the patch by only defining this function when
> __GTHREADS is not defined, to make it more clear.
>
>
>> I think Ramana had some changes in the works at one point to address
>> (some) of this, but I'm not sure what happened to them. Ramana?
>>
>>
>> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__) \
>> + || defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__) \
>> + || defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__) \
>> + || defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>>
>> Ug, no! Use the ACLE macros to avoid this sort of mess.
>>
>
> Ah, thanks, copy-paste from freebsd-atomic.c :)
>
>
> I've attached the updated patch.
> Is it OK?
>
>
> Thanks
> Bernd.
>
libgcc is *still* the wrong place for this. It belongs in the system
library (eg newlib, or glibc, or whatever), which knows about the system
it's running on. (Sorry, I should have said this before, but I've
context-switched this out since it's been a long time since it came up).
This hack will just lead to silent code failure of the worst kind
(non-reproducable, racy) at runtime.
R.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize
2020-11-17 15:41 ` Richard Earnshaw (lists)
@ 2020-11-17 15:51 ` Christophe Lyon
2020-11-17 17:17 ` Bernd Edlinger
2020-11-22 8:05 ` [PATCH] Avoid atomic for guard acquire when that is expensive Bernd Edlinger
2 siblings, 0 replies; 17+ messages in thread
From: Christophe Lyon @ 2020-11-17 15:51 UTC (permalink / raw)
To: Richard Earnshaw (lists)
Cc: Bernd Edlinger, gcc-patches, Ramana Radhakrishnan
On Tue, 17 Nov 2020 at 16:41, Richard Earnshaw (lists) via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On 17/11/2020 15:18, Bernd Edlinger wrote:
> > On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
> >> On 03/11/2020 15:08, Bernd Edlinger wrote:
> >>> Hi,
> >>>
> >>> this fixes a problem with a missing symbol __sync_synchronize
> >>> which happens when newlib is used together with libstdc++ for
> >>> the non-threaded simulator target arm-none-eabi.
> >>>
> >>> There are several questions on stackoverflow about this issue.
> >>>
> >>> I would like to add a weak symbol for this target, since this
> >>> is only a default implementation and not meant to override a
> >>> possibly more sophisticated synchronization function from the
> >>> c-runtime.
> >>>
> >>>
> >>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
> >>>
> >>> Is it OK for trunk?
> >>>
> >>>
> >>> Thanks
> >>> Bernd.
> >>>
> >>
> >> I seem to recall that this was a deliberate decision - you can't guess
> >> this correctly, at least when trying to build portable code - you just
> >> have to know which runtime you will be using.
> >>
> >
> > Therefore I suggest to use the weak attribute. It is on purpose not
> > implementing all of the atomics.
> >
> > The use case, is a C++ program which initializes a local static variable.
> >
> > $ cat test.cc
> > #include <string>
> > main(int argc, char **argv)
> > {
> > static std::string x = "test";
> > return 0;
> > }
> >
> > compiles to this:
> > sub sp, sp, #20
> > str r0, [fp, #-24]
> > str r1, [fp, #-28]
> > ldr r3, .L14
> > ldrb r4, [r3]
> > bl __sync_synchronize
> > and r3, r4, #255
> > and r3, r3, #1
> > cmp r3, #0
> > moveq r3, #1
> > movne r3, #0
> > and r3, r3, #255
> > cmp r3, #0
> > beq .L8
> > ldr r0, .L14
> > bl __cxa_guard_acquire
> > mov r3, r0
> >
> > so __sync_synchronize is not defined in newlib since the target (arm-sim)
> > is known to be not multi-threaded,
> > but __cxa_guard_acquire is also not a thread safe function,
> > because __GTHREADS is not defined by libgcc, since it is known
> > at configure time, that the target does not support threads.
> > So libstdc++ does not try to use a mutex or any atomics either,
> > because it is not compiled with __GTHREADS.
> >
> > I can further narrow down the patch by only defining this function when
> > __GTHREADS is not defined, to make it more clear.
> >
> >
> >> I think Ramana had some changes in the works at one point to address
> >> (some) of this, but I'm not sure what happened to them. Ramana?
> >>
> >>
> >> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__) \
> >> + || defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__) \
> >> + || defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__) \
> >> + || defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> >> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> >>
> >> Ug, no! Use the ACLE macros to avoid this sort of mess.
> >>
> >
> > Ah, thanks, copy-paste from freebsd-atomic.c :)
> >
> >
> > I've attached the updated patch.
> > Is it OK?
> >
> >
> > Thanks
> > Bernd.
> >
>
> libgcc is *still* the wrong place for this. It belongs in the system
> library (eg newlib, or glibc, or whatever), which knows about the system
> it's running on. (Sorry, I should have said this before, but I've
> context-switched this out since it's been a long time since it came up).
>
> This hack will just lead to silent code failure of the worst kind
> (non-reproducable, racy) at runtime.
>
I haven't fully re-read the thread, but I think this is related to an
old discussion,
not very well archived in:
https://gcc.gnu.org/pipermail/gcc-patches/2016-November/462299.html
There's a pointer to a newlib patch from Ramana.
> R.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize
2020-11-17 15:41 ` Richard Earnshaw (lists)
2020-11-17 15:51 ` Christophe Lyon
@ 2020-11-17 17:17 ` Bernd Edlinger
2020-11-22 8:05 ` [PATCH] Avoid atomic for guard acquire when that is expensive Bernd Edlinger
2 siblings, 0 replies; 17+ messages in thread
From: Bernd Edlinger @ 2020-11-17 17:17 UTC (permalink / raw)
To: Richard Earnshaw (lists), gcc-patches, Ramana Radhakrishnan
On 11/17/20 4:41 PM, Richard Earnshaw (lists) wrote:
>
> libgcc is *still* the wrong place for this. It belongs in the system
> library (eg newlib, or glibc, or whatever), which knows about the system
> it's running on. (Sorry, I should have said this before, but I've
> context-switched this out since it's been a long time since it came up).
>
No problem. I just saw it from the other end.
It is odd that this problem does not go away even if gcc is configured
with --disable-threads, which should be the default for arm-none-eabi
anyway.
If we assume a threaded environment then it is still libgcc
which does not define __GTHREADS in libgcc/gthr.h, and libstdc++'s
__cxa_guard_acquire is not making use of functions like __gthread_mutex_lock.
But that appears to be this way by design.
Of course the race is not fixed if you ask newlib to implement just this
__sync_synchronize function.
> This hack will just lead to silent code failure of the worst kind
> (non-reproducable, racy) at runtime.
>
So in a arm-none-eabi system with armv6 or higher where the intrinsic
__sync_synchronize is not a library call but an instruction
we have exactly this worst kind scenario, already.
It is however possible that the default of -fthreadsafe_statics
is inappropriate for --disable-threads ?
Bernd.
> R.
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] Avoid atomic for guard acquire when that is expensive
2020-11-17 15:41 ` Richard Earnshaw (lists)
2020-11-17 15:51 ` Christophe Lyon
2020-11-17 17:17 ` Bernd Edlinger
@ 2020-11-22 8:05 ` Bernd Edlinger
2020-11-24 22:10 ` Jason Merrill
2020-11-30 20:08 ` [PING] " Bernd Edlinger
2 siblings, 2 replies; 17+ messages in thread
From: Bernd Edlinger @ 2020-11-22 8:05 UTC (permalink / raw)
To: Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Jason Merrill, Nathan Sidwell,
Christophe Lyon
[-- Attachment #1: Type: text/plain, Size: 755 bytes --]
Hi,
this avoids the need to use -fno-threadsafe-statics on
arm-none-eabi or working around that problem by supplying
a dummy __sync_synchronize function which might
just lead to silent code failure of the worst kind
(non-reproducable, racy) at runtime, as was pointed out
on previous discussions here.
When the atomic access involves a call to __sync_synchronize
it is better to call __cxa_guard_acquire unconditionally,
since it handles the atomics too, or is a non-threaded
implementation when there is no gthread support for this target.
This fixes also a bug for the ARM EABI big-endian target,
that is, previously the wrong bit was checked.
Regression tested successfully on arm-none-eabi with newlib-3.3.0.
Is it OK for trunk?
Thanks
Bernd.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Avoid-atomic-for-guard-acquire-when-that-is-expensiv.patch --]
[-- Type: text/x-patch; name="0001-Avoid-atomic-for-guard-acquire-when-that-is-expensiv.patch", Size: 5984 bytes --]
From 9fd070407408cf10789f5e9eb5ddda2fb3798e6f Mon Sep 17 00:00:00 2001
From: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date: Sun, 22 Nov 2020 08:11:14 +0100
Subject: [PATCH] Avoid atomic for guard acquire when that is expensive
When the atomic access involves a call to __sync_synchronize
it is better to call __cxa_guard_acquire unconditionally,
since it handles the atomics too, or is a non-threaded
implementation when there is no gthread support for this target.
This fixes also a bug for the ARM EABI big-endian target,
that is, previously the wrong bit was checked.
gcc:
2020-11-22 Bernd Edlinger <bernd.edlinger@hotmail.de>
* target.def (guard_atomic_expensive): New hook.
* doc/tm.texi: Document TARGET_CXX_GUARD_ATOMIC_EXPENSIVE.
* doc/tm.texi.in: Likewise.
* config/arm/arm.c (arm_cxx_guard_atomic_expensive): New callback.
gcc/cp:
2020-11-22 Bernd Edlinger <bernd.edlinger@hotmail.de>
* decl2.c: (build_atomic_load_byte): Rename to...
(build_atomic_load_type): ... and add new parameter type.
(get_guard_cond): Skip the atomic here if that is expensive.
Use the correct type for the atomic load on certain targets.
---
gcc/config/arm/arm.c | 13 +++++++++++++
gcc/cp/decl2.c | 12 ++++++++----
gcc/doc/tm.texi | 7 +++++++
gcc/doc/tm.texi.in | 2 ++
gcc/target.def | 10 ++++++++++
5 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 04190b1..04ca1fe 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -235,6 +235,7 @@ static rtx arm_dwarf_register_span (rtx);
static tree arm_cxx_guard_type (void);
static bool arm_cxx_guard_mask_bit (void);
+static bool arm_cxx_guard_atomic_expensive (void);
static tree arm_get_cookie_size (tree);
static bool arm_cookie_has_size (void);
static bool arm_cxx_cdtor_returns_this (void);
@@ -593,6 +594,9 @@ static const struct attribute_spec arm_attribute_table[] =
#undef TARGET_CXX_GUARD_MASK_BIT
#define TARGET_CXX_GUARD_MASK_BIT arm_cxx_guard_mask_bit
+#undef TARGET_CXX_GUARD_ATOMIC_EXPENSIVE
+#define TARGET_CXX_GUARD_ATOMIC_EXPENSIVE arm_cxx_guard_atomic_expensive
+
#undef TARGET_CXX_GET_COOKIE_SIZE
#define TARGET_CXX_GET_COOKIE_SIZE arm_get_cookie_size
@@ -28882,6 +28886,15 @@ arm_cxx_guard_mask_bit (void)
}
+/* Return true if atomics involve a call to __sync_synchronize. */
+
+static bool
+arm_cxx_guard_atomic_expensive (void)
+{
+ return TARGET_AAPCS_BASED && !TARGET_HAVE_DMB && !TARGET_HAVE_DMB_MCR;
+}
+
+
/* The EABI specifies that all array cookies are 8 bytes long. */
static tree
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 1bc7b7e..e2b29a6 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -3300,15 +3300,15 @@ get_guard (tree decl)
/* Return an atomic load of src with the appropriate memory model. */
static tree
-build_atomic_load_byte (tree src, HOST_WIDE_INT model)
+build_atomic_load_type (tree src, HOST_WIDE_INT model, tree type)
{
- tree ptr_type = build_pointer_type (char_type_node);
+ tree ptr_type = build_pointer_type (type);
tree mem_model = build_int_cst (integer_type_node, model);
tree t, addr, val;
unsigned int size;
int fncode;
- size = tree_to_uhwi (TYPE_SIZE_UNIT (char_type_node));
+ size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
fncode = BUILT_IN_ATOMIC_LOAD_N + exact_log2 (size) + 1;
t = builtin_decl_implicit ((enum built_in_function) fncode);
@@ -3350,8 +3350,12 @@ get_guard_cond (tree guard, bool thread_safe)
if (!thread_safe)
guard = get_guard_bits (guard);
+ else if (targetm.cxx.guard_atomic_expensive ())
+ guard = integer_zero_node;
else
- guard = build_atomic_load_byte (guard, MEMMODEL_ACQUIRE);
+ guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE,
+ targetm.cxx.guard_mask_bit ()
+ ? TREE_TYPE (guard) : char_type_node);
/* Mask off all but the low bit. */
if (targetm.cxx.guard_mask_bit ())
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 2b88f78..92215cf 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10728,6 +10728,13 @@ This hook determines how guard variables are used. It should return
@code{true} indicates that only the least significant bit should be used.
@end deftypefn
+@deftypefn {Target Hook} bool TARGET_CXX_GUARD_ATOMIC_EXPENSIVE (void)
+This hook determines if the guard atomic is expensive. It should return
+@code{false} (the default) if the atomic is inexpensive. A return value of
+@code{true} indicates that the atomic is expensive i.e., involves a call to
+__sync_synchronize. In this case let __cxa_guard_acquire handle the atomics.
+@end deftypefn
+
@deftypefn {Target Hook} tree TARGET_CXX_GET_COOKIE_SIZE (tree @var{type})
This hook returns the size of the cookie to use when allocating an array
whose elements have the indicated @var{type}. Assumes that it is already
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 897f289..ce1d837 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7321,6 +7321,8 @@ floating-point support; they are not included in this mechanism.
@hook TARGET_CXX_GUARD_MASK_BIT
+@hook TARGET_CXX_GUARD_ATOMIC_EXPENSIVE
+
@hook TARGET_CXX_GET_COOKIE_SIZE
@hook TARGET_CXX_COOKIE_HAS_SIZE
diff --git a/gcc/target.def b/gcc/target.def
index 810d554..0c02d5c 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6160,6 +6160,16 @@ DEFHOOK
bool, (void),
hook_bool_void_false)
+/* Return true if the guard atomic is expensive. */
+DEFHOOK
+(guard_atomic_expensive,
+ "This hook determines if the guard atomic is expensive. It should return\n\
+@code{false} (the default) if the atomic is inexpensive. A return value of\n\
+@code{true} indicates that the atomic is expensive i.e., involves a call to\n\
+__sync_synchronize. In this case let __cxa_guard_acquire handle the atomics.",
+ bool, (void),
+ hook_bool_void_false)
+
/* Returns the size of the array cookie for an array of type. */
DEFHOOK
(get_cookie_size,
--
1.9.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Avoid atomic for guard acquire when that is expensive
2020-11-22 8:05 ` [PATCH] Avoid atomic for guard acquire when that is expensive Bernd Edlinger
@ 2020-11-24 22:10 ` Jason Merrill
2020-12-01 18:28 ` Bernd Edlinger
2020-11-30 20:08 ` [PING] " Bernd Edlinger
1 sibling, 1 reply; 17+ messages in thread
From: Jason Merrill @ 2020-11-24 22:10 UTC (permalink / raw)
To: Bernd Edlinger, Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Nathan Sidwell,
Christophe Lyon
On 11/22/20 3:05 AM, Bernd Edlinger wrote:
> Hi,
>
> this avoids the need to use -fno-threadsafe-statics on
> arm-none-eabi or working around that problem by supplying
> a dummy __sync_synchronize function which might
> just lead to silent code failure of the worst kind
> (non-reproducable, racy) at runtime, as was pointed out
> on previous discussions here.
>
> When the atomic access involves a call to __sync_synchronize
> it is better to call __cxa_guard_acquire unconditionally,
> since it handles the atomics too, or is a non-threaded
> implementation when there is no gthread support for this target.
>
> This fixes also a bug for the ARM EABI big-endian target,
> that is, previously the wrong bit was checked.
Instead of a new target macro, can't you follow
fold_builtin_atomic_always_lock_free/can_atomic_load_p?
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PING] [PATCH] Avoid atomic for guard acquire when that is expensive
2020-11-22 8:05 ` [PATCH] Avoid atomic for guard acquire when that is expensive Bernd Edlinger
2020-11-24 22:10 ` Jason Merrill
@ 2020-11-30 20:08 ` Bernd Edlinger
2020-11-30 20:54 ` Jason Merrill
1 sibling, 1 reply; 17+ messages in thread
From: Bernd Edlinger @ 2020-11-30 20:08 UTC (permalink / raw)
To: Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Jason Merrill, Nathan Sidwell,
Christophe Lyon
Hi,
I'd like to ping for this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559882.html
Thanks
Bernd.
On 11/22/20 9:05 AM, Bernd Edlinger wrote:
> Hi,
>
> this avoids the need to use -fno-threadsafe-statics on
> arm-none-eabi or working around that problem by supplying
> a dummy __sync_synchronize function which might
> just lead to silent code failure of the worst kind
> (non-reproducable, racy) at runtime, as was pointed out
> on previous discussions here.
>
> When the atomic access involves a call to __sync_synchronize
> it is better to call __cxa_guard_acquire unconditionally,
> since it handles the atomics too, or is a non-threaded
> implementation when there is no gthread support for this target.
>
> This fixes also a bug for the ARM EABI big-endian target,
> that is, previously the wrong bit was checked.
>
>
> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>
> Is it OK for trunk?
>
>
> Thanks
> Bernd.
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PING] [PATCH] Avoid atomic for guard acquire when that is expensive
2020-11-30 20:08 ` [PING] " Bernd Edlinger
@ 2020-11-30 20:54 ` Jason Merrill
0 siblings, 0 replies; 17+ messages in thread
From: Jason Merrill @ 2020-11-30 20:54 UTC (permalink / raw)
To: Bernd Edlinger, Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Nathan Sidwell,
Christophe Lyon
On 11/30/20 3:08 PM, Bernd Edlinger wrote:
> Hi,
>
> I'd like to ping for this patch:
I reviewed it on the 24th:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/560118.html
>
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559882.html
>
> Thanks
> Bernd.
>
> On 11/22/20 9:05 AM, Bernd Edlinger wrote:
>> Hi,
>>
>> this avoids the need to use -fno-threadsafe-statics on
>> arm-none-eabi or working around that problem by supplying
>> a dummy __sync_synchronize function which might
>> just lead to silent code failure of the worst kind
>> (non-reproducable, racy) at runtime, as was pointed out
>> on previous discussions here.
>>
>> When the atomic access involves a call to __sync_synchronize
>> it is better to call __cxa_guard_acquire unconditionally,
>> since it handles the atomics too, or is a non-threaded
>> implementation when there is no gthread support for this target.
>>
>> This fixes also a bug for the ARM EABI big-endian target,
>> that is, previously the wrong bit was checked.
>>
>>
>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>>
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Avoid atomic for guard acquire when that is expensive
2020-11-24 22:10 ` Jason Merrill
@ 2020-12-01 18:28 ` Bernd Edlinger
2020-12-02 18:57 ` Jason Merrill
0 siblings, 1 reply; 17+ messages in thread
From: Bernd Edlinger @ 2020-12-01 18:28 UTC (permalink / raw)
To: Jason Merrill, Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Nathan Sidwell,
Christophe Lyon
[-- Attachment #1: Type: text/plain, Size: 974 bytes --]
On 11/24/20 11:10 PM, Jason Merrill wrote:
> On 11/22/20 3:05 AM, Bernd Edlinger wrote:
>> Hi,
>>
>> this avoids the need to use -fno-threadsafe-statics on
>> arm-none-eabi or working around that problem by supplying
>> a dummy __sync_synchronize function which might
>> just lead to silent code failure of the worst kind
>> (non-reproducable, racy) at runtime, as was pointed out
>> on previous discussions here.
>>
>> When the atomic access involves a call to __sync_synchronize
>> it is better to call __cxa_guard_acquire unconditionally,
>> since it handles the atomics too, or is a non-threaded
>> implementation when there is no gthread support for this target.
>>
>> This fixes also a bug for the ARM EABI big-endian target,
>> that is, previously the wrong bit was checked.
>
> Instead of a new target macro, can't you follow fold_builtin_atomic_always_lock_free/can_atomic_load_p?
>
Yes, thanks, that should work too.
Would you like this better?
Thanks
Bernd.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Avoid-atomic-for-guard-acquire-when-that-is-expensiv.patch --]
[-- Type: text/x-patch; name="0001-Avoid-atomic-for-guard-acquire-when-that-is-expensiv.patch", Size: 3074 bytes --]
From 863b023786f7e8321979bcdb33c027d37d5e155b Mon Sep 17 00:00:00 2001
From: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date: Tue, 1 Dec 2020 18:54:48 +0100
Subject: [PATCH] Avoid atomic for guard acquire when that is expensive
When the atomic access involves a call to __sync_synchronize
it is better to call __cxa_guard_acquire unconditionally,
since it handles the atomics too, or is a non-threaded
implementation when there is no gthread support for this target.
This fixes also a bug for the ARM EABI big-endian target,
that is, previously the wrong bit was checked.
2020-11-22 Bernd Edlinger <bernd.edlinger@hotmail.de>
* decl2.c: (is_atomic_expensive_p): New helper function.
(build_atomic_load_byte): Rename to...
(build_atomic_load_type): ... and add new parameter type.
(get_guard_cond): Skip the atomic here if that is expensive.
Use the correct type for the atomic load on certain targets.
---
gcc/cp/decl2.c | 33 +++++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 1bc7b7e..5001e8f 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see
#include "intl.h"
#include "c-family/c-ada-spec.h"
#include "asan.h"
+#include "optabs-query.h"
/* Id for dumping the raw trees. */
int raw_dump_id;
@@ -3297,18 +3298,34 @@ get_guard (tree decl)
return guard;
}
+/* Returns true if accessing the GUARD atomic is expensive,
+ i.e. involves a call to __sync_synchronize or similar.
+ In this case let __cxa_guard_acquire handle the atomics. */
+
+static bool
+is_atomic_expensive_p (machine_mode mode)
+{
+ if (!flag_inline_atomics)
+ return false;
+
+ if (!can_compare_and_swap_p (mode, false) || !can_atomic_load_p (mode))
+ return false;
+
+ return true;
+}
+
/* Return an atomic load of src with the appropriate memory model. */
static tree
-build_atomic_load_byte (tree src, HOST_WIDE_INT model)
+build_atomic_load_type (tree src, HOST_WIDE_INT model, tree type)
{
- tree ptr_type = build_pointer_type (char_type_node);
+ tree ptr_type = build_pointer_type (type);
tree mem_model = build_int_cst (integer_type_node, model);
tree t, addr, val;
unsigned int size;
int fncode;
- size = tree_to_uhwi (TYPE_SIZE_UNIT (char_type_node));
+ size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
fncode = BUILT_IN_ATOMIC_LOAD_N + exact_log2 (size) + 1;
t = builtin_decl_implicit ((enum built_in_function) fncode);
@@ -3351,7 +3368,15 @@ get_guard_cond (tree guard, bool thread_safe)
if (!thread_safe)
guard = get_guard_bits (guard);
else
- guard = build_atomic_load_byte (guard, MEMMODEL_ACQUIRE);
+ {
+ tree type = targetm.cxx.guard_mask_bit ()
+ ? TREE_TYPE (guard) : char_type_node;
+
+ if (is_atomic_expensive_p (TYPE_MODE (type)))
+ guard = integer_zero_node;
+ else
+ guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);
+ }
/* Mask off all but the low bit. */
if (targetm.cxx.guard_mask_bit ())
--
1.9.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Avoid atomic for guard acquire when that is expensive
2020-12-01 18:28 ` Bernd Edlinger
@ 2020-12-02 18:57 ` Jason Merrill
2020-12-05 12:37 ` Bernd Edlinger
0 siblings, 1 reply; 17+ messages in thread
From: Jason Merrill @ 2020-12-02 18:57 UTC (permalink / raw)
To: Bernd Edlinger, Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Nathan Sidwell,
Christophe Lyon
On 12/1/20 1:28 PM, Bernd Edlinger wrote:
> On 11/24/20 11:10 PM, Jason Merrill wrote:
>> On 11/22/20 3:05 AM, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this avoids the need to use -fno-threadsafe-statics on
>>> arm-none-eabi or working around that problem by supplying
>>> a dummy __sync_synchronize function which might
>>> just lead to silent code failure of the worst kind
>>> (non-reproducable, racy) at runtime, as was pointed out
>>> on previous discussions here.
>>>
>>> When the atomic access involves a call to __sync_synchronize
>>> it is better to call __cxa_guard_acquire unconditionally,
>>> since it handles the atomics too, or is a non-threaded
>>> implementation when there is no gthread support for this target.
>>>
>>> This fixes also a bug for the ARM EABI big-endian target,
>>> that is, previously the wrong bit was checked.
>>
>> Instead of a new target macro, can't you follow fold_builtin_atomic_always_lock_free/can_atomic_load_p?
>
> Yes, thanks, that should work too.
> Would you like this better?
> +is_atomic_expensive_p (machine_mode mode)
> +{
> + if (!flag_inline_atomics)
> + return false;
Why not true?
> + if (!can_compare_and_swap_p (mode, false) || !can_atomic_load_p (mode))
> + return false;
This also seems backwards; I'd think we want to return false if either
of those tests are true. Or maybe just can_atomic_load_p, and not
bother about compare-and-swap.
> + tree type = targetm.cxx.guard_mask_bit ()
> + ? TREE_TYPE (guard) : char_type_node;
> +
> + if (is_atomic_expensive_p (TYPE_MODE (type)))
> + guard = integer_zero_node;
> + else
> + guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);
It should still work to load a single byte, it just needs to be the
least-significant byte. And this isn't an EABI issue; it looks like the
non-EABI code is also broken for big-endian targets, both the atomic
load and the normal load in get_guard_bits.
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Avoid atomic for guard acquire when that is expensive
2020-12-02 18:57 ` Jason Merrill
@ 2020-12-05 12:37 ` Bernd Edlinger
2020-12-07 15:04 ` Jason Merrill
0 siblings, 1 reply; 17+ messages in thread
From: Bernd Edlinger @ 2020-12-05 12:37 UTC (permalink / raw)
To: Jason Merrill, Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Nathan Sidwell,
Christophe Lyon
[-- Attachment #1: Type: text/plain, Size: 3096 bytes --]
On 12/2/20 7:57 PM, Jason Merrill wrote:
> On 12/1/20 1:28 PM, Bernd Edlinger wrote:
>> On 11/24/20 11:10 PM, Jason Merrill wrote:
>>> On 11/22/20 3:05 AM, Bernd Edlinger wrote:
>>>> Hi,
>>>>
>>>> this avoids the need to use -fno-threadsafe-statics on
>>>> arm-none-eabi or working around that problem by supplying
>>>> a dummy __sync_synchronize function which might
>>>> just lead to silent code failure of the worst kind
>>>> (non-reproducable, racy) at runtime, as was pointed out
>>>> on previous discussions here.
>>>>
>>>> When the atomic access involves a call to __sync_synchronize
>>>> it is better to call __cxa_guard_acquire unconditionally,
>>>> since it handles the atomics too, or is a non-threaded
>>>> implementation when there is no gthread support for this target.
>>>>
>>>> This fixes also a bug for the ARM EABI big-endian target,
>>>> that is, previously the wrong bit was checked.
>>>
>>> Instead of a new target macro, can't you follow fold_builtin_atomic_always_lock_free/can_atomic_load_p?
>>
>> Yes, thanks, that should work too.
>> Would you like this better?
>
>> +is_atomic_expensive_p (machine_mode mode)
>> +{
>> + if (!flag_inline_atomics)
>> + return false;
>
> Why not true?
>
Ooops...
Yes, I ought to return true here.
I must have made a mistake when I tested the last version of this patch,
sorry for the confusion.
>> + if (!can_compare_and_swap_p (mode, false) || !can_atomic_load_p (mode))
>> + return false;
>
> This also seems backwards; I'd think we want to return false if either of those tests are true. Or maybe just can_atomic_load_p, and not bother about compare-and-swap.
>
Yes, you are right.
Unfortuately can_atomic_load_p is too weak, since it does not cover
the memory barrier.
And can_compare_and_swap_p (..., false) is actually a bit too strong,
but if it returns true, we should be able to use any atomic without
need for a library call.
>> + tree type = targetm.cxx.guard_mask_bit ()
>> + ? TREE_TYPE (guard) : char_type_node;
>> +
>> + if (is_atomic_expensive_p (TYPE_MODE (type)))
>> + guard = integer_zero_node;
>> + else
>> + guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);
>
> It should still work to load a single byte, it just needs to be the least-significant byte. And this isn't an EABI issue; it looks like the non-EABI code is also broken for big-endian targets, both the atomic load and the normal load in get_guard_bits.
>
I think the non-EABI code is always using bit 0 in the first byte,
by using the endian-neutral #define _GLIBCXX_GUARD_BIT __guard_test_bit (0, 1).
Only ARM EABI uses bit 0 in byte 3 if big-endian and bit 0 in byte 0 otherwise.
For all other targets when _GLIBCXX_USE_FUTEX is defined,
__cxa_guard_XXX accesses the value as int* while the memory
is a 64-bit long, so I could imagine that is an aliasing violation.
But nothing that needs to be fixed immediately.
Attached is the corrected patch.
Tested again on arm-none-eabi with arm-sim.
Is it OK for trunk?
Thanks
Bernd.
> Jason
>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Avoid-atomic-for-guard-acquire-when-that-is-expensiv.patch --]
[-- Type: text/x-patch; name="0001-Avoid-atomic-for-guard-acquire-when-that-is-expensiv.patch", Size: 3073 bytes --]
From c1a6dcaa906113cba0dc88e36460a65aba35ec38 Mon Sep 17 00:00:00 2001
From: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date: Tue, 1 Dec 2020 18:54:48 +0100
Subject: [PATCH] Avoid atomic for guard acquire when that is expensive
When the atomic access involves a call to __sync_synchronize
it is better to call __cxa_guard_acquire unconditionally,
since it handles the atomics too, or is a non-threaded
implementation when there is no gthread support for this target.
This fixes also a bug for the ARM EABI big-endian target,
that is, previously the wrong bit was checked.
2020-11-22 Bernd Edlinger <bernd.edlinger@hotmail.de>
* decl2.c: (is_atomic_expensive_p): New helper function.
(build_atomic_load_byte): Rename to...
(build_atomic_load_type): ... and add new parameter type.
(get_guard_cond): Skip the atomic here if that is expensive.
Use the correct type for the atomic load on certain targets.
---
gcc/cp/decl2.c | 33 +++++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 1bc7b7e..c7ddcc9 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see
#include "intl.h"
#include "c-family/c-ada-spec.h"
#include "asan.h"
+#include "optabs-query.h"
/* Id for dumping the raw trees. */
int raw_dump_id;
@@ -3297,18 +3298,34 @@ get_guard (tree decl)
return guard;
}
+/* Returns true if accessing the GUARD atomic is expensive,
+ i.e. involves a call to __sync_synchronize or similar.
+ In this case let __cxa_guard_acquire handle the atomics. */
+
+static bool
+is_atomic_expensive_p (machine_mode mode)
+{
+ if (!flag_inline_atomics)
+ return true;
+
+ if (!can_compare_and_swap_p (mode, false) || !can_atomic_load_p (mode))
+ return true;
+
+ return false;
+}
+
/* Return an atomic load of src with the appropriate memory model. */
static tree
-build_atomic_load_byte (tree src, HOST_WIDE_INT model)
+build_atomic_load_type (tree src, HOST_WIDE_INT model, tree type)
{
- tree ptr_type = build_pointer_type (char_type_node);
+ tree ptr_type = build_pointer_type (type);
tree mem_model = build_int_cst (integer_type_node, model);
tree t, addr, val;
unsigned int size;
int fncode;
- size = tree_to_uhwi (TYPE_SIZE_UNIT (char_type_node));
+ size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
fncode = BUILT_IN_ATOMIC_LOAD_N + exact_log2 (size) + 1;
t = builtin_decl_implicit ((enum built_in_function) fncode);
@@ -3351,7 +3368,15 @@ get_guard_cond (tree guard, bool thread_safe)
if (!thread_safe)
guard = get_guard_bits (guard);
else
- guard = build_atomic_load_byte (guard, MEMMODEL_ACQUIRE);
+ {
+ tree type = targetm.cxx.guard_mask_bit ()
+ ? TREE_TYPE (guard) : char_type_node;
+
+ if (is_atomic_expensive_p (TYPE_MODE (type)))
+ guard = integer_zero_node;
+ else
+ guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);
+ }
/* Mask off all but the low bit. */
if (targetm.cxx.guard_mask_bit ())
--
1.9.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Avoid atomic for guard acquire when that is expensive
2020-12-05 12:37 ` Bernd Edlinger
@ 2020-12-07 15:04 ` Jason Merrill
2020-12-07 16:17 ` Bernd Edlinger
0 siblings, 1 reply; 17+ messages in thread
From: Jason Merrill @ 2020-12-07 15:04 UTC (permalink / raw)
To: Bernd Edlinger, Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Nathan Sidwell,
Christophe Lyon
On 12/5/20 7:37 AM, Bernd Edlinger wrote:
> On 12/2/20 7:57 PM, Jason Merrill wrote:
>> On 12/1/20 1:28 PM, Bernd Edlinger wrote:
>>> On 11/24/20 11:10 PM, Jason Merrill wrote:
>>>> On 11/22/20 3:05 AM, Bernd Edlinger wrote:
>>>>> Hi,
>>>>>
>>>>> this avoids the need to use -fno-threadsafe-statics on
>>>>> arm-none-eabi or working around that problem by supplying
>>>>> a dummy __sync_synchronize function which might
>>>>> just lead to silent code failure of the worst kind
>>>>> (non-reproducable, racy) at runtime, as was pointed out
>>>>> on previous discussions here.
>>>>>
>>>>> When the atomic access involves a call to __sync_synchronize
>>>>> it is better to call __cxa_guard_acquire unconditionally,
>>>>> since it handles the atomics too, or is a non-threaded
>>>>> implementation when there is no gthread support for this target.
>>>>>
>>>>> This fixes also a bug for the ARM EABI big-endian target,
>>>>> that is, previously the wrong bit was checked.
>>>>
>>>> Instead of a new target macro, can't you follow fold_builtin_atomic_always_lock_free/can_atomic_load_p?
>>>
>>> Yes, thanks, that should work too.
>>> Would you like this better?
>>
>>> +is_atomic_expensive_p (machine_mode mode)
>>> +{
>>> + if (!flag_inline_atomics)
>>> + return false;
>>
>> Why not true?
>
> Ooops...
> Yes, I ought to return true here.
> I must have made a mistake when I tested the last version of this patch,
> sorry for the confusion.
>
>>> + if (!can_compare_and_swap_p (mode, false) || !can_atomic_load_p (mode))
>>> + return false;
>>
>> This also seems backwards; I'd think we want to return false if either of those tests are true. Or maybe just can_atomic_load_p, and not bother about compare-and-swap.
>
> Yes, you are right.
> Unfortuately can_atomic_load_p is too weak, since it does not cover
> the memory barrier.
>
> And can_compare_and_swap_p (..., false) is actually a bit too strong,
> but if it returns true, we should be able to use any atomic without
> need for a library call.
>>> + tree type = targetm.cxx.guard_mask_bit ()
>>> + ? TREE_TYPE (guard) : char_type_node;
>>> +
>>> + if (is_atomic_expensive_p (TYPE_MODE (type)))
>>> + guard = integer_zero_node;
>>> + else
>>> + guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);
>>
>> It should still work to load a single byte, it just needs to be the least-significant byte.
I still don't think you need to load the whole word to check the guard bit.
> And this isn't an EABI issue; it looks like the non-EABI code is also broken for big-endian targets, both the atomic load and the normal load in get_guard_bits.
>>
>
> I think the non-EABI code is always using bit 0 in the first byte,
> by using the endian-neutral #define _GLIBCXX_GUARD_BIT __guard_test_bit (0, 1).
Except that set_guard sets the least-significant bit on all targets.
> Only ARM EABI uses bit 0 in byte 3 if big-endian and bit 0 in byte 0 otherwise.
>
> For all other targets when _GLIBCXX_USE_FUTEX is defined,
> __cxa_guard_XXX accesses the value as int* while the memory
> is a 64-bit long, so I could imagine that is an aliasing violation.
>
>
> But nothing that needs to be fixed immediately.
Agreed.
> Attached is the corrected patch.
>
> Tested again on arm-none-eabi with arm-sim.
> Is it OK for trunk?
>
> Thanks
> Bernd.
>
>> Jason
>>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Avoid atomic for guard acquire when that is expensive
2020-12-07 15:04 ` Jason Merrill
@ 2020-12-07 16:17 ` Bernd Edlinger
2020-12-08 19:50 ` Jason Merrill
0 siblings, 1 reply; 17+ messages in thread
From: Bernd Edlinger @ 2020-12-07 16:17 UTC (permalink / raw)
To: Jason Merrill, Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Nathan Sidwell,
Christophe Lyon
On 12/7/20 4:04 PM, Jason Merrill wrote:
> On 12/5/20 7:37 AM, Bernd Edlinger wrote:
>> On 12/2/20 7:57 PM, Jason Merrill wrote:
>>> On 12/1/20 1:28 PM, Bernd Edlinger wrote:>>>> + tree type = targetm.cxx.guard_mask_bit ()
>>>> + ? TREE_TYPE (guard) : char_type_node;
>>>> +
>>>> + if (is_atomic_expensive_p (TYPE_MODE (type)))
>>>> + guard = integer_zero_node;
>>>> + else
>>>> + guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);
>>>
>>> It should still work to load a single byte, it just needs to be the least-significant byte.
>
> I still don't think you need to load the whole word to check the guard bit.
>
Of course that is also possible. But I would not expect an
address offset and a byte access to be cheaper than a word access.
I just saw that get_guard_bits does the same thing:
access the first byte if !targetm.cxx.guard_mask_bit ()
and access the whole word otherwise, which is only
there for ARM EABI.
>> And this isn't an EABI issue; it looks like the non-EABI code is also broken for big-endian targets, both the atomic load and the normal load in get_guard_bits.
>>>
>>
>> I think the non-EABI code is always using bit 0 in the first byte,
>> by using the endian-neutral #define _GLIBCXX_GUARD_BIT __guard_test_bit (0, 1).
>
> Except that set_guard sets the least-significant bit on all targets.
>
Hmm, as I said, get_guard_bits gets the guard as a word if !targetm.cxx.guard_mask_bit (),
and as the first byte otherwise. Of course it could get the third byte,
if !targetm.cxx.guard_mask_bit () && BYTES_BIG_ENDIAN, but it would be more complicated
this way, wouldn't it?
Bernd.
>> Only ARM EABI uses bit 0 in byte 3 if big-endian and bit 0 in byte 0 otherwise.
>>
>> For all other targets when _GLIBCXX_USE_FUTEX is defined,
>> __cxa_guard_XXX accesses the value as int* while the memory
>> is a 64-bit long, so I could imagine that is an aliasing violation.
>>
>>
>> But nothing that needs to be fixed immediately.
>
> Agreed.
>
>> Attached is the corrected patch.
>>
>> Tested again on arm-none-eabi with arm-sim.
>> Is it OK for trunk?
>>
>> Thanks
>> Bernd.
>>
>>> Jason
>>>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Avoid atomic for guard acquire when that is expensive
2020-12-07 16:17 ` Bernd Edlinger
@ 2020-12-08 19:50 ` Jason Merrill
0 siblings, 0 replies; 17+ messages in thread
From: Jason Merrill @ 2020-12-08 19:50 UTC (permalink / raw)
To: Bernd Edlinger, Richard Earnshaw (lists),
gcc-patches, Ramana Radhakrishnan, Nathan Sidwell,
Christophe Lyon
On 12/7/20 11:17 AM, Bernd Edlinger wrote:
> On 12/7/20 4:04 PM, Jason Merrill wrote:
>> On 12/5/20 7:37 AM, Bernd Edlinger wrote:
>>> On 12/2/20 7:57 PM, Jason Merrill wrote:
>>>> On 12/1/20 1:28 PM, Bernd Edlinger wrote:>>>> + tree type = targetm.cxx.guard_mask_bit ()
>>>>> + ? TREE_TYPE (guard) : char_type_node;
>>>>> +
>>>>> + if (is_atomic_expensive_p (TYPE_MODE (type)))
>>>>> + guard = integer_zero_node;
>>>>> + else
>>>>> + guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);
>>>>
>>>> It should still work to load a single byte, it just needs to be the least-significant byte.
>>
>> I still don't think you need to load the whole word to check the guard bit.
>
> Of course that is also possible. But I would not expect an
> address offset and a byte access to be cheaper than a word access.
Fair point.
> I just saw that get_guard_bits does the same thing:
> access the first byte if !targetm.cxx.guard_mask_bit ()
> and access the whole word otherwise, which is only
> there for ARM EABI.
>>> And this isn't an EABI issue; it looks like the non-EABI code is also broken for big-endian targets, both the atomic load and the normal load in get_guard_bits.
>>>
>>> I think the non-EABI code is always using bit 0 in the first byte,
>>> by using the endian-neutral #define _GLIBCXX_GUARD_BIT __guard_test_bit (0, 1).
>>
>> Except that set_guard sets the least-significant bit on all targets.
>
> Hmm, as I said, get_guard_bits gets the guard as a word if !targetm.cxx.guard_mask_bit (),
> and as the first byte otherwise. Of course it could get the third byte,
> if !targetm.cxx.guard_mask_bit () && BYTES_BIG_ENDIAN, but it would be more complicated
> this way, wouldn't it?
Ah, yes, I was overlooking that set_guard uses get_guard_bits.
The patch is OK.
>>> Only ARM EABI uses bit 0 in byte 3 if big-endian and bit 0 in byte 0 otherwise.
>>>
>>> For all other targets when _GLIBCXX_USE_FUTEX is defined,
>>> __cxa_guard_XXX accesses the value as int* while the memory
>>> is a 64-bit long, so I could imagine that is an aliasing violation.
>>>
>>>
>>> But nothing that needs to be fixed immediately.
>>
>> Agreed.
>>
>>> Attached is the corrected patch.
>>>
>>> Tested again on arm-none-eabi with arm-sim.
>>> Is it OK for trunk?
>>>
>>> Thanks
>>> Bernd.
>>>
>>>> Jason
>>>>
>>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2020-12-08 19:50 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-03 15:08 [PATCH] libgcc: Add a weak stub for __sync_synchronize Bernd Edlinger
2020-11-17 5:43 ` [PING] " Bernd Edlinger
2020-11-17 12:44 ` Richard Earnshaw (lists)
2020-11-17 15:18 ` Bernd Edlinger
2020-11-17 15:41 ` Richard Earnshaw (lists)
2020-11-17 15:51 ` Christophe Lyon
2020-11-17 17:17 ` Bernd Edlinger
2020-11-22 8:05 ` [PATCH] Avoid atomic for guard acquire when that is expensive Bernd Edlinger
2020-11-24 22:10 ` Jason Merrill
2020-12-01 18:28 ` Bernd Edlinger
2020-12-02 18:57 ` Jason Merrill
2020-12-05 12:37 ` Bernd Edlinger
2020-12-07 15:04 ` Jason Merrill
2020-12-07 16:17 ` Bernd Edlinger
2020-12-08 19:50 ` Jason Merrill
2020-11-30 20:08 ` [PING] " Bernd Edlinger
2020-11-30 20:54 ` Jason Merrill
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).