public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization
@ 2022-05-05 14:00 lh_mouse at 126 dot com
2022-05-05 14:10 ` [Bug c/105495] " lh_mouse at 126 dot com
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: lh_mouse at 126 dot com @ 2022-05-05 14:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
Bug ID: 105495
Summary: `__atomic_compare_exchange` prevents tail-call
optimization
Product: gcc
Version: 11.3.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: lh_mouse at 126 dot com
Target Milestone: ---
Godbolt: https://gcc.godbolt.org/z/7ob6zc17P
Offending testcase:
```c
typedef struct { int b; } cond;
int
__MCF_batch_release_common(cond* p, int c);
int
_MCF_cond_signal_some(cond* p, int x)
{
cond c = {x}, n = {2};
__atomic_compare_exchange(p, &c, &n, 1, 0, 0);
return __MCF_batch_release_common(p, x);
}
```
GCC output:
```asm
_MCF_cond_signal_some:
sub rsp, 24
mov edx, 2
mov eax, esi
mov DWORD PTR [rsp+12], esi
lock cmpxchg DWORD PTR [rdi], edx
je .L2
mov DWORD PTR [rsp+12], eax <------- note this extra store,
which clang doesn't generate
.L2:
call __MCF_batch_release_common
add rsp, 24
ret
```
Clang output:
```asm
_MCF_cond_signal_some: # @_MCF_cond_signal_some
mov ecx, 2
mov eax, esi
lock cmpxchg dword ptr [rdi], ecx
jmp __MCF_batch_release_common # TAILCALL
```
1. If `cond` was defined as a scalar type such as `long`, there is no such
issue.
2. `__atomic_exchange` doesn't suffer from this issue.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/105495] `__atomic_compare_exchange` prevents tail-call optimization
2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com
@ 2022-05-05 14:10 ` lh_mouse at 126 dot com
2022-05-05 14:17 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: lh_mouse at 126 dot com @ 2022-05-05 14:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
--- Comment #1 from LIU Hao <lh_mouse at 126 dot com> ---
A possible workaround is to use a scalar type to provide storage for local
variables, and cast them as needed:
Godbolt: https://gcc.godbolt.org/z/n7zq7Pn4G
```c
typedef struct { int b; } cond;
int
__MCF_batch_release_common(cond* p, int c);
int
_MCF_cond_signal_some(cond* p, int x)
{
int c = {x}, n = {2};
__atomic_compare_exchange((cond*)p, (cond*)&c, (cond*)&n, 1, 0, 0);
return __MCF_batch_release_common(p, x);
}
```
This makes GCC output the same assembly as Clang.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/105495] `__atomic_compare_exchange` prevents tail-call optimization
2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com
2022-05-05 14:10 ` [Bug c/105495] " lh_mouse at 126 dot com
@ 2022-05-05 14:17 ` rguenth at gcc dot gnu.org
2022-05-05 14:30 ` lh_mouse at 126 dot com
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-05 14:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2022-05-05
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is that we pass the 2nd argument by reference which causes a stack
slot to be allocated for 'c':
c.b = x_2(D);
__atomic_compare_exchange_4 (p_4(D), &c, 2, 1, 0, 0);
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/105495] `__atomic_compare_exchange` prevents tail-call optimization
2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com
2022-05-05 14:10 ` [Bug c/105495] " lh_mouse at 126 dot com
2022-05-05 14:17 ` rguenth at gcc dot gnu.org
@ 2022-05-05 14:30 ` lh_mouse at 126 dot com
2022-05-05 14:33 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: lh_mouse at 126 dot com @ 2022-05-05 14:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
--- Comment #3 from LIU Hao <lh_mouse at 126 dot com> ---
Wouldn't that go away if the value in it is never read back?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/105495] `__atomic_compare_exchange` prevents tail-call optimization
2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com
` (2 preceding siblings ...)
2022-05-05 14:30 ` lh_mouse at 126 dot com
@ 2022-05-05 14:33 ` jakub at gcc dot gnu.org
2022-05-05 15:28 ` [Bug middle-end/105495] " pinskia at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-05-05 14:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The reason why it works in #c1 is that we replace the
c = x_4(D);
n_6 = 2;
n.0_1 = n_6;
n.1_2 = (unsigned int) n.0_1;
__atomic_compare_exchange_4 (p_7(D), &c, n.1_2, 1, 0, 0);
call in the IL with:
c_20 = x_4(D);
_15 = c_20;
_16 = VIEW_CONVERT_EXPR<unsigned int>(_15);
_17 = .ATOMIC_COMPARE_EXCHANGE (p_7(D), _16, 2, 260, 0, 0);
_18 = REALPART_EXPR <_17>;
_19 = VIEW_CONVERT_EXPR<int>(_18);
c_21 = _19;
during ccp1 pass, optimizing away the addressables.
But we don't do that for aggregates with sizes of integer types,
but supposedly we could do that too.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/105495] `__atomic_compare_exchange` prevents tail-call optimization
2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com
` (3 preceding siblings ...)
2022-05-05 14:33 ` jakub at gcc dot gnu.org
@ 2022-05-05 15:28 ` pinskia at gcc dot gnu.org
2022-05-06 7:40 ` lh_mouse at 126 dot com
2022-10-31 7:55 ` [Bug tree-optimization/105495] " pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-05-05 15:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Component|c |middle-end
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/105495] `__atomic_compare_exchange` prevents tail-call optimization
2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com
` (4 preceding siblings ...)
2022-05-05 15:28 ` [Bug middle-end/105495] " pinskia at gcc dot gnu.org
@ 2022-05-06 7:40 ` lh_mouse at 126 dot com
2022-10-31 7:55 ` [Bug tree-optimization/105495] " pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: lh_mouse at 126 dot com @ 2022-05-06 7:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
--- Comment #5 from LIU Hao <lh_mouse at 126 dot com> ---
This does not trigger the issue:
```c
#define __atomic_compare_exchange(p,c,n,w,ms,mf) \
({ int __temp; \
__builtin_memcpy(&__temp, c, sizeof(*c)); \
_Bool __r = __atomic_compare_exchange(p, (__typeof__(*(c))*) &__temp, n,
w, ms, mf); \
__builtin_memcpy(c, &__temp, sizeof(*c)); \
__r; });
typedef struct { int b; } cond;
int
__MCF_batch_release_common(cond* p, int c);
int
_MCF_cond_signal_some(cond* p, int x)
{
cond c = {x}, n = {2};
__atomic_compare_exchange(p, &c, &n, 1, 0, 0);
return __MCF_batch_release_common(p, x);
}
```
which results in (godbolt https://gcc.godbolt.org/z/n68T1c6oP):
```asm
_MCF_cond_signal_some:
mov edx, 2
mov eax, esi
lock cmpxchg DWORD PTR [rdi], edx
jmp __MCF_batch_release_common
```
Effectively, we are using a `int` to provide storage for a struct of
`sizeof(int)`.
But if we use a `long` to provide storage for the struct, such issue reappears:
```c
#define __atomic_compare_exchange(p,c,n,w,ms,mf) \
({ long __temp; \
__builtin_memcpy(&__temp, c, sizeof(*c)); \
_Bool __r = __atomic_compare_exchange(p, (__typeof__(*(c))*) &__temp, n,
w, ms, mf); \
__builtin_memcpy(c, &__temp, sizeof(*c)); \
__r; });
typedef struct { int b; } cond;
int
__MCF_batch_release_common(cond* p, int c);
int
_MCF_cond_signal_some(cond* p, int x)
{
cond c = {x}, n = {2};
__atomic_compare_exchange(p, &c, &n, 1, 0, 0);
return __MCF_batch_release_common(p, x);
}
```
which results in (https://gcc.godbolt.org/z/PGof8nGd7)
```asm
_MCF_cond_signal_some:
mov edx, 2
mov eax, esi
mov DWORD PTR [rsp-16], esi
lock cmpxchg DWORD PTR [rdi], edx
je .L2
mov DWORD PTR [rsp-16], eax
.L2:
jmp __MCF_batch_release_common
```
It is also notable that with this kind of hacks, GCC is finally able to perform
TCO on the return statement.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/105495] `__atomic_compare_exchange` prevents tail-call optimization
2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com
` (5 preceding siblings ...)
2022-05-06 7:40 ` lh_mouse at 126 dot com
@ 2022-10-31 7:55 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-10-31 7:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|middle-end |tree-optimization
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I wonder if we could add some access attribute such that the 2nd argument of
__atomic_compare_exchange_4 is marked as read only ...
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-10-31 7:55 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com
2022-05-05 14:10 ` [Bug c/105495] " lh_mouse at 126 dot com
2022-05-05 14:17 ` rguenth at gcc dot gnu.org
2022-05-05 14:30 ` lh_mouse at 126 dot com
2022-05-05 14:33 ` jakub at gcc dot gnu.org
2022-05-05 15:28 ` [Bug middle-end/105495] " pinskia at gcc dot gnu.org
2022-05-06 7:40 ` lh_mouse at 126 dot com
2022-10-31 7:55 ` [Bug tree-optimization/105495] " pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).