public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization @ 2022-05-05 14:00 lh_mouse at 126 dot com 2022-05-05 14:10 ` [Bug c/105495] " lh_mouse at 126 dot com ` (6 more replies) 0 siblings, 7 replies; 8+ messages in thread From: lh_mouse at 126 dot com @ 2022-05-05 14:00 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495 Bug ID: 105495 Summary: `__atomic_compare_exchange` prevents tail-call optimization Product: gcc Version: 11.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: lh_mouse at 126 dot com Target Milestone: --- Godbolt: https://gcc.godbolt.org/z/7ob6zc17P Offending testcase: ```c typedef struct { int b; } cond; int __MCF_batch_release_common(cond* p, int c); int _MCF_cond_signal_some(cond* p, int x) { cond c = {x}, n = {2}; __atomic_compare_exchange(p, &c, &n, 1, 0, 0); return __MCF_batch_release_common(p, x); } ``` GCC output: ```asm _MCF_cond_signal_some: sub rsp, 24 mov edx, 2 mov eax, esi mov DWORD PTR [rsp+12], esi lock cmpxchg DWORD PTR [rdi], edx je .L2 mov DWORD PTR [rsp+12], eax <------- note this extra store, which clang doesn't generate .L2: call __MCF_batch_release_common add rsp, 24 ret ``` Clang output: ```asm _MCF_cond_signal_some: # @_MCF_cond_signal_some mov ecx, 2 mov eax, esi lock cmpxchg dword ptr [rdi], ecx jmp __MCF_batch_release_common # TAILCALL ``` 1. If `cond` was defined as a scalar type such as `long`, there is no such issue. 2. `__atomic_exchange` doesn't suffer from this issue. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/105495] `__atomic_compare_exchange` prevents tail-call optimization 2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com @ 2022-05-05 14:10 ` lh_mouse at 126 dot com 2022-05-05 14:17 ` rguenth at gcc dot gnu.org ` (5 subsequent siblings) 6 siblings, 0 replies; 8+ messages in thread From: lh_mouse at 126 dot com @ 2022-05-05 14:10 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495 --- Comment #1 from LIU Hao <lh_mouse at 126 dot com> --- A possible workaround is to use a scalar type to provide storage for local variables, and cast them as needed: Godbolt: https://gcc.godbolt.org/z/n7zq7Pn4G ```c typedef struct { int b; } cond; int __MCF_batch_release_common(cond* p, int c); int _MCF_cond_signal_some(cond* p, int x) { int c = {x}, n = {2}; __atomic_compare_exchange((cond*)p, (cond*)&c, (cond*)&n, 1, 0, 0); return __MCF_batch_release_common(p, x); } ``` This makes GCC output the same assembly as Clang. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/105495] `__atomic_compare_exchange` prevents tail-call optimization 2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com 2022-05-05 14:10 ` [Bug c/105495] " lh_mouse at 126 dot com @ 2022-05-05 14:17 ` rguenth at gcc dot gnu.org 2022-05-05 14:30 ` lh_mouse at 126 dot com ` (4 subsequent siblings) 6 siblings, 0 replies; 8+ messages in thread From: rguenth at gcc dot gnu.org @ 2022-05-05 14:17 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2022-05-05 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- The issue is that we pass the 2nd argument by reference which causes a stack slot to be allocated for 'c': c.b = x_2(D); __atomic_compare_exchange_4 (p_4(D), &c, 2, 1, 0, 0); ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/105495] `__atomic_compare_exchange` prevents tail-call optimization 2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com 2022-05-05 14:10 ` [Bug c/105495] " lh_mouse at 126 dot com 2022-05-05 14:17 ` rguenth at gcc dot gnu.org @ 2022-05-05 14:30 ` lh_mouse at 126 dot com 2022-05-05 14:33 ` jakub at gcc dot gnu.org ` (3 subsequent siblings) 6 siblings, 0 replies; 8+ messages in thread From: lh_mouse at 126 dot com @ 2022-05-05 14:30 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495 --- Comment #3 from LIU Hao <lh_mouse at 126 dot com> --- Wouldn't that go away if the value in it is never read back? ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/105495] `__atomic_compare_exchange` prevents tail-call optimization 2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com ` (2 preceding siblings ...) 2022-05-05 14:30 ` lh_mouse at 126 dot com @ 2022-05-05 14:33 ` jakub at gcc dot gnu.org 2022-05-05 15:28 ` [Bug middle-end/105495] " pinskia at gcc dot gnu.org ` (2 subsequent siblings) 6 siblings, 0 replies; 8+ messages in thread From: jakub at gcc dot gnu.org @ 2022-05-05 14:33 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495 --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> --- The reason why it works in #c1 is that we replace the c = x_4(D); n_6 = 2; n.0_1 = n_6; n.1_2 = (unsigned int) n.0_1; __atomic_compare_exchange_4 (p_7(D), &c, n.1_2, 1, 0, 0); call in the IL with: c_20 = x_4(D); _15 = c_20; _16 = VIEW_CONVERT_EXPR<unsigned int>(_15); _17 = .ATOMIC_COMPARE_EXCHANGE (p_7(D), _16, 2, 260, 0, 0); _18 = REALPART_EXPR <_17>; _19 = VIEW_CONVERT_EXPR<int>(_18); c_21 = _19; during ccp1 pass, optimizing away the addressables. But we don't do that for aggregates with sizes of integer types, but supposedly we could do that too. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/105495] `__atomic_compare_exchange` prevents tail-call optimization 2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com ` (3 preceding siblings ...) 2022-05-05 14:33 ` jakub at gcc dot gnu.org @ 2022-05-05 15:28 ` pinskia at gcc dot gnu.org 2022-05-06 7:40 ` lh_mouse at 126 dot com 2022-10-31 7:55 ` [Bug tree-optimization/105495] " pinskia at gcc dot gnu.org 6 siblings, 0 replies; 8+ messages in thread From: pinskia at gcc dot gnu.org @ 2022-05-05 15:28 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement Component|c |middle-end ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/105495] `__atomic_compare_exchange` prevents tail-call optimization 2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com ` (4 preceding siblings ...) 2022-05-05 15:28 ` [Bug middle-end/105495] " pinskia at gcc dot gnu.org @ 2022-05-06 7:40 ` lh_mouse at 126 dot com 2022-10-31 7:55 ` [Bug tree-optimization/105495] " pinskia at gcc dot gnu.org 6 siblings, 0 replies; 8+ messages in thread From: lh_mouse at 126 dot com @ 2022-05-06 7:40 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495 --- Comment #5 from LIU Hao <lh_mouse at 126 dot com> --- This does not trigger the issue: ```c #define __atomic_compare_exchange(p,c,n,w,ms,mf) \ ({ int __temp; \ __builtin_memcpy(&__temp, c, sizeof(*c)); \ _Bool __r = __atomic_compare_exchange(p, (__typeof__(*(c))*) &__temp, n, w, ms, mf); \ __builtin_memcpy(c, &__temp, sizeof(*c)); \ __r; }); typedef struct { int b; } cond; int __MCF_batch_release_common(cond* p, int c); int _MCF_cond_signal_some(cond* p, int x) { cond c = {x}, n = {2}; __atomic_compare_exchange(p, &c, &n, 1, 0, 0); return __MCF_batch_release_common(p, x); } ``` which results in (godbolt https://gcc.godbolt.org/z/n68T1c6oP): ```asm _MCF_cond_signal_some: mov edx, 2 mov eax, esi lock cmpxchg DWORD PTR [rdi], edx jmp __MCF_batch_release_common ``` Effectively, we are using a `int` to provide storage for a struct of `sizeof(int)`. But if we use a `long` to provide storage for the struct, such issue reappears: ```c #define __atomic_compare_exchange(p,c,n,w,ms,mf) \ ({ long __temp; \ __builtin_memcpy(&__temp, c, sizeof(*c)); \ _Bool __r = __atomic_compare_exchange(p, (__typeof__(*(c))*) &__temp, n, w, ms, mf); \ __builtin_memcpy(c, &__temp, sizeof(*c)); \ __r; }); typedef struct { int b; } cond; int __MCF_batch_release_common(cond* p, int c); int _MCF_cond_signal_some(cond* p, int x) { cond c = {x}, n = {2}; __atomic_compare_exchange(p, &c, &n, 1, 0, 0); return __MCF_batch_release_common(p, x); } ``` which results in (https://gcc.godbolt.org/z/PGof8nGd7) ```asm _MCF_cond_signal_some: mov edx, 2 mov eax, esi mov DWORD PTR [rsp-16], esi lock cmpxchg DWORD PTR [rdi], edx je .L2 mov DWORD PTR [rsp-16], eax .L2: jmp __MCF_batch_release_common ``` It is also notable that with this kind of hacks, GCC is finally able to perform TCO on the return statement. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/105495] `__atomic_compare_exchange` prevents tail-call optimization 2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com ` (5 preceding siblings ...) 2022-05-06 7:40 ` lh_mouse at 126 dot com @ 2022-10-31 7:55 ` pinskia at gcc dot gnu.org 6 siblings, 0 replies; 8+ messages in thread From: pinskia at gcc dot gnu.org @ 2022-10-31 7:55 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|middle-end |tree-optimization --- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> --- I wonder if we could add some access attribute such that the 2nd argument of __atomic_compare_exchange_4 is marked as read only ... ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-10-31 7:55 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-05 14:00 [Bug c/105495] New: `__atomic_compare_exchange` prevents tail-call optimization lh_mouse at 126 dot com 2022-05-05 14:10 ` [Bug c/105495] " lh_mouse at 126 dot com 2022-05-05 14:17 ` rguenth at gcc dot gnu.org 2022-05-05 14:30 ` lh_mouse at 126 dot com 2022-05-05 14:33 ` jakub at gcc dot gnu.org 2022-05-05 15:28 ` [Bug middle-end/105495] " pinskia at gcc dot gnu.org 2022-05-06 7:40 ` lh_mouse at 126 dot com 2022-10-31 7:55 ` [Bug tree-optimization/105495] " pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).