* [Bug target/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
@ 2021-10-02 20:30 ` pinskia at gcc dot gnu.org
2021-10-03 12:56 ` hjl.tools at gmail dot com
` (37 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-02 20:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Severity|normal |enhancement
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug target/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
2021-10-02 20:30 ` [Bug target/102566] " pinskia at gcc dot gnu.org
@ 2021-10-03 12:56 ` hjl.tools at gmail dot com
2021-10-03 13:06 ` hjl.tools at gmail dot com
` (36 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 12:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|unknown |12.0
Last reconfirmed| |2021-10-03
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
The C code:
#include <stdatomic.h>
_Atomic int v;
int
foo ()
{
return atomic_fetch_or_explicit (&v, 1, memory_order_relaxed) & 1;
}
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug target/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
2021-10-02 20:30 ` [Bug target/102566] " pinskia at gcc dot gnu.org
2021-10-03 12:56 ` hjl.tools at gmail dot com
@ 2021-10-03 13:06 ` hjl.tools at gmail dot com
2021-10-03 13:19 ` [Bug tree-optimization/102566] " hjl.tools at gmail dot com
` (35 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 13:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
This works:
[hjl@gnu-cfl-2 pr102566]$ cat y.c
#include <stdatomic.h>
_Atomic int v;
unsigned int
foo ()
{
return atomic_fetch_or_explicit (&v, 1, memory_order_relaxed) & 1;
}
[hjl@gnu-cfl-2 pr102566]$ make y.s
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -O2 -S
y.c
[hjl@gnu-cfl-2 pr102566]$ cat y.s
.file "y.c"
.text
.p2align 4
.globl foo
.type foo, @function
foo:
.LFB0:
.cfi_startproc
xorl %eax, %eax
lock btsl $0, v(%rip)
setc %al
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.globl v
.bss
.align 4
.type v, @object
.size v, 4
v:
.zero 4
.ident "GCC: (GNU) 12.0.0 20211003 (experimental)"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 pr102566]$
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug tree-optimization/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (2 preceding siblings ...)
2021-10-03 13:06 ` hjl.tools at gmail dot com
@ 2021-10-03 13:19 ` hjl.tools at gmail dot com
2021-10-03 15:10 ` hjl.tools at gmail dot com
` (34 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 13:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|target |tree-optimization
--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
optimize_atomic_bit_test_and works on
_1 = __atomic_fetch_or_4 (&v, 1, 0);
_4 = _1 & 1;
but fails on
_1 = __atomic_fetch_or_4 (&v, 1, 0);
_2 = (int) _1;
_5 = _2 & 1;
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug tree-optimization/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (3 preceding siblings ...)
2021-10-03 13:19 ` [Bug tree-optimization/102566] " hjl.tools at gmail dot com
@ 2021-10-03 15:10 ` hjl.tools at gmail dot com
2021-10-03 16:45 ` [Bug middle-end/102566] " hjl.tools at gmail dot com
` (33 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 15:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> ---
Can we convert
_1 = __atomic_fetch_or_4 (&v, 1, 0);
_2 = (int) _1;
_5 = _2 & 1;
to
_1 = __atomic_fetch_or_4 (&v, 1, 0);
_2 = _1 & 1;
_5 = (int) _2;
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (4 preceding siblings ...)
2021-10-03 15:10 ` hjl.tools at gmail dot com
@ 2021-10-03 16:45 ` hjl.tools at gmail dot com
2021-10-03 22:36 ` hjl.tools at gmail dot com
` (32 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 16:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51536
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51536&action=edit
A patch
Please try this.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (5 preceding siblings ...)
2021-10-03 16:45 ` [Bug middle-end/102566] " hjl.tools at gmail dot com
@ 2021-10-03 22:36 ` hjl.tools at gmail dot com
2021-10-04 15:58 ` thiago at kde dot org
` (31 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 22:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #51536|0 |1
is obsolete| |
--- Comment #6 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51543
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51543&action=edit
The v2 patch
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (6 preceding siblings ...)
2021-10-03 22:36 ` hjl.tools at gmail dot com
@ 2021-10-04 15:58 ` thiago at kde dot org
2021-10-04 16:10 ` thiago at kde dot org
` (30 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 15:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #7 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #5)
> Created attachment 51536 [details]
> A patch
>
> Please try this.
Give me an hour (will try v2).
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (7 preceding siblings ...)
2021-10-04 15:58 ` thiago at kde dot org
@ 2021-10-04 16:10 ` thiago at kde dot org
2021-10-04 16:13 ` thiago at kde dot org
` (29 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 16:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #8 from Thiago Macieira <thiago at kde dot org> ---
$ cat /tmp/test.cpp
#include <atomic>
bool tbit(std::atomic<int> &i)
{
return i.fetch_or(1, std::memory_order_relaxed) & 1;
}
$ ~/dev/gcc/bin/gcc -S -o - -O2 /tmp/test.cpp
.file "test.cpp"
.text
.p2align 4
.globl _Z4tbitRSt6atomicIiE
.type _Z4tbitRSt6atomicIiE, @function
_Z4tbitRSt6atomicIiE:
.LFB339:
.cfi_startproc
lock btsl $0, (%rdi)
setc %al
ret
.cfi_endproc
.LFE339:
.size _Z4tbitRSt6atomicIiE, .-_Z4tbitRSt6atomicIiE
.ident "GCC: (GNU) 12.0.0 20211004 (experimental)"
.section .note.GNU-stack,"",@progbits
+1
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (8 preceding siblings ...)
2021-10-04 16:10 ` thiago at kde dot org
@ 2021-10-04 16:13 ` thiago at kde dot org
2021-10-04 21:46 ` hjl.tools at gmail dot com
` (28 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 16:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #9 from Thiago Macieira <thiago at kde dot org> ---
Looks like it doesn't work for the sign bit.
$ cat /tmp/test.cpp
#include <atomic>
bool tbit(std::atomic<int> &i)
{
return i.fetch_or(CONSTANT, std::memory_order_relaxed) & CONSTANT;
}
$ ~/dev/gcc/bin/gcc -DCONSTANT='(1<<30)' -S -o - -O2 /tmp/test.cpp | sed -n
'/startproc/,/endproc/p'
.cfi_startproc
lock btsl $30, (%rdi)
setc %al
ret
.cfi_endproc
$ ~/dev/gcc/bin/gcc -DCONSTANT='(1<<31)' -S -o - -O2 /tmp/test.cpp | sed -n
'/startproc/,/endproc/p'
.cfi_startproc
movl (%rdi), %eax
.L2:
movl %eax, %ecx
movl %eax, %edx
orl $-2147483648, %ecx
lock cmpxchgl %ecx, (%rdi)
jne .L2
shrl $31, %edx
movl %edx, %eax
ret
.cfi_endproc
Changing to std::atomic<unsigned> makes no difference.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (9 preceding siblings ...)
2021-10-04 16:13 ` thiago at kde dot org
@ 2021-10-04 21:46 ` hjl.tools at gmail dot com
2021-10-04 23:25 ` thiago at kde dot org
` (27 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-04 21:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #51543|0 |1
is obsolete| |
--- Comment #10 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51549
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51549&action=edit
The v3 patch
Please try the v3 patch.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (10 preceding siblings ...)
2021-10-04 21:46 ` hjl.tools at gmail dot com
@ 2021-10-04 23:25 ` thiago at kde dot org
2021-10-04 23:26 ` thiago at kde dot org
` (26 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 23:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #11 from Thiago Macieira <thiago at kde dot org> ---
$ for ((i=0;i<32;++i)); do ~/dev/gcc/bin/gcc "-DCONSTANT=(1<<$i)" -S -o - -O2
/tmp/test.cpp | grep bts; done
lock btsl $0, (%rdi)
lock btsl $1, (%rdi)
lock btsl $2, (%rdi)
lock btsl $3, (%rdi)
lock btsl $4, (%rdi)
lock btsl $5, (%rdi)
lock btsl $6, (%rdi)
lock btsl $7, (%rdi)
lock btsl $8, (%rdi)
lock btsl $9, (%rdi)
lock btsl $10, (%rdi)
lock btsl $11, (%rdi)
lock btsl $12, (%rdi)
lock btsl $13, (%rdi)
lock btsl $14, (%rdi)
lock btsl $15, (%rdi)
lock btsl $16, (%rdi)
lock btsl $17, (%rdi)
lock btsl $18, (%rdi)
lock btsl $19, (%rdi)
lock btsl $20, (%rdi)
lock btsl $21, (%rdi)
lock btsl $22, (%rdi)
lock btsl $23, (%rdi)
lock btsl $24, (%rdi)
lock btsl $25, (%rdi)
lock btsl $26, (%rdi)
lock btsl $27, (%rdi)
lock btsl $28, (%rdi)
lock btsl $29, (%rdi)
lock btsl $30, (%rdi)
lock btsl $31, (%rdi)
And after changing to long:
$ for ((i=32;i<64;++i)); do ~/dev/gcc/bin/gcc "-DCONSTANT=(1L<<$i)" -S -o - -O2
/tmp/test.cpp | grep bts; done
lock btsq $32, (%rdi)
lock btsq $33, (%rdi)
lock btsq $34, (%rdi)
lock btsq $35, (%rdi)
lock btsq $36, (%rdi)
lock btsq $37, (%rdi)
lock btsq $38, (%rdi)
lock btsq $39, (%rdi)
lock btsq $40, (%rdi)
lock btsq $41, (%rdi)
lock btsq $42, (%rdi)
lock btsq $43, (%rdi)
lock btsq $44, (%rdi)
lock btsq $45, (%rdi)
lock btsq $46, (%rdi)
lock btsq $47, (%rdi)
lock btsq $48, (%rdi)
lock btsq $49, (%rdi)
lock btsq $50, (%rdi)
lock btsq $51, (%rdi)
lock btsq $52, (%rdi)
lock btsq $53, (%rdi)
lock btsq $54, (%rdi)
lock btsq $55, (%rdi)
lock btsq $56, (%rdi)
lock btsq $57, (%rdi)
lock btsq $58, (%rdi)
lock btsq $59, (%rdi)
lock btsq $60, (%rdi)
lock btsq $61, (%rdi)
lock btsq $62, (%rdi)
lock btsq $63, (%rdi)
But:
$ cat /tmp/test2.cpp
#include <atomic>
bool tbit(std::atomic<long> &i)
{
return i.fetch_or(1, std::memory_order_relaxed) & (~1);
}
$ ~/dev/gcc/bin/gcc -S -o - -O2 /tmp/test2.cpp
.file "test.cpp"
.text
/tmp/test.cpp: In function ‘bool tbit(std::atomic<long int>&)’:
/tmp/test.cpp:2:6: error: type mismatch in binary expression
2 | bool tbit(std::atomic<long> &i)
| ^~~~
long int
long unsigned int
__int_type
_9 = _6 & -2;
during GIMPLE pass: fab
/tmp/test.cpp:2:6: internal compiler error: verify_gimple failed
0x119fbba verify_gimple_in_cfg(function*, bool)
/home/tjmaciei/src/gcc/gcc/tree-cfg.c:5576
0x106ced7 execute_function_todo
/home/tjmaciei/src/gcc/gcc/passes.c:2042
0x106d8fb execute_todo
/home/tjmaciei/src/gcc/gcc/passes.c:2096
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (11 preceding siblings ...)
2021-10-04 23:25 ` thiago at kde dot org
@ 2021-10-04 23:26 ` thiago at kde dot org
2021-10-05 4:40 ` hjl.tools at gmail dot com
` (25 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 23:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #12 from Thiago Macieira <thiago at kde dot org> ---
Commit 7e0c0500808d58bca5b8e23cbd474022c32234e4 + your patch.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (12 preceding siblings ...)
2021-10-04 23:26 ` thiago at kde dot org
@ 2021-10-05 4:40 ` hjl.tools at gmail dot com
2021-10-05 15:23 ` hjl.tools at gmail dot com
` (24 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-05 4:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #51549|0 |1
is obsolete| |
--- Comment #13 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51551
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51551&action=edit
The v4 patch
Please try the v4 patch.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (13 preceding siblings ...)
2021-10-05 4:40 ` hjl.tools at gmail dot com
@ 2021-10-05 15:23 ` hjl.tools at gmail dot com
2021-10-05 15:57 ` thiago at kde dot org
` (23 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-05 15:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #51551|0 |1
is obsolete| |
--- Comment #14 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51556
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51556&action=edit
The v5 patch
Changes in v5:
1. Check SSA_NAME before SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (14 preceding siblings ...)
2021-10-05 15:23 ` hjl.tools at gmail dot com
@ 2021-10-05 15:57 ` thiago at kde dot org
2021-10-05 16:02 ` pinskia at gcc dot gnu.org
` (22 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-05 15:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #15 from Thiago Macieira <thiago at kde dot org> ---
Works now for the failing case. Additionally:
bool tbit(std::atomic<long> &i)
{
return i.fetch_and(~CONSTANT, std::memory_order_relaxed) & (CONSTANT);
}
Will properly produce LOCK BTR (CONSTANT=2):
lock btrq $1, (%rdi)
setc %al
ret
CONSTANT=(1L<<62):
lock btrq $62, (%rdi)
setc %al
ret
But not for CONSTANT=1 or CONSTANT=(1L<<63):
movq (%rdi), %rax
.L2:
movq %rax, %rcx
movq %rax, %rdx
andq $-2, %rcx
lock cmpxchgq %rcx, (%rdi)
jne .L2
movl %edx, %eax
andl $1, %eax
ret
Same applies to 1<<31 for atomic<int>.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (15 preceding siblings ...)
2021-10-05 15:57 ` thiago at kde dot org
@ 2021-10-05 16:02 ` pinskia at gcc dot gnu.org
2021-10-05 19:26 ` hjl.tools at gmail dot com
` (21 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-05 16:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #16 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #14)
> Created attachment 51556 [details]
> The v5 patch
>
> Changes in v5:
>
> 1. Check SSA_NAME before SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
Why don't you just move this to match.pd instead as suggested by Richard B. on
the mailing list? Then you get the check for SSA_NAME_OCCURS_IN_ABNORMAL_PHI
for free and such. Plus other passes will do the optimization too ....
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (16 preceding siblings ...)
2021-10-05 16:02 ` pinskia at gcc dot gnu.org
@ 2021-10-05 19:26 ` hjl.tools at gmail dot com
2021-10-05 19:30 ` hjl.tools at gmail dot com
` (20 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-05 19:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #51556|0 |1
is obsolete| |
--- Comment #17 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51558
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51558&action=edit
The v6 patch
Please try this.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (17 preceding siblings ...)
2021-10-05 19:26 ` hjl.tools at gmail dot com
@ 2021-10-05 19:30 ` hjl.tools at gmail dot com
2021-10-05 19:36 ` thiago at kde dot org
` (19 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-05 19:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #18 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Andrew Pinski from comment #16)
> (In reply to H.J. Lu from comment #14)
> > Created attachment 51556 [details]
> > The v5 patch
> >
> > Changes in v5:
> >
> > 1. Check SSA_NAME before SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
>
> Why don't you just move this to match.pd instead as suggested by Richard B.
> on the mailing list? Then you get the check for
> SSA_NAME_OCCURS_IN_ABNORMAL_PHI for free and such. Plus other passes will
> do the optimization too ....
Without __atomic_fetch_or_* or __atomic_fetch_and_*, the conversion isn't
needed. We also need to check the mask of the atomic builtin.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (18 preceding siblings ...)
2021-10-05 19:30 ` hjl.tools at gmail dot com
@ 2021-10-05 19:36 ` thiago at kde dot org
2021-10-06 8:00 ` jakub at gcc dot gnu.org
` (18 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-05 19:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #19 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #17)
> Created attachment 51558 [details]
> The v6 patch
>
> Please try this.
Confirmed for all inputs.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (19 preceding siblings ...)
2021-10-05 19:36 ` thiago at kde dot org
@ 2021-10-06 8:00 ` jakub at gcc dot gnu.org
2021-10-06 15:40 ` thiago at kde dot org
` (17 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-06 8:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
Bug 102566 depends on bug 49244, which changed state.
Bug 49244 Summary: __sync or __atomic builtins will not emit 'lock bts/btr/btc'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49244
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (20 preceding siblings ...)
2021-10-06 8:00 ` jakub at gcc dot gnu.org
@ 2021-10-06 15:40 ` thiago at kde dot org
2021-10-06 15:47 ` hjl.tools at gmail dot com
` (16 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-06 15:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #20 from Thiago Macieira <thiago at kde dot org> ---
And:
$ cat /tmp/test.cpp
#include <atomic>
bool tbit(std::atomic<long> &i)
{
return i.fetch_xor(CONSTANT, std::memory_order_relaxed) & (CONSTANT);
}
$ ~/dev/gcc/bin/gcc "-DCONSTANT=(1LL<<63)" -S -o - -O2 /tmp/test.cpp | sed
'1,/startproc/d;/endproc/,$d'
lock btcq $63, (%rdi)
setc %al
ret
Nice!
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (21 preceding siblings ...)
2021-10-06 15:40 ` thiago at kde dot org
@ 2021-10-06 15:47 ` hjl.tools at gmail dot com
2021-10-06 15:54 ` thiago at kde dot org
` (15 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-06 15:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #51558|0 |1
is obsolete| |
--- Comment #21 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51559
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51559&action=edit
The new v3 patch
The new v3 patch to check invalid mask.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (22 preceding siblings ...)
2021-10-06 15:47 ` hjl.tools at gmail dot com
@ 2021-10-06 15:54 ` thiago at kde dot org
2021-10-06 16:05 ` hjl.tools at gmail dot com
` (14 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-06 15:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #22 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #21)
> Created attachment 51559 [details]
> The new v3 patch
>
> The new v3 patch to check invalid mask.
v3? We were already up to v6.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (23 preceding siblings ...)
2021-10-06 15:54 ` thiago at kde dot org
@ 2021-10-06 16:05 ` hjl.tools at gmail dot com
2021-10-07 15:18 ` thiago at kde dot org
` (13 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-06 16:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #23 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Thiago Macieira from comment #22)
> (In reply to H.J. Lu from comment #21)
> > Created attachment 51559 [details]
> > The new v3 patch
> >
> > The new v3 patch to check invalid mask.
>
> v3? We were already up to v6.
I renamed the commit title. The new v3 is the v6 + fixes.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (24 preceding siblings ...)
2021-10-06 16:05 ` hjl.tools at gmail dot com
@ 2021-10-07 15:18 ` thiago at kde dot org
2021-10-07 15:19 ` hjl.tools at gmail dot com
` (12 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-07 15:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #24 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #23)
> I renamed the commit title. The new v3 is the v6 + fixes.
Got it. Still no issues.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (25 preceding siblings ...)
2021-10-07 15:18 ` thiago at kde dot org
@ 2021-10-07 15:19 ` hjl.tools at gmail dot com
2021-10-07 15:35 ` thiago at kde dot org
` (11 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-07 15:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #25 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Thiago Macieira from comment #24)
> (In reply to H.J. Lu from comment #23)
> > I renamed the commit title. The new v3 is the v6 + fixes.
>
> Got it. Still no issues.
Can you get some performance improvement data on real workloads?
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (26 preceding siblings ...)
2021-10-07 15:19 ` hjl.tools at gmail dot com
@ 2021-10-07 15:35 ` thiago at kde dot org
2021-10-10 13:51 ` hjl.tools at gmail dot com
` (10 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-07 15:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #26 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #25)
> Can you get some performance improvement data on real workloads?
Will ask.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (27 preceding siblings ...)
2021-10-07 15:35 ` thiago at kde dot org
@ 2021-10-10 13:51 ` hjl.tools at gmail dot com
2021-10-22 3:31 ` crazylht at gmail dot com
` (9 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-10 13:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #51559|0 |1
is obsolete| |
--- Comment #27 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51580
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51580&action=edit
The new v4 patch
Changes in v4:
1. Bypass redundant check when inputs have been transformed to the
equivalent canonical form with valid bit operation.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (28 preceding siblings ...)
2021-10-10 13:51 ` hjl.tools at gmail dot com
@ 2021-10-22 3:31 ` crazylht at gmail dot com
2021-11-04 21:24 ` thiago at kde dot org
` (8 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: crazylht at gmail dot com @ 2021-10-22 3:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #28 from Hongtao.liu <crazylht at gmail dot com> ---
Can be optimize
int gomp_futex_wake = FUTEX_WAKE | FUTEX_PRIVATE_FLAG;
int gomp_futex_wait = FUTEX_WAIT | FUTEX_PRIVATE_FLAG;
void
gomp_mutex_lock_slow (gomp_mutex_t *mutex, int oldval)
{
/* First loop spins a while. */
while (oldval == 1)
{
if (do_spin (mutex, 1))
{
/* Spin timeout, nothing changed. Set waiting flag. */
oldval = __atomic_exchange_n (mutex, -1, MEMMODEL_ACQUIRE);
if (oldval == 0)
return;
futex_wait (mutex, -1);
break;
}
else
{
/* Something changed. If now unlocked, we're good to go. */
oldval = 0;
if (__atomic_compare_exchange_n (mutex, &oldval, 1, false,
MEMMODEL_ACQUIRE, MEMMODEL_RELAXED))
return;
}
}
/* Second loop waits until mutex is unlocked. We always exit this
loop with wait flag set, so next unlock will awaken a thread. */
while ((oldval = __atomic_exchange_n (mutex, -1, MEMMODEL_ACQUIRE)))
do_wait (mutex, -1);
}
with _atomic_fetch_or/and/xor ?
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (29 preceding siblings ...)
2021-10-22 3:31 ` crazylht at gmail dot com
@ 2021-11-04 21:24 ` thiago at kde dot org
2021-11-10 9:17 ` cvs-commit at gcc dot gnu.org
` (7 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-11-04 21:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #29 from Thiago Macieira <thiago at kde dot org> ---
New suggestion in bug 103090
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (30 preceding siblings ...)
2021-11-04 21:24 ` thiago at kde dot org
@ 2021-11-10 9:17 ` cvs-commit at gcc dot gnu.org
2022-10-29 10:53 ` marko.makela at mariadb dot com
` (6 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-11-10 9:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #30 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:fb161782545224f55ba26ba663889c5e6e9a04d1
commit r12-5102-gfb161782545224f55ba26ba663889c5e6e9a04d1
Author: liuhongt <hongtao.liu@intel.com>
Date: Mon Oct 25 13:59:51 2021 +0800
Improve integer bit test on __atomic_fetch_[or|and]_* returns
commit adedd5c173388ae505470df152b9cb3947339566
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue May 3 13:37:25 2016 +0200
re PR target/49244 (__sync or __atomic builtins will not emit 'lock
bts/btr/btc')
optimized bit test on __atomic_fetch_or_* and __atomic_fetch_and_* returns
with lock bts/btr/btc by turning
mask_2 = 1 << cnt_1;
_4 = __atomic_fetch_or_* (ptr_6, mask_2, _3);
_5 = _4 & mask_2;
into
_4 = ATOMIC_BIT_TEST_AND_SET (ptr_6, cnt_1, 0, _3);
_5 = _4;
and
mask_6 = 1 << bit_5(D);
_1 = ~mask_6;
_2 = __atomic_fetch_and_4 (v_8(D), _1, 0);
_3 = _2 & mask_6;
_4 = _3 != 0;
into
mask_6 = 1 << bit_5(D);
_1 = ~mask_6;
_11 = .ATOMIC_BIT_TEST_AND_RESET (v_8(D), bit_5(D), 1, 0);
_4 = _11 != 0;
But it failed to optimize many equivalent, but slighly different cases:
1.
_1 = __atomic_fetch_or_4 (ptr_6, 1, _3);
_4 = (_Bool) _1;
2.
_1 = __atomic_fetch_and_4 (ptr_6, ~1, _3);
_4 = (_Bool) _1;
3.
_1 = __atomic_fetch_or_4 (ptr_6, 1, _3);
_7 = ~_1;
_5 = (_Bool) _7;
4.
_1 = __atomic_fetch_and_4 (ptr_6, ~1, _3);
_7 = ~_1;
_5 = (_Bool) _7;
5.
_1 = __atomic_fetch_or_4 (ptr_6, 1, _3);
_2 = (int) _1;
_7 = ~_2;
_5 = (_Bool) _7;
6.
_1 = __atomic_fetch_and_4 (ptr_6, ~1, _3);
_2 = (int) _1;
_7 = ~_2;
_5 = (_Bool) _7;
7.
_1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
_5 = (signed int) _1;
_4 = _5 < 0;
8.
_1 = __atomic_fetch_and_4 (ptr_6, 0x7fffffff, _3);
_5 = (signed int) _1;
_4 = _5 < 0;
9.
_1 = 1 << bit_4(D);
mask_5 = (unsigned int) _1;
_2 = __atomic_fetch_or_4 (v_7(D), mask_5, 0);
_3 = _2 & mask_5;
10.
mask_7 = 1 << bit_6(D);
_1 = ~mask_7;
_2 = (unsigned int) _1;
_3 = __atomic_fetch_and_4 (v_9(D), _2, 0);
_4 = (int) _3;
_5 = _4 & mask_7;
We make
mask_2 = 1 << cnt_1;
_4 = __atomic_fetch_or_* (ptr_6, mask_2, _3);
_5 = _4 & mask_2;
and
mask_6 = 1 << bit_5(D);
_1 = ~mask_6;
_2 = __atomic_fetch_and_4 (v_8(D), _1, 0);
_3 = _2 & mask_6;
_4 = _3 != 0;
the canonical forms for this optimization and transform cases 1-9 to the
equivalent canonical form. For cases 10 and 11, we simply remove the cast
before __atomic_fetch_or_4/__atomic_fetch_and_4 with
_1 = 1 << bit_4(D);
_2 = __atomic_fetch_or_4 (v_7(D), _1, 0);
_3 = _2 & _1;
and
mask_7 = 1 << bit_6(D);
_1 = ~mask_7;
_3 = __atomic_fetch_and_4 (v_9(D), _1, 0);
_6 = _3 & mask_7;
_5 = (int) _6;
2021-11-04 H.J. Lu <hongjiu.lu@intel.com>
Hongtao Liu <hongtao.liu@intel.com>
gcc/
PR middle-end/102566
* match.pd (nop_atomic_bit_test_and_p): New match.
* tree-ssa-ccp.c (convert_atomic_bit_not): New function.
(gimple_nop_atomic_bit_test_and_p): New prototype.
(optimize_atomic_bit_test_and): Transform equivalent, but slighly
different cases to their canonical forms.
gcc/testsuite/
PR middle-end/102566
* g++.target/i386/pr102566-1.C: New test.
* g++.target/i386/pr102566-2.C: Likewise.
* g++.target/i386/pr102566-3.C: Likewise.
* g++.target/i386/pr102566-4.C: Likewise.
* g++.target/i386/pr102566-5a.C: Likewise.
* g++.target/i386/pr102566-5b.C: Likewise.
* g++.target/i386/pr102566-6a.C: Likewise.
* g++.target/i386/pr102566-6b.C: Likewise.
* gcc.target/i386/pr102566-1a.c: Likewise.
* gcc.target/i386/pr102566-1b.c: Likewise.
* gcc.target/i386/pr102566-2.c: Likewise.
* gcc.target/i386/pr102566-3a.c: Likewise.
* gcc.target/i386/pr102566-3b.c: Likewise.
* gcc.target/i386/pr102566-4.c: Likewise.
* gcc.target/i386/pr102566-5.c: Likewise.
* gcc.target/i386/pr102566-6.c: Likewise.
* gcc.target/i386/pr102566-7.c: Likewise.
* gcc.target/i386/pr102566-8a.c: Likewise.
* gcc.target/i386/pr102566-8b.c: Likewise.
* gcc.target/i386/pr102566-9a.c: Likewise.
* gcc.target/i386/pr102566-9b.c: Likewise.
* gcc.target/i386/pr102566-10a.c: Likewise.
* gcc.target/i386/pr102566-10b.c: Likewise.
* gcc.target/i386/pr102566-11.c: Likewise.
* gcc.target/i386/pr102566-12.c: Likewise.
* gcc.target/i386/pr102566-13.c: New test.
* gcc.target/i386/pr102566-14.c: New test.
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (31 preceding siblings ...)
2021-11-10 9:17 ` cvs-commit at gcc dot gnu.org
@ 2022-10-29 10:53 ` marko.makela at mariadb dot com
2022-10-31 2:42 ` crazylht at gmail dot com
` (5 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: marko.makela at mariadb dot com @ 2022-10-29 10:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
Marko Mäkelä <marko.makela at mariadb dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |marko.makela at mariadb dot com
--- Comment #31 from Marko Mäkelä <marko.makela at mariadb dot com> ---
Much of this seems to work in GCC 12.2.0 as well as in clang++-15. For clang
there is a related ticket https://github.com/llvm/llvm-project/issues/37322
I noticed a missed optimization in both g++-12 and clang++-15: Some operations
involving bit 31 degrade to loops around lock cmpxchg. I compiled it with "-c
-O2" (AMD64) or "-c -O2 -m32 -march=i686" (IA-32).
#include <atomic>
template<uint32_t b>
void lock_bts(std::atomic<uint32_t> &a) { while (!(a.fetch_or(b) & b)); }
template<uint32_t b>
void lock_btr(std::atomic<uint32_t> &a) { while (a.fetch_and(~b) & b); }
template<uint32_t b>
void lock_btc(std::atomic<uint32_t> &a) { while (a.fetch_xor(b) & b); }
template void lock_bts<1U<<30>(std::atomic<uint32_t> &a);
template void lock_btr<1U<<30>(std::atomic<uint32_t> &a);
template void lock_btc<1U<<30>(std::atomic<uint32_t> &a);
// bug: uses lock cmpxchg
template void lock_bts<1U<<31>(std::atomic<uint32_t> &a);
template void lock_btr<1U<<31>(std::atomic<uint32_t> &a);
template void lock_btc<1U<<31>(std::atomic<uint32_t> &a);
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (32 preceding siblings ...)
2022-10-29 10:53 ` marko.makela at mariadb dot com
@ 2022-10-31 2:42 ` crazylht at gmail dot com
2022-11-01 8:41 ` marko.makela at mariadb dot com
` (4 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: crazylht at gmail dot com @ 2022-10-31 2:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #32 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Marko Mäkelä from comment #31)
> Much of this seems to work in GCC 12.2.0 as well as in clang++-15. For clang
> there is a related ticket https://github.com/llvm/llvm-project/issues/37322
>
> I noticed a missed optimization in both g++-12 and clang++-15: Some
> operations involving bit 31 degrade to loops around lock cmpxchg. I compiled
31 is sign bit, and c = a & 1U << 31; c == 0 is optimized to (sign int)a >= 0.
The optimization we did in optimize_atomic_bit_test_and is supposed to match a
& 1U << 31, and it failed. I guess it could be extend to match (sign int)a >= 0
when mask is 1U << 31.
7 <D.2055>:
8 <D.2054>:
9 _1 = __atomic_fetch_or_4 (v, 2147483648, 0);
10 _2 = (signed int) _1;
11 if (_2 >= 0) goto <D.2055>; else goto <D.2053>;
12 <D.2053>:
13 return;
14}
15
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (33 preceding siblings ...)
2022-10-31 2:42 ` crazylht at gmail dot com
@ 2022-11-01 8:41 ` marko.makela at mariadb dot com
2022-11-01 16:46 ` hjl.tools at gmail dot com
` (3 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: marko.makela at mariadb dot com @ 2022-11-01 8:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #33 from Marko Mäkelä <marko.makela at mariadb dot com> ---
When it comes to toggling the most significant bit, std::atomic::fetch_xor()
could be translated to LOCK XADD which would be able to return all bits:
#include <atomic>
uint32_t toggle_by_add(std::atomic<uint32_t>& a)
{
return a.fetch_add(1U<<31);
}
uint32_t toggle_by_xor(std::atomic<uint32_t>& a)
{
return a.fetch_xor(1U<<31);
}
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (34 preceding siblings ...)
2022-11-01 8:41 ` marko.makela at mariadb dot com
@ 2022-11-01 16:46 ` hjl.tools at gmail dot com
2022-11-07 19:20 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2022-11-01 16:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #34 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 53813
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53813&action=edit
A patch to handle if (_5 < 0)
A patch to extend optimization for
_1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
_5 = (signed int) _1;
_4 = _5 >= 0;
to
_1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
_5 = (signed int) _1;
if (_5 >= 0)
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (35 preceding siblings ...)
2022-11-01 16:46 ` hjl.tools at gmail dot com
@ 2022-11-07 19:20 ` cvs-commit at gcc dot gnu.org
2023-01-18 18:49 ` hjl.tools at gmail dot com
2023-01-18 22:30 ` hjl.tools at gmail dot com
38 siblings, 0 replies; 40+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-07 19:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #35 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:03ed4e57e3d46a61513b3d1ab1720997aec8cf71
commit r13-3760-g03ed4e57e3d46a61513b3d1ab1720997aec8cf71
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Nov 1 09:49:18 2022 -0700
Extend optimization for integer bit test on __atomic_fetch_[or|and]_*
Extend optimization for
_1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
_5 = (signed int) _1;
_4 = _5 >= 0;
to
_1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
_5 = (signed int) _1;
if (_5 >= 0)
gcc/
PR middle-end/102566
* tree-ssa-ccp.cc (optimize_atomic_bit_test_and): Also handle
if (_5 < 0) and if (_5 >= 0).
gcc/testsuite/
PR middle-end/102566
* g++.target/i386/pr102566-7.C
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (36 preceding siblings ...)
2022-11-07 19:20 ` cvs-commit at gcc dot gnu.org
@ 2023-01-18 18:49 ` hjl.tools at gmail dot com
2023-01-18 22:30 ` hjl.tools at gmail dot com
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2023-01-18 18:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #36 from H.J. Lu <hjl.tools at gmail dot com> ---
(1 << (x)) works, but (((unsigned int) 1) << (x)) doesn't work:
[hjl@gnu-skx-1 gcc]$ cat bar.c
void bar (void);
#define MASK1(x) (1 << (x))
void
f1 (unsigned int *a, unsigned int bit)
{
if ((__atomic_fetch_xor (a, MASK1 (bit), __ATOMIC_RELAXED) & MASK1 (bit)))
bar ();
}
#define MASK2(x) (((unsigned int) 1) << (x))
void
f2 (unsigned int *a, unsigned int bit)
{
if ((__atomic_fetch_xor (a, MASK2 (bit), __ATOMIC_RELAXED) & MASK2 (bit)))
bar ();
}
[hjl@gnu-skx-1 gcc]$ ./xgcc -B./ -S -O2 bar.c
[hjl@gnu-skx-1 gcc]$ cat bar.s
.file "bar.c"
.text
.p2align 4
.globl f1
.type f1, @function
f1:
.LFB0:
.cfi_startproc
lock btcl %esi, (%rdi)
jc .L4
ret
.p2align 4,,10
.p2align 3
.L4:
jmp bar
.cfi_endproc
.LFE0:
.size f1, .-f1
.p2align 4
.globl f2
.type f2, @function
f2:
.LFB1:
.cfi_startproc
movl %esi, %ecx
movl $1, %edx
movl (%rdi), %eax
sall %cl, %edx
.L6:
movl %eax, %r8d
movl %eax, %esi
xorl %edx, %r8d
lock cmpxchgl %r8d, (%rdi)
jne .L6
btl %ecx, %esi
jc .L10
ret
.p2align 4,,10
.p2align 3
.L10:
jmp bar
.cfi_endproc
.LFE1:
.size f2, .-f2
.ident "GCC: (GNU) 13.0.1 20230118 (experimental)"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-skx-1 gcc]$
^ permalink raw reply [flat|nested] 40+ messages in thread
* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
` (37 preceding siblings ...)
2023-01-18 18:49 ` hjl.tools at gmail dot com
@ 2023-01-18 22:30 ` hjl.tools at gmail dot com
38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2023-01-18 22:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
--- Comment #37 from H.J. Lu <hjl.tools at gmail dot com> ---
It is
if ((__atomic_fetch_xor_4 ((volatile void *) a, (unsigned int) (1 << bit), 0)
& (unsigned int) (1 << bit)) != 0)
vs
if ((__atomic_fetch_xor_4 ((volatile void *) a, 1 << bit, 0) >> bit & 1) != 0)
Why does GCC generate the second one?
^ permalink raw reply [flat|nested] 40+ messages in thread