public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
@ 2021-10-02 16:08 thiago at kde dot org
  2021-10-02 20:30 ` [Bug target/102566] " pinskia at gcc dot gnu.org
                   ` (38 more replies)
  0 siblings, 39 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-02 16:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

            Bug ID: 102566
           Summary: [i386] GCC should emit LOCK BTS for simple
                    bit-test-and-set operations with std::atomic
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thiago at kde dot org
  Target Milestone: ---

Simple test:

$ cat test.cpp
#include <atomic>
bool tbit(std::atomic<int> &i)
{
    return i.fetch_or(1, std::memory_order_relaxed) & 1;
}

The sequence x.fetch_or(singlebit_constant) & singlebit_constant can be
implemented by a LOCK BTS sequence. The above should emit:

    lock bts $1, (%rdi)
    setb %al
    ret

But instead it emits a cmpxchg loop - see https://gcc.godbolt.org/z/99enKaffa.

This was found reviewing MariaDB lightweight-mutex code, which uses the sign
bit to indicate a contended mutex. See this commit[1] by one of their
maintainers for the removal of fetch_or because it emits an extra loop.

Bonus: LOCK BTR can be used in the sequence x.fetch_and(~single_bit_constant) &
single_bit_constant

[1]
https://github.com/dr-m/atomic_sync/commit/d5e22b2d42cdbac7a15d242bf1446377555c4041

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug target/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
@ 2021-10-02 20:30 ` pinskia at gcc dot gnu.org
  2021-10-03 12:56 ` hjl.tools at gmail dot com
                   ` (37 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-02 20:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug target/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
  2021-10-02 20:30 ` [Bug target/102566] " pinskia at gcc dot gnu.org
@ 2021-10-03 12:56 ` hjl.tools at gmail dot com
  2021-10-03 13:06 ` hjl.tools at gmail dot com
                   ` (36 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 12:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |12.0
   Last reconfirmed|                            |2021-10-03
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
The C code:

#include <stdatomic.h>

_Atomic int v;

int
foo ()
{
  return atomic_fetch_or_explicit (&v, 1, memory_order_relaxed) & 1;
}

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug target/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
  2021-10-02 20:30 ` [Bug target/102566] " pinskia at gcc dot gnu.org
  2021-10-03 12:56 ` hjl.tools at gmail dot com
@ 2021-10-03 13:06 ` hjl.tools at gmail dot com
  2021-10-03 13:19 ` [Bug tree-optimization/102566] " hjl.tools at gmail dot com
                   ` (35 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 13:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
This works:

[hjl@gnu-cfl-2 pr102566]$ cat y.c 
#include <stdatomic.h>

_Atomic int v;

unsigned int
foo ()
{
  return atomic_fetch_or_explicit (&v, 1, memory_order_relaxed) & 1;
}
[hjl@gnu-cfl-2 pr102566]$ make y.s
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -O2 -S
y.c
[hjl@gnu-cfl-2 pr102566]$ cat y.s
        .file   "y.c"
        .text
        .p2align 4
        .globl  foo
        .type   foo, @function
foo:
.LFB0:
        .cfi_startproc
        xorl    %eax, %eax
        lock btsl       $0, v(%rip)
        setc    %al
        ret
        .cfi_endproc
.LFE0:
        .size   foo, .-foo
        .globl  v
        .bss
        .align 4
        .type   v, @object
        .size   v, 4
v:
        .zero   4
        .ident  "GCC: (GNU) 12.0.0 20211003 (experimental)"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 pr102566]$

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug tree-optimization/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (2 preceding siblings ...)
  2021-10-03 13:06 ` hjl.tools at gmail dot com
@ 2021-10-03 13:19 ` hjl.tools at gmail dot com
  2021-10-03 15:10 ` hjl.tools at gmail dot com
                   ` (34 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 13:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |tree-optimization

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
optimize_atomic_bit_test_and works on

  _1 = __atomic_fetch_or_4 (&v, 1, 0);
  _4 = _1 & 1;

but fails on

  _1 = __atomic_fetch_or_4 (&v, 1, 0);
  _2 = (int) _1;
  _5 = _2 & 1;

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug tree-optimization/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (3 preceding siblings ...)
  2021-10-03 13:19 ` [Bug tree-optimization/102566] " hjl.tools at gmail dot com
@ 2021-10-03 15:10 ` hjl.tools at gmail dot com
  2021-10-03 16:45 ` [Bug middle-end/102566] " hjl.tools at gmail dot com
                   ` (33 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 15:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> ---
Can we convert

  _1 = __atomic_fetch_or_4 (&v, 1, 0);
  _2 = (int) _1;
  _5 = _2 & 1;

to

  _1 = __atomic_fetch_or_4 (&v, 1, 0);
  _2 = _1 & 1;
  _5 = (int) _2;

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (4 preceding siblings ...)
  2021-10-03 15:10 ` hjl.tools at gmail dot com
@ 2021-10-03 16:45 ` hjl.tools at gmail dot com
  2021-10-03 22:36 ` hjl.tools at gmail dot com
                   ` (32 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 16:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51536
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51536&action=edit
A patch

Please try this.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (5 preceding siblings ...)
  2021-10-03 16:45 ` [Bug middle-end/102566] " hjl.tools at gmail dot com
@ 2021-10-03 22:36 ` hjl.tools at gmail dot com
  2021-10-04 15:58 ` thiago at kde dot org
                   ` (31 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-03 22:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51536|0                           |1
        is obsolete|                            |

--- Comment #6 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51543
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51543&action=edit
The v2 patch

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (6 preceding siblings ...)
  2021-10-03 22:36 ` hjl.tools at gmail dot com
@ 2021-10-04 15:58 ` thiago at kde dot org
  2021-10-04 16:10 ` thiago at kde dot org
                   ` (30 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 15:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #7 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #5)
> Created attachment 51536 [details]
> A patch
> 
> Please try this.

Give me an hour (will try v2).

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (7 preceding siblings ...)
  2021-10-04 15:58 ` thiago at kde dot org
@ 2021-10-04 16:10 ` thiago at kde dot org
  2021-10-04 16:13 ` thiago at kde dot org
                   ` (29 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 16:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #8 from Thiago Macieira <thiago at kde dot org> ---
$ cat /tmp/test.cpp  
#include <atomic>
bool tbit(std::atomic<int> &i)
{
   return i.fetch_or(1, std::memory_order_relaxed) & 1;
}
$ ~/dev/gcc/bin/gcc -S -o - -O2 /tmp/test.cpp  
       .file   "test.cpp"
       .text
       .p2align 4
       .globl  _Z4tbitRSt6atomicIiE
       .type   _Z4tbitRSt6atomicIiE, @function
_Z4tbitRSt6atomicIiE:
.LFB339:
       .cfi_startproc
       lock btsl       $0, (%rdi)
       setc    %al
       ret
       .cfi_endproc
.LFE339:
       .size   _Z4tbitRSt6atomicIiE, .-_Z4tbitRSt6atomicIiE
       .ident  "GCC: (GNU) 12.0.0 20211004 (experimental)"
       .section        .note.GNU-stack,"",@progbits

+1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (8 preceding siblings ...)
  2021-10-04 16:10 ` thiago at kde dot org
@ 2021-10-04 16:13 ` thiago at kde dot org
  2021-10-04 21:46 ` hjl.tools at gmail dot com
                   ` (28 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 16:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #9 from Thiago Macieira <thiago at kde dot org> ---
Looks like it doesn't work for the sign bit.

$ cat /tmp/test.cpp 
#include <atomic>
bool tbit(std::atomic<int> &i)
{
    return i.fetch_or(CONSTANT, std::memory_order_relaxed) & CONSTANT;
}
$ ~/dev/gcc/bin/gcc -DCONSTANT='(1<<30)' -S -o - -O2 /tmp/test.cpp | sed -n
'/startproc/,/endproc/p'
        .cfi_startproc
        lock btsl       $30, (%rdi)
        setc    %al
        ret
        .cfi_endproc
$ ~/dev/gcc/bin/gcc -DCONSTANT='(1<<31)' -S -o - -O2 /tmp/test.cpp | sed -n
'/startproc/,/endproc/p'
        .cfi_startproc
        movl    (%rdi), %eax
.L2:
        movl    %eax, %ecx
        movl    %eax, %edx
        orl     $-2147483648, %ecx
        lock cmpxchgl   %ecx, (%rdi)
        jne     .L2
        shrl    $31, %edx
        movl    %edx, %eax
        ret
        .cfi_endproc

Changing to std::atomic<unsigned> makes no difference.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (9 preceding siblings ...)
  2021-10-04 16:13 ` thiago at kde dot org
@ 2021-10-04 21:46 ` hjl.tools at gmail dot com
  2021-10-04 23:25 ` thiago at kde dot org
                   ` (27 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-04 21:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51543|0                           |1
        is obsolete|                            |

--- Comment #10 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51549
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51549&action=edit
The v3 patch

Please try the v3 patch.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (10 preceding siblings ...)
  2021-10-04 21:46 ` hjl.tools at gmail dot com
@ 2021-10-04 23:25 ` thiago at kde dot org
  2021-10-04 23:26 ` thiago at kde dot org
                   ` (26 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 23:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #11 from Thiago Macieira <thiago at kde dot org> ---
$ for ((i=0;i<32;++i)); do ~/dev/gcc/bin/gcc "-DCONSTANT=(1<<$i)" -S -o - -O2
/tmp/test.cpp | grep bts; done 
        lock btsl       $0, (%rdi)
        lock btsl       $1, (%rdi)
        lock btsl       $2, (%rdi)
        lock btsl       $3, (%rdi)
        lock btsl       $4, (%rdi)
        lock btsl       $5, (%rdi)
        lock btsl       $6, (%rdi)
        lock btsl       $7, (%rdi)
        lock btsl       $8, (%rdi)
        lock btsl       $9, (%rdi)
        lock btsl       $10, (%rdi)
        lock btsl       $11, (%rdi)
        lock btsl       $12, (%rdi)
        lock btsl       $13, (%rdi)
        lock btsl       $14, (%rdi)
        lock btsl       $15, (%rdi)
        lock btsl       $16, (%rdi)
        lock btsl       $17, (%rdi)
        lock btsl       $18, (%rdi)
        lock btsl       $19, (%rdi)
        lock btsl       $20, (%rdi)
        lock btsl       $21, (%rdi)
        lock btsl       $22, (%rdi)
        lock btsl       $23, (%rdi)
        lock btsl       $24, (%rdi)
        lock btsl       $25, (%rdi)
        lock btsl       $26, (%rdi)
        lock btsl       $27, (%rdi)
        lock btsl       $28, (%rdi)
        lock btsl       $29, (%rdi)
        lock btsl       $30, (%rdi)
        lock btsl       $31, (%rdi)

And after changing to long:

$ for ((i=32;i<64;++i)); do ~/dev/gcc/bin/gcc "-DCONSTANT=(1L<<$i)" -S -o - -O2
/tmp/test.cpp | grep bts; done
        lock btsq       $32, (%rdi)
        lock btsq       $33, (%rdi)
        lock btsq       $34, (%rdi)
        lock btsq       $35, (%rdi)
        lock btsq       $36, (%rdi)
        lock btsq       $37, (%rdi)
        lock btsq       $38, (%rdi)
        lock btsq       $39, (%rdi)
        lock btsq       $40, (%rdi)
        lock btsq       $41, (%rdi)
        lock btsq       $42, (%rdi)
        lock btsq       $43, (%rdi)
        lock btsq       $44, (%rdi)
        lock btsq       $45, (%rdi)
        lock btsq       $46, (%rdi)
        lock btsq       $47, (%rdi)
        lock btsq       $48, (%rdi)
        lock btsq       $49, (%rdi)
        lock btsq       $50, (%rdi)
        lock btsq       $51, (%rdi)
        lock btsq       $52, (%rdi)
        lock btsq       $53, (%rdi)
        lock btsq       $54, (%rdi)
        lock btsq       $55, (%rdi)
        lock btsq       $56, (%rdi)
        lock btsq       $57, (%rdi)
        lock btsq       $58, (%rdi)
        lock btsq       $59, (%rdi)
        lock btsq       $60, (%rdi)
        lock btsq       $61, (%rdi)
        lock btsq       $62, (%rdi)
        lock btsq       $63, (%rdi)

But:

$ cat /tmp/test2.cpp 
#include <atomic>
bool tbit(std::atomic<long> &i)
{
  return i.fetch_or(1, std::memory_order_relaxed) & (~1);
}
$ ~/dev/gcc/bin/gcc -S -o - -O2 /tmp/test2.cpp
        .file   "test.cpp"
        .text
/tmp/test.cpp: In function ‘bool tbit(std::atomic<long int>&)’:
/tmp/test.cpp:2:6: error: type mismatch in binary expression
    2 | bool tbit(std::atomic<long> &i)
      |      ^~~~
long int

long unsigned int

__int_type

_9 = _6 & -2;
during GIMPLE pass: fab
/tmp/test.cpp:2:6: internal compiler error: verify_gimple failed
0x119fbba verify_gimple_in_cfg(function*, bool)
        /home/tjmaciei/src/gcc/gcc/tree-cfg.c:5576
0x106ced7 execute_function_todo
        /home/tjmaciei/src/gcc/gcc/passes.c:2042
0x106d8fb execute_todo
        /home/tjmaciei/src/gcc/gcc/passes.c:2096
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (11 preceding siblings ...)
  2021-10-04 23:25 ` thiago at kde dot org
@ 2021-10-04 23:26 ` thiago at kde dot org
  2021-10-05  4:40 ` hjl.tools at gmail dot com
                   ` (25 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-04 23:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #12 from Thiago Macieira <thiago at kde dot org> ---
Commit 7e0c0500808d58bca5b8e23cbd474022c32234e4 + your patch.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (12 preceding siblings ...)
  2021-10-04 23:26 ` thiago at kde dot org
@ 2021-10-05  4:40 ` hjl.tools at gmail dot com
  2021-10-05 15:23 ` hjl.tools at gmail dot com
                   ` (24 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-05  4:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51549|0                           |1
        is obsolete|                            |

--- Comment #13 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51551
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51551&action=edit
The v4 patch

Please try the v4 patch.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (13 preceding siblings ...)
  2021-10-05  4:40 ` hjl.tools at gmail dot com
@ 2021-10-05 15:23 ` hjl.tools at gmail dot com
  2021-10-05 15:57 ` thiago at kde dot org
                   ` (23 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-05 15:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51551|0                           |1
        is obsolete|                            |

--- Comment #14 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51556
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51556&action=edit
The v5 patch

Changes in v5:

1. Check SSA_NAME before SSA_NAME_OCCURS_IN_ABNORMAL_PHI.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (14 preceding siblings ...)
  2021-10-05 15:23 ` hjl.tools at gmail dot com
@ 2021-10-05 15:57 ` thiago at kde dot org
  2021-10-05 16:02 ` pinskia at gcc dot gnu.org
                   ` (22 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-05 15:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #15 from Thiago Macieira <thiago at kde dot org> ---
Works now for the failing case. Additionally:

bool tbit(std::atomic<long> &i)
{
  return i.fetch_and(~CONSTANT, std::memory_order_relaxed) & (CONSTANT);
}

Will properly produce LOCK BTR (CONSTANT=2):

        lock btrq       $1, (%rdi)
        setc    %al
        ret

CONSTANT=(1L<<62):

        lock btrq       $62, (%rdi)
        setc    %al
        ret

But not for CONSTANT=1 or CONSTANT=(1L<<63):
        movq    (%rdi), %rax
.L2:
        movq    %rax, %rcx
        movq    %rax, %rdx
        andq    $-2, %rcx
        lock cmpxchgq   %rcx, (%rdi)
        jne     .L2
        movl    %edx, %eax
        andl    $1, %eax
        ret

Same applies to 1<<31 for atomic<int>.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (15 preceding siblings ...)
  2021-10-05 15:57 ` thiago at kde dot org
@ 2021-10-05 16:02 ` pinskia at gcc dot gnu.org
  2021-10-05 19:26 ` hjl.tools at gmail dot com
                   ` (21 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-05 16:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #16 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #14)
> Created attachment 51556 [details]
> The v5 patch
> 
> Changes in v5:
> 
> 1. Check SSA_NAME before SSA_NAME_OCCURS_IN_ABNORMAL_PHI.

Why don't you just move this to match.pd instead as suggested by Richard B. on
the mailing list?  Then you get the check for SSA_NAME_OCCURS_IN_ABNORMAL_PHI
for free and such.  Plus other passes will do the optimization too ....

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (16 preceding siblings ...)
  2021-10-05 16:02 ` pinskia at gcc dot gnu.org
@ 2021-10-05 19:26 ` hjl.tools at gmail dot com
  2021-10-05 19:30 ` hjl.tools at gmail dot com
                   ` (20 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-05 19:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51556|0                           |1
        is obsolete|                            |

--- Comment #17 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51558
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51558&action=edit
The v6 patch

Please try this.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (17 preceding siblings ...)
  2021-10-05 19:26 ` hjl.tools at gmail dot com
@ 2021-10-05 19:30 ` hjl.tools at gmail dot com
  2021-10-05 19:36 ` thiago at kde dot org
                   ` (19 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-05 19:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #18 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Andrew Pinski from comment #16)
> (In reply to H.J. Lu from comment #14)
> > Created attachment 51556 [details]
> > The v5 patch
> > 
> > Changes in v5:
> > 
> > 1. Check SSA_NAME before SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
> 
> Why don't you just move this to match.pd instead as suggested by Richard B.
> on the mailing list?  Then you get the check for
> SSA_NAME_OCCURS_IN_ABNORMAL_PHI for free and such.  Plus other passes will
> do the optimization too ....

Without __atomic_fetch_or_* or __atomic_fetch_and_*, the conversion isn't
needed.  We also need to check the mask of the atomic builtin.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (18 preceding siblings ...)
  2021-10-05 19:30 ` hjl.tools at gmail dot com
@ 2021-10-05 19:36 ` thiago at kde dot org
  2021-10-06  8:00 ` jakub at gcc dot gnu.org
                   ` (18 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-05 19:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #19 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #17)
> Created attachment 51558 [details]
> The v6 patch
> 
> Please try this.

Confirmed for all inputs.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (19 preceding siblings ...)
  2021-10-05 19:36 ` thiago at kde dot org
@ 2021-10-06  8:00 ` jakub at gcc dot gnu.org
  2021-10-06 15:40 ` thiago at kde dot org
                   ` (17 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-06  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
Bug 102566 depends on bug 49244, which changed state.

Bug 49244 Summary: __sync or __atomic builtins will not emit 'lock bts/btr/btc'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49244

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (20 preceding siblings ...)
  2021-10-06  8:00 ` jakub at gcc dot gnu.org
@ 2021-10-06 15:40 ` thiago at kde dot org
  2021-10-06 15:47 ` hjl.tools at gmail dot com
                   ` (16 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-06 15:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #20 from Thiago Macieira <thiago at kde dot org> ---
And:

$ cat /tmp/test.cpp 
#include <atomic>
bool tbit(std::atomic<long> &i)
{
  return i.fetch_xor(CONSTANT, std::memory_order_relaxed) & (CONSTANT);
}
$ ~/dev/gcc/bin/gcc "-DCONSTANT=(1LL<<63)" -S -o - -O2 /tmp/test.cpp | sed
'1,/startproc/d;/endproc/,$d'
        lock btcq       $63, (%rdi)
        setc    %al
        ret

Nice!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (21 preceding siblings ...)
  2021-10-06 15:40 ` thiago at kde dot org
@ 2021-10-06 15:47 ` hjl.tools at gmail dot com
  2021-10-06 15:54 ` thiago at kde dot org
                   ` (15 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-06 15:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51558|0                           |1
        is obsolete|                            |

--- Comment #21 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51559
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51559&action=edit
The new v3 patch

The new v3 patch to check invalid mask.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (22 preceding siblings ...)
  2021-10-06 15:47 ` hjl.tools at gmail dot com
@ 2021-10-06 15:54 ` thiago at kde dot org
  2021-10-06 16:05 ` hjl.tools at gmail dot com
                   ` (14 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-06 15:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #22 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #21)
> Created attachment 51559 [details]
> The new v3 patch
> 
> The new v3 patch to check invalid mask.

v3? We were already up to v6.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (23 preceding siblings ...)
  2021-10-06 15:54 ` thiago at kde dot org
@ 2021-10-06 16:05 ` hjl.tools at gmail dot com
  2021-10-07 15:18 ` thiago at kde dot org
                   ` (13 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-06 16:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #23 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Thiago Macieira from comment #22)
> (In reply to H.J. Lu from comment #21)
> > Created attachment 51559 [details]
> > The new v3 patch
> > 
> > The new v3 patch to check invalid mask.
> 
> v3? We were already up to v6.

I renamed the commit title.  The new v3 is the v6 + fixes.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (24 preceding siblings ...)
  2021-10-06 16:05 ` hjl.tools at gmail dot com
@ 2021-10-07 15:18 ` thiago at kde dot org
  2021-10-07 15:19 ` hjl.tools at gmail dot com
                   ` (12 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-07 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #24 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #23)
> I renamed the commit title.  The new v3 is the v6 + fixes.

Got it. Still no issues.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (25 preceding siblings ...)
  2021-10-07 15:18 ` thiago at kde dot org
@ 2021-10-07 15:19 ` hjl.tools at gmail dot com
  2021-10-07 15:35 ` thiago at kde dot org
                   ` (11 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-07 15:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #25 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Thiago Macieira from comment #24)
> (In reply to H.J. Lu from comment #23)
> > I renamed the commit title.  The new v3 is the v6 + fixes.
> 
> Got it. Still no issues.

Can you get some performance improvement data on real workloads?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (26 preceding siblings ...)
  2021-10-07 15:19 ` hjl.tools at gmail dot com
@ 2021-10-07 15:35 ` thiago at kde dot org
  2021-10-10 13:51 ` hjl.tools at gmail dot com
                   ` (10 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-10-07 15:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #26 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #25)
> Can you get some performance improvement data on real workloads?

Will ask.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (27 preceding siblings ...)
  2021-10-07 15:35 ` thiago at kde dot org
@ 2021-10-10 13:51 ` hjl.tools at gmail dot com
  2021-10-22  3:31 ` crazylht at gmail dot com
                   ` (9 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-10 13:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51559|0                           |1
        is obsolete|                            |

--- Comment #27 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51580
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51580&action=edit
The new v4 patch

Changes in v4:

1. Bypass redundant check when inputs have been transformed to the
equivalent canonical form with valid bit operation.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (28 preceding siblings ...)
  2021-10-10 13:51 ` hjl.tools at gmail dot com
@ 2021-10-22  3:31 ` crazylht at gmail dot com
  2021-11-04 21:24 ` thiago at kde dot org
                   ` (8 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: crazylht at gmail dot com @ 2021-10-22  3:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #28 from Hongtao.liu <crazylht at gmail dot com> ---
Can be optimize

int gomp_futex_wake = FUTEX_WAKE | FUTEX_PRIVATE_FLAG;
int gomp_futex_wait = FUTEX_WAIT | FUTEX_PRIVATE_FLAG;

void
gomp_mutex_lock_slow (gomp_mutex_t *mutex, int oldval)
{
  /* First loop spins a while.  */
  while (oldval == 1)
    {
      if (do_spin (mutex, 1))
        {
          /* Spin timeout, nothing changed.  Set waiting flag.  */
          oldval = __atomic_exchange_n (mutex, -1, MEMMODEL_ACQUIRE);
          if (oldval == 0)
            return;
          futex_wait (mutex, -1);
          break;
        }
      else
        {
          /* Something changed.  If now unlocked, we're good to go.  */
          oldval = 0;
          if (__atomic_compare_exchange_n (mutex, &oldval, 1, false,
                                           MEMMODEL_ACQUIRE, MEMMODEL_RELAXED))
            return;
        }
    }

  /* Second loop waits until mutex is unlocked.  We always exit this
     loop with wait flag set, so next unlock will awaken a thread.  */
  while ((oldval = __atomic_exchange_n (mutex, -1, MEMMODEL_ACQUIRE)))
    do_wait (mutex, -1);
}

with _atomic_fetch_or/and/xor ?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (29 preceding siblings ...)
  2021-10-22  3:31 ` crazylht at gmail dot com
@ 2021-11-04 21:24 ` thiago at kde dot org
  2021-11-10  9:17 ` cvs-commit at gcc dot gnu.org
                   ` (7 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: thiago at kde dot org @ 2021-11-04 21:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #29 from Thiago Macieira <thiago at kde dot org> ---
New suggestion in bug 103090

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (30 preceding siblings ...)
  2021-11-04 21:24 ` thiago at kde dot org
@ 2021-11-10  9:17 ` cvs-commit at gcc dot gnu.org
  2022-10-29 10:53 ` marko.makela at mariadb dot com
                   ` (6 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-11-10  9:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #30 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:fb161782545224f55ba26ba663889c5e6e9a04d1

commit r12-5102-gfb161782545224f55ba26ba663889c5e6e9a04d1
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Oct 25 13:59:51 2021 +0800

    Improve integer bit test on __atomic_fetch_[or|and]_* returns

    commit adedd5c173388ae505470df152b9cb3947339566
    Author: Jakub Jelinek <jakub@redhat.com>
    Date:   Tue May 3 13:37:25 2016 +0200

        re PR target/49244 (__sync or __atomic builtins will not emit 'lock
bts/btr/btc')

    optimized bit test on __atomic_fetch_or_* and __atomic_fetch_and_* returns
    with lock bts/btr/btc by turning

      mask_2 = 1 << cnt_1;
      _4 = __atomic_fetch_or_* (ptr_6, mask_2, _3);
      _5 = _4 & mask_2;

    into

      _4 = ATOMIC_BIT_TEST_AND_SET (ptr_6, cnt_1, 0, _3);
      _5 = _4;

    and

      mask_6 = 1 << bit_5(D);
      _1 = ~mask_6;
      _2 = __atomic_fetch_and_4 (v_8(D), _1, 0);
      _3 = _2 & mask_6;
      _4 = _3 != 0;

    into

      mask_6 = 1 << bit_5(D);
      _1 = ~mask_6;
      _11 = .ATOMIC_BIT_TEST_AND_RESET (v_8(D), bit_5(D), 1, 0);
      _4 = _11 != 0;

    But it failed to optimize many equivalent, but slighly different cases:

    1.
      _1 = __atomic_fetch_or_4 (ptr_6, 1, _3);
      _4 = (_Bool) _1;
    2.
      _1 = __atomic_fetch_and_4 (ptr_6, ~1, _3);
      _4 = (_Bool) _1;
    3.
      _1 = __atomic_fetch_or_4 (ptr_6, 1, _3);
      _7 = ~_1;
      _5 = (_Bool) _7;
    4.
      _1 = __atomic_fetch_and_4 (ptr_6, ~1, _3);
      _7 = ~_1;
      _5 = (_Bool) _7;
    5.
      _1 = __atomic_fetch_or_4 (ptr_6, 1, _3);
      _2 = (int) _1;
      _7 = ~_2;
      _5 = (_Bool) _7;
    6.
      _1 = __atomic_fetch_and_4 (ptr_6, ~1, _3);
      _2 = (int) _1;
      _7 = ~_2;
      _5 = (_Bool) _7;
    7.
      _1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
      _5 = (signed int) _1;
      _4 = _5 < 0;
    8.
      _1 = __atomic_fetch_and_4 (ptr_6, 0x7fffffff, _3);
      _5 = (signed int) _1;
      _4 = _5 < 0;
    9.
      _1 = 1 << bit_4(D);
      mask_5 = (unsigned int) _1;
      _2 = __atomic_fetch_or_4 (v_7(D), mask_5, 0);
      _3 = _2 & mask_5;
    10.
      mask_7 = 1 << bit_6(D);
      _1 = ~mask_7;
      _2 = (unsigned int) _1;
      _3 = __atomic_fetch_and_4 (v_9(D), _2, 0);
      _4 = (int) _3;
      _5 = _4 & mask_7;

    We make

      mask_2 = 1 << cnt_1;
      _4 = __atomic_fetch_or_* (ptr_6, mask_2, _3);
      _5 = _4 & mask_2;

    and

      mask_6 = 1 << bit_5(D);
      _1 = ~mask_6;
      _2 = __atomic_fetch_and_4 (v_8(D), _1, 0);
      _3 = _2 & mask_6;
      _4 = _3 != 0;

    the canonical forms for this optimization and transform cases 1-9 to the
    equivalent canonical form.  For cases 10 and 11, we simply remove the cast
    before __atomic_fetch_or_4/__atomic_fetch_and_4 with

      _1 = 1 << bit_4(D);
      _2 = __atomic_fetch_or_4 (v_7(D), _1, 0);
      _3 = _2 & _1;

    and

      mask_7 = 1 << bit_6(D);
      _1 = ~mask_7;
      _3 = __atomic_fetch_and_4 (v_9(D), _1, 0);
      _6 = _3 & mask_7;
      _5 = (int) _6;

    2021-11-04  H.J. Lu  <hongjiu.lu@intel.com>
                Hongtao Liu  <hongtao.liu@intel.com>
    gcc/

            PR middle-end/102566
            * match.pd (nop_atomic_bit_test_and_p): New match.
            * tree-ssa-ccp.c (convert_atomic_bit_not): New function.
            (gimple_nop_atomic_bit_test_and_p): New prototype.
            (optimize_atomic_bit_test_and): Transform equivalent, but slighly
            different cases to their canonical forms.

    gcc/testsuite/

            PR middle-end/102566
            * g++.target/i386/pr102566-1.C: New test.
            * g++.target/i386/pr102566-2.C: Likewise.
            * g++.target/i386/pr102566-3.C: Likewise.
            * g++.target/i386/pr102566-4.C: Likewise.
            * g++.target/i386/pr102566-5a.C: Likewise.
            * g++.target/i386/pr102566-5b.C: Likewise.
            * g++.target/i386/pr102566-6a.C: Likewise.
            * g++.target/i386/pr102566-6b.C: Likewise.
            * gcc.target/i386/pr102566-1a.c: Likewise.
            * gcc.target/i386/pr102566-1b.c: Likewise.
            * gcc.target/i386/pr102566-2.c: Likewise.
            * gcc.target/i386/pr102566-3a.c: Likewise.
            * gcc.target/i386/pr102566-3b.c: Likewise.
            * gcc.target/i386/pr102566-4.c: Likewise.
            * gcc.target/i386/pr102566-5.c: Likewise.
            * gcc.target/i386/pr102566-6.c: Likewise.
            * gcc.target/i386/pr102566-7.c: Likewise.
            * gcc.target/i386/pr102566-8a.c: Likewise.
            * gcc.target/i386/pr102566-8b.c: Likewise.
            * gcc.target/i386/pr102566-9a.c: Likewise.
            * gcc.target/i386/pr102566-9b.c: Likewise.
            * gcc.target/i386/pr102566-10a.c: Likewise.
            * gcc.target/i386/pr102566-10b.c: Likewise.
            * gcc.target/i386/pr102566-11.c: Likewise.
            * gcc.target/i386/pr102566-12.c: Likewise.
            * gcc.target/i386/pr102566-13.c: New test.
            * gcc.target/i386/pr102566-14.c: New test.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (31 preceding siblings ...)
  2021-11-10  9:17 ` cvs-commit at gcc dot gnu.org
@ 2022-10-29 10:53 ` marko.makela at mariadb dot com
  2022-10-31  2:42 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: marko.makela at mariadb dot com @ 2022-10-29 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

Marko Mäkelä <marko.makela at mariadb dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marko.makela at mariadb dot com

--- Comment #31 from Marko Mäkelä <marko.makela at mariadb dot com> ---
Much of this seems to work in GCC 12.2.0 as well as in clang++-15. For clang
there is a related ticket https://github.com/llvm/llvm-project/issues/37322

I noticed a missed optimization in both g++-12 and clang++-15: Some operations
involving bit 31 degrade to loops around lock cmpxchg. I compiled it with "-c
-O2" (AMD64) or "-c -O2 -m32 -march=i686" (IA-32).

#include <atomic>
template<uint32_t b>
void lock_bts(std::atomic<uint32_t> &a) { while (!(a.fetch_or(b) & b)); }
template<uint32_t b>
void lock_btr(std::atomic<uint32_t> &a) { while (a.fetch_and(~b) & b); }
template<uint32_t b>
void lock_btc(std::atomic<uint32_t> &a) { while (a.fetch_xor(b) & b); }
template void lock_bts<1U<<30>(std::atomic<uint32_t> &a);
template void lock_btr<1U<<30>(std::atomic<uint32_t> &a);
template void lock_btc<1U<<30>(std::atomic<uint32_t> &a);
// bug: uses lock cmpxchg
template void lock_bts<1U<<31>(std::atomic<uint32_t> &a);
template void lock_btr<1U<<31>(std::atomic<uint32_t> &a);
template void lock_btc<1U<<31>(std::atomic<uint32_t> &a);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (32 preceding siblings ...)
  2022-10-29 10:53 ` marko.makela at mariadb dot com
@ 2022-10-31  2:42 ` crazylht at gmail dot com
  2022-11-01  8:41 ` marko.makela at mariadb dot com
                   ` (4 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: crazylht at gmail dot com @ 2022-10-31  2:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #32 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Marko Mäkelä from comment #31)
> Much of this seems to work in GCC 12.2.0 as well as in clang++-15. For clang
> there is a related ticket https://github.com/llvm/llvm-project/issues/37322
> 
> I noticed a missed optimization in both g++-12 and clang++-15: Some
> operations involving bit 31 degrade to loops around lock cmpxchg. I compiled
31 is sign bit, and  c = a & 1U << 31; c == 0 is optimized to (sign int)a >= 0.
The optimization we did in optimize_atomic_bit_test_and is supposed to match a
& 1U << 31, and it failed. I guess it could be extend to match (sign int)a >= 0
when mask is 1U << 31.

 7  <D.2055>:
 8  <D.2054>:
 9  _1 = __atomic_fetch_or_4 (v, 2147483648, 0);
10  _2 = (signed int) _1;
11  if (_2 >= 0) goto <D.2055>; else goto <D.2053>;
12  <D.2053>:
13  return;
14}
15

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (33 preceding siblings ...)
  2022-10-31  2:42 ` crazylht at gmail dot com
@ 2022-11-01  8:41 ` marko.makela at mariadb dot com
  2022-11-01 16:46 ` hjl.tools at gmail dot com
                   ` (3 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: marko.makela at mariadb dot com @ 2022-11-01  8:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #33 from Marko Mäkelä <marko.makela at mariadb dot com> ---
When it comes to toggling the most significant bit, std::atomic::fetch_xor()
could be translated to LOCK XADD which would be able to return all bits:

#include <atomic>
uint32_t toggle_by_add(std::atomic<uint32_t>& a)
{
  return a.fetch_add(1U<<31);
}
uint32_t toggle_by_xor(std::atomic<uint32_t>& a)
{
  return a.fetch_xor(1U<<31);
}

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (34 preceding siblings ...)
  2022-11-01  8:41 ` marko.makela at mariadb dot com
@ 2022-11-01 16:46 ` hjl.tools at gmail dot com
  2022-11-07 19:20 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2022-11-01 16:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #34 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 53813
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53813&action=edit
A patch to handle if (_5 < 0)

A patch to extend optimization for

_1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
_5 = (signed int) _1;
_4 = _5 >= 0;

to

_1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
_5 = (signed int) _1;
if (_5 >= 0)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (35 preceding siblings ...)
  2022-11-01 16:46 ` hjl.tools at gmail dot com
@ 2022-11-07 19:20 ` cvs-commit at gcc dot gnu.org
  2023-01-18 18:49 ` hjl.tools at gmail dot com
  2023-01-18 22:30 ` hjl.tools at gmail dot com
  38 siblings, 0 replies; 40+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-07 19:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #35 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:

https://gcc.gnu.org/g:03ed4e57e3d46a61513b3d1ab1720997aec8cf71

commit r13-3760-g03ed4e57e3d46a61513b3d1ab1720997aec8cf71
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Nov 1 09:49:18 2022 -0700

    Extend optimization for integer bit test on __atomic_fetch_[or|and]_*

    Extend optimization for

    _1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
    _5 = (signed int) _1;
    _4 = _5 >= 0;

    to

    _1 = __atomic_fetch_or_4 (ptr_6, 0x80000000, _3);
    _5 = (signed int) _1;
    if (_5 >= 0)

    gcc/

            PR middle-end/102566
            * tree-ssa-ccp.cc (optimize_atomic_bit_test_and): Also handle
            if (_5 < 0) and if (_5 >= 0).

    gcc/testsuite/

            PR middle-end/102566
            * g++.target/i386/pr102566-7.C

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (36 preceding siblings ...)
  2022-11-07 19:20 ` cvs-commit at gcc dot gnu.org
@ 2023-01-18 18:49 ` hjl.tools at gmail dot com
  2023-01-18 22:30 ` hjl.tools at gmail dot com
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2023-01-18 18:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #36 from H.J. Lu <hjl.tools at gmail dot com> ---
(1 << (x)) works, but (((unsigned int) 1) << (x)) doesn't work:

[hjl@gnu-skx-1 gcc]$ cat bar.c
void bar (void);

#define MASK1(x) (1 << (x))

void
f1 (unsigned int *a, unsigned int bit)
{
  if ((__atomic_fetch_xor (a, MASK1 (bit), __ATOMIC_RELAXED) & MASK1 (bit)))
    bar ();
}

#define MASK2(x) (((unsigned int) 1) << (x))

void
f2 (unsigned int *a, unsigned int bit)
{
  if ((__atomic_fetch_xor (a, MASK2 (bit), __ATOMIC_RELAXED) & MASK2 (bit)))
    bar ();
}
[hjl@gnu-skx-1 gcc]$ ./xgcc -B./ -S -O2 bar.c
[hjl@gnu-skx-1 gcc]$ cat bar.s
        .file   "bar.c"
        .text
        .p2align 4
        .globl  f1
        .type   f1, @function
f1:
.LFB0:
        .cfi_startproc
        lock btcl       %esi, (%rdi)
        jc      .L4
        ret
        .p2align 4,,10
        .p2align 3
.L4:
        jmp     bar
        .cfi_endproc
.LFE0:
        .size   f1, .-f1
        .p2align 4
        .globl  f2
        .type   f2, @function
f2:
.LFB1:
        .cfi_startproc
        movl    %esi, %ecx
        movl    $1, %edx
        movl    (%rdi), %eax
        sall    %cl, %edx
.L6:
        movl    %eax, %r8d
        movl    %eax, %esi
        xorl    %edx, %r8d
        lock cmpxchgl   %r8d, (%rdi)
        jne     .L6
        btl     %ecx, %esi
        jc      .L10
        ret
        .p2align 4,,10
        .p2align 3
.L10:
        jmp     bar
        .cfi_endproc
.LFE1:
        .size   f2, .-f2
        .ident  "GCC: (GNU) 13.0.1 20230118 (experimental)"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-skx-1 gcc]$

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic
  2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
                   ` (37 preceding siblings ...)
  2023-01-18 18:49 ` hjl.tools at gmail dot com
@ 2023-01-18 22:30 ` hjl.tools at gmail dot com
  38 siblings, 0 replies; 40+ messages in thread
From: hjl.tools at gmail dot com @ 2023-01-18 22:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566

--- Comment #37 from H.J. Lu <hjl.tools at gmail dot com> ---
It is

if ((__atomic_fetch_xor_4 ((volatile void *) a, (unsigned int) (1 << bit), 0) 
& (unsigned int) (1 << bit)) != 0)

vs

if ((__atomic_fetch_xor_4 ((volatile void *) a, 1 << bit, 0) >> bit & 1) != 0)

Why does GCC generate the second one?

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2023-01-18 22:30 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-02 16:08 [Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic thiago at kde dot org
2021-10-02 20:30 ` [Bug target/102566] " pinskia at gcc dot gnu.org
2021-10-03 12:56 ` hjl.tools at gmail dot com
2021-10-03 13:06 ` hjl.tools at gmail dot com
2021-10-03 13:19 ` [Bug tree-optimization/102566] " hjl.tools at gmail dot com
2021-10-03 15:10 ` hjl.tools at gmail dot com
2021-10-03 16:45 ` [Bug middle-end/102566] " hjl.tools at gmail dot com
2021-10-03 22:36 ` hjl.tools at gmail dot com
2021-10-04 15:58 ` thiago at kde dot org
2021-10-04 16:10 ` thiago at kde dot org
2021-10-04 16:13 ` thiago at kde dot org
2021-10-04 21:46 ` hjl.tools at gmail dot com
2021-10-04 23:25 ` thiago at kde dot org
2021-10-04 23:26 ` thiago at kde dot org
2021-10-05  4:40 ` hjl.tools at gmail dot com
2021-10-05 15:23 ` hjl.tools at gmail dot com
2021-10-05 15:57 ` thiago at kde dot org
2021-10-05 16:02 ` pinskia at gcc dot gnu.org
2021-10-05 19:26 ` hjl.tools at gmail dot com
2021-10-05 19:30 ` hjl.tools at gmail dot com
2021-10-05 19:36 ` thiago at kde dot org
2021-10-06  8:00 ` jakub at gcc dot gnu.org
2021-10-06 15:40 ` thiago at kde dot org
2021-10-06 15:47 ` hjl.tools at gmail dot com
2021-10-06 15:54 ` thiago at kde dot org
2021-10-06 16:05 ` hjl.tools at gmail dot com
2021-10-07 15:18 ` thiago at kde dot org
2021-10-07 15:19 ` hjl.tools at gmail dot com
2021-10-07 15:35 ` thiago at kde dot org
2021-10-10 13:51 ` hjl.tools at gmail dot com
2021-10-22  3:31 ` crazylht at gmail dot com
2021-11-04 21:24 ` thiago at kde dot org
2021-11-10  9:17 ` cvs-commit at gcc dot gnu.org
2022-10-29 10:53 ` marko.makela at mariadb dot com
2022-10-31  2:42 ` crazylht at gmail dot com
2022-11-01  8:41 ` marko.makela at mariadb dot com
2022-11-01 16:46 ` hjl.tools at gmail dot com
2022-11-07 19:20 ` cvs-commit at gcc dot gnu.org
2023-01-18 18:49 ` hjl.tools at gmail dot com
2023-01-18 22:30 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).