[Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
@ 2020-06-18 19:32 andysem at mail dot ru
  2020-06-18 19:36 ` [Bug target/95750] " pinskia at gcc dot gnu.org
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: andysem at mail dot ru @ 2020-06-18 19:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

            Bug ID: 95750
           Summary: [x86] Use dummy atomic insn instead of mfence in
                    __atomic_thread_fence(seq_cst)
           Product: gcc
           Version: 10.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andysem at mail dot ru
  Target Milestone: ---

Currently, __atomic_thread_fence(seq_cst) on x86 and x86-64 generates mfence
instruction. A dummy atomic instruction (a lock-prefixed instruction or xchg
with a memory operand) would provide the same sequential consistency guarantees
while being more efficient on most current CPUs. The mfence instruction
additionally orders non-temporal stores, which is not relevant for atomic
operations and are not ordered by seq_cst atomic operations anyway.

Regarding performance, some data is available in Agner Fog's instruction
tables:

https://www.agner.org/optimize/

Also, there is this article:

https://shipilev.net/blog/2014/on-the-fence-with-dependencies/

TL;DR: There is benefit on every CPU except Atom; on Atom there is no
difference.

Regarding the dummy instruction and target memory location, here are some
considerations:

- The lock-prefixed instruction should preferably not alter flags or registers
and should require minimum number of registers.
- The memory location should not be shared with other threads.
- The memory location should likely be in cache.
- The memory location should not alias existing data on the stack, so that we
don't introduce a false data dependency on previous or subsequent instructions.

Based on the above, a good candidate is "lock not" on a dummy variable on the
top of the stack. Note that the variable would be accessible through esp/rsp,
it is likely to be in hot memory and is likely thread-private.

I've implemented this optimization in Boost.Atomic, and a similar optimization
is done in MSVC:

https://github.com/microsoft/STL/pull/740

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
@ 2020-06-18 19:36 ` pinskia at gcc dot gnu.org
  2020-06-18 19:39 ` andysem at mail dot ru
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-06-18 19:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |91719

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
PR 91719 says this was fixed in GCC 10.1.0


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91719
[Bug 91719] gcc compiles seq_cst store on x86-64 differently from clang/icc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
  2020-06-18 19:36 ` [Bug target/95750] " pinskia at gcc dot gnu.org
@ 2020-06-18 19:39 ` andysem at mail dot ru
  2020-06-18 21:08 ` ubizjak at gmail dot com
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: andysem at mail dot ru @ 2020-06-18 19:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #2 from andysem at mail dot ru ---
gcc 10.1 only optimizes __atomic_store_n(seq_cst) to xchg, but not the fence.

Also, consider applying the same optimization to __sync_synchronize().

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
  2020-06-18 19:36 ` [Bug target/95750] " pinskia at gcc dot gnu.org
  2020-06-18 19:39 ` andysem at mail dot ru
@ 2020-06-18 21:08 ` ubizjak at gmail dot com
  2020-06-18 21:16 ` ubizjak at gmail dot com
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ubizjak at gmail dot com @ 2020-06-18 21:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
How about the following patch:

--cut here--
diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index 9ab5456b227..7d9442d45b7 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -117,10 +117,11 @@
       rtx (*mfence_insn)(rtx);
       rtx mem;

-      if (TARGET_64BIT || TARGET_SSE2)
-       mfence_insn = gen_mfence_sse2;
-      else
+      if (!(TARGET_64BIT || TARGET_SSE2)
+         || TARGET_USE_XCHG_FOR_ATOMIC_STORE)
        mfence_insn = gen_mfence_nosse;
+      else
+       mfence_insn = gen_mfence_sse2;

       mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
       MEM_VOLATILE_P (mem) = 1;
--cut here--

This will generate "lock orl $0, (%rsp)" instead of mfence.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (2 preceding siblings ...)
  2020-06-18 21:08 ` ubizjak at gmail dot com
@ 2020-06-18 21:16 ` ubizjak at gmail dot com
  2020-06-18 22:01 ` andysem at mail dot ru
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ubizjak at gmail dot com @ 2020-06-18 21:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #3)
> How about the following patch:

Surely, mfence_nosse needs to be enabled also for
TARGET_USE_XCHG_FOR_ATOMIC_STORE.

> This will generate "lock orl $0, (%rsp)" instead of mfence.

Please also read [1] why we avoid -4(%%esp).

[1] https://gcc.gnu.org/pipermail/gcc-patches/2017-February/469630.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (3 preceding siblings ...)
  2020-06-18 21:16 ` ubizjak at gmail dot com
@ 2020-06-18 22:01 ` andysem at mail dot ru
  2020-06-18 22:04 ` andysem at mail dot ru
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: andysem at mail dot ru @ 2020-06-18 22:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #5 from andysem at mail dot ru ---
> Please also read [1] why we avoid -4(%%esp).

I believe, memcheck would complain because the value at -4(%%rsp) would be
uninitialized. This is unfortunate. The workaround could be to initialize it
prior to the atomic operation. This would avoid data dependencies with the
surrounding code.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (4 preceding siblings ...)
  2020-06-18 22:01 ` andysem at mail dot ru
@ 2020-06-18 22:04 ` andysem at mail dot ru
  2020-06-19  9:14 ` ubizjak at gmail dot com
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: andysem at mail dot ru @ 2020-06-18 22:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #6 from andysem at mail dot ru ---
Also, I think (%%rsp) can be rather far from the top of the stack if the stack
frame is large. This may mean it's less likely to be in data cache. Having a
dummy variable ensures that it is close to the top.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (5 preceding siblings ...)
  2020-06-18 22:04 ` andysem at mail dot ru
@ 2020-06-19  9:14 ` ubizjak at gmail dot com
  2020-06-19  9:35 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ubizjak at gmail dot com @ 2020-06-19  9:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #7 from Uroš Bizjak <ubizjak at gmail dot com> ---
Actually, x86_64 (at least my Fedora 32) does not like operations on stack:

Starting program: /sdd/uros/git/gcc/gcc/testsuite/gcc.dg/atomic/a.out 

Program received signal SIGSEGV, Segmentation fault.
0x000000000040110a in main ()
(gdb) disass
Dump of assembler code for function main:
   0x0000000000401106 <+0>:     push   %rbp
   0x0000000000401107 <+1>:     mov    %rsp,%rbp
=> 0x000000000040110a <+4>:     lock orq $0x0,(%esp)
   0x0000000000401111 <+11>:    mov    $0x0,%eax
   0x0000000000401116 <+16>:    pop    %rbp
   0x0000000000401117 <+17>:    retq   
End of assembler dump.

I didn't investigate further, but 32bit executable works OK.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (6 preceding siblings ...)
  2020-06-19  9:14 ` ubizjak at gmail dot com
@ 2020-06-19  9:35 ` jakub at gcc dot gnu.org
  2020-06-19  9:49 ` ubizjak at gmail dot com
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-19  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #7)
> Actually, x86_64 (at least my Fedora 32) does not like operations on stack:
> 
> Starting program: /sdd/uros/git/gcc/gcc/testsuite/gcc.dg/atomic/a.out 
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000040110a in main ()
> (gdb) disass
> Dump of assembler code for function main:
>    0x0000000000401106 <+0>:     push   %rbp
>    0x0000000000401107 <+1>:     mov    %rsp,%rbp
> => 0x000000000040110a <+4>:     lock orq $0x0,(%esp)

The culprit is the %esp here, that adds the 0x67 prefix to the insn and
will only work if %rsp is below 4GB.

>    0x0000000000401111 <+11>:    mov    $0x0,%eax
>    0x0000000000401116 <+16>:    pop    %rbp
>    0x0000000000401117 <+17>:    retq   
> End of assembler dump.
> 
> I didn't investigate further, but 32bit executable works OK.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (7 preceding siblings ...)
  2020-06-19  9:35 ` jakub at gcc dot gnu.org
@ 2020-06-19  9:49 ` ubizjak at gmail dot com
  2020-06-19 12:47 ` ubizjak at gmail dot com
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ubizjak at gmail dot com @ 2020-06-19  9:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #8)

> The culprit is the %esp here, that adds the 0x67 prefix to the insn and
> will only work if %rsp is below 4GB.

Ah, indeed... I was in a bit of hurry and didn't notice.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (8 preceding siblings ...)
  2020-06-19  9:49 ` ubizjak at gmail dot com
@ 2020-06-19 12:47 ` ubizjak at gmail dot com
  2020-07-20  9:55 ` ubizjak at gmail dot com
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ubizjak at gmail dot com @ 2020-06-19 12:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #10 from Uroš Bizjak <ubizjak at gmail dot com> ---
Created attachment 48756
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48756&action=edit
Proposed patch

Patch in testing, survives GOMP testcases.

On a related note, the patch uses TARGET_USE_XCHG_FOR_ATOMIC_STORE, which
should probably be renamed to something more appropriate.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (9 preceding siblings ...)
  2020-06-19 12:47 ` ubizjak at gmail dot com
@ 2020-07-20  9:55 ` ubizjak at gmail dot com
  2020-07-20 18:37 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ubizjak at gmail dot com @ 2020-07-20  9:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #48756|0                           |1
        is obsolete|                            |
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |ubizjak at gmail dot com
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-07-20

--- Comment #11 from Uroš Bizjak <ubizjak at gmail dot com> ---
Created attachment 48897
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48897&action=edit
Proposed patch

Patch in testing.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (10 preceding siblings ...)
  2020-07-20  9:55 ` ubizjak at gmail dot com
@ 2020-07-20 18:37 ` cvs-commit at gcc dot gnu.org
  2020-07-21 18:22 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-07-20 18:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:3c5e83d5b32c31b11cf1684bf5d1ab3e7174685c

commit r11-2232-g3c5e83d5b32c31b11cf1684bf5d1ab3e7174685c
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Mon Jul 20 20:34:46 2020 +0200

    i386: Use lock prefixed insn instead of MFENCE [PR95750]

    Currently, __atomic_thread_fence(seq_cst) on x86 and x86-64 generates
    mfence instruction. A dummy atomic instruction (a lock-prefixed instruction
    or xchg with a memory operand) would provide the same sequential
consistency
    guarantees while being more efficient on most current CPUs. The mfence
    instruction additionally orders non-temporal stores, which is not relevant
    for atomic operations and are not ordered by seq_cst atomic operations
anyway.

    2020-07-20  UroÅ¡ Bizjak  <ubizjak@gmail.com>

    gcc/ChangeLog:
            PR target/95750
            * config/i386/i386.h (TARGET_AVOID_MFENCE):
            Rename from TARGET_USE_XCHG_FOR_ATOMIC_STORE.
            * config/i386/sync.md (mfence_sse2): Disable for
TARGET_AVOID_MFENCE.
            (mfence_nosse): Enable also for TARGET_AVOID_MFENCE. Emit stack
            referred memory in word_mode.
            (mem_thread_fence): Do not generate mfence_sse2 pattern when
            TARGET_AVOID_MFENCE is true.
            (atomic_store<mode>): Update for rename.
            * config/i386/x86-tune.def (X86_TUNE_AVOID_MFENCE):
            Rename from X86_TUNE_USE_XCHG_FOR_ATOMIC_STORE.

    gcc/testsuite/ChangeLog:
            PR target/95750
            * gcc.target/i386/pr95750.c: New test.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (11 preceding siblings ...)
  2020-07-20 18:37 ` cvs-commit at gcc dot gnu.org
@ 2020-07-21 18:22 ` cvs-commit at gcc dot gnu.org
  2020-07-23 17:22 ` josephcsible at gmail dot com
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-07-21 18:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:02363d5fdb862a11e6e65ac5b0d1f5ee0c422dc3

commit r11-2255-g02363d5fdb862a11e6e65ac5b0d1f5ee0c422dc3
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Tue Jul 21 20:22:05 2020 +0200

    i386: Fix insn conditions of mfence patterns [PR95750]

    2020-07-21  UroÅ¡ Bizjak  <ubizjak@gmail.com>

    gcc/ChangeLog:
            PR target/95750
            * config/i386/sync.md (mfence_sse2): Enable for
            TARGET_64BIT and TARGET_SSE2.
            (mfence_nosse): Always enable.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (12 preceding siblings ...)
  2020-07-21 18:22 ` cvs-commit at gcc dot gnu.org
@ 2020-07-23 17:22 ` josephcsible at gmail dot com
  2020-07-23 20:42 ` ubizjak at gmail dot com
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: josephcsible at gmail dot com @ 2020-07-23 17:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

Joseph C. Sible <josephcsible at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |josephcsible at gmail dot com

--- Comment #14 from Joseph C. Sible <josephcsible at gmail dot com> ---
I notice this change affects -Os too, even though "lock orq $0,(%rsp)" is 6
bytes and "mfence" is only 3 bytes.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (13 preceding siblings ...)
  2020-07-23 17:22 ` josephcsible at gmail dot com
@ 2020-07-23 20:42 ` ubizjak at gmail dot com
  2020-07-24 14:00 ` cvs-commit at gcc dot gnu.org
  2020-12-10  9:42 ` ubizjak at gmail dot com
  16 siblings, 0 replies; 18+ messages in thread
From: ubizjak at gmail dot com @ 2020-07-23 20:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #15 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Joseph C. Sible from comment #14)
> I notice this change affects -Os too, even though "lock orq $0,(%rsp)" is 6
> bytes and "mfence" is only 3 bytes.

Yes, we can emit mfence for -Os. I'm testing the following patch:

--cut here--
diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index c88750d3664..ed17bb00205 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -123,7 +123,8 @@
       rtx mem;

       if ((TARGET_64BIT || TARGET_SSE2)
-         && !TARGET_AVOID_MFENCE)
+         && (optimize_function_for_size_p (cfun)
+             || !TARGET_AVOID_MFENCE))
        mfence_insn = gen_mfence_sse2;
       else
        mfence_insn = gen_mfence_nosse;
--cut here--

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (14 preceding siblings ...)
  2020-07-23 20:42 ` ubizjak at gmail dot com
@ 2020-07-24 14:00 ` cvs-commit at gcc dot gnu.org
  2020-12-10  9:42 ` ubizjak at gmail dot com
  16 siblings, 0 replies; 18+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-07-24 14:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #16 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:8c7bb540803e8bff9998bc86806e8a83acc75370

commit r11-2306-g8c7bb540803e8bff9998bc86806e8a83acc75370
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Fri Jul 24 15:59:38 2020 +0200

    i386: Emit mfence_sse2 for -Os [PR95750]

    2020-07-24  UroÅ¡ Bizjak  <ubizjak@gmail.com>

    gcc/ChangeLog:
            PR target/95750
            * config/i386/sync.md (mmem_thread_fence): Emit mfence_sse2 for
-Os.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)
  2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
                   ` (15 preceding siblings ...)
  2020-07-24 14:00 ` cvs-commit at gcc dot gnu.org
@ 2020-12-10  9:42 ` ubizjak at gmail dot com
  16 siblings, 0 replies; 18+ messages in thread
From: ubizjak at gmail dot com @ 2020-12-10  9:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.0
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #17 from Uroš Bizjak <ubizjak at gmail dot com> ---
Fixed.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-12-10  9:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-18 19:32 [Bug target/95750] New: [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst) andysem at mail dot ru
2020-06-18 19:36 ` [Bug target/95750] " pinskia at gcc dot gnu.org
2020-06-18 19:39 ` andysem at mail dot ru
2020-06-18 21:08 ` ubizjak at gmail dot com
2020-06-18 21:16 ` ubizjak at gmail dot com
2020-06-18 22:01 ` andysem at mail dot ru
2020-06-18 22:04 ` andysem at mail dot ru
2020-06-19  9:14 ` ubizjak at gmail dot com
2020-06-19  9:35 ` jakub at gcc dot gnu.org
2020-06-19  9:49 ` ubizjak at gmail dot com
2020-06-19 12:47 ` ubizjak at gmail dot com
2020-07-20  9:55 ` ubizjak at gmail dot com
2020-07-20 18:37 ` cvs-commit at gcc dot gnu.org
2020-07-21 18:22 ` cvs-commit at gcc dot gnu.org
2020-07-23 17:22 ` josephcsible at gmail dot com
2020-07-23 20:42 ` ubizjak at gmail dot com
2020-07-24 14:00 ` cvs-commit at gcc dot gnu.org
2020-12-10  9:42 ` ubizjak at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).