public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686
@ 2021-04-21 12:35 jakub at gcc dot gnu.org
  2021-04-21 12:35 ` [Bug target/100182] " jakub at gcc dot gnu.org
                   ` (41 more replies)
  0 siblings, 42 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-21 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

            Bug ID: 100182
           Summary: [8/9/10/11/12 Regression] Miscompilation of
                    atomic_float/1.cc on i686
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

Since r7-1112-gbeed3701c796842abbfb27d7484b35bd82818740
the following testcase distilled from 29_atomics/atomic_float/1.cc with -O2
-march=i686 -m32 aborts on i686-linux:
struct __attribute__((aligned (8))) S { double _M_fp; };
union U { double d; unsigned long long int l; };

__attribute__((noipa)) void
foo (void)
{
  struct S a0, a1;
  union U u;
  double d0, d1;
  a0._M_fp = 0.0;
  a1._M_fp = 1.0;
  __atomic_store_8 (&a0._M_fp, __atomic_load_8 (&a1._M_fp, __ATOMIC_SEQ_CST),
__ATOMIC_SEQ_CST);
  u.l = __atomic_load_8 (&a0._M_fp, __ATOMIC_SEQ_CST);
  d0 = u.d;
  u.l = __atomic_load_8 (&a1._M_fp, __ATOMIC_SEQ_CST);
  d1 = u.d;
  if (d0 != d1)
    __builtin_abort ();
}

int
main ()
{
  foo ();
  return 0;
}

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
@ 2021-04-21 12:35 ` jakub at gcc dot gnu.org
  2021-04-21 14:23 ` jakub at gcc dot gnu.org
                   ` (40 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-21 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-04-21
     Ever confirmed|0                           |1
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
           Priority|P3                          |P2
   Target Milestone|---                         |8.5
             Status|UNCONFIRMED                 |ASSIGNED

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
  2021-04-21 12:35 ` [Bug target/100182] " jakub at gcc dot gnu.org
@ 2021-04-21 14:23 ` jakub at gcc dot gnu.org
  2021-04-21 15:35 ` jakub at gcc dot gnu.org
                   ` (39 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-21 14:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
In this particular case it is the sync.md:398 peephole2:
(define_peephole2
  [(set (match_operand:DF 0 "memory_operand")
        (match_operand:DF 1 "any_fp_register_operand"))
   (set (mem:BLK (scratch:SI))
        (unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE))
   (set (match_operand:DF 2 "fp_register_operand")
        (unspec:DF [(match_operand:DI 3 "memory_operand")]
                   UNSPEC_FILD_ATOMIC))
   (set (match_operand:DI 4 "memory_operand")
        (unspec:DI [(match_dup 2)]
                   UNSPEC_FIST_ATOMIC))]
  "!TARGET_64BIT
   && peep2_reg_dead_p (4, operands[2])
   && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
  [(const_int 0)]
{
  emit_insn (gen_memory_blockage ());
  emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]);
  DONE;
})
that triggers here but from what I can read, all the r7-1112 peephole2s
optimize away stores to some memory on the assumption that the memory is read
only once (in another insn matched by the same peephole2).
I'm not 100% sure if we can rely for it on spill slots for which r7-112 seems
to have been written, but for other memory we'd need to prove that the memory
is dead.
Rather than removing those peephole2s altogether, I wonder if we just shouldn't
check that the memory_operand which we'd optimize away stores to aren't spill
slots.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
  2021-04-21 12:35 ` [Bug target/100182] " jakub at gcc dot gnu.org
  2021-04-21 14:23 ` jakub at gcc dot gnu.org
@ 2021-04-21 15:35 ` jakub at gcc dot gnu.org
  2021-04-21 15:41 ` jakub at gcc dot gnu.org
                   ` (38 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-21 15:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
In addition to r7-1112 r8-3856 also added some similar peephole2s.
I'm afraid I'm getting lost in them, in several other peephole2s there the
store that is optimized away is an atomic store and that is quite certainly not
a spill slot.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-04-21 15:35 ` jakub at gcc dot gnu.org
@ 2021-04-21 15:41 ` jakub at gcc dot gnu.org
  2021-04-21 15:42 ` jakub at gcc dot gnu.org
                   ` (37 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-21 15:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50649
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50649&action=edit
gcc12-pr100182.patch

Untested patch for this particular peephole2.  But, 1) I'm not sure it is 100%
safe even for spill slots 2) I don't know what to do with the remaining 7
peephole2s
I'm afraid during peephole2 pass we don't have anything comparable to RTL DSE
infrastructure, and unfortunately the last RTL DSE2 pass is 5 passes before
peephole2.  So matching these insn sequences e.g. could be done in some machine
specific pass before RTL DSE2 instead of peephole2 and let RTL DSE2 optimize
away what seems unnecessary, or use some patterns that RTL DSE2 would recognize
on its own.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-04-21 15:41 ` jakub at gcc dot gnu.org
@ 2021-04-21 15:42 ` jakub at gcc dot gnu.org
  2021-04-22  8:28 ` jakub at gcc dot gnu.org
                   ` (36 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-21 15:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
           Assignee|jakub at gcc dot gnu.org           |unassigned at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-04-21 15:42 ` jakub at gcc dot gnu.org
@ 2021-04-22  8:28 ` jakub at gcc dot gnu.org
  2021-04-22 13:10 ` [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc " cvs-commit at gcc dot gnu.org
                   ` (35 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-22  8:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The 29_atomics/atomic_float/wait_notify.cc hangs that unfortunately cause
regtest hangs (the timeout stuff doesn't seem to work here) seems to be caused
by this too, at least if 0 && out those 8 peephole2s in sync.md the hang is
gone.

Vlad, can spill slots (MEMs with MEM_EXPR equal to get_spill_slot_decl (false))
be read in multiple instructions (one store multiple reads)?

Unfortunately the patterns do use peep2_reg_dead_p and so it isn't something
that can be done in the split2 pass (reload_completed && !epilogue_completed).

Maybe emit the stores always and if those peephole2s ever trigger, schedule an
extra RTL DSE pass after peephole2?
I'm not sure it is safe to emit the stores as normal DFmode stores though (at
least not in all the cases), because while one atomic read (the one seen in the
peephole2) can be DFmode-ish, further atomic reads (the ones the peephole2
doesn't see) could be DImode-ish.

Uros, can you please have a look?  Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-04-22  8:28 ` jakub at gcc dot gnu.org
@ 2021-04-22 13:10 ` cvs-commit at gcc dot gnu.org
  2021-04-22 13:10 ` cvs-commit at gcc dot gnu.org
                   ` (34 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-22 13:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:0f4588141fcbe4e0f1fa12776b47200870f6c621

commit r12-60-g0f4588141fcbe4e0f1fa12776b47200870f6c621
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Apr 22 15:08:21 2021 +0200

    libstdc++: Add workaround for ia32 floating atomics miscompilations
[PR100184]

    gcc on ia32 miscompiles various atomics involving floating point,
    unfortunately I'm afraid it is too late to fix that for 11.1 and
    as I'm quite lost on it, it might take a while for 12 too
    (disabling all the 8 peephole2s would be easiest, but then we'd
    run into optimization regressions).

    While 1.cc just FAILs, with dejagnu 1.6.1 wait_notify.cc hangs the
    make check even after the timeout fires.  The following patch therefore
    xfails the former and skips the latter.

    Tested on x86_64-linux where
    make check RUNTESTFLAGS='conformance.exp=atomic_float/*.cc'
    is still
                    === libstdc++ Summary ===

     # of expected passes            8
    and on i686-linux, where it is now
                    === libstdc++ Summary ===

     # of expected passes            5
     # of expected failures          1
     # of unsupported tests          1

    2021-04-22  Jakub Jelinek  <jakub@redhat.com>

            PR target/100182
            * testsuite/29_atomics/atomic_float/1.cc: Add dg-xfail-run-if for
            ia32.
            * testsuite/29_atomics/atomic_float/wait_notify.cc: Add dg-skip-if
for
            ia32.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-04-22 13:10 ` [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc " cvs-commit at gcc dot gnu.org
@ 2021-04-22 13:10 ` cvs-commit at gcc dot gnu.org
  2021-04-22 17:53 ` ubizjak at gmail dot com
                   ` (33 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-22 13:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a21f3b38c3b9a5c28c79be37b040e7d06d827d76

commit r11-8281-ga21f3b38c3b9a5c28c79be37b040e7d06d827d76
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Apr 22 15:08:21 2021 +0200

    libstdc++: Add workaround for ia32 floating atomics miscompilations
[PR100184]

    gcc on ia32 miscompiles various atomics involving floating point,
    unfortunately I'm afraid it is too late to fix that for 11.1 and
    as I'm quite lost on it, it might take a while for 12 too
    (disabling all the 8 peephole2s would be easiest, but then we'd
    run into optimization regressions).

    While 1.cc just FAILs, with dejagnu 1.6.1 wait_notify.cc hangs the
    make check even after the timeout fires.  The following patch therefore
    xfails the former and skips the latter.

    Tested on x86_64-linux where
    make check RUNTESTFLAGS='conformance.exp=atomic_float/*.cc'
    is still
                    === libstdc++ Summary ===

     # of expected passes            8
    and on i686-linux, where it is now
                    === libstdc++ Summary ===

     # of expected passes            5
     # of expected failures          1
     # of unsupported tests          1

    2021-04-22  Jakub Jelinek  <jakub@redhat.com>

            PR target/100182
            * testsuite/29_atomics/atomic_float/1.cc: Add dg-xfail-run-if for
            ia32.
            * testsuite/29_atomics/atomic_float/wait_notify.cc: Add dg-skip-if
for
            ia32.

    (cherry picked from commit 0f4588141fcbe4e0f1fa12776b47200870f6c621)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-04-22 13:10 ` cvs-commit at gcc dot gnu.org
@ 2021-04-22 17:53 ` ubizjak at gmail dot com
  2021-04-22 17:58 ` jakub at gcc dot gnu.org
                   ` (32 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-22 17:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|uros at gcc dot gnu.org            |
           Assignee|unassigned at gcc dot gnu.org      |ubizjak at gmail dot com

--- Comment #7 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #1)
> In this particular case it is the sync.md:398 peephole2:
> (define_peephole2
>   [(set (match_operand:DF 0 "memory_operand")
>         (match_operand:DF 1 "any_fp_register_operand"))
>    (set (mem:BLK (scratch:SI))
>         (unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE))
>    (set (match_operand:DF 2 "fp_register_operand")
>         (unspec:DF [(match_operand:DI 3 "memory_operand")]
>                    UNSPEC_FILD_ATOMIC))
>    (set (match_operand:DI 4 "memory_operand")
>         (unspec:DI [(match_dup 2)]
>                    UNSPEC_FIST_ATOMIC))]
>   "!TARGET_64BIT
>    && peep2_reg_dead_p (4, operands[2])
>    && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
>   [(const_int 0)]
> {
>   emit_insn (gen_memory_blockage ());
>   emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]);
>   DONE;
> })
> that triggers here but from what I can read, all the r7-1112 peephole2s
> optimize away stores to some memory on the assumption that the memory is
> read only once (in another insn matched by the same peephole2).
> I'm not 100% sure if we can rely for it on spill slots for which r7-112
> seems to have been written, but for other memory we'd need to prove that the
> memory is dead.
> Rather than removing those peephole2s altogether, I wonder if we just
> shouldn't check that the memory_operand which we'd optimize away stores to
> aren't spill slots.

Actually, these peepholes are too eager and also remove the store to the memory
operand 0 on the assumption that the operand is used only in the peephole2
sequence. As shown in the testcase, this is not always true, and operand 0 can
be accessed also after the peephole2'd sequence.

The solution is to not remove the store to operand 0. Probably there will be
some unneeded stores left in the code, but IMO, this is a small price to pay
for the correctness. And we still remove fild/fistp pair.

I'm testing the following patch:

--cut here--
diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index c7c508c8de8..538d1f89497 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -392,7 +392,8 @@
   "!TARGET_64BIT
    && peep2_reg_dead_p (3, operands[2])
    && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
-  [(set (match_dup 5) (match_dup 1))]
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 5) (match_dup 1))]
   "operands[5] = gen_lowpart (DFmode, operands[4]);")

 (define_peephole2
@@ -411,6 +412,7 @@
    && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
   [(const_int 0)]
 {
+  emit_move_insn (operands[0], operands[1]);
   emit_insn (gen_memory_blockage ());
   emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]);
   DONE;
@@ -428,7 +430,8 @@
   "!TARGET_64BIT
    && peep2_reg_dead_p (3, operands[2])
    && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
-  [(set (match_dup 5) (match_dup 1))]
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 5) (match_dup 1))]
   "operands[5] = gen_lowpart (DFmode, operands[4]);")

 (define_peephole2
@@ -447,6 +450,7 @@
    && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
   [(const_int 0)]
 {
+  emit_move_insn (operands[0], operands[1]);
   emit_insn (gen_memory_blockage ());
   emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]);
   DONE;
--cut here--

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2021-04-22 17:53 ` ubizjak at gmail dot com
@ 2021-04-22 17:58 ` jakub at gcc dot gnu.org
  2021-04-22 18:09 ` ubizjak at gmail dot com
                   ` (31 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-22 17:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I think there are 8 those peephole2s rather than just 4 (I've been looking for
rtx_equal_p (XEXP.*, 0) in sync.md

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2021-04-22 17:58 ` jakub at gcc dot gnu.org
@ 2021-04-22 18:09 ` ubizjak at gmail dot com
  2021-04-22 18:41 ` ubizjak at gmail dot com
                   ` (30 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-22 18:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #8)
> I think there are 8 those peephole2s rather than just 4 (I've been looking
> for
> rtx_equal_p (XEXP.*, 0) in sync.md

No, the other are not problematic.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2021-04-22 18:09 ` ubizjak at gmail dot com
@ 2021-04-22 18:41 ` ubizjak at gmail dot com
  2021-04-23  6:13 ` ubizjak at gmail dot com
                   ` (29 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-22 18:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #10 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #9)
> (In reply to Jakub Jelinek from comment #8)
> > I think there are 8 those peephole2s rather than just 4 (I've been looking
> > for
> > rtx_equal_p (XEXP.*, 0) in sync.md
> 
> No, the other are not problematic.

Actually, you are right. Those other peephole2 sequences also write to the
memory and it is assumed, that the memory is not accessed outside the sequence.

Additional patch follows:

--cut here--
diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index c7c508c8de8..c95cf50970e 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -231,7 +231,8 @@
   "!TARGET_64BIT
    && peep2_reg_dead_p (2, operands[0])
    && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))"
-  [(set (match_dup 3) (match_dup 5))]
+  [(set (match_dup 3) (match_dup 5))
+   (set (match_dup 4) (match_dup 3))]
   "operands[5] = gen_lowpart (DFmode, operands[1]);")

 (define_peephole2
@@ -251,6 +252,7 @@
   [(const_int 0)]
 {
   emit_move_insn (operands[3], gen_lowpart (DFmode, operands[1]));
+  emit_move_insn (operands[4], operands[3]);
   emit_insn (gen_memory_blockage ());
   DONE;
 })
@@ -267,7 +269,8 @@
   "!TARGET_64BIT
    && peep2_reg_dead_p (2, operands[0])
    && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))"
-  [(set (match_dup 3) (match_dup 5))]
+  [(set (match_dup 3) (match_dup 5))
+   (set (match_dup 4) (match_dup 3))]
   "operands[5] = gen_lowpart (DFmode, operands[1]);")

 (define_peephole2
@@ -287,6 +290,7 @@
   [(const_int 0)]
 {
   emit_move_insn (operands[3], gen_lowpart (DFmode, operands[1]));
+  emit_move_insn (operands[4], operands[3]);
   emit_insn (gen_memory_blockage ());
   DONE;
 })
--cut here--

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2021-04-22 18:41 ` ubizjak at gmail dot com
@ 2021-04-23  6:13 ` ubizjak at gmail dot com
  2021-04-23  7:40 ` jakub at gcc dot gnu.org
                   ` (28 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-23  6:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #11 from Uroš Bizjak <ubizjak at gmail dot com> ---
Jakub, do these two patches fix your failures?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2021-04-23  6:13 ` ubizjak at gmail dot com
@ 2021-04-23  7:40 ` jakub at gcc dot gnu.org
  2021-04-23  7:52 ` ubizjak at gmail dot com
                   ` (27 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-23  7:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
They do.  Though, in the combined patch I'm still a little bit worried about
the first 4 modified peephole2s, the last 4 look good to me.
The last 4 are where the original insn did a normal DFmode store and your patch
restores those DFmode stores.
But the first 4 had an atomic store followed by a DFmode read, shouldn't those
preserve an atomic store instead of the DFmode store?  A non-atomic DFmode read
is one thing, but it could be followed later by atomic loads, both into DFmode
and ones into DImode that would check the whole bit pattern.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2021-04-23  7:40 ` jakub at gcc dot gnu.org
@ 2021-04-23  7:52 ` ubizjak at gmail dot com
  2021-04-23  7:54 ` ubizjak at gmail dot com
                   ` (26 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-23  7:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #13 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #12)
> They do.  Though, in the combined patch I'm still a little bit worried about
> the first 4 modified peephole2s, the last 4 look good to me.
> The last 4 are where the original insn did a normal DFmode store and your
> patch restores those DFmode stores.
> But the first 4 had an atomic store followed by a DFmode read, shouldn't
> those
> preserve an atomic store instead of the DFmode store?  A non-atomic DFmode
> read is one thing, but it could be followed later by atomic loads, both into
> DFmode and ones into DImode that would check the whole bit pattern.

DFmode loads and stores *are* atomic, this is what the optimization is based
on.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2021-04-23  7:52 ` ubizjak at gmail dot com
@ 2021-04-23  7:54 ` ubizjak at gmail dot com
  2021-04-23  7:56 ` jakub at gcc dot gnu.org
                   ` (25 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-23  7:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #14 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #13)

> DFmode loads and stores *are* atomic, this is what the optimization is based
> on.

Loads and stores to/from x87 and SSE registers, to be clear.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2021-04-23  7:54 ` ubizjak at gmail dot com
@ 2021-04-23  7:56 ` jakub at gcc dot gnu.org
  2021-04-23  8:02 ` ubizjak at gmail dot com
                   ` (24 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-23  7:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Yes, but do they preserve all the bits and never modify any bit patterns,
including qNaNs and sNaNs?  I thought the point of using the fistp was that it
preserves everything.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2021-04-23  7:56 ` jakub at gcc dot gnu.org
@ 2021-04-23  8:02 ` ubizjak at gmail dot com
  2021-04-23  8:13 ` ubizjak at gmail dot com
                   ` (23 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-23  8:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #16 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #15)
> Yes, but do they preserve all the bits and never modify any bit patterns,
> including qNaNs and sNaNs?  I thought the point of using the fistp was that
> it preserves everything.

Hm, they don't...

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2021-04-23  8:02 ` ubizjak at gmail dot com
@ 2021-04-23  8:13 ` ubizjak at gmail dot com
  2021-04-23  8:25 ` jakub at gcc dot gnu.org
                   ` (22 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-23  8:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #17 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #16)
> (In reply to Jakub Jelinek from comment #15)
> > Yes, but do they preserve all the bits and never modify any bit patterns,
> > including qNaNs and sNaNs?  I thought the point of using the fistp was that
> > it preserves everything.
> 
> Hm, they don't...

This probably means we have to remove x87 peepholes, where an atomic store is
followed by a DFmode read. x87 can't load and store DFmode untouched without
fild/fistp pair.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2021-04-23  8:13 ` ubizjak at gmail dot com
@ 2021-04-23  8:25 ` jakub at gcc dot gnu.org
  2021-04-23  8:36 ` jakub at gcc dot gnu.org
                   ` (21 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-23  8:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #18 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Indeed.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2021-04-23  8:25 ` jakub at gcc dot gnu.org
@ 2021-04-23  8:36 ` jakub at gcc dot gnu.org
  2021-04-23  8:41 ` jakub at gcc dot gnu.org
                   ` (20 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-23  8:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Perhaps best would be to try to construct a testcase for each of the peephole2s
and try some bit pattern that isn't preserved through the FPU except for
fistp/fildp and see what enabling/disabling each of the peephole2s does to it.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (20 preceding siblings ...)
  2021-04-23  8:36 ` jakub at gcc dot gnu.org
@ 2021-04-23  8:41 ` jakub at gcc dot gnu.org
  2021-04-23  9:20 ` ubizjak at gmail dot com
                   ` (19 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-23  8:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aoliva at gcc dot gnu.org

--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
*** Bug 100228 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (21 preceding siblings ...)
  2021-04-23  8:41 ` jakub at gcc dot gnu.org
@ 2021-04-23  9:20 ` ubizjak at gmail dot com
  2021-04-23 15:30 ` cvs-commit at gcc dot gnu.org
                   ` (18 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-23  9:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #50649|0                           |1
        is obsolete|                            |

--- Comment #21 from Uroš Bizjak <ubizjak at gmail dot com> ---
Created attachment 50659
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50659&action=edit
Proposed patch

Here is the complete proposed patch.

We can retain problematic atomic store followed by a DFmode load peepholes as
long as we have a load to the SSE register. Load to the SSE register uses
movlps/movq moves that preserve all bits, so we are sure the store to a memory
location is unchanged from the original.

However, "load to the SSE register" requirement makes the peephole ineffective
for -mfpmath=387, so XFAILs are added to affected testcases.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (22 preceding siblings ...)
  2021-04-23  9:20 ` ubizjak at gmail dot com
@ 2021-04-23 15:30 ` cvs-commit at gcc dot gnu.org
  2021-04-28 10:44 ` cvs-commit at gcc dot gnu.org
                   ` (17 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-23 15:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #22 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:d2324a5ab3ff097864ae6828cb1db4dd013c70d1

commit r12-91-gd2324a5ab3ff097864ae6828cb1db4dd013c70d1
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Fri Apr 23 17:29:29 2021 +0200

    i386: Fix atomic FP peepholes [PR100182]

    64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit
    targets, so there is no need for additional atomic moves to a temporary
    register.

    Introduced load peephole2 patterns assume that there won't be any
additional
    loads from the load location outside the peepholed sequence and wrongly
    removed the source location initialization.

    OTOH, introduced store peephole2 patterns assume there won't be any
additional
    loads from the stored location outside the peepholed sequence and wrongly
    removed the destination location initialization.  Note that we can't use
plain
    x87 FST instruction to initialize destination location because FST converts
    the value to the double-precision format, changing bits during move.

    The patch restores removed initializations in load and store patterns.
    Additionally, plain x87 FST in store peephole2 patterns is prevented by
    limiting the store operand source to SSE registers.

    2021-04-23  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
            Copy operand 3 to operand 4.  Use sse_reg_operand
            as operand 3 predicate.
            (FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2):
            Copy operand 1 to operand 0.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage):
Ditto.

    gcc/testsuite/

            PR target/100182
            * gcc.target/i386/pr100182.c: New test.
            * gcc.target/i386/pr71245-1.c (dg-final): Xfail scan-assembler-not.
            * gcc.target/i386/pr71245-2.c (dg-final): Ditto.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (23 preceding siblings ...)
  2021-04-23 15:30 ` cvs-commit at gcc dot gnu.org
@ 2021-04-28 10:44 ` cvs-commit at gcc dot gnu.org
  2021-04-28 13:33 ` cvs-commit at gcc dot gnu.org
                   ` (16 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-28 10:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #23 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:c03f3077b1517a01c917f75179100f9d10b39156

commit r11-8313-gc03f3077b1517a01c917f75179100f9d10b39156
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Apr 28 12:30:04 2021 +0200

    i386: Fix atomic FP peepholes [PR100182]

    64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit
    targets, so there is no need for additional atomic moves to a temporary
    register.

    Introduced load peephole2 patterns assume that there won't be any
additional
    loads from the load location outside the peepholed sequence and wrongly
    removed the source location initialization.

    OTOH, introduced store peephole2 patterns assume there won't be any
additional
    loads from the stored location outside the peepholed sequence and wrongly
    removed the destination location initialization.  Note that we can't use
plain
    x87 FST instruction to initialize destination location because FST converts
    the value to the double-precision format, changing bits during move.

    The patch restores removed initializations in load and store patterns.
    Additionally, plain x87 FST in store peephole2 patterns is prevented by
    limiting the store operand source to SSE registers.

    2021-04-27  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
            Copy operand 3 to operand 4.  Use sse_reg_operand
            as operand 3 predicate.
            (FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2):
            Copy operand 1 to operand 0.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage):
Ditto.

    gcc/testsuite/
            PR target/100182
            * gcc.target/i386/pr100182.c: New test.
            * gcc.target/i386/pr71245-1.c (dg-final): Xfail scan-assembler-not.
            * gcc.target/i386/pr71245-2.c (dg-final): Ditto.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (24 preceding siblings ...)
  2021-04-28 10:44 ` cvs-commit at gcc dot gnu.org
@ 2021-04-28 13:33 ` cvs-commit at gcc dot gnu.org
  2021-04-28 18:02 ` cvs-commit at gcc dot gnu.org
                   ` (15 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-28 13:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #24 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:39e8bfe7217898e8d21bcc55efe6992fbde262f1

commit r10-9775-g39e8bfe7217898e8d21bcc55efe6992fbde262f1
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Fri Apr 23 17:29:29 2021 +0200

    i386: Fix atomic FP peepholes [PR100182]

    64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit
    targets, so there is no need for additional atomic moves to a temporary
    register.

    Introduced load peephole2 patterns assume that there won't be any
additional
    loads from the load location outside the peepholed sequence and wrongly
    removed the source location initialization.

    OTOH, introduced store peephole2 patterns assume there won't be any
additional
    loads from the stored location outside the peepholed sequence and wrongly
    removed the destination location initialization.  Note that we can't use
plain
    x87 FST instruction to initialize destination location because FST converts
    the value to the double-precision format, changing bits during move.

    The patch restores removed initializations in load and store patterns.
    Additionally, plain x87 FST in store peephole2 patterns is prevented by
    limiting the store operand source to SSE registers.

    2021-04-23  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
            Copy operand 3 to operand 4.  Use sse_reg_operand
            as operand 3 predicate.
            (FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2):
            Copy operand 1 to operand 0.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage):
Ditto.

    gcc/testsuite/

            PR target/100182
            * gcc.target/i386/pr100182.c: New test.
            * gcc.target/i386/pr71245-1.c (dg-final): Xfail scan-assembler-not.
            * gcc.target/i386/pr71245-2.c (dg-final): Ditto.

    (cherry picked from commit d2324a5ab3ff097864ae6828cb1db4dd013c70d1)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (25 preceding siblings ...)
  2021-04-28 13:33 ` cvs-commit at gcc dot gnu.org
@ 2021-04-28 18:02 ` cvs-commit at gcc dot gnu.org
  2021-04-28 18:02 ` cvs-commit at gcc dot gnu.org
                   ` (14 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-28 18:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #25 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:be20ca1d4ff79baf7425a48bb887495e1ea8f788

commit r9-9472-gbe20ca1d4ff79baf7425a48bb887495e1ea8f788
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Fri Apr 23 17:29:29 2021 +0200

    i386: Fix atomic FP peepholes [PR100182]

    64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit
    targets, so there is no need for additional atomic moves to a temporary
    register.

    Introduced load peephole2 patterns assume that there won't be any
additional
    loads from the load location outside the peepholed sequence and wrongly
    removed the source location initialization.

    OTOH, introduced store peephole2 patterns assume there won't be any
additional
    loads from the stored location outside the peepholed sequence and wrongly
    removed the destination location initialization.  Note that we can't use
plain
    x87 FST instruction to initialize destination location because FST converts
    the value to the double-precision format, changing bits during move.

    The patch restores removed initializations in load and store patterns.
    Additionally, plain x87 FST in store peephole2 patterns is prevented by
    limiting the store operand source to SSE registers.

    2021-04-23  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
            Copy operand 3 to operand 4.  Use sse_reg_operand
            as operand 3 predicate.
            (FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2):
            Copy operand 1 to operand 0.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage):
Ditto.

    gcc/testsuite/

            PR target/100182
            * gcc.target/i386/pr100182.c: New test.
            * gcc.target/i386/pr71245-1.c (dg-final): Xfail scan-assembler-not.
            * gcc.target/i386/pr71245-2.c (dg-final): Ditto.

    (cherry picked from commit d2324a5ab3ff097864ae6828cb1db4dd013c70d1)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (26 preceding siblings ...)
  2021-04-28 18:02 ` cvs-commit at gcc dot gnu.org
@ 2021-04-28 18:02 ` cvs-commit at gcc dot gnu.org
  2021-04-28 18:09 ` ubizjak at gmail dot com
                   ` (13 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-28 18:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #26 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-8 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:0d277114b4b2d0cb386c7abe409a81ca29d9d61d

commit r8-10926-g0d277114b4b2d0cb386c7abe409a81ca29d9d61d
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Fri Apr 23 17:29:29 2021 +0200

    i386: Fix atomic FP peepholes [PR100182]

    64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit
    targets, so there is no need for additional atomic moves to a temporary
    register.

    Introduced load peephole2 patterns assume that there won't be any
additional
    loads from the load location outside the peepholed sequence and wrongly
    removed the source location initialization.

    OTOH, introduced store peephole2 patterns assume there won't be any
additional
    loads from the stored location outside the peepholed sequence and wrongly
    removed the destination location initialization.  Note that we can't use
plain
    x87 FST instruction to initialize destination location because FST converts
    the value to the double-precision format, changing bits during move.

    The patch restores removed initializations in load and store patterns.
    Additionally, plain x87 FST in store peephole2 patterns is prevented by
    limiting the store operand source to SSE registers.

    2021-04-23  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
            Copy operand 3 to operand 4.  Use sse_reg_operand
            as operand 3 predicate.
            (FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2):
            Copy operand 1 to operand 0.
            (FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage):
Ditto.
            (LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
            (LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage):
Ditto.

    gcc/testsuite/

            PR target/100182
            * gcc.target/i386/pr100182.c: New test.
            * gcc.target/i386/pr71245-1.c (dg-final): Xfail scan-assembler-not.
            * gcc.target/i386/pr71245-2.c (dg-final): Ditto.

    (cherry picked from commit d2324a5ab3ff097864ae6828cb1db4dd013c70d1)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (27 preceding siblings ...)
  2021-04-28 18:02 ` cvs-commit at gcc dot gnu.org
@ 2021-04-28 18:09 ` ubizjak at gmail dot com
  2021-07-19 13:08 ` hjl.tools at gmail dot com
                   ` (12 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-28 18:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #27 from Uroš Bizjak <ubizjak at gmail dot com> ---
Fixed everywhere.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (28 preceding siblings ...)
  2021-04-28 18:09 ` ubizjak at gmail dot com
@ 2021-07-19 13:08 ` hjl.tools at gmail dot com
  2021-07-19 14:40 ` ubizjak at gmail dot com
                   ` (11 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: hjl.tools at gmail dot com @ 2021-07-19 13:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |---

--- Comment #28 from H.J. Lu <hjl.tools at gmail dot com> ---
29_atomics/atomic_ref/wait_notify.c has the same issue on Linux/x86-64 with
-m32:

(gdb) bt
#0  0xf7f5455d in __kernel_vsyscall ()
#1  0xf7b3a46b in syscall () from /lib/libc.so.6
#2  0x0804995d in std::__detail::__platform_wait<int> (
    __addr=0x804d480 <std::__detail::__waiter_pool_base::_S_for(void
const*)::__w+960>, __val=1)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_wait.h:104
#3  0x08049afa in std::__detail::__waiter_pool::_M_do_wait (__old=1, 
    __addr=0x804d480 <std::__detail::__waiter_pool_base::_S_for(void
const*)::__w+960>, this=<optimized out>)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_wait.h:261
#4  std::__detail::__waiter<std::integral_constant<bool, true>
>::_M_do_wait_v<int, std::__atomic_impl::wait<int>(int const*,
std::remove_volatile<int>::type, std::memory_order)::{lambda()#1}>(int,
std::__atomic_impl::wait<int>(int const*, std::remove_volatile<int>::type,
std::memory_order)::{lambda()#1}) (__vfn=..., 
    __old=42, this=<synthetic pointer>)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_wait.h:400
#5  std::__atomic_wait_address_v<int, std::__atomic_impl::wait<int>(int const*,
std::remove_volatile<int>::type, std::memory_order)::{lambda()#1}>(int const*,
int, std::__atomic_impl::wait<int>(int const*, std::remove_volatile<int>::type,
std::memory_order)::{lambda()#1}) (__addr=0xffaf629c, __old=42, __vfn=...)
--Type <RET> for more, q to quit, c to continue without paging--
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_wait.h:430
#6  0x08049bb2 in std::__atomic_impl::wait<int> (
    __m=std::memory_order::seq_cst, __old=<optimized out>, 
    __ptr=<optimized out>)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_base.h:1018
#7  std::__atomic_ref<int, true, false>::wait (__m=std::memory_order::seq_cst, 
    __old=<optimized out>, this=0xffaf62a4)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_base.h:1570
#8  test<int> (va=0, vb=42)
    at
/export/gnu/import/git/gitlab/x86-gcc/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc:44
#9  0x08049250 in main ()
    at
/export/gnu/import/git/gitlab/x86-gcc/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc:52
(gdb) 

when GCC is configured with

--enable-cet --with-demangler-in-ld --prefix=/usr/gcc-12.0.0-native
--with-local-prefix=/usr/local --enable-gnu-indirect-function
--enable-clocale=gnu --with-system-zlib --with-target-system-zlib
--with-fpmath=sse --with-arch=native --with-cpu=native
--enable-languages=c,c++,fortran,lto,objc,ada,obj-c++,go

where native == skylake-avx512.  It happens one out of ~10 runs.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (29 preceding siblings ...)
  2021-07-19 13:08 ` hjl.tools at gmail dot com
@ 2021-07-19 14:40 ` ubizjak at gmail dot com
  2021-07-19 22:06 ` hjl.tools at gmail dot com
                   ` (10 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-07-19 14:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #29 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to H.J. Lu from comment #28)
> 29_atomics/atomic_ref/wait_notify.c has the same issue on Linux/x86-64 with
> -m32:

Are you sure? The mentioned peephole2 patterns now emit only x87 or SSE DFmode
loads/stores that are guaranteed to be atomic.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (30 preceding siblings ...)
  2021-07-19 14:40 ` ubizjak at gmail dot com
@ 2021-07-19 22:06 ` hjl.tools at gmail dot com
  2021-07-19 22:18 ` ubizjak at gmail dot com
                   ` (9 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: hjl.tools at gmail dot com @ 2021-07-19 22:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #30 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Uroš Bizjak from comment #29)
> (In reply to H.J. Lu from comment #28)
> > 29_atomics/atomic_ref/wait_notify.c has the same issue on Linux/x86-64 with
> > -m32:
> 
> Are you sure? The mentioned peephole2 patterns now emit only x87 or SSE
> DFmode loads/stores that are guaranteed to be atomic.

It does happen, but not very often.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (31 preceding siblings ...)
  2021-07-19 22:06 ` hjl.tools at gmail dot com
@ 2021-07-19 22:18 ` ubizjak at gmail dot com
  2021-07-20  4:23 ` cvs-commit at gcc dot gnu.org
                   ` (8 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-07-19 22:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #31 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to H.J. Lu from comment #30)
> (In reply to Uroš Bizjak from comment #29)
> > (In reply to H.J. Lu from comment #28)
> > > 29_atomics/atomic_ref/wait_notify.c has the same issue on Linux/x86-64 with
> > > -m32:
> > 
> > Are you sure? The mentioned peephole2 patterns now emit only x87 or SSE
> > DFmode loads/stores that are guaranteed to be atomic.
> 
> It does happen, but not very often.

OK, I will simply remove *all* these peephole2s.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (32 preceding siblings ...)
  2021-07-19 22:18 ` ubizjak at gmail dot com
@ 2021-07-20  4:23 ` cvs-commit at gcc dot gnu.org
  2021-07-20  4:30 ` cvs-commit at gcc dot gnu.org
                   ` (7 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-07-20  4:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #32 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:6d4da4aeef5b20f7f9693ddc27d26740d0dbe36c

commit r12-2407-g6d4da4aeef5b20f7f9693ddc27d26740d0dbe36c
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Tue Jul 20 06:15:16 2021 +0200

    i386: Remove atomic_storedi_fpu and atomic_loaddi_fpu peepholes [PR100182]

    These patterns result in non-atomic sequence.

    2021-07-21  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (define_peephole2 atomic_storedi_fpu):
            Remove.
            (define_peephole2 atomic_loaddi_fpu): Ditto.

    gcc/testsuite/
            PR target/100182
            * gcc.target/i386/pr71245-1.c: Remove.
            * gcc.target/i386/pr71245-2.c: Ditto.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (33 preceding siblings ...)
  2021-07-20  4:23 ` cvs-commit at gcc dot gnu.org
@ 2021-07-20  4:30 ` cvs-commit at gcc dot gnu.org
  2021-07-20  4:36 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-07-20  4:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #33 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:f2060ae92f22e4877af21018cf3b8cc2eca4745e

commit r11-8783-gf2060ae92f22e4877af21018cf3b8cc2eca4745e
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Tue Jul 20 06:29:39 2021 +0200

    i386: Remove atomic_storedi_fpu and atomic_loaddi_fpu peepholes [PR100182]

    These patterns result in non-atomic sequence.

    2021-07-21  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (define_peephole2 atomic_storedi_fpu):
            Remove.
            (define_peephole2 atomic_loaddi_fpu): Ditto.

    gcc/testsuite/
            PR target/100182
            * gcc.target/i386/pr71245-1.c: Remove.
            * gcc.target/i386/pr71245-2.c: Ditto.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (34 preceding siblings ...)
  2021-07-20  4:30 ` cvs-commit at gcc dot gnu.org
@ 2021-07-20  4:36 ` cvs-commit at gcc dot gnu.org
  2021-07-20  4:39 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-07-20  4:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #34 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:ff3a8cd277752fd6167c39c00391662e47f1d1c0

commit r10-9992-gff3a8cd277752fd6167c39c00391662e47f1d1c0
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Tue Jul 20 06:36:02 2021 +0200

    i386: Remove atomic_storedi_fpu and atomic_loaddi_fpu peepholes [PR100182]

    These patterns result in non-atomic sequence.

    2021-07-21  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (define_peephole2 atomic_storedi_fpu):
            Remove.
            (define_peephole2 atomic_loaddi_fpu): Ditto.

    gcc/testsuite/
            PR target/100182
            * gcc.target/i386/pr71245-1.c: Remove.
            * gcc.target/i386/pr71245-2.c: Ditto.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (35 preceding siblings ...)
  2021-07-20  4:36 ` cvs-commit at gcc dot gnu.org
@ 2021-07-20  4:39 ` cvs-commit at gcc dot gnu.org
  2021-07-20  4:41 ` ubizjak at gmail dot com
                   ` (4 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-07-20  4:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #35 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:5972907bd0528a04eb8fa4f9897714724da53f04

commit r9-9634-g5972907bd0528a04eb8fa4f9897714724da53f04
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Tue Jul 20 06:38:46 2021 +0200

    i386: Remove atomic_storedi_fpu and atomic_loaddi_fpu peepholes [PR100182]

    These patterns result in non-atomic sequence.

    2021-07-21  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/100182
            * config/i386/sync.md (define_peephole2 atomic_storedi_fpu):
            Remove.
            (define_peephole2 atomic_loaddi_fpu): Ditto.

    gcc/testsuite/
            PR target/100182
            * gcc.target/i386/pr71245-1.c: Remove.
            * gcc.target/i386/pr71245-2.c: Ditto.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (36 preceding siblings ...)
  2021-07-20  4:39 ` cvs-commit at gcc dot gnu.org
@ 2021-07-20  4:41 ` ubizjak at gmail dot com
  2021-07-31 19:24 ` hjl.tools at gmail dot com
                   ` (3 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-07-20  4:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|REOPENED                    |RESOLVED

--- Comment #36 from Uroš Bizjak <ubizjak at gmail dot com> ---
Fixed by reverting PR71245.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (37 preceding siblings ...)
  2021-07-20  4:41 ` ubizjak at gmail dot com
@ 2021-07-31 19:24 ` hjl.tools at gmail dot com
  2021-08-03 18:14 ` hjl.tools at gmail dot com
                   ` (2 subsequent siblings)
  41 siblings, 0 replies; 43+ messages in thread
From: hjl.tools at gmail dot com @ 2021-07-31 19:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #37 from H.J. Lu <hjl.tools at gmail dot com> ---
I still see 32-bit test hang at random on Skylake server:

(gdb) bt
#0  0xf7fc655d in __kernel_vsyscall ()
#1  0xf7bac46b in syscall () from /lib/libc.so.6
#2  0x0804995d in std::__detail::__platform_wait<int> (
    __addr=0x804d680 <std::__detail::__waiter_pool_base::_S_for(void
const*)::__w+1472>, __val=3)
    at
/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_wait.h:104
#3  0x08049e3a in std::__detail::__waiter_pool::_M_do_wait (__old=3, 
    __addr=0x804d680 <std::__detail::__waiter_pool_base::_S_for(void
const*)::__w+1472>, this=<optimized out>)
    at
/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_wait.h:261
#4  std::__detail::__waiter<std::integral_constant<bool, true>
>::_M_do_wait_v<unsigned int, std::__atomic_impl::wait<unsigned int>(unsigned
int const*, std::remove_volatile<unsigned int>::type,
std::memory_order)::{lambda()#1}>(unsigned int,
std::__atomic_impl::wait<unsigned int>(unsigned int const*,
std::remove_volatile<unsigned int>::type, std::memory_order)::{lambda()#1})
(__vfn=..., 
    __old=42, this=<synthetic pointer>)
    at
/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_wait.h:400
#5  std::__atomic_wait_address_v<unsigned int,
std::__atomic_impl::wait<unsigned int>(unsigned int const*,
std::remove_volatile<unsigned int>::type,
std::memory_order)::{lambda()#1}>(unsigned int const*, unsigned int,
std::__atomic_impl::wa--Type <RET> for more, q to quit, c to continue without
paging--
it<unsigned int>(unsigned int const*, std::remove_volatile<unsigned int>::type,
std::memory_order)::{lambda()#1}) (__addr=0xffeb402c, __old=42, __vfn=...)
    at
/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_wait.h:430
#6  0x08049ef2 in std::__atomic_impl::wait<unsigned int> (
    __m=std::memory_order::seq_cst, __old=<optimized out>, 
    __ptr=<optimized out>)
    at
/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_base.h:1018
#7  std::__atomic_ref<unsigned int, true, false>::wait (
    __m=std::memory_order::seq_cst, __old=<optimized out>, this=0xffeb4034)
    at
/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/32/libstdc++-v3/include/bits/atomic_base.h:1570
#8  test<unsigned int> (va=0, vb=42)
    at
/export/gnu/import/git/gcc-test-master-intel64-native/src-master/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc:44
#9  0x0804926b in main ()
    at
/export/gnu/import/git/gcc-test-master-intel64-native/src-master/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc:54
(gdb) 

GCC was configured with

--with-arch=native --with-cpu=native --prefix=/usr/12.0.0 --enable-clocale=gnu
--with-system-zlib --enable-shared --enable-cet --with-demangler-in-ld
--enable-libmpx --with-multilib-list=m32,m64,mx32 --with-fpmath=sse

It happens about once a few weeks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (38 preceding siblings ...)
  2021-07-31 19:24 ` hjl.tools at gmail dot com
@ 2021-08-03 18:14 ` hjl.tools at gmail dot com
  2021-08-03 18:24 ` ubizjak at gmail dot com
  2021-08-03 18:30 ` hjl.tools at gmail dot com
  41 siblings, 0 replies; 43+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-03 18:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #38 from H.J. Lu <hjl.tools at gmail dot com> ---
This time is 29_atomics/atomic_ref/wait_notify.cc in 64-bit on Skylake server:

(gdb) bt
#0  0x00007f897288cc1d in syscall () from /lib64/libc.so.6
#1  0x00000000004018be in std::__detail::__platform_wait<int> (
    __addr=__addr@entry=0x405400
<std::__detail::__waiter_pool_base::_S_for(void const*)::__w+832>,
__val=__val@entry=3)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/atomic_wait.h:104
#2  0x00000000004022a3 in std::__detail::__waiter_pool::_M_do_wait (__old=3, 
    __addr=0x405400 <std::__detail::__waiter_pool_base::_S_for(void
const*)::__w+832>, this=<optimized out>)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/atomic_wait.h:261
#3  std::__detail::__waiter<std::integral_constant<bool, true>
>::_M_do_wait_v<void*, std::__atomic_impl::wait<void*>(void* const*,
std::remove_volatile<void*>::type, std::memory_order)::{lambda()#1}>(void*,
std::__atomic_impl::wait<void*>(void* const*,
std::remove_volatile<void*>::type, std::memory_order)::{lambda()#1})
(__vfn=..., __old=0x2a, this=<synthetic pointer>)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/atomic_wait.h:400
#4  std::__atomic_wait_address_v<void*, std::__atomic_impl::wait<void*>(void*
const*, std::remove_volatile<void*>::type,
std::memory_order)::{lambda()#1}>(void* const*, void*,
std::__atomic_impl::wait<void*>(void* const*,
std::remove_volatile<void*>::type, std::memory_order)::{lambda()#1})
(__addr=<optimized out>, 
    __old=0x2a, __vfn=...)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/atomic_wait.h:430
#5  0x0000000000402366 in std::__atomic_impl::wait<void*> (
    __m=std::memory_order::seq_cst, __old=<optimized out>, 
    __ptr=<optimized out>)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/atomic_base.h:1018
#6  std::__atomic_ref<void*, false, false>::wait (
    __m=std::memory_order::seq_cst, __old=<optimized out>, this=0x7ffed19864e8)
    at
/export/users/hjl/build/gnu/tools-build/gcc-gitlab-native/build-x86_64-linux/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/atomic_base.h:1874
#7  test<void*> (va=va@entry=0x0, vb=vb@entry=0x2a)
    at
/export/gnu/import/git/gitlab/x86-gcc/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc:44
#8  0x000000000040125b in main ()
    at
/export/gnu/import/git/gitlab/x86-gcc/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc:57
(gdb) 

GCC was configured with

--with-arch=native --with-cpu=native --prefix=/usr/12.0.0 --enable-clocale=gnu
--with-system-zlib --enable-shared --enable-cet --with-demangler-in-ld
--enable-libmpx --with-multilib-list=m32,m64,mx32 --with-fpmath=sse

This is very rare.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (39 preceding siblings ...)
  2021-08-03 18:14 ` hjl.tools at gmail dot com
@ 2021-08-03 18:24 ` ubizjak at gmail dot com
  2021-08-03 18:30 ` hjl.tools at gmail dot com
  41 siblings, 0 replies; 43+ messages in thread
From: ubizjak at gmail dot com @ 2021-08-03 18:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #39 from Uroš Bizjak <ubizjak at gmail dot com> ---
Please open a new bugreport, failures in Comment 37 and Comment 38 have nothing
with r7-1112-gbeed3701c796842abbfb27d7484b35bd82818740 which was fully
reverted.

(FTR, fixed peepholes were ineffective, so there was no loss in removing them.)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
  2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
                   ` (40 preceding siblings ...)
  2021-08-03 18:24 ` ubizjak at gmail dot com
@ 2021-08-03 18:30 ` hjl.tools at gmail dot com
  41 siblings, 0 replies; 43+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-03 18:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182

--- Comment #40 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Uroš Bizjak from comment #39)
> Please open a new bugreport, failures in Comment 37 and Comment 38 have
> nothing with r7-1112-gbeed3701c796842abbfb27d7484b35bd82818740 which was
> fully reverted.
> 
> (FTR, fixed peepholes were ineffective, so there was no loss in removing
> them.)

I opened PR 101761.

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-08-03 18:30 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-21 12:35 [Bug target/100182] New: [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc on i686 jakub at gcc dot gnu.org
2021-04-21 12:35 ` [Bug target/100182] " jakub at gcc dot gnu.org
2021-04-21 14:23 ` jakub at gcc dot gnu.org
2021-04-21 15:35 ` jakub at gcc dot gnu.org
2021-04-21 15:41 ` jakub at gcc dot gnu.org
2021-04-21 15:42 ` jakub at gcc dot gnu.org
2021-04-22  8:28 ` jakub at gcc dot gnu.org
2021-04-22 13:10 ` [Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc " cvs-commit at gcc dot gnu.org
2021-04-22 13:10 ` cvs-commit at gcc dot gnu.org
2021-04-22 17:53 ` ubizjak at gmail dot com
2021-04-22 17:58 ` jakub at gcc dot gnu.org
2021-04-22 18:09 ` ubizjak at gmail dot com
2021-04-22 18:41 ` ubizjak at gmail dot com
2021-04-23  6:13 ` ubizjak at gmail dot com
2021-04-23  7:40 ` jakub at gcc dot gnu.org
2021-04-23  7:52 ` ubizjak at gmail dot com
2021-04-23  7:54 ` ubizjak at gmail dot com
2021-04-23  7:56 ` jakub at gcc dot gnu.org
2021-04-23  8:02 ` ubizjak at gmail dot com
2021-04-23  8:13 ` ubizjak at gmail dot com
2021-04-23  8:25 ` jakub at gcc dot gnu.org
2021-04-23  8:36 ` jakub at gcc dot gnu.org
2021-04-23  8:41 ` jakub at gcc dot gnu.org
2021-04-23  9:20 ` ubizjak at gmail dot com
2021-04-23 15:30 ` cvs-commit at gcc dot gnu.org
2021-04-28 10:44 ` cvs-commit at gcc dot gnu.org
2021-04-28 13:33 ` cvs-commit at gcc dot gnu.org
2021-04-28 18:02 ` cvs-commit at gcc dot gnu.org
2021-04-28 18:02 ` cvs-commit at gcc dot gnu.org
2021-04-28 18:09 ` ubizjak at gmail dot com
2021-07-19 13:08 ` hjl.tools at gmail dot com
2021-07-19 14:40 ` ubizjak at gmail dot com
2021-07-19 22:06 ` hjl.tools at gmail dot com
2021-07-19 22:18 ` ubizjak at gmail dot com
2021-07-20  4:23 ` cvs-commit at gcc dot gnu.org
2021-07-20  4:30 ` cvs-commit at gcc dot gnu.org
2021-07-20  4:36 ` cvs-commit at gcc dot gnu.org
2021-07-20  4:39 ` cvs-commit at gcc dot gnu.org
2021-07-20  4:41 ` ubizjak at gmail dot com
2021-07-31 19:24 ` hjl.tools at gmail dot com
2021-08-03 18:14 ` hjl.tools at gmail dot com
2021-08-03 18:24 ` ubizjak at gmail dot com
2021-08-03 18:30 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).