public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)`
@ 2021-09-07  2:56 gabravier at gmail dot com
  2021-09-07  2:58 ` [Bug tree-optimization/102224] " gabravier at gmail dot com
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07  2:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

            Bug ID: 102224
           Summary: Incorrect compile on `x * copysign(1.0, x)`
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

float f(float x, float y)
{
    return x * copysignf(1.0f, x);
}

On latest trunk with -O3 on x86, this compiles to:

f(float):
  andps xmm0, XMMWORD PTR .LC0[rip]
  xorps xmm0, xmm0
  ret
.LC0:
  .long -2147483648
  .long 0
  .long 0
  .long 0

This is likely because the XOR_SIGN pattern emitted by
convert_expand_mult_copysign fails to check that x and y are not the same
(note: it can be optimized to ABS_EXPR in that case).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/102224] Incorrect compile on `x * copysign(1.0, x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
@ 2021-09-07  2:58 ` gabravier at gmail dot com
  2021-09-07  2:59 ` gabravier at gmail dot com
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07  2:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #1 from Gabriel Ravier <gabravier at gmail dot com> ---
(Note: this is a miscompile because it compiles as equivalent to `return 0;` as
that's what `xorps xmm0, xmm0` will do)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/102224] Incorrect compile on `x * copysign(1.0, x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
  2021-09-07  2:58 ` [Bug tree-optimization/102224] " gabravier at gmail dot com
@ 2021-09-07  2:59 ` gabravier at gmail dot com
  2021-09-07  3:01 ` gabravier at gmail dot com
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07  2:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #2 from Gabriel Ravier <gabravier at gmail dot com> ---
(PS: by "x and y" I mean "the two arguments". If they're the same, GCC should
obviously just optimize this to an abs as that's what it ends up being)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/102224] Incorrect compile on `x * copysign(1.0, x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
  2021-09-07  2:58 ` [Bug tree-optimization/102224] " gabravier at gmail dot com
  2021-09-07  2:59 ` gabravier at gmail dot com
@ 2021-09-07  3:01 ` gabravier at gmail dot com
  2021-09-07  3:04 ` [Bug target/102224] " pinskia at gcc dot gnu.org
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07  3:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #3 from Gabriel Ravier <gabravier at gmail dot com> ---
Also seems like this might be unique to x86 as this compiles fine on Aarch64
(though while it doesn't try to do anything stupid like xoring the result with
itself, it does still not optimize the XOR_SIGN to an ABS_EXPR at the GIMPLE
level).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] Incorrect compile on `x * copysign(1.0, x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2021-09-07  3:01 ` gabravier at gmail dot com
@ 2021-09-07  3:04 ` pinskia at gcc dot gnu.org
  2021-09-07  3:06 ` [Bug target/102224] [12 regession] wrong code for " pinskia at gcc dot gnu.org
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-07  3:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
          Component|tree-optimization           |target
           Keywords|                            |wrong-code

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [12 regession] wrong code for `x * copysign(1.0,  x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2021-09-07  3:04 ` [Bug target/102224] " pinskia at gcc dot gnu.org
@ 2021-09-07  3:06 ` pinskia at gcc dot gnu.org
  2021-09-07  3:07 ` gabravier at gmail dot com
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-07  3:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
There was just a recent patch which touched this in the x86 backend.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [12 regession] wrong code for `x * copysign(1.0,  x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (4 preceding siblings ...)
  2021-09-07  3:06 ` [Bug target/102224] [12 regession] wrong code for " pinskia at gcc dot gnu.org
@ 2021-09-07  3:07 ` gabravier at gmail dot com
  2021-09-07  3:18 ` [Bug target/102224] [9/10/11/12 " gabravier at gmail dot com
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07  3:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #5 from Gabriel Ravier <gabravier at gmail dot com> ---
Actually it seems to me like this is a GCC 9 regression, ever since this
pattern exists: GCC 9, 10 and 11 emit the exact same faulty code.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (5 preceding siblings ...)
  2021-09-07  3:07 ` gabravier at gmail dot com
@ 2021-09-07  3:18 ` gabravier at gmail dot com
  2021-09-07  3:25 ` gabravier at gmail dot com
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07  3:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Gabriel Ravier <gabravier at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[12 regession] wrong code   |[9/10/11/12 regession]
                   |for `x * copysign(1.0, x)`  |wrong code for `x *
                   |                            |copysign(1.0, x)`

--- Comment #6 from Gabriel Ravier <gabravier at gmail dot com> ---
Actually, I've only gotten a snapshot from the 5th, which does not appear to
include HJ's patch from the 4th (which seems rather odd). Does it happen to fix
this ? I'd assume it does not, since that patch just seems to care about not
destructing the source and not about the emission of the xor that breaks this,
but I can't know for sure rn.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (6 preceding siblings ...)
  2021-09-07  3:18 ` [Bug target/102224] [9/10/11/12 " gabravier at gmail dot com
@ 2021-09-07  3:25 ` gabravier at gmail dot com
  2021-09-07  6:37 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07  3:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #7 from Gabriel Ravier <gabravier at gmail dot com> ---
Also, `-ffast-math` seems to "fix" this, since in that case the code is
recognized as an ABS_EXPR pattern and as such results in the same code being
emitted without the xor. Is there any reason this isn't the case without fast
math ? I'd assume the xor wouldn't do anything w.r.t. conformance, personally.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)`
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (7 preceding siblings ...)
  2021-09-07  3:25 ` gabravier at gmail dot com
@ 2021-09-07  6:37 ` rguenth at gcc dot gnu.org
  2021-09-07  9:33 ` [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f marxin at gcc dot gnu.org
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-07  6:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|12.0                        |9.5
           Priority|P3                          |P2
             Target|X86                         |x86_64-*-* i?86-*-*
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-09-07
     Ever confirmed|0                           |1

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Miscompiled with -O2, expanding from

  _3 = .XORSIGN (x_2(D), x_2(D)); [tail call]

and the issue is in ix86_split_xorsign which does

  dest = lowpart_subreg (vmode, dest, mode);
  x = gen_rtx_AND (vmode, dest, mask);
  emit_insn (gen_rtx_SET (dest, x));

  op0 = lowpart_subreg (vmode, op0, mode);
  x = gen_rtx_XOR (vmode, dest, op0);
  emit_insn (gen_rtx_SET (dest, x));

not catering for op0 == dest

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (8 preceding siblings ...)
  2021-09-07  6:37 ` rguenth at gcc dot gnu.org
@ 2021-09-07  9:33 ` marxin at gcc dot gnu.org
  2021-09-07  9:51 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-07  9:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[9/10/11/12 regession]      |[9/10/11/12 regession]
                   |wrong code for `x *         |wrong code for `x *
                   |copysign(1.0, x)`           |copysign(1.0, x)` since
                   |                            |r9-5298-g33142cf9cf82aa1f
                 CC|                            |hjl at gcc dot gnu.org,
                   |                            |marxin at gcc dot gnu.org

--- Comment #9 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Gabriel Ravier from comment #5)
> Actually it seems to me like this is a GCC 9 regression, ever since this
> pattern exists: GCC 9, 10 and 11 emit the exact same faulty code.

I can confirm that:

$ cat pr102224.c
float
__attribute__((noipa))
f(float x, float y)
{
    return x * __builtin_copysignf(1.0f, x);
}

int main()
{
  float a = 1.23f;
  volatile float b = f(a, a);

  __builtin_printf ("a: %.2f, b: %.2f\n", a, b);
  if (b != a)
    __builtin_abort ();

  return 0;
}

$ gcc pr102224.c -g -O3 && ./a.out
a: 1.23, b: 0.00
Aborted (core dumped)

Started with r9-5298-g33142cf9cf82aa1f.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (9 preceding siblings ...)
  2021-09-07  9:33 ` [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f marxin at gcc dot gnu.org
@ 2021-09-07  9:51 ` jakub at gcc dot gnu.org
  2021-09-07 11:25 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-07  9:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The problem is actually about not catering to op0 == op1.
In that case what we really want is x & 0x80000000 rather than x ^ (x &
0x80000000).
Unfortunately after RA the mask needs to be already emitted, I think e.g.
because of PIC etc. we can't easily use a different constant at that point, and
for non-AVX pandn insn the negation is on the "0" constrained operand.
So
--- gcc/config/i386/i386-expand.c.jj    2021-09-06 14:47:43.184050185 +0200
+++ gcc/config/i386/i386-expand.c       2021-09-07 11:35:30.798849245 +0200
@@ -2289,12 +2289,20 @@ ix86_split_xorsign (rtx operands[])
   mode = GET_MODE (dest);
   vmode = GET_MODE (mask);

-  op1 = lowpart_subreg (vmode, op1, mode);
-  x = gen_rtx_AND (vmode, op1, mask);
-  emit_insn (gen_rtx_SET (op1, x));
+  if (rtx_equal_p (op0, op1))
+    {
+      op0 = lowpart_subreg (vmode, op0, mode);
+      x = gen_rtx_AND (vmode, op0, gen_rtx_NOT (vmode, mask));
+    }
+  else
+    {
+      op1 = lowpart_subreg (vmode, op1, mode);
+      x = gen_rtx_AND (vmode, op1, mask);
+      emit_insn (gen_rtx_SET (op1, x));

-  op0 = lowpart_subreg (vmode, op0, mode);
-  x = gen_rtx_XOR (vmode, op1, op0);
+      op0 = lowpart_subreg (vmode, op0, mode);
+      x = gen_rtx_XOR (vmode, op1, op0);
+    }

   dest = lowpart_subreg (vmode, dest, mode);
   emit_insn (gen_rtx_SET (dest, x));
doesn't work without -mavx.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (10 preceding siblings ...)
  2021-09-07  9:51 ` jakub at gcc dot gnu.org
@ 2021-09-07 11:25 ` jakub at gcc dot gnu.org
  2021-09-07 18:53 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-07 11:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 51422
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51422&action=edit
gcc12-pr102224.patch

Untested fix.  It is actually a mess.
While we can during expansion emit efficient fabs code if the two input
operands are equal, we still need to ensure correctness for the case where the
operand equality is discovered only after expansion.
This patch does that by ensuring through early-clobber and constraints that
for pre-AVX dest==op1 is different from op0, because we want to overwrite op1
first with op1 & mask before using op0.  For AVX, the constraints ensure that
either all of dest, op0 and op1 are different, or any two of them are the same
but the third one is different.  If op0 and op1 are different, then it is ok
without further changes, the i386-expand.c changes are for the op0 == op1 case
where we've ensured that dest is different from the inputs - we can emit vpandn
but if the mask is in memory, we need to force it into a register and can use
dest for that.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (11 preceding siblings ...)
  2021-09-07 11:25 ` jakub at gcc dot gnu.org
@ 2021-09-07 18:53 ` pinskia at gcc dot gnu.org
  2021-09-08  9:26 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-07 18:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 102227 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (12 preceding siblings ...)
  2021-09-07 18:53 ` pinskia at gcc dot gnu.org
@ 2021-09-08  9:26 ` cvs-commit at gcc dot gnu.org
  2021-09-08 10:08 ` [Bug target/102224] [9/10/11 " jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-08  9:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a7b626d98a9a821ffb33466818d6aa86cac1d6fd

commit r12-3413-ga7b626d98a9a821ffb33466818d6aa86cac1d6fd
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 8 11:25:31 2021 +0200

    i386: Fix up @xorsign<mode>3_1 [PR102224]

    As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
    operands are in the same register, because the splitter overwrites op1
    before with op1 & mask before using op0.

    For dest = xorsign op0, op0 we can actually simplify it from
    dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).

    The expander change is an optimization improvement, if we at expansion
    time know it is xorsign op0, op0, we can emit abs right away and get better
    code through that.

    The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
    to have same operands during expansion, but during RTL optimizations they
    would appear.  For non-AVX we need to use earlyclobber, we require
    dest and op1 to be the same but op0 must be different because we overwrite
    op1 first.  For AVX the constraints ensure that at most 2 of the 3 operands
    may be the same register and if both inputs are the same, handles that
case.
    This case can be easily tested with the xorsign<mode>3 expander change
    reverted.

    Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

    Thinking about it more this morning, while this patch fixes the problems
    revealed in the testcase, the recent PR89984 change was buggy too, but
    perhaps that can be fixed incrementally.  Because for AVX the new code
    destructively modifies op1.  If that is different from dest, say on:
    float
    foo (float x, float y)
    {
      return x * __builtin_copysignf (1.0f, y) + y;
    }
    then we get after RA:
    (insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82])
            (unspec:SF [
                    (reg:SF 20 xmm0 [88])
                    (reg:SF 21 xmm1 [89])
                    (mem/u/c:V4SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 
S16 A128])
                ] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1}
         (nil))
    (insn 9 8 15 2 (set (reg:SF 20 xmm0 [87])
            (plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82])
                (reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {*fop_sf_comm}
         (nil))
    but split the xorsign into:
            vandps  .LC0(%rip), %xmm1, %xmm1
            vxorps  %xmm0, %xmm1, %xmm0
    and then the addition:
            vaddss  %xmm1, %xmm0, %xmm0
    which means we miscompile it - instead of adding y in the end we add
    __builtin_copysignf (0.0f, y).
    So, wonder if we don't want instead in addition to the &Yv <- Yv, 0
    alternative (enabled for both pre-AVX and AVX as in this patch) the
    &Yv <- Yv, Yv where destination must be different from inputs and another
    Yv <- Yv, Yv where it can be the same but then need a match_scratch
    (with X for the other alternatives and =Yv for the last one).
    That way we'd always have a safe register we can store the op1 & mask
    value into, either the destination (in the first alternative known to
    be equal to op1 which is needed for non-AVX but ok for AVX too), in the
    second alternative known to be different from both inputs and in the third
    which could be used for those
    float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); }
    cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use
    some other register like xmm2.

    2021-09-08  Jakub Jelinek  <jakub@redhat.com>

            PR target/102224
            * config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
            operands[2], emit abs<mode>2 instead.
            (@xorsign<mode>3_1): Add early-clobbers for output operand, enable
            first alternative even for avx, add another alternative with
            =&Yv <- 0, Yv, Yvm constraints.
            * config/i386/i386-expand.c (ix86_split_xorsign): If op0 is equal
            to op1, emit vpandn instead.

            * gcc.dg/pr102224.c: New test.
            * gcc.target/i386/avx-pr102224.c: New test.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (13 preceding siblings ...)
  2021-09-08  9:26 ` cvs-commit at gcc dot gnu.org
@ 2021-09-08 10:08 ` jakub at gcc dot gnu.org
  2021-09-08 18:02 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-08 10:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[9/10/11/12 regession]      |[9/10/11 regession] wrong
                   |wrong code for `x *         |code for `x * copysign(1.0,
                   |copysign(1.0, x)` since     |x)` since
                   |r9-5298-g33142cf9cf82aa1f   |r9-5298-g33142cf9cf82aa1f

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk.  The 11 etc. backports will be somewhat different, because
the PR89984 change aren't and won't be on those branches.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10/11 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (14 preceding siblings ...)
  2021-09-08 10:08 ` [Bug target/102224] [9/10/11 " jakub at gcc dot gnu.org
@ 2021-09-08 18:02 ` cvs-commit at gcc dot gnu.org
  2021-09-08 18:03 ` [Bug target/102224] [9/10 " jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-08 18:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:cb5690b8d2ce84fb943535bea0d587863cf57753

commit r11-8973-gcb5690b8d2ce84fb943535bea0d587863cf57753
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 8 11:25:31 2021 +0200

    i386: Fix up @xorsign<mode>3_1 [PR102224]

    As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
    operands are in the same register, because the splitter overwrites op1
    before with op1 & mask before using op0.

    For dest = xorsign op0, op0 we can actually simplify it from
    dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).

    The expander change is an optimization improvement, if we at expansion
    time know it is xorsign op0, op0, we can emit abs right away and get better
    code through that.

    The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
    to have same operands during expansion, but during RTL optimizations they
    would appear.  We need to use earlyclobber, we require dest and op1 to be
    the same but op0 must be different because we overwrite
    op1 first.

    2021-09-08  Jakub Jelinek  <jakub@redhat.com>

            PR target/102224
            * config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
            operands[2], emit abs<mode>2 instead.
            (@xorsign<mode>3_1): Add early-clobber for output operand.

            * gcc.dg/pr102224.c: New test.
            * gcc.target/i386/avx-pr102224.c: New test.

    (cherry picked from commit a7b626d98a9a821ffb33466818d6aa86cac1d6fd)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (15 preceding siblings ...)
  2021-09-08 18:02 ` cvs-commit at gcc dot gnu.org
@ 2021-09-08 18:03 ` jakub at gcc dot gnu.org
  2022-05-10  8:20 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-08 18:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[9/10/11 regession] wrong   |[9/10 regession] wrong code
                   |code for `x * copysign(1.0, |for `x * copysign(1.0, x)`
                   |x)` since                   |since
                   |r9-5298-g33142cf9cf82aa1f   |r9-5298-g33142cf9cf82aa1f

--- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed for 11.3 too.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (16 preceding siblings ...)
  2021-09-08 18:03 ` [Bug target/102224] [9/10 " jakub at gcc dot gnu.org
@ 2022-05-10  8:20 ` cvs-commit at gcc dot gnu.org
  2022-05-11  6:22 ` cvs-commit at gcc dot gnu.org
  2022-05-11  6:36 ` jakub at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-10  8:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #17 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:c7b00fbc469bc9c0a486b48bc349dba13881241e

commit r10-10641-gc7b00fbc469bc9c0a486b48bc349dba13881241e
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 8 11:25:31 2021 +0200

    i386: Fix up @xorsign<mode>3_1 [PR102224]

    As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
    operands are in the same register, because the splitter overwrites op1
    before with op1 & mask before using op0.

    For dest = xorsign op0, op0 we can actually simplify it from
    dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).

    The expander change is an optimization improvement, if we at expansion
    time know it is xorsign op0, op0, we can emit abs right away and get better
    code through that.

    The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
    to have same operands during expansion, but during RTL optimizations they
    would appear.  We need to use earlyclobber, we require dest and op1 to be
    the same but op0 must be different because we overwrite
    op1 first.

    2021-09-08  Jakub Jelinek  <jakub@redhat.com>

            PR target/102224
            * config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
            operands[2], emit abs<mode>2 instead.
            (@xorsign<mode>3_1): Add early-clobber for output operand.

            * gcc.dg/pr102224.c: New test.
            * gcc.target/i386/avx-pr102224.c: New test.

    (cherry picked from commit a7b626d98a9a821ffb33466818d6aa86cac1d6fd)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (17 preceding siblings ...)
  2022-05-10  8:20 ` cvs-commit at gcc dot gnu.org
@ 2022-05-11  6:22 ` cvs-commit at gcc dot gnu.org
  2022-05-11  6:36 ` jakub at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-11  6:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:45579f290157b62a36752ed6df4fbcb7c742fb08

commit r9-10098-g45579f290157b62a36752ed6df4fbcb7c742fb08
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 8 11:25:31 2021 +0200

    i386: Fix up @xorsign<mode>3_1 [PR102224]

    As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
    operands are in the same register, because the splitter overwrites op1
    before with op1 & mask before using op0.

    For dest = xorsign op0, op0 we can actually simplify it from
    dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).

    The expander change is an optimization improvement, if we at expansion
    time know it is xorsign op0, op0, we can emit abs right away and get better
    code through that.

    The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
    to have same operands during expansion, but during RTL optimizations they
    would appear.  We need to use earlyclobber, we require dest and op1 to be
    the same but op0 must be different because we overwrite
    op1 first.

    2021-09-08  Jakub Jelinek  <jakub@redhat.com>

            PR target/102224
            * config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
            operands[2], emit abs<mode>2 instead.
            (@xorsign<mode>3_1): Add early-clobber for output operand.

            * gcc.dg/pr102224.c: New test.
            * gcc.target/i386/avx-pr102224.c: New test.

    (cherry picked from commit a7b626d98a9a821ffb33466818d6aa86cac1d6fd)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/102224] [9/10 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
  2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
                   ` (18 preceding siblings ...)
  2022-05-11  6:22 ` cvs-commit at gcc dot gnu.org
@ 2022-05-11  6:36 ` jakub at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-05-11  6:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-05-11  6:36 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-07  2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
2021-09-07  2:58 ` [Bug tree-optimization/102224] " gabravier at gmail dot com
2021-09-07  2:59 ` gabravier at gmail dot com
2021-09-07  3:01 ` gabravier at gmail dot com
2021-09-07  3:04 ` [Bug target/102224] " pinskia at gcc dot gnu.org
2021-09-07  3:06 ` [Bug target/102224] [12 regession] wrong code for " pinskia at gcc dot gnu.org
2021-09-07  3:07 ` gabravier at gmail dot com
2021-09-07  3:18 ` [Bug target/102224] [9/10/11/12 " gabravier at gmail dot com
2021-09-07  3:25 ` gabravier at gmail dot com
2021-09-07  6:37 ` rguenth at gcc dot gnu.org
2021-09-07  9:33 ` [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f marxin at gcc dot gnu.org
2021-09-07  9:51 ` jakub at gcc dot gnu.org
2021-09-07 11:25 ` jakub at gcc dot gnu.org
2021-09-07 18:53 ` pinskia at gcc dot gnu.org
2021-09-08  9:26 ` cvs-commit at gcc dot gnu.org
2021-09-08 10:08 ` [Bug target/102224] [9/10/11 " jakub at gcc dot gnu.org
2021-09-08 18:02 ` cvs-commit at gcc dot gnu.org
2021-09-08 18:03 ` [Bug target/102224] [9/10 " jakub at gcc dot gnu.org
2022-05-10  8:20 ` cvs-commit at gcc dot gnu.org
2022-05-11  6:22 ` cvs-commit at gcc dot gnu.org
2022-05-11  6:36 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).