public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)`
@ 2021-09-07 2:56 gabravier at gmail dot com
2021-09-07 2:58 ` [Bug tree-optimization/102224] " gabravier at gmail dot com
` (19 more replies)
0 siblings, 20 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07 2:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Bug ID: 102224
Summary: Incorrect compile on `x * copysign(1.0, x)`
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
float f(float x, float y)
{
return x * copysignf(1.0f, x);
}
On latest trunk with -O3 on x86, this compiles to:
f(float):
andps xmm0, XMMWORD PTR .LC0[rip]
xorps xmm0, xmm0
ret
.LC0:
.long -2147483648
.long 0
.long 0
.long 0
This is likely because the XOR_SIGN pattern emitted by
convert_expand_mult_copysign fails to check that x and y are not the same
(note: it can be optimized to ABS_EXPR in that case).
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug tree-optimization/102224] Incorrect compile on `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
@ 2021-09-07 2:58 ` gabravier at gmail dot com
2021-09-07 2:59 ` gabravier at gmail dot com
` (18 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07 2:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #1 from Gabriel Ravier <gabravier at gmail dot com> ---
(Note: this is a miscompile because it compiles as equivalent to `return 0;` as
that's what `xorps xmm0, xmm0` will do)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug tree-optimization/102224] Incorrect compile on `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
2021-09-07 2:58 ` [Bug tree-optimization/102224] " gabravier at gmail dot com
@ 2021-09-07 2:59 ` gabravier at gmail dot com
2021-09-07 3:01 ` gabravier at gmail dot com
` (17 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07 2:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #2 from Gabriel Ravier <gabravier at gmail dot com> ---
(PS: by "x and y" I mean "the two arguments". If they're the same, GCC should
obviously just optimize this to an abs as that's what it ends up being)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug tree-optimization/102224] Incorrect compile on `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
2021-09-07 2:58 ` [Bug tree-optimization/102224] " gabravier at gmail dot com
2021-09-07 2:59 ` gabravier at gmail dot com
@ 2021-09-07 3:01 ` gabravier at gmail dot com
2021-09-07 3:04 ` [Bug target/102224] " pinskia at gcc dot gnu.org
` (16 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07 3:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #3 from Gabriel Ravier <gabravier at gmail dot com> ---
Also seems like this might be unique to x86 as this compiles fine on Aarch64
(though while it doesn't try to do anything stupid like xoring the result with
itself, it does still not optimize the XOR_SIGN to an ABS_EXPR at the GIMPLE
level).
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] Incorrect compile on `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (2 preceding siblings ...)
2021-09-07 3:01 ` gabravier at gmail dot com
@ 2021-09-07 3:04 ` pinskia at gcc dot gnu.org
2021-09-07 3:06 ` [Bug target/102224] [12 regession] wrong code for " pinskia at gcc dot gnu.org
` (15 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-07 3:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |12.0
Component|tree-optimization |target
Keywords| |wrong-code
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [12 regession] wrong code for `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (3 preceding siblings ...)
2021-09-07 3:04 ` [Bug target/102224] " pinskia at gcc dot gnu.org
@ 2021-09-07 3:06 ` pinskia at gcc dot gnu.org
2021-09-07 3:07 ` gabravier at gmail dot com
` (14 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-07 3:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
There was just a recent patch which touched this in the x86 backend.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [12 regession] wrong code for `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (4 preceding siblings ...)
2021-09-07 3:06 ` [Bug target/102224] [12 regession] wrong code for " pinskia at gcc dot gnu.org
@ 2021-09-07 3:07 ` gabravier at gmail dot com
2021-09-07 3:18 ` [Bug target/102224] [9/10/11/12 " gabravier at gmail dot com
` (13 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07 3:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #5 from Gabriel Ravier <gabravier at gmail dot com> ---
Actually it seems to me like this is a GCC 9 regression, ever since this
pattern exists: GCC 9, 10 and 11 emit the exact same faulty code.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (5 preceding siblings ...)
2021-09-07 3:07 ` gabravier at gmail dot com
@ 2021-09-07 3:18 ` gabravier at gmail dot com
2021-09-07 3:25 ` gabravier at gmail dot com
` (12 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07 3:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Gabriel Ravier <gabravier at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[12 regession] wrong code |[9/10/11/12 regession]
|for `x * copysign(1.0, x)` |wrong code for `x *
| |copysign(1.0, x)`
--- Comment #6 from Gabriel Ravier <gabravier at gmail dot com> ---
Actually, I've only gotten a snapshot from the 5th, which does not appear to
include HJ's patch from the 4th (which seems rather odd). Does it happen to fix
this ? I'd assume it does not, since that patch just seems to care about not
destructing the source and not about the emission of the xor that breaks this,
but I can't know for sure rn.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (6 preceding siblings ...)
2021-09-07 3:18 ` [Bug target/102224] [9/10/11/12 " gabravier at gmail dot com
@ 2021-09-07 3:25 ` gabravier at gmail dot com
2021-09-07 6:37 ` rguenth at gcc dot gnu.org
` (11 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: gabravier at gmail dot com @ 2021-09-07 3:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #7 from Gabriel Ravier <gabravier at gmail dot com> ---
Also, `-ffast-math` seems to "fix" this, since in that case the code is
recognized as an ABS_EXPR pattern and as such results in the same code being
emitted without the xor. Is there any reason this isn't the case without fast
math ? I'd assume the xor wouldn't do anything w.r.t. conformance, personally.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)`
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (7 preceding siblings ...)
2021-09-07 3:25 ` gabravier at gmail dot com
@ 2021-09-07 6:37 ` rguenth at gcc dot gnu.org
2021-09-07 9:33 ` [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f marxin at gcc dot gnu.org
` (10 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-07 6:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|12.0 |9.5
Priority|P3 |P2
Target|X86 |x86_64-*-* i?86-*-*
Status|UNCONFIRMED |NEW
Last reconfirmed| |2021-09-07
Ever confirmed|0 |1
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Miscompiled with -O2, expanding from
_3 = .XORSIGN (x_2(D), x_2(D)); [tail call]
and the issue is in ix86_split_xorsign which does
dest = lowpart_subreg (vmode, dest, mode);
x = gen_rtx_AND (vmode, dest, mask);
emit_insn (gen_rtx_SET (dest, x));
op0 = lowpart_subreg (vmode, op0, mode);
x = gen_rtx_XOR (vmode, dest, op0);
emit_insn (gen_rtx_SET (dest, x));
not catering for op0 == dest
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (8 preceding siblings ...)
2021-09-07 6:37 ` rguenth at gcc dot gnu.org
@ 2021-09-07 9:33 ` marxin at gcc dot gnu.org
2021-09-07 9:51 ` jakub at gcc dot gnu.org
` (9 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-07 9:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[9/10/11/12 regession] |[9/10/11/12 regession]
|wrong code for `x * |wrong code for `x *
|copysign(1.0, x)` |copysign(1.0, x)` since
| |r9-5298-g33142cf9cf82aa1f
CC| |hjl at gcc dot gnu.org,
| |marxin at gcc dot gnu.org
--- Comment #9 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Gabriel Ravier from comment #5)
> Actually it seems to me like this is a GCC 9 regression, ever since this
> pattern exists: GCC 9, 10 and 11 emit the exact same faulty code.
I can confirm that:
$ cat pr102224.c
float
__attribute__((noipa))
f(float x, float y)
{
return x * __builtin_copysignf(1.0f, x);
}
int main()
{
float a = 1.23f;
volatile float b = f(a, a);
__builtin_printf ("a: %.2f, b: %.2f\n", a, b);
if (b != a)
__builtin_abort ();
return 0;
}
$ gcc pr102224.c -g -O3 && ./a.out
a: 1.23, b: 0.00
Aborted (core dumped)
Started with r9-5298-g33142cf9cf82aa1f.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (9 preceding siblings ...)
2021-09-07 9:33 ` [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f marxin at gcc dot gnu.org
@ 2021-09-07 9:51 ` jakub at gcc dot gnu.org
2021-09-07 11:25 ` jakub at gcc dot gnu.org
` (8 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-07 9:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The problem is actually about not catering to op0 == op1.
In that case what we really want is x & 0x80000000 rather than x ^ (x &
0x80000000).
Unfortunately after RA the mask needs to be already emitted, I think e.g.
because of PIC etc. we can't easily use a different constant at that point, and
for non-AVX pandn insn the negation is on the "0" constrained operand.
So
--- gcc/config/i386/i386-expand.c.jj 2021-09-06 14:47:43.184050185 +0200
+++ gcc/config/i386/i386-expand.c 2021-09-07 11:35:30.798849245 +0200
@@ -2289,12 +2289,20 @@ ix86_split_xorsign (rtx operands[])
mode = GET_MODE (dest);
vmode = GET_MODE (mask);
- op1 = lowpart_subreg (vmode, op1, mode);
- x = gen_rtx_AND (vmode, op1, mask);
- emit_insn (gen_rtx_SET (op1, x));
+ if (rtx_equal_p (op0, op1))
+ {
+ op0 = lowpart_subreg (vmode, op0, mode);
+ x = gen_rtx_AND (vmode, op0, gen_rtx_NOT (vmode, mask));
+ }
+ else
+ {
+ op1 = lowpart_subreg (vmode, op1, mode);
+ x = gen_rtx_AND (vmode, op1, mask);
+ emit_insn (gen_rtx_SET (op1, x));
- op0 = lowpart_subreg (vmode, op0, mode);
- x = gen_rtx_XOR (vmode, op1, op0);
+ op0 = lowpart_subreg (vmode, op0, mode);
+ x = gen_rtx_XOR (vmode, op1, op0);
+ }
dest = lowpart_subreg (vmode, dest, mode);
emit_insn (gen_rtx_SET (dest, x));
doesn't work without -mavx.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (10 preceding siblings ...)
2021-09-07 9:51 ` jakub at gcc dot gnu.org
@ 2021-09-07 11:25 ` jakub at gcc dot gnu.org
2021-09-07 18:53 ` pinskia at gcc dot gnu.org
` (7 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-07 11:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 51422
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51422&action=edit
gcc12-pr102224.patch
Untested fix. It is actually a mess.
While we can during expansion emit efficient fabs code if the two input
operands are equal, we still need to ensure correctness for the case where the
operand equality is discovered only after expansion.
This patch does that by ensuring through early-clobber and constraints that
for pre-AVX dest==op1 is different from op0, because we want to overwrite op1
first with op1 & mask before using op0. For AVX, the constraints ensure that
either all of dest, op0 and op1 are different, or any two of them are the same
but the third one is different. If op0 and op1 are different, then it is ok
without further changes, the i386-expand.c changes are for the op0 == op1 case
where we've ensured that dest is different from the inputs - we can emit vpandn
but if the mask is in memory, we need to force it into a register and can use
dest for that.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (11 preceding siblings ...)
2021-09-07 11:25 ` jakub at gcc dot gnu.org
@ 2021-09-07 18:53 ` pinskia at gcc dot gnu.org
2021-09-08 9:26 ` cvs-commit at gcc dot gnu.org
` (6 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-07 18:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 102227 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (12 preceding siblings ...)
2021-09-07 18:53 ` pinskia at gcc dot gnu.org
@ 2021-09-08 9:26 ` cvs-commit at gcc dot gnu.org
2021-09-08 10:08 ` [Bug target/102224] [9/10/11 " jakub at gcc dot gnu.org
` (5 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-08 9:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:a7b626d98a9a821ffb33466818d6aa86cac1d6fd
commit r12-3413-ga7b626d98a9a821ffb33466818d6aa86cac1d6fd
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Sep 8 11:25:31 2021 +0200
i386: Fix up @xorsign<mode>3_1 [PR102224]
As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
operands are in the same register, because the splitter overwrites op1
before with op1 & mask before using op0.
For dest = xorsign op0, op0 we can actually simplify it from
dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).
The expander change is an optimization improvement, if we at expansion
time know it is xorsign op0, op0, we can emit abs right away and get better
code through that.
The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
to have same operands during expansion, but during RTL optimizations they
would appear. For non-AVX we need to use earlyclobber, we require
dest and op1 to be the same but op0 must be different because we overwrite
op1 first. For AVX the constraints ensure that at most 2 of the 3 operands
may be the same register and if both inputs are the same, handles that
case.
This case can be easily tested with the xorsign<mode>3 expander change
reverted.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Thinking about it more this morning, while this patch fixes the problems
revealed in the testcase, the recent PR89984 change was buggy too, but
perhaps that can be fixed incrementally. Because for AVX the new code
destructively modifies op1. If that is different from dest, say on:
float
foo (float x, float y)
{
return x * __builtin_copysignf (1.0f, y) + y;
}
then we get after RA:
(insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82])
(unspec:SF [
(reg:SF 20 xmm0 [88])
(reg:SF 21 xmm1 [89])
(mem/u/c:V4SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0
S16 A128])
] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1}
(nil))
(insn 9 8 15 2 (set (reg:SF 20 xmm0 [87])
(plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82])
(reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {*fop_sf_comm}
(nil))
but split the xorsign into:
vandps .LC0(%rip), %xmm1, %xmm1
vxorps %xmm0, %xmm1, %xmm0
and then the addition:
vaddss %xmm1, %xmm0, %xmm0
which means we miscompile it - instead of adding y in the end we add
__builtin_copysignf (0.0f, y).
So, wonder if we don't want instead in addition to the &Yv <- Yv, 0
alternative (enabled for both pre-AVX and AVX as in this patch) the
&Yv <- Yv, Yv where destination must be different from inputs and another
Yv <- Yv, Yv where it can be the same but then need a match_scratch
(with X for the other alternatives and =Yv for the last one).
That way we'd always have a safe register we can store the op1 & mask
value into, either the destination (in the first alternative known to
be equal to op1 which is needed for non-AVX but ok for AVX too), in the
second alternative known to be different from both inputs and in the third
which could be used for those
float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); }
cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use
some other register like xmm2.
2021-09-08 Jakub Jelinek <jakub@redhat.com>
PR target/102224
* config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
operands[2], emit abs<mode>2 instead.
(@xorsign<mode>3_1): Add early-clobbers for output operand, enable
first alternative even for avx, add another alternative with
=&Yv <- 0, Yv, Yvm constraints.
* config/i386/i386-expand.c (ix86_split_xorsign): If op0 is equal
to op1, emit vpandn instead.
* gcc.dg/pr102224.c: New test.
* gcc.target/i386/avx-pr102224.c: New test.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (13 preceding siblings ...)
2021-09-08 9:26 ` cvs-commit at gcc dot gnu.org
@ 2021-09-08 10:08 ` jakub at gcc dot gnu.org
2021-09-08 18:02 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-08 10:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[9/10/11/12 regession] |[9/10/11 regession] wrong
|wrong code for `x * |code for `x * copysign(1.0,
|copysign(1.0, x)` since |x)` since
|r9-5298-g33142cf9cf82aa1f |r9-5298-g33142cf9cf82aa1f
--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk. The 11 etc. backports will be somewhat different, because
the PR89984 change aren't and won't be on those branches.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10/11 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (14 preceding siblings ...)
2021-09-08 10:08 ` [Bug target/102224] [9/10/11 " jakub at gcc dot gnu.org
@ 2021-09-08 18:02 ` cvs-commit at gcc dot gnu.org
2021-09-08 18:03 ` [Bug target/102224] [9/10 " jakub at gcc dot gnu.org
` (3 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-08 18:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:cb5690b8d2ce84fb943535bea0d587863cf57753
commit r11-8973-gcb5690b8d2ce84fb943535bea0d587863cf57753
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Sep 8 11:25:31 2021 +0200
i386: Fix up @xorsign<mode>3_1 [PR102224]
As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
operands are in the same register, because the splitter overwrites op1
before with op1 & mask before using op0.
For dest = xorsign op0, op0 we can actually simplify it from
dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).
The expander change is an optimization improvement, if we at expansion
time know it is xorsign op0, op0, we can emit abs right away and get better
code through that.
The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
to have same operands during expansion, but during RTL optimizations they
would appear. We need to use earlyclobber, we require dest and op1 to be
the same but op0 must be different because we overwrite
op1 first.
2021-09-08 Jakub Jelinek <jakub@redhat.com>
PR target/102224
* config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
operands[2], emit abs<mode>2 instead.
(@xorsign<mode>3_1): Add early-clobber for output operand.
* gcc.dg/pr102224.c: New test.
* gcc.target/i386/avx-pr102224.c: New test.
(cherry picked from commit a7b626d98a9a821ffb33466818d6aa86cac1d6fd)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (15 preceding siblings ...)
2021-09-08 18:02 ` cvs-commit at gcc dot gnu.org
@ 2021-09-08 18:03 ` jakub at gcc dot gnu.org
2022-05-10 8:20 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-08 18:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[9/10/11 regession] wrong |[9/10 regession] wrong code
|code for `x * copysign(1.0, |for `x * copysign(1.0, x)`
|x)` since |since
|r9-5298-g33142cf9cf82aa1f |r9-5298-g33142cf9cf82aa1f
--- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed for 11.3 too.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (16 preceding siblings ...)
2021-09-08 18:03 ` [Bug target/102224] [9/10 " jakub at gcc dot gnu.org
@ 2022-05-10 8:20 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:22 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:36 ` jakub at gcc dot gnu.org
19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-10 8:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #17 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:c7b00fbc469bc9c0a486b48bc349dba13881241e
commit r10-10641-gc7b00fbc469bc9c0a486b48bc349dba13881241e
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Sep 8 11:25:31 2021 +0200
i386: Fix up @xorsign<mode>3_1 [PR102224]
As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
operands are in the same register, because the splitter overwrites op1
before with op1 & mask before using op0.
For dest = xorsign op0, op0 we can actually simplify it from
dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).
The expander change is an optimization improvement, if we at expansion
time know it is xorsign op0, op0, we can emit abs right away and get better
code through that.
The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
to have same operands during expansion, but during RTL optimizations they
would appear. We need to use earlyclobber, we require dest and op1 to be
the same but op0 must be different because we overwrite
op1 first.
2021-09-08 Jakub Jelinek <jakub@redhat.com>
PR target/102224
* config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
operands[2], emit abs<mode>2 instead.
(@xorsign<mode>3_1): Add early-clobber for output operand.
* gcc.dg/pr102224.c: New test.
* gcc.target/i386/avx-pr102224.c: New test.
(cherry picked from commit a7b626d98a9a821ffb33466818d6aa86cac1d6fd)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (17 preceding siblings ...)
2022-05-10 8:20 ` cvs-commit at gcc dot gnu.org
@ 2022-05-11 6:22 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:36 ` jakub at gcc dot gnu.org
19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-11 6:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:45579f290157b62a36752ed6df4fbcb7c742fb08
commit r9-10098-g45579f290157b62a36752ed6df4fbcb7c742fb08
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Sep 8 11:25:31 2021 +0200
i386: Fix up @xorsign<mode>3_1 [PR102224]
As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
operands are in the same register, because the splitter overwrites op1
before with op1 & mask before using op0.
For dest = xorsign op0, op0 we can actually simplify it from
dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).
The expander change is an optimization improvement, if we at expansion
time know it is xorsign op0, op0, we can emit abs right away and get better
code through that.
The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
to have same operands during expansion, but during RTL optimizations they
would appear. We need to use earlyclobber, we require dest and op1 to be
the same but op0 must be different because we overwrite
op1 first.
2021-09-08 Jakub Jelinek <jakub@redhat.com>
PR target/102224
* config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
operands[2], emit abs<mode>2 instead.
(@xorsign<mode>3_1): Add early-clobber for output operand.
* gcc.dg/pr102224.c: New test.
* gcc.target/i386/avx-pr102224.c: New test.
(cherry picked from commit a7b626d98a9a821ffb33466818d6aa86cac1d6fd)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug target/102224] [9/10 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
` (18 preceding siblings ...)
2022-05-11 6:22 ` cvs-commit at gcc dot gnu.org
@ 2022-05-11 6:36 ` jakub at gcc dot gnu.org
19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-05-11 6:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102224
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-05-11 6:36 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-07 2:56 [Bug tree-optimization/102224] New: Incorrect compile on `x * copysign(1.0, x)` gabravier at gmail dot com
2021-09-07 2:58 ` [Bug tree-optimization/102224] " gabravier at gmail dot com
2021-09-07 2:59 ` gabravier at gmail dot com
2021-09-07 3:01 ` gabravier at gmail dot com
2021-09-07 3:04 ` [Bug target/102224] " pinskia at gcc dot gnu.org
2021-09-07 3:06 ` [Bug target/102224] [12 regession] wrong code for " pinskia at gcc dot gnu.org
2021-09-07 3:07 ` gabravier at gmail dot com
2021-09-07 3:18 ` [Bug target/102224] [9/10/11/12 " gabravier at gmail dot com
2021-09-07 3:25 ` gabravier at gmail dot com
2021-09-07 6:37 ` rguenth at gcc dot gnu.org
2021-09-07 9:33 ` [Bug target/102224] [9/10/11/12 regession] wrong code for `x * copysign(1.0, x)` since r9-5298-g33142cf9cf82aa1f marxin at gcc dot gnu.org
2021-09-07 9:51 ` jakub at gcc dot gnu.org
2021-09-07 11:25 ` jakub at gcc dot gnu.org
2021-09-07 18:53 ` pinskia at gcc dot gnu.org
2021-09-08 9:26 ` cvs-commit at gcc dot gnu.org
2021-09-08 10:08 ` [Bug target/102224] [9/10/11 " jakub at gcc dot gnu.org
2021-09-08 18:02 ` cvs-commit at gcc dot gnu.org
2021-09-08 18:03 ` [Bug target/102224] [9/10 " jakub at gcc dot gnu.org
2022-05-10 8:20 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:22 ` cvs-commit at gcc dot gnu.org
2022-05-11 6:36 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).