public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math
@ 2020-05-11 15:07 ubizjak at gmail dot com
2020-05-11 15:12 ` [Bug tree-optimization/95060] " ubizjak at gmail dot com
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2020-05-11 15:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060
Bug ID: 95060
Summary: vfnmsub132ps is not generated with -ffast-math
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
float r[8], a[8], b[8], c[8];
void
test_fnms (void)
{
for (int i = 0; i < 8; i++)
r[i] = -(a[i] * b[i]) - c[i];
}
--cut here--
compiles on x86_64 with "-O3 -mfma" to
vmovaps b(%rip), %ymm0
vmovaps c(%rip), %ymm1
vfnmsub132ps a(%rip), %ymm1, %ymm0
vmovaps %ymm0, r(%rip)
vzeroupper
ret
However, when -ffast-math is added, negation gets moved out of the insn:
vmovaps b(%rip), %ymm0
vmovaps c(%rip), %ymm1
vfmadd132ps a(%rip), %ymm1, %ymm0
-> vxorps .LC0(%rip), %ymm0, %ymm0
vmovaps %ymm0, r(%rip)
vzeroupper
ret
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
@ 2020-05-11 15:12 ` ubizjak at gmail dot com
2020-05-11 19:49 ` jakub at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2020-05-11 15:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
Related to PR86999.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
2020-05-11 15:12 ` [Bug tree-optimization/95060] " ubizjak at gmail dot com
@ 2020-05-11 19:49 ` jakub at gcc dot gnu.org
2020-05-12 6:56 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-11 19:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I see the needed simplifiers in match.pd:
(simplify
(negate (fmas@3 @0 @1 @2))
(if (single_use (@3))
(IFN_FNMS @0 @1 @2))))
but perhaps the problem is that there is no forwprop after widening_mul that
would perform that optimization?
So, shall widening_mul itself if it matches some FMA check if the result of
IFN_{FMA,FMS,FNMA,FNMS} it created isn't negation and if yes, try to
gimple_fold it?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
2020-05-11 15:12 ` [Bug tree-optimization/95060] " ubizjak at gmail dot com
2020-05-11 19:49 ` jakub at gcc dot gnu.org
@ 2020-05-12 6:56 ` rguenth at gcc dot gnu.org
2020-05-12 10:21 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-05-12 6:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|unknown |11.0
Status|UNCONFIRMED |NEW
Last reconfirmed| |2020-05-12
Keywords| |missed-optimization
Ever confirmed|0 |1
Target| |x86_64-*-* i?86-*-*
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
FMA generation already folds the FMA stmt:
if (cond)
fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1,
op2, addop, else_value);
else
fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop);
gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt));
gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun,
use_stmt));
gsi_replace (&gsi, fma_stmt, true);
/* Follow all SSA edges so that we generate FMS, FNMA and FNMS
regardless of where the negation occurs. */
gimple *orig_stmt = gsi_stmt (gsi);
if (fold_stmt (&gsi, follow_all_ssa_edges))
{
if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))
gcc_unreachable ();
update_stmt (gsi_stmt (gsi));
but not the negate it feeds since with -ffast-math we have
-((a[i] * b[i]) + c[i]) as canonical form it seems (reassoc does this).
float r[8], a[8], b[8], c[8];
void
test_fnms (void)
{
for (int i = 0; i < 8; i++)
r[i] = -((a[i] * b[i]) + c[i]);
}
would be an alternative testcase, not handled without -ffast-math either.
I'd suggest to fold the single-use stmt of the fma_stmts lhs if any
[and if it is a negate].
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
` (2 preceding siblings ...)
2020-05-12 6:56 ` rguenth at gcc dot gnu.org
@ 2020-05-12 10:21 ` jakub at gcc dot gnu.org
2020-05-13 9:21 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-12 10:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 48515
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48515&action=edit
gcc11-pr95060.patch
Untested fix.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
` (3 preceding siblings ...)
2020-05-12 10:21 ` jakub at gcc dot gnu.org
@ 2020-05-13 9:21 ` cvs-commit at gcc dot gnu.org
2020-05-13 9:23 ` jakub at gcc dot gnu.org
2021-08-10 23:13 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-05-13 9:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:c0c39a765b0714aed36fced6fbba452a6619acb0
commit r11-350-gc0c39a765b0714aed36fced6fbba452a6619acb0
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed May 13 11:21:02 2020 +0200
Fold single imm use of a FMA if it is a negation [PR95060]
match.pd already has simplifications for negation of a FMA (FMS, FNMA,
FNMS)
call if it is single use, but when the widening_mul pass discovers FMAs,
nothing folds the statements anymore.
So, the following patch adjusts the widening_mul pass to handle that.
I had to adjust quite a lot of tests, because they have in them nested FMAs
(one FMA feeding another one) and the patch results in some (equivalent)
changes
in the chosen instructions, previously the negation of one FMA's result
would result in the dependent FMA being adjusted for the negation, but now
instead the first FMA is adjusted.
2020-05-13 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95060
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Fold a NEGATE_EXPR
if it is the single use of the FMA internal builtin.
* gcc.target/i386/avx512f-pr95060.c: New test.
* gcc.target/i386/fma_double_1.c: Adjust expected insn counts.
* gcc.target/i386/fma_double_2.c: Likewise.
* gcc.target/i386/fma_double_3.c: Likewise.
* gcc.target/i386/fma_double_4.c: Likewise.
* gcc.target/i386/fma_double_5.c: Likewise.
* gcc.target/i386/fma_double_6.c: Likewise.
* gcc.target/i386/fma_float_1.c: Likewise.
* gcc.target/i386/fma_float_2.c: Likewise.
* gcc.target/i386/fma_float_3.c: Likewise.
* gcc.target/i386/fma_float_4.c: Likewise.
* gcc.target/i386/fma_float_5.c: Likewise.
* gcc.target/i386/fma_float_6.c: Likewise.
* gcc.target/i386/l_fma_double_1.c: Likewise.
* gcc.target/i386/l_fma_double_2.c: Likewise.
* gcc.target/i386/l_fma_double_3.c: Likewise.
* gcc.target/i386/l_fma_double_4.c: Likewise.
* gcc.target/i386/l_fma_double_5.c: Likewise.
* gcc.target/i386/l_fma_double_6.c: Likewise.
* gcc.target/i386/l_fma_float_1.c: Likewise.
* gcc.target/i386/l_fma_float_2.c: Likewise.
* gcc.target/i386/l_fma_float_3.c: Likewise.
* gcc.target/i386/l_fma_float_4.c: Likewise.
* gcc.target/i386/l_fma_float_5.c: Likewise.
* gcc.target/i386/l_fma_float_6.c: Likewise.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
` (4 preceding siblings ...)
2020-05-13 9:21 ` cvs-commit at gcc dot gnu.org
@ 2020-05-13 9:23 ` jakub at gcc dot gnu.org
2021-08-10 23:13 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-13 9:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
` (5 preceding siblings ...)
2020-05-13 9:23 ` jakub at gcc dot gnu.org
@ 2021-08-10 23:13 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-10 23:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |zamazan4ik at tut dot by
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 91250 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-08-10 23:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
2020-05-11 15:12 ` [Bug tree-optimization/95060] " ubizjak at gmail dot com
2020-05-11 19:49 ` jakub at gcc dot gnu.org
2020-05-12 6:56 ` rguenth at gcc dot gnu.org
2020-05-12 10:21 ` jakub at gcc dot gnu.org
2020-05-13 9:21 ` cvs-commit at gcc dot gnu.org
2020-05-13 9:23 ` jakub at gcc dot gnu.org
2021-08-10 23:13 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).