public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math
@ 2020-05-11 15:07 ubizjak at gmail dot com
  2020-05-11 15:12 ` [Bug tree-optimization/95060] " ubizjak at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2020-05-11 15:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

            Bug ID: 95060
           Summary: vfnmsub132ps is not generated with -ffast-math
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcase:

--cut here--
float r[8], a[8], b[8], c[8];

void
test_fnms (void)
{
  for (int i = 0; i < 8; i++)
    r[i] = -(a[i] * b[i]) - c[i];
}
--cut here--

compiles on x86_64 with "-O3 -mfma" to

        vmovaps b(%rip), %ymm0
        vmovaps c(%rip), %ymm1
        vfnmsub132ps    a(%rip), %ymm1, %ymm0
        vmovaps %ymm0, r(%rip)
        vzeroupper
        ret

However, when -ffast-math is added, negation gets moved out of the insn:

        vmovaps b(%rip), %ymm0
        vmovaps c(%rip), %ymm1
        vfmadd132ps     a(%rip), %ymm1, %ymm0
->      vxorps  .LC0(%rip), %ymm0, %ymm0
        vmovaps %ymm0, r(%rip)
        vzeroupper
        ret

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
  2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
@ 2020-05-11 15:12 ` ubizjak at gmail dot com
  2020-05-11 19:49 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2020-05-11 15:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
Related to PR86999.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
  2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
  2020-05-11 15:12 ` [Bug tree-optimization/95060] " ubizjak at gmail dot com
@ 2020-05-11 19:49 ` jakub at gcc dot gnu.org
  2020-05-12  6:56 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-11 19:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I see the needed simplifiers in match.pd:
  (simplify
   (negate (fmas@3 @0 @1 @2))
   (if (single_use (@3))
    (IFN_FNMS @0 @1 @2))))
but perhaps the problem is that there is no forwprop after widening_mul that
would perform that optimization?
So, shall widening_mul itself if it matches some FMA check if the result of
IFN_{FMA,FMS,FNMA,FNMS} it created isn't negation and if yes, try to
gimple_fold it?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
  2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
  2020-05-11 15:12 ` [Bug tree-optimization/95060] " ubizjak at gmail dot com
  2020-05-11 19:49 ` jakub at gcc dot gnu.org
@ 2020-05-12  6:56 ` rguenth at gcc dot gnu.org
  2020-05-12 10:21 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-05-12  6:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |11.0
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2020-05-12
           Keywords|                            |missed-optimization
     Ever confirmed|0                           |1
             Target|                            |x86_64-*-* i?86-*-*

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
FMA generation already folds the FMA stmt:

      if (cond)
        fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1,
                                               op2, addop, else_value);
      else
        fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop);
      gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt));
      gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun,
                                                                   use_stmt));
      gsi_replace (&gsi, fma_stmt, true);
      /* Follow all SSA edges so that we generate FMS, FNMA and FNMS
         regardless of where the negation occurs.  */
      gimple *orig_stmt = gsi_stmt (gsi);
      if (fold_stmt (&gsi, follow_all_ssa_edges))
        {
          if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))
            gcc_unreachable ();
          update_stmt (gsi_stmt (gsi));

but not the negate it feeds since with -ffast-math we have
-((a[i] * b[i]) + c[i]) as canonical form it seems (reassoc does this).

float r[8], a[8], b[8], c[8];

void
test_fnms (void)
{
  for (int i = 0; i < 8; i++)
    r[i] = -((a[i] * b[i]) + c[i]);
}

would be an alternative testcase, not handled without -ffast-math either.

I'd suggest to fold the single-use stmt of the fma_stmts lhs if any
[and if it is a negate].

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
  2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
                   ` (2 preceding siblings ...)
  2020-05-12  6:56 ` rguenth at gcc dot gnu.org
@ 2020-05-12 10:21 ` jakub at gcc dot gnu.org
  2020-05-13  9:21 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-12 10:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 48515
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48515&action=edit
gcc11-pr95060.patch

Untested fix.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
  2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
                   ` (3 preceding siblings ...)
  2020-05-12 10:21 ` jakub at gcc dot gnu.org
@ 2020-05-13  9:21 ` cvs-commit at gcc dot gnu.org
  2020-05-13  9:23 ` jakub at gcc dot gnu.org
  2021-08-10 23:13 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-05-13  9:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:c0c39a765b0714aed36fced6fbba452a6619acb0

commit r11-350-gc0c39a765b0714aed36fced6fbba452a6619acb0
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed May 13 11:21:02 2020 +0200

    Fold single imm use of a FMA if it is a negation [PR95060]

    match.pd already has simplifications for negation of a FMA (FMS, FNMA,
FNMS)
    call if it is single use, but when the widening_mul pass discovers FMAs,
    nothing folds the statements anymore.

    So, the following patch adjusts the widening_mul pass to handle that.

    I had to adjust quite a lot of tests, because they have in them nested FMAs
    (one FMA feeding another one) and the patch results in some (equivalent)
changes
    in the chosen instructions, previously the negation of one FMA's result
    would result in the dependent FMA being adjusted for the negation, but now
    instead the first FMA is adjusted.

    2020-05-13  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/95060
            * tree-ssa-math-opts.c (convert_mult_to_fma_1): Fold a NEGATE_EXPR
            if it is the single use of the FMA internal builtin.

            * gcc.target/i386/avx512f-pr95060.c: New test.
            * gcc.target/i386/fma_double_1.c: Adjust expected insn counts.
            * gcc.target/i386/fma_double_2.c: Likewise.
            * gcc.target/i386/fma_double_3.c: Likewise.
            * gcc.target/i386/fma_double_4.c: Likewise.
            * gcc.target/i386/fma_double_5.c: Likewise.
            * gcc.target/i386/fma_double_6.c: Likewise.
            * gcc.target/i386/fma_float_1.c: Likewise.
            * gcc.target/i386/fma_float_2.c: Likewise.
            * gcc.target/i386/fma_float_3.c: Likewise.
            * gcc.target/i386/fma_float_4.c: Likewise.
            * gcc.target/i386/fma_float_5.c: Likewise.
            * gcc.target/i386/fma_float_6.c: Likewise.
            * gcc.target/i386/l_fma_double_1.c: Likewise.
            * gcc.target/i386/l_fma_double_2.c: Likewise.
            * gcc.target/i386/l_fma_double_3.c: Likewise.
            * gcc.target/i386/l_fma_double_4.c: Likewise.
            * gcc.target/i386/l_fma_double_5.c: Likewise.
            * gcc.target/i386/l_fma_double_6.c: Likewise.
            * gcc.target/i386/l_fma_float_1.c: Likewise.
            * gcc.target/i386/l_fma_float_2.c: Likewise.
            * gcc.target/i386/l_fma_float_3.c: Likewise.
            * gcc.target/i386/l_fma_float_4.c: Likewise.
            * gcc.target/i386/l_fma_float_5.c: Likewise.
            * gcc.target/i386/l_fma_float_6.c: Likewise.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
  2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
                   ` (4 preceding siblings ...)
  2020-05-13  9:21 ` cvs-commit at gcc dot gnu.org
@ 2020-05-13  9:23 ` jakub at gcc dot gnu.org
  2021-08-10 23:13 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-13  9:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math
  2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
                   ` (5 preceding siblings ...)
  2020-05-13  9:23 ` jakub at gcc dot gnu.org
@ 2021-08-10 23:13 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-10 23:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |zamazan4ik at tut dot by

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 91250 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-10 23:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11 15:07 [Bug tree-optimization/95060] New: vfnmsub132ps is not generated with -ffast-math ubizjak at gmail dot com
2020-05-11 15:12 ` [Bug tree-optimization/95060] " ubizjak at gmail dot com
2020-05-11 19:49 ` jakub at gcc dot gnu.org
2020-05-12  6:56 ` rguenth at gcc dot gnu.org
2020-05-12 10:21 ` jakub at gcc dot gnu.org
2020-05-13  9:21 ` cvs-commit at gcc dot gnu.org
2020-05-13  9:23 ` jakub at gcc dot gnu.org
2021-08-10 23:13 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).