public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code
       [not found] <bug-84114-4@http.gcc.gnu.org/bugzilla/>
@ 2024-01-26  5:49 ` pinskia at gcc dot gnu.org
  0 siblings, 0 replies; only message in thread
From: pinskia at gcc dot gnu.org @ 2024-01-26  5:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |12.1.0

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Starting in GCC 12 we get on arm64 (with -Ofast):
```
mult_su3_na:
        ldp     q3, q1, [x1, 16]
        ldr     q0, [x0, 32]
        ldp     q2, q4, [x0]
        fmul    v0.2d, v0.2d, v1.2d
        ldr     q1, [x1]
        fmla    v0.2d, v4.2d, v3.2d
        fmla    v0.2d, v2.2d, v1.2d
        faddp   d0, v0.2d
        ret
```

Which is better than before even. (similarly on x86_64 with -mfma) due to SLP
happening.

With -fno-tree-vectorize, -Ofast is slightly on x86_64 better than 13 by one
instruction.

I am not sure if this matters any more due to the SLP improvement ...

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-01-26  5:49 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-84114-4@http.gcc.gnu.org/bugzilla/>
2024-01-26  5:49 ` [Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).