public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code
[not found] <bug-84114-4@http.gcc.gnu.org/bugzilla/>
@ 2024-01-26 5:49 ` pinskia at gcc dot gnu.org
0 siblings, 0 replies; only message in thread
From: pinskia at gcc dot gnu.org @ 2024-01-26 5:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |12.1.0
--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Starting in GCC 12 we get on arm64 (with -Ofast):
```
mult_su3_na:
ldp q3, q1, [x1, 16]
ldr q0, [x0, 32]
ldp q2, q4, [x0]
fmul v0.2d, v0.2d, v1.2d
ldr q1, [x1]
fmla v0.2d, v4.2d, v3.2d
fmla v0.2d, v2.2d, v1.2d
faddp d0, v0.2d
ret
```
Which is better than before even. (similarly on x86_64 with -mfma) due to SLP
happening.
With -fno-tree-vectorize, -Ofast is slightly on x86_64 better than 13 by one
instruction.
I am not sure if this matters any more due to the SLP improvement ...
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2024-01-26 5:49 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-84114-4@http.gcc.gnu.org/bugzilla/>
2024-01-26 5:49 ` [Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).