public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/99551] New: aarch64: csel is used for cold scalar computation which affects performance
@ 2021-03-11 13:19 nsz at gcc dot gnu.org
2021-03-11 16:36 ` [Bug target/99551] " rearnsha at gcc dot gnu.org
2021-12-23 21:31 ` [Bug rtl-optimization/99551] " pinskia at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: nsz at gcc dot gnu.org @ 2021-03-11 13:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99551
Bug ID: 99551
Summary: aarch64: csel is used for cold scalar computation
which affects performance
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: nsz at gcc dot gnu.org
Target Milestone: ---
this is an optimization bug, i don't know which layer it should
be fixed so i report it as target bug.
cold path affects performance of hot code because csel is used:
long foo(long x, int c)
{
if (__builtin_expect(c,0))
x = (x + 15) & ~15;
return x;
}
compiles to
foo:
cmp w1, 0
add x1, x0, 15
and x1, x1, -16
csel x0, x1, x0, ne
ret
i think it would be better to use a branch if the user
explicitly marked the computation cold.
e.g. this is faster if c is always 0:
long foo(long x, int c)
{
if (__builtin_expect(c,0)) {
asm ("");
x = (x + 15) & ~15;
}
return x;
}
foo:
cbnz w1, .L7
ret
.L7:
add x0, x0, 15
and x0, x0, -16
ret
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/99551] aarch64: csel is used for cold scalar computation which affects performance
2021-03-11 13:19 [Bug target/99551] New: aarch64: csel is used for cold scalar computation which affects performance nsz at gcc dot gnu.org
@ 2021-03-11 16:36 ` rearnsha at gcc dot gnu.org
2021-12-23 21:31 ` [Bug rtl-optimization/99551] " pinskia at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2021-03-11 16:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99551
--- Comment #1 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
probably one of the if-conversion passes.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug rtl-optimization/99551] aarch64: csel is used for cold scalar computation which affects performance
2021-03-11 13:19 [Bug target/99551] New: aarch64: csel is used for cold scalar computation which affects performance nsz at gcc dot gnu.org
2021-03-11 16:36 ` [Bug target/99551] " rearnsha at gcc dot gnu.org
@ 2021-12-23 21:31 ` pinskia at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-23 21:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99551
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
if-conversion succeeded through noce_try_cmove_arith
Removing jump 8.
deleting insn with uid = 8.
deleting insn with uid = 11.
deleting insn with uid = 10.
deleting block 3
Merging block 4 into block 2...
changing bb of uid 13
changing bb of uid 18
from 4 to 2
changing bb of uid 19
from 4 to 2
Merged blocks 2 and 4.
Conversion succeeded on pass 1.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-12-23 21:31 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-11 13:19 [Bug target/99551] New: aarch64: csel is used for cold scalar computation which affects performance nsz at gcc dot gnu.org
2021-03-11 16:36 ` [Bug target/99551] " rearnsha at gcc dot gnu.org
2021-12-23 21:31 ` [Bug rtl-optimization/99551] " pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).