public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/99551] New: aarch64: csel is used for cold scalar computation which affects performance
@ 2021-03-11 13:19 nsz at gcc dot gnu.org
  2021-03-11 16:36 ` [Bug target/99551] " rearnsha at gcc dot gnu.org
  2021-12-23 21:31 ` [Bug rtl-optimization/99551] " pinskia at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: nsz at gcc dot gnu.org @ 2021-03-11 13:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99551

            Bug ID: 99551
           Summary: aarch64: csel is used for cold scalar computation
                    which affects performance
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nsz at gcc dot gnu.org
  Target Milestone: ---

this is an optimization bug, i don't know which layer it should
be fixed so i report it as target bug.

cold path affects performance of hot code because csel is used:

long foo(long x, int c)
{
    if (__builtin_expect(c,0))
        x = (x + 15) & ~15;
    return x;
}


compiles to

foo:
        cmp     w1, 0
        add     x1, x0, 15
        and     x1, x1, -16
        csel    x0, x1, x0, ne
        ret

i think it would be better to use a branch if the user
explicitly marked the computation cold.
e.g. this is faster if c is always 0:

long foo(long x, int c)
{
    if (__builtin_expect(c,0)) {
        asm ("");
        x = (x + 15) & ~15;
    }
    return x;
}

foo:
        cbnz    w1, .L7
        ret
.L7:
        add     x0, x0, 15
        and     x0, x0, -16
        ret

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/99551] aarch64: csel is used for cold scalar computation which affects performance
  2021-03-11 13:19 [Bug target/99551] New: aarch64: csel is used for cold scalar computation which affects performance nsz at gcc dot gnu.org
@ 2021-03-11 16:36 ` rearnsha at gcc dot gnu.org
  2021-12-23 21:31 ` [Bug rtl-optimization/99551] " pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2021-03-11 16:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99551

--- Comment #1 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
probably one of the if-conversion passes.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug rtl-optimization/99551] aarch64: csel is used for cold scalar computation which affects performance
  2021-03-11 13:19 [Bug target/99551] New: aarch64: csel is used for cold scalar computation which affects performance nsz at gcc dot gnu.org
  2021-03-11 16:36 ` [Bug target/99551] " rearnsha at gcc dot gnu.org
@ 2021-12-23 21:31 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-23 21:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99551

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
if-conversion succeeded through noce_try_cmove_arith
Removing jump 8.
deleting insn with uid = 8.
deleting insn with uid = 11.
deleting insn with uid = 10.
deleting block 3
Merging block 4 into block 2...
changing bb of uid 13
changing bb of uid 18
  from 4 to 2
changing bb of uid 19
  from 4 to 2
Merged blocks 2 and 4.
Conversion succeeded on pass 1.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-12-23 21:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-11 13:19 [Bug target/99551] New: aarch64: csel is used for cold scalar computation which affects performance nsz at gcc dot gnu.org
2021-03-11 16:36 ` [Bug target/99551] " rearnsha at gcc dot gnu.org
2021-12-23 21:31 ` [Bug rtl-optimization/99551] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).