public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "peter at cordes dot ca" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/105904] New: Predicated mov r0, #1 with opposite conditions could be hoisted, between 1 and 1<<n in opposite sides of a branch
Date: Thu, 09 Jun 2022 07:33:26 +0000	[thread overview]
Message-ID: <bug-105904-4@http.gcc.gnu.org/bugzilla/> (raw)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105904

            Bug ID: 105904
           Summary: Predicated mov r0, #1 with opposite conditions could
                    be hoisted, between 1 and 1<<n in opposite sides of a
                    branch
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: peter at cordes dot ca
  Target Milestone: ---
            Target: arm-*-*

#include <bit>  // using the libstdc++ header
unsigned roundup(unsigned x){
    return std::bit_ceil(x);
}

https://godbolt.org/z/Px1fvWaex

GCC's version is somewhat clunky, including MOV r0, #1 in either "side":

roundup(unsigned int):
        cmp     r0, #1
        itttt   hi
        addhi   r3, r0, #-1
        movhi   r0, #1            @@ here
        clzhi   r3, r3
        rsbhi   r3, r3, #32
        ite     hi
        lslhi   r0, r0, r3
        movls   r0, #1            @@ here
        bx      lr

Even without spotting the other optimizations that clang finds, we can combine
to a single unconditional MOV r0, #1.  But only if we avoid setting flags, so
it requires a 4-byte encoding, not MOVS.  Still, it's one fewer instruction to
execute.

This is not totally trivial: it requires seeing that we can move it across the
conditional LSL.  So it's really a matter of folding the 1s between 1<<n and 1 
in opposite sides of an if-converted branch.

        cmp     r0, #1
        ittt    hi
        addhi   r3, r0, #-1
        clzhi   r3, r3
        rsbhi   r3, r3, #32
        mov     r0, #1            @@ now unconditional
        it      hi
        lslhi   r0, r0, r3
        bx      lr



clang makes rather nice asm for ARMv7 -mcpu=cortex-a53 as discussed in PR104773
which covers a different missed optimization in the same asm.

roundup(unsigned int):                @@ clang's version.
        subs    r0, r0, #1
        clz     r0, r0
        rsb     r1, r0, #32         @ 32-clz
        mov     r0, #1
        lslhi   r0, r0, r1          @ using flags set by SUBS
        bx      lr                  @ 1<<(32-clz) or just 1

Folding the mov r0, #1 from either side is only a couple steps away from making
the clz and rsb unconditional, and keeping only the LSL conditional.

             reply	other threads:[~2022-06-09  7:33 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09  7:33 peter at cordes dot ca [this message]
2023-05-17 20:48 ` [Bug rtl-optimization/105904] " pinskia at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-105904-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).