public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "peter at cordes dot ca" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/105904] New: Predicated mov r0, #1 with opposite conditions could be hoisted, between 1 and 1<<n in opposite sides of a branch Date: Thu, 09 Jun 2022 07:33:26 +0000 [thread overview] Message-ID: <bug-105904-4@http.gcc.gnu.org/bugzilla/> (raw) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105904 Bug ID: 105904 Summary: Predicated mov r0, #1 with opposite conditions could be hoisted, between 1 and 1<<n in opposite sides of a branch Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: peter at cordes dot ca Target Milestone: --- Target: arm-*-* #include <bit> // using the libstdc++ header unsigned roundup(unsigned x){ return std::bit_ceil(x); } https://godbolt.org/z/Px1fvWaex GCC's version is somewhat clunky, including MOV r0, #1 in either "side": roundup(unsigned int): cmp r0, #1 itttt hi addhi r3, r0, #-1 movhi r0, #1 @@ here clzhi r3, r3 rsbhi r3, r3, #32 ite hi lslhi r0, r0, r3 movls r0, #1 @@ here bx lr Even without spotting the other optimizations that clang finds, we can combine to a single unconditional MOV r0, #1. But only if we avoid setting flags, so it requires a 4-byte encoding, not MOVS. Still, it's one fewer instruction to execute. This is not totally trivial: it requires seeing that we can move it across the conditional LSL. So it's really a matter of folding the 1s between 1<<n and 1 in opposite sides of an if-converted branch. cmp r0, #1 ittt hi addhi r3, r0, #-1 clzhi r3, r3 rsbhi r3, r3, #32 mov r0, #1 @@ now unconditional it hi lslhi r0, r0, r3 bx lr clang makes rather nice asm for ARMv7 -mcpu=cortex-a53 as discussed in PR104773 which covers a different missed optimization in the same asm. roundup(unsigned int): @@ clang's version. subs r0, r0, #1 clz r0, r0 rsb r1, r0, #32 @ 32-clz mov r0, #1 lslhi r0, r0, r1 @ using flags set by SUBS bx lr @ 1<<(32-clz) or just 1 Folding the mov r0, #1 from either side is only a couple steps away from making the clz and rsb unconditional, and keeping only the LSL conditional.
next reply other threads:[~2022-06-09 7:33 UTC|newest] Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-06-09 7:33 peter at cordes dot ca [this message] 2023-05-17 20:48 ` [Bug rtl-optimization/105904] " pinskia at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-105904-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).