public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "jl1184 at duke dot edu" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/106277] New: missed-optimization: redundant movzx Date: Wed, 13 Jul 2022 02:16:52 +0000 [thread overview] Message-ID: <bug-106277-4@http.gcc.gnu.org/bugzilla/> (raw) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106277 Bug ID: 106277 Summary: missed-optimization: redundant movzx Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: jl1184 at duke dot edu Target Milestone: --- I came across this when examining a loop that runs slower than I expected. It involves explicit and implicit conversions between 8-bit and 32/64-bit values, and as I looked through the generated assembly using Godbolt compiler explorer, I found lots of movzx instructions that don't seem to break dependency or play a role in correctness, not to mention many use the same register like "movzx eax al", which cannot be eliminated. I then tried some simple examples on Godbolt with X86-64 GCC 12.1, and found that this behavior is persistent and easily reproducible, even when I specify "-march=skylake". Here's an example: #include <stdint.h> int add2bytes(uint8_t* a, uint8_t* b) { return uint8_t(*a + *b); } gcc -O3 gives: add2bytes(unsigned char*, unsigned char*): movzx eax, BYTE PTR [rsi] add al, BYTE PTR [rdi] movzx eax, al ret The first movzx here breaks dependency on old eax value, but what is the second movzx doing? I don't think there's any dependency it can break, and it shouldn't affect the result either. I also asked this on Stack Overflow and [Peter Cordes] has a great response (https://stackoverflow.com/a/72953035/14730360) explaining how this extra movzx is bad for the vast majority of X86-64 processors. IMHO newer versions of GCC should give newer processors more weight in performance tradeoff. Probably -mtune=generic in a later GCC shouldn't care about P6-family partial-register stalls. Practically there should be so few still using those CPUs to run latest compiled softwares. Godbolt link with code for examples: https://godbolt.org/z/4n6ezaav7 Here's another example closer to what I was originally examining: int foo(uint8_t* a, uint8_t i, uint8_t j) { return a[a[i] | a[j]]; } gcc -O3 gives: foo(unsigned char*, unsigned char, unsigned char): movzx esi, sil movzx edx, dl movzx eax, BYTE PTR [rdi+rsi] or al, BYTE PTR [rdi+rdx] movzx eax, al movzx eax, BYTE PTR [rdi+rax] ret As was discussed in the Stack Overflow post, the first 2 movzx should be changed to use different registers so that some CPUs can have the benefit from mov elimination. The "movzx eax, al" just seems unnecessary. The upper bits of RAX should already be cleared, and the dependency of RAX on the "or" is not something that "movzx eax al" can break. So I think it's better to just do "movzx eax, byte ptr [rdi + rax]" after the "or". Or maybe even better, just use "mov eax, byte ptr [rdi + rax]" since EAX should already be free and cleaned in upper bits at this point.
next reply other threads:[~2022-07-13 2:16 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-07-13 2:16 jl1184 at duke dot edu [this message] 2022-07-13 15:49 ` [Bug target/106277] " pinskia at gcc dot gnu.org 2022-07-13 16:06 ` amonakov at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-106277-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).