public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one @ 2023-10-16 14:46 lis8215 at gmail dot com 2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org ` (3 more replies) 0 siblings, 4 replies; 5+ messages in thread From: lis8215 at gmail dot com @ 2023-10-16 14:46 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835 Bug ID: 111835 Summary: Suboptimal codegen: zero extended load instead of sign extended one Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: lis8215 at gmail dot com Target Milestone: --- In this simplified example: int test (const uint8_t * src, uint8_t * dst) { int8_t tmp = (int8_t)*src; *dst = tmp; return tmp; } GCC prefers to use load with zero extension instead of more rational sign extended load. Then it needs to do explicit sign extension for making return value. I know there's a lot of bugs related to zero/sign ext, but I guessed it's rare special case, and it reproduces in any GCC version available at godbolt and any architecture except x86-64. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one 2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com @ 2023-10-16 17:35 ` pinskia at gcc dot gnu.org 2023-10-31 22:07 ` pinskia at gcc dot gnu.org ` (2 subsequent siblings) 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2023-10-16 17:35 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2023-10-16 Component|middle-end |rtl-optimization Ever confirmed|0 |1 Severity|normal |enhancement Status|UNCONFIRMED |NEW Target| |aarch64 Keywords| |missed-optimization --- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- So as you said it depends on the target. Most non-x86 target have: /* Define if loading from memory in MODE, an integral mode narrower than BITS_PER_WORD will either zero-extend or sign-extend. The value of this macro should be the code that says which one of the two operations is implicitly done, or UNKNOWN if none. */ #define LOAD_EXTEND_OP(MODE) ZERO_EXTEND defined. Which causes REE to be confused before hand: Before REE: (insn 7 10 9 2 (set (reg:SI 0 x0 [orig:92 _1 ] [92]) (zero_extend:SI (mem:QI (reg:DI 0 x0 [99]) [0 *src_3(D)+0 S1 A8]))) "/app/example.cpp":4:39 146 {*zero_extendqisi2_aarch64} (nil)) (insn 9 7 15 2 (set (mem:QI (reg:DI 1 x1 [100]) [0 *dst_5(D)+0 S1 A8]) (reg:QI 0 x0 [orig:92 _1 ] [92])) "/app/example.cpp":5:10 62 {*movqi_aarch64} (nil)) (insn 15 9 16 2 (set (reg/i:SI 0 x0) (sign_extend:SI (reg:QI 0 x0 [orig:92 _1 ] [92]))) "/app/example.cpp":7:1 142 {*extendqisi2_aarch64} (nil)) Which means that REE does not elimite it. Note on x86 we get before REE: (insn 7 4 8 2 (set (reg:QI 0 ax [orig:98 _1 ] [98]) (mem:QI (reg:DI 5 di [104]) [0 *src_3(D)+0 S1 A8])) "/app/example.cpp":4:39 93 {*movqi_internal} (nil)) (insn 8 7 9 2 (set (mem:QI (reg:DI 4 si [105]) [0 *dst_5(D)+0 S1 A8]) (reg:QI 0 ax [orig:98 _1 ] [98])) "/app/example.cpp":5:10 93 {*movqi_internal} (nil)) (insn 9 8 15 2 (set (reg:SI 0 ax [orig:103 _1 ] [103]) (sign_extend:SI (reg:QI 0 ax [orig:98 _1 ] [98]))) "/app/example.cpp":6:12 183 {extendqisi2} (nil)) So REE is able to move that sign extend back to the original load. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one 2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com 2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org @ 2023-10-31 22:07 ` pinskia at gcc dot gnu.org 2023-11-01 6:36 ` lis8215 at gmail dot com 2023-11-01 15:40 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2023-10-31 22:07 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Dup of bug 104387. *** This bug has been marked as a duplicate of bug 104387 *** ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one 2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com 2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org 2023-10-31 22:07 ` pinskia at gcc dot gnu.org @ 2023-11-01 6:36 ` lis8215 at gmail dot com 2023-11-01 15:40 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: lis8215 at gmail dot com @ 2023-11-01 6:36 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835 --- Comment #3 from Siarhei Volkau <lis8215 at gmail dot com> --- I don't think that it is duplicate of the bug 104387 because there's only one store. And this bug is simply disappears if we change the source code a bit. e.g. - change (int8_t)*src; to *(int8_t*)src; or change argument uint8_t * dst to int8_t * dst But if we have multiple stores, extension will remain in any condition. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one 2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com ` (2 preceding siblings ...) 2023-11-01 6:36 ` lis8215 at gmail dot com @ 2023-11-01 15:40 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2023-11-01 15:40 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835 --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to Siarhei Volkau from comment #3) > I don't think that it is duplicate of the bug 104387 because there's only > one store. > And this bug is simply disappears if we change the source code a bit. > e.g. > - change (int8_t)*src; to *(int8_t*)src; > or change argument uint8_t * dst to int8_t * dst > > But if we have multiple stores, extension will remain in any condition. One store but 2 uses. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-11-01 15:40 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com 2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org 2023-10-31 22:07 ` pinskia at gcc dot gnu.org 2023-11-01 6:36 ` lis8215 at gmail dot com 2023-11-01 15:40 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).