public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one
@ 2023-10-16 14:46 lis8215 at gmail dot com
2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: lis8215 at gmail dot com @ 2023-10-16 14:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
Bug ID: 111835
Summary: Suboptimal codegen: zero extended load instead of sign
extended one
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: lis8215 at gmail dot com
Target Milestone: ---
In this simplified example:
int test (const uint8_t * src, uint8_t * dst)
{
int8_t tmp = (int8_t)*src;
*dst = tmp;
return tmp;
}
GCC prefers to use load with zero extension instead of more rational sign
extended load.
Then it needs to do explicit sign extension for making return value.
I know there's a lot of bugs related to zero/sign ext, but I guessed it's rare
special case, and it reproduces in any GCC version available at godbolt and any
architecture except x86-64.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one
2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
@ 2023-10-16 17:35 ` pinskia at gcc dot gnu.org
2023-10-31 22:07 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-16 17:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2023-10-16
Component|middle-end |rtl-optimization
Ever confirmed|0 |1
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Target| |aarch64
Keywords| |missed-optimization
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So as you said it depends on the target.
Most non-x86 target have:
/* Define if loading from memory in MODE, an integral mode narrower than
BITS_PER_WORD will either zero-extend or sign-extend. The value of this
macro should be the code that says which one of the two operations is
implicitly done, or UNKNOWN if none. */
#define LOAD_EXTEND_OP(MODE) ZERO_EXTEND
defined.
Which causes REE to be confused before hand:
Before REE:
(insn 7 10 9 2 (set (reg:SI 0 x0 [orig:92 _1 ] [92])
(zero_extend:SI (mem:QI (reg:DI 0 x0 [99]) [0 *src_3(D)+0 S1 A8])))
"/app/example.cpp":4:39 146 {*zero_extendqisi2_aarch64}
(nil))
(insn 9 7 15 2 (set (mem:QI (reg:DI 1 x1 [100]) [0 *dst_5(D)+0 S1 A8])
(reg:QI 0 x0 [orig:92 _1 ] [92])) "/app/example.cpp":5:10 62
{*movqi_aarch64}
(nil))
(insn 15 9 16 2 (set (reg/i:SI 0 x0)
(sign_extend:SI (reg:QI 0 x0 [orig:92 _1 ] [92])))
"/app/example.cpp":7:1 142 {*extendqisi2_aarch64}
(nil))
Which means that REE does not elimite it.
Note on x86 we get before REE:
(insn 7 4 8 2 (set (reg:QI 0 ax [orig:98 _1 ] [98])
(mem:QI (reg:DI 5 di [104]) [0 *src_3(D)+0 S1 A8]))
"/app/example.cpp":4:39 93 {*movqi_internal}
(nil))
(insn 8 7 9 2 (set (mem:QI (reg:DI 4 si [105]) [0 *dst_5(D)+0 S1 A8])
(reg:QI 0 ax [orig:98 _1 ] [98])) "/app/example.cpp":5:10 93
{*movqi_internal}
(nil))
(insn 9 8 15 2 (set (reg:SI 0 ax [orig:103 _1 ] [103])
(sign_extend:SI (reg:QI 0 ax [orig:98 _1 ] [98])))
"/app/example.cpp":6:12 183 {extendqisi2}
(nil))
So REE is able to move that sign extend back to the original load.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one
2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org
@ 2023-10-31 22:07 ` pinskia at gcc dot gnu.org
2023-11-01 6:36 ` lis8215 at gmail dot com
2023-11-01 15:40 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-31 22:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |DUPLICATE
Status|NEW |RESOLVED
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup of bug 104387.
*** This bug has been marked as a duplicate of bug 104387 ***
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one
2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org
2023-10-31 22:07 ` pinskia at gcc dot gnu.org
@ 2023-11-01 6:36 ` lis8215 at gmail dot com
2023-11-01 15:40 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: lis8215 at gmail dot com @ 2023-11-01 6:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
--- Comment #3 from Siarhei Volkau <lis8215 at gmail dot com> ---
I don't think that it is duplicate of the bug 104387 because there's only one
store.
And this bug is simply disappears if we change the source code a bit.
e.g.
- change (int8_t)*src; to *(int8_t*)src;
or change argument uint8_t * dst to int8_t * dst
But if we have multiple stores, extension will remain in any condition.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one
2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
` (2 preceding siblings ...)
2023-11-01 6:36 ` lis8215 at gmail dot com
@ 2023-11-01 15:40 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-01 15:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Siarhei Volkau from comment #3)
> I don't think that it is duplicate of the bug 104387 because there's only
> one store.
> And this bug is simply disappears if we change the source code a bit.
> e.g.
> - change (int8_t)*src; to *(int8_t*)src;
> or change argument uint8_t * dst to int8_t * dst
>
> But if we have multiple stores, extension will remain in any condition.
One store but 2 uses.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-11-01 15:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org
2023-10-31 22:07 ` pinskia at gcc dot gnu.org
2023-11-01 6:36 ` lis8215 at gmail dot com
2023-11-01 15:40 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).