public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one
@ 2023-10-16 14:46 lis8215 at gmail dot com
  2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: lis8215 at gmail dot com @ 2023-10-16 14:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835

            Bug ID: 111835
           Summary: Suboptimal codegen: zero extended load instead of sign
                    extended one
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lis8215 at gmail dot com
  Target Milestone: ---

In this simplified example:

int test (const uint8_t * src, uint8_t * dst)
{
    int8_t tmp = (int8_t)*src;
    *dst = tmp;
    return tmp;
}

GCC prefers to use load with zero extension instead of more rational sign
extended load.
Then it needs to do explicit sign extension for making return value.

I know there's a lot of bugs related to zero/sign ext, but I guessed it's rare
special case, and it reproduces in any GCC version available at godbolt and any
architecture except x86-64.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one
  2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
@ 2023-10-16 17:35 ` pinskia at gcc dot gnu.org
  2023-10-31 22:07 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-16 17:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2023-10-16
          Component|middle-end                  |rtl-optimization
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
             Target|                            |aarch64
           Keywords|                            |missed-optimization

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So as you said it depends on the target.
Most non-x86 target have:
/* Define if loading from memory in MODE, an integral mode narrower than
   BITS_PER_WORD will either zero-extend or sign-extend.  The value of this
   macro should be the code that says which one of the two operations is
   implicitly done, or UNKNOWN if none.  */
#define LOAD_EXTEND_OP(MODE) ZERO_EXTEND

defined.

Which causes REE to be confused before hand:
Before REE:
(insn 7 10 9 2 (set (reg:SI 0 x0 [orig:92 _1 ] [92])
        (zero_extend:SI (mem:QI (reg:DI 0 x0 [99]) [0 *src_3(D)+0 S1 A8])))
"/app/example.cpp":4:39 146 {*zero_extendqisi2_aarch64}
     (nil))
(insn 9 7 15 2 (set (mem:QI (reg:DI 1 x1 [100]) [0 *dst_5(D)+0 S1 A8])
        (reg:QI 0 x0 [orig:92 _1 ] [92])) "/app/example.cpp":5:10 62
{*movqi_aarch64}
     (nil))
(insn 15 9 16 2 (set (reg/i:SI 0 x0)
        (sign_extend:SI (reg:QI 0 x0 [orig:92 _1 ] [92])))
"/app/example.cpp":7:1 142 {*extendqisi2_aarch64}
     (nil))

Which means that REE does not elimite it.


Note on x86 we get before REE:
(insn 7 4 8 2 (set (reg:QI 0 ax [orig:98 _1 ] [98])
        (mem:QI (reg:DI 5 di [104]) [0 *src_3(D)+0 S1 A8]))
"/app/example.cpp":4:39 93 {*movqi_internal}
     (nil))
(insn 8 7 9 2 (set (mem:QI (reg:DI 4 si [105]) [0 *dst_5(D)+0 S1 A8])
        (reg:QI 0 ax [orig:98 _1 ] [98])) "/app/example.cpp":5:10 93
{*movqi_internal}
     (nil))
(insn 9 8 15 2 (set (reg:SI 0 ax [orig:103 _1 ] [103])
        (sign_extend:SI (reg:QI 0 ax [orig:98 _1 ] [98])))
"/app/example.cpp":6:12 183 {extendqisi2}
     (nil))

So REE is able to move that sign extend back to the original load.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one
  2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
  2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org
@ 2023-10-31 22:07 ` pinskia at gcc dot gnu.org
  2023-11-01  6:36 ` lis8215 at gmail dot com
  2023-11-01 15:40 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-31 22:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|NEW                         |RESOLVED

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup of bug 104387.

*** This bug has been marked as a duplicate of bug 104387 ***

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one
  2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
  2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org
  2023-10-31 22:07 ` pinskia at gcc dot gnu.org
@ 2023-11-01  6:36 ` lis8215 at gmail dot com
  2023-11-01 15:40 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: lis8215 at gmail dot com @ 2023-11-01  6:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835

--- Comment #3 from Siarhei Volkau <lis8215 at gmail dot com> ---
I don't think that it is duplicate of the bug 104387 because there's only one
store.
And this bug is simply disappears if we change the source code a bit.
e.g.
 - change (int8_t)*src; to *(int8_t*)src;
or change argument uint8_t * dst to int8_t * dst

But if we have multiple stores, extension will remain in any condition.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one
  2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
                   ` (2 preceding siblings ...)
  2023-11-01  6:36 ` lis8215 at gmail dot com
@ 2023-11-01 15:40 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-01 15:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Siarhei Volkau from comment #3)
> I don't think that it is duplicate of the bug 104387 because there's only
> one store.
> And this bug is simply disappears if we change the source code a bit.
> e.g.
>  - change (int8_t)*src; to *(int8_t*)src;
> or change argument uint8_t * dst to int8_t * dst
> 
> But if we have multiple stores, extension will remain in any condition.

One store but 2 uses.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-01 15:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-16 14:46 [Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one lis8215 at gmail dot com
2023-10-16 17:35 ` [Bug rtl-optimization/111835] " pinskia at gcc dot gnu.org
2023-10-31 22:07 ` pinskia at gcc dot gnu.org
2023-11-01  6:36 ` lis8215 at gmail dot com
2023-11-01 15:40 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).