public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated
@ 2015-01-08 10:51 vekumar at gcc dot gnu.org
  2015-01-08 12:24 ` [Bug rtl-optimization/64537] " kugan at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: vekumar at gcc dot gnu.org @ 2015-01-08 10:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

            Bug ID: 64537
           Summary: Aarch64 redundant sxth instruction gets generated
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vekumar at gcc dot gnu.org

For the below test case redundant sxth instruction gets generate.

int
adds_shift_ext ( long long a, short b, int c)
{
 long long  d = (a - ((long long)b << 3));

  if (d == 0)
    return a + c;
  else
    return b + d + c;
}


adds_shift_ext:
        sxth    w1, w1  // 3    *extendhisi2_aarch64/1  [length = 4] <==1
        subs    x3, x0, x1, sxth 3      // 11   *subs_extvdi_multp2     [length
= 4] <==2
        beq     .L5     // 12   *condjump       [length = 4]
        add     w0, w1, w2      // 19   *addsi3_aarch64/2       [length = 4]
        add     w0, w0, w3      // 20   *addsi3_aarch64/2       [length = 4]
        ret     // 57   simple_return   [length = 4]
        .p2align 2
.L5:
        add     w0, w2, w0      // 14   *addsi3_aarch64/2       [length = 4]
        ret     // 55   simple_return   [length = 4]

<== 1 is not needed.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
  2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
@ 2015-01-08 12:24 ` kugan at gcc dot gnu.org
  2015-01-08 12:33 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-01-08 12:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

kugan at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kugan at gcc dot gnu.org

--- Comment #1 from kugan at gcc dot gnu.org ---
According to AAPCS64
(http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf),
the unused parm register bits have "unspecified value".So I think it is
needede.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
  2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
  2015-01-08 12:24 ` [Bug rtl-optimization/64537] " kugan at gcc dot gnu.org
@ 2015-01-08 12:33 ` pinskia at gcc dot gnu.org
  2015-01-08 12:51 ` kugan at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-01-08 12:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to kugan from comment #1)
> According to AAPCS64
> (http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/
> IHI0055C_beta_aapcs64.pdf), the unused parm register bits have "unspecified
> value".So I think it is Needed

It is not needed because the next instruction has a sign extend and the other
uses of the result of the sign extend only use the lower 32bits of the
register.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
  2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
  2015-01-08 12:24 ` [Bug rtl-optimization/64537] " kugan at gcc dot gnu.org
  2015-01-08 12:33 ` pinskia at gcc dot gnu.org
@ 2015-01-08 12:51 ` kugan at gcc dot gnu.org
  2015-01-08 14:03 ` rearnsha at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-01-08 12:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

--- Comment #3 from kugan at gcc dot gnu.org ---
But isn't w1 is passed with 16bit value (short b) here. Am  I missing something
here?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
  2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2015-01-08 12:51 ` kugan at gcc dot gnu.org
@ 2015-01-08 14:03 ` rearnsha at gcc dot gnu.org
  2015-01-19  2:29 ` kugan at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2015-01-08 14:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

--- Comment #4 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
b is used twice, once shifted left by 3 and once directly.

We could write this as

        subs    x3, x0, x1, sxth 3 
        beq     .L5
        add     w0, w2, w1, sxth          <= Now extended
        add     w0, w0, w3
        ret
        .p2align 2
.L5:
        add     w0, w2, w0
        ret

which in this specific case would perhaps be more efficient, but in practice
it's quite hard to get this sort of multiple-use right.

I think this is a special case, however, of the more common 'un-cse' type of
problem, where multiple uses of an extended (or shifted) value are always
commoned up.

Note that modern CPUs may take an extra cycle to perform an ALU-with-shift type
operation, eliminating the benefit of sinking multiple uses down into the ALU
operations themselves.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
  2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2015-01-08 14:03 ` rearnsha at gcc dot gnu.org
@ 2015-01-19  2:29 ` kugan at gcc dot gnu.org
  2021-08-22  9:15 ` pinskia at gcc dot gnu.org
  2021-08-22  9:20 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-01-19  2:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

--- Comment #5 from kugan at gcc dot gnu.org ---
Is this sort of multiple-use potential candidate for ree pass? Haven't looked
ree in detail yet.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
  2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2015-01-19  2:29 ` kugan at gcc dot gnu.org
@ 2021-08-22  9:15 ` pinskia at gcc dot gnu.org
  2021-08-22  9:20 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-22  9:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
   Target Milestone|---                         |9.0
             Status|NEW                         |RESOLVED

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
GCC9+ does:
        subs    x3, x0, x1, sxth 3
        add     w1, w2, w1, sxth
        add     w1, w1, w3
        add     w0, w2, w0
        csel    w0, w1, w0, ne
        ret

GCC 8 produced:
        sxth    w1, w1
        subs    x3, x0, x1, sxth 3
        add     w1, w1, w2
        add     w1, w1, w3
        add     w0, w2, w0
        csel    w0, w1, w0, ne
        ret

GCC 9's combine is able to do this:
Trying 3 -> 8:
    3: r99:SI=sign_extend(x1:HI)
      REG_DEAD x1:HI
    8: r101:DI=sign_extend(r99:SI#0)
Failed to match this instruction:
(parallel [
        (set (reg:DI 101 [ b ])
            (sign_extend:DI (reg:HI 1 x1 [ b ])))
        (set (reg/v:SI 99 [ b ])
            (sign_extend:SI (reg:HI 1 x1 [ b ])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:DI 101 [ b ])
            (sign_extend:DI (reg:HI 1 x1 [ b ])))
        (set (reg/v:SI 99 [ b ])
            (sign_extend:SI (reg:HI 1 x1 [ b ])))
    ])
Successfully matched this instruction:
(set (reg/v:SI 99 [ b ])
    (sign_extend:SI (reg:HI 1 x1 [ b ])))
Successfully matched this instruction:
(set (reg:DI 101 [ b ])
    (sign_extend:DI (reg:HI 1 x1 [ b ])))
allowing combination of insns 3 and 8
original costs 4 + 4 = 8
replacement costs 4 + 4 = 8

So fixed by r9-2064.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
  2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-08-22  9:15 ` pinskia at gcc dot gnu.org
@ 2021-08-22  9:20 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-22  9:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |NEW
         Resolution|FIXED                       |---
   Target Milestone|9.0                         |---

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Well it was just this case that was fixed.
here is another one which is still broken:
unsigned int
adds_shift_ext ( unsigned long long a, unsigned short b, unsigned c)
{
 unsigned long long  d = (a - ((unsigned long long)b << 3));

  if (d == 0)
    return a + c + b;
  else
    return b + d + c;
}

Note I think there is a missed reassociation/code hoisting too.

  <bb 3> [local count: 536870913]:
  _3 = (unsigned int) a_11(D);
  _4 = _3 + c_13(D);
  _15 = _4 + _8;
  goto <bb 5>; [100.00%]

  <bb 4> [local count: 536870913]:
  _7 = (unsigned int) d_12;
  _17 = _8 + c_13(D);
  _14 = _7 + _17;

c_13(D) + _8 is full redundant here

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-22  9:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
2015-01-08 12:24 ` [Bug rtl-optimization/64537] " kugan at gcc dot gnu.org
2015-01-08 12:33 ` pinskia at gcc dot gnu.org
2015-01-08 12:51 ` kugan at gcc dot gnu.org
2015-01-08 14:03 ` rearnsha at gcc dot gnu.org
2015-01-19  2:29 ` kugan at gcc dot gnu.org
2021-08-22  9:15 ` pinskia at gcc dot gnu.org
2021-08-22  9:20 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).