public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated
@ 2015-01-08 10:51 vekumar at gcc dot gnu.org
2015-01-08 12:24 ` [Bug rtl-optimization/64537] " kugan at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: vekumar at gcc dot gnu.org @ 2015-01-08 10:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537
Bug ID: 64537
Summary: Aarch64 redundant sxth instruction gets generated
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: vekumar at gcc dot gnu.org
For the below test case redundant sxth instruction gets generate.
int
adds_shift_ext ( long long a, short b, int c)
{
long long d = (a - ((long long)b << 3));
if (d == 0)
return a + c;
else
return b + d + c;
}
adds_shift_ext:
sxth w1, w1 // 3 *extendhisi2_aarch64/1 [length = 4] <==1
subs x3, x0, x1, sxth 3 // 11 *subs_extvdi_multp2 [length
= 4] <==2
beq .L5 // 12 *condjump [length = 4]
add w0, w1, w2 // 19 *addsi3_aarch64/2 [length = 4]
add w0, w0, w3 // 20 *addsi3_aarch64/2 [length = 4]
ret // 57 simple_return [length = 4]
.p2align 2
.L5:
add w0, w2, w0 // 14 *addsi3_aarch64/2 [length = 4]
ret // 55 simple_return [length = 4]
<== 1 is not needed.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
@ 2015-01-08 12:24 ` kugan at gcc dot gnu.org
2015-01-08 12:33 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-01-08 12:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537
kugan at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kugan at gcc dot gnu.org
--- Comment #1 from kugan at gcc dot gnu.org ---
According to AAPCS64
(http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf),
the unused parm register bits have "unspecified value".So I think it is
needede.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
2015-01-08 12:24 ` [Bug rtl-optimization/64537] " kugan at gcc dot gnu.org
@ 2015-01-08 12:33 ` pinskia at gcc dot gnu.org
2015-01-08 12:51 ` kugan at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-01-08 12:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to kugan from comment #1)
> According to AAPCS64
> (http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/
> IHI0055C_beta_aapcs64.pdf), the unused parm register bits have "unspecified
> value".So I think it is Needed
It is not needed because the next instruction has a sign extend and the other
uses of the result of the sign extend only use the lower 32bits of the
register.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
2015-01-08 12:24 ` [Bug rtl-optimization/64537] " kugan at gcc dot gnu.org
2015-01-08 12:33 ` pinskia at gcc dot gnu.org
@ 2015-01-08 12:51 ` kugan at gcc dot gnu.org
2015-01-08 14:03 ` rearnsha at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-01-08 12:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537
--- Comment #3 from kugan at gcc dot gnu.org ---
But isn't w1 is passed with 16bit value (short b) here. Am I missing something
here?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
` (2 preceding siblings ...)
2015-01-08 12:51 ` kugan at gcc dot gnu.org
@ 2015-01-08 14:03 ` rearnsha at gcc dot gnu.org
2015-01-19 2:29 ` kugan at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2015-01-08 14:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537
--- Comment #4 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
b is used twice, once shifted left by 3 and once directly.
We could write this as
subs x3, x0, x1, sxth 3
beq .L5
add w0, w2, w1, sxth <= Now extended
add w0, w0, w3
ret
.p2align 2
.L5:
add w0, w2, w0
ret
which in this specific case would perhaps be more efficient, but in practice
it's quite hard to get this sort of multiple-use right.
I think this is a special case, however, of the more common 'un-cse' type of
problem, where multiple uses of an extended (or shifted) value are always
commoned up.
Note that modern CPUs may take an extra cycle to perform an ALU-with-shift type
operation, eliminating the benefit of sinking multiple uses down into the ALU
operations themselves.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
` (3 preceding siblings ...)
2015-01-08 14:03 ` rearnsha at gcc dot gnu.org
@ 2015-01-19 2:29 ` kugan at gcc dot gnu.org
2021-08-22 9:15 ` pinskia at gcc dot gnu.org
2021-08-22 9:20 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-01-19 2:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537
--- Comment #5 from kugan at gcc dot gnu.org ---
Is this sort of multiple-use potential candidate for ree pass? Haven't looked
ree in detail yet.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
` (4 preceding siblings ...)
2015-01-19 2:29 ` kugan at gcc dot gnu.org
@ 2021-08-22 9:15 ` pinskia at gcc dot gnu.org
2021-08-22 9:20 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-22 9:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Target Milestone|--- |9.0
Status|NEW |RESOLVED
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
GCC9+ does:
subs x3, x0, x1, sxth 3
add w1, w2, w1, sxth
add w1, w1, w3
add w0, w2, w0
csel w0, w1, w0, ne
ret
GCC 8 produced:
sxth w1, w1
subs x3, x0, x1, sxth 3
add w1, w1, w2
add w1, w1, w3
add w0, w2, w0
csel w0, w1, w0, ne
ret
GCC 9's combine is able to do this:
Trying 3 -> 8:
3: r99:SI=sign_extend(x1:HI)
REG_DEAD x1:HI
8: r101:DI=sign_extend(r99:SI#0)
Failed to match this instruction:
(parallel [
(set (reg:DI 101 [ b ])
(sign_extend:DI (reg:HI 1 x1 [ b ])))
(set (reg/v:SI 99 [ b ])
(sign_extend:SI (reg:HI 1 x1 [ b ])))
])
Failed to match this instruction:
(parallel [
(set (reg:DI 101 [ b ])
(sign_extend:DI (reg:HI 1 x1 [ b ])))
(set (reg/v:SI 99 [ b ])
(sign_extend:SI (reg:HI 1 x1 [ b ])))
])
Successfully matched this instruction:
(set (reg/v:SI 99 [ b ])
(sign_extend:SI (reg:HI 1 x1 [ b ])))
Successfully matched this instruction:
(set (reg:DI 101 [ b ])
(sign_extend:DI (reg:HI 1 x1 [ b ])))
allowing combination of insns 3 and 8
original costs 4 + 4 = 8
replacement costs 4 + 4 = 8
So fixed by r9-2064.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
` (5 preceding siblings ...)
2021-08-22 9:15 ` pinskia at gcc dot gnu.org
@ 2021-08-22 9:20 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-22 9:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |NEW
Resolution|FIXED |---
Target Milestone|9.0 |---
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Well it was just this case that was fixed.
here is another one which is still broken:
unsigned int
adds_shift_ext ( unsigned long long a, unsigned short b, unsigned c)
{
unsigned long long d = (a - ((unsigned long long)b << 3));
if (d == 0)
return a + c + b;
else
return b + d + c;
}
Note I think there is a missed reassociation/code hoisting too.
<bb 3> [local count: 536870913]:
_3 = (unsigned int) a_11(D);
_4 = _3 + c_13(D);
_15 = _4 + _8;
goto <bb 5>; [100.00%]
<bb 4> [local count: 536870913]:
_7 = (unsigned int) d_12;
_17 = _8 + c_13(D);
_14 = _7 + _17;
c_13(D) + _8 is full redundant here
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-08-22 9:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-08 10:51 [Bug rtl-optimization/64537] New: Aarch64 redundant sxth instruction gets generated vekumar at gcc dot gnu.org
2015-01-08 12:24 ` [Bug rtl-optimization/64537] " kugan at gcc dot gnu.org
2015-01-08 12:33 ` pinskia at gcc dot gnu.org
2015-01-08 12:51 ` kugan at gcc dot gnu.org
2015-01-08 14:03 ` rearnsha at gcc dot gnu.org
2015-01-19 2:29 ` kugan at gcc dot gnu.org
2021-08-22 9:15 ` pinskia at gcc dot gnu.org
2021-08-22 9:20 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).