public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/52628] New: SH Target: Inefficient shift by T bit result
@ 2012-03-19 23:42 olegendo at gcc dot gnu.org
2014-05-10 13:49 ` [Bug target/52628] " olegendo at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-19 23:42 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52628
Bug #: 52628
Summary: SH Target: Inefficient shift by T bit result
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: olegendo@gcc.gnu.org
Target: sh*-*-*
int test_01 (int a, int b, int c)
{
return c << (a > b ? 1 : 0);
}
-m4 -O2:
cmp/gt r5,r4
mov r6,r0
movt r1
rts
shld r1,r0
better:
cmp/gt r5,r4
bf 0f
add r6,r6 ! do not use shll because of T bit usage in shll
0:
rts
mov r6,r0
int test_02 (int a, int b, int c)
{
return c << (a > b ? 2 : 0);
}
-m4 -O2:
cmp/gt r5,r4
mov r6,r0
movt r1
add r1,r1
rts
shld r1,r0
better:
cmp/gt r5,r4
bf 0f
shll2 r6
0:
rts
mov r6,r0
The same goes for other shift amounts like 8 and 16.
On SH4 the zero-displacement conditional branch code should be faster.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/52628] SH Target: Inefficient shift by T bit result
2012-03-19 23:42 [Bug target/52628] New: SH Target: Inefficient shift by T bit result olegendo at gcc dot gnu.org
@ 2014-05-10 13:49 ` olegendo at gcc dot gnu.org
2015-09-27 4:29 ` olegendo at gcc dot gnu.org
2023-07-22 3:06 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-05-10 13:49 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52628
--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> ---
If conditional execution is implemented via zero-displacement branches ( PR
54762 ) then I'd expect this to work automatically. However, on SH variants
that do not have dynamic shifts (SH1, SH2) it should be always better to jump
around a constant shift than invoking the dynamic shift library function.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/52628] SH Target: Inefficient shift by T bit result
2012-03-19 23:42 [Bug target/52628] New: SH Target: Inefficient shift by T bit result olegendo at gcc dot gnu.org
2014-05-10 13:49 ` [Bug target/52628] " olegendo at gcc dot gnu.org
@ 2015-09-27 4:29 ` olegendo at gcc dot gnu.org
2023-07-22 3:06 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-09-27 4:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52628
--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> ---
To catch cases such as
int test_01 (int a, int b, int c)
{
return c << (a > b ? 1 : 0);
}
a shift with treg_set_expr can be implemented. Combine is looking for this
pattern:
Failed to match this instruction:
(set (reg:SI 169)
(ashift:SI (reg:SI 6 r6 [ c ])
(gt:SI (reg:SI 4 r4 [ a ])
(reg:SI 5 r5 [ b ]))))
However, this will only be tried with dynamic shifts. If software dynamic
shifts are used the library call is expanded too early and combine does not try
it. This is done to get constant sharing for the library address. It'd be
better to have dynamic shifts only throughout combine, expand library calls in
split1 and then do a constant optimization afterwards.
For cases such as
int test_01 (int a, int b, int c)
{
return c << (a > b ? 3 : 2);
}
with dynamic shifts we currently get:
cmp/gt r5,r4
mov r6,r0
movt r1
add #2,r1
rts
shld r1,r0
where the expected code would be:
cmp/gt r5,r4
shll2 r6
bf 0f
add r6,r6
0:
rts
mov r6,r0
or
cmp/gt r5,r4
mov #1,r2
mov r6,r0
addc r2,r2
rts
shld r2,r6
It fails to use the addc insn because of PR 65317 and PR 67057.
Then, the actual shift is tried as:
Failed to match this instruction:
(set (reg:SI 168)
(ashift:SI (reg:SI 6 r6 [ c ])
(plus:SI (reg:SI 169)
(const_int 2 [0x2]))))
and as:
Failed to match this instruction:
(set (reg:SI 168)
(ashift:SI (reg:SI 6 r6 [ c ])
(plus:SI (gt:SI (reg:SI 4 r4 [ a ])
(reg:SI 5 r5 [ b ]))
(const_int 2 [0x2]))))
these need to be implemented to be able to split out the common constant shift
count and the dynamic 0/1 shift count.
For
int test_02 (int a, int b, int c)
{
return c << (a > b ? 2 : 0);
}
combine tries:
Failed to match this instruction:
(set (reg:SI 168)
(ashift:SI (reg:SI 6 r6 [ c ])
(ashift:SI (gt:SI (reg:SI 4 r4 [ a ])
(reg:SI 5 r5 [ b ]))
(const_int 1 [0x1]))))
However, for
int test_02 (int a, int b, int c)
{
return c << (a > b ? 3 : 0);
}
it doesn't try anything like that. This is probably a missed case in ifcvt.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/52628] SH Target: Inefficient shift by T bit result
2012-03-19 23:42 [Bug target/52628] New: SH Target: Inefficient shift by T bit result olegendo at gcc dot gnu.org
2014-05-10 13:49 ` [Bug target/52628] " olegendo at gcc dot gnu.org
2015-09-27 4:29 ` olegendo at gcc dot gnu.org
@ 2023-07-22 3:06 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-22 3:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52628
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-07-22 3:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-19 23:42 [Bug target/52628] New: SH Target: Inefficient shift by T bit result olegendo at gcc dot gnu.org
2014-05-10 13:49 ` [Bug target/52628] " olegendo at gcc dot gnu.org
2015-09-27 4:29 ` olegendo at gcc dot gnu.org
2023-07-22 3:06 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).