* [Bug target/53976] [SH] Unnecessary clrt after bt
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
@ 2012-09-23 20:50 ` olegendo at gcc dot gnu.org
2012-09-23 21:27 ` olegendo at gcc dot gnu.org
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-23 20:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-23 20:50:23 UTC ---
The clrt insn gets placed into another basic block, thus using a peephole will
not work in this case. In order to be able to eliminate the clrt (or any sett)
the value of the T bit must be tracked not only inside a basic block but also
across basic blocks.
Another case, which shows that the T bit value is lost and has to be
recalculated:
int test_2 (volatile int* a, int b, int c)
{
a[1] = b != 0;
if (b == 0)
a[10] = c;
return b == 0;
}
compiled with -O2 -m4:
tst r5,r5
mov #-1,r1
negc r1,r1
mov.l r1,@(4,r4)
tst r5,r5 !! OK, negc above clobbers T bit
bf .L2
mov.l r6,@(40,r4)
.L2:
tst r5,r5 !! T bit lost in new BB
rts
movt r0
compiled with -O2 -m2a:
tst r5,r5
movrt r1
mov.l r1,@(4,r4)
bf.s .L4
tst r5,r5 !! T bit lost in new BB
mov.l r6,@(40,r4)
tst r5,r5 !! T bit lost in new BB
.L4:
rts
movt r0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt after bt
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
2012-09-23 20:50 ` [Bug target/53976] " olegendo at gcc dot gnu.org
@ 2012-09-23 21:27 ` olegendo at gcc dot gnu.org
2013-08-03 8:08 ` olegendo at gcc dot gnu.org
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-23 21:27 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-23 21:26:54 UTC ---
Interestingly, the following function shows some improved behavior (notice the
removed volatile mem store):
int test_2_1 (int* a, int b, int c)
{
a[1] = b != 0;
if (b == 0)
a[10] = c;
return b == 0;
}
-O2 -m2a:
tst r5,r5
movrt r1
mov.l r1,@(4,r4)
bf .L4
mov.l r6,@(40,r4)
.L4:
rts
movt r0
This is already minimal.
However, for non-SH2A it's still the same:
tst r5,r5
mov #-1,r1
negc r1,r1
tst r5,r5
bf/s .L4
mov.l r1,@(4,r4)
mov.l r6,@(40,r4)
tst r5,r5
.L4:
rts
movt r0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt after bt
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
2012-09-23 20:50 ` [Bug target/53976] " olegendo at gcc dot gnu.org
2012-09-23 21:27 ` olegendo at gcc dot gnu.org
@ 2013-08-03 8:08 ` olegendo at gcc dot gnu.org
2013-11-21 8:19 ` olegendo at gcc dot gnu.org
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-08-03 8:08 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #2)
> Interestingly, the following function shows some improved behavior (notice
> the removed volatile mem store):
>
> int test_2_1 (int* a, int b, int c)
> {
> a[1] = b != 0;
>
> if (b == 0)
> a[10] = c;
>
> return b == 0;
> }
>
> -O2 -m2a:
> tst r5,r5
> movrt r1
> mov.l r1,@(4,r4)
> bf .L4
> mov.l r6,@(40,r4)
> .L4:
> rts
> movt r0
>
>
> This is already minimal.
> However, for non-SH2A it's still the same:
> tst r5,r5
> mov #-1,r1
> negc r1,r1
> tst r5,r5
> bf/s .L4
> mov.l r1,@(4,r4)
> mov.l r6,@(40,r4)
> tst r5,r5
> .L4:
> rts
> movt r0
One of the problems in this case is that negc clobbers the T bit. Another
alternative
movt r0
xor #1,r0
should be selected here. This could be done by looking at the insns around the
negc-movrt and check whether some insn after negc-movrt sets the T bit in the
same way as it was set before the negc-movrt. In this case not clobbering the
T bit would eliminate the redundant test. However, if this pattern occurs in a
loop or pressure on R0 is high, using negc and the redundant test is probably
going to be better.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt after bt
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
` (2 preceding siblings ...)
2013-08-03 8:08 ` olegendo at gcc dot gnu.org
@ 2013-11-21 8:19 ` olegendo at gcc dot gnu.org
2013-11-21 9:03 ` olegendo at gcc dot gnu.org
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-11-21 8:19 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #5 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Thu Nov 21 08:19:38 2013
New Revision: 205191
URL: http://gcc.gnu.org/viewcvs?rev=205191&root=gcc&view=rev
Log:
PR target/53976
* config/sh/sh_optimize_sett_clrt.cc: New SH specific RTL pass.
* config/sh/sh.c (register_sh_passes): Add sh_optimize_sett_clrt pass.
* config/sh/sh/t-sh (sh_optimize_sett_clrt pass.o): New entry.
* config.gcc (sh[123456789lbe]*-*-* | sh-*-*): Add
sh_optimize_sett_clrt pass.o to extra_objs.
PR target/53976
* gcc.target/sh/pr53976-1.c: New.
Added:
trunk/gcc/config/sh/sh_optimize_sett_clrt.cc
trunk/gcc/testsuite/gcc.target/sh/pr53976-1.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config.gcc
trunk/gcc/config/sh/sh.c
trunk/gcc/config/sh/t-sh
trunk/gcc/testsuite/ChangeLog
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt after bt
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
` (3 preceding siblings ...)
2013-11-21 8:19 ` olegendo at gcc dot gnu.org
@ 2013-11-21 9:03 ` olegendo at gcc dot gnu.org
2013-11-25 16:47 ` olegendo at gcc dot gnu.org
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-11-21 9:03 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #6 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #4)
> One option to get rid of the redundant clrt and sett in BBs that are reached
> with a conditional branch would be to add an SH specific RTL pass that
> analyses the BBs and eliminates the insns in question.
>
> Another option could be to try and inject artificial sett / clrt insns at
> the start of BBs that are reached by conditional branches, and then split
> them away to nops or output empty asm with insn length 0. The idea would be
> to let other already existing RTL passes figure out the redundant T bit sets.
I've decided to do it with an RTL pass, as it's easier and less obscure.
The initial version committed in r205191 only eliminates redundant sett / clrt
insns. However, there are also some opportunities to e.g. hoist sett / clrt
insns out of loops:
long long test0 (long long* a, unsigned int c)
{
long long s = 0;
do s += *a++; while (--c);
return s;
}
Currently compiles to:
_test0:
mov #0,r0
mov #0,r1
.align 2
.L3:
mov.l @r4+,r2
mov.l @r4+,r3
clrt
addc r3,r1
addc r2,r0
add #-1,r5
tst r5,r5
bf .L3
rts
nop
The previous T bit value at the clrt insn in the loop basic block is currently
detected to have an unknown value from the first basic block and value = 0
after the end of the loop.
In this case the clrt insn can be removed from the loop and put into the first
basic block:
_test0:
mov #0,r0
mov #0,r1
clrt
.align 2
.L3:
mov.l @r4+,r2
mov.l @r4+,r3
addc r3,r1
addc r2,r0
add #-1,r5
tst r5,r5
bf .L3
rts
nop
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt after bt
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
` (4 preceding siblings ...)
2013-11-21 9:03 ` olegendo at gcc dot gnu.org
@ 2013-11-25 16:47 ` olegendo at gcc dot gnu.org
2014-08-13 22:02 ` [Bug target/53976] [SH] Unnecessary clrt/sett after bt/bf olegendo at gcc dot gnu.org
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-11-25 16:47 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #7 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Mon Nov 25 16:47:16 2013
New Revision: 205358
URL: http://gcc.gnu.org/viewcvs?rev=205358&root=gcc&view=rev
Log:
PR target/53976
PR target/59243
* config/sh/sh_optimize_sett_clrt.cc (struct ccreg_value): Update
comments.
(sh_optimize_sett_clrt::find_last_ccreg_values): Check stack of
previously visited basic blocks before recursing instead of only one
basic block.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/sh/sh_optimize_sett_clrt.cc
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt/sett after bt/bf
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
` (5 preceding siblings ...)
2013-11-25 16:47 ` olegendo at gcc dot gnu.org
@ 2014-08-13 22:02 ` olegendo at gcc dot gnu.org
2014-11-23 17:37 ` olegendo at gcc dot gnu.org
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-08-13 22:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
Oleg Endo <olegendo at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2014-08-13
Summary|[SH] Unnecessary clrt after |[SH] Unnecessary clrt/sett
|bt |after bt/bf
Ever confirmed|0 |1
--- Comment #8 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Another case that could be handled is trying to invert comparisons so that
conditional branches from multiple basic blocks have the same T bit value.
The following code:
static const int table[] = { 0, 1, 2, 3, 4, 9, 10, 42 };
int test (int n)
{
return *std::lower_bound (std::begin (table), std::end (table), n);
}
compiled with -m2 -ml -Os:
mov.l .L37,r3
mov #8,r1
.L33:
cmp/pl r1
bf.s .L36
mov r1,r7
shar r7
mov r7,r2
shll2 r2
add r3,r2
mov.l @r2,r6
cmp/ge r4,r6 <<<
bt .L35 <<<
mov r2,r3
sett <<<
add #4,r3
bra .L33
subc r7,r1
.L35:
bra .L33
mov r7,r1
.L36:
rts
mov.l @r3,r0
.L38:
.align 2
.L37:
.long __ZN5test5L5tableE
In this case 'cmp/ge r4,r6; bt .L35' can be changed to 'cmp/gt r6,r4; bf .L35'
and the sett insn can be eliminated.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt/sett after bt/bf
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
` (6 preceding siblings ...)
2014-08-13 22:02 ` [Bug target/53976] [SH] Unnecessary clrt/sett after bt/bf olegendo at gcc dot gnu.org
@ 2014-11-23 17:37 ` olegendo at gcc dot gnu.org
2014-11-23 21:17 ` olegendo at gcc dot gnu.org
2014-11-23 21:45 ` olegendo at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-11-23 17:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #9 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Created attachment 34078
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34078&action=edit
Reduced problem from linux kernel
This one exposes a problem of the sh_optimize_sett_clrt RTL pass, where it will
try to visit/evaluate too many basic blocks/edges and eventually run out of
memory.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt/sett after bt/bf
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
` (7 preceding siblings ...)
2014-11-23 17:37 ` olegendo at gcc dot gnu.org
@ 2014-11-23 21:17 ` olegendo at gcc dot gnu.org
2014-11-23 21:45 ` olegendo at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-11-23 21:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #10 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sun Nov 23 21:16:26 2014
New Revision: 217987
URL: https://gcc.gnu.org/viewcvs?rev=217987&root=gcc&view=rev
Log:
gcc/
PR target/53976
* config/sh/sh_optimize_sett_clrt.cc
(sh_optimize_sett_clrt::find_last_ccreg_values): Return bool instead
of void. Abort at complex edges.
(sh_optimize_sett_clrt::execute): Do nothing if find_last_ccreg_values
returned false.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/sh/sh_optimize_sett_clrt.cc
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/53976] [SH] Unnecessary clrt/sett after bt/bf
2012-07-15 20:43 [Bug target/53976] New: [SH] Unnecessary clrt after bt olegendo at gcc dot gnu.org
` (8 preceding siblings ...)
2014-11-23 21:17 ` olegendo at gcc dot gnu.org
@ 2014-11-23 21:45 ` olegendo at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-11-23 21:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976
--- Comment #11 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sun Nov 23 21:45:18 2014
New Revision: 217989
URL: https://gcc.gnu.org/viewcvs?rev=217989&root=gcc&view=rev
Log:
gcc/
Backport from mainline
2014-11-23 Oleg Endo <olegendo@gcc.gnu.org>
PR target/53976
* config/sh/sh_optimize_sett_clrt.cc
(sh_optimize_sett_clrt::find_last_ccreg_values): Return bool instead
of void. Abort at complex edges.
(sh_optimize_sett_clrt::execute): Do nothing if find_last_ccreg_values
returned false.
Modified:
branches/gcc-4_9-branch/gcc/ChangeLog
branches/gcc-4_9-branch/gcc/config/sh/sh_optimize_sett_clrt.cc
^ permalink raw reply [flat|nested] 11+ messages in thread