* [patch passes.def]: Fix regression on ARM PR/61608
@ 2014-06-25 13:35 Kai Tietz
2014-06-25 13:56 ` Richard Biener
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Kai Tietz @ 2014-06-25 13:35 UTC (permalink / raw)
To: GCC Patches; +Cc: Richard Henderson, Jeff Law
Hello,
so there seems to be a fallout caused by moving peephole2 pass. See PR/61608.
So we need indeed 2 peephole2 passes.
ChangeLog
2014-06-25 Kai Tietz <ktietz@redhat.com>
PR rtl-optimization/61608
* passes.def (peephole2): Readd peephole2 pass
before if-after-reload pass.
Tested for arm*-none-*, i686-w64-cygwin, x86_64-unknown-linux-gnu. Ok
for apply?
Regards,
Kai
Index: passes.def
===================================================================
--- passes.def (Revision 211971)
+++ passes.def (Arbeitskopie)
@@ -396,6 +396,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_rtl_dse2);
NEXT_PASS (pass_stack_adjustments);
NEXT_PASS (pass_jump2);
+ NEXT_PASS (pass_peephole2);
NEXT_PASS (pass_if_after_reload);
NEXT_PASS (pass_regrename);
NEXT_PASS (pass_cprop_hardreg);
@@ -407,8 +408,7 @@ along with GCC; see the file COPYING3. If not see
We have a single indirect branch in the entire function
before duplicate-compute-gotos pass. This vastly reduces
the size of the CFG.
- For preventing to run peephole2 pass twice, its run after
- the jump2 got removed. */
+ We need to run peephole2 pass twice. See PR/61608. */
NEXT_PASS (pass_peephole2);
NEXT_PASS (pass_branch_target_load_optimize2);
NEXT_PASS (pass_leaf_regs);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 13:35 [patch passes.def]: Fix regression on ARM PR/61608 Kai Tietz
@ 2014-06-25 13:56 ` Richard Biener
2014-06-25 14:04 ` Jeff Law
2014-06-25 18:12 ` Richard Henderson
2 siblings, 0 replies; 13+ messages in thread
From: Richard Biener @ 2014-06-25 13:56 UTC (permalink / raw)
To: Kai Tietz; +Cc: GCC Patches, Richard Henderson, Jeff Law
On Wed, Jun 25, 2014 at 3:35 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> Hello,
>
> so there seems to be a fallout caused by moving peephole2 pass. See PR/61608.
> So we need indeed 2 peephole2 passes.
>
> ChangeLog
>
> 2014-06-25 Kai Tietz <ktietz@redhat.com>
>
> PR rtl-optimization/61608
> * passes.def (peephole2): Readd peephole2 pass
> before if-after-reload pass.
>
> Tested for arm*-none-*, i686-w64-cygwin, x86_64-unknown-linux-gnu. Ok
> for apply?
Uh, every time I try to improve compile-time by removing passes somebody
else adds other passes back.
Please try to avoid this.
Thanks,
Richard.
> Regards,
> Kai
>
> Index: passes.def
> ===================================================================
> --- passes.def (Revision 211971)
> +++ passes.def (Arbeitskopie)
> @@ -396,6 +396,7 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_rtl_dse2);
> NEXT_PASS (pass_stack_adjustments);
> NEXT_PASS (pass_jump2);
> + NEXT_PASS (pass_peephole2);
> NEXT_PASS (pass_if_after_reload);
> NEXT_PASS (pass_regrename);
> NEXT_PASS (pass_cprop_hardreg);
> @@ -407,8 +408,7 @@ along with GCC; see the file COPYING3. If not see
> We have a single indirect branch in the entire function
> before duplicate-compute-gotos pass. This vastly reduces
> the size of the CFG.
> - For preventing to run peephole2 pass twice, its run after
> - the jump2 got removed. */
> + We need to run peephole2 pass twice. See PR/61608. */
> NEXT_PASS (pass_peephole2);
> NEXT_PASS (pass_branch_target_load_optimize2);
> NEXT_PASS (pass_leaf_regs);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 13:35 [patch passes.def]: Fix regression on ARM PR/61608 Kai Tietz
2014-06-25 13:56 ` Richard Biener
@ 2014-06-25 14:04 ` Jeff Law
2014-06-25 15:04 ` Kai Tietz
2014-06-25 18:12 ` Richard Henderson
2 siblings, 1 reply; 13+ messages in thread
From: Jeff Law @ 2014-06-25 14:04 UTC (permalink / raw)
To: Kai Tietz, GCC Patches; +Cc: Richard Henderson
On 06/25/14 07:35, Kai Tietz wrote:
> Hello,
>
> so there seems to be a fallout caused by moving peephole2 pass. See PR/61608.
> So we need indeed 2 peephole2 passes.
>
> ChangeLog
>
> 2014-06-25 Kai Tietz <ktietz@redhat.com>
>
> PR rtl-optimization/61608
> * passes.def (peephole2): Readd peephole2 pass
> before if-after-reload pass.
So why is the peephole not working in its current location?
Jeff
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 14:04 ` Jeff Law
@ 2014-06-25 15:04 ` Kai Tietz
2014-06-25 15:28 ` Jeff Law
0 siblings, 1 reply; 13+ messages in thread
From: Kai Tietz @ 2014-06-25 15:04 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC Patches, Richard Henderson
2014-06-25 16:04 GMT+02:00 Jeff Law <law@redhat.com>:
> So why is the peephole not working in its current location?
>
> Jeff
Hi Jeff,
that is what I read out of dumps:
If peephole2 is executed early we see following pattern transformation:
(insn 12 11 13 2 (set (reg:CC_NOOV 100 cc)
(compare:CC_NOOV (zero_extract:SI (reg:SI 4 r4 [orig:110 D.1414 ] [110]
)
(const_int 1 [0x1])
(const_int 2 [0x2]))
(const_int 0 [0]))) arm_epilog-1.c:11 84
{*zeroextractsi_compare0_scratch}
(expr_list:REG_DEAD (reg:SI 4 r4 [orig:110 D.1414 ] [110])
(nil)))
(jump_insn 13 12 14 2 (set (pc)
(if_then_else (eq (reg:CC_NOOV 100 cc)
(const_int 0 [0]))
(label_ref:SI 16)
(pc))) arm_epilog-1.c:11 236 {arm_cond_branch}
(expr_list:REG_DEAD (reg:CC_NOOV 100 cc)
(int_list:REG_BR_PROB 3900 (nil)))
-> 16)
which gets replaced to:
(insn 62 11 63 2 (parallel [
(set (reg:CC_NOOV 100 cc)
(compare:CC_NOOV (ashift:SI (reg:SI 4 r4 [orig:110
D.1414 ] [110])
(const_int 29 [0x1d]))
(const_int 0 [0])))
(clobber (reg:SI 4 r4))
]) arm_epilog-1.c:11 -1
(nil))
(jump_insn 63 62 14 2 (set (pc)
(if_then_else (ge (reg:CC_NOOV 100 cc)
(const_int 0 [0]))
(label_ref:SI 16)
(pc))) arm_epilog-1.c:11 -1
(nil)
-> 16)
If we run peephole2 pass late we see instead the following pattern,
which isn't handled by peephole2 pass anymore.
(insn 12 11 15 2 (set (reg:CC_NOOV 100 cc)
(compare:CC_NOOV (zero_extract:SI (reg:SI 4 r4 [orig:110 D.1414 ] [110])
(const_int 1 [0x1])
(const_int 2 [0x2]))
(const_int 0 [0]))) arm_epilog-1_.c:11 84 {*zeroextractsi_compare0_s
cratch}
(expr_list:REG_DEAD (reg:SI 4 r4 [orig:110 D.1414 ] [110])
(nil)))
(insn 15 12 22 2 (cond_exec (ne (reg:CC_NOOV 100 cc)
(const_int 0 [0]))
(set (reg/v:SI 2 r2 [orig:115 c ] [115])
(plus:SI (reg/v:SI 2 r2 [orig:115 c ] [115])
(const_int 1 [0x1])))) arm_epilog-1_.c:11 3229 {*p *arm_addsi3}
(expr_list:REG_DEAD (reg:CC_NOOV 100 cc)
(nil)))
So this issue might be solvable by extending ARM's peephole2 patterns?
An ARM maintainer might be able to tell.
Kai
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 15:04 ` Kai Tietz
@ 2014-06-25 15:28 ` Jeff Law
2014-06-25 15:47 ` Ramana Radhakrishnan
2014-06-25 15:52 ` Richard Henderson
0 siblings, 2 replies; 13+ messages in thread
From: Jeff Law @ 2014-06-25 15:28 UTC (permalink / raw)
To: Kai Tietz; +Cc: GCC Patches, Richard Henderson
On 06/25/14 09:04, Kai Tietz wrote:
> 2014-06-25 16:04 GMT+02:00 Jeff Law <law@redhat.com>:
>> So why is the peephole not working in its current location?
>>
>> Jeff
>
> Hi Jeff,
>
> that is what I read out of dumps:
>
> If peephole2 is executed early we see following pattern transformation:
[ ... ]
Ask an ARM maintainer if the new code is actually better than the old code.
It appears that with the peep2 pass moved that we actually if-convert
the fall-thru path of the conditional and eliminate the conditional.
Which, on the surface seems like a good thing. It may be the case that
we need to twiddle the test. Not sure yet.
Jeff
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 15:28 ` Jeff Law
@ 2014-06-25 15:47 ` Ramana Radhakrishnan
2014-06-25 15:52 ` Richard Henderson
1 sibling, 0 replies; 13+ messages in thread
From: Ramana Radhakrishnan @ 2014-06-25 15:47 UTC (permalink / raw)
To: Jeff Law, Kai Tietz; +Cc: GCC Patches, Richard Henderson
[Apologies about the duplicates to folks - resending to make this hit
the lists]
On 25/06/14 16:28, Jeff Law wrote:
> On 06/25/14 09:04, Kai Tietz wrote:
>> 2014-06-25 16:04 GMT+02:00 Jeff Law <law@redhat.com>:
>>> So why is the peephole not working in its current location?
>>>
>>> Jeff
>>
>> Hi Jeff,
>>
>> that is what I read out of dumps:
>>
>> If peephole2 is executed early we see following pattern transformation:
> [ ... ]
> Ask an ARM maintainer if the new code is actually better than the old code.
>
> It appears that with the peep2 pass moved that we actually if-convert
> the fall-thru path of the conditional and eliminate the conditional.
> Which, on the surface seems like a good thing. It may be the case that
> we need to twiddle the test. Not sure yet.
No, we don't need to twiddle the test. The test is good as it is today
as we caught a potential size regression. (Prima-facie the test itself
doesn't show an actual size regression because of padding but you can
see the replacement of a 2 byte instruction with a 4 byte instruction)
These peepholes that were in there gave 0.1% in CSiBe for Thumb2 as per
the original patch
(https://gcc.gnu.org/ml/gcc-patches/2010-06/msg02518.html). The
if-conversion happened anyway afterwards regardless of where the pattern
was but the key here was the conversion of the 4 byte instruction to a 2
byte instruction.
Looks like the all the peephole2's that have if_then_else's (5 patterns
today) also need to develop a cond_exec variant where possible in the
ARM backend.
It seems unfortunate that the backend has to handle each of the possible
pattern classes that may be conditionalized by the cond-exec pass and we
potentially maintain a list of patterns in sync with anything that can
be conditionalized.
At first glance this looks ugly to "work around" in the backend.
regards
Ramana
>
>
> Jeff
>
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 15:28 ` Jeff Law
2014-06-25 15:47 ` Ramana Radhakrishnan
@ 2014-06-25 15:52 ` Richard Henderson
2014-06-25 16:02 ` Kai Tietz
2014-06-25 17:17 ` Jeff Law
1 sibling, 2 replies; 13+ messages in thread
From: Richard Henderson @ 2014-06-25 15:52 UTC (permalink / raw)
To: Jeff Law, Kai Tietz; +Cc: GCC Patches
On 06/25/2014 08:28 AM, Jeff Law wrote:
> Ask an ARM maintainer if the new code is actually better than the old code.
It isn't.
> It appears that with the peep2 pass moved that we actually if-convert the
> fall-thru path of the conditional and eliminate the conditional. Which, on the
> surface seems like a good thing. It may be the case that we need to twiddle
> the test. Not sure yet.
Looking at the final code in the pr, it looks like we if-convert both ways,
just with changed condition on the test.
It looks like there are at least 3 peepholes in the arm backend that match
compare+branch with a scratch register that are affected by this. I don't
think it's reasonable to expect a peephole to match compare + n cond_exec
insns, so running peep2 before if-conversion would seem to be called for.
Kai, why does your indirect jump peephole require pass_reorder_blocks? It
seems to me that instead of moving peephole2 down, we could move
duplicate_computed_gotos up.
r~
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 15:52 ` Richard Henderson
@ 2014-06-25 16:02 ` Kai Tietz
2014-06-25 17:15 ` Jeff Law
2014-06-25 17:17 ` Jeff Law
1 sibling, 1 reply; 13+ messages in thread
From: Kai Tietz @ 2014-06-25 16:02 UTC (permalink / raw)
To: Richard Henderson; +Cc: Jeff Law, GCC Patches
2014-06-25 17:50 GMT+02:00 Richard Henderson <rth@redhat.com>:
> On 06/25/2014 08:28 AM, Jeff Law wrote:
>> Ask an ARM maintainer if the new code is actually better than the old code.
>
> It isn't.
>
>> It appears that with the peep2 pass moved that we actually if-convert the
>> fall-thru path of the conditional and eliminate the conditional. Which, on the
>> surface seems like a good thing. It may be the case that we need to twiddle
>> the test. Not sure yet.
>
> Looking at the final code in the pr, it looks like we if-convert both ways,
> just with changed condition on the test.
>
> It looks like there are at least 3 peepholes in the arm backend that match
> compare+branch with a scratch register that are affected by this. I don't
> think it's reasonable to expect a peephole to match compare + n cond_exec
> insns, so running peep2 before if-conversion would seem to be called for.
>
> Kai, why does your indirect jump peephole require pass_reorder_blocks? It
> seems to me that instead of moving peephole2 down, we could move
> duplicate_computed_gotos up.
>
>
> r~
We need the peephole2 pass after reorder_blocks due we move in that
pass the jump
...
(code_label 54 52 55 5 8 "" [0 uses])
(note 55 54 56 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(jump_insn 56 55 57 5 (set (pc)
(reg/f:DI 0 ax [orig:83 D.3469 ] [83])) 638 {*indirect_jump}
(expr_list:REG_DEAD (reg/f:DI 0 ax [orig:83 D.3469 ] [83])
(nil)))
(barrier 57 56 58)
....
into each bb, so it can be resolved to an indirect jump on memory.
Testcase to reproduce that is:
// PR 51840/
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
enum {
S_atop = 0,
S_atop_f = 1,
S_atop_t = 2,
S_limit = 3
};
enum {
I_a_dec = 0,
I_a_non_zero_p = 1,
I_jmp_if_true = 2,
I_exit = 3,
I_limit = 4
};
typedef struct {
uint64_t a0;
} vm_state_t;
__attribute__((noinline, noclone)) void exec_code(vm_state_t *state,
uint8_t *code) {
static void (* volatile const vm_dispatch[S_limit][I_limit]) = {
//dispatch for [S_atop = 0][..] with four unique destination labels
{&&atop__a_dec,
&&atop__a_non_zero_p,
&&fixme,
&&atop__exit},
//dispatch for [S_atop_f = 1][..] with two unique destination labels
{&&fixme,
&&fixme,
&&atop_f__jmp_if_true,
&&fixme},
//dispatch for [S_atop_t = 2][..] with two unique destination labels
{&&fixme,
&&fixme,
&&atop_t__jmp_if_true,
&&fixme}
};
printf("atop__a_dec: %p\n", &&atop__a_dec);
printf("atop__a_non_zero_p: %p\n", &&atop__a_non_zero_p);
printf("atop__exit: %p\n", &&atop__exit);
printf("atop_f__jmp_if_true: %p\n", &&atop_f__jmp_if_true);
printf("atop_t__jmp_if_true: %p\n", &&atop_t__jmp_if_true);
printf("fixme: %p\n", &&fixme);
volatile uint64_t atop = state->a0;
volatile int64_t inst_offset=0;
goto *(vm_dispatch[S_atop][code[inst_offset]]);
atop__a_dec: {
printf("atop = %ld\n", atop);
--atop;
volatile uint64_t next_inst = code[inst_offset + 1];
inst_offset += 1;
goto *(vm_dispatch[S_atop][next_inst]);
}
atop__a_non_zero_p: {
volatile uint64_t next_inst = code[inst_offset + 1];
inst_offset += 1;
if (atop != 0) {
goto *(vm_dispatch[S_atop_t][next_inst]);
} else {
goto *(vm_dispatch[S_atop_f][next_inst]);
}
}
atop_f__jmp_if_true: {
volatile uint64_t next_inst = code[inst_offset + 2];
inst_offset += 2;
goto *(vm_dispatch[S_atop][next_inst]);
}
atop_t__jmp_if_true: {
int64_t offset = (int8_t) code[inst_offset + 1];
uint64_t next_inst = code[inst_offset + offset];
inst_offset += offset;
goto *(vm_dispatch[S_atop][next_inst]);
}
atop__exit: {
state->a0 = atop;
return;
}
fixme: {
printf("Dispatch logic ERROR\n");
exit(EXIT_FAILURE);
}
}
int main(void)
{
vm_state_t state;
state.a0 = 10;
uint8_t code[] = {I_a_dec, //decrement top of stack
I_a_non_zero_p, //true if top of stack is non-zero
I_jmp_if_true, -2, //if true jump back to decrement
I_exit}; //otherwise exit
exec_code(&state, code);
return EXIT_SUCCESS;
}
Regards,
Kai
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 16:02 ` Kai Tietz
@ 2014-06-25 17:15 ` Jeff Law
2014-06-25 17:53 ` Kai Tietz
0 siblings, 1 reply; 13+ messages in thread
From: Jeff Law @ 2014-06-25 17:15 UTC (permalink / raw)
To: Kai Tietz, Richard Henderson; +Cc: GCC Patches
On 06/25/14 10:02, Kai Tietz wrote:
> 2014-06-25 17:50 GMT+02:00 Richard Henderson <rth@redhat.com>:
>> On 06/25/2014 08:28 AM, Jeff Law wrote:
>>> Ask an ARM maintainer if the new code is actually better than the old code.
>>
>> It isn't.
>>
>>> It appears that with the peep2 pass moved that we actually if-convert the
>>> fall-thru path of the conditional and eliminate the conditional. Which, on the
>>> surface seems like a good thing. It may be the case that we need to twiddle
>>> the test. Not sure yet.
>>
>> Looking at the final code in the pr, it looks like we if-convert both ways,
>> just with changed condition on the test.
>>
>> It looks like there are at least 3 peepholes in the arm backend that match
>> compare+branch with a scratch register that are affected by this. I don't
>> think it's reasonable to expect a peephole to match compare + n cond_exec
>> insns, so running peep2 before if-conversion would seem to be called for.
>>
>> Kai, why does your indirect jump peephole require pass_reorder_blocks? It
>> seems to me that instead of moving peephole2 down, we could move
>> duplicate_computed_gotos up.
>>
>>
>> r~
>
> We need the peephole2 pass after reorder_blocks due we move in that
> pass the jump
>
> ...
> (code_label 54 52 55 5 8 "" [0 uses])
> (note 55 54 56 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
> (jump_insn 56 55 57 5 (set (pc)
> (reg/f:DI 0 ax [orig:83 D.3469 ] [83])) 638 {*indirect_jump}
> (expr_list:REG_DEAD (reg/f:DI 0 ax [orig:83 D.3469 ] [83])
> (nil)))
> (barrier 57 56 58)
> ....
>
> into each bb, so it can be resolved to an indirect jump on memory.
But I think Richard's question was can we move up
duplicate_computed_gotos since that's the code which I'd expect to make
this transformation. Then you run peep2 immediately after
duplicate_computed_gotos.
Jeff
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 15:52 ` Richard Henderson
2014-06-25 16:02 ` Kai Tietz
@ 2014-06-25 17:17 ` Jeff Law
1 sibling, 0 replies; 13+ messages in thread
From: Jeff Law @ 2014-06-25 17:17 UTC (permalink / raw)
To: Richard Henderson, Kai Tietz; +Cc: GCC Patches
On 06/25/14 09:50, Richard Henderson wrote:
> On 06/25/2014 08:28 AM, Jeff Law wrote:
>> Ask an ARM maintainer if the new code is actually better than the old code.
>
> It isn't.
>
>> It appears that with the peep2 pass moved that we actually if-convert the
>> fall-thru path of the conditional and eliminate the conditional. Which, on the
>> surface seems like a good thing. It may be the case that we need to twiddle
>> the test. Not sure yet.
>
> Looking at the final code in the pr, it looks like we if-convert both ways,
> just with changed condition on the test.
Yea, I came to the same conclusion when I looked back at the code in the
BZ. I was too focused on Kai's message and the RTL contained within
when I wrote my comment.
>
> It looks like there are at least 3 peepholes in the arm backend that match
> compare+branch with a scratch register that are affected by this. I don't
> think it's reasonable to expect a peephole to match compare + n cond_exec
> insns, so running peep2 before if-conversion would seem to be called for.
Agreed.
jeff
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 17:15 ` Jeff Law
@ 2014-06-25 17:53 ` Kai Tietz
0 siblings, 0 replies; 13+ messages in thread
From: Kai Tietz @ 2014-06-25 17:53 UTC (permalink / raw)
To: Jeff Law; +Cc: Richard Henderson, GCC Patches
2014-06-25 19:15 GMT+02:00 Jeff Law <law@redhat.com>:
> On 06/25/14 10:02, Kai Tietz wrote:
>>
>> 2014-06-25 17:50 GMT+02:00 Richard Henderson <rth@redhat.com>:
>>>
>>> On 06/25/2014 08:28 AM, Jeff Law wrote:
>>>>
>>>> Ask an ARM maintainer if the new code is actually better than the old
>>>> code.
>>>
>>>
>>> It isn't.
>>>
>>>> It appears that with the peep2 pass moved that we actually if-convert
>>>> the
>>>> fall-thru path of the conditional and eliminate the conditional. Which,
>>>> on the
>>>> surface seems like a good thing. It may be the case that we need to
>>>> twiddle
>>>> the test. Not sure yet.
>>>
>>>
>>> Looking at the final code in the pr, it looks like we if-convert both
>>> ways,
>>> just with changed condition on the test.
>>>
>>> It looks like there are at least 3 peepholes in the arm backend that
>>> match
>>> compare+branch with a scratch register that are affected by this. I
>>> don't
>>> think it's reasonable to expect a peephole to match compare + n cond_exec
>>> insns, so running peep2 before if-conversion would seem to be called for.
>>>
>>> Kai, why does your indirect jump peephole require pass_reorder_blocks?
>>> It
>>> seems to me that instead of moving peephole2 down, we could move
>>> duplicate_computed_gotos up.
>>>
>>>
>>> r~
>>
>>
>> We need the peephole2 pass after reorder_blocks due we move in that
>> pass the jump
>>
>> ...
>> (code_label 54 52 55 5 8 "" [0 uses])
>> (note 55 54 56 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
>> (jump_insn 56 55 57 5 (set (pc)
>> (reg/f:DI 0 ax [orig:83 D.3469 ] [83])) 638 {*indirect_jump}
>> (expr_list:REG_DEAD (reg/f:DI 0 ax [orig:83 D.3469 ] [83])
>> (nil)))
>> (barrier 57 56 58)
>> ....
>>
>> into each bb, so it can be resolved to an indirect jump on memory.
>
> But I think Richard's question was can we move up duplicate_computed_gotos
> since that's the code which I'd expect to make this transformation. Then
> you run peep2 immediately after duplicate_computed_gotos.
>
>
> Jeff
>
The pass of question on ARM isn't the duplicate_compute_gotos pass.
It is the if-after-reload pass.
I tested it by moving it.
Kai
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 13:35 [patch passes.def]: Fix regression on ARM PR/61608 Kai Tietz
2014-06-25 13:56 ` Richard Biener
2014-06-25 14:04 ` Jeff Law
@ 2014-06-25 18:12 ` Richard Henderson
2014-06-25 18:29 ` Kai Tietz
2 siblings, 1 reply; 13+ messages in thread
From: Richard Henderson @ 2014-06-25 18:12 UTC (permalink / raw)
To: Kai Tietz, GCC Patches; +Cc: Jeff Law
[-- Attachment #1: Type: text/plain, Size: 337 bytes --]
On 06/25/2014 06:35 AM, Kai Tietz wrote:
> Hello,
>
> so there seems to be a fallout caused by moving peephole2 pass. See PR/61608.
> So we need indeed 2 peephole2 passes.
We don't need a second peephole pass. Please try this.
I think there's room for cleanup here, depending on when we leave cfglayout
mode for the last time.
r~
[-- Attachment #2: z --]
[-- Type: text/plain, Size: 2272 bytes --]
* bb-reorder.c (pass_duplicate_computed_gotos::execute): Cleanup
the cfg if there were any changes.
* passes.def: Revert move of peephole2 after reorder_blocks;
move duplicate_computed_gotos before peephole2.
diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index 61b0cab..cd68fee 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -2520,13 +2520,20 @@ pass_duplicate_computed_gotos::execute (function *fun)
changed = true;
}
-done:
- /* Duplicating blocks above will redirect edges and may cause hot blocks
- previously reached by both hot and cold blocks to become dominated only
- by cold blocks. */
+ done:
if (changed)
- fixup_partitions ();
- cfg_layout_finalize ();
+ {
+ /* Duplicating blocks above will redirect edges and may cause hot
+ blocks previously reached by both hot and cold blocks to become
+ dominated only by cold blocks. */
+ fixup_partitions ();
+
+ /* Merge the duplicated blocks into predecessors, when possible. */
+ cfg_layout_finalize ();
+ cleanup_cfg (0);
+ }
+ else
+ cfg_layout_finalize ();
BITMAP_FREE (candidates);
return 0;
diff --git a/gcc/passes.def b/gcc/passes.def
index 280cf16..f13df6c 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -396,20 +396,13 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_rtl_dse2);
NEXT_PASS (pass_stack_adjustments);
NEXT_PASS (pass_jump2);
+ NEXT_PASS (pass_duplicate_computed_gotos);
+ NEXT_PASS (pass_peephole2);
NEXT_PASS (pass_if_after_reload);
NEXT_PASS (pass_regrename);
NEXT_PASS (pass_cprop_hardreg);
NEXT_PASS (pass_fast_rtl_dce);
- NEXT_PASS (pass_duplicate_computed_gotos);
NEXT_PASS (pass_reorder_blocks);
- /* We need to run peephole2 pass after the duplicate-
- compute-gotos and the reorder-blocks pass (PR/39284).
- We have a single indirect branch in the entire function
- before duplicate-compute-gotos pass. This vastly reduces
- the size of the CFG.
- For preventing to run peephole2 pass twice, its run after
- the jump2 got removed. */
- NEXT_PASS (pass_peephole2);
NEXT_PASS (pass_branch_target_load_optimize2);
NEXT_PASS (pass_leaf_regs);
NEXT_PASS (pass_split_before_sched2);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch passes.def]: Fix regression on ARM PR/61608
2014-06-25 18:12 ` Richard Henderson
@ 2014-06-25 18:29 ` Kai Tietz
0 siblings, 0 replies; 13+ messages in thread
From: Kai Tietz @ 2014-06-25 18:29 UTC (permalink / raw)
To: Richard Henderson; +Cc: GCC Patches, Jeff Law
2014-06-25 20:12 GMT+02:00 Richard Henderson <rth@redhat.com>:
> On 06/25/2014 06:35 AM, Kai Tietz wrote:
>> Hello,
>>
>> so there seems to be a fallout caused by moving peephole2 pass. See PR/61608.
>> So we need indeed 2 peephole2 passes.
>
> We don't need a second peephole pass. Please try this.
>
> I think there's room for cleanup here, depending on when we leave cfglayout
> mode for the last time.
>
>
> r~
>
My tests shown that it works for ARM and i386 archs. Doing right now a
bootstrap for x86_64
Kai
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2014-06-25 18:29 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-25 13:35 [patch passes.def]: Fix regression on ARM PR/61608 Kai Tietz
2014-06-25 13:56 ` Richard Biener
2014-06-25 14:04 ` Jeff Law
2014-06-25 15:04 ` Kai Tietz
2014-06-25 15:28 ` Jeff Law
2014-06-25 15:47 ` Ramana Radhakrishnan
2014-06-25 15:52 ` Richard Henderson
2014-06-25 16:02 ` Kai Tietz
2014-06-25 17:15 ` Jeff Law
2014-06-25 17:53 ` Kai Tietz
2014-06-25 17:17 ` Jeff Law
2014-06-25 18:12 ` Richard Henderson
2014-06-25 18:29 ` Kai Tietz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).