* [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn
@ 2016-07-08 15:08 Thomas Preudhomme
2016-07-08 16:06 ` Mike Stump
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Thomas Preudhomme @ 2016-07-08 15:08 UTC (permalink / raw)
To: gcc-patches, Vladimir Makarov
[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]
Hi,
While investigating the root cause a testsuite regression for the
ARM/embedded-5-branch GCC in gcc.dg/vect/slp-perm-5.c, we found that the bug
seems to also affect trunk. The bug manifests itself as an ICE in cselib due to
a parallel insn with two SET to the same register. When processing the second
SET in cselib_record_set (), the assert gcc_assert (REG_VALUES (dreg)->elt ==
0) fails because the field was already set when processing the first SET. The
root cause seems to be a register allocation issue in lra-constraints.
When considering an output operand with matching input operand(s),
match_reload does a number of checks to see if it can reuse the first matching
input operand register value or if a new unique value should be given to the
output operand. The current check ignores the case of multiple output operands
(as in neon_vtrn<mode>_insn insn pattern in config/arm/arm.md). This can lead
to cases where multiple output operands share a same register value when the
first matching input operand has the same value as another output operand,
leading to the ICE in cselib. This patch changes match_reload to get
information about other output operands and check whether this case is met or
not.
ChangeLog entry is as follows:
*** gcc/ChangeLog ***
2016-07-01 Thomas Preud'homme <thomas.preudhomme@arm.com>
* lra-constraints.c (match_reload): Pass information about other
output operands. Create new unique register value if matching input
operand shares same register value as output operand being considered.
(curr_insn_transform): Record output operands already processed.
Patch passed bootstrap under arm-none-linux-gnueabihf (Cortex-A57 in ARM mode
as well as Thumb mode), aarch64-linux-gnu (Cortex-A57) and x86_64-linux-gnu
and testsuite results does not regress for these and for arm-none-eabi
targeting Cortex-A8.
Is this ok for trunk?
Best regards,
Thomas
[-- Attachment #2: lra_constraint_reg_mult_output.patch --]
[-- Type: text/x-patch, Size: 5323 bytes --]
diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index bf08dce2e0a4c2ef4c339aedbda4dba47cba1645..a3fd6c93c648050f3479dc8aca359a819d24863e 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -871,15 +871,18 @@ regno_val_use_in (unsigned int regno, rtx x)
}
/* Generate reloads for matching OUT and INS (array of input operand
- numbers with end marker -1) with reg class GOAL_CLASS. Add input
- and output reloads correspondingly to the lists *BEFORE and *AFTER.
- OUT might be negative. In this case we generate input reloads for
- matched input operands INS. EARLY_CLOBBER_P is a flag that the
- output operand is early clobbered for chosen alternative. */
+ numbers with end marker -1) with reg class GOAL_CLASS, considering
+ output operands OUTS (similar array to INS) needing to be in different
+ registers. Add input and output reloads correspondingly to the lists
+ *BEFORE and *AFTER. OUT might be negative. In this case we generate
+ input reloads for matched input operands INS. EARLY_CLOBBER_P is a flag
+ that the output operand is early clobbered for chosen alternative. */
static void
-match_reload (signed char out, signed char *ins, enum reg_class goal_class,
- rtx_insn **before, rtx_insn **after, bool early_clobber_p)
+match_reload (signed char out, signed char *ins, signed char *outs,
+ enum reg_class goal_class, rtx_insn **before,
+ rtx_insn **after, bool early_clobber_p)
{
+ bool out_conflict;
int i, in;
rtx new_in_reg, new_out_reg, reg;
machine_mode inmode, outmode;
@@ -968,12 +971,32 @@ match_reload (signed char out, signed char *ins, enum reg_class goal_class,
We don't care about eliminable hard regs here as we are
interesting only in pseudos. */
+ /* Matching input's register value is the same as one of the other
+ output operand. Output operands in a parallel insn must be in
+ different registers. */
+ out_conflict = false;
+ if (REG_P (in_rtx))
+ {
+ for (i = 0; outs[i] >= 0; i++)
+ {
+ rtx other_out_rtx = *curr_id->operand_loc[outs[i]];
+ if (REG_P (other_out_rtx)
+ && (regno_val_use_in (REGNO (in_rtx), other_out_rtx)
+ != NULL_RTX))
+ {
+ out_conflict = true;
+ break;
+ }
+ }
+ }
+
new_in_reg = new_out_reg
= (! early_clobber_p && ins[1] < 0 && REG_P (in_rtx)
&& (int) REGNO (in_rtx) < lra_new_regno_start
&& find_regno_note (curr_insn, REG_DEAD, REGNO (in_rtx))
&& (out < 0
|| regno_val_use_in (REGNO (in_rtx), out_rtx) == NULL_RTX)
+ && !out_conflict
? lra_create_new_reg (inmode, in_rtx, goal_class, "")
: lra_create_new_reg_with_unique_value (outmode, out_rtx,
goal_class, ""));
@@ -3432,9 +3455,11 @@ curr_insn_transform (bool check_only_p)
int i, j, k;
int n_operands;
int n_alternatives;
+ int n_outputs;
int commutative;
signed char goal_alt_matched[MAX_RECOG_OPERANDS][MAX_RECOG_OPERANDS];
signed char match_inputs[MAX_RECOG_OPERANDS + 1];
+ signed char outputs[MAX_RECOG_OPERANDS + 1];
rtx_insn *before, *after;
bool alt_p = false;
/* Flag that the insn has been changed through a transformation. */
@@ -3844,6 +3869,8 @@ curr_insn_transform (bool check_only_p)
}
}
+ n_outputs = 0;
+ outputs[0] = -1;
for (i = 0; i < n_operands; i++)
{
int regno;
@@ -4001,7 +4028,7 @@ curr_insn_transform (bool check_only_p)
/* generate reloads for input and matched outputs. */
match_inputs[0] = i;
match_inputs[1] = -1;
- match_reload (goal_alt_matched[i][0], match_inputs,
+ match_reload (goal_alt_matched[i][0], match_inputs, outputs,
goal_alt[i], &before, &after,
curr_static_id->operand_alternative
[goal_alt_number * n_operands + goal_alt_matched[i][0]]
@@ -4011,9 +4038,9 @@ curr_insn_transform (bool check_only_p)
&& (curr_static_id->operand[goal_alt_matched[i][0]].type
== OP_IN))
/* Generate reloads for output and matched inputs. */
- match_reload (i, goal_alt_matched[i], goal_alt[i], &before, &after,
- curr_static_id->operand_alternative
- [goal_alt_number * n_operands + i].earlyclobber);
+ match_reload (i, goal_alt_matched[i], outputs, goal_alt[i], &before,
+ &after, curr_static_id->operand_alternative
+ [goal_alt_number * n_operands + i].earlyclobber);
else if (curr_static_id->operand[i].type == OP_IN
&& (curr_static_id->operand[goal_alt_matched[i][0]].type
== OP_IN))
@@ -4023,12 +4050,22 @@ curr_insn_transform (bool check_only_p)
for (j = 0; (k = goal_alt_matched[i][j]) >= 0; j++)
match_inputs[j + 1] = k;
match_inputs[j + 1] = -1;
- match_reload (-1, match_inputs, goal_alt[i], &before, &after, false);
+ match_reload (-1, match_inputs, outputs, goal_alt[i], &before,
+ &after, false);
}
else
/* We must generate code in any case when function
process_alt_operands decides that it is possible. */
gcc_unreachable ();
+
+ /* Memorise processed outputs so that output remaining to be processed
+ can avoid using the same register value (see match_reload). */
+ if (curr_static_id->operand[i].type == OP_OUT)
+ {
+ outputs[n_outputs++] = i;
+ outputs[n_outputs] = -1;
+ }
+
if (optional_p)
{
lra_assert (REG_P (op));
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn
2016-07-08 15:08 [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn Thomas Preudhomme
@ 2016-07-08 16:06 ` Mike Stump
2016-07-14 9:33 ` Thomas Preudhomme
2016-07-14 16:26 ` Vladimir Makarov
2016-09-19 15:50 ` Vladimir N Makarov
2 siblings, 1 reply; 8+ messages in thread
From: Mike Stump @ 2016-07-08 16:06 UTC (permalink / raw)
To: Thomas Preudhomme; +Cc: gcc-patches, Vladimir Makarov
On Jul 8, 2016, at 8:07 AM, Thomas Preudhomme <thomas.preudhomme@foss.arm.com> wrote:
> While investigating the root cause a testsuite regression for the
> ARM/embedded-5-branch GCC in gcc.dg/vect/slp-perm-5.c, we found that the bug
> seems to also affect trunk.
Hum... If in 6.x, and safe to back port to 6, a back port would be nice... I use LRA in 6.x, and seems like I'd be susceptible to this sort of thing, but, I didn't test it.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn
2016-07-08 16:06 ` Mike Stump
@ 2016-07-14 9:33 ` Thomas Preudhomme
2016-07-14 12:23 ` Thomas Preudhomme
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Preudhomme @ 2016-07-14 9:33 UTC (permalink / raw)
To: Mike Stump; +Cc: gcc-patches, Vladimir Makarov
On Friday 08 July 2016 09:05:55 Mike Stump wrote:
> On Jul 8, 2016, at 8:07 AM, Thomas Preudhomme
<thomas.preudhomme@foss.arm.com> wrote:
> > While investigating the root cause a testsuite regression for the
> > ARM/embedded-5-branch GCC in gcc.dg/vect/slp-perm-5.c, we found that the
> > bug seems to also affect trunk.
>
> Hum... If in 6.x, and safe to back port to 6, a back port would be nice...
> I use LRA in 6.x, and seems like I'd be susceptible to this sort of thing,
> but, I didn't test it.
It should affect 6.x as well yes since there is no special protection against
it in the code being modified. However I only ever managed to reproduce this on
the ARM/embedded-5-branch.
Best regards,
Thomas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn
2016-07-14 9:33 ` Thomas Preudhomme
@ 2016-07-14 12:23 ` Thomas Preudhomme
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Preudhomme @ 2016-07-14 12:23 UTC (permalink / raw)
To: Mike Stump; +Cc: gcc-patches, Vladimir Makarov
On Thursday 14 July 2016 10:32:48 Thomas Preudhomme wrote:
> On Friday 08 July 2016 09:05:55 Mike Stump wrote:
> > On Jul 8, 2016, at 8:07 AM, Thomas Preudhomme
>
> <thomas.preudhomme@foss.arm.com> wrote:
> > > While investigating the root cause a testsuite regression for the
> > > ARM/embedded-5-branch GCC in gcc.dg/vect/slp-perm-5.c, we found that the
> > > bug seems to also affect trunk.
> >
> > Hum... If in 6.x, and safe to back port to 6, a back port would be
> > nice...
> > I use LRA in 6.x, and seems like I'd be susceptible to this sort of thing,
> > but, I didn't test it.
>
> It should affect 6.x as well yes since there is no special protection
> against it in the code being modified. However I only ever managed to
> reproduce this on the ARM/embedded-5-branch.
I've created PR71878 for this problem with more details on how to reproduce
the issue. it only requires to backport a single commit to gcc-5-branch, build
gcc for arm-none-eabi (make all-gcc is enough) and compile the testcase
attached in the PR with the given options.
Please let me know if you need any more details.
Best regards,
Thomas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn
2016-07-08 15:08 [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn Thomas Preudhomme
2016-07-08 16:06 ` Mike Stump
@ 2016-07-14 16:26 ` Vladimir Makarov
2016-09-15 14:56 ` Thomas Preudhomme
2016-09-19 15:50 ` Vladimir N Makarov
2 siblings, 1 reply; 8+ messages in thread
From: Vladimir Makarov @ 2016-07-14 16:26 UTC (permalink / raw)
To: Thomas Preudhomme, gcc-patches
On 07/08/2016 11:07 AM, Thomas Preudhomme wrote:
> Hi,
>
> While investigating the root cause a testsuite regression for the
> ARM/embedded-5-branch GCC in gcc.dg/vect/slp-perm-5.c, we found that the bug
> seems to also affect trunk. The bug manifests itself as an ICE in cselib due to
> a parallel insn with two SET to the same register. When processing the second
> SET in cselib_record_set (), the assert gcc_assert (REG_VALUES (dreg)->elt ==
> 0) fails because the field was already set when processing the first SET. The
> root cause seems to be a register allocation issue in lra-constraints.
>
> When considering an output operand with matching input operand(s),
> match_reload does a number of checks to see if it can reuse the first matching
> input operand register value or if a new unique value should be given to the
> output operand. The current check ignores the case of multiple output operands
> (as in neon_vtrn<mode>_insn insn pattern in config/arm/arm.md). This can lead
> to cases where multiple output operands share a same register value when the
> first matching input operand has the same value as another output operand,
> leading to the ICE in cselib. This patch changes match_reload to get
> information about other output operands and check whether this case is met or
> not.
>
> ChangeLog entry is as follows:
>
> *** gcc/ChangeLog ***
>
> 2016-07-01 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> * lra-constraints.c (match_reload): Pass information about other
> output operands. Create new unique register value if matching input
> operand shares same register value as output operand being considered.
> (curr_insn_transform): Record output operands already processed.
>
>
> Patch passed bootstrap under arm-none-linux-gnueabihf (Cortex-A57 in ARM mode
> as well as Thumb mode), aarch64-linux-gnu (Cortex-A57) and x86_64-linux-gnu
> and testsuite results does not regress for these and for arm-none-eabi
> targeting Cortex-A8.
Sorry, it took sometime to think about the problem. It is a nontrivial
problem with a lot of possible scenarios in general case. The patch is
safe in any case (it can not create a wrong code if LRA w/o the patch
does not create a wrong code).
> Is this ok for trunk?
>
>
Yes. Thank you, Thomas.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn
2016-07-14 16:26 ` Vladimir Makarov
@ 2016-09-15 14:56 ` Thomas Preudhomme
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Preudhomme @ 2016-09-15 14:56 UTC (permalink / raw)
To: Vladimir Makarov, gcc-patches, Richard Biener, Jakub Jelinek
Hi Vladimir & release managers,
The patch applies cleanly to gcc-6-branch. Ok to backport?
Best regards,
Thomas
On 14/07/16 17:25, Vladimir Makarov wrote:
> On 07/08/2016 11:07 AM, Thomas Preudhomme wrote:
>> Hi,
>>
>> While investigating the root cause a testsuite regression for the
>> ARM/embedded-5-branch GCC in gcc.dg/vect/slp-perm-5.c, we found that the bug
>> seems to also affect trunk. The bug manifests itself as an ICE in cselib due to
>> a parallel insn with two SET to the same register. When processing the second
>> SET in cselib_record_set (), the assert gcc_assert (REG_VALUES (dreg)->elt ==
>> 0) fails because the field was already set when processing the first SET. The
>> root cause seems to be a register allocation issue in lra-constraints.
>>
>> When considering an output operand with matching input operand(s),
>> match_reload does a number of checks to see if it can reuse the first matching
>> input operand register value or if a new unique value should be given to the
>> output operand. The current check ignores the case of multiple output operands
>> (as in neon_vtrn<mode>_insn insn pattern in config/arm/arm.md). This can lead
>> to cases where multiple output operands share a same register value when the
>> first matching input operand has the same value as another output operand,
>> leading to the ICE in cselib. This patch changes match_reload to get
>> information about other output operands and check whether this case is met or
>> not.
>>
>> ChangeLog entry is as follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2016-07-01 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>
>> * lra-constraints.c (match_reload): Pass information about other
>> output operands. Create new unique register value if matching input
>> operand shares same register value as output operand being considered.
>> (curr_insn_transform): Record output operands already processed.
>>
>>
>> Patch passed bootstrap under arm-none-linux-gnueabihf (Cortex-A57 in ARM mode
>> as well as Thumb mode), aarch64-linux-gnu (Cortex-A57) and x86_64-linux-gnu
>> and testsuite results does not regress for these and for arm-none-eabi
>> targeting Cortex-A8.
> Sorry, it took sometime to think about the problem. It is a nontrivial problem
> with a lot of possible scenarios in general case. The patch is safe in any case
> (it can not create a wrong code if LRA w/o the patch does not create a wrong code).
>> Is this ok for trunk?
>>
>>
>
> Yes. Thank you, Thomas.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn
2016-07-08 15:08 [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn Thomas Preudhomme
2016-07-08 16:06 ` Mike Stump
2016-07-14 16:26 ` Vladimir Makarov
@ 2016-09-19 15:50 ` Vladimir N Makarov
2016-09-20 9:26 ` Thomas Preudhomme
2 siblings, 1 reply; 8+ messages in thread
From: Vladimir N Makarov @ 2016-09-19 15:50 UTC (permalink / raw)
To: Thomas Preudhomme, gcc-patches
On 07/08/2016 11:07 AM, Thomas Preudhomme wrote:
> Hi,
>
> While investigating the root cause a testsuite regression for the
> ARM/embedded-5-branch GCC in gcc.dg/vect/slp-perm-5.c, we found that the bug
> seems to also affect trunk. The bug manifests itself as an ICE in cselib due to
> a parallel insn with two SET to the same register. When processing the second
> SET in cselib_record_set (), the assert gcc_assert (REG_VALUES (dreg)->elt ==
> 0) fails because the field was already set when processing the first SET. The
> root cause seems to be a register allocation issue in lra-constraints.
>
> When considering an output operand with matching input operand(s),
> match_reload does a number of checks to see if it can reuse the first matching
> input operand register value or if a new unique value should be given to the
> output operand. The current check ignores the case of multiple output operands
> (as in neon_vtrn<mode>_insn insn pattern in config/arm/arm.md). This can lead
> to cases where multiple output operands share a same register value when the
> first matching input operand has the same value as another output operand,
> leading to the ICE in cselib. This patch changes match_reload to get
> information about other output operands and check whether this case is met or
> not.
>
> ChangeLog entry is as follows:
>
> *** gcc/ChangeLog ***
>
> 2016-07-01 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> * lra-constraints.c (match_reload): Pass information about other
> output operands. Create new unique register value if matching input
> operand shares same register value as output operand being considered.
> (curr_insn_transform): Record output operands already processed.
>
>
> Patch passed bootstrap under arm-none-linux-gnueabihf (Cortex-A57 in ARM mode
> as well as Thumb mode), aarch64-linux-gnu (Cortex-A57) and x86_64-linux-gnu
> and testsuite results does not regress for these and for arm-none-eabi
> targeting Cortex-A8.
>
> Is this ok for trunk?
>
>
Yes, Thomas. I tried to find an alternative solution but your approach
is probably better. Your patch is also ok for gcc-6 branch. I'd delay
its commit to gcc-6 branch for a few days to see how it behaves on the
trunk. Although I don't expect any troubles as the patch is quite safe
with my point of view.
Thank you for the patch and sorry for the delay with the answer.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn
2016-09-19 15:50 ` Vladimir N Makarov
@ 2016-09-20 9:26 ` Thomas Preudhomme
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Preudhomme @ 2016-09-20 9:26 UTC (permalink / raw)
To: Vladimir N Makarov, gcc-patches
Hi Vladimir,
I'm sorry I failed to give a proper amount of context in that email. You already
approved that patch for trunk one or two months ago and I was just replying to
your approval to ask if you are now ok to commit it to gcc-6-branch.
I've committed it now then since it has been on trunk for long enough ;-)
Best regards.
Thomas
On 19/09/16 16:05, Vladimir N Makarov wrote:
>
>
> On 07/08/2016 11:07 AM, Thomas Preudhomme wrote:
>> Hi,
>>
>> While investigating the root cause a testsuite regression for the
>> ARM/embedded-5-branch GCC in gcc.dg/vect/slp-perm-5.c, we found that the bug
>> seems to also affect trunk. The bug manifests itself as an ICE in cselib due to
>> a parallel insn with two SET to the same register. When processing the second
>> SET in cselib_record_set (), the assert gcc_assert (REG_VALUES (dreg)->elt ==
>> 0) fails because the field was already set when processing the first SET. The
>> root cause seems to be a register allocation issue in lra-constraints.
>>
>> When considering an output operand with matching input operand(s),
>> match_reload does a number of checks to see if it can reuse the first matching
>> input operand register value or if a new unique value should be given to the
>> output operand. The current check ignores the case of multiple output operands
>> (as in neon_vtrn<mode>_insn insn pattern in config/arm/arm.md). This can lead
>> to cases where multiple output operands share a same register value when the
>> first matching input operand has the same value as another output operand,
>> leading to the ICE in cselib. This patch changes match_reload to get
>> information about other output operands and check whether this case is met or
>> not.
>>
>> ChangeLog entry is as follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2016-07-01 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>
>> * lra-constraints.c (match_reload): Pass information about other
>> output operands. Create new unique register value if matching input
>> operand shares same register value as output operand being considered.
>> (curr_insn_transform): Record output operands already processed.
>>
>>
>> Patch passed bootstrap under arm-none-linux-gnueabihf (Cortex-A57 in ARM mode
>> as well as Thumb mode), aarch64-linux-gnu (Cortex-A57) and x86_64-linux-gnu
>> and testsuite results does not regress for these and for arm-none-eabi
>> targeting Cortex-A8.
>>
>> Is this ok for trunk?
>>
>>
> Yes, Thomas. I tried to find an alternative solution but your approach is
> probably better. Your patch is also ok for gcc-6 branch. I'd delay its commit
> to gcc-6 branch for a few days to see how it behaves on the trunk. Although I
> don't expect any troubles as the patch is quite safe with my point of view.
>
> Thank you for the patch and sorry for the delay with the answer.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-09-20 9:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-08 15:08 [PATCH, GCC/LRA] Teach LRA to not use same register value for multiple output operands of an insn Thomas Preudhomme
2016-07-08 16:06 ` Mike Stump
2016-07-14 9:33 ` Thomas Preudhomme
2016-07-14 12:23 ` Thomas Preudhomme
2016-07-14 16:26 ` Vladimir Makarov
2016-09-15 14:56 ` Thomas Preudhomme
2016-09-19 15:50 ` Vladimir N Makarov
2016-09-20 9:26 ` Thomas Preudhomme
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).