public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload
@ 2020-08-26 10:25 yangyang305 at huawei dot com
2020-08-26 10:39 ` [Bug rtl-optimization/96796] [9 Regression] " ktkachov at gcc dot gnu.org
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: yangyang305 at huawei dot com @ 2020-08-26 10:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
Bug ID: 96796
Summary: aarch64: ICE during RTL pass: reload
Product: gcc
Version: 9.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: yangyang305 at huawei dot com
Target Milestone: ---
Created attachment 49129
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49129&action=edit
ICE testcase
Hi, gcc-9.3.0 ICEs when compiling the attached testcase with -Os on aarch64.
gcc -Os test.i
during RTL pass: reload
test.c: In function ‘func_50.isra.0.constprop’:
test.c:1852:1: internal compiler error: Max. number of generated reload insns
per insn is achieved (90)
0x936ddf lra_constraints(bool)
../../gcc-9.3.0/gcc/lra-constraints.c:4901
0x92144f lra(_IO_FILE*)
../../gcc-9.3.0/gcc/lra.c:2472
0x8e083f do_reload
../../gcc-9.3.0/gcc/ira.c:5523
0x8e083f execute
../../gcc-9.3.0/gcc/ira.c:5707
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
@ 2020-08-26 10:39 ` ktkachov at gcc dot gnu.org
2020-08-26 10:51 ` rguenth at gcc dot gnu.org
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2020-08-26 10:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |10.1.1, 11.0, 8.4.1
CC| |ktkachov at gcc dot gnu.org
Last reconfirmed| |2020-08-26
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Known to fail| |9.3.1
Summary|aarch64: ICE during RTL |[9 Regression] aarch64: ICE
|pass: reload |during RTL pass: reload
--- Comment #1 from ktkachov at gcc dot gnu.org ---
Confirmed on GCC 9 branches. Other branches don't ICE for me
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
2020-08-26 10:39 ` [Bug rtl-optimization/96796] [9 Regression] " ktkachov at gcc dot gnu.org
@ 2020-08-26 10:51 ` rguenth at gcc dot gnu.org
2020-08-26 11:37 ` marxin at gcc dot gnu.org
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-08-26 10:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |aarch64
Target Milestone|--- |9.4
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
2020-08-26 10:39 ` [Bug rtl-optimization/96796] [9 Regression] " ktkachov at gcc dot gnu.org
2020-08-26 10:51 ` rguenth at gcc dot gnu.org
@ 2020-08-26 11:37 ` marxin at gcc dot gnu.org
2020-08-26 15:56 ` acoplan at gcc dot gnu.org
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-08-26 11:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |marxin at gcc dot gnu.org
--- Comment #2 from Martin Liška <marxin at gcc dot gnu.org> ---
There's a reduced test-case:
cat pr96796.c
struct S0 {
signed f0 : 8;
unsigned f1;
unsigned f4;
};
struct S1 {
long f3;
char f4;
} g_3_4;
int g_5, func_1_l_32, func_50___trans_tmp_31;
static struct S0 g_144, g_834, g_1255, g_1261;
int g_273[120] = {};
int *g_555;
char **g_979;
static int g_1092_0;
static int g_1193;
int safe_mul_func_int16_t_s_s(int si1, int si2) { return si1 * si2; }
static struct S0 *func_50();
int func_1() { func_50(g_3_4, g_5, func_1_l_32, 8, 3); }
void safe_div_func_int64_t_s_s(int *);
void safe_mod_func_uint32_t_u_u(struct S0);
struct S0 *func_50(int p_51, struct S0 p_52, struct S1 p_53, int p_54,
int p_55) {
int __trans_tmp_30;
char __trans_tmp_22;
short __trans_tmp_19;
long l_985_1;
long l_1191[8];
safe_div_func_int64_t_s_s(g_273);
__builtin_printf((char*)g_1261.f4);
safe_mod_func_uint32_t_u_u(g_834);
g_144.f0 += 1;
for (;;) {
struct S1 l_1350 = {&l_1350};
for (; p_53.f3; p_53.f3 -= 1)
for (; g_1193 <= 2; g_1193 += 1) {
__trans_tmp_19 = safe_mul_func_int16_t_s_s(l_1191[l_985_1 + p_53.f3],
p_55 % (**g_979 = 10));
__trans_tmp_22 = g_1255.f1 * p_53.f4;
__trans_tmp_30 = __trans_tmp_19 + __trans_tmp_22;
if (__trans_tmp_30)
g_1261.f0 = p_51;
else {
g_1255.f0 = p_53.f3;
int *l_1422 = g_834.f0 = g_144.f4 != (*l_1422)++ > 0 < 0 ^ 51;
g_555 = ~0;
g_1092_0 |= func_50___trans_tmp_31;
}
}
}
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (2 preceding siblings ...)
2020-08-26 11:37 ` marxin at gcc dot gnu.org
@ 2020-08-26 15:56 ` acoplan at gcc dot gnu.org
2020-08-26 16:06 ` [Bug rtl-optimization/96796] [9/10/11 " ktkachov at gcc dot gnu.org
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: acoplan at gcc dot gnu.org @ 2020-08-26 15:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
--- Comment #3 from Alex Coplan <acoplan at gcc dot gnu.org> ---
Adding -fcommon, I can reproduce this ICE on trunk. The default changed in GCC
10 (as of 6271dd984d7f920d4fb17ad37af6a1f8e6b796dc).
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9/10/11 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (3 preceding siblings ...)
2020-08-26 15:56 ` acoplan at gcc dot gnu.org
@ 2020-08-26 16:06 ` ktkachov at gcc dot gnu.org
2020-08-26 20:17 ` acoplan at gcc dot gnu.org
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2020-08-26 16:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |ice-on-valid-code
Known to fail| |10.1.1, 11.0
Summary|[9 Regression] aarch64: ICE |[9/10/11 Regression]
|during RTL pass: reload |aarch64: ICE during RTL
| |pass: reload
Known to work|10.1.1, 11.0 |
--- Comment #4 from ktkachov at gcc dot gnu.org ---
Updating regression markers then
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9/10/11 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (4 preceding siblings ...)
2020-08-26 16:06 ` [Bug rtl-optimization/96796] [9/10/11 " ktkachov at gcc dot gnu.org
@ 2020-08-26 20:17 ` acoplan at gcc dot gnu.org
2020-08-27 11:44 ` rsandifo at gcc dot gnu.org
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: acoplan at gcc dot gnu.org @ 2020-08-26 20:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
--- Comment #5 from Alex Coplan <acoplan at gcc dot gnu.org> ---
Started with this change:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8eaff6ef97836100801f7b40dc03f77fbebe03ac
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9/10/11 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (5 preceding siblings ...)
2020-08-26 20:17 ` acoplan at gcc dot gnu.org
@ 2020-08-27 11:44 ` rsandifo at gcc dot gnu.org
2020-08-28 13:56 ` rsandifo at gcc dot gnu.org
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-08-27 11:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rsandifo at gcc dot gnu.org
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #6 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
(In reply to Alex Coplan from comment #5)
> Started with this change:
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;
> h=8eaff6ef97836100801f7b40dc03f77fbebe03ac
Ah, yeah. What the patch does looks good, but it seems to be
exposing a latent problem with subreg reloads.
The cycling starts with:
----------------------------------------------------------------------------
Changing pseudo 196 in operand 1 of insn 103 on equiv [r105:DI*0x8+r140:DI]
Creating newreg=287, assigning class ALL_REGS to slow/invalid mem r287
Creating newreg=288, assigning class ALL_REGS to slow/invalid mem r288
103: r203:SI=r288:SI<<0x1+r196:DI#0
REG_DEAD r196:DI
Inserting slow/invalid mem reload before:
316: r287:DI=[r105:DI*0x8+r140:DI]
317: r288:SI=r287:DI#0
----------------------------------------------------------------------------
where we now (IMO justifiably) have two reload moves, one for the
memory load and one for the subreg. Next we have:
----------------------------------------------------------------------------
Changing pseudo 196 in operand 3 of insn 103 on equiv [r105:DI*0x8+r140:DI]
Reuse r287 for reload [r105:DI*0x8+r140:DI], change to class
POINTER_AND_FP_REGS for r287
Reuse r288 for reload r287:DI#0, change to class POINTER_AND_FP_REGS
for r288
1 Non pseudo reload: reject++
3 Non pseudo reload: reject++
alt=0,overall=2,losers=0,rld_nregs=0
Choosing alt 0 in insn 103: (0) =r (1) r (2) n (3) r {*add_lsl_si}
Change to class GENERAL_REGS for r288
----------------------------------------------------------------------------
POINTER_AND_FP_REGS is the class that aarch64 prefers for the reload,
again IMO justifiably. This then gets narrowed to GENERAL_REGS for
the main reload register (r288) because of the use in the *add_lsl_si
instruction. But we're then left with a situation in which r287 has
class POINTER_AND_FP_REGS and is only used in moves. In practice,
each move alternative will require either POINTER_REGS or FP_REGS,
but there's nothing to pin r287 down to a particular one, and we end
up oscillating between them.
More specifically, we reload insn 316 as follows:
----------------------------------------------------------------------------
Choosing alt 7 in insn 316: (0) r (1) m {*movdi_aarch64}
Creating newreg=289 from oldreg=287, assigning class GENERAL_REGS to r289
316: r289:DI=[r105:DI*0x8+r140:DI]
Inserting insn reload after:
318: r287:DI=r289:DI
----------------------------------------------------------------------------
Here we've effectively chosen to use GENERAL_REGS for the r287 reload,
but made the choice via a new reload register (r289). Next we do:
----------------------------------------------------------------------------
Choosing alt 13 in insn 318: (0) w (1) rZ {*movdi_aarch64}
Creating newreg=290 from oldreg=287, assigning class FP_REGS to r290
318: r290:DI=r289:DI
Inserting insn reload after:
319: r287:DI=r290:DI
----------------------------------------------------------------------------
Here we've eschewed the r<-r alternative because of the risk of cycling,
so this time we've effectively chosen to use FP_REGS for r287 (instead
of GENERAL_REGS as above). This choice too is made via a new reload
register (r290). We manage to break a potential cycle here, but we've
still left r287 as POINTER_AND_FP_REGS.
Next we move on to the second of the original two reload instructions:
----------------------------------------------------------------------------
Choosing alt 13 in insn 317: (0) r (1) w {*movsi_aarch64}
Creating newreg=291, assigning class FP_REGS to r291
317: r288:SI=r291:SI
Inserting insn reload before:
320: r291:SI=r287:DI#0
----------------------------------------------------------------------------
Here too we've rejected r<-r because of potential cycling, and
so have effectively chosen to put r287 in FP_REGS. The “problem”
is that this time we've reloaded the subreg input rather than the
register output, and so we have the same problem when reloading
the subreg the next time round.
IMO the handling of the first reload shows that it would be better
to restrict the class of r287 rather than generate a new reload
register r289. Doing that might then require a reload in the uses
of r287, but that might happen anyway, since the new class would
still be a subset of the old class, and so any register chosen
for the new class could also have been chosen for the old class.
At least we'd be making forward progress by restricting the class,
and we'd avoid unnmecessary moves via the FP register bank.
I'm testing a patch.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9/10/11 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (6 preceding siblings ...)
2020-08-27 11:44 ` rsandifo at gcc dot gnu.org
@ 2020-08-28 13:56 ` rsandifo at gcc dot gnu.org
2020-09-07 19:16 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-08-28 13:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
--- Comment #7 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Created attachment 49149
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49149&action=edit
Posted patch
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9/10/11 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (7 preceding siblings ...)
2020-08-28 13:56 ` rsandifo at gcc dot gnu.org
@ 2020-09-07 19:16 ` cvs-commit at gcc dot gnu.org
2020-09-11 12:24 ` [Bug rtl-optimization/96796] [9/10 " rsandifo at gcc dot gnu.org
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-09-07 19:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:
https://gcc.gnu.org/g:6001db79c477b03eacc7e7049560921fb54b7845
commit r11-3041-g6001db79c477b03eacc7e7049560921fb54b7845
Author: Richard Sandiford <richard.sandiford@arm.com>
Date: Mon Sep 7 20:15:36 2020 +0100
lra: Avoid cycling on certain subreg reloads [PR96796]
This PR is about LRA cycling for a reload of the form:
----------------------------------------------------------------------------
Changing pseudo 196 in operand 1 of insn 103 on equiv [r105:DI*0x8+r140:DI]
Creating newreg=287, assigning class ALL_REGS to slow/invalid mem
r287
Creating newreg=288, assigning class ALL_REGS to slow/invalid mem
r288
103: r203:SI=r288:SI<<0x1+r196:DI#0
REG_DEAD r196:DI
Inserting slow/invalid mem reload before:
316: r287:DI=[r105:DI*0x8+r140:DI]
317: r288:SI=r287:DI#0
----------------------------------------------------------------------------
The problem is with r287. We rightly give it a broad starting class of
POINTER_AND_FP_REGS (reduced from ALL_REGS by preferred_reload_class).
However, we never make forward progress towards narrowing it down to
a specific choice of class (POINTER_REGS or FP_REGS).
I think in practice we rely on two things to narrow a reload pseudo's
class down to a specific choice:
(1) a restricted class is specified when the pseudo is created
This happens for input address reloads, where the class is taken
from the target's chosen base register class. It also happens
for simple REG reloads, where the class is taken from the chosen
alternative's constraints.
(2) uses of the reload pseudo as a direct input operand
In this case get_reload_reg tries to reuse the existing register
and narrow its class, instead of creating a new reload pseudo.
However, neither occurs here. As described above, r287 rightly
starts out with a wide choice of class, ultimately derived from
ALL_REGS, so we don't get (1). And as the comments in the PR
explain, r287 is never used as an input reload, only the subreg is,
so we don't get (2):
----------------------------------------------------------------------------
Choosing alt 13 in insn 317: (0) r (1) w {*movsi_aarch64}
Creating newreg=291, assigning class FP_REGS to r291
317: r288:SI=r291:SI
Inserting insn reload before:
320: r291:SI=r287:DI#0
----------------------------------------------------------------------------
IMO, in this case we should rely on the reload of r316 to narrow
down the class of r278. Currently we do:
----------------------------------------------------------------------------
Choosing alt 7 in insn 316: (0) r (1) m {*movdi_aarch64}
Creating newreg=289 from oldreg=287, assigning class GENERAL_REGS to
r289
316: r289:DI=[r105:DI*0x8+r140:DI]
Inserting insn reload after:
318: r287:DI=r289:DI
---------------------------------------------------
i.e. we create a new pseudo register r289 and give *that* pseudo
GENERAL_REGS instead. This is because get_reload_reg only narrows
down the existing class for OP_IN and OP_INOUT, not OP_OUT.
But if we have a reload pseudo in a reload instruction and have chosen
a specific class for the reload pseudo, I think we should simply install
it for OP_OUT reloads too, if the class is a subset of the existing class.
We will need to pick such a register whatever happens (for r289 in the
example above). And as explained in the PR, doing this actually avoids
an unnecessary move via the FP registers too.
The patch is quite aggressive in that it does this for all reload
pseudos in all reload instructions. I wondered about reusing the
condition for a reload move in in_class_p:
INSN_UID (curr_insn) >= new_insn_uid_start
&& curr_insn_set != NULL
&& ((OBJECT_P (SET_SRC (curr_insn_set))
&& ! CONSTANT_P (SET_SRC (curr_insn_set)))
|| (GET_CODE (SET_SRC (curr_insn_set)) == SUBREG
&& OBJECT_P (SUBREG_REG (SET_SRC (curr_insn_set)))
&& ! CONSTANT_P (SUBREG_REG (SET_SRC
(curr_insn_set)))))))
but I can't really justify that on first principles. I think we
should apply the rule consistently until we have a specific reason
for doing otherwise.
gcc/
PR rtl-optimization/96796
* lra-constraints.c (in_class_p): Add a default-false
allow_all_reload_class_changes_p parameter. Do not treat
reload moves specially when the parameter is true.
(get_reload_reg): Try to narrow the class of an existing OP_OUT
reload if we're reloading a reload pseudo in a reload instruction.
gcc/testsuite/
PR rtl-optimization/96796
* gcc.c-torture/compile/pr96796.c: New test.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9/10 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (8 preceding siblings ...)
2020-09-07 19:16 ` cvs-commit at gcc dot gnu.org
@ 2020-09-11 12:24 ` rsandifo at gcc dot gnu.org
2021-04-24 8:37 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-09-11 12:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[9/10/11 Regression] |[9/10 Regression] aarch64:
|aarch64: ICE during RTL |ICE during RTL pass: reload
|pass: reload |
--- Comment #9 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Fixed on trunk so far.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9/10 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (9 preceding siblings ...)
2020-09-11 12:24 ` [Bug rtl-optimization/96796] [9/10 " rsandifo at gcc dot gnu.org
@ 2021-04-24 8:37 ` cvs-commit at gcc dot gnu.org
2021-04-25 13:51 ` [Bug rtl-optimization/96796] [9 " cvs-commit at gcc dot gnu.org
2021-04-25 13:54 ` rsandifo at gcc dot gnu.org
12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-24 8:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Richard Sandiford
<rsandifo@gcc.gnu.org>:
https://gcc.gnu.org/g:e720d3033f84802147d2da9e923bd862cdb73164
commit r10-9764-ge720d3033f84802147d2da9e923bd862cdb73164
Author: Richard Sandiford <richard.sandiford@arm.com>
Date: Sat Apr 24 09:37:26 2021 +0100
lra: Avoid cycling on certain subreg reloads [PR96796]
This PR is about LRA cycling for a reload of the form:
----------------------------------------------------------------------------
Changing pseudo 196 in operand 1 of insn 103 on equiv [r105:DI*0x8+r140:DI]
Creating newreg=287, assigning class ALL_REGS to slow/invalid mem
r287
Creating newreg=288, assigning class ALL_REGS to slow/invalid mem
r288
103: r203:SI=r288:SI<<0x1+r196:DI#0
REG_DEAD r196:DI
Inserting slow/invalid mem reload before:
316: r287:DI=[r105:DI*0x8+r140:DI]
317: r288:SI=r287:DI#0
----------------------------------------------------------------------------
The problem is with r287. We rightly give it a broad starting class of
POINTER_AND_FP_REGS (reduced from ALL_REGS by preferred_reload_class).
However, we never make forward progress towards narrowing it down to
a specific choice of class (POINTER_REGS or FP_REGS).
I think in practice we rely on two things to narrow a reload pseudo's
class down to a specific choice:
(1) a restricted class is specified when the pseudo is created
This happens for input address reloads, where the class is taken
from the target's chosen base register class. It also happens
for simple REG reloads, where the class is taken from the chosen
alternative's constraints.
(2) uses of the reload pseudo as a direct input operand
In this case get_reload_reg tries to reuse the existing register
and narrow its class, instead of creating a new reload pseudo.
However, neither occurs here. As described above, r287 rightly
starts out with a wide choice of class, ultimately derived from
ALL_REGS, so we don't get (1). And as the comments in the PR
explain, r287 is never used as an input reload, only the subreg is,
so we don't get (2):
----------------------------------------------------------------------------
Choosing alt 13 in insn 317: (0) r (1) w {*movsi_aarch64}
Creating newreg=291, assigning class FP_REGS to r291
317: r288:SI=r291:SI
Inserting insn reload before:
320: r291:SI=r287:DI#0
----------------------------------------------------------------------------
IMO, in this case we should rely on the reload of r316 to narrow
down the class of r278. Currently we do:
----------------------------------------------------------------------------
Choosing alt 7 in insn 316: (0) r (1) m {*movdi_aarch64}
Creating newreg=289 from oldreg=287, assigning class GENERAL_REGS to
r289
316: r289:DI=[r105:DI*0x8+r140:DI]
Inserting insn reload after:
318: r287:DI=r289:DI
---------------------------------------------------
i.e. we create a new pseudo register r289 and give *that* pseudo
GENERAL_REGS instead. This is because get_reload_reg only narrows
down the existing class for OP_IN and OP_INOUT, not OP_OUT.
But if we have a reload pseudo in a reload instruction and have chosen
a specific class for the reload pseudo, I think we should simply install
it for OP_OUT reloads too, if the class is a subset of the existing class.
We will need to pick such a register whatever happens (for r289 in the
example above). And as explained in the PR, doing this actually avoids
an unnecessary move via the FP registers too.
This backport is less aggressive than the trunk version, in that the new
code reuses the test for a reload move from in_class_p. We will therefore
only narrow OP_OUT classes if the instruction is a register move or memory
load that was generated by LRA itself.
gcc/
PR rtl-optimization/96796
* lra-constraints.c (in_class_p): Add a default-false
allow_all_reload_class_changes_p parameter. Do not treat
reload moves specially when the parameter is true.
(get_reload_reg): Try to narrow the class of an existing OP_OUT
reload if we're reloading a reload pseudo in a reload instruction.
gcc/testsuite/
PR rtl-optimization/96796
* gcc.c-torture/compile/pr96796.c: New test.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (10 preceding siblings ...)
2021-04-24 8:37 ` cvs-commit at gcc dot gnu.org
@ 2021-04-25 13:51 ` cvs-commit at gcc dot gnu.org
2021-04-25 13:54 ` rsandifo at gcc dot gnu.org
12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-25 13:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Richard Sandiford
<rsandifo@gcc.gnu.org>:
https://gcc.gnu.org/g:49cc1253d079bbefc18275f29adc526679422176
commit r9-9463-g49cc1253d079bbefc18275f29adc526679422176
Author: Richard Sandiford <richard.sandiford@arm.com>
Date: Sun Apr 25 14:51:14 2021 +0100
lra: Avoid cycling on certain subreg reloads [PR96796]
This PR is about LRA cycling for a reload of the form:
----------------------------------------------------------------------------
Changing pseudo 196 in operand 1 of insn 103 on equiv [r105:DI*0x8+r140:DI]
Creating newreg=287, assigning class ALL_REGS to slow/invalid mem
r287
Creating newreg=288, assigning class ALL_REGS to slow/invalid mem
r288
103: r203:SI=r288:SI<<0x1+r196:DI#0
REG_DEAD r196:DI
Inserting slow/invalid mem reload before:
316: r287:DI=[r105:DI*0x8+r140:DI]
317: r288:SI=r287:DI#0
----------------------------------------------------------------------------
The problem is with r287. We rightly give it a broad starting class of
POINTER_AND_FP_REGS (reduced from ALL_REGS by preferred_reload_class).
However, we never make forward progress towards narrowing it down to
a specific choice of class (POINTER_REGS or FP_REGS).
I think in practice we rely on two things to narrow a reload pseudo's
class down to a specific choice:
(1) a restricted class is specified when the pseudo is created
This happens for input address reloads, where the class is taken
from the target's chosen base register class. It also happens
for simple REG reloads, where the class is taken from the chosen
alternative's constraints.
(2) uses of the reload pseudo as a direct input operand
In this case get_reload_reg tries to reuse the existing register
and narrow its class, instead of creating a new reload pseudo.
However, neither occurs here. As described above, r287 rightly
starts out with a wide choice of class, ultimately derived from
ALL_REGS, so we don't get (1). And as the comments in the PR
explain, r287 is never used as an input reload, only the subreg is,
so we don't get (2):
----------------------------------------------------------------------------
Choosing alt 13 in insn 317: (0) r (1) w {*movsi_aarch64}
Creating newreg=291, assigning class FP_REGS to r291
317: r288:SI=r291:SI
Inserting insn reload before:
320: r291:SI=r287:DI#0
----------------------------------------------------------------------------
IMO, in this case we should rely on the reload of r316 to narrow
down the class of r278. Currently we do:
----------------------------------------------------------------------------
Choosing alt 7 in insn 316: (0) r (1) m {*movdi_aarch64}
Creating newreg=289 from oldreg=287, assigning class GENERAL_REGS to
r289
316: r289:DI=[r105:DI*0x8+r140:DI]
Inserting insn reload after:
318: r287:DI=r289:DI
---------------------------------------------------
i.e. we create a new pseudo register r289 and give *that* pseudo
GENERAL_REGS instead. This is because get_reload_reg only narrows
down the existing class for OP_IN and OP_INOUT, not OP_OUT.
But if we have a reload pseudo in a reload instruction and have chosen
a specific class for the reload pseudo, I think we should simply install
it for OP_OUT reloads too, if the class is a subset of the existing class.
We will need to pick such a register whatever happens (for r289 in the
example above). And as explained in the PR, doing this actually avoids
an unnecessary move via the FP registers too.
This backport is less aggressive than the trunk version, in that the new
code reuses the test for a reload move from in_class_p. We will therefore
only narrow OP_OUT classes if the instruction is a register move or memory
load that was generated by LRA itself.
gcc/
PR rtl-optimization/96796
* lra-constraints.c (in_class_p): Add a default-false
allow_all_reload_class_changes_p parameter. Do not treat
reload moves specially when the parameter is true.
(get_reload_reg): Try to narrow the class of an existing OP_OUT
reload if we're reloading a reload pseudo in a reload instruction.
gcc/testsuite/
PR rtl-optimization/96796
* gcc.c-torture/compile/pr96796.c: New test.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug rtl-optimization/96796] [9 Regression] aarch64: ICE during RTL pass: reload
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
` (11 preceding siblings ...)
2021-04-25 13:51 ` [Bug rtl-optimization/96796] [9 " cvs-commit at gcc dot gnu.org
@ 2021-04-25 13:54 ` rsandifo at gcc dot gnu.org
12 siblings, 0 replies; 14+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2021-04-25 13:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96796
rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #12 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Fixed for GCC 9 and above. Thanks for the bug report.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2021-04-25 13:54 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-26 10:25 [Bug rtl-optimization/96796] New: aarch64: ICE during RTL pass: reload yangyang305 at huawei dot com
2020-08-26 10:39 ` [Bug rtl-optimization/96796] [9 Regression] " ktkachov at gcc dot gnu.org
2020-08-26 10:51 ` rguenth at gcc dot gnu.org
2020-08-26 11:37 ` marxin at gcc dot gnu.org
2020-08-26 15:56 ` acoplan at gcc dot gnu.org
2020-08-26 16:06 ` [Bug rtl-optimization/96796] [9/10/11 " ktkachov at gcc dot gnu.org
2020-08-26 20:17 ` acoplan at gcc dot gnu.org
2020-08-27 11:44 ` rsandifo at gcc dot gnu.org
2020-08-28 13:56 ` rsandifo at gcc dot gnu.org
2020-09-07 19:16 ` cvs-commit at gcc dot gnu.org
2020-09-11 12:24 ` [Bug rtl-optimization/96796] [9/10 " rsandifo at gcc dot gnu.org
2021-04-24 8:37 ` cvs-commit at gcc dot gnu.org
2021-04-25 13:51 ` [Bug rtl-optimization/96796] [9 " cvs-commit at gcc dot gnu.org
2021-04-25 13:54 ` rsandifo at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).