* [Bug rtl-optimization/116238] [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
@ 2024-08-05 13:16 ` ktkachov at gcc dot gnu.org
2024-08-07 21:36 ` [Bug rtl-optimization/116238] [15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593 pinskia at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2024-08-05 13:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |aarch64
Known to fail| |15.0
Target Milestone|--- |15.0
Known to work| |14.1.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/116238] [15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
2024-08-05 13:16 ` [Bug rtl-optimization/116238] " ktkachov at gcc dot gnu.org
@ 2024-08-07 21:36 ` pinskia at gcc dot gnu.org
2024-08-07 21:46 ` pinskia at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-08-07 21:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 58863
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58863&action=edit
Just this file with `-Ofast -msve-vector-bits=128 -march=armv9-a`
[apinski@xeond2 t]$ ../xgcc -B.. t2.c -Ofast -msve-vector-bits=128
-mcpu=neoverse-v2
during RTL pass: reload
t2.c: In function ‘BKE_mask_calc_handle_point_auto’:
t2.c:24:1: internal compiler error: maximum number of generated reload insns
per insn achieved (90)
24 | }
| ^
0x31e8dd8 internal_error(char const*, ...)
../../gcc/diagnostic-global-context.cc:491
0x14c6f8f lra_constraints(bool)
../../gcc/lra-constraints.cc:5402
0x14af895 lra(_IO_FILE*, int)
../../gcc/lra.cc:2442
0x145779f do_reload
../../gcc/ira.cc:5976
0x1457c3c execute
../../gcc/ira.cc:6164
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/116238] [15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
2024-08-05 13:16 ` [Bug rtl-optimization/116238] " ktkachov at gcc dot gnu.org
2024-08-07 21:36 ` [Bug rtl-optimization/116238] [15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593 pinskia at gcc dot gnu.org
@ 2024-08-07 21:46 ` pinskia at gcc dot gnu.org
2024-08-07 21:50 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-08-07 21:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 58864
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58864&action=edit
Reduced testcase
`-Ofast -msve-vector-bits=128 -march=armv9-a -fno-vect-cost-model `
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/116238] [15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
` (2 preceding siblings ...)
2024-08-07 21:46 ` pinskia at gcc dot gnu.org
@ 2024-08-07 21:50 ` pinskia at gcc dot gnu.org
2024-08-20 8:45 ` ktkachov at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-08-07 21:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2024-08-07
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.
>I've bisected it to the commit g:3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b but I don't know if that is the cause or uncovered a latent problem
I think it was a latent bug.
We are trying to spill `(reg:VNx2QI 102 [ vect_p_6.5 ])` to the stack but there
is no instruction matching to do it so it goes into an infinite.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/116238] [15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
` (3 preceding siblings ...)
2024-08-07 21:50 ` pinskia at gcc dot gnu.org
@ 2024-08-20 8:45 ` ktkachov at gcc dot gnu.org
2024-08-20 15:19 ` rsandifo at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2024-08-20 8:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rsandifo at gcc dot gnu.org
--- Comment #4 from ktkachov at gcc dot gnu.org ---
CCing Richard about the partial mode reloads as he added many of those patterns
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/116238] [15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
` (4 preceding siblings ...)
2024-08-20 8:45 ` ktkachov at gcc dot gnu.org
@ 2024-08-20 15:19 ` rsandifo at gcc dot gnu.org
2024-08-20 15:50 ` [Bug rtl-optimization/116238] [12/13/14/15 " ktkachov at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2024-08-20 15:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
Richard Sandiford <rsandifo at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org
--- Comment #5 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
Yeah, seems to be a latent bug in aarch64_hard_regno_caller_save_mode. A
brute-force reproducer is:
void foo();
typedef unsigned char v2qi __attribute__((vector_size(2)));
void f(v2qi *ptr)
{
v2qi x = *ptr;
asm volatile ("" :: "w" (x));
asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
foo();
asm volatile ("" :: "w" (x));
*ptr = x;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/116238] [12/13/14/15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
` (5 preceding siblings ...)
2024-08-20 15:19 ` rsandifo at gcc dot gnu.org
@ 2024-08-20 15:50 ` ktkachov at gcc dot gnu.org
2024-08-21 16:36 ` cvs-commit at gcc dot gnu.org
2024-08-21 16:37 ` [Bug rtl-optimization/116238] [12/13/14 " rsandifo at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2024-08-20 15:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[15 Regression] ICE |[12/13/14/15 Regression]
|building 526.blender_r on |ICE building 526.blender_r
|aarch64 SVE after |on aarch64 SVE after
|r15-1619-g3b9b8d6cfdf593 |r15-1619-g3b9b8d6cfdf593
Known to work| |9.5.0
Known to fail| |10.5.0
--- Comment #6 from ktkachov at gcc dot gnu.org ---
(In reply to Richard Sandiford from comment #5)
> Yeah, seems to be a latent bug in aarch64_hard_regno_caller_save_mode. A
> brute-force reproducer is:
>
> void foo();
> typedef unsigned char v2qi __attribute__((vector_size(2)));
> void f(v2qi *ptr)
> {
> v2qi x = *ptr;
> asm volatile ("" :: "w" (x));
> asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
> foo();
> asm volatile ("" :: "w" (x));
> *ptr = x;
> }
Interesting. With this reproducer we get the ICE from GCC 10 onwards when
compiled with -O3 -march=armv8.2-a+sve -msve-vector-bits=128
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/116238] [12/13/14/15 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
` (6 preceding siblings ...)
2024-08-20 15:50 ` [Bug rtl-optimization/116238] [12/13/14/15 " ktkachov at gcc dot gnu.org
@ 2024-08-21 16:36 ` cvs-commit at gcc dot gnu.org
2024-08-21 16:37 ` [Bug rtl-optimization/116238] [12/13/14 " rsandifo at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-08-21 16:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:
https://gcc.gnu.org/g:ec9d6d45191f639482344362d048294e74587ca3
commit r15-3073-gec9d6d45191f639482344362d048294e74587ca3
Author: Richard Sandiford <richard.sandiford@arm.com>
Date: Wed Aug 21 17:35:47 2024 +0100
aarch64: Fix caller saves of VNx2QI [PR116238]
The testcase contains a VNx2QImode pseudo that is live across a call
and that cannot be allocated a call-preserved register. LRA quite
reasonably tried to save it before the call and restore it afterwards.
Unfortunately, the target told it to do that in SImode, even though
punning between SImode and VNx2QImode is disallowed by both
TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P.
The natural class to use for SImode is GENERAL_REGS, so this led
to an unsalvageable situation in which we had:
(set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B))
where A needed GENERAL_REGS and B needed FP_REGS. We therefore ended
up in a reload loop.
The hooks above should ensure that this situation can never occur
for incoming subregs. It only happened here because the target
explicitly forced it.
The decision to use SImode for modes smaller than 4 bytes dates
back to the beginning of the port, before 16-bit floating-point
modes existed. I'm not sure whether promoting to SImode really
makes sense for any FPR, but that's a separate performance/QoI
discussion. For now, this patch just disallows using SImode
when it is wrong for correctness reasons, since that should be
safer to backport.
gcc/
PR testsuite/116238
* config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode):
Only return SImode if we can convert to and from it.
gcc/testsuite/
PR testsuite/116238
* gcc.target/aarch64/sve/pr116238.c: New test.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/116238] [12/13/14 Regression] ICE building 526.blender_r on aarch64 SVE after r15-1619-g3b9b8d6cfdf593
2024-08-05 13:14 [Bug rtl-optimization/116238] New: [15 Regression] ICE building 526.blender_r on aarch64 SVE after 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b ktkachov at gcc dot gnu.org
` (7 preceding siblings ...)
2024-08-21 16:36 ` cvs-commit at gcc dot gnu.org
@ 2024-08-21 16:37 ` rsandifo at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2024-08-21 16:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116238
Richard Sandiford <rsandifo at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail|15.0 |14.2.1
Summary|[12/13/14/15 Regression] |[12/13/14 Regression] ICE
|ICE building 526.blender_r |building 526.blender_r on
|on aarch64 SVE after |aarch64 SVE after
|r15-1619-g3b9b8d6cfdf593 |r15-1619-g3b9b8d6cfdf593
Known to work| |15.0
--- Comment #8 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
Fixed on trunk. Will backport after a while if there is no fallout.
^ permalink raw reply [flat|nested] 10+ messages in thread