public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/115487] New: -march=cascadelake causes spilling
@ 2024-06-14 9:31 rguenth at gcc dot gnu.org
2024-06-14 9:42 ` [Bug target/115487] " rguenth at gcc dot gnu.org
2024-06-14 9:59 ` rguenth at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-14 9:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115487
Bug ID: 115487
Summary: -march=cascadelake causes spilling
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
When looking at the report of gcc.target/i386/vect-strided-3.c FAILing with
-march=cascadelake I arrived at using -mno-sse4 instead of just -mno-avx
to avoid pextrq from being used instead of movhps.
But then adding -m32 does
movq %xmm3, 8(%esp)
movd 12(%esp), %xmm1
movd 8(%esp), %xmm0
..
punpckldq %xmm1, %xmm0
but only when -march=cascadelake, not with -march=x86-64. So this confuses
the -march=cascadelake -m32 testresult. Same with -march=znver2 but not
with -mavx2 -mno-sse4.
diff --git a/gcc/testsuite/gcc.target/i386/vect-strided-3.c
b/gcc/testsuite/gcc.target/i386/vect-strided-3.c
index b462701a0b2..f9c54a6f715 100644
--- a/gcc/testsuite/gcc.target/i386/vect-strided-3.c
+++ b/gcc/testsuite/gcc.target/i386/vect-strided-3.c
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -msse2 -mno-avx -fno-tree-slp-vectorize" } */
+/* { dg-options "-O2 -msse2 -mno-sse4 -fno-tree-slp-vectorize" } */
void foo (int * __restrict a, int *b, int s)
{
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/115487] -march=cascadelake causes spilling
2024-06-14 9:31 [Bug target/115487] New: -march=cascadelake causes spilling rguenth at gcc dot gnu.org
@ 2024-06-14 9:42 ` rguenth at gcc dot gnu.org
2024-06-14 9:59 ` rguenth at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-14 9:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115487
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
It looks like it's STV2 doing
28: r131:V4SI=[r102:SI]
30: r132:V4SI=[r102:SI+0x10]
33: r133:DI=vec_select(r131:V4SI#0,parallel)
- 34: [r103:SI]=r133:DI
+ 92: r143:DI#0=vec_merge(vec_duplicate(r133:DI#0),const_vector,0x1)
+ 93: r144:DI#0=vec_merge(vec_duplicate(r133:DI#4),const_vector,0x1)
+ 94: r143:DI#0=vec_select(vec_concat(r143:DI#0,r144:DI#0),parallel)
+ 89: r141:DI#0=vec_merge(vec_duplicate(r133:DI#0),const_vector,0x1)
+ 90: r142:DI#0=vec_merge(vec_duplicate(r133:DI#4),const_vector,0x1)
+ 91: r141:DI#0=vec_select(vec_concat(r141:DI#0,r142:DI#0),parallel)
+ 34: [r103:SI]=r141:DI
36: [r103:SI+0x8]=vec_select(r131:V4SI#0,parallel)
REG_DEAD r131:V4SI
- 38: [r103:SI+0x10]=r133:DI
- REG_DEAD r133:DI
+ 38: [r103:SI+0x10]=r143:DI
+ REG_DEAD r143:DI
for some weird reason and that later causes spilling.
I'll also note that STV2 doesn't seem to recog any of the created insns.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/115487] -march=cascadelake causes spilling
2024-06-14 9:31 [Bug target/115487] New: -march=cascadelake causes spilling rguenth at gcc dot gnu.org
2024-06-14 9:42 ` [Bug target/115487] " rguenth at gcc dot gnu.org
@ 2024-06-14 9:59 ` rguenth at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-14 9:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115487
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Building chain #7...
Adding insn 34 to chain #7
r133 def in insn 33 isn't convertible
Mark r133 def in insn 33 as requiring both modes in chain #7
Collected chain #7...
insns: 34
defs to convert: r133
Computing gain for chain #7...
Instruction gain 8 for 34: [r103:SI]=r133:DI
Instruction conversion gain: 8
Registers conversion cost: 6
Total gain: 2
Converting chain #7...
deferring rescan insn with uid = 89.
deferring rescan insn with uid = 90.
deferring rescan insn with uid = 91.
Copied r133 to a vector register r141 for insn 33
the question is why we start the chain at
33: r133:DI=vec_select(r131:V4SI#0,parallel)
rather than at
28: r131:V4SI=[r102:SI]
or why we are working with SImode regs/vectors in the lowering. The code
before STV2 looks perfectly OK, and most definitely the costs are totally
off here.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-06-14 9:59 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-14 9:31 [Bug target/115487] New: -march=cascadelake causes spilling rguenth at gcc dot gnu.org
2024-06-14 9:42 ` [Bug target/115487] " rguenth at gcc dot gnu.org
2024-06-14 9:59 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).