* [Bug c/111970] [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
@ 2023-10-25 2:03 ` pan2.li at intel dot com
2023-10-25 2:53 ` [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145 pan2.li at intel dot com
` (21 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: pan2.li at intel dot com @ 2023-10-25 2:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #1 from Li Pan <pan2.li at intel dot com> ---
Created attachment 56198
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56198&action=edit
Without this commit
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
2023-10-25 2:03 ` [Bug c/111970] " pan2.li at intel dot com
@ 2023-10-25 2:53 ` pan2.li at intel dot com
2023-10-25 8:05 ` pan2.li at intel dot com
` (20 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: pan2.li at intel dot com @ 2023-10-25 2:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #2 from Li Pan <pan2.li at intel dot com> ---
Add more information about how to build and run the test cases.
Build:
../__RISC-V_INSTALL___RV64/bin/riscv64-unknown-elf-gcc -march=rv64imafdcv
-mabi=lp64d -ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax
--param riscv-autovec-lmul=dynamic -ffast-math -lm
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
-o test.elf
Run:
qemu-riscv64 -cpu rv64,v=true,vlen=128,elen=64,vext_spec=v1.0 test.elf
assertion "dest_float_uint8_t[i * 2] == (src_float_uint8_t
[index_float_uint8_t[i * 2]] + 1)" failed: file
"gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c",
line 106, function: main
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
2023-10-25 2:03 ` [Bug c/111970] " pan2.li at intel dot com
2023-10-25 2:53 ` [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145 pan2.li at intel dot com
@ 2023-10-25 8:05 ` pan2.li at intel dot com
2023-10-27 12:29 ` rguenth at gcc dot gnu.org
` (19 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: pan2.li at intel dot com @ 2023-10-25 8:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #3 from Li Pan <pan2.li at intel dot com> ---
Double confirmed the trunk of GCC still has this issue.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (2 preceding siblings ...)
2023-10-25 8:05 ` pan2.li at intel dot com
@ 2023-10-27 12:29 ` rguenth at gcc dot gnu.org
2023-10-27 13:03 ` pan2.li at intel dot com
` (18 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-27 12:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |14.0
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Status|UNCONFIRMED |ASSIGNED
Priority|P3 |P1
Ever confirmed|0 |1
Last reconfirmed| |2023-10-27
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
I will have a look. The patch should have been a noop for IFN gather loads
since those are processed differently via pattern recog.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (3 preceding siblings ...)
2023-10-27 12:29 ` rguenth at gcc dot gnu.org
@ 2023-10-27 13:03 ` pan2.li at intel dot com
2023-10-31 12:49 ` rguenth at gcc dot gnu.org
` (17 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: pan2.li at intel dot com @ 2023-10-27 13:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #5 from Li Pan <pan2.li at intel dot com> ---
Thank you, any thing I can help please feel free to let me know.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (4 preceding siblings ...)
2023-10-27 13:03 ` pan2.li at intel dot com
@ 2023-10-31 12:49 ` rguenth at gcc dot gnu.org
2023-10-31 14:07 ` pan2.li at intel dot com
` (16 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-31 12:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
So I can see we don't recognize a gather IFN during pattern recog here.
t.c:15:1: note: Final SLP tree for instance 0x502e9a0:
t.c:15:1: note: node 0x4f84700 (max_nunits=128, refcnt=2) vector(32) float
t.c:15:1: note: op template: *_10 = _11;
t.c:15:1: note: stmt 0 *_10 = _11;
t.c:15:1: note: stmt 1 *_20 = _21;
t.c:15:1: note: children 0x4f84790
t.c:15:1: note: node 0x4f84790 (max_nunits=128, refcnt=2) vector(32) float
t.c:15:1: note: op template: _11 = _8 + 1.0e+0;
t.c:15:1: note: stmt 0 _11 = _8 + 1.0e+0;
t.c:15:1: note: stmt 1 _21 = _18 + 2.0e+0;
t.c:15:1: note: children 0x4f84820 0x4f84940
t.c:15:1: note: node 0x4f84820 (max_nunits=128, refcnt=2) vector(32) float
t.c:15:1: note: op template: _8 = *_7;
t.c:15:1: note: stmt 0 _8 = *_7;
t.c:15:1: note: stmt 1 _18 = *_17;
t.c:15:1: note: children 0x4f848b0
t.c:15:1: note: node 0x4f848b0 (max_nunits=128, refcnt=2) vector(128)
unsigned char
t.c:15:1: note: op template: _4 = *_3;
t.c:15:1: note: stmt 0 _4 = *_3;
t.c:15:1: note: stmt 1 _14 = *_13;
t.c:15:1: note: load permutation { 0 1 }
t.c:15:1: note: node (constant) 0x4f84940 (max_nunits=1, refcnt=1)
t.c:15:1: note: { 1.0e+0, 2.0e+0 }
t.c:15:1: note: === vect_match_slp_patterns ===
t.c:15:1: note: Analyzing SLP tree 0x4f84700 for patterns
t.c:15:1: note: === vect_make_slp_decision ===
t.c:15:1: note: Decided to SLP 1 instances. Unrolling factor 64
it tries a few other modes, one even having .MASK_LEN_GATHER_LOAD but that
fails to build SLP. In the end we choose
t.c:15:1: note: ***** Choosing vector mode RVVM4QI
t.c:15:1: note: ***** Choosing epilogue vector mode RVVMF4QI
the main loop instance is
t.c:15:1: note: Vectorizing SLP tree:
t.c:15:1: note: node 0x4f849d0 (max_nunits=64, refcnt=1) vector(32) float
t.c:15:1: note: op template: *_10 = _11;
t.c:15:1: note: stmt 0 *_10 = _11;
t.c:15:1: note: stmt 1 *_20 = _21;
t.c:15:1: note: children 0x4f84a60
t.c:15:1: note: node 0x4f84a60 (max_nunits=64, refcnt=1) vector(32) float
t.c:15:1: note: op template: _11 = _8 + 1.0e+0;
t.c:15:1: note: stmt 0 _11 = _8 + 1.0e+0;
t.c:15:1: note: stmt 1 _21 = _18 + 2.0e+0;
t.c:15:1: note: children 0x4f84af0 0x4f84c10
t.c:15:1: note: node 0x4f84af0 (max_nunits=64, refcnt=1) vector(32) float
t.c:15:1: note: op template: _8 = *_7;
t.c:15:1: note: stmt 0 _8 = *_7;
t.c:15:1: note: stmt 1 _18 = *_17;
t.c:15:1: note: children 0x4f84b80
t.c:15:1: note: node 0x4f84b80 (max_nunits=64, refcnt=1) vector(64) unsigned
char
t.c:15:1: note: op template: _4 = *_3;
t.c:15:1: note: stmt 0 _4 = *_3;
t.c:15:1: note: stmt 1 _14 = *_13;
t.c:15:1: note: node (constant) 0x4f84c10 (max_nunits=1, refcnt=1) vector(32)
float
t.c:15:1: note: { 1.0e+0, 2.0e+0 }
so the main loop uses emulated gather but the epilog uses non-SLP but
gathers here.
# vectp_index.6_209 = PHI <vectp_index.6_210(5), index_25(D)(2)>
# vectp_y.12_601 = PHI <vectp_y.12_602(5), y_27(D)(2)>
vect__4.8_211 = MEM <vector(64) unsigned char> [(uint8_t
*)vectp_index.6_209];
...
MEM <vector(32) float> [(float *)vectp_y.12_601] = vect__11.11_599;
vectp_y.12_604 = vectp_y.12_601 + 128;
MEM <vector(32) float> [(float *)vectp_y.12_604] = vect__11.11_599;
...
vectp_index.6_210 = vectp_index.6_209 + 64;
vectp_y.12_602 = vectp_y.12_604 + 128;
ivtmp_607 = ivtmp_606 + 1;
if (ivtmp_607 < 3)
that IV updates look OK to me.
So not sure what to do? Does the testcase execute correctly with
--param vect-epilogues-nomask=0 ?
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (5 preceding siblings ...)
2023-10-31 12:49 ` rguenth at gcc dot gnu.org
@ 2023-10-31 14:07 ` pan2.li at intel dot com
2023-10-31 14:16 ` pan2.li at intel dot com
` (15 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: pan2.li at intel dot com @ 2023-10-31 14:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #7 from Li Pan <pan2.li at intel dot com> ---
Seems no luck when --param vect-epilogues-nomask=0. I will have a try with the
newest upstream for this issue if everything look OK, and keep you posted.
../__RISC-V_INSTALL___RV64/bin/riscv64-unknown-elf-gcc -march=rv64imafdcv
-mabi=lp64d \
-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax \
--param riscv-autovec-lmul=dynamic --param vect-epilogues-nomask=0 \
-ffast-math -lm
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
\
-o test.elf
../build-qemu/qemu-riscv64 -cpu rv64,v=true,vlen=128,elen=64,vext_spec=v1.0
test.elf
assertion "dest_int32_t_int8_t[i * 2] == (src_int32_t_int8_t
[index_int32_t_int8_t[i * 2]] + 1)" failed: \
file
"gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c",
line 45, function: main
../__RISC-V_INSTALL___RV64/bin/riscv64-unknown-elf-gcc --version
riscv64-unknown-elf-gcc (GCC) 14.0.0 20231019 (experimental)
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (6 preceding siblings ...)
2023-10-31 14:07 ` pan2.li at intel dot com
@ 2023-10-31 14:16 ` pan2.li at intel dot com
2023-11-13 12:42 ` rguenth at gcc dot gnu.org
` (14 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: pan2.li at intel dot com @ 2023-10-31 14:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #8 from Li Pan <pan2.li at intel dot com> ---
Still fail in upstream.
../__RISC-V_INSTALL___RV64/bin/riscv64-unknown-elf-gcc -march=rv64imafdcv
-mabi=lp64d \
-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax \
--param riscv-autovec-lmul=dynamic --param vect-epilogues-nomask=0 \
-ffast-math -lm
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
\
-o test.elf
../build-qemu/qemu-riscv64 -cpu rv64,v=true,vlen=128,elen=64,vext_spec=v1.0
test.elf
assertion "dest_int32_t_int8_t[i * 2] == (src_int32_t_int8_t
[index_int32_t_int8_t[i * 2]] + 1)"
failed: file
"gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c",
line 45, function: main
../__RISC-V_INSTALL___RV64/bin/riscv64-unknown-elf-gcc --version
riscv64-unknown-elf-gcc (GCC) 14.0.0 20231021 (experimental)
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (7 preceding siblings ...)
2023-10-31 14:16 ` pan2.li at intel dot com
@ 2023-11-13 12:42 ` rguenth at gcc dot gnu.org
2023-11-13 22:07 ` juzhe.zhong at rivai dot ai
` (13 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-13 12:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
Does it still occur after the last round of fixes?
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (8 preceding siblings ...)
2023-11-13 12:42 ` rguenth at gcc dot gnu.org
@ 2023-11-13 22:07 ` juzhe.zhong at rivai dot ai
2023-11-20 6:49 ` juzhe.zhong at rivai dot ai
` (12 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-13 22:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #10 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
(In reply to Richard Biener from comment #9)
> Does it still occur after the last round of fixes?
Hi, Richard. The FAIL still exists. We will revisit it later to see whether it
is RISC-V backend issue.
It may not be middle-end issue.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (9 preceding siblings ...)
2023-11-13 22:07 ` juzhe.zhong at rivai dot ai
@ 2023-11-20 6:49 ` juzhe.zhong at rivai dot ai
2023-11-20 6:51 ` juzhe.zhong at rivai dot ai
` (11 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-20 6:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #11 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Hi, Richard.
I come back to revisit this bug.
I found if I do this:
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 4a09b3c2aca..2fd128672b9 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -1434,7 +1434,6 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char
*swap,
&& rhs_code != CFN_GATHER_LOAD
&& rhs_code != CFN_MASK_GATHER_LOAD
&& rhs_code != CFN_MASK_LEN_GATHER_LOAD
- && !STMT_VINFO_GATHER_SCATTER_P (stmt_info)
/* Not grouped loads are handled as externals for BB
vectorization. For loop vectorization we can handle
splats the same we handle single element interleaving. */
The bug is fixed. But I am not sure whether it is the correct fix.
Reproduce bug compile option on RISC-V:
-O3 --param=riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m4
-fno-vect-cost-model -ffast-math
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (10 preceding siblings ...)
2023-11-20 6:49 ` juzhe.zhong at rivai dot ai
@ 2023-11-20 6:51 ` juzhe.zhong at rivai dot ai
2023-11-20 6:52 ` juzhe.zhong at rivai dot ai
` (10 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-20 6:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #12 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Created attachment 56648
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56648&action=edit
Optimized dump which is buggy
This is the buggy vectorized dump IR
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (11 preceding siblings ...)
2023-11-20 6:51 ` juzhe.zhong at rivai dot ai
@ 2023-11-20 6:52 ` juzhe.zhong at rivai dot ai
2023-11-20 8:10 ` juzhe.zhong at rivai dot ai
` (9 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-20 6:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #13 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Created attachment 56649
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56649&action=edit
Correct vectorized optimized dump
This is the optimized dump IR that runs correctly.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (12 preceding siblings ...)
2023-11-20 6:52 ` juzhe.zhong at rivai dot ai
@ 2023-11-20 8:10 ` juzhe.zhong at rivai dot ai
2023-11-20 8:22 ` rguenther at suse dot de
` (8 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-20 8:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #14 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Just confirm on aarch64 QEMU, it seems that ARM SVE has the same issue as RVV.
This is the test:
#include <stdint-gcc.h>
#define TEST_LOOP(DATA_TYPE, INDEX_TYPE)
\
void __attribute__ ((noinline, noclone))
\
f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x,
\
INDEX_TYPE *restrict index)
\
{
\
for (int i = 0; i < 100; ++i)
\
{
\
y[i * 2] = x[index[i * 2]] + 1;
\
y[i * 2 + 1] = x[index[i * 2 + 1]] + 2;
\
}
\
}
TEST_LOOP (int16_t, int8_t)
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, INDEX_TYPE)
\
DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0};
\
DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0};
\
INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0};
\
for (int i = 0; i < 202; i++)
\
{
\
src_##DATA_TYPE##_##INDEX_TYPE[i]
\
= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1));
\
index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55);
\
}
\
f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE,
\
src_##DATA_TYPE##_##INDEX_TYPE,
\
index_##DATA_TYPE##_##INDEX_TYPE);
\
for (int i = 0; i < 100; i++)
\
{
\
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]
\
== (src_##DATA_TYPE##_##INDEX_TYPE
\
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2]]
\
+ 1));
\
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]
\
== (src_##DATA_TYPE##_##INDEX_TYPE
\
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]]
\
+ 2));
\
}
RUN_LOOP (int16_t, int8_t)
return 0;
}
compile: -march=armv8-a+sve -O3 -msve-vector-bits=256 -specs=rdimon.specs
QEMU:sve-default-vector-length=256
The configuration above passed.
However, I tried -march=armv8-a+sve -O3 -msve-vector-bits=512
-fno-vect-cost-model -specs=rdimon.specs
QEMU:sve-default-vector-length=512
This configuration failed like RVV:
assertion "dest_int16_t_int8_t[i * 2] == (src_int16_t_int8_t
[index_int16_t_int8_t[i * 2]] + 1)" failed: file "tmp.c", line 52, function:
main
The reason I experiment on ARM SVE with vector-length = 512bits,
because I checked the dump IR on ARM SVE which is similiar with RVV:
https://godbolt.org/z/x74z7obYT
Hi, @Tamar. Could you double-check whether my analysis (This bug not only
happens on RVV, but also on ARM SVE) is correct or not ?
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (13 preceding siblings ...)
2023-11-20 8:10 ` juzhe.zhong at rivai dot ai
@ 2023-11-20 8:22 ` rguenther at suse dot de
2023-11-20 8:30 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenther at suse dot de @ 2023-11-20 8:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #15 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 20 Nov 2023, juzhe.zhong at rivai dot ai wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
>
> --- Comment #11 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
> Hi, Richard.
>
> I come back to revisit this bug.
>
> I found if I do this:
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 4a09b3c2aca..2fd128672b9 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -1434,7 +1434,6 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char
> *swap,
> && rhs_code != CFN_GATHER_LOAD
> && rhs_code != CFN_MASK_GATHER_LOAD
> && rhs_code != CFN_MASK_LEN_GATHER_LOAD
> - && !STMT_VINFO_GATHER_SCATTER_P (stmt_info)
> /* Not grouped loads are handled as externals for BB
> vectorization. For loop vectorization we can handle
> splats the same we handle single element interleaving. */
>
>
> The bug is fixed. But I am not sure whether it is the correct fix.
That will simply disable SLP recognition for the case in question.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (14 preceding siblings ...)
2023-11-20 8:22 ` rguenther at suse dot de
@ 2023-11-20 8:30 ` rguenth at gcc dot gnu.org
2023-11-20 14:20 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-20 8:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
The IL I see for f_int16_t_int8_t on aarch64 looks OK to me.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (15 preceding siblings ...)
2023-11-20 8:30 ` rguenth at gcc dot gnu.org
@ 2023-11-20 14:20 ` rguenth at gcc dot gnu.org
2023-11-20 14:58 ` rdapp at gcc dot gnu.org
` (5 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-20 14:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so for RISC-V with the testcase from the description there's the
following issue:
_179 = &MEM <vector(64) unsigned char> [(uint8_t *)_618];
_225 = BIT_FIELD_REF <MEM <vector(64) unsigned char> [(uint8_t *)_179], 8,
16>;
...
vect__8.9_405 = {_218, _224, _230, _236, _242, _248, _254, _260, _266, _272,
_278, _284, _290, _296, _302, _308, _314, _320, _326, _332, _338, _344, _350,
_356, _362, _368, _374, _380, _386, _392, _398, _404};
vect__11.11_599 = vect__8.9_405 + { 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0,
2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0,
2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0,
2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0, 1.0e+0, 2.0e+0 };
_609 = (void *) ivtmp.31_622;
MEM <vector(32) float> [(float *)_609] = vect__11.11_599;
_1 = _609 + 128;
MEM <vector(32) float> [(float *)_1] = vect__11.11_599;
I think the following fixes it:
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 96e4a6cffad..bf8c99779ae 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -9188,7 +9188,8 @@ vectorizable_store (vec_info *vinfo,
unsigned HOST_WIDE_INT factor
= const_offset_nunits / const_nunits;
vec_offset = vec_offsets[(vec_num * j + i) / factor];
- unsigned elt_offset = (j % factor) * const_nunits;
+ unsigned elt_offset
+ = ((vec_num * j + i) % factor) * const_nunits;
tree idx_type = TREE_TYPE (TREE_TYPE (vec_offset));
tree scale = size_int (gs_info.scale);
align = get_object_alignment (DR_REF (first_dr_info->dr));
@@ -11150,7 +11151,8 @@ vectorizable_load (vec_info *vinfo,
unsigned HOST_WIDE_INT factor
= const_offset_nunits / const_nunits;
vec_offset = vec_offsets[(vec_num * j + i) / factor];
- unsigned elt_offset = (j % factor) * const_nunits;
+ unsigned elt_offset
+ = ((vec_num * j + i) % factor) * const_nunits;
tree idx_type = TREE_TYPE (TREE_TYPE (vec_offset));
tree scale = size_int (gs_info.scale);
align = get_object_alignment (DR_REF (first_dr_info->dr));
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (16 preceding siblings ...)
2023-11-20 14:20 ` rguenth at gcc dot gnu.org
@ 2023-11-20 14:58 ` rdapp at gcc dot gnu.org
2023-11-20 22:24 ` tnfchris at gcc dot gnu.org
` (4 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rdapp at gcc dot gnu.org @ 2023-11-20 14:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #18 from Robin Dapp <rdapp at gcc dot gnu.org> ---
I did a quick testsuite run on rv32 and can confirm that this fixes the issue
for me.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (17 preceding siblings ...)
2023-11-20 14:58 ` rdapp at gcc dot gnu.org
@ 2023-11-20 22:24 ` tnfchris at gcc dot gnu.org
2023-11-20 22:30 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-11-20 22:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
Tamar Christina <tnfchris at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tnfchris at gcc dot gnu.org
--- Comment #19 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to JuzheZhong from comment #14)
>
> Hi, @Tamar. Could you double-check whether my analysis (This bug not only
> happens on RVV, but also on ARM SVE) is correct or not ?
Hi, indeed it does:
test@sve-1:~/temp$ ./gcc/bin/gcc -march=armv8-a+sve -O3 -msve-vector-bits=256
sve.c -o sve.exe
test@sve-1:~/temp$ ./sve.exe
test@sve-1:~/temp$ ./gcc/bin/gcc -march=armv8-a+sve -O3 -msve-vector-bits=256
-fno-vect-cost-model sve.c -o sve-no-cost.exe
test@sve-1:~/temp$ ./sve-no-cost.exe
sve-no-cost.exe: sve.c:46: main: Assertion `dest_int16_t_int8_t[i * 2] ==
(src_int16_t_int8_t [index_int16_t_int8_t[i * 2]] + 1)' failed.
Aborted (core dumped)
I have noticed some other gather related failures but haven't had time to
triage them to file bugs. Hoping to get to that soon.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (18 preceding siblings ...)
2023-11-20 22:24 ` tnfchris at gcc dot gnu.org
@ 2023-11-20 22:30 ` pinskia at gcc dot gnu.org
2023-11-20 23:56 ` pan2.li at intel dot com
` (2 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-20 22:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #20 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #19)
> I have noticed some other gather related failures but haven't had time to
> triage them to file bugs. Hoping to get to that soon.
I had noticed the following failures which maybe are related on aarch64 (sve):
FAIL: gcc.target/aarch64/sve/mask_struct_load_3_run.c execution test
FAIL: gcc.target/aarch64/sve/mask_struct_store_1_run.c execution test
FAIL: gcc.target/aarch64/sve/mask_struct_store_2_run.c execution test
FAIL: gcc.target/aarch64/sve/mask_struct_store_3_run.c execution test
FAIL: gcc.target/aarch64/sve/mask_struct_store_4.c scan-assembler-not
\\\\tst2b\\\\t.z[0-9]
FAIL: gcc.target/aarch64/sve/mask_struct_store_4.c scan-assembler-not
\\\\tst2d\\\\t.z[0-9]
FAIL: gcc.target/aarch64/sve/mask_struct_store_4.c scan-assembler-not
\\\\tst2h\\\\t.z[0-9]
FAIL: gcc.target/aarch64/sve/mask_struct_store_4.c scan-assembler-not
\\\\tst2w\\\\t.z[0-9]
This was done using qemu and it was my first time recently running a cross with
qemu too.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (19 preceding siblings ...)
2023-11-20 22:30 ` pinskia at gcc dot gnu.org
@ 2023-11-20 23:56 ` pan2.li at intel dot com
2023-11-21 7:20 ` cvs-commit at gcc dot gnu.org
2023-11-21 7:24 ` rguenth at gcc dot gnu.org
22 siblings, 0 replies; 24+ messages in thread
From: pan2.li at intel dot com @ 2023-11-20 23:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #21 from Li Pan <pan2.li at intel dot com> ---
(In reply to Robin Dapp from comment #18)
> I did a quick testsuite run on rv32 and can confirm that this fixes the
> issue for me.
Confirmed that this fixes the issue on RV64 too.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (20 preceding siblings ...)
2023-11-20 23:56 ` pan2.li at intel dot com
@ 2023-11-21 7:20 ` cvs-commit at gcc dot gnu.org
2023-11-21 7:24 ` rguenth at gcc dot gnu.org
22 siblings, 0 replies; 24+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-21 7:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
--- Comment #22 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:c656d268c9dac8b6f710b9bbd399214cb11b3287
commit r14-5635-gc656d268c9dac8b6f710b9bbd399214cb11b3287
Author: Richard Biener <rguenther@suse.de>
Date: Mon Nov 20 15:16:44 2023 +0100
tree-optimization/111970 - fix issue with SLP of emulated gather/scatter
There's a missed index adjustment for the SLP vector number when
computing the index/data vectors for emulated gather/scatter with SLP.
The following fixes this.
PR tree-optimization/111970
* tree-vect-stmts.cc (vectorizable_load): Fix offset calculation
for SLP gather load.
(vectorizable_store): Likewise for SLP scatter store.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145
2023-10-25 2:02 [Bug c/111970] New: [tree-optimization] SLP for non-IFN gathers result in RISC-V test failure on gather pan2.li at intel dot com
` (21 preceding siblings ...)
2023-11-21 7:20 ` cvs-commit at gcc dot gnu.org
@ 2023-11-21 7:24 ` rguenth at gcc dot gnu.org
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-21 7:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #23 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed then.
^ permalink raw reply [flat|nested] 24+ messages in thread