public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/111848] New: RISC-V: RVV cost model pick unexpected big LMUL
@ 2023-10-17 11:03 juzhe.zhong at rivai dot ai
2023-10-17 11:07 ` [Bug c/111848] " juzhe.zhong at rivai dot ai
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-10-17 11:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848
Bug ID: 111848
Summary: RISC-V: RVV cost model pick unexpected big LMUL
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: juzhe.zhong at rivai dot ai
Target Milestone: ---
#include <stdint.h>
void
f3 (uint8_t *restrict a, uint8_t *restrict b,
uint8_t *restrict c, uint8_t *restrict d,
int n)
{
for (int i = 0; i < n; ++i)
{
a[i * 8] = c[i * 8] + d[i * 8];
a[i * 8 + 1] = c[i * 8] + d[i * 8 + 1];
a[i * 8 + 2] = c[i * 8 + 2] + d[i * 8 + 2];
a[i * 8 + 3] = c[i * 8 + 2] + d[i * 8 + 3];
a[i * 8 + 4] = c[i * 8 + 4] + d[i * 8 + 4];
a[i * 8 + 5] = c[i * 8 + 4] + d[i * 8 + 5];
a[i * 8 + 6] = c[i * 8 + 6] + d[i * 8 + 6];
a[i * 8 + 7] = c[i * 8 + 6] + d[i * 8 + 7];
b[i * 8] = c[i * 8 + 1] + d[i * 8];
b[i * 8 + 1] = c[i * 8 + 1] + d[i * 8 + 1];
b[i * 8 + 2] = c[i * 8 + 3] + d[i * 8 + 2];
b[i * 8 + 3] = c[i * 8 + 3] + d[i * 8 + 3];
b[i * 8 + 4] = c[i * 8 + 5] + d[i * 8 + 4];
b[i * 8 + 5] = c[i * 8 + 5] + d[i * 8 + 5];
b[i * 8 + 6] = c[i * 8 + 7] + d[i * 8 + 6];
b[i * 8 + 7] = c[i * 8 + 7] + d[i * 8 + 7];
}
}
This case pick LMUL = 8 which causes horrible vector register spillings.
After experiment, the ideal LMUL should be 2.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug c/111848] RISC-V: RVV cost model pick unexpected big LMUL
2023-10-17 11:03 [Bug c/111848] New: RISC-V: RVV cost model pick unexpected big LMUL juzhe.zhong at rivai dot ai
@ 2023-10-17 11:07 ` juzhe.zhong at rivai dot ai
2023-10-20 3:57 ` [Bug target/111848] " cvs-commit at gcc dot gnu.org
2023-10-20 6:39 ` juzhe.zhong at rivai dot ai
2 siblings, 0 replies; 4+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-10-17 11:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848
--- Comment #1 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Sorry, it pick LMUL = 4:
f3:
ble a4,zero,.L11
csrr t0,vlenb
slli t1,t0,4
csrr a6,vlenb
sub sp,sp,t1
csrr a5,vlenb
slli a6,a6,3
slli a5,a5,2
add a6,a6,sp
vsetvli a7,zero,e16,m8,ta,ma
slli a4,a4,3
vid.v v8
addi t6,a5,-1
vand.vi v8,v8,-2
neg t5,a5
vs8r.v v8,0(sp)
vadd.vi v8,v8,1
vs8r.v v8,0(a6)
j .L7
.L14:
vsetvli a7,zero,e16,m8,ta,ma
.L7:
csrr t0,vlenb
slli t0,t0,3
vl8re16.v v16,0(sp)
add t0,t0,sp
vmv.v.x v8,t6
mv t1,a4
vand.vv v24,v16,v8
mv a6,a4
vl8re16.v v16,0(t0)
vand.vv v8,v16,v8
bleu a4,a5,.L6
mv a6,a5
.L6:
vsetvli zero,a6,e8,m4,ta,ma
vle8.v v20,0(a2)
vle8.v v16,0(a3)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v20,v24
vadd.vv v4,v16,v4
vsetvli zero,a6,e8,m4,ta,ma
vse8.v v4,0(a0)
vle8.v v20,0(a2)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v20,v8
vadd.vv v4,v4,v16
vsetvli zero,a6,e8,m4,ta,ma
vse8.v v4,0(a1)
Ideally LMUL should be 2:
f3:
ble a4,zero,.L9
csrr a5,vlenb
slli a5,a5,1
vsetvli a7,zero,e16,m4,ta,ma
slli a4,a4,3
vid.v v12
addi t6,a5,-1
vand.vi v12,v12,-2
neg t5,a5
vadd.vi v16,v12,1
j .L7
.L10:
vsetvli a7,zero,e16,m4,ta,ma
.L7:
vmv.v.x v4,t6
mv t1,a4
vand.vv v20,v12,v4
mv a6,a4
vand.vv v4,v16,v4
bleu a4,a5,.L6
mv a6,a5
.L6:
vsetvli zero,a6,e8,m2,ta,ma
vle8.v v10,0(a2)
vle8.v v8,0(a3)
vsetvli a7,zero,e8,m2,ta,ma
vrgatherei16.vv v2,v10,v20
vadd.vv v2,v8,v2
vsetvli zero,a6,e8,m2,ta,ma
vse8.v v2,0(a0)
vle8.v v10,0(a2)
vsetvli a7,zero,e8,m2,ta,ma
vrgatherei16.vv v2,v10,v4
vadd.vv v2,v2,v8
vsetvli zero,a6,e8,m2,ta,ma
vse8.v v2,0(a1)
add a4,a4,t5
add a0,a0,a5
add a3,a3,a5
add a1,a1,a5
add a2,a2,a5
bgtu t1,a5,.L10
.L9:
ret
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/111848] RISC-V: RVV cost model pick unexpected big LMUL
2023-10-17 11:03 [Bug c/111848] New: RISC-V: RVV cost model pick unexpected big LMUL juzhe.zhong at rivai dot ai
2023-10-17 11:07 ` [Bug c/111848] " juzhe.zhong at rivai dot ai
@ 2023-10-20 3:57 ` cvs-commit at gcc dot gnu.org
2023-10-20 6:39 ` juzhe.zhong at rivai dot ai
2 siblings, 0 replies; 4+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-10-20 3:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848
--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Lehua Ding <lhtin@gcc.gnu.org>:
https://gcc.gnu.org/g:f0e28d8c13713f509fde26fbe7dd13280b67fb87
commit r14-4774-gf0e28d8c13713f509fde26fbe7dd13280b67fb87
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date: Wed Oct 18 18:25:33 2023 +0800
RISC-V: Fix failed hoist in LICM of vmv.v.x instruction
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848
But it generate horrible register spillings.
The root cause is that we didn't hoist the vmv.v.x outside the loop which
increase the SLP loop register pressure.
So, change the COSNT_VECTOR move into vec_duplicate splitter that we can
gain better optimizations:
1. better LICM.
2. More opportunities of transforming 'vv' into 'vx' in the future.
Before this patch:
f3:
ble a4,zero,.L8
csrr t0,vlenb
slli t1,t0,4
csrr a6,vlenb
sub sp,sp,t1
csrr a5,vlenb
slli a6,a6,3
slli a5,a5,2
add a6,a6,sp
vsetvli a7,zero,e16,m8,ta,ma
slli a4,a4,3
vid.v v8
addi t6,a5,-1
vand.vi v8,v8,-2
neg t5,a5
vs8r.v v8,0(sp)
vadd.vi v8,v8,1
vs8r.v v8,0(a6)
j .L4
.L12:
vsetvli a7,zero,e16,m8,ta,ma
.L4:
csrr t0,vlenb
slli t0,t0,3
vl8re16.v v16,0(sp)
add t0,t0,sp
vmv.v.x v8,t6
mv t1,a4
vand.vv v24,v16,v8
mv a6,a4
vl8re16.v v16,0(t0)
vand.vv v8,v16,v8
bleu a4,a5,.L3
mv a6,a5
.L3:
vsetvli zero,a6,e8,m4,ta,ma
vle8.v v20,0(a2)
vle8.v v16,0(a3)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v20,v24
vadd.vv v4,v16,v4
vsetvli zero,a6,e8,m4,ta,ma
vse8.v v4,0(a0)
vle8.v v20,0(a2)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v20,v8
vadd.vv v4,v4,v16
vsetvli zero,a6,e8,m4,ta,ma
vse8.v v4,0(a1)
add a4,a4,t5
add a0,a0,a5
add a3,a3,a5
add a1,a1,a5
add a2,a2,a5
bgtu t1,a5,.L12
csrr t0,vlenb
slli t1,t0,4
add sp,sp,t1
jr ra
.L8:
ret
After this patch:
f3:
ble a4,zero,.L6
csrr a6,vlenb
csrr a5,vlenb
slli a6,a6,2
slli a5,a5,2
addi a6,a6,-1
slli a4,a4,3
neg t5,a5
vsetvli t1,zero,e16,m8,ta,ma
vmv.v.x v24,a6
vid.v v8
vand.vi v8,v8,-2
vadd.vi v16,v8,1
vand.vv v8,v8,v24
vand.vv v16,v16,v24
.L4:
mv t1,a4
mv a6,a4
bleu a4,a5,.L3
mv a6,a5
.L3:
vsetvli zero,a6,e8,m4,ta,ma
vle8.v v28,0(a2)
vle8.v v24,0(a3)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v28,v8
vadd.vv v4,v24,v4
vsetvli zero,a6,e8,m4,ta,ma
vse8.v v4,0(a0)
vle8.v v28,0(a2)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v28,v16
vadd.vv v4,v4,v24
vsetvli zero,a6,e8,m4,ta,ma
vse8.v v4,0(a1)
add a4,a4,t5
add a0,a0,a5
add a3,a3,a5
add a1,a1,a5
add a2,a2,a5
bgtu t1,a5,.L4
.L6:
ret
Note that this patch triggers multiple FAILs:
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
execution test
They failed are all because of bugs on VSETVL PASS:
10dd4: 0c707057 vsetvli zero,zero,e8,mf2,ta,ma
10dd8: 5e06b8d7 vmv.v.i v17,13
10ddc: 9ed030d7 vmv1r.v v1,v13
10de0: b21040d7 vncvt.x.x.w v1,v1
----> raise illegal instruction since we don't have SEW = 8 -> SEW = 4
narrowing.
10de4: 5e0785d7 vmv.v.v v11,v15
Confirm the recent VSETVL refactor patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633231.html fixed all of
them.
So this patch should be committed after the VSETVL refactor patch.
PR target/111848
gcc/ChangeLog:
* config/riscv/riscv-selftests.cc (run_const_vector_selftests):
Adapt selftest.
* config/riscv/riscv-v.cc (expand_const_vector): Change it into
vec_duplicate splitter.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Adapt test.
* gcc.dg/vect/costmodel/riscv/rvv/pr111848.c: New test.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/111848] RISC-V: RVV cost model pick unexpected big LMUL
2023-10-17 11:03 [Bug c/111848] New: RISC-V: RVV cost model pick unexpected big LMUL juzhe.zhong at rivai dot ai
2023-10-17 11:07 ` [Bug c/111848] " juzhe.zhong at rivai dot ai
2023-10-20 3:57 ` [Bug target/111848] " cvs-commit at gcc dot gnu.org
@ 2023-10-20 6:39 ` juzhe.zhong at rivai dot ai
2 siblings, 0 replies; 4+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-10-20 6:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848
JuzheZhong <juzhe.zhong at rivai dot ai> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|UNCONFIRMED |RESOLVED
--- Comment #3 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Fixed
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-10-20 6:39 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-17 11:03 [Bug c/111848] New: RISC-V: RVV cost model pick unexpected big LMUL juzhe.zhong at rivai dot ai
2023-10-17 11:07 ` [Bug c/111848] " juzhe.zhong at rivai dot ai
2023-10-20 3:57 ` [Bug target/111848] " cvs-commit at gcc dot gnu.org
2023-10-20 6:39 ` juzhe.zhong at rivai dot ai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).