public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS
@ 2023-11-30 10:08 juzhe.zhong at rivai dot ai
2023-12-01 6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org
2023-12-01 6:52 ` juzhe.zhong at rivai dot ai
0 siblings, 2 replies; 3+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-30 10:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776
Bug ID: 112776
Summary: RISC-V Regression: Missed optimization of VSETVL PASS
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: juzhe.zhong at rivai dot ai
Target Milestone: ---
#include "riscv_vector.h"
void
foo_vec(float *r, const float *x)
{
int i, k;
vfloat32m4_t x_vec;
vfloat32m4_t x_forward_vec;
vfloat32m4_t temp_vec;
/**
* I have to use m1 to complicat intrisic
*/
vfloat32m1_t dst_vec;
vfloat32m1_t src_vec;
float result = 0.0f;
float shift_prev = 0.0f;
size_t n = 64;
for(size_t vl; n>0; n -=vl){
vl = __riscv_vsetvl_e32m4(n); //LMUL=4
x_vec = __riscv_vle32_v_f32m4(&x[0], vl);
x_forward_vec = __riscv_vle32_v_f32m4(&x[0], vl);
temp_vec = __riscv_vfmul_vv_f32m4(x_vec, x_forward_vec, vl);
/**
* I have to use m1 to complicat intrisic
*/
//vfloat32m1_t __riscv_vfmv_s_tu(vfloat32m1_t vd, float rs1, size_t vl);
src_vec = __riscv_vfmv_s_tu(src_vec, 0.0f, vl); //initial
src_vec
//dst_vec = __riscv_vfmv_s_f_f32m1_tu(dst_vec, 0.0f, vl); //clean
for vfredosum
dst_vec = __riscv_vfmv_s_tu(dst_vec, 0.0f, vl); //clean for
vfredosum
dst_vec = __riscv_vfredosum_tu(dst_vec, temp_vec, src_vec, vl);
r[0] = __riscv_vfmv_f_s_f32m1_f32(dst_vec);
}
}
ASM:
GCC-14 foo_vec:
li a4,64
.L2:
vsetvli a5,a4,e8,m1,ta,ma --->
vsetvli zero,a5,e32,m1,tu,ma
vmv.s.x v2,zero
vmv.s.x v1,zero
vsetvli zero,a5,e32,m4,tu,ma
vle32.v v4,0(a1)
vfmul.vv v4,v4,v4
vfredosum.vs v1,v4,v2
vfmv.f.s fa5,v1
fsw fa5,0(a0)
sub a4,a4,a5
bne a4,zero,.L2
ret
GCC-13:
foo_vec(float*, float const*):
fmv.s.x fa5,zero
li a4,64
.L2:
vsetvli a5,a4,e32,m4,ta,ma
vle32.v v28,0(a1)
vfmv.s.f v25,fa5
vfmul.vv v28,v28,v28
vfmv.s.f v24,fa5
sub a4,a4,a5
vfredosum.vs v24,v28,v25
vfmv.f.s fa4,v24
fsw fa4,0(a0)
bne a4,zero,.L2
ret
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/112776] RISC-V Regression: Missed optimization of VSETVL PASS
2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai
@ 2023-12-01 6:50 ` cvs-commit at gcc dot gnu.org
2023-12-01 6:52 ` juzhe.zhong at rivai dot ai
1 sibling, 0 replies; 3+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-12-01 6:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776
--- Comment #1 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Lehua Ding <lhtin@gcc.gnu.org>:
https://gcc.gnu.org/g:923a67f17badcbe6e2b2e5d3570a265443258c8e
commit r14-6027-g923a67f17badcbe6e2b2e5d3570a265443258c8e
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date: Fri Dec 1 08:39:57 2023 +0800
RISC-V: Fix VSETVL PASS regression
This patch fix 2 regression (one is bug regression, the other is
performance regression).
Those 2 regressions are both we are comparing ratio for same AVL in wrong
place.
1. BUG regression:
avl_single-84.c:
f0:
li a5,999424
add a1,a1,a5
li a4,299008
add a5,a0,a5
addi a3,a4,992
addi a5,a5,576
addi a1,a1,576
vsetvli a4,zero,e8,m2,ta,ma
add a0,a0,a3
vlm.v v1,0(a5)
vsm.v v1,0(a1)
vl1re64.v v1,0(a0)
beq a2,zero,.L10
li a5,0
vsetvli zero,zero,e64,m1,tu,ma ---> This is totally incorrect
since the ratio above is 4, wheras it is demanding ratio = 64 here.
.L3:
fcvt.d.lu fa5,a5
addi a5,a5,1
fadd.d fa5,fa5,fa0
vfmv.s.f v1,fa5
bne a5,a2,.L3
vfmv.f.s fa0,v1
ret
.L10:
vsetvli zero,zero,e64,m1,ta,ma
vfmv.f.s fa0,v1
ret
2. Performance regression:
before this patch:
vsetvli a5,a4,e8,m1,ta,ma
vsetvli zero,a5,e32,m1,tu,ma
vmv.s.x v2,zero
vmv.s.x v1,zero
vsetvli zero,a5,e32,m4,tu,ma
vle32.v v4,0(a1)
vfmul.vv v4,v4,v4
vfredosum.vs v1,v4,v2
vfmv.f.s fa5,v1
fsw fa5,0(a0)
sub a4,a4,a5
bne a4,zero,.L2
ret
After this patch:
vsetvli a5,a4,e32,m4,tu,ma
vle32.v v4,0(a1)
vmv.s.x v2,zero
vmv.s.x v1,zero
vfmul.vv v4,v4,v4
vfredosum.vs v1,v4,v2
vfmv.f.s fa5,v1
fsw fa5,0(a0)
sub a4,a4,a5
bne a4,zero,.L2
ret
Tested rv64gcv_zvfh_zfh passed no regression.
zvl256b/zvl512b/zvl1024b/zve64d is runing.
PR target/112776
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc
(pre_vsetvl::pre_global_vsetvl_info): Fix ratio.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adapt test.
* gcc.target/riscv/rvv/vsetvl/pr111037-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/pr112776.c: New test.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/112776] RISC-V Regression: Missed optimization of VSETVL PASS
2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai
2023-12-01 6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org
@ 2023-12-01 6:52 ` juzhe.zhong at rivai dot ai
1 sibling, 0 replies; 3+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-12-01 6:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776
JuzheZhong <juzhe.zhong at rivai dot ai> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #2 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Fixed.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-12-01 6:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai
2023-12-01 6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org
2023-12-01 6:52 ` juzhe.zhong at rivai dot ai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).