public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS @ 2023-11-30 10:08 juzhe.zhong at rivai dot ai 2023-12-01 6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org 2023-12-01 6:52 ` juzhe.zhong at rivai dot ai 0 siblings, 2 replies; 3+ messages in thread From: juzhe.zhong at rivai dot ai @ 2023-11-30 10:08 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776 Bug ID: 112776 Summary: RISC-V Regression: Missed optimization of VSETVL PASS Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- #include "riscv_vector.h" void foo_vec(float *r, const float *x) { int i, k; vfloat32m4_t x_vec; vfloat32m4_t x_forward_vec; vfloat32m4_t temp_vec; /** * I have to use m1 to complicat intrisic */ vfloat32m1_t dst_vec; vfloat32m1_t src_vec; float result = 0.0f; float shift_prev = 0.0f; size_t n = 64; for(size_t vl; n>0; n -=vl){ vl = __riscv_vsetvl_e32m4(n); //LMUL=4 x_vec = __riscv_vle32_v_f32m4(&x[0], vl); x_forward_vec = __riscv_vle32_v_f32m4(&x[0], vl); temp_vec = __riscv_vfmul_vv_f32m4(x_vec, x_forward_vec, vl); /** * I have to use m1 to complicat intrisic */ //vfloat32m1_t __riscv_vfmv_s_tu(vfloat32m1_t vd, float rs1, size_t vl); src_vec = __riscv_vfmv_s_tu(src_vec, 0.0f, vl); //initial src_vec //dst_vec = __riscv_vfmv_s_f_f32m1_tu(dst_vec, 0.0f, vl); //clean for vfredosum dst_vec = __riscv_vfmv_s_tu(dst_vec, 0.0f, vl); //clean for vfredosum dst_vec = __riscv_vfredosum_tu(dst_vec, temp_vec, src_vec, vl); r[0] = __riscv_vfmv_f_s_f32m1_f32(dst_vec); } } ASM: GCC-14 foo_vec: li a4,64 .L2: vsetvli a5,a4,e8,m1,ta,ma ---> vsetvli zero,a5,e32,m1,tu,ma vmv.s.x v2,zero vmv.s.x v1,zero vsetvli zero,a5,e32,m4,tu,ma vle32.v v4,0(a1) vfmul.vv v4,v4,v4 vfredosum.vs v1,v4,v2 vfmv.f.s fa5,v1 fsw fa5,0(a0) sub a4,a4,a5 bne a4,zero,.L2 ret GCC-13: foo_vec(float*, float const*): fmv.s.x fa5,zero li a4,64 .L2: vsetvli a5,a4,e32,m4,ta,ma vle32.v v28,0(a1) vfmv.s.f v25,fa5 vfmul.vv v28,v28,v28 vfmv.s.f v24,fa5 sub a4,a4,a5 vfredosum.vs v24,v28,v25 vfmv.f.s fa4,v24 fsw fa4,0(a0) bne a4,zero,.L2 ret ^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/112776] RISC-V Regression: Missed optimization of VSETVL PASS 2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai @ 2023-12-01 6:50 ` cvs-commit at gcc dot gnu.org 2023-12-01 6:52 ` juzhe.zhong at rivai dot ai 1 sibling, 0 replies; 3+ messages in thread From: cvs-commit at gcc dot gnu.org @ 2023-12-01 6:50 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776 --- Comment #1 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The trunk branch has been updated by Lehua Ding <lhtin@gcc.gnu.org>: https://gcc.gnu.org/g:923a67f17badcbe6e2b2e5d3570a265443258c8e commit r14-6027-g923a67f17badcbe6e2b2e5d3570a265443258c8e Author: Juzhe-Zhong <juzhe.zhong@rivai.ai> Date: Fri Dec 1 08:39:57 2023 +0800 RISC-V: Fix VSETVL PASS regression This patch fix 2 regression (one is bug regression, the other is performance regression). Those 2 regressions are both we are comparing ratio for same AVL in wrong place. 1. BUG regression: avl_single-84.c: f0: li a5,999424 add a1,a1,a5 li a4,299008 add a5,a0,a5 addi a3,a4,992 addi a5,a5,576 addi a1,a1,576 vsetvli a4,zero,e8,m2,ta,ma add a0,a0,a3 vlm.v v1,0(a5) vsm.v v1,0(a1) vl1re64.v v1,0(a0) beq a2,zero,.L10 li a5,0 vsetvli zero,zero,e64,m1,tu,ma ---> This is totally incorrect since the ratio above is 4, wheras it is demanding ratio = 64 here. .L3: fcvt.d.lu fa5,a5 addi a5,a5,1 fadd.d fa5,fa5,fa0 vfmv.s.f v1,fa5 bne a5,a2,.L3 vfmv.f.s fa0,v1 ret .L10: vsetvli zero,zero,e64,m1,ta,ma vfmv.f.s fa0,v1 ret 2. Performance regression: before this patch: vsetvli a5,a4,e8,m1,ta,ma vsetvli zero,a5,e32,m1,tu,ma vmv.s.x v2,zero vmv.s.x v1,zero vsetvli zero,a5,e32,m4,tu,ma vle32.v v4,0(a1) vfmul.vv v4,v4,v4 vfredosum.vs v1,v4,v2 vfmv.f.s fa5,v1 fsw fa5,0(a0) sub a4,a4,a5 bne a4,zero,.L2 ret After this patch: vsetvli a5,a4,e32,m4,tu,ma vle32.v v4,0(a1) vmv.s.x v2,zero vmv.s.x v1,zero vfmul.vv v4,v4,v4 vfredosum.vs v1,v4,v2 vfmv.f.s fa5,v1 fsw fa5,0(a0) sub a4,a4,a5 bne a4,zero,.L2 ret Tested rv64gcv_zvfh_zfh passed no regression. zvl256b/zvl512b/zvl1024b/zve64d is runing. PR target/112776 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::pre_global_vsetvl_info): Fix ratio. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adapt test. * gcc.target/riscv/rvv/vsetvl/pr111037-3.c: Ditto. * gcc.target/riscv/rvv/vsetvl/pr112776.c: New test. ^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/112776] RISC-V Regression: Missed optimization of VSETVL PASS 2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai 2023-12-01 6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org @ 2023-12-01 6:52 ` juzhe.zhong at rivai dot ai 1 sibling, 0 replies; 3+ messages in thread From: juzhe.zhong at rivai dot ai @ 2023-12-01 6:52 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776 JuzheZhong <juzhe.zhong at rivai dot ai> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #2 from JuzheZhong <juzhe.zhong at rivai dot ai> --- Fixed. ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-12-01 6:52 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai 2023-12-01 6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org 2023-12-01 6:52 ` juzhe.zhong at rivai dot ai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).