public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS
@ 2023-11-30 10:08 juzhe.zhong at rivai dot ai
  2023-12-01  6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org
  2023-12-01  6:52 ` juzhe.zhong at rivai dot ai
  0 siblings, 2 replies; 3+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-30 10:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776

            Bug ID: 112776
           Summary: RISC-V Regression: Missed optimization of VSETVL PASS
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

#include "riscv_vector.h"

void
foo_vec(float *r, const float *x)
{
int i, k;

vfloat32m4_t x_vec;
vfloat32m4_t x_forward_vec;
vfloat32m4_t temp_vec;

/**
 * I have to use m1 to complicat intrisic
 */
vfloat32m1_t dst_vec;
vfloat32m1_t src_vec;

float result = 0.0f;
float shift_prev = 0.0f;

size_t n = 64;
for(size_t vl; n>0; n -=vl){
    vl = __riscv_vsetvl_e32m4(n); //LMUL=4
    x_vec = __riscv_vle32_v_f32m4(&x[0], vl);
    x_forward_vec = __riscv_vle32_v_f32m4(&x[0], vl);
    temp_vec = __riscv_vfmul_vv_f32m4(x_vec, x_forward_vec, vl);

    /**
     * I have to use m1 to complicat intrisic
     */
    //vfloat32m1_t __riscv_vfmv_s_tu(vfloat32m1_t vd, float rs1, size_t vl);
    src_vec = __riscv_vfmv_s_tu(src_vec, 0.0f, vl);                 //initial
src_vec
    //dst_vec = __riscv_vfmv_s_f_f32m1_tu(dst_vec, 0.0f, vl);         //clean
for vfredosum
    dst_vec = __riscv_vfmv_s_tu(dst_vec, 0.0f, vl);         //clean for
vfredosum
    dst_vec = __riscv_vfredosum_tu(dst_vec, temp_vec, src_vec, vl);

    r[0] = __riscv_vfmv_f_s_f32m1_f32(dst_vec);

   }
}

ASM:

GCC-14 foo_vec:
        li      a4,64
.L2:
        vsetvli a5,a4,e8,m1,ta,ma   ---> 
        vsetvli zero,a5,e32,m1,tu,ma
        vmv.s.x v2,zero
        vmv.s.x v1,zero
        vsetvli zero,a5,e32,m4,tu,ma
        vle32.v v4,0(a1)
        vfmul.vv        v4,v4,v4
        vfredosum.vs    v1,v4,v2
        vfmv.f.s        fa5,v1
        fsw     fa5,0(a0)
        sub     a4,a4,a5
        bne     a4,zero,.L2
        ret

GCC-13:

foo_vec(float*, float const*):
        fmv.s.x fa5,zero
        li      a4,64
.L2:
        vsetvli a5,a4,e32,m4,ta,ma
        vle32.v v28,0(a1)
        vfmv.s.f        v25,fa5
        vfmul.vv        v28,v28,v28
        vfmv.s.f        v24,fa5
        sub     a4,a4,a5
        vfredosum.vs    v24,v28,v25
        vfmv.f.s        fa4,v24
        fsw     fa4,0(a0)
        bne     a4,zero,.L2
        ret

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/112776] RISC-V Regression: Missed optimization of VSETVL PASS
  2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai
@ 2023-12-01  6:50 ` cvs-commit at gcc dot gnu.org
  2023-12-01  6:52 ` juzhe.zhong at rivai dot ai
  1 sibling, 0 replies; 3+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-12-01  6:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776

--- Comment #1 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Lehua Ding <lhtin@gcc.gnu.org>:

https://gcc.gnu.org/g:923a67f17badcbe6e2b2e5d3570a265443258c8e

commit r14-6027-g923a67f17badcbe6e2b2e5d3570a265443258c8e
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date:   Fri Dec 1 08:39:57 2023 +0800

    RISC-V: Fix VSETVL PASS regression

    This patch fix 2 regression (one is bug regression, the other is
performance regression).
    Those 2 regressions are both we are comparing ratio for same AVL in wrong
place.

    1. BUG regression:
    avl_single-84.c:

    f0:
            li      a5,999424
            add     a1,a1,a5
            li      a4,299008
            add     a5,a0,a5
            addi    a3,a4,992
            addi    a5,a5,576
            addi    a1,a1,576
            vsetvli a4,zero,e8,m2,ta,ma
            add     a0,a0,a3
            vlm.v   v1,0(a5)
            vsm.v   v1,0(a1)
            vl1re64.v       v1,0(a0)
            beq     a2,zero,.L10
            li      a5,0
            vsetvli zero,zero,e64,m1,tu,ma   --->  This is totally incorrect
since the ratio above is 4, wheras it is demanding ratio = 64 here.
    .L3:
            fcvt.d.lu       fa5,a5
            addi    a5,a5,1
            fadd.d  fa5,fa5,fa0
            vfmv.s.f        v1,fa5
            bne     a5,a2,.L3
            vfmv.f.s        fa0,v1
            ret
    .L10:
            vsetvli zero,zero,e64,m1,ta,ma
            vfmv.f.s        fa0,v1
            ret

    2. Performance regression:

    before this patch:

            vsetvli a5,a4,e8,m1,ta,ma
            vsetvli zero,a5,e32,m1,tu,ma
            vmv.s.x v2,zero
            vmv.s.x v1,zero
            vsetvli zero,a5,e32,m4,tu,ma
            vle32.v v4,0(a1)
            vfmul.vv        v4,v4,v4
            vfredosum.vs    v1,v4,v2
            vfmv.f.s        fa5,v1
            fsw     fa5,0(a0)
            sub     a4,a4,a5
            bne     a4,zero,.L2
            ret

    After this patch:

            vsetvli a5,a4,e32,m4,tu,ma
            vle32.v v4,0(a1)
            vmv.s.x v2,zero
            vmv.s.x v1,zero
            vfmul.vv        v4,v4,v4
            vfredosum.vs    v1,v4,v2
            vfmv.f.s        fa5,v1
            fsw     fa5,0(a0)
            sub     a4,a4,a5
            bne     a4,zero,.L2
            ret

    Tested rv64gcv_zvfh_zfh passed no regression.

    zvl256b/zvl512b/zvl1024b/zve64d is runing.

            PR target/112776

    gcc/ChangeLog:

            * config/riscv/riscv-vsetvl.cc
(pre_vsetvl::pre_global_vsetvl_info): Fix ratio.

    gcc/testsuite/ChangeLog:

            * gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adapt test.
            * gcc.target/riscv/rvv/vsetvl/pr111037-3.c: Ditto.
            * gcc.target/riscv/rvv/vsetvl/pr112776.c: New test.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/112776] RISC-V Regression: Missed optimization of VSETVL PASS
  2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai
  2023-12-01  6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org
@ 2023-12-01  6:52 ` juzhe.zhong at rivai dot ai
  1 sibling, 0 replies; 3+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-12-01  6:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112776

JuzheZhong <juzhe.zhong at rivai dot ai> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #2 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Fixed.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-12-01  6:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-30 10:08 [Bug c/112776] New: RISC-V Regression: Missed optimization of VSETVL PASS juzhe.zhong at rivai dot ai
2023-12-01  6:50 ` [Bug target/112776] " cvs-commit at gcc dot gnu.org
2023-12-01  6:52 ` juzhe.zhong at rivai dot ai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).