From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id DDF963857835; Thu, 26 Oct 2023 01:57:27 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DDF963857835
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1698285447;
	bh=eL2yFuhLdXbxtnOq6NFd3I8Or0+y8mIrgsg7RKHdvW8=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=JXjNdlJeKCH8cfVa+FQ5cM6WyZbLDuwZWOwRIp1M6KQtqij+BfId4qiz6lnLEF/LS
	 eLS0XVFkTtbwZ017rS95CvQjgrTnKx7FSctjihUzEHitHueIDItelH/pBel1OQq0Es
	 bgppVBBc5RrVTluAChWMCiIka6DaOMuo53BC66Zg=
From: "juzhe.zhong at rivai dot ai" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/112092] RISC-V: Wrong RVV code produced for vsetvl-11.c
 and vsetvlmax-8.c
Date: Thu, 26 Oct 2023 01:57:27 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: juzhe.zhong at rivai dot ai
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-112092-4-amEyiqV0CK@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-112092-4@http.gcc.gnu.org/bugzilla/>
References: <bug-112092-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112092

--- Comment #2 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
To demonstrate the idea, here is a simple example to make you easier unders=
tand
the idea:

https://godbolt.org/z/Gxzjv48Ec

#include "riscv_vector.h"

void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, =
int
cond, int avl) {
    size_t vl =3D __riscv_vsetvl_e16mf2(avl >> 2);
    vint32m1_t a =3D __riscv_vle32_v_i32m1(in1, vl);
    vint32m1_t b =3D __riscv_vle32_v_i32m1_tu(a, in2, vl);
    vint32m1_t c =3D __riscv_vle32_v_i32m1_tu(b, in3, vl);
    __riscv_vse32_v_i32m1(out, c, vl);
}

LLVM:

        srai    a4, a6, 2
        vsetvli zero, a4, e16, mf2, ta, ma
        vle32.v v8, (a0)
        vsetvli zero, zero, e32, m1, tu, ma
        vle32.v v8, (a1)
        vle32.v v8, (a2)
        vse32.v v8, (a3)
        ret

LLVM is generating the naive code according to the intrinsics,
as you said, the first vsetvli keep e16mf2 unchanged.

Here is the codgen of GCC:
GCC:

        srai    a6,a6,2
        vsetvli a6,a6,e32,m1,tu,ma
        vle32.v v1,0(a0)
        vle32.v v1,0(a1)
        vle32.v v1,0(a2)
        vse32.v v1,0(a3)
        ret

since e16 mf2 is same ratio e32 m1, so we change first vsetvl from e16 mf2 =
into
e32 m1 TU.=20

Then we can eliminate the second vsetvl

That is we call "local fusion" here.

For the case you mentioned is "global fusion" But they are the same thing.

Fuse vsetvl according to RVV ISA.

So, the example you mention, GCC is generating correct codes.=