From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id DDF963857835; Thu, 26 Oct 2023 01:57:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DDF963857835 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1698285447; bh=eL2yFuhLdXbxtnOq6NFd3I8Or0+y8mIrgsg7RKHdvW8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=JXjNdlJeKCH8cfVa+FQ5cM6WyZbLDuwZWOwRIp1M6KQtqij+BfId4qiz6lnLEF/LS eLS0XVFkTtbwZ017rS95CvQjgrTnKx7FSctjihUzEHitHueIDItelH/pBel1OQq0Es bgppVBBc5RrVTluAChWMCiIka6DaOMuo53BC66Zg= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/112092] RISC-V: Wrong RVV code produced for vsetvl-11.c and vsetvlmax-8.c Date: Thu, 26 Oct 2023 01:57:27 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112092 --- Comment #2 from JuzheZhong --- To demonstrate the idea, here is a simple example to make you easier unders= tand the idea: https://godbolt.org/z/Gxzjv48Ec #include "riscv_vector.h" void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, = int cond, int avl) { size_t vl =3D __riscv_vsetvl_e16mf2(avl >> 2); vint32m1_t a =3D __riscv_vle32_v_i32m1(in1, vl); vint32m1_t b =3D __riscv_vle32_v_i32m1_tu(a, in2, vl); vint32m1_t c =3D __riscv_vle32_v_i32m1_tu(b, in3, vl); __riscv_vse32_v_i32m1(out, c, vl); } LLVM: srai a4, a6, 2 vsetvli zero, a4, e16, mf2, ta, ma vle32.v v8, (a0) vsetvli zero, zero, e32, m1, tu, ma vle32.v v8, (a1) vle32.v v8, (a2) vse32.v v8, (a3) ret LLVM is generating the naive code according to the intrinsics, as you said, the first vsetvli keep e16mf2 unchanged. Here is the codgen of GCC: GCC: srai a6,a6,2 vsetvli a6,a6,e32,m1,tu,ma vle32.v v1,0(a0) vle32.v v1,0(a1) vle32.v v1,0(a2) vse32.v v1,0(a3) ret since e16 mf2 is same ratio e32 m1, so we change first vsetvl from e16 mf2 = into e32 m1 TU.=20 Then we can eliminate the second vsetvl That is we call "local fusion" here. For the case you mentioned is "global fusion" But they are the same thing. Fuse vsetvl according to RVV ISA. So, the example you mention, GCC is generating correct codes.=