From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 5C3AD3858D37; Thu, 26 Oct 2023 06:51:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5C3AD3858D37 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1698303108; bh=LloT5DNZTTiMXZj4sQvCFvQpOHkYpXARzKBnTvFNHMg=; h=From:To:Subject:Date:In-Reply-To:References:From; b=VQ/YId9MPh+R+Fm3XQNYUzF38S9j0qwb9OxZaH3gwh4cmjJVFC+62vLr8GgX5oWDQ eFa+rIVnLf2oZvKhFvMtqNRm8DrgaHkQfF3i/FVNQVZ4ymqZAQuzFPfO81OYfHZGXO 3++wL2T4vyWjvTT+yiyA/eteimKqytHVfxqM8d0c= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/112092] RISC-V: Wrong RVV code produced for vsetvl-11.c and vsetvlmax-8.c Date: Thu, 26 Oct 2023 06:51:48 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112092 --- Comment #5 from JuzheZhong --- Yes. I am agree that some arch prefer agnostic than undisturbed even with m= ore vsetvls. That's why I have post PR for asking whether we can have a option = like -mprefer-agosnotic. https://github.com/riscv-non-isa/riscv-toolchain-conventions/issues/37 But I think Maciej is worrying about why GCC fuse vsetvl, and change e16mf2 vsetvl into e32m1. For example: https://godbolt.org/z/6G9G7Pbe9 No 'TU' included. I think LLVM codegen looks more reasonable: beqz a5, .LBB0_4 vsetvli a1, a6, e32, m1, ta, ma beqz a4, .LBB0_3 .LBB0_2: # =3D>This Inner Loop Header: Depth= =3D1 vsetvli zero, a1, e32, m1, ta, ma vle32.v v8, (a0) vadd.vv v8, v8, v8 addi a4, a4, -1 vse32.v v8, (a3) bnez a4, .LBB0_2 .LBB0_3: ret .LBB0_4: srai a1, a6, 2 vsetvli a1, a1, e16, mf2, ta, ma bnez a4, .LBB0_2 j .LBB0_3 But GCC is correct with optimizations: foo(int*, int*, int*, int*, unsigned long, int, int): beq a5,zero,.L2 vsetvli a5,a6,e32,m1,ta,ma .L3: beq a4,zero,.L10 li a2,0 .L5: vle32.v v1,0(a0) addi a2,a2,1 vadd.vv v1,v1,v1 vse32.v v1,0(a3) bne a4,a2,.L5 .L10: ret .L2: sraiw a5,a6,2 vsetvli zero,a5,e32,m1,ta,ma j .L3=