From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 696E03857C46; Mon, 18 Dec 2023 13:19:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 696E03857C46 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1702905581; bh=6t2Uit/XR/H9zt6dZMlwpBIoB0+uPQXcdz46tBH/c0s=; h=From:To:Subject:Date:In-Reply-To:References:From; b=ncUBpx2NCE3bHVTKkN99BrLsc8O8CfTjQyPSMHjO0i6HqjZBbAYHlh8Lq2aNfLyOs dqtv7b4AgBElyNtZMdtaGwG3wGcluTtFY9BmSsp5NnwBvbdu+usoAQdqzjeiw+axKr FZVTRWQ+ZQbKMhyTP3q/UpppNj0EK14L8VwdKKZM= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN Date: Mon, 18 Dec 2023 13:19:39 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112432 --- Comment #8 from GCC Commits --- The master branch has been updated by Pan Li : https://gcc.gnu.org/g:b3b2799b872bc4c1944629af9dfc8472c8ca5fe6 commit r14-6659-gb3b2799b872bc4c1944629af9dfc8472c8ca5fe6 Author: Juzhe-Zhong Date: Mon Dec 18 19:35:21 2023 +0800 RISC-V: Support one more overlap for wv instructions For 'wv' instructions, e.g. vwadd.wv vd,vs2,vs1. vs2 has same EEW as vd. vs1 has smaller than vd. So, vs2 can overlap with vd, but vs1 can only overlap highest-number of= vd when LMUL of vs1 is greater than 1. We already have supported overlap for vs1 LMUL >=3D 1. But I forget vs1 LMUL < 1, vs2 can overlap vd even though vs1 totally c= an not overlap vd. Consider the reduction auto-vectorization: int64_t reduc_plus_int (int *__restrict a, int n) { int64_t r =3D 0; for (int i =3D 0; i < n; ++i) r +=3D a[i]; return r; } When we use --param=3Driscv-autovec-lmul=3Dm2, the codegen is good to us because we already supported overlap for source EEW32 LMUL1 -> dest EEW64 LMUL2. --param=3Driscv-autovec-lmul=3Dm2: reduc_plus_int: ble a1,zero,.L4 vsetvli a5,zero,e64,m2,ta,ma vmv.v.i v2,0 .L3: vsetvli a5,a1,e32,m1,tu,ma slli a4,a5,2 sub a1,a1,a5 vle32.v v1,0(a0) add a0,a0,a4 vwadd.wv v2,v2,v1 bne a1,zero,.L3 li a5,0 vsetivli zero,1,e64,m1,ta,ma vmv.s.x v1,a5 vsetvli a5,zero,e64,m2,ta,ma vredsum.vs v2,v2,v1 vmv.x.s a0,v2 ret .L4: li a0,0 ret However, default LMUL (--param=3Driscv-autovec-lmul=3Dm1) generates red= undant vmv1r since it is EEW32 LMUL=3DMF2 -> EEW64 LMUL =3D 1 Before this patch: reduc_plus_int: ble a1,zero,.L4 vsetvli a5,zero,e64,m1,ta,ma vmv.v.i v1,0 .L3: vsetvli a5,a1,e32,mf2,tu,ma slli a4,a5,2 sub a1,a1,a5 vle32.v v2,0(a0) vmv1r.v v3,v1 ----> This should be removed. add a0,a0,a4 vwadd.wv v1,v3,v2 ----> vs2 should be v1 bne a1,zero,.L3 li a5,0 vsetivli zero,1,e64,m1,ta,ma vmv.s.x v2,a5 vsetvli a5,zero,e64,m1,ta,ma vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret .L4: li a0,0 ret After this patch: reduc_plus_int: ble a1,zero,.L4 vsetvli a5,zero,e64,m1,ta,ma vmv.v.i v1,0 .L3: vsetvli a5,a1,e32,mf2,tu,ma slli a4,a5,2 sub a1,a1,a5 vle32.v v2,0(a0) add a0,a0,a4 vwadd.wv v1,v1,v2 bne a1,zero,.L3 li a5,0 vsetivli zero,1,e64,m1,ta,ma vmv.s.x v2,a5 vsetvli a5,zero,e64,m1,ta,ma vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret .L4: li a0,0 ret PR target/112432 gcc/ChangeLog: * config/riscv/riscv.md (none,W21,W42,W84,W43,W86,W87): Add W0. (none,W21,W42,W84,W43,W86,W87,W0): Ditto. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112432-42.c: New test.=