From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 76A493858C54; Mon, 5 Feb 2024 06:59:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 76A493858C54 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1707116386; bh=tjllls/A1tfWINjzRVsGmQPNgmDIuTOgHEcH4/ndJd8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=xNc8LA4m+MqJdy+/STNXUSHBJbXZVVO9pMLV4Yg0XdMIVyw1CBTUisfzgnJV8mGZl iF+4KHbxr3Hij4LG3sm59x6RpM5eOIfVxhrY0re9528chhswWQF5/m1+RLxzuXX7xb 8rs1dIKSLWHXcMKK3E3dpjQs6rr3ohsguoY3rmzA= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113583] Main loop in 519.lbm not vectorized. Date: Mon, 05 Feb 2024 06:59:44 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113583 --- Comment #11 from JuzheZhong --- Hi, I think this RVV compiler codegen is that optimal codegen we want for R= VV: https://repo.hca.bsc.es/epic/z/P6QXCc .LBB0_5: # %vector.body sub a4, t0, a3 vsetvli t1, a4, e64, m1, ta, mu mul a2, a3, t2 add a5, t3, a2 vlse64.v v8, (a5), t2 add a4, a6, a2 vlse64.v v9, (a4), t2 add a4, a0, a2 vlse64.v v10, (a4), t2 vfadd.vv v8, v8, v9 vfmul.vf v8, v8, fa5 vfadd.vf v9, v10, fa4 vfmadd.vf v9, fa3, v10 vlse64.v v10, (a5), t2 add a4, a1, a2 vsse64.v v9, (a4), t2 vfadd.vf v8, v8, fa2 vfmadd.vf v8, fa3, v10 vfadd.vf v8, v8, fa1 add a2, a2, a7 add a3, a3, t1 vsse64.v v8, (a2), t2 bne a3, t0, .LBB0_5=