From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id AD2943856948; Sat, 7 Oct 2023 23:18:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AD2943856948 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1696720686; bh=BFC+XefpUXenxic6PkueDIczRlo20LOXnDQOmG4IBoU=; h=From:To:Subject:Date:From; b=Y0PgpA6UJqrOBibawSmPwLHJfvyGg4OKotnFA7wACA5bNJyGzUcqN/83j+1pV06os pfaSc06bPtBagEhI1qqKAAUH75A3BdK2yUMRjouU2V/4ayP4aoyscyZEGTl3EJSGdt HHMURjEgJdieP4OkJii3k+TgjK4dZYbvDoa9PmCA= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/111721] New: RISC-V: Failed to SLP for gather_load in RVV Date: Sat, 07 Oct 2023 23:18:05 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111721 Bug ID: 111721 Summary: RISC-V: Failed to SLP for gather_load in RVV Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- https://godbolt.org/z/d5TPa5e5s void __attribute__((noipa)) f (int *restrict y, int *restrict x, int *restrict indices, int n) { for (int i =3D 0; i < n; ++i) { y[i * 2] =3D x[indices[i * 2]] + 1; y[i * 2 + 1] =3D x[indices[i * 2 + 1]] + 2; } } RVV ASM: f: ble a3,zero,.L5 .L3: vsetvli a5,a3,e32,m1,ta,ma vlseg2e32.v v2,(a2) ----> VEC_LOAD_LANES vsetivli zero,4,e32,m1,ta,ma vsll.vi v4,v2,2 vsll.vi v1,v3,2 vsetvli zero,a5,e32,m1,ta,ma vluxei32.v v4,(a1),v4 vluxei32.v v1,(a1),v1 vsetivli zero,4,e32,m1,ta,ma slli a4,a5,3 vadd.vi v2,v4,1 vadd.vi v3,v1,2 sub a3,a3,a5 vsetvli zero,a5,e32,m1,ta,ma vsseg2e32.v v2,(a0) ----> VEC_STORE_LANES add a2,a2,a4 add a0,a0,a4 bne a3,zero,.L3 .L5: ret Comparing to aarch64 which can SLP, RVV geneates expensive load_lanes/store_lanes. This is because RVV is using MASK_LEN_GATHER_LOAD that we currently can did= n't support SLP for it.=