From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id AD2943856948; Sat,  7 Oct 2023 23:18:06 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AD2943856948
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1696720686;
	bh=BFC+XefpUXenxic6PkueDIczRlo20LOXnDQOmG4IBoU=;
	h=From:To:Subject:Date:From;
	b=Y0PgpA6UJqrOBibawSmPwLHJfvyGg4OKotnFA7wACA5bNJyGzUcqN/83j+1pV06os
	 pfaSc06bPtBagEhI1qqKAAUH75A3BdK2yUMRjouU2V/4ayP4aoyscyZEGTl3EJSGdt
	 HHMURjEgJdieP4OkJii3k+TgjK4dZYbvDoa9PmCA=
From: "juzhe.zhong at rivai dot ai" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c/111721] New: RISC-V: Failed to SLP for gather_load in RVV
Date: Sat, 07 Oct 2023 23:18:05 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: juzhe.zhong at rivai dot ai
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status
 bug_severity priority component assigned_to reporter target_milestone
Message-ID: <bug-111721-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111721

            Bug ID: 111721
           Summary: RISC-V: Failed to SLP for gather_load in RVV
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

https://godbolt.org/z/d5TPa5e5s

void __attribute__((noipa))
f (int *restrict y, int *restrict x, int *restrict indices, int n)
{
  for (int i =3D 0; i < n; ++i)
    {
      y[i * 2] =3D x[indices[i * 2]] + 1;
      y[i * 2 + 1] =3D x[indices[i * 2 + 1]] + 2;
    }
}

RVV ASM:

f:
        ble     a3,zero,.L5
.L3:
        vsetvli a5,a3,e32,m1,ta,ma
        vlseg2e32.v     v2,(a2)                 ----> VEC_LOAD_LANES
        vsetivli        zero,4,e32,m1,ta,ma
        vsll.vi v4,v2,2
        vsll.vi v1,v3,2
        vsetvli zero,a5,e32,m1,ta,ma
        vluxei32.v      v4,(a1),v4
        vluxei32.v      v1,(a1),v1
        vsetivli        zero,4,e32,m1,ta,ma
        slli    a4,a5,3
        vadd.vi v2,v4,1
        vadd.vi v3,v1,2
        sub     a3,a3,a5
        vsetvli zero,a5,e32,m1,ta,ma
        vsseg2e32.v     v2,(a0)                  ----> VEC_STORE_LANES
        add     a2,a2,a4
        add     a0,a0,a4
        bne     a3,zero,.L3
.L5:
        ret

Comparing to aarch64 which can SLP, RVV geneates expensive
load_lanes/store_lanes.

This is because RVV is using MASK_LEN_GATHER_LOAD that we currently can did=
n't
support SLP for it.=