From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by sourceware.org (Postfix) with ESMTPS id AC5803858D28 for ; Tue, 14 Dec 2021 08:34:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AC5803858D28 Received: by mail-ed1-x532.google.com with SMTP id l25so60790144eda.11 for ; Tue, 14 Dec 2021 00:34:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=VGtzz9fxJu6cmz02JlRCeCfDc2/+mdS0usLVUp/N3gU=; b=lAB1zqFXBDez7Qe03TeCbVZzKJ099S4pJOVdXGpkMYv7xslEd0ZZDta5+/TtG1HChX 6QAKtUYDmqcC2NFlpJgxKRiolIaxiCXCljRsYB+3YrXIMj6m7xA52u0EBZNy9tyKu3BU AdD7gI2ELFNBQ+AShNJdEZxr5TtZc14vKFVHjXM7HGo9vgE4jFPACYe2mwlPjQ4ZZhpS IsgwN3U1Z6P5/6MfDlU0oUZUsVTUgrad+n7mvssv1u8fiCQaaklsB0/8kdv84HwZh7jJ 7ysRWf7nBbGnSgAP1GT/zeZOQ0MBrBd/ke5EQa5koGeT0q20Qr7fP+mUpAK6nabi4n7v UwsA== X-Gm-Message-State: AOAM531TsBHDnTmm0YhuKJIiZPg9BPir/PH6Xcl4NBiblEKoaFk+nCjv fHdel/bIepOGh3rsJ+lzAslW/9bt4V2QvNRo9iBezg== X-Google-Smtp-Source: ABdhPJz5/SyoiezZp+x6uhi4Q1FXWzLD4fuB+cM13j8aXKsbxO9WgYPdSsP/SpPOzL7Pg3vYET/TsD3k7UMz/xz7GjU= X-Received: by 2002:a17:907:2cd7:: with SMTP id hg23mr4286386ejc.724.1639470893582; Tue, 14 Dec 2021 00:34:53 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Prathamesh Kulkarni Date: Tue, 14 Dec 2021 14:04:17 +0530 Message-ID: Subject: Re: [SVE] PR96463 - Optimise svld1rq from vectors To: Prathamesh Kulkarni , gcc Patches , rguenther@suse.de, richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Dec 2021 08:34:56 -0000 On Tue, 7 Dec 2021 at 19:08, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > On Thu, 2 Dec 2021 at 23:11, Richard Sandiford > > wrote: > >> > >> Prathamesh Kulkarni writes: > >> > Hi Richard, > >> > I have attached a WIP untested patch for PR96463. > >> > IIUC, the PR suggests to transform > >> > lhs =3D svld1rq ({-1, -1, ...}, &v[0]) > >> > into: > >> > lhs =3D vec_perm_expr > >> > if v is vector of 4 elements, and each element is 32 bits on little > >> > endian target ? > >> > > >> > I am sorry if this sounds like a silly question, but I am not sure h= ow > >> > to convert a vector of type int32x4_t into svint32_t ? In the patch,= I > >> > simply used NOP_EXPR (which I expected to fail), and gave type error > >> > during gimple verification: > >> > >> It should be possible in principle to have a VEC_PERM_EXPR in which > >> the operands are Advanced SIMD vectors and the result is an SVE vector= . > >> > >> E.g., the dup in the PR would be something like this: > >> > >> foo (int32x4_t a) > >> { > >> svint32_t _2; > >> > >> _2 =3D VEC_PERM_EXPR ; > >> return _2; > >> } > >> > >> where the final operand can be built using: > >> > >> int source_nelts =3D TYPE_VECTOR_SUBPARTS (=E2=80=A6rhs type=E2=80= =A6).to_constant (); > >> vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (=E2=80=A6lhs type=E2=80= =A6), source_nelts, 1); > >> for (int i =3D 0; i < source_nelts; ++i) > >> sel.quick_push (i); > >> > >> I'm not sure how well-tested that combination is though. It might nee= d > >> changes to target-independent code. > > Hi Richard, > > Thanks for the suggestions. > > I tried the above approach in attached patch, but it still results in > > ICE due to type mismatch: > > > > pr96463.c: In function =E2=80=98foo=E2=80=99: > > pr96463.c:8:1: error: type mismatch in =E2=80=98vec_perm_expr=E2=80=99 > > 8 | } > > | ^ > > svint32_t > > int32x4_t > > int32x4_t > > svint32_t > > _3 =3D VEC_PERM_EXPR ; > > during GIMPLE pass: ccp > > dump file: pr96463.c.032t.ccp1 > > pr96463.c:8:1: internal compiler error: verify_gimple failed > > > > Should we perhaps add another tree code, that "extends" a fixed-width > > vector into it's VLA equivalent ? > > No, I think this is just an extreme example of the combination not being > well-tested. :-) Obviously it's worse than I thought. > > I think accepting this kind of VEC_PERM_EXPR is still the way to go. > Richi, WDYT? Hi Richi, ping ? Thanks, Prathamesh > > Thanks, > Richard