From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=BTAA=BT=sifive.com=kito.cheng@sourceware.org>
Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630])
	by sourceware.org (Postfix) with ESMTPS id 2835D3858D20
	for <gcc-patches@gcc.gnu.org>; Tue, 30 May 2023 09:11:21 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2835D3858D20
Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=sifive.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=sifive.com
Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b04706c974so14529295ad.2
        for <gcc-patches@gcc.gnu.org>; Tue, 30 May 2023 02:11:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=sifive.com; s=google; t=1685437880; x=1688029880;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=xB0JvxFCRTfyCxJfUncbl53ejFNa/U32Fs+dvA4u+Fk=;
        b=CT1Ym2o1XTTFmGmMis3hbWbECZDwoOTY1GTZvVw2PzpLH9KK8M9qeghq7AFXtuum+9
         Z1Y8VwDVsh1nGySyWMT1JP1xLE6UBZxJ/bvmkPg6uyT5v7DHjJyh8UQiDuHZ/crIIlBP
         6FlZvFgxbJ+MCCe+z9mkFOq3q5dk1F+oqk71T3uimCpa/ovvMSkSPy86I5AI+fN+NNFc
         s3Y0gli8hzcsJHGMRt0liC6gS3Ft5wa9Ypj61vkeuENQ/tfVA2RvOWn6x7fzdSJQGGKT
         ZEj8uI+udu34w3PE306ZeywtT2TbSsiq+WgJppxgG9eHvsWMbEg/cAHQoH5J1n97Ht1p
         cR6g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1685437880; x=1688029880;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=xB0JvxFCRTfyCxJfUncbl53ejFNa/U32Fs+dvA4u+Fk=;
        b=c6YIO9K/0IhnJejI3pBsCjKkhHT5fBU8vKEa1ywugou/k82+O2wXKavwCwDWsl3URv
         AAAizPOWh9OwY9+SeMwXL5HaqXPYreG1bFV8vDtctQix/+FtK56s262gA+5z8E31WpOa
         kWILa2DKeiFQcvxPyQ8DHjokHoA7BElRPMikkrM9n9KN+ZCQXJHr0DE8fl2+XtwTxM5Y
         ZXOu41NEdA8Wmb655XXklv86yipdR55FVvvr/FdA+gr5zWLWMFchXIhtNTx9Gcy8rAGC
         dBQImoMVp5vT9g9YfRNsogryGfbh/+JJcK4DVIWmkfi6lwmDRlwE+FdR9Vi29jYbv9hr
         WUSw==
X-Gm-Message-State: AC+VfDyLgZ48/nl16tE8V2LK8nm3M6Fbj6uePWjvNUrT0QZXgZeDBNas
	KN2cyAtdAZBzUwtbuLNHteHDkkZN2jDVAGDHH0hZdg==
X-Google-Smtp-Source: ACHHUZ73UWYMW9s7bWXVvOwDQ58LacrV2mrup6GBPY5C+/xyrZKLicq2eVH7YiUjnZ0kWA1/PuvcRUu2wtwp3+A/vck=
X-Received: by 2002:a17:903:2441:b0:1af:bbfd:1c07 with SMTP id
 l1-20020a170903244100b001afbbfd1c07mr2121820pls.57.1685437880026; Tue, 30 May
 2023 02:11:20 -0700 (PDT)
MIME-Version: 1.0
References: <20230530060621.31449-1-kito.cheng@sifive.com> <CAFiYyc0E3Rc-wz7UZsA_2CWX1FNizQdSYge53QuDYbR8ZbhamA@mail.gmail.com>
 <87B2E2DEA59DF7D1+20230530154530505119343@rivai.ai> <5873dbb5-ef8f-6411-1841-d849030554e3@gmail.com>
In-Reply-To: <5873dbb5-ef8f-6411-1841-d849030554e3@gmail.com>
From: Kito Cheng <kito.cheng@sifive.com>
Date: Tue, 30 May 2023 17:11:08 +0800
Message-ID: <CALLt3Tic_G5uGvSDEntwB5jRKetzDgsn=+brqD28c1nmGb6KNA@mail.gmail.com>
Subject: Re: [PATCH] RISC-V: Basic VLS code gen for RISC-V
To: Robin Dapp <rdapp.gcc@gmail.com>
Cc: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>, Richard Biener <richard.guenther@gmail.com>, 
	gcc-patches <gcc-patches@gcc.gnu.org>, palmer <palmer@dabbelt.com>, 
	"kito.cheng" <kito.cheng@gmail.com>, jeffreyalaw <jeffreyalaw@gmail.com>, 
	"pan2.li" <pan2.li@intel.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

(I am still on the meeting hell, and will be released very later,
apology for short and incomplete reply, and will reply complete later)

One point for adding VLS mode support is because SLP, especially for
those SLP candidate not in the loop, those case use VLS type can be
better, of cause using larger safe VLA type can optimize too, but that
will cause one issue we found in RISC-V in LLVM - it will spill/reload
whole register instead of exact size.

e.g.

int32x4_t a;
// def a
// spill a
foo ()
// reload a
// use a

Consider we use a VLA mode for a, it will spill and reload with whole
register VLA mode
Online demo here: https://godbolt.org/z/Y1fThbxE6

On Tue, May 30, 2023 at 5:05=E2=80=AFPM Robin Dapp <rdapp.gcc@gmail.com> wr=
ote:
>
> >>> but ideally the user would be able to specify -mrvv-size=3D32 for an
> >>> implementation with 32 byte vectors and then vector lowering would ma=
ke use
> >>> of vectors up to 32 bytes?
> >
> > Actually, we don't want to specify -mrvv-size =3D 32 to enable vectoriz=
ation on GNU vectors.
> > You can take a look this example:
> > https://godbolt.org/z/3jYqoM84h <https://godbolt.org/z/3jYqoM84h>
> >
> > GCC need to specify the mrvv size to enable GNU vectors and the codegen=
 only can run on CPU with vector-length =3D 128bit.
> > However, LLVM doesn't need to specify the vector length, and the codege=
n can run on any CPU with RVV  vector-length >=3D 128 bits.
> >
> > This is what this patch want to do.
> >
> > Thanks.
> I think Richard's question was rather if it wasn't better to do it more
> generically and lower vectors to what either the current cpu or what the
> user specified rather than just 16-byte vectors (i.e. indeed a fixed
> vlmin and not a fixed vlmin =3D=3D fixed vlmax).
>
> This patch assumes everything is fixed for optimization purposes and then
> switches over to variable-length when nothing can be changed anymore.  Th=
at
> is, we would work on "vlmin"-sized chunks in a VLA fashion at runtime?
> We would need to make sure that no pass after reload makes use of VLA
> properties at all.
>
> In general I don't have a good overview of which optimizations we gain by
> such an approach or rather which ones are prevented by VLA altogether?
> What's the idea for the future?  Still use LEN_LOAD et al. (and masking)
> with "fixed vlmin"?  Wouldn't we select different IVs with this patch tha=
n
> what we would have for pure VLA?
>
> Regards
>  Robin