From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by sourceware.org (Postfix) with ESMTPS id F0602385661C for ; Tue, 30 May 2023 09:17:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F0602385661C Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=sifive.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=sifive.com Received: by mail-pg1-x530.google.com with SMTP id 41be03b00d2f7-517ab9a4a13so3696413a12.1 for ; Tue, 30 May 2023 02:17:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1685438230; x=1688030230; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MT9Hlb7wEcBzEzf37aT8lzejqUydgPgmjc2xqca2AHY=; b=nMcoKpHMJ+jPLBFQH5C/44Pts0qF4mHAxIZ2hLt0rJaGq/1rLBh+VsrHM2/gODb7Vm 4VAQtHvnEQc9QyOtbIl9KVNXzZJG2R+dv3/6TCWlrk0pAdPOC+y8LJe+NcCgeINN/vp7 7MyiMEN6JGcKMFPtWS5dJMGexLNvaRuEpRFwfjLf8UihFsSQgQ0gO7lnclI4USs+uvdu 43oL9QRdQhyTNXDvMbsijjEbkmdtx0mQ0Df/y9c0WwW8d809t/MIeK66rXYv1CnyS5Pr BSge2q6fURR8AvxwD6+A7b4Y3BvEdqPIV9myGlNlxwe9Mjysol37EfcQpSsiHULEVdJh RCZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685438230; x=1688030230; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MT9Hlb7wEcBzEzf37aT8lzejqUydgPgmjc2xqca2AHY=; b=dy6CYKqaI9vblg/V2xY0Eo5xWduxfRgDUeqo1eRYyUMadeg/K0aN7DHHzMHO9egB7Y uCUX3d9nJKvgRfbTY9pxKdiiC78TmcfMJXFAO9G9wd5o8x79wsn/vIh+Ti4u08h+FWpm dtyOkuAC0C24HWVHpZ3OMuPKEG1cxJM47qZ0SnyQXK1tEPVAMzhMKmSJNnqK4GmTM3BW SP/3iEfAAJfMrhsgwA56wheIH4VocNicaPeitwYSuqUJ7wS5RVxNOGqAuoBQKYTTJ8b/ A9FCcxwXtYZ/68BoxK65RsVxMll60iLMl9RvD1677ZDiVE+j3SCe5BDMu1lfORm0ifTT 2WXg== X-Gm-Message-State: AC+VfDzrdkRVZJs1lxQaxBce3GPfSUCSokNy4mp3yk01mD0qfoyPZtSK GE+jmiTlpux5UMTdNjAwmWuStOsiH6YZhIkmG3Oulg== X-Google-Smtp-Source: ACHHUZ7t8CAKyFvScC6cDymyDQTD2+Dg57BCjAw1cZF88C79rO1b5MsoBOwQQXkDI6FzBQ3AtfCueL0tOugobAbG1I4= X-Received: by 2002:a17:90b:1d8e:b0:253:34da:487 with SMTP id pf14-20020a17090b1d8e00b0025334da0487mr1573147pjb.35.1685438230140; Tue, 30 May 2023 02:17:10 -0700 (PDT) MIME-Version: 1.0 References: <20230530060621.31449-1-kito.cheng@sifive.com> <87B2E2DEA59DF7D1+20230530154530505119343@rivai.ai> <5873dbb5-ef8f-6411-1841-d849030554e3@gmail.com> In-Reply-To: From: Kito Cheng Date: Tue, 30 May 2023 17:16:59 +0800 Message-ID: Subject: Re: [PATCH] RISC-V: Basic VLS code gen for RISC-V To: Robin Dapp Cc: "juzhe.zhong@rivai.ai" , Richard Biener , gcc-patches , palmer , "kito.cheng" , jeffreyalaw , "pan2.li" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: One more note: we found a real case in spec 2006, SLP convert two 8 bit into int8x2_t, but the value has live across the function call, it only need to save-restore 16 bit, but it become save-restore VLEN bits because it using VLA mode in backend, you could imagine when VLEN is larger, the performance penalty will also increase, which is opposite way we expect - larger VLEN better performance. On Tue, May 30, 2023 at 5:11=E2=80=AFPM Kito Cheng = wrote: > > (I am still on the meeting hell, and will be released very later, > apology for short and incomplete reply, and will reply complete later) > > One point for adding VLS mode support is because SLP, especially for > those SLP candidate not in the loop, those case use VLS type can be > better, of cause using larger safe VLA type can optimize too, but that > will cause one issue we found in RISC-V in LLVM - it will spill/reload > whole register instead of exact size. > > e.g. > > int32x4_t a; > // def a > // spill a > foo () > // reload a > // use a > > Consider we use a VLA mode for a, it will spill and reload with whole > register VLA mode > Online demo here: https://godbolt.org/z/Y1fThbxE6 > > On Tue, May 30, 2023 at 5:05=E2=80=AFPM Robin Dapp = wrote: > > > > >>> but ideally the user would be able to specify -mrvv-size=3D32 for a= n > > >>> implementation with 32 byte vectors and then vector lowering would = make use > > >>> of vectors up to 32 bytes? > > > > > > Actually, we don't want to specify -mrvv-size =3D 32 to enable vector= ization on GNU vectors. > > > You can take a look this example: > > > https://godbolt.org/z/3jYqoM84h > > > > > > GCC need to specify the mrvv size to enable GNU vectors and the codeg= en only can run on CPU with vector-length =3D 128bit. > > > However, LLVM doesn't need to specify the vector length, and the code= gen can run on any CPU with RVV vector-length >=3D 128 bits. > > > > > > This is what this patch want to do. > > > > > > Thanks. > > I think Richard's question was rather if it wasn't better to do it more > > generically and lower vectors to what either the current cpu or what th= e > > user specified rather than just 16-byte vectors (i.e. indeed a fixed > > vlmin and not a fixed vlmin =3D=3D fixed vlmax). > > > > This patch assumes everything is fixed for optimization purposes and th= en > > switches over to variable-length when nothing can be changed anymore. = That > > is, we would work on "vlmin"-sized chunks in a VLA fashion at runtime? > > We would need to make sure that no pass after reload makes use of VLA > > properties at all. > > > > In general I don't have a good overview of which optimizations we gain = by > > such an approach or rather which ones are prevented by VLA altogether? > > What's the idea for the future? Still use LEN_LOAD et al. (and masking= ) > > with "fixed vlmin"? Wouldn't we select different IVs with this patch t= han > > what we would have for pure VLA? > > > > Regards > > Robin