From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=MQcD=AD=suse.de=rguenther@sourceware.org>
Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28])
	by sourceware.org (Postfix) with ESMTPS id 97B493858D28
	for <gcc-patches@gcc.gnu.org>; Wed, 12 Apr 2023 09:21:20 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 97B493858D28
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de
Received: from relay2.suse.de (relay2.suse.de [149.44.160.134])
	by smtp-out1.suse.de (Postfix) with ESMTP id CBA4F21905;
	Wed, 12 Apr 2023 09:21:19 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa;
	t=1681291279; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=vb00dXChczvs17iYFtzhwLg7Fxs6i8aVH1OnVodX+mk=;
	b=dS4qc4UQxrpIWJlv28+O5iqdhNCHVLGOSOhph+HbLNvMHjSg49aW/n8xq0bm8W8AfY1jAg
	UNxdEVJXmEQ8va+rauBVSh2jr92EhoAH1w9SezBqj+cqvmbo/983QI/mxVU3YzPIpK/ydN
	YlUgdzRixVBdy5Qbc09tNivKxAEWPk8=
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;
	s=susede2_ed25519; t=1681291279;
	h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=vb00dXChczvs17iYFtzhwLg7Fxs6i8aVH1OnVodX+mk=;
	b=H0TlBx0aKoubfNvf+JOOrGM5XKOx8JDKyEu0jq18S1W3B0YHc6IaZw7tIhtGy4mQUMHhae
	xDXSJnszkCqj6qDg==
Received: from wotan.suse.de (wotan.suse.de [10.160.0.1])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by relay2.suse.de (Postfix) with ESMTPS id 7C3422C143;
	Wed, 12 Apr 2023 09:21:19 +0000 (UTC)
Date: Wed, 12 Apr 2023 09:21:19 +0000 (UTC)
From: Richard Biener <rguenther@suse.de>
To: Kito Cheng <kito.cheng@gmail.com>
cc: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>, 
    "richard.sandiford" <richard.sandiford@arm.com>, 
    jeffreyalaw <jeffreyalaw@gmail.com>, gcc-patches <gcc-patches@gcc.gnu.org>, 
    palmer <palmer@dabbelt.com>, jakub <jakub@redhat.com>
Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit
 to 16-bit
In-Reply-To: <CA+yXCZCVd3c3c6KWf_t5x714T0brKu7Swh3AXXy+Z3PQ+1Y02A@mail.gmail.com>
Message-ID: <nycvar.YFH.7.77.849.2304120920111.4466@jbgna.fhfr.qr>
References: <20230410144808.324346-1-juzhe.zhong@rivai.ai> <89f088ec-8692-01f5-0395-5a66ddf085d7@gmail.com> <47D962C7C724E3A2+20230410231445834316202@rivai.ai> <mpto7nulpji.fsf@arm.com> <nycvar.YFH.7.77.849.2304111054120.4466@jbgna.fhfr.qr> <mptv8i2k714.fsf@arm.com>
 <0AEFD2378C3DF89B+202304111919556577872@rivai.ai> <CA+yXCZCUcChS_GbuHETPy6R3rgJAAMRHGz1LXYcwnoS-EOFXZg@mail.gmail.com> <nycvar.YFH.7.77.849.2304120743360.4466@jbgna.fhfr.qr> <CA+yXCZCVd3c3c6KWf_t5x714T0brKu7Swh3AXXy+Z3PQ+1Y02A@mail.gmail.com>
User-Agent: Alpine 2.22 (LSU 394 2020-01-19)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Wed, 12 Apr 2023, Kito Cheng wrote:

> Hi Richard:
> 
> > > In order to model LMUL in backend, we have to the combination of
> > > scalar type and LMUL; possible LMUL is 1, 2, 4, 8, 1/2, 1/4, 1/8 - 8
> > > different types of LMUL, and we'll have QI, HI, SI, DI, HF, SF and DF,
> > > so basically we'll have 7 (LMUL type) * 7 (scalar type) here.
> >
> > Other archs have load/store-multiple instructions, IIRC those
> > are modeled with the appropriate set of operands. Do RVV LMUL
> > group inputs/outputs overlap with the non-LMUL grouped registers
> > and can they be used as aliases or is this supposed to be
> > implemented transparently on the register file level only?
> 
> LMUL and non-LMUL (or LMUL=1) modes use the same vector register file.
> 
> Reg for LMUL=1/2 : { {v0, v1, ...v31} }
> Reg for LMUL=1 : { {v0, v1, ...v31} }
> Reg for LMUL=2 : { {v0, v1}, {v2, v3}, ... {v30, v31} } // reg. must
> align to multiple of 2.
> Reg for LMUL=4 : { {v0, v1, v2, v3}, {v4, v5, v6, v7}, ... {v28, v29,
> v30, v31} } // reg. must align to multiple of 4.
> ..
> Reg for 2-tuples of LMUL=1 : { {v0, v1}, {v1, v2}, ... {v29, v30}, {v30, v31} }
> Reg for 2-tuples of LMUL=2 : { {v0, v1, v2, v3}, {v2, v3, v4, v5}, ...
> {v28, v29, v30, v31}, {v28, v29, v30, v31} } // reg. must align to
> multiple of 2.
> ...
> 
> > But yes, implementing this as operations on multi-register
> > ops with large modes is probably the only sensible approach.
> >
> > I don't see how LMUL of 1/2, 1/4 or 1/8 is useful though? Can you
> > explain? Is that supposed to virtually increase the number of
> > registers? How do you represent r0:1/8:0 vs r0:1/8:3 (the first
> > and the third "virtual" register decomposed from r0) in GCC? To
> > me the natural way would be a subreg of r0?
> >
> > Somehow RVV seems to have more knobs than necessary for tuning
> > the actual vector register layout (aka N axes but only N-1 dimensions
> > thus the axes are
> 
> The concept of fractional LMUL is the same as the concept of AArch64's
> partial SVE vectors,
> so they can only access the lowest part, like SVE's partial vector.
> 
> We want to spill/restore the exact size of those modes (1/2, 1/4,
> 1/8), so adding dedicated modes for those partial vector modes should
> be unavoidable IMO.
> 
> And even if we use sub-vector, we still need to define those partial
> vector types.

Could you use integer modes for the fractional vectors?  For computation
you can always appropriately limit the LEN?