From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 97B493858D28 for ; Wed, 12 Apr 2023 09:21:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 97B493858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id CBA4F21905; Wed, 12 Apr 2023 09:21:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1681291279; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vb00dXChczvs17iYFtzhwLg7Fxs6i8aVH1OnVodX+mk=; b=dS4qc4UQxrpIWJlv28+O5iqdhNCHVLGOSOhph+HbLNvMHjSg49aW/n8xq0bm8W8AfY1jAg UNxdEVJXmEQ8va+rauBVSh2jr92EhoAH1w9SezBqj+cqvmbo/983QI/mxVU3YzPIpK/ydN YlUgdzRixVBdy5Qbc09tNivKxAEWPk8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1681291279; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vb00dXChczvs17iYFtzhwLg7Fxs6i8aVH1OnVodX+mk=; b=H0TlBx0aKoubfNvf+JOOrGM5XKOx8JDKyEu0jq18S1W3B0YHc6IaZw7tIhtGy4mQUMHhae xDXSJnszkCqj6qDg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 7C3422C143; Wed, 12 Apr 2023 09:21:19 +0000 (UTC) Date: Wed, 12 Apr 2023 09:21:19 +0000 (UTC) From: Richard Biener To: Kito Cheng cc: "juzhe.zhong@rivai.ai" , "richard.sandiford" , jeffreyalaw , gcc-patches , palmer , jakub Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit In-Reply-To: Message-ID: References: <20230410144808.324346-1-juzhe.zhong@rivai.ai> <89f088ec-8692-01f5-0395-5a66ddf085d7@gmail.com> <47D962C7C724E3A2+20230410231445834316202@rivai.ai> <0AEFD2378C3DF89B+202304111919556577872@rivai.ai> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 12 Apr 2023, Kito Cheng wrote: > Hi Richard: > > > > In order to model LMUL in backend, we have to the combination of > > > scalar type and LMUL; possible LMUL is 1, 2, 4, 8, 1/2, 1/4, 1/8 - 8 > > > different types of LMUL, and we'll have QI, HI, SI, DI, HF, SF and DF, > > > so basically we'll have 7 (LMUL type) * 7 (scalar type) here. > > > > Other archs have load/store-multiple instructions, IIRC those > > are modeled with the appropriate set of operands. Do RVV LMUL > > group inputs/outputs overlap with the non-LMUL grouped registers > > and can they be used as aliases or is this supposed to be > > implemented transparently on the register file level only? > > LMUL and non-LMUL (or LMUL=1) modes use the same vector register file. > > Reg for LMUL=1/2 : { {v0, v1, ...v31} } > Reg for LMUL=1 : { {v0, v1, ...v31} } > Reg for LMUL=2 : { {v0, v1}, {v2, v3}, ... {v30, v31} } // reg. must > align to multiple of 2. > Reg for LMUL=4 : { {v0, v1, v2, v3}, {v4, v5, v6, v7}, ... {v28, v29, > v30, v31} } // reg. must align to multiple of 4. > .. > Reg for 2-tuples of LMUL=1 : { {v0, v1}, {v1, v2}, ... {v29, v30}, {v30, v31} } > Reg for 2-tuples of LMUL=2 : { {v0, v1, v2, v3}, {v2, v3, v4, v5}, ... > {v28, v29, v30, v31}, {v28, v29, v30, v31} } // reg. must align to > multiple of 2. > ... > > > But yes, implementing this as operations on multi-register > > ops with large modes is probably the only sensible approach. > > > > I don't see how LMUL of 1/2, 1/4 or 1/8 is useful though? Can you > > explain? Is that supposed to virtually increase the number of > > registers? How do you represent r0:1/8:0 vs r0:1/8:3 (the first > > and the third "virtual" register decomposed from r0) in GCC? To > > me the natural way would be a subreg of r0? > > > > Somehow RVV seems to have more knobs than necessary for tuning > > the actual vector register layout (aka N axes but only N-1 dimensions > > thus the axes are > > The concept of fractional LMUL is the same as the concept of AArch64's > partial SVE vectors, > so they can only access the lowest part, like SVE's partial vector. > > We want to spill/restore the exact size of those modes (1/2, 1/4, > 1/8), so adding dedicated modes for those partial vector modes should > be unavoidable IMO. > > And even if we use sub-vector, we still need to define those partial > vector types. Could you use integer modes for the fractional vectors? For computation you can always appropriately limit the LEN?