From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id E795E3858C54 for ; Mon, 13 Feb 2023 09:48:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E795E3858C54 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id D9C58219BA; Mon, 13 Feb 2023 09:48:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1676281689; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Wv5gXaWkpQtNkcNN+0Tt5Mfjd7akgMBo2CvMNGl5oKc=; b=NlmtWbSHFwdfaE0SBR0ETX5dkOPTWUwdzq2nZb9P3oYYmk7PFU18NHQM/SmHcyhRlygogu kb2BFcO93s8vfAXeMeC1SucvymSNvv3LIKB1QtJMw5qHfjsiDYx9cd/FZcIEFvaCDnQ6sX MvNjsPx4d+e4evlsL2B4boX70LEZgHw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1676281689; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Wv5gXaWkpQtNkcNN+0Tt5Mfjd7akgMBo2CvMNGl5oKc=; b=X/wdwAfkfLt6Ftde+4pniFwwSq5jP4olyAiVTEfpTACW9QNNFSSWKycSJ7evkq8AAIz476 vITmm1WI56qh04Cw== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 88A212C141; Mon, 13 Feb 2023 09:48:09 +0000 (UTC) Date: Mon, 13 Feb 2023 09:48:09 +0000 (UTC) From: Richard Biener To: Richard Sandiford cc: "juzhe.zhong@rivai.ai" , "incarnation.p.lee" , gcc-patches , "Kito.cheng" , ams Subject: Re: [PATCH] RISC-V: Bugfix for mode tieable of the rvv bool types In-Reply-To: Message-ID: References: <869B9CDC6210FDAD+20230213161951079936108@rivai.ai> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 13 Feb 2023, Richard Sandiford wrote: > Richard Biener writes: > > On Mon, 13 Feb 2023, juzhe.zhong@rivai.ai wrote: > > > >> >> But then GET_MODE_PRECISION (GET_MODE_INNER (..)) should always be 1? > >> Yes, I think so. > >> > >> Let's explain RVV more clearly. > >> Let's suppose we have vector-length = 64bits in RVV CPU. > >> VNx1BI is exactly 1 consecutive bits. > >> VNx2BI is exactly 2 consecutive bits. > >> VNx4BI is exactly 4 consecutive bits. > >> VNx8BI is exactly 8 consecutive bits. > >> > >> For VNx1BI (vbool64_t ), we load it wich this asm: > >> vsetvl e8mf8 > >> vlm.v > >> > >> For VNx2BI (vbool32_t ), we load it wich this asm: > >> vsetvl e8mf4 > >> vlm.v > >> > >> For VNx4BI (vbool16_t ), we load it wich this asm: > >> vsetvl e8mf2 > >> vlm.v > >> > >> For VNx8BI (vbool8_t ), we load it wich this asm: > >> vsetvl e8m1 > >> vlm.v > >> > >> In case of this code sequence: > >> vbool16_t v4 = *(vbool16_t *)in; > >> vbool8_t v3 = *(vbool8_t*)in; > >> > >> Since VNx4BI (vbool16_t ) is smaller than VNx8BI (vbool8_t ) > >> We can't just use the data loaded by VNx4BI (vbool16_t ) in VNx8BI (vbool8_t ). > >> But we can use the data loaded by VNx8BI (vbool8_t ) in VNx4BI (vbool16_t ). > >> > >> In this example, GCC thinks data loaded for vbool8_t v3 can be replaced by vbool16_t v4 which is already loaded > >> It's incorrect for RVV. > > > > OK, so the 'vlm.v' instruction will zero the padding bits (according to > > vsetvl), but I doubt the memory subsystem will not load a whole byte. > > > > Then GET_MODE_PRECISION of VNx4BI has to be smaller than > > GET_MODE_PRECISION of VNx8BI, even if their size is the same. > > > > I suppose that ADJUST_NUNITS should be able to do this, but then we > > have in aarch64-modes.def > > > > VECTOR_BOOL_MODE (VNx16BI, 16, BI, 2); > > VECTOR_BOOL_MODE (VNx8BI, 8, BI, 2); > > VECTOR_BOOL_MODE (VNx4BI, 4, BI, 2); > > VECTOR_BOOL_MODE (VNx2BI, 2, BI, 2); > > > > ADJUST_NUNITS (VNx16BI, aarch64_sve_vg * 8); > > ADJUST_NUNITS (VNx8BI, aarch64_sve_vg * 4); > > ADJUST_NUNITS (VNx4BI, aarch64_sve_vg * 2); > > ADJUST_NUNITS (VNx2BI, aarch64_sve_vg); > > > > so all VNxMBI modes are 2 bytes in size but their component is always > > BImode but IIRC the elements of VNx2BImode occupy 4 bits each? > > Yeah. Only the low bit is significant, so it's still a 1-bit element. > But the padding is distributed evenly across the elements rather than > being grouped at one end of the predicate. I wonder what we'd do for a target that makes the high bit significant ;) > > For riscv we have > > > > VECTOR_BOOL_MODE (VNx1BI, 1, BI, 1); > > ADJUST_NUNITS (VNx1BI, riscv_v_adjust_nunits (VNx1BImode, 1)); > > > > so here it would be natural to set the mode precision to > > a poly-int computed by the component precision times nunits? OTOH > > we have to look at the component precision vs. size as well and > > > > /* Single bit mode used for booleans. */ > > BOOL_MODE (BI, 1, 1); > > > > BOOL_MODE is not documented, but its precision and size, so BImode > > has a size of 1. That makes VECTOR_BOOL_MODE very special since > > the layout isn't derived from the component mode. Deriving the > > layout from the precision would make aarch64 incorrect and > > would need BI2 and BI4 modes at least. > > I think the elements have to stay BI for AArch64. Using BI2 (with a > precision of 2) would make both bits significant. I think what's "wrong" with a BImode component mode is not the precision but the size - we don't support bit-precision component types on the GENERIC side but for bool vector modes we pack the components to a bit size and aarch64 has varying bit sizes here (and thus components with padding). I don't think we support modes with sizes less than a unit but since bool modes are special we could re-purpose their precision to mean bitsize. > I'm not sure the RVV case fits into the existing mode layout scheme. > AFAIK we don't currently support vector modes with padding at one end. > If that's right, the fix is likely to involve more than just tweaking > the mode parameters. > > What's the byte size of VNx1BI, expressed as a function of N? > If it's CEIL (N, 8) then we don't have a way of representing that yet. PARTIAL_VECTOR_MODE? (ick) Richard.