From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id F0D983857360 for ; Wed, 27 Jul 2022 08:12:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F0D983857360 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id D3A3534D61; Wed, 27 Jul 2022 08:12:11 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id B7EA22C141; Wed, 27 Jul 2022 08:12:11 +0000 (UTC) Date: Wed, 27 Jul 2022 08:12:11 +0000 (UTC) From: Richard Biener To: "juzhe.zhong@rivai.ai" cc: gcc-patches Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison In-Reply-To: <2022072716005642628531@rivai.ai>+297F71E4F3A41880 Message-ID: References: <20220727034437.154625-1-juzhe.zhong@rivai.ai>, <20220727034437.154625-2-juzhe.zhong@rivai.ai>, , <2022072715091952120116@rivai.ai>+0E606AA70DF6FCA5, <2022072716005642628531@rivai.ai>+297F71E4F3A41880 User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jul 2022 08:12:14 -0000 On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > Let's take look at these 2 cases: https://godbolt.org/z/zP16frPnb. In > RVV, we have vle8 and vsetvli to specify loading vint8mf2 (vsetvli a1, > zero + vle8.v). You can see it in foo function. In this case we don't > need to confuse compiler the size of vint8mf2. However, The second case > is I write assembly to occupy the vector register to generate register > spilling cases. You can see LLVM implementation: First use vsetvli + > vle8.v load (vint8mf2_t) data from the base pointer, then spill the > vector to memory using vs1r (this is the whole register store which > store the whole vector to the memory and then use vl1r load the whole > register and finally return it. In LLVM implementation, it insert > vsetvli instructions for RVV instruction using LLVM PASS before RA > (register allocation). So after RA, compiler generate the spilling > loads/stores. We can either choose to use vsetvli + vle/vse (load/store > fractional vector) or vl1r/vs1r (load/store whole vector which enlarge > the spill size). > > Because implementing insert vsetvli PASS after RA (register allocation) > is so complicated, LLVM choose to use vl1r/vs1r. Frankly, I implement > RVV GCC reference to LLVM. So that why I want to define the machine_mode > for `vint8mf2` with smaller element-size but same byte-size from > `vint8m1'. The LLVM strathegy might not be the best one for GCC. I still don't quite understand what issue you face with the code you try to patch. How does the machmode.def portion of the port look like for vint8m1 and vint8mf2? What are the corresponding modes? > Thank you for your reply. > > > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2022-07-27 15:35 > To: juzhe.zhong@rivai.ai > CC: gcc-patches > Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > Thank you so much for the fast reply. Ok, it is true that I didn't think about it carefully. Can you help me with the following the issue? > > > > For RVV (RISC-V 'V' Extension), we have full vector type 'vint8m1_t' (LMUL = 1) and fractional vector type 'vint8mf2_t' (LMUL = 1/2). > > Can you explain in terms of GCCs generic vectors what vint8m1_t and > vint8mf2_t are? > > > Because in the ISA, we don't have whole register load/store for fractional vector. I reference the LLVM implementation and I adjust BITSIZE of > > fractional vector same as full vector (It will confuse GCC the bytesize of fractional vector and consider the spill size of a fractional vector is same as LMUL = 1) > > so that I can use whole register load/store directly during the register spilling. (Even though it will enlarge the spill size). According to the machine_mode definition, > > The machine_mode PRECISION is calculate by component size which is different from BITSIZE > > > > Now, here is the question. For array type: vint8mf2x4_t, if I want to access vint8mf2x4_t[2], because the PRECISION and BITSIZE are different. Because bitops is calculated by > > bitsize and compare to precision in the codes that the patch mentioned. It will make a out-of-bounds access to small array. > > > > Can you help me with this? This is important for the following RVV upstream support. Thanks. > > So you have that vint8mf2_t type and registers + instructions to operate > on them but no way to load/store? How do you implement > > vint8mf2_t foo (vint8mf2_t *p) > { > return *p; > } > > ? > > > > > > > > > > > juzhe.zhong@rivai.ai > > > > From: Richard Biener > > Date: 2022-07-27 14:46 > > To: zhongjuzhe > > CC: gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; richard.sandiford; bernds_cb1; ian; wilson > > Subject: Re: [PATCH 1/1] Fix bit-position comparison > > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > > > From: zhongjuzhe > > > > > > gcc/ChangeLog: > > > > > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > > > > > --- > > > gcc/expr.cc | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/gcc/expr.cc b/gcc/expr.cc > > > index 80bb1b8a4c5..ac2b3c07df6 100644 > > > --- a/gcc/expr.cc > > > +++ b/gcc/expr.cc > > > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > > > code contains an out-of-bounds access to a small array. */ > > > if (!MEM_P (to_rtx) > > > && GET_MODE (to_rtx) != BLKmode > > > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > > > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > > > I think this has the chance to go wrong with regard to endianess. > > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > > is relative to the start of to_rtx as if it were in memory and bitsize > > determines the contiguous region affected. But since we are actually > > storing into a non-memory endianess comes into play. > > > > So no, I don't think the patch is correct, it would be way more > > complicated to get the desired improvement. > > > > Richard. > > > > > { > > > expand_normal (from); > > > result = NULL; > > > > > > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)