From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 469253858D28 for ; Thu, 8 Sep 2022 09:07:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 469253858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2A0C914BF; Thu, 8 Sep 2022 02:08:05 -0700 (PDT) Received: from [10.57.17.180] (unknown [10.57.17.180]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DC5683F73D; Thu, 8 Sep 2022 02:07:57 -0700 (PDT) Message-ID: Date: Thu, 8 Sep 2022 10:07:47 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads) Content-Language: en-US To: Richard Biener Cc: Jakub Jelinek , Richard Sandiford , "gcc-patches@gcc.gnu.org" References: <4a6f2350-f070-1473-63a5-3232968d3a89@arm.com> <4ea91767-de43-9d30-5bd2-a9a0759760c7@arm.com> <69abe824-94f5-95b5-fb7f-6fa076973e05@arm.com> <8f805fb1-d4ae-b0e3-ff26-57fd2c1fc1f7@arm.com> From: "Andre Vieira (lists)" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-15.8 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Ping. On 25/08/2022 10:09, Andre Vieira (lists) via Gcc-patches wrote: > > On 17/08/2022 13:49, Richard Biener wrote: >> Yes, of course.  What you need to do is subtract DECL_FIELD_BIT_OFFSET >> of the representative from DECL_FIELD_BIT_OFFSET of the original >> bitfield >> access - that's the offset within the representative (by construction >> both fields share DECL_FIELD_OFFSET). > Doh! That makes sense... >>> So instead I change bitpos such that: >>> align_of_representative = TYPE_ALIGN (TREE_TYPE (representative)); >>> bitpos -= bitpos.to_constant () / align_of_representative * >>> align_of_representative; >> ?  Not sure why alignment comes into play here? > Yeah just forget about this... it was my ill attempt at basically > doing what you described above. >> Not sure what you are saying but "yes", all shifting and masking should >> happen in the type of the representative. >> >> +  tree bitpos_tree = build_int_cst (bitsizetype, bitpos); >> >> for your convenience there's bitsize_int (bitpos) you can use. >> >> I don't think you are using the correct bitpos though, you fail to >> adjust it for the BIT_FIELD_REF/BIT_INSERT_EXPR. > Not sure I understand what you mean? I do adjust it, I've changed it > now so it should hopefully be clearer. >> >> +                        build_int_cst (bitsizetype, TYPE_PRECISION >> (bf_type)), >> >> the size of the bitfield reference is DECL_SIZE of the original >> FIELD_DECL - it might be bigger than the precision of its type. >> You probably want to double-check it's equal to the precision >> (because of the insert but also because of all the masking) and >> refuse to lower if not. > I added a check for this but out of curiosity, how can the DECL_SIZE > of a bitfield FIELD_DECL be different than it's type's precision? >> >> +/* Return TRUE if there are bitfields to lower in this LOOP. Fill >> TO_LOWER >> +   with data structures representing these bitfields.  */ >> + >> +static bool >> +bitfields_to_lower_p (class loop *loop, >> +                     vec &reads_to_lower, >> +                     vec &writes_to_lower) >> +{ >> +  basic_block *bbs = get_loop_body (loop); >> +  gimple_stmt_iterator gsi; >> >> as said I'd prefer to do this walk as part of the other walks we >> already do - if and if only because get_loop_body () is a DFS >> walk over the loop body (you should at least share that). > I'm now sharing the use of ifc_bbs. The reason why I'd rather not > share the walk over them is because it becomes quite complex to split > out the decision to not lower if's because there are none, for which > we will still want to lower bitfields, versus not lowering if's when > they are there but aren't lowerable at which point we will forego > lowering bitfields since we will not vectorize this loop anyway. >> >> +      value = fold_build1 (NOP_EXPR, load_type, value); >> >> fold_convert (load_type, value) >> >> +      if (!CONSTANT_CLASS_P (value)) >> +       { >> +         pattern_stmt >> +           = gimple_build_assign (vect_recog_temp_ssa_var (load_type, >> NULL), >> +                                  value); >> +         value = gimple_get_lhs (pattern_stmt); >> >> there's in principle >> >>       gimple_seq stmts = NULL; >>       value = gimple_convert (&stmts, load_type, value); >>       if (!gimple_seq_empty_p (stmts)) >>         { >>           pattern_stmt = gimple_seq_first_stmt (stmts); >>           append_pattern_def_seq (vinfo, stmt_info, pattern_stmt); >>         } >> >> though a append_pattern_def_seq helper to add a convenience sequence >> would be nice to have here. > Ended up using the existing 'vect_convert_input', seems to do nicely > here. >> You probably want to double-check your lowering code by >> bootstrapping / testing with -ftree-loop-if-convert. > Done, this lead me to find a new failure mode, where the type of the > first operand of BIT_FIELD_REF was a FP type (TF mode), which then > lead to failures when constructing the masking and shifting. I ended > up adding a nop-conversion to an INTEGER type of the same width first > if necessary. Also did a follow-up bootstrap with the addition of > `-ftree-vectorize` and `-fno-vect-cost-model` to further test the > codegen. All seems to be working on aarch64-linux-gnu.