From: Richard Biener <rguenther@suse.de>
To: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
Richard Sandiford <richard.sandiford@arm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)
Date: Thu, 8 Sep 2022 11:51:20 +0000 (UTC) [thread overview]
Message-ID: <nycvar.YFH.7.77.849.2209081104310.20505@jbgna.fhfr.qr> (raw)
In-Reply-To: <c985486d-ce6d-59e2-adaa-50d1c5ee06f6@arm.com>
On Thu, 25 Aug 2022, Andre Vieira (lists) wrote:
>
> On 17/08/2022 13:49, Richard Biener wrote:
> > Yes, of course. What you need to do is subtract DECL_FIELD_BIT_OFFSET
> > of the representative from DECL_FIELD_BIT_OFFSET of the original bitfield
> > access - that's the offset within the representative (by construction
> > both fields share DECL_FIELD_OFFSET).
> Doh! That makes sense...
> >> So instead I change bitpos such that:
> >> align_of_representative = TYPE_ALIGN (TREE_TYPE (representative));
> >> bitpos -= bitpos.to_constant () / align_of_representative *
> >> align_of_representative;
> > ? Not sure why alignment comes into play here?
> Yeah just forget about this... it was my ill attempt at basically doing what
> you described above.
> > Not sure what you are saying but "yes", all shifting and masking should
> > happen in the type of the representative.
> >
> > + tree bitpos_tree = build_int_cst (bitsizetype, bitpos);
> >
> > for your convenience there's bitsize_int (bitpos) you can use.
> >
> > I don't think you are using the correct bitpos though, you fail to
> > adjust it for the BIT_FIELD_REF/BIT_INSERT_EXPR.
> Not sure I understand what you mean? I do adjust it, I've changed it now so it
> should hopefully be clearer.
> >
> > + build_int_cst (bitsizetype, TYPE_PRECISION
> > (bf_type)),
> >
> > the size of the bitfield reference is DECL_SIZE of the original
> > FIELD_DECL - it might be bigger than the precision of its type.
> > You probably want to double-check it's equal to the precision
> > (because of the insert but also because of all the masking) and
> > refuse to lower if not.
> I added a check for this but out of curiosity, how can the DECL_SIZE of a
> bitfield FIELD_DECL be different than it's type's precision?
It's probably not possible to create a C testcase but I don't see
what makes this impossible in general to have padding in a bitfield
object.
> >
> > +/* Return TRUE if there are bitfields to lower in this LOOP. Fill
> > TO_LOWER
> > + with data structures representing these bitfields. */
> > +
> > +static bool
> > +bitfields_to_lower_p (class loop *loop,
> > + vec <gassign *> &reads_to_lower,
> > + vec <gassign *> &writes_to_lower)
> > +{
> > + basic_block *bbs = get_loop_body (loop);
> > + gimple_stmt_iterator gsi;
> >
> > as said I'd prefer to do this walk as part of the other walks we
> > already do - if and if only because get_loop_body () is a DFS
> > walk over the loop body (you should at least share that).
> I'm now sharing the use of ifc_bbs. The reason why I'd rather not share the
> walk over them is because it becomes quite complex to split out the decision
> to not lower if's because there are none, for which we will still want to
> lower bitfields, versus not lowering if's when they are there but aren't
> lowerable at which point we will forego lowering bitfields since we will not
> vectorize this loop anyway.
> >
> > + value = fold_build1 (NOP_EXPR, load_type, value);
> >
> > fold_convert (load_type, value)
> >
> > + if (!CONSTANT_CLASS_P (value))
> > + {
> > + pattern_stmt
> > + = gimple_build_assign (vect_recog_temp_ssa_var (load_type,
> > NULL),
> > + value);
> > + value = gimple_get_lhs (pattern_stmt);
> >
> > there's in principle
> >
> > gimple_seq stmts = NULL;
> > value = gimple_convert (&stmts, load_type, value);
> > if (!gimple_seq_empty_p (stmts))
> > {
> > pattern_stmt = gimple_seq_first_stmt (stmts);
> > append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
> > }
> >
> > though a append_pattern_def_seq helper to add a convenience sequence
> > would be nice to have here.
> Ended up using the existing 'vect_convert_input', seems to do nicely here.
> > You probably want to double-check your lowering code by
> > bootstrapping / testing with -ftree-loop-if-convert.
> Done, this lead me to find a new failure mode, where the type of the first
> operand of BIT_FIELD_REF was a FP type (TF mode), which then lead to failures
> when constructing the masking and shifting. I ended up adding a nop-conversion
> to an INTEGER type of the same width first if necessary.
You want a VIEW_CONVERT (aka bit-cast) here.
> Also did a follow-up
> bootstrap with the addition of `-ftree-vectorize` and `-fno-vect-cost-model`
> to further test the codegen. All seems to be working on aarch64-linux-gnu.
+static tree
+get_bitfield_rep (gassign *stmt, bool write, tree *bitpos,
+ tree *struct_expr)
...
+ /* Bail out if the DECL_SIZE of the field_decl isn't the same as the
BF's
+ precision. */
+ unsigned HOST_WIDE_INT decl_size = tree_to_uhwi (DECL_SIZE
(field_decl));
+ if (TYPE_PRECISION (TREE_TYPE (gimple_assign_lhs (stmt))) != decl_size)
+ return NULL_TREE;
you can
use compare_tree_int (DECL_SIZE (field_decl), TYPE_PRECISION (...)) != 0
which avoids caring for the case the size isn't a uhwi ...
+ gimple *new_stmt = gimple_build_assign (unshare_expr
(rep_comp_ref),
+ new_val);
+ gimple_set_vuse (new_stmt, gimple_vuse (stmt));
+ tree vdef = gimple_vdef (stmt);
+ gimple_set_vdef (new_stmt, vdef);
+ SSA_NAME_DEF_STMT (vdef) = new_stmt;
you can use gimple_move_vops (new_stmt, stmt); here
+ tree bfr = build3 (BIT_FIELD_REF, bf_type, new_val,
+ build_int_cst (bitsizetype, TYPE_PRECISION
(bf_type)),
+ bitpos);
+ new_val = ifc_temp_var (bf_type, bfr, &gsi);
+ redundant_ssa_names.safe_push (std::make_pair (gimple_assign_lhs
(stmt),
+ new_val));
I'm curious, why the push to redundant_ssa_names? That could use
a comment ...
+ need_to_lower_bitfields = bitfields_to_lower_p (loop, reads_to_lower,
+ writes_to_lower);
do we want to conditionalize this on flag_tree_loop_vectorize? That is,
I think the lowering should for now happen only on the loop version
guarded by .IFN_VECTORIZED. There's
if ((need_to_predicate || any_complicated_phi)
&& ((!flag_tree_loop_vectorize && !loop->force_vectorize)
|| loop->dont_vectorize))
goto cleanup;
for the cases that will force versioning, but I think we should
simply not lower bitfields in the
((!flag_tree_loop_vectorize && !loop->force_vectorize)
|| loop->dont_vectorize)
case?
+ if (!second_stmt || gimple_code (second_stmt) != GIMPLE_ASSIGN
+ || gimple_assign_rhs_code (second_stmt) != BIT_FIELD_REF)
+ return NULL;
the first || goes to a new line
+ bf_stmt = static_cast <gassign *> (second_stmt);
"nicer" and shorter is
bf_stmt = dyn_cast <gassign *> (second_stmt);
if (!bf_stmt || gimple_assign_rhs_code (bf_stmt) != BIT_FIELD_REF)
return NULL;
+ tree lhs = TREE_OPERAND (bf_ref, 0);
+
+ if (!INTEGRAL_TYPE_P (TREE_TYPE (bf_ref)))
+ return NULL;
+
+ gimple *use_stmt, *pattern_stmt;
+ use_operand_p use_p;
+ tree ret = gimple_assign_lhs (first_stmt);
just when reading, generic variables like 'lhs' are not helpful
(when they are not an actual lhs even less so ...).
You have nice docs ontop of the function - when you use
atual names for _2 = BIT_FIELD_REF (_1, ...) variables you can
even use them in the code so docs and code match up nicely.
+ /* If the first operand of the BIT_FIELD_REF is not an INTEGER type,
convert
+ it to one of the same width so we can perform the necessary masking
and
+ shifting. */
+ if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+ {
+ tree int_type
+ = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE
(lhs)),
+ true);
so you probably run into this from code that's not lowered from
original bitfield reads? Note you should use TYPE_SIZE here,
definitely not TYPE_PRECISION on arbitrary types (if its a vector
type then that will yield the number of units for example).
+ unsigned HOST_WIDE_INT shift_n = bit_field_offset (bf_ref).to_constant
();
+ unsigned HOST_WIDE_INT mask_width = bit_field_size (bf_ref).to_constant
();
is there anything that prevents this to run on VLA vector extractions?
I think it would be nice to test constantness at the start of the
function.
+ pattern_stmt
+ = gimple_build_assign (vect_recog_temp_ssa_var (TREE_TYPE
(lhs),
+ NULL),
eh, seeing that multiple times the vect_recog_temp_ssa_var needs
a defaulted NULL second argument ...
Note I fear we will have endianess issues when translating
bit-field accesses to BIT_FIELD_REF/INSERT and then to shifts. Rules
for memory and register operations do not match up (IIRC, I repeatedly
run into issues here myself). The testcases all look like they
won't catch this - I think an example would be sth like
struct X { unsigned a : 23; unsigned b : 9; }, can you see to do
testing on a big-endian target?
Otherwise the patch looks good, so there's only minor things to
fix up (in case the endianess issue turns out to be a non-issue).
Sorry for the delay in reviewing.
Thanks,
Richard.
next prev parent reply other threads:[~2022-09-08 11:51 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-26 10:00 [RFC] Teach vectorizer to deal with bitfield reads Andre Vieira (lists)
2022-07-27 11:37 ` Richard Biener
2022-07-29 8:57 ` Andre Vieira (lists)
2022-07-29 9:11 ` Richard Biener
2022-07-29 10:31 ` Jakub Jelinek
2022-07-29 10:52 ` Richard Biener
2022-08-01 10:21 ` Andre Vieira (lists)
2022-08-01 13:16 ` Richard Biener
2022-08-08 14:06 ` [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads) Andre Vieira (lists)
2022-08-09 14:34 ` Richard Biener
2022-08-16 10:24 ` Andre Vieira (lists)
2022-08-17 12:49 ` Richard Biener
2022-08-25 9:09 ` Andre Vieira (lists)
2022-09-08 9:07 ` Andre Vieira (lists)
2022-09-08 11:51 ` Richard Biener [this message]
2022-09-26 15:23 ` Andre Vieira (lists)
2022-09-27 12:34 ` Richard Biener
2022-09-28 9:43 ` Andre Vieira (lists)
2022-09-28 17:31 ` Andre Vieira (lists)
2022-09-29 7:54 ` Richard Biener
2022-10-07 14:20 ` Andre Vieira (lists)
2022-10-12 1:55 ` Hongtao Liu
2022-10-12 2:11 ` Hongtao Liu
2022-08-01 10:13 ` [RFC] Teach vectorizer to deal with bitfield reads Andre Vieira (lists)
2022-10-12 9:02 ` Eric Botcazou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.YFH.7.77.849.2209081104310.20505@jbgna.fhfr.qr \
--to=rguenther@suse.de \
--cc=andre.simoesdiasvieira@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).