From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <rguenther@suse.de>
Subject: Re: [PATCH] Maintain (mis-)alignment info in the first element of a group
Date: Wed, 15 Sep 2021 11:08:43 +0100 [thread overview]
Message-ID: <mptzgseb178.fsf@arm.com> (raw)
In-Reply-To: <s7p2933o-4058-28nq-351r-8224182oon96@fhfr.qr> (Richard Biener via Gcc-patches's message of "Wed, 15 Sep 2021 11:47:13 +0200 (CEST)")
Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Wed, 15 Sep 2021, Richard Sandiford wrote:
>> Richard Biener <rguenther@suse.de> writes:
>> > On Tue, 14 Sep 2021, Richard Sandiford wrote:
>> >
>> >> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> >> > This changes us to maintain and compute (mis-)alignment info for
>> >> > the first element of a group only rather than for each DR when
>> >> > doing interleaving and for the earliest, first, or first in the SLP
>> >> > node (or any pair or all three of those) when SLP vectorizing.
>> >> >
>> >> > For this to work out the easiest way I have changed the accessors
>> >> > DR_MISALIGNMENT and DR_TARGET_ALIGNMENT to do the indirection to
>> >> > the first element rather than adjusting all callers.
>> >> > dr_misalignment is moved out-of-line and I'm not too fond of the
>> >> > poly-int dances there (any hints?), but basically we are now
>> >> > adjusting the first elements misalignment based on the DR_INIT
>> >> > difference.
>> >> >
>> >> > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>> >> >
>> >> > Richard.
>> >> >
>> >> > 2021-09-13 Richard Biener <rguenther@suse.de>
>> >> >
>> >> > * tree-vectorizer.h (dr_misalignment): Move out of line.
>> >> > (dr_target_alignment): New.
>> >> > (DR_TARGET_ALIGNMENT): Wrap dr_target_alignment.
>> >> > (set_dr_target_alignment): New.
>> >> > (SET_DR_TARGET_ALIGNMENT): Wrap set_dr_target_alignment.
>> >> > * tree-vect-data-refs.c (dr_misalignment): Compute and
>> >> > return the group members misalignment.
>> >> > (vect_compute_data_ref_alignment): Use SET_DR_TARGET_ALIGNMENT.
>> >> > (vect_analyze_data_refs_alignment): Compute alignment only
>> >> > for the first element of a DR group.
>> >> > (vect_slp_analyze_node_alignment): Likewise.
>> >> > ---
>> >> > gcc/tree-vect-data-refs.c | 65 ++++++++++++++++++++++++---------------
>> >> > gcc/tree-vectorizer.h | 24 ++++++++++-----
>> >> > 2 files changed, 57 insertions(+), 32 deletions(-)
>> >> >
>> >> > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> >> > index 66e76132d14..b53d6a0b3f1 100644
>> >> > --- a/gcc/tree-vect-data-refs.c
>> >> > +++ b/gcc/tree-vect-data-refs.c
>> >> > @@ -887,6 +887,36 @@ vect_slp_analyze_instance_dependence (vec_info *vinfo, slp_instance instance)
>> >> > return res;
>> >> > }
>> >> >
>> >> > +/* Return the misalignment of DR_INFO. */
>> >> > +
>> >> > +int
>> >> > +dr_misalignment (dr_vec_info *dr_info)
>> >> > +{
>> >> > + if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt))
>> >> > + {
>> >> > + dr_vec_info *first_dr
>> >> > + = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (dr_info->stmt));
>> >> > + int misalign = first_dr->misalignment;
>> >> > + gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED);
>> >> > + if (misalign == DR_MISALIGNMENT_UNKNOWN)
>> >> > + return misalign;
>> >> > + poly_offset_int diff = (wi::to_poly_offset (DR_INIT (dr_info->dr))
>> >> > + - wi::to_poly_offset (DR_INIT (first_dr->dr)));
>> >> > + poly_int64 mispoly = misalign + diff.to_constant ().to_shwi ();
>> >> > + bool res = known_misalignment (mispoly,
>> >> > + first_dr->target_alignment.to_constant (),
>> >> > + &misalign);
>> >> > + gcc_assert (res);
>> >> > + return misalign;
>> >>
>> >> Yeah, not too keen on the to_constants here. The one on diff looks
>> >> redundant -- you could just use diff.force_shwi () instead, and
>> >> keep everything poly_int.
>> >>
>> >> For the known_misalignment I think we should use:
>> >>
>> >> if (!can_div_trunc_p (mispoly, first_dr->target_alignment,
>> >> "ient, &misalign))
>> >> misalign = DR_MISALIGNMENT_UNKNOWN;
>> >> return misalign;
>> >>
>> >> There are then no to_constant assumptions.
>> >
>> > OK, note that group analysis does
>> >
>> > /* Check that the DR_INITs are compile-time constants. */
>> > if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST
>> > || TREE_CODE (DR_INIT (drb)) != INTEGER_CST)
>> > break;
>> >
>> > /* Sorting has ensured that DR_INIT (dra) <= DR_INIT (drb). */
>> > HOST_WIDE_INT init_a = TREE_INT_CST_LOW (DR_INIT (dra));
>> > HOST_WIDE_INT init_b = TREE_INT_CST_LOW (DR_INIT (drb));
>> >
>> > so I'm confident my variant was "correct", but it still was ugly.
>>
>> Ah, OK. In that case I don't mind the original version, but it would be
>> good to have a comment above the to_constant saying where the condition
>> is enforced.
>>
>> I'm just trying to avoid to_constant calls with no comment to explain
>> them, and with no nearby is_constant call. Otherwise it could end up
>> a bit like tree_to_uhwi, where sometimes tree_fits_uhwi_p really has
>> been checked earlier (not always obvious where) and sometimes
>> tree_to_uhwi is just used out of hope, to avoid having to think about
>> the alternative.
>>
>> > There's also the issue that target_alignment is poly_uint64 but
>> > misalignment is signed int.
>> >
>> > Note that can_div_trunc_p seems to require a poly_uint64 remainder,
>> > I'm not sure what to do with that, so I used is_constant.
>>
>> Ah, yeah, forgot about that sorry. I guess in that case, using
>> is_constant on first_dr->target_alignment and sticking with
>> known_misalignment would make sense.
>>
>> > Btw, to what value do we want to align with variable sized vectors?
>> > We are aligning globals according to target_alignment but only
>> > support that for constant alignment. Most users only want to
>> > distinguish between aligned or not aligned and the actual misalignment
>> > is only used to seed the SSA alignment info, so I suppose being
>> > too conservative in dr_misalignment for variable size vectors isn't
>> > too bad if we correctly identify fully aligned accesses?
>>
>> In practice we don't require more than element alignment for VLA SVE
>> as things stand. However, there's support for using fully-predicated
>> loops in which the first loop iteration aligns by using leading
>> inactive lanes where possible (e.g. start with 1 inactive lane
>> for a misalignment of 1). In principle that would work for VLA too,
>> we just haven't implemented it yet.
>>
>> > I'm now testing a variant that looks like
>> >
>> > int
>> > dr_misalignment (dr_vec_info *dr_info)
>> > {
>> > if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt))
>> > {
>> > dr_vec_info *first_dr
>> > = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (dr_info->stmt));
>> > int misalign = first_dr->misalignment;
>> > gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED);
>> > if (misalign == DR_MISALIGNMENT_UNKNOWN)
>> > return misalign;
>> > /* vect_analyze_data_ref_accesses guarantees that DR_INIT are
>> > INTEGER_CSTs and the first element in the group has the lowest
>> > address. */
>> > poly_offset_int diff = (wi::to_poly_offset (DR_INIT (dr_info->dr))
>> > - wi::to_poly_offset (DR_INIT (first_dr->dr)));
>> > gcc_assert (known_ge (diff, 0));
>> > poly_uint64 mispoly = misalign + diff.force_uhwi ();
>> > unsigned HOST_WIDE_INT quotient;
>> > if (can_div_trunc_p (mispoly, first_dr->target_alignment, "ient,
>> > &mispoly)
>> > && mispoly.is_constant ())
>> > return mispoly.to_constant ();
>> > return DR_MISALIGNMENT_UNKNOWN;
>> >
>> > I wonder why a non-poly works for the quotient here. I suppose it's
>> > because the runtime factor is the same for all poly-ints and thus
>> > it cancels out? But then the remainder likely will never be constant
>> > unless it is zero?
>>
>> It's not so much that the quotient is guaranteed to be constant,
>> but that that particular overload is designed for the case in which
>> the caller only cares about constant quotients, and wants can_div_trunc_p
>> to fail otherwise. In other words, it's really just a way of cutting down
>> on is_constant calls.
>>
>> poly-int.h doesn't provide every possible overload of can_div_trunc_p.
>> In principle it could have a fully-general 4-poly_int version, but we
>> haven't needed that yet. It could also have a new overload for the
>> case we care about here, avoiding the separate is_constant.
>
> Hmm. I guess I'll see to somehow unify this with the logic in
> vect_compute_data_ref_alignment which at the moment forces
> DR_MISALIGNMENT to unknown if DR_TARGET_ALIGNMENT of the vector type
> isn't constant (so for SVE all accesses appear unaligned?). In the
> end vect_compute_data_ref_alignment does
>
> poly_int64 misalignment
> = base_misalignment + wi::to_poly_offset (drb->init).force_shwi ();
>
> /* If this is a backward running DR then first access in the larger
> vectype actually is N-1 elements before the address in the DR.
> Adjust misalign accordingly. */
> if (tree_int_cst_sgn (drb->step) < 0)
> /* PLUS because STEP is negative. */
> misalignment += ((TYPE_VECTOR_SUBPARTS (vectype) - 1)
> * -TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE
> (vectype))));
>
> unsigned int const_misalignment;
> if (!known_misalignment (misalignment, vect_align_c,
> &const_misalignment))
> {
> if (dump_enabled_p ())
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> "Non-constant misalignment for access: %T\n",
> ref);
> return;
> }
>
> SET_DR_MISALIGNMENT (dr_info, const_misalignment);
>
> ignoring the negative step offset (where poly-int TYPE_VECTOR_SUBPARTS
> sneaks in) this combines integer base_misaligned with DR_INIT
> (known INTEGER_CST) and then with the known constant vect_align_c.
>
> So the most simplest version is
>
> if (misalign == DR_MISALIGNMENT_UNKNOWN)
> return misalign;
> /* vect_analyze_data_ref_accesses guarantees that DR_INIT are
> INTEGER_CSTs and the first element in the group has the lowest
> address. Likewise vect_compute_data_ref_alignment will
> have ensured that target_alignment is constant and otherwise
> set misalign to DR_MISALIGNMENT_UNKNOWN. */
> HOST_WIDE_INT diff = (TREE_INT_CST_LOW (DR_INIT (dr_info->dr))
> - TREE_INT_CST_LOW (DR_INIT (first_dr->dr)));
> gcc_assert (diff >= 0);
> unsigned HOST_WIDE_INT target_alignment_c
> = first_dr->target_alignment.to_constant ();
> return (misalign + diff) % target_alignment_c;
LGTM, thanks.
Richard
> Note my intent is to lift the restriction of having only one
> vector type for vectorizing a DR group and alignment is where
> currently correctness issues pop up so in the end
> dr_misalignment () would get a vectype parameter (and the
> odd negative STEP handling would move to the caller, eventually
> with another 'offset' parameter to dr_misalignment () / aligned_access_p
> ()).
>
> So if you are fine with improving the situation for poly-ints
> after all this then I'd go with the most simplest variant above.
>
> OK?
>
> Thanks,
> Richard.
prev parent reply other threads:[~2021-09-15 10:08 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-13 14:37 Richard Biener
2021-09-14 11:10 ` Richard Sandiford
2021-09-15 6:55 ` Richard Biener
2021-09-15 7:32 ` Richard Sandiford
2021-09-15 9:47 ` Richard Biener
2021-09-15 10:08 ` Richard Sandiford [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptzgseb178.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).