From: Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
To: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>,
Richard Earnshaw <Richard.Earnshaw@arm.com>,
ramana.gcc@gmail.com, "nickc@redhat.com" <nickc@redhat.com>
Subject: Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops
Date: Wed, 5 Jul 2023 17:11:06 +0100 [thread overview]
Message-ID: <4086feb8-6048-8bbe-1ed4-6558c9b725e2@arm.com> (raw)
In-Reply-To: <6dba374d-0796-7957-5426-764177115135@arm.com>
Thank you Andre for reviewing! I'll attach the updated version of the
patch to the third review email (your final one thus far ;)
On 22/06/2023 16:54, Andre Vieira (lists) wrote:
> Some comments below, all quite minor. I'll continue to review
> tomorrow, I need a fresher brain for
> arm_mve_check_df_chain_back_for_implic_predic ;)
>
> +static int
> +arm_mve_get_vctp_lanes (rtx x)
> +{
> + if (GET_CODE (x) == SET && GET_CODE (XEXP (x, 1)) == UNSPEC
> + && (XINT (XEXP (x, 1), 1) == VCTP || XINT (XEXP (x, 1), 1) ==
> VCTP_M))
> + {
> + switch (GET_MODE (XEXP (x, 1)))
> + {
> + case V16BImode:
> + return 16;
> + case V8BImode:
> + return 8;
> + case V4BImode:
> + return 4;
> + case V2QImode:
> + return 2;
> + default:
> + break;
> + }
> + }
> + return 0;
> +}
>
> I think you can replace the switch with something along the lines of:
> machine_mode mode = GET_MODE (XEXP (x, 1));
> return VECTOR_MODE_P (mode) ? GET_MODE_NUNITS (mode) : 0;
>
Ah true, especially now that there are no HImode predicates!
I added an additional check of `&& VALID_MVE_PRED_MODE (mode)` as well,
just to make sure we could never pick up V4SImode, etc. (although I'd
never expect that to happen if `rtx x` came from a valid instruction)
>
> +/* Check if an insn requires the use of the VPR_REG, if it does,
> return the
> + sub-rtx of the VPR_REG. The `type` argument controls whether
> + this function should:
> + * For type == 0, check all operands, including the OUT operands,
> + and return the first occurance of the VPR_REG.
>
> s/occurance/occurrence/
Done
>
> + bool requires_vpr;
> + extract_constrain_insn (insn);
>
> indent of requires_vpr is off.
Done
>
> + if (type == 1 && (recog_data.operand_type[op] == OP_OUT
> + || recog_data.operand_type[op] == OP_INOUT))
> + continue;
> + else if (type == 2 && (recog_data.operand_type[op] == OP_IN
> + || recog_data.operand_type[op] == OP_INOUT))
> + continue;
>
> Why skip INOUT? I guess this will become clear when I see the uses,
> but I'm wondering whether 'only check the input operands.' is clear
> enough. Maybe 'check operands that are input only.' would be more
> accurate?
Oh! Thanks for spotting this. It also doesn't work with my comment at
the top:
`(INOUT operands are considered both as input and output operands)`
It's been a long time since I wrote this piece, but it might be that I
added this after realising that there are no insns with an OP_INOUT VPR
reg. Since I don't think it's functional, I changed the code to align
with the comment, instead.
>
> + /* Fetch the reg_class for each entry and check it against the
> + * VPR_REG reg_class. */
>
> Remove leading * on the second line.
Damn auto-formatters ;)
Done
>
> +
> +/* Wrapper function of arm_get_required_vpr_reg with type == 1, so
> return
> + something only if the VPR reg is an input operand to the insn. */
>
> When talking about a function parameter in comments capitalize (INSN)
> the name. Same for:
Done
>
> +/* Wrapper function of arm_get_required_vpr_reg with type == 2, so
> return
> + something only if the VPR reg is the retrurn value, an output of,
> or is
> + clobbered by the insn. */
>
> +/* Return true if an insn is an MVE instruction that VPT-predicable,
> but in
> + its unpredicated form, or if it is predicated, but on a predicate
> other
> + than vpr_reg. */
>
> In this one also 'is a MVE instruction that is VPT-predicable' would
> be better I think.
Oops, thanks for spotting. Done.
>
>
> On 15/06/2023 12:47, Stamatis Markianos-Wright via Gcc-patches wrote:
> > Hi all,
> >
> > This is the 2/2 patch that contains the functional changes needed
> > for MVE Tail Predicated Low Overhead Loops. See my previous email
> > for a general introduction of MVE LOLs.
> >
> > This support is added through the already existing loop-doloop
> > mechanisms that are used for non-MVE dls/le looping.
> >
> > Mid-end changes are:
> >
> > 1) Relax the loop-doloop mechanism in the mid-end to allow for
> > decrement numbers other that -1 and for `count` to be an
> > rtx containing a simple REG (which in this case will contain
> > the number of elements to be processed), rather
> > than an expression for calculating the number of iterations.
> > 2) Added a new df utility function: `df_bb_regno_only_def_find`
> that
> > will return the DEF of a REG only if it is DEF-ed once
> within the
> > basic block.
> >
> > And many things in the backend to implement the above
> optimisation:
> >
> > 3) Implement the `arm_predict_doloop_p` target hook to
> instruct the
> > mid-end about Low Overhead Loops (MVE or not), as well as
> > `arm_loop_unroll_adjust` which will prevent unrolling of
> any loops
> > that are valid for becoming MVE Tail_Predicated Low
> Overhead Loops
> > (unrolling can transform a loop in ways that invalidate the
> dlstp/
> > letp tranformation logic and the benefit of the dlstp/letp
> loop
> > would be considerably higher than that of unrolling)
> > 4) Appropriate changes to the define_expand of doloop_end, new
> > patterns for dlstp and letp, new iterators, unspecs, etc.
> > 5) `arm_mve_loop_valid_for_dlstp` and a number of checking
> functions:
> > * `arm_mve_dlstp_check_dec_counter`
> > * `arm_mve_dlstp_check_inc_counter`
> > * `arm_mve_check_reg_origin_is_num_elems`
> > * `arm_mve_check_df_chain_back_for_implic_predic`
> > * `arm_mve_check_df_chain_fwd_for_implic_predic_impact`
> > This all, in smoe way or another, are running checks on the
> loop
> > structure in order to determine if the loop is valid for
> dlstp/letp
> > transformation.
> > 6) `arm_attempt_dlstp_transform`: (called from the
> define_expand of
> > doloop_end) this function re-checks for the loop's
> suitability for
> > dlstp/letp transformation and then implements it, if possible.
> > 7) Various utility functions:
> > *`arm_mve_get_vctp_lanes` to map
> > from vctp unspecs to number of lanes, and
> > `arm_get_required_vpr_reg`
> > to check an insn to see if it requires the VPR or not.
> > * `arm_mve_get_loop_vctp`
> > * `arm_mve_get_vctp_lanes`
> > * `arm_emit_mve_unpredicated_insn_to_seq`
> > * `arm_get_required_vpr_reg`
> > * `arm_get_required_vpr_reg_param`
> > * `arm_get_required_vpr_reg_ret_val`
> > * `arm_mve_vec_insn_is_predicated_with_this_predicate`
> > * `arm_mve_vec_insn_is_unpredicated_or_uses_other_predicate`
> >
> > No regressions on arm-none-eabi with various targets and on
> > aarch64-none-elf. Thoughts on getting this into trunk?
> >
> > Thank you,
> > Stam Markianos-Wright
> >
> > gcc/ChangeLog:
> >
> > * config/arm/arm-protos.h (arm_target_insn_ok_for_lob):
> > Rename to...
> > (arm_target_bb_ok_for_lob): ...this
> > (arm_attempt_dlstp_transform): New.
> > * config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): New.
> > (TARGET_PREDICT_DOLOOP_P): New.
> > (arm_block_set_vect):
> > (arm_target_insn_ok_for_lob): Rename from
> > arm_target_insn_ok_for_lob.
> > (arm_target_bb_ok_for_lob): New.
> > (arm_mve_get_vctp_lanes): New.
> > (arm_get_required_vpr_reg): New.
> > (arm_get_required_vpr_reg_param): New.
> > (arm_get_required_vpr_reg_ret_val): New.
> > (arm_mve_get_loop_vctp): New.
> > (arm_mve_vec_insn_is_unpredicated_or_uses_other_predicate): New.
> > (arm_mve_vec_insn_is_predicated_with_this_predicate): New.
> > (arm_mve_check_df_chain_back_for_implic_predic): New.
> > (arm_mve_check_df_chain_fwd_for_implic_predic_impact): New.
> > (arm_mve_check_reg_origin_is_num_elems): New.
> > (arm_mve_dlstp_check_inc_counter): New.
> > (arm_mve_dlstp_check_dec_counter): New.
> > (arm_mve_loop_valid_for_dlstp): New.
> > (arm_predict_doloop_p): New.
> > (arm_loop_unroll_adjust): New.
> > (arm_emit_mve_unpredicated_insn_to_seq): New.
> > (arm_attempt_dlstp_transform): New.
> > * config/arm/iterators.md (DLSTP): New.
> > (mode1): Add DLSTP mappings.
> > * config/arm/mve.md (*predicated_doloop_end_internal):
> New.
> > (dlstp<mode1>_insn): New.
> > * config/arm/thumb2.md (doloop_end): Update for MVE LOLs.
> > * config/arm/unspecs.md: New unspecs.
> > * df-core.cc (df_bb_regno_only_def_find): New.
> > * df.h (df_bb_regno_only_def_find): New.
> > * loop-doloop.cc (doloop_condition_get): Relax conditions.
> > (doloop_optimize): Add support for elementwise LoLs.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/arm/lob.h: Update framework.
> > * gcc.target/arm/lob1.c: Likewise.
> > * gcc.target/arm/lob6.c: Likewise.
> > * gcc.target/arm/mve/dlstp-compile-asm.c: New test.
> > * gcc.target/arm/mve/dlstp-int16x8.c: New test.
> > * gcc.target/arm/mve/dlstp-int32x4.c: New test.
> > * gcc.target/arm/mve/dlstp-int64x2.c: New test.
> > * gcc.target/arm/mve/dlstp-int8x16.c: New test.
> > * gcc.target/arm/mve/dlstp-invalid-asm.c: New test.
next prev parent reply other threads:[~2023-07-05 16:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-15 11:47 Stamatis Markianos-Wright
2023-06-22 15:54 ` Andre Vieira (lists)
2023-07-05 16:11 ` Stamatis Markianos-Wright [this message]
2023-06-23 10:23 ` Andre Vieira (lists)
2023-07-05 16:13 ` Stamatis Markianos-Wright
2023-06-23 16:25 ` Andre Vieira (lists)
2023-07-05 16:41 ` Stamatis Markianos-Wright
-- strict thread matches above, loose matches on Subject: below --
2023-12-18 11:53 [PATCH 0/2] " Andre Vieira
2023-12-18 11:53 ` [PATCH 2/2] " Andre Vieira
2023-12-20 16:54 ` Andre Vieira (lists)
2023-11-06 11:20 Stamatis Markianos-Wright
2023-08-17 10:31 Stamatis Markianos-Wright
2022-11-11 17:40 Stam Markianos-Wright
2022-11-15 15:51 ` Andre Vieira (lists)
2022-11-28 12:13 ` Stam Markianos-Wright
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4086feb8-6048-8bbe-1ed4-6558c9b725e2@arm.com \
--to=stam.markianos-wright@arm.com \
--cc=Kyrylo.Tkachov@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=andre.simoesdiasvieira@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=nickc@redhat.com \
--cc=ramana.gcc@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).