Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
To: Stamatis Markianos-Wright <stam.markianos-wright@arm.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>,
	Richard Earnshaw <Richard.Earnshaw@arm.com>,
	ramana.gcc@gmail.com, "nickc@redhat.com" <nickc@redhat.com>
Subject: Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops
Date: Thu, 22 Jun 2023 16:54:57 +0100	[thread overview]
Message-ID: <6dba374d-0796-7957-5426-764177115135@arm.com> (raw)
In-Reply-To: <cada1fc1-dd8f-60b1-ecc2-f89262d458d4@arm.com>

Some comments below, all quite minor. I'll continue to review tomorrow, 
I need a fresher brain for arm_mve_check_df_chain_back_for_implic_predic 
  ;)

+static int
+arm_mve_get_vctp_lanes (rtx x)
+{
+  if (GET_CODE (x) == SET && GET_CODE (XEXP (x, 1)) == UNSPEC
+      && (XINT (XEXP (x, 1), 1) == VCTP || XINT (XEXP (x, 1), 1) == 
VCTP_M))
+    {
+      switch (GET_MODE (XEXP (x, 1)))
+	{
+	  case V16BImode:
+	    return 16;
+	  case V8BImode:
+	    return 8;
+	  case V4BImode:
+	    return 4;
+	  case V2QImode:
+	    return 2;
+	  default:
+	    break;
+	}
+    }
+  return 0;
+}

I think you can replace the switch with something along the lines of:
machine_mode mode = GET_MODE (XEXP (x, 1));
return VECTOR_MODE_P (mode) ? GET_MODE_NUNITS (mode) : 0;


+/* Check if an insn requires the use of the VPR_REG, if it does, return the
+   sub-rtx of the VPR_REG.  The `type` argument controls whether
+   this function should:
+   * For type == 0, check all operands, including the OUT operands,
+     and return the first occurance of the VPR_REG.

s/occurance/occurrence/

+	  bool requires_vpr;
+  extract_constrain_insn (insn);

indent of requires_vpr is off.

+      if (type == 1 && (recog_data.operand_type[op] == OP_OUT
+			|| recog_data.operand_type[op] == OP_INOUT))
+	continue;
+      else if (type == 2 && (recog_data.operand_type[op] == OP_IN
+			     || recog_data.operand_type[op] == OP_INOUT))
+	continue;

Why skip INOUT? I guess this will become clear when I see the uses, but 
I'm wondering whether 'only check the input operands.' is clear enough. 
Maybe 'check operands that are input only.' would be more accurate?

+	  /* Fetch the reg_class for each entry and check it against the
+	   * VPR_REG reg_class.  */

Remove leading * on the second line.

+
+/* Wrapper function of arm_get_required_vpr_reg with type == 1, so return
+   something only if the VPR reg is an input operand to the insn.  */

When talking about a function parameter in comments capitalize (INSN) 
the name. Same for:

+/* Wrapper function of arm_get_required_vpr_reg with type == 2, so return
+   something only if the VPR reg is the retrurn value, an output of, or is
+   clobbered by the insn.  */

+/* Return true if an insn is an MVE instruction that VPT-predicable, but in
+   its unpredicated form, or if it is predicated, but on a predicate other
+   than vpr_reg.  */

In this one also 'is a MVE instruction that is VPT-predicable' would be 
better I think.


On 15/06/2023 12:47, Stamatis Markianos-Wright via Gcc-patches wrote:
 >      Hi all,
 >
 >      This is the 2/2 patch that contains the functional changes needed
 >      for MVE Tail Predicated Low Overhead Loops.  See my previous email
 >      for a general introduction of MVE LOLs.
 >
 >      This support is added through the already existing loop-doloop
 >      mechanisms that are used for non-MVE dls/le looping.
 >
 >      Mid-end changes are:
 >
 >      1) Relax the loop-doloop mechanism in the mid-end to allow for
 >         decrement numbers other that -1 and for `count` to be an
 >         rtx containing a simple REG (which in this case will contain
 >         the number of elements to be processed), rather
 >         than an expression for calculating the number of iterations.
 >      2) Added a new df utility function: `df_bb_regno_only_def_find` that
 >         will return the DEF of a REG only if it is DEF-ed once within the
 >         basic block.
 >
 >      And many things in the backend to implement the above optimisation:
 >
 >      3)  Implement the `arm_predict_doloop_p` target hook to instruct the
 >          mid-end about Low Overhead Loops (MVE or not), as well as
 >          `arm_loop_unroll_adjust` which will prevent unrolling of any 
loops
 >          that are valid for becoming MVE Tail_Predicated Low Overhead 
Loops
 >          (unrolling can transform a loop in ways that invalidate the 
dlstp/
 >          letp tranformation logic and the benefit of the dlstp/letp loop
 >          would be considerably higher than that of unrolling)
 >      4)  Appropriate changes to the define_expand of doloop_end, new
 >          patterns for dlstp and letp, new iterators,  unspecs, etc.
 >      5) `arm_mve_loop_valid_for_dlstp` and a number of checking 
functions:
 >         * `arm_mve_dlstp_check_dec_counter`
 >         * `arm_mve_dlstp_check_inc_counter`
 >         * `arm_mve_check_reg_origin_is_num_elems`
 >         * `arm_mve_check_df_chain_back_for_implic_predic`
 >         * `arm_mve_check_df_chain_fwd_for_implic_predic_impact`
 >         This all, in smoe way or another, are running checks on the loop
 >         structure in order to determine if the loop is valid for 
dlstp/letp
 >         transformation.
 >      6) `arm_attempt_dlstp_transform`: (called from the define_expand of
 >          doloop_end) this function re-checks for the loop's 
suitability for
 >          dlstp/letp transformation and then implements it, if possible.
 >      7) Various utility functions:
 >         *`arm_mve_get_vctp_lanes` to map
 >         from vctp unspecs to number of lanes, and
 > `arm_get_required_vpr_reg`
 >         to check an insn to see if it requires the VPR or not.
 >         * `arm_mve_get_loop_vctp`
 >         * `arm_mve_get_vctp_lanes`
 >         * `arm_emit_mve_unpredicated_insn_to_seq`
 >         * `arm_get_required_vpr_reg`
 >         * `arm_get_required_vpr_reg_param`
 >         * `arm_get_required_vpr_reg_ret_val`
 >         * `arm_mve_vec_insn_is_predicated_with_this_predicate`
 >         * `arm_mve_vec_insn_is_unpredicated_or_uses_other_predicate`
 >
 >      No regressions on arm-none-eabi with various targets and on
 >      aarch64-none-elf. Thoughts on getting this into trunk?
 >
 >      Thank you,
 >      Stam Markianos-Wright
 >
 >      gcc/ChangeLog:
 >
 >              * config/arm/arm-protos.h (arm_target_insn_ok_for_lob):
 > Rename to...
 >              (arm_target_bb_ok_for_lob): ...this
 >              (arm_attempt_dlstp_transform): New.
 >              * config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): New.
 >              (TARGET_PREDICT_DOLOOP_P): New.
 >              (arm_block_set_vect):
 >              (arm_target_insn_ok_for_lob): Rename from
 > arm_target_insn_ok_for_lob.
 >              (arm_target_bb_ok_for_lob): New.
 >              (arm_mve_get_vctp_lanes): New.
 >              (arm_get_required_vpr_reg): New.
 >              (arm_get_required_vpr_reg_param): New.
 >              (arm_get_required_vpr_reg_ret_val): New.
 >              (arm_mve_get_loop_vctp): New.
 > (arm_mve_vec_insn_is_unpredicated_or_uses_other_predicate): New.
 >              (arm_mve_vec_insn_is_predicated_with_this_predicate): New.
 >              (arm_mve_check_df_chain_back_for_implic_predic): New.
 >              (arm_mve_check_df_chain_fwd_for_implic_predic_impact): New.
 >              (arm_mve_check_reg_origin_is_num_elems): New.
 >              (arm_mve_dlstp_check_inc_counter): New.
 >              (arm_mve_dlstp_check_dec_counter): New.
 >              (arm_mve_loop_valid_for_dlstp): New.
 >              (arm_predict_doloop_p): New.
 >              (arm_loop_unroll_adjust): New.
 >              (arm_emit_mve_unpredicated_insn_to_seq): New.
 >              (arm_attempt_dlstp_transform): New.
 >              * config/arm/iterators.md (DLSTP): New.
 >              (mode1): Add DLSTP mappings.
 >              * config/arm/mve.md (*predicated_doloop_end_internal): New.
 >              (dlstp<mode1>_insn): New.
 >              * config/arm/thumb2.md (doloop_end): Update for MVE LOLs.
 >              * config/arm/unspecs.md: New unspecs.
 >              * df-core.cc (df_bb_regno_only_def_find): New.
 >              * df.h (df_bb_regno_only_def_find): New.
 >              * loop-doloop.cc (doloop_condition_get): Relax conditions.
 >              (doloop_optimize): Add support for elementwise LoLs.
 >
 >      gcc/testsuite/ChangeLog:
 >
 >              * gcc.target/arm/lob.h: Update framework.
 >              * gcc.target/arm/lob1.c: Likewise.
 >              * gcc.target/arm/lob6.c: Likewise.
 >              * gcc.target/arm/mve/dlstp-compile-asm.c: New test.
 >              * gcc.target/arm/mve/dlstp-int16x8.c: New test.
 >              * gcc.target/arm/mve/dlstp-int32x4.c: New test.
 >              * gcc.target/arm/mve/dlstp-int64x2.c: New test.
 >              * gcc.target/arm/mve/dlstp-int8x16.c: New test.
 >              * gcc.target/arm/mve/dlstp-invalid-asm.c: New test.

next prev parent reply	other threads:[~2023-06-22 15:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-15 11:47 Stamatis Markianos-Wright
2023-06-22 15:54 ` Andre Vieira (lists) [this message]
2023-07-05 16:11   ` Stamatis Markianos-Wright
2023-06-23 10:23 ` Andre Vieira (lists)
2023-07-05 16:13   ` Stamatis Markianos-Wright
2023-06-23 16:25 ` Andre Vieira (lists)
2023-07-05 16:41   ` Stamatis Markianos-Wright
  -- strict thread matches above, loose matches on Subject: below --
2023-12-18 11:53 [PATCH 0/2] " Andre Vieira
2023-12-18 11:53 ` [PATCH 2/2] " Andre Vieira
2023-12-20 16:54   ` Andre Vieira (lists)
2023-11-06 11:20 Stamatis Markianos-Wright
2023-08-17 10:31 Stamatis Markianos-Wright
2022-11-11 17:40 Stam Markianos-Wright
2022-11-15 15:51 ` Andre Vieira (lists)
2022-11-28 12:13   ` Stam Markianos-Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6dba374d-0796-7957-5426-764177115135@arm.com \
    --to=andre.simoesdiasvieira@arm.com \
    --cc=Kyrylo.Tkachov@arm.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nickc@redhat.com \
    --cc=ramana.gcc@gmail.com \
    --cc=stam.markianos-wright@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).