public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Andre Simoes Dias Vieira <Andre.SimoesDiasVieira@arm.com>
To: Richard Earnshaw <Richard.Earnshaw@arm.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>,
	Stam Markianos-Wright <Stam.Markianos-Wright@arm.com>
Subject: Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops
Date: Tue, 30 Jan 2024 14:09:47 +0000	[thread overview]
Message-ID: <DB9PR08MB6553F6DAE79401C580705DE3887D2@DB9PR08MB6553.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <df9838a2-e922-43f0-ace4-f65a25bdc41b@arm.com>

Hi Richard,

Thanks for the reviews, I'm making these changes but just a heads up.

When hardcoding LR_REGNUM like this we need to change the way we compare the register in doloop_condition_get. This function currently compares the rtx nodes by address, which I think happens to be fine before we assign hard registers, as I suspect we always share the rtx node for the same pseudo, but when assigning registers it seems like we create copies, so things like:
`XEXP (inc_src, 0) == reg` will fail for
inc_src: (plus (reg LR) (const_int -n)'
reg: (reg LR)

Instead I will substitute the operand '==' with calls to 'rtx_equal_p (op1, op2, NULL)'.

Sound good?

Kind regards,
Andre

________________________________________
From: Richard Earnshaw (lists) <Richard.Earnshaw@arm.com>
Sent: Tuesday, January 30, 2024 11:36 AM
To: Andre Simoes Dias Vieira; gcc-patches@gcc.gnu.org
Cc: Kyrylo Tkachov; Stam Markianos-Wright
Subject: Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

On 19/01/2024 14:40, Andre Vieira wrote:
>
> Respin after comments from Kyrill and rebase. I also removed an if-then-else
> construct in arm_mve_check_reg_origin_is_num_elems similar to the other functions
> Kyrill pointed out.
>
> After an earlier comment from Richard Sandiford I also added comments to the
> two tail predication patterns added to explain the need for the unspecs.

[missing ChangeLog]

I'm just going to focus on loop-doloop.c in this reply, I'll respond to the other bits in a follow-up.

      2)  (set (reg) (plus (reg) (const_int -1))
-         (set (pc) (if_then_else (reg != 0)
-                                (label_ref (label))
-                                (pc))).
+        (set (pc) (if_then_else (reg != 0)
+                                (label_ref (label))
+                                (pc))).

      Some targets (ARM) do the comparison before the branch, as in the
      following form:

-     3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
-                   (set (reg) (plus (reg) (const_int -1)))])
-        (set (pc) (if_then_else (cc == NE)
...


This comment is becoming confusing.  Really the text leading up to 3)... should be inside 3.  Something like:

      3) Some targets (ARM) do the comparison before the branch, as in the
      following form:

      (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
                 (set (reg) (plus (reg) (const_int -1)))])
      (set (pc) (if_then_else (cc == NE)
                              (label_ref (label))
                              (pc)))])


The same issue on the comment structure also applies to the new point 4...

+      The ARM target also supports a special case of a counter that decrements
+      by `n` and terminating in a GTU condition.  In that case, the compare and
+      branch are all part of one insn, containing an UNSPEC:
+
+      4) (parallel [
+           (set (pc)
+               (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
+                                                       (const_int -n))])
+                                  (const_int n-1]))
+                   (label_ref)
+                   (pc)))
+           (set (reg:SI 14 lr)
+                (plus:SI (reg:SI 14 lr)
+                         (const_int -n)))
+     */

I think this needs a bit more clarification.  Specifically that this construct supports a predicated vectorized do loop.  Also, the placement of the unspec inside the comparison is ugnly and unnecessary.  It should be sufficient to have the unspec inside a USE expression, which the mid-end can then ignore entirely.  So

        (parallel
         [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
                                       (const_int n-1))
                                  (label_ref) (pc)))
          (set (reg) (plus (reg) (const_int -n)))
          (additional clobbers and uses)])

For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to this pattern to stop anything else from matching it.

Note that we don't need to mention that the register is 'LR' or the modes, those are specific to a particular backend, not the generic pattern we want to match.

+      || !CONST_INT_P (XEXP (inc_src, 1))
+      || INTVAL (XEXP (inc_src, 1)) >= 0)
     return 0;
+  int dec_num = abs (INTVAL (XEXP (inc_src, 1)));

We can just use '-INTVAL(...)' here, we've verified just above that the constant is negative.

-  if ((XEXP (condition, 0) == reg)
+  /* For the ARM special case of having a GTU: re-form the condition without
+     the unspec for the benefit of the middle-end.  */
+  if (GET_CODE (condition) == GTU)
+    {
+      condition = gen_rtx_fmt_ee (GTU, VOIDmode, inc_src,
+                                 GEN_INT (dec_num - 1));
+      return condition;
+    }

If you make the change I mentioned above, this re-forming isn't needed any more, so the arm-specific comment goes away

-   {
+    {
      if (GET_CODE (pattern) != PARALLEL)
      /*  For the second form we expect:

You've fixed the indentation of the brace (good), but the body of the braced expression needs re-indenting as well.

R.


  reply	other threads:[~2024-01-30 14:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-19 14:40 [PATCH v3 0/2] " Andre Vieira
2024-01-19 14:40 ` [PATCH v3 1/2] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns Andre Vieira
2024-01-30 11:08   ` Richard Earnshaw (lists)
2024-01-19 14:40 ` [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops Andre Vieira
2024-01-30 11:36   ` Richard Earnshaw (lists)
2024-01-30 14:09     ` Andre Simoes Dias Vieira [this message]
2024-01-31 13:45       ` Richard Earnshaw (lists)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB9PR08MB6553F6DAE79401C580705DE3887D2@DB9PR08MB6553.eurprd08.prod.outlook.com \
    --to=andre.simoesdiasvieira@arm.com \
    --cc=Kyrylo.Tkachov@arm.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=Stam.Markianos-Wright@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).