public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Andrew Pinski <quic_apinski@quicinc.com>
Cc: <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] aarch64: Fix movv8di for overlapping register and memory load [PR113550]
Date: Thu, 25 Jan 2024 12:08:22 +0000	[thread overview]
Message-ID: <mpth6j1fweh.fsf@arm.com> (raw)
In-Reply-To: <20240124070103.3800874-1-quic_apinski@quicinc.com> (Andrew Pinski's message of "Tue, 23 Jan 2024 23:01:03 -0800")

Andrew Pinski <quic_apinski@quicinc.com> writes:
> The split for movv8di is not ready to handle the case where the setting
> register overlaps with the address of the memory that is being load.
> Fixing the split than just making the output constraint as an early clobber
> for this alternative. The split would first need to figure out which register
> is overlapping with the address and then only emit that move last.

I was curious how strained that detection would be in practice, and in
the end it didn't seem too bad.  I pushed the following variant after
testing on aarch64-linux-gnu.

Thanks,
Richard


The LS64 movv8di pattern didn't handle loads that overlapped with
the address register (unless the overlap happened to be in the
last subload).

gcc/
	PR target/113550
	* config/aarch64/aarch64-simd.md: In the movv8di splitter, check
	whether each split instruction is a load that clobbers the source
	address.  Emit that instruction last if so.

gcc/testsuite/
	PR target/113550
	* gcc.target/aarch64/pr113550.c: New test.
---
 gcc/config/aarch64/aarch64-simd.md          | 18 ++++++--
 gcc/testsuite/gcc.target/aarch64/pr113550.c | 48 +++++++++++++++++++++
 2 files changed, 62 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr113550.c

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 48f0741e7d0..f036f6ce997 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -8221,11 +8221,21 @@ (define_split
 	   || (memory_operand (operands[0], V8DImode)
 	       && register_operand (operands[1], V8DImode)))
     {
+      std::pair<rtx, rtx> last_pair = {};
       for (int offset = 0; offset < 64; offset += 16)
-	emit_move_insn (simplify_gen_subreg (TImode, operands[0],
-					     V8DImode, offset),
-			simplify_gen_subreg (TImode, operands[1],
-					     V8DImode, offset));
+        {
+	  std::pair<rtx, rtx> pair = {
+	    simplify_gen_subreg (TImode, operands[0], V8DImode, offset),
+	    simplify_gen_subreg (TImode, operands[1], V8DImode, offset)
+	  };
+	  if (register_operand (pair.first, TImode)
+	      && reg_overlap_mentioned_p (pair.first, pair.second))
+	    last_pair = pair;
+	  else
+	    emit_move_insn (pair.first, pair.second);
+        }
+      if (last_pair.first)
+	emit_move_insn (last_pair.first, last_pair.second);
       DONE;
     }
   else
diff --git a/gcc/testsuite/gcc.target/aarch64/pr113550.c b/gcc/testsuite/gcc.target/aarch64/pr113550.c
new file mode 100644
index 00000000000..0ff3c7b5c00
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr113550.c
@@ -0,0 +1,48 @@
+/* { dg-options "-O" } */
+/* { dg-do run } */
+
+#pragma GCC push_options
+#pragma GCC target "+ls64"
+#pragma GCC aarch64 "arm_acle.h"
+#pragma GCC pop_options
+
+#define DEF_FUNCTION(NAME, ARGS)		\
+  __attribute__((noipa))			\
+  __arm_data512_t				\
+  NAME ARGS					\
+  {						\
+    return *ptr;				\
+  }
+
+DEF_FUNCTION (f0, (__arm_data512_t *ptr))
+DEF_FUNCTION (f1, (int x0, __arm_data512_t *ptr))
+DEF_FUNCTION (f2, (int x0, int x1, __arm_data512_t *ptr))
+DEF_FUNCTION (f3, (int x0, int x1, int x2, __arm_data512_t *ptr))
+DEF_FUNCTION (f4, (int x0, int x1, int x2, int x3, __arm_data512_t *ptr))
+DEF_FUNCTION (f5, (int x0, int x1, int x2, int x3, int x4,
+		   __arm_data512_t *ptr))
+DEF_FUNCTION (f6, (int x0, int x1, int x2, int x3, int x4, int x5,
+		   __arm_data512_t *ptr))
+DEF_FUNCTION (f7, (int x0, int x1, int x2, int x3, int x4, int x5, int x6,
+		   __arm_data512_t *ptr))
+
+int
+main (void)
+{
+  __arm_data512_t x = { 0, 10, 20, 30, 40, 50, 60, 70 };
+  __arm_data512_t res[8] =
+  {
+    f0 (&x),
+    f1 (0, &x),
+    f2 (0, 1, &x),
+    f3 (0, 1, 2, &x),
+    f4 (0, 1, 2, 3, &x),
+    f5 (0, 1, 2, 3, 4, &x),
+    f6 (0, 1, 2, 3, 4, 5, &x),
+    f7 (0, 1, 2, 3, 4, 5, 6, &x)
+  };
+  for (int i = 0; i < 8; ++i)
+    if (__builtin_memcmp (&x, &res[i], sizeof (x)) != 0)
+      __builtin_abort ();
+  return 0;
+}
-- 
2.25.1


      reply	other threads:[~2024-01-25 12:08 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24  7:01 Andrew Pinski
2024-01-25 12:08 ` Richard Sandiford [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mpth6j1fweh.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=quic_apinski@quicinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).