public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][committed] aarch64: Add mulv2di3 expander for TARGET_SVE
@ 2023-04-24 10:31 Kyrylo Tkachov
  0 siblings, 0 replies; only message in thread
From: Kyrylo Tkachov @ 2023-04-24 10:31 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1183 bytes --]

Hi all,

Motivated by a recent LLVM patch I saw, we can use SVE for 64-bit vector integer MUL (plain Advanced SIMD doesn't support it).
Since the Advanced SIMD regs are just the low 128-bit part of the SVE regs it all works transparently.
It's a reasonably straightforward implementation of the mulv2di3 optab that wires it up through the mulvnx2di3 expander and
subregs the results back to the Advanced SIMD modes.

There's more such tricks possible with other operations (and we could do 64-bit multiply-add merged operations too) but for now
this self-contained patch improves the mul case as without it for the testcases in the patch we'd have scalarised the arguments,
moved them to GP regs, performed two GP MULs and moved them back to SIMD regs.
Advertising a mulv2di3 optab from the backend should also allow for more flexibile vectorisation opportunities.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (mulv2di3): New expander.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve-neon-modes_1.c: New test.
	* gcc.target/aarch64/sve-neon-modes_2.c: New test.

[-- Attachment #2: mulv2di.patch --]
[-- Type: application/octet-stream, Size: 2822 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 3045711812578a891a35345c211d5f9322561643..b1453fd9c0e10d60fcd0923f605c9d6cd2a14295 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -433,6 +433,26 @@ (define_insn "mul<mode>3<vczle><vczbe>"
   [(set_attr "type" "neon_mul_<Vetype><q>")]
 )
 
+;; Advanced SIMD does not support vector DImode MUL, but SVE does.
+;; Make use of the overlap between Z and V registers to implement the V2DI
+;; optab for TARGET_SVE.  The mulvnx2di3 expander can
+;; handle the TARGET_SVE2 case transparently.
+(define_expand "mulv2di3"
+  [(set (match_operand:V2DI 0 "register_operand")
+        (mult:V2DI (match_operand:V2DI 1 "register_operand")
+		   (match_operand:V2DI 2 "aarch64_sve_vsm_operand")))]
+  "TARGET_SVE"
+  {
+    machine_mode sve_mode = VNx2DImode;
+    rtx sve_op0 = simplify_gen_subreg (sve_mode, operands[0], V2DImode, 0);
+    rtx sve_op1 = simplify_gen_subreg (sve_mode, operands[1], V2DImode, 0);
+    rtx sve_op2 = simplify_gen_subreg (sve_mode, operands[2], V2DImode, 0);
+
+    emit_insn (gen_mulvnx2di3 (sve_op0, sve_op1, sve_op2));
+    DONE;
+  }
+)
+
 (define_insn "bswap<mode>2"
   [(set (match_operand:VDQHSD 0 "register_operand" "=w")
         (bswap:VDQHSD (match_operand:VDQHSD 1 "register_operand" "w")))]
diff --git a/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_1.c b/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ce4f1c70bccd07fda587cc98ea6b53d2463036bd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=armv8.2-a+sve" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+typedef long v2di  __attribute__((vector_size (16)));
+
+/*
+** foo:
+**	ptrue	p0.b, all
+**	mul	z0.d, p0/m, z0.d, z1.d
+**	ret
+*/
+
+v2di
+foo (v2di a, v2di b)
+{
+  return a * b;
+}
+
+/*
+** foo_imm:
+**	mul	z0.d, z0.d, #125
+**	ret
+*/
+
+v2di
+foo_imm (v2di a)
+{
+  return a * 125;
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_2.c b/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..02bb9f34c38f18bb0e4c94bafe00d53b0423f9b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_2.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=armv8.5-a+sve2" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+typedef long v2di  __attribute__((vector_size (16)));
+
+/*
+** foo:
+**	mul	z0.d, z0.d, z1.d
+**	ret
+*/
+
+v2di
+foo (v2di a, v2di b)
+{
+  return a * b;
+}
+
+/*
+** foo_imm:
+**	mul	z0.d, z0.d, #125
+**	ret
+*/
+
+v2di
+foo_imm (v2di a)
+{
+  return a * 125;
+}
+

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-04-24 10:31 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-24 10:31 [PATCH][committed] aarch64: Add mulv2di3 expander for TARGET_SVE Kyrylo Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).