public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tamar Christina <tamar.christina@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: nd@arm.com, rguenther@suse.de
Subject: [PATCH 4/4]middle-end: Add tests middle end generic tests for sign differing dotproduct.
Date: Wed, 5 May 2021 18:39:49 +0100	[thread overview]
Message-ID: <20210505173947.GA24190@arm.com> (raw)
In-Reply-To: <patch-14433-tamar@arm.com>

[-- Attachment #1: Type: text/plain, Size: 13192 bytes --]

Hi All,

This adds testcases to test for auto-vect detection of the new sign differing
dot product.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* doc/sourcebuild.texi (arm_v8_2a_i8mm_neon_hw): Document.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp
	(check_effective_target_arm_v8_2a_imm8_neon_ok_nocache,
	check_effective_target_arm_v8_2a_i8mm_neon_hw,
	check_effective_target_vect_usdot_qi): New.
	* gcc.dg/vect/vect-reduc-dot-10.c: New test.
	* gcc.dg/vect/vect-reduc-dot-11.c: New test.
	* gcc.dg/vect/vect-reduc-dot-12.c: New test.
	* gcc.dg/vect/vect-reduc-dot-13.c: New test.
	* gcc.dg/vect/vect-reduc-dot-14.c: New test.
	* gcc.dg/vect/vect-reduc-dot-15.c: New test.
	* gcc.dg/vect/vect-reduc-dot-16.c: New test.
	* gcc.dg/vect/vect-reduc-dot-9.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index b0001247795947c9dcab1a14884ecd585976dfdd..0034ac9d86b26e6674d71090b9d04b6148f99e17 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1672,6 +1672,10 @@ Target supports a vector dot-product of @code{signed char}.
 @item vect_udot_qi
 Target supports a vector dot-product of @code{unsigned char}.
 
+@item vect_usdot_qi
+Target supports a vector dot-product where one operand of the multiply is
+@code{signed char} and the other of @code{unsigned char}.
+
 @item vect_sdot_hi
 Target supports a vector dot-product of @code{signed short}.
 
@@ -1947,6 +1951,11 @@ ARM target supports executing instructions from ARMv8.2-A with the Dot
 Product extension. Some multilibs may be incompatible with these options.
 Implies arm_v8_2a_dotprod_neon_ok.
 
+@item arm_v8_2a_i8mm_neon_hw
+ARM target supports executing instructions from ARMv8.2-A with the 8-bit
+Matrix Multiply extension.  Some multilibs may be incompatible with these
+options.  Implies arm_v8_2a_i8mm_ok.
+
 @item arm_fp16fml_neon_ok
 @anchor{arm_fp16fml_neon_ok}
 ARM target supports extensions to generate the @code{VFMAL} and @code{VFMLS}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c
new file mode 100644
index 0000000000000000000000000000000000000000..7ce86965ea97d37c43d96b4d2271df667dcb2aae
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 unsigned
+#define SIGNEDNESS_3 unsigned
+#define SIGNEDNESS_4 signed
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c
new file mode 100644
index 0000000000000000000000000000000000000000..0f7cbbb87ef028f166366aea55bc4ef49d2f8e9b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 unsigned
+#define SIGNEDNESS_4 signed
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-12.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-12.c
new file mode 100644
index 0000000000000000000000000000000000000000..08412614fc67045d3067b5b55ba032d297595237
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-12.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-13.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-13.c
new file mode 100644
index 0000000000000000000000000000000000000000..7ee0f45f64296442204ee13d5f880f4b7716fb85
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-13.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 signed
+#define SIGNEDNESS_2 unsigned
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-14.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-14.c
new file mode 100644
index 0000000000000000000000000000000000000000..2de1434528b87f0c32c54150b16791f3f2a469b5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-14.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 signed
+#define SIGNEDNESS_2 unsigned
+#define SIGNEDNESS_3 unsigned
+#define SIGNEDNESS_4 signed
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-15.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-15.c
new file mode 100644
index 0000000000000000000000000000000000000000..dc48f95a32bf76c54a906ee81ddee99b16aea84a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-15.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 signed
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 unsigned
+#define SIGNEDNESS_4 signed
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-16.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-16.c
new file mode 100644
index 0000000000000000000000000000000000000000..aec628789366673321aea88c60316a68fe16cbc5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-16.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 signed
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-9.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-9.c
new file mode 100644
index 0000000000000000000000000000000000000000..cbbeedec3bfd0810a8ce8036e6670585d9334924
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-9.c
@@ -0,0 +1,52 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#include "tree-vect.h"
+
+#define N 50
+
+#ifndef SIGNEDNESS_1
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 unsigned
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+#endif
+
+SIGNEDNESS_1 int __attribute__ ((noipa))
+f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
+   SIGNEDNESS_4 char *restrict b)
+{
+  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
+    {
+      int av = a[i];
+      int bv = b[i];
+      SIGNEDNESS_2 short mult = av * bv;
+      res += mult;
+    }
+  return res;
+}
+
+#define BASE ((SIGNEDNESS_3 int) -1 < 0 ? -126 : 4)
+#define OFFSET 20
+
+int
+main (void)
+{
+  check_vect ();
+
+  SIGNEDNESS_3 char a[N], b[N];
+  int expected = 0x12345;
+  for (int i = 0; i < N; ++i)
+    {
+      a[i] = BASE + i * 5;
+      b[i] = BASE + OFFSET + i * 4;
+      asm volatile ("" ::: "memory");
+      expected += (SIGNEDNESS_2 short) (a[i] * b[i]);
+    }
+  if (f (0x12345, a, b) != expected)
+    __builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ad323107f2ec5d55a77214beca5e4135643528b4..db9bd605ab4c838f65667fa616da334a171d9dfb 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5240,6 +5240,36 @@ proc check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache { } {
     return 0;
 }
 
+# Return 1 if the target supports ARMv8.2 Adv.SIMD imm8
+# instructions, 0 otherwise.  The test is valid for ARM and for AArch64.
+# Record the command line options needed.
+
+proc check_effective_target_arm_v8_2a_imm8_neon_ok_nocache { } {
+    global et_arm_v8_2a_imm8_neon_flags
+    set et_arm_v8_2a_imm8_neon_flags ""
+
+    if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } {
+        return 0;
+    }
+
+    # Iterate through sets of options to find the compiler flags that
+    # need to be added to the -march option.
+    foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" "-mfloat-abi=hard -mfpu=neon-fp-armv8"} {
+        if { [check_no_compiler_messages_nocache \
+                  arm_v8_2a_imm8_neon_ok object {
+	    #include <stdint.h>
+            #if !defined (__ARM_FEATURE_MATMUL_INT8)
+            #error "__ARM_FEATURE_MATMUL_INT8 not defined"
+            #endif
+        } "$flags -march=armv8.2-a+imm8"] } {
+            set et_arm_v8_2a_imm8_neon_flags "$flags -march=armv8.2-a+imm8"
+            return 1
+        }
+    }
+
+    return 0;
+}
+
 # Return 1 if the target supports ARMv8.1-M MVE
 # instructions, 0 otherwise.  The test is valid for ARM.
 # Record the command line options needed.
@@ -5667,6 +5697,43 @@ proc check_effective_target_arm_v8_2a_dotprod_neon_hw { } {
     } [add_options_for_arm_v8_2a_dotprod_neon ""]]
 }
 
+# Return 1 if the target supports executing AdvSIMD instructions from ARMv8.2
+# with the i8mm extension, 0 otherwise.  The test is valid for ARM and for
+# AArch64.
+
+proc check_effective_target_arm_v8_2a_i8mm_neon_hw { } {
+    if { ![check_effective_target_arm_v8_2a_i8mm_ok] } {
+        return 0;
+    }
+    return [check_runtime arm_v8_2a_i8mm_neon_hw_available {
+        #include "arm_neon.h"
+        int
+        main (void)
+        {
+
+	  uint32x2_t results = {0,0};
+	  uint8x8_t a = {1,1,1,1,2,2,2,2};
+	  int8x8_t b = {2,2,2,2,3,3,3,3};
+
+          #ifdef __ARM_ARCH_ISA_A64
+          asm ("usdot %0.2s, %1.8b, %2.8b"
+               : "=w"(results)
+               : "w"(a), "w"(b)
+               : /* No clobbers.  */);
+
+	  #else
+          asm ("vusdot.u8 %P0, %P1, %P2"
+               : "=w"(results)
+               : "w"(a), "w"(b)
+               : /* No clobbers.  */);
+          #endif
+
+          return (vget_lane_u32 (results, 0) == 8
+		  && vget_lane_u32 (results, 1) == 24) ? 1 : 0;
+        }
+    } [add_options_for_arm_v8_2a_i8mm ""]]
+}
+
 # Return 1 if this is a ARM target with NEON enabled.
 
 proc check_effective_target_arm_neon { } {
@@ -7022,6 +7089,19 @@ proc check_effective_target_vect_udot_qi { } {
 		 && [et-is-effective-target mips_msa]) }}]
 }
 
+# Return 1 if the target plus current options supports a vector
+# dot-product where one operand of the multiply is signed char
+# and the other unsigned chars, 0 otherwise.
+#
+# This won't change for different subtargets so cache the result.
+
+proc check_effective_target_vect_usdot_qi { } {
+    return [check_cached_effective_target_indexed vect_usdot_qi {
+      expr { [istarget aarch64*-*-*]
+	     || [istarget arm*-*-*] }}]
+}
+
+
 # Return 1 if the target plus current options supports a vector
 # dot-product of signed shorts, 0 otherwise.
 #


-- 

[-- Attachment #2: rb14436.patch --]
[-- Type: text/x-diff, Size: 12313 bytes --]

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index b0001247795947c9dcab1a14884ecd585976dfdd..0034ac9d86b26e6674d71090b9d04b6148f99e17 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1672,6 +1672,10 @@ Target supports a vector dot-product of @code{signed char}.
 @item vect_udot_qi
 Target supports a vector dot-product of @code{unsigned char}.
 
+@item vect_usdot_qi
+Target supports a vector dot-product where one operand of the multiply is
+@code{signed char} and the other of @code{unsigned char}.
+
 @item vect_sdot_hi
 Target supports a vector dot-product of @code{signed short}.
 
@@ -1947,6 +1951,11 @@ ARM target supports executing instructions from ARMv8.2-A with the Dot
 Product extension. Some multilibs may be incompatible with these options.
 Implies arm_v8_2a_dotprod_neon_ok.
 
+@item arm_v8_2a_i8mm_neon_hw
+ARM target supports executing instructions from ARMv8.2-A with the 8-bit
+Matrix Multiply extension.  Some multilibs may be incompatible with these
+options.  Implies arm_v8_2a_i8mm_ok.
+
 @item arm_fp16fml_neon_ok
 @anchor{arm_fp16fml_neon_ok}
 ARM target supports extensions to generate the @code{VFMAL} and @code{VFMLS}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c
new file mode 100644
index 0000000000000000000000000000000000000000..7ce86965ea97d37c43d96b4d2271df667dcb2aae
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 unsigned
+#define SIGNEDNESS_3 unsigned
+#define SIGNEDNESS_4 signed
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c
new file mode 100644
index 0000000000000000000000000000000000000000..0f7cbbb87ef028f166366aea55bc4ef49d2f8e9b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 unsigned
+#define SIGNEDNESS_4 signed
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-12.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-12.c
new file mode 100644
index 0000000000000000000000000000000000000000..08412614fc67045d3067b5b55ba032d297595237
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-12.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-13.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-13.c
new file mode 100644
index 0000000000000000000000000000000000000000..7ee0f45f64296442204ee13d5f880f4b7716fb85
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-13.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 signed
+#define SIGNEDNESS_2 unsigned
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-14.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-14.c
new file mode 100644
index 0000000000000000000000000000000000000000..2de1434528b87f0c32c54150b16791f3f2a469b5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-14.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 signed
+#define SIGNEDNESS_2 unsigned
+#define SIGNEDNESS_3 unsigned
+#define SIGNEDNESS_4 signed
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-15.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-15.c
new file mode 100644
index 0000000000000000000000000000000000000000..dc48f95a32bf76c54a906ee81ddee99b16aea84a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-15.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 signed
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 unsigned
+#define SIGNEDNESS_4 signed
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-16.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-16.c
new file mode 100644
index 0000000000000000000000000000000000000000..aec628789366673321aea88c60316a68fe16cbc5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-16.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#define SIGNEDNESS_1 signed
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+
+#include "vect-reduc-dot-9.c"
+
+/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-9.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-9.c
new file mode 100644
index 0000000000000000000000000000000000000000..cbbeedec3bfd0810a8ce8036e6670585d9334924
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-9.c
@@ -0,0 +1,52 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_i8mm }  */
+
+#include "tree-vect.h"
+
+#define N 50
+
+#ifndef SIGNEDNESS_1
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 unsigned
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+#endif
+
+SIGNEDNESS_1 int __attribute__ ((noipa))
+f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
+   SIGNEDNESS_4 char *restrict b)
+{
+  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
+    {
+      int av = a[i];
+      int bv = b[i];
+      SIGNEDNESS_2 short mult = av * bv;
+      res += mult;
+    }
+  return res;
+}
+
+#define BASE ((SIGNEDNESS_3 int) -1 < 0 ? -126 : 4)
+#define OFFSET 20
+
+int
+main (void)
+{
+  check_vect ();
+
+  SIGNEDNESS_3 char a[N], b[N];
+  int expected = 0x12345;
+  for (int i = 0; i < N; ++i)
+    {
+      a[i] = BASE + i * 5;
+      b[i] = BASE + OFFSET + i * 4;
+      asm volatile ("" ::: "memory");
+      expected += (SIGNEDNESS_2 short) (a[i] * b[i]);
+    }
+  if (f (0x12345, a, b) != expected)
+    __builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_usdot_qi } } } */
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ad323107f2ec5d55a77214beca5e4135643528b4..db9bd605ab4c838f65667fa616da334a171d9dfb 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5240,6 +5240,36 @@ proc check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache { } {
     return 0;
 }
 
+# Return 1 if the target supports ARMv8.2 Adv.SIMD imm8
+# instructions, 0 otherwise.  The test is valid for ARM and for AArch64.
+# Record the command line options needed.
+
+proc check_effective_target_arm_v8_2a_imm8_neon_ok_nocache { } {
+    global et_arm_v8_2a_imm8_neon_flags
+    set et_arm_v8_2a_imm8_neon_flags ""
+
+    if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } {
+        return 0;
+    }
+
+    # Iterate through sets of options to find the compiler flags that
+    # need to be added to the -march option.
+    foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" "-mfloat-abi=hard -mfpu=neon-fp-armv8"} {
+        if { [check_no_compiler_messages_nocache \
+                  arm_v8_2a_imm8_neon_ok object {
+	    #include <stdint.h>
+            #if !defined (__ARM_FEATURE_MATMUL_INT8)
+            #error "__ARM_FEATURE_MATMUL_INT8 not defined"
+            #endif
+        } "$flags -march=armv8.2-a+imm8"] } {
+            set et_arm_v8_2a_imm8_neon_flags "$flags -march=armv8.2-a+imm8"
+            return 1
+        }
+    }
+
+    return 0;
+}
+
 # Return 1 if the target supports ARMv8.1-M MVE
 # instructions, 0 otherwise.  The test is valid for ARM.
 # Record the command line options needed.
@@ -5667,6 +5697,43 @@ proc check_effective_target_arm_v8_2a_dotprod_neon_hw { } {
     } [add_options_for_arm_v8_2a_dotprod_neon ""]]
 }
 
+# Return 1 if the target supports executing AdvSIMD instructions from ARMv8.2
+# with the i8mm extension, 0 otherwise.  The test is valid for ARM and for
+# AArch64.
+
+proc check_effective_target_arm_v8_2a_i8mm_neon_hw { } {
+    if { ![check_effective_target_arm_v8_2a_i8mm_ok] } {
+        return 0;
+    }
+    return [check_runtime arm_v8_2a_i8mm_neon_hw_available {
+        #include "arm_neon.h"
+        int
+        main (void)
+        {
+
+	  uint32x2_t results = {0,0};
+	  uint8x8_t a = {1,1,1,1,2,2,2,2};
+	  int8x8_t b = {2,2,2,2,3,3,3,3};
+
+          #ifdef __ARM_ARCH_ISA_A64
+          asm ("usdot %0.2s, %1.8b, %2.8b"
+               : "=w"(results)
+               : "w"(a), "w"(b)
+               : /* No clobbers.  */);
+
+	  #else
+          asm ("vusdot.u8 %P0, %P1, %P2"
+               : "=w"(results)
+               : "w"(a), "w"(b)
+               : /* No clobbers.  */);
+          #endif
+
+          return (vget_lane_u32 (results, 0) == 8
+		  && vget_lane_u32 (results, 1) == 24) ? 1 : 0;
+        }
+    } [add_options_for_arm_v8_2a_i8mm ""]]
+}
+
 # Return 1 if this is a ARM target with NEON enabled.
 
 proc check_effective_target_arm_neon { } {
@@ -7022,6 +7089,19 @@ proc check_effective_target_vect_udot_qi { } {
 		 && [et-is-effective-target mips_msa]) }}]
 }
 
+# Return 1 if the target plus current options supports a vector
+# dot-product where one operand of the multiply is signed char
+# and the other unsigned chars, 0 otherwise.
+#
+# This won't change for different subtargets so cache the result.
+
+proc check_effective_target_vect_usdot_qi { } {
+    return [check_cached_effective_target_indexed vect_usdot_qi {
+      expr { [istarget aarch64*-*-*]
+	     || [istarget arm*-*-*] }}]
+}
+
+
 # Return 1 if the target plus current options supports a vector
 # dot-product of signed shorts, 0 otherwise.
 #


  parent reply	other threads:[~2021-05-05 17:40 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-05 17:38 [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes Tamar Christina
2021-05-05 17:38 ` [PATCH 2/4]AArch64: Add support for sign differing dot-product usdot for NEON and SVE Tamar Christina
2021-05-10 16:49   ` Richard Sandiford
2021-05-25 14:57     ` Tamar Christina
2021-05-26  8:50       ` Richard Sandiford
2021-05-05 17:39 ` [PATCH 3/4][AArch32]: Add support for sign differing dot-product usdot for NEON Tamar Christina
2021-05-05 17:42   ` FW: " Tamar Christina
     [not found]     ` <VI1PR08MB5325B832EE3BB6139886C0E9FF259@VI1PR08MB5325.eurprd08.prod.outlook.com>
2021-05-25 15:02       ` Tamar Christina
2021-05-26 10:45         ` Kyrylo Tkachov
2021-05-06  9:23   ` Christophe Lyon
2021-05-06  9:27     ` Tamar Christina
2021-05-05 17:39 ` Tamar Christina [this message]
     [not found]   ` <VI1PR08MB532511701573C18A33AC6291FF259@VI1PR08MB5325.eurprd08.prod.outlook.com>
2021-05-25 15:01     ` FW: [PATCH 4/4]middle-end: Add tests middle end generic tests for sign differing dotproduct Tamar Christina
     [not found]     ` <11s2181-8856-30rq-26or-84q8o7qrr2o@fhfr.qr>
2021-05-26  8:48       ` Tamar Christina
2021-06-14 12:08       ` Tamar Christina
2021-05-07 11:45 ` [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes Richard Biener
2021-05-07 12:42   ` Tamar Christina
2021-05-10 11:39     ` Richard Biener
2021-05-10 12:58       ` Tamar Christina
2021-05-10 13:29         ` Richard Biener
2021-05-25 14:57           ` Tamar Christina
2021-05-26  8:56             ` Richard Biener
2021-06-02  9:28               ` Tamar Christina
2021-06-04 10:12                 ` Tamar Christina
2021-06-07 10:10                   ` Richard Sandiford
2021-06-14 12:06                     ` Tamar Christina
2021-06-21  8:11                       ` Tamar Christina
2021-06-22 10:56                       ` Richard Sandiford
2021-06-22 11:16                         ` Richard Sandiford
2021-07-12  9:18                           ` Tamar Christina
2021-07-12  9:39                             ` Richard Sandiford
2021-07-12  9:56                               ` Tamar Christina
2021-07-12 10:25                                 ` Richard Sandiford
2021-07-12 12:29                                   ` Tamar Christina
2021-07-12 14:55                                     ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210505173947.GA24190@arm.com \
    --to=tamar.christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).