public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Carl Love <cel@us.ibm.com>
To: gcc-patches@gcc.gnu.org,
	Segher Boessenkool <segher@kernel.crashing.org>,
	David Edelsohn <dje.gcc@gmail.com>,
	"Kewen.Lin" <linkw@linux.ibm.com>
Cc: cel@us.ibm.com, Peter Bergner <bergner@linux.ibm.com>
Subject: [PATCH v2] rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation
Date: Tue, 01 Aug 2023 11:29:09 -0700	[thread overview]
Message-ID: <18319f27b2ef6cf19de41c3c3fc32c7680c62716.camel@us.ibm.com> (raw)


GCC maintainers:

Ver 2:  Re-worked the test vec-cmpne.c to create a compile only test
verify the instruction generation and a runnable test to verify the
built-in functionality.  Retested the patch on Power 8 LE/BE, Power 9LE/BE and Power 10 LE with no regressions.

The following patch cleans up the definition for the
__builtin_altivec_vcmpne{b,h,w}.  The current implementation implies
that the built-in is only supported on Power 9 since it is defined
under the Power 9 stanza.  However the built-in has no ISA restrictions
as stated in the Power Vector Intrinsic Programming Reference document.
The current built-in works because the built-in gets replaced during
GIMPLE folding by a simple not-equal operator so it doesn't get
expanded and checked for Power 9 code generation.

This patch moves the definition to the Altivec stanza in the built-in
definition file to make it clear the built-ins are valid for Power 8,
Power 9 and beyond.  

The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
LE with no regressions.

Please let me know if the patch is acceptable for mainline.  Thanks.

                      Carl 

------------------------------------------------
rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

The current built-in definitions for vcmpneb, vcmpneh, vcmpnew are defined
under the Power 9 section of r66000-builtins.  This implies they are only
supported on Power 9 and above when in fact they are defined and work with
Altivec as well with the appropriate Altivec instruction generation.

The vec_cmpne builtin should generate the vcmpequ{b,h,w} instruction with
Altivec enabled and generate the vcmpne{b,h,w} on Power 9 and newer
processors.

This patch moves the definitions to the Altivec stanza to make it clear
the built-ins are supported for all Altivec processors.  The patch
enables the vcmpequ{b,h,w} instruction to be generated on Altivec and
the vcmpne{b,h,w} instruction to be generated on Power 9 and beyond.

There is existing test coverage for the vec_cmpne built-in for
vector bool char, vector bool short, vector bool int,
vector bool long long in builtins-3-p9.c and p8vector-builtin-2.c.
Coverage for vector signed int, vector unsigned int is in
p8vector-builtin-2.c.

Test vec-cmpne.c is updated to check the generation of the vcmpequ{b,h,w}
instructions for Altivec.  A new test vec-cmpne-runnable.c is added to
verify the built-ins work as expected.

Patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 LE
with no regressions.

gcc/ChangeLog:

	* config/rs6000/rs6000-builtins.def (vcmpneb, vcmpneh, vcmpnew):
	Move definitions to Altivec stanza.
	* config/rs6000/altivec.md (vcmpneb, vcmpneh, vcmpnew): New
	define_expand.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/vec-cmpne-runnable.c: New execution test.
	* gcc.target/powerpc/vec-cmpne.c (define_test_functions,
	execute_test_functions) moved to vec-cmpne.h.  Added
	scan-assembler-times for vcmpequb, vcmpequh, vcmpequw.
	* gcc.target/powerpc/vec-cmpne.h: New include file for vec-cmpne.c
	and vec-cmpne-runnable.c. Split define_test_functions definition
	into define_test_functions and define_init_verify_functions.
---
 gcc/config/rs6000/altivec.md                  |  12 ++
 gcc/config/rs6000/rs6000-builtins.def         |  18 +--
 .../gcc.target/powerpc/vec-cmpne-runnable.c   |  36 ++++++
 gcc/testsuite/gcc.target/powerpc/vec-cmpne.c  | 110 ++----------------
 gcc/testsuite/gcc.target/powerpc/vec-cmpne.h  |  86 ++++++++++++++
 5 files changed, 151 insertions(+), 111 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-cmpne-runnable.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-cmpne.h

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index ad1224e0b57..31f65aa1b7a 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2631,6 +2631,18 @@ (define_insn "altivec_vcmpequt_p"
   "vcmpequq. %0,%1,%2"
   [(set_attr "type" "veccmpfx")])
 
+;; Expand for builtin vcmpne{b,h,w}
+(define_expand "altivec_vcmpne_<mode>"
+  [(set (match_operand:VSX_EXTRACT_I 3 "altivec_register_operand" "=v")
+	(eq:VSX_EXTRACT_I (match_operand:VSX_EXTRACT_I 1 "altivec_register_operand" "v")
+			  (match_operand:VSX_EXTRACT_I 2 "altivec_register_operand" "v")))
+   (set (match_operand:VSX_EXTRACT_I 0 "altivec_register_operand" "=v")
+        (not:VSX_EXTRACT_I (match_dup 3)))]
+  "TARGET_ALTIVEC"
+  {
+    operands[3] = gen_reg_rtx (GET_MODE (operands[0]));
+  });
+
 (define_insn "*altivec_vcmpgts<VI_char>_p"
   [(set (reg:CC CR6_REGNO)
 	(unspec:CC [(gt:CC (match_operand:VI2 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 638d0bc72ca..6b06fa8b34d 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -641,6 +641,15 @@
   const int __builtin_altivec_vcmpgtuw_p (int, vsi, vsi);
     VCMPGTUW_P vector_gtu_v4si_p {pred}
 
+  const vsc __builtin_altivec_vcmpneb (vsc, vsc);
+    VCMPNEB altivec_vcmpne_v16qi {}
+
+  const vss __builtin_altivec_vcmpneh (vss, vss);
+    VCMPNEH altivec_vcmpne_v8hi {}
+
+  const vsi __builtin_altivec_vcmpnew (vsi, vsi);
+    VCMPNEW altivec_vcmpne_v4si {}
+
   const vsi __builtin_altivec_vctsxs (vf, const int<5>);
     VCTSXS altivec_vctsxs {}
 
@@ -2599,9 +2608,6 @@
   const signed int __builtin_altivec_vcmpaew_p (vsi, vsi);
     VCMPAEW_P vector_ae_v4si_p {}
 
-  const vsc __builtin_altivec_vcmpneb (vsc, vsc);
-    VCMPNEB vcmpneb {}
-
   const signed int __builtin_altivec_vcmpneb_p (vsc, vsc);
     VCMPNEB_P vector_ne_v16qi_p {}
 
@@ -2614,15 +2620,9 @@
   const signed int __builtin_altivec_vcmpnefp_p (vf, vf);
     VCMPNEFP_P vector_ne_v4sf_p {}
 
-  const vss __builtin_altivec_vcmpneh (vss, vss);
-    VCMPNEH vcmpneh {}
-
   const signed int __builtin_altivec_vcmpneh_p (vss, vss);
     VCMPNEH_P vector_ne_v8hi_p {}
 
-  const vsi __builtin_altivec_vcmpnew (vsi, vsi);
-    VCMPNEW vcmpnew {}
-
   const signed int __builtin_altivec_vcmpnew_p (vsi, vsi);
     VCMPNEW_P vector_ne_v4si_p {}
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-cmpne-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-cmpne-runnable.c
new file mode 100644
index 00000000000..be038639a6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-cmpne-runnable.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-mvsx -O3 " } */
+
+/* Test that the vec_cmpne builtin works as expected.  */
+
+#include "vec-cmpne.h"
+
+define_test_functions (int, signed int, signed int, si);
+define_test_functions (int, unsigned int, unsigned int, ui);
+define_test_functions (short, signed short, signed short, ss);
+define_test_functions (short, unsigned short, unsigned short, us);
+define_test_functions (char, signed char, signed char, sc);
+define_test_functions (char, unsigned char, unsigned char, uc);
+define_test_functions (int, signed int, float, ff);
+
+define_init_verify_functions (int, signed int, signed int, si);
+define_init_verify_functions (int, unsigned int, unsigned int, ui);
+define_init_verify_functions (short, signed short, signed short, ss);
+define_init_verify_functions (short, unsigned short, unsigned short, us);
+define_init_verify_functions (char, signed char, signed char, sc);
+define_init_verify_functions (char, unsigned char, unsigned char, uc);
+define_init_verify_functions (int, signed int, float, ff);
+
+int main ()
+{
+  execute_test_functions (int, signed int, signed int, si);
+  execute_test_functions (int, unsigned int, unsigned int, ui);
+  execute_test_functions (short, signed short, signed short, ss);
+  execute_test_functions (short, unsigned short, unsigned short, us);
+  execute_test_functions (char, signed char, signed char, sc);
+  execute_test_functions (char, unsigned char, unsigned char, uc);
+  execute_test_functions (int, signed int, float, ff);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-cmpne.c b/gcc/testsuite/gcc.target/powerpc/vec-cmpne.c
index edba9dece66..8ceddd5cc3f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-cmpne.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-cmpne.c
@@ -1,94 +1,11 @@
-/* { dg-do run } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
 /* { dg-options "-mvsx -O3" } */
 
-/* Test that the vec_cmpne builtin works as expected.  */
-
-#include "altivec.h"
-
-#define N 4096
-
-void abort ();
-
-#define define_test_functions(VBTYPE, RTYPE, STYPE, NAME) \
-\
-RTYPE result_ne_##NAME[N] __attribute__((aligned(16))); \
-RTYPE result_eq_##NAME[N] __attribute__((aligned(16))); \
-STYPE operand1_##NAME[N] __attribute__((aligned(16))); \
-STYPE operand2_##NAME[N] __attribute__((aligned(16))); \
-RTYPE expected_##NAME[N] __attribute__((aligned(16))); \
-\
-__attribute__((noinline)) void vector_tests_##NAME () \
-{ \
-  vector STYPE v1_##NAME, v2_##NAME; \
-  vector bool VBTYPE tmp_##NAME; \
-  int i; \
-  for (i = 0; i < N; i+=16/sizeof (STYPE)) \
-    { \
-      /* result_ne = operand1!=operand2.  */ \
-      v1_##NAME = vec_vsx_ld (0, (const vector STYPE*)&operand1_##NAME[i]); \
-      v2_##NAME = vec_vsx_ld (0, (const vector STYPE*)&operand2_##NAME[i]); \
-\
-      tmp_##NAME = vec_cmpeq (v1_##NAME, v2_##NAME); \
-      vec_vsx_st (tmp_##NAME, 0, &result_eq_##NAME[i]); \
-\
-      tmp_##NAME = vec_cmpne (v1_##NAME, v2_##NAME); \
-      vec_vsx_st (tmp_##NAME, 0, &result_ne_##NAME[i]); \
-    } \
-} \
-\
-__attribute__((noinline)) void init_##NAME () \
-{ \
-  int i; \
-  for (i = 0; i < N; ++i) \
-    { \
-      result_ne_##NAME[i] = 7; \
-      result_eq_##NAME[i] = 15; \
-      if (i%3 == 0) \
-	{ \
-	  /* op1 < op2.  */ \
-	  operand1_##NAME[i] = 1; \
-	  operand2_##NAME[i] = 2; \
-	} \
-      else if (i%3 == 1) \
-	{ \
-	  /* op1 > op2.  */ \
-	  operand1_##NAME[i] = 2; \
-	  operand2_##NAME[i] = 1; \
-	} \
-      else if (i%3 == 2) \
-	{ \
-	  /* op1 == op2.  */ \
-	  operand1_##NAME[i] = 3; \
-	  operand2_##NAME[i] = 3; \
-	} \
-      /* For vector comparisons: "For each element of the result_ne, the \
-	  value of each bit is 1 if the corresponding elements of ARG1 and \
-	  ARG2 are equal." {or whatever the comparison is} "Otherwise, the \
-	  value of each bit is 0."  */ \
-    expected_##NAME[i] = -1 * (RTYPE)(operand1_##NAME[i] != operand2_##NAME[i]); \
-  } \
-} \
-\
-__attribute__((noinline)) void verify_results_##NAME () \
-{ \
-  int i; \
-  for (i = 0; i < N; ++i) \
-    { \
-      if ( ((result_ne_##NAME[i] != expected_##NAME[i]) || \
-	    (result_ne_##NAME[i] == result_eq_##NAME[i]))) \
-	abort (); \
-    } \
-}
-
-
-#define execute_test_functions(VBTYPE, RTYPE, STYPE, NAME) \
-{ \
-  init_##NAME (); \
-  vector_tests_##NAME (); \
-  verify_results_##NAME (); \
-}
+/* Test that the vec_cmpne builtin generates the expected Altive
+   instructions.  */
 
+#include "vec-cmpne.h"
 
 define_test_functions (int, signed int, signed int, si);
 define_test_functions (int, unsigned int, unsigned int, ui);
@@ -98,17 +15,6 @@ define_test_functions (char, signed char, signed char, sc);
 define_test_functions (char, unsigned char, unsigned char, uc);
 define_test_functions (int, signed int, float, ff);
 
-int main ()
-{
-  execute_test_functions (int, signed int, signed int, si);
-  execute_test_functions (int, unsigned int, unsigned int, ui);
-  execute_test_functions (short, signed short, signed short, ss);
-  execute_test_functions (short, unsigned short, unsigned short, us);
-  execute_test_functions (char, signed char, signed char, sc);
-  execute_test_functions (char, unsigned char, unsigned char, uc);
-  execute_test_functions (int, signed int, float, ff);
-
-  return 0;
-}
-
-
+/* { dg-final { scan-assembler-times {\mvcmpequb\M}  4 } } */
+/* { dg-final { scan-assembler-times {\mvcmpequh\M}  4 } } */
+/* { dg-final { scan-assembler-times {\mvcmpequw\M}  4 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-cmpne.h b/gcc/testsuite/gcc.target/powerpc/vec-cmpne.h
new file mode 100644
index 00000000000..45c34d31d0e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-cmpne.h
@@ -0,0 +1,86 @@
+#include "altivec.h"
+
+#define N 4096
+
+void abort ();
+
+#define define_test_functions(VBTYPE, RTYPE, STYPE, NAME) \
+\
+RTYPE result_ne_##NAME[N] __attribute__((aligned(16))); \
+RTYPE result_eq_##NAME[N] __attribute__((aligned(16))); \
+STYPE operand1_##NAME[N] __attribute__((aligned(16))); \
+STYPE operand2_##NAME[N] __attribute__((aligned(16))); \
+RTYPE expected_##NAME[N] __attribute__((aligned(16))); \
+\
+__attribute__((noinline)) void vector_tests_##NAME () \
+{ \
+  vector STYPE v1_##NAME, v2_##NAME; \
+  vector bool VBTYPE tmp_##NAME; \
+  int i; \
+  for (i = 0; i < N; i+=16/sizeof (STYPE)) \
+    { \
+      /* result_ne = operand1!=operand2.  */ \
+      v1_##NAME = vec_vsx_ld (0, (const vector STYPE*)&operand1_##NAME[i]); \
+      v2_##NAME = vec_vsx_ld (0, (const vector STYPE*)&operand2_##NAME[i]); \
+\
+      tmp_##NAME = vec_cmpeq (v1_##NAME, v2_##NAME); \
+      vec_vsx_st (tmp_##NAME, 0, &result_eq_##NAME[i]); \
+\
+      tmp_##NAME = vec_cmpne (v1_##NAME, v2_##NAME); \
+      vec_vsx_st (tmp_##NAME, 0, &result_ne_##NAME[i]); \
+    } \
+} \
+
+#define define_init_verify_functions(VBTYPE, RTYPE, STYPE, NAME) \
+__attribute__((noinline)) void init_##NAME () \
+{ \
+  int i; \
+  for (i = 0; i < N; ++i) \
+    { \
+      result_ne_##NAME[i] = 7; \
+      result_eq_##NAME[i] = 15; \
+      if (i%3 == 0) \
+	{ \
+	  /* op1 < op2.  */ \
+	  operand1_##NAME[i] = 1; \
+	  operand2_##NAME[i] = 2; \
+	} \
+      else if (i%3 == 1) \
+	{ \
+	  /* op1 > op2.  */ \
+	  operand1_##NAME[i] = 2; \
+	  operand2_##NAME[i] = 1; \
+	} \
+      else if (i%3 == 2) \
+	{ \
+	  /* op1 == op2.  */ \
+	  operand1_##NAME[i] = 3; \
+	  operand2_##NAME[i] = 3; \
+	} \
+      /* For vector comparisons: "For each element of the result_ne, the \
+	  value of each bit is 1 if the corresponding elements of ARG1 and \
+	  ARG2 are equal." {or whatever the comparison is} "Otherwise, the \
+	  value of each bit is 0."  */ \
+    expected_##NAME[i] = -1 * (RTYPE)(operand1_##NAME[i] != operand2_##NAME[i]); \
+  } \
+} \
+\
+__attribute__((noinline)) void verify_results_##NAME () \
+{ \
+  int i; \
+  for (i = 0; i < N; ++i) \
+    { \
+      if ( ((result_ne_##NAME[i] != expected_##NAME[i]) || \
+	    (result_ne_##NAME[i] == result_eq_##NAME[i]))) \
+	abort (); \
+    } \
+}
+
+
+#define execute_test_functions(VBTYPE, RTYPE, STYPE, NAME) \
+{ \
+  init_##NAME (); \
+  vector_tests_##NAME (); \
+  verify_results_##NAME (); \
+}
+
-- 
2.37.2



             reply	other threads:[~2023-08-01 18:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 18:29 Carl Love [this message]
2023-08-07  9:18 ` Kewen.Lin
2023-08-07 17:37   ` Carl Love

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18319f27b2ef6cf19de41c3c3fc32c7680c62716.camel@us.ibm.com \
    --to=cel@us.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@linux.ibm.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).