public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, rs6000] Add builtin support for vec_insert4b, vec_extract4b
@ 2018-02-01 19:32 Carl Love
  2018-02-06 15:47 ` Segher Boessenkool
  0 siblings, 1 reply; 4+ messages in thread
From: Carl Love @ 2018-02-01 19:32 UTC (permalink / raw)
  To: gcc-patches, David Edelsohn, Segher Boessenkool; +Cc: Bill Schmidt, cel


GCC maintainers:

The following patch contains fixes for the builtins to add and extract
a word from a char vector.  The existing names for the builtins that do
this are not consistent with the ABI.  Additionally, the supported
arguments are not all consistent with the ABI.  This patch fixes the
support for the insert and extract word builtins so they are consistent
with the "64-Bit ELF V2 ABI Specification".

The patch has been tested on: 

  powerpc64le-unknown-linux-gnu (Power 8 LE)
  powerpc64le-unknown-linux-gnu (Power 9 LE)

Let me know if the patch looks OK or not.  Let me know if you want to
include it in stage 4 or wait for the next release.  Thanks.

                   Carl Love
------------------------------------------------------------------------------

gcc/ChangeLog:

2018-01-31  Carl Love  <cel@us.ibm.com>

	* config/rs6000/altivec.h: Change vec_vextract4b to	vec_extract4b
	and vec_vinsert4b to vec_insert4b.
	* config/rs6000/rs6000-builtin.def: Remove VEXTRACT4B, VINSERT4B
	and VINSERT4B_DI definitions. Add INSERT4B and	EXTRACT4B definitions.
	* config/rs6000/rs6000-c.c: Remove P9V_BUILTIN_VEC_VEXTRACT4B
	and P9V_BUILTIN_VEC_VINSERT4B definitions. Add the definitions
	for P9V_BUILTIN_VEC_EXTRACT4B and P9V_BUILTIN_VEC_INSERT4B.
	* config/rs6000/rs6000.c (altivec_expand_builtin): Remove
	P9V_BUILTIN_VEXTRACT4B, P9V_BUILTIN_VEC_VEXTRACT4B,
	P9V_BUILTIN_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI,
	P9V_BUILTIN_VEC_VINSERT4B case statements. Add P9V_BUILTIN_EXTRACT4B
	and P9V_BUILTIN_INSERT4B case statements.
	* config/rs6000/vsx.md: Remove define_expand vextract4b and
	define_insn_and_split *vextract4b_internal. Add define_insn extract4b
	definition. Rename define_expand vinsert4b to define_expand insert4b and
	define_insn *vinsert4b_internal to define_insn *insert4b_internal.
	Remove definitions for define_expand vinsert4b_di and
	define_insn *vinsert4b_di_internal.
	*doc/extend.texi: Remove documentation for the vec_vextract4b. Fix
	documentation for vec_insert4b.  Add documentation for vec_extract4b.

gcc/testsuite/ChangeLog:

2018-01-31  Carl Love  <cel@us.ibm.com>
	* gcc.target/powerpc/builtins-7-p9-runnable.c: New runnable test file.
	* gcc.target/powerpc/p9_vinsert4b-1.c: Remove file.
	* gcc.target/powerpc/p9_vinsert4b-2.c: Remove file.
---
 gcc/config/rs6000/altivec.h                        |   4 +-
 gcc/config/rs6000/rs6000-builtin.def               |   9 +-
 gcc/config/rs6000/rs6000-c.c                       |  31 +---
 gcc/config/rs6000/rs6000.c                         |   7 +-
 gcc/config/rs6000/vsx.md                           |  65 ++------
 gcc/doc/extend.texi                                |  11 +-
 .../gcc.target/powerpc/builtins-7-p9-runnable.c    | 168 +++++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c  |  39 -----
 gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c  |  30 ----
 9 files changed, 197 insertions(+), 167 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c
 delete mode 100644 gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c
 delete mode 100644 gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 684cb1990..1e495e69c 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -433,8 +433,8 @@
 #define vec_vctzd __builtin_vec_vctzd
 #define vec_vctzh __builtin_vec_vctzh
 #define vec_vctzw __builtin_vec_vctzw
-#define vec_vextract4b __builtin_vec_vextract4b
-#define vec_vinsert4b __builtin_vec_vinsert4b
+#define vec_extract4b __builtin_vec_extract4b
+#define vec_insert4b __builtin_vec_insert4b
 #define vec_vprtyb __builtin_vec_vprtyb
 #define vec_vprtybd __builtin_vec_vprtybd
 #define vec_vprtybw __builtin_vec_vprtybw
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 86604da46..37595cc29 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2226,9 +2226,8 @@ BU_P9V_AV_2 (VEXTUWLX, "vextuwlx",		CONST,	vextuwlx)
 BU_P9V_AV_2 (VEXTUWRX, "vextuwrx",		CONST,	vextuwrx)
 
 /* Insert/extract 4 byte word into a vector.  */
-BU_P9V_VSX_2 (VEXTRACT4B,   "vextract4b",	CONST,	vextract4b)
-BU_P9V_VSX_3 (VINSERT4B,    "vinsert4b",	CONST,	vinsert4b)
-BU_P9V_VSX_3 (VINSERT4B_DI, "vinsert4b_di",	CONST,	vinsert4b_di)
+BU_P9V_VSX_3 (INSERT4B,    "insert4b",		CONST,	insert4b)
+BU_P9V_VSX_2 (EXTRACT4B,   "extract4b", 	CONST,	extract4b)
 
 /* Hardware IEEE 128-bit floating point round to odd instrucitons added in ISA
    3.0 (power9).  */
@@ -2290,12 +2289,12 @@ BU_P9V_OVERLOAD_2 (LXVL,	"lxvl")
 BU_P9V_OVERLOAD_2 (XL_LEN_R,	"xl_len_r")
 BU_P9V_OVERLOAD_2 (VEXTULX,	"vextulx")
 BU_P9V_OVERLOAD_2 (VEXTURX,	"vexturx")
-BU_P9V_OVERLOAD_2 (VEXTRACT4B,	"vextract4b")
+BU_P9V_OVERLOAD_2 (EXTRACT4B,	"extract4b")
 
 /* ISA 3.0 Vector scalar overloaded 3 argument functions */
 BU_P9V_OVERLOAD_3 (STXVL,	"stxvl")
 BU_P9V_OVERLOAD_3 (XST_LEN_R,	"xst_len_r")
-BU_P9V_OVERLOAD_3 (VINSERT4B,	"vinsert4b")
+BU_P9V_OVERLOAD_3 (INSERT4B,	"insert4b")
 
 /* Overloaded CMPNE support was implemented prior to Power 9,
    so is not mentioned here.  */
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index a68be511c..24675b12e 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -5429,10 +5429,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
   { P9V_BUILTIN_VEC_VCTZLSBB, P9V_BUILTIN_VCTZLSBB_V4SI,
     RS6000_BTI_INTSI, RS6000_BTI_V4SI, 0, 0 },
 
-  { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B,
-    RS6000_BTI_INTDI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
-  { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B,
-    RS6000_BTI_INTDI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI, 0 },
+  { P9V_BUILTIN_VEC_EXTRACT4B, P9V_BUILTIN_EXTRACT4B,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, 0 },
 
   { P9V_BUILTIN_VEC_VEXTRACT_FP_FROM_SHORTH, P9V_BUILTIN_VEXTRACT_FP_FROM_SHORTH,
     RS6000_BTI_V4SF, RS6000_BTI_unsigned_V8HI, 0, 0 },
@@ -5492,27 +5490,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
 
-  { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B,
-    RS6000_BTI_V16QI, RS6000_BTI_V4SI,
-    RS6000_BTI_V16QI, RS6000_BTI_UINTSI },
-  { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B,
-    RS6000_BTI_V16QI, RS6000_BTI_unsigned_V4SI,
-    RS6000_BTI_V16QI, RS6000_BTI_UINTSI },
-  { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B,
+  { P9V_BUILTIN_VEC_INSERT4B, P9V_BUILTIN_INSERT4B,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_V4SI,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI },
+  { P9V_BUILTIN_VEC_INSERT4B, P9V_BUILTIN_INSERT4B,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V4SI,
-    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI },
-  { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI,
-    RS6000_BTI_V16QI, RS6000_BTI_INTDI,
-    RS6000_BTI_V16QI, RS6000_BTI_UINTDI },
-  { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI,
-    RS6000_BTI_V16QI, RS6000_BTI_UINTDI,
-    RS6000_BTI_V16QI, RS6000_BTI_UINTDI },
-  { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI,
-    RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTDI,
-    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI },
-  { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI,
-    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI,
-    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI },
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI },
 
   { P8V_BUILTIN_VEC_VADDECUQ, P8V_BUILTIN_VADDECUQ,
     RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI },
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index a37ebd88c..cf6efaafb 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15723,8 +15723,7 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
     case VSX_BUILTIN_VEC_EXT_V1TI:
       return altivec_expand_vec_ext_builtin (exp, target);
 
-    case P9V_BUILTIN_VEXTRACT4B:
-    case P9V_BUILTIN_VEC_VEXTRACT4B:
+    case P9V_BUILTIN_EXTRACT4B:
       arg1 = CALL_EXPR_ARG (exp, 1);
       STRIP_NOPS (arg1);
 
@@ -15739,9 +15738,7 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
 	}
       break;
 
-    case P9V_BUILTIN_VINSERT4B:
-    case P9V_BUILTIN_VINSERT4B_DI:
-    case P9V_BUILTIN_VEC_VINSERT4B:
+    case P9V_BUILTIN_INSERT4B:
       arg2 = CALL_EXPR_ARG (exp, 2);
       STRIP_NOPS (arg2);
 
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index c2016f1bf..693609246 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5152,45 +5152,20 @@
 ;; Vector insert/extract word at arbitrary byte values.  Note, the little
 ;; endian version needs to adjust the byte number, and the V4SI element in
 ;; vinsert4b.
-(define_expand "vextract4b"
-  [(set (match_operand:DI 0 "gpc_reg_operand")
-	(unspec:DI [(match_operand:V16QI 1 "vsx_register_operand")
-		    (match_operand:QI 2 "const_0_to_12_operand")]
-		   UNSPEC_XXEXTRACTUW))]
+(define_insn "extract4b"
+  [(set (match_operand:V2DI 0 "vsx_register_operand")
+	(unspec:V2DI [(match_operand:V16QI 1 "vsx_register_operand" "wa")
+		      (match_operand:QI 2 "const_0_to_12_operand" "n")]
+		     UNSPEC_XXEXTRACTUW))]
   "TARGET_P9_VECTOR"
 {
   if (!VECTOR_ELT_ORDER_BIG)
     operands[2] = GEN_INT (12 - INTVAL (operands[2]));
-})
 
-(define_insn_and_split "*vextract4b_internal"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=wj,r")
-	(unspec:DI [(match_operand:V16QI 1 "vsx_register_operand" "wa,v")
-		    (match_operand:QI 2 "const_0_to_12_operand" "n,n")]
-		   UNSPEC_XXEXTRACTUW))]
-  "TARGET_P9_VECTOR"
-  "@
-   xxextractuw %x0,%x1,%2
-   #"
-  "&& reload_completed && int_reg_operand (operands[0], DImode)"
-  [(const_int 0)]
-{
-  rtx op0 = operands[0];
-  rtx op1 = operands[1];
-  rtx op2 = operands[2];
-  rtx op0_si = gen_rtx_REG (SImode, REGNO (op0));
-  rtx op1_v4si = gen_rtx_REG (V4SImode, REGNO (op1));
-
-  emit_move_insn (op0, op2);
-  if (VECTOR_ELT_ORDER_BIG)
-    emit_insn (gen_vextuwlx (op0_si, op0_si, op1_v4si));
-  else
-    emit_insn (gen_vextuwrx (op0_si, op0_si, op1_v4si));
-  DONE;
-}
-  [(set_attr "type" "vecperm")])
+  return "xxextractuw %0,%1,%2";
+})
 
-(define_expand "vinsert4b"
+(define_expand "insert4b"
   [(set (match_operand:V16QI 0 "vsx_register_operand")
 	(unspec:V16QI [(match_operand:V4SI 1 "vsx_register_operand")
 		       (match_operand:V16QI 2 "vsx_register_operand")
@@ -5208,7 +5183,7 @@
     }
 })
 
-(define_insn "*vinsert4b_internal"
+(define_insn "*insert4b_internal"
   [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
 	(unspec:V16QI [(match_operand:V4SI 1 "vsx_register_operand" "wa")
 		       (match_operand:V16QI 2 "vsx_register_operand" "0")
@@ -5218,28 +5193,6 @@
   "xxinsertw %x0,%x1,%3"
   [(set_attr "type" "vecperm")])
 
-(define_expand "vinsert4b_di"
-  [(set (match_operand:V16QI 0 "vsx_register_operand")
-	(unspec:V16QI [(match_operand:DI 1 "vsx_register_operand")
-		       (match_operand:V16QI 2 "vsx_register_operand")
-		       (match_operand:QI 3 "const_0_to_12_operand")]
-		   UNSPEC_XXINSERTW))]
-  "TARGET_P9_VECTOR"
-{
-  if (!VECTOR_ELT_ORDER_BIG)
-    operands[3] = GEN_INT (12 - INTVAL (operands[3]));
-})
-
-(define_insn "*vinsert4b_di_internal"
-  [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
-	(unspec:V16QI [(match_operand:DI 1 "vsx_register_operand" "wj")
-		       (match_operand:V16QI 2 "vsx_register_operand" "0")
-		       (match_operand:QI 3 "const_0_to_12_operand" "n")]
-		   UNSPEC_XXINSERTW))]
-  "TARGET_P9_VECTOR"
-  "xxinsertw %x0,%x1,%3"
-  [(set_attr "type" "vecperm")])
-
 ;; Generate vector extract four float 32 values from left four elements
 ;; of eight element vector of float 16 values.
 (define_expand "vextract_fp_from_shorth"
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index cb9df971a..d1735d2ff 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -19054,14 +19054,13 @@ vector unsigned short vec_vctzh (vector unsigned short);
 vector int vec_vctzw (vector int);
 vector unsigned int vec_vctzw (vector int);
 
-long long vec_vextract4b (const vector signed char, const int);
-long long vec_vextract4b (const vector unsigned char, const int);
+vector unsigned long long vec_extract4b (const vector unsigned char,
+                                         const signed int);
 
-vector signed char vec_insert4b (vector int, vector signed char, const int);
+vector unsigned char vec_insert4b (vector int, vector unsigned char,
+                                   const signed int);
 vector unsigned char vec_insert4b (vector unsigned int, vector unsigned char,
-                                   const int);
-vector signed char vec_insert4b (long long, vector signed char, const int);
-vector unsigned char vec_insert4b (long long, vector unsigned char, const int);
+                                   const signed int);
 
 vector unsigned int vec_parity_lsbb (vector signed int);
 vector unsigned int vec_parity_lsbb (vector unsigned int);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c
new file mode 100644
index 000000000..ed3e3ff0a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c
@@ -0,0 +1,168 @@
+/* { dg-do run { target { powerpc64*-*-* && p9vector_hw } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+#include <altivec.h>
+#define TRUE 1
+#define FALSE 0
+
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
+#define EXTRACT 0
+
+void abort (void);
+
+int result_wrong_ull (vector unsigned long long vec_expected,
+		      vector unsigned long long vec_actual)
+{
+  int i;
+
+  for (i = 0; i < 2; i++)
+    if (vec_expected[i] != vec_actual[i])
+      return TRUE;
+
+  return FALSE;
+}
+
+void print_ull (vector unsigned long long vec_expected,
+		vector unsigned long long vec_actual)
+{
+  int i;
+
+  printf("expected unsigned long long data\n");
+  for (i = 0; i < 2; i++)
+    printf(" %lld,", vec_expected[i]);
+
+  printf("\nactual signed char data\n");
+  for (i = 0; i < 2; i++)
+    printf(" %lld,", vec_actual[i]);
+  printf("\n");
+}
+
+int result_wrong_uc (vector unsigned char vec_expected,
+		     vector unsigned char vec_actual)
+{
+  int i;
+
+  for (i = 0; i < 16; i++)
+    if (vec_expected[i] != vec_actual[i])
+      return TRUE;
+
+  return FALSE;
+}
+
+void print_uc (vector unsigned char vec_expected,
+	       vector unsigned char vec_actual)
+{
+  int i;
+
+  printf("expected unsigned char data\n");
+  for (i = 0; i < 16; i++)
+    printf(" %d,", vec_expected[i]);
+
+  printf("\nactual unsigned char data\n");
+  for (i = 0; i < 16; i++)
+    printf(" %d,", vec_actual[i]);
+  printf("\n");
+}
+
+
+#if EXTRACT
+vector unsigned long long
+vext (vector unsigned char *vc)
+{
+  return vextract_si_vchar (*vc, 5);
+}
+#endif
+
+int main()
+{
+   vector signed int vsi_arg;
+   vector unsigned char vec_uc_arg, vec_uc_result, vec_uc_expected;
+   vector unsigned long long vec_ull_result, vec_ull_expected;
+   unsigned long long ull_result, ull_expected;
+
+   vec_uc_arg = (vector unsigned char){1, 2, 3, 4,
+				       5, 6, 7, 8,
+				       9, 10, 11, 12,
+				       13, 14, 15, 16};
+
+   vsi_arg = (vector signed int){0xA, 0xB, 0xC, 0xD};
+
+   vec_uc_expected = (vector unsigned char){0xC, 0, 0, 0,
+					    5, 6, 7, 8,
+					    9, 10, 11, 12,
+					    13, 14, 15, 16};
+   /* Test vec_insert4b() */
+   /* Insert into char 0 location */
+   vec_uc_result = vec_insert4b (vsi_arg, vec_uc_arg, 0);
+   
+   if (result_wrong_uc(vec_uc_expected, vec_uc_result))
+     {
+#ifdef DEBUG
+        printf("Error: vec_insert4b pos 0, result does not match expected result\n");
+	print_uc (vec_uc_expected, vec_uc_result);
+#else
+        abort();
+#endif
+      }
+
+   /* insert into char 4 location */
+   vec_uc_expected = (vector unsigned char){1, 2, 3, 4,
+					    0xC, 0, 0, 0,
+					    9, 10, 11, 12,
+					    13, 14, 15, 16};
+   vec_uc_result = vec_insert4b (vsi_arg, vec_uc_arg, 4);
+   
+   if (result_wrong_uc(vec_uc_expected, vec_uc_result))
+     {
+#ifdef DEBUG
+        printf("Error: vec_insert4b pos 4, result does not match expected result\n");
+	print_uc (vec_uc_expected, vec_uc_result);
+#else
+        abort();
+#endif
+      }
+
+   /* Test vec_extract4b() */
+   /* Extract 4b, from char 0 location */
+   vec_uc_arg = (vector unsigned char){10, 0, 0, 0,
+				       20, 0, 0, 0,
+				       30, 0, 0, 0,
+				       40, 0, 0, 0};
+
+   vec_ull_expected = (vector unsigned long long){0, 10};
+   vec_ull_result = vec_extract4b(vec_uc_arg, 0);
+
+   if (result_wrong_ull(vec_ull_expected, vec_ull_result))
+     {
+#ifdef DEBUG
+        printf("Error: vec_extract4b pos 0, result does not match expected result\n");
+	print_ull (vec_ull_expected, vec_ull_result);
+#else
+        abort();
+#endif
+      }
+
+   /* Extract 4b, from char 12 location */
+   vec_uc_arg = (vector unsigned char){10, 0, 0, 0,
+				       20, 0, 0, 0,
+				       30, 0, 0, 0,
+				       40, 0, 0, 0};
+
+   vec_ull_expected = (vector unsigned long long){0, 40};
+   vec_ull_result = vec_extract4b(vec_uc_arg, 12);
+
+   if (result_wrong_ull(vec_ull_expected, vec_ull_result))
+     {
+#ifdef DEBUG
+        printf("Error: vec_extract4b pos 12, result does not match expected result\n");
+	print_ull (vec_ull_expected, vec_ull_result);
+#else
+        abort();
+#endif
+      }
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c b/gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c
deleted file mode 100644
index fa1ba7547..000000000
--- a/gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c
+++ /dev/null
@@ -1,39 +0,0 @@
-/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
-/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
-/* { dg-options "-mcpu=power9 -O2" } */
-
-#include <altivec.h>
-
-vector signed char
-vins_v4si (vector int *vi, vector signed char *vc)
-{
-  return vec_vinsert4b (*vi, *vc, 1);
-}
-
-vector unsigned char
-vins_di (long di, vector unsigned char *vc)
-{
-  return vec_vinsert4b (di, *vc, 2);
-}
-
-vector char
-vins_di2 (long *p_di, vector char *vc)
-{
-  return vec_vinsert4b (*p_di, *vc, 3);
-}
-
-vector unsigned char
-vins_di0 (vector unsigned char *vc)
-{
-  return vec_vinsert4b (0, *vc, 4);
-}
-
-long
-vext (vector signed char *vc)
-{
-  return vec_vextract4b (*vc, 5);
-}
-
-/* { dg-final { scan-assembler "xxextractuw\|vextuw\[lr\]x" } } */
-/* { dg-final { scan-assembler "xxinsertw" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c b/gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c
deleted file mode 100644
index 3b5872ebe..000000000
--- a/gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c
+++ /dev/null
@@ -1,30 +0,0 @@
-/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
-/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
-/* { dg-options "-mcpu=power9 -O2" } */
-
-#include <altivec.h>
-
-vector signed char
-ins_v4si (vector int vi, vector signed char vc)
-{
-  return vec_vinsert4b (vi, vc, 13);	/* { dg-error "vec_vinsert4b" } */
-}
-
-vector unsigned char
-ins_di (long di, vector unsigned char vc, long n)
-{
-  return vec_vinsert4b (di, vc, n);	/* { dg-error "vec_vinsert4b" } */
-}
-
-long
-vext1 (vector signed char vc)
-{
-  return vec_vextract4b (vc, 13);	/* { dg-error "vec_vextract4b" } */
-}
-
-long
-vextn (vector unsigned char vc, long n)
-{
-  return vec_vextract4b (vc, n);	/* { dg-error "vec_vextract4b" } */
-}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH, rs6000] Add builtin support for vec_insert4b, vec_extract4b
  2018-02-01 19:32 [PATCH, rs6000] Add builtin support for vec_insert4b, vec_extract4b Carl Love
@ 2018-02-06 15:47 ` Segher Boessenkool
  2018-02-14 20:08   ` Carl Love
  0 siblings, 1 reply; 4+ messages in thread
From: Segher Boessenkool @ 2018-02-06 15:47 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, David Edelsohn, Bill Schmidt

Hi  Carl,

On Thu, Feb 01, 2018 at 11:31:55AM -0800, Carl Love wrote:
> The following patch contains fixes for the builtins to add and extract
> a word from a char vector.  The existing names for the builtins that do
> this are not consistent with the ABI.  Additionally, the supported
> arguments are not all consistent with the ABI.  This patch fixes the
> support for the insert and extract word builtins so they are consistent
> with the "64-Bit ELF V2 ABI Specification".

The patch is hard to review because it does multiple things at once.
Would have been easier if you first add the new builtins and then delete
the old one.  But I'll try :-)

> Let me know if the patch looks OK or not.  Let me know if you want to
> include it in stage 4 or wait for the next release.  Thanks.

It should also go to GCC 7, right?

> 2018-01-31  Carl Love  <cel@us.ibm.com>
> 
> 	* config/rs6000/altivec.h: Change vec_vextract4b to	vec_extract4b

You have a tab before vec_extract4b there.

> 	and vec_vinsert4b to vec_insert4b.
> 	* config/rs6000/rs6000-builtin.def: Remove VEXTRACT4B, VINSERT4B
> 	and VINSERT4B_DI definitions. Add INSERT4B and	EXTRACT4B definitions.

Two spaces after dot (numerous times, this is the first one).  You have
a tab before EXTRACT4B.

	* config/rs6000/rs6000-builtin.def (VEXTRACT4B, VINSERT4B): Delete.
	(EXTRACT4B, INSERT4B): New.

(And similar to all below).

> 	*doc/extend.texi: Remove documentation for the vec_vextract4b. Fix
> 	documentation for vec_insert4b.  Add documentation for vec_extract4b.

Space after *.

> 2018-01-31  Carl Love  <cel@us.ibm.com>
> 	* gcc.target/powerpc/builtins-7-p9-runnable.c: New runnable test file.
> 	* gcc.target/powerpc/p9_vinsert4b-1.c: Remove file.
> 	* gcc.target/powerpc/p9_vinsert4b-2.c: Remove file.

The files are called p9-vinsert* in fact.  Is all what they tested now in
builtins-7-p9-runnable.c ?

> +(define_insn "extract4b"
> +  [(set (match_operand:V2DI 0 "vsx_register_operand")
> +	(unspec:V2DI [(match_operand:V16QI 1 "vsx_register_operand" "wa")
> +		      (match_operand:QI 2 "const_0_to_12_operand" "n")]
> +		     UNSPEC_XXEXTRACTUW))]
>    "TARGET_P9_VECTOR"
>  {
>    if (!VECTOR_ELT_ORDER_BIG)
>      operands[2] = GEN_INT (12 - INTVAL (operands[2]));
>  
> +  return "xxextractuw %0,%1,%2";
> +})

You need %x0 and %x1 here I think?

> -vector signed char vec_insert4b (vector int, vector signed char, const int);
> +vector unsigned char vec_insert4b (vector int, vector unsigned char,
> +                                   const signed int);

Maybe just write "int" instead of "signed int"?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c
> @@ -0,0 +1,168 @@
> +/* { dg-do run { target { powerpc64*-*-* && p9vector_hw } } } */

As always: why powerpc64* instead of powerpc*?


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH, rs6000] Add builtin support for vec_insert4b, vec_extract4b
  2018-02-06 15:47 ` Segher Boessenkool
@ 2018-02-14 20:08   ` Carl Love
  2018-02-16 14:26     ` Segher Boessenkool
  0 siblings, 1 reply; 4+ messages in thread
From: Carl Love @ 2018-02-14 20:08 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, David Edelsohn, Bill Schmidt

GCC maintainers:

Per Segher's comments on the first version of the patch.  I split the
patch into two.  The first patch (this one) adds the ABI specified
vec_insert4b and vec_extract builtins.  It adds a runnable file to test
the ABI specified builtin instances.  Note, the runnable test file does
not test for illegal argument values such as the const int second
argument > 12 or of the wrong type.

Note, the rtl for vec_insert4b in vsx.md is a copy of the vec_vinsert4b
code with the name changed.  The rtl for vec_extract4b is new.

The second patch removes all of the non-ABI builtin support.

Additionally, I have addressed the other comments from Segher with
regards to formatting issues and rtl register specification.

This patch has been tested on:

  powerpc64le-unknown-linux-gnu (Power 8 LE)
  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regressions.

Let me know if the patch looks OK or not. Thanks.

The patch should also be ported to GCC 7 so we are in compliance with
the ABI.

                           Carl Love

-----------------------------------------------------------------------

    gcc/ChangeLog:

    2018-02-13  Carl Love  <cel@us.ibm.com>

        * config/rs6000/altivec.h: Add builtin names vec_extract4b
        vec_insert4b.
        * config/rs6000/rs6000-builtin.def: Add INSERT4B and EXTRACT4B
        definitions.
        * config/rs6000/rs6000-c.c: Add the definitions for
        P9V_BUILTIN_VEC_EXTRACT4B and P9V_BUILTIN_VEC_INSERT4B.
        * config/rs6000/rs6000.c (altivec_expand_builtin): Add
        P9V_BUILTIN_EXTRACT4B and P9V_BUILTIN_INSERT4B case statements.
        * config/rs6000/vsx.md: Add define_insn extract4b.  Add define_expand
	definition for insert4b and define insn *insert3b_internal.
        * doc/extend.texi: Add documentation for vec_extract4b.

    gcc/testsuite/ChangeLog:

    2018-02-13  Carl Love  <cel@us.ibm.com>
        * gcc.target/powerpc/builtins-7-p9-runnable.c: New runnable test file
        for the ABI definitions for vec_extract4b and vec_insert4b.
---
 gcc/config/rs6000/altivec.h                        |   2 +
 gcc/config/rs6000/rs6000-builtin.def               |   4 +
 gcc/config/rs6000/rs6000-c.c                       |   8 +
 gcc/config/rs6000/rs6000.c                         |   2 +
 gcc/config/rs6000/vsx.md                           |  41 +++++
 gcc/doc/extend.texi                                |   7 +
 .../gcc.target/powerpc/builtins-7-p9-runnable.c    | 169 +++++++++++++++++++++
 7 files changed, 233 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 684cb1990..3bce2ae39 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -435,6 +435,8 @@
 #define vec_vctzw __builtin_vec_vctzw
 #define vec_vextract4b __builtin_vec_vextract4b
 #define vec_vinsert4b __builtin_vec_vinsert4b
+#define vec_extract4b __builtin_vec_extract4b
+#define vec_insert4b __builtin_vec_insert4b
 #define vec_vprtyb __builtin_vec_vprtyb
 #define vec_vprtybd __builtin_vec_vprtybd
 #define vec_vprtybw __builtin_vec_vprtybw
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 86604da46..420d12e29 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2229,6 +2229,8 @@ BU_P9V_AV_2 (VEXTUWRX, "vextuwrx",		CONST,	vextuwrx)
 BU_P9V_VSX_2 (VEXTRACT4B,   "vextract4b",	CONST,	vextract4b)
 BU_P9V_VSX_3 (VINSERT4B,    "vinsert4b",	CONST,	vinsert4b)
 BU_P9V_VSX_3 (VINSERT4B_DI, "vinsert4b_di",	CONST,	vinsert4b_di)
+BU_P9V_VSX_3 (INSERT4B,    "insert4b",		CONST,  insert4b)
+BU_P9V_VSX_2 (EXTRACT4B,   "extract4b", 	CONST,  extract4b)
 
 /* Hardware IEEE 128-bit floating point round to odd instrucitons added in ISA
    3.0 (power9).  */
@@ -2291,11 +2293,13 @@ BU_P9V_OVERLOAD_2 (XL_LEN_R,	"xl_len_r")
 BU_P9V_OVERLOAD_2 (VEXTULX,	"vextulx")
 BU_P9V_OVERLOAD_2 (VEXTURX,	"vexturx")
 BU_P9V_OVERLOAD_2 (VEXTRACT4B,	"vextract4b")
+BU_P9V_OVERLOAD_2 (EXTRACT4B,  "extract4b")
 
 /* ISA 3.0 Vector scalar overloaded 3 argument functions */
 BU_P9V_OVERLOAD_3 (STXVL,	"stxvl")
 BU_P9V_OVERLOAD_3 (XST_LEN_R,	"xst_len_r")
 BU_P9V_OVERLOAD_3 (VINSERT4B,	"vinsert4b")
+BU_P9V_OVERLOAD_3 (INSERT4B,    "insert4b")
 
 /* Overloaded CMPNE support was implemented prior to Power 9,
    so is not mentioned here.  */
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index a68be511c..56e66db98 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -5433,6 +5433,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_INTDI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
   { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B,
     RS6000_BTI_INTDI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI, 0 },
+  { P9V_BUILTIN_VEC_EXTRACT4B, P9V_BUILTIN_EXTRACT4B,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, 0 },
 
   { P9V_BUILTIN_VEC_VEXTRACT_FP_FROM_SHORTH, P9V_BUILTIN_VEXTRACT_FP_FROM_SHORTH,
     RS6000_BTI_V4SF, RS6000_BTI_unsigned_V8HI, 0, 0 },
@@ -5492,6 +5494,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
 
+  { P9V_BUILTIN_VEC_INSERT4B, P9V_BUILTIN_INSERT4B,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_V4SI,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI },
+  { P9V_BUILTIN_VEC_INSERT4B, P9V_BUILTIN_INSERT4B,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V4SI,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI },
   { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B,
     RS6000_BTI_V16QI, RS6000_BTI_V4SI,
     RS6000_BTI_V16QI, RS6000_BTI_UINTSI },
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6a6801aad..f8d8b9687 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15730,6 +15730,7 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
 
     case P9V_BUILTIN_VEXTRACT4B:
     case P9V_BUILTIN_VEC_VEXTRACT4B:
+    case P9V_BUILTIN_VEC_EXTRACT4B:
       arg1 = CALL_EXPR_ARG (exp, 1);
       STRIP_NOPS (arg1);
 
@@ -15747,6 +15748,7 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
     case P9V_BUILTIN_VINSERT4B:
     case P9V_BUILTIN_VINSERT4B_DI:
     case P9V_BUILTIN_VEC_VINSERT4B:
+    case P9V_BUILTIN_VEC_INSERT4B:
       arg2 = CALL_EXPR_ARG (exp, 2);
       STRIP_NOPS (arg2);
 
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 86efdced2..266923f98 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5204,6 +5204,47 @@
 ;; Vector insert/extract word at arbitrary byte values.  Note, the little
 ;; endian version needs to adjust the byte number, and the V4SI element in
 ;; vinsert4b.
+(define_insn "extract4b"
+  [(set (match_operand:V2DI 0 "vsx_register_operand")
+       (unspec:V2DI [(match_operand:V16QI 1 "vsx_register_operand" "wa")
+                     (match_operand:QI 2 "const_0_to_12_operand" "n")]
+                    UNSPEC_XXEXTRACTUW))]
+  "TARGET_P9_VECTOR"
+{
+  if (!VECTOR_ELT_ORDER_BIG)
+    operands[2] = GEN_INT (12 - INTVAL (operands[2]));
+
+  return "xxextractuw %x0,%x1,%2";
+})
+
+(define_expand "insert4b"
+  [(set (match_operand:V16QI 0 "vsx_register_operand")
+	(unspec:V16QI [(match_operand:V4SI 1 "vsx_register_operand")
+		       (match_operand:V16QI 2 "vsx_register_operand")
+		       (match_operand:QI 3 "const_0_to_12_operand")]
+		   UNSPEC_XXINSERTW))]
+  "TARGET_P9_VECTOR"
+{
+  if (!VECTOR_ELT_ORDER_BIG)
+    {
+      rtx op1 = operands[1];
+      rtx v4si_tmp = gen_reg_rtx (V4SImode);
+      emit_insn (gen_vsx_xxpermdi_v4si_be (v4si_tmp, op1, op1, const1_rtx));
+      operands[1] = v4si_tmp;
+      operands[3] = GEN_INT (12 - INTVAL (operands[3]));
+    }
+})
+
+(define_insn "*insert4b_internal"
+  [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
+	(unspec:V16QI [(match_operand:V4SI 1 "vsx_register_operand" "wa")
+		       (match_operand:V16QI 2 "vsx_register_operand" "0")
+		       (match_operand:QI 3 "const_0_to_12_operand" "n")]
+		   UNSPEC_XXINSERTW))]
+  "TARGET_P9_VECTOR"
+  "xxinsertw %x0,%x1,%3"
+  [(set_attr "type" "vecperm")])
+
 (define_expand "vextract4b"
   [(set (match_operand:DI 0 "gpc_reg_operand")
 	(unspec:DI [(match_operand:V16QI 1 "vsx_register_operand")
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index cb9df971a..13dbac42e 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -19055,8 +19055,15 @@ vector int vec_vctzw (vector int);
 vector unsigned int vec_vctzw (vector int);
 
 long long vec_vextract4b (const vector signed char, const int);
+vector unsigned long long vec_extract4b (vector unsigned char,
+                                         const int);
+long long vec_extract4b (const vector signed char, const int);
 long long vec_vextract4b (const vector unsigned char, const int);
 
+vector unsigned char vec_insert4b (vector signed int, vector unsigned char,
+                                   const int);
+vector unsigned char vec_insert4b (vector unsigned int, vector unsigned char,
+                                   const int);
 vector signed char vec_insert4b (vector int, vector signed char, const int);
 vector unsigned char vec_insert4b (vector unsigned int, vector unsigned char,
                                    const int);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c
new file mode 100644
index 000000000..137b46b05
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c
@@ -0,0 +1,169 @@
+/* { dg-do run { target { powerpc*-*-* && p9vector_hw } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+#include <altivec.h>
+#define TRUE 1
+#define FALSE 0
+
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
+#define EXTRACT 0
+
+void abort (void);
+
+int result_wrong_ull (vector unsigned long long vec_expected,
+		      vector unsigned long long vec_actual)
+{
+  int i;
+
+  for (i = 0; i < 2; i++)
+    if (vec_expected[i] != vec_actual[i])
+      return TRUE;
+
+  return FALSE;
+}
+
+int result_wrong_uc (vector unsigned char vec_expected,
+		     vector unsigned char vec_actual)
+{
+  int i;
+
+  for (i = 0; i < 16; i++)
+    if (vec_expected[i] != vec_actual[i])
+      return TRUE;
+
+  return FALSE;
+}
+
+#ifdef DEBUG
+void print_ull (vector unsigned long long vec_expected,
+		vector unsigned long long vec_actual)
+{
+  int i;
+
+  printf("expected unsigned long long data\n");
+  for (i = 0; i < 2; i++)
+    printf(" %lld,", vec_expected[i]);
+
+  printf("\nactual signed char data\n");
+  for (i = 0; i < 2; i++)
+    printf(" %lld,", vec_actual[i]);
+  printf("\n");
+}
+
+void print_uc (vector unsigned char vec_expected,
+	       vector unsigned char vec_actual)
+{
+  int i;
+
+  printf("expected unsigned char data\n");
+  for (i = 0; i < 16; i++)
+    printf(" %d,", vec_expected[i]);
+
+  printf("\nactual unsigned char data\n");
+  for (i = 0; i < 16; i++)
+    printf(" %d,", vec_actual[i]);
+  printf("\n");
+}
+#endif
+
+#if EXTRACT
+vector unsigned long long
+vext (vector unsigned char *vc)
+{
+  return vextract_si_vchar (*vc, 5);
+}
+#endif
+
+int main()
+{
+   vector signed int vsi_arg;
+   vector unsigned char vec_uc_arg, vec_uc_result, vec_uc_expected;
+   vector unsigned long long vec_ull_result, vec_ull_expected;
+   unsigned long long ull_result, ull_expected;
+
+   vec_uc_arg = (vector unsigned char){1, 2, 3, 4,
+				       5, 6, 7, 8,
+				       9, 10, 11, 12,
+				       13, 14, 15, 16};
+
+   vsi_arg = (vector signed int){0xA, 0xB, 0xC, 0xD};
+
+   vec_uc_expected = (vector unsigned char){0xC, 0, 0, 0,
+					    5, 6, 7, 8,
+					    9, 10, 11, 12,
+					    13, 14, 15, 16};
+   /* Test vec_insert4b() */
+   /* Insert into char 0 location */
+   vec_uc_result = vec_insert4b (vsi_arg, vec_uc_arg, 0);
+
+   if (result_wrong_uc(vec_uc_expected, vec_uc_result))
+     {
+#ifdef DEBUG
+        printf("Error: vec_insert4b pos 0, result does not match expected result\n");
+	print_uc (vec_uc_expected, vec_uc_result);
+#else
+        abort();
+#endif
+      }
+
+   /* insert into char 4 location */
+   vec_uc_expected = (vector unsigned char){1, 2, 3, 4,
+					    0xC, 0, 0, 0,
+					    9, 10, 11, 12,
+					    13, 14, 15, 16};
+   vec_uc_result = vec_insert4b (vsi_arg, vec_uc_arg, 4);
+
+   if (result_wrong_uc(vec_uc_expected, vec_uc_result))
+     {
+#ifdef DEBUG
+        printf("Error: vec_insert4b pos 4, result does not match expected result\n");
+	print_uc (vec_uc_expected, vec_uc_result);
+#else
+        abort();
+#endif
+      }
+
+   /* Test vec_extract4b() */
+   /* Extract 4b, from char 0 location */
+   vec_uc_arg = (vector unsigned char){10, 0, 0, 0,
+				       20, 0, 0, 0,
+				       30, 0, 0, 0,
+				       40, 0, 0, 0};
+
+   vec_ull_expected = (vector unsigned long long){0, 10};
+   vec_ull_result = vec_extract4b(vec_uc_arg, 0);
+
+   if (result_wrong_ull(vec_ull_expected, vec_ull_result))
+     {
+#ifdef DEBUG
+        printf("Error: vec_extract4b pos 0, result does not match expected result\n");
+	print_ull (vec_ull_expected, vec_ull_result);
+#else
+        abort();
+#endif
+      }
+
+   /* Extract 4b, from char 12 location */
+   vec_uc_arg = (vector unsigned char){10, 0, 0, 0,
+				       20, 0, 0, 0,
+				       30, 0, 0, 0,
+				       40, 0, 0, 0};
+
+   vec_ull_expected = (vector unsigned long long){0, 40};
+   vec_ull_result = vec_extract4b(vec_uc_arg, 12);
+
+   if (result_wrong_ull(vec_ull_expected, vec_ull_result))
+     {
+#ifdef DEBUG
+        printf("Error: vec_extract4b pos 12, result does not match expected result\n");
+	print_ull (vec_ull_expected, vec_ull_result);
+#else
+        abort();
+#endif
+      }
+}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH, rs6000] Add builtin support for vec_insert4b, vec_extract4b
  2018-02-14 20:08   ` Carl Love
@ 2018-02-16 14:26     ` Segher Boessenkool
  0 siblings, 0 replies; 4+ messages in thread
From: Segher Boessenkool @ 2018-02-16 14:26 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, David Edelsohn, Bill Schmidt

Hi!

On Wed, Feb 14, 2018 at 12:08:27PM -0800, Carl Love wrote:
> Per Segher's comments on the first version of the patch.  I split the
> patch into two.

Thanks, much easier to read.

>     2018-02-13  Carl Love  <cel@us.ibm.com>
> 
>         * config/rs6000/altivec.h: Add builtin names vec_extract4b
>         vec_insert4b.

	* config/rs6000/altivec.h (vec_extract4b, vec_insert4b): New.

(Similar for the rest of the changelog).

> --- a/gcc/config/rs6000/rs6000-c.c
> +++ b/gcc/config/rs6000/rs6000-c.c
> @@ -5433,6 +5433,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>      RS6000_BTI_INTDI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
>    { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B,
>      RS6000_BTI_INTDI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI, 0 },
> +  { P9V_BUILTIN_VEC_EXTRACT4B, P9V_BUILTIN_EXTRACT4B,
> +    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, 0 },

The old builtin use unsigned int for the element number (but signed is
correct, yes).

Looks good.  Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-02-16 14:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-01 19:32 [PATCH, rs6000] Add builtin support for vec_insert4b, vec_extract4b Carl Love
2018-02-06 15:47 ` Segher Boessenkool
2018-02-14 20:08   ` Carl Love
2018-02-16 14:26     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).