From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <meissner@sourceware.org>
Received: by sourceware.org (Postfix, from userid 1005)
 id 8672B3858D39; Thu, 21 Oct 2021 02:38:10 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8672B3858D39
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: Michael Meissner <meissner@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc(refs/users/meissner/heads/work071)] Generate XXSPLTIW on power10.
X-Act-Checkin: gcc
X-Git-Author: Michael Meissner <meissner@linux.ibm.com>
X-Git-Refname: refs/users/meissner/heads/work071
X-Git-Oldrev: 68023da085f00340d435111d31e024c38a6fc892
X-Git-Newrev: 3eb92582ccca03ed0b253ba3baadf8cbfc63d426
Message-Id: <20211021023810.8672B3858D39@sourceware.org>
Date: Thu, 21 Oct 2021 02:38:10 +0000 (GMT)
X-BeenThere: gcc-cvs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-cvs mailing list <gcc-cvs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-cvs>,
 <mailto:gcc-cvs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-cvs/>
List-Help: <mailto:gcc-cvs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-cvs>,
 <mailto:gcc-cvs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Oct 2021 02:38:10 -0000

https://gcc.gnu.org/g:3eb92582ccca03ed0b253ba3baadf8cbfc63d426

commit 3eb92582ccca03ed0b253ba3baadf8cbfc63d426
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Wed Oct 20 22:37:36 2021 -0400

    Generate XXSPLTIW on power10.
    
    This patch adds support to automatically generate the ISA 3.1 XXSPLTIW
    instruction for V8HImode, V4SImode, and V4SFmode vectors.  It does this by
    adding support for vector constants that can be used, and adding a
    VEC_DUPLICATE pattern to generate the actual XXSPLTIW instruction.
    
    The eP constraint added with the XXSPLTIDP patch will now also recognize
    use of the XXSPLTIW instruction.
    
    I added 4 new tests to test loading up V16QI, V8HI, V4SI, and V4SF vector
    constants.
    
    2021-10-20  Michael Meissner  <meissner@linux.ibm.com>
    
    gcc/
    
            * config/rs6000/constraints.md (eP): Update comment.
            * config/rs6000/predicates.md (easy_fp_constant): Add support for
            generating XXSPLTIW.
            (vsx_prefixed_constant): Likewise.
            (easy_vector_constant): Likewise.
            * config/rs6000/rs6000-protos.h (constant_generates_xxspltiw): New
            declaration.
            * config/rs6000/rs6000.c (xxspltib_constant_p): If we can generate
            XXSPLTIW, don't do XXSPLTIB and sign extend.
            (output_vec_const_move): Add support for XXSPLTIW.
            (prefixed_xxsplti_p): Recognize XXSPLTIW instructions as
            prefixed.
            (constant_generates_xxspltiw): New function.
            * config/rs6000/rs6000.md (UNSPEC_XXSPLTIW_CONST): New unspec.
            (xxspltiw_<mode>_internal): New insns.
            (VSX prefixed constant splitter): Add XXSPLTIW support.
            * config/rs6000/rs6000.opt (-msplat-word-constant): New debug
            switch.
            * config/rs6000/vsx.md (vsx_mov<mode>_64bit): Update comment.
            (vsx_mov<mode>_32bit): Likewise.
    
    gcc/testsuite/
    
            * gcc.target/powerpc/vec-splat-constant-v16qi.c: New test.
            * gcc.target/powerpc/vec-splat-constant-v4sf.c: New test.
            * gcc.target/powerpc/vec-splat-constant-v4si.c: New test.
            * gcc.target/powerpc/vec-splat-constant-v8hi.c: New test.
            * gcc.target/powerpc/vec-splati-runnable.c: Update insn count.

Diff:
---
 gcc/config/rs6000/constraints.md                   |  2 +-
 gcc/config/rs6000/predicates.md                    | 11 +++-
 gcc/config/rs6000/rs6000-protos.h                  |  1 +
 gcc/config/rs6000/rs6000.c                         | 54 +++++++++++++++++
 gcc/config/rs6000/rs6000.md                        | 17 ++++++
 gcc/config/rs6000/rs6000.opt                       |  4 ++
 .../gcc.target/powerpc/vec-splat-constant-v16qi.c  | 27 +++++++++
 .../gcc.target/powerpc/vec-splat-constant-v4sf.c   | 67 ++++++++++++++++++++++
 .../gcc.target/powerpc/vec-splat-constant-v4si.c   | 51 ++++++++++++++++
 .../gcc.target/powerpc/vec-splat-constant-v8hi.c   | 62 ++++++++++++++++++++
 .../gcc.target/powerpc/vec-splati-runnable.c       |  2 +-
 11 files changed, 295 insertions(+), 3 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 7d594872a78..0f0513f2171 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -214,7 +214,7 @@
   (match_operand 0 "cint34_operand"))
 
 ;; A SF/DF scalar constant or a vector constant that can be loaded into vector
-;; registers with one prefixed instruction such as XXSPLTIDP.
+;; registers with one prefixed instruction such as XXSPLTIDP or XXSPLTIW.
 (define_constraint "eP"
   "A constant that can be loaded into a VSX register with one prefixed insn."
   (match_operand 0 "vsx_prefixed_constant"))
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index fefa420ed67..4b07850eb64 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -608,6 +608,9 @@
     {
       if (constant_generates_xxspltidp (&vsx_const))
 	return true;
+
+      if (constant_generates_xxspltiw (&vsx_const))
+	return true;
     }
 
   /* Otherwise consider floating point constants hard, so that the
@@ -620,7 +623,7 @@
 
 ;; Return 1 if the operand is a 64-bit floating point scalar constant or a
 ;; vector constant that can be loaded to a VSX register with one prefixed
-;; instruction, such as XXSPLTIDP.
+;; instruction, such as XXSPLTIDP or XXSPLTIW.
 ;;
 ;; In addition regular constants, we also recognize constants formed with the
 ;; VEC_DUPLICATE insn from scalar constants.
@@ -651,6 +654,9 @@
   if (constant_generates_xxspltidp (&vsx_const))
     return true;
 
+  if (constant_generates_xxspltiw (&vsx_const))
+    return true;
+
   return false;
 })
 
@@ -706,6 +712,9 @@
 	{
 	  if (constant_generates_xxspltidp (&vsx_const))
 	    return true;
+
+	  if (constant_generates_xxspltiw (&vsx_const))
+	    return true;
 	}
 
       if (TARGET_P9_VECTOR
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index ec4f78d9241..0b93bc3cc0e 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -259,6 +259,7 @@ typedef struct {
 extern bool constant_to_bytes (rtx, machine_mode, rs6000_const *,
 			       rs6000_const_splat);
 extern unsigned constant_generates_xxspltidp (rs6000_const *);
+extern unsigned constant_generates_xxspltiw (rs6000_const *);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index b041db3c728..59b338085b1 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6939,6 +6939,11 @@ xxspltib_constant_p (rtx op,
   else if (IN_RANGE (value, -1, 0))
     *num_insns_ptr = 1;
 
+  /* If we can generate XXSPLTIW or XXSPLTIDP, don't generate XXSPLTIB and a
+     sign extend operation.  */
+  else if (vsx_prefixed_constant (op, mode))
+    return false;
+
   else
     *num_insns_ptr = 2;
 
@@ -7002,6 +7007,13 @@ output_vec_const_move (rtx *operands)
 		  operands[2] = GEN_INT (imm);
 		  return "xxspltidp %x0,%2";
 		}
+
+	      imm = constant_generates_xxspltiw (&vsx_const);
+	      if (imm)
+		{
+		  operands[2] = GEN_INT (imm);
+		  return "xxspltiw %x0,%2";
+		}
 	    }
 	}
 
@@ -26769,6 +26781,9 @@ prefixed_xxsplti_p (rtx_insn *insn)
     {
       if (constant_generates_xxspltidp (&vsx_const))
 	return true;
+
+      if (constant_generates_xxspltiw (&vsx_const))
+	return true;
     }
 
   return false;
@@ -29005,6 +29020,45 @@ constant_generates_xxspltidp (rs6000_const *vsx_const)
   return sf_value;
 }
 
+/* Determine if a vector constant can be loaded with XXSPLTIW.  Return zero if
+   the XXSPLTIW instruction cannot be used.  Otherwise return the immediate
+   value to be used with the XXSPLTIW instruction.  */
+
+unsigned
+constant_generates_xxspltiw (rs6000_const *vsx_const)
+{
+  if (!TARGET_SPLAT_WORD_CONSTANT || !TARGET_PREFIXED || !TARGET_VSX)
+    return 0;
+
+  /* Only recognize XXSPLTIW for 16-byte vector constants (or 8-byte scalar
+     constants that have been splatted to 128-bits).  */
+  if (vsx_const->total_size != 16)
+    return 0;
+
+  if (!vsx_const->all_words_same)
+    return 0;
+
+  /* If we can use XXSPLTIB, don't generate XXSPLTIW.  */
+  if (vsx_const->all_bytes_same)
+    return 0;
+
+  /* See if we can use VSPLTISH or VSPLTISW.  */
+  if (vsx_const->all_half_words_same)
+    {
+      unsigned short h_word = vsx_const->half_words[0];
+      short sign_h_word = ((h_word & 0xffff) ^ 0x8000) - 0x8000;
+      if (EASY_VECTOR_15 (sign_h_word))
+	return 0;
+    }
+
+  unsigned int word = vsx_const->words[0];
+  int sign_word = ((word & 0xffffffff) ^ 0x80000000) - 0x80000000;
+  if (EASY_VECTOR_15 (sign_word))
+    return 0;
+
+  return vsx_const->words[0];
+}
+
 
 struct gcc_target targetm = TARGET_INITIALIZER;
 
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2633ad9f815..3c94e547939 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -157,6 +157,7 @@
    UNSPEC_HASHST
    UNSPEC_HASHCHK
    UNSPEC_XXSPLTIDP_CONST
+   UNSPEC_XXSPLTIW_CONST
   ])
 
 ;;
@@ -8232,6 +8233,15 @@
   [(set_attr "type" "vecperm")
    (set_attr "prefixed" "yes")])
 
+(define_insn "xxspltiw_<mode>_internal"
+  [(set (match_operand:SFDF 0 "register_operand" "=wa")
+	(unspec:SFDF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
+		     UNSPEC_XXSPLTIW_CONST))]
+  "TARGET_POWER10"
+  "xxspltiw %x0,%1"
+  [(set_attr "type" "vecperm")
+   (set_attr "prefixed" "yes")])
+
 (define_split
   [(set (match_operand:SFDF 0 "vsx_register_operand")
 	(match_operand:SFDF 1 "vsx_prefixed_constant"))]
@@ -8252,6 +8262,13 @@
       DONE;
     }
 
+  imm = constant_generates_xxspltiw (&vsx_const);
+  if (imm)
+    {
+      emit_insn (gen_xxspltiw_<mode>_internal (dest, GEN_INT (imm)));
+      DONE;
+    }
+
   else
     gcc_unreachable ();
 })
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 429da57d19d..ec607a7aee7 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -644,6 +644,10 @@ msplat-float-constant
 Target Var(TARGET_SPLAT_FLOAT_CONSTANT) Init(1) Save
 Generate (do not generate) code that uses the XXSPLTIDP instruction.
 
+msplat-word-constant
+Target Var(TARGET_SPLAT_WORD_CONSTANT) Init(1) Save
+Generate (do not generate) code that uses the XXSPLTIW instruction.
+
 -param=rs6000-density-pct-threshold=
 Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 100) Param
 When costing for loop vectorization, we probably need to penalize the loop body
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v16qi.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v16qi.c
new file mode 100644
index 00000000000..2707d86e6fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v16qi.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -mxxspltiw" } */
+
+#include <altivec.h>
+
+/* Test whether XXSPLTIW is generated for V16HI vector constants where the
+   first 4 elements are the same as the next 4 elements, etc.  */
+
+vector unsigned char
+v16qi_const_1 (void)
+{
+  return (vector unsigned char) { 1, 1, 1, 1, 1, 1, 1, 1,
+				  1, 1, 1, 1, 1, 1, 1, 1, }; /* VSLTPISB.  */
+}
+
+vector unsigned char
+v16qi_const_2 (void)
+{
+  return (vector unsigned char) { 1, 2, 3, 4, 1, 2, 3, 4,
+				  1, 2, 3, 4, 1, 2, 3, 4, }; /* XXSPLTIW.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltiw\M}              1 } } */
+/* { dg-final { scan-assembler-times {\mvspltisb\M|\mxxspltib\M} 1 } } */
+/* { dg-final { scan-assembler-not   {\mlxvx?\M}                   } } */
+/* { dg-final { scan-assembler-not   {\mplxv\M}                    } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v4sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v4sf.c
new file mode 100644
index 00000000000..05d4ee3f5cb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v4sf.c
@@ -0,0 +1,67 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -mxxspltiw" } */
+
+#include <altivec.h>
+
+/* Test whether XXSPLTIW is generated for V4SF vector constants.  */
+
+vector float
+v4sf_const_1 (void)
+{
+  return (vector float) { 1.0f, 1.0f, 1.0f, 1.0f };	/* XXSPLTIW.  */
+}
+
+vector float
+v4sf_const_nan (void)
+{
+  return (vector float) { __builtin_nanf (""),
+			  __builtin_nanf (""),
+			  __builtin_nanf (""),
+			  __builtin_nanf ("") };	/* XXSPLTIW.  */
+}
+
+vector float
+v4sf_const_inf (void)
+{
+  return (vector float) { __builtin_inff (),
+			  __builtin_inff (),
+			  __builtin_inff (),
+			  __builtin_inff () };		/* XXSPLTIW.  */
+}
+
+vector float
+v4sf_const_m0 (void)
+{
+  return (vector float) { -0.0f, -0.0f, -0.0f, -0.0f };	/* XXSPLTIB/VSLW.  */
+}
+
+vector float
+v4sf_splats_1 (void)
+{
+  return vec_splats (1.0f);				/* XXSPLTIW.  */
+}
+
+vector float
+v4sf_splats_nan (void)
+{
+  return vec_splats (__builtin_nanf (""));		/* XXSPLTIW.  */
+}
+
+vector float
+v4sf_splats_inf (void)
+{
+  return vec_splats (__builtin_inff ());		/* XXSPLTIW.  */
+}
+
+vector float
+v8hi_splats_m0 (void)
+{
+  return vec_splats (-0.0f);				/* XXSPLTIB/VSLW.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltiw\M} 6 } } */
+/* { dg-final { scan-assembler-times {\mxxspltib\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvslw\M}     2 } } */
+/* { dg-final { scan-assembler-not   {\mlxvx?\M}      } } */
+/* { dg-final { scan-assembler-not   {\mplxv\M}       } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v4si.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v4si.c
new file mode 100644
index 00000000000..da909e948b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v4si.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -mxxspltiw" } */
+
+#include <altivec.h>
+
+/* Test whether XXSPLTIW is generated for V4SI vector constants.  We make sure
+   the power9 support (XXSPLTIB/VEXTSB2W) is not done.  */
+
+vector int
+v4si_const_1 (void)
+{
+  return (vector int) { 1, 1, 1, 1 };			/* VSLTPISW.  */
+}
+
+vector int
+v4si_const_126 (void)
+{
+  return (vector int) { 126, 126, 126, 126 };		/* XXSPLTIW.  */
+}
+
+vector int
+v4si_const_1023 (void)
+{
+  return (vector int) { 1023, 1023, 1023, 1023 };	/* XXSPLTIW.  */
+}
+
+vector int
+v4si_splats_1 (void)
+{
+  return vec_splats (1);				/* VSLTPISW.  */
+}
+
+vector int
+v4si_splats_126 (void)
+{
+  return vec_splats (126);				/* XXSPLTIW.  */
+}
+
+vector int
+v8hi_splats_1023 (void)
+{
+  return vec_splats (1023);				/* XXSPLTIW.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltiw\M}  4 } } */
+/* { dg-final { scan-assembler-times {\mvspltisw\M}  2 } } */
+/* { dg-final { scan-assembler-not   {\mxxspltib\M}    } } */
+/* { dg-final { scan-assembler-not   {\mvextsb2w\M}    } } */
+/* { dg-final { scan-assembler-not   {\mlxvx?\M}       } } */
+/* { dg-final { scan-assembler-not   {\mplxv\M}        } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v8hi.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v8hi.c
new file mode 100644
index 00000000000..290e05d4a64
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v8hi.c
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -mxxspltiw" } */
+
+#include <altivec.h>
+
+/* Test whether XXSPLTIW is generated for V8HI vector constants.  We make sure
+   the power9 support (XXSPLTIB/VUPKLSB) is not done.  */
+
+vector short
+v8hi_const_1 (void)
+{
+  return (vector short) { 1, 1, 1, 1, 1, 1, 1, 1 };	/* VSLTPISH.  */
+}
+
+vector short
+v8hi_const_126 (void)
+{
+  return (vector short) { 126, 126, 126, 126,
+			  126, 126, 126, 126 };		/* XXSPLTIW.  */
+}
+
+vector short
+v8hi_const_1023 (void)
+{
+  return (vector short) { 1023, 1023, 1023, 1023,
+			  1023, 1023, 1023, 1023 };	/* XXSPLTIW.  */
+}
+
+vector short
+v8hi_splats_1 (void)
+{
+  return vec_splats ((short)1);				/* VSLTPISH.  */
+}
+
+vector short
+v8hi_splats_126 (void)
+{
+  return vec_splats ((short)126);			/* XXSPLTIW.  */
+}
+
+vector short
+v8hi_splats_1023 (void)
+{
+  return vec_splats ((short)1023);			/* XXSPLTIW.  */
+}
+
+/* Test that we can optimiza V8HI where all of the even elements are the same
+   and all of the odd elements are the same.  */
+vector short
+v8hi_const_1023_1000 (void)
+{
+  return (vector short) { 1023, 1000, 1023, 1000,
+			  1023, 1000, 1023, 1000 };	/* XXSPLTIW.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltiw\M}  5 } } */
+/* { dg-final { scan-assembler-times {\mvspltish\M}  2 } } */
+/* { dg-final { scan-assembler-not   {\mxxspltib\M}    } } */
+/* { dg-final { scan-assembler-not   {\mvupklsb\M}     } } */
+/* { dg-final { scan-assembler-not   {\mlxvx?\M}       } } */
+/* { dg-final { scan-assembler-not   {\mplxv\M}        } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
index 5f84930e1a7..6c01666b625 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
@@ -149,7 +149,7 @@ main (int argc, char *argv [])
   return 0;
 }
 
-/* { dg-final { scan-assembler-times {\mxxspltiw\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxxspltiw\M} 3 } } */
 /* { dg-final { scan-assembler-times {\mxxspltidp\M} 3 } } */
 /* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */