[gcc(refs/users/meissner/heads/work046)] Implement XXSPLTIW support.

public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed

* [gcc(refs/users/meissner/heads/work046)] Implement XXSPLTIW support.
@ 2021-04-07 18:46 Michael Meissner
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2021-04-07 18:46 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:9ec2b7a1a16105ab16efed20302107b875c06026

commit 9ec2b7a1a16105ab16efed20302107b875c06026
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Wed Apr 7 14:46:39 2021 -0400

    Implement XXSPLTIW support.
    
    This patch implements XXSPLTIW support for V8HI, V4SI, and V4SF vector
    constants.
    
    A new constraint (eW) is added to match constants that can be loaded with
    the XXSPLTIW instruction.
    
    I have moved the XXSPLTIW built-in function support from altivec.md to
    vsx.md because the functions can load any VSX register, not just the
    ALTIVEC registers.  I have also re-implemented the built-in functions to
    load the vector constants, which will be optimized to generate the
    appropriate XXSPLTIW, VSPLTISH, VSPLTISW, of XXSPLTIB instruction.
    
    I have added a temporary switch (-mxxspltiw) to control whether or not the
    XXSPLTIW instruction is generated.
    
    This patch provides a xxspltiw_constant_p function which decodes both
    VEC_DUPLICATE and VECTOR_CONST insns (similar to the existing
    xxspltib_constant_p function).
    
    The xxspltiw_constant_p function returns the appropriate integer that will be
    used in the XXSPLTIW instruction.  I.e. for V8HI constants, it will be two
    elements combined to make a 32-bit constant, and for V4SF constants, the value
    will be converted to the integer representation of the 32-bit floating value.
    
    gcc/
    2021-04-07  Michael Meissner  <meissner@linux.ibm.com>
    
            * config/rs6000/altivec.md (UNSPEC_XXSPLTIW): Move to vsx.md.
            (xxspltiw_v4si): Move to vsx.md and re-implement.
            (xxspltiw_v4sf): Move to vsx.md and re-implement.
            (xxspltiw_v4sf_inst): Delete.
            * config/rs6000/constraints.md (eW): New constraint.
            * config/rs6000/predicates.md (xxspltiw_operand): New predicate.
            (easy_vector_constant): If we can generate XXSPLTIW, mark the
            vector constant as easy.
            * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
            -mxxspltiw support.
            (POWERPC_MASKS): Add -mxxspltiw support.
            * config/rs6000/rs6000-protos.h (xxspltiw_constant_p): New
            declaration.
            * config/rs6000/rs6000.c (rs6000_option_override_internal): Add
            -mxxspltiw support.
            (xxspltib_constant_p): If we can generate XXSPLTIW, don't generate
            XXSPLTIB and a vector extend instruction.
            (xxspltiw_constant_p): New function.
            (output_vec_const_move): Add support for XXSPLTIW.
            (rs6000_opt_masks): Add -mxxspltiw support.
            * config/rs6000/rs6000.opt (-mxxspltiw): New switch.
            * config/rs6000/vsx.md (UNSPEC_XXSPLTIW): Move here from
            altivec.md.
            (vsx_mov<mode>_64bit): Add XXSPLTIW support.
            (vsx_mov<mode>_32bit): Add XXSPLTIW support.
            (XXSPLTIW): New mode iterator.
            (xxspltiw_<mode>_internal1): New define_insn_and_split.
            (xxspltiw_<mode>_internal2): New define_insn.
            (xxspltiw_v4si): Move to vsx.md from altivec.md.  Re-implement to
            use the new constant format.
            (xxspltiw_v4sf): Move to vsx.md from altivec.md.  Re-implement to
            use the new constant format.

Diff:
---
 gcc/config/rs6000/altivec.md      | 30 -------------
 gcc/config/rs6000/constraints.md  |  5 +++
 gcc/config/rs6000/predicates.md   | 22 +++++++++
 gcc/config/rs6000/rs6000-cpus.def |  6 ++-
 gcc/config/rs6000/rs6000-protos.h |  1 +
 gcc/config/rs6000/rs6000.c        | 93 ++++++++++++++++++++++++++++++++++++++
 gcc/config/rs6000/rs6000.opt      |  3 ++
 gcc/config/rs6000/vsx.md          | 95 +++++++++++++++++++++++++++++++++------
 8 files changed, 209 insertions(+), 46 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 1351dafbc41..708296cb14d 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -176,7 +176,6 @@
    UNSPEC_VSTRIL
    UNSPEC_SLDB
    UNSPEC_SRDB
-   UNSPEC_XXSPLTIW
    UNSPEC_XXSPLTID
    UNSPEC_XXSPLTI32DX
    UNSPEC_XXBLEND
@@ -820,35 +819,6 @@
   "vs<SLDB_lr>dbi %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])
 
-(define_insn "xxspltiw_v4si"
-  [(set (match_operand:V4SI 0 "register_operand" "=wa")
-	(unspec:V4SI [(match_operand:SI 1 "s32bit_cint_operand" "n")]
-		     UNSPEC_XXSPLTIW))]
- "TARGET_POWER10"
- "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")
-  (set_attr "prefixed" "yes")])
-
-(define_expand "xxspltiw_v4sf"
-  [(set (match_operand:V4SF 0 "register_operand" "=wa")
-	(unspec:V4SF [(match_operand:SF 1 "const_double_operand" "n")]
-		     UNSPEC_XXSPLTIW))]
- "TARGET_POWER10"
-{
-  long long value = rs6000_const_f32_to_i32 (operands[1]);
-  emit_insn (gen_xxspltiw_v4sf_inst (operands[0], GEN_INT (value)));
-  DONE;
-})
-
-(define_insn "xxspltiw_v4sf_inst"
-  [(set (match_operand:V4SF 0 "register_operand" "=wa")
-	(unspec:V4SF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
-		     UNSPEC_XXSPLTIW))]
- "TARGET_POWER10"
- "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")
-  (set_attr "prefixed" "yes")])
-
 (define_expand "xxspltidp_v2df"
   [(set (match_operand:V2DF 0 "register_operand" )
 	(unspec:V2DF [(match_operand:SF 1 "const_double_operand")]
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 561ce9797af..b3e36fbcfdf 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -213,6 +213,11 @@
   "A signed 34-bit integer constant if prefixed instructions are supported."
   (match_operand 0 "cint34_operand"))
 
+;; V4SI/V4SF/V8HI vector constant that can be loaded with XXSPLTIW
+(define_constraint "eW"
+  "A vector constant that can be loaded with the XXSPLTIW instruction."
+  (match_operand 0 "xxspltiw_operand"))
+
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index e21bc745f72..dc23f62a3af 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -640,6 +640,25 @@
   return num_insns == 1;
 })
 
+;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a vector
+;; using the ISA 3.1 XXSPLTIW instruction.  Do not return 1 if the value can be
+;; loaded with a smaller XXSPLTIB or VSPLTISW instruction.
+(define_predicate "xxspltiw_operand"
+  (match_code "vec_duplicate,const_vector")
+{
+  HOST_WIDE_INT value = 0;
+
+  if (!xxspltiw_constant_p (op, mode, &value))
+    return false;
+
+  /* xxspltiw_constant_p returns V8HI as (element | (element << 16)).  Undo
+     this to see if the value is in the range -16..15.  */
+  if (mode == V8HImode)
+    value = ((value & 0xffff) ^ 0x8000) - 0x8000;
+
+  return !EASY_VECTOR_15 (value);
+})
+
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
@@ -653,6 +672,9 @@
       if (zero_constant (op, mode) || all_ones_constant (op, mode))
 	return true;
 
+      if (xxspltiw_operand (op, mode))
+	return true;
+
       if (TARGET_P9_VECTOR
           && xxspltib_constant_p (op, mode, &num_insns, &value))
 	return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index cbbb42c1b3a..f7743374f26 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -78,7 +78,8 @@
 #define OTHER_POWER10_MASKS	(OPTION_MASK_MMA			\
 				 | OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PCREL_OPT		\
-				 | OPTION_MASK_PREFIXED)
+				 | OPTION_MASK_PREFIXED			\
+				 | OPTION_MASK_XXSPLTIW)
 
 #define ISA_3_1_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
 				 | OPTION_MASK_POWER10			\
@@ -160,7 +161,8 @@
 				 | OPTION_MASK_RECIP_PRECISION		\
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
-				 | OPTION_MASK_VSX)
+				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_XXSPLTIW)
 
 #endif
 
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 8ac30905013..eff72af8814 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
 
 extern bool easy_altivec_constant (rtx, machine_mode);
 extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern bool xxspltiw_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index dc38c093c53..54c338d73a5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4476,6 +4476,10 @@ rs6000_option_override_internal (bool global_init_p)
       rs6000_isa_flags &= ~OPTION_MASK_MMA;
     }
 
+  if (TARGET_POWER10 && TARGET_VSX
+      && (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIW) == 0)
+    rs6000_isa_flags |= OPTION_MASK_XXSPLTIW;
+
   if (!TARGET_PCREL && TARGET_PCREL_OPT)
     rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
 
@@ -6460,6 +6464,12 @@ xxspltib_constant_p (rtx op,
   else if (IN_RANGE (value, -1, 0))
     *num_insns_ptr = 1;
 
+  /* If XXSPLTIW is available, don't return true if we can use that
+     instruction instead of doing 2 instructions. */
+  else if (TARGET_XXSPLTIW
+	   && (mode == V4SImode || mode == V8HImode))
+    return false;
+
   else
     *num_insns_ptr = 2;
 
@@ -6467,6 +6477,66 @@ xxspltib_constant_p (rtx op,
   return true;
 }
 
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+   XXSPLTIW instruction, possibly with an sign extension.
+
+   Return the constant that is being split via CONSTANT_PTR.  */
+
+bool
+xxspltiw_constant_p (rtx op,
+		     machine_mode mode,
+		     HOST_WIDE_INT *constant_ptr)
+{
+  *constant_ptr = 0;
+
+  if (!TARGET_XXSPLTIW)
+    return false;
+
+  if (mode == VOIDmode)
+    mode = GET_MODE (op);
+
+  if (mode != V8HImode && mode != V4SImode && mode != V4SFmode)
+    return false;
+
+  rtx element = op;
+  if (GET_CODE (op) == VEC_DUPLICATE)
+    element = op;
+
+  else if (GET_CODE (op) == CONST_VECTOR)
+    {
+      size_t nunits = GET_MODE_NUNITS (mode);
+      element = CONST_VECTOR_ELT (op, 0);
+
+      for (size_t i = 1; i < nunits; i++)
+	if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, i)))
+	  return false;
+    }
+
+  HOST_WIDE_INT value;
+  if (CONST_INT_P (element))
+    {
+      value = INTVAL (element);
+      if (!SIGNED_INTEGER_NBIT_P (value, 32))
+	return false;
+
+      /* For V8HImode, return the value setting 2 elements of the constant.  */
+      if (mode == V8HImode)
+	{
+	  value &= 0xffff;
+	  value |= value << 16;
+	}
+    }
+
+  else if (CONST_DOUBLE_P (element))
+    value = rs6000_const_f32_to_i32 (element);
+
+  else
+    return false;
+
+  *constant_ptr = value;
+  return true;
+}
+
 const char *
 output_vec_const_move (rtx *operands)
 {
@@ -6511,6 +6581,28 @@ output_vec_const_move (rtx *operands)
 	    gcc_unreachable ();
 	}
 
+      HOST_WIDE_INT xxspltiw_value = 0;
+      if (xxspltiw_constant_p (vec, mode, &xxspltiw_value))
+	{
+	  /* Generate the smaller VSPLTIS{H,W} if we can.  */
+	  if (dest_vmx_p && mode == V8HImode)
+	    {
+	      long hi_value = ((xxspltiw_value & 0xffff) ^ 0x8000) - 0x8000;
+	      if (IN_RANGE (hi_value, -16, 15))
+		{
+		  operands[2] = GEN_INT (hi_value);
+		  return "vspltish %0,%2";
+		}
+	    }
+
+	  operands[2] = GEN_INT (xxspltiw_value);
+	  if (dest_vmx_p && mode == V4SImode
+	      && IN_RANGE (xxspltiw_value, -16, 15))
+	    return "vspltisw %0,%2";
+
+	  return "xxspltiw %x0,%2";
+	}
+
       if (TARGET_P9_VECTOR
 	  && xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
 	{
@@ -24008,6 +24100,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "string",			0,				false, true  },
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
+  { "xxspltiw",			OPTION_MASK_XXSPLTIW,		false, true  },
 #ifdef OPTION_MASK_64BIT
 #if TARGET_AIX_OS
   { "aix64",			OPTION_MASK_64BIT,		false, false },
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 0dbdf753673..06e7cdbbced 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -619,3 +619,6 @@ Generate (do not generate) MMA instructions.
 
 mrelative-jumptables
 Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save
+
+mxxspltiw
+Target Undocumented Mask(XXSPLTIW) Var(rs6000_isa_flags)
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bcb92be2f5c..e5c5e157d1d 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -369,6 +369,7 @@
    UNSPEC_REPLACE_UN
    UNSPEC_VDIVES
    UNSPEC_VDIVEU
+   UNSPEC_XXSPLTIW
   ])
 
 (define_int_iterator XVCVBF16	[UNSPEC_VSX_XVCVSPBF16
@@ -1167,17 +1168,17 @@
 
 ;;              VSX store  VSX load   VSX move  VSX->GPR   GPR->VSX    LQ (GPR)
 ;;              STQ (GPR)  GPR load   GPR store GPR move   XXSPLTIB    VSPLTISW
-;;              VSX 0/-1   VMX const  GPR const LVX (VMX)  STVX (VMX)
+;;              VSX 0/-1   VMX const  GPR const LVX (VMX)  STVX (VMX)  XXSPLTI*
 (define_insn "vsx_mov<mode>_64bit"
   [(set (match_operand:VSX_M 0 "nonimmediate_operand"
                "=ZwO,      wa,        wa,        r,         we,        ?wQ,
                 ?&r,       ??r,       ??Y,       <??r>,     wa,        v,
-                ?wa,       v,         <??r>,     wZ,        v")
+                ?wa,       v,         <??r>,     wZ,        v,         wa")
 
 	(match_operand:VSX_M 1 "input_operand" 
                "wa,        ZwO,       wa,        we,        r,         r,
                 wQ,        Y,         r,         r,         wE,        jwM,
-                ?jwM,      W,         <nW>,      v,         wZ"))]
+                ?jwM,      W,         <nW>,      v,         wZ,        eW"))]
 
   "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
    && (register_operand (operands[0], <MODE>mode) 
@@ -1188,36 +1189,40 @@
   [(set_attr "type"
                "vecstore,  vecload,   vecsimple, mtvsr,     mfvsr,     load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
-                vecsimple, *,         *,         vecstore,  vecload")
+                vecsimple, *,         *,         vecstore,  vecload,   vecperm")
    (set_attr "num_insns"
                "*,         *,         *,         2,         *,         2,
                 2,         2,         2,         2,         *,         *,
-                *,         5,         2,         *,         *")
+                *,         5,         2,         *,         *,         *")
    (set_attr "max_prefixed_insns"
                "*,         *,         *,         *,         *,         2,
                 2,         2,         2,         2,         *,         *,
-                *,         *,         *,         *,         *")
+                *,         *,         *,         *,         *,         *")
    (set_attr "length"
                "*,         *,         *,         8,         *,         8,
                 8,         8,         8,         8,         *,         *,
-                *,         20,        8,         *,         *")
+                *,         20,        8,         *,         *,         *")
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 *,         *,         *,         *,         p9v,       *,
-                <VSisa>,   *,         *,         *,         *")])
+                <VSisa>,   *,         *,         *,         *,         p10")
+   (set_attr "prefixed"
+               "*,         *,         *,         *,         *,         *,
+                *,         *,         *,         *,         *,         *,
+                *,         *,         *,         *,         *,         yes")])
 
 ;;              VSX store  VSX load   VSX move   GPR load   GPR store  GPR move
-;;              XXSPLTIB   VSPLTISW   VSX 0/-1   VMX const  GPR const
+;;              XXSPLTIB   VSPLTISW   VSX 0/-1   VMX const  GPR const  XXSPLTI*
 ;;              LVX (VMX)  STVX (VMX)
 (define_insn "*vsx_mov<mode>_32bit"
   [(set (match_operand:VSX_M 0 "nonimmediate_operand"
                "=ZwO,      wa,        wa,        ??r,       ??Y,       <??r>,
-                wa,        v,         ?wa,       v,         <??r>,
+                wa,        v,         ?wa,       v,         <??r>,     wa,
                 wZ,        v")
 
 	(match_operand:VSX_M 1 "input_operand" 
                "wa,        ZwO,       wa,        Y,         r,         r,
-                wE,        jwM,       ?jwM,      W,         <nW>,
+                wE,        jwM,       ?jwM,      W,         <nW>,      eW,
                 v,         wZ"))]
 
   "!TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
@@ -1228,15 +1233,19 @@
 }
   [(set_attr "type"
                "vecstore,  vecload,   vecsimple, load,      store,    *,
-                vecsimple, vecsimple, vecsimple, *,         *,
+                vecsimple, vecsimple, vecsimple, *,         *,        vecperm,
                 vecstore,  vecload")
    (set_attr "length"
                "*,         *,         *,         16,        16,        16,
-                *,         *,         *,         20,        16,
+                *,         *,         *,         20,        16,        *,
                 *,         *")
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
-                p9v,       *,         <VSisa>,   *,         *,
+                p9v,       *,         <VSisa>,   *,         *,         p10,
+                *,         *")
+   (set_attr "prefixed"
+               "*,         *,         *,         *,         *,         *,
+                *,         *,         *,         *,         *,         yes,
                 *,         *")])
 
 ;; Explicit  load/store expanders for the builtin functions
@@ -6216,3 +6225,61 @@
   "TARGET_POWER10"
   "vmulld %0,%1,%2"
   [(set_attr "type" "veccomplex")])
+
+\f
+;; XXSPLTIW support.
+(define_mode_iterator XXSPLTIW [V8HI V4SI V4SF])
+
+(define_insn_and_split "*xxspltiw_<mode>_internal1"
+  [(set (match_operand:XXSPLTIW 0 "vsx_register_operand" "=wa")
+	(match_operand:XXSPLTIW 1 "xxspltiw_operand"))]
+  "TARGET_XXSPLTIW"
+  "#"
+  "&& 1"
+  [(set (match_operand:XXSPLTIW 0 "vsx_register_operand")
+	(unspec:XXSPLTIW [(match_dup 2)] UNSPEC_XXSPLTIW))]
+{
+  HOST_WIDE_INT value = 0;
+
+  if (!xxspltiw_constant_p (operands[1], <MODE>mode, &value))
+    gcc_unreachable ();
+
+  operands[2] = GEN_INT (value);
+}
+ [(set_attr "type" "vecperm")
+  (set_attr "prefixed" "yes")])
+
+(define_insn "*xxspltiw_<mode>_internal2"
+  [(set (match_operand:XXSPLTIW 0 "vsx_register_operand" "=wa")
+	(unspec:XXSPLTIW [(match_operand 1 "const_int_operand" "n")]
+			 UNSPEC_XXSPLTIW))]
+  "TARGET_XXSPLTIW"
+  "xxspltiw %x0,%1"
+ [(set_attr "type" "vecperm")
+  (set_attr "prefixed" "yes")])
+
+;; Implement XXSPLTIW built-in functions as just loading up the appropriate
+;; constant vector.  The normal optimizations will generate XXSPLTIW.
+(define_expand "xxspltiw_v4si"
+  [(use (match_operand:V4SI 0 "register_operand"))
+   (use (match_operand:SI 1 "s32bit_cint_operand"))]
+  "TARGET_POWER10"
+{
+  rtx op1 = operands[1];
+  rtvec rv = gen_rtvec (4, op1, op1, op1, op1, op1);
+  rtx cv = gen_rtx_CONST_VECTOR (V4SImode, rv);
+  emit_move_insn (operands[0], cv);
+  DONE;
+})
+
+(define_expand "xxspltiw_v4sf"
+  [(use (match_operand:V4SF 0 "register_operand"))
+   (use (match_operand:SF 1 "const_double_operand"))]
+  "TARGET_POWER10"
+{
+  rtx op1 = operands[1];
+  rtvec rv = gen_rtvec (4, op1, op1, op1, op1, op1);
+  rtx cv = gen_rtx_CONST_VECTOR (V4SFmode, rv);
+  emit_move_insn (operands[0], cv);
+  DONE;
+})


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [gcc(refs/users/meissner/heads/work046)] Implement XXSPLTIW support.
@ 2021-04-07 17:13 Michael Meissner
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2021-04-07 17:13 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:b0bdb408113f6d1a6620d8bbbdd794dbddda95bc

commit b0bdb408113f6d1a6620d8bbbdd794dbddda95bc
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Wed Apr 7 13:13:29 2021 -0400

    Implement XXSPLTIW support.
    
    This patch implements XXSPLTIW support for V8HI, V4SI, and V4SF vector
    constants.
    
    A new constraint (eW) is added to match constants that can be loaded with
    the XXSPLTIW instruction.
    
    I have moved the XXSPLTIW built-in function support from altivec.md to
    vsx.md because the functions can load any VSX register, not just the
    ALTIVEC registers.  I have also re-implemented the built-in functions to
    load the vector constants, which will be optimized to generate the
    appropriate XXSPLTIW, VSPLTISH, VSPLTISW, of XXSPLTIB instruction.
    
    I have added a temporary switch (-mxxspltiw) to control whether or not the
    XXSPLTIW instruction is generated.
    
    This patch provides a xxspltiw_constant_p function which decodes both
    VEC_DUPLICATE and VECTOR_CONST insns (similar to the existing
    xxspltib_constant_p function).
    
    The xxspltiw_constant_p function returns the appropriate integer that will be
    used in the XXSPLTIW instruction.  I.e. for V8HI constants, it will be two
    elements combined to make a 32-bit constant, and for V4SF constants, the value
    will be converted to the integer representation of the 32-bit floating value.
    
    gcc/
    2021-04-07  Michael Meissner  <meissner@linux.ibm.com>
    
            * config/rs6000/altivec.md (UNSPEC_XXSPLTIW): Move to vsx.md.
            (xxspltiw_v4si): Move to vsx.md and re-implement.
            (xxspltiw_v4sf): Move to vsx.md and re-implement.
            (xxspltiw_v4sf_inst): Delete.
            * config/rs6000/constraints.md (eW): New constraint.
            * config/rs6000/predicates.md (xxspltiw_operand): New predicate.
            (easy_vector_constant): If we can generate XXSPLTIW, mark the
            vector constant as easy.
            * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
            -mxxspltiw support.
            (POWERPC_MASKS): Add -mxxspltiw support.
            * config/rs6000/rs6000-protos.h (xxspltiw_constant_p): New
            declaration.
            * config/rs6000/rs6000.c (rs6000_option_override_internal): Add
            -mxxspltiw support.
            (xxspltib_constant_p): If we can generate XXSPLTIW, don't generate
            XXSPLTIB and a vector extend instruction.
            (xxspltiw_constant_p): New function.
            (output_vec_const_move): Add support for XXSPLTIW.
            (rs6000_opt_masks): Add -mxxspltiw support.
            * config/rs6000/rs6000.opt (-mxxspltiw): New switch.
            * config/rs6000/vsx.md (UNSPEC_XXSPLTIW): Move here from
            altivec.md.
            (vsx_mov<mode>_64bit): Add XXSPLTIW support.
            (vsx_mov<mode>_32bit): Add XXSPLTIW support.
            (XXSPLTIW): New mode iterator.
            (xxspltiw_<mode>_internal1): New define_insn_and_split.
            (xxspltiw_<mode>_internal2): New define_insn.
            (xxspltiw_v4si): Move to vsx.md from altivec.md.  Re-implement to
            use the new constant format.
            (xxspltiw_v4sf): Move to vsx.md from altivec.md.  Re-implement to
            use the new constant format.

Diff:
---
 gcc/config/rs6000/altivec.md      | 30 -------------
 gcc/config/rs6000/constraints.md  |  5 +++
 gcc/config/rs6000/predicates.md   | 22 +++++++++
 gcc/config/rs6000/rs6000-cpus.def |  6 ++-
 gcc/config/rs6000/rs6000-protos.h |  1 +
 gcc/config/rs6000/rs6000.c        | 93 ++++++++++++++++++++++++++++++++++++++
 gcc/config/rs6000/rs6000.opt      |  3 ++
 gcc/config/rs6000/vsx.md          | 95 +++++++++++++++++++++++++++++++++------
 8 files changed, 209 insertions(+), 46 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 1351dafbc41..708296cb14d 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -176,7 +176,6 @@
    UNSPEC_VSTRIL
    UNSPEC_SLDB
    UNSPEC_SRDB
-   UNSPEC_XXSPLTIW
    UNSPEC_XXSPLTID
    UNSPEC_XXSPLTI32DX
    UNSPEC_XXBLEND
@@ -820,35 +819,6 @@
   "vs<SLDB_lr>dbi %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])
 
-(define_insn "xxspltiw_v4si"
-  [(set (match_operand:V4SI 0 "register_operand" "=wa")
-	(unspec:V4SI [(match_operand:SI 1 "s32bit_cint_operand" "n")]
-		     UNSPEC_XXSPLTIW))]
- "TARGET_POWER10"
- "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")
-  (set_attr "prefixed" "yes")])
-
-(define_expand "xxspltiw_v4sf"
-  [(set (match_operand:V4SF 0 "register_operand" "=wa")
-	(unspec:V4SF [(match_operand:SF 1 "const_double_operand" "n")]
-		     UNSPEC_XXSPLTIW))]
- "TARGET_POWER10"
-{
-  long long value = rs6000_const_f32_to_i32 (operands[1]);
-  emit_insn (gen_xxspltiw_v4sf_inst (operands[0], GEN_INT (value)));
-  DONE;
-})
-
-(define_insn "xxspltiw_v4sf_inst"
-  [(set (match_operand:V4SF 0 "register_operand" "=wa")
-	(unspec:V4SF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
-		     UNSPEC_XXSPLTIW))]
- "TARGET_POWER10"
- "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")
-  (set_attr "prefixed" "yes")])
-
 (define_expand "xxspltidp_v2df"
   [(set (match_operand:V2DF 0 "register_operand" )
 	(unspec:V2DF [(match_operand:SF 1 "const_double_operand")]
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 561ce9797af..b3e36fbcfdf 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -213,6 +213,11 @@
   "A signed 34-bit integer constant if prefixed instructions are supported."
   (match_operand 0 "cint34_operand"))
 
+;; V4SI/V4SF/V8HI vector constant that can be loaded with XXSPLTIW
+(define_constraint "eW"
+  "A vector constant that can be loaded with the XXSPLTIW instruction."
+  (match_operand 0 "xxspltiw_operand"))
+
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index e21bc745f72..dc23f62a3af 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -640,6 +640,25 @@
   return num_insns == 1;
 })
 
+;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a vector
+;; using the ISA 3.1 XXSPLTIW instruction.  Do not return 1 if the value can be
+;; loaded with a smaller XXSPLTIB or VSPLTISW instruction.
+(define_predicate "xxspltiw_operand"
+  (match_code "vec_duplicate,const_vector")
+{
+  HOST_WIDE_INT value = 0;
+
+  if (!xxspltiw_constant_p (op, mode, &value))
+    return false;
+
+  /* xxspltiw_constant_p returns V8HI as (element | (element << 16)).  Undo
+     this to see if the value is in the range -16..15.  */
+  if (mode == V8HImode)
+    value = ((value & 0xffff) ^ 0x8000) - 0x8000;
+
+  return !EASY_VECTOR_15 (value);
+})
+
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
@@ -653,6 +672,9 @@
       if (zero_constant (op, mode) || all_ones_constant (op, mode))
 	return true;
 
+      if (xxspltiw_operand (op, mode))
+	return true;
+
       if (TARGET_P9_VECTOR
           && xxspltib_constant_p (op, mode, &num_insns, &value))
 	return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index cbbb42c1b3a..f7743374f26 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -78,7 +78,8 @@
 #define OTHER_POWER10_MASKS	(OPTION_MASK_MMA			\
 				 | OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PCREL_OPT		\
-				 | OPTION_MASK_PREFIXED)
+				 | OPTION_MASK_PREFIXED			\
+				 | OPTION_MASK_XXSPLTIW)
 
 #define ISA_3_1_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
 				 | OPTION_MASK_POWER10			\
@@ -160,7 +161,8 @@
 				 | OPTION_MASK_RECIP_PRECISION		\
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
-				 | OPTION_MASK_VSX)
+				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_XXSPLTIW)
 
 #endif
 
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 8ac30905013..eff72af8814 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
 
 extern bool easy_altivec_constant (rtx, machine_mode);
 extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern bool xxspltiw_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index dc38c093c53..54c338d73a5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4476,6 +4476,10 @@ rs6000_option_override_internal (bool global_init_p)
       rs6000_isa_flags &= ~OPTION_MASK_MMA;
     }
 
+  if (TARGET_POWER10 && TARGET_VSX
+      && (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIW) == 0)
+    rs6000_isa_flags |= OPTION_MASK_XXSPLTIW;
+
   if (!TARGET_PCREL && TARGET_PCREL_OPT)
     rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
 
@@ -6460,6 +6464,12 @@ xxspltib_constant_p (rtx op,
   else if (IN_RANGE (value, -1, 0))
     *num_insns_ptr = 1;
 
+  /* If XXSPLTIW is available, don't return true if we can use that
+     instruction instead of doing 2 instructions. */
+  else if (TARGET_XXSPLTIW
+	   && (mode == V4SImode || mode == V8HImode))
+    return false;
+
   else
     *num_insns_ptr = 2;
 
@@ -6467,6 +6477,66 @@ xxspltib_constant_p (rtx op,
   return true;
 }
 
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+   XXSPLTIW instruction, possibly with an sign extension.
+
+   Return the constant that is being split via CONSTANT_PTR.  */
+
+bool
+xxspltiw_constant_p (rtx op,
+		     machine_mode mode,
+		     HOST_WIDE_INT *constant_ptr)
+{
+  *constant_ptr = 0;
+
+  if (!TARGET_XXSPLTIW)
+    return false;
+
+  if (mode == VOIDmode)
+    mode = GET_MODE (op);
+
+  if (mode != V8HImode && mode != V4SImode && mode != V4SFmode)
+    return false;
+
+  rtx element = op;
+  if (GET_CODE (op) == VEC_DUPLICATE)
+    element = op;
+
+  else if (GET_CODE (op) == CONST_VECTOR)
+    {
+      size_t nunits = GET_MODE_NUNITS (mode);
+      element = CONST_VECTOR_ELT (op, 0);
+
+      for (size_t i = 1; i < nunits; i++)
+	if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, i)))
+	  return false;
+    }
+
+  HOST_WIDE_INT value;
+  if (CONST_INT_P (element))
+    {
+      value = INTVAL (element);
+      if (!SIGNED_INTEGER_NBIT_P (value, 32))
+	return false;
+
+      /* For V8HImode, return the value setting 2 elements of the constant.  */
+      if (mode == V8HImode)
+	{
+	  value &= 0xffff;
+	  value |= value << 16;
+	}
+    }
+
+  else if (CONST_DOUBLE_P (element))
+    value = rs6000_const_f32_to_i32 (element);
+
+  else
+    return false;
+
+  *constant_ptr = value;
+  return true;
+}
+
 const char *
 output_vec_const_move (rtx *operands)
 {
@@ -6511,6 +6581,28 @@ output_vec_const_move (rtx *operands)
 	    gcc_unreachable ();
 	}
 
+      HOST_WIDE_INT xxspltiw_value = 0;
+      if (xxspltiw_constant_p (vec, mode, &xxspltiw_value))
+	{
+	  /* Generate the smaller VSPLTIS{H,W} if we can.  */
+	  if (dest_vmx_p && mode == V8HImode)
+	    {
+	      long hi_value = ((xxspltiw_value & 0xffff) ^ 0x8000) - 0x8000;
+	      if (IN_RANGE (hi_value, -16, 15))
+		{
+		  operands[2] = GEN_INT (hi_value);
+		  return "vspltish %0,%2";
+		}
+	    }
+
+	  operands[2] = GEN_INT (xxspltiw_value);
+	  if (dest_vmx_p && mode == V4SImode
+	      && IN_RANGE (xxspltiw_value, -16, 15))
+	    return "vspltisw %0,%2";
+
+	  return "xxspltiw %x0,%2";
+	}
+
       if (TARGET_P9_VECTOR
 	  && xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
 	{
@@ -24008,6 +24100,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "string",			0,				false, true  },
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
+  { "xxspltiw",			OPTION_MASK_XXSPLTIW,		false, true  },
 #ifdef OPTION_MASK_64BIT
 #if TARGET_AIX_OS
   { "aix64",			OPTION_MASK_64BIT,		false, false },
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 0dbdf753673..06e7cdbbced 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -619,3 +619,6 @@ Generate (do not generate) MMA instructions.
 
 mrelative-jumptables
 Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save
+
+mxxspltiw
+Target Undocumented Mask(XXSPLTIW) Var(rs6000_isa_flags)
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bcb92be2f5c..825e8b1480b 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -369,6 +369,7 @@
    UNSPEC_REPLACE_UN
    UNSPEC_VDIVES
    UNSPEC_VDIVEU
+   UNSPEC_XXSPLTIW
   ])
 
 (define_int_iterator XVCVBF16	[UNSPEC_VSX_XVCVSPBF16
@@ -1167,17 +1168,17 @@
 
 ;;              VSX store  VSX load   VSX move  VSX->GPR   GPR->VSX    LQ (GPR)
 ;;              STQ (GPR)  GPR load   GPR store GPR move   XXSPLTIB    VSPLTISW
-;;              VSX 0/-1   VMX const  GPR const LVX (VMX)  STVX (VMX)
+;;              VSX 0/-1   VMX const  GPR const LVX (VMX)  STVX (VMX)  XXSPLTI*
 (define_insn "vsx_mov<mode>_64bit"
   [(set (match_operand:VSX_M 0 "nonimmediate_operand"
                "=ZwO,      wa,        wa,        r,         we,        ?wQ,
                 ?&r,       ??r,       ??Y,       <??r>,     wa,        v,
-                ?wa,       v,         <??r>,     wZ,        v")
+                ?wa,       v,         <??r>,     wZ,        v,         wa")
 
 	(match_operand:VSX_M 1 "input_operand" 
                "wa,        ZwO,       wa,        we,        r,         r,
                 wQ,        Y,         r,         r,         wE,        jwM,
-                ?jwM,      W,         <nW>,      v,         wZ"))]
+                ?jwM,      W,         <nW>,      v,         wZ,        eW"))]
 
   "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
    && (register_operand (operands[0], <MODE>mode) 
@@ -1188,36 +1189,40 @@
   [(set_attr "type"
                "vecstore,  vecload,   vecsimple, mtvsr,     mfvsr,     load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
-                vecsimple, *,         *,         vecstore,  vecload")
+                vecsimple, *,         *,         vecstore,  vecload,   vecperm")
    (set_attr "num_insns"
                "*,         *,         *,         2,         *,         2,
                 2,         2,         2,         2,         *,         *,
-                *,         5,         2,         *,         *")
+                *,         5,         2,         *,         *,         *")
    (set_attr "max_prefixed_insns"
                "*,         *,         *,         *,         *,         2,
                 2,         2,         2,         2,         *,         *,
-                *,         *,         *,         *,         *")
+                *,         *,         *,         *,         *,         *")
    (set_attr "length"
                "*,         *,         *,         8,         *,         8,
                 8,         8,         8,         8,         *,         *,
-                *,         20,        8,         *,         *")
+                *,         20,        8,         *,         *,         *")
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 *,         *,         *,         *,         p9v,       *,
-                <VSisa>,   *,         *,         *,         *")])
+                <VSisa>,   *,         *,         *,         *,         p10")
+   (set_attr "prefixed"
+               "*,         *,         *,         *,         *,         *,
+                *,         *,         *,         *,         *,         *,
+                *,         *,         *,         *,         *,         yes")])
 
 ;;              VSX store  VSX load   VSX move   GPR load   GPR store  GPR move
-;;              XXSPLTIB   VSPLTISW   VSX 0/-1   VMX const  GPR const
+;;              XXSPLTIB   VSPLTISW   VSX 0/-1   VMX const  GPR const  XXSPLTI*
 ;;              LVX (VMX)  STVX (VMX)
 (define_insn "*vsx_mov<mode>_32bit"
   [(set (match_operand:VSX_M 0 "nonimmediate_operand"
                "=ZwO,      wa,        wa,        ??r,       ??Y,       <??r>,
-                wa,        v,         ?wa,       v,         <??r>,
+                wa,        v,         ?wa,       v,         <??r>,     wa,
                 wZ,        v")
 
 	(match_operand:VSX_M 1 "input_operand" 
                "wa,        ZwO,       wa,        Y,         r,         r,
-                wE,        jwM,       ?jwM,      W,         <nW>,
+                wE,        jwM,       ?jwM,      W,         <nW>,      eW,
                 v,         wZ"))]
 
   "!TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
@@ -1228,15 +1233,19 @@
 }
   [(set_attr "type"
                "vecstore,  vecload,   vecsimple, load,      store,    *,
-                vecsimple, vecsimple, vecsimple, *,         *,
+                vecsimple, vecsimple, vecsimple, *,         *,        vecperm,
                 vecstore,  vecload")
    (set_attr "length"
                "*,         *,         *,         16,        16,        16,
-                *,         *,         *,         20,        16,
+                *,         *,         *,         20,        16,        *,
                 *,         *")
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
-                p9v,       *,         <VSisa>,   *,         *,
+                p9v,       *,         <VSisa>,   *,         *,         p10,
+                *,         *")
+   (set_attr "prefixed"
+               "*,         *,         *,         *,         *,         *,
+                *,         *,         *,         *,         *,         yes,
                 *,         *")])
 
 ;; Explicit  load/store expanders for the builtin functions
@@ -6216,3 +6225,61 @@
   "TARGET_POWER10"
   "vmulld %0,%1,%2"
   [(set_attr "type" "veccomplex")])
+
+\f
+;; XXSPLTIW support.
+(define_mode_iterator XXSPLTIW [V8HI V4SI V4SF])
+
+(define_insn_and_split "*xxspltiw_<mode>_internal1"
+  [(set (match_operand:XXSPLTIW 0 "vsx_register_operand" "=wa")
+	(match_operand:XXSPLTIW 1 "xxspltiw_operand"))]
+  "TARGET_XXSPLTIW"
+  "#"
+  "&& 1"
+  [(set (match_operand:XXSPLTIW 0 "vsx_register_operand")
+	(unspec:XXSPLTIW [(match_dup 2)] UNSPEC_XXSPLTIW))]
+{
+  HOST_WIDE_INT value = 0;
+
+  if (!xxspltiw_constant_p (operands[1], <MODE>mode, &value))
+    gcc_unreachable ();
+
+  operands[2] = GEN_INT (value);
+}
+ [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "yes")])
+
+(define_insn "*xxspltiw_<mode>_internal2"
+  [(set (match_operand:XXSPLTIW 0 "vsx_register_operand" "=wa")
+	(unspec:XXSPLTIW [(match_operand 1 "const_int_operand" "n")]
+			 UNSPEC_XXSPLTIW))]
+  "TARGET_XXSPLTIW"
+  "xxspltiw %x0,%1"
+ [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "yes")])
+
+;; Implement XXSPLTIW built-in functions as just loading up the appropriate
+;; constant vector.  The normal optimizations will generate XXSPLTIW.
+(define_expand "xxspltiw_v4si"
+  [(use (match_operand:V4SI 0 "register_operand"))
+   (use (match_operand:SI 1 "s32bit_cint_operand"))]
+  "TARGET_POWER10"
+{
+  rtx op1 = operands[1];
+  rtvec rv = gen_rtvec (4, op1, op1, op1, op1, op1);
+  rtx cv = gen_rtx_CONST_VECTOR (V4SImode, rv);
+  emit_move_insn (operands[0], cv);
+  DONE;
+})
+
+(define_expand "xxspltiw_v4sf"
+  [(use (match_operand:V4SF 0 "register_operand"))
+   (use (match_operand:SF 1 "const_double_operand"))]
+  "TARGET_POWER10"
+{
+  rtx op1 = operands[1];
+  rtvec rv = gen_rtvec (4, op1, op1, op1, op1, op1);
+  rtx cv = gen_rtx_CONST_VECTOR (V4SFmode, rv);
+  emit_move_insn (operands[0], cv);
+  DONE;
+})


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-04-07 18:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-07 18:46 [gcc(refs/users/meissner/heads/work046)] Implement XXSPLTIW support Michael Meissner
  -- strict thread matches above, loose matches on Subject: below --
2021-04-07 17:13 Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).