public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work056)] Generate XXSPLTIDP on power10.
@ 2021-06-18  4:37 Michael Meissner
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2021-06-18  4:37 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:a08c2622d8a8e37c7c22267c017dca75fdfe42ea

commit a08c2622d8a8e37c7c22267c017dca75fdfe42ea
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Fri Jun 18 00:37:32 2021 -0400

    Generate XXSPLTIDP on power10.
    
    This patch implements XXSPLTIDP support for SF and DF scalar constants and
    V2DF vector constants.
    
    A new constraint (eF) is added to match constants that can be loaded with
    the XXSPLTIDP instruction.
    
    I have added a temporary switch (-mxxspltidp) to control whether or not the
    XXSPLTIDP instruction is generated.
    
    This patch provides a xxspltidp_constant_p function which decodes both
    VEC_DUPLICATE and VECTOR_CONST insns (similar to the existing
    xxspltib_constant_p function).
    
    The xxspltidp_constant_p function returns the appropriate integer that will be
    used in the XXSPLTIDP instruction.  Note, because SFmode denormal values
    are undefined in the hardware, the xxspltidp_constant_p function returns
    false for these values.  Also xxspltidp_constant_p returns false for 0.0
    because is cheaper to implement without XXSPLTIDP.
    
    I added 3 new tests to test loading up SF/DF scalar and V2DF vector
    constants.
    
    gcc/
    2021-06-18  Michael Meissner  <meissner@linux.ibm.com>
    
            * config/rs6000/constraints.md (eF): New constraint.
            * config/rs6000/predicates.md (easy_fp_constant): If we can load
            the scalar constant with XXSPLTIDP, the floating point constant is
            easy.
            (xxspltidp_operand): New predicate.
            (easy_vector_constant): If we can generate XXSPLTIDP, mark the
            vector constant as easy.
            * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
            -mxxspltidp support.
            (POWERPC_MASKS): Add -mxxspltidp support.
            * config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New
            declaration.
            * config/rs6000/rs6000.c (rs6000_option_override_internal): Add
            -mxxspltidp support.
            (const_vector_element_all_same): New function.
            (xxspltidp_constant_p): New function.
            (output_vec_const_move): Add support for XXSPLTIDP.
            (rs6000_opt_masks): Add -mxxspltidp support.
            (rs6000_emit_xxspltidp_v2df): Change function to implement the
            XXSPLTIDP instruction.
            * config/rs6000/rs6000.md (movsf_hardfloat): Add XXSPLTIDP
            support.
            (mov<mode>_hardfloat32, FMOVE64 iterator): Add XXSPLTIDP support.
            (mov<mode>_hardfloat64, FMOVE64 iterator): Add XXSPLTIDP support.
            * config/rs6000/rs6000.opt (-mxxspltidp): New switch.
            * config/rs6000/vsx.md (UNSPEC_XXSPLTIDP): Rename UNSPEC_XXSPLTID
            to UNSPEC_XXSPLTIDP to match the instruction.
            (xxspltidp_v2df): Use 'use' for the expand arguments, instead of
            writing out an insn.
            (xxspltidp_v2df_inst): Delete.
            (XXSPLTIDP): New mode iterator.
            (xxspltidp_<mode>_internal1): New define_insn_and_split.
            (xxspltidp_<mode>_internal2): New define_insn.
    
    gcc/testsuite/
    2021-06-18  Michael Meissner  <meissner@linux.ibm.com>
    
            * gcc.target/powerpc/vec-splat-constant-sf.c: New test.
            * gcc.target/powerpc/vec-splat-constant-df.c: New test.
            * gcc.target/powerpc/vec-splat-constant-v2df.c: New test.

Diff:
---
 gcc/config/rs6000/constraints.md                   |   5 +
 gcc/config/rs6000/predicates.md                    |  21 +++++
 gcc/config/rs6000/rs6000-cpus.def                  |   2 +
 gcc/config/rs6000/rs6000-protos.h                  |   1 +
 gcc/config/rs6000/rs6000.c                         | 101 ++++++++++++++++++++-
 gcc/config/rs6000/rs6000.md                        |  52 +++++++----
 gcc/config/rs6000/rs6000.opt                       |   4 +
 gcc/config/rs6000/vsx.md                           |  52 ++++++++---
 .../gcc.target/powerpc/vec-splat-constant-df.c     |  60 ++++++++++++
 .../gcc.target/powerpc/vec-splat-constant-sf.c     |  60 ++++++++++++
 .../gcc.target/powerpc/vec-splat-constant-v2df.c   |  64 +++++++++++++
 11 files changed, 388 insertions(+), 34 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 561ce9797af..e1fadd63580 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,6 +208,11 @@
   (and (match_code "const_int")
        (match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
 
+;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP
+(define_constraint "eF"
+  "A vector constant that can be loaded with the XXSPLTIDP instruction."
+  (match_operand 0 "xxspltidp_operand"))
+
 ;; 34-bit signed integer constant
 (define_constraint "eI"
   "A signed 34-bit integer constant if prefixed instructions are supported."
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index aa17ddc94e5..1831ff8b6d9 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
   if (TARGET_VSX && op == CONST0_RTX (mode))
     return 1;
 
+  /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+     be loaded with that instruction.  */
+  if (xxspltidp_operand (op, mode))
+    return 1;
+
   /* Otherwise consider floating point constants hard, so that the
      constant gets pushed to memory during the early RTL phases.  This
      has the advantage that double precision constants that can be
@@ -666,6 +671,19 @@
   return true;
 })
 
+;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be
+;; loaded via the ISA 3.1 XXSPLTIDP instruction.  Do not return true if the
+;; value is 0.0, since that is easy to generate without using XXSPLTIDP.
+(define_predicate "xxspltidp_operand"
+  (match_code "const_double,const_vector,vec_duplicate")
+{
+  if (op == CONST0_RTX (mode))
+    return false;
+
+  HOST_WIDE_INT value = 0;
+  return xxspltidp_constant_p (op, mode, &value);
+})
+
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
@@ -682,6 +700,9 @@
       if (xxspltiw_operand (op, mode))
 	return true;
 
+      if (xxspltidp_operand (op, mode))
+	return true;
+
       if (TARGET_P9_VECTOR
           && xxspltib_constant_p (op, mode, &num_insns, &value))
 	return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index e6c5891d334..fcdab4cea4a 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -89,6 +89,7 @@
 				 | OPTION_MASK_P10_FUSION_LOGADD 	\
 				 | OPTION_MASK_P10_FUSION_ADDLOG	\
 				 | OPTION_MASK_P10_FUSION_2ADD		\
+				 | OPTION_MASK_XXSPLTIDP		\
 				 | OPTION_MASK_XXSPLTIW)
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
@@ -168,6 +169,7 @@
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
 				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_XXSPLTIDP		\
 				 | OPTION_MASK_XXSPLTIW)
 #endif
 
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 94bf961c6b7..c381ddaed37 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
 
 extern bool easy_altivec_constant (rtx, machine_mode);
 extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3761a448ab6..ab5d9daf4f3 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4503,11 +4503,14 @@ rs6000_option_override_internal (bool global_init_p)
 
   if (TARGET_POWER10 && TARGET_VSX && TARGET_PREFIXED)
     {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0)
+	rs6000_isa_flags |= OPTION_MASK_XXSPLTIW;
+
       if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIW) == 0)
 	rs6000_isa_flags |= OPTION_MASK_XXSPLTIW;
     }
   else
-    rs6000_isa_flags &= ~OPTION_MASK_XXSPLTIW;
+    rs6000_isa_flags &= ~(OPTION_MASK_XXSPLTIDP | OPTION_MASK_XXSPLTIW);
 
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after subtarget", rs6000_isa_flags);
@@ -6511,6 +6514,96 @@ xxspltib_constant_p (rtx op,
   return true;
 }
 
+/* Return the element of a constant vector whose elements are all the same.  In
+   addition if VEC_DUPLICATE is used, return the element being duplicated.  If
+   neither is true, return NULL_RTX.  */
+
+static rtx
+const_vector_element_all_same (rtx op)
+{
+  if (GET_CODE (op) == VEC_DUPLICATE)
+    {
+      rtx element = XEXP (op, 0);
+      return (CONST_INT_P (element) || CONST_DOUBLE_P (element)
+	       ? element
+	       : NULL_RTX);
+    }
+
+  else if (GET_CODE (op) == CONST_VECTOR)
+    {
+      machine_mode mode = GET_MODE (op);
+      size_t n_elts = GET_MODE_NUNITS (mode);
+      rtx element = CONST_VECTOR_ELT (op, 0);
+
+      for (size_t i = 1; i < n_elts; i++)
+	if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1)))
+	  return NULL_RTX;
+
+      return element;
+    }
+
+  return NULL_RTX;
+}
+
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+   XXSPLTIDP instruction.
+
+   Return the constant that is being split via CONSTANT_PTR to use in the
+   XXSPLTIDP instruction.  */
+
+bool
+xxspltidp_constant_p (rtx op,
+		      machine_mode mode,
+		      HOST_WIDE_INT *constant_ptr)
+{
+  *constant_ptr = 0;
+
+  if (!TARGET_XXSPLTIDP)
+    return false;
+
+  if (mode == VOIDmode)
+    mode = GET_MODE (op);
+
+  rtx element = op;
+  if (mode == V2DFmode)
+    {
+      element = const_vector_element_all_same (op);
+      if (!element)
+	return false;
+
+      mode = DFmode;
+    }
+
+  if (mode != SFmode && mode != DFmode)
+    return false;
+
+  if (GET_MODE (element) != mode)
+    return false;
+
+  if (!CONST_DOUBLE_P (element))
+    return false;
+
+  /* Don't return true for 0.0 since that is easy to create without
+     XXSPLTIDP.  */
+  if (element == CONST0_RTX (mode))
+    return false;
+
+  /* If the value doesn't fit in a SFmode, exactly, we can't use XXSPLTIDP.  */
+  const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element);
+  if (!exact_real_truncate (SFmode, rv))
+    return 0;
+
+  long value;
+  REAL_VALUE_TO_TARGET_SINGLE (*rv, value);
+
+  /* Test for SFmode denormal (exponent is 0, mantissa field is non-zero).  */
+  if (((value & 0x7F800000) == 0) && ((value & 0x7FFFFF) != 0))
+    return false;
+
+  *constant_ptr = value;
+  return true;
+}
+
 const char *
 output_vec_const_move (rtx *operands)
 {
@@ -6555,7 +6648,8 @@ output_vec_const_move (rtx *operands)
 	    gcc_unreachable ();
 	}
 
-      if (xxspltiw_operand (vec, mode))
+      if (xxspltiw_operand (vec, mode)
+	  || xxspltidp_operand (vec, mode))
 	return "#";
 
       if (TARGET_P9_VECTOR
@@ -24133,6 +24227,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
   { "xxspltiw",			OPTION_MASK_XXSPLTIW,		false, true  },
+  { "xxspltidp",		OPTION_MASK_XXSPLTIDP,		false, true  },
 #ifdef OPTION_MASK_64BIT
 #if TARGET_AIX_OS
   { "aix64",			OPTION_MASK_64BIT,		false, false },
@@ -27972,7 +28067,7 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value)
     inform (input_location,
 	    "the result for the xxspltidp instruction "
 	    "is undefined for subnormal input values");
-  emit_insn( gen_xxspltidp_v2df_inst (dst, GEN_INT (value)));
+  emit_insn (gen_xxspltidp_v2df_internal2 (dst, GEN_INT (value)));
 }
 
 /* Implement TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC.  */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 44d82c605c7..ebdc95a006a 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7689,17 +7689,17 @@
 ;;
 ;;	LWZ          LFS        LXSSP       LXSSPX     STFS       STXSSP
 ;;	STXSSPX      STW        XXLXOR      LI         FMR        XSCPSGNDP
-;;	MR           MT<x>      MF<x>       NOP
+;;	MR           MT<x>      MF<x>       NOP        XXSPLTIDP
 
 (define_insn "movsf_hardfloat"
   [(set (match_operand:SF 0 "nonimmediate_operand"
 	 "=!r,       f,         v,          wa,        m,         wY,
 	  Z,         m,         wa,         !r,        f,         wa,
-	  !r,        *c*l,      !r,         *h")
+	  !r,        *c*l,      !r,         *h,        wa")
 	(match_operand:SF 1 "input_operand"
 	 "m,         m,         wY,         Z,         f,         v,
 	  wa,        r,         j,          j,         f,         wa,
-	  r,         r,         *h,         0"))]
+	  r,         r,         *h,         0,         eF"))]
   "(register_operand (operands[0], SFmode)
    || register_operand (operands[1], SFmode))
    && TARGET_HARD_FLOAT
@@ -7721,15 +7721,20 @@
    mr %0,%1
    mt%0 %1
    mf%1 %0
-   nop"
+   nop
+   #"
   [(set_attr "type"
 	"load,       fpload,    fpload,     fpload,    fpstore,   fpstore,
 	 fpstore,    store,     veclogical, integer,   fpsimple,  fpsimple,
-	 *,          mtjmpr,    mfjmpr,     *")
+	 *,          mtjmpr,    mfjmpr,     *,         vecperm")
    (set_attr "isa"
 	"*,          *,         p9v,        p8v,       *,         p9v,
 	 p8v,        *,         *,          *,         *,         *,
-	 *,          *,         *,          *")])
+	 *,          *,         *,          *,         p10")
+   (set_attr "prefixed"
+	"*,          *,         *,          *,         *,         *,
+	 *,          *,         *,          *,         *,         *,
+	 *,          *,         *,          *,         yes")])
 
 ;;	LWZ          LFIWZX     STW        STFIWX     MTVSRWZ    MFVSRWZ
 ;;	FMR          MR         MT%0       MF%1       NOP
@@ -7989,18 +7994,18 @@
 
 ;;           STFD         LFD         FMR         LXSD        STXSD
 ;;           LXSD         STXSD       XXLOR       XXLXOR      GPR<-0
-;;           LWZ          STW         MR
+;;           LWZ          STW         MR          XXSPLTIDP
 
 
 (define_insn "*mov<mode>_hardfloat32"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
             "=m,          d,          d,          <f64_p9>,   wY,
               <f64_av>,   Z,          <f64_vsx>,  <f64_vsx>,  !r,
-              Y,          r,          !r")
+              Y,          r,          !r,         wa")
 	(match_operand:FMOVE64 1 "input_operand"
              "d,          m,          d,          wY,         <f64_p9>,
               Z,          <f64_av>,   <f64_vsx>,  <zero_fp>,  <zero_fp>,
-              r,          Y,          r"))]
+              r,          Y,          r,          eF"))]
   "! TARGET_POWERPC64 && TARGET_HARD_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8017,20 +8022,25 @@
    #
    #
    #
+   #
    #"
   [(set_attr "type"
             "fpstore,     fpload,     fpsimple,   fpload,     fpstore,
              fpload,      fpstore,    veclogical, veclogical, two,
-             store,       load,       two")
+             store,       load,       two,        vecperm")
    (set_attr "size" "64")
    (set_attr "length"
             "*,           *,          *,          *,          *,
              *,           *,          *,          *,          8,
-             8,           8,          8")
+             8,           8,          8,          *")
    (set_attr "isa"
             "*,           *,          *,          p9v,        p9v,
              p7v,         p7v,        *,          *,          *,
-             *,           *,          *")])
+             *,           *,          *,          p10")
+   (set_attr "prefixed"
+            "*,           *,          *,          *,          *,
+             *,           *,          *,          *,          *,
+             *,           *,          *,          yes")])
 
 ;;           STW      LWZ     MR      G-const H-const F-const
 
@@ -8057,19 +8067,19 @@
 ;;           STFD         LFD         FMR         LXSD        STXSD
 ;;           LXSDX        STXSDX      XXLOR       XXLXOR      LI 0
 ;;           STD          LD          MR          MT{CTR,LR}  MF{CTR,LR}
-;;           NOP          MFVSRD      MTVSRD
+;;           NOP          MFVSRD      MTVSRD      XXSPLTIDP
 
 (define_insn "*mov<mode>_hardfloat64"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
            "=m,           d,          d,          <f64_p9>,   wY,
              <f64_av>,    Z,          <f64_vsx>,  <f64_vsx>,  !r,
              YZ,          r,          !r,         *c*l,       !r,
-            *h,           r,          <f64_dm>")
+            *h,           r,          <f64_dm>,   wa")
 	(match_operand:FMOVE64 1 "input_operand"
             "d,           m,          d,          wY,         <f64_p9>,
              Z,           <f64_av>,   <f64_vsx>,  <zero_fp>,  <zero_fp>,
              r,           YZ,         r,          r,          *h,
-             0,           <f64_dm>,   r"))]
+             0,           <f64_dm>,   r,          eF"))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8091,18 +8101,24 @@
    mf%1 %0
    nop
    mfvsrd %0,%x1
-   mtvsrd %x0,%1"
+   mtvsrd %x0,%1
+   #"
   [(set_attr "type"
             "fpstore,     fpload,     fpsimple,   fpload,     fpstore,
              fpload,      fpstore,    veclogical, veclogical, integer,
              store,       load,       *,          mtjmpr,     mfjmpr,
-             *,           mfvsr,      mtvsr")
+             *,           mfvsr,      mtvsr,      vecperm")
    (set_attr "size" "64")
    (set_attr "isa"
             "*,           *,          *,          p9v,        p9v,
              p7v,         p7v,        *,          *,          *,
              *,           *,          *,          *,          *,
-             *,           p8v,        p8v")])
+             *,           p8v,        p8v,        p10")
+   (set_attr "prefixed"
+            "*,           *,          *,          *,          *,
+             *,           *,          *,          *,          *,
+             *,           *,          *,          *,          *,
+             *,           *,          *,          yes")])
 
 ;;           STD      LD       MR      MT<SPR> MF<SPR> G-const
 ;;           H-const  F-const  Special
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 38eaa36d6d8..0333099cc8a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -643,3 +643,7 @@ Generate code that will run in privileged state.
 mxxspltiw
 Target Undocumented Mask(XXSPLTIW) Var(rs6000_isa_flags)
 Generate (do not generate) XXSPLTIW instructions.
+
+mxxspltidp
+Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags)
+Generate (do not generate) XXSPLTIDP instructions.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 8f2a37d74d5..ce98f78e02a 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -389,7 +389,7 @@
    UNSPEC_VDIVES
    UNSPEC_VDIVEU
    UNSPEC_XXEVAL
-   UNSPEC_XXSPLTID
+   UNSPEC_XXSPLTIDP
    UNSPEC_XXSPLTI32DX
    UNSPEC_XXBLEND
    UNSPEC_XXPERMX
@@ -6407,9 +6407,8 @@
 
 ;; XXSPLTIDP built-in function support
 (define_expand "xxspltidp_v2df"
-  [(set (match_operand:V2DF 0 "register_operand" )
-	(unspec:V2DF [(match_operand:SF 1 "const_double_operand")]
-		     UNSPEC_XXSPLTID))]
+  [(use (match_operand:V2DF 0 "register_operand" ))
+   (use (match_operand:SF 1 "const_double_operand"))]
  "TARGET_POWER10"
 {
   long value = rs6000_const_f32_to_i32 (operands[1]);
@@ -6417,15 +6416,6 @@
   DONE;
 })
 
-(define_insn "xxspltidp_v2df_inst"
-  [(set (match_operand:V2DF 0 "register_operand" "=wa")
-	(unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
-		     UNSPEC_XXSPLTID))]
-  "TARGET_POWER10"
-  "xxspltidp %x0,%1"
-  [(set_attr "type" "vecsimple")
-   (set_attr "prefixed" "yes")])
-
 ;; XXSPLTI32DX built-in function support
 (define_expand "xxsplti32dx_v4si"
   [(set (match_operand:V4SI 0 "register_operand" "=wa")
@@ -6671,3 +6661,39 @@
 {
   operands[2] = CONST_VECTOR_ELT (operands[1], 0);
 })
+
+;; Generate the XXSPLTIDP instruction to support SFmode and DFmode scalar
+;; constants and V2DF vector constants where both elements are the same.  The
+;; constant has be expressible as a SFmode constant that is not a SFmode
+;; denormal value.
+(define_mode_iterator XXSPLTIDP [SF DF V2DF])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal1"
+  [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa")
+	(match_operand:XXSPLTIDP 1 "xxspltidp_operand"))]
+  "TARGET_XXSPLTIDP"
+  "#"
+  "&& 1"
+  [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand")
+	(unspec:XXSPLTIDP [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+  HOST_WIDE_INT value = 0;
+
+  if (!xxspltidp_constant_p (operands[1], <MODE>mode, &value))
+    gcc_unreachable ();
+
+  operands[2] = GEN_INT (value);
+}
+ [(set_attr "type" "vecperm")
+  (set_attr "prefixed" "yes")])
+
+;; Just in case the user issued -mno-xxspltidp, allow the built-in function
+;; even if the compiler does not automatically generate XXSPLTIDP.
+(define_insn "xxspltidp_<mode>_internal2"
+  [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa")
+	(unspec:XXSPLTIDP [(match_operand 1 "const_int_operand" "n")]
+			  UNSPEC_XXSPLTIDP))]
+  "TARGET_POWER10"
+  "xxspltidp %x0,%1"
+ [(set_attr "type" "vecperm")
+  (set_attr "prefixed" "yes")])
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+   instruction.  */
+
+double
+scalar_double_0 (void)
+{
+  return 0.0;			/* XXSPLTIB or XXLXOR.  */
+}
+
+double
+scalar_double_1 (void)
+{
+  return 1.0;			/* XXSPLTIDP.  */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+  return -0.0;			/* XXSPLTIDP.  */
+}
+
+double
+scalar_double_nan (void)
+{
+  return __builtin_nan ("");	/* XXSPLTIDP.  */
+}
+
+double
+scalar_double_inf (void)
+{
+  return __builtin_inf ();	/* XXSPLTIDP.  */
+}
+
+double
+scalar_double_m_inf (void)	/* XXSPLTIDP.  */
+{
+  return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+  return M_PI;			/* PLFD.  */
+}
+
+double
+scalar_double_denorm (void)
+{
+  return 0x1p-149f;		/* PLFD.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+   instruction.  */
+
+float
+scalar_float_0 (void)
+{
+  return 0.0f;			/* XXSPLTIB or XXLXOR.  */
+}
+
+float
+scalar_float_1 (void)
+{
+  return 1.0f;			/* XXSPLTIDP.  */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+  return -0.0f;			/* XXSPLTIDP.  */
+}
+
+float
+scalar_float_nan (void)
+{
+  return __builtin_nanf ("");	/* XXSPLTIDP.  */
+}
+
+float
+scalar_float_inf (void)
+{
+  return __builtin_inff ();	/* XXSPLTIDP.  */
+}
+
+float
+scalar_float_m_inf (void)	/* XXSPLTIDP.  */
+{
+  return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+  return (float)M_PI;		/* XXSPLTIDP.  */
+}
+
+float
+scalar_float_denorm (void)
+{
+  return 0x1p-149f;		/* PLFS.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..d509459292c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+   instruction.  */
+
+vector double
+v2df_double_0 (void)
+{
+  return (vector double) { 0.0, 0.0 };			/* XXSPLTIB or XXLXOR.  */
+}
+
+vector double
+v2df_double_1 (void)
+{
+  return (vector double) { 1.0, 1.0 };			/* XXSPLTIDP.  */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+  return (vector double) { -0.0, -0.0 };		/* XXSPLTIDP.  */
+}
+
+vector double
+v2df_double_nan (void)
+{
+  return (vector double) { __builtin_nan (""),
+			   __builtin_nan ("") };	/* XXSPLTIDP.  */
+}
+
+vector double
+v2df_double_inf (void)
+{
+  return (vector double) { __builtin_inf (),
+			   __builtin_inf () };		/* XXSPLTIDP.  */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+  return (vector double) { - __builtin_inf (),
+			   - __builtin_inf () };	/* XXSPLTIDP.  */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+  return (vector double) { M_PI, M_PI };		/* PLFD.  */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+  return (vector double) { (double)0x1p-149f,
+			   (double)0x1p-149f };		/* PLFD.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [gcc(refs/users/meissner/heads/work056)] Generate XXSPLTIDP on power10.
@ 2021-06-18  4:42 Michael Meissner
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2021-06-18  4:42 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:008194b5eddf5edd86e743dabd955657757fd94f

commit 008194b5eddf5edd86e743dabd955657757fd94f
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Fri Jun 18 00:42:20 2021 -0400

    Generate XXSPLTIDP on power10.
    
    This patch implements XXSPLTIDP support for SF and DF scalar constants and
    V2DF vector constants.
    
    A new constraint (eF) is added to match constants that can be loaded with
    the XXSPLTIDP instruction.
    
    I have added a temporary switch (-mxxspltidp) to control whether or not the
    XXSPLTIDP instruction is generated.
    
    This patch provides a xxspltidp_constant_p function which decodes both
    VEC_DUPLICATE and VECTOR_CONST insns (similar to the existing
    xxspltib_constant_p function).
    
    The xxspltidp_constant_p function returns the appropriate integer that will be
    used in the XXSPLTIDP instruction.  Note, because SFmode denormal values
    are undefined in the hardware, the xxspltidp_constant_p function returns
    false for these values.  Also xxspltidp_constant_p returns false for 0.0
    because is cheaper to implement without XXSPLTIDP.
    
    I added 3 new tests to test loading up SF/DF scalar and V2DF vector
    constants.
    
    gcc/
    2021-06-18  Michael Meissner  <meissner@linux.ibm.com>
    
            * config/rs6000/constraints.md (eF): New constraint.
            * config/rs6000/predicates.md (easy_fp_constant): If we can load
            the scalar constant with XXSPLTIDP, the floating point constant is
            easy.
            (xxspltidp_operand): New predicate.
            (easy_vector_constant): If we can generate XXSPLTIDP, mark the
            vector constant as easy.
            * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
            -mxxspltidp support.
            (POWERPC_MASKS): Add -mxxspltidp support.
            * config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New
            declaration.
            * config/rs6000/rs6000.c (rs6000_option_override_internal): Add
            -mxxspltidp support.
            (const_vector_element_all_same): New function.
            (xxspltidp_constant_p): New function.
            (output_vec_const_move): Add support for XXSPLTIDP.
            (rs6000_opt_masks): Add -mxxspltidp support.
            (rs6000_emit_xxspltidp_v2df): Change function to implement the
            XXSPLTIDP instruction.
            * config/rs6000/rs6000.md (movsf_hardfloat): Add XXSPLTIDP
            support.
            (mov<mode>_hardfloat32, FMOVE64 iterator): Add XXSPLTIDP support.
            (mov<mode>_hardfloat64, FMOVE64 iterator): Add XXSPLTIDP support.
            * config/rs6000/rs6000.opt (-mxxspltidp): New switch.
            * config/rs6000/vsx.md (UNSPEC_XXSPLTIDP): Rename UNSPEC_XXSPLTID
            to UNSPEC_XXSPLTIDP to match the instruction.
            (xxspltidp_v2df): Use 'use' for the expand arguments, instead of
            writing out an insn.
            (xxspltidp_v2df_inst): Delete.
            (XXSPLTIDP): New mode iterator.
            (xxspltidp_<mode>_internal1): New define_insn_and_split.
            (xxspltidp_<mode>_internal2): New define_insn.
    
    gcc/testsuite/
    2021-06-18  Michael Meissner  <meissner@linux.ibm.com>
    
            * gcc.target/powerpc/vec-splat-constant-sf.c: New test.
            * gcc.target/powerpc/vec-splat-constant-df.c: New test.
            * gcc.target/powerpc/vec-splat-constant-v2df.c: New test.

Diff:
---
 gcc/config/rs6000/constraints.md                   |   5 +
 gcc/config/rs6000/predicates.md                    |  21 +++++
 gcc/config/rs6000/rs6000-cpus.def                  |   2 +
 gcc/config/rs6000/rs6000-protos.h                  |   1 +
 gcc/config/rs6000/rs6000.c                         | 101 ++++++++++++++++++++-
 gcc/config/rs6000/rs6000.md                        |  52 +++++++----
 gcc/config/rs6000/rs6000.opt                       |   4 +
 gcc/config/rs6000/vsx.md                           |  52 ++++++++---
 .../gcc.target/powerpc/vec-splat-constant-df.c     |  60 ++++++++++++
 .../gcc.target/powerpc/vec-splat-constant-sf.c     |  60 ++++++++++++
 .../gcc.target/powerpc/vec-splat-constant-v2df.c   |  64 +++++++++++++
 11 files changed, 388 insertions(+), 34 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 561ce9797af..e1fadd63580 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,6 +208,11 @@
   (and (match_code "const_int")
        (match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
 
+;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP
+(define_constraint "eF"
+  "A vector constant that can be loaded with the XXSPLTIDP instruction."
+  (match_operand 0 "xxspltidp_operand"))
+
 ;; 34-bit signed integer constant
 (define_constraint "eI"
   "A signed 34-bit integer constant if prefixed instructions are supported."
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index aa17ddc94e5..1831ff8b6d9 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
   if (TARGET_VSX && op == CONST0_RTX (mode))
     return 1;
 
+  /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+     be loaded with that instruction.  */
+  if (xxspltidp_operand (op, mode))
+    return 1;
+
   /* Otherwise consider floating point constants hard, so that the
      constant gets pushed to memory during the early RTL phases.  This
      has the advantage that double precision constants that can be
@@ -666,6 +671,19 @@
   return true;
 })
 
+;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be
+;; loaded via the ISA 3.1 XXSPLTIDP instruction.  Do not return true if the
+;; value is 0.0, since that is easy to generate without using XXSPLTIDP.
+(define_predicate "xxspltidp_operand"
+  (match_code "const_double,const_vector,vec_duplicate")
+{
+  if (op == CONST0_RTX (mode))
+    return false;
+
+  HOST_WIDE_INT value = 0;
+  return xxspltidp_constant_p (op, mode, &value);
+})
+
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
@@ -682,6 +700,9 @@
       if (xxspltiw_operand (op, mode))
 	return true;
 
+      if (xxspltidp_operand (op, mode))
+	return true;
+
       if (TARGET_P9_VECTOR
           && xxspltib_constant_p (op, mode, &num_insns, &value))
 	return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index e6c5891d334..fcdab4cea4a 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -89,6 +89,7 @@
 				 | OPTION_MASK_P10_FUSION_LOGADD 	\
 				 | OPTION_MASK_P10_FUSION_ADDLOG	\
 				 | OPTION_MASK_P10_FUSION_2ADD		\
+				 | OPTION_MASK_XXSPLTIDP		\
 				 | OPTION_MASK_XXSPLTIW)
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
@@ -168,6 +169,7 @@
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
 				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_XXSPLTIDP		\
 				 | OPTION_MASK_XXSPLTIW)
 #endif
 
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 94bf961c6b7..c381ddaed37 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
 
 extern bool easy_altivec_constant (rtx, machine_mode);
 extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3761a448ab6..e9ec81df360 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4503,11 +4503,14 @@ rs6000_option_override_internal (bool global_init_p)
 
   if (TARGET_POWER10 && TARGET_VSX && TARGET_PREFIXED)
     {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0)
+	rs6000_isa_flags |= OPTION_MASK_XXSPLTIDP;
+
       if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIW) == 0)
 	rs6000_isa_flags |= OPTION_MASK_XXSPLTIW;
     }
   else
-    rs6000_isa_flags &= ~OPTION_MASK_XXSPLTIW;
+    rs6000_isa_flags &= ~(OPTION_MASK_XXSPLTIDP | OPTION_MASK_XXSPLTIW);
 
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after subtarget", rs6000_isa_flags);
@@ -6511,6 +6514,96 @@ xxspltib_constant_p (rtx op,
   return true;
 }
 
+/* Return the element of a constant vector whose elements are all the same.  In
+   addition if VEC_DUPLICATE is used, return the element being duplicated.  If
+   neither is true, return NULL_RTX.  */
+
+static rtx
+const_vector_element_all_same (rtx op)
+{
+  if (GET_CODE (op) == VEC_DUPLICATE)
+    {
+      rtx element = XEXP (op, 0);
+      return (CONST_INT_P (element) || CONST_DOUBLE_P (element)
+	       ? element
+	       : NULL_RTX);
+    }
+
+  else if (GET_CODE (op) == CONST_VECTOR)
+    {
+      machine_mode mode = GET_MODE (op);
+      size_t n_elts = GET_MODE_NUNITS (mode);
+      rtx element = CONST_VECTOR_ELT (op, 0);
+
+      for (size_t i = 1; i < n_elts; i++)
+	if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1)))
+	  return NULL_RTX;
+
+      return element;
+    }
+
+  return NULL_RTX;
+}
+
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+   XXSPLTIDP instruction.
+
+   Return the constant that is being split via CONSTANT_PTR to use in the
+   XXSPLTIDP instruction.  */
+
+bool
+xxspltidp_constant_p (rtx op,
+		      machine_mode mode,
+		      HOST_WIDE_INT *constant_ptr)
+{
+  *constant_ptr = 0;
+
+  if (!TARGET_XXSPLTIDP)
+    return false;
+
+  if (mode == VOIDmode)
+    mode = GET_MODE (op);
+
+  rtx element = op;
+  if (mode == V2DFmode)
+    {
+      element = const_vector_element_all_same (op);
+      if (!element)
+	return false;
+
+      mode = DFmode;
+    }
+
+  if (mode != SFmode && mode != DFmode)
+    return false;
+
+  if (GET_MODE (element) != mode)
+    return false;
+
+  if (!CONST_DOUBLE_P (element))
+    return false;
+
+  /* Don't return true for 0.0 since that is easy to create without
+     XXSPLTIDP.  */
+  if (element == CONST0_RTX (mode))
+    return false;
+
+  /* If the value doesn't fit in a SFmode, exactly, we can't use XXSPLTIDP.  */
+  const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element);
+  if (!exact_real_truncate (SFmode, rv))
+    return 0;
+
+  long value;
+  REAL_VALUE_TO_TARGET_SINGLE (*rv, value);
+
+  /* Test for SFmode denormal (exponent is 0, mantissa field is non-zero).  */
+  if (((value & 0x7F800000) == 0) && ((value & 0x7FFFFF) != 0))
+    return false;
+
+  *constant_ptr = value;
+  return true;
+}
+
 const char *
 output_vec_const_move (rtx *operands)
 {
@@ -6555,7 +6648,8 @@ output_vec_const_move (rtx *operands)
 	    gcc_unreachable ();
 	}
 
-      if (xxspltiw_operand (vec, mode))
+      if (xxspltiw_operand (vec, mode)
+	  || xxspltidp_operand (vec, mode))
 	return "#";
 
       if (TARGET_P9_VECTOR
@@ -24133,6 +24227,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
   { "xxspltiw",			OPTION_MASK_XXSPLTIW,		false, true  },
+  { "xxspltidp",		OPTION_MASK_XXSPLTIDP,		false, true  },
 #ifdef OPTION_MASK_64BIT
 #if TARGET_AIX_OS
   { "aix64",			OPTION_MASK_64BIT,		false, false },
@@ -27972,7 +28067,7 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value)
     inform (input_location,
 	    "the result for the xxspltidp instruction "
 	    "is undefined for subnormal input values");
-  emit_insn( gen_xxspltidp_v2df_inst (dst, GEN_INT (value)));
+  emit_insn (gen_xxspltidp_v2df_internal2 (dst, GEN_INT (value)));
 }
 
 /* Implement TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC.  */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 44d82c605c7..ebdc95a006a 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7689,17 +7689,17 @@
 ;;
 ;;	LWZ          LFS        LXSSP       LXSSPX     STFS       STXSSP
 ;;	STXSSPX      STW        XXLXOR      LI         FMR        XSCPSGNDP
-;;	MR           MT<x>      MF<x>       NOP
+;;	MR           MT<x>      MF<x>       NOP        XXSPLTIDP
 
 (define_insn "movsf_hardfloat"
   [(set (match_operand:SF 0 "nonimmediate_operand"
 	 "=!r,       f,         v,          wa,        m,         wY,
 	  Z,         m,         wa,         !r,        f,         wa,
-	  !r,        *c*l,      !r,         *h")
+	  !r,        *c*l,      !r,         *h,        wa")
 	(match_operand:SF 1 "input_operand"
 	 "m,         m,         wY,         Z,         f,         v,
 	  wa,        r,         j,          j,         f,         wa,
-	  r,         r,         *h,         0"))]
+	  r,         r,         *h,         0,         eF"))]
   "(register_operand (operands[0], SFmode)
    || register_operand (operands[1], SFmode))
    && TARGET_HARD_FLOAT
@@ -7721,15 +7721,20 @@
    mr %0,%1
    mt%0 %1
    mf%1 %0
-   nop"
+   nop
+   #"
   [(set_attr "type"
 	"load,       fpload,    fpload,     fpload,    fpstore,   fpstore,
 	 fpstore,    store,     veclogical, integer,   fpsimple,  fpsimple,
-	 *,          mtjmpr,    mfjmpr,     *")
+	 *,          mtjmpr,    mfjmpr,     *,         vecperm")
    (set_attr "isa"
 	"*,          *,         p9v,        p8v,       *,         p9v,
 	 p8v,        *,         *,          *,         *,         *,
-	 *,          *,         *,          *")])
+	 *,          *,         *,          *,         p10")
+   (set_attr "prefixed"
+	"*,          *,         *,          *,         *,         *,
+	 *,          *,         *,          *,         *,         *,
+	 *,          *,         *,          *,         yes")])
 
 ;;	LWZ          LFIWZX     STW        STFIWX     MTVSRWZ    MFVSRWZ
 ;;	FMR          MR         MT%0       MF%1       NOP
@@ -7989,18 +7994,18 @@
 
 ;;           STFD         LFD         FMR         LXSD        STXSD
 ;;           LXSD         STXSD       XXLOR       XXLXOR      GPR<-0
-;;           LWZ          STW         MR
+;;           LWZ          STW         MR          XXSPLTIDP
 
 
 (define_insn "*mov<mode>_hardfloat32"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
             "=m,          d,          d,          <f64_p9>,   wY,
               <f64_av>,   Z,          <f64_vsx>,  <f64_vsx>,  !r,
-              Y,          r,          !r")
+              Y,          r,          !r,         wa")
 	(match_operand:FMOVE64 1 "input_operand"
              "d,          m,          d,          wY,         <f64_p9>,
               Z,          <f64_av>,   <f64_vsx>,  <zero_fp>,  <zero_fp>,
-              r,          Y,          r"))]
+              r,          Y,          r,          eF"))]
   "! TARGET_POWERPC64 && TARGET_HARD_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8017,20 +8022,25 @@
    #
    #
    #
+   #
    #"
   [(set_attr "type"
             "fpstore,     fpload,     fpsimple,   fpload,     fpstore,
              fpload,      fpstore,    veclogical, veclogical, two,
-             store,       load,       two")
+             store,       load,       two,        vecperm")
    (set_attr "size" "64")
    (set_attr "length"
             "*,           *,          *,          *,          *,
              *,           *,          *,          *,          8,
-             8,           8,          8")
+             8,           8,          8,          *")
    (set_attr "isa"
             "*,           *,          *,          p9v,        p9v,
              p7v,         p7v,        *,          *,          *,
-             *,           *,          *")])
+             *,           *,          *,          p10")
+   (set_attr "prefixed"
+            "*,           *,          *,          *,          *,
+             *,           *,          *,          *,          *,
+             *,           *,          *,          yes")])
 
 ;;           STW      LWZ     MR      G-const H-const F-const
 
@@ -8057,19 +8067,19 @@
 ;;           STFD         LFD         FMR         LXSD        STXSD
 ;;           LXSDX        STXSDX      XXLOR       XXLXOR      LI 0
 ;;           STD          LD          MR          MT{CTR,LR}  MF{CTR,LR}
-;;           NOP          MFVSRD      MTVSRD
+;;           NOP          MFVSRD      MTVSRD      XXSPLTIDP
 
 (define_insn "*mov<mode>_hardfloat64"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
            "=m,           d,          d,          <f64_p9>,   wY,
              <f64_av>,    Z,          <f64_vsx>,  <f64_vsx>,  !r,
              YZ,          r,          !r,         *c*l,       !r,
-            *h,           r,          <f64_dm>")
+            *h,           r,          <f64_dm>,   wa")
 	(match_operand:FMOVE64 1 "input_operand"
             "d,           m,          d,          wY,         <f64_p9>,
              Z,           <f64_av>,   <f64_vsx>,  <zero_fp>,  <zero_fp>,
              r,           YZ,         r,          r,          *h,
-             0,           <f64_dm>,   r"))]
+             0,           <f64_dm>,   r,          eF"))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8091,18 +8101,24 @@
    mf%1 %0
    nop
    mfvsrd %0,%x1
-   mtvsrd %x0,%1"
+   mtvsrd %x0,%1
+   #"
   [(set_attr "type"
             "fpstore,     fpload,     fpsimple,   fpload,     fpstore,
              fpload,      fpstore,    veclogical, veclogical, integer,
              store,       load,       *,          mtjmpr,     mfjmpr,
-             *,           mfvsr,      mtvsr")
+             *,           mfvsr,      mtvsr,      vecperm")
    (set_attr "size" "64")
    (set_attr "isa"
             "*,           *,          *,          p9v,        p9v,
              p7v,         p7v,        *,          *,          *,
              *,           *,          *,          *,          *,
-             *,           p8v,        p8v")])
+             *,           p8v,        p8v,        p10")
+   (set_attr "prefixed"
+            "*,           *,          *,          *,          *,
+             *,           *,          *,          *,          *,
+             *,           *,          *,          *,          *,
+             *,           *,          *,          yes")])
 
 ;;           STD      LD       MR      MT<SPR> MF<SPR> G-const
 ;;           H-const  F-const  Special
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 38eaa36d6d8..0333099cc8a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -643,3 +643,7 @@ Generate code that will run in privileged state.
 mxxspltiw
 Target Undocumented Mask(XXSPLTIW) Var(rs6000_isa_flags)
 Generate (do not generate) XXSPLTIW instructions.
+
+mxxspltidp
+Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags)
+Generate (do not generate) XXSPLTIDP instructions.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 8f2a37d74d5..ce98f78e02a 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -389,7 +389,7 @@
    UNSPEC_VDIVES
    UNSPEC_VDIVEU
    UNSPEC_XXEVAL
-   UNSPEC_XXSPLTID
+   UNSPEC_XXSPLTIDP
    UNSPEC_XXSPLTI32DX
    UNSPEC_XXBLEND
    UNSPEC_XXPERMX
@@ -6407,9 +6407,8 @@
 
 ;; XXSPLTIDP built-in function support
 (define_expand "xxspltidp_v2df"
-  [(set (match_operand:V2DF 0 "register_operand" )
-	(unspec:V2DF [(match_operand:SF 1 "const_double_operand")]
-		     UNSPEC_XXSPLTID))]
+  [(use (match_operand:V2DF 0 "register_operand" ))
+   (use (match_operand:SF 1 "const_double_operand"))]
  "TARGET_POWER10"
 {
   long value = rs6000_const_f32_to_i32 (operands[1]);
@@ -6417,15 +6416,6 @@
   DONE;
 })
 
-(define_insn "xxspltidp_v2df_inst"
-  [(set (match_operand:V2DF 0 "register_operand" "=wa")
-	(unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
-		     UNSPEC_XXSPLTID))]
-  "TARGET_POWER10"
-  "xxspltidp %x0,%1"
-  [(set_attr "type" "vecsimple")
-   (set_attr "prefixed" "yes")])
-
 ;; XXSPLTI32DX built-in function support
 (define_expand "xxsplti32dx_v4si"
   [(set (match_operand:V4SI 0 "register_operand" "=wa")
@@ -6671,3 +6661,39 @@
 {
   operands[2] = CONST_VECTOR_ELT (operands[1], 0);
 })
+
+;; Generate the XXSPLTIDP instruction to support SFmode and DFmode scalar
+;; constants and V2DF vector constants where both elements are the same.  The
+;; constant has be expressible as a SFmode constant that is not a SFmode
+;; denormal value.
+(define_mode_iterator XXSPLTIDP [SF DF V2DF])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal1"
+  [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa")
+	(match_operand:XXSPLTIDP 1 "xxspltidp_operand"))]
+  "TARGET_XXSPLTIDP"
+  "#"
+  "&& 1"
+  [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand")
+	(unspec:XXSPLTIDP [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+  HOST_WIDE_INT value = 0;
+
+  if (!xxspltidp_constant_p (operands[1], <MODE>mode, &value))
+    gcc_unreachable ();
+
+  operands[2] = GEN_INT (value);
+}
+ [(set_attr "type" "vecperm")
+  (set_attr "prefixed" "yes")])
+
+;; Just in case the user issued -mno-xxspltidp, allow the built-in function
+;; even if the compiler does not automatically generate XXSPLTIDP.
+(define_insn "xxspltidp_<mode>_internal2"
+  [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa")
+	(unspec:XXSPLTIDP [(match_operand 1 "const_int_operand" "n")]
+			  UNSPEC_XXSPLTIDP))]
+  "TARGET_POWER10"
+  "xxspltidp %x0,%1"
+ [(set_attr "type" "vecperm")
+  (set_attr "prefixed" "yes")])
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+   instruction.  */
+
+double
+scalar_double_0 (void)
+{
+  return 0.0;			/* XXSPLTIB or XXLXOR.  */
+}
+
+double
+scalar_double_1 (void)
+{
+  return 1.0;			/* XXSPLTIDP.  */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+  return -0.0;			/* XXSPLTIDP.  */
+}
+
+double
+scalar_double_nan (void)
+{
+  return __builtin_nan ("");	/* XXSPLTIDP.  */
+}
+
+double
+scalar_double_inf (void)
+{
+  return __builtin_inf ();	/* XXSPLTIDP.  */
+}
+
+double
+scalar_double_m_inf (void)	/* XXSPLTIDP.  */
+{
+  return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+  return M_PI;			/* PLFD.  */
+}
+
+double
+scalar_double_denorm (void)
+{
+  return 0x1p-149f;		/* PLFD.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+   instruction.  */
+
+float
+scalar_float_0 (void)
+{
+  return 0.0f;			/* XXSPLTIB or XXLXOR.  */
+}
+
+float
+scalar_float_1 (void)
+{
+  return 1.0f;			/* XXSPLTIDP.  */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+  return -0.0f;			/* XXSPLTIDP.  */
+}
+
+float
+scalar_float_nan (void)
+{
+  return __builtin_nanf ("");	/* XXSPLTIDP.  */
+}
+
+float
+scalar_float_inf (void)
+{
+  return __builtin_inff ();	/* XXSPLTIDP.  */
+}
+
+float
+scalar_float_m_inf (void)	/* XXSPLTIDP.  */
+{
+  return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+  return (float)M_PI;		/* XXSPLTIDP.  */
+}
+
+float
+scalar_float_denorm (void)
+{
+  return 0x1p-149f;		/* PLFS.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..d509459292c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+   instruction.  */
+
+vector double
+v2df_double_0 (void)
+{
+  return (vector double) { 0.0, 0.0 };			/* XXSPLTIB or XXLXOR.  */
+}
+
+vector double
+v2df_double_1 (void)
+{
+  return (vector double) { 1.0, 1.0 };			/* XXSPLTIDP.  */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+  return (vector double) { -0.0, -0.0 };		/* XXSPLTIDP.  */
+}
+
+vector double
+v2df_double_nan (void)
+{
+  return (vector double) { __builtin_nan (""),
+			   __builtin_nan ("") };	/* XXSPLTIDP.  */
+}
+
+vector double
+v2df_double_inf (void)
+{
+  return (vector double) { __builtin_inf (),
+			   __builtin_inf () };		/* XXSPLTIDP.  */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+  return (vector double) { - __builtin_inf (),
+			   - __builtin_inf () };	/* XXSPLTIDP.  */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+  return (vector double) { M_PI, M_PI };		/* PLFD.  */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+  return (vector double) { (double)0x1p-149f,
+			   (double)0x1p-149f };		/* PLFD.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-06-18  4:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-18  4:37 [gcc(refs/users/meissner/heads/work056)] Generate XXSPLTIDP on power10 Michael Meissner
2021-06-18  4:42 Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).