[gcc(refs/users/meissner/heads/work046)] Use XXSPLTI32DX to generate some constants.

public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed

From: Michael Meissner <meissner@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc(refs/users/meissner/heads/work046)] Use XXSPLTI32DX to generate some constants.
Date: Thu,  8 Apr 2021 20:08:09 +0000 (GMT)	[thread overview]
Message-ID: <20210408200809.E99AE39551C5@sourceware.org> (raw)

https://gcc.gnu.org/g:5d6e7dd0d1d1bbcf9cf9a52c54640825438ecc95

commit 5d6e7dd0d1d1bbcf9cf9a52c54640825438ecc95
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Thu Apr 8 16:07:50 2021 -0400

    Use XXSPLTI32DX to generate some constants.
    
    This patch generates a pair of XXSPLTI32DX instructions to load 64-bit
    scalar or 128-bit vector constants into the vector registers.
    
    I added a new constraint (eD) for constants that can be loaded with
    XXSPLTI32DX, but cannot be loaded with the XXSPLTIDP or XXSPLTIW
    instructions.
    
    I added a debug switch (-mxxsplti32dx) to control whether this behavior is
    on or off.
    
    gcc/
    2021-04-08  Michael Meissner  <meissner@linux.ibm.com>
    
            * config/rs6000/contraints.md (eD): New constraint.
            * config/rs6000/predicates.md (easy_fp_constant): If we can load
            the constant with a pair of XXSPLTI32DX instructions, it is easy.
            (xxsplti32dx_operand): New predicate.
            (easy_vector_constant): If we can load the constant with a pair of
            XXSPLTI32DX instructions, it is easy.
            * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
            -mxxsplti32dx.
            (POWERPC_MASKS): Add -mxxsplti32dx.
            * config/rs6000/rs6000-protos.h (xxsplti32dx_constant_p): New
            declaration.
            * config/rs6000/rs6000.c (rs6000_option_override_internal): Add
            -mxxsplti32dx support.
            (xxsplti32dx_constant_p): New helper function.
            (output_vec_const_move): Split constants that need XXSPLTI32DX.
            (rs6000_opt_masks): Add -mxxsplti32dx.
            * config/rs6000/rs6000.md (movsf_hardfloat): Add support for
            loading constants with XXSPLTI32DX.
            (mov<mode>_hardfloat32, FMOVE64 iterator): Add support for loading
            constants with XXSPLTI32DX.
            (mov<mode>_hardfloat64, FMOVE64 iterator): Add support for loading
            constants with XXSPLTI32DX.
            * config/rs6000/rs6000.opt (-mxxsplti32dx): New switch.
            * config/rs6000/vsx.md (UNSPEC_XXSPLTI32DX_CONST): New unspec.
            (vsx_mov<mode>_64bit): Add support for loading constants with
            XXSPLTI32DX.
            (vsx_mov<mode>_32bit): Add support for loading constants with
            XXSPLTI32DX.
            (XXSPLTI32DX): New mode iterator.
            (xxsplti32dx_<mode>): New insn and splits.
            (xxsplti32dx_<mode>_first): New insns.
            (xxsplti32dx_<mode>_second): New insns.
    
    gcc/testsuite/
    2021-04-08  Michael Meissner  <meissner@linux.ibm.com>
    
            * gcc.target/powerpc/vec-splati-runnable.c: Update insn count.
            * gcc.target/powerpc/vec-splat-constant-sf.c: Update insn count.
            * gcc.target/powerpc/vec-splat-constant-df.c: Update insn count.
            * gcc.target/powerpc/vec-splat-constant-v2df.c: Update insn
            count.

Diff:
---
 gcc/config/rs6000/constraints.md                   |   6 +
 gcc/config/rs6000/predicates.md                    |  22 ++++
 gcc/config/rs6000/rs6000-cpus.def                  |   2 +
 gcc/config/rs6000/rs6000-protos.h                  |   1 +
 gcc/config/rs6000/rs6000.c                         |  94 ++++++++++++++++
 gcc/config/rs6000/rs6000.md                        |  44 +++++---
 gcc/config/rs6000/rs6000.opt                       |   4 +
 gcc/config/rs6000/vsx.md                           | 121 ++++++++++++++++++---
 .../gcc.target/powerpc/vec-splat-constant-df.c     |   9 +-
 .../gcc.target/powerpc/vec-splat-constant-sf.c     |   5 +-
 .../gcc.target/powerpc/vec-splat-constant-v2df.c   |  10 +-
 .../gcc.target/powerpc/vec-splati-runnable.c       |   2 +-
 12 files changed, 283 insertions(+), 37 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 70b1eb01770..c7137337f4c 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,6 +208,12 @@
   (and (match_code "const_int")
        (match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
 
+;; SF/DF/V2DF/DI/V2DI scalar or vector constant that can be loaded with a pair
+;; of XXSPLTI32DX instructions.
+(define_constraint "eD"
+  "A vector constant that can be loaded with XXSPLTI32DX instructions."
+  (match_operand 0 "xxsplti32dx_operand"))
+
 ;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP
 (define_constraint "eF"
   "A vector constant that can be loaded with the XXSPLTIDP instruction."
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 48d9c5509a2..281c6e835b9 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -606,6 +606,11 @@
   if (xxspltidp_operand (op, mode))
     return 1;
 
+  /* If we have the ISA 3.1 XXSPLTI32DX instruction, see if the constant can
+     be loaded with a pair of those instructions.  */
+  if (xxsplti32dx_operand (op, mode))
+    return 1;
+
   /* Otherwise consider floating point constants hard, so that the
      constant gets pushed to memory during the early RTL phases.  This
      has the advantage that double precision constants that can be
@@ -677,6 +682,20 @@
   return xxspltidp_constant_p (op, mode, &value);
 })
 
+;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be
+;; loaded via a pair f ISA 3.1 XXSPLTI32DX instructions.  Do not return true if
+;; the value is 0.0 or it can be loaded with XXSPLTIDP, since that is easy to
+;; generate without using XXSPLTI32DX.
+(define_predicate "xxsplti32dx_operand"
+  (match_code "const_double,const_int,const_vector,vec_duplicate")
+{
+  if (op == CONST0_RTX (mode))
+    return false;
+
+  HOST_WIDE_INT value = 0;
+  return xxsplti32dx_constant_p (op, mode, &value);
+})
+
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
@@ -696,6 +715,9 @@
       if (xxspltidp_operand (op, mode))
 	return true;
 
+      if (xxsplti32dx_operand (op, mode))
+	return true;
+
       if (TARGET_P9_VECTOR
           && xxspltib_constant_p (op, mode, &num_insns, &value))
 	return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index cf4044831f7..5a14191cc6c 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -79,6 +79,7 @@
 				 | OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PCREL_OPT		\
 				 | OPTION_MASK_PREFIXED			\
+				 | OPTION_MASK_XXSPLTI32DX		\
 				 | OPTION_MASK_XXSPLTIDP		\
 				 | OPTION_MASK_XXSPLTIW)
 
@@ -163,6 +164,7 @@
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
 				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_XXSPLTI32DX		\
 				 | OPTION_MASK_XXSPLTIDP		\
 				 | OPTION_MASK_XXSPLTIW)
 
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 0fe1c176236..12bf60b043d 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -34,6 +34,7 @@ extern bool easy_altivec_constant (rtx, machine_mode);
 extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
 extern bool xxspltiw_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
 extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
+extern bool xxsplti32dx_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 08a853f2e8f..659eb301c87 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4484,6 +4484,10 @@ rs6000_option_override_internal (bool global_init_p)
       && (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0)
     rs6000_isa_flags |= OPTION_MASK_XXSPLTIDP;
 
+  if (TARGET_POWER10 && TARGET_VSX
+      && (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTI32DX) == 0)
+    rs6000_isa_flags |= OPTION_MASK_XXSPLTI32DX;
+
   if (!TARGET_PCREL && TARGET_PCREL_OPT)
     rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
 
@@ -6610,6 +6614,92 @@ xxspltidp_constant_p (rtx op,
   return true;
 }
 
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+   XXSPLTI32DX instruction.  If the instruction can be synthesized with
+   XXSPLTIDP or is 0/-1, return false;
+
+   Return the 64-bit constant to use in the two XXSPLTI32DX instructions via
+   CONSTANT_PTR.  */
+
+bool
+xxsplti32dx_constant_p (rtx op,
+			machine_mode mode,
+			HOST_WIDE_INT *constant_ptr)
+{
+  *constant_ptr = 0;
+
+  if (!TARGET_XXSPLTI32DX)
+    return false;
+
+  if (mode == VOIDmode)
+    mode = GET_MODE (op);
+
+  if (op == CONST0_RTX (mode))
+    return false;
+
+  rtx element = op;
+  if (mode == V2DFmode || mode == V2DImode)
+    {
+      /* Handle VEC_DUPLICATE and CONST_VECTOR.  */
+      if (GET_CODE (op) == VEC_DUPLICATE)
+	element = XEXP (op, 0);
+
+      else if (GET_CODE (op) == CONST_VECTOR)
+       {
+         element = CONST_VECTOR_ELT (op, 0);
+         if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1)))
+           return false;
+       }
+
+      else
+       return false;
+
+      mode = GET_MODE_INNER (mode);
+    }
+
+  if (GET_MODE (element) != mode)
+    return false;
+
+  /* Handle floating point constants.  */
+  if (mode == SFmode || mode == DFmode)
+    {
+      HOST_WIDE_INT xxspltidp_value = 0;
+
+      if (!CONST_DOUBLE_P (element))
+	return false;
+
+      if (xxspltidp_constant_p (element, mode, &xxspltidp_value))
+	return false;
+
+      long high_low[2];
+      const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element);
+      REAL_VALUE_TO_TARGET_DOUBLE (*rv, high_low);
+
+      if (!BYTES_BIG_ENDIAN)
+	std::swap (high_low[0], high_low[1]);
+
+      *constant_ptr = (high_low[0] << 32) | (high_low[1] & 0xffffffff);
+      return true;
+    }
+
+  /* Handle integer constants.  */
+  else if (mode == DImode)
+    {
+      if (!CONST_INT_P (element))
+	return false;
+
+      HOST_WIDE_INT value = INTVAL (element);
+      if (value == -1)
+	return false;
+
+      *constant_ptr = value;
+      return true;
+    }
+
+  else
+    return false;
+}
+
 const char *
 output_vec_const_move (rtx *operands)
 {
@@ -6683,6 +6773,9 @@ output_vec_const_move (rtx *operands)
 	  return "xxspltidp %x0,%2";
 	}
 
+      if (xxsplti32dx_operand (vec, mode))
+	return "#";
+
       if (TARGET_P9_VECTOR
 	  && xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
 	{
@@ -24182,6 +24275,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "vsx",			OPTION_MASK_VSX,		false, true  },
   { "xxspltiw",			OPTION_MASK_XXSPLTIW,		false, true  },
   { "xxspltidp",		OPTION_MASK_XXSPLTIDP,		false, true  },
+  { "xxsplti32dx",		OPTION_MASK_XXSPLTI32DX,	false, true  },
 #ifdef OPTION_MASK_64BIT
 #if TARGET_AIX_OS
   { "aix64",			OPTION_MASK_64BIT,		false, false },
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5569e0591e6..b5886d3ccf4 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7564,17 +7564,17 @@
 ;;
 ;;	LWZ          LFS        LXSSP       LXSSPX     STFS       STXSSP
 ;;	STXSSPX      STW        XXLXOR      LI         FMR        XSCPSGNDP
-;;	MR           MT<x>      MF<x>       NOP        XXSPLTIDP
+;;	MR           MT<x>      MF<x>       NOP        XXSPLTIDP  XXSPLTI32DX
 
 (define_insn "movsf_hardfloat"
   [(set (match_operand:SF 0 "nonimmediate_operand"
 	 "=!r,       f,         v,          wa,        m,         wY,
 	  Z,         m,         wa,         !r,        f,         wa,
-	  !r,        *c*l,      !r,         *h,        wa")
+	  !r,        *c*l,      !r,         *h,        wa,        wa")
 	(match_operand:SF 1 "input_operand"
 	 "m,         m,         wY,         Z,         f,         v,
 	  wa,        r,         j,          j,         f,         wa,
-	  r,         r,         *h,         0,         eF"))]
+	  r,         r,         *h,         0,         eF,        eD"))]
   "(register_operand (operands[0], SFmode)
    || register_operand (operands[1], SFmode))
    && TARGET_HARD_FLOAT
@@ -7597,19 +7597,28 @@
    mt%0 %1
    mf%1 %0
    nop
+   #
    #"
   [(set_attr "type"
 	"load,       fpload,    fpload,     fpload,    fpstore,   fpstore,
 	 fpstore,    store,     veclogical, integer,   fpsimple,  fpsimple,
-	 *,          mtjmpr,    mfjmpr,     *,         vecperm")
+	 *,          mtjmpr,    mfjmpr,     *,         vecperm,   vecperm")
    (set_attr "isa"
 	"*,          *,         p9v,        p8v,       *,         p9v,
 	 p8v,        *,         *,          *,         *,         *,
-	 *,          *,         *,          *,         p10")
+	 *,          *,         *,          *,         p10,       p10")
    (set_attr "prefixed"
 	"*,          *,         *,          *,         *,         *,
 	 *,          *,         *,          *,         *,         *,
-	 *,          *,         *,          *,         yes")])
+	 *,          *,         *,          *,         yes,       yes")
+   (set_attr "max_prefixed_insns"
+	"*,          *,         *,          *,         *,         *,
+	 *,          *,         *,          *,         *,         *,
+	 *,          *,         *,          *,         *,         2")
+   (set_attr "num_insns"
+	"*,          *,         *,          *,         *,         *,
+	 *,          *,         *,          *,         *,         *,
+	 *,          *,         *,          *,         *,         2")])
 
 ;;	LWZ          LFIWZX     STW        STFIWX     MTVSRWZ    MFVSRWZ
 ;;	FMR          MR         MT%0       MF%1       NOP
@@ -7869,18 +7878,18 @@
 
 ;;           STFD         LFD         FMR         LXSD        STXSD
 ;;           LXSD         STXSD       XXLOR       XXLXOR      GPR<-0
-;;           LWZ          STW         MR          XXSPLTIDP
+;;           LWZ          STW         MR          XXSPLTIDP   XXSPLTI32DX
 
 
 (define_insn "*mov<mode>_hardfloat32"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
             "=m,          d,          d,          <f64_p9>,   wY,
               <f64_av>,   Z,          <f64_vsx>,  <f64_vsx>,  !r,
-              Y,          r,          !r,         wa")
+              Y,          r,          !r,         wa,         wa")
 	(match_operand:FMOVE64 1 "input_operand"
              "d,          m,          d,          wY,         <f64_p9>,
               Z,          <f64_av>,   <f64_vsx>,  <zero_fp>,  <zero_fp>,
-              r,          Y,          r,          eF"))]
+              r,          Y,          r,          eF,         eD"))]
   "! TARGET_POWERPC64 && TARGET_HARD_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -7898,24 +7907,33 @@
    #
    #
    #
+   #
    #"
   [(set_attr "type"
             "fpstore,     fpload,     fpsimple,   fpload,     fpstore,
              fpload,      fpstore,    veclogical, veclogical, two,
-             store,       load,       two,        vecperm")
+             store,       load,       two,        vecperm,    vecperm")
    (set_attr "size" "64")
    (set_attr "length"
             "*,           *,          *,          *,          *,
              *,           *,          *,          *,          8,
-             8,           8,          8,          *")
+             8,           8,          8,          *,          *")
    (set_attr "isa"
             "*,           *,          *,          p9v,        p9v,
              p7v,         p7v,        *,          *,          *,
-             *,           *,          *,          p10")
+             *,           *,          *,          p10,        p10")
    (set_attr "prefixed"
             "*,           *,          *,          *,          *,
              *,           *,          *,          *,          *,
-             *,           *,          *,          yes")])
+             *,           *,          *,          yes,        yes")
+   (set_attr "max_prefixed_insns"
+            "*,           *,          *,          *,          *,
+             *,           *,          *,          *,          *,
+             *,           *,          *,          *,          2")
+   (set_attr "num_insns"
+            "*,           *,          *,          *,          *,
+             *,           *,          *,          *,          *,
+             *,           *,          *,          *,          2")])
 
 ;;           STW      LWZ     MR      G-const H-const F-const
 
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 6620cdb7716..bd269369ca0 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -627,3 +627,7 @@ Generate (do not generate) the XXSPLTIW instruction.
 mxxspltidp
 Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags)
 Generate (do not generate) the XXSPLTIDP instruction.
+
+mxxsplti32dx
+Target Undocumented Mask(XXSPLTI32DX) Var(rs6000_isa_flags)
+Generate (do not generate) the XXSPLTI32DX instruction.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 4b0307d447e..ec2c148fb4d 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -372,6 +372,7 @@
    UNSPEC_XXSPLTIW
    UNSPEC_XXSPLTIDP
    UNSPEC_XXSPLTI32DX
+   UNSPEC_XXSPLTI32DX_CONST
    UNSPEC_XXPERMX
    UNSPEC_XXEVAL
   ])
@@ -1173,16 +1174,19 @@
 ;;              VSX store  VSX load   VSX move  VSX->GPR   GPR->VSX    LQ (GPR)
 ;;              STQ (GPR)  GPR load   GPR store GPR move   XXSPLTIB    VSPLTISW
 ;;              VSX 0/-1   VMX const  GPR const LVX (VMX)  STVX (VMX)  XXSPLTI*
+;;              XXSPLTI32DX
 (define_insn "vsx_mov<mode>_64bit"
   [(set (match_operand:VSX_M 0 "nonimmediate_operand"
                "=ZwO,      wa,        wa,        r,         we,        ?wQ,
                 ?&r,       ??r,       ??Y,       <??r>,     wa,        v,
-                ?wa,       v,         <??r>,     wZ,        v,         wa")
+                ?wa,       v,         <??r>,     wZ,        v,         wa,
+                wa")
 
 	(match_operand:VSX_M 1 "input_operand" 
                "wa,        ZwO,       wa,        we,        r,         r,
                 wQ,        Y,         r,         r,         wE,        jwM,
-                ?jwM,      W,         <nW>,      v,         wZ,        eWeF"))]
+                ?jwM,      W,         <nW>,      v,         wZ,        eWeF,
+                eD"))]
 
   "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
    && (register_operand (operands[0], <MODE>mode) 
@@ -1193,41 +1197,47 @@
   [(set_attr "type"
                "vecstore,  vecload,   vecsimple, mtvsr,     mfvsr,     load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
-                vecsimple, *,         *,         vecstore,  vecload,   vecperm")
+                vecsimple, *,         *,         vecstore,  vecload,   vecperm,
+                vecperm")
    (set_attr "num_insns"
                "*,         *,         *,         2,         *,         2,
                 2,         2,         2,         2,         *,         *,
-                *,         5,         2,         *,         *,         *")
+                *,         5,         2,         *,         *,         *,
+                2")
    (set_attr "max_prefixed_insns"
                "*,         *,         *,         *,         *,         2,
                 2,         2,         2,         2,         *,         *,
-                *,         *,         *,         *,         *,         *")
+                *,         *,         *,         *,         *,         *,
+                2")
    (set_attr "length"
                "*,         *,         *,         8,         *,         8,
                 8,         8,         8,         8,         *,         *,
-                *,         20,        8,         *,         *,         *")
+                *,         20,        8,         *,         *,         *,
+                *")
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 *,         *,         *,         *,         p9v,       *,
-                <VSisa>,   *,         *,         *,         *,         p10")
+                <VSisa>,   *,         *,         *,         *,         p10,
+                p10")
    (set_attr "prefixed"
                "*,         *,         *,         *,         *,         *,
                 *,         *,         *,         *,         *,         *,
-                *,         *,         *,         *,         *,         yes")])
+                *,         *,         *,         *,         *,         yes,
+                yes")])
 
 ;;              VSX store  VSX load   VSX move   GPR load   GPR store  GPR move
 ;;              XXSPLTIB   VSPLTISW   VSX 0/-1   VMX const  GPR const  XXSPLTI*
-;;              LVX (VMX)  STVX (VMX)
+;;              LVX (VMX)  STVX (VMX) XXSPLTI32DX
 (define_insn "*vsx_mov<mode>_32bit"
   [(set (match_operand:VSX_M 0 "nonimmediate_operand"
                "=ZwO,      wa,        wa,        ??r,       ??Y,       <??r>,
                 wa,        v,         ?wa,       v,         <??r>,     wa,
-                wZ,        v")
+                wZ,        v,         wa")
 
 	(match_operand:VSX_M 1 "input_operand" 
                "wa,        ZwO,       wa,        Y,         r,         r,
                 wE,        jwM,       ?jwM,      W,         <nW>,      eWeF,
-                v,         wZ"))]
+                v,         wZ,        eD"))]
 
   "!TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
    && (register_operand (operands[0], <MODE>mode) 
@@ -1238,19 +1248,27 @@
   [(set_attr "type"
                "vecstore,  vecload,   vecsimple, load,      store,    *,
                 vecsimple, vecsimple, vecsimple, *,         *,        vecperm,
-                vecstore,  vecload")
+                vecstore,  vecload,   vecperm")
    (set_attr "length"
                "*,         *,         *,         16,        16,        16,
                 *,         *,         *,         20,        16,        *,
-                *,         *")
+                *,         *,         *")
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 p9v,       *,         <VSisa>,   *,         *,         p10,
-                *,         *")
+                *,         *,         p10")
    (set_attr "prefixed"
                "*,         *,         *,         *,         *,         *,
                 *,         *,         *,         *,         *,         yes,
-                *,         *")])
+                *,         *,         yes")
+   (set_attr "max_prefixed_insns"
+               "*,         *,         *,         *,         *,         *,
+                *,         *,         *,         *,         *,         *,
+                *,         *,         2")
+   (set_attr "num_insns"
+               "*,         *,         *,         *,         *,         *,
+                *,         *,         *,         *,         *,         *,
+                *,         *,         *")])
 
 ;; Explicit  load/store expanders for the builtin functions
 (define_expand "vsx_load_<mode>"
@@ -6330,6 +6348,79 @@
   DONE;
 })
 
+;; XXSPLTI32DX used to create 64-bit constants
+(define_mode_iterator XXSPLTI32DX [SF DF V2DF V2DI])
+
+(define_insn_and_split "*xxsplti32dx_<mode>"
+  [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa")
+	(match_operand:XXSPLTI32DX 1 "xxsplti32dx_operand"))]
+  "TARGET_XXSPLTI32DX"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+	(unspec:XXSPLTI32DX [(match_dup 2)
+			     (match_dup 3)] UNSPEC_XXSPLTI32DX_CONST))
+   (set (match_dup 0)
+	(unspec:XXSPLTI32DX [(match_dup 0)
+			     (match_dup 4)
+			     (match_dup 5)] UNSPEC_XXSPLTI32DX_CONST))]
+{
+  HOST_WIDE_INT value = 0;
+
+  if (!xxsplti32dx_constant_p (operands[1], <MODE>mode, &value))
+    gcc_unreachable ();
+
+  HOST_WIDE_INT high = value >> 32;
+  HOST_WIDE_INT low = value & 0xffffffff;
+
+  /* If the low bits are 0/-1, initialize that word first.  This way we can
+     use a smaller XXSPLTIB instruction instead the first XXSPLTI32DX.  */
+  if (low == 0 || low == -1)
+    {
+      operands[2] = const1_rtx;
+      operands[3] = GEN_INT (low);
+      operands[4] = const0_rtx;
+      operands[5] = GEN_INT (high);
+    }
+  else
+    {
+      operands[2] = const0_rtx;
+      operands[3] = GEN_INT (high);
+      operands[4] = const1_rtx;
+      operands[5] = GEN_INT (low);
+    }
+}
+  [(set_attr "type" "vecperm")
+   (set_attr "prefixed" "yes")
+   (set_attr "num_insns" "2")
+   (set_attr "max_prefixed_insns" "2")])
+
+;; First word of XXSPLTI32DX
+(define_insn "*xxsplti32dx_<mode>_first"
+  [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa,wa,wa")
+	(unspec:XXSPLTI32DX [(match_operand 1 "u1bit_cint_operand" "n,n,n")
+			     (match_operand 2 "const_int_operand" "O,wM,n")]
+			    UNSPEC_XXSPLTI32DX_CONST))]
+  "TARGET_XXSPLTI32DX"
+  "@
+   xxspltib %x0,0
+   xxspltib %x0,255
+   xxsplti32dx %x0,%1,%2"
+  [(set_attr "type" "vecperm")
+   (set_attr "prefixed" "*,*,yes")])
+
+;; Second word of XXSPLTI32DX
+(define_insn "*xxsplti32dx_<mode>_second"
+  [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa")
+	(unspec:XXSPLTI32DX [(match_operand:XXSPLTI32DX 1 "vsx_register_operand" "0")
+			     (match_operand 2 "u1bit_cint_operand" "n")
+			     (match_operand 3 "const_int_operand" "n")]
+			    UNSPEC_XXSPLTI32DX_CONST))]
+  "TARGET_XXSPLTI32DX"
+  "xxsplti32dx %x0,%2,%3"
+  [(set_attr "type" "vecperm")
+   (set_attr "prefixed" "yes")])
+
 ;; XXSPLTI32DX built-in support.
 (define_expand "xxsplti32dx_v4si"
   [(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
index 8f6e176f9af..1435ef4ef4f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -48,13 +48,16 @@ scalar_double_m_inf (void)	/* XXSPLTIDP.  */
 double
 scalar_double_pi (void)
 {
-  return M_PI;			/* PLFD.  */
+  return M_PI;			/* 2x XXSPLTI32DX.  */
 }
 
 double
 scalar_double_denorm (void)
 {
-  return 0x1p-149f;		/* PLFD.  */
+  return 0x1p-149f;		/* XXSPLTIB, XXSPLTI32DX.  */
 }
 
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M}   5 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-not   {\mplfd\M}          } } */
+/* { dg-final { scan-assembler-not   {\mplxsd\M}         } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
index 72504bdfbbd..e9a45d5159d 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -57,4 +57,7 @@ scalar_float_denorm (void)
   return 0x1p-149f;		/* PLFS.  */
 }
 
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M}   6 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 1 } } */
+/* { dg-final { scan-assembler-not   {\mplfs\M}          } } */
+/* { dg-final { scan-assembler-not   {\mplxssp\M}        } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
index d509459292c..d81198b163d 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -51,14 +51,16 @@ v2df_double_m_inf (void)
 vector double
 v2df_double_pi (void)
 {
-  return (vector double) { M_PI, M_PI };		/* PLFD.  */
+  return (vector double) { M_PI, M_PI };		/* 2x XXSPLTI32DX.  */
 }
 
 vector double
 v2df_double_denorm (void)
 {
-  return (vector double) { (double)0x1p-149f,
-			   (double)0x1p-149f };		/* PLFD.  */
+  return (vector double) { (double)0x1p-149f,		/* XXSPLTIB, */
+			   (double)0x1p-149f };		/* XXSPLTI32DX.  */
 }
 
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M}   5 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-not   {\mplxv\M}          } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
index 06a8289d09b..f0eb982eadf 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
@@ -162,4 +162,4 @@ main (int argc, char *argv [])
 
 /* { dg-final { scan-assembler-times {\mxxspltiw\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 4 } } */

next             reply	other threads:[~2021-04-08 20:08 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 20:08 Michael Meissner [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-04-12 17:46 Michael Meissner
2021-04-08  5:35 Michael Meissner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210408200809.E99AE39551C5@sourceware.org \
    --to=meissner@gcc.gnu.org \
    --cc=gcc-cvs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).