public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work056)] Generate XXSPLTIDP on power10.
@ 2021-06-18 4:37 Michael Meissner
0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2021-06-18 4:37 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:a08c2622d8a8e37c7c22267c017dca75fdfe42ea
commit a08c2622d8a8e37c7c22267c017dca75fdfe42ea
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Fri Jun 18 00:37:32 2021 -0400
Generate XXSPLTIDP on power10.
This patch implements XXSPLTIDP support for SF and DF scalar constants and
V2DF vector constants.
A new constraint (eF) is added to match constants that can be loaded with
the XXSPLTIDP instruction.
I have added a temporary switch (-mxxspltidp) to control whether or not the
XXSPLTIDP instruction is generated.
This patch provides a xxspltidp_constant_p function which decodes both
VEC_DUPLICATE and VECTOR_CONST insns (similar to the existing
xxspltib_constant_p function).
The xxspltidp_constant_p function returns the appropriate integer that will be
used in the XXSPLTIDP instruction. Note, because SFmode denormal values
are undefined in the hardware, the xxspltidp_constant_p function returns
false for these values. Also xxspltidp_constant_p returns false for 0.0
because is cheaper to implement without XXSPLTIDP.
I added 3 new tests to test loading up SF/DF scalar and V2DF vector
constants.
gcc/
2021-06-18 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/constraints.md (eF): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the scalar constant with XXSPLTIDP, the floating point constant is
easy.
(xxspltidp_operand): New predicate.
(easy_vector_constant): If we can generate XXSPLTIDP, mark the
vector constant as easy.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
-mxxspltidp support.
(POWERPC_MASKS): Add -mxxspltidp support.
* config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New
declaration.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
-mxxspltidp support.
(const_vector_element_all_same): New function.
(xxspltidp_constant_p): New function.
(output_vec_const_move): Add support for XXSPLTIDP.
(rs6000_opt_masks): Add -mxxspltidp support.
(rs6000_emit_xxspltidp_v2df): Change function to implement the
XXSPLTIDP instruction.
* config/rs6000/rs6000.md (movsf_hardfloat): Add XXSPLTIDP
support.
(mov<mode>_hardfloat32, FMOVE64 iterator): Add XXSPLTIDP support.
(mov<mode>_hardfloat64, FMOVE64 iterator): Add XXSPLTIDP support.
* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
* config/rs6000/vsx.md (UNSPEC_XXSPLTIDP): Rename UNSPEC_XXSPLTID
to UNSPEC_XXSPLTIDP to match the instruction.
(xxspltidp_v2df): Use 'use' for the expand arguments, instead of
writing out an insn.
(xxspltidp_v2df_inst): Delete.
(XXSPLTIDP): New mode iterator.
(xxspltidp_<mode>_internal1): New define_insn_and_split.
(xxspltidp_<mode>_internal2): New define_insn.
gcc/testsuite/
2021-06-18 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.
Diff:
---
gcc/config/rs6000/constraints.md | 5 +
gcc/config/rs6000/predicates.md | 21 +++++
gcc/config/rs6000/rs6000-cpus.def | 2 +
gcc/config/rs6000/rs6000-protos.h | 1 +
gcc/config/rs6000/rs6000.c | 101 ++++++++++++++++++++-
gcc/config/rs6000/rs6000.md | 52 +++++++----
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 52 ++++++++---
.../gcc.target/powerpc/vec-splat-constant-df.c | 60 ++++++++++++
.../gcc.target/powerpc/vec-splat-constant-sf.c | 60 ++++++++++++
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 64 +++++++++++++
11 files changed, 388 insertions(+), 34 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 561ce9797af..e1fadd63580 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,6 +208,11 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP
+(define_constraint "eF"
+ "A vector constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "xxspltidp_operand"))
+
;; 34-bit signed integer constant
(define_constraint "eI"
"A signed 34-bit integer constant if prefixed instructions are supported."
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index aa17ddc94e5..1831ff8b6d9 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
if (TARGET_VSX && op == CONST0_RTX (mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+ be loaded with that instruction. */
+ if (xxspltidp_operand (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -666,6 +671,19 @@
return true;
})
+;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be
+;; loaded via the ISA 3.1 XXSPLTIDP instruction. Do not return true if the
+;; value is 0.0, since that is easy to generate without using XXSPLTIDP.
+(define_predicate "xxspltidp_operand"
+ (match_code "const_double,const_vector,vec_duplicate")
+{
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ HOST_WIDE_INT value = 0;
+ return xxspltidp_constant_p (op, mode, &value);
+})
+
;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
;; vector register without using memory.
(define_predicate "easy_vector_constant"
@@ -682,6 +700,9 @@
if (xxspltiw_operand (op, mode))
return true;
+ if (xxspltidp_operand (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index e6c5891d334..fcdab4cea4a 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -89,6 +89,7 @@
| OPTION_MASK_P10_FUSION_LOGADD \
| OPTION_MASK_P10_FUSION_ADDLOG \
| OPTION_MASK_P10_FUSION_2ADD \
+ | OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
/* Flags that need to be turned off if -mno-power9-vector. */
@@ -168,6 +169,7 @@
| OPTION_MASK_SOFT_FLOAT \
| OPTION_MASK_STRICT_ALIGN_OPTIONAL \
| OPTION_MASK_VSX \
+ | OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
#endif
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 94bf961c6b7..c381ddaed37 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
extern bool easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3761a448ab6..ab5d9daf4f3 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4503,11 +4503,14 @@ rs6000_option_override_internal (bool global_init_p)
if (TARGET_POWER10 && TARGET_VSX && TARGET_PREFIXED)
{
+ if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0)
+ rs6000_isa_flags |= OPTION_MASK_XXSPLTIW;
+
if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIW) == 0)
rs6000_isa_flags |= OPTION_MASK_XXSPLTIW;
}
else
- rs6000_isa_flags &= ~OPTION_MASK_XXSPLTIW;
+ rs6000_isa_flags &= ~(OPTION_MASK_XXSPLTIDP | OPTION_MASK_XXSPLTIW);
if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
rs6000_print_isa_options (stderr, 0, "after subtarget", rs6000_isa_flags);
@@ -6511,6 +6514,96 @@ xxspltib_constant_p (rtx op,
return true;
}
+/* Return the element of a constant vector whose elements are all the same. In
+ addition if VEC_DUPLICATE is used, return the element being duplicated. If
+ neither is true, return NULL_RTX. */
+
+static rtx
+const_vector_element_all_same (rtx op)
+{
+ if (GET_CODE (op) == VEC_DUPLICATE)
+ {
+ rtx element = XEXP (op, 0);
+ return (CONST_INT_P (element) || CONST_DOUBLE_P (element)
+ ? element
+ : NULL_RTX);
+ }
+
+ else if (GET_CODE (op) == CONST_VECTOR)
+ {
+ machine_mode mode = GET_MODE (op);
+ size_t n_elts = GET_MODE_NUNITS (mode);
+ rtx element = CONST_VECTOR_ELT (op, 0);
+
+ for (size_t i = 1; i < n_elts; i++)
+ if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1)))
+ return NULL_RTX;
+
+ return element;
+ }
+
+ return NULL_RTX;
+}
+
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+ XXSPLTIDP instruction.
+
+ Return the constant that is being split via CONSTANT_PTR to use in the
+ XXSPLTIDP instruction. */
+
+bool
+xxspltidp_constant_p (rtx op,
+ machine_mode mode,
+ HOST_WIDE_INT *constant_ptr)
+{
+ *constant_ptr = 0;
+
+ if (!TARGET_XXSPLTIDP)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ rtx element = op;
+ if (mode == V2DFmode)
+ {
+ element = const_vector_element_all_same (op);
+ if (!element)
+ return false;
+
+ mode = DFmode;
+ }
+
+ if (mode != SFmode && mode != DFmode)
+ return false;
+
+ if (GET_MODE (element) != mode)
+ return false;
+
+ if (!CONST_DOUBLE_P (element))
+ return false;
+
+ /* Don't return true for 0.0 since that is easy to create without
+ XXSPLTIDP. */
+ if (element == CONST0_RTX (mode))
+ return false;
+
+ /* If the value doesn't fit in a SFmode, exactly, we can't use XXSPLTIDP. */
+ const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element);
+ if (!exact_real_truncate (SFmode, rv))
+ return 0;
+
+ long value;
+ REAL_VALUE_TO_TARGET_SINGLE (*rv, value);
+
+ /* Test for SFmode denormal (exponent is 0, mantissa field is non-zero). */
+ if (((value & 0x7F800000) == 0) && ((value & 0x7FFFFF) != 0))
+ return false;
+
+ *constant_ptr = value;
+ return true;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6555,7 +6648,8 @@ output_vec_const_move (rtx *operands)
gcc_unreachable ();
}
- if (xxspltiw_operand (vec, mode))
+ if (xxspltiw_operand (vec, mode)
+ || xxspltidp_operand (vec, mode))
return "#";
if (TARGET_P9_VECTOR
@@ -24133,6 +24227,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
{ "update", OPTION_MASK_NO_UPDATE, true , true },
{ "vsx", OPTION_MASK_VSX, false, true },
{ "xxspltiw", OPTION_MASK_XXSPLTIW, false, true },
+ { "xxspltidp", OPTION_MASK_XXSPLTIDP, false, true },
#ifdef OPTION_MASK_64BIT
#if TARGET_AIX_OS
{ "aix64", OPTION_MASK_64BIT, false, false },
@@ -27972,7 +28067,7 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value)
inform (input_location,
"the result for the xxspltidp instruction "
"is undefined for subnormal input values");
- emit_insn( gen_xxspltidp_v2df_inst (dst, GEN_INT (value)));
+ emit_insn (gen_xxspltidp_v2df_internal2 (dst, GEN_INT (value)));
}
/* Implement TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC. */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 44d82c605c7..ebdc95a006a 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7689,17 +7689,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP
+;; MR MT<x> MF<x> NOP XXSPLTIDP
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h")
+ !r, *c*l, !r, *h, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0"))]
+ r, r, *h, 0, eF"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7721,15 +7721,20 @@
mr %0,%1
mt%0 %1
mf%1 %0
- nop"
+ nop
+ #"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *")
+ *, mtjmpr, mfjmpr, *, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *")])
+ *, *, *, *, p10")
+ (set_attr "prefixed"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *, *, yes")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -7989,18 +7994,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR
+;; LWZ STW MR XXSPLTIDP
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r")
+ Y, r, !r, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r"))]
+ r, Y, r, eF"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8017,20 +8022,25 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two")
+ store, load, two, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8")
+ 8, 8, 8, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *")])
+ *, *, *, p10")
+ (set_attr "prefixed"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, yes")])
;; STW LWZ MR G-const H-const F-const
@@ -8057,19 +8067,19 @@
;; STFD LFD FMR LXSD STXSD
;; LXSDX STXSDX XXLOR XXLXOR LI 0
;; STD LD MR MT{CTR,LR} MF{CTR,LR}
-;; NOP MFVSRD MTVSRD
+;; NOP MFVSRD MTVSRD XXSPLTIDP
(define_insn "*mov<mode>_hardfloat64"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
YZ, r, !r, *c*l, !r,
- *h, r, <f64_dm>")
+ *h, r, <f64_dm>, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
r, YZ, r, r, *h,
- 0, <f64_dm>, r"))]
+ 0, <f64_dm>, r, eF"))]
"TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8091,18 +8101,24 @@
mf%1 %0
nop
mfvsrd %0,%x1
- mtvsrd %x0,%1"
+ mtvsrd %x0,%1
+ #"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, integer,
store, load, *, mtjmpr, mfjmpr,
- *, mfvsr, mtvsr")
+ *, mfvsr, mtvsr, vecperm")
(set_attr "size" "64")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
*, *, *, *, *,
- *, p8v, p8v")])
+ *, p8v, p8v, p10")
+ (set_attr "prefixed"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, yes")])
;; STD LD MR MT<SPR> MF<SPR> G-const
;; H-const F-const Special
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 38eaa36d6d8..0333099cc8a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -643,3 +643,7 @@ Generate code that will run in privileged state.
mxxspltiw
Target Undocumented Mask(XXSPLTIW) Var(rs6000_isa_flags)
Generate (do not generate) XXSPLTIW instructions.
+
+mxxspltidp
+Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags)
+Generate (do not generate) XXSPLTIDP instructions.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 8f2a37d74d5..ce98f78e02a 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -389,7 +389,7 @@
UNSPEC_VDIVES
UNSPEC_VDIVEU
UNSPEC_XXEVAL
- UNSPEC_XXSPLTID
+ UNSPEC_XXSPLTIDP
UNSPEC_XXSPLTI32DX
UNSPEC_XXBLEND
UNSPEC_XXPERMX
@@ -6407,9 +6407,8 @@
;; XXSPLTIDP built-in function support
(define_expand "xxspltidp_v2df"
- [(set (match_operand:V2DF 0 "register_operand" )
- (unspec:V2DF [(match_operand:SF 1 "const_double_operand")]
- UNSPEC_XXSPLTID))]
+ [(use (match_operand:V2DF 0 "register_operand" ))
+ (use (match_operand:SF 1 "const_double_operand"))]
"TARGET_POWER10"
{
long value = rs6000_const_f32_to_i32 (operands[1]);
@@ -6417,15 +6416,6 @@
DONE;
})
-(define_insn "xxspltidp_v2df_inst"
- [(set (match_operand:V2DF 0 "register_operand" "=wa")
- (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
- UNSPEC_XXSPLTID))]
- "TARGET_POWER10"
- "xxspltidp %x0,%1"
- [(set_attr "type" "vecsimple")
- (set_attr "prefixed" "yes")])
-
;; XXSPLTI32DX built-in function support
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
@@ -6671,3 +6661,39 @@
{
operands[2] = CONST_VECTOR_ELT (operands[1], 0);
})
+
+;; Generate the XXSPLTIDP instruction to support SFmode and DFmode scalar
+;; constants and V2DF vector constants where both elements are the same. The
+;; constant has be expressible as a SFmode constant that is not a SFmode
+;; denormal value.
+(define_mode_iterator XXSPLTIDP [SF DF V2DF])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal1"
+ [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP 1 "xxspltidp_operand"))]
+ "TARGET_XXSPLTIDP"
+ "#"
+ "&& 1"
+ [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand")
+ (unspec:XXSPLTIDP [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ HOST_WIDE_INT value = 0;
+
+ if (!xxspltidp_constant_p (operands[1], <MODE>mode, &value))
+ gcc_unreachable ();
+
+ operands[2] = GEN_INT (value);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
+;; Just in case the user issued -mno-xxspltidp, allow the built-in function
+;; even if the compiler does not automatically generate XXSPLTIDP.
+(define_insn "xxspltidp_<mode>_internal2"
+ [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa")
+ (unspec:XXSPLTIDP [(match_operand 1 "const_int_operand" "n")]
+ UNSPEC_XXSPLTIDP))]
+ "TARGET_POWER10"
+ "xxspltidp %x0,%1"
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+double
+scalar_double_0 (void)
+{
+ return 0.0; /* XXSPLTIB or XXLXOR. */
+}
+
+double
+scalar_double_1 (void)
+{
+ return 1.0; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+ return -0.0; /* XXSPLTIDP. */
+}
+
+double
+scalar_double_nan (void)
+{
+ return __builtin_nan (""); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_inf (void)
+{
+ return __builtin_inf (); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+ return M_PI; /* PLFD. */
+}
+
+double
+scalar_double_denorm (void)
+{
+ return 0x1p-149f; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+float
+scalar_float_0 (void)
+{
+ return 0.0f; /* XXSPLTIB or XXLXOR. */
+}
+
+float
+scalar_float_1 (void)
+{
+ return 1.0f; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+ return -0.0f; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_nan (void)
+{
+ return __builtin_nanf (""); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_inf (void)
+{
+ return __builtin_inff (); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+ return (float)M_PI; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_denorm (void)
+{
+ return 0x1p-149f; /* PLFS. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..d509459292c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+vector double
+v2df_double_0 (void)
+{
+ return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */
+}
+
+vector double
+v2df_double_1 (void)
+{
+ return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+ return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_nan (void)
+{
+ return (vector double) { __builtin_nan (""),
+ __builtin_nan ("") }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_inf (void)
+{
+ return (vector double) { __builtin_inf (),
+ __builtin_inf () }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+ return (vector double) { - __builtin_inf (),
+ - __builtin_inf () }; /* XXSPLTIDP. */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+ return (vector double) { M_PI, M_PI }; /* PLFD. */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+ return (vector double) { (double)0x1p-149f,
+ (double)0x1p-149f }; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
^ permalink raw reply [flat|nested] 2+ messages in thread
* [gcc(refs/users/meissner/heads/work056)] Generate XXSPLTIDP on power10.
@ 2021-06-18 4:42 Michael Meissner
0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2021-06-18 4:42 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:008194b5eddf5edd86e743dabd955657757fd94f
commit 008194b5eddf5edd86e743dabd955657757fd94f
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Fri Jun 18 00:42:20 2021 -0400
Generate XXSPLTIDP on power10.
This patch implements XXSPLTIDP support for SF and DF scalar constants and
V2DF vector constants.
A new constraint (eF) is added to match constants that can be loaded with
the XXSPLTIDP instruction.
I have added a temporary switch (-mxxspltidp) to control whether or not the
XXSPLTIDP instruction is generated.
This patch provides a xxspltidp_constant_p function which decodes both
VEC_DUPLICATE and VECTOR_CONST insns (similar to the existing
xxspltib_constant_p function).
The xxspltidp_constant_p function returns the appropriate integer that will be
used in the XXSPLTIDP instruction. Note, because SFmode denormal values
are undefined in the hardware, the xxspltidp_constant_p function returns
false for these values. Also xxspltidp_constant_p returns false for 0.0
because is cheaper to implement without XXSPLTIDP.
I added 3 new tests to test loading up SF/DF scalar and V2DF vector
constants.
gcc/
2021-06-18 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/constraints.md (eF): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the scalar constant with XXSPLTIDP, the floating point constant is
easy.
(xxspltidp_operand): New predicate.
(easy_vector_constant): If we can generate XXSPLTIDP, mark the
vector constant as easy.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
-mxxspltidp support.
(POWERPC_MASKS): Add -mxxspltidp support.
* config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New
declaration.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
-mxxspltidp support.
(const_vector_element_all_same): New function.
(xxspltidp_constant_p): New function.
(output_vec_const_move): Add support for XXSPLTIDP.
(rs6000_opt_masks): Add -mxxspltidp support.
(rs6000_emit_xxspltidp_v2df): Change function to implement the
XXSPLTIDP instruction.
* config/rs6000/rs6000.md (movsf_hardfloat): Add XXSPLTIDP
support.
(mov<mode>_hardfloat32, FMOVE64 iterator): Add XXSPLTIDP support.
(mov<mode>_hardfloat64, FMOVE64 iterator): Add XXSPLTIDP support.
* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
* config/rs6000/vsx.md (UNSPEC_XXSPLTIDP): Rename UNSPEC_XXSPLTID
to UNSPEC_XXSPLTIDP to match the instruction.
(xxspltidp_v2df): Use 'use' for the expand arguments, instead of
writing out an insn.
(xxspltidp_v2df_inst): Delete.
(XXSPLTIDP): New mode iterator.
(xxspltidp_<mode>_internal1): New define_insn_and_split.
(xxspltidp_<mode>_internal2): New define_insn.
gcc/testsuite/
2021-06-18 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.
Diff:
---
gcc/config/rs6000/constraints.md | 5 +
gcc/config/rs6000/predicates.md | 21 +++++
gcc/config/rs6000/rs6000-cpus.def | 2 +
gcc/config/rs6000/rs6000-protos.h | 1 +
gcc/config/rs6000/rs6000.c | 101 ++++++++++++++++++++-
gcc/config/rs6000/rs6000.md | 52 +++++++----
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 52 ++++++++---
.../gcc.target/powerpc/vec-splat-constant-df.c | 60 ++++++++++++
.../gcc.target/powerpc/vec-splat-constant-sf.c | 60 ++++++++++++
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 64 +++++++++++++
11 files changed, 388 insertions(+), 34 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 561ce9797af..e1fadd63580 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,6 +208,11 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP
+(define_constraint "eF"
+ "A vector constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "xxspltidp_operand"))
+
;; 34-bit signed integer constant
(define_constraint "eI"
"A signed 34-bit integer constant if prefixed instructions are supported."
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index aa17ddc94e5..1831ff8b6d9 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
if (TARGET_VSX && op == CONST0_RTX (mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+ be loaded with that instruction. */
+ if (xxspltidp_operand (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -666,6 +671,19 @@
return true;
})
+;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be
+;; loaded via the ISA 3.1 XXSPLTIDP instruction. Do not return true if the
+;; value is 0.0, since that is easy to generate without using XXSPLTIDP.
+(define_predicate "xxspltidp_operand"
+ (match_code "const_double,const_vector,vec_duplicate")
+{
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ HOST_WIDE_INT value = 0;
+ return xxspltidp_constant_p (op, mode, &value);
+})
+
;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
;; vector register without using memory.
(define_predicate "easy_vector_constant"
@@ -682,6 +700,9 @@
if (xxspltiw_operand (op, mode))
return true;
+ if (xxspltidp_operand (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index e6c5891d334..fcdab4cea4a 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -89,6 +89,7 @@
| OPTION_MASK_P10_FUSION_LOGADD \
| OPTION_MASK_P10_FUSION_ADDLOG \
| OPTION_MASK_P10_FUSION_2ADD \
+ | OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
/* Flags that need to be turned off if -mno-power9-vector. */
@@ -168,6 +169,7 @@
| OPTION_MASK_SOFT_FLOAT \
| OPTION_MASK_STRICT_ALIGN_OPTIONAL \
| OPTION_MASK_VSX \
+ | OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
#endif
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 94bf961c6b7..c381ddaed37 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
extern bool easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3761a448ab6..e9ec81df360 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4503,11 +4503,14 @@ rs6000_option_override_internal (bool global_init_p)
if (TARGET_POWER10 && TARGET_VSX && TARGET_PREFIXED)
{
+ if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0)
+ rs6000_isa_flags |= OPTION_MASK_XXSPLTIDP;
+
if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIW) == 0)
rs6000_isa_flags |= OPTION_MASK_XXSPLTIW;
}
else
- rs6000_isa_flags &= ~OPTION_MASK_XXSPLTIW;
+ rs6000_isa_flags &= ~(OPTION_MASK_XXSPLTIDP | OPTION_MASK_XXSPLTIW);
if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
rs6000_print_isa_options (stderr, 0, "after subtarget", rs6000_isa_flags);
@@ -6511,6 +6514,96 @@ xxspltib_constant_p (rtx op,
return true;
}
+/* Return the element of a constant vector whose elements are all the same. In
+ addition if VEC_DUPLICATE is used, return the element being duplicated. If
+ neither is true, return NULL_RTX. */
+
+static rtx
+const_vector_element_all_same (rtx op)
+{
+ if (GET_CODE (op) == VEC_DUPLICATE)
+ {
+ rtx element = XEXP (op, 0);
+ return (CONST_INT_P (element) || CONST_DOUBLE_P (element)
+ ? element
+ : NULL_RTX);
+ }
+
+ else if (GET_CODE (op) == CONST_VECTOR)
+ {
+ machine_mode mode = GET_MODE (op);
+ size_t n_elts = GET_MODE_NUNITS (mode);
+ rtx element = CONST_VECTOR_ELT (op, 0);
+
+ for (size_t i = 1; i < n_elts; i++)
+ if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1)))
+ return NULL_RTX;
+
+ return element;
+ }
+
+ return NULL_RTX;
+}
+
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+ XXSPLTIDP instruction.
+
+ Return the constant that is being split via CONSTANT_PTR to use in the
+ XXSPLTIDP instruction. */
+
+bool
+xxspltidp_constant_p (rtx op,
+ machine_mode mode,
+ HOST_WIDE_INT *constant_ptr)
+{
+ *constant_ptr = 0;
+
+ if (!TARGET_XXSPLTIDP)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ rtx element = op;
+ if (mode == V2DFmode)
+ {
+ element = const_vector_element_all_same (op);
+ if (!element)
+ return false;
+
+ mode = DFmode;
+ }
+
+ if (mode != SFmode && mode != DFmode)
+ return false;
+
+ if (GET_MODE (element) != mode)
+ return false;
+
+ if (!CONST_DOUBLE_P (element))
+ return false;
+
+ /* Don't return true for 0.0 since that is easy to create without
+ XXSPLTIDP. */
+ if (element == CONST0_RTX (mode))
+ return false;
+
+ /* If the value doesn't fit in a SFmode, exactly, we can't use XXSPLTIDP. */
+ const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element);
+ if (!exact_real_truncate (SFmode, rv))
+ return 0;
+
+ long value;
+ REAL_VALUE_TO_TARGET_SINGLE (*rv, value);
+
+ /* Test for SFmode denormal (exponent is 0, mantissa field is non-zero). */
+ if (((value & 0x7F800000) == 0) && ((value & 0x7FFFFF) != 0))
+ return false;
+
+ *constant_ptr = value;
+ return true;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6555,7 +6648,8 @@ output_vec_const_move (rtx *operands)
gcc_unreachable ();
}
- if (xxspltiw_operand (vec, mode))
+ if (xxspltiw_operand (vec, mode)
+ || xxspltidp_operand (vec, mode))
return "#";
if (TARGET_P9_VECTOR
@@ -24133,6 +24227,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
{ "update", OPTION_MASK_NO_UPDATE, true , true },
{ "vsx", OPTION_MASK_VSX, false, true },
{ "xxspltiw", OPTION_MASK_XXSPLTIW, false, true },
+ { "xxspltidp", OPTION_MASK_XXSPLTIDP, false, true },
#ifdef OPTION_MASK_64BIT
#if TARGET_AIX_OS
{ "aix64", OPTION_MASK_64BIT, false, false },
@@ -27972,7 +28067,7 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value)
inform (input_location,
"the result for the xxspltidp instruction "
"is undefined for subnormal input values");
- emit_insn( gen_xxspltidp_v2df_inst (dst, GEN_INT (value)));
+ emit_insn (gen_xxspltidp_v2df_internal2 (dst, GEN_INT (value)));
}
/* Implement TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC. */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 44d82c605c7..ebdc95a006a 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7689,17 +7689,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP
+;; MR MT<x> MF<x> NOP XXSPLTIDP
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h")
+ !r, *c*l, !r, *h, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0"))]
+ r, r, *h, 0, eF"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7721,15 +7721,20 @@
mr %0,%1
mt%0 %1
mf%1 %0
- nop"
+ nop
+ #"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *")
+ *, mtjmpr, mfjmpr, *, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *")])
+ *, *, *, *, p10")
+ (set_attr "prefixed"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *, *, yes")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -7989,18 +7994,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR
+;; LWZ STW MR XXSPLTIDP
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r")
+ Y, r, !r, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r"))]
+ r, Y, r, eF"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8017,20 +8022,25 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two")
+ store, load, two, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8")
+ 8, 8, 8, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *")])
+ *, *, *, p10")
+ (set_attr "prefixed"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, yes")])
;; STW LWZ MR G-const H-const F-const
@@ -8057,19 +8067,19 @@
;; STFD LFD FMR LXSD STXSD
;; LXSDX STXSDX XXLOR XXLXOR LI 0
;; STD LD MR MT{CTR,LR} MF{CTR,LR}
-;; NOP MFVSRD MTVSRD
+;; NOP MFVSRD MTVSRD XXSPLTIDP
(define_insn "*mov<mode>_hardfloat64"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
YZ, r, !r, *c*l, !r,
- *h, r, <f64_dm>")
+ *h, r, <f64_dm>, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
r, YZ, r, r, *h,
- 0, <f64_dm>, r"))]
+ 0, <f64_dm>, r, eF"))]
"TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8091,18 +8101,24 @@
mf%1 %0
nop
mfvsrd %0,%x1
- mtvsrd %x0,%1"
+ mtvsrd %x0,%1
+ #"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, integer,
store, load, *, mtjmpr, mfjmpr,
- *, mfvsr, mtvsr")
+ *, mfvsr, mtvsr, vecperm")
(set_attr "size" "64")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
*, *, *, *, *,
- *, p8v, p8v")])
+ *, p8v, p8v, p10")
+ (set_attr "prefixed"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, yes")])
;; STD LD MR MT<SPR> MF<SPR> G-const
;; H-const F-const Special
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 38eaa36d6d8..0333099cc8a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -643,3 +643,7 @@ Generate code that will run in privileged state.
mxxspltiw
Target Undocumented Mask(XXSPLTIW) Var(rs6000_isa_flags)
Generate (do not generate) XXSPLTIW instructions.
+
+mxxspltidp
+Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags)
+Generate (do not generate) XXSPLTIDP instructions.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 8f2a37d74d5..ce98f78e02a 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -389,7 +389,7 @@
UNSPEC_VDIVES
UNSPEC_VDIVEU
UNSPEC_XXEVAL
- UNSPEC_XXSPLTID
+ UNSPEC_XXSPLTIDP
UNSPEC_XXSPLTI32DX
UNSPEC_XXBLEND
UNSPEC_XXPERMX
@@ -6407,9 +6407,8 @@
;; XXSPLTIDP built-in function support
(define_expand "xxspltidp_v2df"
- [(set (match_operand:V2DF 0 "register_operand" )
- (unspec:V2DF [(match_operand:SF 1 "const_double_operand")]
- UNSPEC_XXSPLTID))]
+ [(use (match_operand:V2DF 0 "register_operand" ))
+ (use (match_operand:SF 1 "const_double_operand"))]
"TARGET_POWER10"
{
long value = rs6000_const_f32_to_i32 (operands[1]);
@@ -6417,15 +6416,6 @@
DONE;
})
-(define_insn "xxspltidp_v2df_inst"
- [(set (match_operand:V2DF 0 "register_operand" "=wa")
- (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
- UNSPEC_XXSPLTID))]
- "TARGET_POWER10"
- "xxspltidp %x0,%1"
- [(set_attr "type" "vecsimple")
- (set_attr "prefixed" "yes")])
-
;; XXSPLTI32DX built-in function support
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
@@ -6671,3 +6661,39 @@
{
operands[2] = CONST_VECTOR_ELT (operands[1], 0);
})
+
+;; Generate the XXSPLTIDP instruction to support SFmode and DFmode scalar
+;; constants and V2DF vector constants where both elements are the same. The
+;; constant has be expressible as a SFmode constant that is not a SFmode
+;; denormal value.
+(define_mode_iterator XXSPLTIDP [SF DF V2DF])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal1"
+ [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP 1 "xxspltidp_operand"))]
+ "TARGET_XXSPLTIDP"
+ "#"
+ "&& 1"
+ [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand")
+ (unspec:XXSPLTIDP [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ HOST_WIDE_INT value = 0;
+
+ if (!xxspltidp_constant_p (operands[1], <MODE>mode, &value))
+ gcc_unreachable ();
+
+ operands[2] = GEN_INT (value);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
+;; Just in case the user issued -mno-xxspltidp, allow the built-in function
+;; even if the compiler does not automatically generate XXSPLTIDP.
+(define_insn "xxspltidp_<mode>_internal2"
+ [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa")
+ (unspec:XXSPLTIDP [(match_operand 1 "const_int_operand" "n")]
+ UNSPEC_XXSPLTIDP))]
+ "TARGET_POWER10"
+ "xxspltidp %x0,%1"
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+double
+scalar_double_0 (void)
+{
+ return 0.0; /* XXSPLTIB or XXLXOR. */
+}
+
+double
+scalar_double_1 (void)
+{
+ return 1.0; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+ return -0.0; /* XXSPLTIDP. */
+}
+
+double
+scalar_double_nan (void)
+{
+ return __builtin_nan (""); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_inf (void)
+{
+ return __builtin_inf (); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+ return M_PI; /* PLFD. */
+}
+
+double
+scalar_double_denorm (void)
+{
+ return 0x1p-149f; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+float
+scalar_float_0 (void)
+{
+ return 0.0f; /* XXSPLTIB or XXLXOR. */
+}
+
+float
+scalar_float_1 (void)
+{
+ return 1.0f; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+ return -0.0f; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_nan (void)
+{
+ return __builtin_nanf (""); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_inf (void)
+{
+ return __builtin_inff (); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+ return (float)M_PI; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_denorm (void)
+{
+ return 0x1p-149f; /* PLFS. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..d509459292c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+vector double
+v2df_double_0 (void)
+{
+ return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */
+}
+
+vector double
+v2df_double_1 (void)
+{
+ return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+ return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_nan (void)
+{
+ return (vector double) { __builtin_nan (""),
+ __builtin_nan ("") }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_inf (void)
+{
+ return (vector double) { __builtin_inf (),
+ __builtin_inf () }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+ return (vector double) { - __builtin_inf (),
+ - __builtin_inf () }; /* XXSPLTIDP. */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+ return (vector double) { M_PI, M_PI }; /* PLFD. */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+ return (vector double) { (double)0x1p-149f,
+ (double)0x1p-149f }; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-06-18 4:42 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-18 4:37 [gcc(refs/users/meissner/heads/work056)] Generate XXSPLTIDP on power10 Michael Meissner
2021-06-18 4:42 Michael Meissner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).