public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work070)] Generate XXSPLTIDP on power10.
@ 2021-10-05 21:36 Michael Meissner
0 siblings, 0 replies; 5+ messages in thread
From: Michael Meissner @ 2021-10-05 21:36 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:13f477e7d140c47817fe8c9518ef22bed63a0359
commit 13f477e7d140c47817fe8c9518ef22bed63a0359
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Tue Oct 5 17:36:22 2021 -0400
Generate XXSPLTIDP on power10.
This patch implements XXSPLTIDP support for SF, DF, and DI scalar constants and
V2DF and V2DI vector constants. The XXSPLTIDP instruction is given a 32-bit
immediate that is converted to a vector of two DFmode constants. The immediate
is in SFmode format, so only constants that fit as SFmode values can be loaded
with XXSPLTIDP.
I added two new constraints (eF and eV) to match scalar and vector constants
that can be loaded with the XXSPLTIDP instruction.
I have added a temporary switch (-mxxspltidp) to control whether or not the
XXSPLTIDP instruction is generated.
I added 5 new tests to test loading up SF/DF/DI scalar and V2DI/V2DF vector
constants.
2021-10-05 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/constraints.md (eF): New constraint.
(eV): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the scalar constant with XXSPLTIDP, the constant is easy.
(easy_fp_constant_64bit_scalar): New predicate.
(easy_vector_constant_64bit_element): New predicate.
(easy_vector_constant): If we can generate XXSPLTIDP, mark the
vector constant as easy.
* config/rs6000/rs6000-protos.h (xxspltidp_constant_immediate): New
declaration.
(prefixed_xxsplti_p): Likewise.
* config/rs6000/rs6000.c (xxspltidp_constant_immediate): New function.
(output_vec_const_move): Add support for XXSPLTIDP.
(prefixed_xxsplti_p): New function.
* config/rs6000/rs6000.md (prefixed attribute): Add support for the
xxsplti* prefixed instructions.
(movsf_hardfloat): Add XXSPLTIDP support.
(mov<mode>_hardfloat32, FMOVE64 iterator): Likewise.
(mov<mode>_hardfloat64, FMOVE64 iterator): Likewise.
(movdi_internal32): Likewise.
(movdi_internal64): Likewise.
* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
* config/rs6000/vsx.md (vsx_move<mode>_64bit): Add XXSPLTIDP
support.
(vsx_move<mode>_32bit): Likewise.
(XXSPLTIDP_S): New mode iterator.
(XXSPLTIDP_V): Likewise.
(XXSPLTIDP): Likewise.
(xxspltidp_<mode>_inst): Replace xxspltidp_v2df_inst with an
iterated form that also does SFmode, DFmode, DImode, and
V2DImode.
(xxspltidp_<mode>_internal): New insn and splits.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
eF and eV constraints.
gcc/testsuite/
* gcc.target/powerpc/pr86731-fwrapv-longlong.c: Update insn
regex for power10.
* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-di.c: New test.
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2di.c: New test.
Diff:
---
gcc/config/rs6000/constraints.md | 10 ++
gcc/config/rs6000/predicates.md | 140 +++++++++++++++++++++
gcc/config/rs6000/rs6000-protos.h | 2 +
gcc/config/rs6000/rs6000.c | 96 ++++++++++++++
gcc/config/rs6000/rs6000.md | 58 ++++++---
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 60 ++++++++-
gcc/doc/md.texi | 6 +
.../gcc.target/powerpc/pr86731-fwrapv-longlong.c | 9 +-
.../gcc.target/powerpc/vec-splat-constant-df.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-di.c | 70 +++++++++++
.../gcc.target/powerpc/vec-splat-constant-sf.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 64 ++++++++++
.../gcc.target/powerpc/vec-splat-constant-v2di.c | 50 ++++++++
14 files changed, 663 insertions(+), 26 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index c8cff1a3038..1ff46c9f4fc 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,11 +208,21 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; DI/SF/DF scalar constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eF"
+ "A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_fp_constant_64bit_scalar"))
+
;; 34-bit signed integer constant
(define_constraint "eI"
"A signed 34-bit integer constant if prefixed instructions are supported."
(match_operand 0 "cint34_operand"))
+;; V2DI/V2DF vector constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eV"
+ "A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_vector_constant_64bit_element"))
+
;; Floating-point constraints. These two are defined so that insn
;; length attributes can be calculated exactly.
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 956e42bc514..7544ac87700 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
if (TARGET_VSX && op == CONST0_RTX (mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+ be loaded with that instruction. */
+ if (easy_fp_constant_64bit_scalar (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -609,6 +614,138 @@
return 0;
})
+;; Return 1 if the operand is a 64-bit scalar constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a V2DF or
+;; V2DI mode result that is interpretted as a 64-bit scalar.
+(define_predicate "easy_fp_constant_64bit_scalar"
+ (match_code "const_int,const_double")
+{
+ const REAL_VALUE_TYPE *rv;
+ REAL_VALUE_TYPE rv_type;
+
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ /* Don't return true for 0.0 or 0 since that is easy to create without
+ XXSPLTIDP. */
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ /* Handle DImode by creating a DF value from it. */
+ if (CONST_INT_P (op) && (mode == DImode || mode == VOIDmode))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+
+ /* Avoid values that look like DFmode NaN's. The IEEE 754 64-bit
+ floating format has 1 bit for sign, 11 bits for the exponent,
+ and 52 bits for the mantissa. NaN values have the exponent set
+ to all 1 bits, and the mantissa non-zero (mantissa == 0 is
+ infinity). */
+ int df_exponent = (df_value >> 52) & 0x7ff;
+ HOST_WIDE_INT df_mantissa = df_value & HOST_WIDE_INT_C (0x1fffffffffffff);
+
+ if (df_exponent == 0x7ff && df_mantissa != 0) /* NaN. */
+ return false;
+
+ /* Avoid values that are DFmode subnormal values. Subnormal numbers
+ have the exponent all 0 bits, and the mantissa non-zero. If the
+ value is subnormal, then the hidden bit in the mantissa is not
+ set. */
+ if (df_exponent == 0 && df_mantissa != 0) /* subnormal. */
+ return false;
+
+ long df_words[2];
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_from_target takes the target words in target order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ real_from_target (&rv_type, df_words, DFmode);
+ rv = &rv_type;
+ }
+
+ /* Handle SFmode/DFmode constants. Don't allow decimal or IEEE 128-bit
+ binary constants. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ rv = CONST_DOUBLE_REAL_VALUE (op);
+
+ /* We can't handle anything else with the XXSPLTIDP instruction. */
+ else
+ return false;
+
+ /* Validate that the number can be stored as a SFmode value. */
+ if (!exact_real_truncate (SFmode, rv))
+ return false;
+
+ /* Validate that the number is not a SFmode subnormal value (exponent is 0,
+ mantissa field is non-zero) which is undefined for the XXSPLTIDP
+ instruction. */
+ long sf_value;
+ real_to_target (&sf_value, rv, SFmode);
+
+ /* IEEE 754 32-bit values have 1 bit for the sign, 8 bits for the exponent,
+ and 23 bits for the mantissa. Subnormal numbers have the exponent all
+ 0 bits, and the mantissa non-zero. */
+ long sf_exponent = (sf_value >> 23) & 0xFF;
+ long sf_mantissa = sf_value & 0x7FFFFF;
+
+ if (sf_exponent == 0 && sf_mantissa != 0)
+ return false;
+
+ return true;
+})
+
+;; Return 1 if the operand is a 64-bit vector constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a
+;; V2DFmode or V2DI result.
+;;
+;; We cannot combine the scalar and vector cases because otherwise it is
+;; problematical if we assign an appropriate integer constant to a TImode
+;; value. I.e.
+;;
+;; (set (reg:TI 32)
+;; (const_int 0x8000000000000000))
+;;
+;; Otherwise, the constant would be splatted into the 2 64-bit positions in the
+;; vector register, and not loaded with the upper 64-bits 0, and the constant
+;; in the lower 64-bits.
+
+(define_predicate "easy_vector_constant_64bit_element"
+ (match_code "const_vector,vec_duplicate")
+{
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ if (mode != V2DFmode && mode != V2DImode)
+ return false;
+
+ if (CONST_VECTOR_P (op))
+ {
+ if (!CONST_VECTOR_DUPLICATE_P (op))
+ return false;
+
+ op = CONST_VECTOR_ELT (op, 0);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ op = XEXP (op, 0);
+
+ else
+ return false;
+
+ return easy_fp_constant_64bit_scalar (op, GET_MODE_INNER (mode));
+})
+
;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.
@@ -653,6 +790,9 @@
if (zero_constant (op, mode) || all_ones_constant (op, mode))
return true;
+ if (easy_vector_constant_64bit_element (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 14f6b313105..e9be9c4d99f 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
extern int easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern long xxspltidp_constant_immediate (rtx, machine_mode);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
@@ -198,6 +199,7 @@ enum non_prefixed_form reg_to_non_prefixed (rtx reg, machine_mode mode);
extern bool prefixed_load_p (rtx_insn *);
extern bool prefixed_store_p (rtx_insn *);
extern bool prefixed_paddi_p (rtx_insn *);
+extern bool prefixed_xxsplti_p (rtx_insn *);
extern void rs6000_asm_output_opcode (FILE *);
extern void output_pcrel_opt_reloc (rtx);
extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ad860728169..83d243269e3 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6946,6 +6946,60 @@ xxspltib_constant_p (rtx op,
return true;
}
+/* Return the immediate value used in the XXSPLTIDP instruction. */
+
+long
+xxspltidp_constant_immediate (rtx op, machine_mode mode)
+{
+ long ret;
+
+ /* Handle vectors. */
+ if (CONST_VECTOR_P (op))
+ {
+ op = CONST_VECTOR_ELT (op, 0);
+ mode = GET_MODE_INNER (mode);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ {
+ op = XEXP (op, 0);
+ mode = GET_MODE (op);
+ }
+
+ gcc_assert (easy_fp_constant_64bit_scalar (op, mode));
+
+ /* Handle DImode/V2DImode by creating a DF value from it and then converting
+ the DFmode value to SFmode. */
+ if (CONST_INT_P (op))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+ long df_words[2];
+
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_to_target takes input in target endian order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ REAL_VALUE_TYPE r;
+ real_from_target (&r, &df_words[0], DFmode);
+ real_to_target (&ret, &r, SFmode);
+ }
+
+ /* For floating point constants, convert to SFmode. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ {
+ const REAL_VALUE_TYPE *rv = CONST_DOUBLE_REAL_VALUE (op);
+ real_to_target (&ret, rv, SFmode);
+ }
+
+ else
+ gcc_unreachable ();
+
+ return ret;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6990,6 +7044,13 @@ output_vec_const_move (rtx *operands)
gcc_unreachable ();
}
+ if (easy_fp_constant_64bit_scalar (vec, mode)
+ || easy_vector_constant_64bit_element (vec, mode))
+ {
+ operands[2] = GEN_INT (xxspltidp_constant_immediate (vec, mode));
+ return "xxspltidp %x0,%2";
+ }
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
{
@@ -26724,6 +26785,41 @@ prefixed_paddi_p (rtx_insn *insn)
return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
}
+/* Whether a permute type instruction is a prefixed XXSPLTI* instruction.
+ This is called from the prefixed attribute processing. */
+
+bool
+prefixed_xxsplti_p (rtx_insn *insn)
+{
+ rtx set = single_set (insn);
+ if (!set)
+ return false;
+
+ rtx dest = SET_DEST (set);
+ rtx src = SET_SRC (set);
+ machine_mode mode = GET_MODE (dest);
+
+ if (!REG_P (dest) && !SUBREG_P (dest))
+ return false;
+
+ switch (mode)
+ {
+ case E_DImode:
+ case E_DFmode:
+ case E_SFmode:
+ return easy_fp_constant_64bit_scalar (src, mode);
+
+ case E_V2DImode:
+ case E_V2DFmode:
+ return easy_vector_constant_64bit_element (src, mode);
+
+ default:
+ break;
+ }
+
+ return false;
+}
+
/* Whether the next instruction needs a 'p' prefix issued before the
instruction is printed out. */
static bool prepend_p_to_next_insn;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6bec2bddbde..8afc4b2756d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -314,6 +314,11 @@
(eq_attr "type" "integer,add")
(if_then_else (match_test "prefixed_paddi_p (insn)")
+ (const_string "yes")
+ (const_string "no"))
+
+ (eq_attr "type" "vecperm")
+ (if_then_else (match_test "prefixed_xxsplti_p (insn)")
(const_string "yes")
(const_string "no"))]
@@ -7759,17 +7764,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP
+;; MR MT<x> MF<x> NOP XXSPLTIDP
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h")
+ !r, *c*l, !r, *h, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0"))]
+ r, r, *h, 0, eF"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7791,15 +7796,16 @@
mr %0,%1
mt%0 %1
mf%1 %0
- nop"
+ nop
+ #"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *")
+ *, mtjmpr, mfjmpr, *, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *")])
+ *, *, *, *, p10")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -8059,18 +8065,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR
+;; LWZ STW MR XXSPLTIDP
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r")
+ Y, r, !r, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r"))]
+ r, Y, r, eF"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8087,20 +8093,21 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two")
+ store, load, two, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8")
+ 8, 8, 8, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *")])
+ *, *, *, p10")])
;; STW LWZ MR G-const H-const F-const
@@ -8127,19 +8134,19 @@
;; STFD LFD FMR LXSD STXSD
;; LXSDX STXSDX XXLOR XXLXOR LI 0
;; STD LD MR MT{CTR,LR} MF{CTR,LR}
-;; NOP MFVSRD MTVSRD
+;; NOP MFVSRD MTVSRD XXSPLTIDP
(define_insn "*mov<mode>_hardfloat64"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
YZ, r, !r, *c*l, !r,
- *h, r, <f64_dm>")
+ *h, r, <f64_dm>, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
r, YZ, r, r, *h,
- 0, <f64_dm>, r"))]
+ 0, <f64_dm>, r, eF"))]
"TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8161,18 +8168,19 @@
mf%1 %0
nop
mfvsrd %0,%x1
- mtvsrd %x0,%1"
+ mtvsrd %x0,%1
+ #"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, integer,
store, load, *, mtjmpr, mfjmpr,
- *, mfvsr, mtvsr")
+ *, mfvsr, mtvsr, vecperm")
(set_attr "size" "64")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
*, *, *, *, *,
- *, p8v, p8v")])
+ *, p8v, p8v, p10")])
;; STD LD MR MT<SPR> MF<SPR> G-const
;; H-const F-const Special
@@ -9220,6 +9228,7 @@
;; a gpr into a fpr instead of reloading an invalid 'Y' address
;; GPR store GPR load GPR move FPR store FPR load FPR move
+;; XXSPLTIDP
;; GPR const AVX store AVX store AVX load AVX load VSX move
;; P9 0 P9 -1 AVX 0/-1 VSX 0 VSX -1 P9 const
;; AVX const
@@ -9227,11 +9236,13 @@
(define_insn "*movdi_internal32"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=Y, r, r, m, ^d, ^d,
+ ^wa,
r, wY, Z, ^v, $v, ^wa,
wa, wa, v, wa, *i, v,
v")
(match_operand:DI 1 "input_operand"
"r, Y, r, ^d, m, ^d,
+ eF,
IJKnF, ^v, $v, wY, Z, ^wa,
Oj, wM, OjwM, Oj, wM, wS,
wB"))]
@@ -9246,6 +9257,7 @@
lfd%U1%X1 %0,%1
fmr %0,%1
#
+ #
stxsd %1,%0
stxsdx %x1,%y0
lxsd %0,%1
@@ -9260,17 +9272,20 @@
#"
[(set_attr "type"
"store, load, *, fpstore, fpload, fpsimple,
+ vecperm,
*, fpstore, fpstore, fpload, fpload, veclogical,
vecsimple, vecsimple, vecsimple, veclogical,veclogical,vecsimple,
vecsimple")
(set_attr "size" "64")
(set_attr "length"
"8, 8, 8, *, *, *,
+ *,
16, *, *, *, *, *,
*, *, *, *, *, 8,
*")
(set_attr "isa"
"*, *, *, *, *, *,
+ p10,
*, p9v, p7v, p9v, p7v, *,
p9v, p9v, p7v, *, *, p7v,
p7v")])
@@ -9306,6 +9321,7 @@
})
;; GPR store GPR load GPR move
+;; XXSPLTIDP
;; GPR li GPR lis GPR pli GPR #
;; FPR store FPR load FPR move
;; AVX store AVX store AVX load AVX load VSX move
@@ -9316,6 +9332,7 @@
(define_insn "*movdi_internal64"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=YZ, r, r,
+ ^wa,
r, r, r, r,
m, ^d, ^d,
wY, Z, $v, $v, ^wa,
@@ -9325,6 +9342,7 @@
?r, ?wa")
(match_operand:DI 1 "input_operand"
"r, YZ, r,
+ eF,
I, L, eI, nF,
^d, m, ^d,
^v, $v, wY, Z, ^wa,
@@ -9339,6 +9357,7 @@
std%U0%X0 %1,%0
ld%U1%X1 %0,%1
mr %0,%1
+ #
li %0,%1
lis %0,%v1
li %0,%1
@@ -9365,6 +9384,7 @@
mtvsrd %x0,%1"
[(set_attr "type"
"store, load, *,
+ vecperm,
*, *, *, *,
fpstore, fpload, fpsimple,
fpstore, fpstore, fpload, fpload, veclogical,
@@ -9375,6 +9395,7 @@
(set_attr "size" "64")
(set_attr "length"
"*, *, *,
+ *,
*, *, *, 20,
*, *, *,
*, *, *, *, *,
@@ -9384,6 +9405,7 @@
*, *")
(set_attr "isa"
"*, *, *,
+ p10,
*, *, p10, *,
*, *, *,
p9v, p7v, p9v, p7v, *,
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 9d7878f144a..1d7ce4cc94a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -640,6 +640,10 @@ mprivileged
Target Var(rs6000_privileged) Init(0)
Generate code that will run in privileged state.
+mxxspltidp
+Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save
+Generate (do not generate) XXSPLTIDP instructions.
+
-param=rs6000-density-pct-threshold=
Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 100) Param
When costing for loop vectorization, we probably need to penalize the loop body
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bf033e31c1c..fa33c9d9fbf 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1191,16 +1191,19 @@
;; instruction). But generate XXLXOR/XXLORC if it will avoid a register move.
;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR)
+;; XXSPLTIDP
;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW
;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX)
(define_insn "vsx_mov<mode>_64bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, r, we, ?wQ,
+ wa,
?&r, ??r, ??Y, <??r>, wa, v,
?wa, v, <??r>, wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, we, r, r,
+ eV,
wQ, Y, r, r, wE, jwM,
?jwM, W, <nW>, v, wZ"))]
@@ -1212,36 +1215,44 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, mtvsr, mfvsr, load,
+ vecperm,
store, load, store, *, vecsimple, vecsimple,
vecsimple, *, *, vecstore, vecload")
(set_attr "num_insns"
"*, *, *, 2, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, 5, 2, *, *")
(set_attr "max_prefixed_insns"
"*, *, *, *, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, *, *, *, *")
(set_attr "length"
"*, *, *, 8, *, 8,
+ *,
8, 8, 8, 8, *, *,
*, 20, 8, *, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
*, *, *, *, p9v, *,
<VSisa>, *, *, *, *")])
;; VSX store VSX load VSX move GPR load GPR store GPR move
+;; XXSPLTIDP
;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const
;; LVX (VMX) STVX (VMX)
(define_insn "*vsx_mov<mode>_32bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, ??r, ??Y, <??r>,
+ wa,
wa, v, ?wa, v, <??r>,
wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, Y, r, r,
+ eV,
wE, jwM, ?jwM, W, <nW>,
v, wZ"))]
@@ -1253,14 +1264,17 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, load, store, *,
+ vecperm,
vecsimple, vecsimple, vecsimple, *, *,
vecstore, vecload")
(set_attr "length"
"*, *, *, 16, 16, 16,
+ *,
*, *, *, 20, 16,
*, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
p9v, *, <VSisa>, *, *,
*, *")])
@@ -6449,15 +6463,53 @@
DONE;
})
-(define_insn "xxspltidp_v2df_inst"
- [(set (match_operand:V2DF 0 "register_operand" "=wa")
- (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
- UNSPEC_XXSPLTIDP))]
+(define_mode_iterator XXSPLTIDP_S [DI SF DF])
+(define_mode_iterator XXSPLTIDP_V [V2DF V2DI])
+(define_mode_iterator XXSPLTIDP [DI SF DF V2DF V2DI])
+
+(define_insn "xxspltidp_<mode>_inst"
+ [(set (match_operand:XXSPLTIDP 0 "register_operand" "=wa")
+ (unspec:XXSPLTIDP [(match_operand:SI 1 "c32bit_cint_operand" "n")]
+ UNSPEC_XXSPLTIDP))]
"TARGET_POWER10"
"xxspltidp %x0,%1"
[(set_attr "type" "vecperm")
(set_attr "prefixed" "yes")])
+;; Generate the XXSPLTIDP instruction to support SFmode, DFmode, and DImode
+;; scalar constants and V2DF and V2DI vector constants where both elements are
+;; the same. The constant has to be expressible as a SFmode constant that is
+;; not a SFmode denormal value.
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_S 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_S 1 "easy_fp_constant_64bit_scalar" "eF"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_S [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_V 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_V 1 "easy_vector_constant_64bit_element" "eV"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_V [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
;; XXSPLTI32DX built-in function support
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 2b41cb7fb7b..5035a3fd604 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3333,9 +3333,15 @@ The integer constant zero.
A constant whose negation is a signed 16-bit constant.
@end ifset
+@item eF
+A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction.
+
@item eI
A signed 34-bit integer constant if prefixed instructions are supported.
+@item eV
+A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction.
+
@ifset INTERNALS
@item G
A floating point constant that can be loaded into a register with one
diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
index bd1502bb30a..dcb30e1d886 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
@@ -24,11 +24,12 @@ vector signed long long splats4(void)
return (vector signed long long) vec_sl(mzero, mzero);
}
-/* Codegen will consist of splat and shift instructions for most types.
- If folding is enabled, the vec_sl tests using vector long long type will
- generate a lvx instead of a vspltisw+vsld pair. */
+/* Codegen will consist of splat and shift instructions for most types. If
+ folding is enabled, the vec_sl tests using vector long long type will
+ generate a lvx instead of a vspltisw+vsld pair. On power10, it will
+ generate a xxspltidp instruction instead of the lvx. */
/* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */
/* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */
-/* { dg-final { scan-assembler-times {\mp?lxv\M|\mlxv\M|\mlxvd2x\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mp?lxv\M|\mlxv\M|\mlxvd2x\M|\mxxspltidp\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+double
+scalar_double_0 (void)
+{
+ return 0.0; /* XXSPLTIB or XXLXOR. */
+}
+
+double
+scalar_double_1 (void)
+{
+ return 1.0; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+ return -0.0; /* XXSPLTIDP. */
+}
+
+double
+scalar_double_nan (void)
+{
+ return __builtin_nan (""); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_inf (void)
+{
+ return __builtin_inf (); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+ return M_PI; /* PLFD. */
+}
+
+double
+scalar_double_denorm (void)
+{
+ return 0x1p-149f; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
new file mode 100644
index 00000000000..75714d0b11d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating DImode constants that have the same bit pattern as DFmode
+ constants that can be loaded with the XXSPLTIDP instruction with the ISA 3.1
+ (power10). We use asm to force the value into vector registers. */
+
+double
+scalar_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ double d;
+ long long ll = 0;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+double
+scalar_1 (void)
+{
+ /* VSPLTISW/VUPKLSW or XXSPLTIB/VEXTSB2D. */
+ double d;
+ long long ll = 1;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTIDP. */
+double
+scalar_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x8000000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTIDP. */
+double
+scalar_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x3ff0000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+double
+scalar_pi (void)
+{
+ /* PLXV. */
+ double d;
+ long long ll = 0x400921fb54442d18LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+float
+scalar_float_0 (void)
+{
+ return 0.0f; /* XXSPLTIB or XXLXOR. */
+}
+
+float
+scalar_float_1 (void)
+{
+ return 1.0f; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+ return -0.0f; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_nan (void)
+{
+ return __builtin_nanf (""); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_inf (void)
+{
+ return __builtin_inff (); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+ return (float)M_PI; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_denorm (void)
+{
+ return 0x1p-149f; /* PLFS. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..82ffc86f8aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+vector double
+v2df_double_0 (void)
+{
+ return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */
+}
+
+vector double
+v2df_double_1 (void)
+{
+ return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+ return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_nan (void)
+{
+ return (vector double) { __builtin_nan (""),
+ __builtin_nan ("") }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_inf (void)
+{
+ return (vector double) { __builtin_inf (),
+ __builtin_inf () }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+ return (vector double) { - __builtin_inf (),
+ - __builtin_inf () }; /* XXSPLTIDP. */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+ return (vector double) { M_PI, M_PI }; /* PLVX. */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+ return (vector double) { (double)0x1p-149f,
+ (double)0x1p-149f }; /* PLVX. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
new file mode 100644
index 00000000000..4d44f943d26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating V2DImode constants that have the same bit pattern as
+ V2DFmode constants that can be loaded with the XXSPLTIDP instruction with
+ the ISA 3.1 (power10). */
+
+vector long long
+vector_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ return (vector long long) { 0LL, 0LL };
+}
+
+vector long long
+vector_1 (void)
+{
+ /* XXSPLTIB and VEXTSB2D. */
+ return (vector long long) { 1LL, 1LL };
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTISDP. */
+vector long long
+vector_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x8000000000000000LL, 0x8000000000000000LL };
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTISDP. */
+vector long long
+vector_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x3ff0000000000000LL, 0x3ff0000000000000LL };
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+vector long long
+scalar_pi (void)
+{
+ /* PLXV. */
+ return (vector long long) { 0x400921fb54442d18LL, 0x400921fb54442d18LL };
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* [gcc(refs/users/meissner/heads/work070)] Generate XXSPLTIDP on power10.
@ 2021-10-05 21:59 Michael Meissner
0 siblings, 0 replies; 5+ messages in thread
From: Michael Meissner @ 2021-10-05 21:59 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:332c130e3294e22888d6163ed6904bf32b38cec4
commit 332c130e3294e22888d6163ed6904bf32b38cec4
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Tue Oct 5 17:58:50 2021 -0400
Generate XXSPLTIDP on power10.
This patch implements XXSPLTIDP support for SF, DF, and DI scalar constants and
V2DF and V2DI vector constants. The XXSPLTIDP instruction is given a 32-bit
immediate that is converted to a vector of two DFmode constants. The immediate
is in SFmode format, so only constants that fit as SFmode values can be loaded
with XXSPLTIDP.
I added two new constraints (eF and eV) to match scalar and vector constants
that can be loaded with the XXSPLTIDP instruction.
I have added a temporary switch (-mxxspltidp) to control whether or not the
XXSPLTIDP instruction is generated.
I added 5 new tests to test loading up SF/DF/DI scalar and V2DI/V2DF vector
constants.
2021-10-05 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/constraints.md (eF): New constraint.
(eV): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the scalar constant with XXSPLTIDP, the constant is easy.
(easy_fp_constant_64bit_scalar): New predicate.
(easy_vector_constant_64bit_element): New predicate.
(easy_vector_constant): If we can generate XXSPLTIDP, mark the
vector constant as easy.
* config/rs6000/rs6000-protos.h (xxspltidp_constant_immediate): New
declaration.
(prefixed_xxsplti_p): Likewise.
* config/rs6000/rs6000.c (xxspltidp_constant_immediate): New function.
(output_vec_const_move): Add support for XXSPLTIDP.
(prefixed_xxsplti_p): New function.
* config/rs6000/rs6000.md (prefixed attribute): Add support for the
xxsplti* prefixed instructions.
(movsf_hardfloat): Add XXSPLTIDP support.
(mov<mode>_hardfloat32, FMOVE64 iterator): Likewise.
(mov<mode>_hardfloat64, FMOVE64 iterator): Likewise.
(movdi_internal32): Likewise.
(movdi_internal64): Likewise.
* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
* config/rs6000/vsx.md (vsx_move<mode>_64bit): Add XXSPLTIDP
support.
(vsx_move<mode>_32bit): Likewise.
(XXSPLTIDP_S): New mode iterator.
(XXSPLTIDP_V): Likewise.
(XXSPLTIDP): Likewise.
(xxspltidp_<mode>_inst): Replace xxspltidp_v2df_inst with an
iterated form that also does SFmode, DFmode, DImode, and
V2DImode.
(xxspltidp_<mode>_internal): New insn and splits.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
eF and eV constraints.
gcc/testsuite/
* gcc.target/powerpc/pr86731-fwrapv-longlong.c: Update insn
regex for power10.
* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-di.c: New test.
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2di.c: New test.
Diff:
---
gcc/config/rs6000/constraints.md | 10 ++
gcc/config/rs6000/predicates.md | 140 +++++++++++++++++++++
gcc/config/rs6000/rs6000-protos.h | 2 +
gcc/config/rs6000/rs6000.c | 96 ++++++++++++++
gcc/config/rs6000/rs6000.md | 58 ++++++---
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 60 ++++++++-
gcc/doc/md.texi | 6 +
.../gcc.target/powerpc/pr86731-fwrapv-longlong.c | 9 +-
.../gcc.target/powerpc/vec-splat-constant-df.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-di.c | 70 +++++++++++
.../gcc.target/powerpc/vec-splat-constant-sf.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 64 ++++++++++
.../gcc.target/powerpc/vec-splat-constant-v2di.c | 50 ++++++++
14 files changed, 663 insertions(+), 26 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index c8cff1a3038..1ff46c9f4fc 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,11 +208,21 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; DI/SF/DF scalar constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eF"
+ "A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_fp_constant_64bit_scalar"))
+
;; 34-bit signed integer constant
(define_constraint "eI"
"A signed 34-bit integer constant if prefixed instructions are supported."
(match_operand 0 "cint34_operand"))
+;; V2DI/V2DF vector constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eV"
+ "A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_vector_constant_64bit_element"))
+
;; Floating-point constraints. These two are defined so that insn
;; length attributes can be calculated exactly.
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 956e42bc514..7544ac87700 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
if (TARGET_VSX && op == CONST0_RTX (mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+ be loaded with that instruction. */
+ if (easy_fp_constant_64bit_scalar (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -609,6 +614,138 @@
return 0;
})
+;; Return 1 if the operand is a 64-bit scalar constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a V2DF or
+;; V2DI mode result that is interpretted as a 64-bit scalar.
+(define_predicate "easy_fp_constant_64bit_scalar"
+ (match_code "const_int,const_double")
+{
+ const REAL_VALUE_TYPE *rv;
+ REAL_VALUE_TYPE rv_type;
+
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ /* Don't return true for 0.0 or 0 since that is easy to create without
+ XXSPLTIDP. */
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ /* Handle DImode by creating a DF value from it. */
+ if (CONST_INT_P (op) && (mode == DImode || mode == VOIDmode))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+
+ /* Avoid values that look like DFmode NaN's. The IEEE 754 64-bit
+ floating format has 1 bit for sign, 11 bits for the exponent,
+ and 52 bits for the mantissa. NaN values have the exponent set
+ to all 1 bits, and the mantissa non-zero (mantissa == 0 is
+ infinity). */
+ int df_exponent = (df_value >> 52) & 0x7ff;
+ HOST_WIDE_INT df_mantissa = df_value & HOST_WIDE_INT_C (0x1fffffffffffff);
+
+ if (df_exponent == 0x7ff && df_mantissa != 0) /* NaN. */
+ return false;
+
+ /* Avoid values that are DFmode subnormal values. Subnormal numbers
+ have the exponent all 0 bits, and the mantissa non-zero. If the
+ value is subnormal, then the hidden bit in the mantissa is not
+ set. */
+ if (df_exponent == 0 && df_mantissa != 0) /* subnormal. */
+ return false;
+
+ long df_words[2];
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_from_target takes the target words in target order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ real_from_target (&rv_type, df_words, DFmode);
+ rv = &rv_type;
+ }
+
+ /* Handle SFmode/DFmode constants. Don't allow decimal or IEEE 128-bit
+ binary constants. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ rv = CONST_DOUBLE_REAL_VALUE (op);
+
+ /* We can't handle anything else with the XXSPLTIDP instruction. */
+ else
+ return false;
+
+ /* Validate that the number can be stored as a SFmode value. */
+ if (!exact_real_truncate (SFmode, rv))
+ return false;
+
+ /* Validate that the number is not a SFmode subnormal value (exponent is 0,
+ mantissa field is non-zero) which is undefined for the XXSPLTIDP
+ instruction. */
+ long sf_value;
+ real_to_target (&sf_value, rv, SFmode);
+
+ /* IEEE 754 32-bit values have 1 bit for the sign, 8 bits for the exponent,
+ and 23 bits for the mantissa. Subnormal numbers have the exponent all
+ 0 bits, and the mantissa non-zero. */
+ long sf_exponent = (sf_value >> 23) & 0xFF;
+ long sf_mantissa = sf_value & 0x7FFFFF;
+
+ if (sf_exponent == 0 && sf_mantissa != 0)
+ return false;
+
+ return true;
+})
+
+;; Return 1 if the operand is a 64-bit vector constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a
+;; V2DFmode or V2DI result.
+;;
+;; We cannot combine the scalar and vector cases because otherwise it is
+;; problematical if we assign an appropriate integer constant to a TImode
+;; value. I.e.
+;;
+;; (set (reg:TI 32)
+;; (const_int 0x8000000000000000))
+;;
+;; Otherwise, the constant would be splatted into the 2 64-bit positions in the
+;; vector register, and not loaded with the upper 64-bits 0, and the constant
+;; in the lower 64-bits.
+
+(define_predicate "easy_vector_constant_64bit_element"
+ (match_code "const_vector,vec_duplicate")
+{
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ if (mode != V2DFmode && mode != V2DImode)
+ return false;
+
+ if (CONST_VECTOR_P (op))
+ {
+ if (!CONST_VECTOR_DUPLICATE_P (op))
+ return false;
+
+ op = CONST_VECTOR_ELT (op, 0);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ op = XEXP (op, 0);
+
+ else
+ return false;
+
+ return easy_fp_constant_64bit_scalar (op, GET_MODE_INNER (mode));
+})
+
;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.
@@ -653,6 +790,9 @@
if (zero_constant (op, mode) || all_ones_constant (op, mode))
return true;
+ if (easy_vector_constant_64bit_element (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 14f6b313105..e9be9c4d99f 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
extern int easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern long xxspltidp_constant_immediate (rtx, machine_mode);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
@@ -198,6 +199,7 @@ enum non_prefixed_form reg_to_non_prefixed (rtx reg, machine_mode mode);
extern bool prefixed_load_p (rtx_insn *);
extern bool prefixed_store_p (rtx_insn *);
extern bool prefixed_paddi_p (rtx_insn *);
+extern bool prefixed_xxsplti_p (rtx_insn *);
extern void rs6000_asm_output_opcode (FILE *);
extern void output_pcrel_opt_reloc (rtx);
extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ad860728169..83d243269e3 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6946,6 +6946,60 @@ xxspltib_constant_p (rtx op,
return true;
}
+/* Return the immediate value used in the XXSPLTIDP instruction. */
+
+long
+xxspltidp_constant_immediate (rtx op, machine_mode mode)
+{
+ long ret;
+
+ /* Handle vectors. */
+ if (CONST_VECTOR_P (op))
+ {
+ op = CONST_VECTOR_ELT (op, 0);
+ mode = GET_MODE_INNER (mode);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ {
+ op = XEXP (op, 0);
+ mode = GET_MODE (op);
+ }
+
+ gcc_assert (easy_fp_constant_64bit_scalar (op, mode));
+
+ /* Handle DImode/V2DImode by creating a DF value from it and then converting
+ the DFmode value to SFmode. */
+ if (CONST_INT_P (op))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+ long df_words[2];
+
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_to_target takes input in target endian order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ REAL_VALUE_TYPE r;
+ real_from_target (&r, &df_words[0], DFmode);
+ real_to_target (&ret, &r, SFmode);
+ }
+
+ /* For floating point constants, convert to SFmode. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ {
+ const REAL_VALUE_TYPE *rv = CONST_DOUBLE_REAL_VALUE (op);
+ real_to_target (&ret, rv, SFmode);
+ }
+
+ else
+ gcc_unreachable ();
+
+ return ret;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6990,6 +7044,13 @@ output_vec_const_move (rtx *operands)
gcc_unreachable ();
}
+ if (easy_fp_constant_64bit_scalar (vec, mode)
+ || easy_vector_constant_64bit_element (vec, mode))
+ {
+ operands[2] = GEN_INT (xxspltidp_constant_immediate (vec, mode));
+ return "xxspltidp %x0,%2";
+ }
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
{
@@ -26724,6 +26785,41 @@ prefixed_paddi_p (rtx_insn *insn)
return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
}
+/* Whether a permute type instruction is a prefixed XXSPLTI* instruction.
+ This is called from the prefixed attribute processing. */
+
+bool
+prefixed_xxsplti_p (rtx_insn *insn)
+{
+ rtx set = single_set (insn);
+ if (!set)
+ return false;
+
+ rtx dest = SET_DEST (set);
+ rtx src = SET_SRC (set);
+ machine_mode mode = GET_MODE (dest);
+
+ if (!REG_P (dest) && !SUBREG_P (dest))
+ return false;
+
+ switch (mode)
+ {
+ case E_DImode:
+ case E_DFmode:
+ case E_SFmode:
+ return easy_fp_constant_64bit_scalar (src, mode);
+
+ case E_V2DImode:
+ case E_V2DFmode:
+ return easy_vector_constant_64bit_element (src, mode);
+
+ default:
+ break;
+ }
+
+ return false;
+}
+
/* Whether the next instruction needs a 'p' prefix issued before the
instruction is printed out. */
static bool prepend_p_to_next_insn;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6bec2bddbde..8afc4b2756d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -314,6 +314,11 @@
(eq_attr "type" "integer,add")
(if_then_else (match_test "prefixed_paddi_p (insn)")
+ (const_string "yes")
+ (const_string "no"))
+
+ (eq_attr "type" "vecperm")
+ (if_then_else (match_test "prefixed_xxsplti_p (insn)")
(const_string "yes")
(const_string "no"))]
@@ -7759,17 +7764,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP
+;; MR MT<x> MF<x> NOP XXSPLTIDP
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h")
+ !r, *c*l, !r, *h, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0"))]
+ r, r, *h, 0, eF"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7791,15 +7796,16 @@
mr %0,%1
mt%0 %1
mf%1 %0
- nop"
+ nop
+ #"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *")
+ *, mtjmpr, mfjmpr, *, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *")])
+ *, *, *, *, p10")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -8059,18 +8065,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR
+;; LWZ STW MR XXSPLTIDP
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r")
+ Y, r, !r, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r"))]
+ r, Y, r, eF"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8087,20 +8093,21 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two")
+ store, load, two, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8")
+ 8, 8, 8, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *")])
+ *, *, *, p10")])
;; STW LWZ MR G-const H-const F-const
@@ -8127,19 +8134,19 @@
;; STFD LFD FMR LXSD STXSD
;; LXSDX STXSDX XXLOR XXLXOR LI 0
;; STD LD MR MT{CTR,LR} MF{CTR,LR}
-;; NOP MFVSRD MTVSRD
+;; NOP MFVSRD MTVSRD XXSPLTIDP
(define_insn "*mov<mode>_hardfloat64"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
YZ, r, !r, *c*l, !r,
- *h, r, <f64_dm>")
+ *h, r, <f64_dm>, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
r, YZ, r, r, *h,
- 0, <f64_dm>, r"))]
+ 0, <f64_dm>, r, eF"))]
"TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8161,18 +8168,19 @@
mf%1 %0
nop
mfvsrd %0,%x1
- mtvsrd %x0,%1"
+ mtvsrd %x0,%1
+ #"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, integer,
store, load, *, mtjmpr, mfjmpr,
- *, mfvsr, mtvsr")
+ *, mfvsr, mtvsr, vecperm")
(set_attr "size" "64")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
*, *, *, *, *,
- *, p8v, p8v")])
+ *, p8v, p8v, p10")])
;; STD LD MR MT<SPR> MF<SPR> G-const
;; H-const F-const Special
@@ -9220,6 +9228,7 @@
;; a gpr into a fpr instead of reloading an invalid 'Y' address
;; GPR store GPR load GPR move FPR store FPR load FPR move
+;; XXSPLTIDP
;; GPR const AVX store AVX store AVX load AVX load VSX move
;; P9 0 P9 -1 AVX 0/-1 VSX 0 VSX -1 P9 const
;; AVX const
@@ -9227,11 +9236,13 @@
(define_insn "*movdi_internal32"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=Y, r, r, m, ^d, ^d,
+ ^wa,
r, wY, Z, ^v, $v, ^wa,
wa, wa, v, wa, *i, v,
v")
(match_operand:DI 1 "input_operand"
"r, Y, r, ^d, m, ^d,
+ eF,
IJKnF, ^v, $v, wY, Z, ^wa,
Oj, wM, OjwM, Oj, wM, wS,
wB"))]
@@ -9246,6 +9257,7 @@
lfd%U1%X1 %0,%1
fmr %0,%1
#
+ #
stxsd %1,%0
stxsdx %x1,%y0
lxsd %0,%1
@@ -9260,17 +9272,20 @@
#"
[(set_attr "type"
"store, load, *, fpstore, fpload, fpsimple,
+ vecperm,
*, fpstore, fpstore, fpload, fpload, veclogical,
vecsimple, vecsimple, vecsimple, veclogical,veclogical,vecsimple,
vecsimple")
(set_attr "size" "64")
(set_attr "length"
"8, 8, 8, *, *, *,
+ *,
16, *, *, *, *, *,
*, *, *, *, *, 8,
*")
(set_attr "isa"
"*, *, *, *, *, *,
+ p10,
*, p9v, p7v, p9v, p7v, *,
p9v, p9v, p7v, *, *, p7v,
p7v")])
@@ -9306,6 +9321,7 @@
})
;; GPR store GPR load GPR move
+;; XXSPLTIDP
;; GPR li GPR lis GPR pli GPR #
;; FPR store FPR load FPR move
;; AVX store AVX store AVX load AVX load VSX move
@@ -9316,6 +9332,7 @@
(define_insn "*movdi_internal64"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=YZ, r, r,
+ ^wa,
r, r, r, r,
m, ^d, ^d,
wY, Z, $v, $v, ^wa,
@@ -9325,6 +9342,7 @@
?r, ?wa")
(match_operand:DI 1 "input_operand"
"r, YZ, r,
+ eF,
I, L, eI, nF,
^d, m, ^d,
^v, $v, wY, Z, ^wa,
@@ -9339,6 +9357,7 @@
std%U0%X0 %1,%0
ld%U1%X1 %0,%1
mr %0,%1
+ #
li %0,%1
lis %0,%v1
li %0,%1
@@ -9365,6 +9384,7 @@
mtvsrd %x0,%1"
[(set_attr "type"
"store, load, *,
+ vecperm,
*, *, *, *,
fpstore, fpload, fpsimple,
fpstore, fpstore, fpload, fpload, veclogical,
@@ -9375,6 +9395,7 @@
(set_attr "size" "64")
(set_attr "length"
"*, *, *,
+ *,
*, *, *, 20,
*, *, *,
*, *, *, *, *,
@@ -9384,6 +9405,7 @@
*, *")
(set_attr "isa"
"*, *, *,
+ p10,
*, *, p10, *,
*, *, *,
p9v, p7v, p9v, p7v, *,
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 9d7878f144a..1d7ce4cc94a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -640,6 +640,10 @@ mprivileged
Target Var(rs6000_privileged) Init(0)
Generate code that will run in privileged state.
+mxxspltidp
+Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save
+Generate (do not generate) XXSPLTIDP instructions.
+
-param=rs6000-density-pct-threshold=
Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 100) Param
When costing for loop vectorization, we probably need to penalize the loop body
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bf033e31c1c..fa33c9d9fbf 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1191,16 +1191,19 @@
;; instruction). But generate XXLXOR/XXLORC if it will avoid a register move.
;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR)
+;; XXSPLTIDP
;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW
;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX)
(define_insn "vsx_mov<mode>_64bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, r, we, ?wQ,
+ wa,
?&r, ??r, ??Y, <??r>, wa, v,
?wa, v, <??r>, wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, we, r, r,
+ eV,
wQ, Y, r, r, wE, jwM,
?jwM, W, <nW>, v, wZ"))]
@@ -1212,36 +1215,44 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, mtvsr, mfvsr, load,
+ vecperm,
store, load, store, *, vecsimple, vecsimple,
vecsimple, *, *, vecstore, vecload")
(set_attr "num_insns"
"*, *, *, 2, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, 5, 2, *, *")
(set_attr "max_prefixed_insns"
"*, *, *, *, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, *, *, *, *")
(set_attr "length"
"*, *, *, 8, *, 8,
+ *,
8, 8, 8, 8, *, *,
*, 20, 8, *, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
*, *, *, *, p9v, *,
<VSisa>, *, *, *, *")])
;; VSX store VSX load VSX move GPR load GPR store GPR move
+;; XXSPLTIDP
;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const
;; LVX (VMX) STVX (VMX)
(define_insn "*vsx_mov<mode>_32bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, ??r, ??Y, <??r>,
+ wa,
wa, v, ?wa, v, <??r>,
wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, Y, r, r,
+ eV,
wE, jwM, ?jwM, W, <nW>,
v, wZ"))]
@@ -1253,14 +1264,17 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, load, store, *,
+ vecperm,
vecsimple, vecsimple, vecsimple, *, *,
vecstore, vecload")
(set_attr "length"
"*, *, *, 16, 16, 16,
+ *,
*, *, *, 20, 16,
*, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
p9v, *, <VSisa>, *, *,
*, *")])
@@ -6449,15 +6463,53 @@
DONE;
})
-(define_insn "xxspltidp_v2df_inst"
- [(set (match_operand:V2DF 0 "register_operand" "=wa")
- (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
- UNSPEC_XXSPLTIDP))]
+(define_mode_iterator XXSPLTIDP_S [DI SF DF])
+(define_mode_iterator XXSPLTIDP_V [V2DF V2DI])
+(define_mode_iterator XXSPLTIDP [DI SF DF V2DF V2DI])
+
+(define_insn "xxspltidp_<mode>_inst"
+ [(set (match_operand:XXSPLTIDP 0 "register_operand" "=wa")
+ (unspec:XXSPLTIDP [(match_operand:SI 1 "c32bit_cint_operand" "n")]
+ UNSPEC_XXSPLTIDP))]
"TARGET_POWER10"
"xxspltidp %x0,%1"
[(set_attr "type" "vecperm")
(set_attr "prefixed" "yes")])
+;; Generate the XXSPLTIDP instruction to support SFmode, DFmode, and DImode
+;; scalar constants and V2DF and V2DI vector constants where both elements are
+;; the same. The constant has to be expressible as a SFmode constant that is
+;; not a SFmode denormal value.
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_S 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_S 1 "easy_fp_constant_64bit_scalar" "eF"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_S [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_V 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_V 1 "easy_vector_constant_64bit_element" "eV"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_V [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
;; XXSPLTI32DX built-in function support
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 2b41cb7fb7b..5035a3fd604 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3333,9 +3333,15 @@ The integer constant zero.
A constant whose negation is a signed 16-bit constant.
@end ifset
+@item eF
+A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction.
+
@item eI
A signed 34-bit integer constant if prefixed instructions are supported.
+@item eV
+A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction.
+
@ifset INTERNALS
@item G
A floating point constant that can be loaded into a register with one
diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
index bd1502bb30a..dcb30e1d886 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
@@ -24,11 +24,12 @@ vector signed long long splats4(void)
return (vector signed long long) vec_sl(mzero, mzero);
}
-/* Codegen will consist of splat and shift instructions for most types.
- If folding is enabled, the vec_sl tests using vector long long type will
- generate a lvx instead of a vspltisw+vsld pair. */
+/* Codegen will consist of splat and shift instructions for most types. If
+ folding is enabled, the vec_sl tests using vector long long type will
+ generate a lvx instead of a vspltisw+vsld pair. On power10, it will
+ generate a xxspltidp instruction instead of the lvx. */
/* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */
/* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */
-/* { dg-final { scan-assembler-times {\mp?lxv\M|\mlxv\M|\mlxvd2x\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mp?lxv\M|\mlxv\M|\mlxvd2x\M|\mxxspltidp\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+double
+scalar_double_0 (void)
+{
+ return 0.0; /* XXSPLTIB or XXLXOR. */
+}
+
+double
+scalar_double_1 (void)
+{
+ return 1.0; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+ return -0.0; /* XXSPLTIDP. */
+}
+
+double
+scalar_double_nan (void)
+{
+ return __builtin_nan (""); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_inf (void)
+{
+ return __builtin_inf (); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+ return M_PI; /* PLFD. */
+}
+
+double
+scalar_double_denorm (void)
+{
+ return 0x1p-149f; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
new file mode 100644
index 00000000000..75714d0b11d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating DImode constants that have the same bit pattern as DFmode
+ constants that can be loaded with the XXSPLTIDP instruction with the ISA 3.1
+ (power10). We use asm to force the value into vector registers. */
+
+double
+scalar_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ double d;
+ long long ll = 0;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+double
+scalar_1 (void)
+{
+ /* VSPLTISW/VUPKLSW or XXSPLTIB/VEXTSB2D. */
+ double d;
+ long long ll = 1;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTIDP. */
+double
+scalar_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x8000000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTIDP. */
+double
+scalar_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x3ff0000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+double
+scalar_pi (void)
+{
+ /* PLXV. */
+ double d;
+ long long ll = 0x400921fb54442d18LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+float
+scalar_float_0 (void)
+{
+ return 0.0f; /* XXSPLTIB or XXLXOR. */
+}
+
+float
+scalar_float_1 (void)
+{
+ return 1.0f; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+ return -0.0f; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_nan (void)
+{
+ return __builtin_nanf (""); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_inf (void)
+{
+ return __builtin_inff (); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+ return (float)M_PI; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_denorm (void)
+{
+ return 0x1p-149f; /* PLFS. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..82ffc86f8aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+vector double
+v2df_double_0 (void)
+{
+ return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */
+}
+
+vector double
+v2df_double_1 (void)
+{
+ return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+ return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_nan (void)
+{
+ return (vector double) { __builtin_nan (""),
+ __builtin_nan ("") }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_inf (void)
+{
+ return (vector double) { __builtin_inf (),
+ __builtin_inf () }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+ return (vector double) { - __builtin_inf (),
+ - __builtin_inf () }; /* XXSPLTIDP. */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+ return (vector double) { M_PI, M_PI }; /* PLVX. */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+ return (vector double) { (double)0x1p-149f,
+ (double)0x1p-149f }; /* PLVX. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
new file mode 100644
index 00000000000..4d44f943d26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating V2DImode constants that have the same bit pattern as
+ V2DFmode constants that can be loaded with the XXSPLTIDP instruction with
+ the ISA 3.1 (power10). */
+
+vector long long
+vector_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ return (vector long long) { 0LL, 0LL };
+}
+
+vector long long
+vector_1 (void)
+{
+ /* XXSPLTIB and VEXTSB2D. */
+ return (vector long long) { 1LL, 1LL };
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTISDP. */
+vector long long
+vector_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x8000000000000000LL, 0x8000000000000000LL };
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTISDP. */
+vector long long
+vector_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x3ff0000000000000LL, 0x3ff0000000000000LL };
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+vector long long
+scalar_pi (void)
+{
+ /* PLXV. */
+ return (vector long long) { 0x400921fb54442d18LL, 0x400921fb54442d18LL };
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* [gcc(refs/users/meissner/heads/work070)] Generate XXSPLTIDP on power10.
@ 2021-10-05 21:51 Michael Meissner
0 siblings, 0 replies; 5+ messages in thread
From: Michael Meissner @ 2021-10-05 21:51 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:dc4b08de574f12f6eefb4eb6cd5f26151b688f6a
commit dc4b08de574f12f6eefb4eb6cd5f26151b688f6a
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Tue Oct 5 17:50:20 2021 -0400
Generate XXSPLTIDP on power10.
This patch implements XXSPLTIDP support for SF, DF, and DI scalar constants and
V2DF and V2DI vector constants. The XXSPLTIDP instruction is given a 32-bit
immediate that is converted to a vector of two DFmode constants. The immediate
is in SFmode format, so only constants that fit as SFmode values can be loaded
with XXSPLTIDP.
I added two new constraints (eF and eV) to match scalar and vector constants
that can be loaded with the XXSPLTIDP instruction.
I have added a temporary switch (-mxxspltidp) to control whether or not the
XXSPLTIDP instruction is generated.
I added 5 new tests to test loading up SF/DF/DI scalar and V2DI/V2DF vector
constants.
2021-10-05 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/constraints.md (eF): New constraint.
(eV): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the scalar constant with XXSPLTIDP, the constant is easy.
(easy_fp_constant_64bit_scalar): New predicate.
(easy_vector_constant_64bit_element): New predicate.
(easy_vector_constant): If we can generate XXSPLTIDP, mark the
vector constant as easy.
* config/rs6000/rs6000-protos.h (xxspltidp_constant_immediate): New
declaration.
(prefixed_xxsplti_p): Likewise.
* config/rs6000/rs6000.c (xxspltidp_constant_immediate): New function.
(output_vec_const_move): Add support for XXSPLTIDP.
(prefixed_xxsplti_p): New function.
* config/rs6000/rs6000.md (prefixed attribute): Add support for the
xxsplti* prefixed instructions.
(movsf_hardfloat): Add XXSPLTIDP support.
(mov<mode>_hardfloat32, FMOVE64 iterator): Likewise.
(mov<mode>_hardfloat64, FMOVE64 iterator): Likewise.
(movdi_internal32): Likewise.
(movdi_internal64): Likewise.
* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
* config/rs6000/vsx.md (vsx_move<mode>_64bit): Add XXSPLTIDP
support.
(vsx_move<mode>_32bit): Likewise.
(XXSPLTIDP_S): New mode iterator.
(XXSPLTIDP_V): Likewise.
(XXSPLTIDP): Likewise.
(xxspltidp_<mode>_inst): Replace xxspltidp_v2df_inst with an
iterated form that also does SFmode, DFmode, DImode, and
V2DImode.
(xxspltidp_<mode>_internal): New insn and splits.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
eF and eV constraints.
gcc/testsuite/
* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-di.c: New test.
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2di.c: New test.
Diff:
---
gcc/config/rs6000/constraints.md | 10 ++
gcc/config/rs6000/predicates.md | 140 +++++++++++++++++++++
gcc/config/rs6000/rs6000-protos.h | 2 +
gcc/config/rs6000/rs6000.c | 96 ++++++++++++++
gcc/config/rs6000/rs6000.md | 58 ++++++---
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 60 ++++++++-
gcc/doc/md.texi | 6 +
.../gcc.target/powerpc/vec-splat-constant-df.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-di.c | 70 +++++++++++
.../gcc.target/powerpc/vec-splat-constant-sf.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 64 ++++++++++
.../gcc.target/powerpc/vec-splat-constant-v2di.c | 50 ++++++++
13 files changed, 658 insertions(+), 22 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index c8cff1a3038..1ff46c9f4fc 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,11 +208,21 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; DI/SF/DF scalar constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eF"
+ "A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_fp_constant_64bit_scalar"))
+
;; 34-bit signed integer constant
(define_constraint "eI"
"A signed 34-bit integer constant if prefixed instructions are supported."
(match_operand 0 "cint34_operand"))
+;; V2DI/V2DF vector constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eV"
+ "A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_vector_constant_64bit_element"))
+
;; Floating-point constraints. These two are defined so that insn
;; length attributes can be calculated exactly.
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 956e42bc514..7544ac87700 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
if (TARGET_VSX && op == CONST0_RTX (mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+ be loaded with that instruction. */
+ if (easy_fp_constant_64bit_scalar (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -609,6 +614,138 @@
return 0;
})
+;; Return 1 if the operand is a 64-bit scalar constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a V2DF or
+;; V2DI mode result that is interpretted as a 64-bit scalar.
+(define_predicate "easy_fp_constant_64bit_scalar"
+ (match_code "const_int,const_double")
+{
+ const REAL_VALUE_TYPE *rv;
+ REAL_VALUE_TYPE rv_type;
+
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ /* Don't return true for 0.0 or 0 since that is easy to create without
+ XXSPLTIDP. */
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ /* Handle DImode by creating a DF value from it. */
+ if (CONST_INT_P (op) && (mode == DImode || mode == VOIDmode))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+
+ /* Avoid values that look like DFmode NaN's. The IEEE 754 64-bit
+ floating format has 1 bit for sign, 11 bits for the exponent,
+ and 52 bits for the mantissa. NaN values have the exponent set
+ to all 1 bits, and the mantissa non-zero (mantissa == 0 is
+ infinity). */
+ int df_exponent = (df_value >> 52) & 0x7ff;
+ HOST_WIDE_INT df_mantissa = df_value & HOST_WIDE_INT_C (0x1fffffffffffff);
+
+ if (df_exponent == 0x7ff && df_mantissa != 0) /* NaN. */
+ return false;
+
+ /* Avoid values that are DFmode subnormal values. Subnormal numbers
+ have the exponent all 0 bits, and the mantissa non-zero. If the
+ value is subnormal, then the hidden bit in the mantissa is not
+ set. */
+ if (df_exponent == 0 && df_mantissa != 0) /* subnormal. */
+ return false;
+
+ long df_words[2];
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_from_target takes the target words in target order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ real_from_target (&rv_type, df_words, DFmode);
+ rv = &rv_type;
+ }
+
+ /* Handle SFmode/DFmode constants. Don't allow decimal or IEEE 128-bit
+ binary constants. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ rv = CONST_DOUBLE_REAL_VALUE (op);
+
+ /* We can't handle anything else with the XXSPLTIDP instruction. */
+ else
+ return false;
+
+ /* Validate that the number can be stored as a SFmode value. */
+ if (!exact_real_truncate (SFmode, rv))
+ return false;
+
+ /* Validate that the number is not a SFmode subnormal value (exponent is 0,
+ mantissa field is non-zero) which is undefined for the XXSPLTIDP
+ instruction. */
+ long sf_value;
+ real_to_target (&sf_value, rv, SFmode);
+
+ /* IEEE 754 32-bit values have 1 bit for the sign, 8 bits for the exponent,
+ and 23 bits for the mantissa. Subnormal numbers have the exponent all
+ 0 bits, and the mantissa non-zero. */
+ long sf_exponent = (sf_value >> 23) & 0xFF;
+ long sf_mantissa = sf_value & 0x7FFFFF;
+
+ if (sf_exponent == 0 && sf_mantissa != 0)
+ return false;
+
+ return true;
+})
+
+;; Return 1 if the operand is a 64-bit vector constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a
+;; V2DFmode or V2DI result.
+;;
+;; We cannot combine the scalar and vector cases because otherwise it is
+;; problematical if we assign an appropriate integer constant to a TImode
+;; value. I.e.
+;;
+;; (set (reg:TI 32)
+;; (const_int 0x8000000000000000))
+;;
+;; Otherwise, the constant would be splatted into the 2 64-bit positions in the
+;; vector register, and not loaded with the upper 64-bits 0, and the constant
+;; in the lower 64-bits.
+
+(define_predicate "easy_vector_constant_64bit_element"
+ (match_code "const_vector,vec_duplicate")
+{
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ if (mode != V2DFmode && mode != V2DImode)
+ return false;
+
+ if (CONST_VECTOR_P (op))
+ {
+ if (!CONST_VECTOR_DUPLICATE_P (op))
+ return false;
+
+ op = CONST_VECTOR_ELT (op, 0);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ op = XEXP (op, 0);
+
+ else
+ return false;
+
+ return easy_fp_constant_64bit_scalar (op, GET_MODE_INNER (mode));
+})
+
;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.
@@ -653,6 +790,9 @@
if (zero_constant (op, mode) || all_ones_constant (op, mode))
return true;
+ if (easy_vector_constant_64bit_element (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 14f6b313105..e9be9c4d99f 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
extern int easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern long xxspltidp_constant_immediate (rtx, machine_mode);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
@@ -198,6 +199,7 @@ enum non_prefixed_form reg_to_non_prefixed (rtx reg, machine_mode mode);
extern bool prefixed_load_p (rtx_insn *);
extern bool prefixed_store_p (rtx_insn *);
extern bool prefixed_paddi_p (rtx_insn *);
+extern bool prefixed_xxsplti_p (rtx_insn *);
extern void rs6000_asm_output_opcode (FILE *);
extern void output_pcrel_opt_reloc (rtx);
extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ad860728169..83d243269e3 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6946,6 +6946,60 @@ xxspltib_constant_p (rtx op,
return true;
}
+/* Return the immediate value used in the XXSPLTIDP instruction. */
+
+long
+xxspltidp_constant_immediate (rtx op, machine_mode mode)
+{
+ long ret;
+
+ /* Handle vectors. */
+ if (CONST_VECTOR_P (op))
+ {
+ op = CONST_VECTOR_ELT (op, 0);
+ mode = GET_MODE_INNER (mode);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ {
+ op = XEXP (op, 0);
+ mode = GET_MODE (op);
+ }
+
+ gcc_assert (easy_fp_constant_64bit_scalar (op, mode));
+
+ /* Handle DImode/V2DImode by creating a DF value from it and then converting
+ the DFmode value to SFmode. */
+ if (CONST_INT_P (op))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+ long df_words[2];
+
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_to_target takes input in target endian order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ REAL_VALUE_TYPE r;
+ real_from_target (&r, &df_words[0], DFmode);
+ real_to_target (&ret, &r, SFmode);
+ }
+
+ /* For floating point constants, convert to SFmode. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ {
+ const REAL_VALUE_TYPE *rv = CONST_DOUBLE_REAL_VALUE (op);
+ real_to_target (&ret, rv, SFmode);
+ }
+
+ else
+ gcc_unreachable ();
+
+ return ret;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6990,6 +7044,13 @@ output_vec_const_move (rtx *operands)
gcc_unreachable ();
}
+ if (easy_fp_constant_64bit_scalar (vec, mode)
+ || easy_vector_constant_64bit_element (vec, mode))
+ {
+ operands[2] = GEN_INT (xxspltidp_constant_immediate (vec, mode));
+ return "xxspltidp %x0,%2";
+ }
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
{
@@ -26724,6 +26785,41 @@ prefixed_paddi_p (rtx_insn *insn)
return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
}
+/* Whether a permute type instruction is a prefixed XXSPLTI* instruction.
+ This is called from the prefixed attribute processing. */
+
+bool
+prefixed_xxsplti_p (rtx_insn *insn)
+{
+ rtx set = single_set (insn);
+ if (!set)
+ return false;
+
+ rtx dest = SET_DEST (set);
+ rtx src = SET_SRC (set);
+ machine_mode mode = GET_MODE (dest);
+
+ if (!REG_P (dest) && !SUBREG_P (dest))
+ return false;
+
+ switch (mode)
+ {
+ case E_DImode:
+ case E_DFmode:
+ case E_SFmode:
+ return easy_fp_constant_64bit_scalar (src, mode);
+
+ case E_V2DImode:
+ case E_V2DFmode:
+ return easy_vector_constant_64bit_element (src, mode);
+
+ default:
+ break;
+ }
+
+ return false;
+}
+
/* Whether the next instruction needs a 'p' prefix issued before the
instruction is printed out. */
static bool prepend_p_to_next_insn;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6bec2bddbde..8afc4b2756d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -314,6 +314,11 @@
(eq_attr "type" "integer,add")
(if_then_else (match_test "prefixed_paddi_p (insn)")
+ (const_string "yes")
+ (const_string "no"))
+
+ (eq_attr "type" "vecperm")
+ (if_then_else (match_test "prefixed_xxsplti_p (insn)")
(const_string "yes")
(const_string "no"))]
@@ -7759,17 +7764,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP
+;; MR MT<x> MF<x> NOP XXSPLTIDP
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h")
+ !r, *c*l, !r, *h, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0"))]
+ r, r, *h, 0, eF"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7791,15 +7796,16 @@
mr %0,%1
mt%0 %1
mf%1 %0
- nop"
+ nop
+ #"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *")
+ *, mtjmpr, mfjmpr, *, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *")])
+ *, *, *, *, p10")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -8059,18 +8065,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR
+;; LWZ STW MR XXSPLTIDP
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r")
+ Y, r, !r, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r"))]
+ r, Y, r, eF"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8087,20 +8093,21 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two")
+ store, load, two, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8")
+ 8, 8, 8, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *")])
+ *, *, *, p10")])
;; STW LWZ MR G-const H-const F-const
@@ -8127,19 +8134,19 @@
;; STFD LFD FMR LXSD STXSD
;; LXSDX STXSDX XXLOR XXLXOR LI 0
;; STD LD MR MT{CTR,LR} MF{CTR,LR}
-;; NOP MFVSRD MTVSRD
+;; NOP MFVSRD MTVSRD XXSPLTIDP
(define_insn "*mov<mode>_hardfloat64"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
YZ, r, !r, *c*l, !r,
- *h, r, <f64_dm>")
+ *h, r, <f64_dm>, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
r, YZ, r, r, *h,
- 0, <f64_dm>, r"))]
+ 0, <f64_dm>, r, eF"))]
"TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8161,18 +8168,19 @@
mf%1 %0
nop
mfvsrd %0,%x1
- mtvsrd %x0,%1"
+ mtvsrd %x0,%1
+ #"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, integer,
store, load, *, mtjmpr, mfjmpr,
- *, mfvsr, mtvsr")
+ *, mfvsr, mtvsr, vecperm")
(set_attr "size" "64")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
*, *, *, *, *,
- *, p8v, p8v")])
+ *, p8v, p8v, p10")])
;; STD LD MR MT<SPR> MF<SPR> G-const
;; H-const F-const Special
@@ -9220,6 +9228,7 @@
;; a gpr into a fpr instead of reloading an invalid 'Y' address
;; GPR store GPR load GPR move FPR store FPR load FPR move
+;; XXSPLTIDP
;; GPR const AVX store AVX store AVX load AVX load VSX move
;; P9 0 P9 -1 AVX 0/-1 VSX 0 VSX -1 P9 const
;; AVX const
@@ -9227,11 +9236,13 @@
(define_insn "*movdi_internal32"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=Y, r, r, m, ^d, ^d,
+ ^wa,
r, wY, Z, ^v, $v, ^wa,
wa, wa, v, wa, *i, v,
v")
(match_operand:DI 1 "input_operand"
"r, Y, r, ^d, m, ^d,
+ eF,
IJKnF, ^v, $v, wY, Z, ^wa,
Oj, wM, OjwM, Oj, wM, wS,
wB"))]
@@ -9246,6 +9257,7 @@
lfd%U1%X1 %0,%1
fmr %0,%1
#
+ #
stxsd %1,%0
stxsdx %x1,%y0
lxsd %0,%1
@@ -9260,17 +9272,20 @@
#"
[(set_attr "type"
"store, load, *, fpstore, fpload, fpsimple,
+ vecperm,
*, fpstore, fpstore, fpload, fpload, veclogical,
vecsimple, vecsimple, vecsimple, veclogical,veclogical,vecsimple,
vecsimple")
(set_attr "size" "64")
(set_attr "length"
"8, 8, 8, *, *, *,
+ *,
16, *, *, *, *, *,
*, *, *, *, *, 8,
*")
(set_attr "isa"
"*, *, *, *, *, *,
+ p10,
*, p9v, p7v, p9v, p7v, *,
p9v, p9v, p7v, *, *, p7v,
p7v")])
@@ -9306,6 +9321,7 @@
})
;; GPR store GPR load GPR move
+;; XXSPLTIDP
;; GPR li GPR lis GPR pli GPR #
;; FPR store FPR load FPR move
;; AVX store AVX store AVX load AVX load VSX move
@@ -9316,6 +9332,7 @@
(define_insn "*movdi_internal64"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=YZ, r, r,
+ ^wa,
r, r, r, r,
m, ^d, ^d,
wY, Z, $v, $v, ^wa,
@@ -9325,6 +9342,7 @@
?r, ?wa")
(match_operand:DI 1 "input_operand"
"r, YZ, r,
+ eF,
I, L, eI, nF,
^d, m, ^d,
^v, $v, wY, Z, ^wa,
@@ -9339,6 +9357,7 @@
std%U0%X0 %1,%0
ld%U1%X1 %0,%1
mr %0,%1
+ #
li %0,%1
lis %0,%v1
li %0,%1
@@ -9365,6 +9384,7 @@
mtvsrd %x0,%1"
[(set_attr "type"
"store, load, *,
+ vecperm,
*, *, *, *,
fpstore, fpload, fpsimple,
fpstore, fpstore, fpload, fpload, veclogical,
@@ -9375,6 +9395,7 @@
(set_attr "size" "64")
(set_attr "length"
"*, *, *,
+ *,
*, *, *, 20,
*, *, *,
*, *, *, *, *,
@@ -9384,6 +9405,7 @@
*, *")
(set_attr "isa"
"*, *, *,
+ p10,
*, *, p10, *,
*, *, *,
p9v, p7v, p9v, p7v, *,
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 9d7878f144a..1d7ce4cc94a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -640,6 +640,10 @@ mprivileged
Target Var(rs6000_privileged) Init(0)
Generate code that will run in privileged state.
+mxxspltidp
+Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save
+Generate (do not generate) XXSPLTIDP instructions.
+
-param=rs6000-density-pct-threshold=
Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 100) Param
When costing for loop vectorization, we probably need to penalize the loop body
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bf033e31c1c..fa33c9d9fbf 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1191,16 +1191,19 @@
;; instruction). But generate XXLXOR/XXLORC if it will avoid a register move.
;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR)
+;; XXSPLTIDP
;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW
;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX)
(define_insn "vsx_mov<mode>_64bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, r, we, ?wQ,
+ wa,
?&r, ??r, ??Y, <??r>, wa, v,
?wa, v, <??r>, wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, we, r, r,
+ eV,
wQ, Y, r, r, wE, jwM,
?jwM, W, <nW>, v, wZ"))]
@@ -1212,36 +1215,44 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, mtvsr, mfvsr, load,
+ vecperm,
store, load, store, *, vecsimple, vecsimple,
vecsimple, *, *, vecstore, vecload")
(set_attr "num_insns"
"*, *, *, 2, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, 5, 2, *, *")
(set_attr "max_prefixed_insns"
"*, *, *, *, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, *, *, *, *")
(set_attr "length"
"*, *, *, 8, *, 8,
+ *,
8, 8, 8, 8, *, *,
*, 20, 8, *, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
*, *, *, *, p9v, *,
<VSisa>, *, *, *, *")])
;; VSX store VSX load VSX move GPR load GPR store GPR move
+;; XXSPLTIDP
;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const
;; LVX (VMX) STVX (VMX)
(define_insn "*vsx_mov<mode>_32bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, ??r, ??Y, <??r>,
+ wa,
wa, v, ?wa, v, <??r>,
wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, Y, r, r,
+ eV,
wE, jwM, ?jwM, W, <nW>,
v, wZ"))]
@@ -1253,14 +1264,17 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, load, store, *,
+ vecperm,
vecsimple, vecsimple, vecsimple, *, *,
vecstore, vecload")
(set_attr "length"
"*, *, *, 16, 16, 16,
+ *,
*, *, *, 20, 16,
*, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
p9v, *, <VSisa>, *, *,
*, *")])
@@ -6449,15 +6463,53 @@
DONE;
})
-(define_insn "xxspltidp_v2df_inst"
- [(set (match_operand:V2DF 0 "register_operand" "=wa")
- (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
- UNSPEC_XXSPLTIDP))]
+(define_mode_iterator XXSPLTIDP_S [DI SF DF])
+(define_mode_iterator XXSPLTIDP_V [V2DF V2DI])
+(define_mode_iterator XXSPLTIDP [DI SF DF V2DF V2DI])
+
+(define_insn "xxspltidp_<mode>_inst"
+ [(set (match_operand:XXSPLTIDP 0 "register_operand" "=wa")
+ (unspec:XXSPLTIDP [(match_operand:SI 1 "c32bit_cint_operand" "n")]
+ UNSPEC_XXSPLTIDP))]
"TARGET_POWER10"
"xxspltidp %x0,%1"
[(set_attr "type" "vecperm")
(set_attr "prefixed" "yes")])
+;; Generate the XXSPLTIDP instruction to support SFmode, DFmode, and DImode
+;; scalar constants and V2DF and V2DI vector constants where both elements are
+;; the same. The constant has to be expressible as a SFmode constant that is
+;; not a SFmode denormal value.
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_S 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_S 1 "easy_fp_constant_64bit_scalar" "eF"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_S [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_V 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_V 1 "easy_vector_constant_64bit_element" "eV"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_V [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
;; XXSPLTI32DX built-in function support
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 2b41cb7fb7b..5035a3fd604 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3333,9 +3333,15 @@ The integer constant zero.
A constant whose negation is a signed 16-bit constant.
@end ifset
+@item eF
+A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction.
+
@item eI
A signed 34-bit integer constant if prefixed instructions are supported.
+@item eV
+A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction.
+
@ifset INTERNALS
@item G
A floating point constant that can be loaded into a register with one
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+double
+scalar_double_0 (void)
+{
+ return 0.0; /* XXSPLTIB or XXLXOR. */
+}
+
+double
+scalar_double_1 (void)
+{
+ return 1.0; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+ return -0.0; /* XXSPLTIDP. */
+}
+
+double
+scalar_double_nan (void)
+{
+ return __builtin_nan (""); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_inf (void)
+{
+ return __builtin_inf (); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+ return M_PI; /* PLFD. */
+}
+
+double
+scalar_double_denorm (void)
+{
+ return 0x1p-149f; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
new file mode 100644
index 00000000000..75714d0b11d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating DImode constants that have the same bit pattern as DFmode
+ constants that can be loaded with the XXSPLTIDP instruction with the ISA 3.1
+ (power10). We use asm to force the value into vector registers. */
+
+double
+scalar_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ double d;
+ long long ll = 0;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+double
+scalar_1 (void)
+{
+ /* VSPLTISW/VUPKLSW or XXSPLTIB/VEXTSB2D. */
+ double d;
+ long long ll = 1;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTIDP. */
+double
+scalar_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x8000000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTIDP. */
+double
+scalar_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x3ff0000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+double
+scalar_pi (void)
+{
+ /* PLXV. */
+ double d;
+ long long ll = 0x400921fb54442d18LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+float
+scalar_float_0 (void)
+{
+ return 0.0f; /* XXSPLTIB or XXLXOR. */
+}
+
+float
+scalar_float_1 (void)
+{
+ return 1.0f; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+ return -0.0f; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_nan (void)
+{
+ return __builtin_nanf (""); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_inf (void)
+{
+ return __builtin_inff (); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+ return (float)M_PI; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_denorm (void)
+{
+ return 0x1p-149f; /* PLFS. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..82ffc86f8aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+vector double
+v2df_double_0 (void)
+{
+ return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */
+}
+
+vector double
+v2df_double_1 (void)
+{
+ return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+ return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_nan (void)
+{
+ return (vector double) { __builtin_nan (""),
+ __builtin_nan ("") }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_inf (void)
+{
+ return (vector double) { __builtin_inf (),
+ __builtin_inf () }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+ return (vector double) { - __builtin_inf (),
+ - __builtin_inf () }; /* XXSPLTIDP. */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+ return (vector double) { M_PI, M_PI }; /* PLVX. */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+ return (vector double) { (double)0x1p-149f,
+ (double)0x1p-149f }; /* PLVX. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
new file mode 100644
index 00000000000..4d44f943d26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating V2DImode constants that have the same bit pattern as
+ V2DFmode constants that can be loaded with the XXSPLTIDP instruction with
+ the ISA 3.1 (power10). */
+
+vector long long
+vector_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ return (vector long long) { 0LL, 0LL };
+}
+
+vector long long
+vector_1 (void)
+{
+ /* XXSPLTIB and VEXTSB2D. */
+ return (vector long long) { 1LL, 1LL };
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTISDP. */
+vector long long
+vector_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x8000000000000000LL, 0x8000000000000000LL };
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTISDP. */
+vector long long
+vector_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x3ff0000000000000LL, 0x3ff0000000000000LL };
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+vector long long
+scalar_pi (void)
+{
+ /* PLXV. */
+ return (vector long long) { 0x400921fb54442d18LL, 0x400921fb54442d18LL };
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* [gcc(refs/users/meissner/heads/work070)] Generate XXSPLTIDP on power10.
@ 2021-10-05 21:11 Michael Meissner
0 siblings, 0 replies; 5+ messages in thread
From: Michael Meissner @ 2021-10-05 21:11 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:4544188d46ae061016d44d77dd2b568c48b36d0f
commit 4544188d46ae061016d44d77dd2b568c48b36d0f
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Tue Oct 5 17:10:03 2021 -0400
Generate XXSPLTIDP on power10.
This patch implements XXSPLTIDP support for SF, DF, and DI scalar constants and
V2DF and V2DI vector constants. The XXSPLTIDP instruction is given a 32-bit
immediate that is converted to a vector of two DFmode constants. The immediate
is in SFmode format, so only constants that fit as SFmode values can be loaded
with XXSPLTIDP.
I added two new constraints (eF and eV) to match scalar and vector constants
that can be loaded with the XXSPLTIDP instruction.
I have added a temporary switch (-mxxspltidp) to control whether or not the
XXSPLTIDP instruction is generated.
I added 5 new tests to test loading up SF/DF/DI scalar and V2DI/V2DF vector
constants.
2021-10-05 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/constraints.md (eF): New constraint.
(eV): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the scalar constant with XXSPLTIDP, the constant is easy.
(easy_fp_constant_64bit_scalar): New predicate.
(easy_vector_constant_64bit_element): New predicate.
(easy_vector_constant): If we can generate XXSPLTIDP, mark the
vector constant as easy.
* config/rs6000/rs6000-protos.h (xxspltidp_constant_immediate): New
declaration.
(prefixed_xxsplti_p): Likewise.
* config/rs6000/rs6000.c (xxspltidp_constant_immediate): New function.
(output_vec_const_move): Add support for XXSPLTIDP.
(prefixed_xxsplti_p): New function.
* config/rs6000/rs6000.md (prefixed attribute): Add support for the
xxsplti* prefixed instructions.
(movsf_hardfloat): Add XXSPLTIDP support.
(mov<mode>_hardfloat32, FMOVE64 iterator): Likewise.
(mov<mode>_hardfloat64, FMOVE64 iterator): Likewise.
(movdi_internal32): Likewise.
(movdi_internal64): Likewise.
* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
* config/rs6000/vsx.md (vsx_move<mode>_64bit): Add XXSPLTIDP
support.
(vsx_move<mode>_32bit): Likewise.
(XXSPLTIDP_S): New mode iterator.
(XXSPLTIDP_V): Likewise.
(XXSPLTIDP): Likewise.
(xxspltidp_<mode>_inst): Replace xxspltidp_v2df_inst with an
iterated form that also does SFmode, DFmode, DImode, and
V2DImode.
(xxspltidp_<mode>_internal): New insn and splits.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
eF and eV constraints.
gcc/testsuite/
* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-di.c: New test.
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2di.c: New test.
Diff:
---
gcc/config/rs6000/constraints.md | 10 ++
gcc/config/rs6000/predicates.md | 140 +++++++++++++++++++++
gcc/config/rs6000/rs6000-protos.h | 2 +
gcc/config/rs6000/rs6000.c | 96 ++++++++++++++
gcc/config/rs6000/rs6000.md | 58 ++++++---
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 60 ++++++++-
gcc/doc/md.texi | 6 +
.../gcc.target/powerpc/vec-splat-constant-df.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-di.c | 70 +++++++++++
.../gcc.target/powerpc/vec-splat-constant-sf.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 64 ++++++++++
.../gcc.target/powerpc/vec-splat-constant-v2di.c | 50 ++++++++
13 files changed, 658 insertions(+), 22 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index c8cff1a3038..1ff46c9f4fc 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,11 +208,21 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; DI/SF/DF scalar constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eF"
+ "A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_fp_constant_64bit_scalar"))
+
;; 34-bit signed integer constant
(define_constraint "eI"
"A signed 34-bit integer constant if prefixed instructions are supported."
(match_operand 0 "cint34_operand"))
+;; V2DI/V2DF vector constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eV"
+ "A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_vector_constant_64bit_element"))
+
;; Floating-point constraints. These two are defined so that insn
;; length attributes can be calculated exactly.
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 956e42bc514..7544ac87700 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
if (TARGET_VSX && op == CONST0_RTX (mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+ be loaded with that instruction. */
+ if (easy_fp_constant_64bit_scalar (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -609,6 +614,138 @@
return 0;
})
+;; Return 1 if the operand is a 64-bit scalar constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a V2DF or
+;; V2DI mode result that is interpretted as a 64-bit scalar.
+(define_predicate "easy_fp_constant_64bit_scalar"
+ (match_code "const_int,const_double")
+{
+ const REAL_VALUE_TYPE *rv;
+ REAL_VALUE_TYPE rv_type;
+
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ /* Don't return true for 0.0 or 0 since that is easy to create without
+ XXSPLTIDP. */
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ /* Handle DImode by creating a DF value from it. */
+ if (CONST_INT_P (op) && (mode == DImode || mode == VOIDmode))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+
+ /* Avoid values that look like DFmode NaN's. The IEEE 754 64-bit
+ floating format has 1 bit for sign, 11 bits for the exponent,
+ and 52 bits for the mantissa. NaN values have the exponent set
+ to all 1 bits, and the mantissa non-zero (mantissa == 0 is
+ infinity). */
+ int df_exponent = (df_value >> 52) & 0x7ff;
+ HOST_WIDE_INT df_mantissa = df_value & HOST_WIDE_INT_C (0x1fffffffffffff);
+
+ if (df_exponent == 0x7ff && df_mantissa != 0) /* NaN. */
+ return false;
+
+ /* Avoid values that are DFmode subnormal values. Subnormal numbers
+ have the exponent all 0 bits, and the mantissa non-zero. If the
+ value is subnormal, then the hidden bit in the mantissa is not
+ set. */
+ if (df_exponent == 0 && df_mantissa != 0) /* subnormal. */
+ return false;
+
+ long df_words[2];
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_from_target takes the target words in target order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ real_from_target (&rv_type, df_words, DFmode);
+ rv = &rv_type;
+ }
+
+ /* Handle SFmode/DFmode constants. Don't allow decimal or IEEE 128-bit
+ binary constants. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ rv = CONST_DOUBLE_REAL_VALUE (op);
+
+ /* We can't handle anything else with the XXSPLTIDP instruction. */
+ else
+ return false;
+
+ /* Validate that the number can be stored as a SFmode value. */
+ if (!exact_real_truncate (SFmode, rv))
+ return false;
+
+ /* Validate that the number is not a SFmode subnormal value (exponent is 0,
+ mantissa field is non-zero) which is undefined for the XXSPLTIDP
+ instruction. */
+ long sf_value;
+ real_to_target (&sf_value, rv, SFmode);
+
+ /* IEEE 754 32-bit values have 1 bit for the sign, 8 bits for the exponent,
+ and 23 bits for the mantissa. Subnormal numbers have the exponent all
+ 0 bits, and the mantissa non-zero. */
+ long sf_exponent = (sf_value >> 23) & 0xFF;
+ long sf_mantissa = sf_value & 0x7FFFFF;
+
+ if (sf_exponent == 0 && sf_mantissa != 0)
+ return false;
+
+ return true;
+})
+
+;; Return 1 if the operand is a 64-bit vector constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a
+;; V2DFmode or V2DI result.
+;;
+;; We cannot combine the scalar and vector cases because otherwise it is
+;; problematical if we assign an appropriate integer constant to a TImode
+;; value. I.e.
+;;
+;; (set (reg:TI 32)
+;; (const_int 0x8000000000000000))
+;;
+;; Otherwise, the constant would be splatted into the 2 64-bit positions in the
+;; vector register, and not loaded with the upper 64-bits 0, and the constant
+;; in the lower 64-bits.
+
+(define_predicate "easy_vector_constant_64bit_element"
+ (match_code "const_vector,vec_duplicate")
+{
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ if (mode != V2DFmode && mode != V2DImode)
+ return false;
+
+ if (CONST_VECTOR_P (op))
+ {
+ if (!CONST_VECTOR_DUPLICATE_P (op))
+ return false;
+
+ op = CONST_VECTOR_ELT (op, 0);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ op = XEXP (op, 0);
+
+ else
+ return false;
+
+ return easy_fp_constant_64bit_scalar (op, GET_MODE_INNER (mode));
+})
+
;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.
@@ -653,6 +790,9 @@
if (zero_constant (op, mode) || all_ones_constant (op, mode))
return true;
+ if (easy_vector_constant_64bit_element (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 14f6b313105..e9be9c4d99f 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
extern int easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern long xxspltidp_constant_immediate (rtx, machine_mode);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
@@ -198,6 +199,7 @@ enum non_prefixed_form reg_to_non_prefixed (rtx reg, machine_mode mode);
extern bool prefixed_load_p (rtx_insn *);
extern bool prefixed_store_p (rtx_insn *);
extern bool prefixed_paddi_p (rtx_insn *);
+extern bool prefixed_xxsplti_p (rtx_insn *);
extern void rs6000_asm_output_opcode (FILE *);
extern void output_pcrel_opt_reloc (rtx);
extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ad860728169..83d243269e3 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6946,6 +6946,60 @@ xxspltib_constant_p (rtx op,
return true;
}
+/* Return the immediate value used in the XXSPLTIDP instruction. */
+
+long
+xxspltidp_constant_immediate (rtx op, machine_mode mode)
+{
+ long ret;
+
+ /* Handle vectors. */
+ if (CONST_VECTOR_P (op))
+ {
+ op = CONST_VECTOR_ELT (op, 0);
+ mode = GET_MODE_INNER (mode);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ {
+ op = XEXP (op, 0);
+ mode = GET_MODE (op);
+ }
+
+ gcc_assert (easy_fp_constant_64bit_scalar (op, mode));
+
+ /* Handle DImode/V2DImode by creating a DF value from it and then converting
+ the DFmode value to SFmode. */
+ if (CONST_INT_P (op))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+ long df_words[2];
+
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_to_target takes input in target endian order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ REAL_VALUE_TYPE r;
+ real_from_target (&r, &df_words[0], DFmode);
+ real_to_target (&ret, &r, SFmode);
+ }
+
+ /* For floating point constants, convert to SFmode. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ {
+ const REAL_VALUE_TYPE *rv = CONST_DOUBLE_REAL_VALUE (op);
+ real_to_target (&ret, rv, SFmode);
+ }
+
+ else
+ gcc_unreachable ();
+
+ return ret;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6990,6 +7044,13 @@ output_vec_const_move (rtx *operands)
gcc_unreachable ();
}
+ if (easy_fp_constant_64bit_scalar (vec, mode)
+ || easy_vector_constant_64bit_element (vec, mode))
+ {
+ operands[2] = GEN_INT (xxspltidp_constant_immediate (vec, mode));
+ return "xxspltidp %x0,%2";
+ }
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
{
@@ -26724,6 +26785,41 @@ prefixed_paddi_p (rtx_insn *insn)
return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
}
+/* Whether a permute type instruction is a prefixed XXSPLTI* instruction.
+ This is called from the prefixed attribute processing. */
+
+bool
+prefixed_xxsplti_p (rtx_insn *insn)
+{
+ rtx set = single_set (insn);
+ if (!set)
+ return false;
+
+ rtx dest = SET_DEST (set);
+ rtx src = SET_SRC (set);
+ machine_mode mode = GET_MODE (dest);
+
+ if (!REG_P (dest) && !SUBREG_P (dest))
+ return false;
+
+ switch (mode)
+ {
+ case E_DImode:
+ case E_DFmode:
+ case E_SFmode:
+ return easy_fp_constant_64bit_scalar (src, mode);
+
+ case E_V2DImode:
+ case E_V2DFmode:
+ return easy_vector_constant_64bit_element (src, mode);
+
+ default:
+ break;
+ }
+
+ return false;
+}
+
/* Whether the next instruction needs a 'p' prefix issued before the
instruction is printed out. */
static bool prepend_p_to_next_insn;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6bec2bddbde..8afc4b2756d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -314,6 +314,11 @@
(eq_attr "type" "integer,add")
(if_then_else (match_test "prefixed_paddi_p (insn)")
+ (const_string "yes")
+ (const_string "no"))
+
+ (eq_attr "type" "vecperm")
+ (if_then_else (match_test "prefixed_xxsplti_p (insn)")
(const_string "yes")
(const_string "no"))]
@@ -7759,17 +7764,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP
+;; MR MT<x> MF<x> NOP XXSPLTIDP
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h")
+ !r, *c*l, !r, *h, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0"))]
+ r, r, *h, 0, eF"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7791,15 +7796,16 @@
mr %0,%1
mt%0 %1
mf%1 %0
- nop"
+ nop
+ #"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *")
+ *, mtjmpr, mfjmpr, *, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *")])
+ *, *, *, *, p10")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -8059,18 +8065,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR
+;; LWZ STW MR XXSPLTIDP
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r")
+ Y, r, !r, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r"))]
+ r, Y, r, eF"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8087,20 +8093,21 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two")
+ store, load, two, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8")
+ 8, 8, 8, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *")])
+ *, *, *, p10")])
;; STW LWZ MR G-const H-const F-const
@@ -8127,19 +8134,19 @@
;; STFD LFD FMR LXSD STXSD
;; LXSDX STXSDX XXLOR XXLXOR LI 0
;; STD LD MR MT{CTR,LR} MF{CTR,LR}
-;; NOP MFVSRD MTVSRD
+;; NOP MFVSRD MTVSRD XXSPLTIDP
(define_insn "*mov<mode>_hardfloat64"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
YZ, r, !r, *c*l, !r,
- *h, r, <f64_dm>")
+ *h, r, <f64_dm>, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
r, YZ, r, r, *h,
- 0, <f64_dm>, r"))]
+ 0, <f64_dm>, r, eF"))]
"TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8161,18 +8168,19 @@
mf%1 %0
nop
mfvsrd %0,%x1
- mtvsrd %x0,%1"
+ mtvsrd %x0,%1
+ #"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, integer,
store, load, *, mtjmpr, mfjmpr,
- *, mfvsr, mtvsr")
+ *, mfvsr, mtvsr, vecperm")
(set_attr "size" "64")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
*, *, *, *, *,
- *, p8v, p8v")])
+ *, p8v, p8v, p10")])
;; STD LD MR MT<SPR> MF<SPR> G-const
;; H-const F-const Special
@@ -9220,6 +9228,7 @@
;; a gpr into a fpr instead of reloading an invalid 'Y' address
;; GPR store GPR load GPR move FPR store FPR load FPR move
+;; XXSPLTIDP
;; GPR const AVX store AVX store AVX load AVX load VSX move
;; P9 0 P9 -1 AVX 0/-1 VSX 0 VSX -1 P9 const
;; AVX const
@@ -9227,11 +9236,13 @@
(define_insn "*movdi_internal32"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=Y, r, r, m, ^d, ^d,
+ ^wa,
r, wY, Z, ^v, $v, ^wa,
wa, wa, v, wa, *i, v,
v")
(match_operand:DI 1 "input_operand"
"r, Y, r, ^d, m, ^d,
+ eF,
IJKnF, ^v, $v, wY, Z, ^wa,
Oj, wM, OjwM, Oj, wM, wS,
wB"))]
@@ -9246,6 +9257,7 @@
lfd%U1%X1 %0,%1
fmr %0,%1
#
+ #
stxsd %1,%0
stxsdx %x1,%y0
lxsd %0,%1
@@ -9260,17 +9272,20 @@
#"
[(set_attr "type"
"store, load, *, fpstore, fpload, fpsimple,
+ vecperm,
*, fpstore, fpstore, fpload, fpload, veclogical,
vecsimple, vecsimple, vecsimple, veclogical,veclogical,vecsimple,
vecsimple")
(set_attr "size" "64")
(set_attr "length"
"8, 8, 8, *, *, *,
+ *,
16, *, *, *, *, *,
*, *, *, *, *, 8,
*")
(set_attr "isa"
"*, *, *, *, *, *,
+ p10,
*, p9v, p7v, p9v, p7v, *,
p9v, p9v, p7v, *, *, p7v,
p7v")])
@@ -9306,6 +9321,7 @@
})
;; GPR store GPR load GPR move
+;; XXSPLTIDP
;; GPR li GPR lis GPR pli GPR #
;; FPR store FPR load FPR move
;; AVX store AVX store AVX load AVX load VSX move
@@ -9316,6 +9332,7 @@
(define_insn "*movdi_internal64"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=YZ, r, r,
+ ^wa,
r, r, r, r,
m, ^d, ^d,
wY, Z, $v, $v, ^wa,
@@ -9325,6 +9342,7 @@
?r, ?wa")
(match_operand:DI 1 "input_operand"
"r, YZ, r,
+ eF,
I, L, eI, nF,
^d, m, ^d,
^v, $v, wY, Z, ^wa,
@@ -9339,6 +9357,7 @@
std%U0%X0 %1,%0
ld%U1%X1 %0,%1
mr %0,%1
+ #
li %0,%1
lis %0,%v1
li %0,%1
@@ -9365,6 +9384,7 @@
mtvsrd %x0,%1"
[(set_attr "type"
"store, load, *,
+ vecperm,
*, *, *, *,
fpstore, fpload, fpsimple,
fpstore, fpstore, fpload, fpload, veclogical,
@@ -9375,6 +9395,7 @@
(set_attr "size" "64")
(set_attr "length"
"*, *, *,
+ *,
*, *, *, 20,
*, *, *,
*, *, *, *, *,
@@ -9384,6 +9405,7 @@
*, *")
(set_attr "isa"
"*, *, *,
+ p10,
*, *, p10, *,
*, *, *,
p9v, p7v, p9v, p7v, *,
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 9d7878f144a..1d7ce4cc94a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -640,6 +640,10 @@ mprivileged
Target Var(rs6000_privileged) Init(0)
Generate code that will run in privileged state.
+mxxspltidp
+Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save
+Generate (do not generate) XXSPLTIDP instructions.
+
-param=rs6000-density-pct-threshold=
Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 100) Param
When costing for loop vectorization, we probably need to penalize the loop body
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bf033e31c1c..fa33c9d9fbf 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1191,16 +1191,19 @@
;; instruction). But generate XXLXOR/XXLORC if it will avoid a register move.
;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR)
+;; XXSPLTIDP
;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW
;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX)
(define_insn "vsx_mov<mode>_64bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, r, we, ?wQ,
+ wa,
?&r, ??r, ??Y, <??r>, wa, v,
?wa, v, <??r>, wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, we, r, r,
+ eV,
wQ, Y, r, r, wE, jwM,
?jwM, W, <nW>, v, wZ"))]
@@ -1212,36 +1215,44 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, mtvsr, mfvsr, load,
+ vecperm,
store, load, store, *, vecsimple, vecsimple,
vecsimple, *, *, vecstore, vecload")
(set_attr "num_insns"
"*, *, *, 2, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, 5, 2, *, *")
(set_attr "max_prefixed_insns"
"*, *, *, *, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, *, *, *, *")
(set_attr "length"
"*, *, *, 8, *, 8,
+ *,
8, 8, 8, 8, *, *,
*, 20, 8, *, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
*, *, *, *, p9v, *,
<VSisa>, *, *, *, *")])
;; VSX store VSX load VSX move GPR load GPR store GPR move
+;; XXSPLTIDP
;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const
;; LVX (VMX) STVX (VMX)
(define_insn "*vsx_mov<mode>_32bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, ??r, ??Y, <??r>,
+ wa,
wa, v, ?wa, v, <??r>,
wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, Y, r, r,
+ eV,
wE, jwM, ?jwM, W, <nW>,
v, wZ"))]
@@ -1253,14 +1264,17 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, load, store, *,
+ vecperm,
vecsimple, vecsimple, vecsimple, *, *,
vecstore, vecload")
(set_attr "length"
"*, *, *, 16, 16, 16,
+ *,
*, *, *, 20, 16,
*, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
p9v, *, <VSisa>, *, *,
*, *")])
@@ -6449,15 +6463,53 @@
DONE;
})
-(define_insn "xxspltidp_v2df_inst"
- [(set (match_operand:V2DF 0 "register_operand" "=wa")
- (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
- UNSPEC_XXSPLTIDP))]
+(define_mode_iterator XXSPLTIDP_S [DI SF DF])
+(define_mode_iterator XXSPLTIDP_V [V2DF V2DI])
+(define_mode_iterator XXSPLTIDP [DI SF DF V2DF V2DI])
+
+(define_insn "xxspltidp_<mode>_inst"
+ [(set (match_operand:XXSPLTIDP 0 "register_operand" "=wa")
+ (unspec:XXSPLTIDP [(match_operand:SI 1 "c32bit_cint_operand" "n")]
+ UNSPEC_XXSPLTIDP))]
"TARGET_POWER10"
"xxspltidp %x0,%1"
[(set_attr "type" "vecperm")
(set_attr "prefixed" "yes")])
+;; Generate the XXSPLTIDP instruction to support SFmode, DFmode, and DImode
+;; scalar constants and V2DF and V2DI vector constants where both elements are
+;; the same. The constant has to be expressible as a SFmode constant that is
+;; not a SFmode denormal value.
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_S 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_S 1 "easy_fp_constant_64bit_scalar" "eF"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_S [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_V 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_V 1 "easy_vector_constant_64bit_element" "eV"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_V [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
;; XXSPLTI32DX built-in function support
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 2b41cb7fb7b..5035a3fd604 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3333,9 +3333,15 @@ The integer constant zero.
A constant whose negation is a signed 16-bit constant.
@end ifset
+@item eF
+A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction.
+
@item eI
A signed 34-bit integer constant if prefixed instructions are supported.
+@item eV
+A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction.
+
@ifset INTERNALS
@item G
A floating point constant that can be loaded into a register with one
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+double
+scalar_double_0 (void)
+{
+ return 0.0; /* XXSPLTIB or XXLXOR. */
+}
+
+double
+scalar_double_1 (void)
+{
+ return 1.0; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+ return -0.0; /* XXSPLTIDP. */
+}
+
+double
+scalar_double_nan (void)
+{
+ return __builtin_nan (""); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_inf (void)
+{
+ return __builtin_inf (); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+ return M_PI; /* PLFD. */
+}
+
+double
+scalar_double_denorm (void)
+{
+ return 0x1p-149f; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
new file mode 100644
index 00000000000..75714d0b11d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating DImode constants that have the same bit pattern as DFmode
+ constants that can be loaded with the XXSPLTIDP instruction with the ISA 3.1
+ (power10). We use asm to force the value into vector registers. */
+
+double
+scalar_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ double d;
+ long long ll = 0;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+double
+scalar_1 (void)
+{
+ /* VSPLTISW/VUPKLSW or XXSPLTIB/VEXTSB2D. */
+ double d;
+ long long ll = 1;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTIDP. */
+double
+scalar_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x8000000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTIDP. */
+double
+scalar_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x3ff0000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+double
+scalar_pi (void)
+{
+ /* PLXV. */
+ double d;
+ long long ll = 0x400921fb54442d18LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+float
+scalar_float_0 (void)
+{
+ return 0.0f; /* XXSPLTIB or XXLXOR. */
+}
+
+float
+scalar_float_1 (void)
+{
+ return 1.0f; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+ return -0.0f; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_nan (void)
+{
+ return __builtin_nanf (""); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_inf (void)
+{
+ return __builtin_inff (); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+ return (float)M_PI; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_denorm (void)
+{
+ return 0x1p-149f; /* PLFS. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..82ffc86f8aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+vector double
+v2df_double_0 (void)
+{
+ return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */
+}
+
+vector double
+v2df_double_1 (void)
+{
+ return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+ return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_nan (void)
+{
+ return (vector double) { __builtin_nan (""),
+ __builtin_nan ("") }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_inf (void)
+{
+ return (vector double) { __builtin_inf (),
+ __builtin_inf () }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+ return (vector double) { - __builtin_inf (),
+ - __builtin_inf () }; /* XXSPLTIDP. */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+ return (vector double) { M_PI, M_PI }; /* PLVX. */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+ return (vector double) { (double)0x1p-149f,
+ (double)0x1p-149f }; /* PLVX. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
new file mode 100644
index 00000000000..4d44f943d26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating V2DImode constants that have the same bit pattern as
+ V2DFmode constants that can be loaded with the XXSPLTIDP instruction with
+ the ISA 3.1 (power10). */
+
+vector long long
+vector_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ return (vector long long) { 0LL, 0LL };
+}
+
+vector long long
+vector_1 (void)
+{
+ /* XXSPLTIB and VEXTSB2D. */
+ return (vector long long) { 1LL, 1LL };
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTISDP. */
+vector long long
+vector_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x8000000000000000LL, 0x8000000000000000LL };
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTISDP. */
+vector long long
+vector_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x3ff0000000000000LL, 0x3ff0000000000000LL };
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+vector long long
+scalar_pi (void)
+{
+ /* PLXV. */
+ return (vector long long) { 0x400921fb54442d18LL, 0x400921fb54442d18LL };
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* [gcc(refs/users/meissner/heads/work070)] Generate XXSPLTIDP on power10.
@ 2021-10-04 20:17 Michael Meissner
0 siblings, 0 replies; 5+ messages in thread
From: Michael Meissner @ 2021-10-04 20:17 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:f62b92fd93d8124fe9773cd3202de6bbf117d656
commit f62b92fd93d8124fe9773cd3202de6bbf117d656
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Mon Oct 4 16:16:58 2021 -0400
Generate XXSPLTIDP on power10.
This patch implements XXSPLTIDP support for SF, DF, and DI scalar constants and
V2DF and V2DI vector constants. The XXSPLTIDP instruction is given a 32-bit
immediate that is converted to a vector of two DFmode constants. The immediate
is in SFmode format, so only constants that fit as SFmode values can be loaded
with XXSPLTIDP.
I added two new constraints (eF and eV) to match scalar and vector constants
that can be loaded with the XXSPLTIDP instruction.
I have added a temporary switch (-mxxspltidp) to control whether or not the
XXSPLTIDP instruction is generated.
I added 5 new tests to test loading up SF/DF/DI scalar and V2DI/V2DF vector
constants.
2021-10-04 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/constraints.md (eF): New constraint.
(eV): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the scalar constant with XXSPLTIDP, the constant is easy.
(easy_fp_constant_64bit_scalar): New predicate.
(easy_vector_constant_64bit_element): New predicate.
(easy_vector_constant): If we can generate XXSPLTIDP, mark the
vector constant as easy.
* config/rs6000/rs6000-protos.h (xxspltidp_constant_immediate): New
declaration.
(prefixed_xxsplti_p): Likewise.
* config/rs6000/rs6000.c (xxspltidp_constant_immediate): New function.
(output_vec_const_move): Add support for XXSPLTIDP.
(prefixed_xxsplti_p): New function.
* config/rs6000/rs6000.md (prefixed attribute): Add support for the
xxsplti* prefixed instructions.
(movsf_hardfloat): Add XXSPLTIDP support.
(mov<mode>_hardfloat32, FMOVE64 iterator): Likewise.
(mov<mode>_hardfloat64, FMOVE64 iterator): Likewise.
(movdi_internal32): Likewise.
(movdi_internal64): Likewise.
* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
* config/rs6000/vsx.md (vsx_move<mode>_64bit): Add XXSPLTIDP
support.
(vsx_move<mode>_32bit): Likewise.
(XXSPLTIDP_S): New mode iterator.
(XXSPLTIDP_V): Likewise.
(XXSPLTIDP): Likewise.
(xxspltidp_<mode>_inst): Replace xxspltidp_v2df_inst with an
iterated form that also does SFmode, DFmode, DImode, and
V2DImode.
(xxspltidp_<mode>_internal): New insn and splits.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
eF and eV constraints.
gcc/testsuite/
* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-di.c: New test.
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2di.c: New test.
Diff:
---
gcc/config/rs6000/constraints.md | 10 ++
gcc/config/rs6000/predicates.md | 140 +++++++++++++++++++++
gcc/config/rs6000/rs6000-protos.h | 2 +
gcc/config/rs6000/rs6000.c | 96 ++++++++++++++
gcc/config/rs6000/rs6000.md | 58 ++++++---
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 60 ++++++++-
gcc/doc/md.texi | 6 +
.../gcc.target/powerpc/vec-splat-constant-df.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-di.c | 70 +++++++++++
.../gcc.target/powerpc/vec-splat-constant-sf.c | 60 +++++++++
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 64 ++++++++++
.../gcc.target/powerpc/vec-splat-constant-v2di.c | 50 ++++++++
13 files changed, 658 insertions(+), 22 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index c8cff1a3038..1ff46c9f4fc 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,11 +208,21 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; DI/SF/DF scalar constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eF"
+ "A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_fp_constant_64bit_scalar"))
+
;; 34-bit signed integer constant
(define_constraint "eI"
"A signed 34-bit integer constant if prefixed instructions are supported."
(match_operand 0 "cint34_operand"))
+;; V2DI/V2DF vector constant that can be loaded with the XXSPLTIDP instruction.
+(define_constraint "eV"
+ "A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction."
+ (match_operand 0 "easy_vector_constant_64bit_element"))
+
;; Floating-point constraints. These two are defined so that insn
;; length attributes can be calculated exactly.
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 956e42bc514..7544ac87700 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -601,6 +601,11 @@
if (TARGET_VSX && op == CONST0_RTX (mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can
+ be loaded with that instruction. */
+ if (easy_fp_constant_64bit_scalar (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -609,6 +614,138 @@
return 0;
})
+;; Return 1 if the operand is a 64-bit scalar constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a V2DF or
+;; V2DI mode result that is interpretted as a 64-bit scalar.
+(define_predicate "easy_fp_constant_64bit_scalar"
+ (match_code "const_int,const_double")
+{
+ const REAL_VALUE_TYPE *rv;
+ REAL_VALUE_TYPE rv_type;
+
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ /* Don't return true for 0.0 or 0 since that is easy to create without
+ XXSPLTIDP. */
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ /* Handle DImode by creating a DF value from it. */
+ if (CONST_INT_P (op) && (mode == DImode || mode == VOIDmode))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+
+ /* Avoid values that look like DFmode NaN's. The IEEE 754 64-bit
+ floating format has 1 bit for sign, 11 bits for the exponent,
+ and 52 bits for the mantissa. NaN values have the exponent set
+ to all 1 bits, and the mantissa non-zero (mantissa == 0 is
+ infinity). */
+ int df_exponent = (df_value >> 52) & 0x7ff;
+ HOST_WIDE_INT df_mantissa = df_value & HOST_WIDE_INT_C (0x1fffffffffffff);
+
+ if (df_exponent == 0x7ff && df_mantissa != 0) /* NaN. */
+ return false;
+
+ /* Avoid values that are DFmode subnormal values. Subnormal numbers
+ have the exponent all 0 bits, and the mantissa non-zero. If the
+ value is subnormal, then the hidden bit in the mantissa is not
+ set. */
+ if (df_exponent == 0 && df_mantissa != 0) /* subnormal. */
+ return false;
+
+ long df_words[2];
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_from_target takes the target words in target order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ real_from_target (&rv_type, df_words, DFmode);
+ rv = &rv_type;
+ }
+
+ /* Handle SFmode/DFmode constants. Don't allow decimal or IEEE 128-bit
+ binary constants. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ rv = CONST_DOUBLE_REAL_VALUE (op);
+
+ /* We can't handle anything else with the XXSPLTIDP instruction. */
+ else
+ return false;
+
+ /* Validate that the number can be stored as a SFmode value. */
+ if (!exact_real_truncate (SFmode, rv))
+ return false;
+
+ /* Validate that the number is not a SFmode subnormal value (exponent is 0,
+ mantissa field is non-zero) which is undefined for the XXSPLTIDP
+ instruction. */
+ long sf_value;
+ real_to_target (&sf_value, rv, SFmode);
+
+ /* IEEE 754 32-bit values have 1 bit for the sign, 8 bits for the exponent,
+ and 23 bits for the mantissa. Subnormal numbers have the exponent all
+ 0 bits, and the mantissa non-zero. */
+ long sf_exponent = (sf_value >> 23) & 0xFF;
+ long sf_mantissa = sf_value & 0x7FFFFF;
+
+ if (sf_exponent == 0 && sf_mantissa != 0)
+ return false;
+
+ return true;
+})
+
+;; Return 1 if the operand is a 64-bit vector constant that can be loaded via
+;; the XXSPLTIDP instruction, which takes a SFmode value and produces a
+;; V2DFmode or V2DI result.
+;;
+;; We cannot combine the scalar and vector cases because otherwise it is
+;; problematical if we assign an appropriate integer constant to a TImode
+;; value. I.e.
+;;
+;; (set (reg:TI 32)
+;; (const_int 0x8000000000000000))
+;;
+;; Otherwise, the constant would be splatted into the 2 64-bit positions in the
+;; vector register, and not loaded with the upper 64-bits 0, and the constant
+;; in the lower 64-bits.
+
+(define_predicate "easy_vector_constant_64bit_element"
+ (match_code "const_vector,vec_duplicate")
+{
+ /* Can we do the XXSPLTIDP instruction? */
+ if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ if (mode != V2DFmode && mode != V2DImode)
+ return false;
+
+ if (CONST_VECTOR_P (op))
+ {
+ if (!CONST_VECTOR_DUPLICATE_P (op))
+ return false;
+
+ op = CONST_VECTOR_ELT (op, 0);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ op = XEXP (op, 0);
+
+ else
+ return false;
+
+ return easy_fp_constant_64bit_scalar (op, GET_MODE_INNER (mode));
+})
+
;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.
@@ -653,6 +790,9 @@
if (zero_constant (op, mode) || all_ones_constant (op, mode))
return true;
+ if (easy_vector_constant_64bit_element (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 14f6b313105..e9be9c4d99f 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int,
extern int easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern long xxspltidp_constant_immediate (rtx, machine_mode);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
@@ -198,6 +199,7 @@ enum non_prefixed_form reg_to_non_prefixed (rtx reg, machine_mode mode);
extern bool prefixed_load_p (rtx_insn *);
extern bool prefixed_store_p (rtx_insn *);
extern bool prefixed_paddi_p (rtx_insn *);
+extern bool prefixed_xxsplti_p (rtx_insn *);
extern void rs6000_asm_output_opcode (FILE *);
extern void output_pcrel_opt_reloc (rtx);
extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ad860728169..83d243269e3 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6946,6 +6946,60 @@ xxspltib_constant_p (rtx op,
return true;
}
+/* Return the immediate value used in the XXSPLTIDP instruction. */
+
+long
+xxspltidp_constant_immediate (rtx op, machine_mode mode)
+{
+ long ret;
+
+ /* Handle vectors. */
+ if (CONST_VECTOR_P (op))
+ {
+ op = CONST_VECTOR_ELT (op, 0);
+ mode = GET_MODE_INNER (mode);
+ }
+
+ else if (GET_CODE (op) == VEC_DUPLICATE)
+ {
+ op = XEXP (op, 0);
+ mode = GET_MODE (op);
+ }
+
+ gcc_assert (easy_fp_constant_64bit_scalar (op, mode));
+
+ /* Handle DImode/V2DImode by creating a DF value from it and then converting
+ the DFmode value to SFmode. */
+ if (CONST_INT_P (op))
+ {
+ HOST_WIDE_INT df_value = INTVAL (op);
+ long df_words[2];
+
+ df_words[0] = (df_value >> 32) & 0xffffffff;
+ df_words[1] = df_value & 0xffffffff;
+
+ /* real_to_target takes input in target endian order. */
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (df_words[0], df_words[1]);
+
+ REAL_VALUE_TYPE r;
+ real_from_target (&r, &df_words[0], DFmode);
+ real_to_target (&ret, &r, SFmode);
+ }
+
+ /* For floating point constants, convert to SFmode. */
+ else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode))
+ {
+ const REAL_VALUE_TYPE *rv = CONST_DOUBLE_REAL_VALUE (op);
+ real_to_target (&ret, rv, SFmode);
+ }
+
+ else
+ gcc_unreachable ();
+
+ return ret;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6990,6 +7044,13 @@ output_vec_const_move (rtx *operands)
gcc_unreachable ();
}
+ if (easy_fp_constant_64bit_scalar (vec, mode)
+ || easy_vector_constant_64bit_element (vec, mode))
+ {
+ operands[2] = GEN_INT (xxspltidp_constant_immediate (vec, mode));
+ return "xxspltidp %x0,%2";
+ }
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
{
@@ -26724,6 +26785,41 @@ prefixed_paddi_p (rtx_insn *insn)
return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
}
+/* Whether a permute type instruction is a prefixed XXSPLTI* instruction.
+ This is called from the prefixed attribute processing. */
+
+bool
+prefixed_xxsplti_p (rtx_insn *insn)
+{
+ rtx set = single_set (insn);
+ if (!set)
+ return false;
+
+ rtx dest = SET_DEST (set);
+ rtx src = SET_SRC (set);
+ machine_mode mode = GET_MODE (dest);
+
+ if (!REG_P (dest) && !SUBREG_P (dest))
+ return false;
+
+ switch (mode)
+ {
+ case E_DImode:
+ case E_DFmode:
+ case E_SFmode:
+ return easy_fp_constant_64bit_scalar (src, mode);
+
+ case E_V2DImode:
+ case E_V2DFmode:
+ return easy_vector_constant_64bit_element (src, mode);
+
+ default:
+ break;
+ }
+
+ return false;
+}
+
/* Whether the next instruction needs a 'p' prefix issued before the
instruction is printed out. */
static bool prepend_p_to_next_insn;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6bec2bddbde..8afc4b2756d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -314,6 +314,11 @@
(eq_attr "type" "integer,add")
(if_then_else (match_test "prefixed_paddi_p (insn)")
+ (const_string "yes")
+ (const_string "no"))
+
+ (eq_attr "type" "vecperm")
+ (if_then_else (match_test "prefixed_xxsplti_p (insn)")
(const_string "yes")
(const_string "no"))]
@@ -7759,17 +7764,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP
+;; MR MT<x> MF<x> NOP XXSPLTIDP
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h")
+ !r, *c*l, !r, *h, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0"))]
+ r, r, *h, 0, eF"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7791,15 +7796,16 @@
mr %0,%1
mt%0 %1
mf%1 %0
- nop"
+ nop
+ #"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *")
+ *, mtjmpr, mfjmpr, *, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *")])
+ *, *, *, *, p10")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -8059,18 +8065,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR
+;; LWZ STW MR XXSPLTIDP
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r")
+ Y, r, !r, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r"))]
+ r, Y, r, eF"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8087,20 +8093,21 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two")
+ store, load, two, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8")
+ 8, 8, 8, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *")])
+ *, *, *, p10")])
;; STW LWZ MR G-const H-const F-const
@@ -8127,19 +8134,19 @@
;; STFD LFD FMR LXSD STXSD
;; LXSDX STXSDX XXLOR XXLXOR LI 0
;; STD LD MR MT{CTR,LR} MF{CTR,LR}
-;; NOP MFVSRD MTVSRD
+;; NOP MFVSRD MTVSRD XXSPLTIDP
(define_insn "*mov<mode>_hardfloat64"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
YZ, r, !r, *c*l, !r,
- *h, r, <f64_dm>")
+ *h, r, <f64_dm>, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
r, YZ, r, r, *h,
- 0, <f64_dm>, r"))]
+ 0, <f64_dm>, r, eF"))]
"TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -8161,18 +8168,19 @@
mf%1 %0
nop
mfvsrd %0,%x1
- mtvsrd %x0,%1"
+ mtvsrd %x0,%1
+ #"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, integer,
store, load, *, mtjmpr, mfjmpr,
- *, mfvsr, mtvsr")
+ *, mfvsr, mtvsr, vecperm")
(set_attr "size" "64")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
*, *, *, *, *,
- *, p8v, p8v")])
+ *, p8v, p8v, p10")])
;; STD LD MR MT<SPR> MF<SPR> G-const
;; H-const F-const Special
@@ -9220,6 +9228,7 @@
;; a gpr into a fpr instead of reloading an invalid 'Y' address
;; GPR store GPR load GPR move FPR store FPR load FPR move
+;; XXSPLTIDP
;; GPR const AVX store AVX store AVX load AVX load VSX move
;; P9 0 P9 -1 AVX 0/-1 VSX 0 VSX -1 P9 const
;; AVX const
@@ -9227,11 +9236,13 @@
(define_insn "*movdi_internal32"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=Y, r, r, m, ^d, ^d,
+ ^wa,
r, wY, Z, ^v, $v, ^wa,
wa, wa, v, wa, *i, v,
v")
(match_operand:DI 1 "input_operand"
"r, Y, r, ^d, m, ^d,
+ eF,
IJKnF, ^v, $v, wY, Z, ^wa,
Oj, wM, OjwM, Oj, wM, wS,
wB"))]
@@ -9246,6 +9257,7 @@
lfd%U1%X1 %0,%1
fmr %0,%1
#
+ #
stxsd %1,%0
stxsdx %x1,%y0
lxsd %0,%1
@@ -9260,17 +9272,20 @@
#"
[(set_attr "type"
"store, load, *, fpstore, fpload, fpsimple,
+ vecperm,
*, fpstore, fpstore, fpload, fpload, veclogical,
vecsimple, vecsimple, vecsimple, veclogical,veclogical,vecsimple,
vecsimple")
(set_attr "size" "64")
(set_attr "length"
"8, 8, 8, *, *, *,
+ *,
16, *, *, *, *, *,
*, *, *, *, *, 8,
*")
(set_attr "isa"
"*, *, *, *, *, *,
+ p10,
*, p9v, p7v, p9v, p7v, *,
p9v, p9v, p7v, *, *, p7v,
p7v")])
@@ -9306,6 +9321,7 @@
})
;; GPR store GPR load GPR move
+;; XXSPLTIDP
;; GPR li GPR lis GPR pli GPR #
;; FPR store FPR load FPR move
;; AVX store AVX store AVX load AVX load VSX move
@@ -9316,6 +9332,7 @@
(define_insn "*movdi_internal64"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=YZ, r, r,
+ ^wa,
r, r, r, r,
m, ^d, ^d,
wY, Z, $v, $v, ^wa,
@@ -9325,6 +9342,7 @@
?r, ?wa")
(match_operand:DI 1 "input_operand"
"r, YZ, r,
+ eF,
I, L, eI, nF,
^d, m, ^d,
^v, $v, wY, Z, ^wa,
@@ -9339,6 +9357,7 @@
std%U0%X0 %1,%0
ld%U1%X1 %0,%1
mr %0,%1
+ #
li %0,%1
lis %0,%v1
li %0,%1
@@ -9365,6 +9384,7 @@
mtvsrd %x0,%1"
[(set_attr "type"
"store, load, *,
+ vecperm,
*, *, *, *,
fpstore, fpload, fpsimple,
fpstore, fpstore, fpload, fpload, veclogical,
@@ -9375,6 +9395,7 @@
(set_attr "size" "64")
(set_attr "length"
"*, *, *,
+ *,
*, *, *, 20,
*, *, *,
*, *, *, *, *,
@@ -9384,6 +9405,7 @@
*, *")
(set_attr "isa"
"*, *, *,
+ p10,
*, *, p10, *,
*, *, *,
p9v, p7v, p9v, p7v, *,
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 9d7878f144a..1d7ce4cc94a 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -640,6 +640,10 @@ mprivileged
Target Var(rs6000_privileged) Init(0)
Generate code that will run in privileged state.
+mxxspltidp
+Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save
+Generate (do not generate) XXSPLTIDP instructions.
+
-param=rs6000-density-pct-threshold=
Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 100) Param
When costing for loop vectorization, we probably need to penalize the loop body
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bf033e31c1c..fa33c9d9fbf 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1191,16 +1191,19 @@
;; instruction). But generate XXLXOR/XXLORC if it will avoid a register move.
;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR)
+;; XXSPLTIDP
;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW
;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX)
(define_insn "vsx_mov<mode>_64bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, r, we, ?wQ,
+ wa,
?&r, ??r, ??Y, <??r>, wa, v,
?wa, v, <??r>, wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, we, r, r,
+ eV,
wQ, Y, r, r, wE, jwM,
?jwM, W, <nW>, v, wZ"))]
@@ -1212,36 +1215,44 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, mtvsr, mfvsr, load,
+ vecperm,
store, load, store, *, vecsimple, vecsimple,
vecsimple, *, *, vecstore, vecload")
(set_attr "num_insns"
"*, *, *, 2, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, 5, 2, *, *")
(set_attr "max_prefixed_insns"
"*, *, *, *, *, 2,
+ *,
2, 2, 2, 2, *, *,
*, *, *, *, *")
(set_attr "length"
"*, *, *, 8, *, 8,
+ *,
8, 8, 8, 8, *, *,
*, 20, 8, *, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
*, *, *, *, p9v, *,
<VSisa>, *, *, *, *")])
;; VSX store VSX load VSX move GPR load GPR store GPR move
+;; XXSPLTIDP
;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const
;; LVX (VMX) STVX (VMX)
(define_insn "*vsx_mov<mode>_32bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, ??r, ??Y, <??r>,
+ wa,
wa, v, ?wa, v, <??r>,
wZ, v")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, Y, r, r,
+ eV,
wE, jwM, ?jwM, W, <nW>,
v, wZ"))]
@@ -1253,14 +1264,17 @@
}
[(set_attr "type"
"vecstore, vecload, vecsimple, load, store, *,
+ vecperm,
vecsimple, vecsimple, vecsimple, *, *,
vecstore, vecload")
(set_attr "length"
"*, *, *, 16, 16, 16,
+ *,
*, *, *, 20, 16,
*, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
+ p10,
p9v, *, <VSisa>, *, *,
*, *")])
@@ -6449,15 +6463,53 @@
DONE;
})
-(define_insn "xxspltidp_v2df_inst"
- [(set (match_operand:V2DF 0 "register_operand" "=wa")
- (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
- UNSPEC_XXSPLTIDP))]
+(define_mode_iterator XXSPLTIDP_S [DI SF DF])
+(define_mode_iterator XXSPLTIDP_V [V2DF V2DI])
+(define_mode_iterator XXSPLTIDP [DI SF DF V2DF V2DI])
+
+(define_insn "xxspltidp_<mode>_inst"
+ [(set (match_operand:XXSPLTIDP 0 "register_operand" "=wa")
+ (unspec:XXSPLTIDP [(match_operand:SI 1 "c32bit_cint_operand" "n")]
+ UNSPEC_XXSPLTIDP))]
"TARGET_POWER10"
"xxspltidp %x0,%1"
[(set_attr "type" "vecperm")
(set_attr "prefixed" "yes")])
+;; Generate the XXSPLTIDP instruction to support SFmode, DFmode, and DImode
+;; scalar constants and V2DF and V2DI vector constants where both elements are
+;; the same. The constant has to be expressible as a SFmode constant that is
+;; not a SFmode denormal value.
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_S 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_S 1 "easy_fp_constant_64bit_scalar" "eF"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_S [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
+(define_insn_and_split "*xxspltidp_<mode>_internal"
+ [(set (match_operand:XXSPLTIDP_V 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTIDP_V 1 "easy_vector_constant_64bit_element" "eV"))]
+ "TARGET_POWER10"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTIDP_V [(match_dup 2)] UNSPEC_XXSPLTIDP))]
+{
+ long immediate = xxspltidp_constant_immediate (operands[1], <MODE>mode);
+ operands[2] = GEN_INT (immediate);
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
;; XXSPLTI32DX built-in function support
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 2b41cb7fb7b..5035a3fd604 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3333,9 +3333,15 @@ The integer constant zero.
A constant whose negation is a signed 16-bit constant.
@end ifset
+@item eF
+A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction.
+
@item eI
A signed 34-bit integer constant if prefixed instructions are supported.
+@item eV
+A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction.
+
@ifset INTERNALS
@item G
A floating point constant that can be loaded into a register with one
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
new file mode 100644
index 00000000000..8f6e176f9af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+double
+scalar_double_0 (void)
+{
+ return 0.0; /* XXSPLTIB or XXLXOR. */
+}
+
+double
+scalar_double_1 (void)
+{
+ return 1.0; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+double
+scalar_double_m0 (void)
+{
+ return -0.0; /* XXSPLTIDP. */
+}
+
+double
+scalar_double_nan (void)
+{
+ return __builtin_nan (""); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_inf (void)
+{
+ return __builtin_inf (); /* XXSPLTIDP. */
+}
+
+double
+scalar_double_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inf ();
+}
+#endif
+
+double
+scalar_double_pi (void)
+{
+ return M_PI; /* PLFD. */
+}
+
+double
+scalar_double_denorm (void)
+{
+ return 0x1p-149f; /* PLFD. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
new file mode 100644
index 00000000000..75714d0b11d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating DImode constants that have the same bit pattern as DFmode
+ constants that can be loaded with the XXSPLTIDP instruction with the ISA 3.1
+ (power10). We use asm to force the value into vector registers. */
+
+double
+scalar_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ double d;
+ long long ll = 0;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+double
+scalar_1 (void)
+{
+ /* VSPLTISW/VUPKLSW or XXSPLTIB/VEXTSB2D. */
+ double d;
+ long long ll = 1;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTIDP. */
+double
+scalar_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x8000000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTIDP. */
+double
+scalar_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ double d;
+ long long ll = 0x3ff0000000000000LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+double
+scalar_pi (void)
+{
+ /* PLXV. */
+ double d;
+ long long ll = 0x400921fb54442d18LL;
+
+ __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll));
+ return d;
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
new file mode 100644
index 00000000000..72504bdfbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+float
+scalar_float_0 (void)
+{
+ return 0.0f; /* XXSPLTIB or XXLXOR. */
+}
+
+float
+scalar_float_1 (void)
+{
+ return 1.0f; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+float
+scalar_float_m0 (void)
+{
+ return -0.0f; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_nan (void)
+{
+ return __builtin_nanf (""); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_inf (void)
+{
+ return __builtin_inff (); /* XXSPLTIDP. */
+}
+
+float
+scalar_float_m_inf (void) /* XXSPLTIDP. */
+{
+ return - __builtin_inff ();
+}
+#endif
+
+float
+scalar_float_pi (void)
+{
+ return (float)M_PI; /* XXSPLTIDP. */
+}
+
+float
+scalar_float_denorm (void)
+{
+ return 0x1p-149f; /* PLFS. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
new file mode 100644
index 00000000000..82ffc86f8aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include <math.h>
+
+/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP
+ instruction. */
+
+vector double
+v2df_double_0 (void)
+{
+ return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */
+}
+
+vector double
+v2df_double_1 (void)
+{
+ return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */
+}
+
+#ifndef __FAST_MATH__
+vector double
+v2df_double_m0 (void)
+{
+ return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_nan (void)
+{
+ return (vector double) { __builtin_nan (""),
+ __builtin_nan ("") }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_inf (void)
+{
+ return (vector double) { __builtin_inf (),
+ __builtin_inf () }; /* XXSPLTIDP. */
+}
+
+vector double
+v2df_double_m_inf (void)
+{
+ return (vector double) { - __builtin_inf (),
+ - __builtin_inf () }; /* XXSPLTIDP. */
+}
+#endif
+
+vector double
+v2df_double_pi (void)
+{
+ return (vector double) { M_PI, M_PI }; /* PLVX. */
+}
+
+vector double
+v2df_double_denorm (void)
+{
+ return (vector double) { (double)0x1p-149f,
+ (double)0x1p-149f }; /* PLVX. */
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
new file mode 100644
index 00000000000..4d44f943d26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test generating V2DImode constants that have the same bit pattern as
+ V2DFmode constants that can be loaded with the XXSPLTIDP instruction with
+ the ISA 3.1 (power10). */
+
+vector long long
+vector_0 (void)
+{
+ /* XXSPLTIB or XXLXOR. */
+ return (vector long long) { 0LL, 0LL };
+}
+
+vector long long
+vector_1 (void)
+{
+ /* XXSPLTIB and VEXTSB2D. */
+ return (vector long long) { 1LL, 1LL };
+}
+
+/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated
+ with XXSPLTISDP. */
+vector long long
+vector_float_neg_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x8000000000000000LL, 0x8000000000000000LL };
+}
+
+/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with
+ XXSPLTISDP. */
+vector long long
+vector_float_1_0 (void)
+{
+ /* XXSPLTIDP. */
+ return (vector long long) { 0x3ff0000000000000LL, 0x3ff0000000000000LL };
+}
+
+/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated
+ with XXSPLTIDP. */
+vector long long
+scalar_pi (void)
+{
+ /* PLXV. */
+ return (vector long long) { 0x400921fb54442d18LL, 0x400921fb54442d18LL };
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-10-05 21:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-05 21:36 [gcc(refs/users/meissner/heads/work070)] Generate XXSPLTIDP on power10 Michael Meissner
-- strict thread matches above, loose matches on Subject: below --
2021-10-05 21:59 Michael Meissner
2021-10-05 21:51 Michael Meissner
2021-10-05 21:11 Michael Meissner
2021-10-04 20:17 Michael Meissner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).