public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work046)] Use XXSPLTI32DX to generate some constants.
@ 2021-04-12 17:46 Michael Meissner
0 siblings, 0 replies; 3+ messages in thread
From: Michael Meissner @ 2021-04-12 17:46 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:66839cacee7e18d899e1a79c7860319d9bc14e64
commit 66839cacee7e18d899e1a79c7860319d9bc14e64
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Mon Apr 12 13:45:45 2021 -0400
Use XXSPLTI32DX to generate some constants.
This patch generates a pair of XXSPLTI32DX instructions to load 64-bit
scalar or 128-bit vector constants into the vector registers.
I added a new constraint (eD) for constants that can be loaded with
XXSPLTI32DX, but cannot be loaded with the XXSPLTIDP or XXSPLTIW
instructions.
I added a debug switch (-mxxsplti32dx) to control whether this behavior is
on or off.
gcc/
2021-04-12 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/contraints.md (eD): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the constant with a pair of XXSPLTI32DX instructions, it is easy.
(xxsplti32dx_operand): New predicate.
(easy_vector_constant): If we can load the constant with a pair of
XXSPLTI32DX instructions, it is easy.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
-mxxsplti32dx.
(POWERPC_MASKS): Add -mxxsplti32dx.
* config/rs6000/rs6000-protos.h (xxsplti32dx_constant_p): New
declaration.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
-mxxsplti32dx support.
(xxsplti32dx_constant_p): New helper function.
(output_vec_const_move): Split constants that need XXSPLTI32DX.
(rs6000_opt_masks): Add -mxxsplti32dx.
* config/rs6000/rs6000.md (movsf_hardfloat): Add support for
loading constants with XXSPLTI32DX.
(mov<mode>_hardfloat32, FMOVE64 iterator): Add support for loading
constants with XXSPLTI32DX.
(mov<mode>_hardfloat64, FMOVE64 iterator): Add support for loading
constants with XXSPLTI32DX.
* config/rs6000/rs6000.opt (-mxxsplti32dx): New switch.
* config/rs6000/vsx.md (UNSPEC_XXSPLTI32DX_CONST): New unspec.
(vsx_mov<mode>_64bit): Add support for loading constants with
XXSPLTI32DX.
(vsx_mov<mode>_32bit): Add support for loading constants with
XXSPLTI32DX.
(XXSPLTI32DX): New mode iterator.
(xxsplti32dx_<mode>): New insn and splits.
(xxsplti32dx_<mode>_first): New insns.
(xxsplti32dx_<mode>_second): New insns.
gcc/testsuite/
2021-04-08 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/vec-splati-runnable.c: Update insn count.
* gcc.target/powerpc/vec-splat-constant-sf.c: Update insn count.
* gcc.target/powerpc/vec-splat-constant-df.c: Update insn count.
* gcc.target/powerpc/vec-splat-constant-v2df.c: Update insn
count.
Diff:
---
gcc/config/rs6000/constraints.md | 6 +
gcc/config/rs6000/predicates.md | 22 ++++
gcc/config/rs6000/rs6000-cpus.def | 2 +
gcc/config/rs6000/rs6000-protos.h | 1 +
gcc/config/rs6000/rs6000.c | 94 ++++++++++++++++
gcc/config/rs6000/rs6000.md | 44 +++++---
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 121 ++++++++++++++++++---
.../gcc.target/powerpc/vec-splat-constant-df.c | 9 +-
.../gcc.target/powerpc/vec-splat-constant-sf.c | 5 +-
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 10 +-
.../gcc.target/powerpc/vec-splati-runnable.c | 2 +-
12 files changed, 283 insertions(+), 37 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 70b1eb01770..c7137337f4c 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,6 +208,12 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; SF/DF/V2DF/DI/V2DI scalar or vector constant that can be loaded with a pair
+;; of XXSPLTI32DX instructions.
+(define_constraint "eD"
+ "A vector constant that can be loaded with XXSPLTI32DX instructions."
+ (match_operand 0 "xxsplti32dx_operand"))
+
;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP
(define_constraint "eF"
"A vector constant that can be loaded with the XXSPLTIDP instruction."
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 48d9c5509a2..281c6e835b9 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -606,6 +606,11 @@
if (xxspltidp_operand (op, mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTI32DX instruction, see if the constant can
+ be loaded with a pair of those instructions. */
+ if (xxsplti32dx_operand (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -677,6 +682,20 @@
return xxspltidp_constant_p (op, mode, &value);
})
+;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be
+;; loaded via a pair f ISA 3.1 XXSPLTI32DX instructions. Do not return true if
+;; the value is 0.0 or it can be loaded with XXSPLTIDP, since that is easy to
+;; generate without using XXSPLTI32DX.
+(define_predicate "xxsplti32dx_operand"
+ (match_code "const_double,const_int,const_vector,vec_duplicate")
+{
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ HOST_WIDE_INT value = 0;
+ return xxsplti32dx_constant_p (op, mode, &value);
+})
+
;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
;; vector register without using memory.
(define_predicate "easy_vector_constant"
@@ -696,6 +715,9 @@
if (xxspltidp_operand (op, mode))
return true;
+ if (xxsplti32dx_operand (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index cf4044831f7..5a14191cc6c 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -79,6 +79,7 @@
| OPTION_MASK_PCREL \
| OPTION_MASK_PCREL_OPT \
| OPTION_MASK_PREFIXED \
+ | OPTION_MASK_XXSPLTI32DX \
| OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
@@ -163,6 +164,7 @@
| OPTION_MASK_SOFT_FLOAT \
| OPTION_MASK_STRICT_ALIGN_OPTIONAL \
| OPTION_MASK_VSX \
+ | OPTION_MASK_XXSPLTI32DX \
| OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 0fe1c176236..12bf60b043d 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -34,6 +34,7 @@ extern bool easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
extern bool xxspltiw_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
+extern bool xxsplti32dx_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 08a853f2e8f..7d125d6b65a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4484,6 +4484,10 @@ rs6000_option_override_internal (bool global_init_p)
&& (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0)
rs6000_isa_flags |= OPTION_MASK_XXSPLTIDP;
+ if (TARGET_POWER10 && TARGET_VSX
+ && (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTI32DX) == 0)
+ rs6000_isa_flags |= OPTION_MASK_XXSPLTI32DX;
+
if (!TARGET_PCREL && TARGET_PCREL_OPT)
rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
@@ -6610,6 +6614,92 @@ xxspltidp_constant_p (rtx op,
return true;
}
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+ XXSPLTI32DX instruction. If the instruction can be synthesized with
+ XXSPLTIDP or is 0/-1, return false.
+
+ Return the 64-bit constant to use in the two XXSPLTI32DX instructions via
+ CONSTANT_PTR. */
+
+bool
+xxsplti32dx_constant_p (rtx op,
+ machine_mode mode,
+ HOST_WIDE_INT *constant_ptr)
+{
+ *constant_ptr = 0;
+
+ if (!TARGET_XXSPLTI32DX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ rtx element = op;
+ if (mode == V2DFmode || mode == V2DImode)
+ {
+ /* Handle VEC_DUPLICATE and CONST_VECTOR. */
+ if (GET_CODE (op) == VEC_DUPLICATE)
+ element = XEXP (op, 0);
+
+ else if (GET_CODE (op) == CONST_VECTOR)
+ {
+ element = CONST_VECTOR_ELT (op, 0);
+ if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1)))
+ return false;
+ }
+
+ else
+ return false;
+
+ mode = GET_MODE_INNER (mode);
+ }
+
+ if (GET_MODE (element) != mode)
+ return false;
+
+ /* Handle floating point constants. */
+ if (mode == SFmode || mode == DFmode)
+ {
+ HOST_WIDE_INT xxspltidp_value = 0;
+
+ if (!CONST_DOUBLE_P (element))
+ return false;
+
+ if (xxspltidp_constant_p (element, mode, &xxspltidp_value))
+ return false;
+
+ long high_low[2];
+ const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element);
+ REAL_VALUE_TO_TARGET_DOUBLE (*rv, high_low);
+
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (high_low[0], high_low[1]);
+
+ *constant_ptr = (high_low[0] << 32) | (high_low[1] & 0xffffffff);
+ return true;
+ }
+
+ /* Handle integer constants. */
+ else if (mode == DImode)
+ {
+ if (!CONST_INT_P (element))
+ return false;
+
+ HOST_WIDE_INT value = INTVAL (element);
+ if (value == -1)
+ return false;
+
+ *constant_ptr = value;
+ return true;
+ }
+
+ else
+ return false;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6683,6 +6773,9 @@ output_vec_const_move (rtx *operands)
return "xxspltidp %x0,%2";
}
+ if (xxsplti32dx_operand (vec, mode))
+ return "#";
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
{
@@ -24182,6 +24275,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
{ "vsx", OPTION_MASK_VSX, false, true },
{ "xxspltiw", OPTION_MASK_XXSPLTIW, false, true },
{ "xxspltidp", OPTION_MASK_XXSPLTIDP, false, true },
+ { "xxsplti32dx", OPTION_MASK_XXSPLTI32DX, false, true },
#ifdef OPTION_MASK_64BIT
#if TARGET_AIX_OS
{ "aix64", OPTION_MASK_64BIT, false, false },
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5569e0591e6..b5886d3ccf4 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7564,17 +7564,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP XXSPLTIDP
+;; MR MT<x> MF<x> NOP XXSPLTIDP XXSPLTI32DX
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h, wa")
+ !r, *c*l, !r, *h, wa, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0, eF"))]
+ r, r, *h, 0, eF, eD"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7597,19 +7597,28 @@
mt%0 %1
mf%1 %0
nop
+ #
#"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *, vecperm")
+ *, mtjmpr, mfjmpr, *, vecperm, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *, p10")
+ *, *, *, *, p10, p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, *,
- *, *, *, *, yes")])
+ *, *, *, *, yes, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *, *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *, *, *, 2")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -7869,18 +7878,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR XXSPLTIDP
+;; LWZ STW MR XXSPLTIDP XXSPLTI32DX
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r, wa")
+ Y, r, !r, wa, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r, eF"))]
+ r, Y, r, eF, eD"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -7898,24 +7907,33 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two, vecperm")
+ store, load, two, vecperm, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8, *")
+ 8, 8, 8, *, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *, p10")
+ *, *, *, p10, p10")
(set_attr "prefixed"
"*, *, *, *, *,
*, *, *, *, *,
- *, *, *, yes")])
+ *, *, *, yes, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, *, 2")])
;; STW LWZ MR G-const H-const F-const
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 6620cdb7716..bd269369ca0 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -627,3 +627,7 @@ Generate (do not generate) the XXSPLTIW instruction.
mxxspltidp
Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags)
Generate (do not generate) the XXSPLTIDP instruction.
+
+mxxsplti32dx
+Target Undocumented Mask(XXSPLTI32DX) Var(rs6000_isa_flags)
+Generate (do not generate) the XXSPLTI32DX instruction.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 4b0307d447e..ec2c148fb4d 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -372,6 +372,7 @@
UNSPEC_XXSPLTIW
UNSPEC_XXSPLTIDP
UNSPEC_XXSPLTI32DX
+ UNSPEC_XXSPLTI32DX_CONST
UNSPEC_XXPERMX
UNSPEC_XXEVAL
])
@@ -1173,16 +1174,19 @@
;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR)
;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW
;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) XXSPLTI*
+;; XXSPLTI32DX
(define_insn "vsx_mov<mode>_64bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, r, we, ?wQ,
?&r, ??r, ??Y, <??r>, wa, v,
- ?wa, v, <??r>, wZ, v, wa")
+ ?wa, v, <??r>, wZ, v, wa,
+ wa")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, we, r, r,
wQ, Y, r, r, wE, jwM,
- ?jwM, W, <nW>, v, wZ, eWeF"))]
+ ?jwM, W, <nW>, v, wZ, eWeF,
+ eD"))]
"TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
@@ -1193,41 +1197,47 @@
[(set_attr "type"
"vecstore, vecload, vecsimple, mtvsr, mfvsr, load,
store, load, store, *, vecsimple, vecsimple,
- vecsimple, *, *, vecstore, vecload, vecperm")
+ vecsimple, *, *, vecstore, vecload, vecperm,
+ vecperm")
(set_attr "num_insns"
"*, *, *, 2, *, 2,
2, 2, 2, 2, *, *,
- *, 5, 2, *, *, *")
+ *, 5, 2, *, *, *,
+ 2")
(set_attr "max_prefixed_insns"
"*, *, *, *, *, 2,
2, 2, 2, 2, *, *,
- *, *, *, *, *, *")
+ *, *, *, *, *, *,
+ 2")
(set_attr "length"
"*, *, *, 8, *, 8,
8, 8, 8, 8, *, *,
- *, 20, 8, *, *, *")
+ *, 20, 8, *, *, *,
+ *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
*, *, *, *, p9v, *,
- <VSisa>, *, *, *, *, p10")
+ <VSisa>, *, *, *, *, p10,
+ p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, *,
- *, *, *, *, *, yes")])
+ *, *, *, *, *, yes,
+ yes")])
;; VSX store VSX load VSX move GPR load GPR store GPR move
;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const XXSPLTI*
-;; LVX (VMX) STVX (VMX)
+;; LVX (VMX) STVX (VMX) XXSPLTI32DX
(define_insn "*vsx_mov<mode>_32bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, ??r, ??Y, <??r>,
wa, v, ?wa, v, <??r>, wa,
- wZ, v")
+ wZ, v, wa")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, Y, r, r,
wE, jwM, ?jwM, W, <nW>, eWeF,
- v, wZ"))]
+ v, wZ, eD"))]
"!TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
@@ -1238,19 +1248,27 @@
[(set_attr "type"
"vecstore, vecload, vecsimple, load, store, *,
vecsimple, vecsimple, vecsimple, *, *, vecperm,
- vecstore, vecload")
+ vecstore, vecload, vecperm")
(set_attr "length"
"*, *, *, 16, 16, 16,
*, *, *, 20, 16, *,
- *, *")
+ *, *, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
p9v, *, <VSisa>, *, *, p10,
- *, *")
+ *, *, p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, yes,
- *, *")])
+ *, *, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *")])
;; Explicit load/store expanders for the builtin functions
(define_expand "vsx_load_<mode>"
@@ -6330,6 +6348,79 @@
DONE;
})
+;; XXSPLTI32DX used to create 64-bit constants
+(define_mode_iterator XXSPLTI32DX [SF DF V2DF V2DI])
+
+(define_insn_and_split "*xxsplti32dx_<mode>"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTI32DX 1 "xxsplti32dx_operand"))]
+ "TARGET_XXSPLTI32DX"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTI32DX [(match_dup 2)
+ (match_dup 3)] UNSPEC_XXSPLTI32DX_CONST))
+ (set (match_dup 0)
+ (unspec:XXSPLTI32DX [(match_dup 0)
+ (match_dup 4)
+ (match_dup 5)] UNSPEC_XXSPLTI32DX_CONST))]
+{
+ HOST_WIDE_INT value = 0;
+
+ if (!xxsplti32dx_constant_p (operands[1], <MODE>mode, &value))
+ gcc_unreachable ();
+
+ HOST_WIDE_INT high = value >> 32;
+ HOST_WIDE_INT low = value & 0xffffffff;
+
+ /* If the low bits are 0/-1, initialize that word first. This way we can
+ use a smaller XXSPLTIB instruction instead the first XXSPLTI32DX. */
+ if (low == 0 || low == -1)
+ {
+ operands[2] = const1_rtx;
+ operands[3] = GEN_INT (low);
+ operands[4] = const0_rtx;
+ operands[5] = GEN_INT (high);
+ }
+ else
+ {
+ operands[2] = const0_rtx;
+ operands[3] = GEN_INT (high);
+ operands[4] = const1_rtx;
+ operands[5] = GEN_INT (low);
+ }
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")
+ (set_attr "num_insns" "2")
+ (set_attr "max_prefixed_insns" "2")])
+
+;; First word of XXSPLTI32DX
+(define_insn "*xxsplti32dx_<mode>_first"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa,wa,wa")
+ (unspec:XXSPLTI32DX [(match_operand 1 "u1bit_cint_operand" "n,n,n")
+ (match_operand 2 "const_int_operand" "O,wM,n")]
+ UNSPEC_XXSPLTI32DX_CONST))]
+ "TARGET_XXSPLTI32DX"
+ "@
+ xxspltib %x0,0
+ xxspltib %x0,255
+ xxsplti32dx %x0,%1,%2"
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "*,*,yes")])
+
+;; Second word of XXSPLTI32DX
+(define_insn "*xxsplti32dx_<mode>_second"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa")
+ (unspec:XXSPLTI32DX [(match_operand:XXSPLTI32DX 1 "vsx_register_operand" "0")
+ (match_operand 2 "u1bit_cint_operand" "n")
+ (match_operand 3 "const_int_operand" "n")]
+ UNSPEC_XXSPLTI32DX_CONST))]
+ "TARGET_XXSPLTI32DX"
+ "xxsplti32dx %x0,%2,%3"
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
;; XXSPLTI32DX built-in support.
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
index 8f6e176f9af..1435ef4ef4f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -48,13 +48,16 @@ scalar_double_m_inf (void) /* XXSPLTIDP. */
double
scalar_double_pi (void)
{
- return M_PI; /* PLFD. */
+ return M_PI; /* 2x XXSPLTI32DX. */
}
double
scalar_double_denorm (void)
{
- return 0x1p-149f; /* PLFD. */
+ return 0x1p-149f; /* XXSPLTIB, XXSPLTI32DX. */
}
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-not {\mplfd\M} } } */
+/* { dg-final { scan-assembler-not {\mplxsd\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
index 72504bdfbbd..e9a45d5159d 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -57,4 +57,7 @@ scalar_float_denorm (void)
return 0x1p-149f; /* PLFS. */
}
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mplfs\M} } } */
+/* { dg-final { scan-assembler-not {\mplxssp\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
index d509459292c..d81198b163d 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -51,14 +51,16 @@ v2df_double_m_inf (void)
vector double
v2df_double_pi (void)
{
- return (vector double) { M_PI, M_PI }; /* PLFD. */
+ return (vector double) { M_PI, M_PI }; /* 2x XXSPLTI32DX. */
}
vector double
v2df_double_denorm (void)
{
- return (vector double) { (double)0x1p-149f,
- (double)0x1p-149f }; /* PLFD. */
+ return (vector double) { (double)0x1p-149f, /* XXSPLTIB, */
+ (double)0x1p-149f }; /* XXSPLTI32DX. */
}
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-not {\mplxv\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
index 06a8289d09b..f0eb982eadf 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
@@ -162,4 +162,4 @@ main (int argc, char *argv [])
/* { dg-final { scan-assembler-times {\mxxspltiw\M} 1 } } */
/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 4 } } */
^ permalink raw reply [flat|nested] 3+ messages in thread
* [gcc(refs/users/meissner/heads/work046)] Use XXSPLTI32DX to generate some constants.
@ 2021-04-08 20:08 Michael Meissner
0 siblings, 0 replies; 3+ messages in thread
From: Michael Meissner @ 2021-04-08 20:08 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:5d6e7dd0d1d1bbcf9cf9a52c54640825438ecc95
commit 5d6e7dd0d1d1bbcf9cf9a52c54640825438ecc95
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Thu Apr 8 16:07:50 2021 -0400
Use XXSPLTI32DX to generate some constants.
This patch generates a pair of XXSPLTI32DX instructions to load 64-bit
scalar or 128-bit vector constants into the vector registers.
I added a new constraint (eD) for constants that can be loaded with
XXSPLTI32DX, but cannot be loaded with the XXSPLTIDP or XXSPLTIW
instructions.
I added a debug switch (-mxxsplti32dx) to control whether this behavior is
on or off.
gcc/
2021-04-08 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/contraints.md (eD): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the constant with a pair of XXSPLTI32DX instructions, it is easy.
(xxsplti32dx_operand): New predicate.
(easy_vector_constant): If we can load the constant with a pair of
XXSPLTI32DX instructions, it is easy.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
-mxxsplti32dx.
(POWERPC_MASKS): Add -mxxsplti32dx.
* config/rs6000/rs6000-protos.h (xxsplti32dx_constant_p): New
declaration.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
-mxxsplti32dx support.
(xxsplti32dx_constant_p): New helper function.
(output_vec_const_move): Split constants that need XXSPLTI32DX.
(rs6000_opt_masks): Add -mxxsplti32dx.
* config/rs6000/rs6000.md (movsf_hardfloat): Add support for
loading constants with XXSPLTI32DX.
(mov<mode>_hardfloat32, FMOVE64 iterator): Add support for loading
constants with XXSPLTI32DX.
(mov<mode>_hardfloat64, FMOVE64 iterator): Add support for loading
constants with XXSPLTI32DX.
* config/rs6000/rs6000.opt (-mxxsplti32dx): New switch.
* config/rs6000/vsx.md (UNSPEC_XXSPLTI32DX_CONST): New unspec.
(vsx_mov<mode>_64bit): Add support for loading constants with
XXSPLTI32DX.
(vsx_mov<mode>_32bit): Add support for loading constants with
XXSPLTI32DX.
(XXSPLTI32DX): New mode iterator.
(xxsplti32dx_<mode>): New insn and splits.
(xxsplti32dx_<mode>_first): New insns.
(xxsplti32dx_<mode>_second): New insns.
gcc/testsuite/
2021-04-08 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/vec-splati-runnable.c: Update insn count.
* gcc.target/powerpc/vec-splat-constant-sf.c: Update insn count.
* gcc.target/powerpc/vec-splat-constant-df.c: Update insn count.
* gcc.target/powerpc/vec-splat-constant-v2df.c: Update insn
count.
Diff:
---
gcc/config/rs6000/constraints.md | 6 +
gcc/config/rs6000/predicates.md | 22 ++++
gcc/config/rs6000/rs6000-cpus.def | 2 +
gcc/config/rs6000/rs6000-protos.h | 1 +
gcc/config/rs6000/rs6000.c | 94 ++++++++++++++++
gcc/config/rs6000/rs6000.md | 44 +++++---
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 121 ++++++++++++++++++---
.../gcc.target/powerpc/vec-splat-constant-df.c | 9 +-
.../gcc.target/powerpc/vec-splat-constant-sf.c | 5 +-
.../gcc.target/powerpc/vec-splat-constant-v2df.c | 10 +-
.../gcc.target/powerpc/vec-splati-runnable.c | 2 +-
12 files changed, 283 insertions(+), 37 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 70b1eb01770..c7137337f4c 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,6 +208,12 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; SF/DF/V2DF/DI/V2DI scalar or vector constant that can be loaded with a pair
+;; of XXSPLTI32DX instructions.
+(define_constraint "eD"
+ "A vector constant that can be loaded with XXSPLTI32DX instructions."
+ (match_operand 0 "xxsplti32dx_operand"))
+
;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP
(define_constraint "eF"
"A vector constant that can be loaded with the XXSPLTIDP instruction."
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 48d9c5509a2..281c6e835b9 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -606,6 +606,11 @@
if (xxspltidp_operand (op, mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTI32DX instruction, see if the constant can
+ be loaded with a pair of those instructions. */
+ if (xxsplti32dx_operand (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -677,6 +682,20 @@
return xxspltidp_constant_p (op, mode, &value);
})
+;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be
+;; loaded via a pair f ISA 3.1 XXSPLTI32DX instructions. Do not return true if
+;; the value is 0.0 or it can be loaded with XXSPLTIDP, since that is easy to
+;; generate without using XXSPLTI32DX.
+(define_predicate "xxsplti32dx_operand"
+ (match_code "const_double,const_int,const_vector,vec_duplicate")
+{
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ HOST_WIDE_INT value = 0;
+ return xxsplti32dx_constant_p (op, mode, &value);
+})
+
;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
;; vector register without using memory.
(define_predicate "easy_vector_constant"
@@ -696,6 +715,9 @@
if (xxspltidp_operand (op, mode))
return true;
+ if (xxsplti32dx_operand (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index cf4044831f7..5a14191cc6c 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -79,6 +79,7 @@
| OPTION_MASK_PCREL \
| OPTION_MASK_PCREL_OPT \
| OPTION_MASK_PREFIXED \
+ | OPTION_MASK_XXSPLTI32DX \
| OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
@@ -163,6 +164,7 @@
| OPTION_MASK_SOFT_FLOAT \
| OPTION_MASK_STRICT_ALIGN_OPTIONAL \
| OPTION_MASK_VSX \
+ | OPTION_MASK_XXSPLTI32DX \
| OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 0fe1c176236..12bf60b043d 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -34,6 +34,7 @@ extern bool easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
extern bool xxspltiw_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
+extern bool xxsplti32dx_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 08a853f2e8f..659eb301c87 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4484,6 +4484,10 @@ rs6000_option_override_internal (bool global_init_p)
&& (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0)
rs6000_isa_flags |= OPTION_MASK_XXSPLTIDP;
+ if (TARGET_POWER10 && TARGET_VSX
+ && (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTI32DX) == 0)
+ rs6000_isa_flags |= OPTION_MASK_XXSPLTI32DX;
+
if (!TARGET_PCREL && TARGET_PCREL_OPT)
rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
@@ -6610,6 +6614,92 @@ xxspltidp_constant_p (rtx op,
return true;
}
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+ XXSPLTI32DX instruction. If the instruction can be synthesized with
+ XXSPLTIDP or is 0/-1, return false;
+
+ Return the 64-bit constant to use in the two XXSPLTI32DX instructions via
+ CONSTANT_PTR. */
+
+bool
+xxsplti32dx_constant_p (rtx op,
+ machine_mode mode,
+ HOST_WIDE_INT *constant_ptr)
+{
+ *constant_ptr = 0;
+
+ if (!TARGET_XXSPLTI32DX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ rtx element = op;
+ if (mode == V2DFmode || mode == V2DImode)
+ {
+ /* Handle VEC_DUPLICATE and CONST_VECTOR. */
+ if (GET_CODE (op) == VEC_DUPLICATE)
+ element = XEXP (op, 0);
+
+ else if (GET_CODE (op) == CONST_VECTOR)
+ {
+ element = CONST_VECTOR_ELT (op, 0);
+ if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1)))
+ return false;
+ }
+
+ else
+ return false;
+
+ mode = GET_MODE_INNER (mode);
+ }
+
+ if (GET_MODE (element) != mode)
+ return false;
+
+ /* Handle floating point constants. */
+ if (mode == SFmode || mode == DFmode)
+ {
+ HOST_WIDE_INT xxspltidp_value = 0;
+
+ if (!CONST_DOUBLE_P (element))
+ return false;
+
+ if (xxspltidp_constant_p (element, mode, &xxspltidp_value))
+ return false;
+
+ long high_low[2];
+ const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element);
+ REAL_VALUE_TO_TARGET_DOUBLE (*rv, high_low);
+
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (high_low[0], high_low[1]);
+
+ *constant_ptr = (high_low[0] << 32) | (high_low[1] & 0xffffffff);
+ return true;
+ }
+
+ /* Handle integer constants. */
+ else if (mode == DImode)
+ {
+ if (!CONST_INT_P (element))
+ return false;
+
+ HOST_WIDE_INT value = INTVAL (element);
+ if (value == -1)
+ return false;
+
+ *constant_ptr = value;
+ return true;
+ }
+
+ else
+ return false;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6683,6 +6773,9 @@ output_vec_const_move (rtx *operands)
return "xxspltidp %x0,%2";
}
+ if (xxsplti32dx_operand (vec, mode))
+ return "#";
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
{
@@ -24182,6 +24275,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
{ "vsx", OPTION_MASK_VSX, false, true },
{ "xxspltiw", OPTION_MASK_XXSPLTIW, false, true },
{ "xxspltidp", OPTION_MASK_XXSPLTIDP, false, true },
+ { "xxsplti32dx", OPTION_MASK_XXSPLTI32DX, false, true },
#ifdef OPTION_MASK_64BIT
#if TARGET_AIX_OS
{ "aix64", OPTION_MASK_64BIT, false, false },
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5569e0591e6..b5886d3ccf4 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7564,17 +7564,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP XXSPLTIDP
+;; MR MT<x> MF<x> NOP XXSPLTIDP XXSPLTI32DX
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h, wa")
+ !r, *c*l, !r, *h, wa, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0, eF"))]
+ r, r, *h, 0, eF, eD"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7597,19 +7597,28 @@
mt%0 %1
mf%1 %0
nop
+ #
#"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *, vecperm")
+ *, mtjmpr, mfjmpr, *, vecperm, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *, p10")
+ *, *, *, *, p10, p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, *,
- *, *, *, *, yes")])
+ *, *, *, *, yes, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *, *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *, *, *, 2")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -7869,18 +7878,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR XXSPLTIDP
+;; LWZ STW MR XXSPLTIDP XXSPLTI32DX
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r, wa")
+ Y, r, !r, wa, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r, eF"))]
+ r, Y, r, eF, eD"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -7898,24 +7907,33 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two, vecperm")
+ store, load, two, vecperm, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8, *")
+ 8, 8, 8, *, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *, p10")
+ *, *, *, p10, p10")
(set_attr "prefixed"
"*, *, *, *, *,
*, *, *, *, *,
- *, *, *, yes")])
+ *, *, *, yes, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, *, 2")])
;; STW LWZ MR G-const H-const F-const
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 6620cdb7716..bd269369ca0 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -627,3 +627,7 @@ Generate (do not generate) the XXSPLTIW instruction.
mxxspltidp
Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags)
Generate (do not generate) the XXSPLTIDP instruction.
+
+mxxsplti32dx
+Target Undocumented Mask(XXSPLTI32DX) Var(rs6000_isa_flags)
+Generate (do not generate) the XXSPLTI32DX instruction.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 4b0307d447e..ec2c148fb4d 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -372,6 +372,7 @@
UNSPEC_XXSPLTIW
UNSPEC_XXSPLTIDP
UNSPEC_XXSPLTI32DX
+ UNSPEC_XXSPLTI32DX_CONST
UNSPEC_XXPERMX
UNSPEC_XXEVAL
])
@@ -1173,16 +1174,19 @@
;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR)
;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW
;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) XXSPLTI*
+;; XXSPLTI32DX
(define_insn "vsx_mov<mode>_64bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, r, we, ?wQ,
?&r, ??r, ??Y, <??r>, wa, v,
- ?wa, v, <??r>, wZ, v, wa")
+ ?wa, v, <??r>, wZ, v, wa,
+ wa")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, we, r, r,
wQ, Y, r, r, wE, jwM,
- ?jwM, W, <nW>, v, wZ, eWeF"))]
+ ?jwM, W, <nW>, v, wZ, eWeF,
+ eD"))]
"TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
@@ -1193,41 +1197,47 @@
[(set_attr "type"
"vecstore, vecload, vecsimple, mtvsr, mfvsr, load,
store, load, store, *, vecsimple, vecsimple,
- vecsimple, *, *, vecstore, vecload, vecperm")
+ vecsimple, *, *, vecstore, vecload, vecperm,
+ vecperm")
(set_attr "num_insns"
"*, *, *, 2, *, 2,
2, 2, 2, 2, *, *,
- *, 5, 2, *, *, *")
+ *, 5, 2, *, *, *,
+ 2")
(set_attr "max_prefixed_insns"
"*, *, *, *, *, 2,
2, 2, 2, 2, *, *,
- *, *, *, *, *, *")
+ *, *, *, *, *, *,
+ 2")
(set_attr "length"
"*, *, *, 8, *, 8,
8, 8, 8, 8, *, *,
- *, 20, 8, *, *, *")
+ *, 20, 8, *, *, *,
+ *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
*, *, *, *, p9v, *,
- <VSisa>, *, *, *, *, p10")
+ <VSisa>, *, *, *, *, p10,
+ p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, *,
- *, *, *, *, *, yes")])
+ *, *, *, *, *, yes,
+ yes")])
;; VSX store VSX load VSX move GPR load GPR store GPR move
;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const XXSPLTI*
-;; LVX (VMX) STVX (VMX)
+;; LVX (VMX) STVX (VMX) XXSPLTI32DX
(define_insn "*vsx_mov<mode>_32bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, ??r, ??Y, <??r>,
wa, v, ?wa, v, <??r>, wa,
- wZ, v")
+ wZ, v, wa")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, Y, r, r,
wE, jwM, ?jwM, W, <nW>, eWeF,
- v, wZ"))]
+ v, wZ, eD"))]
"!TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
@@ -1238,19 +1248,27 @@
[(set_attr "type"
"vecstore, vecload, vecsimple, load, store, *,
vecsimple, vecsimple, vecsimple, *, *, vecperm,
- vecstore, vecload")
+ vecstore, vecload, vecperm")
(set_attr "length"
"*, *, *, 16, 16, 16,
*, *, *, 20, 16, *,
- *, *")
+ *, *, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
p9v, *, <VSisa>, *, *, p10,
- *, *")
+ *, *, p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, yes,
- *, *")])
+ *, *, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *")])
;; Explicit load/store expanders for the builtin functions
(define_expand "vsx_load_<mode>"
@@ -6330,6 +6348,79 @@
DONE;
})
+;; XXSPLTI32DX used to create 64-bit constants
+(define_mode_iterator XXSPLTI32DX [SF DF V2DF V2DI])
+
+(define_insn_and_split "*xxsplti32dx_<mode>"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTI32DX 1 "xxsplti32dx_operand"))]
+ "TARGET_XXSPLTI32DX"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTI32DX [(match_dup 2)
+ (match_dup 3)] UNSPEC_XXSPLTI32DX_CONST))
+ (set (match_dup 0)
+ (unspec:XXSPLTI32DX [(match_dup 0)
+ (match_dup 4)
+ (match_dup 5)] UNSPEC_XXSPLTI32DX_CONST))]
+{
+ HOST_WIDE_INT value = 0;
+
+ if (!xxsplti32dx_constant_p (operands[1], <MODE>mode, &value))
+ gcc_unreachable ();
+
+ HOST_WIDE_INT high = value >> 32;
+ HOST_WIDE_INT low = value & 0xffffffff;
+
+ /* If the low bits are 0/-1, initialize that word first. This way we can
+ use a smaller XXSPLTIB instruction instead the first XXSPLTI32DX. */
+ if (low == 0 || low == -1)
+ {
+ operands[2] = const1_rtx;
+ operands[3] = GEN_INT (low);
+ operands[4] = const0_rtx;
+ operands[5] = GEN_INT (high);
+ }
+ else
+ {
+ operands[2] = const0_rtx;
+ operands[3] = GEN_INT (high);
+ operands[4] = const1_rtx;
+ operands[5] = GEN_INT (low);
+ }
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")
+ (set_attr "num_insns" "2")
+ (set_attr "max_prefixed_insns" "2")])
+
+;; First word of XXSPLTI32DX
+(define_insn "*xxsplti32dx_<mode>_first"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa,wa,wa")
+ (unspec:XXSPLTI32DX [(match_operand 1 "u1bit_cint_operand" "n,n,n")
+ (match_operand 2 "const_int_operand" "O,wM,n")]
+ UNSPEC_XXSPLTI32DX_CONST))]
+ "TARGET_XXSPLTI32DX"
+ "@
+ xxspltib %x0,0
+ xxspltib %x0,255
+ xxsplti32dx %x0,%1,%2"
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "*,*,yes")])
+
+;; Second word of XXSPLTI32DX
+(define_insn "*xxsplti32dx_<mode>_second"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa")
+ (unspec:XXSPLTI32DX [(match_operand:XXSPLTI32DX 1 "vsx_register_operand" "0")
+ (match_operand 2 "u1bit_cint_operand" "n")
+ (match_operand 3 "const_int_operand" "n")]
+ UNSPEC_XXSPLTI32DX_CONST))]
+ "TARGET_XXSPLTI32DX"
+ "xxsplti32dx %x0,%2,%3"
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
;; XXSPLTI32DX built-in support.
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
index 8f6e176f9af..1435ef4ef4f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c
@@ -48,13 +48,16 @@ scalar_double_m_inf (void) /* XXSPLTIDP. */
double
scalar_double_pi (void)
{
- return M_PI; /* PLFD. */
+ return M_PI; /* 2x XXSPLTI32DX. */
}
double
scalar_double_denorm (void)
{
- return 0x1p-149f; /* PLFD. */
+ return 0x1p-149f; /* XXSPLTIB, XXSPLTI32DX. */
}
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-not {\mplfd\M} } } */
+/* { dg-final { scan-assembler-not {\mplxsd\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
index 72504bdfbbd..e9a45d5159d 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c
@@ -57,4 +57,7 @@ scalar_float_denorm (void)
return 0x1p-149f; /* PLFS. */
}
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mplfs\M} } } */
+/* { dg-final { scan-assembler-not {\mplxssp\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
index d509459292c..d81198b163d 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c
@@ -51,14 +51,16 @@ v2df_double_m_inf (void)
vector double
v2df_double_pi (void)
{
- return (vector double) { M_PI, M_PI }; /* PLFD. */
+ return (vector double) { M_PI, M_PI }; /* 2x XXSPLTI32DX. */
}
vector double
v2df_double_denorm (void)
{
- return (vector double) { (double)0x1p-149f,
- (double)0x1p-149f }; /* PLFD. */
+ return (vector double) { (double)0x1p-149f, /* XXSPLTIB, */
+ (double)0x1p-149f }; /* XXSPLTI32DX. */
}
-/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-not {\mplxv\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
index 06a8289d09b..f0eb982eadf 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
@@ -162,4 +162,4 @@ main (int argc, char *argv [])
/* { dg-final { scan-assembler-times {\mxxspltiw\M} 1 } } */
/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 4 } } */
^ permalink raw reply [flat|nested] 3+ messages in thread
* [gcc(refs/users/meissner/heads/work046)] Use XXSPLTI32DX to generate some constants.
@ 2021-04-08 5:35 Michael Meissner
0 siblings, 0 replies; 3+ messages in thread
From: Michael Meissner @ 2021-04-08 5:35 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:362c5ee933a376db9fd1a49fbb7ffb69c8d3dde2
commit 362c5ee933a376db9fd1a49fbb7ffb69c8d3dde2
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Thu Apr 8 01:35:01 2021 -0400
Use XXSPLTI32DX to generate some constants.
This patch generates a pair of XXSPLTI32DX instructions to load 64-bit
scalar or 128-bit vector constants into the vector registers.
I added a new constraint (eD) for constants that can be loaded with
XXSPLTI32DX, but cannot be loaded with the XXSPLTIDP or XXSPLTIW
instructions.
I added a debug switch (-mxxsplti32dx) to control whether this behavior is
on or off.
gcc/
2021-04-08 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/contraints.md (eD): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the constant with a pair of XXSPLTI32DX instructions, it is easy.
(xxsplti32dx_operand): New predicate.
(easy_vector_constant): If we can load the constant with a pair of
XXSPLTI32DX instructions, it is easy.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add
-mxxsplti32dx.
(POWERPC_MASKS): Add -mxxsplti32dx.
* config/rs6000/rs6000-protos.h (xxsplti32dx_constant_p): New
declaration.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
-mxxsplti32dx support.
(xxsplti32dx_constant_p): New helper function.
(output_vec_const_move): Split constants that need XXSPLTI32DX.
(rs6000_opt_masks): Add -mxxsplti32dx.
* config/rs6000/rs6000.md (movsf_hardfloat): Add support for
loading constants with XXSPLTI32DX.
(mov<mode>_hardfloat32, FMOVE64 iterator): Add support for loading
constants with XXSPLTI32DX.
(mov<mode>_hardfloat64, FMOVE64 iterator): Add support for loading
constants with XXSPLTI32DX.
* config/rs6000/rs6000.opt (-mxxsplti32dx): New switch.
* config/rs6000/vsx.md (UNSPEC_XXSPLTI32DX_CONST): New unspec.
(vsx_mov<mode>_64bit): Add support for loading constants with
XXSPLTI32DX.
(vsx_mov<mode>_32bit): Add support for loading constants with
XXSPLTI32DX.
(XXSPLTI32DX): New mode iterator.
(xxsplti32dx_<mode>): New insn and splits.
(xxsplti32dx_<mode>_first): New insns.
(xxsplti32dx_<mode>_second): New insns.
gcc/testsuite/
2021-04-08 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/vec-splati-runnable.c: Update insn count.
Diff:
---
gcc/config/rs6000/constraints.md | 6 +
gcc/config/rs6000/predicates.md | 22 ++++
gcc/config/rs6000/rs6000-cpus.def | 2 +
gcc/config/rs6000/rs6000-protos.h | 1 +
gcc/config/rs6000/rs6000.c | 94 ++++++++++++++++
gcc/config/rs6000/rs6000.md | 44 +++++---
gcc/config/rs6000/rs6000.opt | 4 +
gcc/config/rs6000/vsx.md | 121 ++++++++++++++++++---
.../gcc.target/powerpc/vec-splati-runnable.c | 2 +-
9 files changed, 267 insertions(+), 29 deletions(-)
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 70b1eb01770..c7137337f4c 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -208,6 +208,12 @@
(and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000")))
+;; SF/DF/V2DF/DI/V2DI scalar or vector constant that can be loaded with a pair
+;; of XXSPLTI32DX instructions.
+(define_constraint "eD"
+ "A vector constant that can be loaded with XXSPLTI32DX instructions."
+ (match_operand 0 "xxsplti32dx_operand"))
+
;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP
(define_constraint "eF"
"A vector constant that can be loaded with the XXSPLTIDP instruction."
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 48d9c5509a2..281c6e835b9 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -606,6 +606,11 @@
if (xxspltidp_operand (op, mode))
return 1;
+ /* If we have the ISA 3.1 XXSPLTI32DX instruction, see if the constant can
+ be loaded with a pair of those instructions. */
+ if (xxsplti32dx_operand (op, mode))
+ return 1;
+
/* Otherwise consider floating point constants hard, so that the
constant gets pushed to memory during the early RTL phases. This
has the advantage that double precision constants that can be
@@ -677,6 +682,20 @@
return xxspltidp_constant_p (op, mode, &value);
})
+;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be
+;; loaded via a pair f ISA 3.1 XXSPLTI32DX instructions. Do not return true if
+;; the value is 0.0 or it can be loaded with XXSPLTIDP, since that is easy to
+;; generate without using XXSPLTI32DX.
+(define_predicate "xxsplti32dx_operand"
+ (match_code "const_double,const_int,const_vector,vec_duplicate")
+{
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ HOST_WIDE_INT value = 0;
+ return xxsplti32dx_constant_p (op, mode, &value);
+})
+
;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
;; vector register without using memory.
(define_predicate "easy_vector_constant"
@@ -696,6 +715,9 @@
if (xxspltidp_operand (op, mode))
return true;
+ if (xxsplti32dx_operand (op, mode))
+ return true;
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (op, mode, &num_insns, &value))
return true;
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index cf4044831f7..5a14191cc6c 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -79,6 +79,7 @@
| OPTION_MASK_PCREL \
| OPTION_MASK_PCREL_OPT \
| OPTION_MASK_PREFIXED \
+ | OPTION_MASK_XXSPLTI32DX \
| OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
@@ -163,6 +164,7 @@
| OPTION_MASK_SOFT_FLOAT \
| OPTION_MASK_STRICT_ALIGN_OPTIONAL \
| OPTION_MASK_VSX \
+ | OPTION_MASK_XXSPLTI32DX \
| OPTION_MASK_XXSPLTIDP \
| OPTION_MASK_XXSPLTIW)
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 0fe1c176236..12bf60b043d 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -34,6 +34,7 @@ extern bool easy_altivec_constant (rtx, machine_mode);
extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
extern bool xxspltiw_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
+extern bool xxsplti32dx_constant_p (rtx, machine_mode, HOST_WIDE_INT *);
extern int vspltis_shifted (rtx);
extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 08a853f2e8f..659eb301c87 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4484,6 +4484,10 @@ rs6000_option_override_internal (bool global_init_p)
&& (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0)
rs6000_isa_flags |= OPTION_MASK_XXSPLTIDP;
+ if (TARGET_POWER10 && TARGET_VSX
+ && (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTI32DX) == 0)
+ rs6000_isa_flags |= OPTION_MASK_XXSPLTI32DX;
+
if (!TARGET_PCREL && TARGET_PCREL_OPT)
rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
@@ -6610,6 +6614,92 @@ xxspltidp_constant_p (rtx op,
return true;
}
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1
+ XXSPLTI32DX instruction. If the instruction can be synthesized with
+ XXSPLTIDP or is 0/-1, return false;
+
+ Return the 64-bit constant to use in the two XXSPLTI32DX instructions via
+ CONSTANT_PTR. */
+
+bool
+xxsplti32dx_constant_p (rtx op,
+ machine_mode mode,
+ HOST_WIDE_INT *constant_ptr)
+{
+ *constant_ptr = 0;
+
+ if (!TARGET_XXSPLTI32DX)
+ return false;
+
+ if (mode == VOIDmode)
+ mode = GET_MODE (op);
+
+ if (op == CONST0_RTX (mode))
+ return false;
+
+ rtx element = op;
+ if (mode == V2DFmode || mode == V2DImode)
+ {
+ /* Handle VEC_DUPLICATE and CONST_VECTOR. */
+ if (GET_CODE (op) == VEC_DUPLICATE)
+ element = XEXP (op, 0);
+
+ else if (GET_CODE (op) == CONST_VECTOR)
+ {
+ element = CONST_VECTOR_ELT (op, 0);
+ if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1)))
+ return false;
+ }
+
+ else
+ return false;
+
+ mode = GET_MODE_INNER (mode);
+ }
+
+ if (GET_MODE (element) != mode)
+ return false;
+
+ /* Handle floating point constants. */
+ if (mode == SFmode || mode == DFmode)
+ {
+ HOST_WIDE_INT xxspltidp_value = 0;
+
+ if (!CONST_DOUBLE_P (element))
+ return false;
+
+ if (xxspltidp_constant_p (element, mode, &xxspltidp_value))
+ return false;
+
+ long high_low[2];
+ const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element);
+ REAL_VALUE_TO_TARGET_DOUBLE (*rv, high_low);
+
+ if (!BYTES_BIG_ENDIAN)
+ std::swap (high_low[0], high_low[1]);
+
+ *constant_ptr = (high_low[0] << 32) | (high_low[1] & 0xffffffff);
+ return true;
+ }
+
+ /* Handle integer constants. */
+ else if (mode == DImode)
+ {
+ if (!CONST_INT_P (element))
+ return false;
+
+ HOST_WIDE_INT value = INTVAL (element);
+ if (value == -1)
+ return false;
+
+ *constant_ptr = value;
+ return true;
+ }
+
+ else
+ return false;
+}
+
const char *
output_vec_const_move (rtx *operands)
{
@@ -6683,6 +6773,9 @@ output_vec_const_move (rtx *operands)
return "xxspltidp %x0,%2";
}
+ if (xxsplti32dx_operand (vec, mode))
+ return "#";
+
if (TARGET_P9_VECTOR
&& xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
{
@@ -24182,6 +24275,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
{ "vsx", OPTION_MASK_VSX, false, true },
{ "xxspltiw", OPTION_MASK_XXSPLTIW, false, true },
{ "xxspltidp", OPTION_MASK_XXSPLTIDP, false, true },
+ { "xxsplti32dx", OPTION_MASK_XXSPLTI32DX, false, true },
#ifdef OPTION_MASK_64BIT
#if TARGET_AIX_OS
{ "aix64", OPTION_MASK_64BIT, false, false },
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5569e0591e6..b5886d3ccf4 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7564,17 +7564,17 @@
;;
;; LWZ LFS LXSSP LXSSPX STFS STXSSP
;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP
-;; MR MT<x> MF<x> NOP XXSPLTIDP
+;; MR MT<x> MF<x> NOP XXSPLTIDP XXSPLTI32DX
(define_insn "movsf_hardfloat"
[(set (match_operand:SF 0 "nonimmediate_operand"
"=!r, f, v, wa, m, wY,
Z, m, wa, !r, f, wa,
- !r, *c*l, !r, *h, wa")
+ !r, *c*l, !r, *h, wa, wa")
(match_operand:SF 1 "input_operand"
"m, m, wY, Z, f, v,
wa, r, j, j, f, wa,
- r, r, *h, 0, eF"))]
+ r, r, *h, 0, eF, eD"))]
"(register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode))
&& TARGET_HARD_FLOAT
@@ -7597,19 +7597,28 @@
mt%0 %1
mf%1 %0
nop
+ #
#"
[(set_attr "type"
"load, fpload, fpload, fpload, fpstore, fpstore,
fpstore, store, veclogical, integer, fpsimple, fpsimple,
- *, mtjmpr, mfjmpr, *, vecperm")
+ *, mtjmpr, mfjmpr, *, vecperm, vecperm")
(set_attr "isa"
"*, *, p9v, p8v, *, p9v,
p8v, *, *, *, *, *,
- *, *, *, *, p10")
+ *, *, *, *, p10, p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, *,
- *, *, *, *, yes")])
+ *, *, *, *, yes, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *, *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *, *, *, 2")])
;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ
;; FMR MR MT%0 MF%1 NOP
@@ -7869,18 +7878,18 @@
;; STFD LFD FMR LXSD STXSD
;; LXSD STXSD XXLOR XXLXOR GPR<-0
-;; LWZ STW MR XXSPLTIDP
+;; LWZ STW MR XXSPLTIDP XXSPLTI32DX
(define_insn "*mov<mode>_hardfloat32"
[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
"=m, d, d, <f64_p9>, wY,
<f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
- Y, r, !r, wa")
+ Y, r, !r, wa, wa")
(match_operand:FMOVE64 1 "input_operand"
"d, m, d, wY, <f64_p9>,
Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
- r, Y, r, eF"))]
+ r, Y, r, eF, eD"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))"
@@ -7898,24 +7907,33 @@
#
#
#
+ #
#"
[(set_attr "type"
"fpstore, fpload, fpsimple, fpload, fpstore,
fpload, fpstore, veclogical, veclogical, two,
- store, load, two, vecperm")
+ store, load, two, vecperm, vecperm")
(set_attr "size" "64")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, 8,
- 8, 8, 8, *")
+ 8, 8, 8, *, *")
(set_attr "isa"
"*, *, *, p9v, p9v,
p7v, p7v, *, *, *,
- *, *, *, p10")
+ *, *, *, p10, p10")
(set_attr "prefixed"
"*, *, *, *, *,
*, *, *, *, *,
- *, *, *, yes")])
+ *, *, *, yes, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *,
+ *, *, *, *, *,
+ *, *, *, *, 2")])
;; STW LWZ MR G-const H-const F-const
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 6620cdb7716..bd269369ca0 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -627,3 +627,7 @@ Generate (do not generate) the XXSPLTIW instruction.
mxxspltidp
Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags)
Generate (do not generate) the XXSPLTIDP instruction.
+
+mxxsplti32dx
+Target Undocumented Mask(XXSPLTI32DX) Var(rs6000_isa_flags)
+Generate (do not generate) the XXSPLTI32DX instruction.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 4b0307d447e..877f1cdca39 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -372,6 +372,7 @@
UNSPEC_XXSPLTIW
UNSPEC_XXSPLTIDP
UNSPEC_XXSPLTI32DX
+ UNSPEC_XXSPLTI32DX_CONST
UNSPEC_XXPERMX
UNSPEC_XXEVAL
])
@@ -1173,16 +1174,19 @@
;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR)
;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW
;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) XXSPLTI*
+;; XXSPLTI32DX
(define_insn "vsx_mov<mode>_64bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, r, we, ?wQ,
?&r, ??r, ??Y, <??r>, wa, v,
- ?wa, v, <??r>, wZ, v, wa")
+ ?wa, v, <??r>, wZ, v, wa,
+ wa")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, we, r, r,
wQ, Y, r, r, wE, jwM,
- ?jwM, W, <nW>, v, wZ, eWeF"))]
+ ?jwM, W, <nW>, v, wZ, eWeF,
+ eD"))]
"TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
@@ -1193,41 +1197,47 @@
[(set_attr "type"
"vecstore, vecload, vecsimple, mtvsr, mfvsr, load,
store, load, store, *, vecsimple, vecsimple,
- vecsimple, *, *, vecstore, vecload, vecperm")
+ vecsimple, *, *, vecstore, vecload, vecperm,
+ vecperm")
(set_attr "num_insns"
"*, *, *, 2, *, 2,
2, 2, 2, 2, *, *,
- *, 5, 2, *, *, *")
+ *, 5, 2, *, *, *,
+ 2")
(set_attr "max_prefixed_insns"
"*, *, *, *, *, 2,
2, 2, 2, 2, *, *,
- *, *, *, *, *, *")
+ *, *, *, *, *, *,
+ 2")
(set_attr "length"
"*, *, *, 8, *, 8,
8, 8, 8, 8, *, *,
- *, 20, 8, *, *, *")
+ *, 20, 8, *, *, *,
+ *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
*, *, *, *, p9v, *,
- <VSisa>, *, *, *, *, p10")
+ <VSisa>, *, *, *, *, p10,
+ p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, *,
- *, *, *, *, *, yes")])
+ *, *, *, *, *, yes,
+ yes")])
;; VSX store VSX load VSX move GPR load GPR store GPR move
;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const XXSPLTI*
-;; LVX (VMX) STVX (VMX)
+;; LVX (VMX) STVX (VMX) XXSPLTI32DX
(define_insn "*vsx_mov<mode>_32bit"
[(set (match_operand:VSX_M 0 "nonimmediate_operand"
"=ZwO, wa, wa, ??r, ??Y, <??r>,
wa, v, ?wa, v, <??r>, wa,
- wZ, v")
+ wZ, v, wa")
(match_operand:VSX_M 1 "input_operand"
"wa, ZwO, wa, Y, r, r,
wE, jwM, ?jwM, W, <nW>, eWeF,
- v, wZ"))]
+ v, wZ, eD"))]
"!TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
@@ -1238,19 +1248,27 @@
[(set_attr "type"
"vecstore, vecload, vecsimple, load, store, *,
vecsimple, vecsimple, vecsimple, *, *, vecperm,
- vecstore, vecload")
+ vecstore, vecload, vecperm")
(set_attr "length"
"*, *, *, 16, 16, 16,
*, *, *, 20, 16, *,
- *, *")
+ *, *, *")
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
p9v, *, <VSisa>, *, *, p10,
- *, *")
+ *, *, p10")
(set_attr "prefixed"
"*, *, *, *, *, *,
*, *, *, *, *, yes,
- *, *")])
+ *, *, yes")
+ (set_attr "max_prefixed_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, 2")
+ (set_attr "num_insns"
+ "*, *, *, *, *, *,
+ *, *, *, *, *, *,
+ *, *, *")])
;; Explicit load/store expanders for the builtin functions
(define_expand "vsx_load_<mode>"
@@ -6330,6 +6348,79 @@
DONE;
})
+;; XXSPLTI32DX used to create 64-bit constants
+(define_mode_iterator XXSPLTI32DX [SF DF V2DF V2DI])
+
+(define_insn_and_split "*xxsplti32dx_<mode>"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa")
+ (match_operand:XXSPLTI32DX 1 "xxsplti32dx_operand"))]
+ "TARGET_XXSPLTI32DX"
+ "#"
+ "&& 1"
+ [(set (match_dup 0)
+ (unspec:XXSPLTI32DX [(match_dup 2)
+ (match_dup 3)] UNSPEC_XXSPLTI32DX_CONST))
+ (set (match_dup 0)
+ (unspec:XXSPLTI32DX [(match_dup 0)
+ (match_dup 4)
+ (match_dup 5)] UNSPEC_XXSPLTI32DX_CONST))]
+{
+ HOST_WIDE_INT value = 0;
+
+ if (!xxsplti32dx_constant_p (operands[1], <MODE>mode, &value))
+ gcc_unreachable ();
+
+ HOST_WIDE_INT high = value >> 32;
+ HOST_WIDE_INT low = value & 0xffffffff;
+
+ /* If the low bits are 0/-1, initialize that word first. This way we can
+ use a smaller XXSPLTIB instruction instead the first XXSPLTI32DX. */
+ if (low == 0 || low == -1)
+ {
+ operands[2] = const1_rtx;
+ operands[3] = GEN_INT (low);
+ operands[4] = const0_rtx;
+ operands[5] = GEN_INT (high);
+ }
+ else
+ {
+ operands[2] = const0_rtx;
+ operands[3] = GEN_INT (high);
+ operands[4] = const1_rtx;
+ operands[5] = GEN_INT (low);
+ }
+}
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")
+ (set_attr "num_insns" "2")
+ (set_attr "max_prefixed_insns" "2")])
+
+;; First word of XXSPLTI32DX
+(define_insn "*xxsplti32dx_<mode>_first"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa,wa,wa")
+ (unspec:XXSPLTI32DX [(match_operand 1 "u1bit_cint_operand" "n,n,n")
+ (match_operand 2 "const_int_operand" "O,wM,n")]
+ UNSPEC_XXSPLTI32DX_CONST))]
+ "TARGET_XXSPLTI32DX"
+ "@
+ xxspltib %x0,0
+ xxspltib %x0,255
+ xxsplti32dx %x0,%1,%2"
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "*,*,yes")])
+
+;; Second word of XXSPLTI32DX
+(define_insn "*xxsplti32dx_<mode>_second"
+ [(set (match_operand:XXSPLTI32DX 0 "vsx_register_operand" "=wa")
+ (unspec:XXSPLTI32DX [(match_dup 0)
+ (match_operand 1 "u1bit_cint_operand" "n")
+ (match_operand 2 "const_int_operand" "n")]
+ UNSPEC_XXSPLTI32DX_CONST))]
+ "TARGET_XXSPLTI32DX"
+ "xxsplti32dx %x0,%1,%2"
+ [(set_attr "type" "vecperm")
+ (set_attr "prefixed" "yes")])
+
;; XXSPLTI32DX built-in support.
(define_expand "xxsplti32dx_v4si"
[(set (match_operand:V4SI 0 "register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
index 06a8289d09b..f0eb982eadf 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
@@ -162,4 +162,4 @@ main (int argc, char *argv [])
/* { dg-final { scan-assembler-times {\mxxspltiw\M} 1 } } */
/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxxsplti32dx\M} 4 } } */
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-04-12 17:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12 17:46 [gcc(refs/users/meissner/heads/work046)] Use XXSPLTI32DX to generate some constants Michael Meissner
-- strict thread matches above, loose matches on Subject: below --
2021-04-08 20:08 Michael Meissner
2021-04-08 5:35 Michael Meissner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).