From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id B0651385841D; Fri, 17 Sep 2021 02:53:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B0651385841D Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work069)] Generate XXSPLTIDP on power10. X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/work069 X-Git-Oldrev: 45a889959ad5b3059bb53b63cd16ffa7340cd658 X-Git-Newrev: 0ea9d62fad957382cfd08c9f95fe163a29530f70 Message-Id: <20210917025300.B0651385841D@sourceware.org> Date: Fri, 17 Sep 2021 02:53:00 +0000 (GMT) X-BeenThere: gcc-cvs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Sep 2021 02:53:00 -0000 https://gcc.gnu.org/g:0ea9d62fad957382cfd08c9f95fe163a29530f70 commit 0ea9d62fad957382cfd08c9f95fe163a29530f70 Author: Michael Meissner Date: Thu Sep 16 22:52:43 2021 -0400 Generate XXSPLTIDP on power10. This patch implements XXSPLTIDP support for SF, DF, and DI scalar constants and V2DF and V2DI vector constants. The XXSPLTIDP instruction is given a 32-bit immediate that is converted to a vector of two DFmode constants. The immediate is in SFmode format, so only constants that fit as SFmode values can be loaded with XXSPLTIDP. I added two new constraints (eF and eV) to match scalar and vector constants that can be loaded with the XXSPLTIDP instruction. I have added a temporary switch (-mxxspltidp) to control whether or not the XXSPLTIDP instruction is generated. I added 5 new tests to test loading up SF/DF/DI scalar and V2DI/V2DF vector constants. 2021-09-16 Michael Meissner gcc/ * config/rs6000/constraints.md (eF): New constraint. (eV): New constraint. * config/rs6000/predicates.md (easy_fp_constant): If we can load the scalar constant with XXSPLTIDP, the constant is easy. (easy_fp_constant_64bit_scalar): New predicate. (easy_vector_constant_64bit_element): New predicate. (easy_vector_constant): If we can generate XXSPLTIDP, mark the vector constant as easy. * config/rs6000/rs6000-protos.h (xxspltidp_constant_immediate): New declaration. (prefixed_xxsplti_p): Likewise. * config/rs6000/rs6000.c (xxspltidp_constant_immediate): New function. (output_vec_const_move): Add support for XXSPLTIDP. (prefixed_xxsplti_p): New function. * config/rs6000/rs6000.md (prefixed attribute): Add support for the xxsplti* prefixed instructions. (movsf_hardfloat): Add XXSPLTIDP support. (mov_hardfloat32, FMOVE64 iterator): Likewise. (mov_hardfloat64, FMOVE64 iterator): Likewise. (movdi_internal32): Likewise. (movdi_internal64): Likewise. * config/rs6000/rs6000.opt (-mxxspltidp): New switch. * config/rs6000/vsx.md (vsx_move_64bit): Add XXSPLTIDP support. (vsx_move_32bit): Likewise. (XXSPLTIDP_S): New mode iterator. (XXSPLTIDP_V): Likewise. (XXSPLTIDP): Likewise. (xxspltidp__inst): Replace xxspltidp_v2df_inst with an iterated form that also does SFmode, DFmode, DImode, and V2DImode. (xxspltidp__internal): New insn and splits. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document the eF and eV constraints. gcc/testsuite/ * gcc.target/powerpc/vec-splat-constant-df.c: New test. * gcc.target/powerpc/vec-splat-constant-di.c: New test. * gcc.target/powerpc/vec-splat-constant-sf.c: New test. * gcc.target/powerpc/vec-splat-constant-v2df.c: New test. * gcc.target/powerpc/vec-splat-constant-v2di.c: New test. Diff: --- gcc/config/rs6000/constraints.md | 10 ++ gcc/config/rs6000/predicates.md | 140 +++++++++++++++++++++ gcc/config/rs6000/rs6000-protos.h | 2 + gcc/config/rs6000/rs6000.c | 96 ++++++++++++++ gcc/config/rs6000/rs6000.md | 58 ++++++--- gcc/config/rs6000/rs6000.opt | 4 + gcc/config/rs6000/vsx.md | 60 ++++++++- gcc/doc/md.texi | 6 + .../gcc.target/powerpc/vec-splat-constant-df.c | 60 +++++++++ .../gcc.target/powerpc/vec-splat-constant-di.c | 70 +++++++++++ .../gcc.target/powerpc/vec-splat-constant-sf.c | 60 +++++++++ .../gcc.target/powerpc/vec-splat-constant-v2df.c | 64 ++++++++++ .../gcc.target/powerpc/vec-splat-constant-v2di.c | 50 ++++++++ 13 files changed, 658 insertions(+), 22 deletions(-) diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md index c8cff1a3038..1ff46c9f4fc 100644 --- a/gcc/config/rs6000/constraints.md +++ b/gcc/config/rs6000/constraints.md @@ -208,11 +208,21 @@ (and (match_code "const_int") (match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000"))) +;; DI/SF/DF scalar constant that can be loaded with the XXSPLTIDP instruction. +(define_constraint "eF" + "A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction." + (match_operand 0 "easy_fp_constant_64bit_scalar")) + ;; 34-bit signed integer constant (define_constraint "eI" "A signed 34-bit integer constant if prefixed instructions are supported." (match_operand 0 "cint34_operand")) +;; V2DI/V2DF vector constant that can be loaded with the XXSPLTIDP instruction. +(define_constraint "eV" + "A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction." + (match_operand 0 "easy_vector_constant_64bit_element")) + ;; Floating-point constraints. These two are defined so that insn ;; length attributes can be calculated exactly. diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index 956e42bc514..7544ac87700 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -601,6 +601,11 @@ if (TARGET_VSX && op == CONST0_RTX (mode)) return 1; + /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can + be loaded with that instruction. */ + if (easy_fp_constant_64bit_scalar (op, mode)) + return 1; + /* Otherwise consider floating point constants hard, so that the constant gets pushed to memory during the early RTL phases. This has the advantage that double precision constants that can be @@ -609,6 +614,138 @@ return 0; }) +;; Return 1 if the operand is a 64-bit scalar constant that can be loaded via +;; the XXSPLTIDP instruction, which takes a SFmode value and produces a V2DF or +;; V2DI mode result that is interpretted as a 64-bit scalar. +(define_predicate "easy_fp_constant_64bit_scalar" + (match_code "const_int,const_double") +{ + const REAL_VALUE_TYPE *rv; + REAL_VALUE_TYPE rv_type; + + /* Can we do the XXSPLTIDP instruction? */ + if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX) + return false; + + if (mode == VOIDmode) + mode = GET_MODE (op); + + /* Don't return true for 0.0 or 0 since that is easy to create without + XXSPLTIDP. */ + if (op == CONST0_RTX (mode)) + return false; + + /* Handle DImode by creating a DF value from it. */ + if (CONST_INT_P (op) && (mode == DImode || mode == VOIDmode)) + { + HOST_WIDE_INT df_value = INTVAL (op); + + /* Avoid values that look like DFmode NaN's. The IEEE 754 64-bit + floating format has 1 bit for sign, 11 bits for the exponent, + and 52 bits for the mantissa. NaN values have the exponent set + to all 1 bits, and the mantissa non-zero (mantissa == 0 is + infinity). */ + int df_exponent = (df_value >> 52) & 0x7ff; + HOST_WIDE_INT df_mantissa = df_value & HOST_WIDE_INT_C (0x1fffffffffffff); + + if (df_exponent == 0x7ff && df_mantissa != 0) /* NaN. */ + return false; + + /* Avoid values that are DFmode subnormal values. Subnormal numbers + have the exponent all 0 bits, and the mantissa non-zero. If the + value is subnormal, then the hidden bit in the mantissa is not + set. */ + if (df_exponent == 0 && df_mantissa != 0) /* subnormal. */ + return false; + + long df_words[2]; + df_words[0] = (df_value >> 32) & 0xffffffff; + df_words[1] = df_value & 0xffffffff; + + /* real_from_target takes the target words in target order. */ + if (!BYTES_BIG_ENDIAN) + std::swap (df_words[0], df_words[1]); + + real_from_target (&rv_type, df_words, DFmode); + rv = &rv_type; + } + + /* Handle SFmode/DFmode constants. Don't allow decimal or IEEE 128-bit + binary constants. */ + else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode)) + rv = CONST_DOUBLE_REAL_VALUE (op); + + /* We can't handle anything else with the XXSPLTIDP instruction. */ + else + return false; + + /* Validate that the number can be stored as a SFmode value. */ + if (!exact_real_truncate (SFmode, rv)) + return false; + + /* Validate that the number is not a SFmode subnormal value (exponent is 0, + mantissa field is non-zero) which is undefined for the XXSPLTIDP + instruction. */ + long sf_value; + real_to_target (&sf_value, rv, SFmode); + + /* IEEE 754 32-bit values have 1 bit for the sign, 8 bits for the exponent, + and 23 bits for the mantissa. Subnormal numbers have the exponent all + 0 bits, and the mantissa non-zero. */ + long sf_exponent = (sf_value >> 23) & 0xFF; + long sf_mantissa = sf_value & 0x7FFFFF; + + if (sf_exponent == 0 && sf_mantissa != 0) + return false; + + return true; +}) + +;; Return 1 if the operand is a 64-bit vector constant that can be loaded via +;; the XXSPLTIDP instruction, which takes a SFmode value and produces a +;; V2DFmode or V2DI result. +;; +;; We cannot combine the scalar and vector cases because otherwise it is +;; problematical if we assign an appropriate integer constant to a TImode +;; value. I.e. +;; +;; (set (reg:TI 32) +;; (const_int 0x8000000000000000)) +;; +;; Otherwise, the constant would be splatted into the 2 64-bit positions in the +;; vector register, and not loaded with the upper 64-bits 0, and the constant +;; in the lower 64-bits. + +(define_predicate "easy_vector_constant_64bit_element" + (match_code "const_vector,vec_duplicate") +{ + /* Can we do the XXSPLTIDP instruction? */ + if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX) + return false; + + if (mode == VOIDmode) + mode = GET_MODE (op); + + if (mode != V2DFmode && mode != V2DImode) + return false; + + if (CONST_VECTOR_P (op)) + { + if (!CONST_VECTOR_DUPLICATE_P (op)) + return false; + + op = CONST_VECTOR_ELT (op, 0); + } + + else if (GET_CODE (op) == VEC_DUPLICATE) + op = XEXP (op, 0); + + else + return false; + + return easy_fp_constant_64bit_scalar (op, GET_MODE_INNER (mode)); +}) + ;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB ;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction. @@ -653,6 +790,9 @@ if (zero_constant (op, mode) || all_ones_constant (op, mode)) return true; + if (easy_vector_constant_64bit_element (op, mode)) + return true; + if (TARGET_P9_VECTOR && xxspltib_constant_p (op, mode, &num_insns, &value)) return true; diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 14f6b313105..e9be9c4d99f 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int, extern int easy_altivec_constant (rtx, machine_mode); extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); +extern long xxspltidp_constant_immediate (rtx, machine_mode); extern int vspltis_shifted (rtx); extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); extern bool macho_lo_sum_memory_operand (rtx, machine_mode); @@ -198,6 +199,7 @@ enum non_prefixed_form reg_to_non_prefixed (rtx reg, machine_mode mode); extern bool prefixed_load_p (rtx_insn *); extern bool prefixed_store_p (rtx_insn *); extern bool prefixed_paddi_p (rtx_insn *); +extern bool prefixed_xxsplti_p (rtx_insn *); extern void rs6000_asm_output_opcode (FILE *); extern void output_pcrel_opt_reloc (rtx); extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int); diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ad81dfb316d..d6d3ecdee84 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -6697,6 +6697,60 @@ xxspltib_constant_p (rtx op, return true; } +/* Return the immediate value used in the XXSPLTIDP instruction. */ + +long +xxspltidp_constant_immediate (rtx op, machine_mode mode) +{ + long ret; + + /* Handle vectors. */ + if (CONST_VECTOR_P (op)) + { + op = CONST_VECTOR_ELT (op, 0); + mode = GET_MODE_INNER (mode); + } + + else if (GET_CODE (op) == VEC_DUPLICATE) + { + op = XEXP (op, 0); + mode = GET_MODE (op); + } + + gcc_assert (easy_fp_constant_64bit_scalar (op, mode)); + + /* Handle DImode/V2DImode by creating a DF value from it and then converting + the DFmode value to SFmode. */ + if (CONST_INT_P (op)) + { + HOST_WIDE_INT df_value = INTVAL (op); + long df_words[2]; + + df_words[0] = (df_value >> 32) & 0xffffffff; + df_words[1] = df_value & 0xffffffff; + + /* real_to_target takes input in target endian order. */ + if (!BYTES_BIG_ENDIAN) + std::swap (df_words[0], df_words[1]); + + REAL_VALUE_TYPE r; + real_from_target (&r, &df_words[0], DFmode); + real_to_target (&ret, &r, SFmode); + } + + /* For floating point constants, convert to SFmode. */ + else if (CONST_DOUBLE_P (op) && (mode == SFmode || mode == DFmode)) + { + const REAL_VALUE_TYPE *rv = CONST_DOUBLE_REAL_VALUE (op); + real_to_target (&ret, rv, SFmode); + } + + else + gcc_unreachable (); + + return ret; +} + const char * output_vec_const_move (rtx *operands) { @@ -6741,6 +6795,13 @@ output_vec_const_move (rtx *operands) gcc_unreachable (); } + if (easy_fp_constant_64bit_scalar (vec, mode) + || easy_vector_constant_64bit_element (vec, mode)) + { + operands[2] = GEN_INT (xxspltidp_constant_immediate (vec, mode)); + return "xxspltidp %x0,%2"; + } + if (TARGET_P9_VECTOR && xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value)) { @@ -26475,6 +26536,41 @@ prefixed_paddi_p (rtx_insn *insn) return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL); } +/* Whether a permute type instruction is a prefixed XXSPLTI* instruction. + This is called from the prefixed attribute processing. */ + +bool +prefixed_xxsplti_p (rtx_insn *insn) +{ + rtx set = single_set (insn); + if (!set) + return false; + + rtx dest = SET_DEST (set); + rtx src = SET_SRC (set); + machine_mode mode = GET_MODE (dest); + + if (!REG_P (dest) && !SUBREG_P (dest)) + return false; + + switch (mode) + { + case E_DImode: + case E_DFmode: + case E_SFmode: + return easy_fp_constant_64bit_scalar (src, mode); + + case E_V2DImode: + case E_V2DFmode: + return easy_vector_constant_64bit_element (src, mode); + + default: + break; + } + + return false; +} + /* Whether the next instruction needs a 'p' prefix issued before the instruction is printed out. */ static bool prepend_p_to_next_insn; diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 6bec2bddbde..8afc4b2756d 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -314,6 +314,11 @@ (eq_attr "type" "integer,add") (if_then_else (match_test "prefixed_paddi_p (insn)") + (const_string "yes") + (const_string "no")) + + (eq_attr "type" "vecperm") + (if_then_else (match_test "prefixed_xxsplti_p (insn)") (const_string "yes") (const_string "no"))] @@ -7759,17 +7764,17 @@ ;; ;; LWZ LFS LXSSP LXSSPX STFS STXSSP ;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP -;; MR MT MF NOP +;; MR MT MF NOP XXSPLTIDP (define_insn "movsf_hardfloat" [(set (match_operand:SF 0 "nonimmediate_operand" "=!r, f, v, wa, m, wY, Z, m, wa, !r, f, wa, - !r, *c*l, !r, *h") + !r, *c*l, !r, *h, wa") (match_operand:SF 1 "input_operand" "m, m, wY, Z, f, v, wa, r, j, j, f, wa, - r, r, *h, 0"))] + r, r, *h, 0, eF"))] "(register_operand (operands[0], SFmode) || register_operand (operands[1], SFmode)) && TARGET_HARD_FLOAT @@ -7791,15 +7796,16 @@ mr %0,%1 mt%0 %1 mf%1 %0 - nop" + nop + #" [(set_attr "type" "load, fpload, fpload, fpload, fpstore, fpstore, fpstore, store, veclogical, integer, fpsimple, fpsimple, - *, mtjmpr, mfjmpr, *") + *, mtjmpr, mfjmpr, *, vecperm") (set_attr "isa" "*, *, p9v, p8v, *, p9v, p8v, *, *, *, *, *, - *, *, *, *")]) + *, *, *, *, p10")]) ;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ ;; FMR MR MT%0 MF%1 NOP @@ -8059,18 +8065,18 @@ ;; STFD LFD FMR LXSD STXSD ;; LXSD STXSD XXLOR XXLXOR GPR<-0 -;; LWZ STW MR +;; LWZ STW MR XXSPLTIDP (define_insn "*mov_hardfloat32" [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m, d, d, , wY, , Z, , , !r, - Y, r, !r") + Y, r, !r, wa") (match_operand:FMOVE64 1 "input_operand" "d, m, d, wY, , Z, , , , , - r, Y, r"))] + r, Y, r, eF"))] "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && (gpc_reg_operand (operands[0], mode) || gpc_reg_operand (operands[1], mode))" @@ -8087,20 +8093,21 @@ # # # + # #" [(set_attr "type" "fpstore, fpload, fpsimple, fpload, fpstore, fpload, fpstore, veclogical, veclogical, two, - store, load, two") + store, load, two, vecperm") (set_attr "size" "64") (set_attr "length" "*, *, *, *, *, *, *, *, *, 8, - 8, 8, 8") + 8, 8, 8, *") (set_attr "isa" "*, *, *, p9v, p9v, p7v, p7v, *, *, *, - *, *, *")]) + *, *, *, p10")]) ;; STW LWZ MR G-const H-const F-const @@ -8127,19 +8134,19 @@ ;; STFD LFD FMR LXSD STXSD ;; LXSDX STXSDX XXLOR XXLXOR LI 0 ;; STD LD MR MT{CTR,LR} MF{CTR,LR} -;; NOP MFVSRD MTVSRD +;; NOP MFVSRD MTVSRD XXSPLTIDP (define_insn "*mov_hardfloat64" [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m, d, d, , wY, , Z, , , !r, YZ, r, !r, *c*l, !r, - *h, r, ") + *h, r, , wa") (match_operand:FMOVE64 1 "input_operand" "d, m, d, wY, , Z, , , , , r, YZ, r, r, *h, - 0, , r"))] + 0, , r, eF"))] "TARGET_POWERPC64 && TARGET_HARD_FLOAT && (gpc_reg_operand (operands[0], mode) || gpc_reg_operand (operands[1], mode))" @@ -8161,18 +8168,19 @@ mf%1 %0 nop mfvsrd %0,%x1 - mtvsrd %x0,%1" + mtvsrd %x0,%1 + #" [(set_attr "type" "fpstore, fpload, fpsimple, fpload, fpstore, fpload, fpstore, veclogical, veclogical, integer, store, load, *, mtjmpr, mfjmpr, - *, mfvsr, mtvsr") + *, mfvsr, mtvsr, vecperm") (set_attr "size" "64") (set_attr "isa" "*, *, *, p9v, p9v, p7v, p7v, *, *, *, *, *, *, *, *, - *, p8v, p8v")]) + *, p8v, p8v, p10")]) ;; STD LD MR MT MF G-const ;; H-const F-const Special @@ -9220,6 +9228,7 @@ ;; a gpr into a fpr instead of reloading an invalid 'Y' address ;; GPR store GPR load GPR move FPR store FPR load FPR move +;; XXSPLTIDP ;; GPR const AVX store AVX store AVX load AVX load VSX move ;; P9 0 P9 -1 AVX 0/-1 VSX 0 VSX -1 P9 const ;; AVX const @@ -9227,11 +9236,13 @@ (define_insn "*movdi_internal32" [(set (match_operand:DI 0 "nonimmediate_operand" "=Y, r, r, m, ^d, ^d, + ^wa, r, wY, Z, ^v, $v, ^wa, wa, wa, v, wa, *i, v, v") (match_operand:DI 1 "input_operand" "r, Y, r, ^d, m, ^d, + eF, IJKnF, ^v, $v, wY, Z, ^wa, Oj, wM, OjwM, Oj, wM, wS, wB"))] @@ -9246,6 +9257,7 @@ lfd%U1%X1 %0,%1 fmr %0,%1 # + # stxsd %1,%0 stxsdx %x1,%y0 lxsd %0,%1 @@ -9260,17 +9272,20 @@ #" [(set_attr "type" "store, load, *, fpstore, fpload, fpsimple, + vecperm, *, fpstore, fpstore, fpload, fpload, veclogical, vecsimple, vecsimple, vecsimple, veclogical,veclogical,vecsimple, vecsimple") (set_attr "size" "64") (set_attr "length" "8, 8, 8, *, *, *, + *, 16, *, *, *, *, *, *, *, *, *, *, 8, *") (set_attr "isa" "*, *, *, *, *, *, + p10, *, p9v, p7v, p9v, p7v, *, p9v, p9v, p7v, *, *, p7v, p7v")]) @@ -9306,6 +9321,7 @@ }) ;; GPR store GPR load GPR move +;; XXSPLTIDP ;; GPR li GPR lis GPR pli GPR # ;; FPR store FPR load FPR move ;; AVX store AVX store AVX load AVX load VSX move @@ -9316,6 +9332,7 @@ (define_insn "*movdi_internal64" [(set (match_operand:DI 0 "nonimmediate_operand" "=YZ, r, r, + ^wa, r, r, r, r, m, ^d, ^d, wY, Z, $v, $v, ^wa, @@ -9325,6 +9342,7 @@ ?r, ?wa") (match_operand:DI 1 "input_operand" "r, YZ, r, + eF, I, L, eI, nF, ^d, m, ^d, ^v, $v, wY, Z, ^wa, @@ -9339,6 +9357,7 @@ std%U0%X0 %1,%0 ld%U1%X1 %0,%1 mr %0,%1 + # li %0,%1 lis %0,%v1 li %0,%1 @@ -9365,6 +9384,7 @@ mtvsrd %x0,%1" [(set_attr "type" "store, load, *, + vecperm, *, *, *, *, fpstore, fpload, fpsimple, fpstore, fpstore, fpload, fpload, veclogical, @@ -9375,6 +9395,7 @@ (set_attr "size" "64") (set_attr "length" "*, *, *, + *, *, *, *, 20, *, *, *, *, *, *, *, *, @@ -9384,6 +9405,7 @@ *, *") (set_attr "isa" "*, *, *, + p10, *, *, p10, *, *, *, *, p9v, p7v, p9v, p7v, *, diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index c1cb9ab06cd..432f0f716bd 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -639,3 +639,7 @@ Enable instructions that guard against return-oriented programming attacks. mprivileged Target Var(rs6000_privileged) Init(0) Generate code that will run in privileged state. + +mxxspltidp +Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save +Generate (do not generate) XXSPLTIDP instructions. diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index bf033e31c1c..fa33c9d9fbf 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -1191,16 +1191,19 @@ ;; instruction). But generate XXLXOR/XXLORC if it will avoid a register move. ;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR) +;; XXSPLTIDP ;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW ;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) (define_insn "vsx_mov_64bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, wa, wa, r, we, ?wQ, + wa, ?&r, ??r, ??Y, , wa, v, ?wa, v, , wZ, v") (match_operand:VSX_M 1 "input_operand" "wa, ZwO, wa, we, r, r, + eV, wQ, Y, r, r, wE, jwM, ?jwM, W, , v, wZ"))] @@ -1212,36 +1215,44 @@ } [(set_attr "type" "vecstore, vecload, vecsimple, mtvsr, mfvsr, load, + vecperm, store, load, store, *, vecsimple, vecsimple, vecsimple, *, *, vecstore, vecload") (set_attr "num_insns" "*, *, *, 2, *, 2, + *, 2, 2, 2, 2, *, *, *, 5, 2, *, *") (set_attr "max_prefixed_insns" "*, *, *, *, *, 2, + *, 2, 2, 2, 2, *, *, *, *, *, *, *") (set_attr "length" "*, *, *, 8, *, 8, + *, 8, 8, 8, 8, *, *, *, 20, 8, *, *") (set_attr "isa" ", , , *, *, *, + p10, *, *, *, *, p9v, *, , *, *, *, *")]) ;; VSX store VSX load VSX move GPR load GPR store GPR move +;; XXSPLTIDP ;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const ;; LVX (VMX) STVX (VMX) (define_insn "*vsx_mov_32bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, wa, wa, ??r, ??Y, , + wa, wa, v, ?wa, v, , wZ, v") (match_operand:VSX_M 1 "input_operand" "wa, ZwO, wa, Y, r, r, + eV, wE, jwM, ?jwM, W, , v, wZ"))] @@ -1253,14 +1264,17 @@ } [(set_attr "type" "vecstore, vecload, vecsimple, load, store, *, + vecperm, vecsimple, vecsimple, vecsimple, *, *, vecstore, vecload") (set_attr "length" "*, *, *, 16, 16, 16, + *, *, *, *, 20, 16, *, *") (set_attr "isa" ", , , *, *, *, + p10, p9v, *, , *, *, *, *")]) @@ -6449,15 +6463,53 @@ DONE; }) -(define_insn "xxspltidp_v2df_inst" - [(set (match_operand:V2DF 0 "register_operand" "=wa") - (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")] - UNSPEC_XXSPLTIDP))] +(define_mode_iterator XXSPLTIDP_S [DI SF DF]) +(define_mode_iterator XXSPLTIDP_V [V2DF V2DI]) +(define_mode_iterator XXSPLTIDP [DI SF DF V2DF V2DI]) + +(define_insn "xxspltidp__inst" + [(set (match_operand:XXSPLTIDP 0 "register_operand" "=wa") + (unspec:XXSPLTIDP [(match_operand:SI 1 "c32bit_cint_operand" "n")] + UNSPEC_XXSPLTIDP))] "TARGET_POWER10" "xxspltidp %x0,%1" [(set_attr "type" "vecperm") (set_attr "prefixed" "yes")]) +;; Generate the XXSPLTIDP instruction to support SFmode, DFmode, and DImode +;; scalar constants and V2DF and V2DI vector constants where both elements are +;; the same. The constant has to be expressible as a SFmode constant that is +;; not a SFmode denormal value. +(define_insn_and_split "*xxspltidp__internal" + [(set (match_operand:XXSPLTIDP_S 0 "vsx_register_operand" "=wa") + (match_operand:XXSPLTIDP_S 1 "easy_fp_constant_64bit_scalar" "eF"))] + "TARGET_POWER10" + "#" + "&& 1" + [(set (match_dup 0) + (unspec:XXSPLTIDP_S [(match_dup 2)] UNSPEC_XXSPLTIDP))] +{ + long immediate = xxspltidp_constant_immediate (operands[1], mode); + operands[2] = GEN_INT (immediate); +} + [(set_attr "type" "vecperm") + (set_attr "prefixed" "yes")]) + +(define_insn_and_split "*xxspltidp__internal" + [(set (match_operand:XXSPLTIDP_V 0 "vsx_register_operand" "=wa") + (match_operand:XXSPLTIDP_V 1 "easy_vector_constant_64bit_element" "eV"))] + "TARGET_POWER10" + "#" + "&& 1" + [(set (match_dup 0) + (unspec:XXSPLTIDP_V [(match_dup 2)] UNSPEC_XXSPLTIDP))] +{ + long immediate = xxspltidp_constant_immediate (operands[1], mode); + operands[2] = GEN_INT (immediate); +} + [(set_attr "type" "vecperm") + (set_attr "prefixed" "yes")]) + ;; XXSPLTI32DX built-in function support (define_expand "xxsplti32dx_v4si" [(set (match_operand:V4SI 0 "register_operand" "=wa") diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 2b41cb7fb7b..5035a3fd604 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -3333,9 +3333,15 @@ The integer constant zero. A constant whose negation is a signed 16-bit constant. @end ifset +@item eF +A 64-bit scalar constant that can be loaded with the XXSPLTIDP instruction. + @item eI A signed 34-bit integer constant if prefixed instructions are supported. +@item eV +A 128-bit vector constant that can be loaded with the XXSPLTIDP instruction. + @ifset INTERNALS @item G A floating point constant that can be loaded into a register with one diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c new file mode 100644 index 00000000000..8f6e176f9af --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP + instruction. */ + +double +scalar_double_0 (void) +{ + return 0.0; /* XXSPLTIB or XXLXOR. */ +} + +double +scalar_double_1 (void) +{ + return 1.0; /* XXSPLTIDP. */ +} + +#ifndef __FAST_MATH__ +double +scalar_double_m0 (void) +{ + return -0.0; /* XXSPLTIDP. */ +} + +double +scalar_double_nan (void) +{ + return __builtin_nan (""); /* XXSPLTIDP. */ +} + +double +scalar_double_inf (void) +{ + return __builtin_inf (); /* XXSPLTIDP. */ +} + +double +scalar_double_m_inf (void) /* XXSPLTIDP. */ +{ + return - __builtin_inf (); +} +#endif + +double +scalar_double_pi (void) +{ + return M_PI; /* PLFD. */ +} + +double +scalar_double_denorm (void) +{ + return 0x1p-149f; /* PLFD. */ +} + +/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c new file mode 100644 index 00000000000..75714d0b11d --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-di.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test generating DImode constants that have the same bit pattern as DFmode + constants that can be loaded with the XXSPLTIDP instruction with the ISA 3.1 + (power10). We use asm to force the value into vector registers. */ + +double +scalar_0 (void) +{ + /* XXSPLTIB or XXLXOR. */ + double d; + long long ll = 0; + + __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll)); + return d; +} + +double +scalar_1 (void) +{ + /* VSPLTISW/VUPKLSW or XXSPLTIB/VEXTSB2D. */ + double d; + long long ll = 1; + + __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll)); + return d; +} + +/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated + with XXSPLTIDP. */ +double +scalar_float_neg_0 (void) +{ + /* XXSPLTIDP. */ + double d; + long long ll = 0x8000000000000000LL; + + __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll)); + return d; +} + +/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with + XXSPLTIDP. */ +double +scalar_float_1_0 (void) +{ + /* XXSPLTIDP. */ + double d; + long long ll = 0x3ff0000000000000LL; + + __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll)); + return d; +} + +/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated + with XXSPLTIDP. */ +double +scalar_pi (void) +{ + /* PLXV. */ + double d; + long long ll = 0x400921fb54442d18LL; + + __asm__ ("xxmr %x0,%x1" : "=wa" (d) : "wa" (ll)); + return d; +} + +/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c new file mode 100644 index 00000000000..72504bdfbbd --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP + instruction. */ + +float +scalar_float_0 (void) +{ + return 0.0f; /* XXSPLTIB or XXLXOR. */ +} + +float +scalar_float_1 (void) +{ + return 1.0f; /* XXSPLTIDP. */ +} + +#ifndef __FAST_MATH__ +float +scalar_float_m0 (void) +{ + return -0.0f; /* XXSPLTIDP. */ +} + +float +scalar_float_nan (void) +{ + return __builtin_nanf (""); /* XXSPLTIDP. */ +} + +float +scalar_float_inf (void) +{ + return __builtin_inff (); /* XXSPLTIDP. */ +} + +float +scalar_float_m_inf (void) /* XXSPLTIDP. */ +{ + return - __builtin_inff (); +} +#endif + +float +scalar_float_pi (void) +{ + return (float)M_PI; /* XXSPLTIDP. */ +} + +float +scalar_float_denorm (void) +{ + return 0x1p-149f; /* PLFS. */ +} + +/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c new file mode 100644 index 00000000000..82ffc86f8aa --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c @@ -0,0 +1,64 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP + instruction. */ + +vector double +v2df_double_0 (void) +{ + return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */ +} + +vector double +v2df_double_1 (void) +{ + return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */ +} + +#ifndef __FAST_MATH__ +vector double +v2df_double_m0 (void) +{ + return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */ +} + +vector double +v2df_double_nan (void) +{ + return (vector double) { __builtin_nan (""), + __builtin_nan ("") }; /* XXSPLTIDP. */ +} + +vector double +v2df_double_inf (void) +{ + return (vector double) { __builtin_inf (), + __builtin_inf () }; /* XXSPLTIDP. */ +} + +vector double +v2df_double_m_inf (void) +{ + return (vector double) { - __builtin_inf (), + - __builtin_inf () }; /* XXSPLTIDP. */ +} +#endif + +vector double +v2df_double_pi (void) +{ + return (vector double) { M_PI, M_PI }; /* PLVX. */ +} + +vector double +v2df_double_denorm (void) +{ + return (vector double) { (double)0x1p-149f, + (double)0x1p-149f }; /* PLVX. */ +} + +/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c new file mode 100644 index 00000000000..4d44f943d26 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2di.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test generating V2DImode constants that have the same bit pattern as + V2DFmode constants that can be loaded with the XXSPLTIDP instruction with + the ISA 3.1 (power10). */ + +vector long long +vector_0 (void) +{ + /* XXSPLTIB or XXLXOR. */ + return (vector long long) { 0LL, 0LL }; +} + +vector long long +vector_1 (void) +{ + /* XXSPLTIB and VEXTSB2D. */ + return (vector long long) { 1LL, 1LL }; +} + +/* 0x8000000000000000LL is the bit pattern for -0.0, which can be generated + with XXSPLTISDP. */ +vector long long +vector_float_neg_0 (void) +{ + /* XXSPLTIDP. */ + return (vector long long) { 0x8000000000000000LL, 0x8000000000000000LL }; +} + +/* 0x3ff0000000000000LL is the bit pattern for 1.0 which can be generated with + XXSPLTISDP. */ +vector long long +vector_float_1_0 (void) +{ + /* XXSPLTIDP. */ + return (vector long long) { 0x3ff0000000000000LL, 0x3ff0000000000000LL }; +} + +/* 0x400921fb54442d18LL is the bit pattern for PI, which cannot be generated + with XXSPLTIDP. */ +vector long long +scalar_pi (void) +{ + /* PLXV. */ + return (vector long long) { 0x400921fb54442d18LL, 0x400921fb54442d18LL }; +} + +/* { dg-final { scan-assembler-times {\mxxspltidp\M} 2 } } */