From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id 8DCAC39450EC; Tue, 13 Apr 2021 17:14:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8DCAC39450EC Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work047)] Implement XXSPLTIDP support. X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/work047 X-Git-Oldrev: 6d106de793e3933f94d5eb4b82a2536a2d315be7 X-Git-Newrev: 74a1250e17ca6c47f5acf9a437db4cf001dca8d1 Message-Id: <20210413171422.8DCAC39450EC@sourceware.org> Date: Tue, 13 Apr 2021 17:14:22 +0000 (GMT) X-BeenThere: gcc-cvs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Apr 2021 17:14:22 -0000 https://gcc.gnu.org/g:74a1250e17ca6c47f5acf9a437db4cf001dca8d1 commit 74a1250e17ca6c47f5acf9a437db4cf001dca8d1 Author: Michael Meissner Date: Tue Apr 13 13:14:02 2021 -0400 Implement XXSPLTIDP support. This patch implements XXSPLTIDP support for SF and DF scalar constants and V2DF vector constants. A new constraint (eF) is added to match constants that can be loaded with the XXSPLTIDP instruction. I have moved the XXSPLTIDP built-in function support from altivec.md to vsx.md because the functions can load any VSX register, not just the ALTIVEC registers. I have added a temporary switch (-mxxspltidp) to control whether or not the XXSPLTIDP instruction is generated. This patch provides a xxspltidp_constant_p function which decodes both VEC_DUPLICATE and VECTOR_CONST insns (similar to the existing xxspltib_constant_p function). The xxspltidp_constant_p function returns the appropriate integer that will be used in the XXSPLTIDP instruction. Note, because SFmode denormal values are undefined in the hardware, the xxspltidp_constant_p function returns false. Also xxspltidp_constant_p returns false for 0.0 because is cheaper to implement without XXSPLTIDP. gcc/ 2021-04-13 Michael Meissner * config/rs6000/altivec.md (UNSPEC_XXSPLTID): Move to vsx.md and rename to UNSPEC_XXSPLTID. (xxspltidp_v2df): Move to vsx.md and re-implement. (xxspltidp_v2df_inst): Move to vsx.md and re-implement. * config/rs6000/constraints.md (eF): New constraint. * config/rs6000/predicates.md (easy_fp_constant): If we can load the scalar constant with XXSPLTIDP, return true. (xxspltidp_operand): New predicate. (easy_vector_constant): If we can generate XXSPLTIDP, mark the vector constant as easy. * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add -mxxspltidp support. (POWERPC_MASKS): Add -mxxspltidp support. * config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New declaration. * config/rs6000/rs6000.c (rs6000_option_override_internal): Add -mxxspltidp support. (xxspltidp_constant_p): New function. (output_vec_const_move): Add support for XXSPLTIDP. (rs6000_opt_masks): Add -mxxspltidp support. (rs6000_emit_xxspltidp_v2df): Change function to implement the XXSPLTIDP instruction. * config/rs6000/rs6000.md (movsf_hardfloat): Add XXSPLTIDP support. (mov_hardfloat32, FMOVE64 iterator): Add XXSPLTIDP support. (mov_hardfloat64, FMOVE64 iterator): Add XXSPLTIDP support. * config/rs6000/rs6000.opt (-mxxspltidp): New switch. * config/rs6000/vsx.md (UNSPEC_XXSPLTIDP): Move here from altivec.md. Rename it to UNSPEC_XXSPLTIDP to match the instruction. (XXSPLTIDP): New mode iterator. (xxspltidp__internal1): New define_insn_and_split. (xxspltidp__internal2): New define_insn. (xxspltidp_v2df): Move to vsx.md from altivec.md. Re-implement to use the new constant format. Diff: --- gcc/config/rs6000/altivec.md | 21 --------- gcc/config/rs6000/constraints.md | 5 +++ gcc/config/rs6000/predicates.md | 21 +++++++++ gcc/config/rs6000/rs6000-cpus.def | 2 + gcc/config/rs6000/rs6000-protos.h | 1 + gcc/config/rs6000/rs6000.c | 90 ++++++++++++++++++++++++++++++++++++--- gcc/config/rs6000/rs6000.md | 52 ++++++++++++++-------- gcc/config/rs6000/rs6000.opt | 6 ++- gcc/config/rs6000/vsx.md | 42 ++++++++++++++++++ 9 files changed, 193 insertions(+), 47 deletions(-) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 708296cb14d..ad6ead04cfa 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -176,7 +176,6 @@ UNSPEC_VSTRIL UNSPEC_SLDB UNSPEC_SRDB - UNSPEC_XXSPLTID UNSPEC_XXSPLTI32DX UNSPEC_XXBLEND UNSPEC_XXPERMX @@ -819,26 +818,6 @@ "vsdbi %0,%1,%2,%3" [(set_attr "type" "vecsimple")]) -(define_expand "xxspltidp_v2df" - [(set (match_operand:V2DF 0 "register_operand" ) - (unspec:V2DF [(match_operand:SF 1 "const_double_operand")] - UNSPEC_XXSPLTID))] - "TARGET_POWER10" -{ - long value = rs6000_const_f32_to_i32 (operands[1]); - rs6000_emit_xxspltidp_v2df (operands[0], value); - DONE; -}) - -(define_insn "xxspltidp_v2df_inst" - [(set (match_operand:V2DF 0 "register_operand" "=wa") - (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")] - UNSPEC_XXSPLTID))] - "TARGET_POWER10" - "xxspltidp %x0,%1" - [(set_attr "type" "vecsimple") - (set_attr "prefixed" "yes")]) - (define_expand "xxsplti32dx_v4si" [(set (match_operand:V4SI 0 "register_operand" "=wa") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "0") diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md index 561ce9797af..e1fadd63580 100644 --- a/gcc/config/rs6000/constraints.md +++ b/gcc/config/rs6000/constraints.md @@ -208,6 +208,11 @@ (and (match_code "const_int") (match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000"))) +;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP +(define_constraint "eF" + "A vector constant that can be loaded with the XXSPLTIDP instruction." + (match_operand 0 "xxspltidp_operand")) + ;; 34-bit signed integer constant (define_constraint "eI" "A signed 34-bit integer constant if prefixed instructions are supported." diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index bf678f429af..8c461ba2b76 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -601,6 +601,11 @@ if (TARGET_VSX && op == CONST0_RTX (mode)) return 1; + /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can + be loaded with that instruction. */ + if (xxspltidp_operand (op, mode)) + return 1; + /* Otherwise consider floating point constants hard, so that the constant gets pushed to memory during the early RTL phases. This has the advantage that double precision constants that can be @@ -666,6 +671,19 @@ return true; }) +;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be +;; loaded via the ISA 3.1 XXSPLTIDP instruction. Do not return true if the +;; value is 0.0, since that is easy to generate without using XXSPLTIDP. +(define_predicate "xxspltidp_operand" + (match_code "const_double,const_vector,vec_duplicate") +{ + if (op == CONST0_RTX (mode)) + return false; + + HOST_WIDE_INT value = 0; + return xxspltidp_constant_p (op, mode, &value); +}) + ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a ;; vector register without using memory. (define_predicate "easy_vector_constant" @@ -682,6 +700,9 @@ if (xxspltiw_operand (op, mode)) return true; + if (xxspltidp_operand (op, mode)) + return true; + if (TARGET_P9_VECTOR && xxspltib_constant_p (op, mode, &num_insns, &value)) return true; diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def index a21a95bc7aa..3b657e490b1 100644 --- a/gcc/config/rs6000/rs6000-cpus.def +++ b/gcc/config/rs6000/rs6000-cpus.def @@ -86,6 +86,7 @@ | OPTION_MASK_P10_FUSION \ | OPTION_MASK_P10_FUSION_LD_CMPI \ | OPTION_MASK_P10_FUSION_2LOGICAL \ + | OPTION_MASK_XXSPLTIDP \ | OPTION_MASK_XXSPLTIW) /* Flags that need to be turned off if -mno-power9-vector. */ @@ -162,6 +163,7 @@ | OPTION_MASK_SOFT_FLOAT \ | OPTION_MASK_STRICT_ALIGN_OPTIONAL \ | OPTION_MASK_VSX \ + | OPTION_MASK_XXSPLTIDP \ | OPTION_MASK_XXSPLTIW) #endif diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 8ac30905013..5cd0b341fa6 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int, extern bool easy_altivec_constant (rtx, machine_mode); extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); +extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *); extern int vspltis_shifted (rtx); extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); extern bool macho_lo_sum_memory_operand (rtx, machine_mode); diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 53789c6fec3..07958c82761 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4479,11 +4479,16 @@ rs6000_option_override_internal (bool global_init_p) if (!TARGET_PCREL && TARGET_PCREL_OPT) rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT; - if (TARGET_POWER10 && TARGET_VSX - && (rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIW) == 0) - rs6000_isa_flags |= OPTION_MASK_XXSPLTIW; - else if (!TARGET_POWER10 || !TARGET_VSX) - rs6000_isa_flags &= ~OPTION_MASK_XXSPLTIW; + if (TARGET_POWER10 && TARGET_VSX) + { + if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIW) == 0) + rs6000_isa_flags |= OPTION_MASK_XXSPLTIW; + + if ((rs6000_isa_flags_explicit & OPTION_MASK_XXSPLTIDP) == 0) + rs6000_isa_flags |= OPTION_MASK_XXSPLTIDP; + } + else + rs6000_isa_flags &= ~(OPTION_MASK_XXSPLTIW | OPTION_MASK_XXSPLTIDP); if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET) rs6000_print_isa_options (stderr, 0, "after subtarget", rs6000_isa_flags); @@ -6475,6 +6480,75 @@ xxspltib_constant_p (rtx op, return true; } +/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1 + XXSPLTIDP instruction. + + Return the constant that is being split via CONSTANT_PTR to use in the + XXSPLTIDP instruction. */ + +bool +xxspltidp_constant_p (rtx op, + machine_mode mode, + HOST_WIDE_INT *constant_ptr) +{ + *constant_ptr = 0; + + if (!TARGET_XXSPLTIDP) + return false; + + if (mode == VOIDmode) + mode = GET_MODE (op); + + rtx element = op; + if (mode == V2DFmode) + { + /* Handle VEC_DUPLICATE and CONST_VECTOR. */ + if (GET_CODE (op) == VEC_DUPLICATE) + element = XEXP (op, 0); + + else if (GET_CODE (op) == CONST_VECTOR) + { + element = CONST_VECTOR_ELT (op, 0); + if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1))) + return false; + } + + else + return false; + + mode = DFmode; + } + + if (mode != SFmode && mode != DFmode) + return false; + + if (GET_MODE (element) != mode) + return false; + + if (!CONST_DOUBLE_P (element)) + return false; + + /* Don't return true for 0.0 since that is easy to create without + XXSPLTIDP. */ + if (element == CONST0_RTX (mode)) + return false; + + /* If the value doesn't fit in a SFmode, exactly, we can't use XXSPLTIDP. */ + const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element); + if (!exact_real_truncate (SFmode, rv)) + return 0; + + long value; + REAL_VALUE_TO_TARGET_SINGLE (*rv, value); + + /* Test for SFmode denormal (exponent is 0, mantissa field is non-zero). */ + if (((value & 0x7F800000) == 0) && ((value & 0x7FFFFF) != 0)) + return false; + + *constant_ptr = value; + return true; +} + const char * output_vec_const_move (rtx *operands) { @@ -6519,7 +6593,8 @@ output_vec_const_move (rtx *operands) gcc_unreachable (); } - if (xxspltiw_operand (vec, mode)) + if (xxspltiw_operand (vec, mode) + || xxspltidp_operand (vec, mode)) return "#"; if (TARGET_P9_VECTOR @@ -24020,6 +24095,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] = { "update", OPTION_MASK_NO_UPDATE, true , true }, { "vsx", OPTION_MASK_VSX, false, true }, { "xxspltiw", OPTION_MASK_XXSPLTIW, false, true }, + { "xxspltidp", OPTION_MASK_XXSPLTIDP, false, true }, #ifdef OPTION_MASK_64BIT #if TARGET_AIX_OS { "aix64", OPTION_MASK_64BIT, false, false }, @@ -27845,7 +27921,7 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value) inform (input_location, "the result for the xxspltidp instruction " "is undefined for subnormal input values"); - emit_insn( gen_xxspltidp_v2df_inst (dst, GEN_INT (value))); + emit_insn( gen_xxspltidp_v2df_internal2 (dst, GEN_INT (value))); } /* Implement TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC. */ diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index ca4a4d01f05..1996fd1ece3 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7564,17 +7564,17 @@ ;; ;; LWZ LFS LXSSP LXSSPX STFS STXSSP ;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP -;; MR MT MF NOP +;; MR MT MF NOP XXSPLTIDP (define_insn "movsf_hardfloat" [(set (match_operand:SF 0 "nonimmediate_operand" "=!r, f, v, wa, m, wY, Z, m, wa, !r, f, wa, - !r, *c*l, !r, *h") + !r, *c*l, !r, *h, wa") (match_operand:SF 1 "input_operand" "m, m, wY, Z, f, v, wa, r, j, j, f, wa, - r, r, *h, 0"))] + r, r, *h, 0, eF"))] "(register_operand (operands[0], SFmode) || register_operand (operands[1], SFmode)) && TARGET_HARD_FLOAT @@ -7596,15 +7596,20 @@ mr %0,%1 mt%0 %1 mf%1 %0 - nop" + nop + #" [(set_attr "type" "load, fpload, fpload, fpload, fpstore, fpstore, fpstore, store, veclogical, integer, fpsimple, fpsimple, - *, mtjmpr, mfjmpr, *") + *, mtjmpr, mfjmpr, *, vecperm") (set_attr "isa" "*, *, p9v, p8v, *, p9v, p8v, *, *, *, *, *, - *, *, *, *")]) + *, *, *, *, p10") + (set_attr "prefixed" + "*, *, *, *, *, *, + *, *, *, *, *, *, + *, *, *, *, yes")]) ;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ ;; FMR MR MT%0 MF%1 NOP @@ -7864,18 +7869,18 @@ ;; STFD LFD FMR LXSD STXSD ;; LXSD STXSD XXLOR XXLXOR GPR<-0 -;; LWZ STW MR +;; LWZ STW MR XXSPLTIDP (define_insn "*mov_hardfloat32" [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m, d, d, , wY, , Z, , , !r, - Y, r, !r") + Y, r, !r, wa") (match_operand:FMOVE64 1 "input_operand" "d, m, d, wY, , Z, , , , , - r, Y, r"))] + r, Y, r, eF"))] "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && (gpc_reg_operand (operands[0], mode) || gpc_reg_operand (operands[1], mode))" @@ -7892,20 +7897,25 @@ # # # + # #" [(set_attr "type" "fpstore, fpload, fpsimple, fpload, fpstore, fpload, fpstore, veclogical, veclogical, two, - store, load, two") + store, load, two, vecperm") (set_attr "size" "64") (set_attr "length" "*, *, *, *, *, *, *, *, *, 8, - 8, 8, 8") + 8, 8, 8, *") (set_attr "isa" "*, *, *, p9v, p9v, p7v, p7v, *, *, *, - *, *, *")]) + *, *, *, p10") + (set_attr "prefixed" + "*, *, *, *, *, + *, *, *, *, *, + *, *, *, yes")]) ;; STW LWZ MR G-const H-const F-const @@ -7932,19 +7942,19 @@ ;; STFD LFD FMR LXSD STXSD ;; LXSDX STXSDX XXLOR XXLXOR LI 0 ;; STD LD MR MT{CTR,LR} MF{CTR,LR} -;; NOP MFVSRD MTVSRD +;; NOP MFVSRD MTVSRD XXSPLTIDP (define_insn "*mov_hardfloat64" [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m, d, d, , wY, , Z, , , !r, YZ, r, !r, *c*l, !r, - *h, r, ") + *h, r, , wa") (match_operand:FMOVE64 1 "input_operand" "d, m, d, wY, , Z, , , , , r, YZ, r, r, *h, - 0, , r"))] + 0, , r, eF"))] "TARGET_POWERPC64 && TARGET_HARD_FLOAT && (gpc_reg_operand (operands[0], mode) || gpc_reg_operand (operands[1], mode))" @@ -7966,18 +7976,24 @@ mf%1 %0 nop mfvsrd %0,%x1 - mtvsrd %x0,%1" + mtvsrd %x0,%1 + #" [(set_attr "type" "fpstore, fpload, fpsimple, fpload, fpstore, fpload, fpstore, veclogical, veclogical, integer, store, load, *, mtjmpr, mfjmpr, - *, mfvsr, mtvsr") + *, mfvsr, mtvsr, vecperm") (set_attr "size" "64") (set_attr "isa" "*, *, *, p9v, p9v, p7v, p7v, *, *, *, *, *, *, *, *, - *, p8v, p8v")]) + *, p8v, p8v, p10") + (set_attr "prefixed" + "*, *, *, *, *, + *, *, *, *, *, + *, *, *, *, *, + *, *, *, yes")]) ;; STD LD MR MT MF G-const ;; H-const F-const Special diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index b01ebd78c7f..6620cdb7716 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -622,4 +622,8 @@ Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save mxxspltiw Target Undocumented Mask(XXSPLTIW) Var(rs6000_isa_flags) -Generate (do not generate) XXSPLTIW instructions. +Generate (do not generate) the XXSPLTIW instruction. + +mxxspltidp +Target Undocumented Mask(XXSPLTIDP) Var(rs6000_isa_flags) +Generate (do not generate) the XXSPLTIDP instruction. diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 452260d8df7..951ce659872 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -369,6 +369,7 @@ UNSPEC_REPLACE_UN UNSPEC_VDIVES UNSPEC_VDIVEU + UNSPEC_XXSPLTIDP ]) (define_int_iterator XVCVBF16 [UNSPEC_VSX_XVCVSPBF16 @@ -6284,3 +6285,44 @@ operands[2] = CONST_VECTOR_ELT (operands[1], 0); }) +;; XXSPLTIDP support. +(define_mode_iterator XXSPLTIDP [SF DF V2DF]) + +(define_insn_and_split "*xxspltidp__internal1" + [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa") + (match_operand:XXSPLTIDP 1 "xxspltidp_operand"))] + "TARGET_XXSPLTIDP" + "#" + "&& 1" + [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand") + (unspec:XXSPLTIDP [(match_dup 2)] UNSPEC_XXSPLTIDP))] +{ + HOST_WIDE_INT value = 0; + + if (!xxspltidp_constant_p (operands[1], mode, &value)) + gcc_unreachable (); + + operands[2] = GEN_INT (value); +} + [(set_attr "type" "vecperm") + (set_attr "prefixed" "yes")]) + +(define_insn "xxspltidp__internal2" + [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa") + (unspec:XXSPLTIDP [(match_operand 1 "const_int_operand" "n")] + UNSPEC_XXSPLTIDP))] + "TARGET_XXSPLTIDP" + "xxspltidp %x0,%1" + [(set_attr "type" "vecperm") + (set_attr "prefixed" "yes")]) + +;; XXSPLTIDP built-in function support. +(define_expand "xxspltidp_v2df" + [(use (match_operand:V2DF 0 "register_operand" )) + (use (match_operand:SF 1 "const_double_operand"))] + "TARGET_POWER10" +{ + long value = rs6000_const_f32_to_i32 (operands[1]); + rs6000_emit_xxspltidp_v2df (operands[0], value); + DONE; +})