From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 9FBCC385803D for ; Mon, 11 Oct 2021 13:59:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9FBCC385803D Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3C0B31063; Mon, 11 Oct 2021 06:59:41 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.88]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9EF2D3F694; Mon, 11 Oct 2021 06:59:40 -0700 (PDT) From: Richard Sandiford To: Christophe Lyon via Gcc-patches Mail-Followup-To: Christophe Lyon via Gcc-patches , Christophe Lyon , richard.sandiford@arm.com Subject: Re: [PATCH 08/13] arm: Implement auto-vectorized MVE comparisons with vectors of boolean predicates References: <20210907091704.1034380-6-christophe.lyon@foss.st.com> <20210907091704.1034380-9-christophe.lyon@foss.st.com> Date: Mon, 11 Oct 2021 14:59:39 +0100 In-Reply-To: <20210907091704.1034380-9-christophe.lyon@foss.st.com> (Christophe Lyon via Gcc-patches's message of "Tue, 7 Sep 2021 11:16:59 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Oct 2021 13:59:44 -0000 Christophe Lyon via Gcc-patches writes: > We make use of qualifier_predicate to describe MVE builtins > prototypes, restricting to auto-vectorizable vcmp* and vpsel builtins, > as they are exercised by the tests added earlier in the series. > > Special handling is needed for mve_vpselq because it has a v2di > variant, which has no natural VPR.P0 representation: we keep HImode > for it. > > The vector_compare expansion code is updated to use the right VxBI > mode instead of HI for the result. > > New mov patterns are introduced to handle the new modes. > > 2021-09-01 Christophe Lyon > > gcc/ > PR target/100757 > PR target/101325 > * config/arm/arm-builtins.c (BINOP_PRED_UNONE_UNONE_QUALIFIERS) > (BINOP_PRED_NONE_NONE_QUALIFIERS) > (TERNOP_NONE_NONE_NONE_PRED_QUALIFIERS) > (TERNOP_UNONE_UNONE_UNONE_PRED_QUALIFIERS): New. > * config/arm/arm.c (arm_hard_regno_mode_ok): Handle new VxBI > modes. > (arm_mode_to_pred_mode): New. > (arm_expand_vector_compare): Use the right VxBI mode instead of > HI. > (arm_expand_vcond): Likewise. > * config/arm/arm_mve_builtins.def (vcmpneq_, vcmphiq_, vcmpcsq_) > (vcmpltq_, vcmpleq_, vcmpgtq_, vcmpgeq_, vcmpeqq_, vcmpneq_f) > (vcmpltq_f, vcmpleq_f, vcmpgtq_f, vcmpgeq_f, vcmpeqq_f, vpselq_u) > (vpselq_s, vpselq_f): Use new predicated qualifiers. > * config/arm/iterators.md (MVE_7): New mode iterator. > (MVE_VPRED, MVE_vpred): New attribute iterators. > * config/arm/mve.md (@mve_vcmpq_) > (@mve_vcmpq_f, @mve_vpselq_) > (@mve_vpselq_f): Use MVE_VPRED instead of HI. > (@mve_vpselq_v2di): Define separately. > (mov): New expander for VxBI modes. > (mve_mov): New insn for VxBI modes. > > diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c > index 771759f0cdd..6e3638869f1 100644 > --- a/gcc/config/arm/arm-builtins.c > +++ b/gcc/config/arm/arm-builtins.c > @@ -469,6 +469,12 @@ arm_binop_unone_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] > #define BINOP_UNONE_UNONE_UNONE_QUALIFIERS \ > (arm_binop_unone_unone_unone_qualifiers) > > +static enum arm_type_qualifiers > +arm_binop_pred_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] > + = { qualifier_predicate, qualifier_unsigned, qualifier_unsigned }; > +#define BINOP_PRED_UNONE_UNONE_QUALIFIERS \ > + (arm_binop_pred_unone_unone_qualifiers) > + > static enum arm_type_qualifiers > arm_binop_unone_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] > = { qualifier_unsigned, qualifier_none, qualifier_immediate }; > @@ -487,6 +493,12 @@ arm_binop_unone_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] > #define BINOP_UNONE_NONE_NONE_QUALIFIERS \ > (arm_binop_unone_none_none_qualifiers) > > +static enum arm_type_qualifiers > +arm_binop_pred_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] > + = { qualifier_predicate, qualifier_none, qualifier_none }; > +#define BINOP_PRED_NONE_NONE_QUALIFIERS \ > + (arm_binop_pred_none_none_qualifiers) > + > static enum arm_type_qualifiers > arm_binop_unone_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] > = { qualifier_unsigned, qualifier_unsigned, qualifier_none }; > @@ -558,6 +570,12 @@ arm_ternop_none_none_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] > #define TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS \ > (arm_ternop_none_none_none_unone_qualifiers) > > +static enum arm_type_qualifiers > +arm_ternop_none_none_none_pred_qualifiers[SIMD_MAX_BUILTIN_ARGS] > + = { qualifier_none, qualifier_none, qualifier_none, qualifier_predicate }; > +#define TERNOP_NONE_NONE_NONE_PRED_QUALIFIERS \ > + (arm_ternop_none_none_none_pred_qualifiers) > + > static enum arm_type_qualifiers > arm_ternop_none_none_imm_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] > = { qualifier_none, qualifier_none, qualifier_immediate, qualifier_unsigned }; > @@ -577,6 +595,13 @@ arm_ternop_unone_unone_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] > #define TERNOP_UNONE_UNONE_UNONE_UNONE_QUALIFIERS \ > (arm_ternop_unone_unone_unone_unone_qualifiers) > > +static enum arm_type_qualifiers > +arm_ternop_unone_unone_unone_pred_qualifiers[SIMD_MAX_BUILTIN_ARGS] > + = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned, > + qualifier_predicate }; > +#define TERNOP_UNONE_UNONE_UNONE_PRED_QUALIFIERS \ > + (arm_ternop_unone_unone_unone_pred_qualifiers) > + > static enum arm_type_qualifiers > arm_ternop_none_none_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] > = { qualifier_none, qualifier_none, qualifier_none, qualifier_none }; > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index 1222cb0d0fe..5f6637d9a5f 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -25304,7 +25304,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode) > return false; > > if (IS_VPR_REGNUM (regno)) > - return mode == HImode; > + return mode == HImode || mode == V16BImode || mode == V8BImode || mode == V4BImode; Nit: long line, should be: return (mode == HImode || mode == V16BImode || mode == V8BImode || mode == V4BImode); > @@ -30994,6 +30994,19 @@ arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem, > arm_post_atomic_barrier (model); > } > > +/* Return the mode for the MVE vector of predicates corresponding to MODE. */ > +machine_mode > +arm_mode_to_pred_mode (machine_mode mode) > +{ > + switch (GET_MODE_NUNITS (mode)) > + { > + case 16: return V16BImode; > + case 8: return V8BImode; > + case 4: return V4BImode; > + } > + gcc_unreachable (); > +} > + > /* Expand code to compare vectors OP0 and OP1 using condition CODE. > If CAN_INVERT, store either the result or its inverse in TARGET > and return true if TARGET contains the inverse. If !CAN_INVERT, > @@ -31077,7 +31090,7 @@ arm_expand_vector_compare (rtx target, rtx_code code, rtx op0, rtx op1, > if (vcond_mve) > vpr_p0 = target; > else > - vpr_p0 = gen_reg_rtx (HImode); > + vpr_p0 = gen_reg_rtx (arm_mode_to_pred_mode (cmp_mode)); > > switch (GET_MODE_CLASS (cmp_mode)) > { > @@ -31119,7 +31132,7 @@ arm_expand_vector_compare (rtx target, rtx_code code, rtx op0, rtx op1, > if (vcond_mve) > vpr_p0 = target; > else > - vpr_p0 = gen_reg_rtx (HImode); > + vpr_p0 = gen_reg_rtx (arm_mode_to_pred_mode (cmp_mode)); > > emit_insn (gen_mve_vcmpq (code, cmp_mode, vpr_p0, op0, force_reg (cmp_mode, op1))); > if (!vcond_mve) > @@ -31146,7 +31159,7 @@ arm_expand_vector_compare (rtx target, rtx_code code, rtx op0, rtx op1, > if (vcond_mve) > vpr_p0 = target; > else > - vpr_p0 = gen_reg_rtx (HImode); > + vpr_p0 = gen_reg_rtx (arm_mode_to_pred_mode (cmp_mode)); > > emit_insn (gen_mve_vcmpq (swap_condition (code), cmp_mode, vpr_p0, force_reg (cmp_mode, op1), op0)); > if (!vcond_mve) > @@ -31199,7 +31212,7 @@ arm_expand_vcond (rtx *operands, machine_mode cmp_result_mode) > if (TARGET_HAVE_MVE) > { > vcond_mve=true; > - mask = gen_reg_rtx (HImode); > + mask = gen_reg_rtx (arm_mode_to_pred_mode (cmp_result_mode)); > } > else > mask = gen_reg_rtx (cmp_result_mode); > diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def > index e9b5b28f506..58a05e61bd9 100644 > --- a/gcc/config/arm/arm_mve_builtins.def > +++ b/gcc/config/arm/arm_mve_builtins.def > @@ -89,7 +89,7 @@ VAR3 (BINOP_UNONE_UNONE_IMM, vshrq_n_u, v16qi, v8hi, v4si) > VAR3 (BINOP_NONE_NONE_IMM, vshrq_n_s, v16qi, v8hi, v4si) > VAR1 (BINOP_NONE_NONE_UNONE, vaddlvq_p_s, v4si) > VAR1 (BINOP_UNONE_UNONE_UNONE, vaddlvq_p_u, v4si) > -VAR3 (BINOP_UNONE_NONE_NONE, vcmpneq_, v16qi, v8hi, v4si) > +VAR3 (BINOP_PRED_NONE_NONE, vcmpneq_, v16qi, v8hi, v4si) > VAR3 (BINOP_NONE_NONE_NONE, vshlq_s, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_NONE, vshlq_u, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_UNONE, vsubq_u, v16qi, v8hi, v4si) > @@ -117,9 +117,9 @@ VAR3 (BINOP_UNONE_UNONE_UNONE, vhsubq_n_u, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_UNONE, vhaddq_u, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_UNONE, vhaddq_n_u, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_UNONE, veorq_u, v16qi, v8hi, v4si) > -VAR3 (BINOP_UNONE_UNONE_UNONE, vcmphiq_, v16qi, v8hi, v4si) > +VAR3 (BINOP_PRED_UNONE_UNONE, vcmphiq_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_UNONE, vcmphiq_n_, v16qi, v8hi, v4si) > -VAR3 (BINOP_UNONE_UNONE_UNONE, vcmpcsq_, v16qi, v8hi, v4si) > +VAR3 (BINOP_PRED_UNONE_UNONE, vcmpcsq_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_UNONE, vcmpcsq_n_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_UNONE, vbicq_u, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_UNONE, vandq_u, v16qi, v8hi, v4si) > @@ -143,15 +143,15 @@ VAR3 (BINOP_UNONE_UNONE_IMM, vshlq_n_u, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_IMM, vrshrq_n_u, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_UNONE_IMM, vqshlq_n_u, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_NONE_NONE, vcmpneq_n_, v16qi, v8hi, v4si) > -VAR3 (BINOP_UNONE_NONE_NONE, vcmpltq_, v16qi, v8hi, v4si) > +VAR3 (BINOP_PRED_NONE_NONE, vcmpltq_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_NONE_NONE, vcmpltq_n_, v16qi, v8hi, v4si) > -VAR3 (BINOP_UNONE_NONE_NONE, vcmpleq_, v16qi, v8hi, v4si) > +VAR3 (BINOP_PRED_NONE_NONE, vcmpleq_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_NONE_NONE, vcmpleq_n_, v16qi, v8hi, v4si) > -VAR3 (BINOP_UNONE_NONE_NONE, vcmpgtq_, v16qi, v8hi, v4si) > +VAR3 (BINOP_PRED_NONE_NONE, vcmpgtq_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_NONE_NONE, vcmpgtq_n_, v16qi, v8hi, v4si) > -VAR3 (BINOP_UNONE_NONE_NONE, vcmpgeq_, v16qi, v8hi, v4si) > +VAR3 (BINOP_PRED_NONE_NONE, vcmpgeq_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_NONE_NONE, vcmpgeq_n_, v16qi, v8hi, v4si) > -VAR3 (BINOP_UNONE_NONE_NONE, vcmpeqq_, v16qi, v8hi, v4si) > +VAR3 (BINOP_PRED_NONE_NONE, vcmpeqq_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_NONE_NONE, vcmpeqq_n_, v16qi, v8hi, v4si) > VAR3 (BINOP_UNONE_NONE_IMM, vqshluq_n_s, v16qi, v8hi, v4si) > VAR3 (BINOP_NONE_NONE_UNONE, vaddvq_p_s, v16qi, v8hi, v4si) > @@ -219,17 +219,17 @@ VAR2 (BINOP_UNONE_UNONE_IMM, vshllbq_n_u, v16qi, v8hi) > VAR2 (BINOP_UNONE_UNONE_IMM, vorrq_n_u, v8hi, v4si) > VAR2 (BINOP_UNONE_UNONE_IMM, vbicq_n_u, v8hi, v4si) > VAR2 (BINOP_UNONE_NONE_NONE, vcmpneq_n_f, v8hf, v4sf) > -VAR2 (BINOP_UNONE_NONE_NONE, vcmpneq_f, v8hf, v4sf) > +VAR2 (BINOP_PRED_NONE_NONE, vcmpneq_f, v8hf, v4sf) > VAR2 (BINOP_UNONE_NONE_NONE, vcmpltq_n_f, v8hf, v4sf) > -VAR2 (BINOP_UNONE_NONE_NONE, vcmpltq_f, v8hf, v4sf) > +VAR2 (BINOP_PRED_NONE_NONE, vcmpltq_f, v8hf, v4sf) > VAR2 (BINOP_UNONE_NONE_NONE, vcmpleq_n_f, v8hf, v4sf) > -VAR2 (BINOP_UNONE_NONE_NONE, vcmpleq_f, v8hf, v4sf) > +VAR2 (BINOP_PRED_NONE_NONE, vcmpleq_f, v8hf, v4sf) > VAR2 (BINOP_UNONE_NONE_NONE, vcmpgtq_n_f, v8hf, v4sf) > -VAR2 (BINOP_UNONE_NONE_NONE, vcmpgtq_f, v8hf, v4sf) > +VAR2 (BINOP_PRED_NONE_NONE, vcmpgtq_f, v8hf, v4sf) > VAR2 (BINOP_UNONE_NONE_NONE, vcmpgeq_n_f, v8hf, v4sf) > -VAR2 (BINOP_UNONE_NONE_NONE, vcmpgeq_f, v8hf, v4sf) > +VAR2 (BINOP_PRED_NONE_NONE, vcmpgeq_f, v8hf, v4sf) > VAR2 (BINOP_UNONE_NONE_NONE, vcmpeqq_n_f, v8hf, v4sf) > -VAR2 (BINOP_UNONE_NONE_NONE, vcmpeqq_f, v8hf, v4sf) > +VAR2 (BINOP_PRED_NONE_NONE, vcmpeqq_f, v8hf, v4sf) > VAR2 (BINOP_NONE_NONE_NONE, vsubq_f, v8hf, v4sf) > VAR2 (BINOP_NONE_NONE_NONE, vqmovntq_s, v8hi, v4si) > VAR2 (BINOP_NONE_NONE_NONE, vqmovnbq_s, v8hi, v4si) > @@ -295,8 +295,8 @@ VAR2 (TERNOP_UNONE_UNONE_NONE_UNONE, vcvtaq_m_u, v8hi, v4si) > VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vcvtaq_m_s, v8hi, v4si) > VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vshlcq_vec_u, v16qi, v8hi, v4si) > VAR3 (TERNOP_NONE_NONE_UNONE_IMM, vshlcq_vec_s, v16qi, v8hi, v4si) > -VAR4 (TERNOP_UNONE_UNONE_UNONE_UNONE, vpselq_u, v16qi, v8hi, v4si, v2di) > -VAR4 (TERNOP_NONE_NONE_NONE_UNONE, vpselq_s, v16qi, v8hi, v4si, v2di) > +VAR4 (TERNOP_UNONE_UNONE_UNONE_PRED, vpselq_u, v16qi, v8hi, v4si, v2di) > +VAR4 (TERNOP_NONE_NONE_NONE_PRED, vpselq_s, v16qi, v8hi, v4si, v2di) > VAR3 (TERNOP_UNONE_UNONE_UNONE_UNONE, vrev64q_m_u, v16qi, v8hi, v4si) > VAR3 (TERNOP_UNONE_UNONE_UNONE_UNONE, vmvnq_m_u, v16qi, v8hi, v4si) > VAR3 (TERNOP_UNONE_UNONE_UNONE_UNONE, vmlasq_n_u, v16qi, v8hi, v4si) > @@ -426,7 +426,7 @@ VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vrev64q_m_f, v8hf, v4sf) > VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vrev32q_m_s, v16qi, v8hi) > VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vqmovntq_m_s, v8hi, v4si) > VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vqmovnbq_m_s, v8hi, v4si) > -VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vpselq_f, v8hf, v4sf) > +VAR2 (TERNOP_NONE_NONE_NONE_PRED, vpselq_f, v8hf, v4sf) > VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vnegq_m_f, v8hf, v4sf) > VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vmovntq_m_s, v8hi, v4si) > VAR2 (TERNOP_NONE_NONE_NONE_UNONE, vmovnbq_m_s, v8hi, v4si) > diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md > index fafbd2f94b8..df5d15e08b8 100644 > --- a/gcc/config/arm/iterators.md > +++ b/gcc/config/arm/iterators.md > @@ -272,6 +272,7 @@ (define_mode_iterator MVE_3 [V16QI V8HI]) > (define_mode_iterator MVE_2 [V16QI V8HI V4SI]) > (define_mode_iterator MVE_5 [V8HI V4SI]) > (define_mode_iterator MVE_6 [V8HI V4SI]) > +(define_mode_iterator MVE_7 [V16BI V8BI V4BI]) > > ;;---------------------------------------------------------------------------- > ;; Code iterators > @@ -946,6 +947,10 @@ (define_mode_attr V_extr_elem [(V16QI "u8") (V8HI "u16") (V4SI "32") > (V8HF "u16") (V4SF "32")]) > (define_mode_attr earlyclobber_32 [(V16QI "=w") (V8HI "=w") (V4SI "=&w") > (V8HF "=w") (V4SF "=&w")]) > +(define_mode_attr MVE_VPRED [(V16QI "V16BI") (V8HI "V8BI") (V4SI "V4BI") > + (V8HF "V8BI") (V4SF "V4BI")]) > +(define_mode_attr MVE_vpred [(V16QI "v16bi") (V8HI "v8bi") (V4SI "v4bi") > + (V8HF "v8bi") (V4SF "v4bi")]) > > ;;---------------------------------------------------------------------------- > ;; Code attributes > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md > index 14d17060290..c9c8e2c13fe 100644 > --- a/gcc/config/arm/mve.md > +++ b/gcc/config/arm/mve.md > @@ -839,8 +839,8 @@ (define_insn "mve_vaddlvq_p_v4si" > ;; > (define_insn "@mve_vcmpq_" > [ > - (set (match_operand:HI 0 "vpr_register_operand" "=Up") > - (MVE_COMPARISONS:HI (match_operand:MVE_2 1 "s_register_operand" "w") > + (set (match_operand: 0 "vpr_register_operand" "=Up") > + (MVE_COMPARISONS: (match_operand:MVE_2 1 "s_register_operand" "w") > (match_operand:MVE_2 2 "s_register_operand" "w"))) > ] > "TARGET_HAVE_MVE" > @@ -1929,8 +1929,8 @@ (define_insn "mve_vcaddq" > ;; > (define_insn "@mve_vcmpq_f" > [ > - (set (match_operand:HI 0 "vpr_register_operand" "=Up") > - (MVE_FP_COMPARISONS:HI (match_operand:MVE_0 1 "s_register_operand" "w") > + (set (match_operand: 0 "vpr_register_operand" "=Up") > + (MVE_FP_COMPARISONS: (match_operand:MVE_0 1 "s_register_operand" "w") > (match_operand:MVE_0 2 "s_register_operand" "w"))) > ] > "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" > @@ -3321,9 +3321,21 @@ (define_insn "mve_vnegq_m_s" > ;; > (define_insn "@mve_vpselq_" > [ > - (set (match_operand:MVE_1 0 "s_register_operand" "=w") > - (unspec:MVE_1 [(match_operand:MVE_1 1 "s_register_operand" "w") > - (match_operand:MVE_1 2 "s_register_operand" "w") > + (set (match_operand:MVE_2 0 "s_register_operand" "=w") > + (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w") > + (match_operand:MVE_2 2 "s_register_operand" "w") > + (match_operand: 3 "vpr_register_operand" "Up")] > + VPSELQ)) > + ] > + "TARGET_HAVE_MVE" > + "vpsel %q0, %q1, %q2" > + [(set_attr "type" "mve_move") > +]) > +(define_insn "@mve_vpselq_v2di" > + [ > + (set (match_operand:V2DI 0 "s_register_operand" "=w") > + (unspec:V2DI [(match_operand:V2DI 1 "s_register_operand" "w") > + (match_operand:V2DI 2 "s_register_operand" "w") > (match_operand:HI 3 "vpr_register_operand" "Up")] > VPSELQ)) > ] I think we can keep this together and just make MVE_VPRED/MVE_vpred map V2DI to HI/hi. > @@ -4419,7 +4431,7 @@ (define_insn "@mve_vpselq_f" > (set (match_operand:MVE_0 0 "s_register_operand" "=w") > (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") > (match_operand:MVE_0 2 "s_register_operand" "w") > - (match_operand:HI 3 "vpr_register_operand" "Up")] > + (match_operand: 3 "vpr_register_operand" "Up")] > VPSELQ_F)) > ] > "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" > @@ -10516,3 +10528,25 @@ (define_insn "*movmisalign_mve_load" > "vldr.\t%q0, %E1" > [(set_attr "type" "mve_load")] > ) > + > +(define_expand "mov" > + [(set (match_operand:MVE_7 0 "nonimmediate_operand") > + (match_operand:MVE_7 1 "nonimmediate_operand"))] > + "TARGET_HAVE_MVE" > + { > + } > +) Becuase of the (correct) register_operand condition on the define_insn, this expander needs to force operand 1 into a register if neither operand 0 nor operand 1 are registers: { if (!register_operand (operands[0], mode)) operands[1] = force_reg (mode, operands[1]); } Thanks, Richard > + > +(define_insn "*mve_mov" > + [(set (match_operand:MVE_7 0 "nonimmediate_operand" "=rk, m, r, Up, r") > + (match_operand:MVE_7 1 "nonimmediate_operand" "rk, r, m, r, Up"))] > + "TARGET_HAVE_MVE > + && (register_operand (operands[0], mode) > + || register_operand (operands[1], mode))" > + "@ > + mov%?\t%0, %1 > + strh%?\t%1, %0 > + ldrh%?\t%0, %1 > + vmsr%?\t P0, %1 > + vmrs%?\t %0, P0" > +)