From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 7EB4B3858CDA for ; Tue, 24 Oct 2023 21:50:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7EB4B3858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7EB4B3858CDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698184247; cv=none; b=cf73QPmxYGeeHPix4upUFTkUODslkvsS5THIzjDsIlWFr6xd3VQiNhG27RcHbXeTDIbl+GVbaUMAnTqs9JM1S2FWVrjR7ZH1Fb06mOYE//EoC5jbbs77bGUO3Ur6IPaZV6NED7diB/SeSyL2uzQg0IO+481nqIgqI7x2rUhJtHc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698184247; c=relaxed/simple; bh=NaCZt+Dsdd1iaKUHkKZytAL6hgUoOmuQ7XrMCGqH8gk=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=g2afQFXcPhGKuVJWmSJtvcoBXkZkqurcIJLAo1Uy/qyCUP3WE5GSRnJhvHC6GDSHsdePrlnzjUlQedcN6vcvw+oyLIrCoR511PQu9hTfEUbBBN2JCcEZ5aiXCOqel1x8n0kRLybgobzJ5693wWTBnSoqy9HqYcgV3rPBMTeDen0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7B9D32F4; Tue, 24 Oct 2023 14:51:25 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8F9133F762; Tue, 24 Oct 2023 14:50:43 -0700 (PDT) From: Richard Sandiford To: Robin Dapp Mail-Followup-To: Robin Dapp ,Richard Biener , gcc-patches , richard.sandiford@arm.com Cc: Richard Biener , gcc-patches Subject: Re: [PATCH] internal-fn: Add VCOND_MASK_LEN. References: <4b77e155-0936-67d6-ab2d-ae7ef49bfde0@gmail.com> <4afb967d-96ea-7e74-1a35-c86aa5a5ffa6@gmail.com> <38b16b69-1b82-420c-839b-d82278515f10@gmail.com> <03a8c49e-af19-4b38-966b-e9ddae4863f5@gmail.com> <6b71858f-51d1-4e23-879c-d3cb19778ed9@gmail.com> <7e083b67-f283-4e9e-ba76-24e194fa1761@gmail.com> Date: Tue, 24 Oct 2023 22:50:42 +0100 In-Reply-To: (Robin Dapp's message of "Mon, 23 Oct 2023 18:09:58 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-23.2 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_ASCII_DIVIDERS,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Robin Dapp writes: > The attached patch introduces a VCOND_MASK_LEN, helps for the riscv cases > that were broken before and looks unchanged on x86, aarch64 and power > bootstrap and testsuites. > > I only went with the minimal number of new match.pd patterns and did not > try stripping the length of a COND_LEN_OP in order to simplify the > associated COND_OP. > > An important part that I'm not sure how to handle properly is - > when we have a constant immediate length of e.g. 16 and the hardware > also operates on 16 units, vector length masking is actually > redundant and the vcond_mask_len can be reduced to a vec_cond. > For those (if_then_else unsplit) we have a large number of combine > patterns that fuse instruction which do not correspond to ifns > (like widening operations but also more complex ones). > > Currently I achieve this in a most likely wrong way: > > auto sz = GET_MODE_NUNITS (TYPE_MODE (res_op->type)); > bool full_len = len && known_eq (sz.coeffs[0], ilen); > if (!len || full_len) > "vec_cond" > else > "vcond_mask_len" At first, this seemed like an odd place to fold away the length. AFAIK the length in res_op is inherited directly from the original operation, and so it isn't any more redundant after the fold than it was before. But I suppose the reason for doing it here is that we deliberately create IFN_COND_LEN_FOO calls that have "redundant" lengths. Doing that avoids the need to define an IFN_COND_FOO equivalent of every IFN_COND_LEN_FOO optab. Is that right? If so, I think it deserves a comment. But yeah, known_eq (sz.coeffs[0], ilen) doesn't look right. If the target knows that the length is exactly 16 at runtime, then it should set GET_MODE_NUNITS to 16. So I think the length is only redundant if known_eq (sz, ilen). The calculation should take the bias into account as well. Any reason not to make IFN_COND_LEN_MASK a directly-mapped optab? (I realise IFN_COND_MASK isn't, but that's used differently.) Failing that, could the expansion use expand_fn_using_insn? It generally looks OK to me otherwise FWIW, but it would be nice to handle the fold programmatically in gimple-match*.cc rather than having the explicit match.pd patterns. Richi should review the match.pd stuff though. ;) (I didn't really look at it.) Thanks, Richard > Another thing not done in this patch: For vcond_mask we only expect > register operands as mask and force to a register. For a vcond_mask_len > that results from a simplification with all-one or all-zero mask we > could allow constant immediate vectors and expand them to simple > len moves in the backend. > > Regards > Robin > > From bc72e9b2f3ee46508404ee7723ca78790fa96b6b Mon Sep 17 00:00:00 2001 > From: Robin Dapp > Date: Fri, 13 Oct 2023 10:20:35 +0200 > Subject: [PATCH] internal-fn: Add VCOND_MASK_LEN. > > In order to prevent simplification of a COND_OP with degenerate mask > (all true or all zero) into just an OP in the presence of length > masking this patch introduces a length-masked analog to VEC_COND_EXPR: > IFN_VCOND_MASK_LEN. If the to-be-simplified conditional operation has a > length that is not the full hardware vector length a simplification now > does not result int a VEC_COND but rather a VCOND_MASK_LEN. > > For cases where the masks is known to be all true or all zero the patch > introduces new match patterns that allow combination of unconditional > unary, binary and ternay operations with the respective conditional > operations if the target supports it. > > Similarly, if the length is known to be equal to the target hardware > length VCOND_MASK_LEN will be simplified to VEC_COND_EXPR. > > gcc/ChangeLog: > > * config/riscv/autovec.md (vcond_mask_len_): Add > expander. > * config/riscv/riscv-protos.h (enum insn_type): > * doc/md.texi: Add vcond_mask_len. > * gimple-match-exports.cc (maybe_resimplify_conditional_op): > Create VCOND_MASK_LEN when > length masking. > * gimple-match.h (gimple_match_op::gimple_match_op): Allow > matching of 6 and 7 parameters. > (gimple_match_op::set_op): Ditto. > (gimple_match_op::gimple_match_op): Always initialize len and > bias. > * internal-fn.cc (vec_cond_mask_len_direct): Add. > (expand_vec_cond_mask_len_optab_fn): Add. > (direct_vec_cond_mask_len_optab_supported_p): Add. > (internal_fn_len_index): Add VCOND_MASK_LEN. > (internal_fn_mask_index): Ditto. > * internal-fn.def (VCOND_MASK_LEN): New internal function. > * match.pd: Combine unconditional unary, binary and ternary > operations into the respective COND_LEN operations. > * optabs.def (OPTAB_CD): Add vcond_mask_len optab. > --- > gcc/config/riscv/autovec.md | 20 +++++++++ > gcc/config/riscv/riscv-protos.h | 4 ++ > gcc/doc/md.texi | 9 ++++ > gcc/gimple-match-exports.cc | 20 +++++++-- > gcc/gimple-match.h | 78 ++++++++++++++++++++++++++++++++- > gcc/internal-fn.cc | 41 +++++++++++++++++ > gcc/internal-fn.def | 2 + > gcc/match.pd | 74 +++++++++++++++++++++++++++++++ > gcc/optabs.def | 1 + > 9 files changed, 244 insertions(+), 5 deletions(-) > > diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md > index 80910ba3cc2..27a71bc1ef9 100644 > --- a/gcc/config/riscv/autovec.md > +++ b/gcc/config/riscv/autovec.md > @@ -565,6 +565,26 @@ (define_insn_and_split "vcond_mask_" > [(set_attr "type" "vector")] > ) > > +(define_expand "vcond_mask_len_" > + [(match_operand:V_VLS 0 "register_operand") > + (match_operand: 3 "register_operand") > + (match_operand:V_VLS 1 "nonmemory_operand") > + (match_operand:V_VLS 2 "register_operand") > + (match_operand 4 "autovec_length_operand") > + (match_operand 5 "const_0_operand")] > + "TARGET_VECTOR" > + { > + /* The order of vcond_mask is opposite to pred_merge. */ > + rtx ops[] = {operands[0], operands[0], operands[2], operands[1], > + operands[3]}; > + riscv_vector::emit_nonvlmax_insn (code_for_pred_merge (mode), > + riscv_vector::MERGE_OP_REAL_ELSE, ops, > + operands[4]); > + DONE; > + } > + [(set_attr "type" "vector")] > +) > + > ;; ------------------------------------------------------------------------- > ;; ---- [BOOL] Select based on masks > ;; ------------------------------------------------------------------------- > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > index 6cb9d459ee9..025a3568566 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -337,6 +337,10 @@ enum insn_type : unsigned int > /* For vmerge, no mask operand, no mask policy operand. */ > MERGE_OP = __NORMAL_OP_TA2 | TERNARY_OP_P, > > + /* For vmerge with no vundef operand. */ > + MERGE_OP_REAL_ELSE = HAS_DEST_P | HAS_MERGE_P | TDEFAULT_POLICY_P > + | TERNARY_OP_P, > + > /* For vm, no tail policy operand. */ > COMPARE_OP = __NORMAL_OP_MA | TERNARY_OP_P, > COMPARE_OP_MU = __MASK_OP_MU | TERNARY_OP_P, > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index daa318ee3da..de0757f1903 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -5306,6 +5306,15 @@ no need to define this instruction pattern if the others are supported. > Similar to @code{vcond@var{m}@var{n}} but operand 3 holds a pre-computed > result of vector comparison. > > +@cindex @code{vcond_mask_len_@var{m}@var{n}} instruction pattern > +@item @samp{vcond_mask_@var{m}@var{n}} > +Similar to @code{vcond_mask@var{m}@var{n}} but operand 4 holds a variable > +or constant length and operand 5 holds a bias. If the > +element index < operand 4 + operand 5 the respective element of the result is > +computed as in @code{vcond_mask_@var{m}@var{n}}. For element indices >= > +operand 4 + operand 5 the computation is performed as if the respective mask > +element were zero. > + > @cindex @code{maskload@var{m}@var{n}} instruction pattern > @item @samp{maskload@var{m}@var{n}} > Perform a masked load of vector from memory operand 1 of mode @var{m} > diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc > index b36027b0bad..32134dbf711 100644 > --- a/gcc/gimple-match-exports.cc > +++ b/gcc/gimple-match-exports.cc > @@ -307,9 +307,23 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, > && VECTOR_TYPE_P (res_op->type) > && gimple_simplified_result_is_gimple_val (res_op)) > { > - new_op.set_op (VEC_COND_EXPR, res_op->type, > - res_op->cond.cond, res_op->ops[0], > - res_op->cond.else_value); > + tree len = res_op->cond.len; > + HOST_WIDE_INT ilen = -1; > + if (len && TREE_CODE (len) == INTEGER_CST && tree_fits_uhwi_p (len)) > + ilen = tree_to_uhwi (len); > + > + auto sz = GET_MODE_NUNITS (TYPE_MODE (res_op->type)); > + bool full_len = len && known_eq (sz.coeffs[0], ilen); > + > + if (!len || full_len) > + new_op.set_op (VEC_COND_EXPR, res_op->type, > + res_op->cond.cond, res_op->ops[0], > + res_op->cond.else_value); > + else > + new_op.set_op (IFN_VCOND_MASK_LEN, res_op->type, > + res_op->cond.cond, res_op->ops[0], > + res_op->cond.else_value, res_op->cond.len, > + res_op->cond.bias); > *res_op = new_op; > return gimple_resimplify3 (seq, res_op, valueize); > } > diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h > index bec3ff42e3e..63a9f029589 100644 > --- a/gcc/gimple-match.h > +++ b/gcc/gimple-match.h > @@ -32,7 +32,8 @@ public: > enum uncond { UNCOND }; > > /* Build an unconditional op. */ > - gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE) {} > + gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE), len > + (NULL_TREE), bias (NULL_TREE) {} > gimple_match_cond (tree, tree); > gimple_match_cond (tree, tree, tree, tree); > > @@ -56,7 +57,8 @@ public: > > inline > gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in) > - : cond (cond_in), else_value (else_value_in) > + : cond (cond_in), else_value (else_value_in), len (NULL_TREE), > + bias (NULL_TREE) > { > } > > @@ -92,6 +94,10 @@ public: > code_helper, tree, tree, tree, tree, tree); > gimple_match_op (const gimple_match_cond &, > code_helper, tree, tree, tree, tree, tree, tree); > + gimple_match_op (const gimple_match_cond &, > + code_helper, tree, tree, tree, tree, tree, tree, tree); > + gimple_match_op (const gimple_match_cond &, > + code_helper, tree, tree, tree, tree, tree, tree, tree, tree); > > void set_op (code_helper, tree, unsigned int); > void set_op (code_helper, tree, tree); > @@ -100,6 +106,8 @@ public: > void set_op (code_helper, tree, tree, tree, tree, bool); > void set_op (code_helper, tree, tree, tree, tree, tree); > void set_op (code_helper, tree, tree, tree, tree, tree, tree); > + void set_op (code_helper, tree, tree, tree, tree, tree, tree, tree); > + void set_op (code_helper, tree, tree, tree, tree, tree, tree, tree, tree); > void set_value (tree); > > tree op_or_null (unsigned int) const; > @@ -212,6 +220,39 @@ gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in, > ops[4] = op4; > } > > +inline > +gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in, > + code_helper code_in, tree type_in, > + tree op0, tree op1, tree op2, tree op3, > + tree op4, tree op5) > + : cond (cond_in), code (code_in), type (type_in), reverse (false), > + num_ops (6) > +{ > + ops[0] = op0; > + ops[1] = op1; > + ops[2] = op2; > + ops[3] = op3; > + ops[4] = op4; > + ops[5] = op5; > +} > + > +inline > +gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in, > + code_helper code_in, tree type_in, > + tree op0, tree op1, tree op2, tree op3, > + tree op4, tree op5, tree op6) > + : cond (cond_in), code (code_in), type (type_in), reverse (false), > + num_ops (7) > +{ > + ops[0] = op0; > + ops[1] = op1; > + ops[2] = op2; > + ops[3] = op3; > + ops[4] = op4; > + ops[5] = op5; > + ops[6] = op6; > +} > + > /* Change the operation performed to CODE_IN, the type of the result to > TYPE_IN, and the number of operands to NUM_OPS_IN. The caller needs > to set the operands itself. */ > @@ -299,6 +340,39 @@ gimple_match_op::set_op (code_helper code_in, tree type_in, > ops[4] = op4; > } > > +inline void > +gimple_match_op::set_op (code_helper code_in, tree type_in, > + tree op0, tree op1, tree op2, tree op3, tree op4, > + tree op5) > +{ > + code = code_in; > + type = type_in; > + num_ops = 6; > + ops[0] = op0; > + ops[1] = op1; > + ops[2] = op2; > + ops[3] = op3; > + ops[4] = op4; > + ops[5] = op5; > +} > + > +inline void > +gimple_match_op::set_op (code_helper code_in, tree type_in, > + tree op0, tree op1, tree op2, tree op3, tree op4, > + tree op5, tree op6) > +{ > + code = code_in; > + type = type_in; > + num_ops = 7; > + ops[0] = op0; > + ops[1] = op1; > + ops[2] = op2; > + ops[3] = op3; > + ops[4] = op4; > + ops[5] = op5; > + ops[6] = op6; > +} > + > /* Set the "operation" to be the single value VALUE, such as a constant > or SSA_NAME. */ > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 61d5a9e4772..b47c33faf85 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -170,6 +170,7 @@ init_internal_fns () > #define store_lanes_direct { 0, 0, false } > #define mask_store_lanes_direct { 0, 0, false } > #define vec_cond_mask_direct { 1, 0, false } > +#define vec_cond_mask_len_direct { 2, 0, false } > #define vec_cond_direct { 2, 0, false } > #define scatter_store_direct { 3, 1, false } > #define len_store_direct { 3, 3, false } > @@ -3129,6 +3130,41 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) > emit_move_insn (target, ops[0].value); > } > > +static void > +expand_vec_cond_mask_len_optab_fn (internal_fn, gcall *stmt, convert_optab optab) > +{ > + class expand_operand ops[6]; > + > + tree lhs = gimple_call_lhs (stmt); > + tree op0 = gimple_call_arg (stmt, 0); > + tree op1 = gimple_call_arg (stmt, 1); > + tree op2 = gimple_call_arg (stmt, 2); > + tree vec_cond_type = TREE_TYPE (lhs); > + > + machine_mode mode = TYPE_MODE (vec_cond_type); > + machine_mode mask_mode = TYPE_MODE (TREE_TYPE (op0)); > + enum insn_code icode = convert_optab_handler (optab, mode, mask_mode); > + rtx rtx_op1, rtx_op2; > + > + gcc_assert (icode != CODE_FOR_nothing); > + > + rtx_op1 = expand_normal (op1); > + rtx_op2 = expand_normal (op2); > + > + rtx_op1 = force_reg (mode, rtx_op1); > + > + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); > + create_output_operand (&ops[0], target, mode); > + create_input_operand (&ops[1], rtx_op1, mode); > + create_input_operand (&ops[2], rtx_op2, mode); > + > + int opno = add_mask_and_len_args (ops, 3, stmt); > + expand_insn (icode, opno, ops); > + > + if (!rtx_equal_p (ops[0].value, target)) > + emit_move_insn (target, ops[0].value); > +} > + > /* Expand VEC_SET internal functions. */ > > static void > @@ -4018,6 +4054,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, > #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p > #define direct_mask_store_lanes_optab_supported_p multi_vector_optab_supported_p > #define direct_vec_cond_mask_optab_supported_p convert_optab_supported_p > +#define direct_vec_cond_mask_len_optab_supported_p convert_optab_supported_p > #define direct_vec_cond_optab_supported_p convert_optab_supported_p > #define direct_scatter_store_optab_supported_p convert_optab_supported_p > #define direct_len_store_optab_supported_p direct_optab_supported_p > @@ -4690,6 +4727,7 @@ internal_fn_len_index (internal_fn fn) > case IFN_MASK_LEN_STORE: > case IFN_MASK_LEN_LOAD_LANES: > case IFN_MASK_LEN_STORE_LANES: > + case IFN_VCOND_MASK_LEN: > return 3; > > default: > @@ -4721,6 +4759,9 @@ internal_fn_mask_index (internal_fn fn) > case IFN_MASK_LEN_SCATTER_STORE: > return 4; > > + case IFN_VCOND_MASK_LEN: > + return 0; > + > default: > return (conditional_internal_fn_code (fn) != ERROR_MARK > || get_unconditional_internal_fn (fn) != IFN_LAST ? 0 : -1); > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index a2023ab9c3d..581cc3b5140 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -221,6 +221,8 @@ DEF_INTERNAL_OPTAB_FN (VCONDU, ECF_CONST | ECF_NOTHROW, vcondu, vec_cond) > DEF_INTERNAL_OPTAB_FN (VCONDEQ, ECF_CONST | ECF_NOTHROW, vcondeq, vec_cond) > DEF_INTERNAL_OPTAB_FN (VCOND_MASK, ECF_CONST | ECF_NOTHROW, > vcond_mask, vec_cond_mask) > +DEF_INTERNAL_OPTAB_FN (VCOND_MASK_LEN, ECF_CONST | ECF_NOTHROW, > + vcond_mask_len, vec_cond_mask_len) > > DEF_INTERNAL_OPTAB_FN (VEC_SET, ECF_CONST | ECF_NOTHROW, vec_set, vec_set) > DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW, > diff --git a/gcc/match.pd b/gcc/match.pd > index ce8d159d260..f187d560fbf 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -87,6 +87,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > negate bit_not) > (define_operator_list COND_UNARY > IFN_COND_NEG IFN_COND_NOT) > +(define_operator_list COND_LEN_UNARY > + IFN_COND_LEN_NEG IFN_COND_LEN_NOT) > > /* Binary operations and their associated IFN_COND_* function. */ > (define_operator_list UNCOND_BINARY > @@ -103,12 +105,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > IFN_COND_FMIN IFN_COND_FMAX > IFN_COND_AND IFN_COND_IOR IFN_COND_XOR > IFN_COND_SHL IFN_COND_SHR) > +(define_operator_list COND_LEN_BINARY > + IFN_COND_LEN_ADD IFN_COND_LEN_SUB > + IFN_COND_LEN_MUL IFN_COND_LEN_DIV IFN_COND_LEN_MOD IFN_COND_LEN_RDIV > + IFN_COND_LEN_MIN IFN_COND_LEN_MAX > + IFN_COND_LEN_FMIN IFN_COND_LEN_FMAX > + IFN_COND_LEN_AND IFN_COND_LEN_IOR IFN_COND_LEN_XOR > + IFN_COND_LEN_SHL IFN_COND_LEN_SHR) > > /* Same for ternary operations. */ > (define_operator_list UNCOND_TERNARY > IFN_FMA IFN_FMS IFN_FNMA IFN_FNMS) > (define_operator_list COND_TERNARY > IFN_COND_FMA IFN_COND_FMS IFN_COND_FNMA IFN_COND_FNMS) > +(define_operator_list COND_LEN_TERNARY > + IFN_COND_LEN_FMA IFN_COND_LEN_FMS IFN_COND_LEN_FNMA IFN_COND_LEN_FNMS) > > /* __atomic_fetch_or_*, __atomic_fetch_xor_*, __atomic_xor_fetch_* */ > (define_operator_list ATOMIC_FETCH_OR_XOR_N > @@ -8949,6 +8960,69 @@ and, > && single_use (@5)) > (view_convert (cond_op (bit_not @0) @2 @3 @4 > (view_convert:op_type @1))))))) > + > +/* Similar for all cond_len operations. */ > +(for uncond_op (UNCOND_UNARY) > + cond_op (COND_LEN_UNARY) > + (simplify > + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@3 @1)) @2 @4 @5) > + (with { tree op_type = TREE_TYPE (@3); } > + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) > + && is_truth_type_for (op_type, TREE_TYPE (@0))) > + (cond_op @0 @1 @2 @4 @5)))) > + (simplify > + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@3 @2)) @4 @5) > + (with { tree op_type = TREE_TYPE (@3); } > + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) > + && is_truth_type_for (op_type, TREE_TYPE (@0))) > + (cond_op (bit_not @0) @2 @1 @4 @5))))) > + > +(for uncond_op (UNCOND_BINARY) > + cond_op (COND_LEN_BINARY) > + (simplify > + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@4 @1 @2)) @3 @5 @6) > + (with { tree op_type = TREE_TYPE (@4); } > + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) > + && is_truth_type_for (op_type, TREE_TYPE (@0)) > + && single_use (@4)) > + (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3) @5 @6))))) > + (simplify > + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@4 @2 @3)) @5 @6) > + (with { tree op_type = TREE_TYPE (@4); } > + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) > + && is_truth_type_for (op_type, TREE_TYPE (@0)) > + && single_use (@4)) > + (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1) @5 @6)))))) > + > +(for uncond_op (UNCOND_TERNARY) > + cond_op (COND_LEN_TERNARY) > + (simplify > + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4 @6 @7) > + (with { tree op_type = TREE_TYPE (@5); } > + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) > + && is_truth_type_for (op_type, TREE_TYPE (@0)) > + && single_use (@5)) > + (view_convert (cond_op @0 @1 @2 @3 (view_convert:op_type @4) @6 @7))))) > + (simplify > + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@5 @2 @3 @4 @6 @7))) > + (with { tree op_type = TREE_TYPE (@5); } > + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) > + && is_truth_type_for (op_type, TREE_TYPE (@0)) > + && single_use (@5)) > + (view_convert (cond_op (bit_not @0) @2 @3 @4 (view_convert:op_type @1) @6 @7)))))) > + > +/* A VCOND_MASK_LEN with a size that equals the full hardware vector size > + is just a vec_cond. */ > +(simplify > + (IFN_VCOND_MASK_LEN @0 @1 @2 INTEGER_CST@3 INTEGER_CST@4) > + (with { > + HOST_WIDE_INT len = -1; > + if (tree_fits_uhwi_p (@3)) > + len = tree_to_uhwi (@3); > + auto sz = GET_MODE_NUNITS (TYPE_MODE (res_op->type)); > + bool full_len = (sz.coeffs[0] == len); } > + (if (full_len) > + (vec_cond @0 @1 @2)))) > #endif > > /* Detect cases in which a VEC_COND_EXPR effectively replaces the > diff --git a/gcc/optabs.def b/gcc/optabs.def > index 2ccbe4197b7..3cb16bd3002 100644 > --- a/gcc/optabs.def > +++ b/gcc/optabs.def > @@ -88,6 +88,7 @@ OPTAB_CD(vcond_optab, "vcond$a$b") > OPTAB_CD(vcondu_optab, "vcondu$a$b") > OPTAB_CD(vcondeq_optab, "vcondeq$a$b") > OPTAB_CD(vcond_mask_optab, "vcond_mask_$a$b") > +OPTAB_CD(vcond_mask_len_optab, "vcond_mask_len_$a$b") > OPTAB_CD(vec_cmp_optab, "vec_cmp$a$b") > OPTAB_CD(vec_cmpu_optab, "vec_cmpu$a$b") > OPTAB_CD(vec_cmpeq_optab, "vec_cmpeq$a$b")