From: Richard Sandiford <richard.sandiford@arm.com>
To: Robin Dapp <rdapp.gcc@gmail.com>
Cc: Richard Biener <richard.guenther@gmail.com>,
gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.
Date: Tue, 24 Oct 2023 22:50:42 +0100 [thread overview]
Message-ID: <mptv8avlmfh.fsf@arm.com> (raw)
In-Reply-To: <e9b33876-cf75-417e-85b3-89e00e17435f@gmail.com> (Robin Dapp's message of "Mon, 23 Oct 2023 18:09:58 +0200")
Robin Dapp <rdapp.gcc@gmail.com> writes:
> The attached patch introduces a VCOND_MASK_LEN, helps for the riscv cases
> that were broken before and looks unchanged on x86, aarch64 and power
> bootstrap and testsuites.
>
> I only went with the minimal number of new match.pd patterns and did not
> try stripping the length of a COND_LEN_OP in order to simplify the
> associated COND_OP.
>
> An important part that I'm not sure how to handle properly is -
> when we have a constant immediate length of e.g. 16 and the hardware
> also operates on 16 units, vector length masking is actually
> redundant and the vcond_mask_len can be reduced to a vec_cond.
> For those (if_then_else unsplit) we have a large number of combine
> patterns that fuse instruction which do not correspond to ifns
> (like widening operations but also more complex ones).
>
> Currently I achieve this in a most likely wrong way:
>
> auto sz = GET_MODE_NUNITS (TYPE_MODE (res_op->type));
> bool full_len = len && known_eq (sz.coeffs[0], ilen);
> if (!len || full_len)
> "vec_cond"
> else
> "vcond_mask_len"
At first, this seemed like an odd place to fold away the length.
AFAIK the length in res_op is inherited directly from the original
operation, and so it isn't any more redundant after the fold than
it was before. But I suppose the reason for doing it here is that
we deliberately create IFN_COND_LEN_FOO calls that have "redundant"
lengths. Doing that avoids the need to define an IFN_COND_FOO
equivalent of every IFN_COND_LEN_FOO optab. Is that right? If so,
I think it deserves a comment.
But yeah, known_eq (sz.coeffs[0], ilen) doesn't look right.
If the target knows that the length is exactly 16 at runtime,
then it should set GET_MODE_NUNITS to 16. So I think the length
is only redundant if known_eq (sz, ilen).
The calculation should take the bias into account as well.
Any reason not to make IFN_COND_LEN_MASK a directly-mapped optab?
(I realise IFN_COND_MASK isn't, but that's used differently.)
Failing that, could the expansion use expand_fn_using_insn?
It generally looks OK to me otherwise FWIW, but it would be nice
to handle the fold programmatically in gimple-match*.cc rather
than having the explicit match.pd patterns. Richi should review
the match.pd stuff though. ;) (I didn't really look at it.)
Thanks,
Richard
> Another thing not done in this patch: For vcond_mask we only expect
> register operands as mask and force to a register. For a vcond_mask_len
> that results from a simplification with all-one or all-zero mask we
> could allow constant immediate vectors and expand them to simple
> len moves in the backend.
>
> Regards
> Robin
>
> From bc72e9b2f3ee46508404ee7723ca78790fa96b6b Mon Sep 17 00:00:00 2001
> From: Robin Dapp <rdapp@ventanamicro.com>
> Date: Fri, 13 Oct 2023 10:20:35 +0200
> Subject: [PATCH] internal-fn: Add VCOND_MASK_LEN.
>
> In order to prevent simplification of a COND_OP with degenerate mask
> (all true or all zero) into just an OP in the presence of length
> masking this patch introduces a length-masked analog to VEC_COND_EXPR:
> IFN_VCOND_MASK_LEN. If the to-be-simplified conditional operation has a
> length that is not the full hardware vector length a simplification now
> does not result int a VEC_COND but rather a VCOND_MASK_LEN.
>
> For cases where the masks is known to be all true or all zero the patch
> introduces new match patterns that allow combination of unconditional
> unary, binary and ternay operations with the respective conditional
> operations if the target supports it.
>
> Similarly, if the length is known to be equal to the target hardware
> length VCOND_MASK_LEN will be simplified to VEC_COND_EXPR.
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (vcond_mask_len_<mode><vm>): Add
> expander.
> * config/riscv/riscv-protos.h (enum insn_type):
> * doc/md.texi: Add vcond_mask_len.
> * gimple-match-exports.cc (maybe_resimplify_conditional_op):
> Create VCOND_MASK_LEN when
> length masking.
> * gimple-match.h (gimple_match_op::gimple_match_op): Allow
> matching of 6 and 7 parameters.
> (gimple_match_op::set_op): Ditto.
> (gimple_match_op::gimple_match_op): Always initialize len and
> bias.
> * internal-fn.cc (vec_cond_mask_len_direct): Add.
> (expand_vec_cond_mask_len_optab_fn): Add.
> (direct_vec_cond_mask_len_optab_supported_p): Add.
> (internal_fn_len_index): Add VCOND_MASK_LEN.
> (internal_fn_mask_index): Ditto.
> * internal-fn.def (VCOND_MASK_LEN): New internal function.
> * match.pd: Combine unconditional unary, binary and ternary
> operations into the respective COND_LEN operations.
> * optabs.def (OPTAB_CD): Add vcond_mask_len optab.
> ---
> gcc/config/riscv/autovec.md | 20 +++++++++
> gcc/config/riscv/riscv-protos.h | 4 ++
> gcc/doc/md.texi | 9 ++++
> gcc/gimple-match-exports.cc | 20 +++++++--
> gcc/gimple-match.h | 78 ++++++++++++++++++++++++++++++++-
> gcc/internal-fn.cc | 41 +++++++++++++++++
> gcc/internal-fn.def | 2 +
> gcc/match.pd | 74 +++++++++++++++++++++++++++++++
> gcc/optabs.def | 1 +
> 9 files changed, 244 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 80910ba3cc2..27a71bc1ef9 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -565,6 +565,26 @@ (define_insn_and_split "vcond_mask_<mode><vm>"
> [(set_attr "type" "vector")]
> )
>
> +(define_expand "vcond_mask_len_<mode><vm>"
> + [(match_operand:V_VLS 0 "register_operand")
> + (match_operand:<VM> 3 "register_operand")
> + (match_operand:V_VLS 1 "nonmemory_operand")
> + (match_operand:V_VLS 2 "register_operand")
> + (match_operand 4 "autovec_length_operand")
> + (match_operand 5 "const_0_operand")]
> + "TARGET_VECTOR"
> + {
> + /* The order of vcond_mask is opposite to pred_merge. */
> + rtx ops[] = {operands[0], operands[0], operands[2], operands[1],
> + operands[3]};
> + riscv_vector::emit_nonvlmax_insn (code_for_pred_merge (<MODE>mode),
> + riscv_vector::MERGE_OP_REAL_ELSE, ops,
> + operands[4]);
> + DONE;
> + }
> + [(set_attr "type" "vector")]
> +)
> +
> ;; -------------------------------------------------------------------------
> ;; ---- [BOOL] Select based on masks
> ;; -------------------------------------------------------------------------
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 6cb9d459ee9..025a3568566 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -337,6 +337,10 @@ enum insn_type : unsigned int
> /* For vmerge, no mask operand, no mask policy operand. */
> MERGE_OP = __NORMAL_OP_TA2 | TERNARY_OP_P,
>
> + /* For vmerge with no vundef operand. */
> + MERGE_OP_REAL_ELSE = HAS_DEST_P | HAS_MERGE_P | TDEFAULT_POLICY_P
> + | TERNARY_OP_P,
> +
> /* For vm<compare>, no tail policy operand. */
> COMPARE_OP = __NORMAL_OP_MA | TERNARY_OP_P,
> COMPARE_OP_MU = __MASK_OP_MU | TERNARY_OP_P,
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index daa318ee3da..de0757f1903 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5306,6 +5306,15 @@ no need to define this instruction pattern if the others are supported.
> Similar to @code{vcond@var{m}@var{n}} but operand 3 holds a pre-computed
> result of vector comparison.
>
> +@cindex @code{vcond_mask_len_@var{m}@var{n}} instruction pattern
> +@item @samp{vcond_mask_@var{m}@var{n}}
> +Similar to @code{vcond_mask@var{m}@var{n}} but operand 4 holds a variable
> +or constant length and operand 5 holds a bias. If the
> +element index < operand 4 + operand 5 the respective element of the result is
> +computed as in @code{vcond_mask_@var{m}@var{n}}. For element indices >=
> +operand 4 + operand 5 the computation is performed as if the respective mask
> +element were zero.
> +
> @cindex @code{maskload@var{m}@var{n}} instruction pattern
> @item @samp{maskload@var{m}@var{n}}
> Perform a masked load of vector from memory operand 1 of mode @var{m}
> diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc
> index b36027b0bad..32134dbf711 100644
> --- a/gcc/gimple-match-exports.cc
> +++ b/gcc/gimple-match-exports.cc
> @@ -307,9 +307,23 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op,
> && VECTOR_TYPE_P (res_op->type)
> && gimple_simplified_result_is_gimple_val (res_op))
> {
> - new_op.set_op (VEC_COND_EXPR, res_op->type,
> - res_op->cond.cond, res_op->ops[0],
> - res_op->cond.else_value);
> + tree len = res_op->cond.len;
> + HOST_WIDE_INT ilen = -1;
> + if (len && TREE_CODE (len) == INTEGER_CST && tree_fits_uhwi_p (len))
> + ilen = tree_to_uhwi (len);
> +
> + auto sz = GET_MODE_NUNITS (TYPE_MODE (res_op->type));
> + bool full_len = len && known_eq (sz.coeffs[0], ilen);
> +
> + if (!len || full_len)
> + new_op.set_op (VEC_COND_EXPR, res_op->type,
> + res_op->cond.cond, res_op->ops[0],
> + res_op->cond.else_value);
> + else
> + new_op.set_op (IFN_VCOND_MASK_LEN, res_op->type,
> + res_op->cond.cond, res_op->ops[0],
> + res_op->cond.else_value, res_op->cond.len,
> + res_op->cond.bias);
> *res_op = new_op;
> return gimple_resimplify3 (seq, res_op, valueize);
> }
> diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h
> index bec3ff42e3e..63a9f029589 100644
> --- a/gcc/gimple-match.h
> +++ b/gcc/gimple-match.h
> @@ -32,7 +32,8 @@ public:
> enum uncond { UNCOND };
>
> /* Build an unconditional op. */
> - gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE) {}
> + gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE), len
> + (NULL_TREE), bias (NULL_TREE) {}
> gimple_match_cond (tree, tree);
> gimple_match_cond (tree, tree, tree, tree);
>
> @@ -56,7 +57,8 @@ public:
>
> inline
> gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in)
> - : cond (cond_in), else_value (else_value_in)
> + : cond (cond_in), else_value (else_value_in), len (NULL_TREE),
> + bias (NULL_TREE)
> {
> }
>
> @@ -92,6 +94,10 @@ public:
> code_helper, tree, tree, tree, tree, tree);
> gimple_match_op (const gimple_match_cond &,
> code_helper, tree, tree, tree, tree, tree, tree);
> + gimple_match_op (const gimple_match_cond &,
> + code_helper, tree, tree, tree, tree, tree, tree, tree);
> + gimple_match_op (const gimple_match_cond &,
> + code_helper, tree, tree, tree, tree, tree, tree, tree, tree);
>
> void set_op (code_helper, tree, unsigned int);
> void set_op (code_helper, tree, tree);
> @@ -100,6 +106,8 @@ public:
> void set_op (code_helper, tree, tree, tree, tree, bool);
> void set_op (code_helper, tree, tree, tree, tree, tree);
> void set_op (code_helper, tree, tree, tree, tree, tree, tree);
> + void set_op (code_helper, tree, tree, tree, tree, tree, tree, tree);
> + void set_op (code_helper, tree, tree, tree, tree, tree, tree, tree, tree);
> void set_value (tree);
>
> tree op_or_null (unsigned int) const;
> @@ -212,6 +220,39 @@ gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in,
> ops[4] = op4;
> }
>
> +inline
> +gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in,
> + code_helper code_in, tree type_in,
> + tree op0, tree op1, tree op2, tree op3,
> + tree op4, tree op5)
> + : cond (cond_in), code (code_in), type (type_in), reverse (false),
> + num_ops (6)
> +{
> + ops[0] = op0;
> + ops[1] = op1;
> + ops[2] = op2;
> + ops[3] = op3;
> + ops[4] = op4;
> + ops[5] = op5;
> +}
> +
> +inline
> +gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in,
> + code_helper code_in, tree type_in,
> + tree op0, tree op1, tree op2, tree op3,
> + tree op4, tree op5, tree op6)
> + : cond (cond_in), code (code_in), type (type_in), reverse (false),
> + num_ops (7)
> +{
> + ops[0] = op0;
> + ops[1] = op1;
> + ops[2] = op2;
> + ops[3] = op3;
> + ops[4] = op4;
> + ops[5] = op5;
> + ops[6] = op6;
> +}
> +
> /* Change the operation performed to CODE_IN, the type of the result to
> TYPE_IN, and the number of operands to NUM_OPS_IN. The caller needs
> to set the operands itself. */
> @@ -299,6 +340,39 @@ gimple_match_op::set_op (code_helper code_in, tree type_in,
> ops[4] = op4;
> }
>
> +inline void
> +gimple_match_op::set_op (code_helper code_in, tree type_in,
> + tree op0, tree op1, tree op2, tree op3, tree op4,
> + tree op5)
> +{
> + code = code_in;
> + type = type_in;
> + num_ops = 6;
> + ops[0] = op0;
> + ops[1] = op1;
> + ops[2] = op2;
> + ops[3] = op3;
> + ops[4] = op4;
> + ops[5] = op5;
> +}
> +
> +inline void
> +gimple_match_op::set_op (code_helper code_in, tree type_in,
> + tree op0, tree op1, tree op2, tree op3, tree op4,
> + tree op5, tree op6)
> +{
> + code = code_in;
> + type = type_in;
> + num_ops = 7;
> + ops[0] = op0;
> + ops[1] = op1;
> + ops[2] = op2;
> + ops[3] = op3;
> + ops[4] = op4;
> + ops[5] = op5;
> + ops[6] = op6;
> +}
> +
> /* Set the "operation" to be the single value VALUE, such as a constant
> or SSA_NAME. */
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 61d5a9e4772..b47c33faf85 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -170,6 +170,7 @@ init_internal_fns ()
> #define store_lanes_direct { 0, 0, false }
> #define mask_store_lanes_direct { 0, 0, false }
> #define vec_cond_mask_direct { 1, 0, false }
> +#define vec_cond_mask_len_direct { 2, 0, false }
> #define vec_cond_direct { 2, 0, false }
> #define scatter_store_direct { 3, 1, false }
> #define len_store_direct { 3, 3, false }
> @@ -3129,6 +3130,41 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> emit_move_insn (target, ops[0].value);
> }
>
> +static void
> +expand_vec_cond_mask_len_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> +{
> + class expand_operand ops[6];
> +
> + tree lhs = gimple_call_lhs (stmt);
> + tree op0 = gimple_call_arg (stmt, 0);
> + tree op1 = gimple_call_arg (stmt, 1);
> + tree op2 = gimple_call_arg (stmt, 2);
> + tree vec_cond_type = TREE_TYPE (lhs);
> +
> + machine_mode mode = TYPE_MODE (vec_cond_type);
> + machine_mode mask_mode = TYPE_MODE (TREE_TYPE (op0));
> + enum insn_code icode = convert_optab_handler (optab, mode, mask_mode);
> + rtx rtx_op1, rtx_op2;
> +
> + gcc_assert (icode != CODE_FOR_nothing);
> +
> + rtx_op1 = expand_normal (op1);
> + rtx_op2 = expand_normal (op2);
> +
> + rtx_op1 = force_reg (mode, rtx_op1);
> +
> + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> + create_output_operand (&ops[0], target, mode);
> + create_input_operand (&ops[1], rtx_op1, mode);
> + create_input_operand (&ops[2], rtx_op2, mode);
> +
> + int opno = add_mask_and_len_args (ops, 3, stmt);
> + expand_insn (icode, opno, ops);
> +
> + if (!rtx_equal_p (ops[0].value, target))
> + emit_move_insn (target, ops[0].value);
> +}
> +
> /* Expand VEC_SET internal functions. */
>
> static void
> @@ -4018,6 +4054,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
> #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p
> #define direct_mask_store_lanes_optab_supported_p multi_vector_optab_supported_p
> #define direct_vec_cond_mask_optab_supported_p convert_optab_supported_p
> +#define direct_vec_cond_mask_len_optab_supported_p convert_optab_supported_p
> #define direct_vec_cond_optab_supported_p convert_optab_supported_p
> #define direct_scatter_store_optab_supported_p convert_optab_supported_p
> #define direct_len_store_optab_supported_p direct_optab_supported_p
> @@ -4690,6 +4727,7 @@ internal_fn_len_index (internal_fn fn)
> case IFN_MASK_LEN_STORE:
> case IFN_MASK_LEN_LOAD_LANES:
> case IFN_MASK_LEN_STORE_LANES:
> + case IFN_VCOND_MASK_LEN:
> return 3;
>
> default:
> @@ -4721,6 +4759,9 @@ internal_fn_mask_index (internal_fn fn)
> case IFN_MASK_LEN_SCATTER_STORE:
> return 4;
>
> + case IFN_VCOND_MASK_LEN:
> + return 0;
> +
> default:
> return (conditional_internal_fn_code (fn) != ERROR_MARK
> || get_unconditional_internal_fn (fn) != IFN_LAST ? 0 : -1);
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index a2023ab9c3d..581cc3b5140 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -221,6 +221,8 @@ DEF_INTERNAL_OPTAB_FN (VCONDU, ECF_CONST | ECF_NOTHROW, vcondu, vec_cond)
> DEF_INTERNAL_OPTAB_FN (VCONDEQ, ECF_CONST | ECF_NOTHROW, vcondeq, vec_cond)
> DEF_INTERNAL_OPTAB_FN (VCOND_MASK, ECF_CONST | ECF_NOTHROW,
> vcond_mask, vec_cond_mask)
> +DEF_INTERNAL_OPTAB_FN (VCOND_MASK_LEN, ECF_CONST | ECF_NOTHROW,
> + vcond_mask_len, vec_cond_mask_len)
>
> DEF_INTERNAL_OPTAB_FN (VEC_SET, ECF_CONST | ECF_NOTHROW, vec_set, vec_set)
> DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW,
> diff --git a/gcc/match.pd b/gcc/match.pd
> index ce8d159d260..f187d560fbf 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -87,6 +87,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> negate bit_not)
> (define_operator_list COND_UNARY
> IFN_COND_NEG IFN_COND_NOT)
> +(define_operator_list COND_LEN_UNARY
> + IFN_COND_LEN_NEG IFN_COND_LEN_NOT)
>
> /* Binary operations and their associated IFN_COND_* function. */
> (define_operator_list UNCOND_BINARY
> @@ -103,12 +105,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> IFN_COND_FMIN IFN_COND_FMAX
> IFN_COND_AND IFN_COND_IOR IFN_COND_XOR
> IFN_COND_SHL IFN_COND_SHR)
> +(define_operator_list COND_LEN_BINARY
> + IFN_COND_LEN_ADD IFN_COND_LEN_SUB
> + IFN_COND_LEN_MUL IFN_COND_LEN_DIV IFN_COND_LEN_MOD IFN_COND_LEN_RDIV
> + IFN_COND_LEN_MIN IFN_COND_LEN_MAX
> + IFN_COND_LEN_FMIN IFN_COND_LEN_FMAX
> + IFN_COND_LEN_AND IFN_COND_LEN_IOR IFN_COND_LEN_XOR
> + IFN_COND_LEN_SHL IFN_COND_LEN_SHR)
>
> /* Same for ternary operations. */
> (define_operator_list UNCOND_TERNARY
> IFN_FMA IFN_FMS IFN_FNMA IFN_FNMS)
> (define_operator_list COND_TERNARY
> IFN_COND_FMA IFN_COND_FMS IFN_COND_FNMA IFN_COND_FNMS)
> +(define_operator_list COND_LEN_TERNARY
> + IFN_COND_LEN_FMA IFN_COND_LEN_FMS IFN_COND_LEN_FNMA IFN_COND_LEN_FNMS)
>
> /* __atomic_fetch_or_*, __atomic_fetch_xor_*, __atomic_xor_fetch_* */
> (define_operator_list ATOMIC_FETCH_OR_XOR_N
> @@ -8949,6 +8960,69 @@ and,
> && single_use (@5))
> (view_convert (cond_op (bit_not @0) @2 @3 @4
> (view_convert:op_type @1)))))))
> +
> +/* Similar for all cond_len operations. */
> +(for uncond_op (UNCOND_UNARY)
> + cond_op (COND_LEN_UNARY)
> + (simplify
> + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@3 @1)) @2 @4 @5)
> + (with { tree op_type = TREE_TYPE (@3); }
> + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
> + && is_truth_type_for (op_type, TREE_TYPE (@0)))
> + (cond_op @0 @1 @2 @4 @5))))
> + (simplify
> + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@3 @2)) @4 @5)
> + (with { tree op_type = TREE_TYPE (@3); }
> + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
> + && is_truth_type_for (op_type, TREE_TYPE (@0)))
> + (cond_op (bit_not @0) @2 @1 @4 @5)))))
> +
> +(for uncond_op (UNCOND_BINARY)
> + cond_op (COND_LEN_BINARY)
> + (simplify
> + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@4 @1 @2)) @3 @5 @6)
> + (with { tree op_type = TREE_TYPE (@4); }
> + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
> + && is_truth_type_for (op_type, TREE_TYPE (@0))
> + && single_use (@4))
> + (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3) @5 @6)))))
> + (simplify
> + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@4 @2 @3)) @5 @6)
> + (with { tree op_type = TREE_TYPE (@4); }
> + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
> + && is_truth_type_for (op_type, TREE_TYPE (@0))
> + && single_use (@4))
> + (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1) @5 @6))))))
> +
> +(for uncond_op (UNCOND_TERNARY)
> + cond_op (COND_LEN_TERNARY)
> + (simplify
> + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4 @6 @7)
> + (with { tree op_type = TREE_TYPE (@5); }
> + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
> + && is_truth_type_for (op_type, TREE_TYPE (@0))
> + && single_use (@5))
> + (view_convert (cond_op @0 @1 @2 @3 (view_convert:op_type @4) @6 @7)))))
> + (simplify
> + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@5 @2 @3 @4 @6 @7)))
> + (with { tree op_type = TREE_TYPE (@5); }
> + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
> + && is_truth_type_for (op_type, TREE_TYPE (@0))
> + && single_use (@5))
> + (view_convert (cond_op (bit_not @0) @2 @3 @4 (view_convert:op_type @1) @6 @7))))))
> +
> +/* A VCOND_MASK_LEN with a size that equals the full hardware vector size
> + is just a vec_cond. */
> +(simplify
> + (IFN_VCOND_MASK_LEN @0 @1 @2 INTEGER_CST@3 INTEGER_CST@4)
> + (with {
> + HOST_WIDE_INT len = -1;
> + if (tree_fits_uhwi_p (@3))
> + len = tree_to_uhwi (@3);
> + auto sz = GET_MODE_NUNITS (TYPE_MODE (res_op->type));
> + bool full_len = (sz.coeffs[0] == len); }
> + (if (full_len)
> + (vec_cond @0 @1 @2))))
> #endif
>
> /* Detect cases in which a VEC_COND_EXPR effectively replaces the
> diff --git a/gcc/optabs.def b/gcc/optabs.def
> index 2ccbe4197b7..3cb16bd3002 100644
> --- a/gcc/optabs.def
> +++ b/gcc/optabs.def
> @@ -88,6 +88,7 @@ OPTAB_CD(vcond_optab, "vcond$a$b")
> OPTAB_CD(vcondu_optab, "vcondu$a$b")
> OPTAB_CD(vcondeq_optab, "vcondeq$a$b")
> OPTAB_CD(vcond_mask_optab, "vcond_mask_$a$b")
> +OPTAB_CD(vcond_mask_len_optab, "vcond_mask_len_$a$b")
> OPTAB_CD(vec_cmp_optab, "vec_cmp$a$b")
> OPTAB_CD(vec_cmpu_optab, "vec_cmpu$a$b")
> OPTAB_CD(vec_cmpeq_optab, "vec_cmpeq$a$b")
next prev parent reply other threads:[~2023-10-24 21:50 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-08 9:01 [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN Robin Dapp
2023-09-11 20:35 ` Robin Dapp
2023-09-18 10:22 ` Robin Dapp
2023-10-04 8:11 ` Robin Dapp
2023-10-12 13:53 ` Richard Sandiford
2023-10-12 14:19 ` Richard Sandiford
2023-10-13 15:50 ` Robin Dapp
2023-10-16 21:59 ` Richard Sandiford
2023-10-17 8:47 ` Richard Biener
2023-10-17 11:39 ` Robin Dapp
2023-10-17 13:35 ` Richard Sandiford
2023-10-17 15:42 ` Robin Dapp
2023-10-17 16:05 ` Richard Sandiford
[not found] ` <7e083b67-f283-4e9e-ba76-24e194fa1761@gmail.com>
[not found] ` <mptttqmny4u.fsf@arm.com>
2023-10-23 16:09 ` [PATCH] internal-fn: Add VCOND_MASK_LEN Robin Dapp
2023-10-24 21:50 ` Richard Sandiford [this message]
2023-10-25 19:59 ` Robin Dapp
2023-10-25 21:58 ` Richard Sandiford
2023-10-17 15:52 ` [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN Richard Sandiford
2023-10-25 22:10 [PATCH] internal-fn: Add VCOND_MASK_LEN 钟居哲
2023-10-25 22:32 ` Richard Sandiford
2023-10-25 22:35 ` 钟居哲
2023-10-26 8:41 ` Robin Dapp
2023-10-26 14:02 ` Robin Dapp
2023-10-26 14:10 ` 钟居哲
2023-10-26 20:32 ` Robin Dapp
2023-11-02 13:35 ` Richard Biener
2023-11-02 13:48 ` Robin Dapp
2023-11-02 23:49 ` Richard Sandiford
2023-11-03 9:03 ` Robin Dapp
2023-11-03 9:11 ` Richard Sandiford
2023-11-03 22:02 ` Robin Dapp
2023-11-05 20:28 ` Richard Sandiford
2023-11-06 7:22 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptv8avlmfh.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rdapp.gcc@gmail.com \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).