From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by sourceware.org (Postfix) with ESMTPS id 0E38C3858C01 for ; Thu, 26 Oct 2023 14:02:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0E38C3858C01 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0E38C3858C01 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698328973; cv=none; b=jV00aVOnBELEUOtR+BeUtSmYOIkJY+LDJD4zr1f7XEMi/CybVaYNAE6R+Uof7TDBpM2oqYRjHI6M296HGBOPQV+VcggriEvuxKk+s7MXP9OrRtf6q/r7pgmHA5YJF4miPI6qL4h2w9m0++GjAPmsUgGfqRKk1uL59kZubgTxo9A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698328973; c=relaxed/simple; bh=94NyrU4f3Mo3qQZsCmXXzSLgyIqDGwCrP9pBCYSu9+I=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=aK2lyjxmi9A3I23nHFVILcnWHAjHM6YlwiCOHelpRolNDZsRxsi6uodVvahONAa6qSqqBm74lJ1PCTPlri71ptansfGH/WBE1+6VMRSKWDt2eTYC1VMvHbyjIO5X61JBX9Zol81mqbyQCnL0CSOW8+b4n28nAmj0DWK3L1KCFZQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-40859dee28cso7733685e9.0 for ; Thu, 26 Oct 2023 07:02:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698328966; x=1698933766; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:cc:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=GMK0JRTKqdQXRV7cxWMKnRYA7abPkpMiKjj/+vbWVZM=; b=CvYH3lX66j0WlkvyIzPfyIXT7W9AikshPapBhjf0Kg5wZfuDwTO4oYuIJ63e7h1aut SGkLPxg8Qg0gE2g5MI3AyiuDlOsvvUHHXESagjyDxsNJ+n6U5HsbcexF3VuagN3yfGNH rtPcvOM6KWlxI2+McL5Rvolzq2/QOSvSqaLRdBfoN39qAlHDfq90CyE1PZcvkkL7GQnk CBi3SLE+WRHQ5+neWVUls4CA5LysAyLa7WLornsU43OiPQSmCXvukK55Z2Suwio00OQ+ Ay7rTM/+KsU064s9bkEhExPFTHk0TUarJO9HGwfGYD5kG/TDa8hW4uEfKLHi0ms/ZtJ1 P8ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698328966; x=1698933766; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:cc:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GMK0JRTKqdQXRV7cxWMKnRYA7abPkpMiKjj/+vbWVZM=; b=jEfhyoGJHZgcDtP9Q1Kie8TfhtX8cQwSu5/WTQG55wbAX2S80ciqrkeYuMRZVnHw04 A0V/hDAZFkbZOM8isjNFkWiN6wTk1VKYFsQJLBN0nGNTBk4SjAZdZV5eO71oEENKJOSO Pse5n5lpf3QHiDkSljrL3bKdLdAgTT1WIM2QUetmWsPGMk+bFqhEJRTfy18iFhsfIZ1Y tw8NWBeU995bspPjc4jgOVfGnUgfhFCb4bmUhCHT82vrRNBrDFjtRghlSmGgjD/Msc2w 75CCvZsiKTc5I2u5qtMP5xPci1wQP67/4d02a3eolcYJElUtx7d+82uPyuRzCkMsKct7 5Ggg== X-Gm-Message-State: AOJu0Yx4X4RqktLoWs9Vabsv8dNjySEBvF0xRB5nNQr8VrDPGnTKj4h1 26lZtRaR1Y61Y9WlzRiqpBE= X-Google-Smtp-Source: AGHT+IHDgVFMr+TYEkEnUaXycZKMU2zh+1Vfdl8P7WRNyYbn8oZSAY68ozEGUD6W29Ey2Znd3Wvfnw== X-Received: by 2002:a05:600c:358e:b0:408:3a67:f6f5 with SMTP id p14-20020a05600c358e00b004083a67f6f5mr16038514wmq.18.1698328965770; Thu, 26 Oct 2023 07:02:45 -0700 (PDT) Received: from [192.168.1.23] (ip-046-223-203-173.um13.pools.vodafone-ip.de. [46.223.203.173]) by smtp.gmail.com with ESMTPSA id o12-20020adfe80c000000b0032da49e18fasm14433781wrm.23.2023.10.26.07.02.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 26 Oct 2023 07:02:45 -0700 (PDT) Message-ID: <818fe7b8-cb55-49d1-94fa-f929b8cbc5d8@gmail.com> Date: Thu, 26 Oct 2023 16:02:44 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: rdapp.gcc@gmail.com, gcc-patches , rguenther , "juzhe.zhong@rivai.ai" Subject: Re: [PATCH] internal-fn: Add VCOND_MASK_LEN. Content-Language: en-US To: "richard.sandiford" References: From: Robin Dapp In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Ok, next try. Now without dubious pattern and with direct optab but still dedicated expander function. This will cause one riscv regression in cond_widen_reduc-2.c that we can deal with later. It is just a missed optimization where we do not combine something that we used to because of the now-present length masking. I'd also like to postpone handling vcond_mask_len simplifications via stripping the length and falling back to vec_cond and its fold patterns to a later time. As is, this helps us avoid execution failures in at least five test cases. Bootstrap et al. running on x86, aarch64 and power10. Regards Robin >From 7acdebb5b13b71331621af08da6649fe08476fe8 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Wed, 25 Oct 2023 22:19:43 +0200 Subject: [PATCH v3] internal-fn: Add VCOND_MASK_LEN. In order to prevent simplification of a COND_OP with degenerate mask (all true or all zero) into just an OP in the presence of length masking this patch introduces a length-masked analog to VEC_COND_EXPR: IFN_VCOND_MASK_LEN. It also adds new match patterns that allow the combination of unconditional unary, binary and ternay operations with the VCOND_MASK_LEN into a conditional operation if the target supports it. gcc/ChangeLog: PR tree-optimization/111760 * config/riscv/autovec.md (vcond_mask_len_): Add expander. * config/riscv/riscv-protos.h (enum insn_type): Add. * doc/md.texi: Add vcond_mask_len. * gimple-match-exports.cc (maybe_resimplify_conditional_op): Create VCOND_MASK_LEN when length masking. * gimple-match.h (gimple_match_op::gimple_match_op): Allow matching of 6 and 7 parameters. (gimple_match_op::set_op): Ditto. (gimple_match_op::gimple_match_op): Always initialize len and bias. * internal-fn.cc (vec_cond_mask_len_direct): Add. (expand_vec_cond_mask_len_optab_fn): Add. (direct_vec_cond_mask_len_optab_supported_p): Add. (internal_fn_len_index): Add VCOND_MASK_LEN. (internal_fn_mask_index): Ditto. * internal-fn.def (VCOND_MASK_LEN): New internal function. * match.pd: Combine unconditional unary, binary and ternary operations into the respective COND_LEN operations. * optabs.def (OPTAB_D): Add vcond_mask_len optab. --- gcc/config/riscv/autovec.md | 37 ++++++++++++++++ gcc/config/riscv/riscv-protos.h | 5 +++ gcc/doc/md.texi | 9 ++++ gcc/gimple-match-exports.cc | 13 ++++-- gcc/gimple-match.h | 78 ++++++++++++++++++++++++++++++++- gcc/internal-fn.cc | 42 ++++++++++++++++++ gcc/internal-fn.def | 2 + gcc/match.pd | 61 ++++++++++++++++++++++++++ gcc/optabs.def | 1 + 9 files changed, 243 insertions(+), 5 deletions(-) diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 80910ba3cc2..dadb71c1165 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -565,6 +565,43 @@ (define_insn_and_split "vcond_mask_" [(set_attr "type" "vector")] ) +(define_expand "vcond_mask_len_" + [(match_operand:V_VLS 0 "register_operand") + (match_operand: 3 "nonmemory_operand") + (match_operand:V_VLS 1 "nonmemory_operand") + (match_operand:V_VLS 2 "autovec_else_operand") + (match_operand 4 "autovec_length_operand") + (match_operand 5 "const_0_operand")] + "TARGET_VECTOR" + { + if (satisfies_constraint_Wc1 (operands[3])) + { + rtx ops[] = {operands[0], operands[2], operands[1]}; + riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (mode), + riscv_vector::UNARY_OP_TUMA, + ops, operands[4]); + } + else if (satisfies_constraint_Wc0 (operands[3])) + { + rtx ops[] = {operands[0], operands[2], operands[2]}; + riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (mode), + riscv_vector::UNARY_OP_TUMA, + ops, operands[4]); + } + else + { + /* The order of vcond_mask is opposite to pred_merge. */ + rtx ops[] = {operands[0], operands[2], operands[2], operands[1], + operands[3]}; + riscv_vector::emit_nonvlmax_insn (code_for_pred_merge (mode), + riscv_vector::MERGE_OP_TUMA, + ops, operands[4]); + } + DONE; + } + [(set_attr "type" "vector")] +) + ;; ------------------------------------------------------------------------- ;; ---- [BOOL] Select based on masks ;; ------------------------------------------------------------------------- diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 668d75043ca..0a54e4ff022 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -302,6 +302,7 @@ enum insn_type : unsigned int UNARY_OP = __NORMAL_OP | UNARY_OP_P, UNARY_OP_TAMA = __MASK_OP_TAMA | UNARY_OP_P, UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P, + UNARY_OP_TUMA = __MASK_OP_TUMA | UNARY_OP_P, UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P, UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P, @@ -337,6 +338,10 @@ enum insn_type : unsigned int /* For vmerge, no mask operand, no mask policy operand. */ MERGE_OP = __NORMAL_OP_TA2 | TERNARY_OP_P, + /* For vmerge with no vundef operand. */ + MERGE_OP_TUMA = HAS_DEST_P | HAS_MERGE_P | TERNARY_OP_P + | TU_POLICY_P, + /* For vm, no tail policy operand. */ COMPARE_OP = __NORMAL_OP_MA | TERNARY_OP_P, COMPARE_OP_MU = __MASK_OP_MU | TERNARY_OP_P, diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index daa318ee3da..de0757f1903 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5306,6 +5306,15 @@ no need to define this instruction pattern if the others are supported. Similar to @code{vcond@var{m}@var{n}} but operand 3 holds a pre-computed result of vector comparison. +@cindex @code{vcond_mask_len_@var{m}@var{n}} instruction pattern +@item @samp{vcond_mask_@var{m}@var{n}} +Similar to @code{vcond_mask@var{m}@var{n}} but operand 4 holds a variable +or constant length and operand 5 holds a bias. If the +element index < operand 4 + operand 5 the respective element of the result is +computed as in @code{vcond_mask_@var{m}@var{n}}. For element indices >= +operand 4 + operand 5 the computation is performed as if the respective mask +element were zero. + @cindex @code{maskload@var{m}@var{n}} instruction pattern @item @samp{maskload@var{m}@var{n}} Perform a masked load of vector from memory operand 1 of mode @var{m} diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc index b36027b0bad..d6dac08cc2b 100644 --- a/gcc/gimple-match-exports.cc +++ b/gcc/gimple-match-exports.cc @@ -307,9 +307,16 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, && VECTOR_TYPE_P (res_op->type) && gimple_simplified_result_is_gimple_val (res_op)) { - new_op.set_op (VEC_COND_EXPR, res_op->type, - res_op->cond.cond, res_op->ops[0], - res_op->cond.else_value); + tree len = res_op->cond.len; + if (!len) + new_op.set_op (VEC_COND_EXPR, res_op->type, + res_op->cond.cond, res_op->ops[0], + res_op->cond.else_value); + else + new_op.set_op (IFN_VCOND_MASK_LEN, res_op->type, + res_op->cond.cond, res_op->ops[0], + res_op->cond.else_value, + res_op->cond.len, res_op->cond.bias); *res_op = new_op; return gimple_resimplify3 (seq, res_op, valueize); } diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h index bec3ff42e3e..63a9f029589 100644 --- a/gcc/gimple-match.h +++ b/gcc/gimple-match.h @@ -32,7 +32,8 @@ public: enum uncond { UNCOND }; /* Build an unconditional op. */ - gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE) {} + gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE), len + (NULL_TREE), bias (NULL_TREE) {} gimple_match_cond (tree, tree); gimple_match_cond (tree, tree, tree, tree); @@ -56,7 +57,8 @@ public: inline gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in) - : cond (cond_in), else_value (else_value_in) + : cond (cond_in), else_value (else_value_in), len (NULL_TREE), + bias (NULL_TREE) { } @@ -92,6 +94,10 @@ public: code_helper, tree, tree, tree, tree, tree); gimple_match_op (const gimple_match_cond &, code_helper, tree, tree, tree, tree, tree, tree); + gimple_match_op (const gimple_match_cond &, + code_helper, tree, tree, tree, tree, tree, tree, tree); + gimple_match_op (const gimple_match_cond &, + code_helper, tree, tree, tree, tree, tree, tree, tree, tree); void set_op (code_helper, tree, unsigned int); void set_op (code_helper, tree, tree); @@ -100,6 +106,8 @@ public: void set_op (code_helper, tree, tree, tree, tree, bool); void set_op (code_helper, tree, tree, tree, tree, tree); void set_op (code_helper, tree, tree, tree, tree, tree, tree); + void set_op (code_helper, tree, tree, tree, tree, tree, tree, tree); + void set_op (code_helper, tree, tree, tree, tree, tree, tree, tree, tree); void set_value (tree); tree op_or_null (unsigned int) const; @@ -212,6 +220,39 @@ gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in, ops[4] = op4; } +inline +gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in, + code_helper code_in, tree type_in, + tree op0, tree op1, tree op2, tree op3, + tree op4, tree op5) + : cond (cond_in), code (code_in), type (type_in), reverse (false), + num_ops (6) +{ + ops[0] = op0; + ops[1] = op1; + ops[2] = op2; + ops[3] = op3; + ops[4] = op4; + ops[5] = op5; +} + +inline +gimple_match_op::gimple_match_op (const gimple_match_cond &cond_in, + code_helper code_in, tree type_in, + tree op0, tree op1, tree op2, tree op3, + tree op4, tree op5, tree op6) + : cond (cond_in), code (code_in), type (type_in), reverse (false), + num_ops (7) +{ + ops[0] = op0; + ops[1] = op1; + ops[2] = op2; + ops[3] = op3; + ops[4] = op4; + ops[5] = op5; + ops[6] = op6; +} + /* Change the operation performed to CODE_IN, the type of the result to TYPE_IN, and the number of operands to NUM_OPS_IN. The caller needs to set the operands itself. */ @@ -299,6 +340,39 @@ gimple_match_op::set_op (code_helper code_in, tree type_in, ops[4] = op4; } +inline void +gimple_match_op::set_op (code_helper code_in, tree type_in, + tree op0, tree op1, tree op2, tree op3, tree op4, + tree op5) +{ + code = code_in; + type = type_in; + num_ops = 6; + ops[0] = op0; + ops[1] = op1; + ops[2] = op2; + ops[3] = op3; + ops[4] = op4; + ops[5] = op5; +} + +inline void +gimple_match_op::set_op (code_helper code_in, tree type_in, + tree op0, tree op1, tree op2, tree op3, tree op4, + tree op5, tree op6) +{ + code = code_in; + type = type_in; + num_ops = 7; + ops[0] = op0; + ops[1] = op1; + ops[2] = op2; + ops[3] = op3; + ops[4] = op4; + ops[5] = op5; + ops[6] = op6; +} + /* Set the "operation" to be the single value VALUE, such as a constant or SSA_NAME. */ diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 018175261b9..ed83fa8112e 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -170,6 +170,7 @@ init_internal_fns () #define store_lanes_direct { 0, 0, false } #define mask_store_lanes_direct { 0, 0, false } #define vec_cond_mask_direct { 1, 0, false } +#define vec_cond_mask_len_direct { 1, 1, false } #define vec_cond_direct { 2, 0, false } #define scatter_store_direct { 3, 1, false } #define len_store_direct { 3, 3, false } @@ -3129,6 +3130,39 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) emit_move_insn (target, ops[0].value); } +static void +expand_vec_cond_mask_len_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +{ + class expand_operand ops[6]; + + tree lhs = gimple_call_lhs (stmt); + tree op1 = gimple_call_arg (stmt, 1); + tree op2 = gimple_call_arg (stmt, 2); + tree vec_cond_type = TREE_TYPE (lhs); + + machine_mode mode = TYPE_MODE (vec_cond_type); + enum insn_code icode = direct_optab_handler (optab, mode); + rtx rtx_op1, rtx_op2; + + gcc_assert (icode != CODE_FOR_nothing); + + rtx_op1 = expand_normal (op1); + rtx_op2 = expand_normal (op2); + + rtx_op1 = force_reg (mode, rtx_op1); + + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], rtx_op1, mode); + create_input_operand (&ops[2], rtx_op2, mode); + + int opno = add_mask_and_len_args (ops, 3, stmt); + expand_insn (icode, opno, ops); + + if (!rtx_equal_p (ops[0].value, target)) + emit_move_insn (target, ops[0].value); +} + /* Expand VEC_SET internal functions. */ static void @@ -3927,6 +3961,9 @@ expand_convert_optab_fn (internal_fn fn, gcall *stmt, convert_optab optab, #define expand_vec_extract_optab_fn(FN, STMT, OPTAB) \ expand_convert_optab_fn (FN, STMT, OPTAB, 2) +#define expand_vec_cond_mask_len_optab_fn(FN, STMT, OPTAB) \ + expand_vec_cond_mask_len_optab_fn (FN, STMT, OPTAB) + /* RETURN_TYPE and ARGS are a return type and argument list that are in principle compatible with FN (which satisfies direct_internal_fn_p). Return the types that should be used to determine whether the @@ -4018,6 +4055,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_mask_store_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_vec_cond_mask_optab_supported_p convert_optab_supported_p +#define direct_vec_cond_mask_len_optab_supported_p direct_optab_supported_p #define direct_vec_cond_optab_supported_p convert_optab_supported_p #define direct_scatter_store_optab_supported_p convert_optab_supported_p #define direct_len_store_optab_supported_p direct_optab_supported_p @@ -4690,6 +4728,7 @@ internal_fn_len_index (internal_fn fn) case IFN_MASK_LEN_STORE: case IFN_MASK_LEN_LOAD_LANES: case IFN_MASK_LEN_STORE_LANES: + case IFN_VCOND_MASK_LEN: return 3; default: @@ -4779,6 +4818,9 @@ internal_fn_mask_index (internal_fn fn) case IFN_MASK_LEN_SCATTER_STORE: return 4; + case IFN_VCOND_MASK_LEN: + return 0; + default: return (conditional_internal_fn_code (fn) != ERROR_MARK || get_unconditional_internal_fn (fn) != IFN_LAST ? 0 : -1); diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index a2023ab9c3d..581cc3b5140 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -221,6 +221,8 @@ DEF_INTERNAL_OPTAB_FN (VCONDU, ECF_CONST | ECF_NOTHROW, vcondu, vec_cond) DEF_INTERNAL_OPTAB_FN (VCONDEQ, ECF_CONST | ECF_NOTHROW, vcondeq, vec_cond) DEF_INTERNAL_OPTAB_FN (VCOND_MASK, ECF_CONST | ECF_NOTHROW, vcond_mask, vec_cond_mask) +DEF_INTERNAL_OPTAB_FN (VCOND_MASK_LEN, ECF_CONST | ECF_NOTHROW, + vcond_mask_len, vec_cond_mask_len) DEF_INTERNAL_OPTAB_FN (VEC_SET, ECF_CONST | ECF_NOTHROW, vec_set, vec_set) DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW, diff --git a/gcc/match.pd b/gcc/match.pd index f725a685863..0c21c29694d 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -87,6 +87,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) negate bit_not) (define_operator_list COND_UNARY IFN_COND_NEG IFN_COND_NOT) +(define_operator_list COND_LEN_UNARY + IFN_COND_LEN_NEG IFN_COND_LEN_NOT) /* Binary operations and their associated IFN_COND_* function. */ (define_operator_list UNCOND_BINARY @@ -103,12 +105,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) IFN_COND_FMIN IFN_COND_FMAX IFN_COND_AND IFN_COND_IOR IFN_COND_XOR IFN_COND_SHL IFN_COND_SHR) +(define_operator_list COND_LEN_BINARY + IFN_COND_LEN_ADD IFN_COND_LEN_SUB + IFN_COND_LEN_MUL IFN_COND_LEN_DIV IFN_COND_LEN_MOD IFN_COND_LEN_RDIV + IFN_COND_LEN_MIN IFN_COND_LEN_MAX + IFN_COND_LEN_FMIN IFN_COND_LEN_FMAX + IFN_COND_LEN_AND IFN_COND_LEN_IOR IFN_COND_LEN_XOR + IFN_COND_LEN_SHL IFN_COND_LEN_SHR) /* Same for ternary operations. */ (define_operator_list UNCOND_TERNARY IFN_FMA IFN_FMS IFN_FNMA IFN_FNMS) (define_operator_list COND_TERNARY IFN_COND_FMA IFN_COND_FMS IFN_COND_FNMA IFN_COND_FNMS) +(define_operator_list COND_LEN_TERNARY + IFN_COND_LEN_FMA IFN_COND_LEN_FMS IFN_COND_LEN_FNMA IFN_COND_LEN_FNMS) /* __atomic_fetch_or_*, __atomic_fetch_xor_*, __atomic_xor_fetch_* */ (define_operator_list ATOMIC_FETCH_OR_XOR_N @@ -8949,6 +8960,56 @@ and, && single_use (@5)) (view_convert (cond_op (bit_not @0) @2 @3 @4 (view_convert:op_type @1))))))) + +/* Similar for all cond_len operations. */ +(for uncond_op (UNCOND_UNARY) + cond_op (COND_LEN_UNARY) + (simplify + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@3 @1)) @2 @4 @5) + (with { tree op_type = TREE_TYPE (@3); } + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) + && is_truth_type_for (op_type, TREE_TYPE (@0))) + (cond_op @0 @1 @2 @4 @5)))) + (simplify + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@3 @2)) @4 @5) + (with { tree op_type = TREE_TYPE (@3); } + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) + && is_truth_type_for (op_type, TREE_TYPE (@0))) + (cond_op (bit_not @0) @2 @1 @4 @5))))) + +(for uncond_op (UNCOND_BINARY) + cond_op (COND_LEN_BINARY) + (simplify + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@4 @1 @2)) @3 @5 @6) + (with { tree op_type = TREE_TYPE (@4); } + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) + && is_truth_type_for (op_type, TREE_TYPE (@0)) + && single_use (@4)) + (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3) @5 @6))))) + (simplify + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@4 @2 @3)) @5 @6) + (with { tree op_type = TREE_TYPE (@4); } + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) + && is_truth_type_for (op_type, TREE_TYPE (@0)) + && single_use (@4)) + (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1) @5 @6)))))) + +(for uncond_op (UNCOND_TERNARY) + cond_op (COND_LEN_TERNARY) + (simplify + (IFN_VCOND_MASK_LEN @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4 @6 @7) + (with { tree op_type = TREE_TYPE (@5); } + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) + && is_truth_type_for (op_type, TREE_TYPE (@0)) + && single_use (@5)) + (view_convert (cond_op @0 @1 @2 @3 (view_convert:op_type @4) @6 @7))))) + (simplify + (IFN_VCOND_MASK_LEN @0 @1 (view_convert? (uncond_op@5 @2 @3 @4 @6 @7))) + (with { tree op_type = TREE_TYPE (@5); } + (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type) + && is_truth_type_for (op_type, TREE_TYPE (@0)) + && single_use (@5)) + (view_convert (cond_op (bit_not @0) @2 @3 @4 (view_convert:op_type @1) @6 @7)))))) #endif /* Detect cases in which a VEC_COND_EXPR effectively replaces the diff --git a/gcc/optabs.def b/gcc/optabs.def index 2ccbe4197b7..8d5ceeb8710 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -282,6 +282,7 @@ OPTAB_D (cond_len_fnma_optab, "cond_len_fnma$a") OPTAB_D (cond_len_fnms_optab, "cond_len_fnms$a") OPTAB_D (cond_len_neg_optab, "cond_len_neg$a") OPTAB_D (cond_len_one_cmpl_optab, "cond_len_one_cmpl$a") +OPTAB_D (vcond_mask_len_optab, "vcond_mask_len_$a") OPTAB_D (cmov_optab, "cmov$a6") OPTAB_D (cstore_optab, "cstore$a4") OPTAB_D (ctrap_optab, "ctrap$a4") -- 2.41.0