From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 8DA3F3858401 for ; Fri, 18 Aug 2023 09:26:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8DA3F3858401 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 851EF2183F; Fri, 18 Aug 2023 09:26:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1692350805; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SErZNquWjOn9c4RpRkeG+ZWZpdSJekKuUAqbckpiR6c=; b=VDURESMEQ588+UCWY6syb7eYjBJthQ8uqOqifpN2TPhD2+cm2HqliE5gxNNNb14GmI6FqX Zm8tfLJfJE2IrLsKeuxkTKWBajQcTGPq8T29t13SgWvXDxOqbNSs+KEW/CdBOqDr0J7DH/ XtxDJLwGgeEXzDdefxhLYGfReqVE6KI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1692350805; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SErZNquWjOn9c4RpRkeG+ZWZpdSJekKuUAqbckpiR6c=; b=MbE5mBNkenF3iIa12equvHRY8tfirtWkHfiM5MSNZwoAvWbS2186hBYK+pug+O87blKZe0 yxr3V4cR6WP1zHAw== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 767792C142; Fri, 18 Aug 2023 09:26:45 +0000 (UTC) Date: Fri, 18 Aug 2023 09:26:45 +0000 (UTC) From: Richard Biener To: Juzhe-Zhong cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: Re: [PATCH] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold In-Reply-To: <20230816131205.233568-1-juzhe.zhong@rivai.ai> Message-ID: References: <20230816131205.233568-1-juzhe.zhong@rivai.ai> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 16 Aug 2023, Juzhe-Zhong wrote: > Hi, Richard and Richi. > > Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. > It's supported in tree-ssa-math-opts.cc. However, GCC failed to support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. > > Consider this following case: > #define TEST_TYPE(TYPE) \ > __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst, \ > TYPE *__restrict a, \ > TYPE *__restrict b, int n) \ > { \ > for (int i = 0; i < n; i++) \ > dst[i] -= a[i] * b[i]; \ > } > > #define TEST_ALL() \ > TEST_TYPE (float) \ > > TEST_ALL () > > Gimple IR for RVV: > > ... > _39 = -vect__8.14_26; > vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, vect__4.8_34, vect__4.8_34, _46, 0); > ... > > This is because this following piece of codes in tree-ssa-math-opts.cc: > > if (len) > fma_stmt > = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2, > addop, else_value, len, bias); > else if (cond) > fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1, > op2, addop, else_value); > else > fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop); > gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt)); > gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun, > use_stmt)); > gsi_replace (&gsi, fma_stmt, true); > /* Follow all SSA edges so that we generate FMS, FNMA and FNMS > regardless of where the negation occurs. */ > gimple *orig_stmt = gsi_stmt (gsi); > if (fold_stmt (&gsi, follow_all_ssa_edges)) > { > if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi))) > gcc_unreachable (); > update_stmt (gsi_stmt (gsi)); > } > > 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA ====> COND_LEN_FNMA. > > This patch support STMT fold into: > > vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, vect__4.8_34, { 0.0, ... }, _46, 0); > > Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments. > > Extend maximum num ops: > - static const unsigned int MAX_NUM_OPS = 5; > + static const unsigned int MAX_NUM_OPS = 7; > > Bootstrap and Regtest on X86 passed. > > Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend. > > Testing on aarch64 is on progress. > > gcc/ChangeLog: > > * genmatch.cc (decision_tree::gen): Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. > * gimple-match-exports.cc (gimple_simplify): Ditto. > (gimple_resimplify6): New function. > (gimple_resimplify7): New function. > (gimple_match_op::resimplify): Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. > (convert_conditional_op): Ditto. > (build_call_internal): Ditto. > (try_conditional_simplification): Ditto. > (gimple_extract): Ditto. > * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto. > * internal-fn.cc (CASE): Ditto. > > --- > gcc/genmatch.cc | 2 +- > gcc/gimple-match-exports.cc | 124 ++++++++++++++++++++++++++++++++++-- > gcc/gimple-match.h | 19 +++++- > gcc/internal-fn.cc | 11 ++-- > 4 files changed, 144 insertions(+), 12 deletions(-) > > diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc > index f46d2e1520d..a1925a747a7 100644 > --- a/gcc/genmatch.cc > +++ b/gcc/genmatch.cc > @@ -4052,7 +4052,7 @@ decision_tree::gen (vec &files, bool gimple) > } > fprintf (stderr, "removed %u duplicate tails\n", rcnt); > > - for (unsigned n = 1; n <= 5; ++n) > + for (unsigned n = 1; n <= 7; ++n) > { > bool has_kids_p = false; > > diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc > index 7aeb4ddb152..895950309b7 100644 > --- a/gcc/gimple-match-exports.cc > +++ b/gcc/gimple-match-exports.cc > @@ -60,6 +60,12 @@ extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), > code_helper, tree, tree, tree, tree, tree); > extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), > code_helper, tree, tree, tree, tree, tree, tree); > +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), > + code_helper, tree, tree, tree, tree, tree, tree, > + tree); > +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), > + code_helper, tree, tree, tree, tree, tree, tree, > + tree, tree); > > /* Functions that are needed by gimple-match but that are exported and used in > other places in the compiler. */ > @@ -89,6 +95,8 @@ static bool gimple_resimplify2 (gimple_seq *, gimple_match_op *, tree (*)(tree)) > static bool gimple_resimplify3 (gimple_seq *, gimple_match_op *, tree (*)(tree)); > static bool gimple_resimplify4 (gimple_seq *, gimple_match_op *, tree (*)(tree)); > static bool gimple_resimplify5 (gimple_seq *, gimple_match_op *, tree (*)(tree)); > +static bool gimple_resimplify6 (gimple_seq *, gimple_match_op *, tree (*)(tree)); > +static bool gimple_resimplify7 (gimple_seq *, gimple_match_op *, tree (*)(tree)); > > /* Match and simplify the toplevel valueized operation THIS. > Replaces THIS with a simplified and/or canonicalized result and > @@ -109,6 +117,10 @@ gimple_match_op::resimplify (gimple_seq *seq, tree (*valueize)(tree)) > return gimple_resimplify4 (seq, this, valueize); > case 5: > return gimple_resimplify5 (seq, this, valueize); > + case 6: > + return gimple_resimplify6 (seq, this, valueize); > + case 7: > + return gimple_resimplify7 (seq, this, valueize); > default: > gcc_unreachable (); > } > @@ -146,7 +158,14 @@ convert_conditional_op (gimple_match_op *orig_op, > if (ifn == IFN_LAST) > return false; > unsigned int num_ops = orig_op->num_ops; > - new_op->set_op (as_combined_fn (ifn), orig_op->type, num_ops + 2); > + unsigned int num_cond_ops = 2; > + if (orig_op->cond.len) > + { > + /* Convert COND_FNMA to COND_LEN_FNMA if it has LEN and BIAS. */ > + ifn = get_len_internal_fn (ifn); > + num_cond_ops = 4; > + } This should definitely be placed before the ifn == IFN_LAST check, and possibly not doing the get_conditional_internal_fn thing at all. The genmatch changes look OK to me, I hope Richard will look at the conditional transforms things since he invented those. > + new_op->set_op (as_combined_fn (ifn), orig_op->type, num_ops + num_cond_ops); > new_op->ops[0] = orig_op->cond.cond; > for (unsigned int i = 0; i < num_ops; ++i) > new_op->ops[i + 1] = orig_op->ops[i]; > @@ -155,6 +174,11 @@ convert_conditional_op (gimple_match_op *orig_op, > else_value = targetm.preferred_else_value (ifn, orig_op->type, > num_ops, orig_op->ops); > new_op->ops[num_ops + 1] = else_value; > + if (orig_op->cond.len) > + { > + new_op->ops[num_ops + 2] = orig_op->cond.len; > + new_op->ops[num_ops + 3] = orig_op->cond.bias; > + } > return true; > } > /* Helper for gimple_simplify valueizing OP using VALUEIZE and setting > @@ -219,7 +243,9 @@ build_call_internal (internal_fn fn, gimple_match_op *res_op) > res_op->op_or_null (1), > res_op->op_or_null (2), > res_op->op_or_null (3), > - res_op->op_or_null (4)); > + res_op->op_or_null (4), > + res_op->op_or_null (5), > + res_op->op_or_null (6)); > } > > /* RES_OP is the result of a simplification. If it is conditional, > @@ -319,6 +345,7 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op, > { > code_helper op; > tree_code code = conditional_internal_fn_code (ifn); > + int len_index = internal_fn_len_index (ifn); > if (code != ERROR_MARK) > op = code; > else > @@ -330,12 +357,20 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op, > } > > unsigned int num_ops = res_op->num_ops; > + /* num_cond_ops = 2 for COND_ADD (MASK and ELSE) > + wheras num_cond_ops = 4 for COND_LEN_ADD (MASK, ELSE, LEN and BIAS). */ > + unsigned int num_cond_ops = len_index < 0 ? 2 : 4; > + tree else_value > + = len_index < 0 ? res_op->ops[num_ops - 1] : res_op->ops[num_ops - 3]; > + tree len = len_index < 0 ? NULL_TREE : res_op->ops[num_ops - 2]; > + tree bias = len_index < 0 ? NULL_TREE : res_op->ops[num_ops - 1]; > + unsigned int last_op_index = len_index < 0 ? num_ops - 1 : num_ops - 3; > gimple_match_op cond_op (gimple_match_cond (res_op->ops[0], > - res_op->ops[num_ops - 1]), > - op, res_op->type, num_ops - 2); > + else_value, len, bias), > + op, res_op->type, num_ops - num_cond_ops); > > memcpy (cond_op.ops, res_op->ops + 1, (num_ops - 1) * sizeof *cond_op.ops); > - switch (num_ops - 2) > + switch (num_ops - num_cond_ops) > { > case 1: > if (!gimple_resimplify1 (seq, &cond_op, valueize)) > @@ -717,7 +752,7 @@ gimple_extract (gimple *stmt, gimple_match_op *res_op, > /* ??? This way we can't simplify calls with side-effects. */ > if (gimple_call_lhs (stmt) != NULL_TREE > && gimple_call_num_args (stmt) >= 1 > - && gimple_call_num_args (stmt) <= 5) > + && gimple_call_num_args (stmt) <= 7) > { > combined_fn cfn; > if (gimple_call_internal_p (stmt)) > @@ -1145,6 +1180,83 @@ gimple_resimplify5 (gimple_seq *seq, gimple_match_op *res_op, > return canonicalized; > } > > +/* Helper that matches and simplifies the toplevel result from > + a gimple_simplify run (where we don't want to build > + a stmt in case it's used in in-place folding). Replaces > + RES_OP with a simplified and/or canonicalized result and > + returns whether any change was made. */ > + > +static bool > +gimple_resimplify6 (gimple_seq *seq, gimple_match_op *res_op, > + tree (*valueize)(tree)) > +{ > + /* No constant folding is defined for five-operand functions. */ > + > + /* Canonicalize operand order. */ > + bool canonicalized = false; > + int argno = first_commutative_argument (res_op->code, res_op->type); > + if (argno >= 0 > + && tree_swap_operands_p (res_op->ops[argno], res_op->ops[argno + 1])) > + { > + std::swap (res_op->ops[argno], res_op->ops[argno + 1]); > + canonicalized = true; > + } > + > + gimple_match_op res_op2 (*res_op); > + if (gimple_simplify (&res_op2, seq, valueize, > + res_op->code, res_op->type, > + res_op->ops[0], res_op->ops[1], res_op->ops[2], > + res_op->ops[3], res_op->ops[4], res_op->ops[5])) > + { > + *res_op = res_op2; > + return true; > + } > + > + if (maybe_resimplify_conditional_op (seq, res_op, valueize)) > + return true; > + > + return canonicalized; > +} > + > +/* Helper that matches and simplifies the toplevel result from > + a gimple_simplify run (where we don't want to build > + a stmt in case it's used in in-place folding). Replaces > + RES_OP with a simplified and/or canonicalized result and > + returns whether any change was made. */ > + > +static bool > +gimple_resimplify7 (gimple_seq *seq, gimple_match_op *res_op, > + tree (*valueize)(tree)) > +{ > + /* No constant folding is defined for five-operand functions. */ > + > + /* Canonicalize operand order. */ > + bool canonicalized = false; > + int argno = first_commutative_argument (res_op->code, res_op->type); > + if (argno >= 0 > + && tree_swap_operands_p (res_op->ops[argno], res_op->ops[argno + 1])) > + { > + std::swap (res_op->ops[argno], res_op->ops[argno + 1]); > + canonicalized = true; > + } > + > + gimple_match_op res_op2 (*res_op); > + if (gimple_simplify (&res_op2, seq, valueize, > + res_op->code, res_op->type, > + res_op->ops[0], res_op->ops[1], res_op->ops[2], > + res_op->ops[3], res_op->ops[4], res_op->ops[5], > + res_op->ops[6])) > + { > + *res_op = res_op2; > + return true; > + } > + > + if (maybe_resimplify_conditional_op (seq, res_op, valueize)) > + return true; > + > + return canonicalized; > +} > + > /* Return a canonical form for CODE when operating on TYPE. The idea > is to remove redundant ways of representing the same operation so > that code_helpers can be hashed and compared for equality. > diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h > index b20585dca4b..c16cfede826 100644 > --- a/gcc/gimple-match.h > +++ b/gcc/gimple-match.h > @@ -34,6 +34,7 @@ public: > /* Build an unconditional op. */ > gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE) {} > gimple_match_cond (tree, tree); > + gimple_match_cond (tree, tree, tree, tree); > > gimple_match_cond any_else () const; > > @@ -44,6 +45,16 @@ public: > /* The value to use when the condition is false. This is NULL_TREE if > the operation is unconditional or if the value doesn't matter. */ > tree else_value; > + > + /* The length value for conditional len operation that operation occurs > + when element index < len + bias. This is NULL_TREE if the operation > + is non-LEN operation. */ > + tree len; > + > + /* The bias value for conditional len operation that operation occurs > + when element index < len + bias. This is NULL_TREE if the operation > + is non-LEN operation. */ > + tree bias; > }; > > inline > @@ -52,6 +63,12 @@ gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in) > { > } > > +inline > +gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in, > + tree len_in, tree bias_in) > + : cond (cond_in), else_value (else_value_in), len (len_in), bias (bias_in) > +{} > + > /* Return a gimple_match_cond with the same condition but with an > arbitrary ELSE_VALUE. */ > > @@ -93,7 +110,7 @@ public: > bool resimplify (gimple_seq *, tree (*)(tree)); > > /* The maximum value of NUM_OPS. */ > - static const unsigned int MAX_NUM_OPS = 5; > + static const unsigned int MAX_NUM_OPS = 7; > > /* The conditions under which the operation is performed, and the value to > use as a fallback. */ > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index cc1ede58799..be9b93e9f71 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -4477,11 +4477,14 @@ get_unconditional_internal_fn (internal_fn ifn) > { > switch (ifn) > { > -#define CASE(NAME) case IFN_COND_##NAME: return IFN_##NAME; > - FOR_EACH_COND_FN_PAIR(CASE) > +#define CASE(NAME) \ > + case IFN_COND_##NAME: \ > + case IFN_COND_LEN_##NAME: \ > + return IFN_##NAME; > + FOR_EACH_COND_FN_PAIR (CASE) > #undef CASE > - default: > - return IFN_LAST; > + default: > + return IFN_LAST; > } > } > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)