From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 7256D385703A for ; Mon, 21 Aug 2023 14:18:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7256D385703A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E3F532F4; Mon, 21 Aug 2023 07:18:46 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7E7583F740; Mon, 21 Aug 2023 07:18:05 -0700 (PDT) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener ,Juzhe-Zhong , gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Cc: Juzhe-Zhong , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold References: <20230816131205.233568-1-juzhe.zhong@rivai.ai> Date: Mon, 21 Aug 2023 15:18:04 +0100 In-Reply-To: (Richard Biener's message of "Fri, 18 Aug 2023 09:26:45 +0000 (UTC)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-25.6 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Richard Biener writes: > On Wed, 16 Aug 2023, Juzhe-Zhong wrote: > >> Hi, Richard and Richi. >> >> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. >> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. >> >> Consider this following case: >> #define TEST_TYPE(TYPE) \ >> __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst, \ >> TYPE *__restrict a, \ >> TYPE *__restrict b, int n) \ >> { \ >> for (int i = 0; i < n; i++) \ >> dst[i] -= a[i] * b[i]; \ >> } >> >> #define TEST_ALL() \ >> TEST_TYPE (float) \ >> >> TEST_ALL () >> >> Gimple IR for RVV: >> >> ... >> _39 = -vect__8.14_26; >> vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, vect__4.8_34, vect__4.8_34, _46, 0); >> ... >> >> This is because this following piece of codes in tree-ssa-math-opts.cc: >> >> if (len) >> fma_stmt >> = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2, >> addop, else_value, len, bias); >> else if (cond) >> fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1, >> op2, addop, else_value); >> else >> fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop); >> gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt)); >> gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun, >> use_stmt)); >> gsi_replace (&gsi, fma_stmt, true); >> /* Follow all SSA edges so that we generate FMS, FNMA and FNMS >> regardless of where the negation occurs. */ >> gimple *orig_stmt = gsi_stmt (gsi); >> if (fold_stmt (&gsi, follow_all_ssa_edges)) >> { >> if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi))) >> gcc_unreachable (); >> update_stmt (gsi_stmt (gsi)); >> } >> >> 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA ====> COND_LEN_FNMA. >> >> This patch support STMT fold into: >> >> vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, vect__4.8_34, { 0.0, ... }, _46, 0); >> >> Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments. >> >> Extend maximum num ops: >> - static const unsigned int MAX_NUM_OPS = 5; >> + static const unsigned int MAX_NUM_OPS = 7; >> >> Bootstrap and Regtest on X86 passed. >> >> Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend. >> >> Testing on aarch64 is on progress. >> >> gcc/ChangeLog: >> >> * genmatch.cc (decision_tree::gen): Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. >> * gimple-match-exports.cc (gimple_simplify): Ditto. >> (gimple_resimplify6): New function. >> (gimple_resimplify7): New function. >> (gimple_match_op::resimplify): Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. >> (convert_conditional_op): Ditto. >> (build_call_internal): Ditto. >> (try_conditional_simplification): Ditto. >> (gimple_extract): Ditto. >> * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto. >> * internal-fn.cc (CASE): Ditto. >> >> --- >> gcc/genmatch.cc | 2 +- >> gcc/gimple-match-exports.cc | 124 ++++++++++++++++++++++++++++++++++-- >> gcc/gimple-match.h | 19 +++++- >> gcc/internal-fn.cc | 11 ++-- >> 4 files changed, 144 insertions(+), 12 deletions(-) >> >> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc >> index f46d2e1520d..a1925a747a7 100644 >> --- a/gcc/genmatch.cc >> +++ b/gcc/genmatch.cc >> @@ -4052,7 +4052,7 @@ decision_tree::gen (vec &files, bool gimple) >> } >> fprintf (stderr, "removed %u duplicate tails\n", rcnt); >> >> - for (unsigned n = 1; n <= 5; ++n) >> + for (unsigned n = 1; n <= 7; ++n) >> { >> bool has_kids_p = false; >> >> diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc >> index 7aeb4ddb152..895950309b7 100644 >> --- a/gcc/gimple-match-exports.cc >> +++ b/gcc/gimple-match-exports.cc >> @@ -60,6 +60,12 @@ extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), >> code_helper, tree, tree, tree, tree, tree); >> extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), >> code_helper, tree, tree, tree, tree, tree, tree); >> +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), >> + code_helper, tree, tree, tree, tree, tree, tree, >> + tree); >> +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), >> + code_helper, tree, tree, tree, tree, tree, tree, >> + tree, tree); >> >> /* Functions that are needed by gimple-match but that are exported and used in >> other places in the compiler. */ >> @@ -89,6 +95,8 @@ static bool gimple_resimplify2 (gimple_seq *, gimple_match_op *, tree (*)(tree)) >> static bool gimple_resimplify3 (gimple_seq *, gimple_match_op *, tree (*)(tree)); >> static bool gimple_resimplify4 (gimple_seq *, gimple_match_op *, tree (*)(tree)); >> static bool gimple_resimplify5 (gimple_seq *, gimple_match_op *, tree (*)(tree)); >> +static bool gimple_resimplify6 (gimple_seq *, gimple_match_op *, tree (*)(tree)); >> +static bool gimple_resimplify7 (gimple_seq *, gimple_match_op *, tree (*)(tree)); >> >> /* Match and simplify the toplevel valueized operation THIS. >> Replaces THIS with a simplified and/or canonicalized result and >> @@ -109,6 +117,10 @@ gimple_match_op::resimplify (gimple_seq *seq, tree (*valueize)(tree)) >> return gimple_resimplify4 (seq, this, valueize); >> case 5: >> return gimple_resimplify5 (seq, this, valueize); >> + case 6: >> + return gimple_resimplify6 (seq, this, valueize); >> + case 7: >> + return gimple_resimplify7 (seq, this, valueize); >> default: >> gcc_unreachable (); >> } >> @@ -146,7 +158,14 @@ convert_conditional_op (gimple_match_op *orig_op, >> if (ifn == IFN_LAST) >> return false; >> unsigned int num_ops = orig_op->num_ops; >> - new_op->set_op (as_combined_fn (ifn), orig_op->type, num_ops + 2); >> + unsigned int num_cond_ops = 2; >> + if (orig_op->cond.len) >> + { >> + /* Convert COND_FNMA to COND_LEN_FNMA if it has LEN and BIAS. */ >> + ifn = get_len_internal_fn (ifn); >> + num_cond_ops = 4; >> + } > > This should definitely be placed before the ifn == IFN_LAST check, > and possibly not doing the get_conditional_internal_fn thing at all. FWIW, I think it probably works out neater the current way: i.e. first get the conditional function without length and bias parameters, then add the length and bias parameters where needed. It should be a safe operation now that there is a mechanical link between COND_FOO and COND_LEN_FOO (i.e. we can't have one without the other). Thanks, Richard