From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by sourceware.org (Postfix) with ESMTPS id 04C9C3858D32 for ; Tue, 23 Aug 2022 07:26:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 04C9C3858D32 Received: by mail-ej1-x62b.google.com with SMTP id ce26so9944728ejb.11 for ; Tue, 23 Aug 2022 00:26:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=h2aaSAaIYLpGYLVOxf/GnCTvLwYAe1FaEFLiVSjORbs=; b=Ef6YDFLMQ80PkMiOY7j7jSKttsvDd2HJfdvly+l49ANeN72EPByQN0/gddkri0Ikk9 ZBcPseS62JoABXAKHbzpwRKKKYgPtwLeT0djZ4tpAo49TmTYtaNVDcg95cCIBuXeZNGM re/4Z4/coAhJnHBxSWKu8SxURk1n3ZGWm8vz9Mctems2j7sc6n7ctefKjYRqKSKp4zGw /FC2BpgKrCQRZi515Ni2VCox6zQrYaOx7FWrSatWX+uaNm/yO1wdQJ49iXjvS5Rhqf0C +80wrNvECPoU7LhSP53xbKmS3YF/KMeJU8xopAJlevXqZIlJtLoRbcxId6FewFltRPHN XuGA== X-Gm-Message-State: ACgBeo0N27saWjou4HHCdUkRTFKEINdxDvX9b7IMY5CbFH4aqz+kqED7 l28obpW9zA+Dpz8QY37TrCfCuBeINsM2njgADSEvd7Mg X-Google-Smtp-Source: AA6agR6zpHnhyLq53pyy/FTWeV3ZBMigaBa7UItpKq2S74z2FKwYm0u40PbXqJDrXYTZUcgtEIwsL0MmhCOcHhvperY= X-Received: by 2002:a17:907:2722:b0:731:23a3:be78 with SMTP id d2-20020a170907272200b0073123a3be78mr15893835ejl.330.1661239609467; Tue, 23 Aug 2022 00:26:49 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Tue, 23 Aug 2022 09:26:37 +0200 Message-ID: Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735] To: "Kong, Lingling" Cc: "Liu, Hongtao" , "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2022 07:26:53 -0000 On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches wrote: > > Hi, > > This patch is for pr105735/pr101991. It will enable below optimization: > { > - long unsigned int bit; > - > - [local count: 32534376]: > - > - [local count: 1041207449]: > - # tmp_10 = PHI > - # bit_12 = PHI > - tmp_7 = bit2_6(D) & tmp_10; > - bit_8 = bit_12 + 1; > - if (bit_8 != 32) > - goto ; [96.97%] > - else > - goto ; [3.03%] > - > - [local count: 1009658865]: > - goto ; [100.00%] > - > - [local count: 32534376]: > - # tmp_11 = PHI > - return tmp_11; > + tmp_11 = tmp_4(D) & bit2_6(D); > + return tmp_11; > > } > > Ok for master ? > > gcc/ChangeLog: > > PR middle-end/105735 > * match.pd (bitop_with_inv_p): New match. > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. > (analyze_and_compute_bitop_with_inv_effect): New function. > (final_value_replacement_loop): Enhanced to handle bitop > with inv induction. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr105735-1.c: New test. > * gcc.target/i386/pr105735-2.c: New test. > --- > gcc/match.pd | 4 + > gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++++++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++ > gcc/tree-scalar-evolution.cc | 59 +++++++++++++++ > 4 files changed, 179 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c > > diff --git a/gcc/match.pd b/gcc/match.pd index 562138a8034..cfe593ebb02 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -8050,6 +8050,10 @@ and, > (bit_not > (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2)) @3)))) > > +(for bit_op (bit_and bit_ior bit_xor) > + (match (bitop_with_inv_p @0 @1) > + (bit_op:c @0 @1))) > + That's a single-stmt match, you shouldn't use match.pd matching for this. Instead just do if (is_gimple_assign (stmt) && ((code = gimple_assign_rhs_code (stmt)), true) && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == BIT_XOR_EXPR)) .. and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in bit_op:c is redundant btw. - while the name suggests "with invariant" you don't actually check for that. But again, given canonicalization rules the invariant will be rhs2 so above add && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST for example. > /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. > n - (((n > C1) ? n : C1) & -C2) -> (n <= C1) ? n : (n & C1) for signed case. */ (simplify diff --git a/gcc/testsuite/gcc.target/i386/pr105735-1.c b/gcc/testsuite/gcc.target/i386/pr105735-1.c > new file mode 100644 > index 00000000000..8d2123ed351 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c > @@ -0,0 +1,88 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */ > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" > +} } */ > + > +unsigned long long > +__attribute__((noipa)) > +foo (unsigned long long tmp, unsigned long long bit2) { you probably need dg-require-effective-target longlong, but is it necessary to use long long for the testcases in the first place? The IV seems to be unused, if it should match the variables bit size use sizeof (type) * 8 > + for (int bit = 0; bit < 64; bit++) > + tmp &= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo1 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 63; bit >= 0; bit -=3) > + tmp &= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo2 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 64; bit++) > + tmp |= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo3 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 63; bit >= 0; bit -=3) > + tmp |= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo4 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 64; bit++) > + tmp ^= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo5 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 63; bit++) > + tmp ^= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res &= bit2; > + return res; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f1 (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res |= bit2; > + return res; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f2 (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res ^= bit2; > + return res; > +} > + > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c b/gcc/testsuite/gcc.target/i386/pr105735-2.c > new file mode 100644 > index 00000000000..79c1d300b1b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c > @@ -0,0 +1,28 @@ > +/* { dg-do run } */ > +/* { dg-options "-O1" } */ > + > +#include "pr105735-1.c" > + > +int main() > +{ > + unsigned long long tmp = 0x1101101ULL; > + unsigned long long bit2 = 0x1111111011110111ULL; > + if (foo (tmp, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (foo1 (tmp, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (foo2 (tmp, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (foo3 (tmp, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (foo4 (tmp, bit2) != 0x1101101ULL) > + __builtin_abort (); > + if (foo5 (tmp, bit2) != 0x1111111010011010ULL) > + __builtin_abort (); > + if (f (tmp, 64, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (f2 (tmp, 64, bit2) != 0x1101101ULL) > + __builtin_abort (); > +} > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc index fc59d035b19..81220f08d2e 100644 > --- a/gcc/tree-scalar-evolution.cc > +++ b/gcc/tree-scalar-evolution.cc > @@ -3635,6 +3635,53 @@ enum bit_op_kind > return fold_build2 (code1, type, inv, wide_int_to_tree (type, bits)); } > > +/* Match.pd function to match bitop with invariant expression > + .i.e. > + tmp_7 = _0 & _1; */ > +extern bool gimple_bitop_with_inv_p (tree, tree *, tree (*)(tree)); > + > +/* Return the inductive expression of bitop with invariant if possible, > + otherwise returns DEF. */ > +static tree > +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef, > + tree niter) > +{ > + tree match_op[2],inv; > + tree type = TREE_TYPE (phidef); > + gphi* header_phi = NULL; > + /* match thing like op0 (match[0]), op1(match[1]), phidef(PHIDEF) > + > + op1 = PHI > + phidef = op0 & op1 > + if op0 is an invariant, it could change to > + phidef = op0 & inv. */ > + if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL) > + || TREE_CODE (match_op[1]) != SSA_NAME > + || !expr_invariant_in_loop_p (loop, match_op[0]) > + || !(header_phi = dyn_cast (SSA_NAME_DEF_STMT (match_op[1]))) > + || gimple_phi_num_args (header_phi) != 2) > + return NULL_TREE; > + > + if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef) > + return NULL_TREE; > + > + enum tree_code code1 > + = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef)); > + > + if (code1 == BIT_XOR_EXPR) > + { > + if (!tree_fits_uhwi_p (niter)) > + return NULL_TREE; > + unsigned HOST_WIDE_INT niter_num; > + niter_num = tree_to_uhwi (niter); > + if (niter_num % 2 != 0) > + match_op[0] = build_zero_cst (type); > + } > + > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge (loop)); > + return fold_build2 (code1, type, inv, match_op[0]); } The } goes to the next line. > + > /* Do final value replacement for LOOP, return true if we did anything. */ > > bool > @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop *loop) > &folded_casts); > def = compute_overall_effect_of_inner_loop (ex_loop, def); > > + /* Handle bitop with invariant induction expression. > + > + .i.e > + for (int i =0 ;i < 32; i++) > + tmp &= bit2; > + if bit2 is an invariant in loop which could simple to > + tmp &= bit2. */ > + tree bitinv_def; > + if ((bitinv_def please use else if here otherwise looks OK. > + = analyze_and_compute_bitop_with_inv_effect (loop, phidef, niter))) > + def = bitinv_def; > + > /* Handle bitwise induction expression. > > .i.e. > -- > 2.18.2 >