From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id 1718A3840C2B for ; Mon, 23 Nov 2020 22:37:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1718A3840C2B Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-208-ifjFE2rnMMqanO1uaRy81Q-1; Mon, 23 Nov 2020 17:37:47 -0500 X-MC-Unique: ifjFE2rnMMqanO1uaRy81Q-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D6AFE10151E7; Mon, 23 Nov 2020 22:37:46 +0000 (UTC) Received: from localhost.localdomain (ovpn-113-152.phx2.redhat.com [10.3.113.152]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4114F5D705; Mon, 23 Nov 2020 22:37:45 +0000 (UTC) Subject: Re: [PATCH PR94274] fold phi whose incoming args are defined from binary To: "Zhanghaijian (A)" , "gcc-patches@gcc.gnu.org" Cc: "rguenth@gcc.gnu.org" References: From: Jeff Law Message-ID: <295ca083-fae7-94b3-d5d1-d7ff01e06be2@redhat.com> Date: Mon, 23 Nov 2020 15:37:45 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: 8bit Content-Language: en-US X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Nov 2020 22:37:51 -0000 On 6/11/20 6:01 AM, Zhanghaijian (A) wrote: > Hi, > > This is a experimental fix for pr94274. > For if/else structure, when the expressions is binary operation and have a common subexpression, and the opcode is the same, we can > fold the merging phi node in tree_ssa_phiopt_worker (ssa-phiopt). It can be optimized to do csel first and then do binary operations. > This can eliminate one or more instructions. And this patch is effective for 500.perlbench_r in spec2017. > Bootstrap and tested on aarch64/x86_64 Linux platform. No new regression witnessed. > > Any suggestion? > > Thanks, > Haijian Zhang > > pr94274-v1.diff > > From 1f0f09ef3170569ef15791cf3e70de781a9a4cb0 Mon Sep 17 00:00:00 2001 > From: Haijian Zhang > Date: Thu, 11 Jun 2020 14:56:44 +0800 > Subject: [PATCH] fold phi whose incoming args are defined from binary > operations [PR94274] > > For if/else structure, when the expressions is binary operation and > have a common subexpression, and the opcode is the same. We can fold > the merging phi node in tree_ssa_phiopt_worker (ssa-phiopt). It can > be optimized to do csel first and then do binary operations. This can > eliminate one or more instructions. > > 2020-06-11 Haijian Zhang > > gcc/ > > PR tree-optimization/94274 > * common.opt (ftree-combine): New option. > * doc/invoke.texi (-ftree-combine): Document new option. > * opts.c (default_options_table): Enable -ftree-combine at -O1. > * passes.def: Add pass_adjust_alignment. > * tree-pass.h (make_pass_tree_combine): New. > * tree-ssa-phiopt.c (pass_data_tree_combine): New. > gcc/testsuite/ > PR tree-optimization/94274 > * gcc.dg/tree-ssa/pr94274-1.c: New test. > * gcc.dg/tree-ssa/pr94274-2.c: Likewise. > * gcc.dg/tree-ssa/pr94274-3.c: Likewise. I like a lot of what I see in here.   Do you have a copyright assignment and employer disclaimer on file with the FSF?  If not, then we can't take the submission at this time. > --- > gcc/ChangeLog | 10 + > gcc/common.opt | 4 + > gcc/doc/invoke.texi | 15 +- > gcc/opts.c | 1 + > gcc/passes.def | 1 + > gcc/testsuite/ChangeLog | 7 + > gcc/testsuite/gcc.dg/tree-ssa/pr94274-1.c | 15 + > gcc/testsuite/gcc.dg/tree-ssa/pr94274-2.c | 26 ++ > gcc/testsuite/gcc.dg/tree-ssa/pr94274-3.c | 16 + > gcc/tree-pass.h | 1 + > gcc/tree-ssa-phiopt.c | 352 +++++++++++++++++++++- > 11 files changed, 440 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94274-1.c > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94274-2.c > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94274-3.c > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index b0860738d04..c1a5bf87626 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,13 @@ > +2020-06-11 Haijian Zhang > + > + PR tree-optimization/94274 > + * common.opt (ftree-combine): New option. > + * doc/invoke.texi (-ftree-combine): Document new option. > + * opts.c (default_options_table): Enable -ftree-combine at -O1. > + * passes.def: Add pass_adjust_alignment. > + * tree-pass.h (make_pass_tree_combine): New. > + * tree-ssa-phiopt.c (pass_data_tree_combine): New. You should reference tree-optimization/64700 as well.   While your patch doesn't address all the issues in 64700, it's an excellent start. > + > 2020-06-10 Martin Sebor > > PR middle-end/95353 > diff --git a/gcc/common.opt b/gcc/common.opt > index df8af365d1b..f4441aa9ddd 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2721,6 +2721,10 @@ ftree-cselim > Common Report Var(flag_tree_cselim) Init(2) Optimization > Transform condition stores into unconditional ones. > > +ftree-combine > +Common Report Var(flag_tree_combine) Optimization > +Combine binary operations using SSA PHI nodes. I probably would bother with a switch for this.  There's already a flag to control phi-opt as a whole and that seems fine for this change as well. > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94274-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr94274-1.c > new file mode 100644 > index 00000000000..f883c6a3201 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94274-1.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-combine" } */ > + > +int foo (int cond, int a, int b, int c) > +{ > + int result = 0; > + > + if (cond == 1) > + result = b + a; > + else > + result = a + c; > + return result; > +} You should look at comment #3 in pr64700 which I think has a couple more test cases.diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c > index 5f283890998..9b09330c026 100644 > --- a/gcc/tree-ssa-phiopt.c > +++ b/gcc/tree-ssa-phiopt.c > @@ -48,7 +48,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-eh.h" > #include "gimple-fold.h" > > -static unsigned int tree_ssa_phiopt_worker (bool, bool, bool); > +static unsigned int tree_ssa_phiopt_worker (bool, bool, bool, bool); > static bool two_value_replacement (basic_block, basic_block, edge, gphi *, > tree, tree); So if we don't make this a separate pass, but instead have it run as a standard part of the phi-opt optimizations, then I don't think you need an argument to control the behavior. > @@ -155,14 +155,262 @@ single_non_singleton_phi_for_edges (gimple_seq seq, edge e0, edge e1) > return phi; > } > > +/* Compare the operands of stmts. Return 1 if only rhs1 is different. > + Return 2 if only rhs2 is different. Otherwise return false. */ > + > +static unsigned > +compare_operands (gimple *first_stmt, tree rhs1, tree rhs2) So the comment here could be better.  "If only rhs1 is different." -- different from what?, similarly for the comment about rhs2.  I'd use "0" instead of false -- we generally use true/false for booleans, mixing them could confuse folks.  You could make an enum for the 3 return states and use that instead. > +/* Check all stmts for phis, return true if the stmts are binary operation, > + and have the same opcode. Otherwise return false. */ > + > +static bool > +check_stmt_for_phi (gimple *phi) > +{ > + gimple *first_stmt; > + enum tree_code opcode; > + tree arg; > + > + /* First clear flag for stmts which used to track of which operand > + need phi node. */ > + gimple_set_plf (phi, GF_PLF_1, false); > + gimple_set_plf (phi, GF_PLF_2, false); > + > + if (gimple_code (phi) != GIMPLE_PHI) > + return false; So this function is called within a loop over the PHIs.  THere should be no case when gimple_code (phi) != GIMPLE_PHI.  I'd just drop that test. If you want checking, then change the argument to a gphi* and you'll get static checking at the call site. + /* Check the stmt is GIMPLE_ASSIGN, and the stmt is binary operation. */ > + if (!first_stmt || !is_gimple_assign (first_stmt) > + || gimple_assign_rhs_class (first_stmt) != GIMPLE_BINARY_RHS > + || !has_single_use (gimple_assign_lhs (first_stmt))) > + return false; So I don't think it should be a requirement for this patch to move forward, but you could easily handle unary operations here too.  I'm thinking about stuff like casts and the like.  But it's also a step towards handling stuff like INDIRECT_REF. You may want to verify that the defining statements do not have any virtual operands.   It's just the safe thing to do. > + > +/* Check all operands of stmts, keep track of which operand needs a phi. > + Return true if the phis can be replaced, otherwise return false. */ > + > +static bool > +check_operands_for_stmt (gimple *phi) > +{ > + tree arg = PHI_ARG_DEF (phi, 0); > + gimple *first_stmt = SSA_NAME_DEF_STMT (arg); > + > + enum tree_code opcode = gimple_assign_rhs_code (first_stmt); > + tree rhs1 = gimple_assign_rhs1 (first_stmt); > + tree rhs2 = gimple_assign_rhs2 (first_stmt); > + > + /* Scan all args to find which can do the replacement. */ > + for (unsigned i = 1; i != gimple_phi_num_args (phi); i++) > + { > + gimple *stmt = SSA_NAME_DEF_STMT (PHI_ARG_DEF (phi, i)); > + unsigned rhschanged = 0; > + > + rhschanged = compare_operands (first_stmt, > + gimple_assign_rhs1 (stmt), > + gimple_assign_rhs2 (stmt)); So you earlier verified that FIRST_STMT is a GIMPLE_BINARY_RHS.  But you don't do it for the defining statements of subsequent operands in the PHI node.   You need to verify the basic form STMT to ensure it's the same as FIRST_STMT before you call gimple_assign_rhs1 (stmt) and gimple_assign_rhs2 (stmt). + > +/* The function binary_rhs_replacement does the main work of doing the > + replacement of binary operation. */ > + > +static void > +binary_rhs_replacement (gimple *phi) > +{ > + edge e; > + gphi *newphi = NULL; > + gimple *first_stmt = SSA_NAME_DEF_STMT (PHI_ARG_DEF (phi, 0)); > + tree rhs1 = gimple_assign_rhs1 (first_stmt); > + tree rhs2 = gimple_assign_rhs2 (first_stmt); > + > + /* Create a new PHI for replacement. */ > + if (gimple_plf (phi, GF_PLF_1)) > + { > + tree temp = make_ssa_name (TREE_TYPE (rhs1), NULL); > + newphi = create_phi_node (temp, gimple_bb (phi)); > + e = gimple_phi_arg_edge (newphi, 0); > + add_phi_arg (newphi, gimple_assign_rhs1 (first_stmt), e, > + UNKNOWN_LOCATION); > + } > + else if (gimple_plf (phi, GF_PLF_2)) > + { > + tree temp = make_ssa_name (TREE_TYPE (rhs2), NULL); > + newphi = create_phi_node (temp, gimple_bb (phi)); > + e = gimple_phi_arg_edge (newphi, 0); > + add_phi_arg (newphi, gimple_assign_rhs2 (first_stmt), e, > + UNKNOWN_LOCATION); > + } I would think we could do better than UNKNOWN_LOCATION here.  We should have locations in the old phi as well as the statements in bb1/bb2.  Any would be better than UNKNOWN_LOCATION here.  Similarly for the rest of the PHI arguments. At a high level, does your patch handle the case where one of the PHI arguments is defined by another PHI in the same block?  This can happen in loops and the semantics are that all the reads of PHI arguments happen at once and assignments to all the PHI results happen after that, and effectively at the same time.  If this scenario is detected, I'd just avoid trying to do the optimization. And a nit.  Please check for consistent use of tabs and spaces.  Formatting guidelines for GCC are to use tabs instead of 8 spaces.  I think the new code you're adding is inconsistent.  As an example look at the 3 lines that start with "/* Remove the first statement.  */ in binary_rhs_replacement. So I think this needs another iteration, but given it was submitted in June, I think it should still be considered for gcc-11, particularly if the copyright assignment and technical details can be worked out relatively quickly. Thanks! jeff