From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 403 invoked by alias); 26 Jun 2014 09:01:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 374 invoked by uid 89); 26 Jun 2014 09:01:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=AWL,BAYES_00,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Thu, 26 Jun 2014 09:01:29 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 28F04AC5D; Thu, 26 Jun 2014 09:01:26 +0000 (UTC) Date: Thu, 26 Jun 2014 09:01:00 -0000 From: Richard Biener To: Jeff Law cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix parts of PR61607 In-Reply-To: Message-ID: References: <53AB14C0.1010606@redhat.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2014-06/txt/msg02104.txt.bz2 On Thu, 26 Jun 2014, Richard Biener wrote: > On Thu, 26 Jun 2014, Richard Biener wrote: > > > On Wed, 25 Jun 2014, Jeff Law wrote: > > > > > On 06/25/14 08:05, Richard Biener wrote: > > > > > > > > This removes restrictions in DOM cprop_operand that inhibit > > > > some optimizations. The volatile pointer thing is really realy > > > > old and no longer necessary while the loop-depth consideration > > > > is only valid for loop-closed PHI nodes (but we're not in > > > > loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA > > > > phase by inserting copies appropriately. > > > > > > > > Bootstrapped on x86_64-unknown-linux-gnu, ok? > > > > > > > > Thanks, > > > > Richard. > > > > > > > > 2014-06-25 Richard Biener > > > > > > > > PR tree-optimization/61607 > > > > * tree-ssa-dom.c (cprop_operand): Remove restriction on > > > > propagating volatile pointers and on loop depth. > > > The first hunk is OK. > > > > > > I thought we had tests for the do not copy propagate out of a loop nest in the > > > suite. Did you check that tests in BZ 19038 still generate good code after > > > this change? If we still generate good code for those tests, then this hunk > > > is fine too. > > > > I have applied the first hunk and will investigate further. Testing > > didn't show any issue and I know how to retain the check but not > > cause the missed optimization shown in PR61607. > > Let's try to summarize what the restriction is supposed to avoid. > It tries to avoid introducing uses of SSA names defined inside a > loop outside of it because if the SSA name is live over the backedge > we will then have an overlapping life-range which prevents out-of-SSA > from coalescing it to a single register. > > Now, the existing test is not working in that way. > > Rather the best way we have to ensure this property (all outside > uses go through a copy that is placed on exit edges rather than > possibly on the backedge) is to go into loop-closed SSA form. > This is also where the PHI nodes that confuse DOM in PR61607 > come from in the first place. > > Now as the existing measure is ineffective in some cases out-of-SSA > has gotten the ability to deal with this (or a subset): > > /* If elimination of a PHI requires inserting a copy on a backedge, > then we will have to split the backedge which has numerous > undesirable performance effects. > > A significant number of such cases can be handled here by inserting > copies into the loop itself. */ > insert_backedge_copies (); > > now, this doesn't seem to deal with outside uses. But eventually > the coalescing code already assigns proper cost to backedge copies > so that we choose to place copies on the exit edges rather than > the backedge ones - seems not so from looking at coalesce_cost_edge. > > So I think that we should remove the copy-propagation restrictions > and instead address this in out-of-SSA. > > For now the following patch retains the exact same restriction in > DOM as it is present in copyprop (but not in FRE - ok my recent fault, > or in VRP). By avoiding to record the equivalency for PHIs > (where we know that either all or no uses should be covered by > the loop depth check) we retain the ability to record the equivalency > for the two loop exit PHI nodes and thus the threading (if only > on the false path). > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > I'll try to see what happens to the PR19038 testcases (though > that PR is a mess ...) I checked the very original one (thin6d.f from sixtrack) and the generated assembly for -Ofast is the same without any patch and with _all_ loop_depth_of_name restrictions removed from both DOM and copyprop (thus making loop_depth_of_name dead). The cost of out-of-SSA copies for backedges (or in the case of the PR, loop latch edges causing an edge split) is dealt with by /* Inserting copy on critical edge costs more than inserting it elsewhere. */ if (EDGE_CRITICAL_P (e)) mult = 2; in coalesce_cost_edge. So in the end, without a testcase to investigate, I'd propose to get rid of those restrictions. I'm still going forward with the patch below for now. Richard. > Richard. > > 2014-06-26 Richard Biener > > PR tree-optimization/61607 > * tree-ssa-copy.c (copy_prop_visit_phi_node): Adjust comment > explaining why we restrict copies on loop depth. > * tree-ssa-dom.c (cprop_operand): Remove restriction on > on loop depth. > (record_equivalences_from_phis): Instead add it here. > > * gcc.dg/tree-ssa/ssa-dom-thread-5.c: New testcase. > > Index: gcc/tree-ssa-copy.c > =================================================================== > --- gcc/tree-ssa-copy.c (revision 212012) > +++ gcc/tree-ssa-copy.c (working copy) > @@ -401,11 +401,8 @@ copy_prop_visit_phi_node (gimple phi) > arg_value = valueize_val (arg); > > /* Avoid copy propagation from an inner into an outer loop. > - Otherwise, this may move loop variant variables outside of > - their loops and prevent coalescing opportunities. If the > - value was loop invariant, it will be hoisted by LICM and > - exposed for copy propagation. > - ??? The value will be always loop invariant. > + Otherwise, this may introduce uses of loop variant variables > + outside of their loops and prevent coalescing opportunities. > In loop-closed SSA form do not copy-propagate through > PHI nodes in blocks with a loop exit edge predecessor. */ > if (TREE_CODE (arg_value) == SSA_NAME > Index: gcc/tree-ssa-dom.c > =================================================================== > --- gcc/tree-ssa-dom.c (revision 212013) > +++ gcc/tree-ssa-dom.c (working copy) > @@ -1234,7 +1234,13 @@ record_equivalences_from_phis (basic_blo > this, since this is a true assignment and not an equivalence > inferred from a comparison. All uses of this ssa name are dominated > by this assignment, so unwinding just costs time and space. */ > - if (i == gimple_phi_num_args (phi) && may_propagate_copy (lhs, rhs)) > + if (i == gimple_phi_num_args (phi) > + && may_propagate_copy (lhs, rhs) > + /* Do not propagate copies if the propagated value is at a deeper loop > + depth than the propagatee. Otherwise, this may introduce uses > + of loop variant variables outside of their loops and prevent > + coalescing opportunities. */ > + && !(loop_depth_of_name (rhs) > loop_depth_of_name (lhs))) > set_ssa_name_value (lhs, rhs); > } > } > @@ -2247,14 +2253,6 @@ cprop_operand (gimple stmt, use_operand_ > if (!may_propagate_copy (op, val)) > return; > > - /* Do not propagate copies if the propagated value is at a deeper loop > - depth than the propagatee. Otherwise, this may move loop variant > - variables outside of their loops and prevent coalescing > - opportunities. If the value was loop invariant, it will be hoisted > - by LICM and exposed for copy propagation. */ > - if (loop_depth_of_name (val) > loop_depth_of_name (op)) > - return; > - > /* Do not propagate copies into simple IV increment statements. > See PR23821 for how this can disturb IV analysis. */ > if (TREE_CODE (val) != INTEGER_CST > Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c (revision 0) > +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c (working copy) > @@ -0,0 +1,23 @@ > +/* { dg-do compile } */ > +/* { dg-options "-Os -fno-tree-fre -fdump-tree-dom1-details" } */ > + > +void foo(int *); > +void f2(int dst[3], int R) > +{ > + int i, inter[2]; > + _Bool inter0p = 0; > + _Bool inter1p = 0; > + for (i = 1; i < R; i++) > + { > + inter0p = 1; > + inter1p = 1; > + } > + if (inter0p) > + inter[0] = 1; > + if (inter1p) > + inter[1] = 1; > + foo(inter); > +} > + > +/* { dg-final { scan-tree-dump "Threaded jump" "dom1" } } */ > +/* { dg-final { cleanup-tree-dump "dom1" } } */ > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer