From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7404 invoked by alias); 26 Jun 2014 08:33:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 7318 invoked by uid 89); 26 Jun 2014 08:33:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=AWL,BAYES_00,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Thu, 26 Jun 2014 08:33:12 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id C725CACC1; Thu, 26 Jun 2014 08:33:08 +0000 (UTC) Date: Thu, 26 Jun 2014 08:33:00 -0000 From: Richard Biener To: Jeff Law cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix parts of PR61607 In-Reply-To: Message-ID: References: <53AB14C0.1010606@redhat.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2014-06/txt/msg02101.txt.bz2 On Thu, 26 Jun 2014, Richard Biener wrote: > On Wed, 25 Jun 2014, Jeff Law wrote: > > > On 06/25/14 08:05, Richard Biener wrote: > > > > > > This removes restrictions in DOM cprop_operand that inhibit > > > some optimizations. The volatile pointer thing is really realy > > > old and no longer necessary while the loop-depth consideration > > > is only valid for loop-closed PHI nodes (but we're not in > > > loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA > > > phase by inserting copies appropriately. > > > > > > Bootstrapped on x86_64-unknown-linux-gnu, ok? > > > > > > Thanks, > > > Richard. > > > > > > 2014-06-25 Richard Biener > > > > > > PR tree-optimization/61607 > > > * tree-ssa-dom.c (cprop_operand): Remove restriction on > > > propagating volatile pointers and on loop depth. > > The first hunk is OK. > > > > I thought we had tests for the do not copy propagate out of a loop nest in the > > suite. Did you check that tests in BZ 19038 still generate good code after > > this change? If we still generate good code for those tests, then this hunk > > is fine too. > > I have applied the first hunk and will investigate further. Testing > didn't show any issue and I know how to retain the check but not > cause the missed optimization shown in PR61607. Let's try to summarize what the restriction is supposed to avoid. It tries to avoid introducing uses of SSA names defined inside a loop outside of it because if the SSA name is live over the backedge we will then have an overlapping life-range which prevents out-of-SSA from coalescing it to a single register. Now, the existing test is not working in that way. Rather the best way we have to ensure this property (all outside uses go through a copy that is placed on exit edges rather than possibly on the backedge) is to go into loop-closed SSA form. This is also where the PHI nodes that confuse DOM in PR61607 come from in the first place. Now as the existing measure is ineffective in some cases out-of-SSA has gotten the ability to deal with this (or a subset): /* If elimination of a PHI requires inserting a copy on a backedge, then we will have to split the backedge which has numerous undesirable performance effects. A significant number of such cases can be handled here by inserting copies into the loop itself. */ insert_backedge_copies (); now, this doesn't seem to deal with outside uses. But eventually the coalescing code already assigns proper cost to backedge copies so that we choose to place copies on the exit edges rather than the backedge ones - seems not so from looking at coalesce_cost_edge. So I think that we should remove the copy-propagation restrictions and instead address this in out-of-SSA. For now the following patch retains the exact same restriction in DOM as it is present in copyprop (but not in FRE - ok my recent fault, or in VRP). By avoiding to record the equivalency for PHIs (where we know that either all or no uses should be covered by the loop depth check) we retain the ability to record the equivalency for the two loop exit PHI nodes and thus the threading (if only on the false path). Bootstrap and regtest running on x86_64-unknown-linux-gnu. I'll try to see what happens to the PR19038 testcases (though that PR is a mess ...) Richard. 2014-06-26 Richard Biener PR tree-optimization/61607 * tree-ssa-copy.c (copy_prop_visit_phi_node): Adjust comment explaining why we restrict copies on loop depth. * tree-ssa-dom.c (cprop_operand): Remove restriction on on loop depth. (record_equivalences_from_phis): Instead add it here. * gcc.dg/tree-ssa/ssa-dom-thread-5.c: New testcase. Index: gcc/tree-ssa-copy.c =================================================================== --- gcc/tree-ssa-copy.c (revision 212012) +++ gcc/tree-ssa-copy.c (working copy) @@ -401,11 +401,8 @@ copy_prop_visit_phi_node (gimple phi) arg_value = valueize_val (arg); /* Avoid copy propagation from an inner into an outer loop. - Otherwise, this may move loop variant variables outside of - their loops and prevent coalescing opportunities. If the - value was loop invariant, it will be hoisted by LICM and - exposed for copy propagation. - ??? The value will be always loop invariant. + Otherwise, this may introduce uses of loop variant variables + outside of their loops and prevent coalescing opportunities. In loop-closed SSA form do not copy-propagate through PHI nodes in blocks with a loop exit edge predecessor. */ if (TREE_CODE (arg_value) == SSA_NAME Index: gcc/tree-ssa-dom.c =================================================================== --- gcc/tree-ssa-dom.c (revision 212013) +++ gcc/tree-ssa-dom.c (working copy) @@ -1234,7 +1234,13 @@ record_equivalences_from_phis (basic_blo this, since this is a true assignment and not an equivalence inferred from a comparison. All uses of this ssa name are dominated by this assignment, so unwinding just costs time and space. */ - if (i == gimple_phi_num_args (phi) && may_propagate_copy (lhs, rhs)) + if (i == gimple_phi_num_args (phi) + && may_propagate_copy (lhs, rhs) + /* Do not propagate copies if the propagated value is at a deeper loop + depth than the propagatee. Otherwise, this may introduce uses + of loop variant variables outside of their loops and prevent + coalescing opportunities. */ + && !(loop_depth_of_name (rhs) > loop_depth_of_name (lhs))) set_ssa_name_value (lhs, rhs); } } @@ -2247,14 +2253,6 @@ cprop_operand (gimple stmt, use_operand_ if (!may_propagate_copy (op, val)) return; - /* Do not propagate copies if the propagated value is at a deeper loop - depth than the propagatee. Otherwise, this may move loop variant - variables outside of their loops and prevent coalescing - opportunities. If the value was loop invariant, it will be hoisted - by LICM and exposed for copy propagation. */ - if (loop_depth_of_name (val) > loop_depth_of_name (op)) - return; - /* Do not propagate copies into simple IV increment statements. See PR23821 for how this can disturb IV analysis. */ if (TREE_CODE (val) != INTEGER_CST Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-Os -fno-tree-fre -fdump-tree-dom1-details" } */ + +void foo(int *); +void f2(int dst[3], int R) +{ + int i, inter[2]; + _Bool inter0p = 0; + _Bool inter1p = 0; + for (i = 1; i < R; i++) + { + inter0p = 1; + inter1p = 1; + } + if (inter0p) + inter[0] = 1; + if (inter1p) + inter[1] = 1; + foo(inter); +} + +/* { dg-final { scan-tree-dump "Threaded jump" "dom1" } } */ +/* { dg-final { cleanup-tree-dump "dom1" } } */