public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Jeff Law <law@redhat.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Fix parts of PR61607
Date: Thu, 26 Jun 2014 09:01:00 -0000	[thread overview]
Message-ID: <alpine.LSU.2.11.1406261046560.29270@zhemvz.fhfr.qr> (raw)
In-Reply-To: <alpine.LSU.2.11.1406261011421.29270@zhemvz.fhfr.qr>

On Thu, 26 Jun 2014, Richard Biener wrote:

> On Thu, 26 Jun 2014, Richard Biener wrote:
> 
> > On Wed, 25 Jun 2014, Jeff Law wrote:
> > 
> > > On 06/25/14 08:05, Richard Biener wrote:
> > > > 
> > > > This removes restrictions in DOM cprop_operand that inhibit
> > > > some optimizations.  The volatile pointer thing is really realy
> > > > old and no longer necessary while the loop-depth consideration
> > > > is only valid for loop-closed PHI nodes (but we're not in
> > > > loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA
> > > > phase by inserting copies appropriately.
> > > > 
> > > > Bootstrapped on x86_64-unknown-linux-gnu, ok?
> > > > 
> > > > Thanks,
> > > > Richard.
> > > > 
> > > > 2014-06-25  Richard Biener  <rguenther@suse.de>
> > > > 
> > > > 	PR tree-optimization/61607
> > > > 	* tree-ssa-dom.c (cprop_operand): Remove restriction on
> > > > 	propagating volatile pointers and on loop depth.
> > > The first hunk is OK.
> > > 
> > > I thought we had tests for the do not copy propagate out of a loop nest in the
> > > suite.  Did you check that tests in BZ 19038 still generate good code after
> > > this change?  If we still generate good code for those tests, then this hunk
> > > is fine too.
> > 
> > I have applied the first hunk and will investigate further.  Testing 
> > didn't show any issue and I know how to retain the check but not
> > cause the missed optimization shown in PR61607.
> 
> Let's try to summarize what the restriction is supposed to avoid.
> It tries to avoid introducing uses of SSA names defined inside a
> loop outside of it because if the SSA name is live over the backedge
> we will then have an overlapping life-range which prevents out-of-SSA
> from coalescing it to a single register.
> 
> Now, the existing test is not working in that way.
> 
> Rather the best way we have to ensure this property (all outside
> uses go through a copy that is placed on exit edges rather than
> possibly on the backedge) is to go into loop-closed SSA form.
> This is also where the PHI nodes that confuse DOM in PR61607
> come from in the first place.
> 
> Now as the existing measure is ineffective in some cases out-of-SSA
> has gotten the ability to deal with this (or a subset):
> 
>   /* If elimination of a PHI requires inserting a copy on a backedge,
>      then we will have to split the backedge which has numerous
>      undesirable performance effects.
> 
>      A significant number of such cases can be handled here by inserting
>      copies into the loop itself.  */
>   insert_backedge_copies ();
> 
> now, this doesn't seem to deal with outside uses.  But eventually
> the coalescing code already assigns proper cost to backedge copies
> so that we choose to place copies on the exit edges rather than
> the backedge ones - seems not so from looking at coalesce_cost_edge.
> 
> So I think that we should remove the copy-propagation restrictions
> and instead address this in out-of-SSA.
> 
> For now the following patch retains the exact same restriction in
> DOM as it is present in copyprop (but not in FRE - ok my recent fault,
> or in VRP).  By avoiding to record the equivalency for PHIs
> (where we know that either all or no uses should be covered by
> the loop depth check) we retain the ability to record the equivalency
> for the two loop exit PHI nodes and thus the threading (if only
> on the false path).
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> I'll try to see what happens to the PR19038 testcases (though
> that PR is a mess ...)

I checked the very original one (thin6d.f from sixtrack) and the
generated assembly for -Ofast is the same without any patch
and with _all_ loop_depth_of_name restrictions removed from
both DOM and copyprop (thus making loop_depth_of_name dead).

The cost of out-of-SSA copies for backedges (or in the case
of the PR, loop latch edges causing an edge split) is dealt
with by

  /* Inserting copy on critical edge costs more than inserting it 
elsewhere.  */
  if (EDGE_CRITICAL_P (e))
    mult = 2;

in coalesce_cost_edge.

So in the end, without a testcase to investigate, I'd propose
to get rid of those restrictions.  I'm still going forward
with the patch below for now.

Richard.

> Richard.
> 
> 2014-06-26  Richard Biener  <rguenther@suse.de>
> 
> 	PR tree-optimization/61607
> 	* tree-ssa-copy.c (copy_prop_visit_phi_node): Adjust comment
> 	explaining why we restrict copies on loop depth.
> 	* tree-ssa-dom.c (cprop_operand): Remove restriction on
> 	on loop depth.
> 	(record_equivalences_from_phis): Instead add it here.
> 
> 	* gcc.dg/tree-ssa/ssa-dom-thread-5.c: New testcase.
> 
> Index: gcc/tree-ssa-copy.c
> ===================================================================
> --- gcc/tree-ssa-copy.c	(revision 212012)
> +++ gcc/tree-ssa-copy.c	(working copy)
> @@ -401,11 +401,8 @@ copy_prop_visit_phi_node (gimple phi)
>  	arg_value = valueize_val (arg);
>  
>        /* Avoid copy propagation from an inner into an outer loop.
> -	 Otherwise, this may move loop variant variables outside of
> -	 their loops and prevent coalescing opportunities.  If the
> -	 value was loop invariant, it will be hoisted by LICM and
> -	 exposed for copy propagation.
> -	 ???  The value will be always loop invariant.
> +	 Otherwise, this may introduce uses of loop variant variables
> +	 outside of their loops and prevent coalescing opportunities.
>  	 In loop-closed SSA form do not copy-propagate through
>  	 PHI nodes in blocks with a loop exit edge predecessor.  */
>        if (TREE_CODE (arg_value) == SSA_NAME
> Index: gcc/tree-ssa-dom.c
> ===================================================================
> --- gcc/tree-ssa-dom.c	(revision 212013)
> +++ gcc/tree-ssa-dom.c	(working copy)
> @@ -1234,7 +1234,13 @@ record_equivalences_from_phis (basic_blo
>  	 this, since this is a true assignment and not an equivalence
>  	 inferred from a comparison.  All uses of this ssa name are dominated
>  	 by this assignment, so unwinding just costs time and space.  */
> -      if (i == gimple_phi_num_args (phi) && may_propagate_copy (lhs, rhs))
> +      if (i == gimple_phi_num_args (phi)
> +	  && may_propagate_copy (lhs, rhs)
> +	  /* Do not propagate copies if the propagated value is at a deeper loop
> +	     depth than the propagatee.  Otherwise, this may introduce uses
> +	     of loop variant variables outside of their loops and prevent
> +	     coalescing opportunities.  */
> +	  && !(loop_depth_of_name (rhs) > loop_depth_of_name (lhs)))
>  	set_ssa_name_value (lhs, rhs);
>      }
>  }
> @@ -2247,14 +2253,6 @@ cprop_operand (gimple stmt, use_operand_
>        if (!may_propagate_copy (op, val))
>  	return;
>  
> -      /* Do not propagate copies if the propagated value is at a deeper loop
> -	 depth than the propagatee.  Otherwise, this may move loop variant
> -	 variables outside of their loops and prevent coalescing
> -	 opportunities.  If the value was loop invariant, it will be hoisted
> -	 by LICM and exposed for copy propagation.  */
> -      if (loop_depth_of_name (val) > loop_depth_of_name (op))
> -	return;
> -
>        /* Do not propagate copies into simple IV increment statements.
>           See PR23821 for how this can disturb IV analysis.  */
>        if (TREE_CODE (val) != INTEGER_CST
> Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c	(working copy)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Os -fno-tree-fre -fdump-tree-dom1-details" } */
> +
> +void foo(int *);
> +void f2(int dst[3], int R)
> +{
> +  int i, inter[2];
> +  _Bool inter0p = 0;
> +  _Bool inter1p = 0;
> +  for (i = 1; i < R; i++)
> +    {
> +      inter0p = 1;
> +      inter1p = 1;
> +    }
> +  if (inter0p)
> +    inter[0] = 1;
> +  if (inter1p)
> +    inter[1] = 1;
> +  foo(inter);
> +}
> +
> +/* { dg-final { scan-tree-dump "Threaded jump" "dom1" } } */
> +/* { dg-final { cleanup-tree-dump "dom1" } } */
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

  reply	other threads:[~2014-06-26  9:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-25 14:07 Richard Biener
2014-06-25 18:28 ` Jeff Law
2014-06-26  7:57   ` Richard Biener
2014-06-26  8:33     ` Richard Biener
2014-06-26  9:01       ` Richard Biener [this message]
2014-06-26 15:33         ` Jeff Law
2014-06-25 14:09 Richard Biener
2014-06-25 18:34 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1406261046560.29270@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).