From: Richard Biener <rguenther@suse.de>
To: Jeff Law <law@redhat.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Fix parts of PR61607
Date: Thu, 26 Jun 2014 09:01:00 -0000 [thread overview]
Message-ID: <alpine.LSU.2.11.1406261046560.29270@zhemvz.fhfr.qr> (raw)
In-Reply-To: <alpine.LSU.2.11.1406261011421.29270@zhemvz.fhfr.qr>
On Thu, 26 Jun 2014, Richard Biener wrote:
> On Thu, 26 Jun 2014, Richard Biener wrote:
>
> > On Wed, 25 Jun 2014, Jeff Law wrote:
> >
> > > On 06/25/14 08:05, Richard Biener wrote:
> > > >
> > > > This removes restrictions in DOM cprop_operand that inhibit
> > > > some optimizations. The volatile pointer thing is really realy
> > > > old and no longer necessary while the loop-depth consideration
> > > > is only valid for loop-closed PHI nodes (but we're not in
> > > > loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA
> > > > phase by inserting copies appropriately.
> > > >
> > > > Bootstrapped on x86_64-unknown-linux-gnu, ok?
> > > >
> > > > Thanks,
> > > > Richard.
> > > >
> > > > 2014-06-25 Richard Biener <rguenther@suse.de>
> > > >
> > > > PR tree-optimization/61607
> > > > * tree-ssa-dom.c (cprop_operand): Remove restriction on
> > > > propagating volatile pointers and on loop depth.
> > > The first hunk is OK.
> > >
> > > I thought we had tests for the do not copy propagate out of a loop nest in the
> > > suite. Did you check that tests in BZ 19038 still generate good code after
> > > this change? If we still generate good code for those tests, then this hunk
> > > is fine too.
> >
> > I have applied the first hunk and will investigate further. Testing
> > didn't show any issue and I know how to retain the check but not
> > cause the missed optimization shown in PR61607.
>
> Let's try to summarize what the restriction is supposed to avoid.
> It tries to avoid introducing uses of SSA names defined inside a
> loop outside of it because if the SSA name is live over the backedge
> we will then have an overlapping life-range which prevents out-of-SSA
> from coalescing it to a single register.
>
> Now, the existing test is not working in that way.
>
> Rather the best way we have to ensure this property (all outside
> uses go through a copy that is placed on exit edges rather than
> possibly on the backedge) is to go into loop-closed SSA form.
> This is also where the PHI nodes that confuse DOM in PR61607
> come from in the first place.
>
> Now as the existing measure is ineffective in some cases out-of-SSA
> has gotten the ability to deal with this (or a subset):
>
> /* If elimination of a PHI requires inserting a copy on a backedge,
> then we will have to split the backedge which has numerous
> undesirable performance effects.
>
> A significant number of such cases can be handled here by inserting
> copies into the loop itself. */
> insert_backedge_copies ();
>
> now, this doesn't seem to deal with outside uses. But eventually
> the coalescing code already assigns proper cost to backedge copies
> so that we choose to place copies on the exit edges rather than
> the backedge ones - seems not so from looking at coalesce_cost_edge.
>
> So I think that we should remove the copy-propagation restrictions
> and instead address this in out-of-SSA.
>
> For now the following patch retains the exact same restriction in
> DOM as it is present in copyprop (but not in FRE - ok my recent fault,
> or in VRP). By avoiding to record the equivalency for PHIs
> (where we know that either all or no uses should be covered by
> the loop depth check) we retain the ability to record the equivalency
> for the two loop exit PHI nodes and thus the threading (if only
> on the false path).
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>
> I'll try to see what happens to the PR19038 testcases (though
> that PR is a mess ...)
I checked the very original one (thin6d.f from sixtrack) and the
generated assembly for -Ofast is the same without any patch
and with _all_ loop_depth_of_name restrictions removed from
both DOM and copyprop (thus making loop_depth_of_name dead).
The cost of out-of-SSA copies for backedges (or in the case
of the PR, loop latch edges causing an edge split) is dealt
with by
/* Inserting copy on critical edge costs more than inserting it
elsewhere. */
if (EDGE_CRITICAL_P (e))
mult = 2;
in coalesce_cost_edge.
So in the end, without a testcase to investigate, I'd propose
to get rid of those restrictions. I'm still going forward
with the patch below for now.
Richard.
> Richard.
>
> 2014-06-26 Richard Biener <rguenther@suse.de>
>
> PR tree-optimization/61607
> * tree-ssa-copy.c (copy_prop_visit_phi_node): Adjust comment
> explaining why we restrict copies on loop depth.
> * tree-ssa-dom.c (cprop_operand): Remove restriction on
> on loop depth.
> (record_equivalences_from_phis): Instead add it here.
>
> * gcc.dg/tree-ssa/ssa-dom-thread-5.c: New testcase.
>
> Index: gcc/tree-ssa-copy.c
> ===================================================================
> --- gcc/tree-ssa-copy.c (revision 212012)
> +++ gcc/tree-ssa-copy.c (working copy)
> @@ -401,11 +401,8 @@ copy_prop_visit_phi_node (gimple phi)
> arg_value = valueize_val (arg);
>
> /* Avoid copy propagation from an inner into an outer loop.
> - Otherwise, this may move loop variant variables outside of
> - their loops and prevent coalescing opportunities. If the
> - value was loop invariant, it will be hoisted by LICM and
> - exposed for copy propagation.
> - ??? The value will be always loop invariant.
> + Otherwise, this may introduce uses of loop variant variables
> + outside of their loops and prevent coalescing opportunities.
> In loop-closed SSA form do not copy-propagate through
> PHI nodes in blocks with a loop exit edge predecessor. */
> if (TREE_CODE (arg_value) == SSA_NAME
> Index: gcc/tree-ssa-dom.c
> ===================================================================
> --- gcc/tree-ssa-dom.c (revision 212013)
> +++ gcc/tree-ssa-dom.c (working copy)
> @@ -1234,7 +1234,13 @@ record_equivalences_from_phis (basic_blo
> this, since this is a true assignment and not an equivalence
> inferred from a comparison. All uses of this ssa name are dominated
> by this assignment, so unwinding just costs time and space. */
> - if (i == gimple_phi_num_args (phi) && may_propagate_copy (lhs, rhs))
> + if (i == gimple_phi_num_args (phi)
> + && may_propagate_copy (lhs, rhs)
> + /* Do not propagate copies if the propagated value is at a deeper loop
> + depth than the propagatee. Otherwise, this may introduce uses
> + of loop variant variables outside of their loops and prevent
> + coalescing opportunities. */
> + && !(loop_depth_of_name (rhs) > loop_depth_of_name (lhs)))
> set_ssa_name_value (lhs, rhs);
> }
> }
> @@ -2247,14 +2253,6 @@ cprop_operand (gimple stmt, use_operand_
> if (!may_propagate_copy (op, val))
> return;
>
> - /* Do not propagate copies if the propagated value is at a deeper loop
> - depth than the propagatee. Otherwise, this may move loop variant
> - variables outside of their loops and prevent coalescing
> - opportunities. If the value was loop invariant, it will be hoisted
> - by LICM and exposed for copy propagation. */
> - if (loop_depth_of_name (val) > loop_depth_of_name (op))
> - return;
> -
> /* Do not propagate copies into simple IV increment statements.
> See PR23821 for how this can disturb IV analysis. */
> if (TREE_CODE (val) != INTEGER_CST
> Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-5.c (working copy)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Os -fno-tree-fre -fdump-tree-dom1-details" } */
> +
> +void foo(int *);
> +void f2(int dst[3], int R)
> +{
> + int i, inter[2];
> + _Bool inter0p = 0;
> + _Bool inter1p = 0;
> + for (i = 1; i < R; i++)
> + {
> + inter0p = 1;
> + inter1p = 1;
> + }
> + if (inter0p)
> + inter[0] = 1;
> + if (inter1p)
> + inter[1] = 1;
> + foo(inter);
> +}
> +
> +/* { dg-final { scan-tree-dump "Threaded jump" "dom1" } } */
> +/* { dg-final { cleanup-tree-dump "dom1" } } */
>
--
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
next prev parent reply other threads:[~2014-06-26 9:01 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-25 14:07 Richard Biener
2014-06-25 18:28 ` Jeff Law
2014-06-26 7:57 ` Richard Biener
2014-06-26 8:33 ` Richard Biener
2014-06-26 9:01 ` Richard Biener [this message]
2014-06-26 15:33 ` Jeff Law
2014-06-25 14:09 Richard Biener
2014-06-25 18:34 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1406261046560.29270@zhemvz.fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=law@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).