public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <law@redhat.com>
To: Richard Biener <rguenther@suse.de>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Fix parts of PR61607
Date: Thu, 26 Jun 2014 15:33:00 -0000	[thread overview]
Message-ID: <53AC3D13.4070709@redhat.com> (raw)
In-Reply-To: <alpine.LSU.2.11.1406261046560.29270@zhemvz.fhfr.qr>

On 06/26/14 02:58, Richard Biener wrote:
> On Thu, 26 Jun 2014, Richard Biener wrote:
>
>> On Thu, 26 Jun 2014, Richard Biener wrote:
>>
>>> On Wed, 25 Jun 2014, Jeff Law wrote:
>>>
>>>> On 06/25/14 08:05, Richard Biener wrote:
>>>>>
>>>>> This removes restrictions in DOM cprop_operand that inhibit
>>>>> some optimizations.  The volatile pointer thing is really realy
>>>>> old and no longer necessary while the loop-depth consideration
>>>>> is only valid for loop-closed PHI nodes (but we're not in
>>>>> loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA
>>>>> phase by inserting copies appropriately.
>>>>>
>>>>> Bootstrapped on x86_64-unknown-linux-gnu, ok?
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>> 2014-06-25  Richard Biener  <rguenther@suse.de>
>>>>>
>>>>> 	PR tree-optimization/61607
>>>>> 	* tree-ssa-dom.c (cprop_operand): Remove restriction on
>>>>> 	propagating volatile pointers and on loop depth.
>>>> The first hunk is OK.
>>>>
>>>> I thought we had tests for the do not copy propagate out of a loop nest in the
>>>> suite.  Did you check that tests in BZ 19038 still generate good code after
>>>> this change?  If we still generate good code for those tests, then this hunk
>>>> is fine too.
>>>
>>> I have applied the first hunk and will investigate further.  Testing
>>> didn't show any issue and I know how to retain the check but not
>>> cause the missed optimization shown in PR61607.
>>
>> Let's try to summarize what the restriction is supposed to avoid.
>> It tries to avoid introducing uses of SSA names defined inside a
>> loop outside of it because if the SSA name is live over the backedge
>> we will then have an overlapping life-range which prevents out-of-SSA
>> from coalescing it to a single register.
>>
>> Now, the existing test is not working in that way.
>>
>> Rather the best way we have to ensure this property (all outside
>> uses go through a copy that is placed on exit edges rather than
>> possibly on the backedge) is to go into loop-closed SSA form.
>> This is also where the PHI nodes that confuse DOM in PR61607
>> come from in the first place.
>>
>> Now as the existing measure is ineffective in some cases out-of-SSA
>> has gotten the ability to deal with this (or a subset):
>>
>>    /* If elimination of a PHI requires inserting a copy on a backedge,
>>       then we will have to split the backedge which has numerous
>>       undesirable performance effects.
>>
>>       A significant number of such cases can be handled here by inserting
>>       copies into the loop itself.  */
>>    insert_backedge_copies ();
>>
>> now, this doesn't seem to deal with outside uses.  But eventually
>> the coalescing code already assigns proper cost to backedge copies
>> so that we choose to place copies on the exit edges rather than
>> the backedge ones - seems not so from looking at coalesce_cost_edge.
>>
>> So I think that we should remove the copy-propagation restrictions
>> and instead address this in out-of-SSA.
>>
>> For now the following patch retains the exact same restriction in
>> DOM as it is present in copyprop (but not in FRE - ok my recent fault,
>> or in VRP).  By avoiding to record the equivalency for PHIs
>> (where we know that either all or no uses should be covered by
>> the loop depth check) we retain the ability to record the equivalency
>> for the two loop exit PHI nodes and thus the threading (if only
>> on the false path).
>>
>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>>
>> I'll try to see what happens to the PR19038 testcases (though
>> that PR is a mess ...)
>
> I checked the very original one (thin6d.f from sixtrack) and the
> generated assembly for -Ofast is the same without any patch
> and with _all_ loop_depth_of_name restrictions removed from
> both DOM and copyprop (thus making loop_depth_of_name dead).
>
> The cost of out-of-SSA copies for backedges (or in the case
> of the PR, loop latch edges causing an edge split) is dealt
> with by
>
>    /* Inserting copy on critical edge costs more than inserting it
> elsewhere.  */
>    if (EDGE_CRITICAL_P (e))
>      mult = 2;
>
> in coalesce_cost_edge.
>
> So in the end, without a testcase to investigate, I'd propose
> to get rid of those restrictions.  I'm still going forward
> with the patch below for now.
Sounds good.  Glad to see those hacks disappear.

Jeff

  reply	other threads:[~2014-06-26 15:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-25 14:07 Richard Biener
2014-06-25 18:28 ` Jeff Law
2014-06-26  7:57   ` Richard Biener
2014-06-26  8:33     ` Richard Biener
2014-06-26  9:01       ` Richard Biener
2014-06-26 15:33         ` Jeff Law [this message]
2014-06-25 14:09 Richard Biener
2014-06-25 18:34 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53AC3D13.4070709@redhat.com \
    --to=law@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).