From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5136 invoked by alias); 26 Jun 2014 15:33:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 5095 invoked by uid 89); 26 Jun 2014 15:33:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 26 Jun 2014 15:32:38 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s5QFWZGM025587 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 26 Jun 2014 11:32:36 -0400 Received: from stumpy.slc.redhat.com (ovpn-113-44.phx2.redhat.com [10.3.113.44]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s5QFWZXk004160; Thu, 26 Jun 2014 11:32:35 -0400 Message-ID: <53AC3D13.4070709@redhat.com> Date: Thu, 26 Jun 2014 15:33:00 -0000 From: Jeff Law User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Richard Biener CC: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix parts of PR61607 References: <53AB14C0.1010606@redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2014-06/txt/msg02143.txt.bz2 On 06/26/14 02:58, Richard Biener wrote: > On Thu, 26 Jun 2014, Richard Biener wrote: > >> On Thu, 26 Jun 2014, Richard Biener wrote: >> >>> On Wed, 25 Jun 2014, Jeff Law wrote: >>> >>>> On 06/25/14 08:05, Richard Biener wrote: >>>>> >>>>> This removes restrictions in DOM cprop_operand that inhibit >>>>> some optimizations. The volatile pointer thing is really realy >>>>> old and no longer necessary while the loop-depth consideration >>>>> is only valid for loop-closed PHI nodes (but we're not in >>>>> loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA >>>>> phase by inserting copies appropriately. >>>>> >>>>> Bootstrapped on x86_64-unknown-linux-gnu, ok? >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> 2014-06-25 Richard Biener >>>>> >>>>> PR tree-optimization/61607 >>>>> * tree-ssa-dom.c (cprop_operand): Remove restriction on >>>>> propagating volatile pointers and on loop depth. >>>> The first hunk is OK. >>>> >>>> I thought we had tests for the do not copy propagate out of a loop nest in the >>>> suite. Did you check that tests in BZ 19038 still generate good code after >>>> this change? If we still generate good code for those tests, then this hunk >>>> is fine too. >>> >>> I have applied the first hunk and will investigate further. Testing >>> didn't show any issue and I know how to retain the check but not >>> cause the missed optimization shown in PR61607. >> >> Let's try to summarize what the restriction is supposed to avoid. >> It tries to avoid introducing uses of SSA names defined inside a >> loop outside of it because if the SSA name is live over the backedge >> we will then have an overlapping life-range which prevents out-of-SSA >> from coalescing it to a single register. >> >> Now, the existing test is not working in that way. >> >> Rather the best way we have to ensure this property (all outside >> uses go through a copy that is placed on exit edges rather than >> possibly on the backedge) is to go into loop-closed SSA form. >> This is also where the PHI nodes that confuse DOM in PR61607 >> come from in the first place. >> >> Now as the existing measure is ineffective in some cases out-of-SSA >> has gotten the ability to deal with this (or a subset): >> >> /* If elimination of a PHI requires inserting a copy on a backedge, >> then we will have to split the backedge which has numerous >> undesirable performance effects. >> >> A significant number of such cases can be handled here by inserting >> copies into the loop itself. */ >> insert_backedge_copies (); >> >> now, this doesn't seem to deal with outside uses. But eventually >> the coalescing code already assigns proper cost to backedge copies >> so that we choose to place copies on the exit edges rather than >> the backedge ones - seems not so from looking at coalesce_cost_edge. >> >> So I think that we should remove the copy-propagation restrictions >> and instead address this in out-of-SSA. >> >> For now the following patch retains the exact same restriction in >> DOM as it is present in copyprop (but not in FRE - ok my recent fault, >> or in VRP). By avoiding to record the equivalency for PHIs >> (where we know that either all or no uses should be covered by >> the loop depth check) we retain the ability to record the equivalency >> for the two loop exit PHI nodes and thus the threading (if only >> on the false path). >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu. >> >> I'll try to see what happens to the PR19038 testcases (though >> that PR is a mess ...) > > I checked the very original one (thin6d.f from sixtrack) and the > generated assembly for -Ofast is the same without any patch > and with _all_ loop_depth_of_name restrictions removed from > both DOM and copyprop (thus making loop_depth_of_name dead). > > The cost of out-of-SSA copies for backedges (or in the case > of the PR, loop latch edges causing an edge split) is dealt > with by > > /* Inserting copy on critical edge costs more than inserting it > elsewhere. */ > if (EDGE_CRITICAL_P (e)) > mult = 2; > > in coalesce_cost_edge. > > So in the end, without a testcase to investigate, I'd propose > to get rid of those restrictions. I'm still going forward > with the patch below for now. Sounds good. Glad to see those hacks disappear. Jeff