From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 53773 invoked by alias); 24 Apr 2019 18:31:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 53764 invoked by uid 89); 24 Apr 2019 18:31:49 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.1 spammy=seriously, H*i:sk:CAFiYyc X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 24 Apr 2019 18:31:47 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4A0C97C0A2; Wed, 24 Apr 2019 18:31:46 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-65.rdu2.redhat.com [10.10.112.65]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5857B5D704; Wed, 24 Apr 2019 18:31:45 +0000 (UTC) Subject: Re: [RFA][tree-optimization/90037] Cleanup const/copies between DOM and erroneous path isolation To: Richard Biener Cc: gcc-patches References: <4be1df73-8e26-b0c8-eaaf-e682819a256d@redhat.com> From: Jeff Law Openpgp: preference=signencrypt Message-ID: <995512fe-f357-c168-b509-baf330874483@redhat.com> Date: Wed, 24 Apr 2019 18:40:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-04/txt/msg00946.txt.bz2 On 4/24/19 4:44 AM, Richard Biener wrote: > On Tue, Apr 23, 2019 at 4:29 PM Jeff Law wrote: >> >> >> As discussed in the BZ, this patch addresses the false positive warning >> by cleaning up the const/copy propagations left in the IL between DOM's >> jump threading and erroneous path isolation. >> >> In the past we'd been handling this stuff with phi only cprop. To make >> phi only cprop work in this case we'd have to change it to scan >> statements within some subset of blocks where it had previously only >> scanned PHIs. >> >> I was concerned about the compile-time cost of the additional scanning >> plus the extra pass. So I compared that against using the lattice based >> const/copy propagation as well as against the RPO VN bits. >> >> It turns out that the lattice based const/copy propagation does the >> least amount of work, closely followed by a limited RPO VN. The >> improved "phi only" copy propagator being the worst. > > Interesting. Definitely. I didn't dig further than what's mentioned in the BZ. > >> Given that we can use the lattice copy propagator by just adding the >> pass to passes.def whereas using the RPN VN actually requires a little >> bit of real code (to set up the entry/exits for the relevant SEME >> regions), I went with the lattice copy propagator. >> >> This change adds around .4% instruction executions to my testbed of .i >> files. It has no significant impact on the resulting code -- I see >> different register allocation decisions in a lot of places which seem to >> primarily result in reversing arguments to comparisons. > > Was there a need to have two copy-prop passes in the early > DOM/errorneous-path removal where we previously only had > a single phi-only-prop pass? Is the testcase fixed also when > doing copy-prop only a single time? The testcase is fixed with a single copyprop (lattice or RPO VN) after DOM but before erroneous path isolation. I seriously considered just dropping the copyprop pass after erroneous path isolation. I'm pretty sure it'll regress codegen quality, but it may not be too bad. > > Also the late pass you replace should be right after VRP, not > after warn_restrict (that was a mistake done when adding the > warn_restrict pass). Agreed. But ISTM we should make that an independent change. > > The main reason I dislike this is that it is an unconditional cleanup > pass run even when we didn't perform any jump threading. That's > of course the same case as with the existing phi-only-prop passes > but would have been my major argument for the SEME VN > (where at some cut-off of course doing a single whole-function VN > is cheaper than doing N SEME VNs, and I can even think of doing > N SEME regions at once in exchange for doing a whole-function > RPO order compute). Yea. We're certainly doing more work in the cases where we didn't thread anything. I've always wanted better ways to indicate what actions a pass did and using that to bypass subsequent passes if they weren't going to be profitable. > >> FWIW I also considered delaying the erroneous path isolation pass. I >> ultimately decided against that. My recollection is that its location >> came from the desire to clean up those paths early enough in the >> pipeline so that other optimizers could do a better job. >> >> We could consider an early/late split here. Essentially we do the >> optimization early, leaving enough data lying around somewhere for a >> late pass to look for and issue the warning. Something like a >> __builtin_warning that we leave in the IL, then just before expanding to >> RTL, we scan the IL once and issue the __builtin_warnings. >> >> In this specific case the __builtin_warning would be on a path that we >> eventually would determine was unreachable at compile time and the path >> would be removed. I suspect we could do something similar for other >> warnings coming out of the middle end. >> >> Anyway, this has been bootstrapped and regression tested on x86_64, >> ppc64, i686, sparc64, & ppc64le. It's also been bootstrapped on alpha, >> ppc (32 bit), armeb, m68k, riscv64, mipsisa32r4, arm, sh4, & sh4eb. >> It's also built and regression tested all the *-elf targets in my tester. >> >> OK for the trunk, or do we want to defer to gcc-10? > > I like the pass removal and would say OK if you manage with > a 1:1 replacement (thus get rid of that extra copy-prop between > DOM and pass_isolate_erroneous_paths). That's the one we need to fix the regression. We might be able to drop the one after erroneous path isolation which would keep us at the 1:1 replacement. I'll poke at that. jeff