From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 51404 invoked by alias); 13 Feb 2020 07:48:54 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 51284 invoked by uid 89); 13 Feb 2020 07:48:42 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-6.1 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 13 Feb 2020 07:48:40 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 6E97BAD3A; Thu, 13 Feb 2020 07:48:37 +0000 (UTC) Date: Thu, 13 Feb 2020 07:48:00 -0000 From: Richard Biener To: Segher Boessenkool cc: Roman Zhuykov , "Kewen.Lin" , GCC Patches , Bill Schmidt , "bin.cheng" Subject: Re: [PATCH 0/4 GCC11] IVOPTs consider step cost for different forms when unrolling In-Reply-To: <20200212220546.GE22482@gate.crashing.org> Message-ID: References: <52c8eecc-3383-81ad-70ce-27c149d7a103@linux.ibm.com> <20200210212910.GL22482@gate.crashing.org> <20200211074859.GV22482@gate.crashing.org> <1ac98132-734e-0ee3-5ea2-7ec256ee92d2@ispras.ru> <20200211181216.GX22482@gate.crashing.org> <20200212100155.GC22482@gate.crashing.org> <20200212220546.GE22482@gate.crashing.org> User-Agent: Alpine 2.21 (LSU 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-SW-Source: 2020-02/txt/msg00766.txt.bz2 On Wed, 12 Feb 2020, Segher Boessenkool wrote: > On Wed, Feb 12, 2020 at 11:53:22AM +0100, Richard Biener wrote: > > On Wed, 12 Feb 2020, Segher Boessenkool wrote: > > > On Wed, Feb 12, 2020 at 09:12:58AM +0100, Richard Biener wrote: > > > > On Tue, 11 Feb 2020, Segher Boessenkool wrote: > > > > > Basic block partitioning has wildly disproportionate fallout in all > > > > > later passes, both in terms of what those *do* (or don't, if partitioning > > > > > is enabled), and of impact on the code (not to mention developer time). > > > > > > > > > > Maybe the implementation can be improved, but probably we should do this > > > > > in a different way altogether. The current situation is not good. > > > > > > > > I think the expectation that you can go back to CFG layout mode > > > > and then work with CFG layout tools after we've lowered to CFG RTL > > > > is simply bogus. > > > > > > Partitioning is also quite problematic if you do not use cfglayout > > > mode. For example, in shrink-wrapping. It prevents a lot there. > > > > > > > Yeah, you can probably do analysis things but > > > > I wouldn't be surprised if a CFG RTL -> CFG layout -> CFG RTL cycle > > > > can wreck things. Undoubtedly doing CFG manipulations is not going > > > > to work since CFG layout does not respect CFG RTL restrictions. > > > > > > Doing CFG manipulations on CFG RTL mode directly is almost impossible > > > to do correctly. > > > > > > For example, bb-reorder. Which is a much more important optimisation > > > than partitioning, btw. > > > > BB reorder switches back and forth as well ... :/ > > Yes. It is extremely hard to change any jumps in cfgrtl mode. > > The goal is to use cfglayout mode more, ideally *always*, not to use > it less! Sure! It must be that split requires CFG RTL (it doesn't say so, but we don't have a PROP_rtllayout or a "cannot work with" set of properties). Otherwise I'm not sure why we go out of CFG layout mode so early. Passes not messing with the CFG at all should be ignorant of what mode we are in. > > I think both are closely enough related that we probably should do > > partitioning from within the same framework? OTOH BB reorder > > happens _much_ later. > > This may be doable. What we need to do first though is to find a better > thing to use than EDGE_CROSSING. > > Maybe we can determine which blocks should be hot and cold early, but > actually make that happen only very late (maybe adjusting the decision > if we have to to make things work)? That way, intervening passes do not > have to care (much). The question is whether we can even preserve things like BB counts and edge frequencies once we've gone out of CFG layout mode for once. > > > > Partitioning simply uncovered latent bugs, there's nothing wrong > > > > with it IMHO. > > > > > > I don't agree. The whole way EDGE_CROSSING works hinders all other > > > optimisations a lot. > > > > I'm not sure if it's there for correctness reasons or just to > > help checking that nothing "undoes" the partitioning decision. > > The latter. (It may also make it easier for the former, of course, but > that can be solved anyway). And that makes live very hard for all later > passes, while it is doubtful that it even give the best decisions: for > example it prevents a lot of shrink-wrapping, which you dearly *want* to > do on cold code! Sure. So do that earlier ;) Richard.