From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 114694 invoked by alias); 27 Nov 2019 10:08:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 114685 invoked by uid 89); 27 Nov 2019 10:08:42 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-4.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy=pay X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 27 Nov 2019 10:08:40 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0799330E; Wed, 27 Nov 2019 02:08:39 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4DC713F6C4; Wed, 27 Nov 2019 02:08:38 -0800 (PST) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener ,Segher Boessenkool , GCC Patches , Nicholas Krause , richard.sandiford@arm.com Cc: Segher Boessenkool , GCC Patches , Nicholas Krause Subject: Re: Add a new combine pass References: <20191118235659.GO16031@gate.crashing.org> <20191119203932.GC16031@gate.crashing.org> <20191120204556.GP16031@gate.crashing.org> <20191122163911.GE9491@gate.crashing.org> <20191125221350.GW9491@gate.crashing.org> <20191126014247.GC9491@gate.crashing.org> Date: Wed, 27 Nov 2019 10:18:00 -0000 In-Reply-To: (Richard Biener's message of "Wed, 27 Nov 2019 09:29:27 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2019-11/txt/msg02488.txt.bz2 Richard Biener writes: > On Tue, Nov 26, 2019 at 2:42 AM Segher Boessenkool > wrote: >> >> On Mon, Nov 25, 2019 at 11:08:47PM +0000, Richard Sandiford wrote: >> > Segher Boessenkool writes: >> > > On Mon, Nov 25, 2019 at 09:16:52PM +0000, Richard Sandiford wrote: >> > >> Segher Boessenkool writes: >> > >> > I am wondering the other way around :-) Is what you do for combine2 >> > >> > something that would be more generally applicable/useful? That's what >> > >> > I'm trying to find out :-) >> > >> > >> > >> > What combine does could use some improvement, if you want to hear a >> > >> > more direct motivations. LOG_LINKS just skip references we cannot >> > >> > handle (and some more), so we always have to do modified_between etc., >> > >> > which hurts. >> > >> >> > >> The trade-offs behind the choice of representation are very specific >> > >> to the pass. >> > > >> > > Yes, but hopefully not so specific that every pass needs a completely >> > > different representation ;-) >> > >> > Well, it depends. Most passes make do with df (without DU/UD-chains). >> > But since DU/UD-chains are naturally quadratic in the general case, >> > and are expensive to keep up to date, each DU/UD pass is going to have >> > make some compromises. It doesn't seem too bad that passes make >> > different compromises based on what they're trying to do. (combine: >> > single use per definition; fwprop.c: track all uses, but for dominating >> > definitions only; sched: fudged via a param; regrename: single >> > definition/multiple use chains optimised for renmaing; combine2: full >> > live range information, but limited use list; etc.) >> >> combine actually *calculates* DU chains almost completely, it just throws >> away most of that information (it wants to have LOG_LINKS, as it did ages >> ago). The only thing stopping us from doing that right now is that not >> all uses are counted (some are skipped). >> >> Since combine works only within BBs, DU chains are linear to compute, and >> UD chains are trivial (and just linear to compute). > > quadraticness appears for RTL DU/UD chains because of partial definitions, > that doesn't change for BBs so even there computing is them is quadratic > (because recording them is). The situation is simply having N partial > defs all reaching M uses which gives you a chain of size N * M. > > Now - for combine you don't want partial defs, so for simplicity we could > choose to _not_ record DU/UD chains whenever we see a partial def for > a pseudo (and mark those as "bad"). Or, slightly enhanced, we can > handle DU/UD chains for regions where there is no partial definition > and add a "fake" D denoting (there are [multiple] defs beyond that > might be partial). Depending on the use-case that should suffice and > make the problem linear. > > I think you want to ask sth like "is REG changed [partially] between > its use in insn A and the def in insn B" and you want to answer that by using > REGs UD chain for that. If you only ever reached the def in insn B via the > "pruned" chain then this would work, likewise for the chain we do not compute > any UD chain for REG. (Passing over this as I think it's about what current combine wants.) >> Updating is quadratic in general, sure. Luckily in most realistic cases >> it is cheap (most, sigh) (insns aren't combined to very far away). > > Updating is linear as well if you can disregard partial defs. > Updating cannot be quadratic if compute is linear ;) This was based on the assumption that we'd do an update after each combination, so that the pass still sees correct info. That then makes the updates across one run of the pass quadratic, since the number of successful combinations is O(ninsns). As far as the new pass goes: the pass would be quadratic if we tried to combine each use in single-def DU chain with its definition. It would also be quadratic if we tried to parallelise each pair of uses in a DU chain. So if we did have full DU chains in the new pass, we'd also need some limit N on the number of uses we try to combine with. And if we're only going to try combining with N uses, then it seemed better to track only N uses "by name", rather than pay the cost of tracking all uses by name but ignoring the information for some of them. All we care about for other uses is whether they would prevent a move. We can track that using a simple point-based live range, where points are LUIDs with gaps in between for new insns. So the new pass uses a list of N specific uses and a single live range. Querying whether a particular definition is live at a particular point is then a constant-time operation. So is updating the info after a successful combination (potentially including a move). That still seems like a reasonable way of representing this, given what the pass wants to do. Moving to full DU chains would IMO just make the pass more expensive with no obvious benefit. Thanks, Richard