From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14988 invoked by alias); 19 Nov 2005 19:31:25 -0000 Received: (qmail 14972 invoked by uid 22791); 19 Nov 2005 19:31:24 -0000 X-Spam-Check-By: sourceware.org Received: from yosemite.airs.com (HELO yosemite.airs.com) (205.217.158.180) by sourceware.org (qpsmtpd/0.31) with SMTP; Sat, 19 Nov 2005 19:31:23 +0000 Received: (qmail 32111 invoked by uid 10); 19 Nov 2005 19:31:22 -0000 Received: (qmail 3479 invoked by uid 500); 19 Nov 2005 06:18:52 -0000 To: Andrew MacLeod Cc: gcc mailing list Subject: Re: Register Allocation References: <1132246411.10098.153.camel@localhost.localdomain> From: Ian Lance Taylor Date: Sat, 19 Nov 2005 19:31:00 -0000 In-Reply-To: <1132246411.10098.153.camel@localhost.localdomain> Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2005-11/txt/msg00913.txt.bz2 Andrew MacLeod writes: > The document is intended as a starting point and consists mostly of my > thoughts at the moment. By the time the underlying RTL bits are done, I > would like it to have evolved to include input from others. The more > useful comments there are, the better the chance of us getting a decent > allocator. Thanks for thinking about this and writing this up. When I look at reload, there are a few things that make the code overly complex: reload inheritance, general hueristics to handle memory addresses which are intended to work on all machines, and secondary/tertiary reloads and the reload_in/out patterns. So I'd like to quickly look at these in your framework. Secondary/tertiary reloads and reload_in/out patterns are apparently subsumed by the Spill Engine. Porting a target to the new framework is going to require providing the Spill Engine with the appropriate instructions for storing and loading each native register class, including a description of any temporary or scratch registers required. That seems reasonable. One minor thing that I don't quite understand is that bit about reaching a limit of SPILLIDs. For targets with limited displacement, what matters is when you start needing larger displacements, which as far as I can see has more to do with the overall stack frame size than the number of SPILLIDs. Reload inheritance is a complex idea threaded through reload. The goal is to reuse values which happen to already be in registers whenever possible. In particular, to do this for reloads required for a single insn. A typical simple example is 'a++' where 'a' winds up living on the stack at a large unrepresentable offset. You don't want to wind up computing the address of 'a' twice, although that is what a naive spill code implementation. That example is too simple, I suppose, but it has the right general flavor. Reload inheritance is particularly important on machines with limited numbers of registers and limited addressing capability--e.g., SH, Thumb, MIPS16. The current reload implementation needs to do reload inheritance as it goes, or it will run out of spill registers in some cases. That is why a reload CSE pass after reload is not enough to solve the problem. I would be interested in your thoughts on handling this general issue in your framework. One approach that seems reasonably promising to me is to first do pseudo spill code for the instructions in a basic block, then run CSE over the pseudo code, and only then identify real spill registers to fit into the pseudo code. The current reload pass includes general heuristics to handle reloading memory addresses. This code knows things like "if stack pointer plus displacement is not a valid memory address, try loading the displacement into a register." Many targets currently rely on those heuristics to generate valid code. I haven't been able to quite pin down where this happens in your proposal. For example, it's easy for an address to use the frame pointer and be valid before reload, and then for reload to eliminate the frame pointer (in fact, in your scheme, what does frame pointer elimination?) and produce an offset from the stack pointer which is invalid. That is, spill code or frame pointer elimination can generate invalid address, and something needs to fix them up. Where does that happen, and how? The other main thing that bugs me about the current register allocation scheme is that there is no communication with the scheduler. The scheduler can easily increase register pressure so much that the resulting code is worse than if the scheduler had never run at all. Naturally this happens particularly on architectures with relatively few registers and relatively deep pipelines, like XScale. I don't want to argue that your scheme needs to solve every problem. But I think that any attempt to tackle the register allocator should think seriously about better integration with the scheduler. I don't see anything about that in your proposal. Ian