From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11218 invoked by alias); 23 Nov 2005 17:07:34 -0000 Received: (qmail 11104 invoked by uid 22791); 23 Nov 2005 17:07:30 -0000 X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 23 Nov 2005 17:07:28 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id jANH7PPE029052; Wed, 23 Nov 2005 12:07:25 -0500 Received: from potter.sfbay.redhat.com (potter.sfbay.redhat.com [172.16.27.15]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id jANH7OV32602; Wed, 23 Nov 2005 12:07:24 -0500 Received: from vpn83-149.boston.redhat.com (vpn83-149.boston.redhat.com [172.16.83.149]) by potter.sfbay.redhat.com (8.12.8/8.12.8) with ESMTP id jANH7CSY018317; Wed, 23 Nov 2005 12:07:19 -0500 Subject: Re: Register Allocation From: Andrew MacLeod To: Ian Lance Taylor Cc: gcc mailing list In-Reply-To: References: <1132246411.10098.153.camel@localhost.localdomain> Content-Type: text/plain Date: Wed, 23 Nov 2005 17:07:00 -0000 Message-Id: <1132765629.28830.195.camel@localhost.localdomain> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2005-11/txt/msg01153.txt.bz2 On Fri, 2005-11-18 at 22:18 -0800, Ian Lance Taylor wrote: > Andrew MacLeod writes: > Secondary/tertiary reloads and reload_in/out patterns are apparently > subsumed by the Spill Engine. Porting a target to the new framework > is going to require providing the Spill Engine with the appropriate > instructions for storing and loading each native register class, > including a description of any temporary or scratch registers > required. That seems reasonable. One minor thing that I don't quite > understand is that bit about reaching a limit of SPILLIDs. For > targets with limited displacement, what matters is when you start > needing larger displacements, which as far as I can see has more to do > with the overall stack frame size than the number of SPILLIDs. > There is a correlation between the number of SPILLIDs and the amount of stack space you are going to need. If each spill takes 4 bytes, and there is 1000 bytes available in the limited displacement pattern (already taken any fixed requirements out), when SPILLID reaches 250, you either have to try to compress SPILLIDs by some mechanism (such as colouring them), or switch to the less efficient pattern which uses other registers. Since displacements aren't assigned yet, the number of SPILLIDs is the only measure there is to decide when to do this. It is possible that you could commit these spills to memory at this point, if that served a purpose. If the less efficient store doesn't actually have any different register requirements, there is no need to go through this. The correct insn would be chosen based on the eventually assigned displacement. > > Reload inheritance is a complex idea threaded through reload. The > goal is to reuse values which happen to already be in registers > whenever possible. In particular, to do this for reloads required for > a single insn. A typical simple example is 'a++' where 'a' winds up > living on the stack at a large unrepresentable offset. You don't want > to wind up computing the address of 'a' twice, although that is what a > naive spill code implementation. That example is too simple, I > suppose, but it has the right general flavor. > As the spiller steps through the program, it tracks exactly what is in each hardware register. It also uses lookahead to see how register are used in the near future. When a value is loaded, every attempt is made to reuse that value. This is one of the reasons spilled pseudos are simply left alone until the spiller sees them. The two uses of the address calculation of A should already be in the same pseudo when the spiller sees them. It will simply reuse the value if they are close to each other and it is possible with the various other register constraints. > Reload inheritance is particularly important on machines with limited > numbers of registers and limited addressing capability--e.g., SH, > Thumb, MIPS16. The current reload implementation needs to do reload > inheritance as it goes, or it will run out of spill registers in some > cases. That is why a reload CSE pass after reload is not enough to > solve the problem. Which is one of the reasons the spiller understands how to spill hardware registers. Sometimes there is no spill register available, and/or there is a value which is used multiple times close together, but no register available over the range. It may be better to temporarily spill a value already assigned to a hardware register for a period and free up the additional register. Thats where the lookahead comes in real handy :-) > The current reload pass includes general heuristics to handle > reloading memory addresses. This code knows things like "if stack > pointer plus displacement is not a valid memory address, try loading > the displacement into a register." Many targets currently rely on > those heuristics to generate valid code. I haven't been able to quite > pin down where this happens in your proposal. For example, it's easy > for an address to use the frame pointer and be valid before reload, > and then for reload to eliminate the frame pointer (in fact, in your > scheme, what does frame pointer elimination?) and produce an offset > from the stack pointer which is invalid. That is, spill code or frame > pointer elimination can generate invalid address, and something needs > to fix them up. Where does that happen, and how? > To be fair, I haven't given register elimination a lot of thought yet. I was presuming it could be inserted as a pass or as a separate component of the spiller. Let me get back to you, I must investigate the issues :-). A couple of prime examples would be helpful :-) I'll write up a new section for it. > le. > I don't want to argue that your scheme needs to solve every problem. > But I think that any attempt to tackle the register allocator should > think seriously about better integration with the scheduler. I don't > see anything about that in your proposal. There isn't :-). I have never been a proponent of integrating register allocation and scheduling. Quite the opposite, I prefer keeping them separate. They can do some limited communication, but should not be tightly intertwined. Either should be capable of running without knowing the other exists. Scheduling should be able to use heuristics and perhaps some of the tools of the allocator (such as register pressure and number of register available) to prevent itself from being too stupid if so desired. Conversely, register allocation may be able to call into the scheduler to determine if a spill or rematerialization sequence is cheaper/more expensive in some locations than others, and that sort of thing. The register allocator can help the scheduler by trying to spread its use of register out a bit. Using the minimum number of register is not alway the right thing :-) These types of things are all easy to add after the fact without affecting the design in any way. (I suppose anyone what wanted to experiment with it would be able to add a scheduling pass or scheduling aware optimization(s) to the allocator) Yet another one of those religious subjects :-) Thanks for the comments! Andrew