From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-122230-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 11218 invoked by alias); 23 Nov 2005 17:07:34 -0000
Received: (qmail 11104 invoked by uid 22791); 23 Nov 2005 17:07:30 -0000
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 23 Nov 2005 17:07:28 +0000
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) 	by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id jANH7PPE029052; 	Wed, 23 Nov 2005 12:07:25 -0500
Received: from potter.sfbay.redhat.com (potter.sfbay.redhat.com [172.16.27.15]) 	by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id jANH7OV32602; 	Wed, 23 Nov 2005 12:07:24 -0500
Received: from vpn83-149.boston.redhat.com (vpn83-149.boston.redhat.com [172.16.83.149]) 	by potter.sfbay.redhat.com (8.12.8/8.12.8) with ESMTP id jANH7CSY018317; 	Wed, 23 Nov 2005 12:07:19 -0500
Subject: Re: Register Allocation
From: Andrew MacLeod <amacleod@redhat.com>
To: Ian Lance Taylor <ian@airs.com>
Cc: gcc mailing list <gcc@gcc.gnu.org>
In-Reply-To: <m3oe4h6tyr.fsf@gossamer.airs.com>
References: <1132246411.10098.153.camel@localhost.localdomain> 	 <m3oe4h6tyr.fsf@gossamer.airs.com>
Content-Type: text/plain
Date: Wed, 23 Nov 2005 17:07:00 -0000
Message-Id: <1132765629.28830.195.camel@localhost.localdomain>
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
X-SW-Source: 2005-11/txt/msg01153.txt.bz2

On Fri, 2005-11-18 at 22:18 -0800, Ian Lance Taylor wrote:
> Andrew MacLeod <amacleod@redhat.com> writes:

> Secondary/tertiary reloads and reload_in/out patterns are apparently
> subsumed by the Spill Engine.  Porting a target to the new framework
> is going to require providing the Spill Engine with the appropriate
> instructions for storing and loading each native register class,
> including a description of any temporary or scratch registers
> required.  That seems reasonable.  One minor thing that I don't quite
> understand is that bit about reaching a limit of SPILLIDs.  For
> targets with limited displacement, what matters is when you start
> needing larger displacements, which as far as I can see has more to do
> with the overall stack frame size than the number of SPILLIDs.
> 

There is a correlation between the number of SPILLIDs and the amount of
stack space you are going to need. If each spill takes 4 bytes, and
there is 1000 bytes available in the limited displacement pattern
(already taken any fixed requirements out), when SPILLID reaches 250,
you either have to try to compress SPILLIDs by some mechanism (such as
colouring them), or switch to the less efficient pattern which uses
other registers.  Since displacements aren't assigned yet, the number of
SPILLIDs is the only measure there is to decide when to do this. It is
possible that you could commit these spills to memory at this point, if
that served a purpose. 

If the less efficient store doesn't actually have any different register
requirements, there is no need to go through this. The correct insn
would be chosen based on the eventually assigned displacement.  


> 
> Reload inheritance is a complex idea threaded through reload.  The
> goal is to reuse values which happen to already be in registers
> whenever possible.  In particular, to do this for reloads required for
> a single insn.  A typical simple example is 'a++' where 'a' winds up
> living on the stack at a large unrepresentable offset.  You don't want
> to wind up computing the address of 'a' twice, although that is what a
> naive spill code implementation.  That example is too simple, I
> suppose, but it has the right general flavor.
> 

As the spiller steps through the program, it tracks exactly what is in
each hardware register. It also uses lookahead to see how register are
used in the near future. When a value is loaded, every attempt is made
to reuse that value.  This is one of the reasons spilled pseudos are
simply left alone until the spiller sees them. The two uses of the
address calculation of A should already be in the same pseudo when the
spiller sees them. It will simply reuse the value if they are close to
each other and it is possible with the various other register
constraints.  

> Reload inheritance is particularly important on machines with limited
> numbers of registers and limited addressing capability--e.g., SH,
> Thumb, MIPS16.  The current reload implementation needs to do reload
> inheritance as it goes, or it will run out of spill registers in some
> cases.  That is why a reload CSE pass after reload is not enough to
> solve the problem.

Which is one of the reasons the spiller understands how to spill
hardware registers. Sometimes there is no spill register available,
and/or there is a value which is used multiple times close together,
but no register available over the range. It may be better to
temporarily spill a value already assigned to a hardware register for a
period and free up the additional register.  Thats where the lookahead
comes in real handy :-)


> The current reload pass includes general heuristics to handle
> reloading memory addresses.  This code knows things like "if stack
> pointer plus displacement is not a valid memory address, try loading
> the displacement into a register."  Many targets currently rely on
> those heuristics to generate valid code.  I haven't been able to quite
> pin down where this happens in your proposal.  For example, it's easy
> for an address to use the frame pointer and be valid before reload,
> and then for reload to eliminate the frame pointer (in fact, in your
> scheme, what does frame pointer elimination?) and produce an offset
> from the stack pointer which is invalid.  That is, spill code or frame
> pointer elimination can generate invalid address, and something needs
> to fix them up.  Where does that happen, and how?
> 


To be fair, I haven't given register elimination a lot of thought yet. I
was presuming it could be inserted as a pass or as a separate component
of the spiller.  Let me get back to you, I must investigate the
issues :-). A couple of prime examples would be helpful :-)  I'll write
up a new section for it.

> le.
> I don't want to argue that your scheme needs to solve every problem.
> But I think that any attempt to tackle the register allocator should
> think seriously about better integration with the scheduler.  I don't
> see anything about that in your proposal.

There isn't :-). I have never been a proponent of integrating register
allocation and scheduling. Quite the opposite, I prefer keeping them
separate.  They can do some limited communication, but should not be
tightly intertwined. Either should be capable of running without knowing
the other exists. 

Scheduling should be able to use heuristics and perhaps some of the
tools of the allocator (such as register pressure and number of register
available) to prevent itself from being too stupid if so desired.
  
Conversely, register allocation may be able to call into the scheduler
to determine if a spill or rematerialization sequence is cheaper/more
expensive in some locations than others, and that sort of thing. The
register allocator can help the scheduler by trying to spread its use of
register out a bit. Using the minimum number of register is not alway
the right thing :-) These types of things are all easy to add after the
fact without affecting the design in any way.  (I suppose anyone what
wanted to experiment with it would be able to add a scheduling pass or
scheduling aware optimization(s) to the allocator)

Yet another one of those religious subjects :-)

Thanks for the comments!
Andrew