IRA conflict graph & alternative selection

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* IRA conflict graph & alternative selection
@ 2009-02-13 15:13 Jeff Law
  2009-02-13 15:32 ` Paolo Bonzini
  2009-02-13 16:37 ` Vladimir Makarov
  0 siblings, 2 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-13 15:13 UTC (permalink / raw)
  To: Vladimir N. Makarov; +Cc: GCC

I've been thinking further about instruction alternative selection prior 
to allocation and one of the questions in my mind is how this interacts 
with IRA.

We select an alternative for each insn based on some "best guess" 
heuristic -- the selection of an alternative will often restrict the 
register classes available for each of the operands.  Presumably we'd 
want to encode that information it the conflict graph so that IRA would 
allocate registers so as to fit the constraints of the early insn 
alternative selection.  Right?   In the case where the graph is 
uncolorable, do we allow IRA to override the alternative selection, or 
do we insert copies to simplify the conflict graph or some mixture of both?

Thoughts?

jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 15:13 IRA conflict graph & alternative selection Jeff Law
@ 2009-02-13 15:32 ` Paolo Bonzini
  2009-02-13 15:47   ` Michael Matz
  2009-02-13 17:01   ` Jeff Law
  2009-02-13 16:37 ` Vladimir Makarov
  1 sibling, 2 replies; 32+ messages in thread
From: Paolo Bonzini @ 2009-02-13 15:32 UTC (permalink / raw)
  To: Jeff Law; +Cc: Vladimir N. Makarov, GCC

Jeff Law wrote:
> We'd want to encode [early insn alternative selection]
> information in the conflict graph so that IRA would
> allocate registers so as to fit the constraints of the early insn
> alternative selection.  Right?   In the case where the graph is
> uncolorable, do we allow IRA to override the alternative selection, or
> do we insert copies to simplify the conflict graph or some mixture of both?

Inserting compensation code, for example copies, can be seen as some
kind of "pre-reload" as it was used on new-ra branch; the problem with
pre-reload was that it was built on "cp reload1.c pre-reload.c", so it
was not much less complicated than reload.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 15:32 ` Paolo Bonzini
@ 2009-02-13 15:47   ` Michael Matz
  2009-02-16 16:46     ` Jeff Law
  2009-02-13 17:01   ` Jeff Law
  1 sibling, 1 reply; 32+ messages in thread
From: Michael Matz @ 2009-02-13 15:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jeff Law, Vladimir N. Makarov, GCC

Hi,

On Fri, 13 Feb 2009, Paolo Bonzini wrote:

> > We'd want to encode [early insn alternative selection] information in 
> > the conflict graph so that IRA would allocate registers so as to fit 
> > the constraints of the early insn alternative selection.  Right?  In 
> > the case where the graph is uncolorable, do we allow IRA to override 
> > the alternative selection, or do we insert copies to simplify the 
> > conflict graph or some mixture of both?

If the initial alternative selection was done cleverly (like chose the 
alternatives allowing the largest register sets which don't immediately 
create conflicting demands for a pseudo register) the opportunities for 
making an uncolorable graph colorable by chosing another alternative will 
be very small.  This can only happen if that new alternative somehow 
allows for the uncolorable node a completely new set of register (like say 
float instead of integer regs), which would mean also selecting other 
alternatives for all instructions where this pseudo also is used.

So it's not impossible, but I think it would happen relatively seldom that 
changing the alternatives improves the situation.

> Inserting compensation code, for example copies, can be seen as some 
> kind of "pre-reload" as it was used on new-ra branch; the problem with 
> pre-reload was that it was built on "cp reload1.c pre-reload.c", so it 
> was not much less complicated than reload.

Unnecessarily so, though.  Faster to have something working, but as the 
problem is much easier (and you still can reasonably create new pseudos) 
also the solution can be.  It just never was followed through.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 15:13 IRA conflict graph & alternative selection Jeff Law
  2009-02-13 15:32 ` Paolo Bonzini
@ 2009-02-13 16:37 ` Vladimir Makarov
  2009-02-13 16:49   ` Paolo Bonzini
  2009-02-16 16:40   ` Jeff Law
  1 sibling, 2 replies; 32+ messages in thread
From: Vladimir Makarov @ 2009-02-13 16:37 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC

Jeff Law wrote:
> I've been thinking further about instruction alternative selection 
> prior to allocation and one of the questions in my mind is how this 
> interacts with IRA.
>
> We select an alternative for each insn based on some "best guess" 
> heuristic -- the selection of an alternative will often restrict the 
> register classes available for each of the operands.  Presumably we'd 
> want to encode that information it the conflict graph so that IRA 
> would allocate registers so as to fit the constraints of the early 
> insn alternative selection.  Right?   In the case where the graph is 
> uncolorable, do we allow IRA to override the alternative selection, or 
> do we insert copies to simplify the conflict graph or some mixture of 
> both?
>
> Thoughts?

Sticking with one insn alternative (by alternative I mean set of operand 
constraints - one constraint for one operand.  That is different from 
RTL insn alternative which is superset of such set) permits more 
accurate (and much faster) cost calculation of register class costs for 
pseudos which improves coloring and spilling decisions especially for 
CISC architectures likes x86, x86_64.  On the other hand it rejects some 
more possible alternatives.  What is more important? I don't know.  
Therefore I wrote about it as an experiment.  That is ok because to get 
current state IRA algorithms I tried probably 5 times more different 
coloring algorithms.

As for copies,  I think it would be a bad decision to stick only to 
original (after the code selection) alternative and generate copies to 
satisfy this alternative.  For example, if pseudo got memory instead of 
hard-register required by the alternative, it would be bad to generate a 
copy (ld/st in this case) if memory is accepted by the insn.  That is 
because a new pseudo will need a hard register and could result in 
spilling another pseudo.  And this  can not be fixed by a simple 
subsequent pass removing ld/st by using memory operands.  So I think if 
the altrernative is not satisfied then we should choose the cheapest one 
and use it for the insn.  Still some copies might be needed to satisfy 
the new alternative.  After changing some insn alternatives and 
generating copies, choosing alternative for them and generating new 
pseudos we modify register class costs of the pseudos and calculate 
costs for the new pseudos (it should be a fast process because we need 
to change costs only for pseudos involved in insns with changed 
alternatives and because again each insn has only one alternative).  
After cost recalculation, another round of coloring is done.

That is in brief how I see it and there are a lot of reload details 
missed (like virtual register eliminations or addressing displacement 
constraints etc).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 16:37 ` Vladimir Makarov
@ 2009-02-13 16:49   ` Paolo Bonzini
  2009-02-13 19:01     ` Ian Lance Taylor
  2009-02-13 19:54     ` Jeff Law
  2009-02-16 16:40   ` Jeff Law
  1 sibling, 2 replies; 32+ messages in thread
From: Paolo Bonzini @ 2009-02-13 16:49 UTC (permalink / raw)
  To: GCC Development, Jeffrey A Law, Vladimir Makarov, Michael Matz

> As for copies,  I think it would be a bad decision to stick only to
> original (after the code selection) alternative and generate copies to
> satisfy this alternative.  For example, if pseudo got memory instead of
> hard-register required by the alternative, it would be bad to generate a
> copy (ld/st in this case) if memory is accepted by the insn.

Yes, spilling is a special case (see also PR19398).  If something is
spilled, "prereloads" could be rolled back for affected instructions and
 recomputed (possibly forcing no change on the constraints of other
registers).

> That is in brief how I see it and there are a lot of reload details
> missed (like virtual register eliminations or addressing displacement
> constraints etc).

I suppose those would stay in reload?  Michael, what were your
intentions for the insn-sel branch?  (Sorry to remind you of that :-P).

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 15:32 ` Paolo Bonzini
  2009-02-13 15:47   ` Michael Matz
@ 2009-02-13 17:01   ` Jeff Law
  1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-13 17:01 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Vladimir N. Makarov, GCC

Paolo Bonzini wrote:
> Jeff Law wrote:
>   
>> We'd want to encode [early insn alternative selection]
>> information in the conflict graph so that IRA would
>> allocate registers so as to fit the constraints of the early insn
>> alternative selection.  Right?   In the case where the graph is
>> uncolorable, do we allow IRA to override the alternative selection, or
>> do we insert copies to simplify the conflict graph or some mixture of both?
>>     
>
> Inserting compensation code, for example copies, can be seen as some
> kind of "pre-reload" as it was used on new-ra branch;
It can certainly be viewed that way; however, it helps me to think in 
terms of having IRA generate spill code.  The difference is subtle, but 
it's helping me get away from the reload-centric mindset which keeps 
creeping in. 


>  the problem with
> pre-reload was that it was built on "cp reload1.c pre-reload.c", so it
> was not much less complicated than reload.
>   
Precisely.  And I'm finding that as I look at the existing reload code 
and ponder where/how to do things better I always end up with the same 
code, but just at a different place in the pipeline.  That shouldn't be 
the goal here -- the goal should be something more maintainable and 
predictable than reload.

What I think I'm getting at is we should be looking at common reload 
cases and mapping them into changes to the conflict graph within IRA 
(possibly with some copies inserted as compensation code).

Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 16:49   ` Paolo Bonzini
@ 2009-02-13 19:01     ` Ian Lance Taylor
  2009-02-13 19:38       ` Vladimir Makarov
  2009-02-16 16:25       ` Jeff Law
  2009-02-13 19:54     ` Jeff Law
  1 sibling, 2 replies; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-13 19:01 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: GCC Development, Jeffrey A Law, Vladimir Makarov, Michael Matz

Paolo Bonzini <bonzini@gnu.org> writes:

>> That is in brief how I see it and there are a lot of reload details
>> missed (like virtual register eliminations or addressing displacement
>> constraints etc).
>
> I suppose those would stay in reload?

I see no reason for those to stay in reload (especially since I think
reload should disappear entirely).  It is reasonable to pick the total
maximum size of the stack frame, and thus resolve all displacement
constraints, before register allocation.  Carefully relaxing these
constraints during reload can give you slightly better results for some
instructions, but only in very very few cases, and only in functions
which already have unusually large stack frames.  I don't consider that
to be an important optimization.  Given that, we can determine the
maximum offset for all virtual registers before register allocation,
which suffices for selection of insn constraint alternatives, and then
determine the actual offset, once, after register allocation.

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 19:01     ` Ian Lance Taylor
@ 2009-02-13 19:38       ` Vladimir Makarov
  2009-02-16 16:25       ` Jeff Law
  1 sibling, 0 replies; 32+ messages in thread
From: Vladimir Makarov @ 2009-02-13 19:38 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Paolo Bonzini, GCC Development, Jeffrey A Law, Michael Matz

Ian Lance Taylor wrote:
> Paolo Bonzini <bonzini@gnu.org> writes:
>
>   
>>> That is in brief how I see it and there are a lot of reload details
>>> missed (like virtual register eliminations or addressing displacement
>>> constraints etc).
>>>       
>> I suppose those would stay in reload?
>>     
>
> I see no reason for those to stay in reload (especially since I think
> reload should disappear entirely).  It is reasonable to pick the total
> maximum size of the stack frame, and thus resolve all displacement
> constraints, before register allocation.  Carefully relaxing these
> constraints during reload can give you slightly better results for some
> instructions, but only in very very few cases, and only in functions
> which already have unusually large stack frames.  I don't consider that
> to be an important optimization.  Given that, we can determine the
> maximum offset for all virtual registers before register allocation,
> which suffices for selection of insn constraint alternatives, and then
> determine the actual offset, once, after register allocation.
>
>
>   
Ian, sorry for not to be clear which resulted in this misunderstanding.  
What I wrote was a scheme without reload pass.  I just mentioned some 
tasks of reload pass which should be solved in RA without the reload and 
whose solutions I did not discussed in details.  The solution could be 
what you propose for example.

Removing reload is a long way, imho. If we go this way, in the middle of 
it there would be still reload but solving less and less tasks.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 16:49   ` Paolo Bonzini
  2009-02-13 19:01     ` Ian Lance Taylor
@ 2009-02-13 19:54     ` Jeff Law
  2009-02-14 15:00       ` Steven Bosscher
  1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-13 19:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: GCC Development, Vladimir Makarov, Michael Matz

Paolo Bonzini wrote:
>> As for copies,  I think it would be a bad decision to stick only to
>> original (after the code selection) alternative and generate copies to
>> satisfy this alternative.  For example, if pseudo got memory instead of
>> hard-register required by the alternative, it would be bad to generate a
>> copy (ld/st in this case) if memory is accepted by the insn.
>>     
>
> Yes, spilling is a special case (see also PR19398).  If something is
> spilled, "prereloads" could be rolled back for affected instructions and
>  recomputed (possibly forcing no change on the constraints of other
> registers).
>   
Ick.  This reminds me of the old old LRS code -- while it worked and was 
reasonably effective, it was bloody hard to understand and follow (one 
of major reasons the code never was contributed).    I don't see what 
this kind of approach buys us when compared to having IRA deal with 
spill code directly.
>   
>> That is in brief how I see it and there are a lot of reload details
>> missed (like virtual register eliminations or addressing displacement
>> constraints etc).
>>     
>
> I suppose those would stay in reload? 
Ideally they'd all move into IRA.  But I don't mind putting them off 
until after we have a solid handle on how to deal with the most common 
cases.



Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 19:54     ` Jeff Law
@ 2009-02-14 15:00       ` Steven Bosscher
  2009-02-16 15:53         ` Jeff Law
  0 siblings, 1 reply; 32+ messages in thread
From: Steven Bosscher @ 2009-02-14 15:00 UTC (permalink / raw)
  To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

On Fri, Feb 13, 2009 at 8:53 PM, Jeff Law <law@redhat.com> wrote:
>>> That is in brief how I see it and there are a lot of reload details
>>> missed (like virtual register eliminations or addressing displacement
>>> constraints etc).
>>>
>>
>> I suppose those would stay in reload?
>
> Ideally they'd all move into IRA.

...and so, IRA became evil to destroy evil...  :-)

Gr.
Steven

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-14 15:00       ` Steven Bosscher
@ 2009-02-16 15:53         ` Jeff Law
  0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-16 15:53 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Steven Bosscher wrote:
> On Fri, Feb 13, 2009 at 8:53 PM, Jeff Law <law@redhat.com> wrote:
>   
>>>> That is in brief how I see it and there are a lot of reload details
>>>> missed (like virtual register eliminations or addressing displacement
>>>> constraints etc).
>>>>
>>>>         
>>> I suppose those would stay in reload?
>>>       
>> Ideally they'd all move into IRA.
>>     
>
> ...and so, IRA became evil to destroy evil...  :-)
>   
Obviously the hope would be we could do things much more cleanly in the 
IRA code base.  Modeling spill code generation as a set of 
transformations on the conflict graph has some potential. 

jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 19:01     ` Ian Lance Taylor
  2009-02-13 19:38       ` Vladimir Makarov
@ 2009-02-16 16:25       ` Jeff Law
  2009-02-16 20:33         ` Ian Lance Taylor
  1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-16 16:25 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Ian Lance Taylor wrote:
> Paolo Bonzini <bonzini@gnu.org> writes:
>
>   
>>> That is in brief how I see it and there are a lot of reload details
>>> missed (like virtual register eliminations or addressing displacement
>>> constraints etc).
>>>       
>> I suppose those would stay in reload?
>>     
>
> I see no reason for those to stay in reload (especially since I think
> reload should disappear entirely).  It is reasonable to pick the total
> maximum size of the stack frame, and thus resolve all displacement
> constraints, before register allocation.  Carefully relaxing these
> constraints during reload can give you slightly better results for some
> instructions, but only in very very few cases, and only in functions
> which already have unusually large stack frames.  I don't consider that
> to be an important optimization.  Given that, we can determine the
> maximum offset for all virtual registers before register allocation,
> which suffices for selection of insn constraint alternatives, and then
> determine the actual offset, once, after register allocation.
>   
I would agree that careful relaxation of displacements is no longer as 
important as it once was, I don't think we can just  hand wave away the 
displacement issues

  1. The stack frames don't have to be that big to bump up against these 
problems.

  2. The code we generate if we have to reload the address because the 
displacement was out of range can be horrific

  3. There are targets where other registers used in the insn determine 
the range of the displacement.  ie, in a load from memory, the 
destination register used determines the valid range of displacements 
(+-16 bytes vs +-8k on one target I'm aware of.

  4. Register eliminations complicates matters as well.  Enough that I 
don't think you can set maximum offsets until you've finalized 
everything in the stack -- which implies that you're done spilling.



Jeff



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 16:37 ` Vladimir Makarov
  2009-02-13 16:49   ` Paolo Bonzini
@ 2009-02-16 16:40   ` Jeff Law
  2009-02-17 16:28     ` Vladimir Makarov
  1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-16 16:40 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: GCC

Vladimir Makarov wrote:
> Jeff Law wrote:
>> I've been thinking further about instruction alternative selection 
>> prior to allocation and one of the questions in my mind is how this 
>> interacts with IRA.
>>
>> We select an alternative for each insn based on some "best guess" 
>> heuristic -- the selection of an alternative will often restrict the 
>> register classes available for each of the operands.  Presumably we'd 
>> want to encode that information it the conflict graph so that IRA 
>> would allocate registers so as to fit the constraints of the early 
>> insn alternative selection.  Right?   In the case where the graph is 
>> uncolorable, do we allow IRA to override the alternative selection, 
>> or do we insert copies to simplify the conflict graph or some mixture 
>> of both?
>>
>> Thoughts?
>
>
> As for copies,  I think it would be a bad decision to stick only to 
> original (after the code selection) alternative and generate copies to 
> satisfy this alternative.  For example, if pseudo got memory instead 
> of hard-register required by the alternative, it would be bad to 
> generate a copy (ld/st in this case) if memory is accepted by the insn.  
That's why I mentioned the possibility of relaxing the conflict graph to 
allow other alternatives if we find that the graph is uncolorable.    So 
if we initially wanted class A, but couldn't get it and the operand 
could accept class B, then we remove the conflict between the pseudo and 
the hard regs in class B and recolor.

I have no idea how expensive this would be.

This also implies that we're representing conflicts for register classes 
& memory in the conflict graph. 

Jeff



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-13 15:47   ` Michael Matz
@ 2009-02-16 16:46     ` Jeff Law
  2009-02-16 17:27       ` Michael Matz
  0 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-16 16:46 UTC (permalink / raw)
  To: Michael Matz; +Cc: Paolo Bonzini, Vladimir N. Makarov, GCC

Michael Matz wrote:
> Hi,
>
> On Fri, 13 Feb 2009, Paolo Bonzini wrote:
>
>   
>>> We'd want to encode [early insn alternative selection] information in 
>>> the conflict graph so that IRA would allocate registers so as to fit 
>>> the constraints of the early insn alternative selection.  Right?  In 
>>> the case where the graph is uncolorable, do we allow IRA to override 
>>> the alternative selection, or do we insert copies to simplify the 
>>> conflict graph or some mixture of both?
>>>       
>
> If the initial alternative selection was done cleverly (like chose the 
> alternatives allowing the largest register sets which don't immediately 
> create conflicting demands for a pseudo register) the opportunities for 
> making an uncolorable graph colorable by chosing another alternative will 
> be very small.  This can only happen if that new alternative somehow 
> allows for the uncolorable node a completely new set of register (like say 
> float instead of integer regs), which would mean also selecting other 
> alternatives for all instructions where this pseudo also is used.
>
> So it's not impossible, but I think it would happen relatively seldom that 
> changing the alternatives improves the situation.
>   
Of course.  However, we might want to pick a narrower class if it has a 
smaller cost.  The mn103 targets come to mind.  In general you're better 
off with d0-d3/a0-a3 as they're the cheapest (cost & space).  However, 
you've got some extended registers which can be used just like 
d0-d3/a0-a3, but which are more expensive (but still cheaper than memory).


Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-16 16:46     ` Jeff Law
@ 2009-02-16 17:27       ` Michael Matz
  0 siblings, 0 replies; 32+ messages in thread
From: Michael Matz @ 2009-02-16 17:27 UTC (permalink / raw)
  To: Jeff Law; +Cc: Paolo Bonzini, Vladimir N. Makarov, GCC

Hi,

On Mon, 16 Feb 2009, Jeff Law wrote:

> > If the initial alternative selection was done cleverly (like chose the 
> > alternatives allowing the largest register sets which don't 
> > immediately create conflicting demands for a pseudo register) the 
> > opportunities for making an uncolorable graph colorable by chosing 
> > another alternative will be very small.  This can only happen if that 
> > new alternative somehow allows for the uncolorable node a completely 
> > new set of register (like say float instead of integer regs), which 
> > would mean also selecting other alternatives for all instructions 
> > where this pseudo also is used.
> >
> > So it's not impossible, but I think it would happen relatively seldom 
> > that changing the alternatives improves the situation.
> >   
> Of course.  However, we might want to pick a narrower class if it has a 
> smaller cost.  The mn103 targets come to mind.  In general you're better 
> off with d0-d3/a0-a3 as they're the cheapest (cost & space).  However, 
> you've got some extended registers which can be used just like 
> d0-d3/a0-a3, but which are more expensive (but still cheaper than 
> memory).

I'd rather model this as a set of preferrable colors in the node.  If 
they're still free when coloring the node, good, if not, too bad, but 
there are still others to chose from.  This is more or less equivalent to 
chosing a different alternative, but more explicit for the coloring 
problem and with less ripple-down effects.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-16 16:25       ` Jeff Law
@ 2009-02-16 20:33         ` Ian Lance Taylor
  2009-02-16 23:15           ` Jeff Law
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-16 20:33 UTC (permalink / raw)
  To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Jeff Law <law@redhat.com> writes:

> Ian Lance Taylor wrote:
>>
>> I see no reason for those to stay in reload (especially since I think
>> reload should disappear entirely).  It is reasonable to pick the total
>> maximum size of the stack frame, and thus resolve all displacement
>> constraints, before register allocation.  Carefully relaxing these
>> constraints during reload can give you slightly better results for some
>> instructions, but only in very very few cases, and only in functions
>> which already have unusually large stack frames.  I don't consider that
>> to be an important optimization.  Given that, we can determine the
>> maximum offset for all virtual registers before register allocation,
>> which suffices for selection of insn constraint alternatives, and then
>> determine the actual offset, once, after register allocation.
>>   
> I would agree that careful relaxation of displacements is no longer as
> important as it once was, I don't think we can just  hand wave away
> the displacement issues
>
>  1. The stack frames don't have to be that big to bump up against
> these problems.
>
>  2. The code we generate if we have to reload the address because the
> displacement was out of range can be horrific
>
>  3. There are targets where other registers used in the insn determine
> the range of the displacement.  ie, in a load from memory, the
> destination register used determines the valid range of displacements
> (+-16 bytes vs +-8k on one target I'm aware of.

In all of thse cases, the relaxation loop can only affect a handful of
instructions: the cases where saving a few less registers moves the
offset within range.  Those few instructions can only occur in a handful
of functions: the ones where the stack frame is so large that this
becomes an issue at all.

I'm not handwaving away displacement issues in general.  I'm handwaving
away the need to do relaxation, such that we adjust if we find that need
to save one more or one fewer register.  If we eliminate that relaxation
requirement, we can determine all displacements before register
allocation.

>  4. Register eliminations complicates matters as well.  Enough that I
> don't think you can set maximum offsets until you've finalized
> everything in the stack -- which implies that you're done spilling.

We clearly can set maximum offsets, if we are willing to sacrifice an
optimization.  I argue that that optimization is inconsequential for
99.9% of all code, and avoidable (through refactoring and good inline
heuristics) for 100% of all code.

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-16 20:33         ` Ian Lance Taylor
@ 2009-02-16 23:15           ` Jeff Law
  2009-02-17  3:57             ` Ian Lance Taylor
  2009-02-17 16:28             ` Vladimir Makarov
  0 siblings, 2 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-16 23:15 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Ian Lance Taylor wrote:
> Jeff Law <law@redhat.com> writes:
>
>   
>> Ian Lance Taylor wrote:
>>     
>>> I see no reason for those to stay in reload (especially since I think
>>> reload should disappear entirely).  It is reasonable to pick the total
>>> maximum size of the stack frame, and thus resolve all displacement
>>> constraints, before register allocation.  Carefully relaxing these
>>> constraints during reload can give you slightly better results for some
>>> instructions, but only in very very few cases, and only in functions
>>> which already have unusually large stack frames.  I don't consider that
>>> to be an important optimization.  Given that, we can determine the
>>> maximum offset for all virtual registers before register allocation,
>>> which suffices for selection of insn constraint alternatives, and then
>>> determine the actual offset, once, after register allocation.
>>>   
>>>       
>> I would agree that careful relaxation of displacements is no longer as
>> important as it once was, I don't think we can just  hand wave away
>> the displacement issues
>>
>>  1. The stack frames don't have to be that big to bump up against
>> these problems.
>>
>>  2. The code we generate if we have to reload the address because the
>> displacement was out of range can be horrific
>>
>>  3. There are targets where other registers used in the insn determine
>> the range of the displacement.  ie, in a load from memory, the
>> destination register used determines the valid range of displacements
>> (+-16 bytes vs +-8k on one target I'm aware of.
>>     
>
> In all of thse cases, the relaxation loop can only affect a handful of
> instructions: the cases where saving a few less registers moves the
> offset within range.  Those few instructions can only occur in a handful
> of functions: the ones where the stack frame is so large that this
> becomes an issue at all.
>   
I disagree, particularly because of point #3.    I don't see how you can 
hand wave it away, that is unless you plan on just making every 
load/store of a stack variable/spill be assumed to be out of the +-16 
byte range which will generate absolutely horrible code.

On that particular target is isn't uncommon to have situations where you 
think you're going to be able to use the +-8k instruction, but because 
of spilling you end up using a different register and suddenly you're 
stuck with only being able to use +-16 byte offsets.


> I'm not handwaving away displacement issues in general.  I'm handwaving
> away the need to do relaxation, such that we adjust if we find that need
> to save one more or one fewer register.  If we eliminate that relaxation
> requirement, we can determine all displacements before register
> allocation.
>   
I still don't see it as that simple. 
>   
>>  4. Register eliminations complicates matters as well.  Enough that I
>> don't think you can set maximum offsets until you've finalized
>> everything in the stack -- which implies that you're done spilling.
>>     
>
> We clearly can set maximum offsets, if we are willing to sacrifice an
> optimization.  I argue that that optimization is inconsequential for
> 99.9% of all code, and avoidable (through refactoring and good inline
> heuristics) for 100% of all code.
>   
Without knowing the size of the frame, how do you plan on doing this 
without making the assumption that nothing is going to fit in the 
shorter displacement variants?  How can you do this when the range of 
valid displacements can change because the register you used got spilled 
and you got a register from a different class (which in turn has a 
drastically smaller set of valid displacements).

jeff


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-16 23:15           ` Jeff Law
@ 2009-02-17  3:57             ` Ian Lance Taylor
  2009-02-17 16:52               ` Bernd Schmidt
  2009-02-17 17:45               ` Jeff Law
  2009-02-17 16:28             ` Vladimir Makarov
  1 sibling, 2 replies; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-17  3:57 UTC (permalink / raw)
  To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Jeff Law <law@redhat.com> writes:

>>> I would agree that careful relaxation of displacements is no longer as
>>> important as it once was, I don't think we can just  hand wave away
>>> the displacement issues
>>>
>>>  1. The stack frames don't have to be that big to bump up against
>>> these problems.
>>>
>>>  2. The code we generate if we have to reload the address because the
>>> displacement was out of range can be horrific
>>>
>>>  3. There are targets where other registers used in the insn determine
>>> the range of the displacement.  ie, in a load from memory, the
>>> destination register used determines the valid range of displacements
>>> (+-16 bytes vs +-8k on one target I'm aware of.
>>>     
>>
>> In all of thse cases, the relaxation loop can only affect a handful of
>> instructions: the cases where saving a few less registers moves the
>> offset within range.  Those few instructions can only occur in a handful
>> of functions: the ones where the stack frame is so large that this
>> becomes an issue at all.
>>   
> I disagree, particularly because of point #3.    I don't see how you
> can hand wave it away, that is unless you plan on just making every
> load/store of a stack variable/spill be assumed to be out of the +-16
> byte range which will generate absolutely horrible code.

No, that makes no sense.  What I'm suggesting is that we fix the stack
offsets of all local variables before register allocation, based on a
conservative assessment of how many registers will be saved on the
stack.  Then we know during register allocation whether the memory
reference will be in or out of the +- 16 byte range.  What we lose is
the ability to discover that our conservative assessment was overly
conservative, and so actually some small number of instructions will be
generated as out of range when they could have been in range.  (Of
course we will pick up some of those cases using peepholes).


> Without knowing the size of the frame, how do you plan on doing this
> without making the assumption that nothing is going to fit in the
> shorter displacement variants?  How can you do this when the range of
> valid displacements can change because the register you used got
> spilled and you got a register from a different class (which in turn
> has a drastically smaller set of valid displacements).

I'm saying that you guess the size of the frame, so your premise does
not describe the aproach that I am suggesting.

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-16 23:15           ` Jeff Law
  2009-02-17  3:57             ` Ian Lance Taylor
@ 2009-02-17 16:28             ` Vladimir Makarov
  2009-02-17 17:47               ` Jeff Law
  1 sibling, 1 reply; 32+ messages in thread
From: Vladimir Makarov @ 2009-02-17 16:28 UTC (permalink / raw)
  To: Jeff Law; +Cc: Ian Lance Taylor, Paolo Bonzini, GCC Development, Michael Matz

Jeff Law wrote:
> Ian Lance Taylor wrote:
>> Jeff Law <law@redhat.com> writes:
>>
>>  
>>> Ian Lance Taylor wrote:
>>>    
>>>> I see no reason for those to stay in reload (especially since I think
>>>> reload should disappear entirely).  It is reasonable to pick the total
>>>> maximum size of the stack frame, and thus resolve all displacement
>>>> constraints, before register allocation.  Carefully relaxing these
>>>> constraints during reload can give you slightly better results for 
>>>> some
>>>> instructions, but only in very very few cases, and only in functions
>>>> which already have unusually large stack frames.  I don't consider 
>>>> that
>>>> to be an important optimization.  Given that, we can determine the
>>>> maximum offset for all virtual registers before register allocation,
>>>> which suffices for selection of insn constraint alternatives, and then
>>>> determine the actual offset, once, after register allocation.
>>>>         
>>> I would agree that careful relaxation of displacements is no longer as
>>> important as it once was, I don't think we can just  hand wave away
>>> the displacement issues
>>>
>>>  1. The stack frames don't have to be that big to bump up against
>>> these problems.
>>>
>>>  2. The code we generate if we have to reload the address because the
>>> displacement was out of range can be horrific
>>>
>>>  3. There are targets where other registers used in the insn determine
>>> the range of the displacement.  ie, in a load from memory, the
>>> destination register used determines the valid range of displacements
>>> (+-16 bytes vs +-8k on one target I'm aware of.
>>>     
>>
>> In all of thse cases, the relaxation loop can only affect a handful of
>> instructions: the cases where saving a few less registers moves the
>> offset within range.  Those few instructions can only occur in a handful
>> of functions: the ones where the stack frame is so large that this
>> becomes an issue at all.
>>   
> I disagree, particularly because of point #3.    I don't see how you 
> can hand wave it away, that is unless you plan on just making every 
> load/store of a stack variable/spill be assumed to be out of the +-16 
> byte range which will generate absolutely horrible code.
>
> On that particular target is isn't uncommon to have situations where 
> you think you're going to be able to use the +-8k instruction, but 
> because of spilling you end up using a different register and suddenly 
> you're stuck with only being able to use +-16 byte offsets.
>
>
According my old notes, there are plenty targets with small displacement 
issues: avr (0:64), c4x (-255:255), fr30 (-512:512 or 0:64 depending on 
data mode), ip2k (0:127), m68c11(0:256), m68k(-128:127), mcore(0:15, 
0:30 etc depending on date mode), mmix (0:255), xtensa (0:255 for QI) 
and famous sh (0:63).  I think not relaxing the displacements will 
really hurt the performance especially for targets with small register 
file like mcore.

IMHO, another important optimization what reload does for such 
processors is usage of already calculated displacement whose value is in 
hard register to calculate an new displacement (that was implemented by 
Joern for SH).

Although it is hard for me to say how these optimizations will 
complicate the new implementation.  May be we could sacrifice a small 
performance degradation for more clear code.  I don't know, only two 
tried implementations will show it.
>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-16 16:40   ` Jeff Law
@ 2009-02-17 16:28     ` Vladimir Makarov
  0 siblings, 0 replies; 32+ messages in thread
From: Vladimir Makarov @ 2009-02-17 16:28 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC

Jeff Law wrote:
> Vladimir Makarov wrote:
>> Jeff Law wrote:
>>> I've been thinking further about instruction alternative selection 
>>> prior to allocation and one of the questions in my mind is how this 
>>> interacts with IRA.
>>>
>>> We select an alternative for each insn based on some "best guess" 
>>> heuristic -- the selection of an alternative will often restrict the 
>>> register classes available for each of the operands.  Presumably 
>>> we'd want to encode that information it the conflict graph so that 
>>> IRA would allocate registers so as to fit the constraints of the 
>>> early insn alternative selection.  Right?   In the case where the 
>>> graph is uncolorable, do we allow IRA to override the alternative 
>>> selection, or do we insert copies to simplify the conflict graph or 
>>> some mixture of both?
>>>
>>> Thoughts?
>>
>>
>> As for copies,  I think it would be a bad decision to stick only to 
>> original (after the code selection) alternative and generate copies 
>> to satisfy this alternative.  For example, if pseudo got memory 
>> instead of hard-register required by the alternative, it would be bad 
>> to generate a copy (ld/st in this case) if memory is accepted by the 
>> insn.  
> That's why I mentioned the possibility of relaxing the conflict graph 
> to allow other alternatives if we find that the graph is 
> uncolorable.    So if we initially wanted class A, but couldn't get it 
> and the operand could accept class B, then we remove the conflict 
> between the pseudo and the hard regs in class B and recolor.
>
> I have no idea how expensive this would be.
>
> This also implies that we're representing conflicts for register 
> classes & memory in the conflict graph.
>
The problem is that graph coloring working well on non-intersected 
register classes.  There were some articles how to make it work on 
intersected register classes.  The most known is "A Generalized 
Algorithm for Graph-Coloring Register Allocation":

http://www.cs.tufts.edu/~nr/pubs/gcra-abstract.html

I read this article with attention several times and even wrote some 
notes but unfortunately can not find it.  I got an impression that it 
will not work.

As for changing register class from one cover class (e.g. GENERAL_REGS) 
to another non-intersected one (FLOAT_REGS), it is possible but more 
rare than changing register class in one cover class (like from GENERAL 
to AREG in x86). The most common example is usage pseudo for moving 
memory values (e.g. generated from a[i] = b[i]).  It can be done through 
int or fp regs.  I've tried some graph coloring algorithm modification 
to permit register class changing when register pressure for original 
class is high and one for new class is low but got mixed results on x86 
and ppc (some tests were worse some were better with practically the 
same average result).


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17  3:57             ` Ian Lance Taylor
@ 2009-02-17 16:52               ` Bernd Schmidt
  2009-02-17 18:37                 ` Steven Bosscher
  2009-02-17 20:01                 ` Ian Lance Taylor
  2009-02-17 17:45               ` Jeff Law
  1 sibling, 2 replies; 32+ messages in thread
From: Bernd Schmidt @ 2009-02-17 16:52 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Jeff Law, Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Ian Lance Taylor wrote:

> No, that makes no sense.  What I'm suggesting is that we fix the stack
> offsets of all local variables before register allocation, based on a
> conservative assessment of how many registers will be saved on the
> stack.

The conservative assessment is that all pseudos go on the stack.
However, this way you'll generate terrible code.

I don't really understand why people want to remove reload only to
implement the same thing elsewhere.  There are only two major problems
with reload IMO:
 - the RELOAD_FOR_xxx mechanism of scheduling reload insns is terrible
 - so is the inheritance code
Even so, the number of bugs in these seems to have dropped off over the
years.

If you replace these two with a cleaner solution, you'll end up with a
fairly clean and easy to understand reload pass.  We'd still want the
register allocator to be strong enough not to leave too much work for
it, but I simply don't see how reload can be entirely replaced in gcc.


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17  3:57             ` Ian Lance Taylor
  2009-02-17 16:52               ` Bernd Schmidt
@ 2009-02-17 17:45               ` Jeff Law
  2009-02-17 19:57                 ` Ian Lance Taylor
  1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-17 17:45 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Ian Lance Taylor wrote:
> Jeff Law <law@redhat.com> writes:
>
>   
>>>> I would agree that careful relaxation of displacements is no longer as
>>>> important as it once was, I don't think we can just  hand wave away
>>>> the displacement issues
>>>>
>>>>  1. The stack frames don't have to be that big to bump up against
>>>> these problems.
>>>>
>>>>  2. The code we generate if we have to reload the address because the
>>>> displacement was out of range can be horrific
>>>>
>>>>  3. There are targets where other registers used in the insn determine
>>>> the range of the displacement.  ie, in a load from memory, the
>>>> destination register used determines the valid range of displacements
>>>> (+-16 bytes vs +-8k on one target I'm aware of.
>>>>     
>>>>         
>>> In all of thse cases, the relaxation loop can only affect a handful of
>>> instructions: the cases where saving a few less registers moves the
>>> offset within range.  Those few instructions can only occur in a handful
>>> of functions: the ones where the stack frame is so large that this
>>> becomes an issue at all.
>>>   
>>>       
>> I disagree, particularly because of point #3.    I don't see how you
>> can hand wave it away, that is unless you plan on just making every
>> load/store of a stack variable/spill be assumed to be out of the +-16
>> byte range which will generate absolutely horrible code.
>>     
>
> No, that makes no sense.  What I'm suggesting is that we fix the stack
> offsets of all local variables before register allocation, based on a
> conservative assessment of how many registers will be saved on the
> stack.  Then we know during register allocation whether the memory
> reference will be in or out of the +- 16 byte range. 
Just fixing the offset is not sufficient for the case I'm concerned 
about because you don't know what offsets are valid until you know 
precisely what registers are used in the insn.   That's absolutely 
critical.  So you have to assume worst case which is you can only use 
the +-16 byte insns which will pessimize just about every stack reference.

Perhaps the confusion in this case is because we're not really relaxing 
-- we're dealing with a case where the offset alone isn't enough to 
determine if the addressing mode is valid.






>  What we lose is
> the ability to discover that our conservative assessment was overly
> conservative, and so actually some small number of instructions will be
> generated as out of range when they could have been in range.  (Of
> course we will pick up some of those cases using peepholes).
>
>   
In the case I'm talking about, it will affect a huge number of insns.
>> Without knowing the size of the frame, how do you plan on doing this
>> without making the assumption that nothing is going to fit in the
>> shorter displacement variants?  How can you do this when the range of
>> valid displacements can change because the register you used got
>> spilled and you got a register from a different class (which in turn
>> has a drastically smaller set of valid displacements).
>>     
>
> I'm saying that you guess the size of the frame, so your premise does
> not describe the aproach that I am suggesting.
>   
But the offsets are relative to the size of the frame.  So I don't see 
how you can set the offsets without knowing the size of the frame.

Clearly neither of us is understanding what the other is saying.

Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17 16:28             ` Vladimir Makarov
@ 2009-02-17 17:47               ` Jeff Law
  0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-17 17:47 UTC (permalink / raw)
  To: Vladimir Makarov
  Cc: Ian Lance Taylor, Paolo Bonzini, GCC Development, Michael Matz

Vladimir Makarov wrote:
>
> IMHO, another important optimization what reload does for such 
> processors is usage of already calculated displacement whose value is 
> in hard register to calculate an new displacement (that was 
> implemented by Joern for SH).
ISTM this optimization could be done with a pre-variant rather than the 
ad-hoc implementation we have now.


Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17 16:52               ` Bernd Schmidt
@ 2009-02-17 18:37                 ` Steven Bosscher
  2009-02-17 20:01                 ` Ian Lance Taylor
  1 sibling, 0 replies; 32+ messages in thread
From: Steven Bosscher @ 2009-02-17 18:37 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Ian Lance Taylor, Jeff Law, Paolo Bonzini, GCC Development,
	Vladimir Makarov, Michael Matz

On Tue, Feb 17, 2009 at 5:51 PM, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> The conservative assessment is that all pseudos go on the stack.
> However, this way you'll generate terrible code.

Yes. So make a not-too-optimistic estimate. And if it turns out your
estimate is too small, then you roll back, update the estimate, and
try again.  This only works if you can make the "roll back" part work,
but it's been done before (not for GCC, of course).

> I don't really understand why people want to remove reload only to
> implement the same thing elsewhere.

I don't think "people" want to remove reload. Simplifying it, is
probably a better description. Splitting out tasks not related to
spilling.  Trying to do the same things in a place where you can still
create a pseudo, instead of trying really hard to find a register and
change many choices of the register allocator while at it.

>  There are only two major problems
> with reload IMO:
>  - the RELOAD_FOR_xxx mechanism of scheduling reload insns is terrible
>  - so is the inheritance code
> Even so, the number of bugs in these seems to have dropped off over the
> years.

There is at least one more major problem with reload: It performs
instruction selection.  A big issue with register allocation in GCC is
that the register allocator doesn't even really know what instruction
it is allocating registers for!

Gr.
Steven

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17 17:45               ` Jeff Law
@ 2009-02-17 19:57                 ` Ian Lance Taylor
  2009-02-20 21:58                   ` Jeff Law
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-17 19:57 UTC (permalink / raw)
  To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Jeff Law <law@redhat.com> writes:

>> No, that makes no sense.  What I'm suggesting is that we fix the stack
>> offsets of all local variables before register allocation, based on a
>> conservative assessment of how many registers will be saved on the
>> stack.  Then we know during register allocation whether the memory
>> reference will be in or out of the +- 16 byte range. 
> Just fixing the offset is not sufficient for the case I'm concerned
> about because you don't know what offsets are valid until you know
> precisely what registers are used in the insn.   That's absolutely
> critical.  So you have to assume worst case which is you can only use
> the +-16 byte insns which will pessimize just about every stack
> reference.

You fix the offset of the value stored on the stack.  Given that, you
know which instruction you can use.

> Perhaps the confusion in this case is because we're not really
> relaxing -- we're dealing with a case where the offset alone isn't
> enough to determine if the addressing mode is valid.

The confusion may be that we may be talking about different things.
I'm talking about removing this loop in the reload function:

  /* This loop scans the entire function each go-round
     and repeats until one repetition spills no additional hard regs.  */


>>> Without knowing the size of the frame, how do you plan on doing this
>>> without making the assumption that nothing is going to fit in the
>>> shorter displacement variants?  How can you do this when the range of
>>> valid displacements can change because the register you used got
>>> spilled and you got a register from a different class (which in turn
>>> has a drastically smaller set of valid displacements).
>>>     
>>
>> I'm saying that you guess the size of the frame, so your premise does
>> not describe the aproach that I am suggesting.
>>   
> But the offsets are relative to the size of the frame.  So I don't see
> how you can set the offsets without knowing the size of the frame.

You know the size of the frame, by guessing conservatively as to how
many registers will be required.

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17 16:52               ` Bernd Schmidt
  2009-02-17 18:37                 ` Steven Bosscher
@ 2009-02-17 20:01                 ` Ian Lance Taylor
  2009-02-20 16:25                   ` Denis Chertykov
  2009-02-20 22:04                   ` Jeff Law
  1 sibling, 2 replies; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-17 20:01 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Jeff Law, Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Bernd Schmidt <bernds_cb1@t-online.de> writes:

> Ian Lance Taylor wrote:
>
>> No, that makes no sense.  What I'm suggesting is that we fix the stack
>> offsets of all local variables before register allocation, based on a
>> conservative assessment of how many registers will be saved on the
>> stack.
>
> The conservative assessment is that all pseudos go on the stack.
> However, this way you'll generate terrible code.

I think the number of pseudos on the stack is irrelevant.  You put each
one farther down the stack at a known offset.  The only interesting
number to choose is how many registers will need to be saved on the
stack.

> I don't really understand why people want to remove reload only to
> implement the same thing elsewhere.  There are only two major problems
> with reload IMO:
>  - the RELOAD_FOR_xxx mechanism of scheduling reload insns is terrible
>  - so is the inheritance code
> Even so, the number of bugs in these seems to have dropped off over the
> years.

The problem with reload is that it interferes with register allocation.
Even if gcc had a perfect register allocator, we would still generate
suboptimal code because reload would mess up the allocation.  The key to
solving that problem is that we need to do instruction selection (in gcc
terms, picking which insn alternative to use) either before register
allocation or in conjunction with it.

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17 20:01                 ` Ian Lance Taylor
@ 2009-02-20 16:25                   ` Denis Chertykov
  2009-02-20 22:04                   ` Jeff Law
  1 sibling, 0 replies; 32+ messages in thread
From: Denis Chertykov @ 2009-02-20 16:25 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Bernd Schmidt, Jeff Law, Paolo Bonzini, GCC Development,
	Vladimir Makarov, Michael Matz

2009/2/17 Ian Lance Taylor <iant@google.com>:
>
> The problem with reload is that it interferes with register allocation.
> Even if gcc had a perfect register allocator, we would still generate
> suboptimal code because reload would mess up the allocation.  The key to
> solving that problem is that we need to do instruction selection (in gcc
> terms, picking which insn alternative to use) either before register
> allocation or in conjunction with it.

I have a strong opinion that better to have a right insn's before the
register allocation.
(right insn is an insn which satisfies it's constraints.)
Because of that idea I have realised pre-relod in new-ra. (I'm agree
that pre-reload is an ugly pieces of code)
Even more, today I have a set of experiments with pre-reload pass before IRA.
I have founded that IRA costs calculation process work better if I
have a right insn's before IRA.

Denis.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17 19:57                 ` Ian Lance Taylor
@ 2009-02-20 21:58                   ` Jeff Law
  2009-02-21 16:40                     ` Ian Lance Taylor
  0 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-20 21:58 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Ian Lance Taylor wrote:
> Jeff Law <law@redhat.com> writes:
>
>   
>>> No, that makes no sense.  What I'm suggesting is that we fix the stack
>>> offsets of all local variables before register allocation, based on a
>>> conservative assessment of how many registers will be saved on the
>>> stack.  Then we know during register allocation whether the memory
>>> reference will be in or out of the +- 16 byte range. 
>>>       
>> Just fixing the offset is not sufficient for the case I'm concerned
>> about because you don't know what offsets are valid until you know
>> precisely what registers are used in the insn.   That's absolutely
>> critical.  So you have to assume worst case which is you can only use
>> the +-16 byte insns which will pessimize just about every stack
>> reference.
>>     
>
> You fix the offset of the value stored on the stack.  Given that, you
> know which instruction you can use.
>   
No, it's not that simple.  If you spill, then you can change the 
registers that are used and that in turn changes the set of valid 
offsets in reg+d instructions.
>   
>> Perhaps the confusion in this case is because we're not really
>> relaxing -- we're dealing with a case where the offset alone isn't
>> enough to determine if the addressing mode is valid.
>>     
>
> The confusion may be that we may be talking about different things.
> I'm talking about removing this loop in the reload function:
>
>   /* This loop scans the entire function each go-round
>      and repeats until one repetition spills no additional hard regs.  */
>   
Most of that loop's real work isn't in the address relaxation code, but 
in digging through instructions to determine what needs to be reloaded 
and how many registers are necessary to perform the requested reloads.

It is unfortunate that spilling and allocating stack slots requires the 
loop to iterate again -- I do believe it is possible to remove some 
silly iterating, but removing the whole loop is a pipe dream at the moment.


>
>   
>>>> Without knowing the size of the frame, how do you plan on doing this
>>>> without making the assumption that nothing is going to fit in the
>>>> shorter displacement variants?  How can you do this when the range of
>>>> valid displacements can change because the register you used got
>>>> spilled and you got a register from a different class (which in turn
>>>> has a drastically smaller set of valid displacements).
>>>>     
>>>>         
>>> I'm saying that you guess the size of the frame, so your premise does
>>> not describe the aproach that I am suggesting.
>>>   
>>>       
>> But the offsets are relative to the size of the frame.  So I don't see
>> how you can set the offsets without knowing the size of the frame.
>>     
>
> You know the size of the frame, by guessing conservatively as to how
> many registers will be required.
>   
A conservative guess would be something like "every pseudo needs a slot" 
-- anything less than that isn't conservative, it's  optimistic.  
Guessing conservative is going to be wasteful on so many levels.  
Guessing optimistic doesn't solve any of the problems we want to solve 
as we still need to deal with the case where our guess was wrong.

jeff


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-17 20:01                 ` Ian Lance Taylor
  2009-02-20 16:25                   ` Denis Chertykov
@ 2009-02-20 22:04                   ` Jeff Law
  1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-20 22:04 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Bernd Schmidt, Paolo Bonzini, GCC Development, Vladimir Makarov,
	Michael Matz

Ian Lance Taylor wrote:
> Bernd Schmidt <bernds_cb1@t-online.de> writes:
>
>   
>> Ian Lance Taylor wrote:
>>
>>     
>>> No, that makes no sense.  What I'm suggesting is that we fix the stack
>>> offsets of all local variables before register allocation, based on a
>>> conservative assessment of how many registers will be saved on the
>>> stack.
>>>       
>> The conservative assessment is that all pseudos go on the stack.
>> However, this way you'll generate terrible code.
>>     
>
> I think the number of pseudos on the stack is irrelevant.  You put each
> one farther down the stack at a known offset.  The only interesting
> number to choose is how many registers will need to be saved on the
> stack.
>   
Allocation of slots isn't that simple because you might be spilling 
later, which in turn allocates a new stack slot.  Note that you want the 
new stack slot to actually be at a smaller address because it's likely 
used far more often than the stack slots for pseudos that didn't get a 
hard register.

You also can't make assumptions about stack/frame growth directions and 
an address that is valid as fp+offset may not be valid as sp+offset'.

I think you're vastly over-simplifying the code in question.

>
>   
>> I don't really understand why people want to remove reload only to
>> implement the same thing elsewhere.  There are only two major problems
>> with reload IMO:
>>  - the RELOAD_FOR_xxx mechanism of scheduling reload insns is terrible
>>  - so is the inheritance code
>> Even so, the number of bugs in these seems to have dropped off over the
>> years.
>>     
>
> The problem with reload is that it interferes with register allocation.
> Even if gcc had a perfect register allocator, we would still generate
> suboptimal code because reload would mess up the allocation.  The key to
> solving that problem is that we need to do instruction selection (in gcc
> terms, picking which insn alternative to use) either before register
> allocation or in conjunction with it.
>   
I agree that instruction selection prior to or as part of allocation is 
good.  But until we're dealing with spills in the allocator as well, 
we're still in a situation where reload can come along and muck things 
up.   Hence my desire to move more of the spill code generation into IRA.

Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-20 21:58                   ` Jeff Law
@ 2009-02-21 16:40                     ` Ian Lance Taylor
  2009-02-22  2:23                       ` Jeff Law
  2009-02-22  2:35                       ` Jeff Law
  0 siblings, 2 replies; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-21 16:40 UTC (permalink / raw)
  To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Jeff Law <law@redhat.com> writes:

> Ian Lance Taylor wrote:
>> Jeff Law <law@redhat.com> writes:
>>
>>   
>>>> No, that makes no sense.  What I'm suggesting is that we fix the stack
>>>> offsets of all local variables before register allocation, based on a
>>>> conservative assessment of how many registers will be saved on the
>>>> stack.  Then we know during register allocation whether the memory
>>>> reference will be in or out of the +- 16 byte range.       
>>> Just fixing the offset is not sufficient for the case I'm concerned
>>> about because you don't know what offsets are valid until you know
>>> precisely what registers are used in the insn.   That's absolutely
>>> critical.  So you have to assume worst case which is you can only use
>>> the +-16 byte insns which will pessimize just about every stack
>>> reference.
>>>     
>>
>> You fix the offset of the value stored on the stack.  Given that, you
>> know which instruction you can use.
>>   
> No, it's not that simple.  If you spill, then you can change the
> registers that are used and that in turn changes the set of valid
> offsets in reg+d instructions.

We are just talking in circles.  As I have been saying all along, I'm
suggesting that gcc conservatively determine which registers will be
used.  Then spilling will not change the stack space allocated for saved
registers.


>>> Perhaps the confusion in this case is because we're not really
>>> relaxing -- we're dealing with a case where the offset alone isn't
>>> enough to determine if the addressing mode is valid.
>>>     
>>
>> The confusion may be that we may be talking about different things.
>> I'm talking about removing this loop in the reload function:
>>
>>   /* This loop scans the entire function each go-round
>>      and repeats until one repetition spills no additional hard regs.  */
>>   
> Most of that loop's real work isn't in the address relaxation code,
> but in digging through instructions to determine what needs to be
> reloaded and how many registers are necessary to perform the requested
> reloads.

You are not using the word "relaxation" in the same sense that I meant
it.  I meant explicitly the process of finding the final set of
elimination offsets.


>> You know the size of the frame, by guessing conservatively as to how
>> many registers will be required.
>>   
> A conservative guess would be something like "every pseudo needs a
> slot" -- anything less than that isn't conservative, it's  optimistic.

Not a conservative guess at pseudo registers, a conservative guess at
hard registers.

I'm sorry, but I think I'm done with this thread.  I'm just repeating
myself, and people are critiquing things which I am not saying.

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-21 16:40                     ` Ian Lance Taylor
@ 2009-02-22  2:23                       ` Jeff Law
  2009-02-22  2:35                       ` Jeff Law
  1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-22  2:23 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Ian Lance Taylor wrote:
> Not a conservative guess at pseudo registers, a conservative guess at
> hard registers.
>
> I'm sorry, but I think I'm done with this thread.  I'm just repeating
> myself, and people are critiquing things which I am not saying.
>   
FWIW, I'm not trying to be dense -- I respect and value your opinions.  
We're clearly not communicating well.

jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: IRA conflict graph & alternative selection
  2009-02-21 16:40                     ` Ian Lance Taylor
  2009-02-22  2:23                       ` Jeff Law
@ 2009-02-22  2:35                       ` Jeff Law
  1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-22  2:35 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz

Ian Lance Taylor wrote:
> Jeff Law <law@redhat.com> writes:
>
>   
>> Ian Lance Taylor wrote:
>>     
>>> Jeff Law <law@redhat.com> writes:
>>>
>>>   
>>>       
>>>>> No, that makes no sense.  What I'm suggesting is that we fix the stack
>>>>> offsets of all local variables before register allocation, based on a
>>>>> conservative assessment of how many registers will be saved on the
>>>>> stack.  Then we know during register allocation whether the memory
>>>>> reference will be in or out of the +- 16 byte range.       
>>>>>           
>>>> Just fixing the offset is not sufficient for the case I'm concerned
>>>> about because you don't know what offsets are valid until you know
>>>> precisely what registers are used in the insn.   That's absolutely
>>>> critical.  So you have to assume worst case which is you can only use
>>>> the +-16 byte insns which will pessimize just about every stack
>>>> reference.
>>>>     
>>>>         
>>> You fix the offset of the value stored on the stack.  Given that, you
>>> know which instruction you can use.
>>>   
>>>       
>> No, it's not that simple.  If you spill, then you can change the
>> registers that are used and that in turn changes the set of valid
>> offsets in reg+d instructions.
>>     
>
> We are just talking in circles.  As I have been saying all along, I'm
> suggesting that gcc conservatively determine which registers will be
> used.  Then spilling will not change the stack space allocated for saved
> registers.
>   
Conservatively selecting registers for this case would make the 
generated code so bad on this target that GCC would become unusable.  
>>
>> Most of that loop's real work isn't in the address relaxation code,
>> but in digging through instructions to determine what needs to be
>> reloaded and how many registers are necessary to perform the requested
>> reloads.
>>     
>
> You are not using the word "relaxation" in the same sense that I meant
> it.  I meant explicitly the process of finding the final set of
> elimination offsets.
>   
Actually I think I am.  Very little work of the loop is related to 
finding the final set of elimination offsets.  Furthermore, most of the 
time we run through that loop precisely once.   Of the remaining cases 
where the loop iterates more than once, 25% can be trivially eliminated 
through better sequencing of operations in the loop.




>
>
> Not a conservative guess at pseudo registers, a conservative guess at
> hard registers.
>   
OK.  I see the difference here, but it doesn't change the problem at 
hand -- a conservative guess at what hard registers we're going to use 
would pessimize the code horribly, possibly more so than making a 
conservative guess about what offsets are going to be valid -- a 
conservative guess on the registers used for these memory operations 
would have a cascading effect throughout the code causing a ton of 
reloading of other instructions.

Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2009-02-22  2:35 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-13 15:13 IRA conflict graph & alternative selection Jeff Law
2009-02-13 15:32 ` Paolo Bonzini
2009-02-13 15:47   ` Michael Matz
2009-02-16 16:46     ` Jeff Law
2009-02-16 17:27       ` Michael Matz
2009-02-13 17:01   ` Jeff Law
2009-02-13 16:37 ` Vladimir Makarov
2009-02-13 16:49   ` Paolo Bonzini
2009-02-13 19:01     ` Ian Lance Taylor
2009-02-13 19:38       ` Vladimir Makarov
2009-02-16 16:25       ` Jeff Law
2009-02-16 20:33         ` Ian Lance Taylor
2009-02-16 23:15           ` Jeff Law
2009-02-17  3:57             ` Ian Lance Taylor
2009-02-17 16:52               ` Bernd Schmidt
2009-02-17 18:37                 ` Steven Bosscher
2009-02-17 20:01                 ` Ian Lance Taylor
2009-02-20 16:25                   ` Denis Chertykov
2009-02-20 22:04                   ` Jeff Law
2009-02-17 17:45               ` Jeff Law
2009-02-17 19:57                 ` Ian Lance Taylor
2009-02-20 21:58                   ` Jeff Law
2009-02-21 16:40                     ` Ian Lance Taylor
2009-02-22  2:23                       ` Jeff Law
2009-02-22  2:35                       ` Jeff Law
2009-02-17 16:28             ` Vladimir Makarov
2009-02-17 17:47               ` Jeff Law
2009-02-13 19:54     ` Jeff Law
2009-02-14 15:00       ` Steven Bosscher
2009-02-16 15:53         ` Jeff Law
2009-02-16 16:40   ` Jeff Law
2009-02-17 16:28     ` Vladimir Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).