* IRA conflict graph & alternative selection
@ 2009-02-13 15:13 Jeff Law
2009-02-13 15:32 ` Paolo Bonzini
2009-02-13 16:37 ` Vladimir Makarov
0 siblings, 2 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-13 15:13 UTC (permalink / raw)
To: Vladimir N. Makarov; +Cc: GCC
I've been thinking further about instruction alternative selection prior
to allocation and one of the questions in my mind is how this interacts
with IRA.
We select an alternative for each insn based on some "best guess"
heuristic -- the selection of an alternative will often restrict the
register classes available for each of the operands. Presumably we'd
want to encode that information it the conflict graph so that IRA would
allocate registers so as to fit the constraints of the early insn
alternative selection. Right? In the case where the graph is
uncolorable, do we allow IRA to override the alternative selection, or
do we insert copies to simplify the conflict graph or some mixture of both?
Thoughts?
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 15:13 IRA conflict graph & alternative selection Jeff Law
@ 2009-02-13 15:32 ` Paolo Bonzini
2009-02-13 15:47 ` Michael Matz
2009-02-13 17:01 ` Jeff Law
2009-02-13 16:37 ` Vladimir Makarov
1 sibling, 2 replies; 32+ messages in thread
From: Paolo Bonzini @ 2009-02-13 15:32 UTC (permalink / raw)
To: Jeff Law; +Cc: Vladimir N. Makarov, GCC
Jeff Law wrote:
> We'd want to encode [early insn alternative selection]
> information in the conflict graph so that IRA would
> allocate registers so as to fit the constraints of the early insn
> alternative selection. Right? In the case where the graph is
> uncolorable, do we allow IRA to override the alternative selection, or
> do we insert copies to simplify the conflict graph or some mixture of both?
Inserting compensation code, for example copies, can be seen as some
kind of "pre-reload" as it was used on new-ra branch; the problem with
pre-reload was that it was built on "cp reload1.c pre-reload.c", so it
was not much less complicated than reload.
Paolo
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 15:32 ` Paolo Bonzini
@ 2009-02-13 15:47 ` Michael Matz
2009-02-16 16:46 ` Jeff Law
2009-02-13 17:01 ` Jeff Law
1 sibling, 1 reply; 32+ messages in thread
From: Michael Matz @ 2009-02-13 15:47 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Jeff Law, Vladimir N. Makarov, GCC
Hi,
On Fri, 13 Feb 2009, Paolo Bonzini wrote:
> > We'd want to encode [early insn alternative selection] information in
> > the conflict graph so that IRA would allocate registers so as to fit
> > the constraints of the early insn alternative selection. Right? In
> > the case where the graph is uncolorable, do we allow IRA to override
> > the alternative selection, or do we insert copies to simplify the
> > conflict graph or some mixture of both?
If the initial alternative selection was done cleverly (like chose the
alternatives allowing the largest register sets which don't immediately
create conflicting demands for a pseudo register) the opportunities for
making an uncolorable graph colorable by chosing another alternative will
be very small. This can only happen if that new alternative somehow
allows for the uncolorable node a completely new set of register (like say
float instead of integer regs), which would mean also selecting other
alternatives for all instructions where this pseudo also is used.
So it's not impossible, but I think it would happen relatively seldom that
changing the alternatives improves the situation.
> Inserting compensation code, for example copies, can be seen as some
> kind of "pre-reload" as it was used on new-ra branch; the problem with
> pre-reload was that it was built on "cp reload1.c pre-reload.c", so it
> was not much less complicated than reload.
Unnecessarily so, though. Faster to have something working, but as the
problem is much easier (and you still can reasonably create new pseudos)
also the solution can be. It just never was followed through.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 15:13 IRA conflict graph & alternative selection Jeff Law
2009-02-13 15:32 ` Paolo Bonzini
@ 2009-02-13 16:37 ` Vladimir Makarov
2009-02-13 16:49 ` Paolo Bonzini
2009-02-16 16:40 ` Jeff Law
1 sibling, 2 replies; 32+ messages in thread
From: Vladimir Makarov @ 2009-02-13 16:37 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC
Jeff Law wrote:
> I've been thinking further about instruction alternative selection
> prior to allocation and one of the questions in my mind is how this
> interacts with IRA.
>
> We select an alternative for each insn based on some "best guess"
> heuristic -- the selection of an alternative will often restrict the
> register classes available for each of the operands. Presumably we'd
> want to encode that information it the conflict graph so that IRA
> would allocate registers so as to fit the constraints of the early
> insn alternative selection. Right? In the case where the graph is
> uncolorable, do we allow IRA to override the alternative selection, or
> do we insert copies to simplify the conflict graph or some mixture of
> both?
>
> Thoughts?
Sticking with one insn alternative (by alternative I mean set of operand
constraints - one constraint for one operand. That is different from
RTL insn alternative which is superset of such set) permits more
accurate (and much faster) cost calculation of register class costs for
pseudos which improves coloring and spilling decisions especially for
CISC architectures likes x86, x86_64. On the other hand it rejects some
more possible alternatives. What is more important? I don't know.
Therefore I wrote about it as an experiment. That is ok because to get
current state IRA algorithms I tried probably 5 times more different
coloring algorithms.
As for copies, I think it would be a bad decision to stick only to
original (after the code selection) alternative and generate copies to
satisfy this alternative. For example, if pseudo got memory instead of
hard-register required by the alternative, it would be bad to generate a
copy (ld/st in this case) if memory is accepted by the insn. That is
because a new pseudo will need a hard register and could result in
spilling another pseudo. And this can not be fixed by a simple
subsequent pass removing ld/st by using memory operands. So I think if
the altrernative is not satisfied then we should choose the cheapest one
and use it for the insn. Still some copies might be needed to satisfy
the new alternative. After changing some insn alternatives and
generating copies, choosing alternative for them and generating new
pseudos we modify register class costs of the pseudos and calculate
costs for the new pseudos (it should be a fast process because we need
to change costs only for pseudos involved in insns with changed
alternatives and because again each insn has only one alternative).
After cost recalculation, another round of coloring is done.
That is in brief how I see it and there are a lot of reload details
missed (like virtual register eliminations or addressing displacement
constraints etc).
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 16:37 ` Vladimir Makarov
@ 2009-02-13 16:49 ` Paolo Bonzini
2009-02-13 19:01 ` Ian Lance Taylor
2009-02-13 19:54 ` Jeff Law
2009-02-16 16:40 ` Jeff Law
1 sibling, 2 replies; 32+ messages in thread
From: Paolo Bonzini @ 2009-02-13 16:49 UTC (permalink / raw)
To: GCC Development, Jeffrey A Law, Vladimir Makarov, Michael Matz
> As for copies, I think it would be a bad decision to stick only to
> original (after the code selection) alternative and generate copies to
> satisfy this alternative. For example, if pseudo got memory instead of
> hard-register required by the alternative, it would be bad to generate a
> copy (ld/st in this case) if memory is accepted by the insn.
Yes, spilling is a special case (see also PR19398). If something is
spilled, "prereloads" could be rolled back for affected instructions and
recomputed (possibly forcing no change on the constraints of other
registers).
> That is in brief how I see it and there are a lot of reload details
> missed (like virtual register eliminations or addressing displacement
> constraints etc).
I suppose those would stay in reload? Michael, what were your
intentions for the insn-sel branch? (Sorry to remind you of that :-P).
Paolo
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 15:32 ` Paolo Bonzini
2009-02-13 15:47 ` Michael Matz
@ 2009-02-13 17:01 ` Jeff Law
1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-13 17:01 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Vladimir N. Makarov, GCC
Paolo Bonzini wrote:
> Jeff Law wrote:
>
>> We'd want to encode [early insn alternative selection]
>> information in the conflict graph so that IRA would
>> allocate registers so as to fit the constraints of the early insn
>> alternative selection. Right? In the case where the graph is
>> uncolorable, do we allow IRA to override the alternative selection, or
>> do we insert copies to simplify the conflict graph or some mixture of both?
>>
>
> Inserting compensation code, for example copies, can be seen as some
> kind of "pre-reload" as it was used on new-ra branch;
It can certainly be viewed that way; however, it helps me to think in
terms of having IRA generate spill code. The difference is subtle, but
it's helping me get away from the reload-centric mindset which keeps
creeping in.
> the problem with
> pre-reload was that it was built on "cp reload1.c pre-reload.c", so it
> was not much less complicated than reload.
>
Precisely. And I'm finding that as I look at the existing reload code
and ponder where/how to do things better I always end up with the same
code, but just at a different place in the pipeline. That shouldn't be
the goal here -- the goal should be something more maintainable and
predictable than reload.
What I think I'm getting at is we should be looking at common reload
cases and mapping them into changes to the conflict graph within IRA
(possibly with some copies inserted as compensation code).
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 16:49 ` Paolo Bonzini
@ 2009-02-13 19:01 ` Ian Lance Taylor
2009-02-13 19:38 ` Vladimir Makarov
2009-02-16 16:25 ` Jeff Law
2009-02-13 19:54 ` Jeff Law
1 sibling, 2 replies; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-13 19:01 UTC (permalink / raw)
To: Paolo Bonzini
Cc: GCC Development, Jeffrey A Law, Vladimir Makarov, Michael Matz
Paolo Bonzini <bonzini@gnu.org> writes:
>> That is in brief how I see it and there are a lot of reload details
>> missed (like virtual register eliminations or addressing displacement
>> constraints etc).
>
> I suppose those would stay in reload?
I see no reason for those to stay in reload (especially since I think
reload should disappear entirely). It is reasonable to pick the total
maximum size of the stack frame, and thus resolve all displacement
constraints, before register allocation. Carefully relaxing these
constraints during reload can give you slightly better results for some
instructions, but only in very very few cases, and only in functions
which already have unusually large stack frames. I don't consider that
to be an important optimization. Given that, we can determine the
maximum offset for all virtual registers before register allocation,
which suffices for selection of insn constraint alternatives, and then
determine the actual offset, once, after register allocation.
Ian
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 19:01 ` Ian Lance Taylor
@ 2009-02-13 19:38 ` Vladimir Makarov
2009-02-16 16:25 ` Jeff Law
1 sibling, 0 replies; 32+ messages in thread
From: Vladimir Makarov @ 2009-02-13 19:38 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Paolo Bonzini, GCC Development, Jeffrey A Law, Michael Matz
Ian Lance Taylor wrote:
> Paolo Bonzini <bonzini@gnu.org> writes:
>
>
>>> That is in brief how I see it and there are a lot of reload details
>>> missed (like virtual register eliminations or addressing displacement
>>> constraints etc).
>>>
>> I suppose those would stay in reload?
>>
>
> I see no reason for those to stay in reload (especially since I think
> reload should disappear entirely). It is reasonable to pick the total
> maximum size of the stack frame, and thus resolve all displacement
> constraints, before register allocation. Carefully relaxing these
> constraints during reload can give you slightly better results for some
> instructions, but only in very very few cases, and only in functions
> which already have unusually large stack frames. I don't consider that
> to be an important optimization. Given that, we can determine the
> maximum offset for all virtual registers before register allocation,
> which suffices for selection of insn constraint alternatives, and then
> determine the actual offset, once, after register allocation.
>
>
>
Ian, sorry for not to be clear which resulted in this misunderstanding.
What I wrote was a scheme without reload pass. I just mentioned some
tasks of reload pass which should be solved in RA without the reload and
whose solutions I did not discussed in details. The solution could be
what you propose for example.
Removing reload is a long way, imho. If we go this way, in the middle of
it there would be still reload but solving less and less tasks.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 16:49 ` Paolo Bonzini
2009-02-13 19:01 ` Ian Lance Taylor
@ 2009-02-13 19:54 ` Jeff Law
2009-02-14 15:00 ` Steven Bosscher
1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-13 19:54 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: GCC Development, Vladimir Makarov, Michael Matz
Paolo Bonzini wrote:
>> As for copies, I think it would be a bad decision to stick only to
>> original (after the code selection) alternative and generate copies to
>> satisfy this alternative. For example, if pseudo got memory instead of
>> hard-register required by the alternative, it would be bad to generate a
>> copy (ld/st in this case) if memory is accepted by the insn.
>>
>
> Yes, spilling is a special case (see also PR19398). If something is
> spilled, "prereloads" could be rolled back for affected instructions and
> recomputed (possibly forcing no change on the constraints of other
> registers).
>
Ick. This reminds me of the old old LRS code -- while it worked and was
reasonably effective, it was bloody hard to understand and follow (one
of major reasons the code never was contributed). I don't see what
this kind of approach buys us when compared to having IRA deal with
spill code directly.
>
>> That is in brief how I see it and there are a lot of reload details
>> missed (like virtual register eliminations or addressing displacement
>> constraints etc).
>>
>
> I suppose those would stay in reload?
Ideally they'd all move into IRA. But I don't mind putting them off
until after we have a solid handle on how to deal with the most common
cases.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 19:54 ` Jeff Law
@ 2009-02-14 15:00 ` Steven Bosscher
2009-02-16 15:53 ` Jeff Law
0 siblings, 1 reply; 32+ messages in thread
From: Steven Bosscher @ 2009-02-14 15:00 UTC (permalink / raw)
To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
On Fri, Feb 13, 2009 at 8:53 PM, Jeff Law <law@redhat.com> wrote:
>>> That is in brief how I see it and there are a lot of reload details
>>> missed (like virtual register eliminations or addressing displacement
>>> constraints etc).
>>>
>>
>> I suppose those would stay in reload?
>
> Ideally they'd all move into IRA.
...and so, IRA became evil to destroy evil... :-)
Gr.
Steven
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-14 15:00 ` Steven Bosscher
@ 2009-02-16 15:53 ` Jeff Law
0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-16 15:53 UTC (permalink / raw)
To: Steven Bosscher
Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Steven Bosscher wrote:
> On Fri, Feb 13, 2009 at 8:53 PM, Jeff Law <law@redhat.com> wrote:
>
>>>> That is in brief how I see it and there are a lot of reload details
>>>> missed (like virtual register eliminations or addressing displacement
>>>> constraints etc).
>>>>
>>>>
>>> I suppose those would stay in reload?
>>>
>> Ideally they'd all move into IRA.
>>
>
> ...and so, IRA became evil to destroy evil... :-)
>
Obviously the hope would be we could do things much more cleanly in the
IRA code base. Modeling spill code generation as a set of
transformations on the conflict graph has some potential.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 19:01 ` Ian Lance Taylor
2009-02-13 19:38 ` Vladimir Makarov
@ 2009-02-16 16:25 ` Jeff Law
2009-02-16 20:33 ` Ian Lance Taylor
1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-16 16:25 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Ian Lance Taylor wrote:
> Paolo Bonzini <bonzini@gnu.org> writes:
>
>
>>> That is in brief how I see it and there are a lot of reload details
>>> missed (like virtual register eliminations or addressing displacement
>>> constraints etc).
>>>
>> I suppose those would stay in reload?
>>
>
> I see no reason for those to stay in reload (especially since I think
> reload should disappear entirely). It is reasonable to pick the total
> maximum size of the stack frame, and thus resolve all displacement
> constraints, before register allocation. Carefully relaxing these
> constraints during reload can give you slightly better results for some
> instructions, but only in very very few cases, and only in functions
> which already have unusually large stack frames. I don't consider that
> to be an important optimization. Given that, we can determine the
> maximum offset for all virtual registers before register allocation,
> which suffices for selection of insn constraint alternatives, and then
> determine the actual offset, once, after register allocation.
>
I would agree that careful relaxation of displacements is no longer as
important as it once was, I don't think we can just hand wave away the
displacement issues
1. The stack frames don't have to be that big to bump up against these
problems.
2. The code we generate if we have to reload the address because the
displacement was out of range can be horrific
3. There are targets where other registers used in the insn determine
the range of the displacement. ie, in a load from memory, the
destination register used determines the valid range of displacements
(+-16 bytes vs +-8k on one target I'm aware of.
4. Register eliminations complicates matters as well. Enough that I
don't think you can set maximum offsets until you've finalized
everything in the stack -- which implies that you're done spilling.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 16:37 ` Vladimir Makarov
2009-02-13 16:49 ` Paolo Bonzini
@ 2009-02-16 16:40 ` Jeff Law
2009-02-17 16:28 ` Vladimir Makarov
1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-16 16:40 UTC (permalink / raw)
To: Vladimir Makarov; +Cc: GCC
Vladimir Makarov wrote:
> Jeff Law wrote:
>> I've been thinking further about instruction alternative selection
>> prior to allocation and one of the questions in my mind is how this
>> interacts with IRA.
>>
>> We select an alternative for each insn based on some "best guess"
>> heuristic -- the selection of an alternative will often restrict the
>> register classes available for each of the operands. Presumably we'd
>> want to encode that information it the conflict graph so that IRA
>> would allocate registers so as to fit the constraints of the early
>> insn alternative selection. Right? In the case where the graph is
>> uncolorable, do we allow IRA to override the alternative selection,
>> or do we insert copies to simplify the conflict graph or some mixture
>> of both?
>>
>> Thoughts?
>
>
> As for copies, I think it would be a bad decision to stick only to
> original (after the code selection) alternative and generate copies to
> satisfy this alternative. For example, if pseudo got memory instead
> of hard-register required by the alternative, it would be bad to
> generate a copy (ld/st in this case) if memory is accepted by the insn.
That's why I mentioned the possibility of relaxing the conflict graph to
allow other alternatives if we find that the graph is uncolorable. So
if we initially wanted class A, but couldn't get it and the operand
could accept class B, then we remove the conflict between the pseudo and
the hard regs in class B and recolor.
I have no idea how expensive this would be.
This also implies that we're representing conflicts for register classes
& memory in the conflict graph.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-13 15:47 ` Michael Matz
@ 2009-02-16 16:46 ` Jeff Law
2009-02-16 17:27 ` Michael Matz
0 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-16 16:46 UTC (permalink / raw)
To: Michael Matz; +Cc: Paolo Bonzini, Vladimir N. Makarov, GCC
Michael Matz wrote:
> Hi,
>
> On Fri, 13 Feb 2009, Paolo Bonzini wrote:
>
>
>>> We'd want to encode [early insn alternative selection] information in
>>> the conflict graph so that IRA would allocate registers so as to fit
>>> the constraints of the early insn alternative selection. Right? In
>>> the case where the graph is uncolorable, do we allow IRA to override
>>> the alternative selection, or do we insert copies to simplify the
>>> conflict graph or some mixture of both?
>>>
>
> If the initial alternative selection was done cleverly (like chose the
> alternatives allowing the largest register sets which don't immediately
> create conflicting demands for a pseudo register) the opportunities for
> making an uncolorable graph colorable by chosing another alternative will
> be very small. This can only happen if that new alternative somehow
> allows for the uncolorable node a completely new set of register (like say
> float instead of integer regs), which would mean also selecting other
> alternatives for all instructions where this pseudo also is used.
>
> So it's not impossible, but I think it would happen relatively seldom that
> changing the alternatives improves the situation.
>
Of course. However, we might want to pick a narrower class if it has a
smaller cost. The mn103 targets come to mind. In general you're better
off with d0-d3/a0-a3 as they're the cheapest (cost & space). However,
you've got some extended registers which can be used just like
d0-d3/a0-a3, but which are more expensive (but still cheaper than memory).
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-16 16:46 ` Jeff Law
@ 2009-02-16 17:27 ` Michael Matz
0 siblings, 0 replies; 32+ messages in thread
From: Michael Matz @ 2009-02-16 17:27 UTC (permalink / raw)
To: Jeff Law; +Cc: Paolo Bonzini, Vladimir N. Makarov, GCC
Hi,
On Mon, 16 Feb 2009, Jeff Law wrote:
> > If the initial alternative selection was done cleverly (like chose the
> > alternatives allowing the largest register sets which don't
> > immediately create conflicting demands for a pseudo register) the
> > opportunities for making an uncolorable graph colorable by chosing
> > another alternative will be very small. This can only happen if that
> > new alternative somehow allows for the uncolorable node a completely
> > new set of register (like say float instead of integer regs), which
> > would mean also selecting other alternatives for all instructions
> > where this pseudo also is used.
> >
> > So it's not impossible, but I think it would happen relatively seldom
> > that changing the alternatives improves the situation.
> >
> Of course. However, we might want to pick a narrower class if it has a
> smaller cost. The mn103 targets come to mind. In general you're better
> off with d0-d3/a0-a3 as they're the cheapest (cost & space). However,
> you've got some extended registers which can be used just like
> d0-d3/a0-a3, but which are more expensive (but still cheaper than
> memory).
I'd rather model this as a set of preferrable colors in the node. If
they're still free when coloring the node, good, if not, too bad, but
there are still others to chose from. This is more or less equivalent to
chosing a different alternative, but more explicit for the coloring
problem and with less ripple-down effects.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-16 16:25 ` Jeff Law
@ 2009-02-16 20:33 ` Ian Lance Taylor
2009-02-16 23:15 ` Jeff Law
0 siblings, 1 reply; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-16 20:33 UTC (permalink / raw)
To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Jeff Law <law@redhat.com> writes:
> Ian Lance Taylor wrote:
>>
>> I see no reason for those to stay in reload (especially since I think
>> reload should disappear entirely). It is reasonable to pick the total
>> maximum size of the stack frame, and thus resolve all displacement
>> constraints, before register allocation. Carefully relaxing these
>> constraints during reload can give you slightly better results for some
>> instructions, but only in very very few cases, and only in functions
>> which already have unusually large stack frames. I don't consider that
>> to be an important optimization. Given that, we can determine the
>> maximum offset for all virtual registers before register allocation,
>> which suffices for selection of insn constraint alternatives, and then
>> determine the actual offset, once, after register allocation.
>>
> I would agree that careful relaxation of displacements is no longer as
> important as it once was, I don't think we can just hand wave away
> the displacement issues
>
> 1. The stack frames don't have to be that big to bump up against
> these problems.
>
> 2. The code we generate if we have to reload the address because the
> displacement was out of range can be horrific
>
> 3. There are targets where other registers used in the insn determine
> the range of the displacement. ie, in a load from memory, the
> destination register used determines the valid range of displacements
> (+-16 bytes vs +-8k on one target I'm aware of.
In all of thse cases, the relaxation loop can only affect a handful of
instructions: the cases where saving a few less registers moves the
offset within range. Those few instructions can only occur in a handful
of functions: the ones where the stack frame is so large that this
becomes an issue at all.
I'm not handwaving away displacement issues in general. I'm handwaving
away the need to do relaxation, such that we adjust if we find that need
to save one more or one fewer register. If we eliminate that relaxation
requirement, we can determine all displacements before register
allocation.
> 4. Register eliminations complicates matters as well. Enough that I
> don't think you can set maximum offsets until you've finalized
> everything in the stack -- which implies that you're done spilling.
We clearly can set maximum offsets, if we are willing to sacrifice an
optimization. I argue that that optimization is inconsequential for
99.9% of all code, and avoidable (through refactoring and good inline
heuristics) for 100% of all code.
Ian
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-16 20:33 ` Ian Lance Taylor
@ 2009-02-16 23:15 ` Jeff Law
2009-02-17 3:57 ` Ian Lance Taylor
2009-02-17 16:28 ` Vladimir Makarov
0 siblings, 2 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-16 23:15 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Ian Lance Taylor wrote:
> Jeff Law <law@redhat.com> writes:
>
>
>> Ian Lance Taylor wrote:
>>
>>> I see no reason for those to stay in reload (especially since I think
>>> reload should disappear entirely). It is reasonable to pick the total
>>> maximum size of the stack frame, and thus resolve all displacement
>>> constraints, before register allocation. Carefully relaxing these
>>> constraints during reload can give you slightly better results for some
>>> instructions, but only in very very few cases, and only in functions
>>> which already have unusually large stack frames. I don't consider that
>>> to be an important optimization. Given that, we can determine the
>>> maximum offset for all virtual registers before register allocation,
>>> which suffices for selection of insn constraint alternatives, and then
>>> determine the actual offset, once, after register allocation.
>>>
>>>
>> I would agree that careful relaxation of displacements is no longer as
>> important as it once was, I don't think we can just hand wave away
>> the displacement issues
>>
>> 1. The stack frames don't have to be that big to bump up against
>> these problems.
>>
>> 2. The code we generate if we have to reload the address because the
>> displacement was out of range can be horrific
>>
>> 3. There are targets where other registers used in the insn determine
>> the range of the displacement. ie, in a load from memory, the
>> destination register used determines the valid range of displacements
>> (+-16 bytes vs +-8k on one target I'm aware of.
>>
>
> In all of thse cases, the relaxation loop can only affect a handful of
> instructions: the cases where saving a few less registers moves the
> offset within range. Those few instructions can only occur in a handful
> of functions: the ones where the stack frame is so large that this
> becomes an issue at all.
>
I disagree, particularly because of point #3. I don't see how you can
hand wave it away, that is unless you plan on just making every
load/store of a stack variable/spill be assumed to be out of the +-16
byte range which will generate absolutely horrible code.
On that particular target is isn't uncommon to have situations where you
think you're going to be able to use the +-8k instruction, but because
of spilling you end up using a different register and suddenly you're
stuck with only being able to use +-16 byte offsets.
> I'm not handwaving away displacement issues in general. I'm handwaving
> away the need to do relaxation, such that we adjust if we find that need
> to save one more or one fewer register. If we eliminate that relaxation
> requirement, we can determine all displacements before register
> allocation.
>
I still don't see it as that simple.
>
>> 4. Register eliminations complicates matters as well. Enough that I
>> don't think you can set maximum offsets until you've finalized
>> everything in the stack -- which implies that you're done spilling.
>>
>
> We clearly can set maximum offsets, if we are willing to sacrifice an
> optimization. I argue that that optimization is inconsequential for
> 99.9% of all code, and avoidable (through refactoring and good inline
> heuristics) for 100% of all code.
>
Without knowing the size of the frame, how do you plan on doing this
without making the assumption that nothing is going to fit in the
shorter displacement variants? How can you do this when the range of
valid displacements can change because the register you used got spilled
and you got a register from a different class (which in turn has a
drastically smaller set of valid displacements).
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-16 23:15 ` Jeff Law
@ 2009-02-17 3:57 ` Ian Lance Taylor
2009-02-17 16:52 ` Bernd Schmidt
2009-02-17 17:45 ` Jeff Law
2009-02-17 16:28 ` Vladimir Makarov
1 sibling, 2 replies; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-17 3:57 UTC (permalink / raw)
To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Jeff Law <law@redhat.com> writes:
>>> I would agree that careful relaxation of displacements is no longer as
>>> important as it once was, I don't think we can just hand wave away
>>> the displacement issues
>>>
>>> 1. The stack frames don't have to be that big to bump up against
>>> these problems.
>>>
>>> 2. The code we generate if we have to reload the address because the
>>> displacement was out of range can be horrific
>>>
>>> 3. There are targets where other registers used in the insn determine
>>> the range of the displacement. ie, in a load from memory, the
>>> destination register used determines the valid range of displacements
>>> (+-16 bytes vs +-8k on one target I'm aware of.
>>>
>>
>> In all of thse cases, the relaxation loop can only affect a handful of
>> instructions: the cases where saving a few less registers moves the
>> offset within range. Those few instructions can only occur in a handful
>> of functions: the ones where the stack frame is so large that this
>> becomes an issue at all.
>>
> I disagree, particularly because of point #3. I don't see how you
> can hand wave it away, that is unless you plan on just making every
> load/store of a stack variable/spill be assumed to be out of the +-16
> byte range which will generate absolutely horrible code.
No, that makes no sense. What I'm suggesting is that we fix the stack
offsets of all local variables before register allocation, based on a
conservative assessment of how many registers will be saved on the
stack. Then we know during register allocation whether the memory
reference will be in or out of the +- 16 byte range. What we lose is
the ability to discover that our conservative assessment was overly
conservative, and so actually some small number of instructions will be
generated as out of range when they could have been in range. (Of
course we will pick up some of those cases using peepholes).
> Without knowing the size of the frame, how do you plan on doing this
> without making the assumption that nothing is going to fit in the
> shorter displacement variants? How can you do this when the range of
> valid displacements can change because the register you used got
> spilled and you got a register from a different class (which in turn
> has a drastically smaller set of valid displacements).
I'm saying that you guess the size of the frame, so your premise does
not describe the aproach that I am suggesting.
Ian
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-16 23:15 ` Jeff Law
2009-02-17 3:57 ` Ian Lance Taylor
@ 2009-02-17 16:28 ` Vladimir Makarov
2009-02-17 17:47 ` Jeff Law
1 sibling, 1 reply; 32+ messages in thread
From: Vladimir Makarov @ 2009-02-17 16:28 UTC (permalink / raw)
To: Jeff Law; +Cc: Ian Lance Taylor, Paolo Bonzini, GCC Development, Michael Matz
Jeff Law wrote:
> Ian Lance Taylor wrote:
>> Jeff Law <law@redhat.com> writes:
>>
>>
>>> Ian Lance Taylor wrote:
>>>
>>>> I see no reason for those to stay in reload (especially since I think
>>>> reload should disappear entirely). It is reasonable to pick the total
>>>> maximum size of the stack frame, and thus resolve all displacement
>>>> constraints, before register allocation. Carefully relaxing these
>>>> constraints during reload can give you slightly better results for
>>>> some
>>>> instructions, but only in very very few cases, and only in functions
>>>> which already have unusually large stack frames. I don't consider
>>>> that
>>>> to be an important optimization. Given that, we can determine the
>>>> maximum offset for all virtual registers before register allocation,
>>>> which suffices for selection of insn constraint alternatives, and then
>>>> determine the actual offset, once, after register allocation.
>>>>
>>> I would agree that careful relaxation of displacements is no longer as
>>> important as it once was, I don't think we can just hand wave away
>>> the displacement issues
>>>
>>> 1. The stack frames don't have to be that big to bump up against
>>> these problems.
>>>
>>> 2. The code we generate if we have to reload the address because the
>>> displacement was out of range can be horrific
>>>
>>> 3. There are targets where other registers used in the insn determine
>>> the range of the displacement. ie, in a load from memory, the
>>> destination register used determines the valid range of displacements
>>> (+-16 bytes vs +-8k on one target I'm aware of.
>>>
>>
>> In all of thse cases, the relaxation loop can only affect a handful of
>> instructions: the cases where saving a few less registers moves the
>> offset within range. Those few instructions can only occur in a handful
>> of functions: the ones where the stack frame is so large that this
>> becomes an issue at all.
>>
> I disagree, particularly because of point #3. I don't see how you
> can hand wave it away, that is unless you plan on just making every
> load/store of a stack variable/spill be assumed to be out of the +-16
> byte range which will generate absolutely horrible code.
>
> On that particular target is isn't uncommon to have situations where
> you think you're going to be able to use the +-8k instruction, but
> because of spilling you end up using a different register and suddenly
> you're stuck with only being able to use +-16 byte offsets.
>
>
According my old notes, there are plenty targets with small displacement
issues: avr (0:64), c4x (-255:255), fr30 (-512:512 or 0:64 depending on
data mode), ip2k (0:127), m68c11(0:256), m68k(-128:127), mcore(0:15,
0:30 etc depending on date mode), mmix (0:255), xtensa (0:255 for QI)
and famous sh (0:63). I think not relaxing the displacements will
really hurt the performance especially for targets with small register
file like mcore.
IMHO, another important optimization what reload does for such
processors is usage of already calculated displacement whose value is in
hard register to calculate an new displacement (that was implemented by
Joern for SH).
Although it is hard for me to say how these optimizations will
complicate the new implementation. May be we could sacrifice a small
performance degradation for more clear code. I don't know, only two
tried implementations will show it.
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-16 16:40 ` Jeff Law
@ 2009-02-17 16:28 ` Vladimir Makarov
0 siblings, 0 replies; 32+ messages in thread
From: Vladimir Makarov @ 2009-02-17 16:28 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC
Jeff Law wrote:
> Vladimir Makarov wrote:
>> Jeff Law wrote:
>>> I've been thinking further about instruction alternative selection
>>> prior to allocation and one of the questions in my mind is how this
>>> interacts with IRA.
>>>
>>> We select an alternative for each insn based on some "best guess"
>>> heuristic -- the selection of an alternative will often restrict the
>>> register classes available for each of the operands. Presumably
>>> we'd want to encode that information it the conflict graph so that
>>> IRA would allocate registers so as to fit the constraints of the
>>> early insn alternative selection. Right? In the case where the
>>> graph is uncolorable, do we allow IRA to override the alternative
>>> selection, or do we insert copies to simplify the conflict graph or
>>> some mixture of both?
>>>
>>> Thoughts?
>>
>>
>> As for copies, I think it would be a bad decision to stick only to
>> original (after the code selection) alternative and generate copies
>> to satisfy this alternative. For example, if pseudo got memory
>> instead of hard-register required by the alternative, it would be bad
>> to generate a copy (ld/st in this case) if memory is accepted by the
>> insn.
> That's why I mentioned the possibility of relaxing the conflict graph
> to allow other alternatives if we find that the graph is
> uncolorable. So if we initially wanted class A, but couldn't get it
> and the operand could accept class B, then we remove the conflict
> between the pseudo and the hard regs in class B and recolor.
>
> I have no idea how expensive this would be.
>
> This also implies that we're representing conflicts for register
> classes & memory in the conflict graph.
>
The problem is that graph coloring working well on non-intersected
register classes. There were some articles how to make it work on
intersected register classes. The most known is "A Generalized
Algorithm for Graph-Coloring Register Allocation":
http://www.cs.tufts.edu/~nr/pubs/gcra-abstract.html
I read this article with attention several times and even wrote some
notes but unfortunately can not find it. I got an impression that it
will not work.
As for changing register class from one cover class (e.g. GENERAL_REGS)
to another non-intersected one (FLOAT_REGS), it is possible but more
rare than changing register class in one cover class (like from GENERAL
to AREG in x86). The most common example is usage pseudo for moving
memory values (e.g. generated from a[i] = b[i]). It can be done through
int or fp regs. I've tried some graph coloring algorithm modification
to permit register class changing when register pressure for original
class is high and one for new class is low but got mixed results on x86
and ppc (some tests were worse some were better with practically the
same average result).
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 3:57 ` Ian Lance Taylor
@ 2009-02-17 16:52 ` Bernd Schmidt
2009-02-17 18:37 ` Steven Bosscher
2009-02-17 20:01 ` Ian Lance Taylor
2009-02-17 17:45 ` Jeff Law
1 sibling, 2 replies; 32+ messages in thread
From: Bernd Schmidt @ 2009-02-17 16:52 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Jeff Law, Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Ian Lance Taylor wrote:
> No, that makes no sense. What I'm suggesting is that we fix the stack
> offsets of all local variables before register allocation, based on a
> conservative assessment of how many registers will be saved on the
> stack.
The conservative assessment is that all pseudos go on the stack.
However, this way you'll generate terrible code.
I don't really understand why people want to remove reload only to
implement the same thing elsewhere. There are only two major problems
with reload IMO:
- the RELOAD_FOR_xxx mechanism of scheduling reload insns is terrible
- so is the inheritance code
Even so, the number of bugs in these seems to have dropped off over the
years.
If you replace these two with a cleaner solution, you'll end up with a
fairly clean and easy to understand reload pass. We'd still want the
register allocator to be strong enough not to leave too much work for
it, but I simply don't see how reload can be entirely replaced in gcc.
Bernd
--
This footer brought to you by insane German lawmakers.
Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 3:57 ` Ian Lance Taylor
2009-02-17 16:52 ` Bernd Schmidt
@ 2009-02-17 17:45 ` Jeff Law
2009-02-17 19:57 ` Ian Lance Taylor
1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-17 17:45 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Ian Lance Taylor wrote:
> Jeff Law <law@redhat.com> writes:
>
>
>>>> I would agree that careful relaxation of displacements is no longer as
>>>> important as it once was, I don't think we can just hand wave away
>>>> the displacement issues
>>>>
>>>> 1. The stack frames don't have to be that big to bump up against
>>>> these problems.
>>>>
>>>> 2. The code we generate if we have to reload the address because the
>>>> displacement was out of range can be horrific
>>>>
>>>> 3. There are targets where other registers used in the insn determine
>>>> the range of the displacement. ie, in a load from memory, the
>>>> destination register used determines the valid range of displacements
>>>> (+-16 bytes vs +-8k on one target I'm aware of.
>>>>
>>>>
>>> In all of thse cases, the relaxation loop can only affect a handful of
>>> instructions: the cases where saving a few less registers moves the
>>> offset within range. Those few instructions can only occur in a handful
>>> of functions: the ones where the stack frame is so large that this
>>> becomes an issue at all.
>>>
>>>
>> I disagree, particularly because of point #3. I don't see how you
>> can hand wave it away, that is unless you plan on just making every
>> load/store of a stack variable/spill be assumed to be out of the +-16
>> byte range which will generate absolutely horrible code.
>>
>
> No, that makes no sense. What I'm suggesting is that we fix the stack
> offsets of all local variables before register allocation, based on a
> conservative assessment of how many registers will be saved on the
> stack. Then we know during register allocation whether the memory
> reference will be in or out of the +- 16 byte range.
Just fixing the offset is not sufficient for the case I'm concerned
about because you don't know what offsets are valid until you know
precisely what registers are used in the insn. That's absolutely
critical. So you have to assume worst case which is you can only use
the +-16 byte insns which will pessimize just about every stack reference.
Perhaps the confusion in this case is because we're not really relaxing
-- we're dealing with a case where the offset alone isn't enough to
determine if the addressing mode is valid.
> What we lose is
> the ability to discover that our conservative assessment was overly
> conservative, and so actually some small number of instructions will be
> generated as out of range when they could have been in range. (Of
> course we will pick up some of those cases using peepholes).
>
>
In the case I'm talking about, it will affect a huge number of insns.
>> Without knowing the size of the frame, how do you plan on doing this
>> without making the assumption that nothing is going to fit in the
>> shorter displacement variants? How can you do this when the range of
>> valid displacements can change because the register you used got
>> spilled and you got a register from a different class (which in turn
>> has a drastically smaller set of valid displacements).
>>
>
> I'm saying that you guess the size of the frame, so your premise does
> not describe the aproach that I am suggesting.
>
But the offsets are relative to the size of the frame. So I don't see
how you can set the offsets without knowing the size of the frame.
Clearly neither of us is understanding what the other is saying.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 16:28 ` Vladimir Makarov
@ 2009-02-17 17:47 ` Jeff Law
0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-17 17:47 UTC (permalink / raw)
To: Vladimir Makarov
Cc: Ian Lance Taylor, Paolo Bonzini, GCC Development, Michael Matz
Vladimir Makarov wrote:
>
> IMHO, another important optimization what reload does for such
> processors is usage of already calculated displacement whose value is
> in hard register to calculate an new displacement (that was
> implemented by Joern for SH).
ISTM this optimization could be done with a pre-variant rather than the
ad-hoc implementation we have now.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 16:52 ` Bernd Schmidt
@ 2009-02-17 18:37 ` Steven Bosscher
2009-02-17 20:01 ` Ian Lance Taylor
1 sibling, 0 replies; 32+ messages in thread
From: Steven Bosscher @ 2009-02-17 18:37 UTC (permalink / raw)
To: Bernd Schmidt
Cc: Ian Lance Taylor, Jeff Law, Paolo Bonzini, GCC Development,
Vladimir Makarov, Michael Matz
On Tue, Feb 17, 2009 at 5:51 PM, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> The conservative assessment is that all pseudos go on the stack.
> However, this way you'll generate terrible code.
Yes. So make a not-too-optimistic estimate. And if it turns out your
estimate is too small, then you roll back, update the estimate, and
try again. This only works if you can make the "roll back" part work,
but it's been done before (not for GCC, of course).
> I don't really understand why people want to remove reload only to
> implement the same thing elsewhere.
I don't think "people" want to remove reload. Simplifying it, is
probably a better description. Splitting out tasks not related to
spilling. Trying to do the same things in a place where you can still
create a pseudo, instead of trying really hard to find a register and
change many choices of the register allocator while at it.
> There are only two major problems
> with reload IMO:
> - the RELOAD_FOR_xxx mechanism of scheduling reload insns is terrible
> - so is the inheritance code
> Even so, the number of bugs in these seems to have dropped off over the
> years.
There is at least one more major problem with reload: It performs
instruction selection. A big issue with register allocation in GCC is
that the register allocator doesn't even really know what instruction
it is allocating registers for!
Gr.
Steven
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 17:45 ` Jeff Law
@ 2009-02-17 19:57 ` Ian Lance Taylor
2009-02-20 21:58 ` Jeff Law
0 siblings, 1 reply; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-17 19:57 UTC (permalink / raw)
To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Jeff Law <law@redhat.com> writes:
>> No, that makes no sense. What I'm suggesting is that we fix the stack
>> offsets of all local variables before register allocation, based on a
>> conservative assessment of how many registers will be saved on the
>> stack. Then we know during register allocation whether the memory
>> reference will be in or out of the +- 16 byte range.
> Just fixing the offset is not sufficient for the case I'm concerned
> about because you don't know what offsets are valid until you know
> precisely what registers are used in the insn. That's absolutely
> critical. So you have to assume worst case which is you can only use
> the +-16 byte insns which will pessimize just about every stack
> reference.
You fix the offset of the value stored on the stack. Given that, you
know which instruction you can use.
> Perhaps the confusion in this case is because we're not really
> relaxing -- we're dealing with a case where the offset alone isn't
> enough to determine if the addressing mode is valid.
The confusion may be that we may be talking about different things.
I'm talking about removing this loop in the reload function:
/* This loop scans the entire function each go-round
and repeats until one repetition spills no additional hard regs. */
>>> Without knowing the size of the frame, how do you plan on doing this
>>> without making the assumption that nothing is going to fit in the
>>> shorter displacement variants? How can you do this when the range of
>>> valid displacements can change because the register you used got
>>> spilled and you got a register from a different class (which in turn
>>> has a drastically smaller set of valid displacements).
>>>
>>
>> I'm saying that you guess the size of the frame, so your premise does
>> not describe the aproach that I am suggesting.
>>
> But the offsets are relative to the size of the frame. So I don't see
> how you can set the offsets without knowing the size of the frame.
You know the size of the frame, by guessing conservatively as to how
many registers will be required.
Ian
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 16:52 ` Bernd Schmidt
2009-02-17 18:37 ` Steven Bosscher
@ 2009-02-17 20:01 ` Ian Lance Taylor
2009-02-20 16:25 ` Denis Chertykov
2009-02-20 22:04 ` Jeff Law
1 sibling, 2 replies; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-17 20:01 UTC (permalink / raw)
To: Bernd Schmidt
Cc: Jeff Law, Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Bernd Schmidt <bernds_cb1@t-online.de> writes:
> Ian Lance Taylor wrote:
>
>> No, that makes no sense. What I'm suggesting is that we fix the stack
>> offsets of all local variables before register allocation, based on a
>> conservative assessment of how many registers will be saved on the
>> stack.
>
> The conservative assessment is that all pseudos go on the stack.
> However, this way you'll generate terrible code.
I think the number of pseudos on the stack is irrelevant. You put each
one farther down the stack at a known offset. The only interesting
number to choose is how many registers will need to be saved on the
stack.
> I don't really understand why people want to remove reload only to
> implement the same thing elsewhere. There are only two major problems
> with reload IMO:
> - the RELOAD_FOR_xxx mechanism of scheduling reload insns is terrible
> - so is the inheritance code
> Even so, the number of bugs in these seems to have dropped off over the
> years.
The problem with reload is that it interferes with register allocation.
Even if gcc had a perfect register allocator, we would still generate
suboptimal code because reload would mess up the allocation. The key to
solving that problem is that we need to do instruction selection (in gcc
terms, picking which insn alternative to use) either before register
allocation or in conjunction with it.
Ian
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 20:01 ` Ian Lance Taylor
@ 2009-02-20 16:25 ` Denis Chertykov
2009-02-20 22:04 ` Jeff Law
1 sibling, 0 replies; 32+ messages in thread
From: Denis Chertykov @ 2009-02-20 16:25 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Bernd Schmidt, Jeff Law, Paolo Bonzini, GCC Development,
Vladimir Makarov, Michael Matz
2009/2/17 Ian Lance Taylor <iant@google.com>:
>
> The problem with reload is that it interferes with register allocation.
> Even if gcc had a perfect register allocator, we would still generate
> suboptimal code because reload would mess up the allocation. The key to
> solving that problem is that we need to do instruction selection (in gcc
> terms, picking which insn alternative to use) either before register
> allocation or in conjunction with it.
I have a strong opinion that better to have a right insn's before the
register allocation.
(right insn is an insn which satisfies it's constraints.)
Because of that idea I have realised pre-relod in new-ra. (I'm agree
that pre-reload is an ugly pieces of code)
Even more, today I have a set of experiments with pre-reload pass before IRA.
I have founded that IRA costs calculation process work better if I
have a right insn's before IRA.
Denis.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 19:57 ` Ian Lance Taylor
@ 2009-02-20 21:58 ` Jeff Law
2009-02-21 16:40 ` Ian Lance Taylor
0 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2009-02-20 21:58 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Ian Lance Taylor wrote:
> Jeff Law <law@redhat.com> writes:
>
>
>>> No, that makes no sense. What I'm suggesting is that we fix the stack
>>> offsets of all local variables before register allocation, based on a
>>> conservative assessment of how many registers will be saved on the
>>> stack. Then we know during register allocation whether the memory
>>> reference will be in or out of the +- 16 byte range.
>>>
>> Just fixing the offset is not sufficient for the case I'm concerned
>> about because you don't know what offsets are valid until you know
>> precisely what registers are used in the insn. That's absolutely
>> critical. So you have to assume worst case which is you can only use
>> the +-16 byte insns which will pessimize just about every stack
>> reference.
>>
>
> You fix the offset of the value stored on the stack. Given that, you
> know which instruction you can use.
>
No, it's not that simple. If you spill, then you can change the
registers that are used and that in turn changes the set of valid
offsets in reg+d instructions.
>
>> Perhaps the confusion in this case is because we're not really
>> relaxing -- we're dealing with a case where the offset alone isn't
>> enough to determine if the addressing mode is valid.
>>
>
> The confusion may be that we may be talking about different things.
> I'm talking about removing this loop in the reload function:
>
> /* This loop scans the entire function each go-round
> and repeats until one repetition spills no additional hard regs. */
>
Most of that loop's real work isn't in the address relaxation code, but
in digging through instructions to determine what needs to be reloaded
and how many registers are necessary to perform the requested reloads.
It is unfortunate that spilling and allocating stack slots requires the
loop to iterate again -- I do believe it is possible to remove some
silly iterating, but removing the whole loop is a pipe dream at the moment.
>
>
>>>> Without knowing the size of the frame, how do you plan on doing this
>>>> without making the assumption that nothing is going to fit in the
>>>> shorter displacement variants? How can you do this when the range of
>>>> valid displacements can change because the register you used got
>>>> spilled and you got a register from a different class (which in turn
>>>> has a drastically smaller set of valid displacements).
>>>>
>>>>
>>> I'm saying that you guess the size of the frame, so your premise does
>>> not describe the aproach that I am suggesting.
>>>
>>>
>> But the offsets are relative to the size of the frame. So I don't see
>> how you can set the offsets without knowing the size of the frame.
>>
>
> You know the size of the frame, by guessing conservatively as to how
> many registers will be required.
>
A conservative guess would be something like "every pseudo needs a slot"
-- anything less than that isn't conservative, it's optimistic.
Guessing conservative is going to be wasteful on so many levels.
Guessing optimistic doesn't solve any of the problems we want to solve
as we still need to deal with the case where our guess was wrong.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-17 20:01 ` Ian Lance Taylor
2009-02-20 16:25 ` Denis Chertykov
@ 2009-02-20 22:04 ` Jeff Law
1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-20 22:04 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Bernd Schmidt, Paolo Bonzini, GCC Development, Vladimir Makarov,
Michael Matz
Ian Lance Taylor wrote:
> Bernd Schmidt <bernds_cb1@t-online.de> writes:
>
>
>> Ian Lance Taylor wrote:
>>
>>
>>> No, that makes no sense. What I'm suggesting is that we fix the stack
>>> offsets of all local variables before register allocation, based on a
>>> conservative assessment of how many registers will be saved on the
>>> stack.
>>>
>> The conservative assessment is that all pseudos go on the stack.
>> However, this way you'll generate terrible code.
>>
>
> I think the number of pseudos on the stack is irrelevant. You put each
> one farther down the stack at a known offset. The only interesting
> number to choose is how many registers will need to be saved on the
> stack.
>
Allocation of slots isn't that simple because you might be spilling
later, which in turn allocates a new stack slot. Note that you want the
new stack slot to actually be at a smaller address because it's likely
used far more often than the stack slots for pseudos that didn't get a
hard register.
You also can't make assumptions about stack/frame growth directions and
an address that is valid as fp+offset may not be valid as sp+offset'.
I think you're vastly over-simplifying the code in question.
>
>
>> I don't really understand why people want to remove reload only to
>> implement the same thing elsewhere. There are only two major problems
>> with reload IMO:
>> - the RELOAD_FOR_xxx mechanism of scheduling reload insns is terrible
>> - so is the inheritance code
>> Even so, the number of bugs in these seems to have dropped off over the
>> years.
>>
>
> The problem with reload is that it interferes with register allocation.
> Even if gcc had a perfect register allocator, we would still generate
> suboptimal code because reload would mess up the allocation. The key to
> solving that problem is that we need to do instruction selection (in gcc
> terms, picking which insn alternative to use) either before register
> allocation or in conjunction with it.
>
I agree that instruction selection prior to or as part of allocation is
good. But until we're dealing with spills in the allocator as well,
we're still in a situation where reload can come along and muck things
up. Hence my desire to move more of the spill code generation into IRA.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-20 21:58 ` Jeff Law
@ 2009-02-21 16:40 ` Ian Lance Taylor
2009-02-22 2:23 ` Jeff Law
2009-02-22 2:35 ` Jeff Law
0 siblings, 2 replies; 32+ messages in thread
From: Ian Lance Taylor @ 2009-02-21 16:40 UTC (permalink / raw)
To: Jeff Law; +Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Jeff Law <law@redhat.com> writes:
> Ian Lance Taylor wrote:
>> Jeff Law <law@redhat.com> writes:
>>
>>
>>>> No, that makes no sense. What I'm suggesting is that we fix the stack
>>>> offsets of all local variables before register allocation, based on a
>>>> conservative assessment of how many registers will be saved on the
>>>> stack. Then we know during register allocation whether the memory
>>>> reference will be in or out of the +- 16 byte range.
>>> Just fixing the offset is not sufficient for the case I'm concerned
>>> about because you don't know what offsets are valid until you know
>>> precisely what registers are used in the insn. That's absolutely
>>> critical. So you have to assume worst case which is you can only use
>>> the +-16 byte insns which will pessimize just about every stack
>>> reference.
>>>
>>
>> You fix the offset of the value stored on the stack. Given that, you
>> know which instruction you can use.
>>
> No, it's not that simple. If you spill, then you can change the
> registers that are used and that in turn changes the set of valid
> offsets in reg+d instructions.
We are just talking in circles. As I have been saying all along, I'm
suggesting that gcc conservatively determine which registers will be
used. Then spilling will not change the stack space allocated for saved
registers.
>>> Perhaps the confusion in this case is because we're not really
>>> relaxing -- we're dealing with a case where the offset alone isn't
>>> enough to determine if the addressing mode is valid.
>>>
>>
>> The confusion may be that we may be talking about different things.
>> I'm talking about removing this loop in the reload function:
>>
>> /* This loop scans the entire function each go-round
>> and repeats until one repetition spills no additional hard regs. */
>>
> Most of that loop's real work isn't in the address relaxation code,
> but in digging through instructions to determine what needs to be
> reloaded and how many registers are necessary to perform the requested
> reloads.
You are not using the word "relaxation" in the same sense that I meant
it. I meant explicitly the process of finding the final set of
elimination offsets.
>> You know the size of the frame, by guessing conservatively as to how
>> many registers will be required.
>>
> A conservative guess would be something like "every pseudo needs a
> slot" -- anything less than that isn't conservative, it's optimistic.
Not a conservative guess at pseudo registers, a conservative guess at
hard registers.
I'm sorry, but I think I'm done with this thread. I'm just repeating
myself, and people are critiquing things which I am not saying.
Ian
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-21 16:40 ` Ian Lance Taylor
@ 2009-02-22 2:23 ` Jeff Law
2009-02-22 2:35 ` Jeff Law
1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-22 2:23 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Ian Lance Taylor wrote:
> Not a conservative guess at pseudo registers, a conservative guess at
> hard registers.
>
> I'm sorry, but I think I'm done with this thread. I'm just repeating
> myself, and people are critiquing things which I am not saying.
>
FWIW, I'm not trying to be dense -- I respect and value your opinions.
We're clearly not communicating well.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: IRA conflict graph & alternative selection
2009-02-21 16:40 ` Ian Lance Taylor
2009-02-22 2:23 ` Jeff Law
@ 2009-02-22 2:35 ` Jeff Law
1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2009-02-22 2:35 UTC (permalink / raw)
To: Ian Lance Taylor
Cc: Paolo Bonzini, GCC Development, Vladimir Makarov, Michael Matz
Ian Lance Taylor wrote:
> Jeff Law <law@redhat.com> writes:
>
>
>> Ian Lance Taylor wrote:
>>
>>> Jeff Law <law@redhat.com> writes:
>>>
>>>
>>>
>>>>> No, that makes no sense. What I'm suggesting is that we fix the stack
>>>>> offsets of all local variables before register allocation, based on a
>>>>> conservative assessment of how many registers will be saved on the
>>>>> stack. Then we know during register allocation whether the memory
>>>>> reference will be in or out of the +- 16 byte range.
>>>>>
>>>> Just fixing the offset is not sufficient for the case I'm concerned
>>>> about because you don't know what offsets are valid until you know
>>>> precisely what registers are used in the insn. That's absolutely
>>>> critical. So you have to assume worst case which is you can only use
>>>> the +-16 byte insns which will pessimize just about every stack
>>>> reference.
>>>>
>>>>
>>> You fix the offset of the value stored on the stack. Given that, you
>>> know which instruction you can use.
>>>
>>>
>> No, it's not that simple. If you spill, then you can change the
>> registers that are used and that in turn changes the set of valid
>> offsets in reg+d instructions.
>>
>
> We are just talking in circles. As I have been saying all along, I'm
> suggesting that gcc conservatively determine which registers will be
> used. Then spilling will not change the stack space allocated for saved
> registers.
>
Conservatively selecting registers for this case would make the
generated code so bad on this target that GCC would become unusable.
>>
>> Most of that loop's real work isn't in the address relaxation code,
>> but in digging through instructions to determine what needs to be
>> reloaded and how many registers are necessary to perform the requested
>> reloads.
>>
>
> You are not using the word "relaxation" in the same sense that I meant
> it. I meant explicitly the process of finding the final set of
> elimination offsets.
>
Actually I think I am. Very little work of the loop is related to
finding the final set of elimination offsets. Furthermore, most of the
time we run through that loop precisely once. Of the remaining cases
where the loop iterates more than once, 25% can be trivially eliminated
through better sequencing of operations in the loop.
>
>
> Not a conservative guess at pseudo registers, a conservative guess at
> hard registers.
>
OK. I see the difference here, but it doesn't change the problem at
hand -- a conservative guess at what hard registers we're going to use
would pessimize the code horribly, possibly more so than making a
conservative guess about what offsets are going to be valid -- a
conservative guess on the registers used for these memory operations
would have a cascading effect throughout the code causing a ton of
reloading of other instructions.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2009-02-22 2:35 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-13 15:13 IRA conflict graph & alternative selection Jeff Law
2009-02-13 15:32 ` Paolo Bonzini
2009-02-13 15:47 ` Michael Matz
2009-02-16 16:46 ` Jeff Law
2009-02-16 17:27 ` Michael Matz
2009-02-13 17:01 ` Jeff Law
2009-02-13 16:37 ` Vladimir Makarov
2009-02-13 16:49 ` Paolo Bonzini
2009-02-13 19:01 ` Ian Lance Taylor
2009-02-13 19:38 ` Vladimir Makarov
2009-02-16 16:25 ` Jeff Law
2009-02-16 20:33 ` Ian Lance Taylor
2009-02-16 23:15 ` Jeff Law
2009-02-17 3:57 ` Ian Lance Taylor
2009-02-17 16:52 ` Bernd Schmidt
2009-02-17 18:37 ` Steven Bosscher
2009-02-17 20:01 ` Ian Lance Taylor
2009-02-20 16:25 ` Denis Chertykov
2009-02-20 22:04 ` Jeff Law
2009-02-17 17:45 ` Jeff Law
2009-02-17 19:57 ` Ian Lance Taylor
2009-02-20 21:58 ` Jeff Law
2009-02-21 16:40 ` Ian Lance Taylor
2009-02-22 2:23 ` Jeff Law
2009-02-22 2:35 ` Jeff Law
2009-02-17 16:28 ` Vladimir Makarov
2009-02-17 17:47 ` Jeff Law
2009-02-13 19:54 ` Jeff Law
2009-02-14 15:00 ` Steven Bosscher
2009-02-16 15:53 ` Jeff Law
2009-02-16 16:40 ` Jeff Law
2009-02-17 16:28 ` Vladimir Makarov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).