Register classes

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Register classes
@ 2004-08-18 17:59 Daniel Berlin
  2004-08-20  2:07 ` James E Wilson
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Berlin @ 2004-08-18 17:59 UTC (permalink / raw)
  To: gcc

A whole lot of ports seem to have register classes that contain 
registers that can't possibly be used together.
They also have to seem weird groupings of registers that don't make 
sense to be used by any single pseudo.

What exactly is the goal of these register classes?
Is it to work around bad allocation decisions by the register 
allocator/regclass (IE get regclass to choose this new class as the 
preferred class so that the allocator makes a better decision later 
on)?

I recently had another discussion with Gregory Chaitan about register 
allocation, and two things he was adamant about (that I was also 
somewhat strong feeling, but not as adamant about during design of 
new-ra):

1. The interference graph should represent all interference constraints 
necessary to do allocation in some way.  IE for variables that need 
multiple registers, this needs to be represented in the interference 
graph in some way (through either extra or weighted edges).
IE you should need nothing but a single bit test to determine whether 
two variables can share a register or not.
New-ra didn't follow this, and it became quite messy to try to 
determine whether two variables actually interfered, and what the 
possible registers we could use for these variables were.  This took 
away most of the benefits of using an interference graph in the first 
place.

2. That a lot of parameterization be done properly, so that costs are 
done right for each thing, and the allocator can make the right 
decisions.

In our case, that means that register classes (if we keep them) should 
represent only one thing, whereas right now it seems they can represent 
one of three things:

1. The set of registers possible for some given mode/usage.
2. The set of registers good (but not all possibly usable registers for 
that mode/usage are included) for some given mode/usage.
3. The set of registers good (but including  unusable registers for 
that mode/usage) are included.

Have I missed anything (I haven't touched new-ra or old-ra in a year or 
so, so it's possibly i've lost it).
--Dan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Register classes
  2004-08-18 17:59 Register classes Daniel Berlin
@ 2004-08-20  2:07 ` James E Wilson
  2004-08-20  5:28   ` Daniel Berlin
  0 siblings, 1 reply; 7+ messages in thread
From: James E Wilson @ 2004-08-20  2:07 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc

Daniel Berlin wrote:
> A whole lot of ports seem to have register classes that contain 
> registers that can't possibly be used together.
> They also have to seem weird groupings of registers that don't make 
> sense to be used by any single pseudo.

This is because of interactions between the register preferences 
computed by regclass and the register allocator.

The register preferencing works by looking at the instructions that use 
a pseudo, looking at the register classes preferred by these 
instructions, and then forming unions/intersections to determine the 
best class to allocate a register to.  If you don't have register 
classes to represent the union/intersection of other register classes, 
then you can end up with suboptimal results.  In some extreme cases, a 
port may fail in reload if a union/intersection class isn't there.  This 
is sometimes a problem for embedded targets with small register sets, 
and lots of registers with special purposes.  See the subunion and 
superunion stuff in regclass.c for some of the stuff that is done with 
register classes.

Suppose you have a pseudo that is used in only two instructions.  One 
instruction requires the pseudo to be in class A.  The other requires it 
to be in class B.  The two classes don't intersect.  Neither class is 
better.  The best class to use would be the union of class A and B, 
which then lets the register allocator choose a hard reg from either 
class, depending on other factors, such as which class has less register 
pressure.  If you don't have the union reg class, then regclass must 
arbitrarily pick one (I think it picks the lower numbered one?), and 
this may result in suboptimal code if this forces an otherwise 
unnecessary spill.

Also, in some cases, it is useful to have classes that represent the 
difference of two classes.  Suppose you have a couple of registers with 
special purposes that cause them to have high register pressure.  Then 
you may be able to improve results by creating register classes with 
those registers subtracted out, and then using hooks to force the 
difference class to be used when possible, to reduce register pressure 
on the special purpose registers.  Again, this is more likely to be 
useful for an embedded target with a small number of registers and a set 
of registers with special purposes.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Register classes
  2004-08-20  2:07 ` James E Wilson
@ 2004-08-20  5:28   ` Daniel Berlin
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Berlin @ 2004-08-20  5:28 UTC (permalink / raw)
  To: James E Wilson; +Cc: gcc


On Aug 19, 2004, at 9:53 PM, James E Wilson wrote:

> Daniel Berlin wrote:
>> A whole lot of ports seem to have register classes that contain 
>> registers that can't possibly be used together.
>> They also have to seem weird groupings of registers that don't make 
>> sense to be used by any single pseudo.
>
> This is because of interactions between the register preferences 
> computed by regclass and the register allocator.
>
IOW, it's done to try to get better performance out of the register 
allocator.
:)
--Dan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Register classes
  2004-08-19 17:09     ` Daniel Berlin
@ 2004-08-19 20:14       ` Denis Chertykov
  0 siblings, 0 replies; 7+ messages in thread
From: Denis Chertykov @ 2004-08-19 20:14 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Michael Matz, gcc, Caroline Tice

Daniel Berlin <dberlin@dberlin.org> writes:

> On Aug 19, 2004, at 8:56 AM, Michael Matz wrote:
> 
> > Hi,
> >
> > On Wed, 18 Aug 2004, Daniel Berlin wrote:
> >
> >> Just so nobody has any doubts, i have no current plans to do
> >> anything to
> >> the register allocator. I'm just trying to gather more information in
> >> case i decide to mount another future assault on it.
> >>
> >> I should point out something from the below paper though:
> >
> > As this paper didn't yet show up in citeseer (it seems), I would be
> > thankful for this paper.
> >
> I'll send it to you.
> 
> > Note that new-ra does not really use register classes for the
> > interference
> > graph, but instead register sets.
> >
> Except they are taken from register classes.

The register sets are taken from register classes and new-ra use
register sets for interference graph. Even more new-ra tracks useless
conflicts. This is conflicts between webs with nonintersected
register sets but intersected live ranges.

Denis.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Register classes
  2004-08-19 13:42   ` Michael Matz
@ 2004-08-19 17:09     ` Daniel Berlin
  2004-08-19 20:14       ` Denis Chertykov
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Berlin @ 2004-08-19 17:09 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc, Caroline Tice


On Aug 19, 2004, at 8:56 AM, Michael Matz wrote:

> Hi,
>
> On Wed, 18 Aug 2004, Daniel Berlin wrote:
>
>> Just so nobody has any doubts, i have no current plans to do anything 
>> to
>> the register allocator. I'm just trying to gather more information in
>> case i decide to mount another future assault on it.
>>
>> I should point out something from the below paper though:
>
> As this paper didn't yet show up in citeseer (it seems), I would be
> thankful for this paper.
>
I'll send it to you.

> Note that new-ra does not really use register classes for the 
> interference
> graph, but instead register sets.
>
Except they are taken from register classes.

> Trivially colorable nodes are still trivially colorable also in new-ra,
> _as long_ as they only require one register, so I'm not exactly sure 
> what
> you were speaking about in your first mail.

Uh, handling multi-regs is something very ugly in new-ra right now.
Also, the interference graph in new-ra is split into two graphs, which 
is also a bad idea.
(sup_igraph and igraph).

>> "One of our key insights is that coloring constraints on each
>> interference-graph node should be expressed in terms of the set
>
> Without having read the paper, at least this part I think is basically
> implemented in new-ra.

But it's not, because we don't gracefully handle multi-reg pseudos and 
whatnot simply by using the interference graph.  We go through all 
kinds of contortions.

>   Sometimes we still use the regclasses for
> preferring some regs.  And of course when looking on the constraints 
> of an
> insn you have to deal with regclasses.
>
>
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Register classes
  2004-08-18 18:40 ` Daniel Berlin
@ 2004-08-19 13:42   ` Michael Matz
  2004-08-19 17:09     ` Daniel Berlin
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Matz @ 2004-08-19 13:42 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Caroline Tice, gcc

Hi,

On Wed, 18 Aug 2004, Daniel Berlin wrote:

> Just so nobody has any doubts, i have no current plans to do anything to
> the register allocator. I'm just trying to gather more information in
> case i decide to mount another future assault on it.
> 
> I should point out something from the below paper though:

As this paper didn't yet show up in citeseer (it seems), I would be 
thankful for this paper.

Note that new-ra does not really use register classes for the interference 
graph, but instead register sets.

Trivially colorable nodes are still trivially colorable also in new-ra,
_as long_ as they only require one register, so I'm not exactly sure what
you were speaking about in your first mail.

The problem with multi-reg pseudos is a tradeoff.  If you exactly count 
the necessary registers you make your nodes unnesessarily non-trivial.  
One example:  suppose R needs two _consecutive_ hardregs and that there 
are N colors.  This node is only trivially colorable if it has N/2-1 
neighbors (instead of N-2 which on would think at first).  The problem is 
the consecutivness.  If it has N-2 neighbors that we can be sure that 
there are two arbitrary hardregs available, but not two consecutive.
N/2-1 is a big hammer to ensure trivial colorability, which makes much too 
many nodes spilled than later necessary.

So, I choose another solution, by also popping it off the stack if it has 
<= N-2 neighbors.  But then I have to live and cope with the fact that 
this node might not get its two free consecutive hardregs.  And to make 
the effect of this less noticable (i.e. increasing the probability of it 
finding the two colors) there are some heuristics added.

Maybe you speak about this complexity.  But IMHO the coloring of the graph 
itself is still reasonable clean.  The other complexity is due to the way 
how it tries to avoid spilling must-get-a-color webs (which are of 
different classes, like short-living ones, or those with only one register 
possible and such).  Due to the above some simplified webs do not fall 
into the category, and some do.

> "One of our key insights is that coloring constraints on each
> interference-graph node should be expressed in terms of the set

Without having read the paper, at least this part I think is basically 
implemented in new-ra.  Sometimes we still use the regclasses for 
preferring some regs.  And of course when looking on the constraints of an 
insn you have to deal with regclasses.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Register classes
  2004-08-18 18:16 Fwd: " Caroline Tice
@ 2004-08-18 18:40 ` Daniel Berlin
  2004-08-19 13:42   ` Michael Matz
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Berlin @ 2004-08-18 18:40 UTC (permalink / raw)
  To: Caroline Tice; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 5110 bytes --]

On Aug 18, 2004, at 2:05 PM, Caroline Tice wrote:

> Oops, I forgot to copy the list on this message...

Just so nobody has any doubts, i have no current plans to do anything 
to the register allocator.
I'm just trying to gather more information in case i decide to mount 
another future assault on it.

I should point out something from the below paper though:
"One of our key insights is that coloring constraints on each
interference-graph node should be expressed in terms of the set
of registers available to it. With this insight we produce a 
generalization
that handles simultaneous allocation of multiple register
classes and accommodates register aliasing in an elegant way. Because
allocators using our approach know about the set of registers
available to each node, they can recognize when overlap between
such sets would introduce inaccuracies in the criterion for 
colorability,
and thereby avoid the overcounting inherent in simpler
formulations."

This insight is something i have shared as well, and one of the reasons 
i've always hated regclasses. They don't give you this information, 
they give you "kinda sorta this information, sometimes".

> Begin forwarded message:
>
>> From: Caroline Tice <ctice@apple.com>
>> Date: August 18, 2004 10:55:43 AM PDT
>> To: Daniel Berlin <dberlin@dberlin.org>
>> Subject: Re: Register classes
>>
>> If you are planning on doing some work on the register allocator, 
>> there was an EXCELLENT paper in
>> PLDI this year (ACM Conference on Programming Language Design and 
>> Implementation) on
>> updating Chaitin's algorithm to work well on modern architectures.  I 
>> would strongly recommend you read it, if you haven't already.  I can 
>> send you a copy if you like.  Below is my summary of the paper:
>>
>> A Generalized Algorithm for Graph-Coloring Register Allocation
>> by M.D. Smith, N. Ramsey and G. Holloway
>>
>> Expanding Chaitin's graph coloring algorithm to work on real hardware 
>> architectures (taking into account
>> register classes, overlapping between register classes, register 
>> aliasing, etc).  Basically, Chaitin said a
>> node in the coloring graph was trivially colorable if the number of 
>> neighbors of N, "degree(N)" was less
>> than the number of registers on the machine.  The new formula says   
>> node is trivially colorable is
>>  "squeeze (N)" is less than the number of registers actually 
>> available for allocation, where squeeze (N) is
>> the number of registers that N's neighbors will actually need.  The 
>> rest of the paper is the concerned with
>> accurately defining "squeeze (N)" and "number of registers actually 
>> available for allocation" taking the
>> various hardware architectures and constraints into account.
>>
>> -- Caroline Tice
>> ctice@apple.com
>>
>> On Aug 18, 2004, at 10:39 AM, Daniel Berlin wrote:
>>
>>> A whole lot of ports seem to have register classes that contain 
>>> registers that can't possibly be used together.
>>> They also have to seem weird groupings of registers that don't make 
>>> sense to be used by any single pseudo.
>>>
>>> What exactly is the goal of these register classes?
>>> Is it to work around bad allocation decisions by the register 
>>> allocator/regclass (IE get regclass to choose this new class as the 
>>> preferred class so that the allocator makes a better decision later 
>>> on)?
>>>
>>> I recently had another discussion with Gregory Chaitan about 
>>> register allocation, and two things he was adamant about (that I was 
>>> also somewhat strong feeling, but not as adamant about during design 
>>> of new-ra):
>>>
>>> 1. The interference graph should represent all interference 
>>> constraints necessary to do allocation in some way.  IE for 
>>> variables that need multiple registers, this needs to be represented 
>>> in the interference graph in some way (through either extra or 
>>> weighted edges).
>>> IE you should need nothing but a single bit test to determine 
>>> whether two variables can share a register or not.
>>> New-ra didn't follow this, and it became quite messy to try to 
>>> determine whether two variables actually interfered, and what the 
>>> possible registers we could use for these variables were.  This took 
>>> away most of the benefits of using an interference graph in the 
>>> first place.
>>>
>>> 2. That a lot of parameterization be done properly, so that costs 
>>> are done right for each thing, and the allocator can make the right 
>>> decisions.
>>>
>>>
>>> In our case, that means that register classes (if we keep them) 
>>> should represent only one thing, whereas right now it seems they can 
>>> represent one of three things:
>>>
>>> 1. The set of registers possible for some given mode/usage.
>>> 2. The set of registers good (but not all possibly usable registers 
>>> for that mode/usage are included) for some given mode/usage.
>>> 3. The set of registers good (but including  unusable registers for 
>>> that mode/usage) are included.
>>>
>>> Have I missed anything (I haven't touched new-ra or old-ra in a year 
>>> or so, so it's possibly i've lost it).
>>> --Dan
>>>
>>

[-- Attachment #2: Type: text/enriched, Size: 5218 bytes --]

On Aug 18, 2004, at 2:05 PM, Caroline Tice wrote:

<excerpt>Oops, I forgot to copy the list on this message...

</excerpt>

Just so nobody has any doubts, i have no current plans to do anything
to the register allocator.

I'm just trying to gather more information in case i decide to mount
another future assault on it.

I should point out something from the below paper though: 

"<smaller><x-tad-smaller>One of our key insights is that coloring
constraints on each

</x-tad-smaller><x-tad-smaller>interference-graph node should be
expressed in terms of the set

of registers available to it. With this insight we produce a
generalization

that handles simultaneous allocation of multiple register

classes and accommodates register aliasing in an elegant way. Because

allocators using our approach know about the set of registers

available to each node, they can recognize when overlap between

such sets would introduce inaccuracies in the criterion for
colorability,

and thereby avoid the overcounting inherent in simpler

formulations.</x-tad-smaller></smaller>"

This insight is something i have shared as well, and one of the
reasons i've always hated regclasses. They don't give you this
information, they give you "kinda sorta this information, sometimes".

<excerpt>Begin forwarded message:

<excerpt><bold><color><param>0000,0000,0000</param>From:
</color></bold>Caroline Tice <<ctice@apple.com>

<bold><color><param>0000,0000,0000</param>Date: </color></bold>August
18, 2004 10:55:43 AM PDT

<bold><color><param>0000,0000,0000</param>To: </color></bold>Daniel
Berlin <<dberlin@dberlin.org>

<bold><color><param>0000,0000,0000</param>Subject: </color>Re:
Register classes

</bold>

If you are planning on doing some work on the register allocator,
there was an EXCELLENT paper in

PLDI this year (ACM Conference on Programming Language Design and
Implementation) on

updating Chaitin's algorithm to work well on modern architectures.  I
would strongly recommend you read it, if you haven't already.  I can
send you a copy if you like.  Below is my summary of the paper:

A Generalized Algorithm for Graph-Coloring Register Allocation

by M.D. Smith, N. Ramsey and G. Holloway

Expanding Chaitin's graph coloring algorithm to work on real hardware
architectures (taking into account

register classes, overlapping between register classes, register
aliasing, etc).  Basically, Chaitin said a

node in the coloring graph was trivially colorable if the number of
neighbors of N, "degree(N)" was less

than the number of registers on the machine.  The new formula says  
node is trivially colorable is

 "squeeze (N)" is less than the number of registers actually available
for allocation, where squeeze (N) is

the number of registers that N's neighbors will actually need.  The
rest of the paper is the concerned with

accurately defining "squeeze (N)" and "number of registers actually
available for allocation" taking the

various hardware architectures and constraints into account.

-- Caroline Tice

ctice@apple.com

On Aug 18, 2004, at 10:39 AM, Daniel Berlin wrote:

<excerpt>A whole lot of ports seem to have register classes that
contain registers that can't possibly be used together.

They also have to seem weird groupings of registers that don't make
sense to be used by any single pseudo.

What exactly is the goal of these register classes?

Is it to work around bad allocation decisions by the register
allocator/regclass (IE get regclass to choose this new class as the
preferred class so that the allocator makes a better decision later
on)?

I recently had another discussion with Gregory Chaitan about register
allocation, and two things he was adamant about (that I was also
somewhat strong feeling, but not as adamant about during design of
new-ra):

1. The interference graph should represent all interference
constraints necessary to do allocation in some way.  IE for variables
that need multiple registers, this needs to be represented in the
interference graph in some way (through either extra or weighted
edges).

IE you should need nothing but a single bit test to determine whether
two variables can share a register or not.

New-ra didn't follow this, and it became quite messy to try to
determine whether two variables actually interfered, and what the
possible registers we could use for these variables were.  This took
away most of the benefits of using an interference graph in the first
place.

2. That a lot of parameterization be done properly, so that costs are
done right for each thing, and the allocator can make the right
decisions.

In our case, that means that register classes (if we keep them) should
represent only one thing, whereas right now it seems they can
represent one of three things:

1. The set of registers possible for some given mode/usage.

2. The set of registers good (but not all possibly usable registers
for that mode/usage are included) for some given mode/usage.

3. The set of registers good (but including  unusable registers for
that mode/usage) are included.

Have I missed anything (I haven't touched new-ra or old-ra in a year
or so, so it's possibly i've lost it).

--Dan

</excerpt>

</excerpt></excerpt>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-08-20  2:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-18 17:59 Register classes Daniel Berlin
2004-08-20  2:07 ` James E Wilson
2004-08-20  5:28   ` Daniel Berlin
2004-08-18 18:16 Fwd: " Caroline Tice
2004-08-18 18:40 ` Daniel Berlin
2004-08-19 13:42   ` Michael Matz
2004-08-19 17:09     ` Daniel Berlin
2004-08-19 20:14       ` Denis Chertykov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).