* some Sparc hackery in the works
@ 1998-07-13 4:36 David S. Miller
1998-07-13 12:58 ` Joern Rennecke
1998-07-13 21:47 ` Jeffrey A Law
0 siblings, 2 replies; 8+ messages in thread
From: David S. Miller @ 1998-07-13 4:36 UTC (permalink / raw)
To: egcs
I've been tooling in the background on some Sparc backend rewrites to
improve code in general, and in particular for 64-bit targets. The
latter was my original goal when I began these changes.
The work is very preliminary, but I've made much progress. It won't
make it into the upcoming 1.1 release, but I will merge it in soon
afterwards.
So I figured that I'd put my latest patches up for FTP for those who
want to take a look and maybe contribute more improvements or just
plain regression test what I have so far. Feel free to do either.
First, some caveats:
1) 64-bit targets are completely broken, don't even try to use it yet.
This will be fixed soon.
2) JFC's VIS hacks are not completely back in with my changes,
again this will be fixed soon.
All 32-bit non-VIS/non-v8plus targets should work just fine.
Next, what I was trying to accomplish:
1) Make all RTL ever generated by the Sparc back end have a
one to one correspondance between RTL insns and real Sparc
insns.
For certain classes of operations the Sparc backend currently
generates multiple sparc insns per RTL insn. This is bad for
two reasons:
a) Less accurate schedules are obtained
b) reorg can't fill as many delay slots
The first major obstacle here were the move patterns. I rewrote
them completely, happily half of sparc.c disappeared as things
such as emit_move_sequence, output_move_double etc. were no longer
needed.
The remaining cases (most of which I haven't gotten to yet) to fix
this issue completely have to do with multi-register (DI mode etc.)
integer operations on sparc32, some PIC patterns, and the next
topic.
2) Output more efficient constant moves and symbol references on
64-bit targets, in all code models.
This was the major fallacy I saw which gave me incentive to work
on these rewrites. Currently the 64-bit sparc backend produces
ill sequences of code to load constants, such as:
sethi %hi(const), %tmp
or %tmp, %lo(const), %tmp
sllx %tmp, 32, %tmp
sethi %hi(const), %dest
or %dest, %lo(const), %dest
or %dest, %tmp, %dest
This will happen any time bits need to be set in the upper
32-bits of a 64-bit DImode value.
The current framework I have added makes this possible, the
actual implementation is coming soon. The one problem case
which may be difficult to handle satisfactorily are symbol
address loads in the ANYWHERE 64-bit code model. Currently
this requires a fixed hard register which brings me to...
3) Complete removal of any references to hard integer registers in the
Sparc machine description. The main incentive is that such
hard-coded registers prevent CSE from occuring.
There are two primary places where this occurs in the current Sparc
backend, for the aforementioned 64-bit symbol loading case and for
PIC on all sparc sub-targets.
Richard Henderson has worked closely with me on some solutions for
the PIC problems.
Richard has already improved PIC code gen for jump tables, especially
for -fPIC, in this patch set.
Anyways, the current state of my changes are up for grabs at:
ftp://ftp.cobaltmicro.com/pub/users/davem/egcs/sparc_movsi_rewrite.diff
This is against the latest CVS development sources. Enjoy, and
comments are welcome.
Later,
David S. Miller
davem@dm.cobaltmicro.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: some Sparc hackery in the works
1998-07-13 4:36 some Sparc hackery in the works David S. Miller
@ 1998-07-13 12:58 ` Joern Rennecke
1998-07-13 16:44 ` David S. Miller
` (2 more replies)
1998-07-13 21:47 ` Jeffrey A Law
1 sibling, 3 replies; 8+ messages in thread
From: Joern Rennecke @ 1998-07-13 12:58 UTC (permalink / raw)
To: David S. Miller; +Cc: egcs
> 1) Make all RTL ever generated by the Sparc back end have a
> one to one correspondance between RTL insns and real Sparc
> insns.
>
> For certain classes of operations the Sparc backend currently
> generates multiple sparc insns per RTL insn. This is bad for
> two reasons:
>
> a) Less accurate schedules are obtained
> b) reorg can't fill as many delay slots
>
> The first major obstacle here were the move patterns. I rewrote
> them completely, happily half of sparc.c disappeared as things
> such as emit_move_sequence, output_move_double etc. were no longer
> needed.
>
> The remaining cases (most of which I haven't gotten to yet) to fix
> this issue completely have to do with multi-register (DI mode etc.)
> integer operations on sparc32, some PIC patterns, and the next
> topic.
You usually get worse code if you try to open-code DImode operations as
RTL at rtl generation time.
You can get the same benefit for scheduling and delay slot filling if you
povide define_splits for the patterns in question; you can then also
safe the output template by replacing it with a '#', to indicate that
this pattern must be split.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: some Sparc hackery in the works
1998-07-13 12:58 ` Joern Rennecke
@ 1998-07-13 16:44 ` David S. Miller
1998-07-13 20:56 ` Michael Hayes
1998-07-13 21:47 ` Jeffrey A Law
2 siblings, 0 replies; 8+ messages in thread
From: David S. Miller @ 1998-07-13 16:44 UTC (permalink / raw)
To: amylaar; +Cc: egcs
From: Joern Rennecke <amylaar@cygnus.co.uk>
Date: Mon, 13 Jul 1998 19:54:53 +0100 (BST)
You usually get worse code if you try to open-code DImode
operations as RTL at rtl generation time. You can get the same
benefit for scheduling and delay slot filling if you povide
define_splits for the patterns in question; you can then also safe
the output template by replacing it with a '#', to indicate that
this pattern must be split.
If you look at my code, this is precisely what I'm doing.
Later,
David S. Miller
davem@dm.cobaltmicro.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: some Sparc hackery in the works
1998-07-13 12:58 ` Joern Rennecke
1998-07-13 16:44 ` David S. Miller
@ 1998-07-13 20:56 ` Michael Hayes
1998-07-13 21:47 ` Jeffrey A Law
2 siblings, 0 replies; 8+ messages in thread
From: Michael Hayes @ 1998-07-13 20:56 UTC (permalink / raw)
To: Joern Rennecke; +Cc: David S. Miller, egcs
Joern Rennecke writes:
> You usually get worse code if you try to open-code DImode operations as
> RTL at rtl generation time.
Yep, I fell into that trap. The register allocator does a very poor
job with SUBREGs.
> You can get the same benefit for scheduling and delay slot filling if you
> povide define_splits for the patterns in question; you can then also
> safe the output template by replacing it with a '#', to indicate that
> this pattern must be split.
Snippets of advice, such as this, should be in the manual somewhere.
The only drawback to this approach I found was that it was harder to
peer inside these composite insns to check for register usage.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: some Sparc hackery in the works
1998-07-13 12:58 ` Joern Rennecke
1998-07-13 16:44 ` David S. Miller
1998-07-13 20:56 ` Michael Hayes
@ 1998-07-13 21:47 ` Jeffrey A Law
1998-07-13 23:48 ` David S. Miller
2 siblings, 1 reply; 8+ messages in thread
From: Jeffrey A Law @ 1998-07-13 21:47 UTC (permalink / raw)
To: Joern Rennecke; +Cc: David S. Miller, egcs
In message < 199807131854.TAA18918@phal.cygnus.co.uk >you write:
> You usually get worse code if you try to open-code DImode operations as
> RTL at rtl generation time.
This depends on numerous factors, including but not limited to how many
registers the target has and how important scheduling is for good performance
on the target.
> You can get the same benefit for scheduling and delay slot filling if you
> povide define_splits for the patterns in question; you can then also
> safe the output template by replacing it with a '#', to indicate that
> this pattern must be split.
Not necessarily since you lose optimizations done by various other
optimizers. For example, the sub-parts of a DImode load become
candidates for CSE, GCSE, combine, local alloc, etc.
There's generally some tradeoff for each direction. We should trust
David to do the right thing for the sparc port.
jeff
ps. Much of Dave's work mirrors stuff rth and myself have discussed
as important to handle better on the sparc port.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: some Sparc hackery in the works
1998-07-13 23:48 ` David S. Miller
@ 1998-07-13 21:47 ` Jeffrey A Law
0 siblings, 0 replies; 8+ messages in thread
From: Jeffrey A Law @ 1998-07-13 21:47 UTC (permalink / raw)
To: David S. Miller; +Cc: amylaar, egcs
In message < 199807140425.VAA21575@dm.cobaltmicro.com >you write:
> ps. Much of Dave's work mirrors stuff rth and myself have discussed
> as important to handle better on the sparc port.
>
> BTW, during my work I noticed how much tha PA port is derived from or
> influenced by the Sparc stuff, perhaps you can easily steal some of my
> tricks and techniques ;-)
When I first started hacking on the PA port in 1992 it was still
generating sparc floating point opcodes :-)
"Derived" is just a nice way of saying the port started as:
cp sparc.md pa.md
cp sparc.h pa.h
cp sparc.c pa.c
vi pa.md pa.c pa.h
I think for everything except the conditional moves and other nullified
instructions we'll probably be able to steal some of your code. In
fact, splitting up the DF/DI mode moves has been on the PA todo list
for a long time now. It wasn't worth doing earlier since we didn't
split after reload. But that's changed recently :-)
jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: some Sparc hackery in the works
1998-07-13 4:36 some Sparc hackery in the works David S. Miller
1998-07-13 12:58 ` Joern Rennecke
@ 1998-07-13 21:47 ` Jeffrey A Law
1 sibling, 0 replies; 8+ messages in thread
From: Jeffrey A Law @ 1998-07-13 21:47 UTC (permalink / raw)
To: David S. Miller; +Cc: egcs
In message < 199807131107.EAA11110@dm.cobaltmicro.com >you write:
> The work is very preliminary, but I've made much progress. It won't
> make it into the upcoming 1.1 release, but I will merge it in soon
> afterwards.
Agreed.
> So I figured that I'd put my latest patches up for FTP for those who
> want to take a look and maybe contribute more improvements or just
> plain regression test what I have so far. Feel free to do either.
Another option is to make a branch for this work. That way the latest
bits are always in CVS. Folks that want to look at or help with the
code just check out the magic branch.
> 1) Make all RTL ever generated by the Sparc back end have a
> one to one correspondance between RTL insns and real Sparc
> insns.
Yippie!
> For certain classes of operations the Sparc backend currently
> generates multiple sparc insns per RTL insn. This is bad for
> two reasons:
>
> a) Less accurate schedules are obtained
> b) reorg can't fill as many delay slots
>
> The first major obstacle here were the move patterns. I rewrote
> them completely, happily half of sparc.c disappeared as things
> such as emit_move_sequence, output_move_double etc. were no longer
> needed.
Yup. This is traditionally where this kind of work gets hard.
> 3) Complete removal of any references to hard integer registers in the
> Sparc machine description. The main incentive is that such
> hard-coded registers prevent CSE from occuring.
Awsome. These hard registers also prevent certain combining, hoisting
out of loops, gcse, etc. Explicitly mentioned hard registers are
generally a lose.
jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: some Sparc hackery in the works
1998-07-13 21:47 ` Jeffrey A Law
@ 1998-07-13 23:48 ` David S. Miller
1998-07-13 21:47 ` Jeffrey A Law
0 siblings, 1 reply; 8+ messages in thread
From: David S. Miller @ 1998-07-13 23:48 UTC (permalink / raw)
To: law; +Cc: amylaar, egcs
Date: Mon, 13 Jul 1998 21:48:37 -0600
From: Jeffrey A Law <law@hurl.cygnus.com>
In message < 199807131854.TAA18918@phal.cygnus.co.uk >you write:
> You usually get worse code if you try to open-code DImode operations as
> RTL at rtl generation time.
This depends on numerous factors, including but not limited to how many
registers the target has and how important scheduling is for good performance
on the target.
> You can get the same benefit for scheduling and delay slot filling if you
> povide define_splits for the patterns in question; you can then also
> safe the output template by replacing it with a '#', to indicate that
> this pattern must be split.
Not necessarily since you lose optimizations done by various other
optimizers. For example, the sub-parts of a DImode load become
candidates for CSE, GCSE, combine, local alloc, etc.
There's generally some tradeoff for each direction. We should trust
David to do the right thing for the sparc port.
There is a 3rd issue. If you open code it early, you risk ending up
with a multi-insn output in the end. The reason is fundamentally
because some bits of information (odd/even DI mode register alignment,
and actual address alignment for MEM moves) pop up near the end of
code gen depending upon what register allocators do (and subsequently
what stack slots go to which items etc.). There is also an issue with
argument passing semantics of DI mode objects on 32-bit targets.
Look at my code for the movdi sequences and splits very carefully. I
tried many combinations and approaches, and this was the first which
reliably gave me perfect 1<-->1 insn output in the end.
Some of my early open coding attempts caused many situations where at
the end, in the define_insn's, I had to detect the bad register number
of address alignment cases and fix it up. This was bad and defeated
the purpose of the goals I had in mind.
Thus I made everything a post-reload split... oh, always keep in mind
that all these move patterns are "special" anyways due to reload.
ps. Much of Dave's work mirrors stuff rth and myself have discussed
as important to handle better on the sparc port.
BTW, during my work I noticed how much tha PA port is derived from or
influenced by the Sparc stuff, perhaps you can easily steal some of my
tricks and techniques ;-)
Later,
David S. Miller
davem@dm.cobaltmicro.com
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~1998-07-13 23:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-07-13 4:36 some Sparc hackery in the works David S. Miller
1998-07-13 12:58 ` Joern Rennecke
1998-07-13 16:44 ` David S. Miller
1998-07-13 20:56 ` Michael Hayes
1998-07-13 21:47 ` Jeffrey A Law
1998-07-13 23:48 ` David S. Miller
1998-07-13 21:47 ` Jeffrey A Law
1998-07-13 21:47 ` Jeffrey A Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).