public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* some Sparc hackery in the works
@ 1998-07-13  4:36 David S. Miller
  1998-07-13 12:58 ` Joern Rennecke
  1998-07-13 21:47 ` Jeffrey A Law
  0 siblings, 2 replies; 8+ messages in thread
From: David S. Miller @ 1998-07-13  4:36 UTC (permalink / raw)
  To: egcs

I've been tooling in the background on some Sparc backend rewrites to
improve code in general, and in particular for 64-bit targets.  The
latter was my original goal when I began these changes.

The work is very preliminary, but I've made much progress.  It won't
make it into the upcoming 1.1 release, but I will merge it in soon
afterwards.

So I figured that I'd put my latest patches up for FTP for those who
want to take a look and maybe contribute more improvements or just
plain regression test what I have so far.  Feel free to do either.

First, some caveats:

1) 64-bit targets are completely broken, don't even try to use it yet.
   This will be fixed soon.

2) JFC's VIS hacks are not completely back in with my changes,
   again this will be fixed soon.

All 32-bit non-VIS/non-v8plus targets should work just fine.

Next, what I was trying to accomplish:

1) Make all RTL ever generated by the Sparc back end have a
   one to one correspondance between RTL insns and real Sparc
   insns.

   For certain classes of operations the Sparc backend currently
   generates multiple sparc insns per RTL insn.  This is bad for
   two reasons:

	a) Less accurate schedules are obtained
	b) reorg can't fill as many delay slots

   The first major obstacle here were the move patterns.  I rewrote
   them completely, happily half of sparc.c disappeared as things
   such as emit_move_sequence, output_move_double etc. were no longer
   needed.

   The remaining cases (most of which I haven't gotten to yet) to fix
   this issue completely have to do with multi-register (DI mode etc.)
   integer operations on sparc32, some PIC patterns, and the next
   topic.

2) Output more efficient constant moves and symbol references on
   64-bit targets, in all code models.

   This was the major fallacy I saw which gave me incentive to work
   on these rewrites.  Currently the 64-bit sparc backend produces
   ill sequences of code to load constants, such as:

   sethi	%hi(const), %tmp
   or		%tmp, %lo(const), %tmp
   sllx		%tmp, 32, %tmp
   sethi	%hi(const), %dest
   or		%dest, %lo(const), %dest
   or		%dest, %tmp, %dest

   This will happen any time bits need to be set in the upper
   32-bits of a 64-bit DImode value.

   The current framework I have added makes this possible, the
   actual implementation is coming soon.  The one problem case
   which may be difficult to handle satisfactorily are symbol
   address loads in the ANYWHERE 64-bit code model.  Currently
   this requires a fixed hard register which brings me to...

3) Complete removal of any references to hard integer registers in the
   Sparc machine description.  The main incentive is that such
   hard-coded registers prevent CSE from occuring.

   There are two primary places where this occurs in the current Sparc
   backend, for the aforementioned 64-bit symbol loading case and for
   PIC on all sparc sub-targets.

   Richard Henderson has worked closely with me on some solutions for
   the PIC problems.

Richard has already improved PIC code gen for jump tables, especially
for -fPIC, in this patch set.

Anyways, the current state of my changes are up for grabs at:

ftp://ftp.cobaltmicro.com/pub/users/davem/egcs/sparc_movsi_rewrite.diff

This is against the latest CVS development sources.  Enjoy, and
comments are welcome.

Later,
David S. Miller
davem@dm.cobaltmicro.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: some Sparc hackery in the works
  1998-07-13  4:36 some Sparc hackery in the works David S. Miller
@ 1998-07-13 12:58 ` Joern Rennecke
  1998-07-13 16:44   ` David S. Miller
                     ` (2 more replies)
  1998-07-13 21:47 ` Jeffrey A Law
  1 sibling, 3 replies; 8+ messages in thread
From: Joern Rennecke @ 1998-07-13 12:58 UTC (permalink / raw)
  To: David S. Miller; +Cc: egcs

> 1) Make all RTL ever generated by the Sparc back end have a
>    one to one correspondance between RTL insns and real Sparc
>    insns.
> 
>    For certain classes of operations the Sparc backend currently
>    generates multiple sparc insns per RTL insn.  This is bad for
>    two reasons:
> 
> 	a) Less accurate schedules are obtained
> 	b) reorg can't fill as many delay slots
> 
>    The first major obstacle here were the move patterns.  I rewrote
>    them completely, happily half of sparc.c disappeared as things
>    such as emit_move_sequence, output_move_double etc. were no longer
>    needed.
> 
>    The remaining cases (most of which I haven't gotten to yet) to fix
>    this issue completely have to do with multi-register (DI mode etc.)
>    integer operations on sparc32, some PIC patterns, and the next
>    topic.

You usually get worse code if you try to open-code DImode operations as
RTL at rtl generation time.
You can get the same benefit for scheduling and delay slot filling if you
povide define_splits for the patterns in question; you can then also
safe the output template by replacing it with a '#', to indicate that
this pattern must be split.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: some Sparc hackery in the works
  1998-07-13 12:58 ` Joern Rennecke
@ 1998-07-13 16:44   ` David S. Miller
  1998-07-13 20:56   ` Michael Hayes
  1998-07-13 21:47   ` Jeffrey A Law
  2 siblings, 0 replies; 8+ messages in thread
From: David S. Miller @ 1998-07-13 16:44 UTC (permalink / raw)
  To: amylaar; +Cc: egcs

   From: Joern Rennecke <amylaar@cygnus.co.uk>
   Date: Mon, 13 Jul 1998 19:54:53 +0100 (BST)

   You usually get worse code if you try to open-code DImode
   operations as RTL at rtl generation time.  You can get the same
   benefit for scheduling and delay slot filling if you povide
   define_splits for the patterns in question; you can then also safe
   the output template by replacing it with a '#', to indicate that
   this pattern must be split.

If you look at my code, this is precisely what I'm doing.

Later,
David S. Miller
davem@dm.cobaltmicro.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: some Sparc hackery in the works
  1998-07-13 12:58 ` Joern Rennecke
  1998-07-13 16:44   ` David S. Miller
@ 1998-07-13 20:56   ` Michael Hayes
  1998-07-13 21:47   ` Jeffrey A Law
  2 siblings, 0 replies; 8+ messages in thread
From: Michael Hayes @ 1998-07-13 20:56 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: David S. Miller, egcs

Joern Rennecke writes:
 > You usually get worse code if you try to open-code DImode operations as
 > RTL at rtl generation time.

Yep, I fell into that trap.  The register allocator does a very poor
job with SUBREGs.

 > You can get the same benefit for scheduling and delay slot filling if you
 > povide define_splits for the patterns in question; you can then also
 > safe the output template by replacing it with a '#', to indicate that
 > this pattern must be split.

Snippets of advice, such as this, should be in the manual somewhere.
The only drawback to this approach I found was that it was harder to
peer inside these composite insns to check for register usage.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: some Sparc hackery in the works
  1998-07-13 12:58 ` Joern Rennecke
  1998-07-13 16:44   ` David S. Miller
  1998-07-13 20:56   ` Michael Hayes
@ 1998-07-13 21:47   ` Jeffrey A Law
  1998-07-13 23:48     ` David S. Miller
  2 siblings, 1 reply; 8+ messages in thread
From: Jeffrey A Law @ 1998-07-13 21:47 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: David S. Miller, egcs

  In message < 199807131854.TAA18918@phal.cygnus.co.uk >you write:
  > You usually get worse code if you try to open-code DImode operations as
  > RTL at rtl generation time.
This depends on numerous factors, including but not limited to how many
registers the target has and how important scheduling is for good performance
on the target.

  > You can get the same benefit for scheduling and delay slot filling if you
  > povide define_splits for the patterns in question; you can then also
  > safe the output template by replacing it with a '#', to indicate that
  > this pattern must be split.
Not necessarily since you lose optimizations done by various other
optimizers.  For example, the sub-parts of a DImode load become
candidates for CSE, GCSE, combine, local alloc, etc.

There's generally some tradeoff for each direction.  We should trust
David to do the right thing for the sparc port.

jeff

ps.  Much of Dave's work mirrors stuff rth and myself have discussed
as important to handle better on the sparc port.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: some Sparc hackery in the works
  1998-07-13 23:48     ` David S. Miller
@ 1998-07-13 21:47       ` Jeffrey A Law
  0 siblings, 0 replies; 8+ messages in thread
From: Jeffrey A Law @ 1998-07-13 21:47 UTC (permalink / raw)
  To: David S. Miller; +Cc: amylaar, egcs

  In message < 199807140425.VAA21575@dm.cobaltmicro.com >you write:
  >    ps.  Much of Dave's work mirrors stuff rth and myself have discussed
  >    as important to handle better on the sparc port.
  > 
  > BTW, during my work I noticed how much tha PA port is derived from or
  > influenced by the Sparc stuff, perhaps you can easily steal some of my
  > tricks and techniques ;-)
When I first started hacking on the PA port in 1992 it was still
generating sparc floating point opcodes :-)

"Derived" is just a nice way of saying the port started as:

cp sparc.md pa.md
cp sparc.h pa.h
cp sparc.c pa.c

vi pa.md pa.c pa.h


I think for everything except the conditional moves and other nullified
instructions we'll probably be able to steal some of your code.  In
fact, splitting up the DF/DI mode moves has been on the PA todo list
for a long time now.  It wasn't worth doing earlier since we didn't
split after reload.  But that's changed recently :-)

jeff





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: some Sparc hackery in the works
  1998-07-13  4:36 some Sparc hackery in the works David S. Miller
  1998-07-13 12:58 ` Joern Rennecke
@ 1998-07-13 21:47 ` Jeffrey A Law
  1 sibling, 0 replies; 8+ messages in thread
From: Jeffrey A Law @ 1998-07-13 21:47 UTC (permalink / raw)
  To: David S. Miller; +Cc: egcs

  In message < 199807131107.EAA11110@dm.cobaltmicro.com >you write:
  > The work is very preliminary, but I've made much progress.  It won't
  > make it into the upcoming 1.1 release, but I will merge it in soon
  > afterwards.
Agreed.

  > So I figured that I'd put my latest patches up for FTP for those who
  > want to take a look and maybe contribute more improvements or just
  > plain regression test what I have so far.  Feel free to do either.
Another option is to make a branch for this work.  That way the latest
bits are always in CVS.  Folks that want to look at or help with the
code just check out the magic branch.

  > 1) Make all RTL ever generated by the Sparc back end have a
  >    one to one correspondance between RTL insns and real Sparc
  >    insns.
Yippie!

  >    For certain classes of operations the Sparc backend currently
  >    generates multiple sparc insns per RTL insn.  This is bad for
  >    two reasons:
  > 
  > 	a) Less accurate schedules are obtained
  > 	b) reorg can't fill as many delay slots
  > 
  >    The first major obstacle here were the move patterns.  I rewrote
  >    them completely, happily half of sparc.c disappeared as things
  >    such as emit_move_sequence, output_move_double etc. were no longer
  >    needed.
Yup.  This is traditionally where this kind of work gets hard.

  > 3) Complete removal of any references to hard integer registers in the
  >    Sparc machine description.  The main incentive is that such
  >    hard-coded registers prevent CSE from occuring.
Awsome.  These hard registers also prevent certain combining, hoisting
out of loops, gcse, etc.  Explicitly mentioned hard registers are 
generally a lose.

jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: some Sparc hackery in the works
  1998-07-13 21:47   ` Jeffrey A Law
@ 1998-07-13 23:48     ` David S. Miller
  1998-07-13 21:47       ` Jeffrey A Law
  0 siblings, 1 reply; 8+ messages in thread
From: David S. Miller @ 1998-07-13 23:48 UTC (permalink / raw)
  To: law; +Cc: amylaar, egcs

   Date: Mon, 13 Jul 1998 21:48:37 -0600
   From: Jeffrey A Law <law@hurl.cygnus.com>

     In message < 199807131854.TAA18918@phal.cygnus.co.uk >you write:
     > You usually get worse code if you try to open-code DImode operations as
     > RTL at rtl generation time.

   This depends on numerous factors, including but not limited to how many
   registers the target has and how important scheduling is for good performance
   on the target.

     > You can get the same benefit for scheduling and delay slot filling if you
     > povide define_splits for the patterns in question; you can then also
     > safe the output template by replacing it with a '#', to indicate that
     > this pattern must be split.

   Not necessarily since you lose optimizations done by various other
   optimizers.  For example, the sub-parts of a DImode load become
   candidates for CSE, GCSE, combine, local alloc, etc.

   There's generally some tradeoff for each direction.  We should trust
   David to do the right thing for the sparc port.

There is a 3rd issue.  If you open code it early, you risk ending up
with a multi-insn output in the end.  The reason is fundamentally
because some bits of information (odd/even DI mode register alignment,
and actual address alignment for MEM moves) pop up near the end of
code gen depending upon what register allocators do (and subsequently
what stack slots go to which items etc.).  There is also an issue with
argument passing semantics of DI mode objects on 32-bit targets.

Look at my code for the movdi sequences and splits very carefully.  I
tried many combinations and approaches, and this was the first which
reliably gave me perfect 1<-->1 insn output in the end.

Some of my early open coding attempts caused many situations where at
the end, in the define_insn's, I had to detect the bad register number
of address alignment cases and fix it up.  This was bad and defeated
the purpose of the goals I had in mind.

Thus I made everything a post-reload split... oh, always keep in mind
that all these move patterns are "special" anyways due to reload.

   ps.  Much of Dave's work mirrors stuff rth and myself have discussed
   as important to handle better on the sparc port.

BTW, during my work I noticed how much tha PA port is derived from or
influenced by the Sparc stuff, perhaps you can easily steal some of my
tricks and techniques ;-)

Later,
David S. Miller
davem@dm.cobaltmicro.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~1998-07-13 23:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-07-13  4:36 some Sparc hackery in the works David S. Miller
1998-07-13 12:58 ` Joern Rennecke
1998-07-13 16:44   ` David S. Miller
1998-07-13 20:56   ` Michael Hayes
1998-07-13 21:47   ` Jeffrey A Law
1998-07-13 23:48     ` David S. Miller
1998-07-13 21:47       ` Jeffrey A Law
1998-07-13 21:47 ` Jeffrey A Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).