public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Register renaming works - but sched2 doesn't profit from it ?!?!
@ 2000-06-11  5:37 Toon Moene
  0 siblings, 0 replies; only message in thread
From: Toon Moene @ 2000-06-11  5:37 UTC (permalink / raw)
  To: gcc

L.S.,

A long, long time ago I wrote to this list a series of optimisation
opportunities not realised by the (then egcs) compiler, of particular
importance to Fortran programs (Dec., 7, 1997).

One of them was the rescheduling of instructions after register renaming
in (unrolled) loops.  Now that Stan Cox recently added a register
renaming pass, this should have been tackled.

Take, for instance, the following code:

      subroutine sum(a, b, c, n)
      integer i, n
      real a(n), b(n), c(n)
      do i = 1, n
         c(i) = a(i) + b(i)
      enddo
      end

The current CVS'd compiler generates, for the unrolled loop, on
alphaev6-unknown-linux-gnu (-O2 -funroll-loops -fno-rerun-loop-opt):

$L6:
        lds $f10,0($17)
        lds $f11,0($16)
        subl $4,3,$1
        lda $4,-4($4)
        addl $1,$31,$2
        adds $f11,$f10,$f11
        sts $f11,0($18)
        lds $f11,4($17)
        lds $f10,4($16)
        adds $f10,$f11,$f10
        sts $f10,4($18)
        lds $f10,8($17)
        lds $f11,8($16)
        adds $f11,$f10,$f11
        sts $f11,8($18)
        lds $f10,12($16)
        lda $16,16($16)
        lds $f11,12($17)
        lda $17,16($17)
        adds $f10,$f11,$f10
        sts $f10,12($18)
        lda $18,16($18)
        bge $2,$L6

and with -frename-registers:

$L6:
        lds $f23,0($16)
        lds $f12,0($17)
        subl $4,3,$1
        lda $4,-4($4)
        addl $1,$31,$2
        adds $f23,$f12,$f24
        sts $f24,0($18)
        lds $f13,4($16)
        lds $f25,4($17)
        adds $f13,$f25,$f14
        sts $f14,4($18)
        lds $f26,8($16)
        lds $f15,8($17)
        adds $f26,$f15,$f27
        sts $f27,8($18)
        lds $f22,12($16)
        lda $16,16($16)
        lds $f11,12($17)
        lda $17,16($17)
        adds $f22,$f11,$f10
        sts $f10,12($18)
        lda $18,16($18)
        bge $2,$L6

Note how all the floating point registers are renamed (they are
temporaries within the loop anyway) - thereby breaking all the
dependency chains - but the sequence of instructions hasn't changed !

The following is close to optimal (derived by hand):

$L6:
        lds $f23,0($16)
        lds $f12,0($17)
        lds $f13,4($16)
        lds $f25,4($17)
        lds $f26,8($16)
        lds $f15,8($17)
        lds $f22,12($16)
        lds $f11,12($17)
        subl $4,3,$1
        lda $4,-4($4)
        addl $1,$31,$2
        adds $f23,$f12,$f24
        adds $f13,$f25,$f14
        adds $f26,$f15,$f27
        adds $f22,$f11,$f10
        lda $16,16($16)
        lda $17,16($17)
        sts $f24,0($18)
        sts $f14,4($18)
        sts $f27,8($18)
        sts $f10,12($18)
        lda $18,16($18)
        bge $2,$L6

whereby all the loads are moved to the top of the loop, and all the
stores to the bottom.  This is approximately 12 % faster on my 466 Mhz
21264.  On a statically scheduled machine (e.g. the 21164(A)) the
difference should be dramatic.

Premium question:  Why doesn't sched2 reschedule the instructions, in
spite of the completely different dependencies ?

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
GNU Fortran 95: http://g95.sourceforge.net/ (under construction)

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2000-06-11  5:37 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-06-11  5:37 Register renaming works - but sched2 doesn't profit from it ?!?! Toon Moene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).