public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* GCC  Summit 2010 topic (potentially).
@ 2009-06-04 17:33 Toon Moene
  2009-06-04 20:53 ` Richard Guenther
  0 siblings, 1 reply; 2+ messages in thread
From: Toon Moene @ 2009-06-04 17:33 UTC (permalink / raw)
  To: gcc mailing list

L.S.,

This year I'm unable to attend the GCC Summit (both due to time and 
money constraints).

In 2008, I pondered to talk about the effect of link time optimization 
on typical Fortran programs -

That is, until my attention got hijacked by the geo-politically more 
pressing question of Coarrays in Fortran.

However, the issue still stands.  So I'm thinking ahead of next year 
(assuming LTO will work by that time for most front-end languages):

What will LTO bring for Fortran ?

Here's a run-of-the-mill example from our code:

       SUBROUTINE VERINT (
      I   KLON   , KLAT   , KLEV   , KINT  , KHALO
      I , KLON1  , KLON2  , KLAT1  , KLAT2
      I , KP     , KQ     , KR
      R , PARG   , PRES
      R , PALFH  , PBETH
      R , PALFA  , PBETA  , PGAMA   )
...
       DO JY = KLAT1,KLAT2
       DO JX = KLON1,KLON2
          IDX  = KP(JX,JY)
          IDY  = KQ(JX,JY)
          ILEV = KR(JX,JY)
C
          PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
      +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
      +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
      + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
      +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C    +
      +               + PGAMA(JX,JY,2)*(
C    +
      +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
      +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
      + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
      +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
       ENDDO
       ENDDO
...
       RETURN
       END

There are several issues a link time optimization pass could determine:

1. Whether or not the arrays PALFA, PARG, ... are suitably aligned for
    vectorization (forgoing a run time check for that).

2. Wheter KLON{1,2}, KLAT{1,2} are actually invariant throughout an
    invocation of the execuatable (as they are in our case)
    (CSE of vectorization criteria).

However, with a little bit of extra effort (instrumentation outside the 
program), the following can be determined:

3. KLON{1,2}, KLAT{1,2} are in fact known constants, which only happen
    to be variables because the executable is built to accommodate
    arbitrary grid sizes.

Would it help to provide GCC with knowledge about KLON, KLAT (and 
thereby, KLON{1,2}, KLAT{1,2}) ?

Note that this question is less academic than it seems.  We often run on 
the same grid for years without changing an executable, so this 
optimization makes sense.

Kind regards,

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.4/changes.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: GCC Summit 2010 topic (potentially).
  2009-06-04 17:33 GCC Summit 2010 topic (potentially) Toon Moene
@ 2009-06-04 20:53 ` Richard Guenther
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Guenther @ 2009-06-04 20:53 UTC (permalink / raw)
  To: Toon Moene; +Cc: gcc mailing list

On Thu, Jun 4, 2009 at 7:33 PM, Toon Moene <toon@moene.org> wrote:
> L.S.,
>
> This year I'm unable to attend the GCC Summit (both due to time and money
> constraints).
>
> In 2008, I pondered to talk about the effect of link time optimization on
> typical Fortran programs -
>
> That is, until my attention got hijacked by the geo-politically more
> pressing question of Coarrays in Fortran.
>
> However, the issue still stands.  So I'm thinking ahead of next year
> (assuming LTO will work by that time for most front-end languages):
>
> What will LTO bring for Fortran ?
>
> Here's a run-of-the-mill example from our code:
>
>      SUBROUTINE VERINT (
>     I   KLON   , KLAT   , KLEV   , KINT  , KHALO
>     I , KLON1  , KLON2  , KLAT1  , KLAT2
>     I , KP     , KQ     , KR
>     R , PARG   , PRES
>     R , PALFH  , PBETH
>     R , PALFA  , PBETA  , PGAMA   )
> ...
>      DO JY = KLAT1,KLAT2
>      DO JX = KLON1,KLON2
>         IDX  = KP(JX,JY)
>         IDY  = KQ(JX,JY)
>         ILEV = KR(JX,JY)
> C
>         PRES(JX,JY) = PGAMA(JX,JY,1)*(
> C
>     +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
>     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
>     + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
>     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
> C    +
>     +               + PGAMA(JX,JY,2)*(
> C    +
>     +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
>     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
>     + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
>     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
>      ENDDO
>      ENDDO
> ...
>      RETURN
>      END
>
> There are several issues a link time optimization pass could determine:
>
> 1. Whether or not the arrays PALFA, PARG, ... are suitably aligned for
>   vectorization (forgoing a run time check for that).
>
> 2. Wheter KLON{1,2}, KLAT{1,2} are actually invariant throughout an
>   invocation of the execuatable (as they are in our case)
>   (CSE of vectorization criteria).
>
> However, with a little bit of extra effort (instrumentation outside the
> program), the following can be determined:
>
> 3. KLON{1,2}, KLAT{1,2} are in fact known constants, which only happen
>   to be variables because the executable is built to accommodate
>   arbitrary grid sizes.
>
> Would it help to provide GCC with knowledge about KLON, KLAT (and thereby,
> KLON{1,2}, KLAT{1,2}) ?
>
> Note that this question is less academic than it seems.  We often run on the
> same grid for years without changing an executable, so this optimization
> makes sense.

IPA-CP would be the candidate to propagate that information.  But if
they are both sufficiently large the performance improvement form knowning
them is likely minimal.

Richard.

> Kind regards,
>
> --
> Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> At home: http://moene.org/~toon/
> Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.4/changes.html
>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-06-04 20:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-04 17:33 GCC Summit 2010 topic (potentially) Toon Moene
2009-06-04 20:53 ` Richard Guenther

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).