public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/24519]  New: gfortran slow because of incomplete dependency checking
@ 2005-10-25 13:29 paul dot richard dot thomas at cea dot fr
  2005-10-25 13:47 ` [Bug fortran/24519] " pinskia at gcc dot gnu dot org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: paul dot richard dot thomas at cea dot fr @ 2005-10-25 13:29 UTC (permalink / raw)
  To: gcc-bugs

This PR is concerned with gfortran's performance relative to commercial
compilers in respect of Polyhedron's TEST_FPU.F90.

Baseline(test_fpu.f90 out of the box):

gfortran 20050903 -fmax-stack-var-size=1000000 -O2

  Benchmark running, hopefully as only ACTIVE task
Test1 - Gauss 2000 (101x101) inverts 34.6 sec  Err= 0.000000000000002
Test2 - Crout 2000 (101x101) inverts  8.4 sec  Err= 0.000000000000001
Test3 - Crout  2 (1001x1001) inverts 11.0 sec  Err= 0.000000000000001
Test4 - Lapack 2 (1001x1001) inverts  8.1 sec  Err= 0.000000000000273
                             total = 62.1 sec

Digital DF6.0 using /fast

  Benchmark running, hopefully as only ACTIVE task
Test1 - Gauss 2000 (101x101) inverts  5.1 sec  Err= 0.000000000000003
Test2 - Crout 2000 (101x101) inverts  5.4 sec  Err= 0.000000000000012
Test3 - Crout  2 (1001x1001) inverts 10.6 sec  Err= 0.000000000000063
Test4 - Lapack 2 (1001x1001) inverts  7.5 sec  Err= 0.000000000000297
                             total = 28.6 sec

gfortran is doing OK but for Test1, where there is a factor of nearly 7
between them.

The offending lines in the source are:

lines 115-120

   temp = b(:,k)
   DO j = 1, n
      c = b(k,j)*d
      b(:,j) = b(:,j)-temp*c
      b(k,j) = c
   END DO

Repeating these nearly doubles the execution time of Test1.

Modifying the code to:

   zb = b                           ! A copy of b..
   temp = b(:,k)
   DO j = 1, n
      c = b(k,j)*d
      b(:,j) = zb(:,j)-temp*c       ! ..to be used here.
      b(k,j) = c
   END DO

Test1 - Gauss 2000 (101x101) inverts 12.0 sec  Err= 0.000000000000003
Test2 - Crout 2000 (101x101) inverts  8.4 sec  Err= 0.000000000000001
Test3 - Crout  2 (1001x1001) inverts 11.0 sec  Err= 0.000000000000001
Test4 - Lapack 2 (1001x1001) inverts  8.3 sec  Err= 0.000000000000273
                             total = 39.7 sec

which gains us nearly a factor of three and looks much more respectable.

The reason is evident from dumping the code.

This bit of code is converted into (loosely):

   temp = b(:,k)
   DO j = 1, n
      c = b(k,j)*d
      allocate (atmp6(size(b,1)))
      atmp6(:) = b(:,j)-temp*c
      b(:,j) = atmp6(:)
      deallocate(atmp6)
      b(k,j) = c
   END DO

If I reproduce this with an already allocated temporary, the time for Test1
drops to 11.6s.  It seems to me that it is the repeated calls to
_gfc_internal_malloc and _gfc_internal_free that are responsible for 23s of
execution time in the original version of the test. This is confirmed by
making this previous explicitly so, whereupon the execution time for Test1
goes up to 35.7s.

Finally, the best performance of all is obtained if the vector expression is
made F77-like

   temp = b(:,k)
   DO j = 1, n
      c = b(k,j)*d
      do p = 1, n
        b(p,j) = b(p,j)-temp(p)*c
      end do
      b(k,j) = c
   END DO

whereupon gfortran turns in a very healthy 6.7s.

My conclusion of all this is that there is some room for some optimization
of the scalarizer (by having gfc_conv_resolve_dependencies recognise that
this is an element by element replacement?).


-- 
           Summary: gfortran slow because of incomplete dependency checking
           Product: gcc
           Version: 4.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: fortran
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: paul dot richard dot thomas at cea dot fr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24519


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-03-07  2:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-25 13:29 [Bug fortran/24519] New: gfortran slow because of incomplete dependency checking paul dot richard dot thomas at cea dot fr
2005-10-25 13:47 ` [Bug fortran/24519] " pinskia at gcc dot gnu dot org
2005-10-25 13:55 ` pinskia at gcc dot gnu dot org
2005-10-25 18:14 ` pinskia at gcc dot gnu dot org
2006-02-18 17:52 ` pault at gcc dot gnu dot org
2006-02-24 10:53 ` pault at gcc dot gnu dot org
2006-02-24 11:28 ` pault at gcc dot gnu dot org
2006-02-27  3:46 ` pinskia at gcc dot gnu dot org
2006-03-07  0:06 ` pault at gcc dot gnu dot org
2006-03-07  2:26 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).