public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Large slowdown with gfortran vs f77 (x7)
@ 2009-09-04  8:45 Jeremy Sanders
  2009-09-08 19:28 ` James Cloos
  0 siblings, 1 reply; 4+ messages in thread
From: Jeremy Sanders @ 2009-09-04  8:45 UTC (permalink / raw)
  To: gcc

Hi - We noticed some code was very slow in gfortran vs g77, almost a factor 
of 7 times slower. This appears to be because gfortran uses the expf 
liubrary function call for the fortran exp() function for single precision 
numbers, but g77 uses the exp library function (i.e. the double precision 
function).

You can see the very large speed differences by telling gfortran to always 
use double precision numbers.

This can be replicated with this simple program:

test.f:
      implicit none
      integer i,j
      real a, u
      do j=1,10
         do i=1,10000000
            a=exp(1.0+1.0/i)
            u = u+a
         end do
      end do
      print *,u
      end

Compiled with gfortran -O2:
real    0m29.921s
user    0m29.912s
sys     0m0.000s

Compiled with gfortran -O2 -fdefault-real-8:
real    0m4.306s
user    0m4.304s
sys     0m0.000s

This is with a newly built gcc 4.4.1 on Fedora 10 (glibc 2.9), x86-64.

As the difference comes down to the speed difference between exp and expf, 
is it a gcc issue or a glibc issue? Should gcc be using this slow function? 
It seems amazing that the same function call is 7 times slower for lower 
precision numbers!

Thanks

Jeremy

-- 
Jeremy Sanders <jss@ast.cam.ac.uk>   http://www-xray.ast.cam.ac.uk/~jss/
X-Ray Group, Institute of Astronomy, University of Cambridge, UK.
Public Key Server PGP Key ID: E1AAE053

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Large slowdown with gfortran vs f77 (x7)
  2009-09-04  8:45 Large slowdown with gfortran vs f77 (x7) Jeremy Sanders
@ 2009-09-08 19:28 ` James Cloos
  0 siblings, 0 replies; 4+ messages in thread
From: James Cloos @ 2009-09-08 19:28 UTC (permalink / raw)
  To: Jeremy Sanders; +Cc: gcc

>>>>> "Jeremy" == Jeremy Sanders <jss@ast.cam.ac.uk> writes:

Jeremy> You can see the very large speed differences by telling gfortran
Jeremy> to always use double precision numbers.

Jeremy> This can be replicated with this simple program:

Jeremy> Compiled with gfortran -O2:
Jeremy> real    0m29.921s
Jeremy> user    0m29.912s
Jeremy> sys     0m0.000s

Jeremy> Compiled with gfortran -O2 -fdefault-real-8:
Jeremy> real    0m4.306s
Jeremy> user    0m4.304s
Jeremy> sys     0m0.000s

Jeremy> This is with a newly built gcc 4.4.1 on Fedora 10 (glibc 2.9), x86-64.

I tried it on my 1Ghz PIII laptop running Gentoo with gcc 4.1.1 and
glibc 2.10.1.

I added -march=pentium3 to the gcc cmd line; that probably made little
difference.  (glibc was also compiled with -march=pentium3 -O2).

I got nearly identical user times for the two compiles; user time was
always within 0.03 of 23.40 over multiple runs.

Incidently, while the real-8 compile output 271828665.96115601, the
real4 compile output 67108976.

This does, then, seem to be an x86-64 issue.

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Large slowdown with gfortran vs f77 (x7)
  2009-09-04 15:04 FX
@ 2009-09-04 15:48 ` Tobias Burnus
  0 siblings, 0 replies; 4+ messages in thread
From: Tobias Burnus @ 2009-09-04 15:48 UTC (permalink / raw)
  To: FX; +Cc: gcc, Fortran List, jss

On 09/04/2009 05:04 PM, FX wrote:
>   -- it's unarguably a glibc issue: if exp() is fast and expf() is
> slow, why doesn't glibc implement expf() by calling exp()? (yes, there
> can be other issues like rounding or so, but they can also be dealt
> with separately)

If I recall correctly, it is mostly an x86-64 problem. AMD has some math
patches for GLIBC which speed things up a lot. I think those are used in
openSUSE/SLES but not in Fedora. On the other hand, the AMD patches have
a problem with signaling NaN, which is being fixed [1,4].

Some older timings (from PR 34128) on openSUSE (!)  -- for "sin" but
there is the same problem as for exp:

              g77     gfortran
-m32  real(4) 0.408s  0.421s
-m64  real(4) 1.040s  0.589s ! sinf on x86-64: 40% faster!
-m32  real(8) 0.411s  0.408s
-m64  real(8) 0.976s  0.968s ! sin on x86-64


As this is a math-library problem, one cannot do much from the
GCC/gfortran side. You could consider using the AMD Math Core Library
[2] which implements fast versions of the trigonometric functions and
exp [3]. Those functions are not fully IEEE compliant but it might not
be needed in your case [3,4]. (See AMCL manual [3] for the details.)
Intel's MKL should have something similar if you are on Intel hardware
and have by chance the library.

Switching to SUSE or applying the patches oneself is another
possibility. (I do not know why the patches are not included in the
upstream version of glibc. There must be some (somewhat) well-founded
reason.)

> -- a similar bug was already reported a year and a half ago, and no
> activity was recorded on that front 
> (http://sources.redhat.com/bugzilla/show_bug.cgi?id=5997);

Well, it is assigned to someone @suse and as written it is not an issue
on openSUSE. It might be also related to the AMD patches and the reason
why they are not included in GLIBC.

Tobias
(who uses openSUSE [11.1/Factory] at home, Fedora [version 6 (!)]  at work)


[1] For sNaN, see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39314 and
https://bugzilla.novell.com/show_bug.cgi?id=487576
[2] ACML (free as in free beer): http://www.amd.com/acml
[3]
http://developer.amd.com/cpu/Libraries/acml/onlinehelp/Documents/Simple.html
; the functions are prefixed by "fast" but if you include the library
before the math library ("-lm") the fast version is used instead of the
libm version; "-lm" is automatically appended (internally) at the end of
the command line when using "gfortran" thus simply adding "-lacml_mv"
(or was it -lacml ?) to the command line should be sufficient.
[4] Note, the GLIBC patches of AMD are supposed to be fully IEEE
complient while the fastexp etc. of ACML are not (esp. regarding
denormal numbers and signaling NaN.)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Large slowdown with gfortran vs f77 (x7)
@ 2009-09-04 15:04 FX
  2009-09-04 15:48 ` Tobias Burnus
  0 siblings, 1 reply; 4+ messages in thread
From: FX @ 2009-09-04 15:04 UTC (permalink / raw)
  To: gcc, Fortran List; +Cc: jss

Hi Jeremy,

   -- it's unarguably a glibc issue: if exp() is fast and expf() is  
slow, why doesn't glibc implement expf() by calling exp()? (yes, there  
can be other issues like rounding or so, but they can also be dealt  
with separately)
   -- a similar bug was already reported a year and a half ago, and no  
activity was recorded on that front  (http://sources.redhat.com/bugzilla/show_bug.cgi?id=5997 
); overall, the math lib from glibc can be buggy and slow (and its  
development is not exactly proceeding at a steady pace) but political  
reasons prevent GCC to include its own math lib
   -- there is a GNU Fortran mailing-list where Fortran-related issues  
are welcome

Regards,
FX

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-09-08 19:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-04  8:45 Large slowdown with gfortran vs f77 (x7) Jeremy Sanders
2009-09-08 19:28 ` James Cloos
2009-09-04 15:04 FX
2009-09-04 15:48 ` Tobias Burnus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).