public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
* Re: optimization/4487: -ffast-math fails to disable gradual underflow on Ultrasparc
@ 2001-10-05 16:26 Tim Prince
  0 siblings, 0 replies; 2+ messages in thread
From: Tim Prince @ 2001-10-05 16:26 UTC (permalink / raw)
  To: nobody; +Cc: gcc-prs

The following reply was made to PR optimization/4487; it has been noted by GNATS.

From: "Tim Prince" <tprince@computer.org>
To: "Peter van Hoof" <vanhoof@cita.utoronto.ca>, <gcc-gnats@gcc.gnu.org>
Cc:  
Subject: Re: optimization/4487: -ffast-math fails to disable gradual underflow on Ultrasparc
Date: Fri, 5 Oct 2001 16:23:43 -0700

 > Ultrasparc chips do not support gradual underflow in hardware,
 > and therefore these instructions need to be emulated in
 software.
 > Since -ffast-math allows deviations from the IEEE-754 standard
 > for the sake of increasing performance, it is my opinion that
 > -ffast-math should flush denormalized numbers to zero (or at
 least
 > there should be some option for enabling this; to the best of
 my
 > knowledge no such flag exists for Sparc hardware). Needless to
 > say that software emulation can lead to substantial performance
 > degradation for certain programs. My machine has a 500MHz
 > Ultrasparc IIe processor, but I think the problem is the same
 > for all v9 hardware.
 Current P4 chips have a relatively slow firmware code sequence
 stored on-board in ROM for processing gradual underflows.
 Apparently, AMD chips have a similar but less severe problem.  I
 think the attitude with gcc so far has been "let the run-time
 library handle it"  i.e. surely there is a Sun run-time call
 which sets abrupt underflow, which could be emulated on other
 targets. It's a little difficult to maintain the separation of
 gcc from run-time and be able to handle this, but it could likely
 be done by adding a function to libgcc2 which would be invoked
 when main() is built with -ffast-math, or some more specific
 flag, and the architecture is one of those for which it is wanted
 (e.g. -msse2 in gcc-3.1).
 


^ permalink raw reply	[flat|nested] 2+ messages in thread

* optimization/4487: -ffast-math fails to disable gradual underflow on Ultrasparc
@ 2001-10-05 15:46 Peter van Hoof
  0 siblings, 0 replies; 2+ messages in thread
From: Peter van Hoof @ 2001-10-05 15:46 UTC (permalink / raw)
  To: gcc-gnats

>Number:         4487
>Category:       optimization
>Synopsis:       -ffast-math fails to disable gradual underflow for Ultrasparc
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    unassigned
>State:          open
>Class:          pessimizes-code
>Submitter-Id:   net
>Arrival-Date:   Fri Oct 05 15:46:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     Peter van Hoof
>Release:        3.0.1
>Organization:
Canadian Institute for Theoretical Astrophysics
>Environment:
System: SunOS scooby 5.8 Generic_108528-10 sun4u sparc SUNW,Sun-Blade-100
Architecture: sun4

	
host: sparc-sun-solaris2.8
build: sparc-sun-solaris2.8
target: sparc-sun-solaris2.8
configured with: ../gcc-3.0.1/configure --prefix=/opt/local --enable-threads --enable-gcj
>Description:
	Ultrasparc chips do not support gradual underflow in hardware,
	and therefore these instructions need to be emulated in software.
	Since -ffast-math allows deviations from the IEEE-754 standard
	for the sake of increasing performance, it is my opinion that
	-ffast-math should flush denormalized numbers to zero (or at least
	there should be some option for enabling this; to the best of my
	knowledge no such flag exists for Sparc hardware). Needless to
	say that software emulation can lead to substantial performance
	degradation for certain programs. My machine has a 500MHz
	Ultrasparc IIe processor, but I think the problem is the same
	for all v9 hardware.
>How-To-Repeat:
	To illustrate the degradation, here is a little program that
	generates oodles of underflows:
	
	scooby> gcc -O3 -ffast-math test.c -lm
	scooby> time a.out
	16.02u 135.95s 2:35.33 97.8%
	
	The -fast option on the SunWorks compiler does flush denormalized
	numbers to zero. I do not have a SunWorks compiler myself, so I
	used somebody elses (running on a Sun Ultra 1):
	
	chinook> cc -fast test.c -lm
	scooby> time a.out
	0.23u 0.01s 0:00.21 114.2%
	
	There obviously is a dramatic improvement in performance.
	This is test.c:
	
double pow(double,double);

int main()
{
	long i,j;
	double x[5000],y[5000],fac;

	fac = 1.e-305;
	for( i=0; i < 5000; i++ ) {
		x[i] = pow((double)(i+1),5.);
		y[i] = 0.;
	}
	for( j=0; j < 1000; j++ ) {
		for( i=0; i < 5000; i++ ) {
			y[i] += fac/x[i];
		}
	}
}

	
>Fix:
	The SunWorks compiler can obviously work around the problem,
	so there must be a workaround. However, I haven't found it yet.
	If somebody knows how to do it, I would be happy to hear about it!
>Release-Note:
>Audit-Trail:
>Unformatted:


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2001-10-05 16:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-10-05 16:26 optimization/4487: -ffast-math fails to disable gradual underflow on Ultrasparc Tim Prince
  -- strict thread matches above, loose matches on Subject: below --
2001-10-05 15:46 Peter van Hoof

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).