Spill frequency and alignment

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Spill frequency and alignment
@ 1998-12-18  9:16 tprince
  0 siblings, 0 replies; 2+ messages in thread
From: tprince @ 1998-12-18  9:16 UTC (permalink / raw)
  To: burley, egcs, hjstein, toon

I ran a comparison test on the classic Dawes/Whittle Laboratory
implicit time-marching turbomachinery CFD code.  The code is
written entirely in single precision.  As I related yesterday, there
are significant spills, and g77 runs faster on Intel with -Os than
with -O2.  In certain important sections of the code, the number
of memory references associated with spills and restores
approaches the number of declared memory references.

lf95, which employs mis-aligned 80-bit spills, takes 25% longer
to run than g77 or lf90.  I would tend to ascribe this to the spill
problem, having examined important sections of the generated
code and seeing this as the primary target for criticism.  For a
double precision code, the failure to align declared arrays would
be a more serious factor.

The differences in results were consistent with my expectation;
the widened spills changed results to a degree similar to what is
expected from the usual re-associations performed by various
compilers.  The code runs reliably in plain single precision, but I
consider the advantage of widened register calculation to be
worth the slight time penalty of less than 5%.  Some of this may
often be regained by more rapid and reliable iterative
convergence.  However, the widened spills don't demonstrate
any advantage worth paying in execution time.

I believe this confirms Toon's suspicion that widened spills will
be unsatisfactory without guaranteed alignment.

Going way out on a limb, I would guess that the penalty for 80-bit
spills rather than 32-bit ones could be reduced from 25% to 10%
by proper alignment, and possibly to 5% by widening only by one
step (64-bit spills for single precision, 80-bit spills for double).

 I don't have a strong feeling about 64-bit vs 80-bit spills for
single precision, but I do believe we will be unhappy with
mis-aligned widened spills.

Dr. Timothy C. Prince
Consulting Engineer
Solar Turbines, a Caterpillar Company
alternate e-mail: tprince@computer.org

           To:                                              INTERNET - IBMMAIL

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Spill frequency and alignment
@ 1998-12-20  5:57 Stephen L Moshier
  0 siblings, 0 replies; 2+ messages in thread
From: Stephen L Moshier @ 1998-12-20  5:57 UTC (permalink / raw)
  To: tprince, egcs

  In certain important sections of the code, the number
  of memory references associated with spills and restores
  approaches the number of declared memory references.

In the terminology of reload, spilling merely reassigns a variable
from a register to a home on the stack.  I suspect that the force-mem
switch tends to create gratuitous copying, from the declared memory
variable to a register which then suffers reloading to the stack.  It
would be interesting if you were to find that -fno-force-mem had an
effect on either the overall execution speed or the amount of spilling
in your particular benchmark.

The fact that something was spilled does not imply that it was also
necessarily truncated.  Spilling and truncating are two different
effects to be evaluated.  If a `double' variable is merely copied into
a register and then recopied out to a `double' stack location, then it
was not actually truncated.  I thought we already had some test cases
that showed truncation, but in an hour or so of searching, I could not
find it either in the torture tests nor in any of the random C
programs that happened to be handy.  Thus I do not have a test case
for the bug, and the experience of searching for one has made me think
that the truncation effect may be rare compared to the overall
incidence of spilling.  That ratio affects the statistical estimation
of the speed penalty, because there would be no reason to widen the
spills that did not also truncate.

Permit me to reiterate that I consider extra-precise register problems
to be a hardware bug, not a compiler bug.  The Intel 8087 fraud is no
substitute for a real XFmode data type in the compiler nor for ways
to get IEEE behavior out of the hardware.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~1998-12-20  5:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-12-18  9:16 Spill frequency and alignment tprince
1998-12-20  5:57 Stephen L Moshier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).