public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [EGCS] Re: double alignment patch for x86
@ 1997-08-19 19:45 Marc Lehmann
  1998-02-09  2:10 ` Jeffrey A Law
  0 siblings, 1 reply; 5+ messages in thread
From: Marc Lehmann @ 1997-08-19 19:45 UTC (permalink / raw)
  To: egcs

>  > IMHO, the stack should always be aligned to an 8 byte boundary unless you
>  > specifically ask for it otherwise (-fspace for instance).  Otherwise, your
>  > caller may not have aligned the stack properly.
>That's one way to approach the problem, but it results in an extra insn to
>mask off the low bits of the stack in the prologue.

One thing that bugs me for months now (need advice):

gcc currently ignores the alignment set by FUNCTION_ARG_BOUNDARY
on machines that use push instructions to store arguments. This
results in incorrect code (caller forgets the necessary padding)
with the proposed -marg-align-double.

I'd like to fix this... so... where would be the place to do that?
Should store_one_arg add padding, or should this be done in expand_call?

>Options which increase the minimum alignment requirements for stack objects
>implicitly depend on the OS, crt0 and friends to maintain proper alignment.

We could make the switch dependent on the OS (Linux -> do, solaris -> ignore).
That's somewhat ugly, but it doesn't break any programs.

In worst case we could simply default this switch to off.

>[ Which is why I would generally discourage options which work in this
>  manner. ]

I'd say we shouldn't discourage options that can easily double
code speed, with only minimal changes. And on many machines,
you have more alignment than the standard says anyway.

      -----==-
      ----==-- _
      ---==---(_)__  __ ____  __       Marc Lehmann
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com
      -=====/_/_//_/\_,_/ /_/\_\
    The choice of a GNU generation

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [EGCS] Re: double alignment patch for x86
  1997-08-19 19:45 [EGCS] Re: double alignment patch for x86 Marc Lehmann
@ 1998-02-09  2:10 ` Jeffrey A Law
  0 siblings, 0 replies; 5+ messages in thread
From: Jeffrey A Law @ 1998-02-09  2:10 UTC (permalink / raw)
  To: Marc Lehmann; +Cc: egcs

  In message <E0x0rGY-0000Z1-00.1997-08-19-18-36-22_pgcc_forever_@cerebro>you write:
  > One thing that bugs me for months now (need advice):
  > 
  > gcc currently ignores the alignment set by FUNCTION_ARG_BOUNDARY
  > on machines that use push instructions to store arguments. This
  > results in incorrect code (caller forgets the necessary padding)
  > with the proposed -marg-align-double.
Right.

  > I'd like to fix this... so... where would be the place to do that?
  > Should store_one_arg add padding, or should this be done in expand_call?
Have the caller push dummy word(s) before any of the normal args iff the
sum total of the stack area to be pushed is not a multiple of
FUNCTION_ARG_BOUNDARY.

The callee then moves any unaligned args out of the stack into either a
register (if that't the final "home" for the arg) or into an aligned
stack slot.

However, this only works if nobody ever mis-aligns the stack -- which
is hard to guarantee on the x86 because of the existance of old compilers
which do not maintain proper stack alignment.


  > We could make the switch dependent on the OS (Linux -> do, solaris ->
  > ignore).  That's somewhat ugly, but it doesn't break any programs.
You can't even do it on linux by default because of existing libraries
which may perform callbacks (qsort) or the need to be able to mix and
match .o files from different versions of gcc.


jeff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [EGCS] Re: double alignment patch for x86
  1997-08-18  8:22 2 (small?) problems Thomas Hiller
@ 1997-08-18 13:29 ` Dave Love
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Love @ 1997-08-18 13:29 UTC (permalink / raw)
  To: egcs

>>>>> "Toon" == Toon Moene <toon@moene.indiv.nluug.nl> writes:

 >> It takes 2 additional cycles per double access.. this can sum up
 >> to 30% runtime penalty on important algorithms like matrix
 >> multiply...

 Toon> This is on a Pentium ?  

It's about right for daxpy on my pentium.

-- Dave (recycled physicist)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [EGCS] Re: double alignment patch for x86
@ 1997-08-17 21:48 Marc Lehmann
  0 siblings, 0 replies; 5+ messages in thread
From: Marc Lehmann @ 1997-08-17 21:48 UTC (permalink / raw)
  To: egcs

John Carr wrote:
>
>byte alignment.  8 byte alignment is desirable (and has always been --
>Sun made a bad choice 10 years ago).  gcc copies misaligned double
>arguments to aligned locations on the stack.  Doing this on x86 might

I thinkt the patch already does this with -mstack-align-double
the (broken) -marg-align-double is just to squeeze out
the last bit of performance ;)

>help performance without breaking compatibility (or it might not: on x86
>there is only a cost if the dynamic value of the pointer is not aligned
>but SPARC requires two instructions instead of one to support possibly
>misaligned pointers).

It takes 2 additional cycles per double access.. this can sum up
to 30% runtime penalty on important algorithms like
matrix multiply...

It was a actually physicist that actually pointed this out to me
(and H.J.Lu for the libc, I believe).

      -----==-
      ----==-- _
      ---==---(_)__  __ ____  __       Marc Lehmann
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com
      -=====/_/_//_/\_,_/ /_/\_\
    The choice of a GNU generation

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [EGCS] Re: double alignment patch for x86
@ 1997-08-17 21:48 Toon Moene
  0 siblings, 0 replies; 5+ messages in thread
From: Toon Moene @ 1997-08-17 21:48 UTC (permalink / raw)
  To: egcs

Marc,

>  It takes 2 additional cycles per double access.. this
>  can sum up to 30% runtime penalty on important algorithms
>  like matrix multiply...

This is on a Pentium ?  My brother did some experiments on his PPro  
180 for me with a code that had almost all DOUBLE PRECISION data in  
COMMON BLOCKS.

Without -malign-double, the code took 18 seconds, with  
-malign-double 10 seconds, i.e. a factor of almost 2.

>  It was a actually physicist that actually pointed this
>  out to me (and H.J.Lu for the libc, I believe).

We're both actual physicists :-)

Cheers,
Toon.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~1998-02-09  2:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-08-19 19:45 [EGCS] Re: double alignment patch for x86 Marc Lehmann
1998-02-09  2:10 ` Jeffrey A Law
  -- strict thread matches above, loose matches on Subject: below --
1997-08-18  8:22 2 (small?) problems Thomas Hiller
1997-08-18 13:29 ` [EGCS] Re: double alignment patch for x86 Dave Love
1997-08-17 21:48 Toon Moene
1997-08-17 21:48 Marc Lehmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).