From mboxrd@z Thu Jan 1 00:00:00 1970 From: Craig Burley To: davem@dm.cobaltmicro.com Cc: law@cygnus.com, d.love@dl.ac.uk, egcs@cygnus.com Subject: ix86 double alignment (was Re: egcs-1.1 release schedule) Date: Mon, 22 Jun 1998 18:20:00 -0000 Message-id: <199806221829.OAA07477@melange.gnu.org> References: <199806221217.FAA20123@dm.cobaltmicro.com> X-SW-Source: 1998-06/msg00769.html > Date: Sun, 21 Jun 1998 22:31:31 -0600 > From: Jeffrey A Law > > * The ABI is still going to mandate that some doubles in > argument lists are going to be mis-aligned. We'd have > to arrange to copy them from the arglist into a suitable > stack slot. This may be more trouble than its worth. > >And there are still going to be issues with equivalence statements. Well, I'm willing to not try to do any special aligning for EQUIVALENCE and COMMON for now. If we can just get 64-bit alignment for stack-allocated VAR_DECLs -- which generally won't include EQUIVALENCE (and certainly not COMMON) -- we'll have made a *huge* improvement in g77 performance, especially its *repeatability* of performance measurements. (Without this improvement, egcs 1.1 will often appear *substantially* worse than the combination of g77 0.5.22 and gcc 2.7.2.3 on lots of widely used Fortran code, assuming users are using -malign-double.) I hope to have a fairly thorough sample program put together soon (tomorrow?) to illustrate this, but the simple cases we want to align for now are like subroutine x double precision a ... end and: subroutine y(n) double precision a(n) ... end The latter uses automatic arrays (which gcc and g77 support), it'd be great to get those 64-bit aligned as well. The former is the most important thing we *aren't* aligning, currently, even with `-malign-double'. (It should be aligned especially if `a' is an array, of course.) A case we can't 64-bit align is: real r(2) double precision d1, d2 equivalence (r(1),d1) equivalence (r(2),d2) Regardless of whether this is stack, static, or even part of a common block, we can't 64-bit align both d1 and d2. (Well, not without an option to completely change the way we implement Fortran; I wonder if Sun does that to support weird-but-conforming code on SPARCs, such as the above.) What we *can* do is *implement* the above, perhaps warning about the suboptimal alignment. That is, there's no reason we can't go ahead and 32-bit align d1 and d2, so one of them is not 64-bit aligned. The programmer asked for it, after all! What we can also 64-bit align is this: real r(2) double precision d equivalence (r(2),d) We can do that because we can see that there are no actual *conflicts* of alignment. We can implement this by either inserting a dummy unused 32-bit variable before r(1) and aligning *that* to a 64-bit boundary (stack or static, doesn't matter), or, if we have a smart-enough back end (or linker, for static memory I guess), simply use a directive that means "align to a 64-bit boundary on bit 32". But it's not *important* to 64-bit align the above EQUIVALENCE case, certainly not for egcs 1.1. And what we also need to continue to support is stuff like real r1, r2 real s(6) double precision d1, d2 common r1, d1, r2, d2 equivalence (r1,s) which requires that s(1) overlays r1, s(2) and s(3) overlay d1, s(4) overlays r2, and s(5) and s(6) overlays d2. Again, we can do this by seeing that there are no "hard" conflicts (at the machine or ABI level), and punting (and warning?) over the fact that the "soft" conflicts (the ideal 64-bit alignment of double for performance reasons) prevent "ideal" alignment. Again, "so what", the programmer has specified no 64-bit alignment, so we don't give it to him in cases like that -- but we can still compile correct, and fairly fast, ABI-compatible, code. Note that I suggested the gcc architecture (machine descriptions, etc.) be modified to include a more fine-grained expression of alignment requirements. E.g. distinguishing hardware requirements (even instruction requirements, such as `ld' vs. `ldd' on SPARCv8) from ABI requirements from ideal performance settings. But this suggestion was turned down at the time -- some seven years ago! Maybe it's time we finally got this all "right", and I'm sure willing to help. But I think we can only manage to get a bit of it "right" to improve x86 performance for egcs 1.1. tq vm, (burley)