From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31949 invoked by alias); 19 Jul 2002 14:40:05 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 31938 invoked from network); 19 Jul 2002 14:40:04 -0000 Received: from unknown (HELO laptop.moene.indiv.nluug.nl) (195.109.255.217) by sources.redhat.com with SMTP; 19 Jul 2002 14:40:04 -0000 Received: from local ([127.0.0.1] helo=moene.indiv.nluug.nl) by laptop.moene.indiv.nluug.nl with esmtp (Exim 3.12 #1 (Debian)) id 17VYvj-0000Kn-00 for ; Fri, 19 Jul 2002 16:40:27 +0200 Message-ID: <3D3824DA.B198DC39@moene.indiv.nluug.nl> Date: Fri, 19 Jul 2002 11:00:00 -0000 From: Toon Moene Organization: Moene Computational Physics, Maartensdijk, The Netherlands X-Accept-Language: en MIME-Version: 1.0 To: gcc@gcc.gnu.org Subject: Re: Alias analysis - does base_alias_check still work ? References: <3D346B28.47039CD9@moene.indiv.nluug.nl> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-SW-Source: 2002-07/txt/msg00860.txt.bz2 I wrote: > f/com.c contains the following note, preceding the definition of > > #define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0 > > /* We do not wish to use alias-set based aliasing at all. Used in the > extreme (every object with its own set, with equivalences recorded) > it > might be helpful, but there are problems when it comes to inlining. > We > get on ok with flag_argument_noalias, and alias-set aliasing does > currently limit how stack slots can be reused, which is a lose. */ > > I do not know if all the facts mentioned here still actually hold, but I > do have strong doubts that base_alias_check in alias.c still does its > duty. > > Consider the following Fortran source: > > SUBROUTINE SIMPLE(A, B) > B = 3.0 > A = 2.0 > B = A*B > END > > one would assume that alias analysis at least once should check that the > assignment to A in line 3 doesn't change the value of B set in line 2, > which, with > > flag_argument_noalias > 1 > > [arguments don't alias] in effect, would be the case. > > However, according to my experiments with setting breakpoints in > base_alias_check, it never passes that point. Sigh, that's just because it doesn't need to. The code generated looks like this: ... movl $0x40400000, (%edx) ! B=3.0 movl $0x40000000, (%eax) ! A=2.0 flds (%edx) ! put B on stack ... which, of course, gets neatly around the problem of whether the store into A would change B. Now for the real issue. To have register renaming be really effective, alias analysis has to work well. Take the following example: subroutine saxpy(n,sa,sx,sy) real sx(n),sy(n),sa integer i,n do i = 1,n sy(i) = sy(i) + sa*sx(i) enddo end If we compile this with -O2 -march=pentium4 -mfpmath=sse -funroll-loops -frename-registers, we get for the unrolled loop: .L6: movaps %xmm1, %xmm7 movaps %xmm1, %xmm6 movaps %xmm1, %xmm5 mulss (%edx), %xmm7 movaps %xmm1, %xmm4 addss (%eax), %xmm7 movss %xmm7, (%eax) mulss 4(%edx), %xmm6 addss 4(%eax), %xmm6 movss %xmm6, 4(%eax) mulss 8(%edx), %xmm5 addss 8(%eax), %xmm5 movss %xmm5, 8(%eax) mulss 12(%edx), %xmm4 addl $16, %edx addss 12(%eax), %xmm4 movss %xmm4, 12(%eax) addl $16, %eax subl $4, %ecx jns .L6 Obviously, register renaming has done its work, but the (re-)scheduling of instructions leaves something to be desired. After much gdb'ing in sched-deps.c and alias.c I believe to have found the cause: the rescheduling of this loop after register renaming is run after register allocation (hey, no surprise :-). However, alias analysis is really careful about assumptions on the contents of these hard registers, so almost no instruction gets moved around. OK, but what if we allow instruction scheduling before register allocation (that would only be beneficial if the floating point (pseudo) registers have different "names" already, but fortunately, they do), using -fschedule-insns instead of -frename-registers: .L6: movaps %xmm4, %xmm0 movaps %xmm4, %xmm1 movaps %xmm4, %xmm2 movaps %xmm4, %xmm3 mulss (%edx), %xmm0 mulss 4(%edx), %xmm1 mulss 8(%edx), %xmm2 mulss 12(%edx), %xmm3 addss (%eax), %xmm0 addss 4(%eax), %xmm1 addss 8(%eax), %xmm2 addss 12(%eax), %xmm3 movss %xmm0, (%eax) movss %xmm1, 4(%eax) movss %xmm2, 8(%eax) movss %xmm3, 12(%eax) addl $16, %edx addl $16, %eax subl $4, %ecx jns .L6 Bingo ! That's a lot better ! Which begs the question: Is there a reason -fschedule-insns isn't on by default when using -O2 ? Cheers, -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)