From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22376 invoked by alias); 11 Dec 2011 14:08:31 -0000 Received: (qmail 22366 invoked by uid 22791); 11 Dec 2011 14:08:29 -0000 X-SWARE-Spam-Status: No, hits=-2.8 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,TW_PM X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 11 Dec 2011 14:08:16 +0000 From: "dominiq at lps dot ens.fr" To: gcc-bugs@gcc.gnu.org Subject: [Bug lto/51497] [4.7 Regression] The run time for the polyhedron test nf.f90 is ~10% slower with -flto after revision 182107 Date: Sun, 11 Dec 2011 14:14:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: lto X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: dominiq at lps dot ens.fr X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2011-12/txt/msg01151.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51497 --- Comment #2 from Dominique d'Humieres 2011-12-11 14:07:59 UTC --- Upon further looking at the assembly, I have found that the seven loops in spmmult are all vectorized without -flto, while none of them are with -flto. For nf2dprecon after trisolve inlining, the code looks like subroutine NF2DPrecon(x,gi,au1,au2,i1,i2,nx) ! 2D NF Preconditioning matrix implicit none integer :: i1,i2,nx real(8),dimension(i2)::x,t,gi,au1,au2 integer :: i,j do i = i1 , i2 , nx if ( i>i1 ) x(i:i+nx-1) = x(i:i+nx-1) - au2(i-nx:i-1)*x(i-nx:i-1) x(i) = gi(i)* x(i) do j = i+1 , i+nx-1 x(j) = gi(j)*(x(j)-au1(j-1)*x(j-1)) enddo do j = i+nx-2 , i , -1 x(j) = x(j) - gi(j)*au1(j)*x(j+1) enddo enddo do i = i2-2*nx+1 , i1 , -nx t(i:i+nx-1) = au2(i:i+nx-1)*x(i+nx:i+2*nx-1) t(i) = gi(i)* t(i) do j = i+1 , i+nx-1 t(j) = gi(j)*(t(j)-au1(j-1)*t(j-1)) enddo do j = i+nx-2 , i , -1 t(j) = t(j) - gi(j)*au1(j)*t(j+1) enddo x(i:i+nx-1) = x(i:i+nx-1) - t(i:i+nx-1) enddo end subroutine NF2DPrecon !========================================= where none of the explicit 'do j' loops are vectorized ("possible dependence between data-refs") while the three implicit loops are vectorized without -flto, while only the last two are with -flto. Note that the first loop not vectorized with -lflto: x(i:i+nx-1) = x(i:i+nx-1) - au2(i-nx:i-1)*x(i-nx:i-1) is vectorized without it with "created 1 versioning for alias checks." (alias between au2 and x? if yes, valid Fortran codes guarantee that there is no aliasing).