From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21043 invoked by alias); 20 Nov 2009 13:45:46 -0000 Received: (qmail 20846 invoked by uid 48); 20 Nov 2009 13:45:10 -0000 Date: Fri, 20 Nov 2009 13:45:00 -0000 Message-ID: <20091120134510.20845.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug tree-optimization/42108] [4.4/4.5 Regression] Vectorizer cannot deal with PAREN_EXPR gracefully, 50% performance regression In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "dominiq at lps dot ens dot fr" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2009-11/txt/msg01687.txt.bz2 ------- Comment #9 from dominiq at lps dot ens dot fr 2009-11-20 13:45 ------- I am rather confused by some comments: (1) Although I am not fluent with x86 assembly, I am pretty sure that no code in eval is vectorized (assembly taken from this pr or from the original post http://gcc.gnu.org/ml/fortran/2009-11/msg00163.html). (2) If I am not mistaken, the k loop always handle 3 elements for i, i+n, and i+2*n. (3) On a core2duo 2.1Ghz, I only see small changes in the timing between 4.3.4 to trunk, -O1 to -O3, and 32 or 64 bit mode. Now if I do the following change: --- pr42108_1_db.f90 2009-11-20 14:14:05.000000000 +0100 +++ pr42108_1_db_1.f90 2009-11-20 14:15:24.000000000 +0100 @@ -7,12 +7,10 @@ subroutine eval(foo1,foo2,foo3,foo4,x,n do i=2,n foo3(i)=foo2*foo4(i) do j=1,i-1 - temp=0.0d0 - jmini=j-i - do k=i,nnd,n - temp=temp+(x(k)-x(k+jmini))**2 - end do - temp = sqrt(temp+foo1) + temp = sqrt( (x(i) - x(j))**2 & + +(x(i+n) - x(j+n))**2 & + +(x(i+2*n)-x(j+2*n))**2 & + +foo1) foo3(i)=foo3(i)+temp*foo4(j) foo3(j)=foo3(j)+temp*foo4(i) end do I go from 9.2s to 5.5s for n=20000. So the k loop is not automatically unrolled even with -funroll-loops. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108