From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15528 invoked by alias); 30 Jun 2005 22:16:41 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 15462 invoked by uid 48); 30 Jun 2005 22:16:34 -0000 Date: Thu, 30 Jun 2005 22:16:00 -0000 Message-ID: <20050630221634.15461.qmail@sourceware.org> From: "danalis at cis dot udel dot edu" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20041006153323.17863.kunert@physik.tu-dresden.de> References: <20041006153323.17863.kunert@physik.tu-dresden.de> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug tree-optimization/17863] [4.0/4.1 Regression] threefold performance loss, not inlining as much X-Bugzilla-Reason: CC X-SW-Source: 2005-06/txt/msg03640.txt.bz2 List-Id: ------- Additional Comments From danalis at cis dot udel dot edu 2005-06-30 22:16 ------- I'm looking at the reduced testcase from comment #6, and I noticed that f() is declared double, but does not return anything. Thus the code doesn't compile with -O3 -Wall -Werror. If I fix the bug adding a "return(return *ap1)", or by declaring f() to be void, the performance regression dissappears. Here's the test harness I used to call the minimized testcase: int main(int argc, char *argv[]){ double ay[100][100]; const double *py, *pz; double *dxb, *ap1; double sum=0; int i,j,k; for(i=0; i<100; i++){ for(j=0; j<100; j++){ ay[i][j] = 1000*(i+1)+2*(j+1); } } py = ay[0]; pz = ay[1]; dxb = ay[2]; ap1 = ay[3]; for(k=0; k<100; k++){ for(i=0; i<10000; i++){ for(j=0; j<12; j++){ sum += f(py,pz,dxb,ap1,j,5); sum /= 2; } } } cout << sum << endl; return 0; } Is that ok? I compiled this with -O3 -mtune=pentium. Runtimes *without* the fix to f() were 0.31s, 8.72s, 8.83s and 8.80s when compiled with g++ 2.95.3, 3.4.3, 4.0.0 and 4.1.0-20050625, respectively (making this a large performance regression relative to gcc-2.95.3). Runtimes *with* the fix were 0.34s, 0.28s, 0.36s, 0.32s when compiled with g++ 2.95.3, 3.4.3, 4.0.0 and 4.1.0-20050625, respectively. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17863