From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26053 invoked by alias); 1 Sep 2009 09:13:57 -0000 Received: (qmail 25867 invoked by uid 48); 1 Sep 2009 09:13:39 -0000 Date: Tue, 01 Sep 2009 09:13:00 -0000 Message-ID: <20090901091339.25866.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug target/38306] [4.4/4.5 Regression] 15% slowdown w.r.t. 4.3 of computational kernel on some architectures In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "jv244 at cam dot ac dot uk" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2009-09/txt/msg00021.txt.bz2 ------- Comment #16 from jv244 at cam dot ac dot uk 2009-09-01 09:13 ------- (In reply to comment #15) > Please try -O2 and -O2 -funroll-loops too, since -O3 is not always good for > speed. (It would be even better if -O2 is not slower and you can find out what > the culprit is at -O3; this is not necessarily possible though). you're right that, without -fschedule-insns -O2 is faster than -O3 on this case, but nothing comes close to 4.3 performance. adding '-fschedule-insns' to the fastest -O2 choice makes it 20% slower. All numbers with trunk: -O2 -march=native -funroll-loops -ffast-math: 4.032 -O2 -march=native -funroll-loops -ffast-math -fschedule-insns: 4.712 -O3 -march=native -funroll-loops -ffast-math: 4.408 -O2 -march=native -ffast-math: 11.373 -O2 -march=native -ffast-math -fschedule-insns: 11.409 -O3 -march=native -ffast-math: 4.296 -O3 -march=native -ffast-math -fschedule-insns: 4.656 I can test other flags if you've a hint -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38306