From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19572 invoked by alias); 30 Nov 2008 16:18:41 -0000 Received: (qmail 19377 invoked by uid 48); 30 Nov 2008 16:17:19 -0000 Date: Sun, 30 Nov 2008 16:18:00 -0000 Message-ID: <20081130161719.19376.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug target/38306] [4.4 Regression] 15% slowdown of computational kernel In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "jv244 at cam dot ac dot uk" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2008-11/txt/msg02651.txt.bz2 ------- Comment #4 from jv244 at cam dot ac dot uk 2008-11-30 16:17 ------- (In reply to comment #2) > Due to the high density of branches in the code this is easily a code layout > and/or padding issue. Different architectures have different constraints on > their decoders and branch predictors related to branch density. Core > introduces other branch limitations for loops that engage the loop stream > detector. > We do not at all try to properly optimize (or even model) this apart > from inserting nops. YMMV with -fschedule-insns. I'm not expert enough to understand this, but you have it right. However, it remains a regression (on opteron) 4.4: -O3 -march=native -funroll-loops -ffast-math ==> 5.064s -O3 -march=native -funroll-loops -ffast-math -fschedule-insns ==> 4.396 4.3: -O3 -march=native -funroll-loops -ffast-math ==> 4.376 -O3 -march=native -funroll-loops -ffast-math -fschedule-insns ==> 3.372 -fno-tree-reassoc has no effect. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38306