From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 801 invoked by alias); 16 Jan 2008 16:47:12 -0000 Received: (qmail 642 invoked by uid 48); 16 Jan 2008 16:46:28 -0000 Date: Wed, 16 Jan 2008 17:40:00 -0000 Message-ID: <20080116164628.641.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3 In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "hubicka at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2008-01/txt/msg01647.txt.bz2 ------- Comment #11 from hubicka at gcc dot gnu dot org 2008-01-16 16:46 ------- Last time I looked into it, it was code alignment affected by inlining in the string matching loop (longest_match). This code is very atypical, since the internal loop comparing strings is hand unrolled but it almost never rolls, since the compressed strings tends to be all different. GCC mispredicts this moving some stuff out of the loop and bb-reorder aligns the code in a way that the default path not doing the loop is jumping pretty far hurting decode bandwidth of K8 especially because the jumps are hard to predict. I don't see any direct things in the code heuristics can use to realize that the loop is not rooling, except for special casing the particular benchmark. FDO scores of gzip are not doing that bad, but there is still gap relative to ICC (even archaic version of it running 32bit compared to 64bit GCC). http://www.suse.de/~gcctest/SPEC-britten/CINT/sandbox-britten-FDO/index.html It would be nice to convince gzip/zlibc/bzip2 people to use profiling by default in the build process - those packages are ideal targets. But since core is not that much sensitive to code alignment and nuber of jumps as K8, perhaps there are extra problems demonstrated by this. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33761