From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21937 invoked by alias); 25 Apr 2002 16:36:22 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 21873 invoked from network); 25 Apr 2002 16:36:14 -0000 Received: from unknown (HELO vexpert.dbai.tuwien.ac.at) (128.130.111.12) by sources.redhat.com with SMTP; 25 Apr 2002 16:36:14 -0000 Received: from naos (naos [128.130.111.28]) by vexpert.dbai.tuwien.ac.at (8.11.6/8.11.6) with ESMTP id g3PGaCW03987; Thu, 25 Apr 2002 18:36:13 +0200 (MET DST) Date: Thu, 25 Apr 2002 09:41:00 -0000 From: Gerald Pfeifer To: Kurt Garloff cc: gcc@gcc.gnu.org, Andreas Jaeger Subject: Re: inliner in gcc-3.1 In-Reply-To: <20020424132314.B27120@gum01m.etpnet.phys.tue.nl> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2002-04/txt/msg01327.txt.bz2 On Wed, 24 Apr 2002, Kurt Garloff wrote: > I made another patch for 3.0.1 and adapted it to 3.1, which is probably much > more interesting: The -O3 (-finline-functions) fix. > > In the code I use, I put the inline keyword at all places where it was > obvious to my eyes that inlining was a good idea. When I compile that code > with -O3 (aka -finline-functions), performance drops by 5--10%. > > The reason is that comparably large functions get automatically inlined, so > the recursive inline limit is hit earlier, and the important leaf functions > sometimes do not get inlined any more. The problem is that all functions > are treated as if they were declared inline, so the keyword gets completely > ignored. I changed this and allow automatically inlined functions only half > the size of functions declared inline by the programmer. > This reduced the performance drop incurred by -finline-functions to 0.5--1.5%. > > It would be nice if this patch > http://www.garloff.de/kurt/freesoft/gcc/gcc310-inline-func-acct-v1.diff > would be tested by more people and integrated into 3.1. This second patch (partially) fixes a very bad regression we've been having since GCC 3.0; build time and binary size seem to be fine, though we seem to degrade slightly for some of the other benchmarks. I'd really like to see what this does to SPEC -- Andreas, could you give it a try? (This is due to a C module in the C++ application, and your fix seems to be critical for many C applications in general.) build time binary size 2.95.3 4:01 4430752 3.0 23:54 6295044 3.0.3 3:58 3948444 3.1-20020422 4:38 3996096 <-- without patch 3.1-20020424+kurt-v3 5:35 4102432 3.1.20020425+kurt-finl. 4:32 3912640 <-- this is with the patch 3.1-20020422+limit=800 6:37 4177344 3.1-20020422+limit=1200 16:50 4597888 3.2-20020422 5:15 4003276 Times in [s] | 2.95.3| 3.1 | 3.1-with-patch| --------------------+-------+ ---------------+---------------+ STRATCOMP1-ALL| 3.57 | 96.26 (0.01) | 23.74 (0.01) | <-- REGRESS! STRATCOMP-770.2-Q| 0.73 | 0.93 (0.01) | 0.83 (0.00) | 2QBF1| 19.08 | 22.25 (0.01) | 21.33 (0.00) | PRIMEIMPL2| 10.74 | 12.90 (0.02) | 12.85 (0.00) | ANCESTOR| 8.88 | 9.54 (0.01) | 9.61 (0.01) | 3COL-SIMPLEX1| 6.30 | 7.16 (0.00) | 7.30 (0.01) | 3COL-LADDER1| 36.24 | 42.28 (0.01) | 41.71 (0.03) | 3COL-N-LADDER1| 19.81 | 22.57 (0.06) | 22.96 (0.00) | 3COL-RANDOM1| 10.69 | 12.23 (0.01) | 12.87 (0.01) | HP-RANDOM1| 13.16 | 14.82 (0.02) | 14.85 (0.01) | HAMCYCLE-FREE| 1.18 | 1.71 (0.00) | 1.71 (0.00) | DECOMP2| 21.91 | 24.04 (0.02) | 25.75 (0.03) | BW-P4-Esra-a| 91.71 | 99.28 (0.03) | 100.19 (0.05) | BW-P5-nopush| 6.96 | 7.43 (0.01) | 7.48 (0.00) | BW-P5-pushbin| 6.20 | 6.49 (0.00) | 6.57 (0.00) | BW-P5-nopushbin| 1.94 | 2.08 (0.00) | 2.13 (0.00) | 3SAT-1| 32.92 | 38.48 (0.04) | 39.84 (0.00) | 3SAT-1-CONSTRAINT| 17.46 | 20.63 (0.00) | 21.37 (0.00) | HANOI-Towers| 4.73 | 4.94 (0.01) | 5.24 (0.00) | RAMSEY| 8.00 | 8.66 (0.00) | 9.01 (0.00) | CRISTAL| 11.07 | 13.39 (0.01) | 12.17 (0.02) | HANOI-K| 33.41 | 38.73 (0.02) | 38.84 (0.00) | 21-QUEENS| 9.66 | 10.40 (0.00) | 11.06 (0.01) | MSTDir[V=13,A=40]| 25.71 | 21.08 (0.01) | 21.00 (0.01) | MSTDir[V=15,A=40]| 25.81 | 21.16 (0.00) | 21.06 (0.01) | MSTUndir[V=13,A=40]| 12.86 | 11.47 (0.00) | 11.31 (0.01) | MSTUndir[V=15,A=40]|214.87 | 188.61 (0.01) | 185.37 (0.01) | TIMETABLING| 9.63 | 10.63 (0.00) | 10.98 (0.01) | --------------------+-------+ ---------------+---------------+ Gerald