From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17508 invoked by alias); 23 Apr 2002 15:23:55 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 17492 invoked from network); 23 Apr 2002 15:23:49 -0000 Received: from unknown (HELO vexpert.dbai.tuwien.ac.at) (128.130.111.12) by sources.redhat.com with SMTP; 23 Apr 2002 15:23:49 -0000 Received: from naos (naos [128.130.111.28]) by vexpert.dbai.tuwien.ac.at (8.11.6/8.11.6) with ESMTP id g3NFMvW23161; Tue, 23 Apr 2002 17:22:57 +0200 (MET DST) Date: Tue, 23 Apr 2002 08:24:00 -0000 From: Gerald Pfeifer To: Mark Mitchell cc: gcc@gcc.gnu.org Subject: Re: GCC 3.1 Prerelease In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2002-04/txt/msg01169.txt.bz2 On Sun, 21 Apr 2002, Gerald Pfeifer wrote: >> PR3083 C++ frontend consumes inacceptable amounts of CPU with -O3 > This is a report of mine. I believe it can be (more or less) closed, but > will do a careful investigation tomorrow and possible Tuesday and provide > an analysis then. As promised, I performed tests comparing GCC 2.95.3, GCC 3.0, GCC 3.0.3, the 3.1-branch as of yesterday, the 3.1-branch with -finline-limit=800 and 1200, respectively, and the 3.2-branch as of yesterday. ANALYSIS: The situation isn't as bad as it used to be, but still release critical. Compiling the file attached to PR 3083/C++ takes 62.4s with the 3.1-branch versus 46.9s with GCC 3.0.3 (on a PIII/1000). Considering not only PR 3083/C++, but the situation in general, it seems that for template/STL-heavy code with deeply nested structures, GCC 2.95 still is the compiler of choice, especially in terms of code quality. GCC 3.1 seems to generate equivalent code to GCC 3.0.3 but is measurably slower. Current mainline produces slightly better code than 3.0.3 and 3.1 at the expense of even longer build times, but still does not reach GCC 2.95 on average. (Note that the first test above was performed using the same preprocessed file, while the other tests were performed compiling from source, which means that GCC 2.95 gets the benefit of a much lighter libstdc++.) DIAGNOSIS: Especially considering the runs where I increased -finline-limit, I strongly suspect our inliner is still quite bad. DETAILED DATA: The first table shows the time needed to build DLV (http://www.dbai.tuwien.ac.at/proj/dlv/), a C++ application that makes heavy use of STL and has deeply nested template structures, on an Athlon/1200. It also shows the size of the resulting static binary. The second table shows DLV running selected benchmarks that exercise different modules of the system (which is basically equivalent to evaluating GCC code generation on completely separate applications). | 2.95.3| 3.0 | 3.0.3 | 3.1 |...-800|..-1200| 3.2 | --------------------+-------+-------+-------+-------+-------+-------+-------+ Build time 4:01 23:54 3:58 4:38 6:37 16:50 5:15 Binary size 4430752 6295044 3948444 3996096 4177344 4597888 4003276 | 2.95.3| 3.0 | 3.0.3 | 3.1 |...-800|..-1200| 3.2 | --------------------+-------+-------+-------+-------+-------+-------+-------+ STRATCOMP1-ALL| 3.57 | 79.39 | 71.41 | 96.33 | 85.40 | 82.83 | 92.01 | STRATCOMP-770.2-Q| 0.73 | 0.93 | 0.90 | 0.95 | 0.98 | 0.87 | 0.93 | 2QBF1| 19.08 | 27.66 | 22.96 | 22.25 | 19.90 | 18.46 | 20.97 | PRIMEIMPL2| 10.74 | 16.39 | 12.92 | 12.90 | 11.74 | 10.29 | 12.35 | ANCESTOR| 8.88 | 9.23 | 9.63 | 9.55 | 10.52 | 9.15 | 9.46 | 3COL-SIMPLEX1| 6.30 | 6.75 | 7.20 | 7.15 | 7.29 | 6.75 | 7.26 | 3COL-LADDER1| 36.24 | 49.20 | 44.47 | 42.35 | 39.27 | 35.79 | 40.99 | 3COL-N-LADDER1| 19.81 | 23.25 | 21.27 | 22.60 | 21.11 | 19.99 | 21.32 | 3COL-RANDOM1| 10.69 | 15.38 | 12.72 | 12.22 | 11.69 | 10.61 | 11.55 | HP-RANDOM1| 13.16 | 13.85 | 14.55 | 14.79 | 14.39 | 14.08 | 14.43 | HAMCYCLE-FREE| 1.18 | 1.42 | 1.71 | 1.70 | 1.58 | 1.54 | 1.59 | DECOMP2| 21.91 | 26.36 | 24.24 | 24.08 | 24.15 | 22.48 | 24.38 | BW-P4-Esra-a| 91.71 |102.26 | 97.70 | 99.30 | 95.60 | 92.25 | 96.51 | BW-P5-nopush| 6.96 | 7.89 | 7.42 | 7.44 | 7.26 | 7.01 | 7.24 | BW-P5-pushbin| 6.20 | 6.99 | 6.52 | 6.48 | 6.30 | 6.04 | 6.31 | BW-P5-nopushbin| 1.94 | 2.23 | 2.10 | 2.07 | 2.05 | 1.94 | 2.04 | 3SAT-1| 32.92 | 47.37 | 37.65 | 38.45 | 35.10 | 30.96 | 36.81 | 3SAT-1-CONSTRAINT| 17.46 | 27.28 | 21.97 | 20.62 | 18.95 | 17.06 | 19.26 | HANOI-Towers| 4.73 | 5.36 | 5.03 | 4.96 | 5.09 | 4.81 | 5.05 | RAMSEY| 8.00 | 9.55 | 9.08 | 8.68 | 8.95 | 8.00 | 8.66 | CRISTAL| 11.07 | 12.54 | 13.15 | 13.35 | 13.47 | 12.65 | 13.28 | HANOI-K| 33.41 | 46.27 | 38.33 | 38.66 | 34.79 | 32.50 | 36.76 | 21-QUEENS| 9.66 | 13.73 | 10.92 | 10.41 | 9.95 | 9.07 | 9.23 | MSTDir[V=13,A=40]| 25.71 | 24.85 | 24.46 | 21.09 | 19.52 | 19.31 | 20.47 | MSTDir[V=15,A=40]| 25.81 | 24.88 | 24.47 | 21.15 | 19.61 | 19.39 | 20.52 | MSTUndir[V=13,A=40]| 12.86 | 12.86 | 12.82 | 11.45 | 10.53 | 10.40 | 11.04 | MSTUndir[V=15,A=40]|214.87 |210.69 |210.63 |188.57 |172.70 |171.26 |182.65 | TIMETABLING| 9.63 | 10.58 | 10.74 | 10.62 | 11.78 | 9.98 | 10.80 | --------------------+-------+-------+-------+-------+-------+-------+-------+ (The STRATCOMP1-ALL regression is probably due to third-party code -- a SAT solver -- experiencing alignment issues due to dirty programming.)