From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24039 invoked by alias); 27 Apr 2002 16:15:04 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 24002 invoked from network); 27 Apr 2002 16:14:54 -0000 Received: from unknown (HELO etpmod.phys.tue.nl) (131.155.111.35) by sources.redhat.com with SMTP; 27 Apr 2002 16:14:54 -0000 Received: from gum01m.etpnet.phys.tue.nl (gum01m.etpnet.phys.tue.nl [192.168.84.65]) by etpmod.phys.tue.nl (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id SAA14056; Sat, 27 Apr 2002 18:14:53 +0200 Received: (from garloff@localhost) by gum01m.etpnet.phys.tue.nl (8.11.6/8.11.6/SuSE Linux 0.5) id g3RGErL28819; Sat, 27 Apr 2002 18:14:53 +0200 Date: Sat, 27 Apr 2002 09:49:00 -0000 From: Kurt Garloff To: Gerald Pfeifer Cc: gcc@gcc.gnu.org, Andreas Jaeger Subject: Re: inliner in gcc-3.1 Message-ID: <20020427181453.A28227@gum01m.etpnet.phys.tue.nl> References: <20020424132314.B27120@gum01m.etpnet.phys.tue.nl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/04w6evG8XlLl3ft" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.22.1i X-Operating-System: Linux 2.4.16-schedJ2 i686 X-PGP-Info: on http://www.garloff.de/kurt/mykeys.pgp X-PGP-Key: 1024D/1C98774E, 1024R/CEFC9215 Organization: TU/e(NL), SuSE(DE) X-SW-Source: 2002-04/txt/msg01490.txt.bz2 --/04w6evG8XlLl3ft Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-length: 4522 Hi Gerald, Andreas, On Thu, Apr 25, 2002 at 06:36:10PM +0200, Gerald Pfeifer wrote: > On Wed, 24 Apr 2002, Kurt Garloff wrote: > > It would be nice if this patch > > http://www.garloff.de/kurt/freesoft/gcc/gcc310-inline-func-acct-v1.diff > > would be tested by more people and integrated into 3.1. >=20 > This second patch (partially) fixes a very bad regression we've been > having since GCC 3.0; build time and binary size seem to be fine, though > we seem to degrade slightly for some of the other benchmarks. >=20 > I'd really like to see what this does to SPEC -- Andreas, could you give > it a try? I created a new inline accounting patch, which should prevent -O3=20 (-finline-functions) from delivering worse performance than -O2 for code that already has the mostimporatnt functions marked inline. As it turned out, it is not so good to limit the RTL inlining (integrate.c) for functions selected by -finline-functions. For the tree-inliner it is very useful, as the tree inliner does cut off inlining after some repeated inlining in order to limit compile-time resource requirements. Maybe some more experiments are needed here. The patch is at http://www.garloff.de/kurt/freesoft/gcc/gcc310-inline-func-acct-v1.2.diff and has been diffed against a 3.1-20020422 with my inline heuristics patch v3.6 applied. http://www.garloff.de/kurt/freesoft/gcc/g++310-rec-inline-heuristics-v3.6.d= iff Here are my benchmark results. (Tests performed on 2xpIII-1GHz, Linux-2.4.18, glibc-2.2.5; I left max-inline-slope and min-inline-insns alone.) max max libbench_double libbench_cplx_double g++ inl+inl build run binary run binary single (times u+s in s) 3.1 600 -O2 27.52 16.00 82579 18.97 95909 +3.6 600 -O2 29.02 15.96 82431 18.90+ 95780 +3.6+1.2 600 -O2 29.17 16.02 82431 18.87+ 95780 3.1 2500 -O2 48.32 15.97 86017 18.96 111912 +3.6 2500+1250-O2 48.12 15.98 86049 19.01 111944 +3.6+1.2 2500+1250-O2 48.50 15.98 86049 18.99 111944=20=20=20=20 +3.6 2500+ 300-O2 37.33 15.99 83395 18.88+ 105127 +3.6+1.2 2500+ 300-O2 37.41 15.94 83395 18.88+ 105127 3.1 600 -O3 23.88 16.65- 82667 18.98 94805 +3.6 600 -O3 28.67 16.65- 84809 19.04 96097 +3.6+1.2 600 -O3 30.40 16.62- 99900 19.02 112262 3.1 2500 -O3 136.88 15.78+ 137523 19.08 165986 +3.6 2500+1250-O3 145.06 15.80+ 139550 19.21- 168431=20=20=20 +3.6+1.2 2500+1250-O3 64.15 15.82+ 98138 18.92 128517 +3.6 2500+ 300-O3 38.07 16.64- 85845 19.04 108405 +3.6+1.2 2500+ 300-O3 37.46 16.70- 94113 19.00 117715 This chart does give some unexpected results. It seems the cplx_double benchmark is almost unaffected by the patch and by the increased inlining. All time are around 19.0. For -O3 with 2500+1250 and the v3.6 patch (max-inline-insns + max-inline-insns-single), we are clearly over the top. The v1.2 patch fixes that. Compile time is reduced to a reasonable number again and performance is good. Some results are around 18.9 (v3.6-600-O2, v3.6-2500-300-O2, v3.6+v1.2-2500-300, v3.6-v1.2-2500-1250-O3). Looking at the double results, we have three groups: 15.8, 16.0, and 16.6. The worst results are for a low inline limit (600) with -O3, independent of patches applied. With v3.6 (with or without 1.2), -O3 and a small single-fn limit and a large overall one (2500-300), the bad score is received. The best results are for a lot of inlining (2500 resp. 2500-1250) and -O3. =46rom those, build times and binary sizes are quite different: With both patches applied, only half the compile time is needed and a 1.4 times smaller binary is produced. The binary sizes are quite surprising. The v1.2 patch does not do anything for -O2 as expected. For -O3 it does limit the tree inlining. Funny enough, for small max-inline-insns-single values this leads to _larger_ binaries! Apparently the smaller chunks get later inlined by the RTL inliner (integrate) leading to more inlining. For larger single fn inlining limits, the effects of the v1.2 patch are more close to what can be expected.=20 I'd be curious what other people get. Regards, -- Kurt Garloff Eindhoven, NL GPG key: See mail header, key servers Linux kernel development SuSE Linux AG, Nuernberg, DE SCSI, Security --/04w6evG8XlLl3ft Content-Type: application/pgp-signature Content-Disposition: inline Content-length: 232 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8ys58xmLh6hyYd04RApsVAJ9hk3+CqDDeE5n/WEsgvJ96Ruv7BwCdFFQM f2I75/fO8U6qh3kq10/U2Jg= =+Bse -----END PGP SIGNATURE----- --/04w6evG8XlLl3ft--