From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-50838-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 24039 invoked by alias); 27 Apr 2002 16:15:04 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 24002 invoked from network); 27 Apr 2002 16:14:54 -0000
Received: from unknown (HELO etpmod.phys.tue.nl) (131.155.111.35)
  by sources.redhat.com with SMTP; 27 Apr 2002 16:14:54 -0000
Received: from gum01m.etpnet.phys.tue.nl (gum01m.etpnet.phys.tue.nl [192.168.84.65])
	by etpmod.phys.tue.nl (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id SAA14056;
	Sat, 27 Apr 2002 18:14:53 +0200
Received: (from garloff@localhost)
	by gum01m.etpnet.phys.tue.nl (8.11.6/8.11.6/SuSE Linux 0.5) id g3RGErL28819;
	Sat, 27 Apr 2002 18:14:53 +0200
Date: Sat, 27 Apr 2002 09:49:00 -0000
From: Kurt Garloff <garloff@suse.de>
To: Gerald Pfeifer <pfeifer@dbai.tuwien.ac.at>
Cc: gcc@gcc.gnu.org, Andreas Jaeger <aj@suse.de>
Subject: Re: inliner in gcc-3.1
Message-ID: <20020427181453.A28227@gum01m.etpnet.phys.tue.nl>
References: <20020424132314.B27120@gum01m.etpnet.phys.tue.nl> <Pine.BSF.4.44.0204251822260.46238-100000@naos.dbai.tuwien.ac.at>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="/04w6evG8XlLl3ft"
Content-Disposition: inline
In-Reply-To: <Pine.BSF.4.44.0204251822260.46238-100000@naos.dbai.tuwien.ac.at>
User-Agent: Mutt/1.3.22.1i
X-Operating-System: Linux 2.4.16-schedJ2 i686
X-PGP-Info: on http://www.garloff.de/kurt/mykeys.pgp
X-PGP-Key: 1024D/1C98774E, 1024R/CEFC9215
Organization: TU/e(NL), SuSE(DE)
X-SW-Source: 2002-04/txt/msg01490.txt.bz2


--/04w6evG8XlLl3ft
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Content-length: 4522

Hi Gerald, Andreas,

On Thu, Apr 25, 2002 at 06:36:10PM +0200, Gerald Pfeifer wrote:
> On Wed, 24 Apr 2002, Kurt Garloff wrote:
> > It would be nice if this patch
> > http://www.garloff.de/kurt/freesoft/gcc/gcc310-inline-func-acct-v1.diff
> > would be tested by more people and integrated into 3.1.
>=20
> This second patch (partially) fixes a very bad regression we've been
> having since GCC 3.0; build time and binary size seem to be fine, though
> we seem to degrade slightly for some of the other benchmarks.
>=20
> I'd really like to see what this does to SPEC -- Andreas, could you give
> it a try?

I created a new inline accounting patch, which should prevent -O3=20
(-finline-functions) from delivering worse performance than -O2 for code
that already has the mostimporatnt functions marked inline.

As it turned out, it is not so good to limit the RTL inlining (integrate.c)
for functions selected by -finline-functions. For the tree-inliner it
is very useful, as the tree inliner does cut off inlining after some
repeated inlining in order to limit compile-time resource requirements.
Maybe some more experiments are needed here.

The patch is at
http://www.garloff.de/kurt/freesoft/gcc/gcc310-inline-func-acct-v1.2.diff
and has been diffed against a 3.1-20020422 with my inline heuristics patch
v3.6 applied.
http://www.garloff.de/kurt/freesoft/gcc/g++310-rec-inline-heuristics-v3.6.d=
iff

Here are my benchmark results.
(Tests performed on 2xpIII-1GHz, Linux-2.4.18, glibc-2.2.5; I left
 max-inline-slope and min-inline-insns alone.)

        max max                libbench_double libbench_cplx_double
 g++    inl+inl        build     run   binary      run   binary
            single    (times u+s in s)
3.1       600 -O2      27.52    16.00   82579     18.97   95909
+3.6      600 -O2      29.02    15.96   82431     18.90+  95780
+3.6+1.2  600 -O2      29.17    16.02   82431     18.87+  95780

3.1      2500 -O2      48.32    15.97   86017     18.96  111912
+3.6     2500+1250-O2  48.12    15.98   86049     19.01  111944
+3.6+1.2 2500+1250-O2  48.50    15.98   86049     18.99  111944=20=20=20=20
+3.6     2500+ 300-O2  37.33    15.99   83395     18.88+ 105127
+3.6+1.2 2500+ 300-O2  37.41    15.94   83395     18.88+ 105127

3.1       600 -O3      23.88    16.65-  82667     18.98   94805
+3.6      600 -O3      28.67    16.65-  84809     19.04   96097
+3.6+1.2  600 -O3      30.40    16.62-  99900     19.02  112262

3.1      2500 -O3     136.88    15.78+ 137523     19.08  165986
+3.6     2500+1250-O3 145.06    15.80+ 139550     19.21- 168431=20=20=20
+3.6+1.2 2500+1250-O3  64.15    15.82+  98138     18.92  128517
+3.6     2500+ 300-O3  38.07    16.64-  85845     19.04  108405
+3.6+1.2 2500+ 300-O3  37.46    16.70-  94113     19.00  117715

This chart does give some unexpected results.

It seems the cplx_double benchmark is almost unaffected by the patch and by
the increased inlining. All time are around 19.0. For -O3 with 2500+1250 and
the v3.6 patch (max-inline-insns + max-inline-insns-single), we are clearly
over the top. The v1.2 patch fixes that. Compile time is reduced to a
reasonable number again and performance is good. Some results are
around 18.9 (v3.6-600-O2, v3.6-2500-300-O2, v3.6+v1.2-2500-300,
v3.6-v1.2-2500-1250-O3).

Looking at the double results, we have three groups: 15.8, 16.0, and 16.6.
The worst results are for a low inline limit (600) with -O3, independent of
patches applied. With v3.6 (with or without 1.2), -O3 and a small single-fn
limit and a large overall one (2500-300), the bad score is received.
The best results are for a lot of inlining (2500 resp. 2500-1250) and -O3.
=46rom those, build times and binary sizes are quite different: With both
patches applied, only half the compile time is needed and a 1.4 times
smaller binary is produced.

The binary sizes are quite surprising. The v1.2 patch does not do anything
for -O2 as expected. For -O3 it does limit the tree inlining. Funny enough,
for small max-inline-insns-single values this leads to _larger_ binaries!
Apparently the smaller chunks get later inlined by the RTL inliner
(integrate) leading to more inlining.
For larger single fn inlining limits, the effects of the v1.2 patch are more
close to what can be expected.=20

I'd be curious what other people get.

Regards,
--
Kurt Garloff  <garloff@suse.de>                          Eindhoven, NL
GPG key: See mail header, key servers         Linux kernel development
SuSE Linux AG, Nuernberg, DE                            SCSI, Security

--/04w6evG8XlLl3ft
Content-Type: application/pgp-signature
Content-Disposition: inline
Content-length: 232

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8ys58xmLh6hyYd04RApsVAJ9hk3+CqDDeE5n/WEsgvJ96Ruv7BwCdFFQM
f2I75/fO8U6qh3kq10/U2Jg=
=+Bse
-----END PGP SIGNATURE-----

--/04w6evG8XlLl3ft--