From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7722 invoked by alias); 9 Jan 2019 18:04:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 7707 invoked by uid 89); 9 Jan 2019 18:04:47 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,HTML_MESSAGE,SPF_HELO_PASS,UNPARSEABLE_RELAY autolearn=ham version=3.3.2 spammy=trained, sk:fdumpi, sk:fdump-i, H*c:alternative X-HELO: userp2120.oracle.com Received: from userp2120.oracle.com (HELO userp2120.oracle.com) (156.151.31.85) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 09 Jan 2019 18:04:44 +0000 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id x09I4gHW116023; Wed, 9 Jan 2019 18:04:42 GMT Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2120.oracle.com with ESMTP id 2ptn7r2rjd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Jan 2019 18:04:42 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x09I4fjZ021354 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 Jan 2019 18:04:42 GMT Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x09I4fSX025916; Wed, 9 Jan 2019 18:04:41 GMT Received: from dhcp-10-159-227-174.vpn.oracle.com (/10.159.227.174) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 09 Jan 2019 10:04:41 -0800 From: Qing Zhao Message-Id: <162AA6E1-FC30-4E72-B16E-E639E7D3B8BD@oracle.com> Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: Remove overall growth from badness metrics Date: Wed, 09 Jan 2019 18:04:00 -0000 In-Reply-To: <20190109143124.zaxyj2xdsfvvgh5o@kam.mff.cuni.cz> Cc: gcc-patches@gcc.gnu.org To: Jan Hubicka References: <20190106235208.tqx4jqylljlbk3gk@kam.mff.cuni.cz> <5ECDB2A8-9A03-4D59-BECD-4D4BB7D4BE4D@oracle.com> <20190108114647.qth2zwfzl75ugujt@kam.mff.cuni.cz> <20190108175345.2rzbqlsnjr44etii@kam.mff.cuni.cz> <99F94084-6A2F-40C4-A6A2-A7713D7E15E8@oracle.com> <20190109143124.zaxyj2xdsfvvgh5o@kam.mff.cuni.cz> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2019-01/txt/msg00499.txt.bz2 >>>=20 >>> Those are sizes of libxul, which is the largest library of Firefox. >>> PGO is profile guided optimization. >>=20 >> Okay. I see.=20 >>=20 >> looks like for LTO, the code size increase with profiling is much small= er than >> that without profiling when growth is increased from 20% to 40%.=20=20 >=20 > With LTo the growth is about 9%, while for non-LTO is about about 4% and > with PGO it is about 3%. This is expected. >=20 > For non-LTO most of translation units do not hit the limit becuase most > of calls are external. Firefox is bit special here by using the #include > based unified build that gets it closer to LTO, but not quite. >=20 > With LTO there is only one translation unit that hits the 20% code size > growth that after optimization translates to that 9% >=20 > With profilef feedback code is partitioned into cold and hot sections > where only hot section growths by the given percentage. For firefox > about 15% of the binary is trained and rest is cold. >>=20 >> for Non-LTO, the code size increase is minimal when growth is increased = fro 20% to 40%. >>=20 >> However, not quite understand the last column, could you please explain = a little bit >> on last column (-finline-functions)? >=20 > It is non-lto build but with additional -finline-functions. >=20 > GCC build machinery uses -O2 by default and -O3 for some files. Adding > -finline-functions enables agressive inlining everywhere. But double > checking the numbers, I must have cut&pasted wrong data here. For > growth 20 -finline-functions non-LTO non-PGO I get 107272791 (so table > is wrong) and increasing growth to 40 gets me 115311719 (which is > correct in the table) >>=20 >>>>>=20 >>>>> growth LTO+PGO PGO LTO none -finline-functions >>>>> 20 (default) 83752215 94390023 93085455 103437191 94351191 >>>>> 40 85299111 97220935 101600151 108910311 115311719 >>>>> clang 111520431 114863807 108437807 >=20 > It should be: > growth LTO+PGO PGO LTO none -finline-functions > 20 (default) 83752215 94390023 93085455 103437191 107272791 > 40 85299111 97220935 101600151 108910311 115311719 > clang 111520431 114863807 108437807 >=20 > So 7.5% growth. Okay, I see. >>>=20 >>> Yes, i have also reworked the inline metrics somehwat and spent quite >>> some time looking into dumps to see that it behaves reasonably. There >>> was two ages old bugs I fixed in last two weeks and also added some >>> extra tricks like penalizing cross-module inlines some time ago. Given >>> the fact that even with profile feedback I am not able to sort the >>> priority queue well and neither can Clang do the job, I think it is good >>> motivation to adjust the parameter which I have set somewhat arbitrarily >>> at a time I was not able to test it well. >>=20 >> where is the code for your current heuristic to sorting the inlinable ca= ndidates? >=20 > It is in ipa-inline.c:edge-badness > If you use -fdump-ipa-inline-details you can search for "Considering" in > the dump file to find record about every inline decision. It dumps the > badness value and also the individual values used to compute it. thanks, will take a look on it. Qing >=20