From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24660 invoked by alias); 27 Sep 2014 10:32:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 24650 invoked by uid 89); 27 Sep 2014 10:32:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Sat, 27 Sep 2014 10:32:16 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id D7BFCAB43 for ; Sat, 27 Sep 2014 10:32:13 +0000 (UTC) Message-ID: <542692D1.3030500@suse.cz> Date: Sat, 27 Sep 2014 10:32:00 -0000 From: =?windows-1252?Q?Martin_Li=9Aka?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.0 MIME-Version: 1.0 To: gcc-patches@gcc.gnu.org Subject: Re: [PATCH 3/5] IPA ICF pass References: <20140620073156.GC12633@tsaunders-iceball.corp.tor1.mozilla.com> <20140705225351.GK16837@kam.mff.cuni.cz> <53C7E626.8080400@suse.cz> <54255A09.1090305@suse.cz> <20140926144441.GA4266@x4> <20140926232713.GC7334@kam.mff.cuni.cz> In-Reply-To: <20140926232713.GC7334@kam.mff.cuni.cz> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2014-09/txt/msg02454.txt.bz2 On 09/27/2014 01:27 AM, Jan Hubicka wrote: >> While a plain Firefox -flto build works fine. LTO/PGO build fails with: >> >> lto1: internal compiler error: in ipa_merge_profiles, at ipa-utils.c:540 >> 0x7d6165 ipa_merge_profiles(cgraph_node*, cgraph_node*) >> ../../gcc/gcc/ipa-utils.c:540 >> 0xf10c41 ipa_icf::sem_function::merge(ipa_icf::sem_item*) >> ../../gcc/gcc/ipa-icf.c:753 >> 0xf15206 ipa_icf::sem_item_optimizer::merge_classes(unsigned int) >> ../../gcc/gcc/ipa-icf.c:2706 >> 0xf1c1f4 ipa_icf::sem_item_optimizer::execute() >> ../../gcc/gcc/ipa-icf.c:2098 >> 0xf1d3f1 ipa_icf_driver >> ../../gcc/gcc/ipa-icf.c:2784 >> 0xf1d3f1 ipa_icf::pass_ipa_icf::execute(function*) >> ../../gcc/gcc/ipa-icf.c:2831 >> >> >> The pass is also very memory hungry (from 3GB without ICF to 4GB during >> libxul link), while the code size savings are in the 1% range. The majority of the problem are groups of candidates that are built according to hash. The hash value is based on a number of arguments, number of BB, number of gimple statements and types of these statements. It groups function into classes. In WPA (before a body of any function is loaded) I get following histogram: Dump after WPA based types groups Congruence classes: 97204 (unique hash values: 88725), with total: 191457 items Class size histogram [num of members]: number of classe number of classess [1]: 86453 classes [2]: 5680 classes [3]: 1541 classes [4]: 915 classes [5]: 446 classes [6]: 346 classes [7]: 200 classes [8]: 181 classes [9]: 154 classes [10]: 109 classes [11]: 87 classes [12]: 87 classes [13]: 68 classes [14]: 58 classes [15]: 58 classes [16]: 41 classes [17]: 25 classes [18]: 33 classes [19]: 28 classes [20]: 25 classes [21]: 19 classes [22]: 30 classes [23]: 24 classes [24]: 33 classes [25]: 17 classes [26]: 15 classes [27]: 10 classes [28]: 13 classes [29]: 18 classes [30]: 10 classes It means that each class with more than one member needs to be iterated and these functions are compared. And yes, there's the root of the problem. I have to load function body to process deep function comparison. As you can see, we have almost 200k function, where more than half each situated in a group with more that one member. So that 1GB extra memory usage is caused by these bodies: Init called for 105004 items (54.84%). Memory footprint can be significantly reduced if one can load the body and release it and the memory is freed. I asked Honza about it, but it looks GGC mechanism cannot be easily forced to release it. > > Thnks for checking. I was just thinking about doing that myself. Would > you mind posting -ftime-report of firefox WPA stage? > > It seems that in this case we reject too many of equality candidates? > It think the original numbers was about 4-5% but later some equivalences was > disabled because of devirt/aliasing issues. Do you compare it with gold ICF > enabled? There are quite few obvious improvements to the analysis that can > be done, but I guess we need to analyze the interesting cases one by one. You are right, the number were quite promising, but during the time, I had to reduce the "aggressivity" of the pass. As Honza said, it can be improved step-by-step. > > One thing that Martin can try is to hook into lto-symtab and try to check > that the COMDAT functions that are known to be same pass the equality check. > I suppose we will learn interesting things this way. Good point, I will try it. Martin > I think the patch adds quite important infrastructure for gimple semantic > equality checking and function merging. I went through the majority of code and > I think it is mostly ready to mainline (i.e. cleaner than what we have in > tree-ssa-tailmerge) so hope we can finish the review process next week. > We will need to get better cost/benefits ratio to enable it for -O2 that is > someting I would really like to see for 5.0, but it seems to be easier to > handle this incrementally.... Thank you for the review, Martin > > Honza >