From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 9C3F63858004 for ; Fri, 23 Oct 2020 11:21:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9C3F63858004 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mjambor@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6BCCAB048; Fri, 23 Oct 2020 11:21:13 +0000 (UTC) From: Martin Jambor To: Jan Hubicka , gary@amperecomputing.com, mliska@suse.cz, jakub@redhat.com, gcc-patches@gcc.gnu.org Subject: Re: Materialize clones on demand In-Reply-To: <20201022094820.GB97578@kam.mff.cuni.cz> References: <20201022094820.GB97578@kam.mff.cuni.cz> User-Agent: Notmuch/0.31 (https://notmuchmail.org) Emacs/26.3 (x86_64-suse-linux-gnu) Date: Fri, 23 Oct 2020 13:21:12 +0200 Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-3039.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Oct 2020 11:21:16 -0000 Hi, On Thu, Oct 22 2020, Jan Hubicka wrote: > Hi, > this patch removes the pass to materialize all clones and instead this > is now done on demand. The motivation is to reduce lifetime of function > bodies in ltrans that should noticeably reduce memory use for highly > parallel compilations of large programs (like Martin does) or with > partitioning reduced/disabled. For cc1 with one partition the memory use > seems to go down from 4gb to cca 1.5gb (seeing from top, so this is not > particularly accurate). > Nice. > This should also make get_body to do the right thing at WPA time (still > not good idea for production patch). I did not test this path. > > Martin (Jambor), Jakub, there is one FIXME in ipa-param-manipulation. > We seem to ICE when we redirect to a call before callee is materialized > (this should be possible to trigger on mainline with recursive > callgraphs too, but it definitly triggers on several testcases in c > testsuite if the get_untransformed_body is disabled). It would be nice > to fix this, but I am not quite sure how the debug info adjustments here > works. Well, the debug mappings are all based on PARM_DECLs. Unfortunately, I cannot think of any quick fix now, though we might want to sit down and try to revise the mechanism also because of debug info issues described in PR 95343 and PR 93385. I'll keep this in mind and in my notes. I have one question regarding the patch itself: > Bootstrapped/regtested x86_64-linux and also lto-bootstrapped with > release checking. I plan to commit it after bit more testing. > > Honza > > gcc/ChangeLog: > > 2020-10-22 Jan Hubicka > > * cgraph.c (cgraph_node::get_untransformed_body): Perform lazy > clone materialization. > * cgraph.h (cgraph_node::materialize_clone): Declare. > (symbol_table::materialize_all_clones): Remove. > * cgraphclones.c (cgraph_materialize_clone): Turn to ... > (cgraph_node::materialize_clone): .. this one; move here > dumping from symbol_table::materialize_all_clones. > (symbol_table::materialize_all_clones): Remove. > * cgraphunit.c (mark_functions_to_output): Clear stmt references. > (cgraph_node::expand): Initialize bitmaps early; > do not call execute_all_ipa_transforms if there are no transforms. > * ipa-inline-transform.c (save_inline_function_body): Fix formating. > (inline_transform): Materialize all clones before function is modified. > * ipa-param-manipulation.c (ipa_param_adjustments::modify_call): > Materialize clone if needed. > * ipa.c (class pass_materialize_all_clones): Remove. > (make_pass_materialize_all_clones): Remove. > * passes.c (execute_all_ipa_transforms): Materialize all clones. > * passes.def: Remove pass_materialize_all_clones. > * tree-pass.h (make_pass_materialize_all_clones): Remove. > [...] > diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c > index 05713c28cf0..1e2262789dd 100644 > --- a/gcc/cgraphunit.c > +++ b/gcc/cgraphunit.c > @@ -2298,7 +2299,8 @@ cgraph_node::expand (void) > bitmap_obstack_initialize (®_obstack); /* FIXME, only at RTL generation*/ > > update_ssa (TODO_update_ssa_only_virtuals); > - execute_all_ipa_transforms (false); > + if (ipa_transforms_to_apply.exists ()) > + execute_all_ipa_transforms (false); > Can some function not have ipa_inline among the transforms_to_apply? Martin