From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id C2687388C02B for ; Mon, 26 Oct 2020 10:35:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C2687388C02B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=none smtp.mailfrom=hubicka@kam.mff.cuni.cz Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 9C81D282CC0; Mon, 26 Oct 2020 11:35:19 +0100 (CET) Date: Mon, 26 Oct 2020 11:35:19 +0100 From: Jan Hubicka To: Richard Biener Cc: jakub@redhat.com, gary@amperecomputing.com, gcc-patches@gcc.gnu.org Subject: Re: Materialize clones on demand Message-ID: <20201026103519.GA12163@kam.mff.cuni.cz> References: <20201022094820.GB97578@kam.mff.cuni.cz> <20201023192748.GB33077@kam.mff.cuni.cz> <20201026094848.GB89299@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-16.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Oct 2020 10:35:23 -0000 > > > > > > > cselib.c:3137 (cselib_init) 34M: 25.9% 34M 1514k: 17.3% heap > > > > tree-scalar-evolution.c:2984 (scev_initialize) 37M: 27.6% 50M 228k: 2.6% ggc > > > > > > Hmm, so we do > > > > > > scalar_evolution_info = hash_table::create_ggc (100); > > > > > > and > > > > > > scalar_evolution_info->empty (); > > > scalar_evolution_info = NULL; > > > > > > to reclaim. ->empty () will IIRC at least allocate 7 elements which we > > > the eventually should reclaim during a GC walk - I guess the hashtable > > > statistics do not really handle GC reclaimed portions? > > > > > > If there's a friendlier way of releasing a GC allocated hash-tab > > > we can switch to that. Note that in principle the hash-table doesn't > > > need to be GC allocated but it needs to be walked since it refers to > > > trees that might not be referenced in other ways. > > > > hashtable has destructor that does ggc_free, so i think ggc_delete is > > right way to free. > > Can you try if that helps? As said, in the end it's probably > miscountings in the stats. I do not think we are miscounting here. empty () really allocates small hashtable and leaves it alone. It should be ggc_delete. I will test it. > > > > > > > > and hashmaps: > > > > ipa-reference.c:1133 (ipa_reference_read_optimiz 2047k: 3.0% 3071k 9 : 0.0% heap > > > > tree-ssa.c:60 (redirect_edge_var_map_add) 4125k: 6.1% 4126k 8190 : 0.1% heap > > > > > > Similar as SCEV, probably mis-accounting? > > > > > > > alias.c:1200 (record_alias_subset) 4510k: 6.6% 4510k 4546 : 0.0% ggc > > > > ipa-prop.h:986 (ipcp_transformation_t) 8191k: 12.0% 11M 16 : 0.0% ggc > > > > dwarf2out.c:5957 (dwarf2out_register_external_di 47M: 72.2% 71M 12 : 0.0% ggc > > > > > > > > and hashsets: > > > > ipa-devirt.c:3093 (possible_polymorphic_call_tar 15k: 0.9% 23k 8 : 0.0% heap > > > > ipa-devirt.c:1599 (add_type_duplicate) 412k: 22.2% 412k 4065 : 0.0% heap > > > > tree-ssa-threadbackward.c:40 (thread_jumps) 1432k: 77.0% 1433k 119k: 0.8% heap > > > > > > > > and vectors: > > > > tree-ssa-structalias.c:5783 (push_fields_onto_fi 8 847k: 0.3% 976k 475621: 0.8% 17k 24k > > > > > > Huh. It's an auto_vec<> > > > > Hmm, those maybe gets miscounted, i will check. > > > > > > > tree-ssa-pre.c:334 (alloc_expression_id) 48 1125k: 0.4% 1187k 198336: 0.3% 23k 34k > > > > tree-into-ssa.c:1787 (register_new_update_single 8 1196k: 0.5% 1264k 380385: 0.6% 24k 36k > > > > ggc-page.c:1264 (add_finalizer) 8 1232k: 0.5% 1848k 43: 0.0% 77k 81k > > > > tree-ssa-structalias.c:1609 (topo_visit) 8 1302k: 0.5% 1328k 892964: 1.4% 27k 33k > > > > graphds.c:254 (graphds_dfs) 4 1469k: 0.6% 1675k 2101780: 3.4% 30k 34k > > > > dominance.c:955 (get_dominated_to_depth) 8 2251k: 0.9% 2266k 685140: 1.1% 46k 50k > > > > tree-ssa-structalias.c:410 (new_var_info) 32 2264k: 0.9% 2341k 330758: 0.5% 47k 63k > > > > tree-ssa-structalias.c:3104 (process_constraint) 48 2376k: 0.9% 2606k 405451: 0.7% 49k 83k > > > > symtab.c:612 (create_reference) 8 3314k: 1.3% 4897k 75213: 0.1% 414k 612k > > > > vec.h:1734 (copy) 48 233M:90.5% 234M 6243163:10.1% 4982k 5003k > > > > Also I should annotate copy. > > Yeah, some missing annotations might cause issues. It will only let us to see who copies the vectors ;) auto_vecs I think are special since we may manage to miscount the pre-allocated space. I will look into that. > > > > > > Well, we're building a DIE tree for the whole unit here so I'm not sure > > > what parts we can optimize. The structures may keep quite some stuff > > > on the tree side live through the decl -> DIE and block -> DIE maps > > > and the external_die_map used for LTO streaming (but if we lazily stream > > > bodies we do need to keep this map ... unless we add some > > > start/end-stream-body hooks and doing the map per function. But then > > > we build the DIEs lazily as well so the query of the map is lazy :/) > > > > Yep, not sure how much we could do here. Of course ggc_collect when > > invoked will do quite a lot of walking to discover relatively few tree > > references, but not sure if that can be solved by custom marking or so. > > In principle the late DIE creation code can remove entries from the > external_die_map map, but not sure how much that helps (might also > cause re-allocation of it if we shrink it). It might help quite a bit > for references to BLOCKs. Maybe you can try the following simple > patch ... > > diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c > index ba93a6c3d81..350cc5d443c 100644 > --- a/gcc/dwarf2out.c > +++ b/gcc/dwarf2out.c > @@ -5974,6 +5974,7 @@ maybe_create_die_with_external_ref (tree decl) > > const char *sym = desc->sym; > unsigned HOST_WIDE_INT off = desc->off; > + external_die_map->remove (decl); > > in_lto_p = false; > dw_die_ref die = (TREE_CODE (decl) == BLOCK I will give it a try. Thanks! I think shrinking hashtables is not much of a fear here: it happens lazilly either at ggc_collect (that is desirable) or when hashtable is walked (which is amortized by the walk) Honza > > > > > Hona > > > > > > Richard. > > > > > > -- > > > Richard Biener > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > > > Germany; GF: Felix Imend > > > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > Germany; GF: Felix Imend