From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 651713857C67 for ; Mon, 26 Oct 2020 09:48:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 651713857C67 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=none smtp.mailfrom=hubicka@kam.mff.cuni.cz Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 01AEE282AB9; Mon, 26 Oct 2020 10:48:48 +0100 (CET) Date: Mon, 26 Oct 2020 10:48:48 +0100 From: Jan Hubicka To: Richard Biener Cc: Martin Jambor , gary@amperecomputing.com, mliska@suse.cz, jakub@redhat.com, gcc-patches@gcc.gnu.org Subject: Re: Materialize clones on demand Message-ID: <20201026094848.GB89299@kam.mff.cuni.cz> References: <20201022094820.GB97578@kam.mff.cuni.cz> <20201023192748.GB33077@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Oct 2020 09:48:54 -0000 > > We seem to leak some hashtables: > > dwarf2out.c:28850 (dwarf2out_init) 31M: 23.8% 47M 19 : 0.0% ggc > > that one likely keeps quite some memory live... Yep, having in-memory dwaf2out for whole cc1plus eats a lot of memory quite naturally. > > > cselib.c:3137 (cselib_init) 34M: 25.9% 34M 1514k: 17.3% heap > > tree-scalar-evolution.c:2984 (scev_initialize) 37M: 27.6% 50M 228k: 2.6% ggc > > Hmm, so we do > > scalar_evolution_info = hash_table::create_ggc (100); > > and > > scalar_evolution_info->empty (); > scalar_evolution_info = NULL; > > to reclaim. ->empty () will IIRC at least allocate 7 elements which we > the eventually should reclaim during a GC walk - I guess the hashtable > statistics do not really handle GC reclaimed portions? > > If there's a friendlier way of releasing a GC allocated hash-tab > we can switch to that. Note that in principle the hash-table doesn't > need to be GC allocated but it needs to be walked since it refers to > trees that might not be referenced in other ways. hashtable has destructor that does ggc_free, so i think ggc_delete is right way to free. > > > and hashmaps: > > ipa-reference.c:1133 (ipa_reference_read_optimiz 2047k: 3.0% 3071k 9 : 0.0% heap > > tree-ssa.c:60 (redirect_edge_var_map_add) 4125k: 6.1% 4126k 8190 : 0.1% heap > > Similar as SCEV, probably mis-accounting? > > > alias.c:1200 (record_alias_subset) 4510k: 6.6% 4510k 4546 : 0.0% ggc > > ipa-prop.h:986 (ipcp_transformation_t) 8191k: 12.0% 11M 16 : 0.0% ggc > > dwarf2out.c:5957 (dwarf2out_register_external_di 47M: 72.2% 71M 12 : 0.0% ggc > > > > and hashsets: > > ipa-devirt.c:3093 (possible_polymorphic_call_tar 15k: 0.9% 23k 8 : 0.0% heap > > ipa-devirt.c:1599 (add_type_duplicate) 412k: 22.2% 412k 4065 : 0.0% heap > > tree-ssa-threadbackward.c:40 (thread_jumps) 1432k: 77.0% 1433k 119k: 0.8% heap > > > > and vectors: > > tree-ssa-structalias.c:5783 (push_fields_onto_fi 8 847k: 0.3% 976k 475621: 0.8% 17k 24k > > Huh. It's an auto_vec<> Hmm, those maybe gets miscounted, i will check. > > > tree-ssa-pre.c:334 (alloc_expression_id) 48 1125k: 0.4% 1187k 198336: 0.3% 23k 34k > > tree-into-ssa.c:1787 (register_new_update_single 8 1196k: 0.5% 1264k 380385: 0.6% 24k 36k > > ggc-page.c:1264 (add_finalizer) 8 1232k: 0.5% 1848k 43: 0.0% 77k 81k > > tree-ssa-structalias.c:1609 (topo_visit) 8 1302k: 0.5% 1328k 892964: 1.4% 27k 33k > > graphds.c:254 (graphds_dfs) 4 1469k: 0.6% 1675k 2101780: 3.4% 30k 34k > > dominance.c:955 (get_dominated_to_depth) 8 2251k: 0.9% 2266k 685140: 1.1% 46k 50k > > tree-ssa-structalias.c:410 (new_var_info) 32 2264k: 0.9% 2341k 330758: 0.5% 47k 63k > > tree-ssa-structalias.c:3104 (process_constraint) 48 2376k: 0.9% 2606k 405451: 0.7% 49k 83k > > symtab.c:612 (create_reference) 8 3314k: 1.3% 4897k 75213: 0.1% 414k 612k > > vec.h:1734 (copy) 48 233M:90.5% 234M 6243163:10.1% 4982k 5003k Also I should annotate copy. > > Those all look OK to me, not sure why we even think there's a leak? I think we do not need to hold references anymore (perhaps for aliases - i will check). Also all function bodies should be freed by now. > > > However main problem is > > cfg.c:202 (connect_src) 5745k: 0.2% 271M: 1.9% 1754k: 0.0% 1132k: 0.2% 7026k > > cfg.c:212 (connect_dest) 6307k: 0.2% 281M: 2.0% 10129k: 0.2% 2490k: 0.5% 7172k > > varasm.c:3359 (build_constant_desc) 7387k: 0.2% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k > > emit-rtl.c:486 (gen_raw_REG) 7799k: 0.2% 215M: 1.5% 96 : 0.0% 0 : 0.0% 9502k > > dwarf2cfi.c:2341 (add_cfis_to_fde) 8027k: 0.2% 0 : 0.0% 4906k: 0.1% 1405k: 0.3% 78k > > emit-rtl.c:4074 (make_jump_insn_raw) 8239k: 0.2% 93M: 0.7% 0 : 0.0% 0 : 0.0% 1442k > > tree-ssanames.c:308 (make_ssa_name_fn) 9130k: 0.2% 456M: 3.3% 0 : 0.0% 0 : 0.0% 6622k > > gimple.c:1808 (gimple_copy) 9508k: 0.3% 524M: 3.7% 8609k: 0.2% 2972k: 0.6% 7135k > > tree-inline.c:4879 (expand_call_inline) 9590k: 0.3% 21M: 0.2% 0 : 0.0% 0 : 0.0% 328k > > dwarf2cfi.c:418 (new_cfi) 10M: 0.3% 0 : 0.0% 0 : 0.0% 0 : 0.0% 444k > > cfg.c:266 (unchecked_make_edge) 10M: 0.3% 60M: 0.4% 355M: 6.8% 0 : 0.0% 9083k I think it is bug to have fuction body at the end of compilation - will try to work out reason for that. > > tree.c:1642 (wide_int_to_tree_1) 10M: 0.3% 2313k: 0.0% 0 : 0.0% 0 : 0.0% 548k > > stringpool.c:41 (stringpool_ggc_alloc) 10M: 0.3% 7055k: 0.0% 0 : 0.0% 2270k: 0.5% 588k > > stringpool.c:63 (alloc_node) 10M: 0.3% 12M: 0.1% 0 : 0.0% 0 : 0.0% 588k > > tree-phinodes.c:119 (allocate_phi_node) 11M: 0.3% 153M: 1.1% 0 : 0.0% 3539k: 0.7% 340k > > cgraph.c:289 (create_empty) 12M: 0.3% 0 : 0.0% 109M: 2.1% 0 : 0.0% 371k > > cfg.c:127 (alloc_block) 14M: 0.4% 705M: 5.0% 0 : 0.0% 0 : 0.0% 7086k > > tree-streamer-in.c:558 (streamer_read_tree_bitfi 22M: 0.6% 13k: 0.0% 0 : 0.0% 22k: 0.0% 64k > > tree-inline.c:834 (remap_block) 28M: 0.8% 159M: 1.1% 0 : 0.0% 0 : 0.0% 2009k > > stringpool.c:79 (ggc_alloc_string) 28M: 0.8% 5619k: 0.0% 0 : 0.0% 6658k: 1.4% 1785k > > dwarf2out.c:11727 (add_ranges_num) 32M: 0.9% 0 : 0.0% 32M: 0.6% 144 : 0.0% 20 > > tree-inline.c:5942 (copy_decl_to_var) 39M: 1.1% 51M: 0.4% 0 : 0.0% 0 : 0.0% 646k > > tree-inline.c:5994 (copy_decl_no_change) 78M: 2.1% 270M: 1.9% 0 : 0.0% 0 : 0.0% 2497k > > function.c:4438 (reorder_blocks_1) 96M: 2.6% 101M: 0.7% 0 : 0.0% 0 : 0.0% 2109k > > hash-table.h:802 (expand) 142M: 3.9% 18M: 0.1% 198M: 3.8% 32M: 6.9% 38k > > dwarf2out.c:10086 (new_loc_list) 219M: 6.0% 11M: 0.1% 0 : 0.0% 0 : 0.0% 2955k > > tree-streamer-in.c:637 (streamer_alloc_tree) 379M: 10.3% 426M: 3.0% 0 : 0.0% 4201k: 0.9% 9828k > > dwarf2out.c:5702 (new_die_raw) 434M: 11.8% 0 : 0.0% 0 : 0.0% 0 : 0.0% 5556k > > dwarf2out.c:1383 (new_loc_descr) 519M: 14.1% 12M: 0.1% 2880 : 0.0% 0 : 0.0% 6812k > > dwarf2out.c:4420 (add_dwarf_attr) 640M: 17.4% 0 : 0.0% 94M: 1.8% 4584k: 1.0% 3877k > > toplev.c:906 (realloc_for_line_map) 768M: 20.8% 0 : 0.0% 767M: 14.6% 255M: 54.4% 33 > > -------------------------------------------------------------------------------------------------------------------------------------------- > > GGC memory Leak Garbage Freed Overhead Times > > -------------------------------------------------------------------------------------------------------------------------------------------- > > Total 3689M:100.0% 14039M:100.0% 5254M:100.0% 470M:100.0% 391M > > -------------------------------------------------------------------------------------------------------------------------------------------- > > > > Clearly some function bodies leak - I will try to figure out what. But > > main problem is debug info. > > I guess debug info for whole cc1plus is large, but it would be nice if > > it was not in the garbage collector, for example :) > > Well, we're building a DIE tree for the whole unit here so I'm not sure > what parts we can optimize. The structures may keep quite some stuff > on the tree side live through the decl -> DIE and block -> DIE maps > and the external_die_map used for LTO streaming (but if we lazily stream > bodies we do need to keep this map ... unless we add some > start/end-stream-body hooks and doing the map per function. But then > we build the DIEs lazily as well so the query of the map is lazy :/) Yep, not sure how much we could do here. Of course ggc_collect when invoked will do quite a lot of walking to discover relatively few tree references, but not sure if that can be solved by custom marking or so. Hona > > Richard. > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > Germany; GF: Felix Imend