From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 074D13858036 for ; Wed, 28 Oct 2020 15:52:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 074D13858036 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=none smtp.mailfrom=hubicka@kam.mff.cuni.cz Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 8DA552800F7; Wed, 28 Oct 2020 16:51:58 +0100 (CET) Date: Wed, 28 Oct 2020 16:51:58 +0100 From: Jan Hubicka To: Richard Biener Cc: Martin Jambor , gary@amperecomputing.com, mliska@suse.cz, jakub@redhat.com, gcc-patches@gcc.gnu.org Subject: Re: Materialize clones on demand Message-ID: <20201028155158.GI44896@kam.mff.cuni.cz> References: <20201022094820.GB97578@kam.mff.cuni.cz> <20201023192748.GB33077@kam.mff.cuni.cz> <20201026094848.GB89299@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-15.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Oct 2020 15:52:04 -0000 > > > > However main problem is > > > > cfg.c:202 (connect_src) 5745k: 0.2% 271M: 1.9% 1754k: 0.0% 1132k: 0.2% 7026k > > > > cfg.c:212 (connect_dest) 6307k: 0.2% 281M: 2.0% 10129k: 0.2% 2490k: 0.5% 7172k > > > > varasm.c:3359 (build_constant_desc) 7387k: 0.2% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k > > > > emit-rtl.c:486 (gen_raw_REG) 7799k: 0.2% 215M: 1.5% 96 : 0.0% 0 : 0.0% 9502k > > > > dwarf2cfi.c:2341 (add_cfis_to_fde) 8027k: 0.2% 0 : 0.0% 4906k: 0.1% 1405k: 0.3% 78k > > > > emit-rtl.c:4074 (make_jump_insn_raw) 8239k: 0.2% 93M: 0.7% 0 : 0.0% 0 : 0.0% 1442k > > > > tree-ssanames.c:308 (make_ssa_name_fn) 9130k: 0.2% 456M: 3.3% 0 : 0.0% 0 : 0.0% 6622k > > > > gimple.c:1808 (gimple_copy) 9508k: 0.3% 524M: 3.7% 8609k: 0.2% 2972k: 0.6% 7135k > > > > tree-inline.c:4879 (expand_call_inline) 9590k: 0.3% 21M: 0.2% 0 : 0.0% 0 : 0.0% 328k > > > > dwarf2cfi.c:418 (new_cfi) 10M: 0.3% 0 : 0.0% 0 : 0.0% 0 : 0.0% 444k > > > > cfg.c:266 (unchecked_make_edge) 10M: 0.3% 60M: 0.4% 355M: 6.8% 0 : 0.0% 9083k > > I think it is bug to have fuction body at the end of compilation - will > > try to work out reason for that. > > > > tree.c:1642 (wide_int_to_tree_1) 10M: 0.3% 2313k: 0.0% 0 : 0.0% 0 : 0.0% 548k > > > > stringpool.c:41 (stringpool_ggc_alloc) 10M: 0.3% 7055k: 0.0% 0 : 0.0% 2270k: 0.5% 588k > > > > stringpool.c:63 (alloc_node) 10M: 0.3% 12M: 0.1% 0 : 0.0% 0 : 0.0% 588k > > > > tree-phinodes.c:119 (allocate_phi_node) 11M: 0.3% 153M: 1.1% 0 : 0.0% 3539k: 0.7% 340k > > > > cgraph.c:289 (create_empty) 12M: 0.3% 0 : 0.0% 109M: 2.1% 0 : 0.0% 371k > > > > cfg.c:127 (alloc_block) 14M: 0.4% 705M: 5.0% 0 : 0.0% 0 : 0.0% 7086k > > > > tree-streamer-in.c:558 (streamer_read_tree_bitfi 22M: 0.6% 13k: 0.0% 0 : 0.0% 22k: 0.0% 64k > > > > tree-inline.c:834 (remap_block) 28M: 0.8% 159M: 1.1% 0 : 0.0% 0 : 0.0% 2009k > > > > stringpool.c:79 (ggc_alloc_string) 28M: 0.8% 5619k: 0.0% 0 : 0.0% 6658k: 1.4% 1785k > > > > dwarf2out.c:11727 (add_ranges_num) 32M: 0.9% 0 : 0.0% 32M: 0.6% 144 : 0.0% 20 > > > > tree-inline.c:5942 (copy_decl_to_var) 39M: 1.1% 51M: 0.4% 0 : 0.0% 0 : 0.0% 646k > > > > tree-inline.c:5994 (copy_decl_no_change) 78M: 2.1% 270M: 1.9% 0 : 0.0% 0 : 0.0% 2497k > > > > function.c:4438 (reorder_blocks_1) 96M: 2.6% 101M: 0.7% 0 : 0.0% 0 : 0.0% 2109k > > > > hash-table.h:802 (expand) 142M: 3.9% 18M: 0.1% 198M: 3.8% 32M: 6.9% 38k > > > > dwarf2out.c:10086 (new_loc_list) 219M: 6.0% 11M: 0.1% 0 : 0.0% 0 : 0.0% 2955k > > > > tree-streamer-in.c:637 (streamer_alloc_tree) 379M: 10.3% 426M: 3.0% 0 : 0.0% 4201k: 0.9% 9828k > > > > dwarf2out.c:5702 (new_die_raw) 434M: 11.8% 0 : 0.0% 0 : 0.0% 0 : 0.0% 5556k > > > > dwarf2out.c:1383 (new_loc_descr) 519M: 14.1% 12M: 0.1% 2880 : 0.0% 0 : 0.0% 6812k > > > > dwarf2out.c:4420 (add_dwarf_attr) 640M: 17.4% 0 : 0.0% 94M: 1.8% 4584k: 1.0% 3877k > > > > toplev.c:906 (realloc_for_line_map) 768M: 20.8% 0 : 0.0% 767M: 14.6% 255M: 54.4% 33 > > > > -------------------------------------------------------------------------------------------------------------------------------------------- > > > > GGC memory Leak Garbage Freed Overhead Times > > > > -------------------------------------------------------------------------------------------------------------------------------------------- > > > > Total 3689M:100.0% 14039M:100.0% 5254M:100.0% 470M:100.0% 391M > > > > -------------------------------------------------------------------------------------------------------------------------------------------- > > > > > > > > Clearly some function bodies leak - I will try to figure out what. But > > > > main problem is debug info. > > > > I guess debug info for whole cc1plus is large, but it would be nice if > > > > it was not in the garbage collector, for example :) > > > > > > Well, we're building a DIE tree for the whole unit here so I'm not sure > > > what parts we can optimize. The structures may keep quite some stuff > > > on the tree side live through the decl -> DIE and block -> DIE maps > > > and the external_die_map used for LTO streaming (but if we lazily stream > > > bodies we do need to keep this map ... unless we add some > > > start/end-stream-body hooks and doing the map per function. But then > > > we build the DIEs lazily as well so the query of the map is lazy :/) > > > > Yep, not sure how much we could do here. Of course ggc_collect when > > invoked will do quite a lot of walking to discover relatively few tree > > references, but not sure if that can be solved by custom marking or so. > > In principle the late DIE creation code can remove entries from the > external_die_map map, but not sure how much that helps (might also > cause re-allocation of it if we shrink it). It might help quite a bit > for references to BLOCKs. Maybe you can try the following simple > patch ... > > diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c > index ba93a6c3d81..350cc5d443c 100644 > --- a/gcc/dwarf2out.c > +++ b/gcc/dwarf2out.c > @@ -5974,6 +5974,7 @@ maybe_create_die_with_external_ref (tree decl) > > const char *sym = desc->sym; > unsigned HOST_WIDE_INT off = desc->off; > + external_die_map->remove (decl); > > in_lto_p = false; > dw_die_ref die = (TREE_CODE (decl) == BLOCK Updated stats are: ipa-devirt.c:1950 (get_odr_type) 385k: 0.0% 0 : 0.0% 0 : 0.0% 0 : 0.0% 7044 emit-rtl.c:4117 (make_note_raw) 396k: 0.0% 986M: 6.8% 0 : 0.0% 0 : 0.0% 17M lto-cgraph.c:1983 (input_node_opt_summary) 524k: 0.0% 18M: 0.1% 313k: 0.0% 1012k: 0.2% 124k tree-inline.c:4883 (expand_call_inline) 526k: 0.0% 30M: 0.2% 0 : 0.0% 0 : 0.0% 329k gimple.c:1822 (gimple_copy) 527k: 0.0% 536M: 3.7% 8631k: 0.2% 2997k: 0.6% 7174k emit-rtl.c:2703 (gen_label_rtx) 532k: 0.0% 76M: 0.5% 0 : 0.0% 0 : 0.0% 1232k ipa-modref-tree.h:154 (insert_access) 592k: 0.0% 0 : 0.0% 4052k: 0.1% 7192 : 0.0% 26k cfg.c:202 (connect_src) 617k: 0.0% 277M: 1.9% 1755k: 0.0% 1133k: 0.2% 7053k tree-ssanames.c:308 (make_ssa_name_fn) 627k: 0.0% 466M: 3.2% 0 : 0.0% 0 : 0.0% 6642k tree.c:7887 (build_pointer_type_for_mode) 635k: 0.0% 1094k: 0.0% 0 : 0.0% 0 : 0.0% 10k cgraph.c:1989 (rtl_info) 661k: 0.0% 0 : 0.0% 0 : 0.0% 0 : 0.0% 27k cfg.c:212 (connect_dest) 698k: 0.0% 287M: 2.0% 10181k: 0.2% 2490k: 0.5% 7200k symbol-summary.h:108 (allocate_new) 736k: 0.0% 0 : 0.0% 8663k: 0.2% 0 : 0.0% 391k varpool.c:137 (create_empty) 746k: 0.0% 0 : 0.0% 6257k: 0.1% 0 : 0.0% 54k varasm.c:1513 (make_decl_rtl) 834k: 0.0% 866k: 0.0% 0 : 0.0% 0 : 0.0% 70k emit-rtl.c:4074 (make_jump_insn_raw) 913k: 0.0% 100M: 0.7% 0 : 0.0% 0 : 0.0% 1448k tree-phinodes.c:119 (allocate_phi_node) 943k: 0.0% 164M: 1.1% 0 : 0.0% 3563k: 0.7% 343k emit-rtl.c:386 (set_mem_attrs) 982k: 0.0% 171M: 1.2% 0 : 0.0% 0 : 0.0% 4413k tree.c:1311 (build_new_int_cst) 1080k: 0.0% 838k: 0.0% 66M: 1.3% 0 : 0.0% 2188k langhooks.c:664 (build_builtin_function) 1125k: 0.0% 137k: 0.0% 0 : 0.0% 170k: 0.0% 4367 emit-rtl.c:486 (gen_raw_REG) 1158k: 0.0% 221M: 1.5% 96 : 0.0% 0 : 0.0% 9517k cfg.c:266 (unchecked_make_edge) 1179k: 0.0% 69M: 0.5% 356M: 6.8% 0 : 0.0% 9119k varasm.c:3350 (build_constant_desc) 1232k: 0.0% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k varasm.c:3397 (build_constant_desc) 1232k: 0.0% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k tree.c:1497 (cache_wide_int_in_type_cache) 1342k: 0.0% 44k: 0.0% 0 : 0.0% 3184 : 0.0% 18k cfg.c:127 (alloc_block) 1597k: 0.0% 720M: 5.0% 0 : 0.0% 0 : 0.0% 7113k tree-inline.c:837 (remap_block) 1738k: 0.1% 187M: 1.3% 0 : 0.0% 0 : 0.0% 2016k dwarf2out.c:15872 (mem_loc_descriptor) 2048k: 0.1% 0 : 0.0% 1531k: 0.0% 512 : 0.0% 10 emit-rtl.c:856 (gen_rtx_MEM) 2138k: 0.1% 297M: 2.1% 0 : 0.0% 0 : 0.0% 12M symtab.c:596 (create_reference) 2486k: 0.1% 0 : 0.0% 44M: 0.8% 341k: 0.1% 192k tree-inline.c:5038 (expand_call_inline) 2687k: 0.1% 0 : 0.0% 2434k: 0.0% 15k: 0.0% 6432 dwarf2out.c:1028 (dwarf2out_alloc_current_fde) 3084k: 0.1% 0 : 0.0% 0 : 0.0% 0 : 0.0% 27k ipa-prop.c:5276 (read_ipcp_transformation_info) 3549k: 0.1% 34k: 0.0% 0 : 0.0% 737k: 0.1% 6508 alias.c:1200 (record_alias_subset) 4712k: 0.1% 0 : 0.0% 3096 : 0.0% 36k: 0.0% 4679 tree.c:2264 (build_string) 5163k: 0.2% 1782k: 0.0% 0 : 0.0% 652k: 0.1% 115k function.c:4438 (reorder_blocks_1) 5470k: 0.2% 193M: 1.3% 0 : 0.0% 0 : 0.0% 2121k varasm.c:3359 (build_constant_desc) 7393k: 0.2% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k dwarf2cfi.c:2341 (add_cfis_to_fde) 8078k: 0.2% 0 : 0.0% 4933k: 0.1% 1417k: 0.3% 78k dwarf2cfi.c:418 (new_cfi) 10M: 0.3% 0 : 0.0% 0 : 0.0% 0 : 0.0% 447k stringpool.c:63 (alloc_node) 10M: 0.3% 12M: 0.1% 0 : 0.0% 0 : 0.0% 591k tree.c:1642 (wide_int_to_tree_1) 10M: 0.3% 2375k: 0.0% 0 : 0.0% 0 : 0.0% 549k stringpool.c:41 (stringpool_ggc_alloc) 10M: 0.3% 7328k: 0.0% 0 : 0.0% 2279k: 0.5% 591k cgraph.c:290 (create_empty) 11M: 0.3% 0 : 0.0% 96M: 1.8% 0 : 0.0% 372k tree-inline.c:5946 (copy_decl_to_var) 16M: 0.5% 74M: 0.5% 0 : 0.0% 0 : 0.0% 647k tree-streamer-in.c:558 (streamer_read_tree_bitfi 22M: 0.7% 13k: 0.0% 0 : 0.0% 22k: 0.0% 64k stringpool.c:79 (ggc_alloc_string) 27M: 0.8% 7321k: 0.0% 0 : 0.0% 6640k: 1.3% 1784k dwarf2out.c:11728 (add_ranges_num) 32M: 1.0% 0 : 0.0% 32M: 0.6% 144 : 0.0% 20 tree-inline.c:5998 (copy_decl_no_change) 34M: 1.0% 315M: 2.2% 0 : 0.0% 0 : 0.0% 2504k hash-table.h:802 (expand) 142M: 4.3% 10M: 0.1% 185M: 3.5% 32M: 6.6% 29k dwarf2out.c:10087 (new_loc_list) 199M: 6.0% 9350k: 0.1% 0 : 0.0% 0 : 0.0% 2666k tree-streamer-in.c:637 (streamer_alloc_tree) 315M: 9.5% 491M: 3.4% 0 : 0.0% 4243k: 0.8% 9820k dwarf2out.c:5702 (new_die_raw) 412M: 12.4% 0 : 0.0% 0 : 0.0% 0 : 0.0% 5285k dwarf2out.c:1383 (new_loc_descr) 480M: 14.4% 9653k: 0.1% 2880 : 0.0% 0 : 0.0% 6265k dwarf2out.c:4420 (add_dwarf_attr) 750M: 22.5% 0 : 0.0% 94M: 1.8% 13M: 2.7% 3891k toplev.c:906 (realloc_for_line_map) 768M: 23.0% 0 : 0.0% 767M: 14.6% 255M: 52.3% 33 -------------------------------------------------------------------------------------------------------------------------------------------- GGC memory Leak Garbage Freed Overhead Times -------------------------------------------------------------------------------------------------------------------------------------------- Total 3332M:100.0% 14432M:100.0% 5267M:100.0% 489M:100.0% 389M -------------------------------------------------------------------------------------------------------------------------------------------- So it seems there is a reduction from 3.6G to 3.3G Honza