From: Jan Hubicka <hubicka@ucw.cz>
To: Richard Biener <rguenther@suse.de>
Cc: Martin Jambor <mjambor@suse.cz>,
gary@amperecomputing.com, mliska@suse.cz, jakub@redhat.com,
gcc-patches@gcc.gnu.org
Subject: Re: Materialize clones on demand
Date: Wed, 28 Oct 2020 16:51:58 +0100 [thread overview]
Message-ID: <20201028155158.GI44896@kam.mff.cuni.cz> (raw)
In-Reply-To: <nycvar.YFH.7.76.2010261123100.10073@p653.nepu.fhfr.qr>
> > > > However main problem is
> > > > cfg.c:202 (connect_src) 5745k: 0.2% 271M: 1.9% 1754k: 0.0% 1132k: 0.2% 7026k
> > > > cfg.c:212 (connect_dest) 6307k: 0.2% 281M: 2.0% 10129k: 0.2% 2490k: 0.5% 7172k
> > > > varasm.c:3359 (build_constant_desc) 7387k: 0.2% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k
> > > > emit-rtl.c:486 (gen_raw_REG) 7799k: 0.2% 215M: 1.5% 96 : 0.0% 0 : 0.0% 9502k
> > > > dwarf2cfi.c:2341 (add_cfis_to_fde) 8027k: 0.2% 0 : 0.0% 4906k: 0.1% 1405k: 0.3% 78k
> > > > emit-rtl.c:4074 (make_jump_insn_raw) 8239k: 0.2% 93M: 0.7% 0 : 0.0% 0 : 0.0% 1442k
> > > > tree-ssanames.c:308 (make_ssa_name_fn) 9130k: 0.2% 456M: 3.3% 0 : 0.0% 0 : 0.0% 6622k
> > > > gimple.c:1808 (gimple_copy) 9508k: 0.3% 524M: 3.7% 8609k: 0.2% 2972k: 0.6% 7135k
> > > > tree-inline.c:4879 (expand_call_inline) 9590k: 0.3% 21M: 0.2% 0 : 0.0% 0 : 0.0% 328k
> > > > dwarf2cfi.c:418 (new_cfi) 10M: 0.3% 0 : 0.0% 0 : 0.0% 0 : 0.0% 444k
> > > > cfg.c:266 (unchecked_make_edge) 10M: 0.3% 60M: 0.4% 355M: 6.8% 0 : 0.0% 9083k
> > I think it is bug to have fuction body at the end of compilation - will
> > try to work out reason for that.
> > > > tree.c:1642 (wide_int_to_tree_1) 10M: 0.3% 2313k: 0.0% 0 : 0.0% 0 : 0.0% 548k
> > > > stringpool.c:41 (stringpool_ggc_alloc) 10M: 0.3% 7055k: 0.0% 0 : 0.0% 2270k: 0.5% 588k
> > > > stringpool.c:63 (alloc_node) 10M: 0.3% 12M: 0.1% 0 : 0.0% 0 : 0.0% 588k
> > > > tree-phinodes.c:119 (allocate_phi_node) 11M: 0.3% 153M: 1.1% 0 : 0.0% 3539k: 0.7% 340k
> > > > cgraph.c:289 (create_empty) 12M: 0.3% 0 : 0.0% 109M: 2.1% 0 : 0.0% 371k
> > > > cfg.c:127 (alloc_block) 14M: 0.4% 705M: 5.0% 0 : 0.0% 0 : 0.0% 7086k
> > > > tree-streamer-in.c:558 (streamer_read_tree_bitfi 22M: 0.6% 13k: 0.0% 0 : 0.0% 22k: 0.0% 64k
> > > > tree-inline.c:834 (remap_block) 28M: 0.8% 159M: 1.1% 0 : 0.0% 0 : 0.0% 2009k
> > > > stringpool.c:79 (ggc_alloc_string) 28M: 0.8% 5619k: 0.0% 0 : 0.0% 6658k: 1.4% 1785k
> > > > dwarf2out.c:11727 (add_ranges_num) 32M: 0.9% 0 : 0.0% 32M: 0.6% 144 : 0.0% 20
> > > > tree-inline.c:5942 (copy_decl_to_var) 39M: 1.1% 51M: 0.4% 0 : 0.0% 0 : 0.0% 646k
> > > > tree-inline.c:5994 (copy_decl_no_change) 78M: 2.1% 270M: 1.9% 0 : 0.0% 0 : 0.0% 2497k
> > > > function.c:4438 (reorder_blocks_1) 96M: 2.6% 101M: 0.7% 0 : 0.0% 0 : 0.0% 2109k
> > > > hash-table.h:802 (expand) 142M: 3.9% 18M: 0.1% 198M: 3.8% 32M: 6.9% 38k
> > > > dwarf2out.c:10086 (new_loc_list) 219M: 6.0% 11M: 0.1% 0 : 0.0% 0 : 0.0% 2955k
> > > > tree-streamer-in.c:637 (streamer_alloc_tree) 379M: 10.3% 426M: 3.0% 0 : 0.0% 4201k: 0.9% 9828k
> > > > dwarf2out.c:5702 (new_die_raw) 434M: 11.8% 0 : 0.0% 0 : 0.0% 0 : 0.0% 5556k
> > > > dwarf2out.c:1383 (new_loc_descr) 519M: 14.1% 12M: 0.1% 2880 : 0.0% 0 : 0.0% 6812k
> > > > dwarf2out.c:4420 (add_dwarf_attr) 640M: 17.4% 0 : 0.0% 94M: 1.8% 4584k: 1.0% 3877k
> > > > toplev.c:906 (realloc_for_line_map) 768M: 20.8% 0 : 0.0% 767M: 14.6% 255M: 54.4% 33
> > > > --------------------------------------------------------------------------------------------------------------------------------------------
> > > > GGC memory Leak Garbage Freed Overhead Times
> > > > --------------------------------------------------------------------------------------------------------------------------------------------
> > > > Total 3689M:100.0% 14039M:100.0% 5254M:100.0% 470M:100.0% 391M
> > > > --------------------------------------------------------------------------------------------------------------------------------------------
> > > >
> > > > Clearly some function bodies leak - I will try to figure out what. But
> > > > main problem is debug info.
> > > > I guess debug info for whole cc1plus is large, but it would be nice if
> > > > it was not in the garbage collector, for example :)
> > >
> > > Well, we're building a DIE tree for the whole unit here so I'm not sure
> > > what parts we can optimize. The structures may keep quite some stuff
> > > on the tree side live through the decl -> DIE and block -> DIE maps
> > > and the external_die_map used for LTO streaming (but if we lazily stream
> > > bodies we do need to keep this map ... unless we add some
> > > start/end-stream-body hooks and doing the map per function. But then
> > > we build the DIEs lazily as well so the query of the map is lazy :/)
> >
> > Yep, not sure how much we could do here. Of course ggc_collect when
> > invoked will do quite a lot of walking to discover relatively few tree
> > references, but not sure if that can be solved by custom marking or so.
>
> In principle the late DIE creation code can remove entries from the
> external_die_map map, but not sure how much that helps (might also
> cause re-allocation of it if we shrink it). It might help quite a bit
> for references to BLOCKs. Maybe you can try the following simple
> patch ...
>
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index ba93a6c3d81..350cc5d443c 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -5974,6 +5974,7 @@ maybe_create_die_with_external_ref (tree decl)
>
> const char *sym = desc->sym;
> unsigned HOST_WIDE_INT off = desc->off;
> + external_die_map->remove (decl);
>
> in_lto_p = false;
> dw_die_ref die = (TREE_CODE (decl) == BLOCK
Updated stats are:
ipa-devirt.c:1950 (get_odr_type) 385k: 0.0% 0 : 0.0% 0 : 0.0% 0 : 0.0% 7044
emit-rtl.c:4117 (make_note_raw) 396k: 0.0% 986M: 6.8% 0 : 0.0% 0 : 0.0% 17M
lto-cgraph.c:1983 (input_node_opt_summary) 524k: 0.0% 18M: 0.1% 313k: 0.0% 1012k: 0.2% 124k
tree-inline.c:4883 (expand_call_inline) 526k: 0.0% 30M: 0.2% 0 : 0.0% 0 : 0.0% 329k
gimple.c:1822 (gimple_copy) 527k: 0.0% 536M: 3.7% 8631k: 0.2% 2997k: 0.6% 7174k
emit-rtl.c:2703 (gen_label_rtx) 532k: 0.0% 76M: 0.5% 0 : 0.0% 0 : 0.0% 1232k
ipa-modref-tree.h:154 (insert_access) 592k: 0.0% 0 : 0.0% 4052k: 0.1% 7192 : 0.0% 26k
cfg.c:202 (connect_src) 617k: 0.0% 277M: 1.9% 1755k: 0.0% 1133k: 0.2% 7053k
tree-ssanames.c:308 (make_ssa_name_fn) 627k: 0.0% 466M: 3.2% 0 : 0.0% 0 : 0.0% 6642k
tree.c:7887 (build_pointer_type_for_mode) 635k: 0.0% 1094k: 0.0% 0 : 0.0% 0 : 0.0% 10k
cgraph.c:1989 (rtl_info) 661k: 0.0% 0 : 0.0% 0 : 0.0% 0 : 0.0% 27k
cfg.c:212 (connect_dest) 698k: 0.0% 287M: 2.0% 10181k: 0.2% 2490k: 0.5% 7200k
symbol-summary.h:108 (allocate_new) 736k: 0.0% 0 : 0.0% 8663k: 0.2% 0 : 0.0% 391k
varpool.c:137 (create_empty) 746k: 0.0% 0 : 0.0% 6257k: 0.1% 0 : 0.0% 54k
varasm.c:1513 (make_decl_rtl) 834k: 0.0% 866k: 0.0% 0 : 0.0% 0 : 0.0% 70k
emit-rtl.c:4074 (make_jump_insn_raw) 913k: 0.0% 100M: 0.7% 0 : 0.0% 0 : 0.0% 1448k
tree-phinodes.c:119 (allocate_phi_node) 943k: 0.0% 164M: 1.1% 0 : 0.0% 3563k: 0.7% 343k
emit-rtl.c:386 (set_mem_attrs) 982k: 0.0% 171M: 1.2% 0 : 0.0% 0 : 0.0% 4413k
tree.c:1311 (build_new_int_cst) 1080k: 0.0% 838k: 0.0% 66M: 1.3% 0 : 0.0% 2188k
langhooks.c:664 (build_builtin_function) 1125k: 0.0% 137k: 0.0% 0 : 0.0% 170k: 0.0% 4367
emit-rtl.c:486 (gen_raw_REG) 1158k: 0.0% 221M: 1.5% 96 : 0.0% 0 : 0.0% 9517k
cfg.c:266 (unchecked_make_edge) 1179k: 0.0% 69M: 0.5% 356M: 6.8% 0 : 0.0% 9119k
varasm.c:3350 (build_constant_desc) 1232k: 0.0% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k
varasm.c:3397 (build_constant_desc) 1232k: 0.0% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k
tree.c:1497 (cache_wide_int_in_type_cache) 1342k: 0.0% 44k: 0.0% 0 : 0.0% 3184 : 0.0% 18k
cfg.c:127 (alloc_block) 1597k: 0.0% 720M: 5.0% 0 : 0.0% 0 : 0.0% 7113k
tree-inline.c:837 (remap_block) 1738k: 0.1% 187M: 1.3% 0 : 0.0% 0 : 0.0% 2016k
dwarf2out.c:15872 (mem_loc_descriptor) 2048k: 0.1% 0 : 0.0% 1531k: 0.0% 512 : 0.0% 10
emit-rtl.c:856 (gen_rtx_MEM) 2138k: 0.1% 297M: 2.1% 0 : 0.0% 0 : 0.0% 12M
symtab.c:596 (create_reference) 2486k: 0.1% 0 : 0.0% 44M: 0.8% 341k: 0.1% 192k
tree-inline.c:5038 (expand_call_inline) 2687k: 0.1% 0 : 0.0% 2434k: 0.0% 15k: 0.0% 6432
dwarf2out.c:1028 (dwarf2out_alloc_current_fde) 3084k: 0.1% 0 : 0.0% 0 : 0.0% 0 : 0.0% 27k
ipa-prop.c:5276 (read_ipcp_transformation_info) 3549k: 0.1% 34k: 0.0% 0 : 0.0% 737k: 0.1% 6508
alias.c:1200 (record_alias_subset) 4712k: 0.1% 0 : 0.0% 3096 : 0.0% 36k: 0.0% 4679
tree.c:2264 (build_string) 5163k: 0.2% 1782k: 0.0% 0 : 0.0% 652k: 0.1% 115k
function.c:4438 (reorder_blocks_1) 5470k: 0.2% 193M: 1.3% 0 : 0.0% 0 : 0.0% 2121k
varasm.c:3359 (build_constant_desc) 7393k: 0.2% 0 : 0.0% 0 : 0.0% 0 : 0.0% 51k
dwarf2cfi.c:2341 (add_cfis_to_fde) 8078k: 0.2% 0 : 0.0% 4933k: 0.1% 1417k: 0.3% 78k
dwarf2cfi.c:418 (new_cfi) 10M: 0.3% 0 : 0.0% 0 : 0.0% 0 : 0.0% 447k
stringpool.c:63 (alloc_node) 10M: 0.3% 12M: 0.1% 0 : 0.0% 0 : 0.0% 591k
tree.c:1642 (wide_int_to_tree_1) 10M: 0.3% 2375k: 0.0% 0 : 0.0% 0 : 0.0% 549k
stringpool.c:41 (stringpool_ggc_alloc) 10M: 0.3% 7328k: 0.0% 0 : 0.0% 2279k: 0.5% 591k
cgraph.c:290 (create_empty) 11M: 0.3% 0 : 0.0% 96M: 1.8% 0 : 0.0% 372k
tree-inline.c:5946 (copy_decl_to_var) 16M: 0.5% 74M: 0.5% 0 : 0.0% 0 : 0.0% 647k
tree-streamer-in.c:558 (streamer_read_tree_bitfi 22M: 0.7% 13k: 0.0% 0 : 0.0% 22k: 0.0% 64k
stringpool.c:79 (ggc_alloc_string) 27M: 0.8% 7321k: 0.0% 0 : 0.0% 6640k: 1.3% 1784k
dwarf2out.c:11728 (add_ranges_num) 32M: 1.0% 0 : 0.0% 32M: 0.6% 144 : 0.0% 20
tree-inline.c:5998 (copy_decl_no_change) 34M: 1.0% 315M: 2.2% 0 : 0.0% 0 : 0.0% 2504k
hash-table.h:802 (expand) 142M: 4.3% 10M: 0.1% 185M: 3.5% 32M: 6.6% 29k
dwarf2out.c:10087 (new_loc_list) 199M: 6.0% 9350k: 0.1% 0 : 0.0% 0 : 0.0% 2666k
tree-streamer-in.c:637 (streamer_alloc_tree) 315M: 9.5% 491M: 3.4% 0 : 0.0% 4243k: 0.8% 9820k
dwarf2out.c:5702 (new_die_raw) 412M: 12.4% 0 : 0.0% 0 : 0.0% 0 : 0.0% 5285k
dwarf2out.c:1383 (new_loc_descr) 480M: 14.4% 9653k: 0.1% 2880 : 0.0% 0 : 0.0% 6265k
dwarf2out.c:4420 (add_dwarf_attr) 750M: 22.5% 0 : 0.0% 94M: 1.8% 13M: 2.7% 3891k
toplev.c:906 (realloc_for_line_map) 768M: 23.0% 0 : 0.0% 767M: 14.6% 255M: 52.3% 33
--------------------------------------------------------------------------------------------------------------------------------------------
GGC memory Leak Garbage Freed Overhead Times
--------------------------------------------------------------------------------------------------------------------------------------------
Total 3332M:100.0% 14432M:100.0% 5267M:100.0% 489M:100.0% 389M
--------------------------------------------------------------------------------------------------------------------------------------------
So it seems there is a reduction from 3.6G to 3.3G
Honza
prev parent reply other threads:[~2020-10-28 15:52 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-22 9:48 Jan Hubicka
2020-10-23 11:21 ` Martin Jambor
2020-10-23 11:26 ` Jan Hubicka
2020-10-23 19:27 ` Jan Hubicka
2020-10-26 7:41 ` Richard Biener
2020-10-26 9:48 ` Jan Hubicka
2020-10-26 10:32 ` Richard Biener
2020-10-26 10:35 ` Jan Hubicka
2020-10-28 15:51 ` Jan Hubicka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201028155158.GI44896@kam.mff.cuni.cz \
--to=hubicka@ucw.cz \
--cc=gary@amperecomputing.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=mjambor@suse.cz \
--cc=mliska@suse.cz \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).