From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 3A1AD3858427 for ; Tue, 14 May 2024 12:20:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3A1AD3858427 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kam.mff.cuni.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3A1AD3858427 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.113.20.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715689259; cv=none; b=Xh1FUQKmJlX6g0WX0KwI/HT4ua+G1/rOmSuzNslA5MU1WDXA3FB+ND08vyNQfFZKtj1va0qToyZVbFP8J5gYbu9oAFR6dJXaZa4cFUPgyHxR0rou02XnPyvZXM/LMo1yVSgDhW4sAwuhvpMq237+DjeG9ecbvzoSSj0719+CnZA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715689259; c=relaxed/simple; bh=b4eKHvc1b2wWIJhOtzT7yRI+2ufMcB9OUpjc//sg6fQ=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=EVWsE+WZWxLgqTxKSAIyziNNOIAquN+DbVO3h3LNN2jkkItxspD1LyRCaSA38q2dv/BrV6yJqMy+RWewHmaNlB1i9IvNhIq/+UpaX3GIsS+IVjMsXswrJcJ07x0LHcB2bTTvT5LW3Qt7u4QqpXiFAnVJfU+Sn2mTIUlix+VDEnc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 3E0BE286F53; Tue, 14 May 2024 14:20:56 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1715689256; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DwMhxJLBioNWH9niv9FchuwWN/xAhwc+OA1/luvV4nA=; b=a910g9Fqz3LS3eXifecDp1jL5t8OsJwkSueXrFQN2aReqPGrBu49fVz1ifwUnSKIm+2vw+ k/VoVsZOf8fvtsmGPz6hk3bx1NvpFu1GfAwShNcFZjOuWrnWm3Lu6Zm9e1LHr5EwhZlgym 5Dal/LxYnTbT25zMM5p0v8M0Ji6FleY= Date: Tue, 14 May 2024 14:20:56 +0200 From: Jan Hubicka To: Michal Jires Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH 6/7] lto: squash order of symbols in partitions Message-ID: References: <18cc1c3980551ac1881eea6e78811a629c7baa82.1700222403.git.mjires@suse.cz> <1169efeea8ca079fc9297a4f95ad292558b1bbcf.1700222403.git.mjires@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1169efeea8ca079fc9297a4f95ad292558b1bbcf.1700222403.git.mjires@suse.cz> X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,JMQ_SPF_NEUTRAL,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > This patch squashes order of symbols in individual partitions, so that > their relative order is conserved, but is not influenced by symbols in > other partitions. > Order of cloned symbols is set to 0. This should be fine because order > specifies order of symbols in input files, which cloned symbols are not > part of. The current use of order is somewhat broken (after converting cgraph to C++, that is a while). The original code was setting order at the time function was finalized, which made them to be output in same order as the bodies appear in source code (with -fno-toplevel-reorder build at least). With this logic the clones should have same order as originals, so they appear next to tihem. Later initialization of order was moved to register_symbol that is king of wrong since frontends are allowed to produce symbols early. So it would be nice to fix this problem and make sure that order of clons is sane. I guess this is bit of independent of the rest of caching, so maybe we can first get the other patches in and then worry about order? > > This is important for incremental LTO because if there is a new symbol, > it otherwise shifts order of all symbols with higher order, which would > diverge them all. > > Bootstrapped/regtested on x86_64-pc-linux-gnu > > gcc/ChangeLog: > > * lto-cgraph.cc (lto_output_node): Add and use order_remap. > (lto_output_varpool_node): Likewise. > (output_symtab): Likewise. > * lto-streamer-out.cc (produce_asm): Likewise. > (output_function): Likewise. > (output_constructor): Likewise. > (copy_function_or_variable): Likewise. > (cmp_int): New. > (lto_output): Generate order_remap. > * lto-streamer.h (produce_asm): Add order_remap. > (output_symtab): Likewise. > --- > gcc/lto-cgraph.cc | 20 ++++++++---- > gcc/lto-streamer-out.cc | 71 +++++++++++++++++++++++++++++++++-------- > gcc/lto-streamer.h | 5 +-- > 3 files changed, 73 insertions(+), 23 deletions(-) > > diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc > index 32c0f5ac6db..a7530290fba 100644 > --- a/gcc/lto-cgraph.cc > +++ b/gcc/lto-cgraph.cc > @@ -381,7 +381,8 @@ reachable_from_this_partition_p (struct cgraph_node *node, lto_symtab_encoder_t > > static void > lto_output_node (struct lto_simple_output_block *ob, struct cgraph_node *node, > - lto_symtab_encoder_t encoder) > + lto_symtab_encoder_t encoder, > + hash_map, int>* order_remap) > { > unsigned int tag; > struct bitpack_d bp; > @@ -405,7 +406,9 @@ lto_output_node (struct lto_simple_output_block *ob, struct cgraph_node *node, > > streamer_write_enum (ob->main_stream, LTO_symtab_tags, LTO_symtab_last_tag, > tag); > - streamer_write_hwi_stream (ob->main_stream, node->order); > + > + int order = flag_wpa ? *order_remap->get (node->order) : node->order; > + streamer_write_hwi_stream (ob->main_stream, order); > > /* In WPA mode, we only output part of the call-graph. Also, we > fake cgraph node attributes. There are two cases that we care. > @@ -585,7 +588,8 @@ lto_output_node (struct lto_simple_output_block *ob, struct cgraph_node *node, > > static void > lto_output_varpool_node (struct lto_simple_output_block *ob, varpool_node *node, > - lto_symtab_encoder_t encoder) > + lto_symtab_encoder_t encoder, > + hash_map, int>* order_remap) > { > bool boundary_p = !lto_symtab_encoder_in_partition_p (encoder, node); > bool encode_initializer_p > @@ -602,7 +606,8 @@ lto_output_varpool_node (struct lto_simple_output_block *ob, varpool_node *node, > > streamer_write_enum (ob->main_stream, LTO_symtab_tags, LTO_symtab_last_tag, > LTO_symtab_variable); > - streamer_write_hwi_stream (ob->main_stream, node->order); > + int order = flag_wpa ? *order_remap->get (node->order) : node->order; > + streamer_write_hwi_stream (ob->main_stream, order); > lto_output_var_decl_ref (ob->decl_state, ob->main_stream, node->decl); > bp = bitpack_create (ob->main_stream); > bp_pack_value (&bp, node->externally_visible, 1); > @@ -967,7 +972,7 @@ compute_ltrans_boundary (lto_symtab_encoder_t in_encoder) > /* Output the part of the symtab in SET and VSET. */ > > void > -output_symtab (void) > +output_symtab (hash_map, int>* order_remap) > { > struct cgraph_node *node; > struct lto_simple_output_block *ob; > @@ -994,9 +999,10 @@ output_symtab (void) > { > symtab_node *node = lto_symtab_encoder_deref (encoder, i); > if (cgraph_node *cnode = dyn_cast (node)) > - lto_output_node (ob, cnode, encoder); > + lto_output_node (ob, cnode, encoder, order_remap); > else > - lto_output_varpool_node (ob, dyn_cast (node), encoder); > + lto_output_varpool_node (ob, dyn_cast (node), encoder, > + order_remap); > } > > /* Go over the nodes in SET again to write edges. */ > diff --git a/gcc/lto-streamer-out.cc b/gcc/lto-streamer-out.cc > index a1bbea8fc68..9448ab195d5 100644 > --- a/gcc/lto-streamer-out.cc > +++ b/gcc/lto-streamer-out.cc > @@ -2212,7 +2212,8 @@ output_cfg (struct output_block *ob, struct function *fn) > a function, set FN to the decl for that function. */ > > void > -produce_asm (struct output_block *ob, tree fn) > +produce_asm (struct output_block *ob, tree fn, > + hash_map, int>* order_remap) > { > enum lto_section_type section_type = ob->section_type; > struct lto_function_header header; > @@ -2221,9 +2222,11 @@ produce_asm (struct output_block *ob, tree fn) > if (section_type == LTO_section_function_body) > { > const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (fn)); > - section_name = lto_get_section_name (section_type, name, > - symtab_node::get (fn)->order, > - NULL); > + > + int order = symtab_node::get (fn)->order; > + if (flag_wpa && order_remap) > + order = *order_remap->get (order); > + section_name = lto_get_section_name (section_type, name, order, NULL); > } > else > section_name = lto_get_section_name (section_type, NULL, 0, NULL); > @@ -2405,7 +2408,8 @@ streamer_write_chain (struct output_block *ob, tree t, bool ref_p) > /* Output the body of function NODE->DECL. */ > > static void > -output_function (struct cgraph_node *node) > +output_function (struct cgraph_node *node, > + hash_map, int>* order_remap) > { > tree function; > struct function *fn; > @@ -2482,7 +2486,7 @@ output_function (struct cgraph_node *node) > streamer_write_uhwi (ob, 0); > > /* Create a section to hold the pickled output of this function. */ > - produce_asm (ob, function); > + produce_asm (ob, function, order_remap); > > destroy_output_block (ob); > if (streamer_dump_file) > @@ -2493,7 +2497,8 @@ output_function (struct cgraph_node *node) > /* Output the body of function NODE->DECL. */ > > static void > -output_constructor (struct varpool_node *node) > +output_constructor (struct varpool_node *node, > + hash_map, int>* order_remap) > { > tree var = node->decl; > struct output_block *ob; > @@ -2515,7 +2520,7 @@ output_constructor (struct varpool_node *node) > stream_write_tree (ob, DECL_INITIAL (var), true); > > /* Create a section to hold the pickled output of this function. */ > - produce_asm (ob, var); > + produce_asm (ob, var, order_remap); > > destroy_output_block (ob); > if (streamer_dump_file) > @@ -2576,15 +2581,18 @@ lto_output_toplevel_asms (void) > /* Copy the function body or variable constructor of NODE without deserializing. */ > > static void > -copy_function_or_variable (struct symtab_node *node) > +copy_function_or_variable (struct symtab_node *node, > + hash_map, int>* order_remap) > { > tree function = node->decl; > struct lto_file_decl_data *file_data = node->lto_file_data; > const char *data; > size_t len; > const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (function)); > + > + int order = flag_wpa ? *order_remap->get (node->order) : node->order; > char *section_name = > - lto_get_section_name (LTO_section_function_body, name, node->order, NULL); > + lto_get_section_name (LTO_section_function_body, name, order, NULL); > size_t i, j; > struct lto_in_decl_state *in_state; > struct lto_out_decl_state *out_state = lto_get_out_decl_state (); > @@ -2729,6 +2737,15 @@ cmp_symbol_files (const void *pn1, const void *pn2, void *id_map_) > return n1->order - n2->order; > } > > +/* Compare ints, callback for qsort. */ > +static int > +cmp_int (const void *a, const void *b) > +{ > + int ia = *(int const*) a; > + int ib = *(int const*) b; > + return ia - ib; > +} > + > /* Main entry point from the pass manager. */ > > void > @@ -2741,6 +2758,32 @@ lto_output (void) > lto_symtab_encoder_t encoder = lto_get_out_decl_state ()->symtab_node_encoder; > auto_vec symbols_to_copy; > > + hash_map, int> order_remap; > + if (flag_wpa) > + { > + /* Remap order so that it does not depend on symbols outside of > + partition. */ > + auto_vec orders; > + > + n_nodes = lto_symtab_encoder_size (encoder); > + for (i = 0; i < n_nodes; i++) > + { > + symtab_node *snode = lto_symtab_encoder_deref (encoder, i); > + if (cgraph_node *cnode = dyn_cast (snode)) > + { > + if (cnode->clone_of) > + { > + order_remap.put (snode->order, 0); > + continue; > + } > + } > + orders.safe_push (snode->order); > + } > + orders.qsort (cmp_int); > + for (i = 0; i < orders.length (); i++) > + order_remap.put (orders[i], i); > + } > + > prune_offload_funcs (); > > if (flag_checking) > @@ -2817,14 +2860,14 @@ lto_output (void) > at WPA time. */ > || DECL_ARGUMENTS (cnode->decl) > || cnode->declare_variant_alt)) > - output_function (cnode); > + output_function (cnode, &order_remap); > else if ((vnode = dyn_cast (snode)) > && (DECL_INITIAL (vnode->decl) != error_mark_node > || (!flag_wpa > && flag_incremental_link != INCREMENTAL_LINK_LTO))) > - output_constructor (vnode); > + output_constructor (vnode, &order_remap); > else > - copy_function_or_variable (snode); > + copy_function_or_variable (snode, &order_remap); > gcc_assert (lto_get_out_decl_state () == decl_state); > lto_pop_out_decl_state (); > lto_record_function_out_decl_state (snode->decl, decl_state); > @@ -2834,7 +2877,7 @@ lto_output (void) > be done now to make sure that all the statements in every function > have been renumbered so that edges can be associated with call > statements using the statement UIDs. */ > - output_symtab (); > + output_symtab (&order_remap); > > output_offload_tables (); > > diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h > index 0556b34c837..3363e6f9e61 100644 > --- a/gcc/lto-streamer.h > +++ b/gcc/lto-streamer.h > @@ -888,7 +888,8 @@ extern void lto_output_fn_decl_ref (struct lto_out_decl_state *, > extern tree lto_input_var_decl_ref (lto_input_block *, lto_file_decl_data *); > extern tree lto_input_fn_decl_ref (lto_input_block *, lto_file_decl_data *); > extern void lto_output_toplevel_asms (void); > -extern void produce_asm (struct output_block *ob, tree fn); > +extern void produce_asm (struct output_block *ob, tree fn, > + hash_map, int>* order_remap = 0); > extern void lto_output (); > extern void produce_asm_for_decls (); > void lto_output_decl_state_streams (struct output_block *, > @@ -919,7 +920,7 @@ void lto_set_symtab_encoder_in_partition (lto_symtab_encoder_t, > > bool lto_symtab_encoder_encode_initializer_p (lto_symtab_encoder_t, > varpool_node *); > -void output_symtab (void); > +void output_symtab (hash_map, int>*); > void input_symtab (void); > void output_offload_tables (void); > void input_offload_tables (bool); > -- > 2.42.1 >