public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
To: Martin Jambor <mjambor@suse.cz>,
	       Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Jan Hubicka <hubicka@ucw.cz>,
	       segher@kernel.crashing.org
Subject: [PATCH v3] Make function clone name numbering independent.
Date: Wed, 28 Nov 2018 21:09:00 -0000	[thread overview]
Message-ID: <ded14873-6181-849f-6a3e-faf804d3f097@oracle.com> (raw)
In-Reply-To: <3f9e8c2a-6112-b595-a932-137958e61f5e@oracle.com>


[-- Attachment #1.1.1: Type: text/plain, Size: 1245 bytes --]

Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html

It took longer than expected, but I've finally rebased this on top of
the new clone_function_name* API and included the requested
optimizations.

There are a few remaining spots where we could probably apply similar
optimizations:

- gcc/multiple_target.c:create_target_clone
- gcc/multiple_target.c:create_dispatcher_calls
- gcc/omp-expand.c:grid_expand_target_grid_body

But I've yet to figure out how these work in detail and with the new
API these shouldn't block the main change from being merged.

I've also included a small change to rs6000 which I'm pretty sure is
safe, but I have no way of testing.

I'm not sure what's the consensus on what's more appropriate, but I
thought that it would be a good idea to keep these changes as separate
patches since only the first one really changes behaviour, while the
rest are optimizations. It's conceivable that someone who is
backporting this to an older release might want to just backport the
core bits of the change. I can re-submit it as one patch if that's
more appropriate.

Everything in these patches was bootstrapped and regression tested on
x86_64.

Ok for trunk?

- Michael

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.2: 0001-Make-function-assembly-more-independent.patch --]
[-- Type: text/x-patch; name="0001-Make-function-assembly-more-independent.patch", Size: 3631 bytes --]

From 40cb5c888522d69bc42791f0c884dcb9e29eff37 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
Date: Thu, 1 Nov 2018 12:57:30 -0400
Subject: [PATCH 1/4] Make function assembly more independent.

This is achieved by having clone_function_name assign unique clone
numbers for each function independently.

gcc:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids;
	  hash map.
	(clone_function_name_numbered): Use clone_fn_ids.

gcc/testsuite:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* gcc.dg/independent-cloneids-1.c: New test.
---
 gcc/cgraphclones.c                            | 10 ++++-
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++++++++++++++++++
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index e17959c0ca..fdd84b60d3 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map<const char *, unsigned> *clone_fn_ids;
 
 /* Return a new assembler name for a clone of decl named NAME.  Apart
    from the string SUFFIX, the new name will end with a unique (for
@@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num;
 tree
 clone_function_name_numbered (const char *name, const char *suffix)
 {
-  return clone_function_name (name, suffix, clone_fn_id_num++);
+  /* Initialize the function->counter mapping the first time it's
+     needed.  */
+  if (!clone_fn_ids)
+    clone_fn_ids = hash_map<const char *, unsigned int>::create_ggc (64);
+  unsigned int &suffix_counter = clone_fn_ids->get_or_insert (
+				   IDENTIFIER_POINTER (get_identifier (name)));
+  return clone_function_name (name, suffix, suffix_counter++);
 }
 
 /* Return a new assembler name for a clone of DECL.  Apart from string
diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c
new file mode 100644
index 0000000000..3203895492
--- /dev/null
+++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone"  } */
+
+extern int printf (const char *, ...);
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+int
+baz (int arg)
+{
+  printf("%d\n", bar (3));
+  printf("%d\n", bar (4));
+  printf("%d\n", foo (5));
+  printf("%d\n", foo (6));
+  /* adding or removing the following call should not affect foo
+     function's clone numbering */
+  printf("%d\n", bar (7));
+  return foo (8);
+}
+
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]3:} } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]4:} } } */
-- 
2.19.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.3: 0002-Minimize-clone-counter-memory-usage-in-create_virtua.patch --]
[-- Type: text/x-patch; name="0002-Minimize-clone-counter-memory-usage-in-create_virtua.patch", Size: 6590 bytes --]

From 80b07ddf059415c896cecb9862517899db3993e9 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
Date: Mon, 17 Sep 2018 16:02:27 -0400
Subject: [PATCH 2/4] Minimize clone counter memory usage in
 create_virtual_clone.

Based on Martin Jambour's suggestion:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00111.html

gcc:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* cgraph.h (cgraph_node::create_virtual_clone): Add a new
	  argument: num_suffix.
	* cgraphclones.c (cgraph_node::create_virtual_clone): Pass
	  num_suffix to clone_function_name.
	* ipa-cp.c (create_specialized_node): Keep track of clone
	  counters in clone_num_suffixes hash map.
	(ipcp_driver): Free the counter hash map.
	* ipa-hsa.c (process_hsa_functions): Keep track of clone
	  counters in hsa_clone_numbers hash map.
---
 gcc/cgraph.h       |  8 +++++---
 gcc/cgraphclones.c | 16 +++++++++++-----
 gcc/ipa-cp.c       | 10 +++++++++-
 gcc/ipa-hsa.c      |  9 +++++++--
 4 files changed, 32 insertions(+), 11 deletions(-)

diff --git gcc/cgraph.h gcc/cgraph.h
index c13d79850f..4c0fb4da05 100644
--- gcc/cgraph.h
+++ gcc/cgraph.h
@@ -968,11 +968,13 @@ public:
 			     cgraph_node *new_inlined_to,
 			     bitmap args_to_skip, const char *suffix = NULL);
 
-  /* Create callgraph node clone with new declaration.  The actual body will
-     be copied later at compilation stage.  */
+  /* Create callgraph node clone with new declaration.  The actual body will be
+     copied later at compilation stage.  The name of the new clone will be
+     constructed from the name of the original node, SUFFIX and NUM_SUFFIX.  */
   cgraph_node *create_virtual_clone (vec<cgraph_edge *> redirect_callers,
 				     vec<ipa_replace_map *, va_gc> *tree_map,
-				     bitmap args_to_skip, const char * suffix);
+				     bitmap args_to_skip, const char * suffix,
+				     unsigned num_suffix);
 
   /* cgraph node being removed from symbol table; see if its entry can be
    replaced by other inline clone.  */
diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index fdd84b60d3..51bdd7680e 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -595,8 +595,9 @@ clone_function_name (tree decl, const char *suffix)
 }
 
 
-/* Create callgraph node clone with new declaration.  The actual body will
-   be copied later at compilation stage.
+/* Create callgraph node clone with new declaration.  The actual body will be
+   copied later at compilation stage.  The name of the new clone will be
+   constructed from the name of the original node, SUFFIX and NUM_SUFFIX.
 
    TODO: after merging in ipa-sra use function call notes instead of args_to_skip
    bitmap interface.
@@ -604,7 +605,8 @@ clone_function_name (tree decl, const char *suffix)
 cgraph_node *
 cgraph_node::create_virtual_clone (vec<cgraph_edge *> redirect_callers,
 				   vec<ipa_replace_map *, va_gc> *tree_map,
-				   bitmap args_to_skip, const char * suffix)
+				   bitmap args_to_skip, const char * suffix,
+				   unsigned num_suffix)
 {
   tree old_decl = decl;
   cgraph_node *new_node = NULL;
@@ -639,8 +641,12 @@ cgraph_node::create_virtual_clone (vec<cgraph_edge *> redirect_callers,
   strcpy (name + len + 1, suffix);
   name[len] = '.';
   DECL_NAME (new_decl) = get_identifier (name);
-  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
-								   suffix));
+  SET_DECL_ASSEMBLER_NAME (new_decl,
+			   clone_function_name (
+			     IDENTIFIER_POINTER (
+			       DECL_ASSEMBLER_NAME (old_decl)),
+			     suffix,
+			     num_suffix));
   SET_DECL_RTL (new_decl, NULL);
 
   new_node = create_clone (new_decl, count, false,
diff --git gcc/ipa-cp.c gcc/ipa-cp.c
index 4471bae11c..e0cd1bc45b 100644
--- gcc/ipa-cp.c
+++ gcc/ipa-cp.c
@@ -376,6 +376,9 @@ static profile_count max_count;
 
 static long overall_size, max_new_size;
 
+/* Node to unique clone suffix number map.  */
+static hash_map<cgraph_node *, unsigned> *clone_num_suffixes;
+
 /* Return the param lattices structure corresponding to the Ith formal
    parameter of the function described by INFO.  */
 static inline struct ipcp_param_lattices *
@@ -3828,8 +3831,11 @@ create_specialized_node (struct cgraph_node *node,
 	}
     }
 
+  unsigned &suffix_counter = clone_num_suffixes->get_or_insert (node);
   new_node = node->create_virtual_clone (callers, replace_trees,
-					 args_to_skip, "constprop");
+					 args_to_skip, "constprop",
+					 suffix_counter);
+  suffix_counter++;
 
   bool have_self_recursive_calls = !self_recursive_calls.is_empty ();
   for (unsigned j = 0; j < self_recursive_calls.length (); j++)
@@ -5043,6 +5049,7 @@ ipcp_driver (void)
 
   ipa_check_create_node_params ();
   ipa_check_create_edge_args ();
+  clone_num_suffixes = new hash_map<cgraph_node *, unsigned>;
 
   if (dump_file)
     {
@@ -5064,6 +5071,7 @@ ipcp_driver (void)
   ipcp_store_vr_results ();
 
   /* Free all IPCP structures.  */
+  delete clone_num_suffixes;
   free_toporder_info (&topo);
   delete edge_clone_summaries;
   edge_clone_summaries = NULL;
diff --git gcc/ipa-hsa.c gcc/ipa-hsa.c
index 7c6ceaab8f..b5dc3dfef2 100644
--- gcc/ipa-hsa.c
+++ gcc/ipa-hsa.c
@@ -74,6 +74,7 @@ process_hsa_functions (void)
   if (hsa_summaries == NULL)
     hsa_summaries = new hsa_summary_t (symtab);
 
+  hash_map<cgraph_node *, unsigned> hsa_clone_numbers;
   FOR_EACH_DEFINED_FUNCTION (node)
     {
       hsa_function_summary *s = hsa_summaries->get (node);
@@ -86,9 +87,11 @@ process_hsa_functions (void)
 	{
 	  if (!check_warn_node_versionable (node))
 	    continue;
+	  unsigned &clone_number = hsa_clone_numbers.get_or_insert (node);
 	  cgraph_node *clone
 	    = node->create_virtual_clone (vec <cgraph_edge *> (),
-					  NULL, NULL, "hsa");
+					  NULL, NULL, "hsa", clone_number);
+	  clone_number++;
 	  TREE_PUBLIC (clone->decl) = TREE_PUBLIC (node->decl);
 	  clone->externally_visible = node->externally_visible;
 
@@ -107,9 +110,11 @@ process_hsa_functions (void)
 	{
 	  if (!check_warn_node_versionable (node))
 	    continue;
+	  unsigned &clone_number = hsa_clone_numbers.get_or_insert (node);
 	  cgraph_node *clone
 	    = node->create_virtual_clone (vec <cgraph_edge *> (),
-					  NULL, NULL, "hsa");
+					  NULL, NULL, "hsa", clone_number);
+	  clone_number++;
 	  TREE_PUBLIC (clone->decl) = TREE_PUBLIC (node->decl);
 	  clone->externally_visible = node->externally_visible;
 
-- 
2.19.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.4: 0003-Minimize-clone-counter-memory-usage-in-LTO.patch --]
[-- Type: text/x-patch; name="0003-Minimize-clone-counter-memory-usage-in-LTO.patch", Size: 2613 bytes --]

From 5bb67981114c08fbaf78e5cc0f3f8abaa0aee60b Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
Date: Mon, 26 Nov 2018 14:22:39 -0500
Subject: [PATCH 3/4] Minimize clone counter memory usage in LTO.

gcc/lto:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* lto-partition.c (privatize_symbol_name_1): Keep track of
	  non-unique symbol counters in the lto_clone_numbers hash
	  map.
	(lto_promote_cross_file_statics): Allocate and free the
	lto_clone_numbers hash map.
	(lto_promote_statics_nonwpa): Free the lto_clone_numbers hash
	map.
---
 gcc/lto/lto-partition.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git gcc/lto/lto-partition.c gcc/lto/lto-partition.c
index 24e7c23859..3afdcd6f5d 100644
--- gcc/lto/lto-partition.c
+++ gcc/lto/lto-partition.c
@@ -951,6 +951,9 @@ validize_symbol_for_target (symtab_node *node)
     }
 }
 
+/* Maps symbol names to unique lto clone counters.  */
+static hash_map<const char *, unsigned> *lto_clone_numbers;
+
 /* Helper for privatize_symbol_name.  Mangle NODE symbol name
    represented by DECL.  */
 
@@ -962,10 +965,13 @@ privatize_symbol_name_1 (symtab_node *node, tree decl)
   if (must_not_rename (node, name))
     return false;
 
+  unsigned &clone_number = lto_clone_numbers->get_or_insert (
+			     IDENTIFIER_POINTER (DECL_NAME (decl)));
   name = maybe_rewrite_identifier (name);
   symtab->change_decl_assembler_name (decl,
-				      clone_function_name_numbered (
-					  name, "lto_priv"));
+				      clone_function_name (
+					  name, "lto_priv", clone_number));
+  clone_number++;
 
   if (node->lto_file_data)
     lto_record_renamed_decl (node->lto_file_data, name,
@@ -1157,6 +1163,8 @@ lto_promote_cross_file_statics (void)
       part->encoder = compute_ltrans_boundary (part->encoder);
     }
 
+  lto_clone_numbers = new hash_map<const char *, unsigned>;
+
   /* Look at boundaries and promote symbols as needed.  */
   for (i = 0; i < n_sets; i++)
     {
@@ -1187,6 +1195,7 @@ lto_promote_cross_file_statics (void)
           promote_symbol (node);
         }
     }
+  delete lto_clone_numbers;
 }
 
 /* Rename statics in the whole unit in the case that 
@@ -1196,9 +1205,12 @@ void
 lto_promote_statics_nonwpa (void)
 {
   symtab_node *node;
+
+  lto_clone_numbers = new hash_map<const char *, unsigned>;
   FOR_EACH_SYMBOL (node)
     {
       rename_statics (NULL, node);
       validize_symbol_for_target (node);
     }
+  delete lto_clone_numbers;
 }
-- 
2.19.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.5: 0004-There-can-be-at-most-one-.resolver-clone-per-functio.patch --]
[-- Type: text/x-patch; name="0004-There-can-be-at-most-one-.resolver-clone-per-functio.patch", Size: 1242 bytes --]

From a822759e422f0cea11640ec3d6d6fba94bcd4ee9 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
Date: Wed, 28 Nov 2018 09:24:30 -0500
Subject: [PATCH 4/4] There can be at most one .resolver clone per function

gcc:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* config/rs6000/rs6000.c (make_resolver_func): Generate
	  resolver symbol with clone_function_name instead of
	  clone_function_name_numbered.
---
 gcc/config/rs6000/rs6000.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git gcc/config/rs6000/rs6000.c gcc/config/rs6000/rs6000.c
index 4f113cb025..a9d038829b 100644
--- gcc/config/rs6000/rs6000.c
+++ gcc/config/rs6000/rs6000.c
@@ -36997,7 +36997,7 @@ make_resolver_func (const tree default_decl,
 {
   /* Make the resolver function static.  The resolver function returns
      void *.  */
-  tree decl_name = clone_function_name_numbered (default_decl, "resolver");
+  tree decl_name = clone_function_name (default_decl, "resolver");
   const char *resolver_name = IDENTIFIER_POINTER (decl_name);
   tree type = build_function_type_list (ptr_type_node, NULL_TREE);
   tree decl = build_fn_decl (resolver_name, type);
-- 
2.19.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2018-11-28 21:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-16 19:38 [PATCH] " Michael Ploujnikov
2018-07-17  6:10 ` Bernhard Reutner-Fischer
2018-07-17 10:03   ` Richard Biener
2018-07-17 20:25     ` Michael Ploujnikov
2018-07-20  2:49       ` Michael Ploujnikov
2018-07-20 10:05         ` Richard Biener
2018-07-24 13:57           ` Michael Ploujnikov
2018-07-26 17:27             ` Michael Ploujnikov
2018-07-31 17:40               ` Michael Ploujnikov
2018-08-01 10:38                 ` Richard Biener
2018-08-02 19:05                   ` Michael Ploujnikov
2018-08-13 23:58                     ` [PING][PATCH] " Michael Ploujnikov
2018-08-20 19:47                       ` Jeff Law
2018-08-31 18:49                       ` [PING v2][PATCH] " Michael Ploujnikov
2018-09-03 10:02                         ` Martin Jambor
2018-09-03 11:08                           ` Richard Biener
2018-09-03 13:15                             ` Martin Jambor
2018-09-05  3:24                               ` Michael Ploujnikov
2018-11-28 21:09                                 ` Michael Ploujnikov [this message]
2018-11-28 22:49                                   ` [PATCH v3] " Segher Boessenkool
2018-11-29 14:13                                     ` Michael Ploujnikov
2018-11-30  7:26                                   ` Richard Biener
2018-11-30 21:11                                     ` Michael Ploujnikov
2018-12-01 16:31                                       ` H.J. Lu
2018-12-03 17:00                                         ` Michael Ploujnikov
2018-12-04  3:06                                           ` Michael Ploujnikov
2018-12-04 14:07                                             ` Jan Hubicka
2018-09-05  3:32                           ` [PING v2][PATCH] " Michael Ploujnikov
2018-12-04 12:48                             ` Martin Jambor
2018-12-04 13:22                               ` Michael Ploujnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ded14873-6181-849f-6a3e-faf804d3f097@oracle.com \
    --to=michael.ploujnikov@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=mjambor@suse.cz \
    --cc=richard.guenther@gmail.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).