public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Backporting streaming and enum changes
@ 2020-08-06 14:34 Jan Hubicka
  2020-08-06 14:39 ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Hubicka @ 2020-08-06 14:34 UTC (permalink / raw)
  To: gcc-patches, jakub, rguenther

[-- Attachment #1: Type: text/plain, Size: 1524 bytes --]

Hello,
as discussed some time ago, I would like to discuss possibility to
backport the straming and enum improvements.  The motivation is that
this brings quite noticeable improvements to builds of very large
projects where we currently have nonlinearity problem with anonymous
namespaces (which is solved by first set of patches) and also there is
quite noticeable overhead of streaming of enums that I noticed too late
for gcc 10.1. This is the second combine dpatch.

There is also noticeable reduction of .o files (especially before
compression as hit to WPA->ltrans streaming) and some memory use
benefits.

This is an optional thing to do, but I believe it may be helpful for
distro builds and those using LTO for large projects.  

For firefox the reduction in global stream (that is slowest part of WPA)
is from 25678391 tree bodies to 20821629, 11160520 SCC hash collisions
to 6002. 392382523 overal section size to 287891470 (both is
compressed).

For Firefox streaming is under control, but other projects like Chromium
hits bigger issues. The reason is that Firefox has "unified build" that
#includes multiple cpp sources to one, so it consists of only about 8k
source files, while chromium is over 25k and it was tested on project
with over 250k sources. More smaller sources one gets, the more
noticeable bottleneck streaming become.

The patches are not completely trivial, but they affect code that is
heavily executed during streaming and was in mainline for several
months, so I hope they are safe.

Honza

[-- Attachment #2: patch1b --]
[-- Type: text/plain, Size: 47894 bytes --]

From 7edbd8fada3055d55cd238754cd70770d62b10de Mon Sep 17 00:00:00 2001
From: Jan Hubicka <jh@suse.cz>
Date: Wed, 20 May 2020 15:58:22 +0200
Subject: [PATCH 1/5] Avoid SCC hashing on unmergeable trees

This is new incarantion of patch to identify unmergeable tree at streaming out
time rather than streaming in and to avoid pickling them to sccs with with hash
codes.

Building cc1 plus this patch reduces:

[WPA] read 4452927 SCCs of average size 1.986030
[WPA] 8843646 tree bodies read in total
[WPA] tree SCC table: size 524287, 205158 elements, collision ratio: 0.505204
[WPA] tree SCC max chain length 43 (size 1)
[WPA] Compared 947551 SCCs, 780270 collisions (0.823460)
[WPA] Merged 944038 SCCs
[WPA] Merged 5253521 tree bodies
[WPA] Merged 590027 types
...
[WPA] Size of mmap'd section decls: 99229066 bytes
[WPA] Size of mmap'd section function_body: 18398837 bytes
[WPA] Size of mmap'd section refs: 733678 bytes
[WPA] Size of mmap'd section jmpfuncs: 2965981 bytes
[WPA] Size of mmap'd section pureconst: 170248 bytes
[WPA] Size of mmap'd section profile: 17985 bytes
[WPA] Size of mmap'd section symbol_nodes: 3392736 bytes
[WPA] Size of mmap'd section inline: 2693920 bytes
[WPA] Size of mmap'd section icf: 435557 bytes
[WPA] Size of mmap'd section offload_table: 0 bytes
[WPA] Size of mmap'd section lto: 4320 bytes
[WPA] Size of mmap'd section ipa_sra: 651660 bytes

... to ...

[WPA] read 3312246 unshared trees
[WPA] read 1144381 mergeable SCCs of average size 4.833785
[WPA] 8843938 tree bodies read in total
[WPA] tree SCC table: size 524287, 197767 elements, collision ratio: 0.506446
[WPA] tree SCC max chain length 43 (size 1)
[WPA] Compared 946614 SCCs, 775077 collisions (0.818789)
[WPA] Merged 943798 SCCs
[WPA] Merged 5253336 tree bodies
[WPA] Merged 590105 types
....
[WPA] Size of mmap'd section decls: 81262144 bytes
[WPA] Size of mmap'd section function_body: 14702611 bytes
[WPA] Size of mmap'd section ext_symtab: 0 bytes
[WPA] Size of mmap'd section refs: 733695 bytes
[WPA] Size of mmap'd section jmpfuncs: 2332150 bytes
[WPA] Size of mmap'd section pureconst: 170292 bytes
[WPA] Size of mmap'd section profile: 17986 bytes
[WPA] Size of mmap'd section symbol_nodes: 3393358 bytes
[WPA] Size of mmap'd section inline: 2567939 bytes
[WPA] Size of mmap'd section icf: 435633 bytes
[WPA] Size of mmap'd section lto: 4320 bytes
[WPA] Size of mmap'd section ipa_sra: 651824 bytes

so results in about 22% reduction in global decl stream and 24% reduction on
function bodies stream (which is read mostly by ICF)

Martin, the zstd compression breaks the compression statistics (it works when
GCC is configured for zlib)

At first ltrans I get:

[LTRANS] Size of mmap'd section decls: 3734248 bytes
[LTRANS] Size of mmap'd section function_body: 4895962 bytes

... to ...

[LTRANS] Size of mmap'd section decls: 3479850 bytes
[LTRANS] Size of mmap'd section function_body: 3722935 bytes

So 7% reduction of global stream and 31% reduction of function bodies.

Stream in seems to get about 3% faster and stream out about 5% but it is
close to noise factor of my experiment.  I expect bigger speedups on
Firefox but I did not test it today since my Firefox setup broke again.
GCC is not very good example on the problem with anonymous namespace
types since we do not have so many of them.

Sice of object files in gcc directory is reduced by 11% (because hash
numbers do not compress well I guess).

The patch makes DFS walk to recognize trees that are not merged (anonymous
namespace, local function/variable decls, anonymous types etc).  As discussed
on IRC this is now done during the SCC walk rather than during the hash
computation.  When local tree is discovered we know that SCC components of everything that is on
the stack reffers to it and thus is also local. Moreover we mark trees into hash set in output block
so if we get a cross edge referring to local tree it gets marked too.

Patch also takes care of avoiding SCC wrappers around some trees. In particular
 1) singleton unmergeable SCCs are now streamed inline in global decl stream
    This includes INTEGER_CSTs and IDENTIFIER_NODEs that are shared by different
    code than rest of tree merging.
 2) We use LTO_trees instead of LTO_tree_scc to wrap unmergeable SCC components.
    It is still necessary to mark them because of forward references.  LTO_trees
    has simple header with number of trees and then things are streamed same way
    as for LTO_tree_scc. That is tree headers first followed by pickled references
    so things may point to future.

    Of course it is not necessary for LTO_tree_scc to be single component and
    streamer out may group more components together, but I decided to not snowball
    the patch even more
 3) In local streams when lto_output_tree is called and the topmost SCC components
    turns out to be singleton we stream the tree directly
    instead of LTO_tree_scc, hash code, pickled tree, reference to just stremaed tree.

    LTO_trees is used to wrap all trees needed to represent tree being streamed.
    It would make sense again to use only one LTO_trees rather than one per SCC
    but I think this can be done incrementally.

In general local trees are now recognized by new predicate local_tree_p

Bit subtle is handing of TRANLSATION_UNIT_DECL, INTEGER_CST and
IDENTIFIER_NODE.

TRANSLATION_UNIT_DECL a local tree but references to it does not make
other trees local (because we also understand local decls now).
So I check for it later after localness propagation is done.

INTEGER_CST and IDENTIFIER_NODEs are merged but not via the tree merging
machinery. So it makes sense to stream them as unmergeable trees but we
still need to compute their hashes so SCCs referring them do not get too
large collision chains.  For this reason they are checked just prior
stream out.

lto-bootstrapped/regteted x86_64-linux, OK?

gcc/ChangeLog:

2020-05-19  Jan Hubicka  <hubicka@ucw.cz>

	* lto-streamer-in.c (lto_input_scc): Add SHARED_SCC parameter.
	(lto_input_tree_1): Strenghten sanity check.
	(lto_input_tree): Update call of lto_input_scc.
	* lto-streamer-out.c: Include ipa-utils.h
	(create_output_block): Initialize local_trees if merigng is going
	to happen.
	(destroy_output_block): Destroy local_trees.
	(DFS): Add max_local_entry.
	(local_tree_p): New function.
	(DFS::DFS): Initialize and maintain it.
	(DFS::DFS_write_tree): Decide on streaming format.
	(lto_output_tree): Stream inline singleton SCCs
	* lto-streamer.h (enum LTO_tags): Add LTO_trees.
	(struct output_block): Add local_trees.
	(lto_input_scc): Update prototype.

gcc/lto/ChangeLog:

2020-05-19  Jan Hubicka  <hubicka@ucw.cz>

	* lto-common.c (compare_tree_sccs_1): Sanity check that we never
	read TRANSLATION_UNIT_DECL.
	(process_dref): Break out from ...
	(unify_scc): ... here.
	(process_new_tree): Break out from ...
	(lto_read_decls): ... here; handle streaming of singleton trees.
	(print_lto_report_1): Update statistics.

(cherry picked from commit 03d90a20a1afcbb9c30da8d4adf4922b0685061f)
(cherry picked from commit c6328b32770132efa004a3cad127cf74be84e911)
---
 gcc/lto-streamer-in.c  |  28 +++--
 gcc/lto-streamer-out.c | 146 +++++++++++++++++++++----
 gcc/lto-streamer.h     |   8 +-
 gcc/lto/lto-common.c   | 240 ++++++++++++++++++++++++-----------------
 4 files changed, 297 insertions(+), 125 deletions(-)

diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index 244f5b8aa5c..85d0edf49a7 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -1424,16 +1424,17 @@ lto_read_tree (class lto_input_block *ib, class data_in *data_in,
 
 
 /* Populate the reader cache with trees materialized from the SCC
-   following in the IB, DATA_IN stream.  */
+   following in the IB, DATA_IN stream.
+   If SHARED_SCC is true we input LTO_tree_scc.  */
 
 hashval_t
 lto_input_scc (class lto_input_block *ib, class data_in *data_in,
-	       unsigned *len, unsigned *entry_len)
+	       unsigned *len, unsigned *entry_len, bool shared_scc)
 {
   /* A blob of unnamed tree nodes, fill the cache from it and
      recurse.  */
   unsigned size = streamer_read_uhwi (ib);
-  hashval_t scc_hash = streamer_read_uhwi (ib);
+  hashval_t scc_hash = shared_scc ? streamer_read_uhwi (ib) : 0;
   unsigned scc_entry_len = 1;
 
   if (size == 1)
@@ -1456,7 +1457,8 @@ lto_input_scc (class lto_input_block *ib, class data_in *data_in,
 	      || (tag >= LTO_field_decl_ref && tag <= LTO_global_decl_ref)
 	      || tag == LTO_tree_pickle_reference
 	      || tag == LTO_integer_cst
-	      || tag == LTO_tree_scc)
+	      || tag == LTO_tree_scc
+	      || tag == LTO_trees)
 	    gcc_unreachable ();
 
 	  result = streamer_alloc_tree (ib, data_in, tag);
@@ -1522,7 +1524,7 @@ lto_input_tree_1 (class lto_input_block *ib, class data_in *data_in,
 				 (a, len, TYPE_PRECISION (type)));
       streamer_tree_cache_append (data_in->reader_cache, result, hash);
     }
-  else if (tag == LTO_tree_scc)
+  else if (tag == LTO_tree_scc || tag == LTO_trees)
     gcc_unreachable ();
   else
     {
@@ -1538,11 +1540,11 @@ lto_input_tree (class lto_input_block *ib, class data_in *data_in)
 {
   enum LTO_tags tag;
 
-  /* Input and skip SCCs.  */
-  while ((tag = streamer_read_record_start (ib)) == LTO_tree_scc)
+  /* Input pickled trees needed to stream in the reference.  */
+  while ((tag = streamer_read_record_start (ib)) == LTO_trees)
     {
       unsigned len, entry_len;
-      lto_input_scc (ib, data_in, &len, &entry_len);
+      lto_input_scc (ib, data_in, &len, &entry_len, false);
 
       /* Register DECLs with the debuginfo machinery.  */
       while (!dref_queue.is_empty ())
@@ -1551,7 +1553,15 @@ lto_input_tree (class lto_input_block *ib, class data_in *data_in)
 	  debug_hooks->register_external_die (e.decl, e.sym, e.off);
 	}
     }
-  return lto_input_tree_1 (ib, data_in, tag, 0);
+  tree t = lto_input_tree_1 (ib, data_in, tag, 0);
+
+  if (!dref_queue.is_empty ())
+    {
+      dref_entry e = dref_queue.pop ();
+      debug_hooks->register_external_die (e.decl, e.sym, e.off);
+      gcc_checking_assert (dref_queue.is_empty ());
+    }
+  return t;
 }
 
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index d5e2daa3a9e..2d9d8689459 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-dfa.h"
 #include "file-prefix-map.h" /* remap_debug_filename()  */
 #include "output.h"
+#include "ipa-utils.h"
 
 
 static void lto_write_tree (struct output_block*, tree, bool);
@@ -75,6 +76,10 @@ create_output_block (enum lto_section_type section_type)
 
   ob->section_type = section_type;
   ob->decl_state = lto_get_out_decl_state ();
+  /* Only global decl stream in non-wpa will ever be considered by tree
+     merging.  */
+  if (!flag_wpa && section_type == LTO_section_decls)
+    ob->local_trees = new (hash_set <tree>);
   ob->main_stream = XCNEW (struct lto_output_stream);
   ob->string_stream = XCNEW (struct lto_output_stream);
   ob->writer_cache = streamer_tree_cache_create (!flag_wpa, true, false);
@@ -100,6 +105,7 @@ destroy_output_block (struct output_block *ob)
 
   delete ob->string_hash_table;
   ob->string_hash_table = NULL;
+  delete ob->local_trees;
 
   free (ob->main_stream);
   free (ob->string_stream);
@@ -532,6 +538,8 @@ private:
     bool ref_p;
     bool this_ref_p;
   };
+  /* Maximum index of scc stack containing a local tree.  */
+  int max_local_entry;
 
   static int scc_entry_compare (const void *, const void *);
 
@@ -550,6 +558,41 @@ private:
   struct obstack sccstate_obstack;
 };
 
+/* Return true if type can not be merged with structurally same tree in
+   other translation unit.  During stream out this information is propagated
+   to all trees referring to T and they are not streamed with additional
+   information needed by the tree merging in lto-common.c (in particular,
+   scc hash codes are not streamed).
+
+   TRANSLATION_UNIT_DECL is handled specially since references to it does
+   not make other trees local as well.  */
+
+static bool
+local_tree_p (tree t)
+{
+  switch (TREE_CODE (t))
+    {
+    case LABEL_DECL:
+      return true;
+    case NAMESPACE_DECL:
+      return !DECL_NAME (t);
+    case VAR_DECL:
+    case FUNCTION_DECL:
+      return !TREE_PUBLIC (t) && !DECL_EXTERNAL (t);
+    case RECORD_TYPE:
+    case UNION_TYPE:
+    case ENUMERAL_TYPE:
+      /* Anonymous namespace types are local.
+	 Only work hard for main variants;
+	 variant types will inherit locality.  */
+      return TYPE_MAIN_VARIANT (t) == t
+	     && odr_type_p (t) && type_with_linkage_p (t)
+	     && type_in_anonymous_namespace_p (t);
+    default:
+      return false;
+    }
+}
+
 /* Emit the physical representation of tree node EXPR to output block OB,
    using depth-first search on the subgraph.  If THIS_REF_P is true, the
    leaves of EXPR are emitted as references via lto_output_tree_ref.
@@ -560,6 +603,8 @@ DFS::DFS (struct output_block *ob, tree expr, bool ref_p, bool this_ref_p,
 	  bool single_p)
 {
   unsigned int next_dfs_num = 1;
+
+  max_local_entry = -1;
   gcc_obstack_init (&sccstate_obstack);
   DFS_write_tree (ob, NULL, expr, ref_p, this_ref_p);
   while (!worklist_vec.is_empty ())
@@ -586,6 +631,8 @@ DFS::DFS (struct output_block *ob, tree expr, bool ref_p, bool this_ref_p,
 	  scc_entry e = { expr, 0 };
 	  /* Not yet visited.  DFS recurse and push it onto the stack.  */
 	  *slot = cstate = XOBNEW (&sccstate_obstack, struct sccs);
+	  if (ob->local_trees && local_tree_p (expr))
+	    max_local_entry = sccstack.length ();
 	  sccstack.safe_push (e);
 	  cstate->dfsnum = next_dfs_num++;
 	  cstate->low = cstate->dfsnum;
@@ -640,7 +687,26 @@ DFS::DFS (struct output_block *ob, tree expr, bool ref_p, bool this_ref_p,
 	     any merging there.  */
 	  hashval_t scc_hash = 0;
 	  unsigned scc_entry_len = 0;
-	  if (!flag_wpa)
+	  bool local_to_unit = !ob->local_trees
+			       || max_local_entry >= (int)first;
+
+	  /* Remember that trees are local so info gets propagated to other
+	     SCCs.  */
+	  if (local_to_unit && ob->local_trees)
+	    {
+	      for (unsigned i = 0; i < size; ++i)
+		ob->local_trees->add (sccstack[first + i].t);
+	    }
+
+	  /* As a special case do not stream TRANSLATION_UNIT_DECL as shared
+	     tree.  We can not mark it local because references to it does not
+	     make other trees local (all global decls reffer to it via
+	     CONTEXT).  */
+	  if (size == 1
+	      && TREE_CODE (sccstack[first].t) == TRANSLATION_UNIT_DECL)
+	    local_to_unit = true;
+
+	  if (!local_to_unit)
 	    {
 	      scc_hash = hash_scc (ob, first, size, ref_p, this_ref_p);
 
@@ -672,18 +738,47 @@ DFS::DFS (struct output_block *ob, tree expr, bool ref_p, bool this_ref_p,
 	      gcc_checking_assert (scc_entry_len == 1);
 	    }
 
-	  /* Write LTO_tree_scc.  */
-	  streamer_write_record_start (ob, LTO_tree_scc);
-	  streamer_write_uhwi (ob, size);
-	  streamer_write_uhwi (ob, scc_hash);
+	  worklist_vec.pop ();
+
+	  /* Only global decl sections are considered by tree merging.  */
+	  if (ob->section_type != LTO_section_decls)
+	    {
+	      /* If this is the original tree we stream and it forms SCC
+		 by itself then we do not need to stream SCC at all.  */
+	      if (worklist_vec.is_empty () && first == 0 && size == 1)
+		 return;
+	      streamer_write_record_start (ob, LTO_trees);
+	      streamer_write_uhwi (ob, size);
+	    }
+	  /* Write LTO_tree_scc if tree merging is going to be performed.  */
+	  else if (!local_to_unit
+		   /* These are special since sharing is not done by tree
+		      merging machinery.  We can not special case them earlier
+		      because we still need to compute hash for further sharing
+		      of trees referring to them.  */
+		   && (size != 1
+		       || (TREE_CODE (sccstack[first].t) != IDENTIFIER_NODE
+			   && (TREE_CODE (sccstack[first].t) != INTEGER_CST
+			       || TREE_OVERFLOW (sccstack[first].t)))))
+
+	    {
+	      gcc_checking_assert (ob->section_type == LTO_section_decls);
+	      streamer_write_record_start (ob, LTO_tree_scc);
+	      streamer_write_uhwi (ob, size);
+	      streamer_write_uhwi (ob, scc_hash);
+	    }
+	  /* Non-trivial SCCs must be packed to trees blocks so forward
+	     references work correctly.  */
+	  else if (size != 1)
+	    {
+	       streamer_write_record_start (ob, LTO_trees);
+	       streamer_write_uhwi (ob, size);
+	    }
 
 	  /* Write size-1 SCCs without wrapping them inside SCC bundles.
 	     All INTEGER_CSTs need to be handled this way as we need
 	     their type to materialize them.  Also builtins are handled
-	     this way.
-	     ???  We still wrap these in LTO_tree_scc so at the
-	     input side we can properly identify the tree we want
-	     to ultimatively return.  */
+	     this way.  */
 	  if (size == 1)
 	    lto_output_tree_1 (ob, expr, scc_hash, ref_p, this_ref_p);
 	  else
@@ -722,10 +817,11 @@ DFS::DFS (struct output_block *ob, tree expr, bool ref_p, bool this_ref_p,
 
 	  /* Finally truncate the vector.  */
 	  sccstack.truncate (first);
+	  if ((int)first <= max_local_entry)
+	    max_local_entry = first - 1;
 
 	  if (from_state)
 	    from_state->low = MIN (from_state->low, cstate->low);
-	  worklist_vec.pop ();
 	  continue;
 	}
 
@@ -1569,7 +1665,14 @@ DFS::DFS_write_tree (struct output_block *ob, sccs *from_state,
 
   /* Check if we already streamed EXPR.  */
   if (streamer_tree_cache_lookup (ob->writer_cache, expr, NULL))
-    return;
+    {
+      /* Refernece to a local tree makes entry also local.  We always process
+	 top of stack entry, so set max to number of entries in stack - 1.  */
+      if (ob->local_trees
+	  && ob->local_trees->contains (expr))
+	max_local_entry = sccstack.length () - 1;
+      return;
+    }
 
   worklist w;
   w.expr = expr;
@@ -1641,15 +1744,20 @@ lto_output_tree (struct output_block *ob, tree expr,
       DFS (ob, expr, ref_p, this_ref_p, false);
       in_dfs_walk = false;
 
-      /* Finally append a reference to the tree we were writing.
-	 ???  If expr ended up as a singleton we could have
-	 inlined it here and avoid outputting a reference.  */
+      /* Finally append a reference to the tree we were writing.  */
       existed_p = streamer_tree_cache_lookup (ob->writer_cache, expr, &ix);
-      gcc_assert (existed_p);
-      streamer_write_record_start (ob, LTO_tree_pickle_reference);
-      streamer_write_uhwi (ob, ix);
-      streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
-			   lto_tree_code_to_tag (TREE_CODE (expr)));
+
+      /* DFS walk above possibly skipped streaming EXPR itself to let us inline
+	 it.  */
+      if (!existed_p)
+	lto_output_tree_1 (ob, expr, 0, ref_p, this_ref_p);
+      else
+	{
+	  streamer_write_record_start (ob, LTO_tree_pickle_reference);
+	  streamer_write_uhwi (ob, ix);
+	  streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
+			       lto_tree_code_to_tag (TREE_CODE (expr)));
+	}
       if (streamer_dump_file)
 	{
 	  print_node_brief (streamer_dump_file, "   Finished SCC of ",
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index 76aa6fe34b8..a466fb8b329 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -178,6 +178,9 @@ enum LTO_tags
   /* Special for global streamer.  A blob of unnamed tree nodes.  */
   LTO_tree_scc,
 
+  /* Sequence of trees.  */
+  LTO_trees,
+
   /* References to indexable tree nodes.  These objects are stored in
      tables that are written separately from the function bodies that
      reference them.  This way they can be instantiated even when the
@@ -751,6 +754,9 @@ struct output_block
   /* Cache of nodes written in this section.  */
   struct streamer_tree_cache_d *writer_cache;
 
+  /* All trees identified as local to the unit streamed.  */
+  hash_set<tree> *local_trees;
+
   /* All data persistent across whole duration of output block
      can go here.  */
   struct obstack obstack;
@@ -901,7 +907,7 @@ tree lto_input_tree_ref (class lto_input_block *, class data_in *,
 void lto_tag_check_set (enum LTO_tags, int, ...);
 void lto_init_eh (void);
 hashval_t lto_input_scc (class lto_input_block *, class data_in *,
-			 unsigned *, unsigned *);
+			 unsigned *, unsigned *, bool);
 tree lto_input_tree_1 (class lto_input_block *, class data_in *,
 		       enum LTO_tags, hashval_t hash);
 tree lto_input_tree (class lto_input_block *, class data_in *);
diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
index c8fd6f5c595..66858a88157 100644
--- a/gcc/lto/lto-common.c
+++ b/gcc/lto/lto-common.c
@@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "lto-common.h"
 #include "tree-pretty-print.h"
+#include "print-tree.h"
 
 /* True when no new types are going to be streamd from the global stream.  */
 
@@ -1054,6 +1055,7 @@ static unsigned long num_prevailing_types;
 static unsigned long num_type_scc_trees;
 static unsigned long total_scc_size;
 static unsigned long num_sccs_read;
+static unsigned long num_unshared_trees_read;
 static unsigned long total_scc_size_merged;
 static unsigned long num_sccs_merged;
 static unsigned long num_scc_compares;
@@ -1088,6 +1090,10 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map)
   compare_values (TREE_CODE);
   code = TREE_CODE (t1);
 
+  /* If we end up comparing translation unit decls we either forgot to mark
+     some SCC as local or we compare too much.  */
+  gcc_checking_assert (code != TRANSLATION_UNIT_DECL);
+
   if (!TYPE_P (t1))
     {
       compare_values (TREE_SIDE_EFFECTS);
@@ -1626,6 +1632,28 @@ cmp_tree (const void *p1_, const void *p2_)
   return ((uintptr_t)p1[1] < (uintptr_t)p2[1]) ? -1 : 1;
 }
 
+/* New scc of size 1 containing T was streamed in from DATA_IN and not merged.
+   Register it to reader cache at index FROM.  */
+
+static void
+process_dref (class data_in *data_in, tree t, unsigned from)
+{
+  struct streamer_tree_cache_d *cache = data_in->reader_cache;
+  /* If we got a debug reference queued, see if the prevailing
+     tree has a debug reference and if not, register the one
+     for the tree we are about to throw away.  */
+  if (dref_queue.length () == 1)
+    {
+      dref_entry e = dref_queue.pop ();
+      gcc_assert (e.decl
+		  == streamer_tree_cache_get_tree (cache, from));
+      const char *sym;
+      unsigned HOST_WIDE_INT off;
+      if (!debug_hooks->die_ref_for_decl (t, &sym, &off))
+	debug_hooks->register_external_die (t, e.sym, e.off);
+    }
+}
+
 /* Try to unify the SCC with nodes FROM to FROM + LEN in CACHE and
    hash value SCC_HASH with an already recorded SCC.  Return true if
    that was successful, otherwise return false.  */
@@ -1646,22 +1674,16 @@ unify_scc (class data_in *data_in, unsigned from,
     {
       tree t = streamer_tree_cache_get_tree (cache, from + i);
       scc->entries[i] = t;
-      /* Do not merge SCCs with local entities inside them.  Also do
-	 not merge TRANSLATION_UNIT_DECLs and anonymous namespaces
-	 and types therein types.  */
-      if (TREE_CODE (t) == TRANSLATION_UNIT_DECL
-	  || (VAR_OR_FUNCTION_DECL_P (t)
-	      && !(TREE_PUBLIC (t) || DECL_EXTERNAL (t)))
-	  || TREE_CODE (t) == LABEL_DECL
-	  || (TREE_CODE (t) == NAMESPACE_DECL && !DECL_NAME (t))
-	  || (TYPE_P (t)
-	      && type_with_linkage_p (TYPE_MAIN_VARIANT (t))
-	      && type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t))))
-	{
-	  /* Avoid doing any work for these cases and do not worry to
-	     record the SCCs for further merging.  */
-	  return false;
-	}
+      /* These types should be streamed as unshared.  */
+      gcc_checking_assert
+	 (!(TREE_CODE (t) == TRANSLATION_UNIT_DECL
+	    || (VAR_OR_FUNCTION_DECL_P (t)
+		&& !(TREE_PUBLIC (t) || DECL_EXTERNAL (t)))
+	    || TREE_CODE (t) == LABEL_DECL
+	    || (TREE_CODE (t) == NAMESPACE_DECL && !DECL_NAME (t))
+	    || (TYPE_P (t)
+		&& type_with_linkage_p (TYPE_MAIN_VARIANT (t))
+		&& type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t)))));
     }
 
   /* Look for the list of candidate SCCs to compare against.  */
@@ -1712,21 +1734,7 @@ unify_scc (class data_in *data_in, unsigned from,
 	     to the tree node mapping computed by compare_tree_sccs.  */
 	  if (len == 1)
 	    {
-	      /* If we got a debug reference queued, see if the prevailing
-		 tree has a debug reference and if not, register the one
-		 for the tree we are about to throw away.  */
-	      if (dref_queue.length () == 1)
-		{
-		  dref_entry e = dref_queue.pop ();
-		  gcc_assert (e.decl
-			      == streamer_tree_cache_get_tree (cache, from));
-		  const char *sym;
-		  unsigned HOST_WIDE_INT off;
-		  if (!debug_hooks->die_ref_for_decl (pscc->entries[0], &sym,
-						      &off))
-		    debug_hooks->register_external_die (pscc->entries[0],
-							e.sym, e.off);
-		}
+	      process_dref (data_in, pscc->entries[0], from);
 	      lto_maybe_register_decl (data_in, pscc->entries[0], from);
 	      streamer_tree_cache_replace_tree (cache, pscc->entries[0], from);
 	    }
@@ -1785,7 +1793,65 @@ unify_scc (class data_in *data_in, unsigned from,
   return unified_p;
 }
 
+typedef int_hash<unsigned, 0, UINT_MAX> code_id_hash;
+
+/* Do registering necessary once new tree fully streamed in (including all
+   trees it reffers to).  */
+
+static void
+process_new_tree (tree t, hash_map <code_id_hash, unsigned> *hm,
+		  unsigned index, unsigned *total, class data_in *data_in)
+{
+  /* Reconstruct the type variant and pointer-to/reference-to
+     chains.  */
+  if (TYPE_P (t))
+    {
+      /* Map the tree types to their frequencies.  */
+      if (flag_lto_dump_type_stats)
+	{
+	  unsigned key = (unsigned) TREE_CODE (t);
+	  unsigned *countp = hm->get (key);
+	  hm->put (key, countp ? (*countp) + 1 : 1);
+	  (*total)++;
+	}
+
+      num_prevailing_types++;
+      lto_fixup_prevailing_type (t);
 
+      /* Compute the canonical type of all non-ODR types.
+	 Delay ODR types for the end of merging process - the canonical
+	 type for those can be computed using the (unique) name however
+	 we want to do this only if units in other languages do not
+	 contain structurally equivalent type.
+
+	 Because SCC components are streamed in random (hash) order
+	 we may have encountered the type before while registering
+	 type canonical of a derived type in the same SCC.  */
+      if (!TYPE_CANONICAL (t))
+	{
+	  if (!RECORD_OR_UNION_TYPE_P (t)
+	      || !TYPE_CXX_ODR_P (t))
+	    gimple_register_canonical_type (t);
+	  else if (COMPLETE_TYPE_P (t))
+	    vec_safe_push (types_to_register, t);
+	}
+      if (TYPE_MAIN_VARIANT (t) == t && odr_type_p (t))
+	register_odr_type (t);
+    }
+  /* Link shared INTEGER_CSTs into TYPE_CACHED_VALUEs of its
+     type which is also member of this SCC.  */
+  if (TREE_CODE (t) == INTEGER_CST
+      && !TREE_OVERFLOW (t))
+    cache_integer_cst (t);
+  if (!flag_ltrans)
+    {
+      lto_maybe_register_decl (data_in, t, index);
+      /* Scan the tree for references to global functions or
+	 variables and record those for later fixup.  */
+      if (mentions_vars_p (t))
+	vec_safe_push (tree_with_vars, t);
+    }
+}
 
 /* Read all the symbols from buffer DATA, using descriptors in DECL_DATA.
    RESOLUTIONS is the set of symbols picked by the linker (read from the
@@ -1813,7 +1879,6 @@ lto_read_decls (struct lto_file_decl_data *decl_data, const void *data,
   /* We do not uniquify the pre-loaded cache entries, those are middle-end
      internal types that should not be merged.  */
 
-  typedef int_hash<unsigned, 0, UINT_MAX> code_id_hash;
   hash_map <code_id_hash, unsigned> hm;
   unsigned total = 0;
 
@@ -1824,31 +1889,41 @@ lto_read_decls (struct lto_file_decl_data *decl_data, const void *data,
       unsigned from = data_in->reader_cache->nodes.length ();
       /* Read and uniquify SCCs as in the input stream.  */
       enum LTO_tags tag = streamer_read_record_start (&ib_main);
-      if (tag == LTO_tree_scc)
+      if (tag == LTO_tree_scc || tag == LTO_trees)
 	{
 	  unsigned len_;
 	  unsigned scc_entry_len;
+
+	  /* Because we stream in SCC order we know that all unshared trees
+	     are now fully streamed.  Process them.  */
 	  hashval_t scc_hash = lto_input_scc (&ib_main, data_in, &len_,
-					      &scc_entry_len);
+					      &scc_entry_len,
+					      tag == LTO_tree_scc);
 	  unsigned len = data_in->reader_cache->nodes.length () - from;
 	  gcc_assert (len == len_);
 
-	  total_scc_size += len;
-	  num_sccs_read++;
+	  if (tag == LTO_tree_scc)
+	    {
+	      total_scc_size += len;
+	      num_sccs_read++;
+	    }
+	  else
+	    num_unshared_trees_read += len;
 
 	  /* We have the special case of size-1 SCCs that are pre-merged
 	     by means of identifier and string sharing for example.
 	     ???  Maybe we should avoid streaming those as SCCs.  */
 	  tree first = streamer_tree_cache_get_tree (data_in->reader_cache,
 						     from);
-	  if (len == 1
-	      && (TREE_CODE (first) == IDENTIFIER_NODE
-		  || (TREE_CODE (first) == INTEGER_CST
-		      && !TREE_OVERFLOW (first))))
-	    continue;
+	  /* Identifier and integers are shared specially, they should never
+	     go by the tree merging path.  */
+	  gcc_checking_assert ((TREE_CODE (first) != IDENTIFIER_NODE
+				&& (TREE_CODE (first) != INTEGER_CST
+				    || TREE_OVERFLOW (first)))
+			       || len != 1);
 
 	  /* Try to unify the SCC with already existing ones.  */
-	  if (!flag_ltrans
+	  if (!flag_ltrans && tag != LTO_trees
 	      && unify_scc (data_in, from,
 			    len, scc_entry_len, scc_hash))
 	    continue;
@@ -1862,56 +1937,9 @@ lto_read_decls (struct lto_file_decl_data *decl_data, const void *data,
 	    {
 	      tree t = streamer_tree_cache_get_tree (data_in->reader_cache,
 						     from + i);
-	      /* Reconstruct the type variant and pointer-to/reference-to
-		 chains.  */
+	      process_new_tree (t, &hm, from + i, &total, data_in);
 	      if (TYPE_P (t))
-		{
-		  /* Map the tree types to their frequencies.  */
-		  if (flag_lto_dump_type_stats)
-		    {
-		      unsigned key = (unsigned) TREE_CODE (t);
-		      unsigned *countp = hm.get (key);
-		      hm.put (key, countp ? (*countp) + 1 : 1);
-		      total++;
-		    }
-
-		  seen_type = true;
-		  num_prevailing_types++;
-		  lto_fixup_prevailing_type (t);
-
-		  /* Compute the canonical type of all non-ODR types.
-		     Delay ODR types for the end of merging process - the canonical
-		     type for those can be computed using the (unique) name however
-		     we want to do this only if units in other languages do not
-		     contain structurally equivalent type.
-
-		     Because SCC components are streamed in random (hash) order
-		     we may have encountered the type before while registering
-		     type canonical of a derived type in the same SCC.  */
-		  if (!TYPE_CANONICAL (t))
-		    {
-		      if (!RECORD_OR_UNION_TYPE_P (t)
-			  || !TYPE_CXX_ODR_P (t))
-		        gimple_register_canonical_type (t);
-		      else if (COMPLETE_TYPE_P (t))
-			vec_safe_push (types_to_register, t);
-		    }
-		  if (TYPE_MAIN_VARIANT (t) == t && odr_type_p (t))
-		    register_odr_type (t);
-		}
-	      /* Link shared INTEGER_CSTs into TYPE_CACHED_VALUEs of its
-		 type which is also member of this SCC.  */
-	      if (TREE_CODE (t) == INTEGER_CST
-		  && !TREE_OVERFLOW (t))
-		cache_integer_cst (t);
-	      if (!flag_ltrans)
-		{
-		  lto_maybe_register_decl (data_in, t, from + i);
-		  /* Scan the tree for references to global functions or
-		     variables and record those for later fixup.  */
-		  if (mentions_vars_p (t))
-		    vec_safe_push (tree_with_vars, t);
-		}
+		seen_type = true;
 	    }
 
 	  /* Register DECLs with the debuginfo machinery.  */
@@ -1926,9 +1954,26 @@ lto_read_decls (struct lto_file_decl_data *decl_data, const void *data,
 	}
       else
 	{
-	  /* Pickle stray references.  */
 	  t = lto_input_tree_1 (&ib_main, data_in, tag, 0);
-	  gcc_assert (t && data_in->reader_cache->nodes.length () == from);
+	  /* We streamed in new tree.  Add it to cache and process dref.  */
+	  if (data_in->reader_cache->nodes.length () == from + 1)
+	    {
+	      num_unshared_trees_read++;
+	      data_in->location_cache.accept_location_cache ();
+	      process_dref (data_in, t, from);
+	      if (TREE_CODE (t) == IDENTIFIER_NODE
+		  || (TREE_CODE (t) == INTEGER_CST
+		      && !TREE_OVERFLOW (t)))
+		;
+	      else
+		{
+		  lto_maybe_register_decl (data_in, t, from);
+		  process_new_tree (t, &hm, from, &total, data_in);
+		}
+	    }
+	  else
+	    /* FIXME: It seems useless to pickle stray references.  */
+	    gcc_assert (data_in->reader_cache->nodes.length () == from);
 	}
     }
 
@@ -2953,10 +2998,13 @@ print_lto_report_1 (void)
   const char *pfx = (flag_lto) ? "LTO" : (flag_wpa) ? "WPA" : "LTRANS";
   fprintf (stderr, "%s statistics\n", pfx);
 
-  fprintf (stderr, "[%s] read %lu SCCs of average size %f\n",
+  fprintf (stderr, "[%s] read %lu unshared trees\n",
+	   pfx, num_unshared_trees_read);
+  fprintf (stderr, "[%s] read %lu mergeable SCCs of average size %f\n",
 	   pfx, num_sccs_read, total_scc_size / (double)num_sccs_read);
-  fprintf (stderr, "[%s] %lu tree bodies read in total\n", pfx, total_scc_size);
-  if (flag_wpa && tree_scc_hash)
+  fprintf (stderr, "[%s] %lu tree bodies read in total\n", pfx,
+	   total_scc_size + num_unshared_trees_read);
+  if (flag_wpa && tree_scc_hash && num_sccs_read)
     {
       fprintf (stderr, "[%s] tree SCC table: size %ld, %ld elements, "
 	       "collision ratio: %f\n", pfx,
-- 
2.20.1


From b5193950495fe9a5e86d2204ee349a946fb0a979 Mon Sep 17 00:00:00 2001
From: Jan Hubicka <jh@suse.cz>
Date: Fri, 22 May 2020 12:29:19 +0200
Subject: [PATCH 2/5] Simplify streaming of SCC components

this patch saves few bytes from SCC streaming.  First we stream end markers
that are fully ignored at stream in.
Second I missed streaming of emtry_len in the previous change so it is
pointlessly streamed for LTO_trees. Moreover entry_len is almost always 1
(always during gcc bootstrap) and thus it makes sense to avoid stremaing it
in majority of cases.

gcc/ChangeLog:

2020-05-21  Jan Hubicka  <hubicka@ucw.cz>

	* lto-streamer-in.c (lto_read_tree): Do not stream end markers.
	(lto_input_scc): Optimize streaming of entry lengths.
	* lto-streamer-out.c (lto_write_tree): Do not stream end markers
	(DFS::DFS): Optimize stremaing of entry lengths

(cherry picked from commit 47273df0bcdd552385f25049dce71943aac8321e)
(cherry picked from commit f55838d8ec18a499ff2f521bea66bbb54f70c56b)
---
 gcc/lto-streamer-in.c  | 17 +++++++++--------
 gcc/lto-streamer-out.c | 19 +++++++------------
 2 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index 85d0edf49a7..d0532c5ac51 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -1417,8 +1417,6 @@ lto_read_tree (class lto_input_block *ib, class data_in *data_in,
 
   lto_read_tree_1 (ib, data_in, result);
 
-  /* end_marker = */ streamer_read_uchar (ib);
-
   return result;
 }
 
@@ -1431,12 +1429,18 @@ hashval_t
 lto_input_scc (class lto_input_block *ib, class data_in *data_in,
 	       unsigned *len, unsigned *entry_len, bool shared_scc)
 {
-  /* A blob of unnamed tree nodes, fill the cache from it and
-     recurse.  */
   unsigned size = streamer_read_uhwi (ib);
-  hashval_t scc_hash = shared_scc ? streamer_read_uhwi (ib) : 0;
+  hashval_t scc_hash = 0;
   unsigned scc_entry_len = 1;
 
+  if (shared_scc)
+    {
+      if (size & 1)
+	scc_entry_len = streamer_read_uhwi (ib);
+      size /= 2;
+      scc_hash = streamer_read_uhwi (ib);
+    }
+
   if (size == 1)
     {
       enum LTO_tags tag = streamer_read_record_start (ib);
@@ -1447,8 +1451,6 @@ lto_input_scc (class lto_input_block *ib, class data_in *data_in,
       unsigned int first = data_in->reader_cache->nodes.length ();
       tree result;
 
-      scc_entry_len = streamer_read_uhwi (ib);
-
       /* Materialize size trees by reading their headers.  */
       for (unsigned i = 0; i < size; ++i)
 	{
@@ -1471,7 +1473,6 @@ lto_input_scc (class lto_input_block *ib, class data_in *data_in,
 	  result = streamer_tree_cache_get_tree (data_in->reader_cache,
 						 first + i);
 	  lto_read_tree_1 (ib, data_in, result);
-	  /* end_marker = */ streamer_read_uchar (ib);
 	}
     }
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 2d9d8689459..2c3d16799b4 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -473,9 +473,6 @@ lto_write_tree (struct output_block *ob, tree expr, bool ref_p)
   streamer_write_tree_header (ob, expr);
 
   lto_write_tree_1 (ob, expr, ref_p);
-
-  /* Mark the end of EXPR.  */
-  streamer_write_zero (ob);
 }
 
 /* Emit the physical representation of tree node EXPR to output block OB,
@@ -764,7 +761,12 @@ DFS::DFS (struct output_block *ob, tree expr, bool ref_p, bool this_ref_p,
 	    {
 	      gcc_checking_assert (ob->section_type == LTO_section_decls);
 	      streamer_write_record_start (ob, LTO_tree_scc);
-	      streamer_write_uhwi (ob, size);
+	      /* In wast majority of cases scc_entry_len is 1 and size is small
+		 integer.  Use extra bit of size to stream info about
+		 exceptions.  */
+	      streamer_write_uhwi (ob, size * 2 + (scc_entry_len != 1));
+	      if (scc_entry_len != 1)
+		streamer_write_uhwi (ob, scc_entry_len);
 	      streamer_write_uhwi (ob, scc_hash);
 	    }
 	  /* Non-trivial SCCs must be packed to trees blocks so forward
@@ -783,8 +785,6 @@ DFS::DFS (struct output_block *ob, tree expr, bool ref_p, bool this_ref_p,
 	    lto_output_tree_1 (ob, expr, scc_hash, ref_p, this_ref_p);
 	  else
 	    {
-	      /* Write the size of the SCC entry candidates.  */
-	      streamer_write_uhwi (ob, scc_entry_len);
 
 	      /* Write all headers and populate the streamer cache.  */
 	      for (unsigned i = 0; i < size; ++i)
@@ -807,12 +807,7 @@ DFS::DFS (struct output_block *ob, tree expr, bool ref_p, bool this_ref_p,
 
 	      /* Write the bitpacks and tree references.  */
 	      for (unsigned i = 0; i < size; ++i)
-		{
-		  lto_write_tree_1 (ob, sccstack[first+i].t, ref_p);
-
-		  /* Mark the end of the tree.  */
-		  streamer_write_zero (ob);
-		}
+		lto_write_tree_1 (ob, sccstack[first+i].t, ref_p);
 	    }
 
 	  /* Finally truncate the vector.  */
-- 
2.20.1


From a783662c8a0e90d7b3b9534df981927c9537b1ea Mon Sep 17 00:00:00 2001
From: Jan Hubicka <jh@suse.cz>
Date: Fri, 22 May 2020 12:31:34 +0200
Subject: [PATCH 3/5] Fix hashing of prestreamed nodes

this patch seems to solve basically all collisions while building cc1.
From:

[WPA] read 3312246 unshared trees
[WPA] read 1144381 mergeable SCCs of average size 4.833785
[WPA] 8843938 tree bodies read in total
[WPA] tree SCC table: size 524287, 197767 elements, collision ratio: 0.506446
[WPA] tree SCC max chain length 43 (size 1)
[WPA] Compared 946614 SCCs, 775077 collisions (0.818789)

to

[WPA] read 3314520 unshared trees
[WPA] read 1144763 mergeable SCCs of average size 4.835021
[WPA] 8849473 tree bodies read in total
[WPA] tree SCC table: size 524287, 200574 elements, collision ratio: 0.486418
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared 944189 SCCs, 179 collisions (0.000190)

The problem is that preloaded nodes all have hash code 0 because
cache->nodes.length is not updated while streaming out.

I also added an arbitrary constant to avoid clash with constant of 0 used to
hash NULL pointers and 1 used to hash pointers inside SCC.

	* tree-streamer.c (record_common_node): Fix hash value of pre-streamed
	nodes.

(cherry picked from commit 1089a367c4b05b5e3f072adca8913904ed65928c)
(cherry picked from commit 5c7ed52a039cfafb63c27fa8c1535e02dfc1ae09)
---
 gcc/tree-streamer.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/tree-streamer.c b/gcc/tree-streamer.c
index f6181fafc4c..b0afa1dc6c0 100644
--- a/gcc/tree-streamer.c
+++ b/gcc/tree-streamer.c
@@ -299,10 +299,11 @@ record_common_node (struct streamer_tree_cache_d *cache, tree node)
   if (!node)
     node = error_mark_node;
 
-  /* ???  FIXME, devise a better hash value.  But the hash needs to be equal
-     for all frontend and lto1 invocations.  So just use the position
-     in the cache as hash value.  */
-  streamer_tree_cache_append (cache, node, cache->nodes.length ());
+  /* This hash needs to be equal for all frontend and lto1 invocations.  So
+     just use the position in the cache as hash value.
+     Small integers are used by hash_tree to record positions within scc
+     hash. Values are not in same range.  */
+  streamer_tree_cache_append (cache, node, cache->next_idx + 0xc001);
 
   switch (TREE_CODE (node))
     {
-- 
2.20.1


From 5a66de7d2f197b89dd595f6b9412ac2a0c184bd6 Mon Sep 17 00:00:00 2001
From: Jan Hubicka <jh@suse.cz>
Date: Fri, 22 May 2020 16:37:06 +0200
Subject: [PATCH 4/5] Avoid streaming stray references.

this patch avoids stremaing completely useless stray references to gobal decl
stream.  I am re-testing the patch (rebased to current tree) on x86_64-linux
and intend to commit once testing finishes.

gcc/ChangeLog:

2020-05-22  Jan Hubicka  <hubicka@ucw.cz>

	* lto-streamer-out.c (lto_output_tree): Do not stream final ref if
	it is not needed.

gcc/lto/ChangeLog:

2020-05-22  Jan Hubicka  <hubicka@ucw.cz>

	* lto-common.c (lto_read_decls): Do not skip stray refs.

(cherry picked from commit bcb63eb2cbd3caf212b9cf42d8c218c09dc6ff8b)
(cherry picked from commit 57400cf273f8052c601d90d86a47705faa17aaa9)
---
 gcc/lto-streamer-out.c |  2 +-
 gcc/lto/lto-common.c   | 28 +++++++++++-----------------
 2 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 2c3d16799b4..74ea872e511 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1746,7 +1746,7 @@ lto_output_tree (struct output_block *ob, tree expr,
 	 it.  */
       if (!existed_p)
 	lto_output_tree_1 (ob, expr, 0, ref_p, this_ref_p);
-      else
+      else if (this_ref_p)
 	{
 	  streamer_write_record_start (ob, LTO_tree_pickle_reference);
 	  streamer_write_uhwi (ob, ix);
diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
index 66858a88157..bcc0919dfc6 100644
--- a/gcc/lto/lto-common.c
+++ b/gcc/lto/lto-common.c
@@ -1955,25 +1955,19 @@ lto_read_decls (struct lto_file_decl_data *decl_data, const void *data,
       else
 	{
 	  t = lto_input_tree_1 (&ib_main, data_in, tag, 0);
-	  /* We streamed in new tree.  Add it to cache and process dref.  */
-	  if (data_in->reader_cache->nodes.length () == from + 1)
+	  gcc_assert (data_in->reader_cache->nodes.length () == from + 1);
+	  num_unshared_trees_read++;
+	  data_in->location_cache.accept_location_cache ();
+	  process_dref (data_in, t, from);
+	  if (TREE_CODE (t) == IDENTIFIER_NODE
+	      || (TREE_CODE (t) == INTEGER_CST
+		  && !TREE_OVERFLOW (t)))
+	    ;
+	  else
 	    {
-	      num_unshared_trees_read++;
-	      data_in->location_cache.accept_location_cache ();
-	      process_dref (data_in, t, from);
-	      if (TREE_CODE (t) == IDENTIFIER_NODE
-		  || (TREE_CODE (t) == INTEGER_CST
-		      && !TREE_OVERFLOW (t)))
-		;
-	      else
-		{
-		  lto_maybe_register_decl (data_in, t, from);
-		  process_new_tree (t, &hm, from, &total, data_in);
-		}
+	      lto_maybe_register_decl (data_in, t, from);
+	      process_new_tree (t, &hm, from, &total, data_in);
 	    }
-	  else
-	    /* FIXME: It seems useless to pickle stray references.  */
-	    gcc_assert (data_in->reader_cache->nodes.length () == from);
 	}
     }
 
-- 
2.20.1


From 59877e70b25c164761e641b703fafe7cb872acb9 Mon Sep 17 00:00:00 2001
From: Jan Hubicka <jh@suse.cz>
Date: Mon, 25 May 2020 14:41:33 +0200
Subject: [PATCH 5/5] Do not stream redundant stuff

as discussed on IRC this adds knob to disable stuff we stream "just for fun"
(or to make it easier to debug streamer desychnonization).

Te size of .o files in gcc subdirectory is reduced form 506MB to 492MB

gcc/

	* lto-streamer-out.c (lto_output_tree): Add streamer_debugging check.
	* lto-streamer.h (streamer_debugging): New constant
	* tree-streamer-in.c (streamer_read_tree_bitfields): Add
	streamer_debugging check.
	(streamer_get_pickled_tree): Likewise.
	* tree-streamer-out.c (pack_ts_base_value_fields): Likewise.

(cherry picked from commit a746f952abb78af9db28a7f3bce442e113877c9c)
(cherry picked from commit 9d679bd7b621c93c000b7d548f34f3438e0d57a8)
---
 gcc/lto-streamer-out.c  |  5 +++--
 gcc/lto-streamer.h      |  3 +++
 gcc/tree-streamer-in.c  | 21 ++++++++++++++-------
 gcc/tree-streamer-out.c |  3 ++-
 4 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 74ea872e511..61c21d181fa 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1710,8 +1710,9 @@ lto_output_tree (struct output_block *ob, tree expr,
 	 will instantiate two different nodes for the same object.  */
       streamer_write_record_start (ob, LTO_tree_pickle_reference);
       streamer_write_uhwi (ob, ix);
-      streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
-			   lto_tree_code_to_tag (TREE_CODE (expr)));
+      if (streamer_debugging)
+	streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
+			     lto_tree_code_to_tag (TREE_CODE (expr)));
       lto_stats.num_pickle_refs_output++;
     }
   else
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index a466fb8b329..93da3c66fd8 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -125,6 +125,9 @@ along with GCC; see the file COPYING3.  If not see
 
 typedef unsigned char	lto_decl_flags_t;
 
+/* Stream additional data to LTO object files to make it easier to debug
+   streaming code.  This changes object files.  */
+static const bool streamer_debugging = false;
 
 /* Tags representing the various IL objects written to the bytecode file
    (GIMPLE statements, basic blocks, EH regions, tree nodes, etc).
diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c
index 5b653cdf5ad..a3aea3d811f 100644
--- a/gcc/tree-streamer-in.c
+++ b/gcc/tree-streamer-in.c
@@ -487,9 +487,13 @@ streamer_read_tree_bitfields (class lto_input_block *ib,
 
   /* The first word in BP contains the code of the tree that we
      are about to read.  */
-  code = (enum tree_code) bp_unpack_value (&bp, 16);
-  lto_tag_check (lto_tree_code_to_tag (code),
-		 lto_tree_code_to_tag (TREE_CODE (expr)));
+  if (streamer_debugging)
+    {
+      code = (enum tree_code) bp_unpack_value (&bp, 16);
+      lto_tag_check (lto_tree_code_to_tag (code),
+		     lto_tree_code_to_tag (TREE_CODE (expr)));
+    }
+  code = TREE_CODE (expr);
 
   /* Note that all these functions are highly sensitive to changes in
      the types and sizes of each of the fields being packed.  */
@@ -1107,11 +1111,14 @@ streamer_get_pickled_tree (class lto_input_block *ib, class data_in *data_in)
   enum LTO_tags expected_tag;
 
   ix = streamer_read_uhwi (ib);
-  expected_tag = streamer_read_enum (ib, LTO_tags, LTO_NUM_TAGS);
-
   result = streamer_tree_cache_get_tree (data_in->reader_cache, ix);
-  gcc_assert (result
-              && TREE_CODE (result) == lto_tag_to_tree_code (expected_tag));
+
+  if (streamer_debugging)
+    {
+      expected_tag = streamer_read_enum (ib, LTO_tags, LTO_NUM_TAGS);
+      gcc_assert (result
+		  && TREE_CODE (result) == lto_tag_to_tree_code (expected_tag));
+    }
 
   return result;
 }
diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c
index 0b54ffb63f3..7bc97196cf1 100644
--- a/gcc/tree-streamer-out.c
+++ b/gcc/tree-streamer-out.c
@@ -70,7 +70,8 @@ write_identifier (struct output_block *ob,
 static inline void
 pack_ts_base_value_fields (struct bitpack_d *bp, tree expr)
 {
-  bp_pack_value (bp, TREE_CODE (expr), 16);
+  if (streamer_debugging)
+    bp_pack_value (bp, TREE_CODE (expr), 16);
   if (!TYPE_P (expr))
     {
       bp_pack_value (bp, TREE_SIDE_EFFECTS (expr), 1);
-- 
2.20.1


[-- Attachment #3: patch2b --]
[-- Type: text/plain, Size: 36768 bytes --]

From 11821bc4536a4fafda73c4130961fa5fa445b24a Mon Sep 17 00:00:00 2001
From: Jan Hubicka <jh@suse.cz>
Date: Wed, 3 Jun 2020 21:16:43 +0200
Subject: [PATCH 1/2] Optimize ODR enum streaming

it turns out that half of the global decl stream of cc1 LTO build consits
TREE_LISTS, identifiers and integer cosntats representing TYPE_VALUES of enums.
Those are streamed only to produce ODR warning and used otherwise, so this
patch moves the info to a separate section that is represented and streamed
more effectively.

This also adds place for more info that may be used for ODR diagnostics
(i.e. at the moment we do not warn when the declarations differs i.e. by the
associated member functions and their types) and the type inheritance graph
rather then poluting the global stream.

I was bit unsure what enums we want to store into the section.  All parsed
enums is probably too expensive, only those enums streamed to represent IL is
bit hard to get, so I went for those seen by free lang data.

As a plus we now get bit more precise warning because also the location of
mismatched enum CONST_DECL is streamed.

It changes:
[WPA] read 4608466 unshared trees
[WPA] read 2942094 mergeable SCCs of average size 1.365328
[WPA] 8625389 tree bodies read in total
[WPA] tree SCC table: size 524287, 247652 elements, collision ratio: 0.383702
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared 2694442 SCCs, 228 collisions (0.000085)
[WPA] Merged 2694419 SCCs
[WPA] Merged 3731982 tree bodies
[WPA] Merged 633335 types
[WPA] 122077 types prevailed (155548 associated trees)
...
[WPA] Compression: 110593119 input bytes, 287696614 uncompressed bytes (ratio: 2.601397)
[WPA] Size of mmap'd section decls: 85628556 bytes
[WPA] Size of mmap'd section function_body: 13842928 bytes

[WPA] read 1720989 unshared trees
[WPA] read 1252217 mergeable SCCs of average size 1.858507
[WPA] 4048243 tree bodies read in total
[WPA] tree SCC table: size 524287, 226524 elements, collision ratio: 0.491759
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared 1025693 SCCs, 196 collisions (0.000191)
[WPA] Merged 1025670 SCCs
[WPA] Merged 2063373 tree bodies
[WPA] Merged 633497 types
[WPA] 122299 types prevailed (155827 associated trees)
...
[WPA] Compression: 103428770 input bytes, 281151423 uncompressed bytes (ratio: 2.718310)
[WPA] Size of mmap'd section decls: 49390917 bytes
[WPA] Size of mmap'd section function_body: 13858258 bytes
...
[WPA] Size of mmap'd section odr_types: 29054816 bytes

So number of SCCs streamed drops to 38% and the number of unshared trees (that
are bit misnamed since it is mostly integer_cst) to 37%.

Things speeds up correspondingly, but I did not save time report from previous
build.

The enum values are still quite surprisingly large.  I may take a look into
ways getting it smaller incrementally, but it streams reasonably fast:

Time variable                                   usr           sys          wall               GGC
 phase opt and generate             :  25.20 ( 68%)  10.88 ( 72%)  36.13 ( 69%)  868060 kB ( 52%)
 phase stream in                    :   4.46 ( 12%)   0.90 (  6%)   5.38 ( 10%)  790724 kB ( 48%)
 phase stream out                   :   6.69 ( 18%)   3.32 ( 22%)  10.03 ( 19%)       8 kB (  0%)
 ipa lto gimple in                  :   0.79 (  2%)   1.86 ( 12%)   2.39 (  5%)  252612 kB ( 15%)
 ipa lto gimple out                 :   2.48 (  7%)   0.78 (  5%)   3.26 (  6%)       0 kB (  0%)
 ipa lto decl in                    :   1.71 (  5%)   0.46 (  3%)   2.34 (  4%)  417883 kB ( 25%)
 ipa lto decl out                   :   3.28 (  9%)   0.07 (  0%)   3.27 (  6%)       0 kB (  0%)
 whopr wpa I/O                      :   0.40 (  1%)   2.24 ( 15%)   2.77 (  5%)       8 kB (  0%)
 lto stream decompression           :   1.38 (  4%)   0.31 (  2%)   1.36 (  3%)       0 kB (  0%)
 ipa ODR types                      :   0.18 (  0%)   0.02 (  0%)   0.25 (  0%)       0 kB (  0%)
 ipa inlining heuristics            :  11.64 ( 31%)   1.45 ( 10%)  13.12 ( 25%)  453160 kB ( 27%)
 ipa pure const                     :   1.74 (  5%)   0.00 (  0%)   1.76 (  3%)       0 kB (  0%)
 ipa icf                            :   1.72 (  5%)   5.33 ( 35%)   7.06 ( 13%)   16593 kB (  1%)
 whopr partitioning                 :   2.22 (  6%)   0.01 (  0%)   2.23 (  4%)    5689 kB (  0%)
 TOTAL                              :  37.17         15.20         52.46        1660886 kB

LTO-bootstrapped/regtested x86_64-linux, will comit it shortly.

gcc/ChangeLog:

2020-06-03  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-devirt.c: Include data-streamer.h, lto-streamer.h and
	streamer-hooks.h.
	(odr_enums): New static var.
	(struct odr_enum_val): New struct.
	(class odr_enum): New struct.
	(odr_enum_map): New hashtable.
	(odr_types_equivalent_p): Drop code testing TYPE_VALUES.
	(add_type_duplicate): Likewise.
	(free_odr_warning_data): Do not free TYPE_VALUES.
	(register_odr_enum): New function.
	(ipa_odr_summary_write): New function.
	(ipa_odr_read_section): New function.
	(ipa_odr_summary_read): New function.
	(class pass_ipa_odr): New pass.
	(make_pass_ipa_odr): New function.
	* ipa-utils.h (register_odr_enum): Declare.
	* lto-section-in.c: (lto_section_name): Add odr_types section.
	* lto-streamer.h (enum lto_section_type): Add odr_types section.
	* passes.def: Add odr_types pass.
	* lto-streamer-out.c (DFS::DFS_write_tree_body): Do not stream
	TYPE_VALUES.
	(hash_tree): Likewise.
	* tree-streamer-in.c (lto_input_ts_type_non_common_tree_pointers):
	Likewise.
	* tree-streamer-out.c (write_ts_type_non_common_tree_pointers):
	Likewise.
	* timevar.def (TV_IPA_ODR): New timervar.
	* tree-pass.h (make_pass_ipa_odr): Declare.
	* tree.c (free_lang_data_in_type): Regiser ODR types.

gcc/lto/ChangeLog:

2020-06-03  Jan Hubicka  <hubicka@ucw.cz>

	* lto-common.c (compare_tree_sccs_1): Do not compare TYPE_VALUES.

gcc/testsuite/ChangeLog:

2020-06-03  Jan Hubicka  <hubicka@ucw.cz>

	* g++.dg/lto/pr84805_0.C: Update.

(cherry picked from commit 3fb68f2e666d9de7e0326af9f43b12c9e98f19a6)

Fix typo.
---
 gcc/ipa-devirt.c                     | 404 ++++++++++++++++++++++++---
 gcc/ipa-utils.h                      |   2 +
 gcc/lto-section-in.c                 |   3 +-
 gcc/lto-streamer-out.c               |   8 +-
 gcc/lto-streamer.h                   |   1 +
 gcc/lto/lto-common.c                 |   4 +-
 gcc/passes.def                       |   1 +
 gcc/testsuite/g++.dg/lto/pr84805_0.C |   2 +-
 gcc/timevar.def                      |   1 +
 gcc/tree-pass.h                      |   1 +
 gcc/tree-streamer-in.c               |   4 +-
 gcc/tree-streamer-out.c              |   4 +-
 gcc/tree.c                           |  13 +-
 13 files changed, 379 insertions(+), 69 deletions(-)

diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index 85f6e20da32..31903a1c623 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -88,7 +88,7 @@ along with GCC; see the file COPYING3.  If not see
      This means that the graph is not complete. Types with no methods are not
      inserted into the graph.  Also types without virtual methods are not
      represented at all, though it may be easy to add this.
- 
+
      The inheritance graph is represented as follows:
 
        Vertices are structures odr_type.  Every odr_type may correspond
@@ -131,6 +131,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "intl.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "data-streamer.h"
+#include "lto-streamer.h"
+#include "streamer-hooks.h"
 
 /* Hash based set of pairs of types.  */
 struct type_pair
@@ -495,6 +498,29 @@ static odr_hash_type *odr_hash;
 static GTY(()) vec <odr_type, va_gc> *odr_types_ptr;
 #define odr_types (*odr_types_ptr)
 
+/* All enums defined and accessible for the unit.  */
+static GTY(()) vec <tree, va_gc> *odr_enums;
+
+/* Information we hold about value defined by an enum type.  */
+struct odr_enum_val
+{
+  const char *name;
+  HOST_WIDE_INT val;
+  location_t locus;
+};
+
+/* Information about enum values.  */
+struct odr_enum
+{
+  location_t locus;
+  auto_vec<odr_enum_val, 0> vals;
+  bool warned;
+};
+
+/* A table of all ODR enum definitions.  */
+static hash_map <nofree_string_hash, odr_enum> *odr_enum_map = NULL;
+static struct obstack odr_enum_obstack;
+
 /* Set TYPE_BINFO of TYPE and its variants to BINFO.  */
 void
 set_type_binfo (tree type, tree binfo)
@@ -1205,37 +1231,6 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, bool *warned,
       return false;
     }
 
-  if (TREE_CODE (t1) == ENUMERAL_TYPE
-      && TYPE_VALUES (t1) && TYPE_VALUES (t2))
-    {
-      tree v1, v2;
-      for (v1 = TYPE_VALUES (t1), v2 = TYPE_VALUES (t2);
-	   v1 && v2 ; v1 = TREE_CHAIN (v1), v2 = TREE_CHAIN (v2))
-	{
-	  if (TREE_PURPOSE (v1) != TREE_PURPOSE (v2))
-	    {
-	      warn_odr (t1, t2, NULL, NULL, warn, warned,
-			G_("an enum with different value name"
-			   " is defined in another translation unit"));
-	      return false;
-	    }
-	  if (!operand_equal_p (TREE_VALUE (v1), TREE_VALUE (v2), 0))
-	    {
-	      warn_odr (t1, t2, NULL, NULL, warn, warned,
-			G_("an enum with different values is defined"
-			   " in another translation unit"));
-	      return false;
-	    }
-	}
-      if (v1 || v2)
-	{
-	  warn_odr (t1, t2, NULL, NULL, warn, warned,
-		    G_("an enum with mismatching number of values "
-		       "is defined in another translation unit"));
-	  return false;
-	}
-    }
-
   /* Non-aggregate types can be handled cheaply.  */
   if (INTEGRAL_TYPE_P (t1)
       || SCALAR_FLOAT_TYPE_P (t1)
@@ -1622,10 +1617,6 @@ add_type_duplicate (odr_type val, tree type)
     }
   else if (COMPLETE_TYPE_P (val->type) && !COMPLETE_TYPE_P (type))
     ;
-  else if (TREE_CODE (val->type) == ENUMERAL_TYPE
-	   && TREE_CODE (type) == ENUMERAL_TYPE
-	   && !TYPE_VALUES (val->type) && TYPE_VALUES (type))
-    prevail = true;
   else if (TREE_CODE (val->type) == RECORD_TYPE
 	   && TREE_CODE (type) == RECORD_TYPE
 	   && TYPE_BINFO (type) && !TYPE_BINFO (val->type))
@@ -1977,7 +1968,7 @@ get_odr_type (tree type, bool insert)
       unsigned int i;
 
       gcc_assert (BINFO_TYPE (TYPE_BINFO (val->type)) == type);
-  
+
       val->all_derivations_known = type_all_derivations_known_p (type);
       for (i = 0; i < BINFO_N_BASE_BINFOS (binfo); i++)
 	/* For now record only polymorphic types. other are
@@ -2222,7 +2213,7 @@ dump_type_inheritance_graph (FILE *f)
 
 /* Save some WPA->ltrans streaming by freeing stuff needed only for good
    ODR warnings.
-   We free TYPE_VALUES of enums and also make TYPE_DECLs to not point back
+   We make TYPE_DECLs to not point back
    to the type (which is needed to keep them in the same SCC and preserve
    location information to output warnings) and subsequently we make all
    TYPE_DECLS of same assembler name equivalent.  */
@@ -2242,8 +2233,6 @@ free_odr_warning_data ()
       {
 	tree t = odr_types[i]->type;
 
-	if (TREE_CODE (t) == ENUMERAL_TYPE)
-	  TYPE_VALUES (t) = NULL;
 	TREE_TYPE (TYPE_NAME (t)) = void_type_node;
 
 	if (odr_types[i]->types)
@@ -2251,8 +2240,6 @@ free_odr_warning_data ()
 	    {
 	      tree td = (*odr_types[i]->types)[j];
 
-	      if (TREE_CODE (td) == ENUMERAL_TYPE)
-	        TYPE_VALUES (td) = NULL;
 	      TYPE_NAME (td) = TYPE_NAME (t);
 	    }
       }
@@ -2286,7 +2273,7 @@ build_type_inheritance_graph (void)
       get_odr_type (TYPE_METHOD_BASETYPE (TREE_TYPE (n->decl)), true);
 
     /* Look also for virtual tables of types that do not define any methods.
- 
+
        We need it in a case where class B has virtual base of class A
        re-defining its virtual method and there is class C with no virtual
        methods with B as virtual base.
@@ -2603,7 +2590,7 @@ record_target_from_binfo (vec <cgraph_node *> &nodes,
    INSERTED is used to avoid duplicate insertions of methods into NODES.
    MATCHED_VTABLES are used to avoid duplicate walking vtables.
    Clear COMPLETEP if unreferable target is found.
- 
+
    If CONSIDER_CONSTRUCTION is true, record to BASES_TO_CONSIDER
    all cases where BASE_SKIPPED is true (because the base is abstract
    class).  */
@@ -2803,7 +2790,7 @@ subbinfo_with_vtable_at_offset (tree binfo, unsigned HOST_WIDE_INT offset,
 	  && DECL_ASSEMBLER_NAME (v) == DECL_ASSEMBLER_NAME (vtable))
 	return binfo;
     }
-  
+
   for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
     if (polymorphic_type_binfo_p (base_binfo))
       {
@@ -4019,4 +4006,331 @@ debug_tree_odr_name (tree type, bool demangle)
   fprintf (stderr, "%s\n", odr);
 }
 
+/* Register ODR enum so we later stream record about its values.  */
+
+void
+register_odr_enum (tree t)
+{
+  if (flag_lto)
+    vec_safe_push (odr_enums, t);
+}
+
+/* Write ODR enums to LTO stream file.  */
+
+static void
+ipa_odr_summary_write (void)
+{
+  if (!odr_enums && !odr_enum_map)
+    return;
+  struct output_block *ob = create_output_block (LTO_section_odr_types);
+  unsigned int i;
+  tree t;
+
+  if (odr_enums)
+    {
+      streamer_write_uhwi (ob, odr_enums->length ());
+
+      /* For every ODR enum stream out
+	   - its ODR name
+	   - number of values,
+	   - value names and constant their represent
+	   - bitpack of locations so we can do good diagnostics.  */
+      FOR_EACH_VEC_ELT (*odr_enums, i, t)
+	{
+	  streamer_write_string (ob, ob->main_stream,
+				 IDENTIFIER_POINTER
+				     (DECL_ASSEMBLER_NAME (TYPE_NAME (t))),
+				 true);
+
+	  int n = 0;
+	  for (tree e = TYPE_VALUES (t); e; e = TREE_CHAIN (e))
+	    n++;
+	  streamer_write_uhwi (ob, n);
+	  for (tree e = TYPE_VALUES (t); e; e = TREE_CHAIN (e))
+	    {
+	      streamer_write_string (ob, ob->main_stream,
+				     IDENTIFIER_POINTER (TREE_PURPOSE (e)),
+				     true);
+	      streamer_write_hwi (ob, tree_to_shwi
+					(DECL_INITIAL (TREE_VALUE (e))));
+	    }
+
+	  bitpack_d bp = bitpack_create (ob->main_stream);
+	  lto_output_location (ob, &bp, DECL_SOURCE_LOCATION (TYPE_NAME (t)));
+	  for (tree e = TYPE_VALUES (t); e; e = TREE_CHAIN (e))
+	    lto_output_location (ob, &bp,
+				 DECL_SOURCE_LOCATION (TREE_VALUE (e)));
+	  streamer_write_bitpack (&bp);
+	}
+      vec_free (odr_enums);
+      odr_enums = NULL;
+    }
+  /* During LTO incremental linking we already have streamed in types.  */
+  else if (odr_enum_map)
+    {
+      gcc_checking_assert (!odr_enums);
+      streamer_write_uhwi (ob, odr_enum_map->elements ());
+
+      hash_map<nofree_string_hash, odr_enum>::iterator iter
+		= odr_enum_map->begin ();
+      for (; iter != odr_enum_map->end (); ++iter)
+	{
+	  odr_enum &this_enum = (*iter).second;
+	  streamer_write_string (ob, ob->main_stream, (*iter).first, true);
+
+	  streamer_write_uhwi (ob, this_enum.vals.length ());
+	  for (unsigned j = 0; j < this_enum.vals.length (); j++)
+	    {
+	      streamer_write_string (ob, ob->main_stream,
+				     this_enum.vals[j].name, true);
+	      streamer_write_hwi (ob, this_enum.vals[j].val);
+	    }
+
+	  bitpack_d bp = bitpack_create (ob->main_stream);
+	  lto_output_location (ob, &bp, this_enum.locus);
+	  for (unsigned j = 0; j < this_enum.vals.length (); j++)
+	    lto_output_location (ob, &bp, this_enum.vals[j].locus);
+	  streamer_write_bitpack (&bp);
+	}
+
+      delete odr_enum_map;
+      obstack_free (&odr_enum_obstack, NULL);
+      odr_enum_map = NULL;
+    }
+
+  produce_asm (ob, NULL);
+  destroy_output_block (ob);
+}
+
+/* Write ODR enums from LTO stream file and warn on mismatches.  */
+
+static void
+ipa_odr_read_section (struct lto_file_decl_data *file_data, const char *data,
+		      size_t len)
+{
+  const struct lto_function_header *header
+    = (const struct lto_function_header *) data;
+  const int cfg_offset = sizeof (struct lto_function_header);
+  const int main_offset = cfg_offset + header->cfg_size;
+  const int string_offset = main_offset + header->main_size;
+  class data_in *data_in;
+
+  lto_input_block ib ((const char *) data + main_offset, header->main_size,
+		      file_data->mode_table);
+
+  data_in
+    = lto_data_in_create (file_data, (const char *) data + string_offset,
+			  header->string_size, vNULL);
+  unsigned int n = streamer_read_uhwi (&ib);
+
+  if (!odr_enum_map)
+    {
+      gcc_obstack_init (&odr_enum_obstack);
+      odr_enum_map = new (hash_map <nofree_string_hash, odr_enum>);
+    }
+
+  for (unsigned i = 0; i < n; i++)
+    {
+      const char *rname = streamer_read_string (data_in, &ib);
+      unsigned int nvals = streamer_read_uhwi (&ib);
+      char *name;
+  
+      obstack_grow (&odr_enum_obstack, rname, strlen (rname) + 1);
+      name = XOBFINISH (&odr_enum_obstack, char *);
+
+      bool existed_p;
+      class odr_enum &this_enum
+		 = odr_enum_map->get_or_insert (xstrdup (name), &existed_p);
+
+      if (!existed_p)
+	{
+	  this_enum.vals.safe_grow_cleared (nvals);
+	  this_enum.warned = false;
+	  for (unsigned j = 0; j < nvals; j++)
+	    {
+	      const char *val_name = streamer_read_string (data_in, &ib);
+	      obstack_grow (&odr_enum_obstack, val_name, strlen (val_name) + 1);
+	      this_enum.vals[j].name = XOBFINISH (&odr_enum_obstack, char *);
+	      this_enum.vals[j].val = streamer_read_hwi (&ib);
+	    }
+	  bitpack_d bp = streamer_read_bitpack (&ib);
+	  stream_input_location (&this_enum.locus, &bp, data_in);
+	  for (unsigned j = 0; j < nvals; j++)
+	    stream_input_location (&this_enum.vals[j].locus, &bp, data_in);
+	  data_in->location_cache.apply_location_cache ();
+	}
+      else
+	{
+	  int do_warning = -1;
+	  char *warn_name = NULL;
+	  HOST_WIDE_INT warn_value = 0;
+
+	  for (unsigned j = 0; j < nvals; j++)
+	    {
+	      const char *id = streamer_read_string (data_in, &ib);
+	      HOST_WIDE_INT val = streamer_read_hwi (&ib);
+
+	      if (do_warning != -1 || j > this_enum.vals.length ())
+		continue;
+	      if (strcmp (id, this_enum.vals[j].name)
+		  || val != this_enum.vals[j].val)
+		{
+		  warn_name = xstrdup (id);
+		  warn_value = val;
+		  do_warning = j;
+		}
+	    }
+	  bitpack_d bp = streamer_read_bitpack (&ib);
+
+	  location_t locus;
+	  stream_input_location (&locus, &bp, data_in);
+
+	  if (do_warning != -1 || nvals != this_enum.vals.length ())
+	    {
+	      data_in->location_cache.apply_location_cache ();
+
+	      const int opts = DMGL_PARAMS | DMGL_ANSI | DMGL_TYPES;
+	      char *dmgname = cplus_demangle (name, opts);
+	      if (this_enum.warned
+		  || !warning_at (this_enum.locus,
+				  OPT_Wodr, "type %qs violates the "
+				  "C++ One Definition Rule",
+				  dmgname))
+		do_warning = -1;
+	      else
+	       {
+		 this_enum.warned = true;
+		 if (do_warning == -1)
+		   inform (locus,
+			   "an enum with different number of values is defined"
+			   " in another translation unit");
+		 else if (warn_name)
+		   inform (locus,
+			   "an enum with different value name"
+			   " is defined in another translation unit");
+		 else
+		   inform (locus,
+			   "an enum with different values"
+			   " is defined in another translation unit");
+	       }
+	    }
+	  else
+	    data_in->location_cache.revert_location_cache ();
+	  for (unsigned j = 0; j < nvals; j++)
+	    {
+	      location_t id_locus;
+
+	      data_in->location_cache.revert_location_cache ();
+	      stream_input_location (&id_locus, &bp, data_in);
+	      if ((int) j == do_warning)
+		{
+		  data_in->location_cache.apply_location_cache ();
+		  if (strcmp (warn_name, this_enum.vals[j].name))
+		    inform (this_enum.vals[j].locus,
+			    "name %qs differs from name %qs defined"
+			    " in another translation unit",
+			    this_enum.vals[j].name, warn_name);
+		  else
+		    inform (this_enum.vals[j].locus,
+			    "name %qs is defined to " HOST_WIDE_INT_PRINT_DEC
+			    " while another translation unit defines "
+			    "it as " HOST_WIDE_INT_PRINT_DEC,
+			    warn_name, this_enum.vals[j].val, warn_value);
+		  inform (id_locus,
+			  "mismatching definition");
+		}
+	      else
+	        data_in->location_cache.revert_location_cache ();
+	    }
+	  if (warn_name)
+	    free (warn_name);
+	  obstack_free (&odr_enum_obstack, name);
+	}
+    }
+  lto_free_section_data (file_data, LTO_section_ipa_fn_summary, NULL, data,
+			 len);
+  lto_data_in_delete (data_in);
+}
+
+/* Read all ODR type sections.  */
+
+static void
+ipa_odr_summary_read (void)
+{
+  struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
+  struct lto_file_decl_data *file_data;
+  unsigned int j = 0;
+
+  while ((file_data = file_data_vec[j++]))
+    {
+      size_t len;
+      const char *data
+	= lto_get_summary_section_data (file_data, LTO_section_odr_types,
+					&len);
+      if (data)
+	ipa_odr_read_section (file_data, data, len);
+    }
+  /* Enum info is used only to produce warnings.  Only case we will need it
+     again is streaming for incremental LTO.  */
+  if (flag_incremental_link != INCREMENTAL_LINK_LTO)
+    {
+      delete odr_enum_map;
+      obstack_free (&odr_enum_obstack, NULL);
+      odr_enum_map = NULL;
+    }
+}
+
+namespace {
+
+const pass_data pass_data_ipa_odr =
+{
+  IPA_PASS, /* type */
+  "odr", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_IPA_ODR, /* tv_id */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_ipa_odr : public ipa_opt_pass_d
+{
+public:
+  pass_ipa_odr (gcc::context *ctxt)
+    : ipa_opt_pass_d (pass_data_ipa_odr, ctxt,
+		      NULL, /* generate_summary */
+		      ipa_odr_summary_write, /* write_summary */
+		      ipa_odr_summary_read, /* read_summary */
+		      NULL, /* write_optimization_summary */
+		      NULL, /* read_optimization_summary */
+		      NULL, /* stmt_fixup */
+		      0, /* function_transform_todo_flags_start */
+		      NULL, /* function_transform */
+		      NULL) /* variable_transform */
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *)
+    {
+      return (in_lto_p || flag_lto);
+    }
+
+  virtual unsigned int execute (function *)
+    {
+      return 0;
+    }
+
+}; // class pass_ipa_odr
+
+} // anon namespace
+
+ipa_opt_pass_d *
+make_pass_ipa_odr (gcc::context *ctxt)
+{
+  return new pass_ipa_odr (ctxt);
+}
+
+
 #include "gt-ipa-devirt.h"
diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
index 98edc383461..6597593d138 100644
--- a/gcc/ipa-utils.h
+++ b/gcc/ipa-utils.h
@@ -98,6 +98,8 @@ void enable_odr_based_tbaa (tree type);
 bool odr_based_tbaa_p (const_tree type);
 void set_type_canonical_for_odr_type (tree type, tree canonical);
 
+void register_odr_enum (tree type);
+
 /* Return vector containing possible targets of polymorphic call E.
    If COMPLETEP is non-NULL, store true if the list is complete. 
    CACHE_TOKEN (if non-NULL) will get stored to an unique ID of entry
diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c
index 0923a8c0746..48cf4844b63 100644
--- a/gcc/lto-section-in.c
+++ b/gcc/lto-section-in.c
@@ -55,7 +55,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
   "mode_table",
   "hsa",
   "lto",
-  "ipa_sra"
+  "ipa_sra",
+  "odr_types",
 };
 
 /* Hooks so that the ipa passes can call into the lto front end to get
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 61c21d181fa..4651d8e5c51 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -964,9 +964,7 @@ DFS::DFS_write_tree_body (struct output_block *ob,
 
   if (CODE_CONTAINS_STRUCT (code, TS_TYPE_NON_COMMON))
     {
-      if (TREE_CODE (expr) == ENUMERAL_TYPE)
-	DFS_follow_tree_edge (TYPE_VALUES (expr));
-      else if (TREE_CODE (expr) == ARRAY_TYPE)
+      if (TREE_CODE (expr) == ARRAY_TYPE)
 	DFS_follow_tree_edge (TYPE_DOMAIN (expr));
       else if (RECORD_OR_UNION_TYPE_P (expr))
 	for (tree t = TYPE_FIELDS (expr); t; t = TREE_CHAIN (t))
@@ -1377,9 +1375,7 @@ hash_tree (struct streamer_tree_cache_d *cache, hash_map<tree, hashval_t> *map,
 
   if (CODE_CONTAINS_STRUCT (code, TS_TYPE_NON_COMMON))
     {
-      if (code == ENUMERAL_TYPE)
-	visit (TYPE_VALUES (t));
-      else if (code == ARRAY_TYPE)
+      if (code == ARRAY_TYPE)
 	visit (TYPE_DOMAIN (t));
       else if (RECORD_OR_UNION_TYPE_P (t))
 	for (tree f = TYPE_FIELDS (t); f; f = TREE_CHAIN (f))
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index 93da3c66fd8..261e6a9ffa8 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -243,6 +243,7 @@ enum lto_section_type
   LTO_section_ipa_hsa,
   LTO_section_lto,
   LTO_section_ipa_sra,
+  LTO_section_odr_types,
   LTO_N_SECTION_TYPES		/* Must be last.  */
 };
 
diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
index bcc0919dfc6..6a60a74daab 100644
--- a/gcc/lto/lto-common.c
+++ b/gcc/lto/lto-common.c
@@ -1503,9 +1503,7 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map)
 
   if (CODE_CONTAINS_STRUCT (code, TS_TYPE_NON_COMMON))
     {
-      if (code == ENUMERAL_TYPE)
-	compare_tree_edges (TYPE_VALUES (t1), TYPE_VALUES (t2));
-      else if (code == ARRAY_TYPE)
+      if (code == ARRAY_TYPE)
 	compare_tree_edges (TYPE_DOMAIN (t1), TYPE_DOMAIN (t2));
       else if (RECORD_OR_UNION_TYPE_P (t1))
 	{
diff --git a/gcc/passes.def b/gcc/passes.def
index 2bf2cb78fc5..adc88bb2929 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -145,6 +145,7 @@ along with GCC; see the file COPYING3.  If not see
 
   INSERT_PASSES_AFTER (all_regular_ipa_passes)
   NEXT_PASS (pass_analyzer);
+  NEXT_PASS (pass_ipa_odr);
   NEXT_PASS (pass_ipa_whole_program_visibility);
   NEXT_PASS (pass_ipa_profile);
   NEXT_PASS (pass_ipa_icf);
diff --git a/gcc/testsuite/g++.dg/lto/pr84805_0.C b/gcc/testsuite/g++.dg/lto/pr84805_0.C
index c9e00add79f..1509eae4845 100644
--- a/gcc/testsuite/g++.dg/lto/pr84805_0.C
+++ b/gcc/testsuite/g++.dg/lto/pr84805_0.C
@@ -11,7 +11,7 @@ struct __is_void_helper : false_type {};
 struct is_void : __is_void_helper {};
 template < typename > struct is_array : false_type {};
 namespace __gnu_cxx {
-enum _Lock_policy { _S_single, _S_mutex, _S_atomic }; // { dg-lto-warning "6: type '_Lock_policy' violates the C\\+\\+ One Definition Rule" }
+enum _Lock_policy { _S_single, _S_mutex, _S_atomic }; // { dg-lto-warning "6: type '__gnu_cxx::_Lock_policy' violates the C\\+\\+ One Definition Rule" }
 const _Lock_policy __default_lock_policy = _S_atomic;
 } namespace std {
 using __gnu_cxx::_Lock_policy;
diff --git a/gcc/timevar.def b/gcc/timevar.def
index f467d0251f7..29479205c41 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -70,6 +70,7 @@ DEFTIMEVAR (TV_CGRAPH                , "callgraph construction")
 DEFTIMEVAR (TV_CGRAPHOPT             , "callgraph optimization")
 DEFTIMEVAR (TV_CGRAPH_FUNC_EXPANSION , "callgraph functions expansion")
 DEFTIMEVAR (TV_CGRAPH_IPA_PASSES     , "callgraph ipa passes")
+DEFTIMEVAR (TV_IPA_ODR		     , "ipa ODR types")
 DEFTIMEVAR (TV_IPA_FNSUMMARY         , "ipa function summary")
 DEFTIMEVAR (TV_IPA_UNREACHABLE       , "ipa dead code removal")
 DEFTIMEVAR (TV_IPA_INHERITANCE       , "ipa inheritance graph")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index a1207a20a3c..a97c7bd36fb 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -505,6 +505,7 @@ extern ipa_opt_pass_d *make_pass_ipa_cp (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_sra (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_icf (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt);
+extern ipa_opt_pass_d *make_pass_ipa_odr (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_hsa (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c
index a3aea3d811f..9b5d2fe0243 100644
--- a/gcc/tree-streamer-in.c
+++ b/gcc/tree-streamer-in.c
@@ -842,9 +842,7 @@ lto_input_ts_type_non_common_tree_pointers (class lto_input_block *ib,
 					    class data_in *data_in,
 					    tree expr)
 {
-  if (TREE_CODE (expr) == ENUMERAL_TYPE)
-    TYPE_VALUES (expr) = stream_read_tree (ib, data_in);
-  else if (TREE_CODE (expr) == ARRAY_TYPE)
+  if (TREE_CODE (expr) == ARRAY_TYPE)
     TYPE_DOMAIN (expr) = stream_read_tree (ib, data_in);
   else if (RECORD_OR_UNION_TYPE_P (expr))
     TYPE_FIELDS (expr) = streamer_read_chain (ib, data_in);
diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c
index 7bc97196cf1..150f876d0e6 100644
--- a/gcc/tree-streamer-out.c
+++ b/gcc/tree-streamer-out.c
@@ -730,9 +730,7 @@ static void
 write_ts_type_non_common_tree_pointers (struct output_block *ob, tree expr,
 					bool ref_p)
 {
-  if (TREE_CODE (expr) == ENUMERAL_TYPE)
-    stream_write_tree (ob, TYPE_VALUES (expr), ref_p);
-  else if (TREE_CODE (expr) == ARRAY_TYPE)
+  if (TREE_CODE (expr) == ARRAY_TYPE)
     stream_write_tree (ob, TYPE_DOMAIN (expr), ref_p);
   else if (RECORD_OR_UNION_TYPE_P (expr))
     streamer_write_chain (ob, TYPE_FIELDS (expr), ref_p);
diff --git a/gcc/tree.c b/gcc/tree.c
index 78fce74ff78..6da3dc816c0 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -5606,15 +5606,14 @@ free_lang_data_in_type (tree type, class free_lang_data_d *fld)
 	  /* Type values are used only for C++ ODR checking.  Drop them
 	     for all type variants and non-ODR types.
 	     For ODR types the data is freed in free_odr_warning_data.  */
-	  if (TYPE_MAIN_VARIANT (type) != type
-	      || !type_with_linkage_p (type))
+	  if (!TYPE_VALUES (type))
+	    ;
+	  else if (TYPE_MAIN_VARIANT (type) != type
+		   || !type_with_linkage_p (type)
+		   || type_in_anonymous_namespace_p (type))
 	    TYPE_VALUES (type) = NULL;
 	  else
-	  /* Simplify representation by recording only values rather
-	     than const decls.  */
-	    for (tree e = TYPE_VALUES (type); e; e = TREE_CHAIN (e))
-	      if (TREE_CODE (TREE_VALUE (e)) == CONST_DECL)
-		TREE_VALUE (e) = DECL_INITIAL (TREE_VALUE (e));
+	    register_odr_enum (type);
 	}
       free_lang_data_in_one_sizepos (&TYPE_MIN_VALUE (type));
       free_lang_data_in_one_sizepos (&TYPE_MAX_VALUE (type));
-- 
2.20.1


From 17e2bae3a3ee3615e7bb2a7aa090786fd3746a3c Mon Sep 17 00:00:00 2001
From: Jan Hubicka <jh@suse.cz>
Date: Sat, 6 Jun 2020 22:19:46 +0200
Subject: [PATCH 2/2] Fix ICE in ODR enum streaming [PR95548]

gcc/ChangeLog:

2020-06-06  Jan Hubicka  <hubicka@ucw.cz>

	PR lto/95548
	* ipa-devirt.c (struct odr_enum_val): Turn values to wide_int.
	(ipa_odr_summary_write): Update streaming.
	(ipa_odr_read_section): Update streaming.

gcc/testsuite/ChangeLog:

2020-06-06  Jan Hubicka  <hubicka@ucw.cz>

	* g++.dg/torture/pr95548.C: New test.

(cherry picked from commit eca7a60bd24ebd91addd785e420a06d8f5086634)
---
 gcc/ipa-devirt.c                       | 59 +++++++++++++++++++++-----
 gcc/testsuite/g++.dg/torture/pr95548.C |  8 ++++
 2 files changed, 56 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr95548.C

diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index 31903a1c623..3ab7049734f 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -505,7 +505,7 @@ static GTY(()) vec <tree, va_gc> *odr_enums;
 struct odr_enum_val
 {
   const char *name;
-  HOST_WIDE_INT val;
+  wide_int val;
   location_t locus;
 };
 
@@ -4051,8 +4051,9 @@ ipa_odr_summary_write (void)
 	      streamer_write_string (ob, ob->main_stream,
 				     IDENTIFIER_POINTER (TREE_PURPOSE (e)),
 				     true);
-	      streamer_write_hwi (ob, tree_to_shwi
-					(DECL_INITIAL (TREE_VALUE (e))));
+	      streamer_write_wide_int (ob,
+				       wi::to_wide (DECL_INITIAL
+						      (TREE_VALUE (e))));
 	    }
 
 	  bitpack_d bp = bitpack_create (ob->main_stream);
@@ -4083,7 +4084,7 @@ ipa_odr_summary_write (void)
 	    {
 	      streamer_write_string (ob, ob->main_stream,
 				     this_enum.vals[j].name, true);
-	      streamer_write_hwi (ob, this_enum.vals[j].val);
+	      streamer_write_wide_int (ob, this_enum.vals[j].val);
 	    }
 
 	  bitpack_d bp = bitpack_create (ob->main_stream);
@@ -4142,35 +4143,51 @@ ipa_odr_read_section (struct lto_file_decl_data *file_data, const char *data,
       class odr_enum &this_enum
 		 = odr_enum_map->get_or_insert (xstrdup (name), &existed_p);
 
+      /* If this is first time we see the enum, remember its definition.  */
       if (!existed_p)
 	{
 	  this_enum.vals.safe_grow_cleared (nvals);
 	  this_enum.warned = false;
+	  if (dump_file)
+	    fprintf (dump_file, "enum %s\n{\n", name);
 	  for (unsigned j = 0; j < nvals; j++)
 	    {
 	      const char *val_name = streamer_read_string (data_in, &ib);
 	      obstack_grow (&odr_enum_obstack, val_name, strlen (val_name) + 1);
 	      this_enum.vals[j].name = XOBFINISH (&odr_enum_obstack, char *);
-	      this_enum.vals[j].val = streamer_read_hwi (&ib);
+	      this_enum.vals[j].val = streamer_read_wide_int (&ib);
+	      if (dump_file)
+		fprintf (dump_file, "  %s = " HOST_WIDE_INT_PRINT_DEC ",\n",
+			 val_name, wi::fits_shwi_p (this_enum.vals[j].val)
+			 ? this_enum.vals[j].val.to_shwi () : -1);
 	    }
 	  bitpack_d bp = streamer_read_bitpack (&ib);
 	  stream_input_location (&this_enum.locus, &bp, data_in);
 	  for (unsigned j = 0; j < nvals; j++)
 	    stream_input_location (&this_enum.vals[j].locus, &bp, data_in);
 	  data_in->location_cache.apply_location_cache ();
+	  if (dump_file)
+	    fprintf (dump_file, "}\n");
 	}
+      /* If we already have definition, compare it with new one and output
+	 warnings if they differs.  */
       else
 	{
 	  int do_warning = -1;
 	  char *warn_name = NULL;
-	  HOST_WIDE_INT warn_value = 0;
+	  wide_int warn_value = wi::zero (1);
 
+	  if (dump_file)
+	    fprintf (dump_file, "Comparing enum %s\n", name);
+
+	  /* Look for differences which we will warn about later once locations
+	     are streamed.  */
 	  for (unsigned j = 0; j < nvals; j++)
 	    {
 	      const char *id = streamer_read_string (data_in, &ib);
-	      HOST_WIDE_INT val = streamer_read_hwi (&ib);
+	      wide_int val = streamer_read_wide_int (&ib);
 
-	      if (do_warning != -1 || j > this_enum.vals.length ())
+	      if (do_warning != -1 || j >= this_enum.vals.length ())
 		continue;
 	      if (strcmp (id, this_enum.vals[j].name)
 		  || val != this_enum.vals[j].val)
@@ -4178,13 +4195,19 @@ ipa_odr_read_section (struct lto_file_decl_data *file_data, const char *data,
 		  warn_name = xstrdup (id);
 		  warn_value = val;
 		  do_warning = j;
+		  if (dump_file)
+		    fprintf (dump_file, "  Different on entry %i\n", j);
 		}
 	    }
-	  bitpack_d bp = streamer_read_bitpack (&ib);
 
+	  /* Stream in locations, but do not apply them unless we are going
+	     to warn.  */
+	  bitpack_d bp = streamer_read_bitpack (&ib);
 	  location_t locus;
+
 	  stream_input_location (&locus, &bp, data_in);
 
+	  /* Did we find a difference?  */
 	  if (do_warning != -1 || nvals != this_enum.vals.length ())
 	    {
 	      data_in->location_cache.apply_location_cache ();
@@ -4216,26 +4239,40 @@ ipa_odr_read_section (struct lto_file_decl_data *file_data, const char *data,
 	    }
 	  else
 	    data_in->location_cache.revert_location_cache ();
+
+	  /* Finally look up for location of the actual value that diverged.  */
 	  for (unsigned j = 0; j < nvals; j++)
 	    {
 	      location_t id_locus;
 
 	      data_in->location_cache.revert_location_cache ();
 	      stream_input_location (&id_locus, &bp, data_in);
+
 	      if ((int) j == do_warning)
 		{
 		  data_in->location_cache.apply_location_cache ();
+
 		  if (strcmp (warn_name, this_enum.vals[j].name))
 		    inform (this_enum.vals[j].locus,
 			    "name %qs differs from name %qs defined"
 			    " in another translation unit",
 			    this_enum.vals[j].name, warn_name);
-		  else
+		  /* FIXME: In case there is easy way to print wide_ints,
+		     perhaps we could do it here instead of overlfow checpl.  */
+		  else if (wi::fits_shwi_p (this_enum.vals[j].val)
+			   && wi::fits_shwi_p (warn_value))
 		    inform (this_enum.vals[j].locus,
 			    "name %qs is defined to " HOST_WIDE_INT_PRINT_DEC
 			    " while another translation unit defines "
 			    "it as " HOST_WIDE_INT_PRINT_DEC,
-			    warn_name, this_enum.vals[j].val, warn_value);
+			    warn_name, this_enum.vals[j].val.to_shwi (),
+			    warn_value.to_shwi ());
+		  else
+		    inform (this_enum.vals[j].locus,
+			    "name %qs is defined to different value "
+			    "in another translation unit",
+			    warn_name);
+
 		  inform (id_locus,
 			  "mismatching definition");
 		}
diff --git a/gcc/testsuite/g++.dg/torture/pr95548.C b/gcc/testsuite/g++.dg/torture/pr95548.C
new file mode 100644
index 00000000000..bca4f753f7e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr95548.C
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+enum a { b = (unsigned long)-1 } c;
+#ifdef __SIZEOF_INT128__
+enum c { d = (unsigned long)-1 } e;
+#endif
+main()
+{
+}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Backporting streaming and enum changes
  2020-08-06 14:34 Backporting streaming and enum changes Jan Hubicka
@ 2020-08-06 14:39 ` Richard Biener
  2020-08-14  8:36   ` Christophe Lyon
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2020-08-06 14:39 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, Jakub Jelinek

On Thu, 6 Aug 2020, Jan Hubicka wrote:

> Hello,
> as discussed some time ago, I would like to discuss possibility to
> backport the straming and enum improvements.  The motivation is that
> this brings quite noticeable improvements to builds of very large
> projects where we currently have nonlinearity problem with anonymous
> namespaces (which is solved by first set of patches) and also there is
> quite noticeable overhead of streaming of enums that I noticed too late
> for gcc 10.1. This is the second combine dpatch.
> 
> There is also noticeable reduction of .o files (especially before
> compression as hit to WPA->ltrans streaming) and some memory use
> benefits.
> 
> This is an optional thing to do, but I believe it may be helpful for
> distro builds and those using LTO for large projects.  
> 
> For firefox the reduction in global stream (that is slowest part of WPA)
> is from 25678391 tree bodies to 20821629, 11160520 SCC hash collisions
> to 6002. 392382523 overal section size to 287891470 (both is
> compressed).
> 
> For Firefox streaming is under control, but other projects like Chromium
> hits bigger issues. The reason is that Firefox has "unified build" that
> #includes multiple cpp sources to one, so it consists of only about 8k
> source files, while chromium is over 25k and it was tested on project
> with over 250k sources. More smaller sources one gets, the more
> noticeable bottleneck streaming become.
> 
> The patches are not completely trivial, but they affect code that is
> heavily executed during streaming and was in mainline for several
> months, so I hope they are safe.

So we've built the core of openSUSE (~3000 packages) on x86_64
and i586 with these backported and sofar found no issues.

I'm fine with backporting but I'll give Jakub the chance to
object.

Honza - please make sure to bump the LTO stream version minor
together with the streaming change (I think the enum change
doesn't require bumping).

Thanks,
Richard.

> Honza
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Backporting streaming and enum changes
  2020-08-06 14:39 ` Richard Biener
@ 2020-08-14  8:36   ` Christophe Lyon
  2020-08-14  9:15     ` Jan Hubicka
  0 siblings, 1 reply; 6+ messages in thread
From: Christophe Lyon @ 2020-08-14  8:36 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, Jakub Jelinek, gcc Patches

Hi,

On Thu, 6 Aug 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
>
> On Thu, 6 Aug 2020, Jan Hubicka wrote:
>
> > Hello,
> > as discussed some time ago, I would like to discuss possibility to
> > backport the straming and enum improvements.  The motivation is that
> > this brings quite noticeable improvements to builds of very large
> > projects where we currently have nonlinearity problem with anonymous
> > namespaces (which is solved by first set of patches) and also there is
> > quite noticeable overhead of streaming of enums that I noticed too late
> > for gcc 10.1. This is the second combine dpatch.
> >
> > There is also noticeable reduction of .o files (especially before
> > compression as hit to WPA->ltrans streaming) and some memory use
> > benefits.
> >
> > This is an optional thing to do, but I believe it may be helpful for
> > distro builds and those using LTO for large projects.
> >
> > For firefox the reduction in global stream (that is slowest part of WPA)
> > is from 25678391 tree bodies to 20821629, 11160520 SCC hash collisions
> > to 6002. 392382523 overal section size to 287891470 (both is
> > compressed).
> >
> > For Firefox streaming is under control, but other projects like Chromium
> > hits bigger issues. The reason is that Firefox has "unified build" that
> > #includes multiple cpp sources to one, so it consists of only about 8k
> > source files, while chromium is over 25k and it was tested on project
> > with over 250k sources. More smaller sources one gets, the more
> > noticeable bottleneck streaming become.
> >
> > The patches are not completely trivial, but they affect code that is
> > heavily executed during streaming and was in mainline for several
> > months, so I hope they are safe.
>
> So we've built the core of openSUSE (~3000 packages) on x86_64
> and i586 with these backported and sofar found no issues.
>
> I'm fine with backporting but I'll give Jakub the chance to
> object.
>
> Honza - please make sure to bump the LTO stream version minor
> together with the streaming change (I think the enum change
> doesn't require bumping).
>

Since this was backported as r10-8623-g0d96c3424bbb5e5f994b78c8f65d8704d215be54,
I've noticed ICEs on arm and aarch64:
    gcc.dg/pr34457-1.c (internal compiler error)
    gcc.dg/torture/pr92088-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
    gcc.dg/torture/pr92088-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error)

I can see:
Excess errors:
during IPA pass: cp
lto1: internal compiler error: in operator[], at vec.h:878
0xa0a5d7 vec<lto_encoder_entry, va_heap, vl_embed>::operator[](unsigned int)
        /gcc/vec.h:878
0xa0a5d7 vec<lto_encoder_entry, va_heap, vl_ptr>::operator[](unsigned int)
        /gcc/vec.h:1444
0xa194d3 vec<lto_encoder_entry, va_heap, vl_ptr>::operator[](unsigned int)
        /gcc/tree.h:3408
0xa194d3 lto_symtab_encoder_deref
        /gcc/lto-streamer.h:1173
0xa194d3 ipa_prop_read_section
        /gcc/ipa-prop.c:5060
0xa194d3 ipa_prop_read_jump_functions()
        /gcc/ipa-prop.c:5089
0xb6ba71 ipa_read_summaries_1
        /gcc/passes.c:2837
0x6bc4b5 read_cgraph_and_symbols(unsigned int, char const**)
        /gcc/lto/lto-common.c:2921
0x69deb2 lto_main()
        /gcc/lto/lto.c:625

The tests pass on trunk.

Christophe
> Thanks,
> Richard.
>
> > Honza
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Backporting streaming and enum changes
  2020-08-14  8:36   ` Christophe Lyon
@ 2020-08-14  9:15     ` Jan Hubicka
  2020-08-14  9:21       ` Jan Hubicka
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Hubicka @ 2020-08-14  9:15 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: Richard Biener, Jakub Jelinek, gcc Patches

Hi,
> 
> Since this was backported as r10-8623-g0d96c3424bbb5e5f994b78c8f65d8704d215be54,

Yes, after discussion with Jakub on IRC.
> I've noticed ICEs on arm and aarch64:
>     gcc.dg/pr34457-1.c (internal compiler error)
>     gcc.dg/torture/pr92088-1.c   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (internal compiler error)
>     gcc.dg/torture/pr92088-1.c   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects  (internal compiler error)
> 
> I can see:
> Excess errors:
> during IPA pass: cp
> lto1: internal compiler error: in operator[], at vec.h:878
> 0xa0a5d7 vec<lto_encoder_entry, va_heap, vl_embed>::operator[](unsigned int)
>         /gcc/vec.h:878
> 0xa0a5d7 vec<lto_encoder_entry, va_heap, vl_ptr>::operator[](unsigned int)
>         /gcc/vec.h:1444
> 0xa194d3 vec<lto_encoder_entry, va_heap, vl_ptr>::operator[](unsigned int)
>         /gcc/tree.h:3408
> 0xa194d3 lto_symtab_encoder_deref
>         /gcc/lto-streamer.h:1173
> 0xa194d3 ipa_prop_read_section
>         /gcc/ipa-prop.c:5060
> 0xa194d3 ipa_prop_read_jump_functions()
>         /gcc/ipa-prop.c:5089
> 0xb6ba71 ipa_read_summaries_1
>         /gcc/passes.c:2837
> 0x6bc4b5 read_cgraph_and_symbols(unsigned int, char const**)
>         /gcc/lto/lto-common.c:2921
> 0x69deb2 lto_main()
>         /gcc/lto/lto.c:625
> 
> The tests pass on trunk.

I will check this out.  I seem to remember that we fixed the issue on
mainline. It was extra zero byte streaming, I will dig it out.

Honza
> 
> Christophe
> > Thanks,
> > Richard.
> >
> > > Honza
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Backporting streaming and enum changes
  2020-08-14  9:15     ` Jan Hubicka
@ 2020-08-14  9:21       ` Jan Hubicka
  2020-08-14  9:30         ` Christophe Lyon
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Hubicka @ 2020-08-14  9:21 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: Jakub Jelinek, gcc Patches, Richard Biener

> Hi,
> > 
> > Since this was backported as r10-8623-g0d96c3424bbb5e5f994b78c8f65d8704d215be54,
> 
> Yes, after discussion with Jakub on IRC.
> > I've noticed ICEs on arm and aarch64:
> >     gcc.dg/pr34457-1.c (internal compiler error)
> >     gcc.dg/torture/pr92088-1.c   -O2 -flto -fno-use-linker-plugin
> > -flto-partition=none  (internal compiler error)
> >     gcc.dg/torture/pr92088-1.c   -O2 -flto -fuse-linker-plugin
> > -fno-fat-lto-objects  (internal compiler error)
> > 
> > I can see:
> > Excess errors:
> > during IPA pass: cp
> > lto1: internal compiler error: in operator[], at vec.h:878
> > 0xa0a5d7 vec<lto_encoder_entry, va_heap, vl_embed>::operator[](unsigned int)
> >         /gcc/vec.h:878
> > 0xa0a5d7 vec<lto_encoder_entry, va_heap, vl_ptr>::operator[](unsigned int)
> >         /gcc/vec.h:1444
> > 0xa194d3 vec<lto_encoder_entry, va_heap, vl_ptr>::operator[](unsigned int)
> >         /gcc/tree.h:3408
> > 0xa194d3 lto_symtab_encoder_deref
> >         /gcc/lto-streamer.h:1173
> > 0xa194d3 ipa_prop_read_section
> >         /gcc/ipa-prop.c:5060
> > 0xa194d3 ipa_prop_read_jump_functions()
> >         /gcc/ipa-prop.c:5089
> > 0xb6ba71 ipa_read_summaries_1
> >         /gcc/passes.c:2837
> > 0x6bc4b5 read_cgraph_and_symbols(unsigned int, char const**)
> >         /gcc/lto/lto-common.c:2921
> > 0x69deb2 lto_main()
> >         /gcc/lto/lto.c:625
> > 
> > The tests pass on trunk.
> 
> I will check this out.  I seem to remember that we fixed the issue on
> mainline. It was extra zero byte streaming, I will dig it out.
Found it,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95362
I will backport the patch shortly. Sorry for the breakage - it is quite
impressive that we do not have non-trivial SCC streamed in the distro bootstrap.

Honza
> 
> Honza
> > 
> > Christophe
> > > Thanks,
> > > Richard.
> > >
> > > > Honza
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Backporting streaming and enum changes
  2020-08-14  9:21       ` Jan Hubicka
@ 2020-08-14  9:30         ` Christophe Lyon
  0 siblings, 0 replies; 6+ messages in thread
From: Christophe Lyon @ 2020-08-14  9:30 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jakub Jelinek, gcc Patches, Richard Biener

On Fri, 14 Aug 2020 at 11:21, Jan Hubicka <hubicka@ucw.cz> wrote:
>
> > Hi,
> > >
> > > Since this was backported as r10-8623-g0d96c3424bbb5e5f994b78c8f65d8704d215be54,
> >
> > Yes, after discussion with Jakub on IRC.
> > > I've noticed ICEs on arm and aarch64:
> > >     gcc.dg/pr34457-1.c (internal compiler error)
> > >     gcc.dg/torture/pr92088-1.c   -O2 -flto -fno-use-linker-plugin
> > > -flto-partition=none  (internal compiler error)
> > >     gcc.dg/torture/pr92088-1.c   -O2 -flto -fuse-linker-plugin
> > > -fno-fat-lto-objects  (internal compiler error)
> > >
> > > I can see:
> > > Excess errors:
> > > during IPA pass: cp
> > > lto1: internal compiler error: in operator[], at vec.h:878
> > > 0xa0a5d7 vec<lto_encoder_entry, va_heap, vl_embed>::operator[](unsigned int)
> > >         /gcc/vec.h:878
> > > 0xa0a5d7 vec<lto_encoder_entry, va_heap, vl_ptr>::operator[](unsigned int)
> > >         /gcc/vec.h:1444
> > > 0xa194d3 vec<lto_encoder_entry, va_heap, vl_ptr>::operator[](unsigned int)
> > >         /gcc/tree.h:3408
> > > 0xa194d3 lto_symtab_encoder_deref
> > >         /gcc/lto-streamer.h:1173
> > > 0xa194d3 ipa_prop_read_section
> > >         /gcc/ipa-prop.c:5060
> > > 0xa194d3 ipa_prop_read_jump_functions()
> > >         /gcc/ipa-prop.c:5089
> > > 0xb6ba71 ipa_read_summaries_1
> > >         /gcc/passes.c:2837
> > > 0x6bc4b5 read_cgraph_and_symbols(unsigned int, char const**)
> > >         /gcc/lto/lto-common.c:2921
> > > 0x69deb2 lto_main()
> > >         /gcc/lto/lto.c:625
> > >
> > > The tests pass on trunk.
> >
> > I will check this out.  I seem to remember that we fixed the issue on
> > mainline. It was extra zero byte streaming, I will dig it out.
> Found it,
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95362
> I will backport the patch shortly. Sorry for the breakage - it is quite
> impressive that we do not have non-trivial SCC streamed in the distro bootstrap.
>

Indeed! Thanks for the quick fix.

> Honza
> >
> > Honza
> > >
> > > Christophe
> > > > Thanks,
> > > > Richard.
> > > >
> > > > > Honza
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-14  9:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-06 14:34 Backporting streaming and enum changes Jan Hubicka
2020-08-06 14:39 ` Richard Biener
2020-08-14  8:36   ` Christophe Lyon
2020-08-14  9:15     ` Jan Hubicka
2020-08-14  9:21       ` Jan Hubicka
2020-08-14  9:30         ` Christophe Lyon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).