public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PR64164] drop copyrename, integrate into expand
@ 2015-03-27 18:04 Alexandre Oliva
  2015-03-27 18:11 ` Alexandre Oliva
                   ` (2 more replies)
  0 siblings, 3 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-03-27 18:04 UTC (permalink / raw)
  To: gcc-patches

This patch reworks the out-of-ssa expander to enable coalescing of SSA
partitions that don't share the same base name.  This is done only when
optimizing.

The test we use to tell whether two partitions can be merged no longer
demands them to have the same base variable when optimizing, so they
become eligible for coalescing, as they would after copyrename.  We then
compute the partitioning we'd get if all coalescible partitions were
coalesced, using this partition assignment to assign base vars numbers.
These base var numbers are then used to identify conflicts, which used
to be based on shared base vars or base types.

We now propagate base var names during coalescing proper, only towards
the leader variable.  I'm no longer sure this is still needed, but
something about handling variables and results led me this way and I
didn't revisit it.  I might rework that with a later patch, or a later
revision of this patch; it would require other means to identify
partitions holding result_decls during merging, or allow that and deal
with param and result decls in a different way during expand proper.

I had to fix two lingering bugs in order for the whole thing to work: we
perform conflict detection after abnormal coalescing, but we computed
live ranges involving only the partition leaders, so conflicts with
other names already coalesced wouldn't be detected.  The other problem
was that we didn't track default defs for parms as live at entry, so
they might end up coalesced.  I guess none of these problems would have
been exercised in practice, because we wouldn't even consider merging
ssa names associated with different variables.

In the end, I verified that this fixed the codegen regression in the
PR64164 testcase, that failed to merge two partitions that could in
theory be merged, but that wasn't even considered due to differences in
the SSA var names.

I'd agree that disregarding the var names and dropping 4 passes is too
much of a change to fix this one problem, but...  it's something we
should have long tackled, and it gets this and other jobs done, so...

Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
on x86_64, so without lto.  Is this ok to install?


for  gcc/ChangeLog

	PR rtl-optimization/64164
	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
	* tree-ssa-copyrename.c: Removed.
	* opts.c (default_options_table): Drop -ftree-copyrename.
	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
	* common.opt (ftree-copyrename): Ignore.
	(ftree-coalesce-vars, ftree-coalesce-inlined-vars): Likewise.
	* doc/invoke.texi: Remove the ignored options above.
	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
	across base variables when optimizing.
	* tree-ssa-coalesce.c (build_ssa_conflict_graph): Process
	PARM_DECLs's default defs at the entry point.
	(attempt_coalesce): Add param_defaults argument, and
	track the presence of default defs for params in each
	partition.  Propagate base var to leader on merge, preferring
	parms and results, named vars, ignored vars, and then anon
	vars.  Refuse to merge a RESULT_DECL partition with a default
	PARM_DECL one.
	(perform_abnormal_coalescing): Add param_defaults argument,
	and pass it to attempt_coalesce.
	(coalesce_partitions): Likewise.
	(dump_part_var_map): New.
	(compute_optimized_partition_bases): New, called by...
	(coalesce_ssa_name): ... when optimizing, disabling
	partition_view_bitmap's base assignment.  Pass local
	param_defaults to coalescer functions.
	* tree-ssa-live.c (var_map_base_init): Note use only when not
	optimizing.
	(calculate_live_ranges): Initialize for all SSA names, not
	just partition leaders.
---
 gcc/Makefile.in           |    1 
 gcc/common.opt            |   12 +
 gcc/doc/invoke.texi       |   29 ---
 gcc/gimple-expr.c         |    7 -
 gcc/opts.c                |    1 
 gcc/passes.def            |    5 
 gcc/tree-ssa-coalesce.c   |  336 ++++++++++++++++++++++++++++++
 gcc/tree-ssa-copyrename.c |  499 ---------------------------------------------
 gcc/tree-ssa-live.c       |   11 +
 9 files changed, 347 insertions(+), 554 deletions(-)
 delete mode 100644 gcc/tree-ssa-copyrename.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index de1f3b6..b3149ba 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1427,7 +1427,6 @@ OBJS = \
 	tree-ssa-ccp.o \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
-	tree-ssa-copyrename.o \
 	tree-ssa-dce.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index b49ac46..fefaee7 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2207,16 +2207,16 @@ Common Report Var(flag_tree_ch) Optimization
 Enable loop header copying on trees
 
 ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing.  Preserved for backward compatibility.
 
 ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copy-prop
 Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 08ce074..2f6acb5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -442,8 +442,7 @@ Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
+-ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
 -ftree-loop-if-convert-stores -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
@@ -8800,32 +8799,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
 references with scalars to prevent committing structures to memory too
 early.  This flag is enabled by default at @option{-O} and higher.
 
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees.  This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables.  This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions.  It is a more limited form of
-@option{-ftree-coalesce-vars}.  This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries.  This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}.  In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones.  This option is enabled by default.
-
 @item -ftree-ter
 @opindex ftree-ter
 Perform temporary expression replacement during the SSA->normal phase.  Single
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index efc93b7..62ae577 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -399,13 +399,14 @@ copy_var_decl (tree var, tree name, tree type)
 bool
 gimple_can_coalesce_p (tree name1, tree name2)
 {
-  /* First check the SSA_NAME's associated DECL.  We only want to
-     coalesce if they have the same DECL or both have no associated DECL.  */
+  /* First check the SSA_NAME's associated DECL.  Without
+     optimization, we only want to coalesce if they have the same DECL
+     or both have no associated DECL.  */
   tree var1 = SSA_NAME_VAR (name1);
   tree var2 = SSA_NAME_VAR (name2);
   var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
   var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
-  if (var1 != var2)
+  if (var1 != var2 && !optimize)
     return false;
 
   /* Now check the types.  If the types are the same, then we should
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..8149421 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -453,7 +453,6 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
-    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 2bc5dcd..345f451 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -76,7 +76,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_all_early_optimizations);
       PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_ccp);
 	  /* After CCP we rewrite no longer addressed locals into SSA
 	     form if possible.  */
@@ -152,7 +151,6 @@ along with GCC; see the file COPYING3.  If not see
       /* Initial scalar cleanups before alias computation.
 	 They ensure memory accesses are not indirect wherever possible.  */
       NEXT_PASS (pass_strip_predict_hints);
-      NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_ccp);
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
@@ -180,7 +178,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_stdarg);
       NEXT_PASS (pass_lower_complex);
       NEXT_PASS (pass_sra);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* The dom pass will also resolve all __builtin_constant_p calls
          that are still there to 0.  This has to be done after some
 	 propagations have already run, but before some more dead code
@@ -289,7 +286,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_fold_builtins);
       NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* FIXME: If DCE is not run before checking for uninitialized uses,
 	 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 	 However, this also causes us to misdiagnose cases that should be
@@ -324,7 +320,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tsan);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* ???  We do want some kind of loop invariant motion, but we possibly
          need to adjust LIM to be more friendly towards preserving accurate
 	 debug information here.  */
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index dd6b9c0..48a723c 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -833,6 +833,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
   basic_block bb;
   ssa_op_iter iter;
   live_track_p live;
+  basic_block entry;
+
+  /* If we are optimizing, we may attempt to coalesce variables from
+     different base variables, including different parameters, so we
+     have to make sure default defs live at the entry block conflict
+     with each other.  */
+  if (optimize)
+    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  else
+    entry = NULL;
 
   map = live_var_map (liveinfo);
   graph = ssa_conflicts_new (num_var_partitions (map));
@@ -891,6 +901,33 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	    live_track_process_def (live, result, graph);
 	}
 
+      /* Pretend there are defs for params' default defs at the start
+	 of the (post-)entry block.  We run after abnormal coalescing,
+	 so we can't assume the leader variable is the default
+	 definition, but because of SSA_NAME_VAR adjustments in
+	 attempt_coalesce, we can assume that if there is any
+	 PARM_DECL in the partition, it will be the leader's
+	 SSA_NAME_VAR.  */
+      if (bb == entry)
+	{
+	  unsigned part;
+	  bitmap_iterator bi;
+	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, part, bi)
+	    {
+	      bitmap_iterator bi2;
+	      unsigned v;
+	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[part],
+					0, v, bi2)
+		{
+		  tree var = partition_to_var (map, v);
+		  if (!SSA_NAME_VAR (var)
+		      || TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL)
+		    continue;
+		  live_track_process_def (live, var, graph);
+		}
+	    }
+	}
+
      live_track_clear_base_vars (live);
     }
 
@@ -1127,11 +1164,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
 
 static inline bool
 attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
-		  FILE *debug)
+		  bitmap param_defaults, FILE *debug)
 {
   int z;
   tree var1, var2;
   int p1, p2;
+  bool default_def = false;
 
   p1 = var_to_partition (map, ssa_name (x));
   p2 = var_to_partition (map, ssa_name (y));
@@ -1160,6 +1198,70 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
     {
       var1 = partition_to_var (map, p1);
       var2 = partition_to_var (map, p2);
+
+      tree leader;
+
+      if (var1 == var2 || !SSA_NAME_VAR (var2)
+	  || SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2))
+	{
+	  leader = SSA_NAME_VAR (var1);
+	  default_def = (leader && TREE_CODE (leader) == PARM_DECL
+			 && (SSA_NAME_IS_DEFAULT_DEF (var1)
+			     || bitmap_bit_p (param_defaults, p1)));
+	}
+      else if (!SSA_NAME_VAR (var1))
+	{
+	  leader = SSA_NAME_VAR (var2);
+	  default_def = (leader && TREE_CODE (leader) == PARM_DECL
+			 && (SSA_NAME_IS_DEFAULT_DEF (var2)
+			     || bitmap_bit_p (param_defaults, p2)));
+	}
+      else if ((TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL
+		&& (SSA_NAME_IS_DEFAULT_DEF (var1)
+		    || bitmap_bit_p (param_defaults, p1)))
+	       || TREE_CODE (SSA_NAME_VAR (var1)) == RESULT_DECL)
+	{
+	  if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
+	       && (SSA_NAME_IS_DEFAULT_DEF (var2)
+		   || bitmap_bit_p (param_defaults, p2)))
+	      || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
+	    {
+	      /* We only have one RESULT_DECL, and two PARM_DECL
+		 DEFAULT_DEFs would have conflicted, so we know either
+		 one of var1 or var2 is a PARM_DECL, and the other is
+		 a RESULT_DECL.  */
+	      if (debug)
+		fprintf (debug, ": Cannot coalesce PARM_DECL and RESULT_DECL\n");
+	      return false;
+	    }
+	  leader = SSA_NAME_VAR (var1);
+	  default_def = TREE_CODE (leader) == PARM_DECL;
+	}
+      else if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
+		&& (SSA_NAME_IS_DEFAULT_DEF (var2)
+		    || bitmap_bit_p (param_defaults, p2)))
+	       || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
+	{
+	  leader = SSA_NAME_VAR (var2);
+	  default_def = TREE_CODE (leader) == PARM_DECL;
+	}
+      else if (TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL)
+	leader = SSA_NAME_VAR (var1);
+      else if (TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL)
+	leader = SSA_NAME_VAR (var2);
+      else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL
+	       && !DECL_IGNORED_P (SSA_NAME_VAR (var1)))
+	leader = SSA_NAME_VAR (var1);
+      else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL
+	       && !DECL_IGNORED_P (SSA_NAME_VAR (var2)))
+	leader = SSA_NAME_VAR (var2);
+      else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL)
+	leader = SSA_NAME_VAR (var1);
+      else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
+	leader = SSA_NAME_VAR (var2);
+      else /* What else could it be?  */
+	gcc_unreachable ();
+
       z = var_union (map, var1, var2);
       if (z == NO_PARTITION)
 	{
@@ -1178,8 +1280,46 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
 	    ssa_conflicts_merge (graph, p2, p1);
 	}
 
+      if (z == p1)
+	{
+	  if (SSA_NAME_VAR (var1) != leader)
+	    {
+	      replace_ssa_name_symbol (var1, leader);
+	      if (debug)
+		{
+		  fprintf (debug, ": Renamed ");
+		  print_generic_expr (debug, var1, TDF_SLIM);
+		}
+	    }
+	  if (default_def)
+	    {
+	      if (SSA_NAME_IS_DEFAULT_DEF (var2))
+		bitmap_clear_bit (param_defaults, p2);
+	      bitmap_set_bit (param_defaults, p1);
+	    }
+	}
+      else
+	{
+	  if (SSA_NAME_VAR (var2) != leader)
+	    {
+	      replace_ssa_name_symbol (var2, leader);
+	      if (debug)
+		{
+		  fprintf (debug, ": Renamed ");
+		  print_generic_expr (debug, var2, TDF_SLIM);
+		}
+	    }
+	  if (default_def)
+	    {
+	      if (SSA_NAME_IS_DEFAULT_DEF (var1))
+		bitmap_clear_bit (param_defaults, p1);
+	      bitmap_set_bit (param_defaults, p2);
+	    }
+	}
+
       if (debug)
 	fprintf (debug, ": Success -> %d\n", z);
+
       return true;
     }
 
@@ -1194,7 +1334,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
    Debug output is sent to DEBUG if it is non-NULL.  */
 
 static void
-perform_abnormal_coalescing (var_map map, FILE *debug)
+perform_abnormal_coalescing (var_map map, bitmap param_defaults, FILE *debug)
 {
   basic_block bb;
   edge e;
@@ -1223,7 +1363,7 @@ perform_abnormal_coalescing (var_map map, FILE *debug)
 		if (debug)
 		  fprintf (debug, "Abnormal coalesce: ");
 
-		if (!attempt_coalesce (map, NULL, v1, v2, debug))
+		if (!attempt_coalesce (map, NULL, v1, v2, param_defaults, debug))
 		  fail_abnormal_edge_coalesce (v1, v2);
 	      }
 	  }
@@ -1235,7 +1375,7 @@ perform_abnormal_coalescing (var_map map, FILE *debug)
 
 static void
 coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
-		     FILE *debug)
+		     bitmap param_defaults, FILE *debug)
 {
   int x = 0, y = 0;
   tree var1, var2;
@@ -1253,7 +1393,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
 
       if (debug)
 	fprintf (debug, "Coalesce list: ");
-      attempt_coalesce (map, graph, x, y, debug);
+      attempt_coalesce (map, graph, x, y, param_defaults, debug);
     }
 }
 
@@ -1281,6 +1421,178 @@ ssa_name_var_hash::equal (const value_type *n1, const compare_type *n2)
 }
 
 
+/* Output partition map MAP with coalescing plan PART to file F.  */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+  int t;
+  unsigned x, y;
+  int p;
+
+  fprintf (f, "\nCoalescible Partition map \n\n");
+
+  for (x = 0; x < map->num_partitions; x++)
+    {
+      if (map->view_to_partition != NULL)
+	p = map->view_to_partition[x];
+      else
+	p = x;
+
+      if (ssa_name (p) == NULL_TREE
+	  || virtual_operand_p (ssa_name (p)))
+        continue;
+
+      t = 0;
+      for (y = 1; y < num_ssa_names; y++)
+        {
+	  tree var = version_to_var (map, y);
+	  if (!var)
+	    continue;
+	  int q = var_to_partition (map, var);
+	  p = partition_find (part, q);
+	  gcc_assert (map->partition_to_base_index[q]
+		      == map->partition_to_base_index[p]);
+
+	  if (p == (int)x)
+	    {
+	      if (t++ == 0)
+	        {
+		  fprintf (f, "Partition %d, base %d (", x,
+			   map->partition_to_base_index[q]);
+		  print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+		  fprintf (f, " - ");
+		}
+	      fprintf (f, "%d ", y);
+	    }
+	}
+      if (t != 0)
+	fprintf (f, ")\n");
+    }
+  fprintf (f, "\n");
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+   partition of SSA names USED_IN_COPIES and related by CL
+   coalesce possibilities.  */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+				   coalesce_list_p cl)
+{
+  int parts = num_var_partitions (map);
+  partition tentative = partition_new (parts);
+
+  /* Partition the SSA versions so that, for each coalescible
+     pair, both of its members are in the same partition in
+     TENTATIVE.  */
+  gcc_assert (!cl->sorted);
+  coalesce_pair_p node;
+  coalesce_iterator_type ppi;
+  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+    {
+      tree v1 = ssa_name (node->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (node->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* We have to deal with cost one pairs too.  */
+  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+    {
+      tree v1 = ssa_name (co->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (co->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* And also with abnormal edges.  */
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->flags & EDGE_ABNORMAL)
+	  {
+	    gphi_iterator gsi;
+	    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+		 gsi_next (&gsi))
+	      {
+		gphi *phi = gsi.phi ();
+		tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+		if (SSA_NAME_IS_DEFAULT_DEF (arg)
+		    && (!SSA_NAME_VAR (arg)
+			|| TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+		  continue;
+
+		tree res = PHI_RESULT (phi);
+
+		int p1 = partition_find (tentative, var_to_partition (map, res));
+		int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+		if (p1 == p2)
+		  continue;
+
+		partition_union (tentative, p1, p2);
+	      }
+	  }
+    }
+  
+
+  map->partition_to_base_index = XCNEWVEC (int, parts);
+  auto_vec<unsigned int> index_map (parts);
+  if (parts)
+    index_map.quick_grow (parts);
+
+  const unsigned no_part = -1;
+  unsigned count = parts;
+  while (count)
+    index_map[--count] = no_part;
+
+  /* Initialize MAP's mapping from partition to base index, using
+     as base indices an enumeration of the TENTATIVE partitions in
+     which each SSA version ended up, so that we compute conflicts
+     between all SSA versions that ended up in the same potential
+     coalesce partition.  */
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      if (index_map[base] != no_part)
+	continue;
+      index_map[base] = count++;
+    }
+
+  map->num_basevars = count;
+
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      gcc_assert (index_map[base] < count);
+      map->partition_to_base_index[pidx] = index_map[base];
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_part_var_map (dump_file, tentative, map);
+
+  partition_delete (tentative);
+}
+
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1346,7 +1658,12 @@ coalesce_ssa_name (void)
     dump_var_map (dump_file, map);
 
   /* Don't calculate live ranges for variables not in the coalesce list.  */
-  partition_view_bitmap (map, used_in_copies, true);
+  partition_view_bitmap (map, used_in_copies, !optimize);
+
+  /* If we are optimizing, compute the base indices ourselves.  */
+  if (optimize)
+    compute_optimized_partition_bases (map, used_in_copies, cl);
+
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
@@ -1355,13 +1672,15 @@ coalesce_ssa_name (void)
       return map;
     }
 
+  bitmap param_defaults = BITMAP_ALLOC (NULL);
+
   /* First, coalesce all the copies across abnormal edges.  These are not placed
      in the coalesce list because they do not need to be sorted, and simply
      consume extra memory/compilation time in large programs.
      Performing abnormal coalescing also needs no live/conflict computation
      because it must succeed (but we lose checking that it indeed does).
      Still for PR63155 this reduces memory usage from 10GB to zero.  */
-  perform_abnormal_coalescing (map,
+  perform_abnormal_coalescing (map, param_defaults,
 			       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -1393,9 +1712,10 @@ coalesce_ssa_name (void)
     dump_var_map (dump_file, map);
 
   /* Now coalesce everything in the list.  */
-  coalesce_partitions (map, graph, cl,
+  coalesce_partitions (map, graph, cl, param_defaults,
 		       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
+  BITMAP_FREE (param_defaults);
   delete_coalesce_list (cl);
   ssa_conflicts_delete (graph);
 
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index f3cb56e..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,499 +0,0 @@
-/* Rename SSA copies.
-   Copyright (C) 2004-2015 Free Software Foundation, Inc.
-   Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
-#include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
-#include "tree.h"
-#include "fold-const.h"
-#include "predict.h"
-#include "hard-reg-set.h"
-#include "function.h"
-#include "dominance.h"
-#include "cfg.h"
-#include "basic-block.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "bitmap.h"
-#include "gimple-ssa.h"
-#include "stringpool.h"
-#include "tree-ssanames.h"
-#include "hashtab.h"
-#include "rtl.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
-  /* Number of copies coalesced.  */
-  int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
-   This optimization looks for copies between 2 SSA_NAMES, either through a
-   direct copy, or an implicit one via a PHI node result and its arguments.
-
-   Each copy is examined to determine if it is possible to rename the base
-   variable of one of the operands to the same variable as the other operand.
-   i.e.
-   T.3_5 = <blah>
-   a_1 = T.3_5
-
-   If this copy couldn't be copy propagated, it could possibly remain in the
-   program throughout the optimization phases.   After SSA->normal, it would
-   become:
-
-   T.3 = <blah>
-   a = T.3
-
-   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
-   fundamental reason why the base variable needs to be T.3, subject to
-   certain restrictions.  This optimization attempts to determine if we can
-   change the base variable on copies like this, and result in code such as:
-
-   a_5 = <blah>
-   a_1 = a_5
-
-   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
-   possible, the copy goes away completely. If it isn't possible, a new temp
-   will be created for a_5, and you will end up with the exact same code:
-
-   a.8 = <blah>
-   a = a.8
-
-   The other benefit of performing this optimization relates to what variables
-   are chosen in copies.  Gimplification of the program uses temporaries for
-   a lot of things. expressions like
-
-   a_1 = <blah>
-   <blah2> = a_1
-
-   get turned into
-
-   T.3_5 = <blah>
-   a_1 = T.3_5
-   <blah2> = a_1
-
-   Copy propagation is done in a forward direction, and if we can propagate
-   through the copy, we end up with:
-
-   T.3_5 = <blah>
-   <blah2> = T.3_5
-
-   The copy is gone, but so is all reference to the user variable 'a'. By
-   performing this optimization, we would see the sequence:
-
-   a_5 = <blah>
-   a_1 = a_5
-   <blah2> = a_1
-
-   which copy propagation would then turn into:
-
-   a_5 = <blah>
-   <blah2> = a_5
-
-   and so we still retain the user variable whenever possible.  */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
-   Choose a representative for the partition, and send debug info to DEBUG.  */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
-  int p1, p2, p3;
-  tree root1, root2;
-  tree rep1, rep2;
-  bool ign1, ign2, abnorm;
-
-  gcc_assert (TREE_CODE (var1) == SSA_NAME);
-  gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
-  register_ssa_partition (map, var1);
-  register_ssa_partition (map, var2);
-
-  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
-  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
-  if (debug)
-    {
-      fprintf (debug, "Try : ");
-      print_generic_expr (debug, var1, TDF_SLIM);
-      fprintf (debug, "(P%d) & ", p1);
-      print_generic_expr (debug, var2, TDF_SLIM);
-      fprintf (debug, "(P%d)", p2);
-    }
-
-  gcc_assert (p1 != NO_PARTITION);
-  gcc_assert (p2 != NO_PARTITION);
-
-  if (p1 == p2)
-    {
-      if (debug)
-	fprintf (debug, " : Already coalesced.\n");
-      return;
-    }
-
-  rep1 = partition_to_var (map, p1);
-  rep2 = partition_to_var (map, p2);
-  root1 = SSA_NAME_VAR (rep1);
-  root2 = SSA_NAME_VAR (rep2);
-  if (!root1 && !root2)
-    return;
-
-  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
-  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
-	    || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
-  if (abnorm)
-    {
-      if (debug)
-	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
-      return;
-    }
-
-  /* Partitions already have the same root, simply merge them.  */
-  if (root1 == root2)
-    {
-      p1 = partition_union (map->var_partition, p1, p2);
-      if (debug)
-	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
-      return;
-    }
-
-  /* Never attempt to coalesce 2 different parameters.  */
-  if ((root1 && TREE_CODE (root1) == PARM_DECL)
-      && (root2 && TREE_CODE (root2) == PARM_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
-      return;
-    }
-
-  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
-      != (root2 && TREE_CODE (root2) == RESULT_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
-      return;
-    }
-
-  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
-  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
-  /* Refrain from coalescing user variables, if requested.  */
-  if (!ign1 && !ign2)
-    {
-      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
-	ign2 = true;
-      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
-	ign1 = true;
-      else if (flag_ssa_coalesce_vars != 2)
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
-	  return;
-	}
-      else
-	ign2 = true;
-    }
-
-  /* If both values have default defs, we can't coalesce.  If only one has a
-     tag, make sure that variable is the new root partition.  */
-  if (root1 && ssa_default_def (cfun, root1))
-    {
-      if (root2 && ssa_default_def (cfun, root2))
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 default defs. No coalesce.\n");
-	  return;
-	}
-      else
-        {
-	  ign2 = true;
-	  ign1 = false;
-	}
-    }
-  else if (root2 && ssa_default_def (cfun, root2))
-    {
-      ign1 = true;
-      ign2 = false;
-    }
-
-  /* Do not coalesce if we cannot assign a symbol to the partition.  */
-  if (!(!ign2 && root2)
-      && !(!ign1 && root1))
-    {
-      if (debug)
-	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the new chosen root variable would be read-only.
-     If both ign1 && ign2, then the root var of the larger partition
-     wins, so reject in that case if any of the root vars is TREE_READONLY.
-     Otherwise reject only if the root var, on which replace_ssa_name_symbol
-     will be called below, is readonly.  */
-  if (((root1 && TREE_READONLY (root1)) && ign2)
-      || ((root2 && TREE_READONLY (root2)) && ign1))
-    {
-      if (debug)
-	fprintf (debug, " : Readonly variable.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the two variables aren't type compatible .  */
-  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
-      /* There is a disconnect between the middle-end type-system and
-         VRP, avoid coalescing enum types with different bounds.  */
-      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
-	   || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
-	  && TREE_TYPE (var1) != TREE_TYPE (var2)))
-    {
-      if (debug)
-	fprintf (debug, " : Incompatible types.  No coalesce.\n");
-      return;
-    }
-
-  /* Merge the two partitions.  */
-  p3 = partition_union (map->var_partition, p1, p2);
-
-  /* Set the root variable of the partition to the better choice, if there is
-     one.  */
-  if (!ign2 && root2)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
-  else if (!ign1 && root1)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
-  else
-    gcc_unreachable ();
-
-  if (debug)
-    {
-      fprintf (debug, " --> P%d ", p3);
-      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
-			  TDF_SLIM);
-      fprintf (debug, "\n");
-    }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
-  GIMPLE_PASS, /* type */
-  "copyrename", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_COPY_RENAME, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
-  pass_rename_ssa_copies (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
-   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
-   changing the underlying root variable of all coalesced version.  This will
-   then cause the SSA->normal pass to attempt to coalesce them all to the same
-   variable.  */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
-  var_map map;
-  basic_block bb;
-  tree var, part_var;
-  gimple stmt;
-  unsigned x;
-  FILE *debug;
-
-  memset (&stats, 0, sizeof (stats));
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    debug = dump_file;
-  else
-    debug = NULL;
-
-  map = init_var_map (num_ssa_names);
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Scan for real copies.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_assign_ssa_name_copy_p (stmt))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
-	    }
-	}
-    }
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Treat PHI nodes as copies between the result and each argument.  */
-      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-        {
-          size_t i;
-	  tree res;
-	  gphi *phi = gsi.phi ();
-	  res = gimple_phi_result (phi);
-
-	  /* Do not process virtual SSA_NAMES.  */
-	  if (virtual_operand_p (res))
-	    continue;
-
-	  /* Make sure to only use the same partition for an argument
-	     as the result but never the other way around.  */
-	  if (SSA_NAME_VAR (res)
-	      && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
-	    for (i = 0; i < gimple_phi_num_args (phi); i++)
-	      {
-		tree arg = PHI_ARG_DEF (phi, i);
-		if (TREE_CODE (arg) == SSA_NAME)
-		  copy_rename_partition_coalesce (map, res, arg,
-						  debug);
-	      }
-	  /* Else if all arguments are in the same partition try to merge
-	     it with the result.  */
-	  else
-	    {
-	      int all_p_same = -1;
-	      int p = -1;
-	      for (i = 0; i < gimple_phi_num_args (phi); i++)
-		{
-		  tree arg = PHI_ARG_DEF (phi, i);
-		  if (TREE_CODE (arg) != SSA_NAME)
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		  else if (all_p_same == -1)
-		    {
-		      p = partition_find (map->var_partition,
-					  SSA_NAME_VERSION (arg));
-		      all_p_same = 1;
-		    }
-		  else if (all_p_same == 1
-			   && p != partition_find (map->var_partition,
-						   SSA_NAME_VERSION (arg)))
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		}
-	      if (all_p_same == 1)
-		copy_rename_partition_coalesce (map, res,
-						PHI_ARG_DEF (phi, 0),
-						debug);
-	    }
-        }
-    }
-
-  if (debug)
-    dump_var_map (debug, map);
-
-  /* Now one more pass to make all elements of a partition share the same
-     root variable.  */
-
-  for (x = 1; x < num_ssa_names; x++)
-    {
-      part_var = partition_to_var (map, x);
-      if (!part_var)
-        continue;
-      var = ssa_name (x);
-      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
-	continue;
-      if (debug)
-        {
-	  fprintf (debug, "Coalesced ");
-	  print_generic_expr (debug, var, TDF_SLIM);
-	  fprintf (debug, " to ");
-	  print_generic_expr (debug, part_var, TDF_SLIM);
-	  fprintf (debug, "\n");
-	}
-      stats.coalesced++;
-      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
-    }
-
-  statistics_counter_event (fun, "copies coalesced",
-			    stats.coalesced);
-  delete_var_map (map);
-  return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
-  return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index e0c42669..d8f2b08 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -119,7 +119,10 @@ tree_int_map_hasher::equal (const value_type *v, const compare_type *c)
 }
 
 
-/* This routine will initialize the basevar fields of MAP.  */
+/* This routine will initialize the basevar fields of MAP with base
+   names, when we are not optimizing.  When optimizing, we'll use
+   partition numbers as base index numbers, see coalesce_ssa_name in
+   tree-ssa-coalesce.c.  */
 
 static void
 var_map_base_init (var_map map)
@@ -1233,9 +1236,11 @@ calculate_live_ranges (var_map map, bool want_livein)
   tree_live_info_p live;
 
   live = new_tree_live_info (map);
-  for (i = 0; i < num_var_partitions (map); i++)
+  /* We have already coalesced abnormal SSA names, so iterate over all
+     names, so as to cover all variables in each partition.  */
+  for (i = 1; i < num_ssa_names; i++)
     {
-      var = partition_to_var (map, i);
+      var = ssa_name (i);
       if (var != NULL_TREE)
 	set_var_live_on_entry (var, live);
     }


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-27 18:04 [PR64164] drop copyrename, integrate into expand Alexandre Oliva
@ 2015-03-27 18:11 ` Alexandre Oliva
  2015-03-28 19:22 ` Alexandre Oliva
  2015-12-04 12:45 ` Dominik Vogt
  2 siblings, 0 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-03-27 18:11 UTC (permalink / raw)
  To: gcc-patches

On Mar 27, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto.  Is this ok to install?

Err, sorry, wrong keystroke, I didn't mean to post that message yet, I
was just drafting it while several of the issues were still fresh, and a
wrong keystroke got it out.  I still have a couple of ICEs to look into,
and a number of guality regressions to analyze.  Apologies for setting
false expectations, but I will get there, and when I do, I'll post a
revised patch.

Comments on the one I posted are welcome nevertheless.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-27 18:04 [PR64164] drop copyrename, integrate into expand Alexandre Oliva
  2015-03-27 18:11 ` Alexandre Oliva
@ 2015-03-28 19:22 ` Alexandre Oliva
  2015-03-31  5:11   ` Jeff Law
                     ` (2 more replies)
  2015-12-04 12:45 ` Dominik Vogt
  2 siblings, 3 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-03-28 19:22 UTC (permalink / raw)
  To: gcc-patches

On Mar 27, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> This patch reworks the out-of-ssa expander to enable coalescing of SSA
> partitions that don't share the same base name.  This is done only when
> optimizing.

> The test we use to tell whether two partitions can be merged no longer
> demands them to have the same base variable when optimizing, so they
> become eligible for coalescing, as they would after copyrename.  We then
> compute the partitioning we'd get if all coalescible partitions were
> coalesced, using this partition assignment to assign base vars numbers.
> These base var numbers are then used to identify conflicts, which used
> to be based on shared base vars or base types.

> We now propagate base var names during coalescing proper, only towards
> the leader variable.  I'm no longer sure this is still needed, but
> something about handling variables and results led me this way and I
> didn't revisit it.  I might rework that with a later patch, or a later
> revision of this patch; it would require other means to identify
> partitions holding result_decls during merging, or allow that and deal
> with param and result decls in a different way during expand proper.

> I had to fix two lingering bugs in order for the whole thing to work: we
> perform conflict detection after abnormal coalescing, but we computed
> live ranges involving only the partition leaders, so conflicts with
> other names already coalesced wouldn't be detected.

This early abnormal coalescing was only present for a few days in the
trunk, and I was lucky enough to start working on a tree that had it.
It turns out that the fix for it was thus rendered unnecessary, so I
dropped it.  It was the fix for it, that didn't cover the live range
check, that caused the two ICEs I saw in the regressions tests.  Since
the ultimate cause of the problem is gone, and the change that
introduced the check failures, both problems went *poof* after I updated
the tree, resolved the conflicts and dropped the redundant code.

> The other problem was that we didn't track default defs for parms as
> live at entry, so they might end up coalesced.

I improved this a little bit, using the bitmap of partitions containing
default params to check that we only process function-entry defs for
them, rather than for all param decls in case they end up in other
partitions.

> I guess none of these problems would have been exercised in practice,
> because we wouldn't even consider merging ssa names associated with
> different variables.

> In the end, I verified that this fixed the codegen regression in the
> PR64164 testcase, that failed to merge two partitions that could in
> theory be merged, but that wasn't even considered due to differences in
> the SSA var names.

> I'd agree that disregarding the var names and dropping 4 passes is too
> much of a change to fix this one problem, but...  it's something we
> should have long tackled, and it gets this and other jobs done, so...

Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
on x86_64, so without lto plugin.  The only regression is in
gcc.dg/guality/pr54200.c, that explicitly disables VTA.  When
optimization is enabled, the different coalescing we perform now causes
VTA-less variable tracking to lose track of variable "z".  This
regression in non-VTA var-tracking is expected and, as richi put it in
PR 64164, I guess we don't care about that, do we? :-)

The other guality regressions I mentioned in my other email turned out
not to be regressions, but preexisting failures that somehow did not
make to the test_summary of my earlier pristine build.

Is this ok to install?


for  gcc/ChangeLog

	PR rtl-optimization/64164
	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
	* tree-ssa-copyrename.c: Removed.
	* opts.c (default_options_table): Drop -ftree-copyrename.
	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
	* common.opt (ftree-copyrename): Ignore.
	(ftree-coalesce-vars, ftree-coalesce-inlined-vars): Likewise.
	* doc/invoke.texi: Remove the ignored options above.
	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
	across base variables when optimizing.
	* tree-ssa-coalesce.c (build_ssa_conflict_graph): Add
	param_defaults argument.  Process PARM_DECLs's default defs at
	the entry point.
	(attempt_coalesce): Add param_defaults argument, and
	track the presence of default defs for params in each
	partition.  Propagate base var to leader on merge, preferring
	parms and results, named vars, ignored vars, and then anon
	vars.  Refuse to merge a RESULT_DECL partition with a default
	PARM_DECL one.
	(coalesce_partitions): Add param_defaults argument,
	and pass it to attempt_coalesce.
	(dump_part_var_map): New.
	(compute_optimized_partition_bases): New, called by...
	(coalesce_ssa_name): ... when optimizing, disabling
	partition_view_bitmap's base assignment.  Pass local
	param_defaults to coalescer functions.
	* tree-ssa-live.c (var_map_base_init): Note use only when not
	optimizing.
---
 gcc/Makefile.in           |    1 
 gcc/common.opt            |   12 +
 gcc/doc/invoke.texi       |   29 ---
 gcc/gimple-expr.c         |    7 -
 gcc/opts.c                |    1 
 gcc/passes.def            |    5 
 gcc/tree-ssa-coalesce.c   |  342 ++++++++++++++++++++++++++++++-
 gcc/tree-ssa-copyrename.c |  499 ---------------------------------------------
 gcc/tree-ssa-live.c       |    5 
 9 files changed, 347 insertions(+), 554 deletions(-)
 delete mode 100644 gcc/tree-ssa-copyrename.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f924fb8..990c4e9 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1428,7 +1428,6 @@ OBJS = \
 	tree-ssa-ccp.o \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
-	tree-ssa-copyrename.o \
 	tree-ssa-dce.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index b49ac46..fefaee7 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2207,16 +2207,16 @@ Common Report Var(flag_tree_ch) Optimization
 Enable loop header copying on trees
 
 ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing.  Preserved for backward compatibility.
 
 ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copy-prop
 Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9749727..5d2c516 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -442,8 +442,7 @@ Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
+-ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
 -ftree-loop-if-convert-stores -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
@@ -8822,32 +8821,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
 references with scalars to prevent committing structures to memory too
 early.  This flag is enabled by default at @option{-O} and higher.
 
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees.  This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables.  This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions.  It is a more limited form of
-@option{-ftree-coalesce-vars}.  This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries.  This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}.  In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones.  This option is enabled by default.
-
 @item -ftree-ter
 @opindex ftree-ter
 Perform temporary expression replacement during the SSA->normal phase.  Single
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index efc93b7..62ae577 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -399,13 +399,14 @@ copy_var_decl (tree var, tree name, tree type)
 bool
 gimple_can_coalesce_p (tree name1, tree name2)
 {
-  /* First check the SSA_NAME's associated DECL.  We only want to
-     coalesce if they have the same DECL or both have no associated DECL.  */
+  /* First check the SSA_NAME's associated DECL.  Without
+     optimization, we only want to coalesce if they have the same DECL
+     or both have no associated DECL.  */
   tree var1 = SSA_NAME_VAR (name1);
   tree var2 = SSA_NAME_VAR (name2);
   var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
   var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
-  if (var1 != var2)
+  if (var1 != var2 && !optimize)
     return false;
 
   /* Now check the types.  If the types are the same, then we should
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..8149421 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -453,7 +453,6 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
-    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 1d598b2..f8fd0ef 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_all_early_optimizations);
       PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_object_sizes);
 	  NEXT_PASS (pass_ccp);
 	  /* After CCP we rewrite no longer addressed locals into SSA
@@ -154,7 +153,6 @@ along with GCC; see the file COPYING3.  If not see
       /* Initial scalar cleanups before alias computation.
 	 They ensure memory accesses are not indirect wherever possible.  */
       NEXT_PASS (pass_strip_predict_hints);
-      NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_ccp);
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
@@ -182,7 +180,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_stdarg);
       NEXT_PASS (pass_lower_complex);
       NEXT_PASS (pass_sra);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* The dom pass will also resolve all __builtin_constant_p calls
          that are still there to 0.  This has to be done after some
 	 propagations have already run, but before some more dead code
@@ -291,7 +288,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_fold_builtins);
       NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* FIXME: If DCE is not run before checking for uninitialized uses,
 	 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 	 However, this also causes us to misdiagnose cases that should be
@@ -326,7 +322,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tsan);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* ???  We do want some kind of loop invariant motion, but we possibly
          need to adjust LIM to be more friendly towards preserving accurate
 	 debug information here.  */
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 1afeefe..8557d84 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -825,13 +825,23 @@ live_track_clear_base_vars (live_track_p ptr)
    base variable are added.  */
 
 static ssa_conflicts_p
-build_ssa_conflict_graph (tree_live_info_p liveinfo)
+build_ssa_conflict_graph (tree_live_info_p liveinfo, bitmap param_defaults)
 {
   ssa_conflicts_p graph;
   var_map map;
   basic_block bb;
   ssa_op_iter iter;
   live_track_p live;
+  basic_block entry;
+
+  /* If we are optimizing, we may attempt to coalesce variables from
+     different base variables, including different parameters, so we
+     have to make sure default defs live at the entry block conflict
+     with each other.  */
+  if (optimize)
+    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  else
+    entry = NULL;
 
   map = live_var_map (liveinfo);
   graph = ssa_conflicts_new (num_var_partitions (map));
@@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	    live_track_process_def (live, result, graph);
 	}
 
+      /* Pretend there are defs for params' default defs at the start
+	 of the (post-)entry block.  We run after abnormal coalescing,
+	 so we can't assume the leader variable is the default
+	 definition, but because of SSA_NAME_VAR adjustments in
+	 attempt_coalesce, we can assume that if there is any
+	 PARM_DECL in the partition, it will be the leader's
+	 SSA_NAME_VAR.  */
+      if (bb == entry)
+	{
+	  unsigned base;
+	  bitmap_iterator bi;
+	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	    {
+	      bitmap_iterator bi2;
+	      unsigned part;
+	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+					0, part, bi2)
+		{
+		  tree var = partition_to_var (map, part);
+		  if (!SSA_NAME_VAR (var)
+		      || TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+		      || !(SSA_NAME_IS_DEFAULT_DEF (var)
+			   || (param_defaults
+			       && bitmap_bit_p (param_defaults, part))))
+		    continue;
+		  live_track_process_def (live, var, graph);
+		}
+	    }
+	}
+
      live_track_clear_base_vars (live);
     }
 
@@ -1126,11 +1166,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
 
 static inline bool
 attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
-		  FILE *debug)
+		  bitmap param_defaults, FILE *debug)
 {
   int z;
   tree var1, var2;
   int p1, p2;
+  bool default_def = false;
 
   p1 = var_to_partition (map, ssa_name (x));
   p2 = var_to_partition (map, ssa_name (y));
@@ -1158,6 +1199,70 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
     {
       var1 = partition_to_var (map, p1);
       var2 = partition_to_var (map, p2);
+
+      tree leader;
+
+      if (var1 == var2 || !SSA_NAME_VAR (var2)
+	  || SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2))
+	{
+	  leader = SSA_NAME_VAR (var1);
+	  default_def = (leader && TREE_CODE (leader) == PARM_DECL
+			 && (SSA_NAME_IS_DEFAULT_DEF (var1)
+			     || bitmap_bit_p (param_defaults, p1)));
+	}
+      else if (!SSA_NAME_VAR (var1))
+	{
+	  leader = SSA_NAME_VAR (var2);
+	  default_def = (leader && TREE_CODE (leader) == PARM_DECL
+			 && (SSA_NAME_IS_DEFAULT_DEF (var2)
+			     || bitmap_bit_p (param_defaults, p2)));
+	}
+      else if ((TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL
+		&& (SSA_NAME_IS_DEFAULT_DEF (var1)
+		    || bitmap_bit_p (param_defaults, p1)))
+	       || TREE_CODE (SSA_NAME_VAR (var1)) == RESULT_DECL)
+	{
+	  if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
+	       && (SSA_NAME_IS_DEFAULT_DEF (var2)
+		   || bitmap_bit_p (param_defaults, p2)))
+	      || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
+	    {
+	      /* We only have one RESULT_DECL, and two PARM_DECL
+		 DEFAULT_DEFs would have conflicted, so we know either
+		 one of var1 or var2 is a PARM_DECL, and the other is
+		 a RESULT_DECL.  */
+	      if (debug)
+		fprintf (debug, ": Cannot coalesce PARM_DECL and RESULT_DECL\n");
+	      return false;
+	    }
+	  leader = SSA_NAME_VAR (var1);
+	  default_def = TREE_CODE (leader) == PARM_DECL;
+	}
+      else if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
+		&& (SSA_NAME_IS_DEFAULT_DEF (var2)
+		    || bitmap_bit_p (param_defaults, p2)))
+	       || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
+	{
+	  leader = SSA_NAME_VAR (var2);
+	  default_def = TREE_CODE (leader) == PARM_DECL;
+	}
+      else if (TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL)
+	leader = SSA_NAME_VAR (var1);
+      else if (TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL)
+	leader = SSA_NAME_VAR (var2);
+      else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL
+	       && !DECL_IGNORED_P (SSA_NAME_VAR (var1)))
+	leader = SSA_NAME_VAR (var1);
+      else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL
+	       && !DECL_IGNORED_P (SSA_NAME_VAR (var2)))
+	leader = SSA_NAME_VAR (var2);
+      else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL)
+	leader = SSA_NAME_VAR (var1);
+      else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
+	leader = SSA_NAME_VAR (var2);
+      else /* What else could it be?  */
+	gcc_unreachable ();
+
       z = var_union (map, var1, var2);
       if (z == NO_PARTITION)
 	{
@@ -1173,8 +1278,46 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
       else
 	ssa_conflicts_merge (graph, p2, p1);
 
+      if (z == p1)
+	{
+	  if (SSA_NAME_VAR (var1) != leader)
+	    {
+	      replace_ssa_name_symbol (var1, leader);
+	      if (debug)
+		{
+		  fprintf (debug, ": Renamed ");
+		  print_generic_expr (debug, var1, TDF_SLIM);
+		}
+	    }
+	  if (default_def)
+	    {
+	      if (SSA_NAME_IS_DEFAULT_DEF (var2))
+		bitmap_clear_bit (param_defaults, p2);
+	      bitmap_set_bit (param_defaults, p1);
+	    }
+	}
+      else
+	{
+	  if (SSA_NAME_VAR (var2) != leader)
+	    {
+	      replace_ssa_name_symbol (var2, leader);
+	      if (debug)
+		{
+		  fprintf (debug, ": Renamed ");
+		  print_generic_expr (debug, var2, TDF_SLIM);
+		}
+	    }
+	  if (default_def)
+	    {
+	      if (SSA_NAME_IS_DEFAULT_DEF (var1))
+		bitmap_clear_bit (param_defaults, p1);
+	      bitmap_set_bit (param_defaults, p2);
+	    }
+	}
+
       if (debug)
 	fprintf (debug, ": Success -> %d\n", z);
+
       return true;
     }
 
@@ -1190,7 +1333,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
 
 static void
 coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
-		     FILE *debug)
+		     bitmap param_defaults, FILE *debug)
 {
   int x = 0, y = 0;
   tree var1, var2;
@@ -1226,7 +1369,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
 		if (debug)
 		  fprintf (debug, "Abnormal coalesce: ");
 
-		if (!attempt_coalesce (map, graph, v1, v2, debug))
+		if (!attempt_coalesce (map, graph, v1, v2, param_defaults, debug))
 		  fail_abnormal_edge_coalesce (v1, v2);
 	      }
 	  }
@@ -1244,7 +1387,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
 
       if (debug)
 	fprintf (debug, "Coalesce list: ");
-      attempt_coalesce (map, graph, x, y, debug);
+      attempt_coalesce (map, graph, x, y, param_defaults, debug);
     }
 }
 
@@ -1272,6 +1415,178 @@ ssa_name_var_hash::equal (const value_type *n1, const compare_type *n2)
 }
 
 
+/* Output partition map MAP with coalescing plan PART to file F.  */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+  int t;
+  unsigned x, y;
+  int p;
+
+  fprintf (f, "\nCoalescible Partition map \n\n");
+
+  for (x = 0; x < map->num_partitions; x++)
+    {
+      if (map->view_to_partition != NULL)
+	p = map->view_to_partition[x];
+      else
+	p = x;
+
+      if (ssa_name (p) == NULL_TREE
+	  || virtual_operand_p (ssa_name (p)))
+        continue;
+
+      t = 0;
+      for (y = 1; y < num_ssa_names; y++)
+        {
+	  tree var = version_to_var (map, y);
+	  if (!var)
+	    continue;
+	  int q = var_to_partition (map, var);
+	  p = partition_find (part, q);
+	  gcc_assert (map->partition_to_base_index[q]
+		      == map->partition_to_base_index[p]);
+
+	  if (p == (int)x)
+	    {
+	      if (t++ == 0)
+	        {
+		  fprintf (f, "Partition %d, base %d (", x,
+			   map->partition_to_base_index[q]);
+		  print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+		  fprintf (f, " - ");
+		}
+	      fprintf (f, "%d ", y);
+	    }
+	}
+      if (t != 0)
+	fprintf (f, ")\n");
+    }
+  fprintf (f, "\n");
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+   partition of SSA names USED_IN_COPIES and related by CL
+   coalesce possibilities.  */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+				   coalesce_list_p cl)
+{
+  int parts = num_var_partitions (map);
+  partition tentative = partition_new (parts);
+
+  /* Partition the SSA versions so that, for each coalescible
+     pair, both of its members are in the same partition in
+     TENTATIVE.  */
+  gcc_assert (!cl->sorted);
+  coalesce_pair_p node;
+  coalesce_iterator_type ppi;
+  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+    {
+      tree v1 = ssa_name (node->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (node->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* We have to deal with cost one pairs too.  */
+  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+    {
+      tree v1 = ssa_name (co->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (co->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* And also with abnormal edges.  */
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->flags & EDGE_ABNORMAL)
+	  {
+	    gphi_iterator gsi;
+	    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+		 gsi_next (&gsi))
+	      {
+		gphi *phi = gsi.phi ();
+		tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+		if (SSA_NAME_IS_DEFAULT_DEF (arg)
+		    && (!SSA_NAME_VAR (arg)
+			|| TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+		  continue;
+
+		tree res = PHI_RESULT (phi);
+
+		int p1 = partition_find (tentative, var_to_partition (map, res));
+		int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+		if (p1 == p2)
+		  continue;
+
+		partition_union (tentative, p1, p2);
+	      }
+	  }
+    }
+  
+
+  map->partition_to_base_index = XCNEWVEC (int, parts);
+  auto_vec<unsigned int> index_map (parts);
+  if (parts)
+    index_map.quick_grow (parts);
+
+  const unsigned no_part = -1;
+  unsigned count = parts;
+  while (count)
+    index_map[--count] = no_part;
+
+  /* Initialize MAP's mapping from partition to base index, using
+     as base indices an enumeration of the TENTATIVE partitions in
+     which each SSA version ended up, so that we compute conflicts
+     between all SSA versions that ended up in the same potential
+     coalesce partition.  */
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      if (index_map[base] != no_part)
+	continue;
+      index_map[base] = count++;
+    }
+
+  map->num_basevars = count;
+
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      gcc_assert (index_map[base] < count);
+      map->partition_to_base_index[pidx] = index_map[base];
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_part_var_map (dump_file, tentative, map);
+
+  partition_delete (tentative);
+}
+
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1332,7 +1647,12 @@ coalesce_ssa_name (void)
     dump_var_map (dump_file, map);
 
   /* Don't calculate live ranges for variables not in the coalesce list.  */
-  partition_view_bitmap (map, used_in_copies, true);
+  partition_view_bitmap (map, used_in_copies, !optimize);
+
+  /* If we are optimizing, compute the base indices ourselves.  */
+  if (optimize)
+    compute_optimized_partition_bases (map, used_in_copies, cl);
+
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
@@ -1341,6 +1661,8 @@ coalesce_ssa_name (void)
       return map;
     }
 
+  bitmap param_defaults = BITMAP_ALLOC (NULL);
+
   if (dump_file && (dump_flags & TDF_DETAILS))
     dump_var_map (dump_file, map);
 
@@ -1350,7 +1672,7 @@ coalesce_ssa_name (void)
     dump_live_info (dump_file, liveinfo, LIVEDUMP_ENTRY);
 
   /* Build a conflict graph.  */
-  graph = build_ssa_conflict_graph (liveinfo);
+  graph = build_ssa_conflict_graph (liveinfo, param_defaults);
   delete_tree_live_info (liveinfo);
   if (dump_file && (dump_flags & TDF_DETAILS))
     ssa_conflicts_dump (dump_file, graph);
@@ -1370,10 +1692,10 @@ coalesce_ssa_name (void)
     dump_var_map (dump_file, map);
 
   /* Now coalesce everything in the list.  */
-  coalesce_partitions (map, graph, cl,
-		       ((dump_flags & TDF_DETAILS) ? dump_file
-						   : NULL));
+  coalesce_partitions (map, graph, cl, param_defaults,
+		       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
+  BITMAP_FREE (param_defaults);
   delete_coalesce_list (cl);
   ssa_conflicts_delete (graph);
 
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index f3cb56e..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,499 +0,0 @@
-/* Rename SSA copies.
-   Copyright (C) 2004-2015 Free Software Foundation, Inc.
-   Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
-#include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
-#include "tree.h"
-#include "fold-const.h"
-#include "predict.h"
-#include "hard-reg-set.h"
-#include "function.h"
-#include "dominance.h"
-#include "cfg.h"
-#include "basic-block.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "bitmap.h"
-#include "gimple-ssa.h"
-#include "stringpool.h"
-#include "tree-ssanames.h"
-#include "hashtab.h"
-#include "rtl.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
-  /* Number of copies coalesced.  */
-  int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
-   This optimization looks for copies between 2 SSA_NAMES, either through a
-   direct copy, or an implicit one via a PHI node result and its arguments.
-
-   Each copy is examined to determine if it is possible to rename the base
-   variable of one of the operands to the same variable as the other operand.
-   i.e.
-   T.3_5 = <blah>
-   a_1 = T.3_5
-
-   If this copy couldn't be copy propagated, it could possibly remain in the
-   program throughout the optimization phases.   After SSA->normal, it would
-   become:
-
-   T.3 = <blah>
-   a = T.3
-
-   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
-   fundamental reason why the base variable needs to be T.3, subject to
-   certain restrictions.  This optimization attempts to determine if we can
-   change the base variable on copies like this, and result in code such as:
-
-   a_5 = <blah>
-   a_1 = a_5
-
-   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
-   possible, the copy goes away completely. If it isn't possible, a new temp
-   will be created for a_5, and you will end up with the exact same code:
-
-   a.8 = <blah>
-   a = a.8
-
-   The other benefit of performing this optimization relates to what variables
-   are chosen in copies.  Gimplification of the program uses temporaries for
-   a lot of things. expressions like
-
-   a_1 = <blah>
-   <blah2> = a_1
-
-   get turned into
-
-   T.3_5 = <blah>
-   a_1 = T.3_5
-   <blah2> = a_1
-
-   Copy propagation is done in a forward direction, and if we can propagate
-   through the copy, we end up with:
-
-   T.3_5 = <blah>
-   <blah2> = T.3_5
-
-   The copy is gone, but so is all reference to the user variable 'a'. By
-   performing this optimization, we would see the sequence:
-
-   a_5 = <blah>
-   a_1 = a_5
-   <blah2> = a_1
-
-   which copy propagation would then turn into:
-
-   a_5 = <blah>
-   <blah2> = a_5
-
-   and so we still retain the user variable whenever possible.  */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
-   Choose a representative for the partition, and send debug info to DEBUG.  */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
-  int p1, p2, p3;
-  tree root1, root2;
-  tree rep1, rep2;
-  bool ign1, ign2, abnorm;
-
-  gcc_assert (TREE_CODE (var1) == SSA_NAME);
-  gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
-  register_ssa_partition (map, var1);
-  register_ssa_partition (map, var2);
-
-  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
-  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
-  if (debug)
-    {
-      fprintf (debug, "Try : ");
-      print_generic_expr (debug, var1, TDF_SLIM);
-      fprintf (debug, "(P%d) & ", p1);
-      print_generic_expr (debug, var2, TDF_SLIM);
-      fprintf (debug, "(P%d)", p2);
-    }
-
-  gcc_assert (p1 != NO_PARTITION);
-  gcc_assert (p2 != NO_PARTITION);
-
-  if (p1 == p2)
-    {
-      if (debug)
-	fprintf (debug, " : Already coalesced.\n");
-      return;
-    }
-
-  rep1 = partition_to_var (map, p1);
-  rep2 = partition_to_var (map, p2);
-  root1 = SSA_NAME_VAR (rep1);
-  root2 = SSA_NAME_VAR (rep2);
-  if (!root1 && !root2)
-    return;
-
-  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
-  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
-	    || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
-  if (abnorm)
-    {
-      if (debug)
-	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
-      return;
-    }
-
-  /* Partitions already have the same root, simply merge them.  */
-  if (root1 == root2)
-    {
-      p1 = partition_union (map->var_partition, p1, p2);
-      if (debug)
-	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
-      return;
-    }
-
-  /* Never attempt to coalesce 2 different parameters.  */
-  if ((root1 && TREE_CODE (root1) == PARM_DECL)
-      && (root2 && TREE_CODE (root2) == PARM_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
-      return;
-    }
-
-  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
-      != (root2 && TREE_CODE (root2) == RESULT_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
-      return;
-    }
-
-  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
-  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
-  /* Refrain from coalescing user variables, if requested.  */
-  if (!ign1 && !ign2)
-    {
-      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
-	ign2 = true;
-      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
-	ign1 = true;
-      else if (flag_ssa_coalesce_vars != 2)
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
-	  return;
-	}
-      else
-	ign2 = true;
-    }
-
-  /* If both values have default defs, we can't coalesce.  If only one has a
-     tag, make sure that variable is the new root partition.  */
-  if (root1 && ssa_default_def (cfun, root1))
-    {
-      if (root2 && ssa_default_def (cfun, root2))
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 default defs. No coalesce.\n");
-	  return;
-	}
-      else
-        {
-	  ign2 = true;
-	  ign1 = false;
-	}
-    }
-  else if (root2 && ssa_default_def (cfun, root2))
-    {
-      ign1 = true;
-      ign2 = false;
-    }
-
-  /* Do not coalesce if we cannot assign a symbol to the partition.  */
-  if (!(!ign2 && root2)
-      && !(!ign1 && root1))
-    {
-      if (debug)
-	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the new chosen root variable would be read-only.
-     If both ign1 && ign2, then the root var of the larger partition
-     wins, so reject in that case if any of the root vars is TREE_READONLY.
-     Otherwise reject only if the root var, on which replace_ssa_name_symbol
-     will be called below, is readonly.  */
-  if (((root1 && TREE_READONLY (root1)) && ign2)
-      || ((root2 && TREE_READONLY (root2)) && ign1))
-    {
-      if (debug)
-	fprintf (debug, " : Readonly variable.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the two variables aren't type compatible .  */
-  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
-      /* There is a disconnect between the middle-end type-system and
-         VRP, avoid coalescing enum types with different bounds.  */
-      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
-	   || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
-	  && TREE_TYPE (var1) != TREE_TYPE (var2)))
-    {
-      if (debug)
-	fprintf (debug, " : Incompatible types.  No coalesce.\n");
-      return;
-    }
-
-  /* Merge the two partitions.  */
-  p3 = partition_union (map->var_partition, p1, p2);
-
-  /* Set the root variable of the partition to the better choice, if there is
-     one.  */
-  if (!ign2 && root2)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
-  else if (!ign1 && root1)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
-  else
-    gcc_unreachable ();
-
-  if (debug)
-    {
-      fprintf (debug, " --> P%d ", p3);
-      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
-			  TDF_SLIM);
-      fprintf (debug, "\n");
-    }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
-  GIMPLE_PASS, /* type */
-  "copyrename", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_COPY_RENAME, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
-  pass_rename_ssa_copies (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
-   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
-   changing the underlying root variable of all coalesced version.  This will
-   then cause the SSA->normal pass to attempt to coalesce them all to the same
-   variable.  */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
-  var_map map;
-  basic_block bb;
-  tree var, part_var;
-  gimple stmt;
-  unsigned x;
-  FILE *debug;
-
-  memset (&stats, 0, sizeof (stats));
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    debug = dump_file;
-  else
-    debug = NULL;
-
-  map = init_var_map (num_ssa_names);
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Scan for real copies.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_assign_ssa_name_copy_p (stmt))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
-	    }
-	}
-    }
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Treat PHI nodes as copies between the result and each argument.  */
-      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-        {
-          size_t i;
-	  tree res;
-	  gphi *phi = gsi.phi ();
-	  res = gimple_phi_result (phi);
-
-	  /* Do not process virtual SSA_NAMES.  */
-	  if (virtual_operand_p (res))
-	    continue;
-
-	  /* Make sure to only use the same partition for an argument
-	     as the result but never the other way around.  */
-	  if (SSA_NAME_VAR (res)
-	      && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
-	    for (i = 0; i < gimple_phi_num_args (phi); i++)
-	      {
-		tree arg = PHI_ARG_DEF (phi, i);
-		if (TREE_CODE (arg) == SSA_NAME)
-		  copy_rename_partition_coalesce (map, res, arg,
-						  debug);
-	      }
-	  /* Else if all arguments are in the same partition try to merge
-	     it with the result.  */
-	  else
-	    {
-	      int all_p_same = -1;
-	      int p = -1;
-	      for (i = 0; i < gimple_phi_num_args (phi); i++)
-		{
-		  tree arg = PHI_ARG_DEF (phi, i);
-		  if (TREE_CODE (arg) != SSA_NAME)
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		  else if (all_p_same == -1)
-		    {
-		      p = partition_find (map->var_partition,
-					  SSA_NAME_VERSION (arg));
-		      all_p_same = 1;
-		    }
-		  else if (all_p_same == 1
-			   && p != partition_find (map->var_partition,
-						   SSA_NAME_VERSION (arg)))
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		}
-	      if (all_p_same == 1)
-		copy_rename_partition_coalesce (map, res,
-						PHI_ARG_DEF (phi, 0),
-						debug);
-	    }
-        }
-    }
-
-  if (debug)
-    dump_var_map (debug, map);
-
-  /* Now one more pass to make all elements of a partition share the same
-     root variable.  */
-
-  for (x = 1; x < num_ssa_names; x++)
-    {
-      part_var = partition_to_var (map, x);
-      if (!part_var)
-        continue;
-      var = ssa_name (x);
-      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
-	continue;
-      if (debug)
-        {
-	  fprintf (debug, "Coalesced ");
-	  print_generic_expr (debug, var, TDF_SLIM);
-	  fprintf (debug, " to ");
-	  print_generic_expr (debug, part_var, TDF_SLIM);
-	  fprintf (debug, "\n");
-	}
-      stats.coalesced++;
-      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
-    }
-
-  statistics_counter_event (fun, "copies coalesced",
-			    stats.coalesced);
-  delete_var_map (map);
-  return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
-  return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index e0c42669..46b1869 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -119,7 +119,10 @@ tree_int_map_hasher::equal (const value_type *v, const compare_type *c)
 }
 
 
-/* This routine will initialize the basevar fields of MAP.  */
+/* This routine will initialize the basevar fields of MAP with base
+   names, when we are not optimizing.  When optimizing, we'll use
+   partition numbers as base index numbers, see coalesce_ssa_name in
+   tree-ssa-coalesce.c.  */
 
 static void
 var_map_base_init (var_map map)


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-28 19:22 ` Alexandre Oliva
@ 2015-03-31  5:11   ` Jeff Law
  2015-04-03 13:17     ` Alexandre Oliva
  2015-03-31  6:55   ` Steven Bosscher
  2015-03-31 14:06   ` Richard Biener
  2 siblings, 1 reply; 127+ messages in thread
From: Jeff Law @ 2015-03-31  5:11 UTC (permalink / raw)
  To: Alexandre Oliva, gcc-patches

On 03/28/2015 01:21 PM, Alexandre Oliva wrote:
> On Mar 27, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> This patch reworks the out-of-ssa expander to enable coalescing of SSA
>> partitions that don't share the same base name.  This is done only when
>> optimizing.
>
>> The test we use to tell whether two partitions can be merged no longer
>> demands them to have the same base variable when optimizing, so they
>> become eligible for coalescing, as they would after copyrename.  We then
>> compute the partitioning we'd get if all coalescible partitions were
>> coalesced, using this partition assignment to assign base vars numbers.
>> These base var numbers are then used to identify conflicts, which used
>> to be based on shared base vars or base types.
>
>> We now propagate base var names during coalescing proper, only towards
>> the leader variable.  I'm no longer sure this is still needed, but
>> something about handling variables and results led me this way and I
>> didn't revisit it.  I might rework that with a later patch, or a later
>> revision of this patch; it would require other means to identify
>> partitions holding result_decls during merging, or allow that and deal
>> with param and result decls in a different way during expand proper.
>
>> I had to fix two lingering bugs in order for the whole thing to work: we
>> perform conflict detection after abnormal coalescing, but we computed
>> live ranges involving only the partition leaders, so conflicts with
>> other names already coalesced wouldn't be detected.
>
> This early abnormal coalescing was only present for a few days in the
> trunk, and I was lucky enough to start working on a tree that had it.
> It turns out that the fix for it was thus rendered unnecessary, so I
> dropped it.  It was the fix for it, that didn't cover the live range
> check, that caused the two ICEs I saw in the regressions tests.  Since
> the ultimate cause of the problem is gone, and the change that
> introduced the check failures, both problems went *poof* after I updated
> the tree, resolved the conflicts and dropped the redundant code.
>
>> The other problem was that we didn't track default defs for parms as
>> live at entry, so they might end up coalesced.
>
> I improved this a little bit, using the bitmap of partitions containing
> default params to check that we only process function-entry defs for
> them, rather than for all param decls in case they end up in other
> partitions.
>
>> I guess none of these problems would have been exercised in practice,
>> because we wouldn't even consider merging ssa names associated with
>> different variables.
>
>> In the end, I verified that this fixed the codegen regression in the
>> PR64164 testcase, that failed to merge two partitions that could in
>> theory be merged, but that wasn't even considered due to differences in
>> the SSA var names.
>
>> I'd agree that disregarding the var names and dropping 4 passes is too
>> much of a change to fix this one problem, but...  it's something we
>> should have long tackled, and it gets this and other jobs done, so...
>
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto plugin.  The only regression is in
> gcc.dg/guality/pr54200.c, that explicitly disables VTA.  When
> optimization is enabled, the different coalescing we perform now causes
> VTA-less variable tracking to lose track of variable "z".  This
> regression in non-VTA var-tracking is expected and, as richi put it in
> PR 64164, I guess we don't care about that, do we? :-)
>
> The other guality regressions I mentioned in my other email turned out
> not to be regressions, but preexisting failures that somehow did not
> make to the test_summary of my earlier pristine build.
>
> Is this ok to install?
>
>
> for  gcc/ChangeLog
>
> 	PR rtl-optimization/64164
> 	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
> 	* tree-ssa-copyrename.c: Removed.
> 	* opts.c (default_options_table): Drop -ftree-copyrename.
> 	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
> 	* common.opt (ftree-copyrename): Ignore.
> 	(ftree-coalesce-vars, ftree-coalesce-inlined-vars): Likewise.
> 	* doc/invoke.texi: Remove the ignored options above.
> 	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
> 	across base variables when optimizing.
> 	* tree-ssa-coalesce.c (build_ssa_conflict_graph): Add
> 	param_defaults argument.  Process PARM_DECLs's default defs at
> 	the entry point.
> 	(attempt_coalesce): Add param_defaults argument, and
> 	track the presence of default defs for params in each
> 	partition.  Propagate base var to leader on merge, preferring
> 	parms and results, named vars, ignored vars, and then anon
> 	vars.  Refuse to merge a RESULT_DECL partition with a default
> 	PARM_DECL one.
> 	(coalesce_partitions): Add param_defaults argument,
> 	and pass it to attempt_coalesce.
> 	(dump_part_var_map): New.
> 	(compute_optimized_partition_bases): New, called by...
> 	(coalesce_ssa_name): ... when optimizing, disabling
> 	partition_view_bitmap's base assignment.  Pass local
> 	param_defaults to coalescer functions.
> 	* tree-ssa-live.c (var_map_base_init): Note use only when not
> 	optimizing.
> ---
>   gcc/Makefile.in           |    1
>   gcc/common.opt            |   12 +
>   gcc/doc/invoke.texi       |   29 ---
>   gcc/gimple-expr.c         |    7 -
>   gcc/opts.c                |    1
>   gcc/passes.def            |    5
>   gcc/tree-ssa-coalesce.c   |  342 ++++++++++++++++++++++++++++++-
>   gcc/tree-ssa-copyrename.c |  499 ---------------------------------------------
>   gcc/tree-ssa-live.c       |    5
>   9 files changed, 347 insertions(+), 554 deletions(-)
>   delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index efc93b7..62ae577 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -399,13 +399,14 @@ copy_var_decl (tree var, tree name, tree type)
>   bool
>   gimple_can_coalesce_p (tree name1, tree name2)
>   {
> -  /* First check the SSA_NAME's associated DECL.  We only want to
> -     coalesce if they have the same DECL or both have no associated DECL.  */
> +  /* First check the SSA_NAME's associated DECL.  Without
> +     optimization, we only want to coalesce if they have the same DECL
> +     or both have no associated DECL.  */
>     tree var1 = SSA_NAME_VAR (name1);
>     tree var2 = SSA_NAME_VAR (name2);
>     var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
>     var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> -  if (var1 != var2)
> +  if (var1 != var2 && !optimize)
>       return false;
So when when the base variables are different and we are optimizing, 
this allows coalescing, right?

What I don't see is a corresponding change to var_map_base_init to 
ensure we build a conflict graph which includes objects when 
SSA_NAME_VARs are not the same.  I see a vague reference in 
var_map_base_init's header comment that refers us to coalesce_ssa_name.

It appears that compute_optimized_partition_bases handles this by 
creating a partitions of things that are related by copies/phis 
regardless of their underlying named object, type, etc.  Right?





> index 1d598b2..f8fd0ef 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
[ ... ]
Hard to argue with removing a pass that gets called 5 times!


> @@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>   	    live_track_process_def (live, result, graph);
>   	}
>
> +      /* Pretend there are defs for params' default defs at the start
> +	 of the (post-)entry block.  We run after abnormal coalescing,
> +	 so we can't assume the leader variable is the default
> +	 definition, but because of SSA_NAME_VAR adjustments in
> +	 attempt_coalesce, we can assume that if there is any
> +	 PARM_DECL in the partition, it will be the leader's
> +	 SSA_NAME_VAR.  */
So the issue here is you want to iterate over the objects live at the 
entry block, which would include any SSA_NAMEs which result from 
PARM_DECLs.  I don't guess there's an easier way to do that other than 
iterating over everything live in that initial block?

And the second second EXECUTE_IF_SET_IN_BITMAP iterates over everything 
in the partitions associated with the SSA_NAMES that are live at the the 
entry block, right?

I don't guess it'd be more efficient to walk over the SSA_NAMEs looking 
for anything marked as a default definition, then map that back to a 
partition since we'd have to look at every SSA_NAME whereas your code 
only looks at paritions that are live in the entry block, then looks at 
the elements in those partitions.

> @@ -1126,11 +1166,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>
>   static inline bool
>   attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> -		  FILE *debug)
> +		  bitmap param_defaults, FILE *debug)
[ ... ]
So the bulk of the changes into this routine are really about picking a 
good leader, which presumably is how we're able to get the desired 
effects on debuginfo that we used to get from tree-ssa-copyrename.c?


> @@ -1158,6 +1199,70 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>       {
>         var1 = partition_to_var (map, p1);
>         var2 = partition_to_var (map, p2);
> +
> +      tree leader;
> +
> +      if (var1 == var2 || !SSA_NAME_VAR (var2)
> +	  || SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2))
> +	{
> +	  leader = SSA_NAME_VAR (var1);
> +	  default_def = (leader && TREE_CODE (leader) == PARM_DECL
> +			 && (SSA_NAME_IS_DEFAULT_DEF (var1)
> +			     || bitmap_bit_p (param_defaults, p1)));
> +	}
So some comments about the various cases here might help.  I can sort 
them out if I read the code, but one could argue that a block comment on 
the rules for how to select the partition leader would be better.

Is the special casing of PARM_DECLs + RESULT_DECLs really a failing of 
not handling one or both properly when computing liveness information?

I'm not aware of an inherent reason why a PARM_DECL couldn't coalesce 
with a related RESULT_DECL if they are otherwise non-conflicting and 
related by a copy/phi.


> @@ -1272,6 +1415,178 @@ ssa_name_var_hash::equal (const value_type *n1, const compare_type *n2)
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> +   partition of SSA names USED_IN_COPIES and related by CL
> +   coalesce possibilities.  */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> +				   coalesce_list_p cl)
Presumably ordering of unioning of the partitions doesn't matter here as 
we're looking at coalesce possibilities rather than things we have 
actually coalesced?  Thus it's OK (?) to handle the names occurring in 
abnormal PHIs after those names that are associated by a copy.

This is all probably OK, but I want to make sure I understand what's 
happening before a final approval.

jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-28 19:22 ` Alexandre Oliva
  2015-03-31  5:11   ` Jeff Law
@ 2015-03-31  6:55   ` Steven Bosscher
  2015-03-31 13:30     ` Richard Biener
  2015-03-31 14:06   ` Richard Biener
  2 siblings, 1 reply; 127+ messages in thread
From: Steven Bosscher @ 2015-03-31  6:55 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: GCC Patches

On Sat, Mar 28, 2015 at 8:21 PM, Alexandre Oliva wrote:
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto plugin.  The only regression is in
> gcc.dg/guality/pr54200.c, that explicitly disables VTA.

What about memory footprint? IIRC this pass was in part introduced to
reduce the number of VAR_DECLs.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-31  6:55   ` Steven Bosscher
@ 2015-03-31 13:30     ` Richard Biener
  0 siblings, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-03-31 13:30 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Alexandre Oliva, GCC Patches

On Tue, Mar 31, 2015 at 8:55 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Sat, Mar 28, 2015 at 8:21 PM, Alexandre Oliva wrote:
>> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
>> on x86_64, so without lto plugin.  The only regression is in
>> gcc.dg/guality/pr54200.c, that explicitly disables VTA.
>
> What about memory footprint? IIRC this pass was in part introduced to
> reduce the number of VAR_DECLs.

That's no longer necessary as we now drop VAR_DECLs from non-user vars
completely at into-SSA time.  We have "anonymous" SSA names without
associated decls.

Richard.

> Ciao!
> Steven

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-28 19:22 ` Alexandre Oliva
  2015-03-31  5:11   ` Jeff Law
  2015-03-31  6:55   ` Steven Bosscher
@ 2015-03-31 14:06   ` Richard Biener
  2015-04-03 13:30     ` Alexandre Oliva
  2 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-03-31 14:06 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: GCC Patches

On Sat, Mar 28, 2015 at 8:21 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Mar 27, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> This patch reworks the out-of-ssa expander to enable coalescing of SSA
>> partitions that don't share the same base name.  This is done only when
>> optimizing.
>
>> The test we use to tell whether two partitions can be merged no longer
>> demands them to have the same base variable when optimizing, so they
>> become eligible for coalescing, as they would after copyrename.  We then
>> compute the partitioning we'd get if all coalescible partitions were
>> coalesced, using this partition assignment to assign base vars numbers.
>> These base var numbers are then used to identify conflicts, which used
>> to be based on shared base vars or base types.
>
>> We now propagate base var names during coalescing proper, only towards
>> the leader variable.  I'm no longer sure this is still needed, but
>> something about handling variables and results led me this way and I
>> didn't revisit it.  I might rework that with a later patch, or a later
>> revision of this patch; it would require other means to identify
>> partitions holding result_decls during merging, or allow that and deal
>> with param and result decls in a different way during expand proper.
>
>> I had to fix two lingering bugs in order for the whole thing to work: we
>> perform conflict detection after abnormal coalescing, but we computed
>> live ranges involving only the partition leaders, so conflicts with
>> other names already coalesced wouldn't be detected.
>
> This early abnormal coalescing was only present for a few days in the
> trunk, and I was lucky enough to start working on a tree that had it.
> It turns out that the fix for it was thus rendered unnecessary, so I
> dropped it.  It was the fix for it, that didn't cover the live range
> check, that caused the two ICEs I saw in the regressions tests.  Since
> the ultimate cause of the problem is gone, and the change that
> introduced the check failures, both problems went *poof* after I updated
> the tree, resolved the conflicts and dropped the redundant code.
>
>> The other problem was that we didn't track default defs for parms as
>> live at entry, so they might end up coalesced.
>
> I improved this a little bit, using the bitmap of partitions containing
> default params to check that we only process function-entry defs for
> them, rather than for all param decls in case they end up in other
> partitions.
>
>> I guess none of these problems would have been exercised in practice,
>> because we wouldn't even consider merging ssa names associated with
>> different variables.
>
>> In the end, I verified that this fixed the codegen regression in the
>> PR64164 testcase, that failed to merge two partitions that could in
>> theory be merged, but that wasn't even considered due to differences in
>> the SSA var names.
>
>> I'd agree that disregarding the var names and dropping 4 passes is too
>> much of a change to fix this one problem, but...  it's something we
>> should have long tackled, and it gets this and other jobs done, so...
>
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto plugin.  The only regression is in
> gcc.dg/guality/pr54200.c, that explicitly disables VTA.  When
> optimization is enabled, the different coalescing we perform now causes
> VTA-less variable tracking to lose track of variable "z".  This
> regression in non-VTA var-tracking is expected and, as richi put it in
> PR 64164, I guess we don't care about that, do we? :-)

Apart from at -O0, yes.

> The other guality regressions I mentioned in my other email turned out
> not to be regressions, but preexisting failures that somehow did not
> make to the test_summary of my earlier pristine build.
>
> Is this ok to install?

I think this is stage1 material.  Some comments in-line

>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
>         * tree-ssa-copyrename.c: Removed.
>         * opts.c (default_options_table): Drop -ftree-copyrename.
>         * passes.def: Drop all occurrences of pass_rename_ssa_copies.
>         * common.opt (ftree-copyrename): Ignore.
>         (ftree-coalesce-vars, ftree-coalesce-inlined-vars): Likewise.
>         * doc/invoke.texi: Remove the ignored options above.
>         * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
>         across base variables when optimizing.
>         * tree-ssa-coalesce.c (build_ssa_conflict_graph): Add
>         param_defaults argument.  Process PARM_DECLs's default defs at
>         the entry point.
>         (attempt_coalesce): Add param_defaults argument, and
>         track the presence of default defs for params in each
>         partition.  Propagate base var to leader on merge, preferring
>         parms and results, named vars, ignored vars, and then anon
>         vars.  Refuse to merge a RESULT_DECL partition with a default
>         PARM_DECL one.
>         (coalesce_partitions): Add param_defaults argument,
>         and pass it to attempt_coalesce.
>         (dump_part_var_map): New.
>         (compute_optimized_partition_bases): New, called by...
>         (coalesce_ssa_name): ... when optimizing, disabling
>         partition_view_bitmap's base assignment.  Pass local
>         param_defaults to coalescer functions.
>         * tree-ssa-live.c (var_map_base_init): Note use only when not
>         optimizing.
> ---
>  gcc/Makefile.in           |    1
>  gcc/common.opt            |   12 +
>  gcc/doc/invoke.texi       |   29 ---
>  gcc/gimple-expr.c         |    7 -
>  gcc/opts.c                |    1
>  gcc/passes.def            |    5
>  gcc/tree-ssa-coalesce.c   |  342 ++++++++++++++++++++++++++++++-
>  gcc/tree-ssa-copyrename.c |  499 ---------------------------------------------
>  gcc/tree-ssa-live.c       |    5
>  9 files changed, 347 insertions(+), 554 deletions(-)
>  delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index f924fb8..990c4e9 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1428,7 +1428,6 @@ OBJS = \
>         tree-ssa-ccp.o \
>         tree-ssa-coalesce.o \
>         tree-ssa-copy.o \
> -       tree-ssa-copyrename.o \
>         tree-ssa-dce.o \
>         tree-ssa-dom.o \
>         tree-ssa-dse.o \
> diff --git a/gcc/common.opt b/gcc/common.opt
> index b49ac46..fefaee7 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2207,16 +2207,16 @@ Common Report Var(flag_tree_ch) Optimization
>  Enable loop header copying on trees
>
>  ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Ignore
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-copy-prop
>  Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 9749727..5d2c516 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -442,8 +442,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> +-ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>  -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>  -ftree-loop-if-convert-stores -ftree-loop-im @gol
>  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
> @@ -8822,32 +8821,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
>  references with scalars to prevent committing structures to memory too
>  early.  This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees.  This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables.  This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions.  It is a more limited form of
> -@option{-ftree-coalesce-vars}.  This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries.  This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}.  In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones.  This option is enabled by default.
> -
>  @item -ftree-ter
>  @opindex ftree-ter
>  Perform temporary expression replacement during the SSA->normal phase.  Single
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index efc93b7..62ae577 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -399,13 +399,14 @@ copy_var_decl (tree var, tree name, tree type)
>  bool
>  gimple_can_coalesce_p (tree name1, tree name2)
>  {
> -  /* First check the SSA_NAME's associated DECL.  We only want to
> -     coalesce if they have the same DECL or both have no associated DECL.  */
> +  /* First check the SSA_NAME's associated DECL.  Without
> +     optimization, we only want to coalesce if they have the same DECL
> +     or both have no associated DECL.  */
>    tree var1 = SSA_NAME_VAR (name1);
>    tree var2 = SSA_NAME_VAR (name2);
>    var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
>    var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> -  if (var1 != var2)
> +  if (var1 != var2 && !optimize)
>      return false;
>
>    /* Now check the types.  If the types are the same, then we should
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 39c190d..8149421 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -453,7 +453,6 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> -    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 1d598b2..f8fd0ef 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_all_early_optimizations);
>        PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
>           NEXT_PASS (pass_remove_cgraph_callee_edges);
> -         NEXT_PASS (pass_rename_ssa_copies);
>           NEXT_PASS (pass_object_sizes);
>           NEXT_PASS (pass_ccp);
>           /* After CCP we rewrite no longer addressed locals into SSA
> @@ -154,7 +153,6 @@ along with GCC; see the file COPYING3.  If not see
>        /* Initial scalar cleanups before alias computation.
>          They ensure memory accesses are not indirect wherever possible.  */
>        NEXT_PASS (pass_strip_predict_hints);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        NEXT_PASS (pass_ccp);
>        /* After CCP we rewrite no longer addressed locals into SSA
>          form if possible.  */
> @@ -182,7 +180,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_stdarg);
>        NEXT_PASS (pass_lower_complex);
>        NEXT_PASS (pass_sra);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* The dom pass will also resolve all __builtin_constant_p calls
>           that are still there to 0.  This has to be done after some
>          propagations have already run, but before some more dead code
> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_fold_builtins);
>        NEXT_PASS (pass_optimize_widening_mul);
>        NEXT_PASS (pass_tail_calls);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* FIXME: If DCE is not run before checking for uninitialized uses,
>          we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
>          However, this also causes us to misdiagnose cases that should be
> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_dce);
>        NEXT_PASS (pass_asan);
>        NEXT_PASS (pass_tsan);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* ???  We do want some kind of loop invariant motion, but we possibly
>           need to adjust LIM to be more friendly towards preserving accurate
>          debug information here.  */
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index 1afeefe..8557d84 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -825,13 +825,23 @@ live_track_clear_base_vars (live_track_p ptr)
>     base variable are added.  */
>
>  static ssa_conflicts_p
> -build_ssa_conflict_graph (tree_live_info_p liveinfo)
> +build_ssa_conflict_graph (tree_live_info_p liveinfo, bitmap param_defaults)
>  {
>    ssa_conflicts_p graph;
>    var_map map;
>    basic_block bb;
>    ssa_op_iter iter;
>    live_track_p live;
> +  basic_block entry;
> +
> +  /* If we are optimizing, we may attempt to coalesce variables from
> +     different base variables, including different parameters, so we
> +     have to make sure default defs live at the entry block conflict
> +     with each other.  */
> +  if (optimize)
> +    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> +  else
> +    entry = NULL;
>
>    map = live_var_map (liveinfo);
>    graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>             live_track_process_def (live, result, graph);
>         }
>
> +      /* Pretend there are defs for params' default defs at the start
> +        of the (post-)entry block.  We run after abnormal coalescing,
> +        so we can't assume the leader variable is the default
> +        definition, but because of SSA_NAME_VAR adjustments in
> +        attempt_coalesce, we can assume that if there is any
> +        PARM_DECL in the partition, it will be the leader's
> +        SSA_NAME_VAR.  */
> +      if (bb == entry)
> +       {
> +         unsigned base;
> +         bitmap_iterator bi;
> +         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> +           {
> +             bitmap_iterator bi2;
> +             unsigned part;
> +             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> +                                       0, part, bi2)
> +               {
> +                 tree var = partition_to_var (map, part);
> +                 if (!SSA_NAME_VAR (var)
> +                     || TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> +                     || !(SSA_NAME_IS_DEFAULT_DEF (var)
> +                          || (param_defaults
> +                              && bitmap_bit_p (param_defaults, part))))
> +                   continue;
> +                 live_track_process_def (live, var, graph);
> +               }
> +           }
> +       }
> +

This looks somewhat awkward to me ;)  Is it really important to allow
coalescing PARM_DECL-based SSA vars with sth else?  At least
abnormal coalescing doesn't need to do that, so just walking over
the function decls parameters and making their default-def live
should be enough?

That is, that param_defaults bitmap looks ugly to me.

>       live_track_clear_base_vars (live);
>      }
>
> @@ -1126,11 +1166,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>
>  static inline bool
>  attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> -                 FILE *debug)
> +                 bitmap param_defaults, FILE *debug)
>  {
>    int z;
>    tree var1, var2;
>    int p1, p2;
> +  bool default_def = false;
>
>    p1 = var_to_partition (map, ssa_name (x));
>    p2 = var_to_partition (map, ssa_name (y));
> @@ -1158,6 +1199,70 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>      {
>        var1 = partition_to_var (map, p1);
>        var2 = partition_to_var (map, p2);
> +
> +      tree leader;
> +
> +      if (var1 == var2 || !SSA_NAME_VAR (var2)
> +         || SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2))
> +       {
> +         leader = SSA_NAME_VAR (var1);
> +         default_def = (leader && TREE_CODE (leader) == PARM_DECL
> +                        && (SSA_NAME_IS_DEFAULT_DEF (var1)
> +                            || bitmap_bit_p (param_defaults, p1)));
> +       }
> +      else if (!SSA_NAME_VAR (var1))
> +       {
> +         leader = SSA_NAME_VAR (var2);
> +         default_def = (leader && TREE_CODE (leader) == PARM_DECL
> +                        && (SSA_NAME_IS_DEFAULT_DEF (var2)
> +                            || bitmap_bit_p (param_defaults, p2)));
> +       }
> +      else if ((TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL
> +               && (SSA_NAME_IS_DEFAULT_DEF (var1)
> +                   || bitmap_bit_p (param_defaults, p1)))
> +              || TREE_CODE (SSA_NAME_VAR (var1)) == RESULT_DECL)
> +       {
> +         if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
> +              && (SSA_NAME_IS_DEFAULT_DEF (var2)
> +                  || bitmap_bit_p (param_defaults, p2)))
> +             || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
> +           {
> +             /* We only have one RESULT_DECL, and two PARM_DECL
> +                DEFAULT_DEFs would have conflicted, so we know either
> +                one of var1 or var2 is a PARM_DECL, and the other is
> +                a RESULT_DECL.  */
> +             if (debug)
> +               fprintf (debug, ": Cannot coalesce PARM_DECL and RESULT_DECL\n");
> +             return false;
> +           }
> +         leader = SSA_NAME_VAR (var1);
> +         default_def = TREE_CODE (leader) == PARM_DECL;
> +       }
> +      else if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
> +               && (SSA_NAME_IS_DEFAULT_DEF (var2)
> +                   || bitmap_bit_p (param_defaults, p2)))
> +              || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
> +       {
> +         leader = SSA_NAME_VAR (var2);
> +         default_def = TREE_CODE (leader) == PARM_DECL;
> +       }
> +      else if (TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL)
> +       leader = SSA_NAME_VAR (var1);
> +      else if (TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL)
> +       leader = SSA_NAME_VAR (var2);
> +      else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL
> +              && !DECL_IGNORED_P (SSA_NAME_VAR (var1)))
> +       leader = SSA_NAME_VAR (var1);
> +      else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL
> +              && !DECL_IGNORED_P (SSA_NAME_VAR (var2)))
> +       leader = SSA_NAME_VAR (var2);
> +      else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL)
> +       leader = SSA_NAME_VAR (var1);
> +      else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
> +       leader = SSA_NAME_VAR (var2);
> +      else /* What else could it be?  */
> +       gcc_unreachable ();
> +

definitely comments missing in this spaghetti...

>        z = var_union (map, var1, var2);
>        if (z == NO_PARTITION)
>         {
> @@ -1173,8 +1278,46 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>        else
>         ssa_conflicts_merge (graph, p2, p1);
>
> +      if (z == p1)
> +       {
> +         if (SSA_NAME_VAR (var1) != leader)
> +           {
> +             replace_ssa_name_symbol (var1, leader);
> +             if (debug)
> +               {
> +                 fprintf (debug, ": Renamed ");
> +                 print_generic_expr (debug, var1, TDF_SLIM);
> +               }
> +           }
> +         if (default_def)
> +           {
> +             if (SSA_NAME_IS_DEFAULT_DEF (var2))
> +               bitmap_clear_bit (param_defaults, p2);
> +             bitmap_set_bit (param_defaults, p1);
> +           }
> +       }
> +      else
> +       {
> +         if (SSA_NAME_VAR (var2) != leader)
> +           {
> +             replace_ssa_name_symbol (var2, leader);
> +             if (debug)
> +               {
> +                 fprintf (debug, ": Renamed ");
> +                 print_generic_expr (debug, var2, TDF_SLIM);
> +               }
> +           }
> +         if (default_def)
> +           {
> +             if (SSA_NAME_IS_DEFAULT_DEF (var1))
> +               bitmap_clear_bit (param_defaults, p1);
> +             bitmap_set_bit (param_defaults, p2);
> +           }
> +       }

or seeing this, why coalesce default-defs at all?  Either they are param values
or they have indetermined values (and thus we can and do pick whatever is
available at expansion time)?

> +
>        if (debug)
>         fprintf (debug, ": Success -> %d\n", z);
> +
>        return true;
>      }
>
> @@ -1190,7 +1333,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
>  static void
>  coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
> -                    FILE *debug)
> +                    bitmap param_defaults, FILE *debug)
>  {
>    int x = 0, y = 0;
>    tree var1, var2;
> @@ -1226,7 +1369,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
>                 if (debug)
>                   fprintf (debug, "Abnormal coalesce: ");
>
> -               if (!attempt_coalesce (map, graph, v1, v2, debug))
> +               if (!attempt_coalesce (map, graph, v1, v2, param_defaults, debug))
>                   fail_abnormal_edge_coalesce (v1, v2);
>               }
>           }
> @@ -1244,7 +1387,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
>
>        if (debug)
>         fprintf (debug, "Coalesce list: ");
> -      attempt_coalesce (map, graph, x, y, debug);
> +      attempt_coalesce (map, graph, x, y, param_defaults, debug);
>      }
>  }
>
> @@ -1272,6 +1415,178 @@ ssa_name_var_hash::equal (const value_type *n1, const compare_type *n2)
>  }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F.  */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> +  int t;
> +  unsigned x, y;
> +  int p;
> +
> +  fprintf (f, "\nCoalescible Partition map \n\n");
> +
> +  for (x = 0; x < map->num_partitions; x++)
> +    {
> +      if (map->view_to_partition != NULL)
> +       p = map->view_to_partition[x];
> +      else
> +       p = x;
> +
> +      if (ssa_name (p) == NULL_TREE
> +         || virtual_operand_p (ssa_name (p)))
> +        continue;
> +
> +      t = 0;
> +      for (y = 1; y < num_ssa_names; y++)
> +        {
> +         tree var = version_to_var (map, y);
> +         if (!var)
> +           continue;
> +         int q = var_to_partition (map, var);
> +         p = partition_find (part, q);
> +         gcc_assert (map->partition_to_base_index[q]
> +                     == map->partition_to_base_index[p]);
> +
> +         if (p == (int)x)
> +           {
> +             if (t++ == 0)
> +               {
> +                 fprintf (f, "Partition %d, base %d (", x,
> +                          map->partition_to_base_index[q]);
> +                 print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> +                 fprintf (f, " - ");
> +               }
> +             fprintf (f, "%d ", y);
> +           }
> +       }
> +      if (t != 0)
> +       fprintf (f, ")\n");
> +    }
> +  fprintf (f, "\n");
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> +   partition of SSA names USED_IN_COPIES and related by CL
> +   coalesce possibilities.  */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> +                                  coalesce_list_p cl)
> +{
> +  int parts = num_var_partitions (map);
> +  partition tentative = partition_new (parts);
> +
> +  /* Partition the SSA versions so that, for each coalescible
> +     pair, both of its members are in the same partition in
> +     TENTATIVE.  */
> +  gcc_assert (!cl->sorted);
> +  coalesce_pair_p node;
> +  coalesce_iterator_type ppi;
> +  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> +    {
> +      tree v1 = ssa_name (node->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (node->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* We have to deal with cost one pairs too.  */
> +  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> +    {
> +      tree v1 = ssa_name (co->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (co->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* And also with abnormal edges.  */
> +  basic_block bb;
> +  edge e;
> +  edge_iterator ei;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      FOR_EACH_EDGE (e, ei, bb->preds)
> +       if (e->flags & EDGE_ABNORMAL)
> +         {
> +           gphi_iterator gsi;
> +           for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> +                gsi_next (&gsi))
> +             {
> +               gphi *phi = gsi.phi ();
> +               tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> +               if (SSA_NAME_IS_DEFAULT_DEF (arg)
> +                   && (!SSA_NAME_VAR (arg)
> +                       || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> +                 continue;
> +
> +               tree res = PHI_RESULT (phi);
> +
> +               int p1 = partition_find (tentative, var_to_partition (map, res));
> +               int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> +               if (p1 == p2)
> +                 continue;
> +
> +               partition_union (tentative, p1, p2);
> +             }
> +         }
> +    }

So the above does full coalescing ignoring conflicts.

> +
> +  map->partition_to_base_index = XCNEWVEC (int, parts);
> +  auto_vec<unsigned int> index_map (parts);
> +  if (parts)
> +    index_map.quick_grow (parts);
> +
> +  const unsigned no_part = -1;
> +  unsigned count = parts;
> +  while (count)
> +    index_map[--count] = no_part;
> +
> +  /* Initialize MAP's mapping from partition to base index, using
> +     as base indices an enumeration of the TENTATIVE partitions in
> +     which each SSA version ended up, so that we compute conflicts
> +     between all SSA versions that ended up in the same potential
> +     coalesce partition.  */
> +  bitmap_iterator bi;
> +  unsigned i;
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      if (index_map[base] != no_part)
> +       continue;
> +      index_map[base] = count++;
> +    }
> +
> +  map->num_basevars = count;

Did you do any statistics on how the number of basevars changes with your patch
compared to trunk?

So apart from possibly simplifying the patch by not dealing with
default-def coalesces
of PARAM_DECLs and ignoring them for conflict purposes for others (as
tree-ssa-live.c
does) the patch looks good to me.

Richard.

> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      gcc_assert (index_map[base] < count);
> +      map->partition_to_base_index[pidx] = index_map[base];
> +    }
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    dump_part_var_map (dump_file, tentative, map);
> +
> +  partition_delete (tentative);
> +}
> +
> +
>  /* Reduce the number of copies by coalescing variables in the function.  Return
>     a partition map with the resulting coalesces.  */
>
> @@ -1332,7 +1647,12 @@ coalesce_ssa_name (void)
>      dump_var_map (dump_file, map);
>
>    /* Don't calculate live ranges for variables not in the coalesce list.  */
> -  partition_view_bitmap (map, used_in_copies, true);
> +  partition_view_bitmap (map, used_in_copies, !optimize);
> +
> +  /* If we are optimizing, compute the base indices ourselves.  */
> +  if (optimize)
> +    compute_optimized_partition_bases (map, used_in_copies, cl);
> +
>    BITMAP_FREE (used_in_copies);
>
>    if (num_var_partitions (map) < 1)
> @@ -1341,6 +1661,8 @@ coalesce_ssa_name (void)
>        return map;
>      }
>
> +  bitmap param_defaults = BITMAP_ALLOC (NULL);
> +
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      dump_var_map (dump_file, map);
>
> @@ -1350,7 +1672,7 @@ coalesce_ssa_name (void)
>      dump_live_info (dump_file, liveinfo, LIVEDUMP_ENTRY);
>
>    /* Build a conflict graph.  */
> -  graph = build_ssa_conflict_graph (liveinfo);
> +  graph = build_ssa_conflict_graph (liveinfo, param_defaults);
>    delete_tree_live_info (liveinfo);
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      ssa_conflicts_dump (dump_file, graph);
> @@ -1370,10 +1692,10 @@ coalesce_ssa_name (void)
>      dump_var_map (dump_file, map);
>
>    /* Now coalesce everything in the list.  */
> -  coalesce_partitions (map, graph, cl,
> -                      ((dump_flags & TDF_DETAILS) ? dump_file
> -                                                  : NULL));
> +  coalesce_partitions (map, graph, cl, param_defaults,
> +                      ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
> +  BITMAP_FREE (param_defaults);
>    delete_coalesce_list (cl);
>    ssa_conflicts_delete (graph);
>
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index f3cb56e..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,499 +0,0 @@
> -/* Rename SSA copies.
> -   Copyright (C) 2004-2015 Free Software Foundation, Inc.
> -   Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3.  If not see
> -<http://www.gnu.org/licenses/>.  */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "tm.h"
> -#include "hash-set.h"
> -#include "machmode.h"
> -#include "vec.h"
> -#include "double-int.h"
> -#include "input.h"
> -#include "alias.h"
> -#include "symtab.h"
> -#include "wide-int.h"
> -#include "inchash.h"
> -#include "tree.h"
> -#include "fold-const.h"
> -#include "predict.h"
> -#include "hard-reg-set.h"
> -#include "function.h"
> -#include "dominance.h"
> -#include "cfg.h"
> -#include "basic-block.h"
> -#include "tree-ssa-alias.h"
> -#include "internal-fn.h"
> -#include "gimple-expr.h"
> -#include "is-a.h"
> -#include "gimple.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "bitmap.h"
> -#include "gimple-ssa.h"
> -#include "stringpool.h"
> -#include "tree-ssanames.h"
> -#include "hashtab.h"
> -#include "rtl.h"
> -#include "statistics.h"
> -#include "real.h"
> -#include "fixed-value.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> -  /* Number of copies coalesced.  */
> -  int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> -   This optimization looks for copies between 2 SSA_NAMES, either through a
> -   direct copy, or an implicit one via a PHI node result and its arguments.
> -
> -   Each copy is examined to determine if it is possible to rename the base
> -   variable of one of the operands to the same variable as the other operand.
> -   i.e.
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -
> -   If this copy couldn't be copy propagated, it could possibly remain in the
> -   program throughout the optimization phases.   After SSA->normal, it would
> -   become:
> -
> -   T.3 = <blah>
> -   a = T.3
> -
> -   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> -   fundamental reason why the base variable needs to be T.3, subject to
> -   certain restrictions.  This optimization attempts to determine if we can
> -   change the base variable on copies like this, and result in code such as:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -
> -   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> -   possible, the copy goes away completely. If it isn't possible, a new temp
> -   will be created for a_5, and you will end up with the exact same code:
> -
> -   a.8 = <blah>
> -   a = a.8
> -
> -   The other benefit of performing this optimization relates to what variables
> -   are chosen in copies.  Gimplification of the program uses temporaries for
> -   a lot of things. expressions like
> -
> -   a_1 = <blah>
> -   <blah2> = a_1
> -
> -   get turned into
> -
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -   <blah2> = a_1
> -
> -   Copy propagation is done in a forward direction, and if we can propagate
> -   through the copy, we end up with:
> -
> -   T.3_5 = <blah>
> -   <blah2> = T.3_5
> -
> -   The copy is gone, but so is all reference to the user variable 'a'. By
> -   performing this optimization, we would see the sequence:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -   <blah2> = a_1
> -
> -   which copy propagation would then turn into:
> -
> -   a_5 = <blah>
> -   <blah2> = a_5
> -
> -   and so we still retain the user variable whenever possible.  */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> -   Choose a representative for the partition, and send debug info to DEBUG.  */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> -  int p1, p2, p3;
> -  tree root1, root2;
> -  tree rep1, rep2;
> -  bool ign1, ign2, abnorm;
> -
> -  gcc_assert (TREE_CODE (var1) == SSA_NAME);
> -  gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> -  register_ssa_partition (map, var1);
> -  register_ssa_partition (map, var2);
> -
> -  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> -  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> -  if (debug)
> -    {
> -      fprintf (debug, "Try : ");
> -      print_generic_expr (debug, var1, TDF_SLIM);
> -      fprintf (debug, "(P%d) & ", p1);
> -      print_generic_expr (debug, var2, TDF_SLIM);
> -      fprintf (debug, "(P%d)", p2);
> -    }
> -
> -  gcc_assert (p1 != NO_PARTITION);
> -  gcc_assert (p2 != NO_PARTITION);
> -
> -  if (p1 == p2)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Already coalesced.\n");
> -      return;
> -    }
> -
> -  rep1 = partition_to_var (map, p1);
> -  rep2 = partition_to_var (map, p2);
> -  root1 = SSA_NAME_VAR (rep1);
> -  root2 = SSA_NAME_VAR (rep2);
> -  if (!root1 && !root2)
> -    return;
> -
> -  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
> -  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> -           || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> -  if (abnorm)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Partitions already have the same root, simply merge them.  */
> -  if (root1 == root2)
> -    {
> -      p1 = partition_union (map->var_partition, p1, p2);
> -      if (debug)
> -       fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> -      return;
> -    }
> -
> -  /* Never attempt to coalesce 2 different parameters.  */
> -  if ((root1 && TREE_CODE (root1) == PARM_DECL)
> -      && (root2 && TREE_CODE (root2) == PARM_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> -      return;
> -    }
> -
> -  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> -      != (root2 && TREE_CODE (root2) == RESULT_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> -      return;
> -    }
> -
> -  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> -  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> -  /* Refrain from coalescing user variables, if requested.  */
> -  if (!ign1 && !ign2)
> -    {
> -      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> -       ign2 = true;
> -      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> -       ign1 = true;
> -      else if (flag_ssa_coalesce_vars != 2)
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> -         return;
> -       }
> -      else
> -       ign2 = true;
> -    }
> -
> -  /* If both values have default defs, we can't coalesce.  If only one has a
> -     tag, make sure that variable is the new root partition.  */
> -  if (root1 && ssa_default_def (cfun, root1))
> -    {
> -      if (root2 && ssa_default_def (cfun, root2))
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 default defs. No coalesce.\n");
> -         return;
> -       }
> -      else
> -        {
> -         ign2 = true;
> -         ign1 = false;
> -       }
> -    }
> -  else if (root2 && ssa_default_def (cfun, root2))
> -    {
> -      ign1 = true;
> -      ign2 = false;
> -    }
> -
> -  /* Do not coalesce if we cannot assign a symbol to the partition.  */
> -  if (!(!ign2 && root2)
> -      && !(!ign1 && root1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the new chosen root variable would be read-only.
> -     If both ign1 && ign2, then the root var of the larger partition
> -     wins, so reject in that case if any of the root vars is TREE_READONLY.
> -     Otherwise reject only if the root var, on which replace_ssa_name_symbol
> -     will be called below, is readonly.  */
> -  if (((root1 && TREE_READONLY (root1)) && ign2)
> -      || ((root2 && TREE_READONLY (root2)) && ign1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Readonly variable.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the two variables aren't type compatible .  */
> -  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> -      /* There is a disconnect between the middle-end type-system and
> -         VRP, avoid coalescing enum types with different bounds.  */
> -      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> -          || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> -         && TREE_TYPE (var1) != TREE_TYPE (var2)))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Incompatible types.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Merge the two partitions.  */
> -  p3 = partition_union (map->var_partition, p1, p2);
> -
> -  /* Set the root variable of the partition to the better choice, if there is
> -     one.  */
> -  if (!ign2 && root2)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> -  else if (!ign1 && root1)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> -  else
> -    gcc_unreachable ();
> -
> -  if (debug)
> -    {
> -      fprintf (debug, " --> P%d ", p3);
> -      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> -                         TDF_SLIM);
> -      fprintf (debug, "\n");
> -    }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> -  GIMPLE_PASS, /* type */
> -  "copyrename", /* name */
> -  OPTGROUP_NONE, /* optinfo_flags */
> -  TV_TREE_COPY_RENAME, /* tv_id */
> -  ( PROP_cfg | PROP_ssa ), /* properties_required */
> -  0, /* properties_provided */
> -  0, /* properties_destroyed */
> -  0, /* todo_flags_start */
> -  0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> -  pass_rename_ssa_copies (gcc::context *ctxt)
> -    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> -  {}
> -
> -  /* opt_pass methods: */
> -  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> -  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> -  virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> -   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
> -   changing the underlying root variable of all coalesced version.  This will
> -   then cause the SSA->normal pass to attempt to coalesce them all to the same
> -   variable.  */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> -  var_map map;
> -  basic_block bb;
> -  tree var, part_var;
> -  gimple stmt;
> -  unsigned x;
> -  FILE *debug;
> -
> -  memset (&stats, 0, sizeof (stats));
> -
> -  if (dump_file && (dump_flags & TDF_DETAILS))
> -    debug = dump_file;
> -  else
> -    debug = NULL;
> -
> -  map = init_var_map (num_ssa_names);
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Scan for real copies.  */
> -      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -       {
> -         stmt = gsi_stmt (gsi);
> -         if (gimple_assign_ssa_name_copy_p (stmt))
> -           {
> -             tree lhs = gimple_assign_lhs (stmt);
> -             tree rhs = gimple_assign_rhs1 (stmt);
> -
> -             copy_rename_partition_coalesce (map, lhs, rhs, debug);
> -           }
> -       }
> -    }
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Treat PHI nodes as copies between the result and each argument.  */
> -      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -        {
> -          size_t i;
> -         tree res;
> -         gphi *phi = gsi.phi ();
> -         res = gimple_phi_result (phi);
> -
> -         /* Do not process virtual SSA_NAMES.  */
> -         if (virtual_operand_p (res))
> -           continue;
> -
> -         /* Make sure to only use the same partition for an argument
> -            as the result but never the other way around.  */
> -         if (SSA_NAME_VAR (res)
> -             && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> -           for (i = 0; i < gimple_phi_num_args (phi); i++)
> -             {
> -               tree arg = PHI_ARG_DEF (phi, i);
> -               if (TREE_CODE (arg) == SSA_NAME)
> -                 copy_rename_partition_coalesce (map, res, arg,
> -                                                 debug);
> -             }
> -         /* Else if all arguments are in the same partition try to merge
> -            it with the result.  */
> -         else
> -           {
> -             int all_p_same = -1;
> -             int p = -1;
> -             for (i = 0; i < gimple_phi_num_args (phi); i++)
> -               {
> -                 tree arg = PHI_ARG_DEF (phi, i);
> -                 if (TREE_CODE (arg) != SSA_NAME)
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -                 else if (all_p_same == -1)
> -                   {
> -                     p = partition_find (map->var_partition,
> -                                         SSA_NAME_VERSION (arg));
> -                     all_p_same = 1;
> -                   }
> -                 else if (all_p_same == 1
> -                          && p != partition_find (map->var_partition,
> -                                                  SSA_NAME_VERSION (arg)))
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -               }
> -             if (all_p_same == 1)
> -               copy_rename_partition_coalesce (map, res,
> -                                               PHI_ARG_DEF (phi, 0),
> -                                               debug);
> -           }
> -        }
> -    }
> -
> -  if (debug)
> -    dump_var_map (debug, map);
> -
> -  /* Now one more pass to make all elements of a partition share the same
> -     root variable.  */
> -
> -  for (x = 1; x < num_ssa_names; x++)
> -    {
> -      part_var = partition_to_var (map, x);
> -      if (!part_var)
> -        continue;
> -      var = ssa_name (x);
> -      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> -       continue;
> -      if (debug)
> -        {
> -         fprintf (debug, "Coalesced ");
> -         print_generic_expr (debug, var, TDF_SLIM);
> -         fprintf (debug, " to ");
> -         print_generic_expr (debug, part_var, TDF_SLIM);
> -         fprintf (debug, "\n");
> -       }
> -      stats.coalesced++;
> -      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> -    }
> -
> -  statistics_counter_event (fun, "copies coalesced",
> -                           stats.coalesced);
> -  delete_var_map (map);
> -  return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> -  return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index e0c42669..46b1869 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -119,7 +119,10 @@ tree_int_map_hasher::equal (const value_type *v, const compare_type *c)
>  }
>
>
> -/* This routine will initialize the basevar fields of MAP.  */
> +/* This routine will initialize the basevar fields of MAP with base
> +   names, when we are not optimizing.  When optimizing, we'll use
> +   partition numbers as base index numbers, see coalesce_ssa_name in
> +   tree-ssa-coalesce.c.  */
>
>  static void
>  var_map_base_init (var_map map)
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-31  5:11   ` Jeff Law
@ 2015-04-03 13:17     ` Alexandre Oliva
  2015-04-06 16:08       ` Jeff Law
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-04-03 13:17 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Mar 31, 2015, Jeff Law <law@redhat.com> wrote:

>> -  if (var1 != var2)
>> +  if (var1 != var2 && !optimize)
>> return false;
> So when when the base variables are different and we are optimizing,
> this allows coalescing, right?

Yeah.

> What I don't see is a corresponding change to var_map_base_init to
> ensure we build a conflict graph which includes objects when
> SSA_NAME_VARs are not the same.  I see a vague reference in
> var_map_base_init's header comment that refers us to
> coalesce_ssa_name.

> It appears that compute_optimized_partition_bases handles this by
> creating a partitions of things that are related by copies/phis
> regardless of their underlying named object, type, etc.  Right?

Correct.  I guess it makes sense to move partition base computation to a
single location.  Since compute_optimized_partition_bases relies on data
structures local to this source file, I'm moving the non-optimized
version to tree-ssa-coalesce.c, and dropping support for basevar
initialization from tree-ssa-live.c.


> Hard to argue with removing a pass that gets called 5 times!

:-)


>> @@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>> live_track_process_def (live, result, graph);
>> }
>> 
>> +      /* Pretend there are defs for params' default defs at the start
>> +	 of the (post-)entry block.  We run after abnormal coalescing,
>> +	 so we can't assume the leader variable is the default
>> +	 definition, but because of SSA_NAME_VAR adjustments in
>> +	 attempt_coalesce, we can assume that if there is any
>> +	 PARM_DECL in the partition, it will be the leader's
>> +	 SSA_NAME_VAR.  */

This comment is outdated.  Since we no longer have abnormal coalescing
before building the conflict graph, we can just test whether each
SSA_NAME is a default def for a PARM_DECL and be done with it.

> So the issue here is you want to iterate over the objects live at the
> entry block, which would include any SSA_NAMEs which result from
> PARM_DECLs.  I don't guess there's an easier way to do that other than
> iterating over everything live in that initial block?

We could iterate over all SSA_NAMEs, but that would probably be more
costly.  There shouldn't be very many live variables at the function
entry, so using the live bitmaps is likely to save us time, especially
on functions with lots of SSA_NAMEs.

> And the second second EXECUTE_IF_SET_IN_BITMAP iterates over
> everything in the partitions associated with the SSA_NAMES that are
> live at the the entry block, right?

Yeah, we iterate over the bases in live_base_var, because the per-base
bitmaps are only accurate when the corresponding live_base_var bit is
iset.

>> @@ -1126,11 +1166,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>> 
>> static inline bool
>> attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>> -		  FILE *debug)
>> +		  bitmap param_defaults, FILE *debug)
> [ ... ]
> So the bulk of the changes into this routine are really about picking
> a good leader, which presumably is how we're able to get the desired
> effects on debuginfo that we used to get from tree-ssa-copyrename.c?

This has nothing to do with debuginfo, I'm afraid.  We just had to keep
track of parm and result decls to avoid coalescing them together, and
parm decl default defs to promote them to leaders, because expand copies
incoming REGs to pseudos in PARM_DECL's DECL_RTL.  We should fill that
in with the RTL created for the default def for the PARM_DECL.  At the
end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
return register or rtl.  I didn't want to tackle the reworking of these
expanders to avoid problems out of copying incoming parms to one pseudo
and then reading it from another, as I observed before I made this
change.  I'm now tackling that, so that we can refrain from touching the
base vars altogether, and not refrain from coalescing parms and results.

> So some comments about the various cases here might help.  I can sort
> them out if I read the code, but one could argue that a block comment
> on the rules for how to select the partition leader would be better.

*nod*.  I won't bother, though, if this code ends up gone in the next
iteration of the patch ;-)

> Is the special casing of PARM_DECLs + RESULT_DECLs really a failing of
> not handling one or both properly when computing liveness information?

No, it's about RTL assignment and copying to/from hard regs.  We assign
RTL to PARM_DECLs and RESULT_DECLs explicitly in the expander, but we
can't assign different RTL to them if they are coalesced in a single
partition.

> I'm not aware of an inherent reason why a PARM_DECL couldn't coalesce
> with a related RESULT_DECL if they are otherwise non-conflicting and
> related by a copy/phi.

Indeed, there isn't any inherent reason.  It was just a restriction I
carried over from copyrename, and that I postponed cleaning up.

> Presumably ordering of unioning of the partitions doesn't matter here
> as we're looking at coalesce possibilities rather than things we have
> actually coalesced?  Thus it's OK (?) to handle the names occurring in
> abnormal PHIs after those names that are associated by a copy.

Yeah, they'll end up with the same basevar one way or another.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-31 14:06   ` Richard Biener
@ 2015-04-03 13:30     ` Alexandre Oliva
  2015-04-06 15:57       ` Jeff Law
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-04-03 13:30 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Mar 31, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

>> +                     || !(SSA_NAME_IS_DEFAULT_DEF (var)
>> +                          || (param_defaults
>> +                              && bitmap_bit_p (param_defaults, part))))

> This looks somewhat awkward to me ;)  Is it really important to allow
> coalescing PARM_DECL-based SSA vars with sth else?

It's a valid optimization.  I can't say it's really important, but if
the only objection is to param_defaults, I'm getting rid of it.

> At least abnormal coalescing doesn't need to do that, so just walking
> over the function decls parameters and making their default-def live
> should be enough?

It should.  We'd have to duplicate logic of parameters, including static
chain and whatnot.  I figured this would make it more resilient to
changes elsewhere.

>> +      else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
>> +       leader = SSA_NAME_VAR (var2);
>> +      else /* What else could it be?  */
>> +       gcc_unreachable ();
>> +

> definitely comments missing in this spaghetti...

I'm trying to remove the spaghetting now.

> or seeing this, why coalesce default-defs at all?

Why not?  (the referenced code is gone from my local tree, BTW)

> Either they are param values or they have indetermined values (and
> thus we can and do pick whatever is available at expansion time)?

If they are param values, we want to have them available; if they
aren't, whatever we coalesce with is good.


> So the above does full coalescing ignoring conflicts.

Yeah.  We want to tell what we'd get if all coalesce possibilities are
taken, so that we can assign the same basevar to all partitions so that
we detect conflicts.

> Did you do any statistics on how the number of basevars changes with your patch
> compared to trunk?

'fraid I didn't run any statistics whatsoever.  I didn't think it was
important, since it's pretty much just doing copyrename during coalesce.
Truth be told, this has since relaxed some of the constraints from
copyrename, and I'm going to relax some more in the next iteration, so I
guess some statistics wouldn't be a bad idea.  Is there any specific
testcase you're interested in, or something like a GCC bootstrap or
somesuch?

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-04-03 13:30     ` Alexandre Oliva
@ 2015-04-06 15:57       ` Jeff Law
  0 siblings, 0 replies; 127+ messages in thread
From: Jeff Law @ 2015-04-06 15:57 UTC (permalink / raw)
  To: Alexandre Oliva, Richard Biener; +Cc: GCC Patches

On 04/03/2015 07:28 AM, Alexandre Oliva wrote:
> On Mar 31, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>>> +                     || !(SSA_NAME_IS_DEFAULT_DEF (var)
>>> +                          || (param_defaults
>>> +                              && bitmap_bit_p (param_defaults, part))))
>
>> This looks somewhat awkward to me ;)  Is it really important to allow
>> coalescing PARM_DECL-based SSA vars with sth else?
>
> It's a valid optimization.  I can't say it's really important, but if
> the only objection is to param_defaults, I'm getting rid of it.
I doubt it's terribly important, but I agree it's a valid optimization. 
  Do you have a testcase where it triggers?  Can we include that too so 
that if someone wants to remove this later for some reason or another 
we'd at least have a chance of seeing a regression.

ISTM it can only trigger when the PARM is tied to another object via a 
copy and the PARM and other object have non-overlapping lifetimes.  I'd 
expect that this may happen at PHIs where the PARM appears on the RHS 
and dies at that point -- the PARM and the PHI result are likely not 
going to conflict and thus may coalesce.


>
>> At least abnormal coalescing doesn't need to do that, so just walking
>> over the function decls parameters and making their default-def live
>> should be enough?
>
> It should.  We'd have to duplicate logic of parameters, including static
> chain and whatnot.  I figured this would make it more resilient to
> changes elsewhere.
This ties in a bit to my comment about whether or not we've got proper 
life information for PARMs.  I'd generally prefer to see us get the life 
information corrrect.


>
>>> +      else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
>>> +       leader = SSA_NAME_VAR (var2);
>>> +      else /* What else could it be?  */
>>> +       gcc_unreachable ();
>>> +
>
>> definitely comments missing in this spaghetti...
>
> I'm trying to remove the spaghetting now.
Good :-)

>
>> or seeing this, why coalesce default-defs at all?
>
> Why not?  (the referenced code is gone from my local tree, BTW)
>
>> Either they are param values or they have indetermined values (and
>> thus we can and do pick whatever is available at expansion time)?
>
> If they are param values, we want to have them available; if they
> aren't, whatever we coalesce with is good.
Agreed.  Didn't we recently change the coalescing code to allow 
coalescing non-PARM default defs more aggressively:

Author: glisse <glisse@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Mon Nov 3 10:47:04 2014 +0000

     2014-11-03  Marc Glisse  <marc.glisse@inria.fr>

         PR tree-optimization/60770
     gcc/
         * tree-into-ssa.c (rewrite_update_stmt): Return whether the
         statement should be removed.
         (maybe_register_def): Likewise. Replace clobbers with default
         definitions.
         (rewrite_dom_walker::before_dom_children): Remove statement if
         rewrite_update_stmt says so.
         * tree-ssa-live.c: Include tree-ssa.h.
         (set_var_live_on_entry): Do not mark undefined variables as live.
         (verify_live_on_entry): Do not check undefined variables.
         * tree-ssa.h (ssa_undefined_value_p): New parameter for the case
         of partially undefined variables.
         * tree-ssa.c (ssa_undefined_value_p): Likewise.
         (execute_update_addresses_taken): Do not drop clobbers.

     gcc/testsuite/
         * gcc.dg/tree-ssa/pr60770-1.c: New file.

  > guess some statistics wouldn't be a bad idea.  Is there any specific
> testcase you're interested in, or something like a GCC bootstrap or
> somesuch?
Not from me.  bootstrap or .i files from gcc bootstrap would seem to be 
sufficient to me.

jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-04-03 13:17     ` Alexandre Oliva
@ 2015-04-06 16:08       ` Jeff Law
  2015-04-24  1:56         ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Jeff Law @ 2015-04-06 16:08 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches

On 04/03/2015 07:17 AM, Alexandre Oliva wrote:
>>> @@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>>> live_track_process_def (live, result, graph);
>>> }
>>>
>>> +      /* Pretend there are defs for params' default defs at the start
>>> +	 of the (post-)entry block.  We run after abnormal coalescing,
>>> +	 so we can't assume the leader variable is the default
>>> +	 definition, but because of SSA_NAME_VAR adjustments in
>>> +	 attempt_coalesce, we can assume that if there is any
>>> +	 PARM_DECL in the partition, it will be the leader's
>>> +	 SSA_NAME_VAR.  */
>
> This comment is outdated.  Since we no longer have abnormal coalescing
> before building the conflict graph, we can just test whether each
> SSA_NAME is a default def for a PARM_DECL and be done with it.
OK.  Please update the comment :-0

>
>> So the issue here is you want to iterate over the objects live at the
>> entry block, which would include any SSA_NAMEs which result from
>> PARM_DECLs.  I don't guess there's an easier way to do that other than
>> iterating over everything live in that initial block?
>
> We could iterate over all SSA_NAMEs, but that would probably be more
> costly.  There shouldn't be very many live variables at the function
> entry, so using the live bitmaps is likely to save us time, especially
> on functions with lots of SSA_NAMEs.
Agreed.  Iterating over the SSA_NAMEs was what came to mind when I 
pondered this a bit more and I'd rejected it for the same reason.

Can we get to the SSA_NAMEs associated with the PARM_DECLs from the 
function decl?  I can't think of a way off the top of my head, but if we 
could, then that'd avoid the iteration over the bitmap of live variables.

But then again, the bitmap of live variables ought to be small, 
particularly if we're not marking non-PARM default defs as live anymore 
(see patch reference in my prior message).

>> So the bulk of the changes into this routine are really about picking
>> a good leader, which presumably is how we're able to get the desired
>> effects on debuginfo that we used to get from tree-ssa-copyrename.c?
>
> This has nothing to do with debuginfo, I'm afraid.  We just had to keep
> track of parm and result decls to avoid coalescing them together, and
> parm decl default defs to promote them to leaders, because expand copies
> incoming REGs to pseudos in PARM_DECL's DECL_RTL.  We should fill that
> in with the RTL created for the default def for the PARM_DECL.  At the
> end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
> return register or rtl.  I didn't want to tackle the reworking of these
> expanders to avoid problems out of copying incoming parms to one pseudo
> and then reading it from another, as I observed before I made this
> change.  I'm now tackling that, so that we can refrain from touching the
> base vars altogether, and not refrain from coalescing parms and results.
Hmmm, so the real issue here is the expansion setup of parms and 
results.  I hadn't pondered that aspect.  I'd encourage fixing the 
expansion code too if you can see a path for that.

Basically I just don't like special casing things like this -- 
coalescing should be driven by life information/conflict graph and a 
copy relationship between the two candidate objects.

Overall it looks like you're on the right path and we'll just need to 
iterate a bit more.

jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-04-06 16:08       ` Jeff Law
@ 2015-04-24  1:56         ` Alexandre Oliva
  2015-04-27 11:39           ` Richard Biener
  2015-04-29  3:51           ` Jeff Law
  0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-04-24  1:56 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: gcc-patches

On Apr  6, 2015, Jeff Law <law@redhat.com> wrote:

>>> So the bulk of the changes into this routine are really about picking
>>> a good leader, which presumably is how we're able to get the desired
>>> effects on debuginfo that we used to get from tree-ssa-copyrename.c?
>> 
>> This has nothing to do with debuginfo, I'm afraid.  We just had to keep
>> track of parm and result decls to avoid coalescing them together, and
>> parm decl default defs to promote them to leaders, because expand copies
>> incoming REGs to pseudos in PARM_DECL's DECL_RTL.  We should fill that
>> in with the RTL created for the default def for the PARM_DECL.  At the
>> end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
>> return register or rtl.  I didn't want to tackle the reworking of these
>> expanders to avoid problems out of copying incoming parms to one pseudo
>> and then reading it from another, as I observed before I made this
>> change.  I'm now tackling that, so that we can refrain from touching the
>> base vars altogether, and not refrain from coalescing parms and results.

> Hmmm, so the real issue here is the expansion setup of parms and
> results.  I hadn't pondered that aspect.  I'd encourage fixing the
> expansion code too if you can see a path for that.

That was the trickiest bit of the patch: getting assign_parms to use the
out-of-SSA-chosen RTL for the (default def of the) param instead of
creating a pseudo or stack slot of its own, especially when we create a
.result_ptr decl and there is an incoming by-ref result_decl, in which
case we ought to use the same SSA-assigned pseudo for both.  Another
case worth mentioning is that in which a param is unused: there is no
default def for it, but in the non-optimized case, we have to assign it
to the same location.  I've used the DECL_RTL itself to carry the
information in this case, at least in the non-optimized case, in which
we know all SSA_NAMEs associated with each param *will* be assigned to
the same partition, and use the same RTL.  If we do optimize, the param
may get multiple locations, and DECL_RTL will be cleared.  That's fine:
the incoming value of the param will end up being copied to a separate
pseudo, so that there's no risk of messing with any other default def
(there's a new testcase for this), and the copy is likely to be
optimized out.

The other tricky bit was to fix all expander bits that required
SSA_NAMEs to have a associated decl.  I've removed all such cases, so we
can now expand anonymous SSA decls directly, without having to create an
ignored decl.  Doing that, we can coalesce variables and expand each
partition without worrying about choosing a preferred partition leader.
We just have to make sure we don't consider pairs of variables eligible
for coalescing if they should get different promoted modes, or a
different register-or-stack choice, and then expansion of partitions is
streamlined: we just expand each leader, and then adjust all SSA_NAMEs
to associate the RTL with their base variables, if any.


In this revision of the patch, I have retained -ftree-coalesce-vars, so
that its negated form can be used in testcases that formerly expected no
coalescing across user variables, but that explicitly disabled VTA.

As for testcases, while investigating test regressions, I found out
various guality failures had to do with VT's lack of awareness of custom
calling conventions.  Caller's variables saved in registers that are
normally call-clobbered, but that are call-saved in custom conventions
set up for a callee, would end up invalidating the entry-point location
associations.  I've arranged for var-tracking to use custom calling
conventions for register invalidation at call insns, and this fixed not
only a few guality regressions due to changes in register assignment,
but a number of other long-standing guality failures.  Yay!  This could
be split out into a standalone patch.


On Mar 31, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> Did you do any statistics on how the number of basevars changes with your patch
> compared to trunk?

In this version of the patch, we no longer touch the base vars at all.
We just associate the piece of RTL generated for the partition with a
list of decls, if needed.  (I've just realized that I never noticed a
list of decls show up anywhere, and looking into this, I saw a bug in
the leader_merge function, that causes it to fail to go from a single
entry to a list: it creates the list, but then returns the original
non-list entry; that's why I never saw them!  I won't delay posting the
patch just because of this; I'm not even sure we want decl lists in REG
or MEM attrs begin with)

I have collected some statistics on the effects of the patch in
compiling stage3-gcc/, before and after the patch, with and without
-fno-tree-coalesce-vars.  I counted, per function:

b/a: before the patch, or after the patch

c/n: -ftree-coalesce-vars (default when optimizing) or
-fno-tree-coalesce-vars

cv: the coalescible var count, i.e., the active partition count prior to
coalescing.  SSA_NAMEs not elligible for coalescing are not counted.
The more of these there are, the larger the conflict graph we have to
build.

base: the base variable count that guides the construction of the
conflict map.  The more of these there are, the smaller the conflict
graph we have to build, but it is also a lower bound for the final
partition count.

part: the partition count after coalescing, not counting those of
SSA_NAMEs that were not elligible for coalescing to begin with.

abn: successful abnormal coalesce count.  How many times
attempt_coalesce returned true as called in the abnormal coalesce loop.

same: successful normal coalesces of pairs of SSA_NAMEs that share the
same base variable (SSA_NAME_VAR, not the base index used to guide the
construction of the conflict graph).  Ignored base decls are regarded as
NULL for purposes of this comparison.  How many times attempt_coalesce
returned true for variables that share the same base variable.  This may
count cases in which both vars are in the same partition already due to
earlier coalesces.

other: successful normal coalesces of pairs of SSA_NAMEs that do NOT
share the same base variable.  Same caveats as above.

fail: failed attempts at normal coalece.  How many times
attempt_coalesce returned false.

b/a    c/n     cv     base   part   abn   same   other fail

before -fno-tr 570180 176682 221442 82076 370746     0 10542
before -ftree- 577212 171581 221927 82076 378093     0 18654

after  -fno-tr 608533 179959 220948 82076 488119     0 11697
after  -ftree- 589243 202588 221817 82076 349373 41775 24124


Here's (for reference only) the patch used to gather the data
consolidated above:

diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index eeac5a4..d9fe4cc 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -1199,6 +1199,11 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
   edge e;
   edge_iterator ei;
 
+  int abnormal = 0, samevar = 0, othervar = 0, failure = 0;
+  int initial_partitions = num_var_partitions (map);
+  int final_partitions = initial_partitions;
+  int p1, p2;
+
   /* First, coalesce all the copies across abnormal edges.  These are not placed
      in the coalesce list because they do not need to be sorted, and simply
      consume extra memory/compilation time in large programs.  */
@@ -1226,8 +1231,17 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
 		if (debug)
 		  fprintf (debug, "Abnormal coalesce: ");
 
+		p1 = var_to_partition (map, arg);
+		p2 = var_to_partition (map, res);
+
 		if (!attempt_coalesce (map, graph, v1, v2, debug))
 		  fail_abnormal_edge_coalesce (v1, v2);
+		else
+		  {
+		    abnormal++;
+		    if (p1 != p2)
+		      final_partitions--;
+		  }
 	      }
 	  }
     }
@@ -1244,8 +1258,30 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
 
       if (debug)
 	fprintf (debug, "Coalesce list: ");
-      attempt_coalesce (map, graph, x, y, debug);
+
+      p1 = var_to_partition (map, var1);
+      p2 = var_to_partition (map, var2);
+
+      if (!attempt_coalesce (map, graph, x, y, debug))
+	failure++;
+      else
+	{
+	  if (p1 != p2)
+	    final_partitions--;
+	  if ((SSA_NAME_VAR (var1) && !DECL_IGNORED_P (SSA_NAME_VAR (var1))
+	       ? SSA_NAME_VAR (var1) : NULL)
+	      == (SSA_NAME_VAR (var2) && !DECL_IGNORED_P (SSA_NAME_VAR (var2))
+		  ? SSA_NAME_VAR (var2) : NULL))
+	    samevar++;
+	  else
+	    othervar++;
+	}
     }
+
+  inform (1,
+	  "%i cv, %i base, %i part, %i abn, %i same, %i other, %i failed in %q+F",
+	  initial_partitions, num_basevars (map), final_partitions,
+	  abnormal, samevar, othervar, failure, current_function_decl);
 }
 


And here's the actual patch I'm submitting for your appreciation (I was
gonna say for inclusion, but given the leader_merge brown paper bag bug,
I'll just want feedback on whether we want that or not, and either drop
the list-building, or probably post a revised patch that fixes fallout
from lists where decls are expected.)

No regressions, and many progressions, on x86_64-linux-gnu and
i686-pc-linux-gnu.

[PR64164] Drop copyrename, use coalescible partition as base when optimizing.

for  gcc/ChangeLog

	PR rtl-optimization/64164
	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
	* tree-ssa-copyrename.c: Removed.
	* opts.c (default_options_table): Drop -ftree-copyrename.  Add
	-ftree-coalesce-vars.
	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
	* common.opt (ftree-copyrename): Ignore.
	(ftree-coalesce-inlined-vars): Likewise.
	* doc/invoke.texi: Remove the ignored options above.
	* gimple-expr.h (gimple_can_coalesce_p): Note def location.
	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
	across variables when flag_tree_coalesce_vars.  Check register
	use and promoted modes to allow coalescing.  Moved to
	tree-ssa-coalesce.c.
	* tree-ssa-live.c (struct tree_int_map_hasher): Move along
	with its member functions to tree-ssa-coalesce.c.
	(var_map_base_init): Likewise.  Renamed to
	compute_samebase_partition_bases.
	(partition_view_normal): Drop want_bases parameter.
	(partition_view_bitmap): Likewise.
	* tree-ssa-live.h: Adjust declarations.
	* tree-ssa-coalesce.c: Include explow.h.
	(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
	default defs at the entry point.
	(dump_part_var_map): New.
	(compute_optimized_partition_bases): New, called by...
	(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
	of compute_samebase_partition_bases.  Adjust.
	* alias.c (nonoverlapping_memrefs_p): Special-case RTL-less
	gimple-reg exprs.
	* cfgexpand.c (leader_merge): New.
	(get_rtl_for_parm_ssa_default_def): New.
	(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
	vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
	(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
	redundant MEM attr setting.
	(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
	from...
	(expand_one_stack_var): ... this.  New wrapper to check and
	skip already expanded SSA partitions.
	(record_alignment_for_reg_var): New, factored out of...
	(expand_one_var): ... this.
	(expand_one_ssa_partition): New.
	(adjust_one_expanded_partition_var): New.
	(expand_one_register_var): Check and skip already expanded SSA
	partitions.
	(expand_used_vars): Don't create DECLs for anonymous SSA
	names.  Expand all SSA partitions, then adjust all SSA names.
	(pass::execute): Replace the loops that set
	SA.partition_to_pseudo from partition leaders and cleared
	DECL_RTL for multi-location variables, and that which used to
	rename vars and set attrs, with one that clears DECL_RTL and
	checks that PARMs and RESULTs default_defs match DECL_RTL.
	* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
	* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL and
	TREE_LIST decl.
	* explow.c (promote_ssa_mode): New.
	* explow.h (promote_ssa_mode): Declare.
	* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
	* function.c: Include cfgexpand.h.
	(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
	(use_register_for_parm_decl): Wrapper for the above to
	special-case the result_ptr.
	(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
	(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
	multiple locations.
	(assign_parm_adjust_stack_rtl): Add all and parm arguments,
	for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
	(assign_parm_setup_block): Prefer SSA-assigned location.
	(assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
	if stack_parm is NULL.
	(assign_parm_setup_stack): Prefer SSA-assigned location.
	(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
	rtl before testing for pointer bounds.  Special-case result_ptr.
	(expand_function_start): Maybe reset DECL_RTL of result.
	Prefer SSA-assigned location for result and static chain.
	Factor out DECL_RESULT and SET_DECL_RTL.
	* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
	anonymous SSA names.  Use promote_ssa_mode.
	(get_temp_reg): Likewise.
	(remove_ssa_form): Adjust.
	* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
	and get its reg_usage for reg invalidation.
	(compute_bb_dataflow): Pass it insn.
	(emit_notes_in_bb): Likewise.

for  gcc/testsuite/ChangeLog

	* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
	* gcc.dg/ssp-1.c: Make counter a register.
	* gcc.dg/ssp-2.c: Likewise.
	* gcc.dg/torture/parm-coalesce.c: New.
---
 gcc/Makefile.in                              |    1 
 gcc/alias.c                                  |   12 +
 gcc/cfgexpand.c                              |  383 +++++++++++++++-----
 gcc/cfgexpand.h                              |    2 
 gcc/common.opt                               |   12 -
 gcc/doc/invoke.texi                          |   48 +--
 gcc/emit-rtl.c                               |    7 
 gcc/explow.c                                 |   25 +
 gcc/explow.h                                 |    3 
 gcc/expr.c                                   |   33 +-
 gcc/function.c                               |  211 +++++++++--
 gcc/gimple-expr.c                            |   39 --
 gcc/gimple-expr.h                            |    5 
 gcc/opts.c                                   |    2 
 gcc/passes.def                               |    5 
 gcc/testsuite/gcc.dg/guality/pr54200.c       |    2 
 gcc/testsuite/gcc.dg/ssp-1.c                 |    2 
 gcc/testsuite/gcc.dg/ssp-2.c                 |    2 
 gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
 gcc/tree-outof-ssa.c                         |   15 -
 gcc/tree-ssa-coalesce.c                      |  380 +++++++++++++++++++-
 gcc/tree-ssa-copyrename.c                    |  499 --------------------------
 gcc/tree-ssa-live.c                          |  101 -----
 gcc/tree-ssa-live.h                          |    4 
 gcc/var-tracking.c                           |   12 -
 25 files changed, 980 insertions(+), 865 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
 delete mode 100644 gcc/tree-ssa-copyrename.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 80c91f0..6920ee7 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1428,7 +1428,6 @@ OBJS = \
 	tree-ssa-ccp.o \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
-	tree-ssa-copyrename.o \
 	tree-ssa-dce.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index a7160f3..2100e8b 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2365,6 +2365,18 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
   if (! DECL_P (exprx) || ! DECL_P (expry))
     return 0;
 
+  /* If we refer to different gimple registers, or one gimple register
+     and one non-gimple-register, we know they can't overlap.  Now,
+     there could be more than one stack slot for (different versions
+     of) the same gimple register, but we can presumably tell they
+     don't overlap based on offsets from stack base addresses
+     elsewhere.  It's important that we don't proceed to DECL_RTL,
+     because gimple registers may not pass DECL_RTL_SET_P, and
+     make_decl_rtl won't be able to do anything about them since no
+     SSA information will have remained to guide it.  */
+  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+    return exprx != expry;
+
   /* With invalid code we can end up storing into the constant pool.
      Bail out to avoid ICEing when creating RTL for this.
      See gfortran.dg/lto/20091028-2_0.f90.  */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index ca491a0..74190a6d 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -179,21 +179,137 @@ gimple_assign_rhs_to_tree (gimple stmt)
 
 #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
 
+/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
+   TREE_LIST of DECLs.  If NEXT is covered by CUR, return CUR
+   unchanged.  Otherwise, return a list with all entries of CUR, with
+   NEXT at the end.  If CUR was a list, it will be modified in
+   place.  */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+  if (cur == NULL || cur == next)
+    return next;
+
+  tree list;
+
+  if (TREE_CODE (cur) == TREE_LIST)
+    {
+      /* Look for NEXT in the list.  Stop at the last node to insert
+	 there.  */
+      for (list = cur; ; list = TREE_CHAIN (list))
+	{
+	  if (TREE_VALUE (list) == next)
+	    return cur;
+	  if (!TREE_CHAIN (list))
+	    break;
+	}
+    }
+  else
+    /* Create the first node.  */
+    list = build_tree_list (NULL, cur);
+
+  next = build_tree_list (NULL, next);
+  TREE_CHAIN (list) = next;
+
+  return cur;
+}
+
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+   there is one.  */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+  if (!is_gimple_reg (var))
+    return NULL_RTX;
+
+  /* If we've already determined RTL for the decl, use it.  This is
+     not just an optimization: if VAR is a PARM whose incoming value
+     is unused, we won't find a default def to use its partition, but
+     we still want to use the location of the parm, if it was used at
+     all.  During assign_parms, until a location is assigned for the
+     VAR, RTL can only for a parm or result if we're not coalescing
+     across variables, when we know we're coalescing all SSA_NAMEs of
+     each parm or result, and we're not coalescing them with names
+     pertaining to other variables, such as other parms' default
+     defs.  */
+  if (DECL_RTL_SET_P (var))
+    {
+      gcc_assert (DECL_RTL (var) != pc_rtx);
+      return DECL_RTL (var);
+    }
+
+  tree name = ssa_default_def (cfun, var);
+
+  if (!name)
+    return NULL_RTX;
+
+  int part = var_to_partition (SA.map, name);
+  if (part == NO_PARTITION)
+    return NULL_RTX;
+
+  return SA.partition_to_pseudo[part];
+}
+
 /* Associate declaration T with storage space X.  If T is no
    SSA name this is exactly SET_DECL_RTL, otherwise make the
    partition of T associated with X.  */
 static inline void
 set_rtl (tree t, rtx x)
 {
+  if (x && SSAVAR (t))
+    {
+      bool skip = false;
+      tree cur = NULL_TREE;
+
+      if (MEM_P (x))
+	cur = MEM_EXPR (x);
+      else if (REG_P (x))
+	cur = REG_EXPR (x);
+      else if (GET_CODE (x) == CONCAT
+	       && REG_P (XEXP (x, 0)))
+	cur = REG_EXPR (XEXP (x, 0));
+      else if (GET_CODE (x) == PARALLEL)
+	cur = REG_EXPR (XVECEXP (x, 0, 0));
+      else if (x == pc_rtx)
+	skip = true;
+      else
+	gcc_unreachable ();
+
+      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+      if (cur != next)
+	{
+	  if (MEM_P (x))
+	    set_mem_attributes (x, SSAVAR (t), true);
+	  else
+	    set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
+	}
+    }
+
   if (TREE_CODE (t) == SSA_NAME)
     {
-      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
-      if (x && !MEM_P (x))
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
-      /* For the benefit of debug information at -O0 (where vartracking
-         doesn't run) record the place also in the base DECL if it's
-	 a normal variable (not a parameter).  */
-      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+      int part = var_to_partition (SA.map, t);
+      if (part != NO_PARTITION)
+	{
+	  if (SA.partition_to_pseudo[part])
+	    gcc_assert (SA.partition_to_pseudo[part] == x);
+	  else
+	    SA.partition_to_pseudo[part] = x;
+	}
+      /* For the benefit of debug information at -O0 (where
+         vartracking doesn't run) record the place also in the base
+         DECL.  For PARMs and RESULTs, we may end up resetting these
+         in function.c:maybe_reset_rtl_for_parm, but in some rare
+         cases we may need them (unused and overwritten incoming
+         value, that at -O0 must share the location with the other
+         uses in spite of the missing default def), and this may be
+         the only chance to preserve them.  */
+      if (x && x != pc_rtx && SSA_NAME_VAR (t))
 	{
 	  tree var = SSA_NAME_VAR (t);
 	  /* If we don't yet have something recorded, just record it now.  */
@@ -909,7 +1025,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (Pmode, base, offset);
-  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+  x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
+		   ? DECL_MODE (SSAVAR (decl))
+		   : TYPE_MODE (TREE_TYPE (decl)), x);
 
   if (TREE_CODE (decl) != SSA_NAME)
     {
@@ -931,7 +1049,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
       DECL_USER_ALIGN (decl) = 0;
     }
 
-  set_mem_attributes (x, SSAVAR (decl), true);
   set_rtl (decl, x);
 }
 
@@ -1146,13 +1263,22 @@ account_stack_vars (void)
    to a variable to be allocated in the stack frame.  */
 
 static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
 {
   HOST_WIDE_INT size, offset;
   unsigned byte_align;
 
-  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
-  byte_align = align_local_variable (SSAVAR (var));
+  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
+    {
+      size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
+      byte_align = align_local_variable (SSAVAR (var));
+    }
+  else
+    {
+      tree type = TREE_TYPE (var);
+      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+      byte_align = TYPE_ALIGN_UNIT (type);
+    }
 
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1163,6 +1289,27 @@ expand_one_stack_var (tree var)
 			   crtl->max_used_stack_slot_alignment, offset);
 }
 
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+   already assigned some MEM.  */
+
+static void
+expand_one_stack_var (tree var)
+{
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (MEM_P (x));
+	  return;
+	}
+    }
+
+  return expand_one_stack_var_1 (var);
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a hard register.  */
 
@@ -1172,12 +1319,112 @@ expand_one_hard_reg_var (tree var)
   rest_of_decl_compilation (var, 0, 0);
 }
 
+/* Record the alignment requirements of some variable assigned to a
+   pseudo.  */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+  if (SUPPORTS_STACK_ALIGNMENT
+      && crtl->stack_alignment_estimated < align)
+    {
+      /* stack_alignment_estimated shouldn't change after stack
+         realign decision made */
+      gcc_assert (!crtl->stack_realign_processed);
+      crtl->stack_alignment_estimated = align;
+    }
+
+  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+     So here we only make sure stack_alignment_needed >= align.  */
+  if (crtl->stack_alignment_needed < align)
+    crtl->stack_alignment_needed = align;
+  if (crtl->max_used_stack_slot_alignment < align)
+    crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition.  */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+  int part = var_to_partition (SA.map, var);
+  gcc_assert (part != NO_PARTITION);
+
+  if (SA.partition_to_pseudo[part])
+    return;
+
+  if (!use_register_for_decl (var))
+    {
+      expand_one_stack_var_1 (var);
+      return;
+    }
+
+  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+					  TYPE_MODE (TREE_TYPE (var)),
+					  TYPE_ALIGN (TREE_TYPE (var)));
+
+  /* If the variable alignment is very large we'll dynamicaly allocate
+     it, which means that in-frame portion is just a pointer.  */
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+    align = POINTER_SIZE;
+
+  record_alignment_for_reg_var (align);
+
+  machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+  rtx x = gen_reg_rtx (reg_mode);
+
+  set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+   and the underlying variable of the SSA_NAME.  */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+  if (!var)
+    return;
+
+  tree decl = SSA_NAME_VAR (var);
+
+  int part = var_to_partition (SA.map, var);
+  if (part == NO_PARTITION)
+    return;
+
+  rtx x = SA.partition_to_pseudo[part];
+
+  set_rtl (var, x);
+
+  if (!REG_P (x))
+    return;
+
+  /* Note if the object is a user variable.  */
+  if (decl && !DECL_ARTIFICIAL (decl))
+    mark_user_reg (x);
+
+  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+    mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a pseudo register.  */
 
 static void
 expand_one_register_var (tree var)
 {
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (REG_P (x));
+	  return;
+	}
+    }
+
   tree decl = SSAVAR (var);
   tree type = TREE_TYPE (decl);
   machine_mode reg_mode = promote_decl_mode (decl, NULL);
@@ -1312,21 +1559,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
 	align = POINTER_SIZE;
     }
 
-  if (SUPPORTS_STACK_ALIGNMENT
-      && crtl->stack_alignment_estimated < align)
-    {
-      /* stack_alignment_estimated shouldn't change after stack
-         realign decision made */
-      gcc_assert (!crtl->stack_realign_processed);
-      crtl->stack_alignment_estimated = align;
-    }
-
-  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
-     So here we only make sure stack_alignment_needed >= align.  */
-  if (crtl->stack_alignment_needed < align)
-    crtl->stack_alignment_needed = align;
-  if (crtl->max_used_stack_slot_alignment < align)
-    crtl->max_used_stack_slot_alignment = align;
+  record_alignment_for_reg_var (align);
 
   if (TREE_CODE (origvar) == SSA_NAME)
     {
@@ -1760,48 +1993,18 @@ expand_used_vars (void)
   if (targetm.use_pseudo_pic_reg ())
     pic_offset_table_rtx = gen_reg_rtx (Pmode);
 
-  hash_map<tree, tree> ssa_name_decls;
   for (i = 0; i < SA.map->num_partitions; i++)
     {
       tree var = partition_to_var (SA.map, i);
 
       gcc_assert (!virtual_operand_p (var));
 
-      /* Assign decls to each SSA name partition, share decls for partitions
-         we could have coalesced (those with the same type).  */
-      if (SSA_NAME_VAR (var) == NULL_TREE)
-	{
-	  tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
-	  if (!*slot)
-	    *slot = create_tmp_reg (TREE_TYPE (var));
-	  replace_ssa_name_symbol (var, *slot);
-	}
-
-      /* Always allocate space for partitions based on VAR_DECLs.  But for
-	 those based on PARM_DECLs or RESULT_DECLs and which matter for the
-	 debug info, there is no need to do so if optimization is disabled
-	 because all the SSA_NAMEs based on these DECLs have been coalesced
-	 into a single partition, which is thus assigned the canonical RTL
-	 location of the DECLs.  If in_lto_p, we can't rely on optimize,
-	 a function could be compiled with -O1 -flto first and only the
-	 link performed at -O0.  */
-      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
-	expand_one_var (var, true, true);
-      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
-	{
-	  /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
-	     contain the default def (representing the parm or result itself)
-	     we don't do anything here.  But those which don't contain the
-	     default def (representing a temporary based on the parm/result)
-	     we need to allocate space just like for normal VAR_DECLs.  */
-	  if (!bitmap_bit_p (SA.partition_has_default_def, i))
-	    {
-	      expand_one_var (var, true, true);
-	      gcc_assert (SA.partition_to_pseudo[i]);
-	    }
-	}
+      expand_one_ssa_partition (var);
     }
 
+  for (i = 1; i < num_ssa_names; i++)
+    adjust_one_expanded_partition_var (ssa_name (i));
+
   if (flag_stack_protect == SPCT_FLAG_STRONG)
       gen_stack_protect_signal
 	= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -6033,35 +6236,6 @@ pass_expand::execute (function *fun)
       parm_birth_insn = var_seq;
     }
 
-  /* Now that we also have the parameter RTXs, copy them over to our
-     partitions.  */
-  for (i = 0; i < SA.map->num_partitions; i++)
-    {
-      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
-      if (TREE_CODE (var) != VAR_DECL
-	  && !SA.partition_to_pseudo[i])
-	SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
-      gcc_assert (SA.partition_to_pseudo[i]);
-
-      /* If this decl was marked as living in multiple places, reset
-	 this now to NULL.  */
-      if (DECL_RTL_IF_SET (var) == pc_rtx)
-	SET_DECL_RTL (var, NULL);
-
-      /* Some RTL parts really want to look at DECL_RTL(x) when x
-	 was a decl marked in REG_ATTR or MEM_ATTR.  We could use
-	 SET_DECL_RTL here making this available, but that would mean
-	 to select one of the potentially many RTLs for one DECL.  Instead
-	 of doing that we simply reset the MEM_EXPR of the RTL in question,
-	 then nobody can get at it and hence nobody can call DECL_RTL on it.  */
-      if (!DECL_RTL_SET_P (var))
-	{
-	  if (MEM_P (SA.partition_to_pseudo[i]))
-	    set_mem_expr (SA.partition_to_pseudo[i], NULL);
-	}
-    }
-
   /* If we have a class containing differently aligned pointers
      we need to merge those into the corresponding RTL pointer
      alignment.  */
@@ -6069,7 +6243,6 @@ pass_expand::execute (function *fun)
     {
       tree name = ssa_name (i);
       int part;
-      rtx r;
 
       if (!name
 	  /* We might have generated new SSA names in
@@ -6082,20 +6255,24 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      /* Adjust all partition members to get the underlying decl of
-	 the representative which we might have created in expand_one_var.  */
-      if (SSA_NAME_VAR (name) == NULL_TREE)
+      gcc_assert (SA.partition_to_pseudo[part]);
+
+      /* If this decl was marked as living in multiple places, reset
+	 this now to NULL.  */
+      tree var = SSA_NAME_VAR (name);
+      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+	SET_DECL_RTL (var, NULL);
+      /* Check that the pseudos chosen by assign_parms are those of
+	 the corresponding default defs.  */
+      else if (SSA_NAME_IS_DEFAULT_DEF (name)
+	       && (TREE_CODE (var) == PARM_DECL
+		   || TREE_CODE (var) == RESULT_DECL))
 	{
-	  tree leader = partition_to_var (SA.map, part);
-	  gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
-	  replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+	  rtx in = DECL_RTL_IF_SET (var);
+	  gcc_assert (in);
+	  rtx out = SA.partition_to_pseudo[part];
+	  gcc_assert (in == out || rtx_equal_p (in, out));
 	}
-      if (!POINTER_TYPE_P (TREE_TYPE (name)))
-	continue;
-
-      r = SA.partition_to_pseudo[part];
-      if (REG_P (r))
-	mark_reg_pointer (r, get_pointer_alignment (name));
     }
 
   /* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..602579d 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
 
 #endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 380848c..2cdbea1 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2212,16 +2212,16 @@ Common Report Var(flag_tree_ch) Optimization
 Enable loop header copying on trees
 
 ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing.  Preserved for backward compatibility.
 
 ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
 
 ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copy-prop
 Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c20dd4d..0a3b930 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -337,7 +337,6 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
 -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
 -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
 -fdump-tree-nrv -fdump-tree-vect @gol
 -fdump-tree-sink @gol
 -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -443,9 +442,8 @@ Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
 -ftree-loop-if-convert-stores -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -6989,11 +6987,6 @@ name is made by appending @file{.phiopt} to the source file name.
 Dump each function after forward propagating single use variables.  The file
 name is made by appending @file{.forwprop} to the source file name.
 
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization.  The file
-name is made by appending @file{.copyrename} to the source file name.
-
 @item nrv
 @opindex fdump-tree-nrv
 Dump each function after applying the named return value optimization on
@@ -7458,8 +7451,8 @@ compilation time.
 -ftree-ccp @gol
 -fssa-phiopt @gol
 -ftree-ch @gol
+-ftree-coalesce-vars @gol
 -ftree-copy-prop @gol
--ftree-copyrename @gol
 -ftree-dce @gol
 -ftree-dominator-opts @gol
 -ftree-dse @gol
@@ -8724,6 +8717,15 @@ profitable to parallelize the loops.
 Compare the results of several data dependence analyzers.  This option
 is used for debugging the data dependence analyzers.
 
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries.  This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}.  In the negated form, this flag
+prevents SSA coalescing of user variables.  This option is enabled by
+default if optimization is enabled.
+
 @item -ftree-loop-if-convert
 @opindex ftree-loop-if-convert
 Attempt to transform conditional jumps in the innermost loops to
@@ -8837,32 +8839,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
 references with scalars to prevent committing structures to memory too
 early.  This flag is enabled by default at @option{-O} and higher.
 
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees.  This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables.  This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions.  It is a more limited form of
-@option{-ftree-coalesce-vars}.  This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries.  This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}.  In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones.  This option is enabled by default.
-
 @item -ftree-ter
 @opindex ftree-ter
 Perform temporary expression replacement during the SSA->normal phase.  Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index b8dc7d5..ef31ba0f 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1229,6 +1229,11 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
 void
 set_reg_attrs_for_decl_rtl (tree t, rtx x)
 {
+  if (!t)
+    return;
+  tree tdecl = t;
+  if (TREE_CODE (t) == TREE_LIST)
+    tdecl = TREE_VALUE (t);
   if (GET_CODE (x) == SUBREG)
     {
       gcc_assert (subreg_lowpart_p (x));
@@ -1237,7 +1242,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
   if (REG_P (x))
     REG_ATTRS (x)
       = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
-					       DECL_MODE (t)));
+					       DECL_MODE (tdecl)));
   if (GET_CODE (x) == CONCAT)
     {
       if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index de446a9..b53a3b7 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -854,6 +854,31 @@ promote_decl_mode (const_tree decl, int *punsignedp)
   return pmode;
 }
 
+/* Return the promoted mode for name.  If it is a named SSA_NAME, it
+   is the same as promote_decl_mode.  Otherwise, it is the promoted
+   mode of a temp decl of same type as the SSA_NAME, if we had created
+   one.  */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  if (SSA_NAME_VAR (name))
+    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+
+  tree type = TREE_TYPE (name);
+  int unsignedp = TYPE_UNSIGNED (type);
+  machine_mode mode = TYPE_MODE (type);
+
+  machine_mode pmode = promote_mode (type, mode, &unsignedp);
+  if (punsignedp)
+    *punsignedp = unsignedp;
+
+  return pmode;
+}
+
+
 \f
 /* Controls the behaviour of {anti_,}adjust_stack.  */
 static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 48f1859..7b11e46 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
 /* Return mode and signedness to use when object is promoted.  */
 machine_mode promote_decl_mode (const_tree, int *);
 
+/* Return mode and signedness to use when object is promoted.  */
+machine_mode promote_ssa_mode (const_tree, int *);
+
 /* Remove some bytes from the stack.  An rtx says how many.  */
 extern void adjust_stack (rtx);
 
diff --git a/gcc/expr.c b/gcc/expr.c
index 530a944..95a9bab 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9388,7 +9388,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
   rtx op0, op1, temp, decl_rtl;
   tree type;
   int unsignedp;
-  machine_mode mode;
+  machine_mode mode, dmode;
   enum tree_code code = TREE_CODE (exp);
   rtx subtarget, original_target;
   int ignore;
@@ -9519,7 +9519,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       if (g == NULL
 	  && modifier == EXPAND_INITIALIZER
 	  && !SSA_NAME_IS_DEFAULT_DEF (exp)
-	  && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+	  && (optimize || !SSA_NAME_VAR (exp)
+	      || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
 	  && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
 	g = SSA_NAME_DEF_STMT (exp);
       if (g)
@@ -9598,15 +9599,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       /* Ensure variable marked as used even if it doesn't go through
 	 a parser.  If it hasn't be used yet, write out an external
 	 definition.  */
-      TREE_USED (exp) = 1;
+      if (exp)
+	TREE_USED (exp) = 1;
 
       /* Show we haven't gotten RTL for this yet.  */
       temp = 0;
 
       /* Variables inherited from containing functions should have
 	 been lowered by this point.  */
-      context = decl_function_context (exp);
-      gcc_assert (SCOPE_FILE_SCOPE_P (context)
+      if (exp)
+	context = decl_function_context (exp);
+      gcc_assert (!exp
+		  || SCOPE_FILE_SCOPE_P (context)
 		  || context == current_function_decl
 		  || TREE_STATIC (exp)
 		  || DECL_EXTERNAL (exp)
@@ -9630,7 +9634,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	  decl_rtl = use_anchored_address (decl_rtl);
 	  if (modifier != EXPAND_CONST_ADDRESS
 	      && modifier != EXPAND_SUM
-	      && !memory_address_addr_space_p (DECL_MODE (exp),
+	      && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+					       : GET_MODE (decl_rtl),
 					       XEXP (decl_rtl, 0),
 					       MEM_ADDR_SPACE (decl_rtl)))
 	    temp = replace_equiv_address (decl_rtl,
@@ -9641,12 +9646,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 if the address is a register.  */
       if (temp != 0)
 	{
-	  if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+	  if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
 	    mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
 
 	  return temp;
 	}
 
+      if (exp)
+	dmode = DECL_MODE (exp);
+      else
+	dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
       /* If the mode of DECL_RTL does not match that of the decl,
 	 there are two cases: we are dealing with a BLKmode value
 	 that is returned in a register, or we are dealing with
@@ -9654,8 +9664,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 of the wanted mode, but mark it so that we know that it
 	 was already extended.  */
       if (REG_P (decl_rtl)
-	  && DECL_MODE (exp) != BLKmode
-	  && GET_MODE (decl_rtl) != DECL_MODE (exp))
+	  && dmode != BLKmode
+	  && GET_MODE (decl_rtl) != dmode)
 	{
 	  machine_mode pmode;
 
@@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	    pmode = promote_function_mode (type, mode, &unsignedp,
 					   gimple_call_fntype (g),
 					   2);
+	  else if (!exp)
+	    {
+	      gcc_assert (code == SSA_NAME);
+	      pmode = promote_ssa_mode (ssa_name, &unsignedp);
+	    }
 	  else
 	    pmode = promote_decl_mode (exp, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
diff --git a/gcc/function.c b/gcc/function.c
index 7d4df92..1f5296e 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfganal.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
+#include "cfgexpand.h"
 #include "basic-block.h"
 #include "df.h"
 #include "params.h"
@@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
 bool
 use_register_for_decl (const_tree decl)
 {
+  if (TREE_CODE (decl) == SSA_NAME)
+    {
+      if (!SSA_NAME_VAR (decl))
+	return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+	  && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+      decl = SSA_NAME_VAR (decl);
+    }
+
   if (!targetm.calls.allocate_stack_slots_for_args ())
     return true;
 
@@ -2804,23 +2814,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
   data->entry_parm = entry_parm;
 }
 
+/* Wrapper for use_register_for_decl, that special-cases the
+   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+   passed by reference.  */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (DECL_BY_REFERENCE (result))
+	parm = result;
+    }
+
+  return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+   is passed by reference.  */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (!DECL_BY_REFERENCE (result))
+	return NULL_RTX;
+
+      parm = result;
+    }
+
+  return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+   SSA_NAMEs in multiple partitions, so that assign_parms will choose
+   the default def, if it exists, or create new RTL to hold the unused
+   entry value.  If we are coalescing across variables, we want to
+   reset the location too, because a parm without a default def
+   (incoming value unused) might be coalesced with one with a default
+   def, and then assign_parms would copy both incoming values to the
+   same location, which might cause the wrong value to survive.  */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+  gcc_assert (TREE_CODE (parm) == PARM_DECL
+	      || TREE_CODE (parm) == RESULT_DECL);
+  if ((flag_tree_coalesce_vars
+       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+      && is_gimple_reg (parm))
+    SET_DECL_RTL (parm, NULL_RTX);
+}
+
 /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
    always valid and properly aligned.  */
 
 static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+			      struct assign_parm_data_one *data)
 {
   rtx stack_parm = data->stack_parm;
 
+  /* If out-of-SSA assigned RTL to the parm default def, make sure we
+     don't use what we might have computed before.  */
+  rtx ssa_assigned = rtl_for_parm (all, parm);
+  if (ssa_assigned)
+    stack_parm = NULL;
+
   /* If we can't trust the parm stack slot to be aligned enough for its
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
-  if (stack_parm
-      && ((STRICT_ALIGNMENT
-	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
-	  || (data->nominal_type
-	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
-	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+  else if (stack_parm
+	   && ((STRICT_ALIGNMENT
+		&& (GET_MODE_ALIGNMENT (data->nominal_mode)
+		    > MEM_ALIGN (stack_parm)))
+	       || (data->nominal_type
+		   && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+		   && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
     stack_parm = NULL;
 
   /* If parm was passed in memory, and we need to convert it on entry,
@@ -2882,11 +2957,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      stack_parm = assign_stack_local (BLKmode, size_stored,
-				       DECL_ALIGN (parm));
+      stack_parm = rtl_for_parm (all, parm);
+      if (!stack_parm)
+	stack_parm = assign_stack_local (BLKmode, size_stored,
+					 DECL_ALIGN (parm));
+      else
+	stack_parm = copy_rtx (stack_parm);
       if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
 	PUT_MODE (stack_parm, GET_MODE (entry_parm));
       set_mem_attributes (stack_parm, parm, 1);
@@ -3027,10 +3107,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  parmreg = gen_reg_rtx (promoted_nominal_mode);
+  rtx from_expand = rtl_for_parm (all, parm);
 
-  if (!DECL_ARTIFICIAL (parm))
-    mark_user_reg (parmreg);
+  if (from_expand && !data->passed_pointer)
+    {
+      parmreg = from_expand;
+      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
+    }
+  else
+    {
+      parmreg = gen_reg_rtx (promoted_nominal_mode);
+      if (!DECL_ARTIFICIAL (parm))
+	mark_user_reg (parmreg);
+    }
 
   /* If this was an item that we received a pointer to,
      set DECL_RTL appropriately.  */
@@ -3049,6 +3138,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      assign_parm_find_data_types and expand_expr_real_1.  */
 
   equiv_stack_parm = data->stack_parm;
+  if (!equiv_stack_parm)
+    equiv_stack_parm = data->entry_parm;
   validated_mem = validize_mem (copy_rtx (data->entry_parm));
 
   need_conversion = (data->nominal_mode != data->passed_mode
@@ -3189,11 +3280,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
   /* If we were passed a pointer but the actual value can safely live
      in a register, retrieve it and use it directly.  */
-  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+  if (data->passed_pointer
+      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
     {
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
-      if (use_register_for_decl (parm))
+      if (from_expand)
+	{
+	  parmreg = from_expand;
+	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+	}
+      else if (use_register_for_decl (parm))
 	{
 	  parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
 	  mark_user_reg (parmreg);
@@ -3233,7 +3330,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       /* STACK_PARM is the pointer, not the parm, and PARMREG is
 	 now the parm.  */
-      data->stack_parm = NULL;
+      data->stack_parm = equiv_stack_parm = NULL;
     }
 
   /* Mark the register as eliminable if we did no conversion and it was
@@ -3243,11 +3340,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      make here would screw up life analysis for it.  */
   if (data->nominal_mode == data->passed_mode
       && !did_conversion
-      && data->stack_parm != 0
-      && MEM_P (data->stack_parm)
+      && equiv_stack_parm != 0
+      && MEM_P (equiv_stack_parm)
       && data->locate.offset.var == 0
       && reg_mentioned_p (virtual_incoming_args_rtx,
-			  XEXP (data->stack_parm, 0)))
+			  XEXP (equiv_stack_parm, 0)))
     {
       rtx_insn *linsn = get_last_insn ();
       rtx_insn *sinsn;
@@ -3260,8 +3357,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	    = GET_MODE_INNER (GET_MODE (parmreg));
 	  int regnor = REGNO (XEXP (parmreg, 0));
 	  int regnoi = REGNO (XEXP (parmreg, 1));
-	  rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
-	  rtx stacki = adjust_address_nv (data->stack_parm, submode,
+	  rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+	  rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
 					  GET_MODE_SIZE (submode));
 
 	  /* Scan backwards for the set of the real and
@@ -3334,6 +3431,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 
       if (data->stack_parm == 0)
 	{
+	  rtx x = data->stack_parm = rtl_for_parm (all, parm);
+	  if (x)
+	    gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+	}
+
+      if (data->stack_parm == 0)
+	{
 	  int align = STACK_SLOT_ALIGNMENT (data->passed_type,
 					    GET_MODE (data->entry_parm),
 					    TYPE_ALIGN (data->passed_type));
@@ -3592,6 +3696,8 @@ assign_parms (tree fndecl)
 	  DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
 	  continue;
 	}
+      else
+	maybe_reset_rtl_for_parm (parm);
 
       /* Estimate stack alignment from parameter alignment.  */
       if (SUPPORTS_STACK_ALIGNMENT)
@@ -3641,7 +3747,9 @@ assign_parms (tree fndecl)
       else
 	set_decl_incoming_rtl (parm, data.entry_parm, false);
 
-      /* Boudns should be loaded in the particular order to
+      assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+      /* Bounds should be loaded in the particular order to
 	 have registers allocated correctly.  Collect info about
 	 input bounds and load them later.  */
       if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3658,11 +3766,10 @@ assign_parms (tree fndecl)
 	}
       else
 	{
-	  assign_parm_adjust_stack_rtl (&data);
-
 	  if (assign_parm_setup_block_p (&data))
 	    assign_parm_setup_block (&all, parm, &data);
-	  else if (data.passed_pointer || use_register_for_decl (parm))
+	  else if (data.passed_pointer
+		   || use_register_for_parm_decl (&all, parm))
 	    assign_parm_setup_reg (&all, parm, &data);
 	  else
 	    assign_parm_setup_stack (&all, parm, &data);
@@ -5001,7 +5108,9 @@ expand_function_start (tree subr)
      before any library calls that assign parms might generate.  */
 
   /* Decide whether to return the value in memory or in a register.  */
-  if (aggregate_value_p (DECL_RESULT (subr), subr))
+  tree res = DECL_RESULT (subr);
+  maybe_reset_rtl_for_parm (res);
+  if (aggregate_value_p (res, subr))
     {
       /* Returning something that won't go in a register.  */
       rtx value_address = 0;
@@ -5009,7 +5118,7 @@ expand_function_start (tree subr)
 #ifdef PCC_STATIC_STRUCT_RETURN
       if (cfun->returns_pcc_struct)
 	{
-	  int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+	  int size = int_size_in_bytes (TREE_TYPE (res));
 	  value_address = assemble_static_space (size);
 	}
       else
@@ -5021,36 +5130,45 @@ expand_function_start (tree subr)
 	     it.  */
 	  if (sv)
 	    {
-	      value_address = gen_reg_rtx (Pmode);
+	      if (DECL_BY_REFERENCE (res))
+		value_address = get_rtl_for_parm_ssa_default_def (res);
+	      if (!value_address)
+		value_address = gen_reg_rtx (Pmode);
 	      emit_move_insn (value_address, sv);
 	    }
 	}
       if (value_address)
 	{
 	  rtx x = value_address;
-	  if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+	  if (!DECL_BY_REFERENCE (res))
 	    {
-	      x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
-	      set_mem_attributes (x, DECL_RESULT (subr), 1);
+	      x = get_rtl_for_parm_ssa_default_def (res);
+	      if (!x)
+		{
+		  x = gen_rtx_MEM (DECL_MODE (res), value_address);
+		  set_mem_attributes (x, res, 1);
+		}
 	    }
-	  SET_DECL_RTL (DECL_RESULT (subr), x);
+	  SET_DECL_RTL (res, x);
 	}
     }
-  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+  else if (DECL_MODE (res) == VOIDmode)
     /* If return mode is void, this decl rtl should not be used.  */
-    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+    SET_DECL_RTL (res, NULL_RTX);
   else
     {
       /* Compute the return values into a pseudo reg, which we will copy
 	 into the true return register after the cleanups are done.  */
-      tree return_type = TREE_TYPE (DECL_RESULT (subr));
-      if (TYPE_MODE (return_type) != BLKmode
-	  && targetm.calls.return_in_msb (return_type))
+      tree return_type = TREE_TYPE (res);
+      rtx x = get_rtl_for_parm_ssa_default_def (res);
+      if (x)
+	/* Use it.  */;
+      else if (TYPE_MODE (return_type) != BLKmode
+	       && targetm.calls.return_in_msb (return_type))
 	/* expand_function_end will insert the appropriate padding in
 	   this case.  Use the return value's natural (unpadded) mode
 	   within the function proper.  */
-	SET_DECL_RTL (DECL_RESULT (subr),
-		      gen_reg_rtx (TYPE_MODE (return_type)));
+	x = gen_reg_rtx (TYPE_MODE (return_type));
       else
 	{
 	  /* In order to figure out what mode to use for the pseudo, we
@@ -5061,25 +5179,26 @@ expand_function_start (tree subr)
 	  /* Structures that are returned in registers are not
 	     aggregate_value_p, so we may see a PARALLEL or a REG.  */
 	  if (REG_P (hard_reg))
-	    SET_DECL_RTL (DECL_RESULT (subr),
-			  gen_reg_rtx (GET_MODE (hard_reg)));
+	    x = gen_reg_rtx (GET_MODE (hard_reg));
 	  else
 	    {
 	      gcc_assert (GET_CODE (hard_reg) == PARALLEL);
-	      SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+	      x = gen_group_rtx (hard_reg);
 	    }
 	}
 
+      SET_DECL_RTL (res, x);
+
       /* Set DECL_REGISTER flag so that expand_function_end will copy the
 	 result to the real return register(s).  */
-      DECL_REGISTER (DECL_RESULT (subr)) = 1;
+      DECL_REGISTER (res) = 1;
 
       if (chkp_function_instrumented_p (current_function_decl))
 	{
-	  tree return_type = TREE_TYPE (DECL_RESULT (subr));
+	  tree return_type = TREE_TYPE (res);
 	  rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
 								 subr, 1);
-	  SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+	  SET_DECL_BOUNDS_RTL (res, bounds);
 	}
     }
 
@@ -5093,7 +5212,9 @@ expand_function_start (tree subr)
       tree parm = cfun->static_chain_decl;
       rtx local, chain, insn;
 
-      local = gen_reg_rtx (Pmode);
+      local = get_rtl_for_parm_ssa_default_def (parm);
+      if (!local)
+	local = gen_reg_rtx (Pmode);
       chain = targetm.calls.static_chain (current_function_decl, true);
 
       set_decl_incoming_rtl (parm, chain, false);
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index efc93b7..e29f300 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
   return copy;
 }
 
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
-   coalescing together, false otherwise.
-
-   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
-  /* First check the SSA_NAME's associated DECL.  We only want to
-     coalesce if they have the same DECL or both have no associated DECL.  */
-  tree var1 = SSA_NAME_VAR (name1);
-  tree var2 = SSA_NAME_VAR (name2);
-  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
-  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
-  if (var1 != var2)
-    return false;
-
-  /* Now check the types.  If the types are the same, then we should
-     try to coalesce V1 and V2.  */
-  tree t1 = TREE_TYPE (name1);
-  tree t2 = TREE_TYPE (name2);
-  if (t1 == t2)
-    return true;
-
-  /* If the types are not the same, check for a canonical type match.  This
-     (for example) allows coalescing when the types are fundamentally the
-     same, but just have different names. 
-
-     Note pointer types with different address spaces may have the same
-     canonical type.  Those are rejected for coalescing by the
-     types_compatible_p check.  */
-  if (TYPE_CANONICAL (t1)
-      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
-      && types_compatible_p (t1, t2))
-    return true;
-
-  return false;
-}
-
 /* Strip off a legitimate source ending from the input string NAME of
    length LEN.  Rather than having to know the names used by all of
    our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index a50a90a..b492137 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
 extern bool gimple_has_body_p (tree);
 extern const char *gimple_decl_printable_name (tree, int);
 extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
 extern tree create_tmp_var_name (const char *);
 extern tree create_tmp_var_raw (tree, const char * = NULL);
 extern tree create_tmp_var (tree, const char * = NULL);
@@ -56,6 +55,10 @@ extern bool is_gimple_mem_ref_addr (tree);
 extern void mark_addressable (tree);
 extern bool is_gimple_reg_rhs (tree);
 
+/* Defined in tree-ssa-coalesce.c.   */
+extern bool gimple_can_coalesce_p (tree, tree);
+
+
 /* Return true if a conversion from either type of TYPE1 and TYPE2
    to the other is not required.  Otherwise return false.  */
 
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..7e41b1f 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
-    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index ffa63b5..4548b20 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_all_early_optimizations);
       PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_object_sizes);
 	  NEXT_PASS (pass_ccp);
 	  /* After CCP we rewrite no longer addressed locals into SSA
@@ -154,7 +153,6 @@ along with GCC; see the file COPYING3.  If not see
       /* Initial scalar cleanups before alias computation.
 	 They ensure memory accesses are not indirect wherever possible.  */
       NEXT_PASS (pass_strip_predict_hints);
-      NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_ccp);
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
@@ -182,7 +180,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_stdarg);
       NEXT_PASS (pass_lower_complex);
       NEXT_PASS (pass_sra);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* The dom pass will also resolve all __builtin_constant_p calls
          that are still there to 0.  This has to be done after some
 	 propagations have already run, but before some more dead code
@@ -291,7 +288,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_fold_builtins);
       NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* FIXME: If DCE is not run before checking for uninitialized uses,
 	 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 	 However, this also causes us to misdiagnose cases that should be
@@ -326,7 +322,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tsan);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* ???  We do want some kind of loop invariant motion, but we possibly
          need to adjust LIM to be more friendly towards preserving accurate
 	 debug information here.  */
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/54200 */
 /* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
 
 int o __attribute__((used));
 
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
 
 int main ()
 {
-  int i;
+  register int i;
   char foo[255];
 
   // smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
 void
 overflow()
 {
-  int i = 0;
+  register int i = 0;
   char foo[30];
 
   /* Overflow buffer.  */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+   value is unused, to the same location, so as to overwrite one of
+   them with the incoming value of the other.  */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+/* Same as foo, but with swapped parameters.  */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+int
+main (void)
+{
+  if (foo (0, 1) != 3)
+    abort ();
+  if (bar (1, 0) != 3)
+    abort ();
+  return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index e6310cd..e62f36b 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -330,12 +330,13 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
 
   start_sequence ();
 
-  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+  tree name = partition_to_var (SA.map, dest);
+  var = SSA_NAME_VAR (name);
   src_mode = TYPE_MODE (TREE_TYPE (src));
   dest_mode = GET_MODE (dest_rtx);
-  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
   gcc_assert (!REG_P (dest_rtx)
-	      || dest_mode == promote_decl_mode (var, &unsignedp));
+	      || dest_mode == promote_ssa_mode (name, &unsignedp));
 
   if (src_mode != dest_mode)
     {
@@ -714,12 +715,12 @@ static rtx
 get_temp_reg (tree name)
 {
   tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
-  tree type = TREE_TYPE (var);
+  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
   int unsignedp;
-  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
   rtx x = gen_reg_rtx (reg_mode);
   if (POINTER_TYPE_P (type))
-    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
   return x;
 }
 
@@ -1019,7 +1020,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 
   /* Return to viewing the variable list as just all reference variables after
      coalescing has been performed.  */
-  partition_view_normal (map, false);
+  partition_view_normal (map);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index eeac5a4..c2cdeef0 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssanames.h"
 #include "tree-ssa-live.h"
 #include "tree-ssa-coalesce.h"
+#include "explow.h"
 #include "diagnostic-core.h"
 
 
@@ -832,6 +833,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
   basic_block bb;
   ssa_op_iter iter;
   live_track_p live;
+  basic_block entry;
+
+  /* If inter-variable coalescing is enabled, we may attempt to
+     coalesce variables from different base variables, including
+     different parameters, so we have to make sure default defs live
+     at the entry block conflict with each other.  */
+  if (flag_tree_coalesce_vars)
+    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  else
+    entry = NULL;
 
   map = live_var_map (liveinfo);
   graph = ssa_conflicts_new (num_var_partitions (map));
@@ -890,6 +901,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	    live_track_process_def (live, result, graph);
 	}
 
+      /* Pretend there are defs for params' default defs at the start
+	 of the (post-)entry block.  */
+      if (bb == entry)
+	{
+	  unsigned base;
+	  bitmap_iterator bi;
+	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	    {
+	      bitmap_iterator bi2;
+	      unsigned part;
+	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+					0, part, bi2)
+		{
+		  tree var = partition_to_var (map, part);
+		  if (!SSA_NAME_VAR (var)
+		      || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+			  && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+		      || !SSA_NAME_IS_DEFAULT_DEF (var))
+		    continue;
+		  live_track_process_def (live, var, graph);
+		}
+	    }
+	}
+
      live_track_clear_base_vars (live);
     }
 
@@ -1158,6 +1193,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
     {
       var1 = partition_to_var (map, p1);
       var2 = partition_to_var (map, p2);
+
       z = var_union (map, var1, var2);
       if (z == NO_PARTITION)
 	{
@@ -1175,6 +1211,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
 
       if (debug)
 	fprintf (debug, ": Success -> %d\n", z);
+
       return true;
     }
 
@@ -1272,6 +1309,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
 }
 
 
+/* Output partition map MAP with coalescing plan PART to file F.  */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+  int t;
+  unsigned x, y;
+  int p;
+
+  fprintf (f, "\nCoalescible Partition map \n\n");
+
+  for (x = 0; x < map->num_partitions; x++)
+    {
+      if (map->view_to_partition != NULL)
+	p = map->view_to_partition[x];
+      else
+	p = x;
+
+      if (ssa_name (p) == NULL_TREE
+	  || virtual_operand_p (ssa_name (p)))
+        continue;
+
+      t = 0;
+      for (y = 1; y < num_ssa_names; y++)
+        {
+	  tree var = version_to_var (map, y);
+	  if (!var)
+	    continue;
+	  int q = var_to_partition (map, var);
+	  p = partition_find (part, q);
+	  gcc_assert (map->partition_to_base_index[q]
+		      == map->partition_to_base_index[p]);
+
+	  if (p == (int)x)
+	    {
+	      if (t++ == 0)
+	        {
+		  fprintf (f, "Partition %d, base %d (", x,
+			   map->partition_to_base_index[q]);
+		  print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+		  fprintf (f, " - ");
+		}
+	      fprintf (f, "%d ", y);
+	    }
+	}
+      if (t != 0)
+	fprintf (f, ")\n");
+    }
+  fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+   coalescing together, false otherwise.
+
+   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+  /* First check the SSA_NAME's associated DECL.  Without
+     optimization, we only want to coalesce if they have the same DECL
+     or both have no associated DECL.  */
+  tree var1 = SSA_NAME_VAR (name1);
+  tree var2 = SSA_NAME_VAR (name2);
+  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+  if (var1 != var2 && !flag_tree_coalesce_vars)
+    return false;
+
+  /* Now check the types.  If the types are the same, then we should
+     try to coalesce V1 and V2.  */
+  tree t1 = TREE_TYPE (name1);
+  tree t2 = TREE_TYPE (name2);
+  if (t1 == t2)
+    {
+    check_modes:
+      /* If the base variables are the same, we're good: none of the
+	 other tests below could possibly fail.  */
+      var1 = SSA_NAME_VAR (name1);
+      var2 = SSA_NAME_VAR (name2);
+      if (var1 == var2)
+	return true;
+
+      /* We don't want to coalesce two SSA names if one of the base
+	 variables is supposed to be a register while the other is
+	 supposed to be on the stack.  Anonymous SSA names take
+	 registers, but when not optimizing, user variables should go
+	 on the stack, so coalescing them with the anonymous variable
+	 as the partition leader would end up assigning the user
+	 variable to a register.  Don't do that!  */
+      bool reg1 = !var1 || use_register_for_decl (var1);
+      bool reg2 = !var2 || use_register_for_decl (var2);
+      if (reg1 != reg2)
+	return false;
+
+      /* Check that the promoted modes are the same.  We don't want to
+	 coalesce if the promoted modes would be different.  Only
+	 PARM_DECLs and RESULT_DECLs have different promotion rules,
+	 so skip the test if we both are variables or anonymous
+	 SSA_NAMEs.  */
+      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+	|| promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
+    }
+
+  /* If the types are not the same, check for a canonical type match.  This
+     (for example) allows coalescing when the types are fundamentally the
+     same, but just have different names. 
+
+     Note pointer types with different address spaces may have the same
+     canonical type.  Those are rejected for coalescing by the
+     types_compatible_p check.  */
+  if (TYPE_CANONICAL (t1)
+      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+      && types_compatible_p (t1, t2))
+    goto check_modes;
+
+  return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+   partition of SSA names USED_IN_COPIES and related by CL coalesce
+   possibilities.  This must match gimple_can_coalesce_p in the
+   optimized case.  */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+				   coalesce_list_p cl)
+{
+  int parts = num_var_partitions (map);
+  partition tentative = partition_new (parts);
+
+  /* Partition the SSA versions so that, for each coalescible
+     pair, both of its members are in the same partition in
+     TENTATIVE.  */
+  gcc_assert (!cl->sorted);
+  coalesce_pair_p node;
+  coalesce_iterator_type ppi;
+  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+    {
+      tree v1 = ssa_name (node->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (node->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* We have to deal with cost one pairs too.  */
+  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+    {
+      tree v1 = ssa_name (co->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (co->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* And also with abnormal edges.  */
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->flags & EDGE_ABNORMAL)
+	  {
+	    gphi_iterator gsi;
+	    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+		 gsi_next (&gsi))
+	      {
+		gphi *phi = gsi.phi ();
+		tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+		if (SSA_NAME_IS_DEFAULT_DEF (arg)
+		    && (!SSA_NAME_VAR (arg)
+			|| TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+		  continue;
+
+		tree res = PHI_RESULT (phi);
+
+		int p1 = partition_find (tentative, var_to_partition (map, res));
+		int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+		if (p1 == p2)
+		  continue;
+
+		partition_union (tentative, p1, p2);
+	      }
+	  }
+    }
+
+  map->partition_to_base_index = XCNEWVEC (int, parts);
+  auto_vec<unsigned int> index_map (parts);
+  if (parts)
+    index_map.quick_grow (parts);
+
+  const unsigned no_part = -1;
+  unsigned count = parts;
+  while (count)
+    index_map[--count] = no_part;
+
+  /* Initialize MAP's mapping from partition to base index, using
+     as base indices an enumeration of the TENTATIVE partitions in
+     which each SSA version ended up, so that we compute conflicts
+     between all SSA versions that ended up in the same potential
+     coalesce partition.  */
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      if (index_map[base] != no_part)
+	continue;
+      index_map[base] = count++;
+    }
+
+  map->num_basevars = count;
+
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      gcc_assert (index_map[base] < count);
+      map->partition_to_base_index[pidx] = index_map[base];
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_part_var_map (dump_file, tentative, map);
+
+  partition_delete (tentative);
+}
+
+/* Hashtable helpers.  */
+
+struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
+{
+  typedef tree_int_map *value_type;
+  typedef tree_int_map *compare_type;
+  static inline hashval_t hash (const tree_int_map *);
+  static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+  return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+  return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+   names.  Partitions will share the same base if they have the same
+   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
+   must match gimple_can_coalesce_p in the non-optimized case.  */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+  int x, num_part;
+  tree var;
+  struct tree_int_map *m, *mapstorage;
+
+  num_part = num_var_partitions (map);
+  hash_table<tree_int_map_hasher> tree_to_index (num_part);
+  /* We can have at most num_part entries in the hash tables, so it's
+     enough to allocate so many map elements once, saving some malloc
+     calls.  */
+  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+  /* If a base table already exists, clear it, otherwise create it.  */
+  free (map->partition_to_base_index);
+  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+  /* Build the base variable list, and point partitions at their bases.  */
+  for (x = 0; x < num_part; x++)
+    {
+      struct tree_int_map **slot;
+      unsigned baseindex;
+      var = partition_to_var (map, x);
+      if (SSA_NAME_VAR (var)
+	  && (!VAR_P (SSA_NAME_VAR (var))
+	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+	m->base.from = SSA_NAME_VAR (var);
+      else
+	/* This restricts what anonymous SSA names we can coalesce
+	   as it restricts the sets we compute conflicts for.
+	   Using TREE_TYPE to generate sets is the easies as
+	   type equivalency also holds for SSA names with the same
+	   underlying decl.
+
+	   Check gimple_can_coalesce_p when changing this code.  */
+	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+			? TYPE_CANONICAL (TREE_TYPE (var))
+			: TREE_TYPE (var));
+      /* If base variable hasn't been seen, set it up.  */
+      slot = tree_to_index.find_slot (m, INSERT);
+      if (!*slot)
+	{
+	  baseindex = m - mapstorage;
+	  m->to = baseindex;
+	  *slot = m;
+	  m++;
+	}
+      else
+	baseindex = (*slot)->to;
+      map->partition_to_base_index[x] = baseindex;
+    }
+
+  map->num_basevars = m - mapstorage;
+
+  free (mapstorage);
+}
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1288,9 +1649,10 @@ coalesce_ssa_name (void)
   cl = create_coalesce_list ();
   map = create_outofssa_var_map (cl, used_in_copies);
 
-  /* If optimization is disabled, we need to coalesce all the names originating
-     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
-  if (!optimize)
+  /* If this optimization is disabled, we need to coalesce all the
+     names originating from the same SSA_NAME_VAR so debug info
+     remains undisturbed.  */
+  if (!flag_tree_coalesce_vars)
     {
       hash_table<ssa_name_var_hash> ssa_name_hash (10);
 
@@ -1331,8 +1693,13 @@ coalesce_ssa_name (void)
   if (dump_file && (dump_flags & TDF_DETAILS))
     dump_var_map (dump_file, map);
 
-  /* Don't calculate live ranges for variables not in the coalesce list.  */
-  partition_view_bitmap (map, used_in_copies, true);
+  partition_view_bitmap (map, used_in_copies);
+
+  if (flag_tree_coalesce_vars)
+    compute_optimized_partition_bases (map, used_in_copies, cl);
+  else
+    compute_samebase_partition_bases (map);
+
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
@@ -1371,8 +1738,7 @@ coalesce_ssa_name (void)
 
   /* Now coalesce everything in the list.  */
   coalesce_partitions (map, graph, cl,
-		       ((dump_flags & TDF_DETAILS) ? dump_file
-						   : NULL));
+		       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
   delete_coalesce_list (cl);
   ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index f3cb56e..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,499 +0,0 @@
-/* Rename SSA copies.
-   Copyright (C) 2004-2015 Free Software Foundation, Inc.
-   Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
-#include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
-#include "tree.h"
-#include "fold-const.h"
-#include "predict.h"
-#include "hard-reg-set.h"
-#include "function.h"
-#include "dominance.h"
-#include "cfg.h"
-#include "basic-block.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "bitmap.h"
-#include "gimple-ssa.h"
-#include "stringpool.h"
-#include "tree-ssanames.h"
-#include "hashtab.h"
-#include "rtl.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
-  /* Number of copies coalesced.  */
-  int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
-   This optimization looks for copies between 2 SSA_NAMES, either through a
-   direct copy, or an implicit one via a PHI node result and its arguments.
-
-   Each copy is examined to determine if it is possible to rename the base
-   variable of one of the operands to the same variable as the other operand.
-   i.e.
-   T.3_5 = <blah>
-   a_1 = T.3_5
-
-   If this copy couldn't be copy propagated, it could possibly remain in the
-   program throughout the optimization phases.   After SSA->normal, it would
-   become:
-
-   T.3 = <blah>
-   a = T.3
-
-   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
-   fundamental reason why the base variable needs to be T.3, subject to
-   certain restrictions.  This optimization attempts to determine if we can
-   change the base variable on copies like this, and result in code such as:
-
-   a_5 = <blah>
-   a_1 = a_5
-
-   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
-   possible, the copy goes away completely. If it isn't possible, a new temp
-   will be created for a_5, and you will end up with the exact same code:
-
-   a.8 = <blah>
-   a = a.8
-
-   The other benefit of performing this optimization relates to what variables
-   are chosen in copies.  Gimplification of the program uses temporaries for
-   a lot of things. expressions like
-
-   a_1 = <blah>
-   <blah2> = a_1
-
-   get turned into
-
-   T.3_5 = <blah>
-   a_1 = T.3_5
-   <blah2> = a_1
-
-   Copy propagation is done in a forward direction, and if we can propagate
-   through the copy, we end up with:
-
-   T.3_5 = <blah>
-   <blah2> = T.3_5
-
-   The copy is gone, but so is all reference to the user variable 'a'. By
-   performing this optimization, we would see the sequence:
-
-   a_5 = <blah>
-   a_1 = a_5
-   <blah2> = a_1
-
-   which copy propagation would then turn into:
-
-   a_5 = <blah>
-   <blah2> = a_5
-
-   and so we still retain the user variable whenever possible.  */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
-   Choose a representative for the partition, and send debug info to DEBUG.  */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
-  int p1, p2, p3;
-  tree root1, root2;
-  tree rep1, rep2;
-  bool ign1, ign2, abnorm;
-
-  gcc_assert (TREE_CODE (var1) == SSA_NAME);
-  gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
-  register_ssa_partition (map, var1);
-  register_ssa_partition (map, var2);
-
-  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
-  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
-  if (debug)
-    {
-      fprintf (debug, "Try : ");
-      print_generic_expr (debug, var1, TDF_SLIM);
-      fprintf (debug, "(P%d) & ", p1);
-      print_generic_expr (debug, var2, TDF_SLIM);
-      fprintf (debug, "(P%d)", p2);
-    }
-
-  gcc_assert (p1 != NO_PARTITION);
-  gcc_assert (p2 != NO_PARTITION);
-
-  if (p1 == p2)
-    {
-      if (debug)
-	fprintf (debug, " : Already coalesced.\n");
-      return;
-    }
-
-  rep1 = partition_to_var (map, p1);
-  rep2 = partition_to_var (map, p2);
-  root1 = SSA_NAME_VAR (rep1);
-  root2 = SSA_NAME_VAR (rep2);
-  if (!root1 && !root2)
-    return;
-
-  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
-  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
-	    || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
-  if (abnorm)
-    {
-      if (debug)
-	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
-      return;
-    }
-
-  /* Partitions already have the same root, simply merge them.  */
-  if (root1 == root2)
-    {
-      p1 = partition_union (map->var_partition, p1, p2);
-      if (debug)
-	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
-      return;
-    }
-
-  /* Never attempt to coalesce 2 different parameters.  */
-  if ((root1 && TREE_CODE (root1) == PARM_DECL)
-      && (root2 && TREE_CODE (root2) == PARM_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
-      return;
-    }
-
-  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
-      != (root2 && TREE_CODE (root2) == RESULT_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
-      return;
-    }
-
-  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
-  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
-  /* Refrain from coalescing user variables, if requested.  */
-  if (!ign1 && !ign2)
-    {
-      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
-	ign2 = true;
-      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
-	ign1 = true;
-      else if (flag_ssa_coalesce_vars != 2)
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
-	  return;
-	}
-      else
-	ign2 = true;
-    }
-
-  /* If both values have default defs, we can't coalesce.  If only one has a
-     tag, make sure that variable is the new root partition.  */
-  if (root1 && ssa_default_def (cfun, root1))
-    {
-      if (root2 && ssa_default_def (cfun, root2))
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 default defs. No coalesce.\n");
-	  return;
-	}
-      else
-        {
-	  ign2 = true;
-	  ign1 = false;
-	}
-    }
-  else if (root2 && ssa_default_def (cfun, root2))
-    {
-      ign1 = true;
-      ign2 = false;
-    }
-
-  /* Do not coalesce if we cannot assign a symbol to the partition.  */
-  if (!(!ign2 && root2)
-      && !(!ign1 && root1))
-    {
-      if (debug)
-	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the new chosen root variable would be read-only.
-     If both ign1 && ign2, then the root var of the larger partition
-     wins, so reject in that case if any of the root vars is TREE_READONLY.
-     Otherwise reject only if the root var, on which replace_ssa_name_symbol
-     will be called below, is readonly.  */
-  if (((root1 && TREE_READONLY (root1)) && ign2)
-      || ((root2 && TREE_READONLY (root2)) && ign1))
-    {
-      if (debug)
-	fprintf (debug, " : Readonly variable.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the two variables aren't type compatible .  */
-  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
-      /* There is a disconnect between the middle-end type-system and
-         VRP, avoid coalescing enum types with different bounds.  */
-      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
-	   || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
-	  && TREE_TYPE (var1) != TREE_TYPE (var2)))
-    {
-      if (debug)
-	fprintf (debug, " : Incompatible types.  No coalesce.\n");
-      return;
-    }
-
-  /* Merge the two partitions.  */
-  p3 = partition_union (map->var_partition, p1, p2);
-
-  /* Set the root variable of the partition to the better choice, if there is
-     one.  */
-  if (!ign2 && root2)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
-  else if (!ign1 && root1)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
-  else
-    gcc_unreachable ();
-
-  if (debug)
-    {
-      fprintf (debug, " --> P%d ", p3);
-      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
-			  TDF_SLIM);
-      fprintf (debug, "\n");
-    }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
-  GIMPLE_PASS, /* type */
-  "copyrename", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_COPY_RENAME, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
-  pass_rename_ssa_copies (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
-   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
-   changing the underlying root variable of all coalesced version.  This will
-   then cause the SSA->normal pass to attempt to coalesce them all to the same
-   variable.  */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
-  var_map map;
-  basic_block bb;
-  tree var, part_var;
-  gimple stmt;
-  unsigned x;
-  FILE *debug;
-
-  memset (&stats, 0, sizeof (stats));
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    debug = dump_file;
-  else
-    debug = NULL;
-
-  map = init_var_map (num_ssa_names);
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Scan for real copies.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_assign_ssa_name_copy_p (stmt))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
-	    }
-	}
-    }
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Treat PHI nodes as copies between the result and each argument.  */
-      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-        {
-          size_t i;
-	  tree res;
-	  gphi *phi = gsi.phi ();
-	  res = gimple_phi_result (phi);
-
-	  /* Do not process virtual SSA_NAMES.  */
-	  if (virtual_operand_p (res))
-	    continue;
-
-	  /* Make sure to only use the same partition for an argument
-	     as the result but never the other way around.  */
-	  if (SSA_NAME_VAR (res)
-	      && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
-	    for (i = 0; i < gimple_phi_num_args (phi); i++)
-	      {
-		tree arg = PHI_ARG_DEF (phi, i);
-		if (TREE_CODE (arg) == SSA_NAME)
-		  copy_rename_partition_coalesce (map, res, arg,
-						  debug);
-	      }
-	  /* Else if all arguments are in the same partition try to merge
-	     it with the result.  */
-	  else
-	    {
-	      int all_p_same = -1;
-	      int p = -1;
-	      for (i = 0; i < gimple_phi_num_args (phi); i++)
-		{
-		  tree arg = PHI_ARG_DEF (phi, i);
-		  if (TREE_CODE (arg) != SSA_NAME)
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		  else if (all_p_same == -1)
-		    {
-		      p = partition_find (map->var_partition,
-					  SSA_NAME_VERSION (arg));
-		      all_p_same = 1;
-		    }
-		  else if (all_p_same == 1
-			   && p != partition_find (map->var_partition,
-						   SSA_NAME_VERSION (arg)))
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		}
-	      if (all_p_same == 1)
-		copy_rename_partition_coalesce (map, res,
-						PHI_ARG_DEF (phi, 0),
-						debug);
-	    }
-        }
-    }
-
-  if (debug)
-    dump_var_map (debug, map);
-
-  /* Now one more pass to make all elements of a partition share the same
-     root variable.  */
-
-  for (x = 1; x < num_ssa_names; x++)
-    {
-      part_var = partition_to_var (map, x);
-      if (!part_var)
-        continue;
-      var = ssa_name (x);
-      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
-	continue;
-      if (debug)
-        {
-	  fprintf (debug, "Coalesced ");
-	  print_generic_expr (debug, var, TDF_SLIM);
-	  fprintf (debug, " to ");
-	  print_generic_expr (debug, part_var, TDF_SLIM);
-	  fprintf (debug, "\n");
-	}
-      stats.coalesced++;
-      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
-    }
-
-  statistics_counter_event (fun, "copies coalesced",
-			    stats.coalesced);
-  delete_var_map (map);
-  return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
-  return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 2c7c072..821b2f4 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -100,90 +100,6 @@ static void  verify_live_on_entry (tree_live_info_p);
    ssa_name or variable, and vice versa.  */
 
 
-/* Hashtable helpers.  */
-
-struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
-{
-  typedef tree_int_map *value_type;
-  typedef tree_int_map *compare_type;
-  static inline hashval_t hash (const tree_int_map *);
-  static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
-  return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
-  return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP.  */
-
-static void
-var_map_base_init (var_map map)
-{
-  int x, num_part;
-  tree var;
-  struct tree_int_map *m, *mapstorage;
-
-  num_part = num_var_partitions (map);
-  hash_table<tree_int_map_hasher> tree_to_index (num_part);
-  /* We can have at most num_part entries in the hash tables, so it's
-     enough to allocate so many map elements once, saving some malloc
-     calls.  */
-  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
-  /* If a base table already exists, clear it, otherwise create it.  */
-  free (map->partition_to_base_index);
-  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
-  /* Build the base variable list, and point partitions at their bases.  */
-  for (x = 0; x < num_part; x++)
-    {
-      struct tree_int_map **slot;
-      unsigned baseindex;
-      var = partition_to_var (map, x);
-      if (SSA_NAME_VAR (var)
-	  && (!VAR_P (SSA_NAME_VAR (var))
-	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
-	m->base.from = SSA_NAME_VAR (var);
-      else
-	/* This restricts what anonymous SSA names we can coalesce
-	   as it restricts the sets we compute conflicts for.
-	   Using TREE_TYPE to generate sets is the easies as
-	   type equivalency also holds for SSA names with the same
-	   underlying decl. 
-
-	   Check gimple_can_coalesce_p when changing this code.  */
-	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
-			? TYPE_CANONICAL (TREE_TYPE (var))
-			: TREE_TYPE (var));
-      /* If base variable hasn't been seen, set it up.  */
-      slot = tree_to_index.find_slot (m, INSERT);
-      if (!*slot)
-	{
-	  baseindex = m - mapstorage;
-	  m->to = baseindex;
-	  *slot = m;
-	  m++;
-	}
-      else
-	baseindex = (*slot)->to;
-      map->partition_to_base_index[x] = baseindex;
-    }
-
-  map->num_basevars = m - mapstorage;
-
-  free (mapstorage);
-}
-
-
 /* Remove the base table in MAP.  */
 
 static void
@@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
 }
 
 
-/* Create a partition view which includes all the used partitions in MAP.  If
-   WANT_BASES is true, create the base variable map as well.  */
+/* Create a partition view which includes all the used partitions in MAP.  */
 
 void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
 {
   bitmap used;
 
   used = partition_view_init (map);
   partition_view_fini (map, used);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
@@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
    as well.  */
 
 void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
 {
   bitmap used;
   bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
     }
   partition_view_fini (map, new_partitions);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
 extern var_map init_var_map (int);
 extern void delete_var_map (var_map);
 extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
 extern void dump_scope_blocks (FILE *, int);
 extern void debug_scope_block (tree, int);
 extern void debug_scope_blocks (int);
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 685fcc38c..447fcd9 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4872,12 +4872,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
    registers, as well as associations between MEMs and VALUEs.  */
 
 static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
 {
   unsigned int r;
   hard_reg_set_iterator hrsi;
+  HARD_REG_SET invalidated_regs;
 
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+  get_call_reg_set_usage (call_insn, &invalidated_regs,
+			  regs_invalidated_by_call);
+
+  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
     var_regno_delete (set, r);
 
   if (MAY_HAVE_DEBUG_INSNS)
@@ -6685,7 +6689,7 @@ compute_bb_dataflow (basic_block bb)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (out);
+	    dataflow_set_clear_at_call (out, insn);
 	    break;
 
 	  case MO_USE:
@@ -9152,7 +9156,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (set);
+	    dataflow_set_clear_at_call (set, insn);
 	    emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
 	    {
 	      rtx arguments = mo->u.loc, *p = &arguments;


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-04-24  1:56         ` Alexandre Oliva
@ 2015-04-27 11:39           ` Richard Biener
  2015-06-06  5:12             ` Alexandre Oliva
  2015-04-29  3:51           ` Jeff Law
  1 sibling, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-04-27 11:39 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Jeff Law, GCC Patches

On Fri, Apr 24, 2015 at 3:56 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Apr  6, 2015, Jeff Law <law@redhat.com> wrote:
>
>>>> So the bulk of the changes into this routine are really about picking
>>>> a good leader, which presumably is how we're able to get the desired
>>>> effects on debuginfo that we used to get from tree-ssa-copyrename.c?
>>>
>>> This has nothing to do with debuginfo, I'm afraid.  We just had to keep
>>> track of parm and result decls to avoid coalescing them together, and
>>> parm decl default defs to promote them to leaders, because expand copies
>>> incoming REGs to pseudos in PARM_DECL's DECL_RTL.  We should fill that
>>> in with the RTL created for the default def for the PARM_DECL.  At the
>>> end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
>>> return register or rtl.  I didn't want to tackle the reworking of these
>>> expanders to avoid problems out of copying incoming parms to one pseudo
>>> and then reading it from another, as I observed before I made this
>>> change.  I'm now tackling that, so that we can refrain from touching the
>>> base vars altogether, and not refrain from coalescing parms and results.
>
>> Hmmm, so the real issue here is the expansion setup of parms and
>> results.  I hadn't pondered that aspect.  I'd encourage fixing the
>> expansion code too if you can see a path for that.
>
> That was the trickiest bit of the patch: getting assign_parms to use the
> out-of-SSA-chosen RTL for the (default def of the) param instead of
> creating a pseudo or stack slot of its own, especially when we create a
> .result_ptr decl and there is an incoming by-ref result_decl, in which
> case we ought to use the same SSA-assigned pseudo for both.  Another
> case worth mentioning is that in which a param is unused: there is no
> default def for it, but in the non-optimized case, we have to assign it
> to the same location.  I've used the DECL_RTL itself to carry the
> information in this case, at least in the non-optimized case, in which
> we know all SSA_NAMEs associated with each param *will* be assigned to
> the same partition, and use the same RTL.  If we do optimize, the param
> may get multiple locations, and DECL_RTL will be cleared.  That's fine:
> the incoming value of the param will end up being copied to a separate
> pseudo, so that there's no risk of messing with any other default def
> (there's a new testcase for this), and the copy is likely to be
> optimized out.
>
> The other tricky bit was to fix all expander bits that required
> SSA_NAMEs to have a associated decl.  I've removed all such cases, so we
> can now expand anonymous SSA decls directly, without having to create an
> ignored decl.  Doing that, we can coalesce variables and expand each
> partition without worrying about choosing a preferred partition leader.
> We just have to make sure we don't consider pairs of variables eligible
> for coalescing if they should get different promoted modes, or a
> different register-or-stack choice, and then expansion of partitions is
> streamlined: we just expand each leader, and then adjust all SSA_NAMEs
> to associate the RTL with their base variables, if any.
>
>
> In this revision of the patch, I have retained -ftree-coalesce-vars, so
> that its negated form can be used in testcases that formerly expected no
> coalescing across user variables, but that explicitly disabled VTA.
>
> As for testcases, while investigating test regressions, I found out
> various guality failures had to do with VT's lack of awareness of custom
> calling conventions.  Caller's variables saved in registers that are
> normally call-clobbered, but that are call-saved in custom conventions
> set up for a callee, would end up invalidating the entry-point location
> associations.  I've arranged for var-tracking to use custom calling
> conventions for register invalidation at call insns, and this fixed not
> only a few guality regressions due to changes in register assignment,
> but a number of other long-standing guality failures.  Yay!  This could
> be split out into a standalone patch.
>
>
> On Mar 31, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> Did you do any statistics on how the number of basevars changes with your patch
>> compared to trunk?
>
> In this version of the patch, we no longer touch the base vars at all.
> We just associate the piece of RTL generated for the partition with a
> list of decls, if needed.  (I've just realized that I never noticed a
> list of decls show up anywhere, and looking into this, I saw a bug in
> the leader_merge function, that causes it to fail to go from a single
> entry to a list: it creates the list, but then returns the original
> non-list entry; that's why I never saw them!  I won't delay posting the
> patch just because of this; I'm not even sure we want decl lists in REG
> or MEM attrs begin with)
>
> I have collected some statistics on the effects of the patch in
> compiling stage3-gcc/, before and after the patch, with and without
> -fno-tree-coalesce-vars.  I counted, per function:
>
> b/a: before the patch, or after the patch
>
> c/n: -ftree-coalesce-vars (default when optimizing) or
> -fno-tree-coalesce-vars
>
> cv: the coalescible var count, i.e., the active partition count prior to
> coalescing.  SSA_NAMEs not elligible for coalescing are not counted.
> The more of these there are, the larger the conflict graph we have to
> build.
>
> base: the base variable count that guides the construction of the
> conflict map.  The more of these there are, the smaller the conflict
> graph we have to build, but it is also a lower bound for the final
> partition count.
>
> part: the partition count after coalescing, not counting those of
> SSA_NAMEs that were not elligible for coalescing to begin with.
>
> abn: successful abnormal coalesce count.  How many times
> attempt_coalesce returned true as called in the abnormal coalesce loop.
>
> same: successful normal coalesces of pairs of SSA_NAMEs that share the
> same base variable (SSA_NAME_VAR, not the base index used to guide the
> construction of the conflict graph).  Ignored base decls are regarded as
> NULL for purposes of this comparison.  How many times attempt_coalesce
> returned true for variables that share the same base variable.  This may
> count cases in which both vars are in the same partition already due to
> earlier coalesces.
>
> other: successful normal coalesces of pairs of SSA_NAMEs that do NOT
> share the same base variable.  Same caveats as above.
>
> fail: failed attempts at normal coalece.  How many times
> attempt_coalesce returned false.
>
> b/a    c/n     cv     base   part   abn   same   other fail
>
> before -fno-tr 570180 176682 221442 82076 370746     0 10542
> before -ftree- 577212 171581 221927 82076 378093     0 18654
>
> after  -fno-tr 608533 179959 220948 82076 488119     0 11697
> after  -ftree- 589243 202588 221817 82076 349373 41775 24124
>
>
> Here's (for reference only) the patch used to gather the data
> consolidated above:
>
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index eeac5a4..d9fe4cc 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -1199,6 +1199,11 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
>    edge e;
>    edge_iterator ei;
>
> +  int abnormal = 0, samevar = 0, othervar = 0, failure = 0;
> +  int initial_partitions = num_var_partitions (map);
> +  int final_partitions = initial_partitions;
> +  int p1, p2;
> +
>    /* First, coalesce all the copies across abnormal edges.  These are not placed
>       in the coalesce list because they do not need to be sorted, and simply
>       consume extra memory/compilation time in large programs.  */
> @@ -1226,8 +1231,17 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
>                 if (debug)
>                   fprintf (debug, "Abnormal coalesce: ");
>
> +               p1 = var_to_partition (map, arg);
> +               p2 = var_to_partition (map, res);
> +
>                 if (!attempt_coalesce (map, graph, v1, v2, debug))
>                   fail_abnormal_edge_coalesce (v1, v2);
> +               else
> +                 {
> +                   abnormal++;
> +                   if (p1 != p2)
> +                     final_partitions--;
> +                 }
>               }
>           }
>      }
> @@ -1244,8 +1258,30 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
>
>        if (debug)
>         fprintf (debug, "Coalesce list: ");
> -      attempt_coalesce (map, graph, x, y, debug);
> +
> +      p1 = var_to_partition (map, var1);
> +      p2 = var_to_partition (map, var2);
> +
> +      if (!attempt_coalesce (map, graph, x, y, debug))
> +       failure++;
> +      else
> +       {
> +         if (p1 != p2)
> +           final_partitions--;
> +         if ((SSA_NAME_VAR (var1) && !DECL_IGNORED_P (SSA_NAME_VAR (var1))
> +              ? SSA_NAME_VAR (var1) : NULL)
> +             == (SSA_NAME_VAR (var2) && !DECL_IGNORED_P (SSA_NAME_VAR (var2))
> +                 ? SSA_NAME_VAR (var2) : NULL))
> +           samevar++;
> +         else
> +           othervar++;
> +       }
>      }
> +
> +  inform (1,
> +         "%i cv, %i base, %i part, %i abn, %i same, %i other, %i failed in %q+F",
> +         initial_partitions, num_basevars (map), final_partitions,
> +         abnormal, samevar, othervar, failure, current_function_decl);
>  }
>
>
>
> And here's the actual patch I'm submitting for your appreciation (I was
> gonna say for inclusion, but given the leader_merge brown paper bag bug,
> I'll just want feedback on whether we want that or not, and either drop
> the list-building, or probably post a revised patch that fixes fallout
> from lists where decls are expected.)
>
> No regressions, and many progressions, on x86_64-linux-gnu and
> i686-pc-linux-gnu.
>
> [PR64164] Drop copyrename, use coalescible partition as base when optimizing.
>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
>         * tree-ssa-copyrename.c: Removed.
>         * opts.c (default_options_table): Drop -ftree-copyrename.  Add
>         -ftree-coalesce-vars.
>         * passes.def: Drop all occurrences of pass_rename_ssa_copies.
>         * common.opt (ftree-copyrename): Ignore.
>         (ftree-coalesce-inlined-vars): Likewise.
>         * doc/invoke.texi: Remove the ignored options above.
>         * gimple-expr.h (gimple_can_coalesce_p): Note def location.
>         * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
>         across variables when flag_tree_coalesce_vars.  Check register
>         use and promoted modes to allow coalescing.  Moved to
>         tree-ssa-coalesce.c.
>         * tree-ssa-live.c (struct tree_int_map_hasher): Move along
>         with its member functions to tree-ssa-coalesce.c.
>         (var_map_base_init): Likewise.  Renamed to
>         compute_samebase_partition_bases.
>         (partition_view_normal): Drop want_bases parameter.
>         (partition_view_bitmap): Likewise.
>         * tree-ssa-live.h: Adjust declarations.
>         * tree-ssa-coalesce.c: Include explow.h.
>         (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
>         default defs at the entry point.
>         (dump_part_var_map): New.
>         (compute_optimized_partition_bases): New, called by...
>         (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
>         of compute_samebase_partition_bases.  Adjust.
>         * alias.c (nonoverlapping_memrefs_p): Special-case RTL-less
>         gimple-reg exprs.
>         * cfgexpand.c (leader_merge): New.
>         (get_rtl_for_parm_ssa_default_def): New.
>         (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
>         vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
>         (expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
>         redundant MEM attr setting.
>         (expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
>         from...
>         (expand_one_stack_var): ... this.  New wrapper to check and
>         skip already expanded SSA partitions.
>         (record_alignment_for_reg_var): New, factored out of...
>         (expand_one_var): ... this.
>         (expand_one_ssa_partition): New.
>         (adjust_one_expanded_partition_var): New.
>         (expand_one_register_var): Check and skip already expanded SSA
>         partitions.
>         (expand_used_vars): Don't create DECLs for anonymous SSA
>         names.  Expand all SSA partitions, then adjust all SSA names.
>         (pass::execute): Replace the loops that set
>         SA.partition_to_pseudo from partition leaders and cleared
>         DECL_RTL for multi-location variables, and that which used to
>         rename vars and set attrs, with one that clears DECL_RTL and
>         checks that PARMs and RESULTs default_defs match DECL_RTL.
>         * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
>         * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL and
>         TREE_LIST decl.
>         * explow.c (promote_ssa_mode): New.
>         * explow.h (promote_ssa_mode): Declare.
>         * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
>         * function.c: Include cfgexpand.h.
>         (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
>         (use_register_for_parm_decl): Wrapper for the above to
>         special-case the result_ptr.
>         (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
>         (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
>         multiple locations.
>         (assign_parm_adjust_stack_rtl): Add all and parm arguments,
>         for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
>         (assign_parm_setup_block): Prefer SSA-assigned location.
>         (assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
>         if stack_parm is NULL.
>         (assign_parm_setup_stack): Prefer SSA-assigned location.
>         (assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
>         rtl before testing for pointer bounds.  Special-case result_ptr.
>         (expand_function_start): Maybe reset DECL_RTL of result.
>         Prefer SSA-assigned location for result and static chain.
>         Factor out DECL_RESULT and SET_DECL_RTL.
>         * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
>         anonymous SSA names.  Use promote_ssa_mode.
>         (get_temp_reg): Likewise.
>         (remove_ssa_form): Adjust.
>         * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
>         and get its reg_usage for reg invalidation.
>         (compute_bb_dataflow): Pass it insn.
>         (emit_notes_in_bb): Likewise.
>
> for  gcc/testsuite/ChangeLog
>
>         * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
>         * gcc.dg/ssp-1.c: Make counter a register.
>         * gcc.dg/ssp-2.c: Likewise.
>         * gcc.dg/torture/parm-coalesce.c: New.
> ---
>  gcc/Makefile.in                              |    1
>  gcc/alias.c                                  |   12 +
>  gcc/cfgexpand.c                              |  383 +++++++++++++++-----
>  gcc/cfgexpand.h                              |    2
>  gcc/common.opt                               |   12 -
>  gcc/doc/invoke.texi                          |   48 +--
>  gcc/emit-rtl.c                               |    7
>  gcc/explow.c                                 |   25 +
>  gcc/explow.h                                 |    3
>  gcc/expr.c                                   |   33 +-
>  gcc/function.c                               |  211 +++++++++--
>  gcc/gimple-expr.c                            |   39 --
>  gcc/gimple-expr.h                            |    5
>  gcc/opts.c                                   |    2
>  gcc/passes.def                               |    5
>  gcc/testsuite/gcc.dg/guality/pr54200.c       |    2
>  gcc/testsuite/gcc.dg/ssp-1.c                 |    2
>  gcc/testsuite/gcc.dg/ssp-2.c                 |    2
>  gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
>  gcc/tree-outof-ssa.c                         |   15 -
>  gcc/tree-ssa-coalesce.c                      |  380 +++++++++++++++++++-
>  gcc/tree-ssa-copyrename.c                    |  499 --------------------------
>  gcc/tree-ssa-live.c                          |  101 -----
>  gcc/tree-ssa-live.h                          |    4
>  gcc/var-tracking.c                           |   12 -
>  25 files changed, 980 insertions(+), 865 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>  delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 80c91f0..6920ee7 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1428,7 +1428,6 @@ OBJS = \
>         tree-ssa-ccp.o \
>         tree-ssa-coalesce.o \
>         tree-ssa-copy.o \
> -       tree-ssa-copyrename.o \
>         tree-ssa-dce.o \
>         tree-ssa-dom.o \
>         tree-ssa-dse.o \
> diff --git a/gcc/alias.c b/gcc/alias.c
> index a7160f3..2100e8b 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2365,6 +2365,18 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>    if (! DECL_P (exprx) || ! DECL_P (expry))
>      return 0;
>
> +  /* If we refer to different gimple registers, or one gimple register
> +     and one non-gimple-register, we know they can't overlap.  Now,
> +     there could be more than one stack slot for (different versions
> +     of) the same gimple register, but we can presumably tell they
> +     don't overlap based on offsets from stack base addresses
> +     elsewhere.  It's important that we don't proceed to DECL_RTL,
> +     because gimple registers may not pass DECL_RTL_SET_P, and
> +     make_decl_rtl won't be able to do anything about them since no
> +     SSA information will have remained to guide it.  */
> +  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> +    return exprx != expry;
> +

This should also mention that is_gimple_reg vars do not have their
address taken.

>    /* With invalid code we can end up storing into the constant pool.
>       Bail out to avoid ICEing when creating RTL for this.
>       See gfortran.dg/lto/20091028-2_0.f90.  */
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index ca491a0..74190a6d 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -179,21 +179,137 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
>  #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
> +   TREE_LIST of DECLs.  If NEXT is covered by CUR, return CUR
> +   unchanged.  Otherwise, return a list with all entries of CUR, with
> +   NEXT at the end.  If CUR was a list, it will be modified in
> +   place.  */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> +  if (cur == NULL || cur == next)
> +    return next;
> +
> +  tree list;
> +
> +  if (TREE_CODE (cur) == TREE_LIST)
> +    {
> +      /* Look for NEXT in the list.  Stop at the last node to insert
> +        there.  */
> +      for (list = cur; ; list = TREE_CHAIN (list))
> +       {
> +         if (TREE_VALUE (list) == next)
> +           return cur;
> +         if (!TREE_CHAIN (list))
> +           break;
> +       }
> +    }
> +  else
> +    /* Create the first node.  */
> +    list = build_tree_list (NULL, cur);
> +
> +  next = build_tree_list (NULL, next);
> +  TREE_CHAIN (list) = next;

Ick - presumably you can't use sth better than a TREE_LIST here?
First the linear
walk looks expensive and 2nd, well, TREE_LIST ...

> +
> +  return cur;
> +}
> +
> +
> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
> +   there is one.  */
> +
> +rtx
> +get_rtl_for_parm_ssa_default_def (tree var)
> +{
> +  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> +
> +  if (!is_gimple_reg (var))
> +    return NULL_RTX;
> +
> +  /* If we've already determined RTL for the decl, use it.  This is
> +     not just an optimization: if VAR is a PARM whose incoming value
> +     is unused, we won't find a default def to use its partition, but
> +     we still want to use the location of the parm, if it was used at
> +     all.  During assign_parms, until a location is assigned for the
> +     VAR, RTL can only for a parm or result if we're not coalescing
> +     across variables, when we know we're coalescing all SSA_NAMEs of
> +     each parm or result, and we're not coalescing them with names
> +     pertaining to other variables, such as other parms' default
> +     defs.  */
> +  if (DECL_RTL_SET_P (var))
> +    {
> +      gcc_assert (DECL_RTL (var) != pc_rtx);
> +      return DECL_RTL (var);
> +    }
> +
> +  tree name = ssa_default_def (cfun, var);
> +
> +  if (!name)
> +    return NULL_RTX;
> +
> +  int part = var_to_partition (SA.map, name);
> +  if (part == NO_PARTITION)
> +    return NULL_RTX;
> +
> +  return SA.partition_to_pseudo[part];
> +}
> +
>  /* Associate declaration T with storage space X.  If T is no
>     SSA name this is exactly SET_DECL_RTL, otherwise make the
>     partition of T associated with X.  */
>  static inline void
>  set_rtl (tree t, rtx x)
>  {
> +  if (x && SSAVAR (t))
> +    {
> +      bool skip = false;
> +      tree cur = NULL_TREE;
> +
> +      if (MEM_P (x))
> +       cur = MEM_EXPR (x);
> +      else if (REG_P (x))
> +       cur = REG_EXPR (x);
> +      else if (GET_CODE (x) == CONCAT
> +              && REG_P (XEXP (x, 0)))
> +       cur = REG_EXPR (XEXP (x, 0));
> +      else if (GET_CODE (x) == PARALLEL)
> +       cur = REG_EXPR (XVECEXP (x, 0, 0));
> +      else if (x == pc_rtx)
> +       skip = true;
> +      else
> +       gcc_unreachable ();
> +
> +      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +
> +      if (cur != next)
> +       {
> +         if (MEM_P (x))
> +           set_mem_attributes (x, SSAVAR (t), true);
> +         else
> +           set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
> +       }
> +    }
> +
>    if (TREE_CODE (t) == SSA_NAME)
>      {
> -      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> -      if (x && !MEM_P (x))
> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> -      /* For the benefit of debug information at -O0 (where vartracking
> -         doesn't run) record the place also in the base DECL if it's
> -        a normal variable (not a parameter).  */
> -      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
> +      int part = var_to_partition (SA.map, t);
> +      if (part != NO_PARTITION)
> +       {
> +         if (SA.partition_to_pseudo[part])
> +           gcc_assert (SA.partition_to_pseudo[part] == x);
> +         else
> +           SA.partition_to_pseudo[part] = x;
> +       }
> +      /* For the benefit of debug information at -O0 (where
> +         vartracking doesn't run) record the place also in the base
> +         DECL.  For PARMs and RESULTs, we may end up resetting these
> +         in function.c:maybe_reset_rtl_for_parm, but in some rare
> +         cases we may need them (unused and overwritten incoming
> +         value, that at -O0 must share the location with the other
> +         uses in spite of the missing default def), and this may be
> +         the only chance to preserve them.  */
> +      if (x && x != pc_rtx && SSA_NAME_VAR (t))
>         {
>           tree var = SSA_NAME_VAR (t);
>           /* If we don't yet have something recorded, just record it now.  */
> @@ -909,7 +1025,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
>    x = plus_constant (Pmode, base, offset);
> -  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
> +  x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
> +                  ? DECL_MODE (SSAVAR (decl))
> +                  : TYPE_MODE (TREE_TYPE (decl)), x);
>
>    if (TREE_CODE (decl) != SSA_NAME)
>      {
> @@ -931,7 +1049,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>        DECL_USER_ALIGN (decl) = 0;
>      }
>
> -  set_mem_attributes (x, SSAVAR (decl), true);
>    set_rtl (decl, x);
>  }
>
> @@ -1146,13 +1263,22 @@ account_stack_vars (void)
>     to a variable to be allocated in the stack frame.  */
>
>  static void
> -expand_one_stack_var (tree var)
> +expand_one_stack_var_1 (tree var)
>  {
>    HOST_WIDE_INT size, offset;
>    unsigned byte_align;
>
> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> -  byte_align = align_local_variable (SSAVAR (var));
> +  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
> +    {
> +      size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> +      byte_align = align_local_variable (SSAVAR (var));
> +    }
> +  else

I'd go here for all TREE_CODE (var) == SSA_NAME (and get rid of
the SSAVAR macro?)

> +    {
> +      tree type = TREE_TYPE (var);
> +      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> +      byte_align = TYPE_ALIGN_UNIT (type);
> +    }
>
>    /* We handle highly aligned variables in expand_stack_vars.  */
>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1163,6 +1289,27 @@ expand_one_stack_var (tree var)
>                            crtl->max_used_stack_slot_alignment, offset);
>  }
>
> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> +   already assigned some MEM.  */
> +
> +static void
> +expand_one_stack_var (tree var)
> +{
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (MEM_P (x));
> +         return;
> +       }
> +    }
> +
> +  return expand_one_stack_var_1 (var);
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a hard register.  */
>
> @@ -1172,12 +1319,112 @@ expand_one_hard_reg_var (tree var)
>    rest_of_decl_compilation (var, 0, 0);
>  }
>
> +/* Record the alignment requirements of some variable assigned to a
> +   pseudo.  */
> +
> +static void
> +record_alignment_for_reg_var (unsigned int align)
> +{
> +  if (SUPPORTS_STACK_ALIGNMENT
> +      && crtl->stack_alignment_estimated < align)
> +    {
> +      /* stack_alignment_estimated shouldn't change after stack
> +         realign decision made */
> +      gcc_assert (!crtl->stack_realign_processed);
> +      crtl->stack_alignment_estimated = align;
> +    }
> +
> +  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> +     So here we only make sure stack_alignment_needed >= align.  */
> +  if (crtl->stack_alignment_needed < align)
> +    crtl->stack_alignment_needed = align;
> +  if (crtl->max_used_stack_slot_alignment < align)
> +    crtl->max_used_stack_slot_alignment = align;
> +}
> +
> +/* Create RTL for an SSA partition.  */
> +
> +static void
> +expand_one_ssa_partition (tree var)
> +{
> +  int part = var_to_partition (SA.map, var);
> +  gcc_assert (part != NO_PARTITION);
> +
> +  if (SA.partition_to_pseudo[part])
> +    return;
> +
> +  if (!use_register_for_decl (var))
> +    {
> +      expand_one_stack_var_1 (var);
> +      return;
> +    }
> +
> +  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
> +                                         TYPE_MODE (TREE_TYPE (var)),
> +                                         TYPE_ALIGN (TREE_TYPE (var)));
> +
> +  /* If the variable alignment is very large we'll dynamicaly allocate
> +     it, which means that in-frame portion is just a pointer.  */
> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +    align = POINTER_SIZE;
> +
> +  record_alignment_for_reg_var (align);
> +
> +  machine_mode reg_mode = promote_ssa_mode (var, NULL);
> +
> +  rtx x = gen_reg_rtx (reg_mode);
> +
> +  set_rtl (var, x);
> +}
> +
> +/* Record the association between the RTL generated for a partition
> +   and the underlying variable of the SSA_NAME.  */
> +
> +static void
> +adjust_one_expanded_partition_var (tree var)
> +{
> +  if (!var)
> +    return;
> +
> +  tree decl = SSA_NAME_VAR (var);
> +
> +  int part = var_to_partition (SA.map, var);
> +  if (part == NO_PARTITION)
> +    return;
> +
> +  rtx x = SA.partition_to_pseudo[part];
> +
> +  set_rtl (var, x);
> +
> +  if (!REG_P (x))
> +    return;
> +
> +  /* Note if the object is a user variable.  */
> +  if (decl && !DECL_ARTIFICIAL (decl))
> +    mark_user_reg (x);
> +
> +  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
> +    mark_reg_pointer (x, get_pointer_alignment (var));
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a pseudo register.  */
>
>  static void
>  expand_one_register_var (tree var)
>  {
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (REG_P (x));
> +         return;
> +       }
> +    }
> +
>    tree decl = SSAVAR (var);
>    tree type = TREE_TYPE (decl);
>    machine_mode reg_mode = promote_decl_mode (decl, NULL);
> @@ -1312,21 +1559,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
>         align = POINTER_SIZE;
>      }
>
> -  if (SUPPORTS_STACK_ALIGNMENT
> -      && crtl->stack_alignment_estimated < align)
> -    {
> -      /* stack_alignment_estimated shouldn't change after stack
> -         realign decision made */
> -      gcc_assert (!crtl->stack_realign_processed);
> -      crtl->stack_alignment_estimated = align;
> -    }
> -
> -  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> -     So here we only make sure stack_alignment_needed >= align.  */
> -  if (crtl->stack_alignment_needed < align)
> -    crtl->stack_alignment_needed = align;
> -  if (crtl->max_used_stack_slot_alignment < align)
> -    crtl->max_used_stack_slot_alignment = align;
> +  record_alignment_for_reg_var (align);
>
>    if (TREE_CODE (origvar) == SSA_NAME)
>      {
> @@ -1760,48 +1993,18 @@ expand_used_vars (void)
>    if (targetm.use_pseudo_pic_reg ())
>      pic_offset_table_rtx = gen_reg_rtx (Pmode);
>
> -  hash_map<tree, tree> ssa_name_decls;
>    for (i = 0; i < SA.map->num_partitions; i++)
>      {
>        tree var = partition_to_var (SA.map, i);
>
>        gcc_assert (!virtual_operand_p (var));
>
> -      /* Assign decls to each SSA name partition, share decls for partitions
> -         we could have coalesced (those with the same type).  */
> -      if (SSA_NAME_VAR (var) == NULL_TREE)
> -       {
> -         tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
> -         if (!*slot)
> -           *slot = create_tmp_reg (TREE_TYPE (var));
> -         replace_ssa_name_symbol (var, *slot);
> -       }
> -
> -      /* Always allocate space for partitions based on VAR_DECLs.  But for
> -        those based on PARM_DECLs or RESULT_DECLs and which matter for the
> -        debug info, there is no need to do so if optimization is disabled
> -        because all the SSA_NAMEs based on these DECLs have been coalesced
> -        into a single partition, which is thus assigned the canonical RTL
> -        location of the DECLs.  If in_lto_p, we can't rely on optimize,
> -        a function could be compiled with -O1 -flto first and only the
> -        link performed at -O0.  */
> -      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
> -       expand_one_var (var, true, true);
> -      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
> -       {
> -         /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
> -            contain the default def (representing the parm or result itself)
> -            we don't do anything here.  But those which don't contain the
> -            default def (representing a temporary based on the parm/result)
> -            we need to allocate space just like for normal VAR_DECLs.  */
> -         if (!bitmap_bit_p (SA.partition_has_default_def, i))
> -           {
> -             expand_one_var (var, true, true);
> -             gcc_assert (SA.partition_to_pseudo[i]);
> -           }
> -       }
> +      expand_one_ssa_partition (var);
>      }
>
> +  for (i = 1; i < num_ssa_names; i++)
> +    adjust_one_expanded_partition_var (ssa_name (i));
> +
>    if (flag_stack_protect == SPCT_FLAG_STRONG)
>        gen_stack_protect_signal
>         = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -6033,35 +6236,6 @@ pass_expand::execute (function *fun)
>        parm_birth_insn = var_seq;
>      }
>
> -  /* Now that we also have the parameter RTXs, copy them over to our
> -     partitions.  */
> -  for (i = 0; i < SA.map->num_partitions; i++)
> -    {
> -      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
> -
> -      if (TREE_CODE (var) != VAR_DECL
> -         && !SA.partition_to_pseudo[i])
> -       SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
> -      gcc_assert (SA.partition_to_pseudo[i]);
> -
> -      /* If this decl was marked as living in multiple places, reset
> -        this now to NULL.  */
> -      if (DECL_RTL_IF_SET (var) == pc_rtx)
> -       SET_DECL_RTL (var, NULL);
> -
> -      /* Some RTL parts really want to look at DECL_RTL(x) when x
> -        was a decl marked in REG_ATTR or MEM_ATTR.  We could use
> -        SET_DECL_RTL here making this available, but that would mean
> -        to select one of the potentially many RTLs for one DECL.  Instead
> -        of doing that we simply reset the MEM_EXPR of the RTL in question,
> -        then nobody can get at it and hence nobody can call DECL_RTL on it.  */
> -      if (!DECL_RTL_SET_P (var))
> -       {
> -         if (MEM_P (SA.partition_to_pseudo[i]))
> -           set_mem_expr (SA.partition_to_pseudo[i], NULL);
> -       }
> -    }
> -
>    /* If we have a class containing differently aligned pointers
>       we need to merge those into the corresponding RTL pointer
>       alignment.  */
> @@ -6069,7 +6243,6 @@ pass_expand::execute (function *fun)
>      {
>        tree name = ssa_name (i);
>        int part;
> -      rtx r;
>
>        if (!name
>           /* We might have generated new SSA names in
> @@ -6082,20 +6255,24 @@ pass_expand::execute (function *fun)
>        if (part == NO_PARTITION)
>         continue;
>
> -      /* Adjust all partition members to get the underlying decl of
> -        the representative which we might have created in expand_one_var.  */
> -      if (SSA_NAME_VAR (name) == NULL_TREE)
> +      gcc_assert (SA.partition_to_pseudo[part]);
> +
> +      /* If this decl was marked as living in multiple places, reset
> +        this now to NULL.  */
> +      tree var = SSA_NAME_VAR (name);
> +      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
> +       SET_DECL_RTL (var, NULL);
> +      /* Check that the pseudos chosen by assign_parms are those of
> +        the corresponding default defs.  */
> +      else if (SSA_NAME_IS_DEFAULT_DEF (name)
> +              && (TREE_CODE (var) == PARM_DECL
> +                  || TREE_CODE (var) == RESULT_DECL))
>         {
> -         tree leader = partition_to_var (SA.map, part);
> -         gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
> -         replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
> +         rtx in = DECL_RTL_IF_SET (var);
> +         gcc_assert (in);
> +         rtx out = SA.partition_to_pseudo[part];
> +         gcc_assert (in == out || rtx_equal_p (in, out));
>         }
> -      if (!POINTER_TYPE_P (TREE_TYPE (name)))
> -       continue;
> -
> -      r = SA.partition_to_pseudo[part];
> -      if (REG_P (r))
> -       mark_reg_pointer (r, get_pointer_alignment (name));
>      }
>
>    /* If this function is `main', emit a call to `__main'
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index a0b6e3e..602579d 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern tree gimple_assign_rhs_to_tree (gimple);
>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +
>
>  #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 380848c..2cdbea1 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2212,16 +2212,16 @@ Common Report Var(flag_tree_ch) Optimization
>  Enable loop header copying on trees
>
>  ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Report Var(flag_tree_coalesce_vars) Optimization
> +Enable SSA coalescing of user variables
>
>  ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-copy-prop
>  Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index c20dd4d..0a3b930 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -337,7 +337,6 @@ Objective-C and Objective-C++ Dialects}.
>  -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-nrv -fdump-tree-vect @gol
>  -fdump-tree-sink @gol
>  -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
> @@ -443,9 +442,8 @@ Objective-C and Objective-C++ Dialects}.
>  -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>  -ftree-loop-if-convert-stores -ftree-loop-im @gol
>  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
> @@ -6989,11 +6987,6 @@ name is made by appending @file{.phiopt} to the source file name.
>  Dump each function after forward propagating single use variables.  The file
>  name is made by appending @file{.forwprop} to the source file name.
>
> -@item copyrename
> -@opindex fdump-tree-copyrename
> -Dump each function after applying the copy rename optimization.  The file
> -name is made by appending @file{.copyrename} to the source file name.
> -
>  @item nrv
>  @opindex fdump-tree-nrv
>  Dump each function after applying the named return value optimization on
> @@ -7458,8 +7451,8 @@ compilation time.
>  -ftree-ccp @gol
>  -fssa-phiopt @gol
>  -ftree-ch @gol
> +-ftree-coalesce-vars @gol
>  -ftree-copy-prop @gol
> --ftree-copyrename @gol
>  -ftree-dce @gol
>  -ftree-dominator-opts @gol
>  -ftree-dse @gol
> @@ -8724,6 +8717,15 @@ profitable to parallelize the loops.
>  Compare the results of several data dependence analyzers.  This option
>  is used for debugging the data dependence analyzers.
>
> +@item -ftree-coalesce-vars
> +@opindex ftree-coalesce-vars
> +Tell the compiler to attempt to combine small user-defined variables
> +too, instead of just compiler temporaries.  This may severely limit the
> +ability to debug an optimized program compiled with
> +@option{-fno-var-tracking-assignments}.  In the negated form, this flag
> +prevents SSA coalescing of user variables.  This option is enabled by
> +default if optimization is enabled.
> +
>  @item -ftree-loop-if-convert
>  @opindex ftree-loop-if-convert
>  Attempt to transform conditional jumps in the innermost loops to
> @@ -8837,32 +8839,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
>  references with scalars to prevent committing structures to memory too
>  early.  This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees.  This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables.  This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions.  It is a more limited form of
> -@option{-ftree-coalesce-vars}.  This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries.  This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}.  In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones.  This option is enabled by default.
> -
>  @item -ftree-ter
>  @opindex ftree-ter
>  Perform temporary expression replacement during the SSA->normal phase.  Single
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index b8dc7d5..ef31ba0f 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1229,6 +1229,11 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
>  void
>  set_reg_attrs_for_decl_rtl (tree t, rtx x)
>  {
> +  if (!t)
> +    return;
> +  tree tdecl = t;
> +  if (TREE_CODE (t) == TREE_LIST)
> +    tdecl = TREE_VALUE (t);

So it only uses the "first" entry?...

>    if (GET_CODE (x) == SUBREG)
>      {
>        gcc_assert (subreg_lowpart_p (x));
> @@ -1237,7 +1242,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>    if (REG_P (x))
>      REG_ATTRS (x)
>        = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
> -                                              DECL_MODE (t)));
> +                                              DECL_MODE (tdecl)));
>    if (GET_CODE (x) == CONCAT)
>      {
>        if (REG_P (XEXP (x, 0)))
> diff --git a/gcc/explow.c b/gcc/explow.c
> index de446a9..b53a3b7 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -854,6 +854,31 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>    return pmode;
>  }
>
> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
> +   mode of a temp decl of same type as the SSA_NAME, if we had created
> +   one.  */
> +
> +machine_mode
> +promote_ssa_mode (const_tree name, int *punsignedp)
> +{
> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
> +
> +  if (SSA_NAME_VAR (name))
> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);

As above I'd rather not have different paths for anonymous vs. non-anonymous
vars (so just delete the above two lines).

> +  tree type = TREE_TYPE (name);
> +  int unsignedp = TYPE_UNSIGNED (type);
> +  machine_mode mode = TYPE_MODE (type);
> +
> +  machine_mode pmode = promote_mode (type, mode, &unsignedp);
> +  if (punsignedp)
> +    *punsignedp = unsignedp;
> +
> +  return pmode;
> +}
> +
> +
>
>  /* Controls the behaviour of {anti_,}adjust_stack.  */
>  static bool suppress_reg_args_size;
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 48f1859..7b11e46 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
>  /* Return mode and signedness to use when object is promoted.  */
>  machine_mode promote_decl_mode (const_tree, int *);
>
> +/* Return mode and signedness to use when object is promoted.  */
> +machine_mode promote_ssa_mode (const_tree, int *);
> +
>  /* Remove some bytes from the stack.  An rtx says how many.  */
>  extern void adjust_stack (rtx);
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 530a944..95a9bab 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -9388,7 +9388,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>    rtx op0, op1, temp, decl_rtl;
>    tree type;
>    int unsignedp;
> -  machine_mode mode;
> +  machine_mode mode, dmode;
>    enum tree_code code = TREE_CODE (exp);
>    rtx subtarget, original_target;
>    int ignore;
> @@ -9519,7 +9519,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        if (g == NULL
>           && modifier == EXPAND_INITIALIZER
>           && !SSA_NAME_IS_DEFAULT_DEF (exp)
> -         && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> +         && (optimize || !SSA_NAME_VAR (exp)
> +             || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>           && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
>         g = SSA_NAME_DEF_STMT (exp);
>        if (g)
> @@ -9598,15 +9599,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        /* Ensure variable marked as used even if it doesn't go through
>          a parser.  If it hasn't be used yet, write out an external
>          definition.  */
> -      TREE_USED (exp) = 1;
> +      if (exp)
> +       TREE_USED (exp) = 1;
>
>        /* Show we haven't gotten RTL for this yet.  */
>        temp = 0;
>
>        /* Variables inherited from containing functions should have
>          been lowered by this point.  */
> -      context = decl_function_context (exp);
> -      gcc_assert (SCOPE_FILE_SCOPE_P (context)
> +      if (exp)
> +       context = decl_function_context (exp);
> +      gcc_assert (!exp
> +                 || SCOPE_FILE_SCOPE_P (context)
>                   || context == current_function_decl
>                   || TREE_STATIC (exp)
>                   || DECL_EXTERNAL (exp)
> @@ -9630,7 +9634,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>           decl_rtl = use_anchored_address (decl_rtl);
>           if (modifier != EXPAND_CONST_ADDRESS
>               && modifier != EXPAND_SUM
> -             && !memory_address_addr_space_p (DECL_MODE (exp),
> +             && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
> +                                              : GET_MODE (decl_rtl),
>                                                XEXP (decl_rtl, 0),
>                                                MEM_ADDR_SPACE (decl_rtl)))
>             temp = replace_equiv_address (decl_rtl,
> @@ -9641,12 +9646,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          if the address is a register.  */
>        if (temp != 0)
>         {
> -         if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
> +         if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
>             mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>
>           return temp;
>         }
>
> +      if (exp)
> +       dmode = DECL_MODE (exp);
> +      else
> +       dmode = TYPE_MODE (TREE_TYPE (ssa_name));
> +
>        /* If the mode of DECL_RTL does not match that of the decl,
>          there are two cases: we are dealing with a BLKmode value
>          that is returned in a register, or we are dealing with
> @@ -9654,8 +9664,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          of the wanted mode, but mark it so that we know that it
>          was already extended.  */
>        if (REG_P (decl_rtl)
> -         && DECL_MODE (exp) != BLKmode
> -         && GET_MODE (decl_rtl) != DECL_MODE (exp))
> +         && dmode != BLKmode
> +         && GET_MODE (decl_rtl) != dmode)
>         {
>           machine_mode pmode;
>
> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>             pmode = promote_function_mode (type, mode, &unsignedp,
>                                            gimple_call_fntype (g),
>                                            2);
> +         else if (!exp)
> +           {
> +             gcc_assert (code == SSA_NAME);

promote_ssa_mode should assert this.

> +             pmode = promote_ssa_mode (ssa_name, &unsignedp);
> +           }
>           else
>             pmode = promote_decl_mode (exp, &unsignedp);
>           gcc_assert (GET_MODE (decl_rtl) == pmode);
> diff --git a/gcc/function.c b/gcc/function.c
> index 7d4df92..1f5296e 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfganal.h"
>  #include "cfgbuild.h"
>  #include "cfgcleanup.h"
> +#include "cfgexpand.h"
>  #include "basic-block.h"
>  #include "df.h"
>  #include "params.h"
> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>  bool
>  use_register_for_decl (const_tree decl)
>  {
> +  if (TREE_CODE (decl) == SSA_NAME)
> +    {
> +      if (!SSA_NAME_VAR (decl))
> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> +
> +      decl = SSA_NAME_VAR (decl);

See above.  Please drop the SSA_NAME_VAR != NULL path.

> +    }
> +
>    if (!targetm.calls.allocate_stack_slots_for_args ())
>      return true;
>
> @@ -2804,23 +2814,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>    data->entry_parm = entry_parm;
>  }
>
> +/* Wrapper for use_register_for_decl, that special-cases the
> +   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> +   passed by reference.  */
> +
> +static bool
> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (DECL_BY_REFERENCE (result))
> +       parm = result;
> +    }
> +
> +  return use_register_for_decl (parm);
> +}
> +
> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> +   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> +   is passed by reference.  */
> +
> +static rtx
> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (!DECL_BY_REFERENCE (result))
> +       return NULL_RTX;
> +
> +      parm = result;
> +    }
> +
> +  return get_rtl_for_parm_ssa_default_def (parm);
> +}
> +
> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> +   SSA_NAMEs in multiple partitions, so that assign_parms will choose
> +   the default def, if it exists, or create new RTL to hold the unused
> +   entry value.  If we are coalescing across variables, we want to
> +   reset the location too, because a parm without a default def
> +   (incoming value unused) might be coalesced with one with a default
> +   def, and then assign_parms would copy both incoming values to the
> +   same location, which might cause the wrong value to survive.  */
> +static void
> +maybe_reset_rtl_for_parm (tree parm)
> +{
> +  gcc_assert (TREE_CODE (parm) == PARM_DECL
> +             || TREE_CODE (parm) == RESULT_DECL);
> +  if ((flag_tree_coalesce_vars
> +       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> +      && is_gimple_reg (parm))
> +    SET_DECL_RTL (parm, NULL_RTX);
> +}
> +
>  /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
>     always valid and properly aligned.  */
>
>  static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> +                             struct assign_parm_data_one *data)
>  {
>    rtx stack_parm = data->stack_parm;
>
> +  /* If out-of-SSA assigned RTL to the parm default def, make sure we
> +     don't use what we might have computed before.  */
> +  rtx ssa_assigned = rtl_for_parm (all, parm);
> +  if (ssa_assigned)
> +    stack_parm = NULL;
> +
>    /* If we can't trust the parm stack slot to be aligned enough for its
>       ultimate type, don't use that slot after entry.  We'll make another
>       stack slot, if we need one.  */
> -  if (stack_parm
> -      && ((STRICT_ALIGNMENT
> -          && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> -         || (data->nominal_type
> -             && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> -             && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> +  else if (stack_parm
> +          && ((STRICT_ALIGNMENT
> +               && (GET_MODE_ALIGNMENT (data->nominal_mode)
> +                   > MEM_ALIGN (stack_parm)))
> +              || (data->nominal_type
> +                  && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> +                  && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>      stack_parm = NULL;
>
>    /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2882,11 +2957,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>
>    size = int_size_in_bytes (data->passed_type);
>    size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> +
>    if (stack_parm == 0)
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> -      stack_parm = assign_stack_local (BLKmode, size_stored,
> -                                      DECL_ALIGN (parm));
> +      stack_parm = rtl_for_parm (all, parm);
> +      if (!stack_parm)
> +       stack_parm = assign_stack_local (BLKmode, size_stored,
> +                                        DECL_ALIGN (parm));
> +      else
> +       stack_parm = copy_rtx (stack_parm);
>        if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>         PUT_MODE (stack_parm, GET_MODE (entry_parm));
>        set_mem_attributes (stack_parm, parm, 1);
> @@ -3027,10 +3107,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>      = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                              TREE_TYPE (current_function_decl), 2);
>
> -  parmreg = gen_reg_rtx (promoted_nominal_mode);
> +  rtx from_expand = rtl_for_parm (all, parm);
>
> -  if (!DECL_ARTIFICIAL (parm))
> -    mark_user_reg (parmreg);
> +  if (from_expand && !data->passed_pointer)
> +    {
> +      parmreg = from_expand;
> +      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
> +    }
> +  else
> +    {
> +      parmreg = gen_reg_rtx (promoted_nominal_mode);
> +      if (!DECL_ARTIFICIAL (parm))
> +       mark_user_reg (parmreg);
> +    }
>
>    /* If this was an item that we received a pointer to,
>       set DECL_RTL appropriately.  */
> @@ -3049,6 +3138,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       assign_parm_find_data_types and expand_expr_real_1.  */
>
>    equiv_stack_parm = data->stack_parm;
> +  if (!equiv_stack_parm)
> +    equiv_stack_parm = data->entry_parm;
>    validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
>    need_conversion = (data->nominal_mode != data->passed_mode
> @@ -3189,11 +3280,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>    /* If we were passed a pointer but the actual value can safely live
>       in a register, retrieve it and use it directly.  */
> -  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
> +  if (data->passed_pointer
> +      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
>      {
>        /* We can't use nominal_mode, because it will have been set to
>          Pmode above.  We must use the actual mode of the parm.  */
> -      if (use_register_for_decl (parm))
> +      if (from_expand)
> +       {
> +         parmreg = from_expand;
> +         gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> +       }
> +      else if (use_register_for_decl (parm))
>         {
>           parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>           mark_user_reg (parmreg);
> @@ -3233,7 +3330,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>        /* STACK_PARM is the pointer, not the parm, and PARMREG is
>          now the parm.  */
> -      data->stack_parm = NULL;
> +      data->stack_parm = equiv_stack_parm = NULL;
>      }
>
>    /* Mark the register as eliminable if we did no conversion and it was
> @@ -3243,11 +3340,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       make here would screw up life analysis for it.  */
>    if (data->nominal_mode == data->passed_mode
>        && !did_conversion
> -      && data->stack_parm != 0
> -      && MEM_P (data->stack_parm)
> +      && equiv_stack_parm != 0
> +      && MEM_P (equiv_stack_parm)
>        && data->locate.offset.var == 0
>        && reg_mentioned_p (virtual_incoming_args_rtx,
> -                         XEXP (data->stack_parm, 0)))
> +                         XEXP (equiv_stack_parm, 0)))
>      {
>        rtx_insn *linsn = get_last_insn ();
>        rtx_insn *sinsn;
> @@ -3260,8 +3357,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>             = GET_MODE_INNER (GET_MODE (parmreg));
>           int regnor = REGNO (XEXP (parmreg, 0));
>           int regnoi = REGNO (XEXP (parmreg, 1));
> -         rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> -         rtx stacki = adjust_address_nv (data->stack_parm, submode,
> +         rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> +         rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
>                                           GET_MODE_SIZE (submode));
>
>           /* Scan backwards for the set of the real and
> @@ -3334,6 +3431,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>
>        if (data->stack_parm == 0)
>         {
> +         rtx x = data->stack_parm = rtl_for_parm (all, parm);
> +         if (x)
> +           gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
> +       }
> +
> +      if (data->stack_parm == 0)
> +       {
>           int align = STACK_SLOT_ALIGNMENT (data->passed_type,
>                                             GET_MODE (data->entry_parm),
>                                             TYPE_ALIGN (data->passed_type));
> @@ -3592,6 +3696,8 @@ assign_parms (tree fndecl)
>           DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>           continue;
>         }
> +      else
> +       maybe_reset_rtl_for_parm (parm);
>
>        /* Estimate stack alignment from parameter alignment.  */
>        if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3641,7 +3747,9 @@ assign_parms (tree fndecl)
>        else
>         set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> -      /* Boudns should be loaded in the particular order to
> +      assign_parm_adjust_stack_rtl (&all, parm, &data);
> +
> +      /* Bounds should be loaded in the particular order to
>          have registers allocated correctly.  Collect info about
>          input bounds and load them later.  */
>        if (POINTER_BOUNDS_TYPE_P (data.passed_type))
> @@ -3658,11 +3766,10 @@ assign_parms (tree fndecl)
>         }
>        else
>         {
> -         assign_parm_adjust_stack_rtl (&data);
> -
>           if (assign_parm_setup_block_p (&data))
>             assign_parm_setup_block (&all, parm, &data);
> -         else if (data.passed_pointer || use_register_for_decl (parm))
> +         else if (data.passed_pointer
> +                  || use_register_for_parm_decl (&all, parm))
>             assign_parm_setup_reg (&all, parm, &data);
>           else
>             assign_parm_setup_stack (&all, parm, &data);
> @@ -5001,7 +5108,9 @@ expand_function_start (tree subr)
>       before any library calls that assign parms might generate.  */
>
>    /* Decide whether to return the value in memory or in a register.  */
> -  if (aggregate_value_p (DECL_RESULT (subr), subr))
> +  tree res = DECL_RESULT (subr);
> +  maybe_reset_rtl_for_parm (res);
> +  if (aggregate_value_p (res, subr))
>      {
>        /* Returning something that won't go in a register.  */
>        rtx value_address = 0;
> @@ -5009,7 +5118,7 @@ expand_function_start (tree subr)
>  #ifdef PCC_STATIC_STRUCT_RETURN
>        if (cfun->returns_pcc_struct)
>         {
> -         int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
> +         int size = int_size_in_bytes (TREE_TYPE (res));
>           value_address = assemble_static_space (size);
>         }
>        else
> @@ -5021,36 +5130,45 @@ expand_function_start (tree subr)
>              it.  */
>           if (sv)
>             {
> -             value_address = gen_reg_rtx (Pmode);
> +             if (DECL_BY_REFERENCE (res))
> +               value_address = get_rtl_for_parm_ssa_default_def (res);
> +             if (!value_address)
> +               value_address = gen_reg_rtx (Pmode);
>               emit_move_insn (value_address, sv);
>             }
>         }
>        if (value_address)
>         {
>           rtx x = value_address;
> -         if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
> +         if (!DECL_BY_REFERENCE (res))
>             {
> -             x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
> -             set_mem_attributes (x, DECL_RESULT (subr), 1);
> +             x = get_rtl_for_parm_ssa_default_def (res);
> +             if (!x)
> +               {
> +                 x = gen_rtx_MEM (DECL_MODE (res), value_address);
> +                 set_mem_attributes (x, res, 1);
> +               }
>             }
> -         SET_DECL_RTL (DECL_RESULT (subr), x);
> +         SET_DECL_RTL (res, x);
>         }
>      }
> -  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
> +  else if (DECL_MODE (res) == VOIDmode)
>      /* If return mode is void, this decl rtl should not be used.  */
> -    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
> +    SET_DECL_RTL (res, NULL_RTX);
>    else
>      {
>        /* Compute the return values into a pseudo reg, which we will copy
>          into the true return register after the cleanups are done.  */
> -      tree return_type = TREE_TYPE (DECL_RESULT (subr));
> -      if (TYPE_MODE (return_type) != BLKmode
> -         && targetm.calls.return_in_msb (return_type))
> +      tree return_type = TREE_TYPE (res);
> +      rtx x = get_rtl_for_parm_ssa_default_def (res);
> +      if (x)
> +       /* Use it.  */;
> +      else if (TYPE_MODE (return_type) != BLKmode
> +              && targetm.calls.return_in_msb (return_type))
>         /* expand_function_end will insert the appropriate padding in
>            this case.  Use the return value's natural (unpadded) mode
>            within the function proper.  */
> -       SET_DECL_RTL (DECL_RESULT (subr),
> -                     gen_reg_rtx (TYPE_MODE (return_type)));
> +       x = gen_reg_rtx (TYPE_MODE (return_type));
>        else
>         {
>           /* In order to figure out what mode to use for the pseudo, we
> @@ -5061,25 +5179,26 @@ expand_function_start (tree subr)
>           /* Structures that are returned in registers are not
>              aggregate_value_p, so we may see a PARALLEL or a REG.  */
>           if (REG_P (hard_reg))
> -           SET_DECL_RTL (DECL_RESULT (subr),
> -                         gen_reg_rtx (GET_MODE (hard_reg)));
> +           x = gen_reg_rtx (GET_MODE (hard_reg));
>           else
>             {
>               gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> -             SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
> +             x = gen_group_rtx (hard_reg);
>             }
>         }
>
> +      SET_DECL_RTL (res, x);
> +
>        /* Set DECL_REGISTER flag so that expand_function_end will copy the
>          result to the real return register(s).  */
> -      DECL_REGISTER (DECL_RESULT (subr)) = 1;
> +      DECL_REGISTER (res) = 1;
>
>        if (chkp_function_instrumented_p (current_function_decl))
>         {
> -         tree return_type = TREE_TYPE (DECL_RESULT (subr));
> +         tree return_type = TREE_TYPE (res);
>           rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
>                                                                  subr, 1);
> -         SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
> +         SET_DECL_BOUNDS_RTL (res, bounds);
>         }
>      }
>
> @@ -5093,7 +5212,9 @@ expand_function_start (tree subr)
>        tree parm = cfun->static_chain_decl;
>        rtx local, chain, insn;
>
> -      local = gen_reg_rtx (Pmode);
> +      local = get_rtl_for_parm_ssa_default_def (parm);
> +      if (!local)
> +       local = gen_reg_rtx (Pmode);
>        chain = targetm.calls.static_chain (current_function_decl, true);
>
>        set_decl_incoming_rtl (parm, chain, false);
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index efc93b7..e29f300 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
>    return copy;
>  }
>
> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> -   coalescing together, false otherwise.
> -
> -   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> -
> -bool
> -gimple_can_coalesce_p (tree name1, tree name2)
> -{
> -  /* First check the SSA_NAME's associated DECL.  We only want to
> -     coalesce if they have the same DECL or both have no associated DECL.  */
> -  tree var1 = SSA_NAME_VAR (name1);
> -  tree var2 = SSA_NAME_VAR (name2);
> -  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> -  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> -  if (var1 != var2)
> -    return false;
> -
> -  /* Now check the types.  If the types are the same, then we should
> -     try to coalesce V1 and V2.  */
> -  tree t1 = TREE_TYPE (name1);
> -  tree t2 = TREE_TYPE (name2);
> -  if (t1 == t2)
> -    return true;
> -
> -  /* If the types are not the same, check for a canonical type match.  This
> -     (for example) allows coalescing when the types are fundamentally the
> -     same, but just have different names.
> -
> -     Note pointer types with different address spaces may have the same
> -     canonical type.  Those are rejected for coalescing by the
> -     types_compatible_p check.  */
> -  if (TYPE_CANONICAL (t1)
> -      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> -      && types_compatible_p (t1, t2))
> -    return true;
> -
> -  return false;
> -}
> -
>  /* Strip off a legitimate source ending from the input string NAME of
>     length LEN.  Rather than having to know the names used by all of
>     our front ends, we strip off an ending of a period followed by
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index a50a90a..b492137 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
>  extern bool gimple_has_body_p (tree);
>  extern const char *gimple_decl_printable_name (tree, int);
>  extern tree copy_var_decl (tree, tree, tree);
> -extern bool gimple_can_coalesce_p (tree, tree);
>  extern tree create_tmp_var_name (const char *);
>  extern tree create_tmp_var_raw (tree, const char * = NULL);
>  extern tree create_tmp_var (tree, const char * = NULL);
> @@ -56,6 +55,10 @@ extern bool is_gimple_mem_ref_addr (tree);
>  extern void mark_addressable (tree);
>  extern bool is_gimple_reg_rhs (tree);
>
> +/* Defined in tree-ssa-coalesce.c.   */

Err, put it to tree-ssa-coalesce.h?

> +extern bool gimple_can_coalesce_p (tree, tree);
> +
> +
>  /* Return true if a conversion from either type of TYPE1 and TYPE2
>     to the other is not required.  Otherwise return false.  */
>
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 39c190d..7e41b1f 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
> +    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> -    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index ffa63b5..4548b20 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_all_early_optimizations);
>        PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
>           NEXT_PASS (pass_remove_cgraph_callee_edges);
> -         NEXT_PASS (pass_rename_ssa_copies);
>           NEXT_PASS (pass_object_sizes);
>           NEXT_PASS (pass_ccp);
>           /* After CCP we rewrite no longer addressed locals into SSA
> @@ -154,7 +153,6 @@ along with GCC; see the file COPYING3.  If not see
>        /* Initial scalar cleanups before alias computation.
>          They ensure memory accesses are not indirect wherever possible.  */
>        NEXT_PASS (pass_strip_predict_hints);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        NEXT_PASS (pass_ccp);
>        /* After CCP we rewrite no longer addressed locals into SSA
>          form if possible.  */
> @@ -182,7 +180,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_stdarg);
>        NEXT_PASS (pass_lower_complex);
>        NEXT_PASS (pass_sra);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* The dom pass will also resolve all __builtin_constant_p calls
>           that are still there to 0.  This has to be done after some
>          propagations have already run, but before some more dead code
> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_fold_builtins);
>        NEXT_PASS (pass_optimize_widening_mul);
>        NEXT_PASS (pass_tail_calls);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* FIXME: If DCE is not run before checking for uninitialized uses,
>          we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
>          However, this also causes us to misdiagnose cases that should be
> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_dce);
>        NEXT_PASS (pass_asan);
>        NEXT_PASS (pass_tsan);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* ???  We do want some kind of loop invariant motion, but we possibly
>           need to adjust LIM to be more friendly towards preserving accurate
>          debug information here.  */
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
> index 9b17187..e1e7293 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
> @@ -1,6 +1,6 @@
>  /* PR tree-optimization/54200 */
>  /* { dg-do run } */
> -/* { dg-options "-g -fno-var-tracking-assignments" } */
> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>
>  int o __attribute__((used));
>
> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
> index 5467f4d..db69332 100644
> --- a/gcc/testsuite/gcc.dg/ssp-1.c
> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>
>  int main ()
>  {
> -  int i;
> +  register int i;
>    char foo[255];
>
>    // smash stack
> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
> index 9a7ac32..752fe53 100644
> --- a/gcc/testsuite/gcc.dg/ssp-2.c
> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
>  void
>  overflow()
>  {
> -  int i = 0;
> +  register int i = 0;
>    char foo[30];
>
>    /* Overflow buffer.  */
> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> new file mode 100644
> index 0000000..dbd81c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> @@ -0,0 +1,40 @@
> +/* { dg-do run } */
> +
> +#include <stdlib.h>
> +
> +/* Make sure we don't coalesce both incoming parms, one whose incoming
> +   value is unused, to the same location, so as to overwrite one of
> +   them with the incoming value of the other.  */
> +
> +int __attribute__((noinline, noclone))
> +foo (int i, int j)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +/* Same as foo, but with swapped parameters.  */
> +int __attribute__((noinline, noclone))
> +bar (int j, int i)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +int
> +main (void)
> +{
> +  if (foo (0, 1) != 3)
> +    abort ();
> +  if (bar (1, 0) != 3)
> +    abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index e6310cd..e62f36b 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -330,12 +330,13 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>
>    start_sequence ();
>
> -  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
> +  tree name = partition_to_var (SA.map, dest);
> +  var = SSA_NAME_VAR (name);
>    src_mode = TYPE_MODE (TREE_TYPE (src));
>    dest_mode = GET_MODE (dest_rtx);
> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));

The TREE_TYPE of name and its SSA_NAME_VAR are always the same.  So just
use TREE_TYPE (name) here.

>    gcc_assert (!REG_P (dest_rtx)
> -             || dest_mode == promote_decl_mode (var, &unsignedp));
> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>
>    if (src_mode != dest_mode)
>      {
> @@ -714,12 +715,12 @@ static rtx
>  get_temp_reg (tree name)
>  {
>    tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> -  tree type = TREE_TYPE (var);
> +  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);

See above.

>    int unsignedp;
> -  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
> +  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>    rtx x = gen_reg_rtx (reg_mode);
>    if (POINTER_TYPE_P (type))
> -    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
> +    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
>    return x;
>  }
>
> @@ -1019,7 +1020,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
>    /* Return to viewing the variable list as just all reference variables after
>       coalescing has been performed.  */
> -  partition_view_normal (map, false);
> +  partition_view_normal (map);
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index eeac5a4..c2cdeef0 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssanames.h"
>  #include "tree-ssa-live.h"
>  #include "tree-ssa-coalesce.h"
> +#include "explow.h"
>  #include "diagnostic-core.h"
>
>
> @@ -832,6 +833,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>    basic_block bb;
>    ssa_op_iter iter;
>    live_track_p live;
> +  basic_block entry;
> +
> +  /* If inter-variable coalescing is enabled, we may attempt to
> +     coalesce variables from different base variables, including
> +     different parameters, so we have to make sure default defs live
> +     at the entry block conflict with each other.  */
> +  if (flag_tree_coalesce_vars)
> +    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> +  else
> +    entry = NULL;
>
>    map = live_var_map (liveinfo);
>    graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -890,6 +901,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>             live_track_process_def (live, result, graph);
>         }
>
> +      /* Pretend there are defs for params' default defs at the start
> +        of the (post-)entry block.  */
> +      if (bb == entry)
> +       {
> +         unsigned base;
> +         bitmap_iterator bi;
> +         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> +           {
> +             bitmap_iterator bi2;
> +             unsigned part;
> +             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> +                                       0, part, bi2)
> +               {
> +                 tree var = partition_to_var (map, part);
> +                 if (!SSA_NAME_VAR (var)
> +                     || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> +                         && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> +                     || !SSA_NAME_IS_DEFAULT_DEF (var))
> +                   continue;
> +                 live_track_process_def (live, var, graph);
> +               }
> +           }
> +       }
> +
>       live_track_clear_base_vars (live);
>      }
>
> @@ -1158,6 +1193,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>      {
>        var1 = partition_to_var (map, p1);
>        var2 = partition_to_var (map, p2);
> +
>        z = var_union (map, var1, var2);
>        if (z == NO_PARTITION)
>         {
> @@ -1175,6 +1211,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
>        if (debug)
>         fprintf (debug, ": Success -> %d\n", z);
> +
>        return true;
>      }
>
> @@ -1272,6 +1309,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
>  }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F.  */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> +  int t;
> +  unsigned x, y;
> +  int p;
> +
> +  fprintf (f, "\nCoalescible Partition map \n\n");
> +
> +  for (x = 0; x < map->num_partitions; x++)
> +    {
> +      if (map->view_to_partition != NULL)
> +       p = map->view_to_partition[x];
> +      else
> +       p = x;
> +
> +      if (ssa_name (p) == NULL_TREE
> +         || virtual_operand_p (ssa_name (p)))
> +        continue;
> +
> +      t = 0;
> +      for (y = 1; y < num_ssa_names; y++)
> +        {
> +         tree var = version_to_var (map, y);
> +         if (!var)
> +           continue;
> +         int q = var_to_partition (map, var);
> +         p = partition_find (part, q);
> +         gcc_assert (map->partition_to_base_index[q]
> +                     == map->partition_to_base_index[p]);
> +
> +         if (p == (int)x)
> +           {
> +             if (t++ == 0)
> +               {
> +                 fprintf (f, "Partition %d, base %d (", x,
> +                          map->partition_to_base_index[q]);
> +                 print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> +                 fprintf (f, " - ");
> +               }
> +             fprintf (f, "%d ", y);
> +           }
> +       }
> +      if (t != 0)
> +       fprintf (f, ")\n");
> +    }
> +  fprintf (f, "\n");
> +}
> +
> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> +   coalescing together, false otherwise.
> +
> +   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> +
> +bool
> +gimple_can_coalesce_p (tree name1, tree name2)
> +{
> +  /* First check the SSA_NAME's associated DECL.  Without
> +     optimization, we only want to coalesce if they have the same DECL
> +     or both have no associated DECL.  */
> +  tree var1 = SSA_NAME_VAR (name1);
> +  tree var2 = SSA_NAME_VAR (name2);
> +  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> +  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> +  if (var1 != var2 && !flag_tree_coalesce_vars)
> +    return false;
> +
> +  /* Now check the types.  If the types are the same, then we should
> +     try to coalesce V1 and V2.  */
> +  tree t1 = TREE_TYPE (name1);
> +  tree t2 = TREE_TYPE (name2);
> +  if (t1 == t2)
> +    {
> +    check_modes:
> +      /* If the base variables are the same, we're good: none of the
> +        other tests below could possibly fail.  */
> +      var1 = SSA_NAME_VAR (name1);
> +      var2 = SSA_NAME_VAR (name2);
> +      if (var1 == var2)
> +       return true;
> +
> +      /* We don't want to coalesce two SSA names if one of the base
> +        variables is supposed to be a register while the other is
> +        supposed to be on the stack.  Anonymous SSA names take
> +        registers, but when not optimizing, user variables should go
> +        on the stack, so coalescing them with the anonymous variable
> +        as the partition leader would end up assigning the user
> +        variable to a register.  Don't do that!  */
> +      bool reg1 = !var1 || use_register_for_decl (var1);
> +      bool reg2 = !var2 || use_register_for_decl (var2);
> +      if (reg1 != reg2)
> +       return false;
> +
> +      /* Check that the promoted modes are the same.  We don't want to
> +        coalesce if the promoted modes would be different.  Only
> +        PARM_DECLs and RESULT_DECLs have different promotion rules,
> +        so skip the test if we both are variables or anonymous
> +        SSA_NAMEs.  */
> +      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> +       || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
> +    }
> +
> +  /* If the types are not the same, check for a canonical type match.  This
> +     (for example) allows coalescing when the types are fundamentally the
> +     same, but just have different names.
> +
> +     Note pointer types with different address spaces may have the same
> +     canonical type.  Those are rejected for coalescing by the
> +     types_compatible_p check.  */
> +  if (TYPE_CANONICAL (t1)
> +      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> +      && types_compatible_p (t1, t2))
> +    goto check_modes;
> +
> +  return false;
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> +   partition of SSA names USED_IN_COPIES and related by CL coalesce
> +   possibilities.  This must match gimple_can_coalesce_p in the
> +   optimized case.  */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> +                                  coalesce_list_p cl)
> +{
> +  int parts = num_var_partitions (map);
> +  partition tentative = partition_new (parts);
> +
> +  /* Partition the SSA versions so that, for each coalescible
> +     pair, both of its members are in the same partition in
> +     TENTATIVE.  */
> +  gcc_assert (!cl->sorted);
> +  coalesce_pair_p node;
> +  coalesce_iterator_type ppi;
> +  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> +    {
> +      tree v1 = ssa_name (node->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (node->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* We have to deal with cost one pairs too.  */
> +  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> +    {
> +      tree v1 = ssa_name (co->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (co->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* And also with abnormal edges.  */
> +  basic_block bb;
> +  edge e;
> +  edge_iterator ei;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      FOR_EACH_EDGE (e, ei, bb->preds)
> +       if (e->flags & EDGE_ABNORMAL)
> +         {
> +           gphi_iterator gsi;
> +           for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> +                gsi_next (&gsi))
> +             {
> +               gphi *phi = gsi.phi ();
> +               tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> +               if (SSA_NAME_IS_DEFAULT_DEF (arg)
> +                   && (!SSA_NAME_VAR (arg)
> +                       || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> +                 continue;
> +
> +               tree res = PHI_RESULT (phi);
> +
> +               int p1 = partition_find (tentative, var_to_partition (map, res));
> +               int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> +               if (p1 == p2)
> +                 continue;
> +
> +               partition_union (tentative, p1, p2);
> +             }
> +         }
> +    }
> +
> +  map->partition_to_base_index = XCNEWVEC (int, parts);
> +  auto_vec<unsigned int> index_map (parts);
> +  if (parts)
> +    index_map.quick_grow (parts);
> +
> +  const unsigned no_part = -1;
> +  unsigned count = parts;
> +  while (count)
> +    index_map[--count] = no_part;
> +
> +  /* Initialize MAP's mapping from partition to base index, using
> +     as base indices an enumeration of the TENTATIVE partitions in
> +     which each SSA version ended up, so that we compute conflicts
> +     between all SSA versions that ended up in the same potential
> +     coalesce partition.  */
> +  bitmap_iterator bi;
> +  unsigned i;
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      if (index_map[base] != no_part)
> +       continue;
> +      index_map[base] = count++;
> +    }
> +
> +  map->num_basevars = count;
> +
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      gcc_assert (index_map[base] < count);
> +      map->partition_to_base_index[pidx] = index_map[base];
> +    }
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    dump_part_var_map (dump_file, tentative, map);
> +
> +  partition_delete (tentative);
> +}
> +
> +/* Hashtable helpers.  */
> +
> +struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
> +{
> +  typedef tree_int_map *value_type;
> +  typedef tree_int_map *compare_type;
> +  static inline hashval_t hash (const tree_int_map *);
> +  static inline bool equal (const tree_int_map *, const tree_int_map *);
> +};
> +
> +inline hashval_t
> +tree_int_map_hasher::hash (const tree_int_map *v)
> +{
> +  return tree_map_base_hash (v);
> +}
> +
> +inline bool
> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> +{
> +  return tree_int_map_eq (v, c);
> +}
> +
> +/* This routine will initialize the basevar fields of MAP with base
> +   names.  Partitions will share the same base if they have the same
> +   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
> +   must match gimple_can_coalesce_p in the non-optimized case.  */
> +
> +static void
> +compute_samebase_partition_bases (var_map map)
> +{
> +  int x, num_part;
> +  tree var;
> +  struct tree_int_map *m, *mapstorage;
> +
> +  num_part = num_var_partitions (map);
> +  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> +  /* We can have at most num_part entries in the hash tables, so it's
> +     enough to allocate so many map elements once, saving some malloc
> +     calls.  */
> +  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> +
> +  /* If a base table already exists, clear it, otherwise create it.  */
> +  free (map->partition_to_base_index);
> +  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> +
> +  /* Build the base variable list, and point partitions at their bases.  */
> +  for (x = 0; x < num_part; x++)
> +    {
> +      struct tree_int_map **slot;
> +      unsigned baseindex;
> +      var = partition_to_var (map, x);
> +      if (SSA_NAME_VAR (var)
> +         && (!VAR_P (SSA_NAME_VAR (var))
> +             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> +       m->base.from = SSA_NAME_VAR (var);
> +      else
> +       /* This restricts what anonymous SSA names we can coalesce
> +          as it restricts the sets we compute conflicts for.
> +          Using TREE_TYPE to generate sets is the easies as
> +          type equivalency also holds for SSA names with the same
> +          underlying decl.
> +
> +          Check gimple_can_coalesce_p when changing this code.  */
> +       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> +                       ? TYPE_CANONICAL (TREE_TYPE (var))
> +                       : TREE_TYPE (var));
> +      /* If base variable hasn't been seen, set it up.  */
> +      slot = tree_to_index.find_slot (m, INSERT);
> +      if (!*slot)
> +       {
> +         baseindex = m - mapstorage;
> +         m->to = baseindex;
> +         *slot = m;
> +         m++;
> +       }
> +      else
> +       baseindex = (*slot)->to;
> +      map->partition_to_base_index[x] = baseindex;
> +    }
> +
> +  map->num_basevars = m - mapstorage;
> +
> +  free (mapstorage);
> +}
> +
>  /* Reduce the number of copies by coalescing variables in the function.  Return
>     a partition map with the resulting coalesces.  */
>
> @@ -1288,9 +1649,10 @@ coalesce_ssa_name (void)
>    cl = create_coalesce_list ();
>    map = create_outofssa_var_map (cl, used_in_copies);
>
> -  /* If optimization is disabled, we need to coalesce all the names originating
> -     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
> -  if (!optimize)
> +  /* If this optimization is disabled, we need to coalesce all the
> +     names originating from the same SSA_NAME_VAR so debug info
> +     remains undisturbed.  */
> +  if (!flag_tree_coalesce_vars)
>      {
>        hash_table<ssa_name_var_hash> ssa_name_hash (10);
>
> @@ -1331,8 +1693,13 @@ coalesce_ssa_name (void)
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      dump_var_map (dump_file, map);
>
> -  /* Don't calculate live ranges for variables not in the coalesce list.  */
> -  partition_view_bitmap (map, used_in_copies, true);
> +  partition_view_bitmap (map, used_in_copies);
> +
> +  if (flag_tree_coalesce_vars)
> +    compute_optimized_partition_bases (map, used_in_copies, cl);
> +  else
> +    compute_samebase_partition_bases (map);
> +
>    BITMAP_FREE (used_in_copies);
>
>    if (num_var_partitions (map) < 1)
> @@ -1371,8 +1738,7 @@ coalesce_ssa_name (void)
>
>    /* Now coalesce everything in the list.  */
>    coalesce_partitions (map, graph, cl,
> -                      ((dump_flags & TDF_DETAILS) ? dump_file
> -                                                  : NULL));
> +                      ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
>    delete_coalesce_list (cl);
>    ssa_conflicts_delete (graph);
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index f3cb56e..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,499 +0,0 @@
> -/* Rename SSA copies.
> -   Copyright (C) 2004-2015 Free Software Foundation, Inc.
> -   Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3.  If not see
> -<http://www.gnu.org/licenses/>.  */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "tm.h"
> -#include "hash-set.h"
> -#include "machmode.h"
> -#include "vec.h"
> -#include "double-int.h"
> -#include "input.h"
> -#include "alias.h"
> -#include "symtab.h"
> -#include "wide-int.h"
> -#include "inchash.h"
> -#include "tree.h"
> -#include "fold-const.h"
> -#include "predict.h"
> -#include "hard-reg-set.h"
> -#include "function.h"
> -#include "dominance.h"
> -#include "cfg.h"
> -#include "basic-block.h"
> -#include "tree-ssa-alias.h"
> -#include "internal-fn.h"
> -#include "gimple-expr.h"
> -#include "is-a.h"
> -#include "gimple.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "bitmap.h"
> -#include "gimple-ssa.h"
> -#include "stringpool.h"
> -#include "tree-ssanames.h"
> -#include "hashtab.h"
> -#include "rtl.h"
> -#include "statistics.h"
> -#include "real.h"
> -#include "fixed-value.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> -  /* Number of copies coalesced.  */
> -  int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> -   This optimization looks for copies between 2 SSA_NAMES, either through a
> -   direct copy, or an implicit one via a PHI node result and its arguments.
> -
> -   Each copy is examined to determine if it is possible to rename the base
> -   variable of one of the operands to the same variable as the other operand.
> -   i.e.
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -
> -   If this copy couldn't be copy propagated, it could possibly remain in the
> -   program throughout the optimization phases.   After SSA->normal, it would
> -   become:
> -
> -   T.3 = <blah>
> -   a = T.3
> -
> -   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> -   fundamental reason why the base variable needs to be T.3, subject to
> -   certain restrictions.  This optimization attempts to determine if we can
> -   change the base variable on copies like this, and result in code such as:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -
> -   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> -   possible, the copy goes away completely. If it isn't possible, a new temp
> -   will be created for a_5, and you will end up with the exact same code:
> -
> -   a.8 = <blah>
> -   a = a.8
> -
> -   The other benefit of performing this optimization relates to what variables
> -   are chosen in copies.  Gimplification of the program uses temporaries for
> -   a lot of things. expressions like
> -
> -   a_1 = <blah>
> -   <blah2> = a_1
> -
> -   get turned into
> -
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -   <blah2> = a_1
> -
> -   Copy propagation is done in a forward direction, and if we can propagate
> -   through the copy, we end up with:
> -
> -   T.3_5 = <blah>
> -   <blah2> = T.3_5
> -
> -   The copy is gone, but so is all reference to the user variable 'a'. By
> -   performing this optimization, we would see the sequence:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -   <blah2> = a_1
> -
> -   which copy propagation would then turn into:
> -
> -   a_5 = <blah>
> -   <blah2> = a_5
> -
> -   and so we still retain the user variable whenever possible.  */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> -   Choose a representative for the partition, and send debug info to DEBUG.  */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> -  int p1, p2, p3;
> -  tree root1, root2;
> -  tree rep1, rep2;
> -  bool ign1, ign2, abnorm;
> -
> -  gcc_assert (TREE_CODE (var1) == SSA_NAME);
> -  gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> -  register_ssa_partition (map, var1);
> -  register_ssa_partition (map, var2);
> -
> -  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> -  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> -  if (debug)
> -    {
> -      fprintf (debug, "Try : ");
> -      print_generic_expr (debug, var1, TDF_SLIM);
> -      fprintf (debug, "(P%d) & ", p1);
> -      print_generic_expr (debug, var2, TDF_SLIM);
> -      fprintf (debug, "(P%d)", p2);
> -    }
> -
> -  gcc_assert (p1 != NO_PARTITION);
> -  gcc_assert (p2 != NO_PARTITION);
> -
> -  if (p1 == p2)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Already coalesced.\n");
> -      return;
> -    }
> -
> -  rep1 = partition_to_var (map, p1);
> -  rep2 = partition_to_var (map, p2);
> -  root1 = SSA_NAME_VAR (rep1);
> -  root2 = SSA_NAME_VAR (rep2);
> -  if (!root1 && !root2)
> -    return;
> -
> -  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
> -  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> -           || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> -  if (abnorm)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Partitions already have the same root, simply merge them.  */
> -  if (root1 == root2)
> -    {
> -      p1 = partition_union (map->var_partition, p1, p2);
> -      if (debug)
> -       fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> -      return;
> -    }
> -
> -  /* Never attempt to coalesce 2 different parameters.  */
> -  if ((root1 && TREE_CODE (root1) == PARM_DECL)
> -      && (root2 && TREE_CODE (root2) == PARM_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> -      return;
> -    }
> -
> -  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> -      != (root2 && TREE_CODE (root2) == RESULT_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> -      return;
> -    }
> -
> -  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> -  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> -  /* Refrain from coalescing user variables, if requested.  */
> -  if (!ign1 && !ign2)
> -    {
> -      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> -       ign2 = true;
> -      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> -       ign1 = true;
> -      else if (flag_ssa_coalesce_vars != 2)
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> -         return;
> -       }
> -      else
> -       ign2 = true;
> -    }
> -
> -  /* If both values have default defs, we can't coalesce.  If only one has a
> -     tag, make sure that variable is the new root partition.  */
> -  if (root1 && ssa_default_def (cfun, root1))
> -    {
> -      if (root2 && ssa_default_def (cfun, root2))
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 default defs. No coalesce.\n");
> -         return;
> -       }
> -      else
> -        {
> -         ign2 = true;
> -         ign1 = false;
> -       }
> -    }
> -  else if (root2 && ssa_default_def (cfun, root2))
> -    {
> -      ign1 = true;
> -      ign2 = false;
> -    }
> -
> -  /* Do not coalesce if we cannot assign a symbol to the partition.  */
> -  if (!(!ign2 && root2)
> -      && !(!ign1 && root1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the new chosen root variable would be read-only.
> -     If both ign1 && ign2, then the root var of the larger partition
> -     wins, so reject in that case if any of the root vars is TREE_READONLY.
> -     Otherwise reject only if the root var, on which replace_ssa_name_symbol
> -     will be called below, is readonly.  */
> -  if (((root1 && TREE_READONLY (root1)) && ign2)
> -      || ((root2 && TREE_READONLY (root2)) && ign1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Readonly variable.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the two variables aren't type compatible .  */
> -  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> -      /* There is a disconnect between the middle-end type-system and
> -         VRP, avoid coalescing enum types with different bounds.  */
> -      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> -          || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> -         && TREE_TYPE (var1) != TREE_TYPE (var2)))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Incompatible types.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Merge the two partitions.  */
> -  p3 = partition_union (map->var_partition, p1, p2);
> -
> -  /* Set the root variable of the partition to the better choice, if there is
> -     one.  */
> -  if (!ign2 && root2)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> -  else if (!ign1 && root1)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> -  else
> -    gcc_unreachable ();
> -
> -  if (debug)
> -    {
> -      fprintf (debug, " --> P%d ", p3);
> -      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> -                         TDF_SLIM);
> -      fprintf (debug, "\n");
> -    }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> -  GIMPLE_PASS, /* type */
> -  "copyrename", /* name */
> -  OPTGROUP_NONE, /* optinfo_flags */
> -  TV_TREE_COPY_RENAME, /* tv_id */
> -  ( PROP_cfg | PROP_ssa ), /* properties_required */
> -  0, /* properties_provided */
> -  0, /* properties_destroyed */
> -  0, /* todo_flags_start */
> -  0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> -  pass_rename_ssa_copies (gcc::context *ctxt)
> -    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> -  {}
> -
> -  /* opt_pass methods: */
> -  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> -  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> -  virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> -   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
> -   changing the underlying root variable of all coalesced version.  This will
> -   then cause the SSA->normal pass to attempt to coalesce them all to the same
> -   variable.  */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> -  var_map map;
> -  basic_block bb;
> -  tree var, part_var;
> -  gimple stmt;
> -  unsigned x;
> -  FILE *debug;
> -
> -  memset (&stats, 0, sizeof (stats));
> -
> -  if (dump_file && (dump_flags & TDF_DETAILS))
> -    debug = dump_file;
> -  else
> -    debug = NULL;
> -
> -  map = init_var_map (num_ssa_names);
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Scan for real copies.  */
> -      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -       {
> -         stmt = gsi_stmt (gsi);
> -         if (gimple_assign_ssa_name_copy_p (stmt))
> -           {
> -             tree lhs = gimple_assign_lhs (stmt);
> -             tree rhs = gimple_assign_rhs1 (stmt);
> -
> -             copy_rename_partition_coalesce (map, lhs, rhs, debug);
> -           }
> -       }
> -    }
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Treat PHI nodes as copies between the result and each argument.  */
> -      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -        {
> -          size_t i;
> -         tree res;
> -         gphi *phi = gsi.phi ();
> -         res = gimple_phi_result (phi);
> -
> -         /* Do not process virtual SSA_NAMES.  */
> -         if (virtual_operand_p (res))
> -           continue;
> -
> -         /* Make sure to only use the same partition for an argument
> -            as the result but never the other way around.  */
> -         if (SSA_NAME_VAR (res)
> -             && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> -           for (i = 0; i < gimple_phi_num_args (phi); i++)
> -             {
> -               tree arg = PHI_ARG_DEF (phi, i);
> -               if (TREE_CODE (arg) == SSA_NAME)
> -                 copy_rename_partition_coalesce (map, res, arg,
> -                                                 debug);
> -             }
> -         /* Else if all arguments are in the same partition try to merge
> -            it with the result.  */
> -         else
> -           {
> -             int all_p_same = -1;
> -             int p = -1;
> -             for (i = 0; i < gimple_phi_num_args (phi); i++)
> -               {
> -                 tree arg = PHI_ARG_DEF (phi, i);
> -                 if (TREE_CODE (arg) != SSA_NAME)
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -                 else if (all_p_same == -1)
> -                   {
> -                     p = partition_find (map->var_partition,
> -                                         SSA_NAME_VERSION (arg));
> -                     all_p_same = 1;
> -                   }
> -                 else if (all_p_same == 1
> -                          && p != partition_find (map->var_partition,
> -                                                  SSA_NAME_VERSION (arg)))
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -               }
> -             if (all_p_same == 1)
> -               copy_rename_partition_coalesce (map, res,
> -                                               PHI_ARG_DEF (phi, 0),
> -                                               debug);
> -           }
> -        }
> -    }
> -
> -  if (debug)
> -    dump_var_map (debug, map);
> -
> -  /* Now one more pass to make all elements of a partition share the same
> -     root variable.  */
> -
> -  for (x = 1; x < num_ssa_names; x++)
> -    {
> -      part_var = partition_to_var (map, x);
> -      if (!part_var)
> -        continue;
> -      var = ssa_name (x);
> -      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> -       continue;
> -      if (debug)
> -        {
> -         fprintf (debug, "Coalesced ");
> -         print_generic_expr (debug, var, TDF_SLIM);
> -         fprintf (debug, " to ");
> -         print_generic_expr (debug, part_var, TDF_SLIM);
> -         fprintf (debug, "\n");
> -       }
> -      stats.coalesced++;
> -      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> -    }
> -
> -  statistics_counter_event (fun, "copies coalesced",
> -                           stats.coalesced);
> -  delete_var_map (map);
> -  return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> -  return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index 2c7c072..821b2f4 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -100,90 +100,6 @@ static void  verify_live_on_entry (tree_live_info_p);
>     ssa_name or variable, and vice versa.  */
>
>
> -/* Hashtable helpers.  */
> -
> -struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
> -{
> -  typedef tree_int_map *value_type;
> -  typedef tree_int_map *compare_type;
> -  static inline hashval_t hash (const tree_int_map *);
> -  static inline bool equal (const tree_int_map *, const tree_int_map *);
> -};
> -
> -inline hashval_t
> -tree_int_map_hasher::hash (const tree_int_map *v)
> -{
> -  return tree_map_base_hash (v);
> -}
> -
> -inline bool
> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> -{
> -  return tree_int_map_eq (v, c);
> -}
> -
> -
> -/* This routine will initialize the basevar fields of MAP.  */
> -
> -static void
> -var_map_base_init (var_map map)
> -{
> -  int x, num_part;
> -  tree var;
> -  struct tree_int_map *m, *mapstorage;
> -
> -  num_part = num_var_partitions (map);
> -  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> -  /* We can have at most num_part entries in the hash tables, so it's
> -     enough to allocate so many map elements once, saving some malloc
> -     calls.  */
> -  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> -
> -  /* If a base table already exists, clear it, otherwise create it.  */
> -  free (map->partition_to_base_index);
> -  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> -
> -  /* Build the base variable list, and point partitions at their bases.  */
> -  for (x = 0; x < num_part; x++)
> -    {
> -      struct tree_int_map **slot;
> -      unsigned baseindex;
> -      var = partition_to_var (map, x);
> -      if (SSA_NAME_VAR (var)
> -         && (!VAR_P (SSA_NAME_VAR (var))
> -             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> -       m->base.from = SSA_NAME_VAR (var);
> -      else
> -       /* This restricts what anonymous SSA names we can coalesce
> -          as it restricts the sets we compute conflicts for.
> -          Using TREE_TYPE to generate sets is the easies as
> -          type equivalency also holds for SSA names with the same
> -          underlying decl.
> -
> -          Check gimple_can_coalesce_p when changing this code.  */
> -       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> -                       ? TYPE_CANONICAL (TREE_TYPE (var))
> -                       : TREE_TYPE (var));
> -      /* If base variable hasn't been seen, set it up.  */
> -      slot = tree_to_index.find_slot (m, INSERT);
> -      if (!*slot)
> -       {
> -         baseindex = m - mapstorage;
> -         m->to = baseindex;
> -         *slot = m;
> -         m++;
> -       }
> -      else
> -       baseindex = (*slot)->to;
> -      map->partition_to_base_index[x] = baseindex;
> -    }
> -
> -  map->num_basevars = m - mapstorage;
> -
> -  free (mapstorage);
> -}
> -
> -
>  /* Remove the base table in MAP.  */
>
>  static void
> @@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
>  }
>
>
> -/* Create a partition view which includes all the used partitions in MAP.  If
> -   WANT_BASES is true, create the base variable map as well.  */
> +/* Create a partition view which includes all the used partitions in MAP.  */
>
>  void
> -partition_view_normal (var_map map, bool want_bases)
> +partition_view_normal (var_map map)
>  {
>    bitmap used;
>
>    used = partition_view_init (map);
>    partition_view_fini (map, used);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> @@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
>     as well.  */
>
>  void
> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> +partition_view_bitmap (var_map map, bitmap only)
>  {
>    bitmap used;
>    bitmap new_partitions = BITMAP_ALLOC (NULL);
> @@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>      }
>    partition_view_fini (map, new_partitions);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
> index d5d7820..1f88358 100644
> --- a/gcc/tree-ssa-live.h
> +++ b/gcc/tree-ssa-live.h
> @@ -71,8 +71,8 @@ typedef struct _var_map
>  extern var_map init_var_map (int);
>  extern void delete_var_map (var_map);
>  extern int var_union (var_map, tree, tree);
> -extern void partition_view_normal (var_map, bool);
> -extern void partition_view_bitmap (var_map, bitmap, bool);
> +extern void partition_view_normal (var_map);
> +extern void partition_view_bitmap (var_map, bitmap);
>  extern void dump_scope_blocks (FILE *, int);
>  extern void debug_scope_block (tree, int);
>  extern void debug_scope_blocks (int);
> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
> index 685fcc38c..447fcd9 100644
> --- a/gcc/var-tracking.c
> +++ b/gcc/var-tracking.c
> @@ -4872,12 +4872,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
>     registers, as well as associations between MEMs and VALUEs.  */
>
>  static void
> -dataflow_set_clear_at_call (dataflow_set *set)
> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
>  {
>    unsigned int r;
>    hard_reg_set_iterator hrsi;
> +  HARD_REG_SET invalidated_regs;
>
> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
> +  get_call_reg_set_usage (call_insn, &invalidated_regs,
> +                         regs_invalidated_by_call);
> +
> +  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
>      var_regno_delete (set, r);
>
>    if (MAY_HAVE_DEBUG_INSNS)
> @@ -6685,7 +6689,7 @@ compute_bb_dataflow (basic_block bb)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (out);
> +           dataflow_set_clear_at_call (out, insn);
>             break;
>
>           case MO_USE:
> @@ -9152,7 +9156,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (set);
> +           dataflow_set_clear_at_call (set, insn);
>             emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
>             {
>               rtx arguments = mo->u.loc, *p = &arguments;

Otherwise this looks fine to me - I didn't really spot the TREE_LIST
uses though (apart from that first element use).

2nd eyes welcome.

Thanks,
Richard.

>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-04-24  1:56         ` Alexandre Oliva
  2015-04-27 11:39           ` Richard Biener
@ 2015-04-29  3:51           ` Jeff Law
  1 sibling, 0 replies; 127+ messages in thread
From: Jeff Law @ 2015-04-29  3:51 UTC (permalink / raw)
  To: Alexandre Oliva, Richard Biener; +Cc: gcc-patches

On 04/23/2015 07:56 PM, Alexandre Oliva wrote:
>
> The other tricky bit was to fix all expander bits that required
> SSA_NAMEs to have a associated decl.  I've removed all such cases, so we
> can now expand anonymous SSA decls directly, without having to create an
> ignored decl.  Doing that, we can coalesce variables and expand each
> partition without worrying about choosing a preferred partition leader.
> We just have to make sure we don't consider pairs of variables eligible
> for coalescing if they should get different promoted modes, or a
> different register-or-stack choice, and then expansion of partitions is
> streamlined: we just expand each leader, and then adjust all SSA_NAMEs
> to associate the RTL with their base variables, if any.
Nice.



>
>
> In this revision of the patch, I have retained -ftree-coalesce-vars, so
> that its negated form can be used in testcases that formerly expected no
> coalescing across user variables, but that explicitly disabled VTA.
Seems reasonable.


>
> As for testcases, while investigating test regressions, I found out
> various guality failures had to do with VT's lack of awareness of custom
> calling conventions.  Caller's variables saved in registers that are
> normally call-clobbered, but that are call-saved in custom conventions
> set up for a callee, would end up invalidating the entry-point location
> associations.  I've arranged for var-tracking to use custom calling
> conventions for register invalidation at call insns, and this fixed not
> only a few guality regressions due to changes in register assignment,
> but a number of other long-standing guality failures.  Yay!  This could
> be split out into a standalone patch.
That might be wise -- I think we're going to need at least one more 
iteration on the removal of copyrename.


> In this version of the patch, we no longer touch the base vars at all.
> We just associate the piece of RTL generated for the partition with a
> list of decls, if needed.  (I've just realized that I never noticed a
> list of decls show up anywhere, and looking into this, I saw a bug in
> the leader_merge function, that causes it to fail to go from a single
> entry to a list: it creates the list, but then returns the original
> non-list entry; that's why I never saw them!  I won't delay posting the
> patch just because of this; I'm not even sure we want decl lists in REG
> or MEM attrs begin with)
Well, Richi noted the compile-time cost and poor data structure choice. 
  I'd ask the question, what's the benefit in tracking these as a list?

If we want to track, then how often do we need to actually traverse the 
list, how hard would it be to build a pathological case (from a compile 
time standpoint).  Presumably there's no way to sort the list to make 
finding an entry cheap?


>
> I have collected some statistics on the effects of the patch in
> compiling stage3-gcc/, before and after the patch, with and without
> -fno-tree-coalesce-vars.  I counted, per function:
>
> b/a: before the patch, or after the patch
>
> c/n: -ftree-coalesce-vars (default when optimizing) or
> -fno-tree-coalesce-vars
>
> cv: the coalescible var count, i.e., the active partition count prior to
> coalescing.  SSA_NAMEs not elligible for coalescing are not counted.
> The more of these there are, the larger the conflict graph we have to
> build.
>
> base: the base variable count that guides the construction of the
> conflict map.  The more of these there are, the smaller the conflict
> graph we have to build, but it is also a lower bound for the final
> partition count.
>
> part: the partition count after coalescing, not counting those of
> SSA_NAMEs that were not elligible for coalescing to begin with.
>
> abn: successful abnormal coalesce count.  How many times
> attempt_coalesce returned true as called in the abnormal coalesce loop.
>
> same: successful normal coalesces of pairs of SSA_NAMEs that share the
> same base variable (SSA_NAME_VAR, not the base index used to guide the
> construction of the conflict graph).  Ignored base decls are regarded as
> NULL for purposes of this comparison.  How many times attempt_coalesce
> returned true for variables that share the same base variable.  This may
> count cases in which both vars are in the same partition already due to
> earlier coalesces.
>
> other: successful normal coalesces of pairs of SSA_NAMEs that do NOT
> share the same base variable.  Same caveats as above.
>
> fail: failed attempts at normal coalece.  How many times
> attempt_coalesce returned false.
>
> b/a    c/n     cv     base   part   abn   same   other fail
>
> before -fno-tr 570180 176682 221442 82076 370746     0 10542
> before -ftree- 577212 171581 221927 82076 378093     0 18654
>
> after  -fno-tr 608533 179959 220948 82076 488119     0 11697
> after  -ftree- 589243 202588 221817 82076 349373 41775 24124
I've spent quite a bit of time trying to figure out what all this means. 
  I think the takeaway is we'll use a bit more memory, but we also 
coalesce a bit better.  Neither effect appears to be very large.

>
>
>
> And here's the actual patch I'm submitting for your appreciation (I was
> gonna say for inclusion, but given the leader_merge brown paper bag bug,
> I'll just want feedback on whether we want that or not, and either drop
> the list-building, or probably post a revised patch that fixes fallout
> from lists where decls are expected.)
>
> No regressions, and many progressions, on x86_64-linux-gnu and
> i686-pc-linux-gnu.
>
> [PR64164] Drop copyrename, use coalescible partition as base when optimizing.
>
> for  gcc/ChangeLog
>
> 	PR rtl-optimization/64164
> 	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
> 	* tree-ssa-copyrename.c: Removed.
> 	* opts.c (default_options_table): Drop -ftree-copyrename.  Add
> 	-ftree-coalesce-vars.
> 	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
> 	* common.opt (ftree-copyrename): Ignore.
> 	(ftree-coalesce-inlined-vars): Likewise.
> 	* doc/invoke.texi: Remove the ignored options above.
> 	* gimple-expr.h (gimple_can_coalesce_p): Note def location.
> 	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
> 	across variables when flag_tree_coalesce_vars.  Check register
> 	use and promoted modes to allow coalescing.  Moved to
> 	tree-ssa-coalesce.c.
> 	* tree-ssa-live.c (struct tree_int_map_hasher): Move along
> 	with its member functions to tree-ssa-coalesce.c.
> 	(var_map_base_init): Likewise.  Renamed to
> 	compute_samebase_partition_bases.
> 	(partition_view_normal): Drop want_bases parameter.
> 	(partition_view_bitmap): Likewise.
> 	* tree-ssa-live.h: Adjust declarations.
> 	* tree-ssa-coalesce.c: Include explow.h.
> 	(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
> 	default defs at the entry point.
> 	(dump_part_var_map): New.
> 	(compute_optimized_partition_bases): New, called by...
> 	(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
> 	of compute_samebase_partition_bases.  Adjust.
> 	* alias.c (nonoverlapping_memrefs_p): Special-case RTL-less
> 	gimple-reg exprs.
> 	* cfgexpand.c (leader_merge): New.
> 	(get_rtl_for_parm_ssa_default_def): New.
> 	(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
> 	vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
> 	(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
> 	redundant MEM attr setting.
> 	(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
> 	from...
> 	(expand_one_stack_var): ... this.  New wrapper to check and
> 	skip already expanded SSA partitions.
> 	(record_alignment_for_reg_var): New, factored out of...
> 	(expand_one_var): ... this.
> 	(expand_one_ssa_partition): New.
> 	(adjust_one_expanded_partition_var): New.
> 	(expand_one_register_var): Check and skip already expanded SSA
> 	partitions.
> 	(expand_used_vars): Don't create DECLs for anonymous SSA
> 	names.  Expand all SSA partitions, then adjust all SSA names.
> 	(pass::execute): Replace the loops that set
> 	SA.partition_to_pseudo from partition leaders and cleared
> 	DECL_RTL for multi-location variables, and that which used to
> 	rename vars and set attrs, with one that clears DECL_RTL and
> 	checks that PARMs and RESULTs default_defs match DECL_RTL.
> 	* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
> 	* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL and
> 	TREE_LIST decl.
> 	* explow.c (promote_ssa_mode): New.
> 	* explow.h (promote_ssa_mode): Declare.
> 	* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
> 	* function.c: Include cfgexpand.h.
> 	(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
> 	(use_register_for_parm_decl): Wrapper for the above to
> 	special-case the result_ptr.
> 	(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
> 	(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
> 	multiple locations.
> 	(assign_parm_adjust_stack_rtl): Add all and parm arguments,
> 	for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
> 	(assign_parm_setup_block): Prefer SSA-assigned location.
> 	(assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
> 	if stack_parm is NULL.
> 	(assign_parm_setup_stack): Prefer SSA-assigned location.
> 	(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
> 	rtl before testing for pointer bounds.  Special-case result_ptr.
> 	(expand_function_start): Maybe reset DECL_RTL of result.
> 	Prefer SSA-assigned location for result and static chain.
> 	Factor out DECL_RESULT and SET_DECL_RTL.
> 	* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
> 	anonymous SSA names.  Use promote_ssa_mode.
> 	(get_temp_reg): Likewise.
> 	(remove_ssa_form): Adjust.
> 	* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
> 	and get its reg_usage for reg invalidation.
> 	(compute_bb_dataflow): Pass it insn.
> 	(emit_notes_in_bb): Likewise.
>
> for  gcc/testsuite/ChangeLog
>
> 	* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
> 	* gcc.dg/ssp-1.c: Make counter a register.
> 	* gcc.dg/ssp-2.c: Likewise.
> 	* gcc.dg/torture/parm-coalesce.c: New.

Just a few comments in addition to Richi's....


> ---
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index ca491a0..74190a6d 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -179,21 +179,137 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
>   #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
> +   TREE_LIST of DECLs.  If NEXT is covered by CUR, return CUR
> +   unchanged.  Otherwise, return a list with all entries of CUR, with
> +   NEXT at the end.  If CUR was a list, it will be modified in
> +   place.  */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> +  if (cur == NULL || cur == next)
> +    return next;
> +
> +  tree list;
> +
> +  if (TREE_CODE (cur) == TREE_LIST)
> +    {
> +      /* Look for NEXT in the list.  Stop at the last node to insert
> +	 there.  */
> +      for (list = cur; ; list = TREE_CHAIN (list))
> +	{
> +	  if (TREE_VALUE (list) == next)
> +	    return cur;
> +	  if (!TREE_CHAIN (list))
> +	    break;
> +	}
> +    }
> +  else
> +    /* Create the first node.  */
> +    list = build_tree_list (NULL, cur);
> +
> +  next = build_tree_list (NULL, next);
> +  TREE_CHAIN (list) = next;
> +
> +  return cur;
> +}
As Richi notes, avoid TREE_LIST :-)  I suspect a vec would be an 
improvement here.  How often do we have more than one entry?  How often 
do we have to search this list and how hard is it to trigger 
pathological behaviour here?  If we're not gaining much, consider 
dropping this completely.  It's the most controversial part of the patch.

> +
> +      /* If this decl was marked as living in multiple places, reset
> +	 this now to NULL.  */
> +      tree var = SSA_NAME_VAR (name);
> +      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
Do we document the special meaning of pc_rtx in DECL_RTL?

> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index b8dc7d5..ef31ba0f 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1229,6 +1229,11 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
>   void
>   set_reg_attrs_for_decl_rtl (tree t, rtx x)
>   {
> +  if (!t)
> +    return;
> +  tree tdecl = t;
> +  if (TREE_CODE (t) == TREE_LIST)
> +    tdecl = TREE_VALUE (t);
As Richi mentioned, we only use the first entry, which begs the 
question, do we need the the leader_merge bits at all.

Jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-04-27 11:39           ` Richard Biener
@ 2015-06-06  5:12             ` Alexandre Oliva
  2015-06-08  8:16               ` Richard Biener
  2015-06-10  0:28               ` Alexandre Oliva
  0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-06-06  5:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, GCC Patches

On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> This should also mention that is_gimple_reg vars do not have their
> address taken.

check

>> +static tree
>> +leader_merge (tree cur, tree next)

> Ick - presumably you can't use sth better than a TREE_LIST here?

The list was an experiment that never really worked, and when I tried to
make it work after the patch, it proved to be unworkable, so I dropped
it, and rewrote leader_merge to choose either of the params, preferring
anonymous over ignored over named, so as to reduce the likelihood of
misreading of debug dumps, since that's all they're used for.

>> static void
>> -expand_one_stack_var (tree var)
>> +expand_one_stack_var_1 (tree var)
>> {
>> HOST_WIDE_INT size, offset;
>> unsigned byte_align;
>> 
>> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>> -  byte_align = align_local_variable (SSAVAR (var));
>> +  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
>> +    {
>> +      size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>> +      byte_align = align_local_variable (SSAVAR (var));
>> +    }
>> +  else

> I'd go here for all TREE_CODE (var) == SSA_NAME

Check

> (and get rid of the SSAVAR macro?)

There are remaining uses that don't seem worth dropping it for.

>> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
>> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
>> +   mode of a temp decl of same type as the SSA_NAME, if we had created
>> +   one.  */
>> +
>> +machine_mode
>> +promote_ssa_mode (const_tree name, int *punsignedp)
>> +{
>> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
>> +
>> +  if (SSA_NAME_VAR (name))
>> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);

> As above I'd rather not have different paths for anonymous vs. non-anonymous
> vars (so just delete the above two lines).

Check

>> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>> pmode = promote_function_mode (type, mode, &unsignedp,
>> gimple_call_fntype (g),
>> 2);
>> +         else if (!exp)
>> +           {
>> +             gcc_assert (code == SSA_NAME);

> promote_ssa_mode should assert this.

>> +             pmode = promote_ssa_mode (ssa_name, &unsignedp);

It does, so...  check.


>> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>> bool
>> use_register_for_decl (const_tree decl)
>> {
>> +  if (TREE_CODE (decl) == SSA_NAME)
>> +    {
>> +      if (!SSA_NAME_VAR (decl))
>> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>> +
>> +      decl = SSA_NAME_VAR (decl);

> See above.  Please drop the SSA_NAME_VAR != NULL path.

Check, then taken back, after a bootstrap failure and some debugging
made me realize this would be wrong.  Here are the nearly-added comments
that explain why:

      /* We often try to use the SSA_NAME, instead of its underlying
	 decl, to get type information and guide decisions, to avoid
	 differences of behavior between anonymous and named
	 variables, but in this one case we have to go for the actual
	 variable if there is one.  The main reason is that, at least
	 at -O0, we want to place user variables on the stack, but we
	 don't mind using pseudos for anonymous or ignored temps.
	 Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
	 should go in pseudos, whereas their corresponding variables
	 might have to go on the stack.  So, disregarding the decl
	 here would negatively impact debug info at -O0, enable
	 coalescing between SSA_NAMEs that ought to get different
	 stack/pseudo assignments, and get the incoming argument
	 processing thoroughly confused by PARM_DECLs expected to live
	 in stack slots but assigned to pseudos.  */


>> +++ b/gcc/gimple-expr.h
>> +/* Defined in tree-ssa-coalesce.c.   */
>> +extern bool gimple_can_coalesce_p (tree, tree);

> Err, put it to tree-ssa-coalesce.h?

Check.  Lots of additional headers required to be able to include
tree-ssa-coalesce.h, though.


>> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
>> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));

> The TREE_TYPE of name and its SSA_NAME_VAR are always the same.  So just
> use TREE_TYPE (name) here.

Check

>> gcc_assert (!REG_P (dest_rtx)
>> -             || dest_mode == promote_decl_mode (var, &unsignedp));
>> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>> 
>> if (src_mode != dest_mode)
>> {
>> @@ -714,12 +715,12 @@ static rtx
>> get_temp_reg (tree name)
>> {
>> tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>> -  tree type = TREE_TYPE (var);
>> +  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);

> See above.

Check


Here's the revised patch, regstrapped on x86_64-linux-gnu and
i686-linux-gnu.  The first attempt failed to compile libjava on x86_64,
requiring the new change in tree-ssa-loop-niter.c to pass.  It didn't
occur in the unpatched tree because the differences between anon or
named SSA_NAMEs in copyrename changed costs and caused different choices
in ivopts, which ultimately failed to expose the problem in loop-niter
during vrp.

At the end, I enclose the incremental changes since the previous
revision of the patch, to ease the incremental review.

Ok to install?


for  gcc/ChangeLog

	PR rtl-optimization/64164
	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
	* tree-ssa-copyrename.c: Removed.
	* opts.c (default_options_table): Drop -ftree-copyrename.  Add
	-ftree-coalesce-vars.
	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
	* common.opt (ftree-copyrename): Ignore.
	(ftree-coalesce-inlined-vars): Likewise.
	* doc/invoke.texi: Remove the ignored options above.
	* gimple-expr.h (gimple_can_coalesce_p): Move declaration
	* tree-ssa-coalesce.h: ... here.
	* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
	headers required by it.
	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
	across variables when flag_tree_coalesce_vars.  Check register
	use and promoted modes to allow coalescing.  Moved to
	tree-ssa-coalesce.c.
	* tree-ssa-live.c (struct tree_int_map_hasher): Move along
	with its member functions to tree-ssa-coalesce.c.
	(var_map_base_init): Likewise.  Renamed to
	compute_samebase_partition_bases.
	(partition_view_normal): Drop want_bases parameter.
	(partition_view_bitmap): Likewise.
	* tree-ssa-live.h: Adjust declarations.
	* tree-ssa-coalesce.c: Include explow.h.
	(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
	default defs at the entry point.
	(dump_part_var_map): New.
	(compute_optimized_partition_bases): New, called by...
	(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
	of compute_samebase_partition_bases.  Adjust.
	* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
	* cfgexpand.c (leader_merge): New.
	(get_rtl_for_parm_ssa_default_def): New.
	(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
	vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
	(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
	redundant MEM attr setting.
	(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
	from...
	(expand_one_stack_var): ... this.  New wrapper to check and
	skip already expanded SSA partitions.
	(record_alignment_for_reg_var): New, factored out of...
	(expand_one_var): ... this.
	(expand_one_ssa_partition): New.
	(adjust_one_expanded_partition_var): New.
	(expand_one_register_var): Check and skip already expanded SSA
	partitions.
	(expand_used_vars): Don't create DECLs for anonymous SSA
	names.  Expand all SSA partitions, then adjust all SSA names.
	(pass::execute): Replace the loops that set
	SA.partition_to_pseudo from partition leaders and cleared
	DECL_RTL for multi-location variables, and that which used to
	rename vars and set attrs, with one that clears DECL_RTL and
	checks that PARMs and RESULTs default_defs match DECL_RTL.
	* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
	* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
	* explow.c (promote_ssa_mode): New.
	* explow.h (promote_ssa_mode): Declare.
	* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
	* function.c: Include cfgexpand.h.
	(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
	(use_register_for_parm_decl): Wrapper for the above to
	special-case the result_ptr.
	(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
	(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
	multiple locations.
	(assign_parm_adjust_stack_rtl): Add all and parm arguments,
	for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
	(assign_parm_setup_block): Prefer SSA-assigned location.
	(assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
	if stack_parm is NULL.
	(assign_parm_setup_stack): Prefer SSA-assigned location.
	(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
	rtl before testing for pointer bounds.  Special-case result_ptr.
	(expand_function_start): Maybe reset DECL_RTL of result.
	Prefer SSA-assigned location for result and static chain.
	Factor out DECL_RESULT and SET_DECL_RTL.
	* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
	anonymous SSA names.  Use promote_ssa_mode.
	(get_temp_reg): Likewise.
	(remove_ssa_form): Adjust.
	* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
	and get its reg_usage for reg invalidation.
	(compute_bb_dataflow): Pass it insn.
	(emit_notes_in_bb): Likewise.
	* tree-ssa-loop-niter.c (loop_exits_before_overflow): Don't
	fail assert on conversion between unsigned types.

for  gcc/testsuite/ChangeLog

	* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
	* gcc.dg/ssp-1.c: Make counter a register.
	* gcc.dg/ssp-2.c: Likewise.
	* gcc.dg/torture/parm-coalesce.c: New.
---
 gcc/Makefile.in                              |    1 
 gcc/alias.c                                  |   13 +
 gcc/cfgexpand.c                              |  370 ++++++++++++++-----
 gcc/cfgexpand.h                              |    2 
 gcc/common.opt                               |   12 -
 gcc/doc/invoke.texi                          |   48 +--
 gcc/emit-rtl.c                               |    5 
 gcc/explow.c                                 |   22 +
 gcc/explow.h                                 |    3 
 gcc/expr.c                                   |   39 +-
 gcc/function.c                               |  226 +++++++++---
 gcc/gimple-expr.c                            |   39 --
 gcc/gimple-expr.h                            |    1 
 gcc/opts.c                                   |    2 
 gcc/passes.def                               |    5 
 gcc/testsuite/gcc.dg/guality/pr54200.c       |    2 
 gcc/testsuite/gcc.dg/ssp-1.c                 |    2 
 gcc/testsuite/gcc.dg/ssp-2.c                 |    2 
 gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
 gcc/tree-outof-ssa.c                         |   16 -
 gcc/tree-ssa-coalesce.c                      |  380 +++++++++++++++++++-
 gcc/tree-ssa-coalesce.h                      |    1 
 gcc/tree-ssa-copyrename.c                    |  499 --------------------------
 gcc/tree-ssa-live.c                          |  101 -----
 gcc/tree-ssa-live.h                          |    4 
 gcc/tree-ssa-loop-niter.c                    |    6 
 gcc/tree-ssa-uncprop.c                       |    5 
 gcc/var-tracking.c                           |   12 -
 28 files changed, 984 insertions(+), 874 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
 delete mode 100644 gcc/tree-ssa-copyrename.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 3d14938..2a03223 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1441,7 +1441,6 @@ OBJS = \
 	tree-ssa-ccp.o \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
-	tree-ssa-copyrename.o \
 	tree-ssa-dce.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index ea539c5..5a031d9 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2552,6 +2552,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
   if (! DECL_P (exprx) || ! DECL_P (expry))
     return 0;
 
+  /* If we refer to different gimple registers, or one gimple register
+     and one non-gimple-register, we know they can't overlap.  First,
+     gimple registers don't have their addresses taken.  Now, there
+     could be more than one stack slot for (different versions of) the
+     same gimple register, but we can presumably tell they don't
+     overlap based on offsets from stack base addresses elsewhere.
+     It's important that we don't proceed to DECL_RTL, because gimple
+     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+     able to do anything about them since no SSA information will have
+     remained to guide it.  */
+  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+    return exprx != expry;
+
   /* With invalid code we can end up storing into the constant pool.
      Bail out to avoid ICEing when creating RTL for this.
      See gfortran.dg/lto/20091028-2_0.f90.  */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b190f91..bf972fc 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -179,21 +179,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
 
 #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
 
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+   out of the same user variable being in multiple partitions (this is
+   less likely for compiler-introduced temps).  */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+  if (cur == NULL || cur == next)
+    return next;
+
+  if (DECL_P (cur) && DECL_IGNORED_P (cur))
+    return cur;
+
+  if (DECL_P (next) && DECL_IGNORED_P (next))
+    return next;
+
+  return cur;
+}
+
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+   there is one.  */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+  if (!is_gimple_reg (var))
+    return NULL_RTX;
+
+  /* If we've already determined RTL for the decl, use it.  This is
+     not just an optimization: if VAR is a PARM whose incoming value
+     is unused, we won't find a default def to use its partition, but
+     we still want to use the location of the parm, if it was used at
+     all.  During assign_parms, until a location is assigned for the
+     VAR, RTL can only for a parm or result if we're not coalescing
+     across variables, when we know we're coalescing all SSA_NAMEs of
+     each parm or result, and we're not coalescing them with names
+     pertaining to other variables, such as other parms' default
+     defs.  */
+  if (DECL_RTL_SET_P (var))
+    {
+      gcc_assert (DECL_RTL (var) != pc_rtx);
+      return DECL_RTL (var);
+    }
+
+  tree name = ssa_default_def (cfun, var);
+
+  if (!name)
+    return NULL_RTX;
+
+  int part = var_to_partition (SA.map, name);
+  if (part == NO_PARTITION)
+    return NULL_RTX;
+
+  return SA.partition_to_pseudo[part];
+}
+
 /* Associate declaration T with storage space X.  If T is no
    SSA name this is exactly SET_DECL_RTL, otherwise make the
    partition of T associated with X.  */
 static inline void
 set_rtl (tree t, rtx x)
 {
+  if (x && SSAVAR (t))
+    {
+      bool skip = false;
+      tree cur = NULL_TREE;
+
+      if (MEM_P (x))
+	cur = MEM_EXPR (x);
+      else if (REG_P (x))
+	cur = REG_EXPR (x);
+      else if (GET_CODE (x) == CONCAT
+	       && REG_P (XEXP (x, 0)))
+	cur = REG_EXPR (XEXP (x, 0));
+      else if (GET_CODE (x) == PARALLEL)
+	cur = REG_EXPR (XVECEXP (x, 0, 0));
+      else if (x == pc_rtx)
+	skip = true;
+      else
+	gcc_unreachable ();
+
+      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+      if (cur != next)
+	{
+	  if (MEM_P (x))
+	    set_mem_attributes (x, next, true);
+	  else
+	    set_reg_attrs_for_decl_rtl (next, x);
+	}
+    }
+
   if (TREE_CODE (t) == SSA_NAME)
     {
-      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
-      if (x && !MEM_P (x))
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
-      /* For the benefit of debug information at -O0 (where vartracking
-         doesn't run) record the place also in the base DECL if it's
-	 a normal variable (not a parameter).  */
-      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+      int part = var_to_partition (SA.map, t);
+      if (part != NO_PARTITION)
+	{
+	  if (SA.partition_to_pseudo[part])
+	    gcc_assert (SA.partition_to_pseudo[part] == x);
+	  else
+	    SA.partition_to_pseudo[part] = x;
+	}
+      /* For the benefit of debug information at -O0 (where
+         vartracking doesn't run) record the place also in the base
+         DECL.  For PARMs and RESULTs, we may end up resetting these
+         in function.c:maybe_reset_rtl_for_parm, but in some rare
+         cases we may need them (unused and overwritten incoming
+         value, that at -O0 must share the location with the other
+         uses in spite of the missing default def), and this may be
+         the only chance to preserve them.  */
+      if (x && x != pc_rtx && SSA_NAME_VAR (t))
 	{
 	  tree var = SSA_NAME_VAR (t);
 	  /* If we don't yet have something recorded, just record it now.  */
@@ -909,7 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (Pmode, base, offset);
-  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+		   ? TYPE_MODE (TREE_TYPE (decl))
+		   : DECL_MODE (SSAVAR (decl)), x);
 
   if (TREE_CODE (decl) != SSA_NAME)
     {
@@ -931,7 +1033,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
       DECL_USER_ALIGN (decl) = 0;
     }
 
-  set_mem_attributes (x, SSAVAR (decl), true);
   set_rtl (decl, x);
 }
 
@@ -1146,13 +1247,22 @@ account_stack_vars (void)
    to a variable to be allocated in the stack frame.  */
 
 static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
 {
   HOST_WIDE_INT size, offset;
   unsigned byte_align;
 
-  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
-  byte_align = align_local_variable (SSAVAR (var));
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      tree type = TREE_TYPE (var);
+      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+      byte_align = TYPE_ALIGN_UNIT (type);
+    }
+  else
+    {
+      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+      byte_align = align_local_variable (var);
+    }
 
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1163,6 +1273,27 @@ expand_one_stack_var (tree var)
 			   crtl->max_used_stack_slot_alignment, offset);
 }
 
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+   already assigned some MEM.  */
+
+static void
+expand_one_stack_var (tree var)
+{
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (MEM_P (x));
+	  return;
+	}
+    }
+
+  return expand_one_stack_var_1 (var);
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a hard register.  */
 
@@ -1172,13 +1303,114 @@ expand_one_hard_reg_var (tree var)
   rest_of_decl_compilation (var, 0, 0);
 }
 
+/* Record the alignment requirements of some variable assigned to a
+   pseudo.  */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+  if (SUPPORTS_STACK_ALIGNMENT
+      && crtl->stack_alignment_estimated < align)
+    {
+      /* stack_alignment_estimated shouldn't change after stack
+         realign decision made */
+      gcc_assert (!crtl->stack_realign_processed);
+      crtl->stack_alignment_estimated = align;
+    }
+
+  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+     So here we only make sure stack_alignment_needed >= align.  */
+  if (crtl->stack_alignment_needed < align)
+    crtl->stack_alignment_needed = align;
+  if (crtl->max_used_stack_slot_alignment < align)
+    crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition.  */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+  int part = var_to_partition (SA.map, var);
+  gcc_assert (part != NO_PARTITION);
+
+  if (SA.partition_to_pseudo[part])
+    return;
+
+  if (!use_register_for_decl (var))
+    {
+      expand_one_stack_var_1 (var);
+      return;
+    }
+
+  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+					  TYPE_MODE (TREE_TYPE (var)),
+					  TYPE_ALIGN (TREE_TYPE (var)));
+
+  /* If the variable alignment is very large we'll dynamicaly allocate
+     it, which means that in-frame portion is just a pointer.  */
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+    align = POINTER_SIZE;
+
+  record_alignment_for_reg_var (align);
+
+  machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+  rtx x = gen_reg_rtx (reg_mode);
+
+  set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+   and the underlying variable of the SSA_NAME.  */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+  if (!var)
+    return;
+
+  tree decl = SSA_NAME_VAR (var);
+
+  int part = var_to_partition (SA.map, var);
+  if (part == NO_PARTITION)
+    return;
+
+  rtx x = SA.partition_to_pseudo[part];
+
+  set_rtl (var, x);
+
+  if (!REG_P (x))
+    return;
+
+  /* Note if the object is a user variable.  */
+  if (decl && !DECL_ARTIFICIAL (decl))
+    mark_user_reg (x);
+
+  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+    mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a pseudo register.  */
 
 static void
 expand_one_register_var (tree var)
 {
-  tree decl = SSAVAR (var);
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (REG_P (x));
+	  return;
+	}
+      gcc_unreachable ();
+    }
+
+  tree decl = var;
   tree type = TREE_TYPE (decl);
   machine_mode reg_mode = promote_decl_mode (decl, NULL);
   rtx x = gen_reg_rtx (reg_mode);
@@ -1312,21 +1544,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
 	align = POINTER_SIZE;
     }
 
-  if (SUPPORTS_STACK_ALIGNMENT
-      && crtl->stack_alignment_estimated < align)
-    {
-      /* stack_alignment_estimated shouldn't change after stack
-         realign decision made */
-      gcc_assert (!crtl->stack_realign_processed);
-      crtl->stack_alignment_estimated = align;
-    }
-
-  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
-     So here we only make sure stack_alignment_needed >= align.  */
-  if (crtl->stack_alignment_needed < align)
-    crtl->stack_alignment_needed = align;
-  if (crtl->max_used_stack_slot_alignment < align)
-    crtl->max_used_stack_slot_alignment = align;
+  record_alignment_for_reg_var (align);
 
   if (TREE_CODE (origvar) == SSA_NAME)
     {
@@ -1760,48 +1978,18 @@ expand_used_vars (void)
   if (targetm.use_pseudo_pic_reg ())
     pic_offset_table_rtx = gen_reg_rtx (Pmode);
 
-  hash_map<tree, tree> ssa_name_decls;
   for (i = 0; i < SA.map->num_partitions; i++)
     {
       tree var = partition_to_var (SA.map, i);
 
       gcc_assert (!virtual_operand_p (var));
 
-      /* Assign decls to each SSA name partition, share decls for partitions
-         we could have coalesced (those with the same type).  */
-      if (SSA_NAME_VAR (var) == NULL_TREE)
-	{
-	  tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
-	  if (!*slot)
-	    *slot = create_tmp_reg (TREE_TYPE (var));
-	  replace_ssa_name_symbol (var, *slot);
-	}
-
-      /* Always allocate space for partitions based on VAR_DECLs.  But for
-	 those based on PARM_DECLs or RESULT_DECLs and which matter for the
-	 debug info, there is no need to do so if optimization is disabled
-	 because all the SSA_NAMEs based on these DECLs have been coalesced
-	 into a single partition, which is thus assigned the canonical RTL
-	 location of the DECLs.  If in_lto_p, we can't rely on optimize,
-	 a function could be compiled with -O1 -flto first and only the
-	 link performed at -O0.  */
-      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
-	expand_one_var (var, true, true);
-      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
-	{
-	  /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
-	     contain the default def (representing the parm or result itself)
-	     we don't do anything here.  But those which don't contain the
-	     default def (representing a temporary based on the parm/result)
-	     we need to allocate space just like for normal VAR_DECLs.  */
-	  if (!bitmap_bit_p (SA.partition_has_default_def, i))
-	    {
-	      expand_one_var (var, true, true);
-	      gcc_assert (SA.partition_to_pseudo[i]);
-	    }
-	}
+      expand_one_ssa_partition (var);
     }
 
+  for (i = 1; i < num_ssa_names; i++)
+    adjust_one_expanded_partition_var (ssa_name (i));
+
   if (flag_stack_protect == SPCT_FLAG_STRONG)
       gen_stack_protect_signal
 	= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -5961,35 +6149,6 @@ pass_expand::execute (function *fun)
       parm_birth_insn = var_seq;
     }
 
-  /* Now that we also have the parameter RTXs, copy them over to our
-     partitions.  */
-  for (i = 0; i < SA.map->num_partitions; i++)
-    {
-      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
-      if (TREE_CODE (var) != VAR_DECL
-	  && !SA.partition_to_pseudo[i])
-	SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
-      gcc_assert (SA.partition_to_pseudo[i]);
-
-      /* If this decl was marked as living in multiple places, reset
-	 this now to NULL.  */
-      if (DECL_RTL_IF_SET (var) == pc_rtx)
-	SET_DECL_RTL (var, NULL);
-
-      /* Some RTL parts really want to look at DECL_RTL(x) when x
-	 was a decl marked in REG_ATTR or MEM_ATTR.  We could use
-	 SET_DECL_RTL here making this available, but that would mean
-	 to select one of the potentially many RTLs for one DECL.  Instead
-	 of doing that we simply reset the MEM_EXPR of the RTL in question,
-	 then nobody can get at it and hence nobody can call DECL_RTL on it.  */
-      if (!DECL_RTL_SET_P (var))
-	{
-	  if (MEM_P (SA.partition_to_pseudo[i]))
-	    set_mem_expr (SA.partition_to_pseudo[i], NULL);
-	}
-    }
-
   /* If we have a class containing differently aligned pointers
      we need to merge those into the corresponding RTL pointer
      alignment.  */
@@ -5997,7 +6156,6 @@ pass_expand::execute (function *fun)
     {
       tree name = ssa_name (i);
       int part;
-      rtx r;
 
       if (!name
 	  /* We might have generated new SSA names in
@@ -6010,20 +6168,24 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      /* Adjust all partition members to get the underlying decl of
-	 the representative which we might have created in expand_one_var.  */
-      if (SSA_NAME_VAR (name) == NULL_TREE)
+      gcc_assert (SA.partition_to_pseudo[part]);
+
+      /* If this decl was marked as living in multiple places, reset
+	 this now to NULL.  */
+      tree var = SSA_NAME_VAR (name);
+      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+	SET_DECL_RTL (var, NULL);
+      /* Check that the pseudos chosen by assign_parms are those of
+	 the corresponding default defs.  */
+      else if (SSA_NAME_IS_DEFAULT_DEF (name)
+	       && (TREE_CODE (var) == PARM_DECL
+		   || TREE_CODE (var) == RESULT_DECL))
 	{
-	  tree leader = partition_to_var (SA.map, part);
-	  gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
-	  replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+	  rtx in = DECL_RTL_IF_SET (var);
+	  gcc_assert (in);
+	  rtx out = SA.partition_to_pseudo[part];
+	  gcc_assert (in == out || rtx_equal_p (in, out));
 	}
-      if (!POINTER_TYPE_P (TREE_TYPE (name)))
-	continue;
-
-      r = SA.partition_to_pseudo[part];
-      if (REG_P (r))
-	mark_reg_pointer (r, get_pointer_alignment (name));
     }
 
   /* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..602579d 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
 
 #endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 32b416a..051f824 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2227,16 +2227,16 @@ Common Report Var(flag_tree_ch) Optimization
 Enable loop header copying on trees
 
 ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing.  Preserved for backward compatibility.
 
 ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
 
 ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copy-prop
 Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e25bd62..e359be2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
 -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
 -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
 -fdump-tree-nrv -fdump-tree-vect @gol
 -fdump-tree-sink @gol
 -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
 -ftree-loop-if-convert-stores -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -7076,11 +7074,6 @@ name is made by appending @file{.phiopt} to the source file name.
 Dump each function after forward propagating single use variables.  The file
 name is made by appending @file{.forwprop} to the source file name.
 
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization.  The file
-name is made by appending @file{.copyrename} to the source file name.
-
 @item nrv
 @opindex fdump-tree-nrv
 Dump each function after applying the named return value optimization on
@@ -7545,8 +7538,8 @@ compilation time.
 -ftree-ccp @gol
 -fssa-phiopt @gol
 -ftree-ch @gol
+-ftree-coalesce-vars @gol
 -ftree-copy-prop @gol
--ftree-copyrename @gol
 -ftree-dce @gol
 -ftree-dominator-opts @gol
 -ftree-dse @gol
@@ -8815,6 +8808,15 @@ profitable to parallelize the loops.
 Compare the results of several data dependence analyzers.  This option
 is used for debugging the data dependence analyzers.
 
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries.  This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}.  In the negated form, this flag
+prevents SSA coalescing of user variables.  This option is enabled by
+default if optimization is enabled.
+
 @item -ftree-loop-if-convert
 @opindex ftree-loop-if-convert
 Attempt to transform conditional jumps in the innermost loops to
@@ -8928,32 +8930,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
 references with scalars to prevent committing structures to memory too
 early.  This flag is enabled by default at @option{-O} and higher.
 
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees.  This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables.  This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions.  It is a more limited form of
-@option{-ftree-coalesce-vars}.  This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries.  This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}.  In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones.  This option is enabled by default.
-
 @item -ftree-ter
 @opindex ftree-ter
 Perform temporary expression replacement during the SSA->normal phase.  Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 49a1509..2b98946 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1249,6 +1249,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
 void
 set_reg_attrs_for_decl_rtl (tree t, rtx x)
 {
+  if (!t)
+    return;
+  tree tdecl = t;
   if (GET_CODE (x) == SUBREG)
     {
       gcc_assert (subreg_lowpart_p (x));
@@ -1257,7 +1260,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
   if (REG_P (x))
     REG_ATTRS (x)
       = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
-					       DECL_MODE (t)));
+					       DECL_MODE (tdecl)));
   if (GET_CODE (x) == CONCAT)
     {
       if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index 8745aea..5b0d49c 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -856,6 +856,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
   return pmode;
 }
 
+/* Return the promoted mode for name.  If it is a named SSA_NAME, it
+   is the same as promote_decl_mode.  Otherwise, it is the promoted
+   mode of a temp decl of same type as the SSA_NAME, if we had created
+   one.  */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  tree type = TREE_TYPE (name);
+  int unsignedp = TYPE_UNSIGNED (type);
+  machine_mode mode = TYPE_MODE (type);
+
+  machine_mode pmode = promote_mode (type, mode, &unsignedp);
+  if (punsignedp)
+    *punsignedp = unsignedp;
+
+  return pmode;
+}
+
+
 \f
 /* Controls the behaviour of {anti_,}adjust_stack.  */
 static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 94613de..52113db 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
 /* Return mode and signedness to use when object is promoted.  */
 machine_mode promote_decl_mode (const_tree, int *);
 
+/* Return mode and signedness to use when object is promoted.  */
+machine_mode promote_ssa_mode (const_tree, int *);
+
 /* Remove some bytes from the stack.  An rtx says how many.  */
 extern void adjust_stack (rtx);
 
diff --git a/gcc/expr.c b/gcc/expr.c
index 5a931dc..5b6e16e 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9301,7 +9301,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
   rtx op0, op1, temp, decl_rtl;
   tree type;
   int unsignedp;
-  machine_mode mode;
+  machine_mode mode, dmode;
   enum tree_code code = TREE_CODE (exp);
   rtx subtarget, original_target;
   int ignore;
@@ -9432,7 +9432,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       if (g == NULL
 	  && modifier == EXPAND_INITIALIZER
 	  && !SSA_NAME_IS_DEFAULT_DEF (exp)
-	  && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+	  && (optimize || !SSA_NAME_VAR (exp)
+	      || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
 	  && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
 	g = SSA_NAME_DEF_STMT (exp);
       if (g)
@@ -9511,15 +9512,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       /* Ensure variable marked as used even if it doesn't go through
 	 a parser.  If it hasn't be used yet, write out an external
 	 definition.  */
-      TREE_USED (exp) = 1;
+      if (exp)
+	TREE_USED (exp) = 1;
 
       /* Show we haven't gotten RTL for this yet.  */
       temp = 0;
 
       /* Variables inherited from containing functions should have
 	 been lowered by this point.  */
-      context = decl_function_context (exp);
-      gcc_assert (SCOPE_FILE_SCOPE_P (context)
+      if (exp)
+	context = decl_function_context (exp);
+      gcc_assert (!exp
+		  || SCOPE_FILE_SCOPE_P (context)
 		  || context == current_function_decl
 		  || TREE_STATIC (exp)
 		  || DECL_EXTERNAL (exp)
@@ -9543,7 +9547,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	  decl_rtl = use_anchored_address (decl_rtl);
 	  if (modifier != EXPAND_CONST_ADDRESS
 	      && modifier != EXPAND_SUM
-	      && !memory_address_addr_space_p (DECL_MODE (exp),
+	      && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+					       : GET_MODE (decl_rtl),
 					       XEXP (decl_rtl, 0),
 					       MEM_ADDR_SPACE (decl_rtl)))
 	    temp = replace_equiv_address (decl_rtl,
@@ -9554,12 +9559,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 if the address is a register.  */
       if (temp != 0)
 	{
-	  if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+	  if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
 	    mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
 
 	  return temp;
 	}
 
+      if (exp)
+	dmode = DECL_MODE (exp);
+      else
+	dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
       /* If the mode of DECL_RTL does not match that of the decl,
 	 there are two cases: we are dealing with a BLKmode value
 	 that is returned in a register, or we are dealing with
@@ -9567,22 +9577,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 of the wanted mode, but mark it so that we know that it
 	 was already extended.  */
       if (REG_P (decl_rtl)
-	  && DECL_MODE (exp) != BLKmode
-	  && GET_MODE (decl_rtl) != DECL_MODE (exp))
+	  && dmode != BLKmode
+	  && GET_MODE (decl_rtl) != dmode)
 	{
 	  machine_mode pmode;
 
 	  /* Get the signedness to be used for this variable.  Ensure we get
 	     the same mode we got when the variable was declared.  */
-	  if (code == SSA_NAME
-	      && (g = SSA_NAME_DEF_STMT (ssa_name))
-	      && gimple_code (g) == GIMPLE_CALL
-	      && !gimple_call_internal_p (g))
+	  if (code != SSA_NAME)
+	    pmode = promote_decl_mode (exp, &unsignedp);
+	  else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+		   && gimple_code (g) == GIMPLE_CALL
+		   && !gimple_call_internal_p (g))
 	    pmode = promote_function_mode (type, mode, &unsignedp,
 					   gimple_call_fntype (g),
 					   2);
 	  else
-	    pmode = promote_decl_mode (exp, &unsignedp);
+	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/function.c b/gcc/function.c
index 7d2d7e4..58e2498 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfganal.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
+#include "cfgexpand.h"
 #include "basic-block.h"
 #include "df.h"
 #include "params.h"
@@ -2121,6 +2122,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
 bool
 use_register_for_decl (const_tree decl)
 {
+  if (TREE_CODE (decl) == SSA_NAME)
+    {
+      /* We often try to use the SSA_NAME, instead of its underlying
+	 decl, to get type information and guide decisions, to avoid
+	 differences of behavior between anonymous and named
+	 variables, but in this one case we have to go for the actual
+	 variable if there is one.  The main reason is that, at least
+	 at -O0, we want to place user variables on the stack, but we
+	 don't mind using pseudos for anonymous or ignored temps.
+	 Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+	 should go in pseudos, whereas their corresponding variables
+	 might have to go on the stack.  So, disregarding the decl
+	 here would negatively impact debug info at -O0, enable
+	 coalescing between SSA_NAMEs that ought to get different
+	 stack/pseudo assignments, and get the incoming argument
+	 processing thoroughly confused by PARM_DECLs expected to live
+	 in stack slots but assigned to pseudos.  */
+      if (!SSA_NAME_VAR (decl))
+	return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+	  && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+      decl = SSA_NAME_VAR (decl);
+    }
+
   if (!targetm.calls.allocate_stack_slots_for_args ())
     return true;
 
@@ -2804,23 +2829,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
   data->entry_parm = entry_parm;
 }
 
+/* Wrapper for use_register_for_decl, that special-cases the
+   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+   passed by reference.  */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (DECL_BY_REFERENCE (result))
+	parm = result;
+    }
+
+  return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+   is passed by reference.  */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (!DECL_BY_REFERENCE (result))
+	return NULL_RTX;
+
+      parm = result;
+    }
+
+  return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+   SSA_NAMEs in multiple partitions, so that assign_parms will choose
+   the default def, if it exists, or create new RTL to hold the unused
+   entry value.  If we are coalescing across variables, we want to
+   reset the location too, because a parm without a default def
+   (incoming value unused) might be coalesced with one with a default
+   def, and then assign_parms would copy both incoming values to the
+   same location, which might cause the wrong value to survive.  */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+  gcc_assert (TREE_CODE (parm) == PARM_DECL
+	      || TREE_CODE (parm) == RESULT_DECL);
+  if ((flag_tree_coalesce_vars
+       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+      && is_gimple_reg (parm))
+    SET_DECL_RTL (parm, NULL_RTX);
+}
+
 /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
    always valid and properly aligned.  */
 
 static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+			      struct assign_parm_data_one *data)
 {
   rtx stack_parm = data->stack_parm;
 
+  /* If out-of-SSA assigned RTL to the parm default def, make sure we
+     don't use what we might have computed before.  */
+  rtx ssa_assigned = rtl_for_parm (all, parm);
+  if (ssa_assigned)
+    stack_parm = NULL;
+
   /* If we can't trust the parm stack slot to be aligned enough for its
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
-  if (stack_parm
-      && ((STRICT_ALIGNMENT
-	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
-	  || (data->nominal_type
-	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
-	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+  else if (stack_parm
+	   && ((STRICT_ALIGNMENT
+		&& (GET_MODE_ALIGNMENT (data->nominal_mode)
+		    > MEM_ALIGN (stack_parm)))
+	       || (data->nominal_type
+		   && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+		   && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
     stack_parm = NULL;
 
   /* If parm was passed in memory, and we need to convert it on entry,
@@ -2882,11 +2972,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      stack_parm = assign_stack_local (BLKmode, size_stored,
-				       DECL_ALIGN (parm));
+      stack_parm = rtl_for_parm (all, parm);
+      if (!stack_parm)
+	stack_parm = assign_stack_local (BLKmode, size_stored,
+					 DECL_ALIGN (parm));
+      else
+	stack_parm = copy_rtx (stack_parm);
       if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
 	PUT_MODE (stack_parm, GET_MODE (entry_parm));
       set_mem_attributes (stack_parm, parm, 1);
@@ -3027,10 +3122,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  parmreg = gen_reg_rtx (promoted_nominal_mode);
+  rtx from_expand = rtl_for_parm (all, parm);
 
-  if (!DECL_ARTIFICIAL (parm))
-    mark_user_reg (parmreg);
+  if (from_expand && !data->passed_pointer)
+    {
+      parmreg = from_expand;
+      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
+    }
+  else
+    {
+      parmreg = gen_reg_rtx (promoted_nominal_mode);
+      if (!DECL_ARTIFICIAL (parm))
+	mark_user_reg (parmreg);
+    }
 
   /* If this was an item that we received a pointer to,
      set DECL_RTL appropriately.  */
@@ -3049,6 +3153,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      assign_parm_find_data_types and expand_expr_real_1.  */
 
   equiv_stack_parm = data->stack_parm;
+  if (!equiv_stack_parm)
+    equiv_stack_parm = data->entry_parm;
   validated_mem = validize_mem (copy_rtx (data->entry_parm));
 
   need_conversion = (data->nominal_mode != data->passed_mode
@@ -3189,11 +3295,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
   /* If we were passed a pointer but the actual value can safely live
      in a register, retrieve it and use it directly.  */
-  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+  if (data->passed_pointer
+      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
     {
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
-      if (use_register_for_decl (parm))
+      if (from_expand)
+	{
+	  parmreg = from_expand;
+	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+	}
+      else if (use_register_for_decl (parm))
 	{
 	  parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
 	  mark_user_reg (parmreg);
@@ -3233,7 +3345,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       /* STACK_PARM is the pointer, not the parm, and PARMREG is
 	 now the parm.  */
-      data->stack_parm = NULL;
+      data->stack_parm = equiv_stack_parm = NULL;
     }
 
   /* Mark the register as eliminable if we did no conversion and it was
@@ -3243,11 +3355,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      make here would screw up life analysis for it.  */
   if (data->nominal_mode == data->passed_mode
       && !did_conversion
-      && data->stack_parm != 0
-      && MEM_P (data->stack_parm)
+      && equiv_stack_parm != 0
+      && MEM_P (equiv_stack_parm)
       && data->locate.offset.var == 0
       && reg_mentioned_p (virtual_incoming_args_rtx,
-			  XEXP (data->stack_parm, 0)))
+			  XEXP (equiv_stack_parm, 0)))
     {
       rtx_insn *linsn = get_last_insn ();
       rtx_insn *sinsn;
@@ -3260,8 +3372,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	    = GET_MODE_INNER (GET_MODE (parmreg));
 	  int regnor = REGNO (XEXP (parmreg, 0));
 	  int regnoi = REGNO (XEXP (parmreg, 1));
-	  rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
-	  rtx stacki = adjust_address_nv (data->stack_parm, submode,
+	  rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+	  rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
 					  GET_MODE_SIZE (submode));
 
 	  /* Scan backwards for the set of the real and
@@ -3334,6 +3446,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 
       if (data->stack_parm == 0)
 	{
+	  rtx x = data->stack_parm = rtl_for_parm (all, parm);
+	  if (x)
+	    gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+	}
+
+      if (data->stack_parm == 0)
+	{
 	  int align = STACK_SLOT_ALIGNMENT (data->passed_type,
 					    GET_MODE (data->entry_parm),
 					    TYPE_ALIGN (data->passed_type));
@@ -3592,6 +3711,8 @@ assign_parms (tree fndecl)
 	  DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
 	  continue;
 	}
+      else
+	maybe_reset_rtl_for_parm (parm);
 
       /* Estimate stack alignment from parameter alignment.  */
       if (SUPPORTS_STACK_ALIGNMENT)
@@ -3641,7 +3762,9 @@ assign_parms (tree fndecl)
       else
 	set_decl_incoming_rtl (parm, data.entry_parm, false);
 
-      /* Boudns should be loaded in the particular order to
+      assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+      /* Bounds should be loaded in the particular order to
 	 have registers allocated correctly.  Collect info about
 	 input bounds and load them later.  */
       if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3658,11 +3781,10 @@ assign_parms (tree fndecl)
 	}
       else
 	{
-	  assign_parm_adjust_stack_rtl (&data);
-
 	  if (assign_parm_setup_block_p (&data))
 	    assign_parm_setup_block (&all, parm, &data);
-	  else if (data.passed_pointer || use_register_for_decl (parm))
+	  else if (data.passed_pointer
+		   || use_register_for_parm_decl (&all, parm))
 	    assign_parm_setup_reg (&all, parm, &data);
 	  else
 	    assign_parm_setup_stack (&all, parm, &data);
@@ -5004,7 +5126,9 @@ expand_function_start (tree subr)
      before any library calls that assign parms might generate.  */
 
   /* Decide whether to return the value in memory or in a register.  */
-  if (aggregate_value_p (DECL_RESULT (subr), subr))
+  tree res = DECL_RESULT (subr);
+  maybe_reset_rtl_for_parm (res);
+  if (aggregate_value_p (res, subr))
     {
       /* Returning something that won't go in a register.  */
       rtx value_address = 0;
@@ -5012,7 +5136,7 @@ expand_function_start (tree subr)
 #ifdef PCC_STATIC_STRUCT_RETURN
       if (cfun->returns_pcc_struct)
 	{
-	  int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+	  int size = int_size_in_bytes (TREE_TYPE (res));
 	  value_address = assemble_static_space (size);
 	}
       else
@@ -5024,36 +5148,45 @@ expand_function_start (tree subr)
 	     it.  */
 	  if (sv)
 	    {
-	      value_address = gen_reg_rtx (Pmode);
+	      if (DECL_BY_REFERENCE (res))
+		value_address = get_rtl_for_parm_ssa_default_def (res);
+	      if (!value_address)
+		value_address = gen_reg_rtx (Pmode);
 	      emit_move_insn (value_address, sv);
 	    }
 	}
       if (value_address)
 	{
 	  rtx x = value_address;
-	  if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+	  if (!DECL_BY_REFERENCE (res))
 	    {
-	      x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
-	      set_mem_attributes (x, DECL_RESULT (subr), 1);
+	      x = get_rtl_for_parm_ssa_default_def (res);
+	      if (!x)
+		{
+		  x = gen_rtx_MEM (DECL_MODE (res), value_address);
+		  set_mem_attributes (x, res, 1);
+		}
 	    }
-	  SET_DECL_RTL (DECL_RESULT (subr), x);
+	  SET_DECL_RTL (res, x);
 	}
     }
-  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+  else if (DECL_MODE (res) == VOIDmode)
     /* If return mode is void, this decl rtl should not be used.  */
-    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+    SET_DECL_RTL (res, NULL_RTX);
   else
     {
       /* Compute the return values into a pseudo reg, which we will copy
 	 into the true return register after the cleanups are done.  */
-      tree return_type = TREE_TYPE (DECL_RESULT (subr));
-      if (TYPE_MODE (return_type) != BLKmode
-	  && targetm.calls.return_in_msb (return_type))
+      tree return_type = TREE_TYPE (res);
+      rtx x = get_rtl_for_parm_ssa_default_def (res);
+      if (x)
+	/* Use it.  */;
+      else if (TYPE_MODE (return_type) != BLKmode
+	       && targetm.calls.return_in_msb (return_type))
 	/* expand_function_end will insert the appropriate padding in
 	   this case.  Use the return value's natural (unpadded) mode
 	   within the function proper.  */
-	SET_DECL_RTL (DECL_RESULT (subr),
-		      gen_reg_rtx (TYPE_MODE (return_type)));
+	x = gen_reg_rtx (TYPE_MODE (return_type));
       else
 	{
 	  /* In order to figure out what mode to use for the pseudo, we
@@ -5064,25 +5197,26 @@ expand_function_start (tree subr)
 	  /* Structures that are returned in registers are not
 	     aggregate_value_p, so we may see a PARALLEL or a REG.  */
 	  if (REG_P (hard_reg))
-	    SET_DECL_RTL (DECL_RESULT (subr),
-			  gen_reg_rtx (GET_MODE (hard_reg)));
+	    x = gen_reg_rtx (GET_MODE (hard_reg));
 	  else
 	    {
 	      gcc_assert (GET_CODE (hard_reg) == PARALLEL);
-	      SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+	      x = gen_group_rtx (hard_reg);
 	    }
 	}
 
+      SET_DECL_RTL (res, x);
+
       /* Set DECL_REGISTER flag so that expand_function_end will copy the
 	 result to the real return register(s).  */
-      DECL_REGISTER (DECL_RESULT (subr)) = 1;
+      DECL_REGISTER (res) = 1;
 
       if (chkp_function_instrumented_p (current_function_decl))
 	{
-	  tree return_type = TREE_TYPE (DECL_RESULT (subr));
+	  tree return_type = TREE_TYPE (res);
 	  rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
 								 subr, 1);
-	  SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+	  SET_DECL_BOUNDS_RTL (res, bounds);
 	}
     }
 
@@ -5097,7 +5231,9 @@ expand_function_start (tree subr)
       rtx local, chain;
      rtx_insn *insn;
 
-      local = gen_reg_rtx (Pmode);
+      local = get_rtl_for_parm_ssa_default_def (parm);
+      if (!local)
+	local = gen_reg_rtx (Pmode);
       chain = targetm.calls.static_chain (current_function_decl, true);
 
       set_decl_incoming_rtl (parm, chain, false);
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index 4d683d6..d3d1c5f 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
   return copy;
 }
 
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
-   coalescing together, false otherwise.
-
-   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
-  /* First check the SSA_NAME's associated DECL.  We only want to
-     coalesce if they have the same DECL or both have no associated DECL.  */
-  tree var1 = SSA_NAME_VAR (name1);
-  tree var2 = SSA_NAME_VAR (name2);
-  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
-  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
-  if (var1 != var2)
-    return false;
-
-  /* Now check the types.  If the types are the same, then we should
-     try to coalesce V1 and V2.  */
-  tree t1 = TREE_TYPE (name1);
-  tree t2 = TREE_TYPE (name2);
-  if (t1 == t2)
-    return true;
-
-  /* If the types are not the same, check for a canonical type match.  This
-     (for example) allows coalescing when the types are fundamentally the
-     same, but just have different names. 
-
-     Note pointer types with different address spaces may have the same
-     canonical type.  Those are rejected for coalescing by the
-     types_compatible_p check.  */
-  if (TYPE_CANONICAL (t1)
-      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
-      && types_compatible_p (t1, t2))
-    return true;
-
-  return false;
-}
-
 /* Strip off a legitimate source ending from the input string NAME of
    length LEN.  Rather than having to know the names used by all of
    our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index ed23eb2..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
 extern bool gimple_has_body_p (tree);
 extern const char *gimple_decl_printable_name (tree, int);
 extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
 extern tree create_tmp_var_name (const char *);
 extern tree create_tmp_var_raw (tree, const char * = NULL);
 extern tree create_tmp_var (tree, const char * = NULL);
diff --git a/gcc/opts.c b/gcc/opts.c
index 9793999..5305299 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
-    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 4690e23..230e089 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_all_early_optimizations);
       PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_object_sizes);
 	  NEXT_PASS (pass_ccp);
 	  /* After CCP we rewrite no longer addressed locals into SSA
@@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
       /* Initial scalar cleanups before alias computation.
 	 They ensure memory accesses are not indirect wherever possible.  */
       NEXT_PASS (pass_strip_predict_hints);
-      NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_ccp);
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
@@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_ch);
       NEXT_PASS (pass_lower_complex);
       NEXT_PASS (pass_sra);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* The dom pass will also resolve all __builtin_constant_p calls
          that are still there to 0.  This has to be done after some
 	 propagations have already run, but before some more dead code
@@ -291,7 +288,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_fold_builtins);
       NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* FIXME: If DCE is not run before checking for uninitialized uses,
 	 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 	 However, this also causes us to misdiagnose cases that should be
@@ -326,7 +322,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tsan);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* ???  We do want some kind of loop invariant motion, but we possibly
          need to adjust LIM to be more friendly towards preserving accurate
 	 debug information here.  */
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/54200 */
 /* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
 
 int o __attribute__((used));
 
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
 
 int main ()
 {
-  int i;
+  register int i;
   char foo[255];
 
   // smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
 void
 overflow()
 {
-  int i = 0;
+  register int i = 0;
   char foo[30];
 
   /* Overflow buffer.  */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+   value is unused, to the same location, so as to overwrite one of
+   them with the incoming value of the other.  */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+/* Same as foo, but with swapped parameters.  */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+int
+main (void)
+{
+  if (foo (0, 1) != 3)
+    abort ();
+  if (bar (1, 0) != 3)
+    abort ();
+  return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index e23bc0b..59d91c6 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
   rtx dest_rtx, seq, x;
   machine_mode dest_mode, src_mode;
   int unsignedp;
-  tree var;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -327,12 +326,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
 
   start_sequence ();
 
-  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+  tree name = partition_to_var (SA.map, dest);
   src_mode = TYPE_MODE (TREE_TYPE (src));
   dest_mode = GET_MODE (dest_rtx);
-  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
   gcc_assert (!REG_P (dest_rtx)
-	      || dest_mode == promote_decl_mode (var, &unsignedp));
+	      || dest_mode == promote_ssa_mode (name, &unsignedp));
 
   if (src_mode != dest_mode)
     {
@@ -708,13 +707,12 @@ elim_backward (elim_graph g, int T)
 static rtx
 get_temp_reg (tree name)
 {
-  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
-  tree type = TREE_TYPE (var);
+  tree type = TREE_TYPE (name);
   int unsignedp;
-  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
   rtx x = gen_reg_rtx (reg_mode);
   if (POINTER_TYPE_P (type))
-    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
   return x;
 }
 
@@ -1014,7 +1012,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 
   /* Return to viewing the variable list as just all reference variables after
      coalescing has been performed.  */
-  partition_view_normal (map, false);
+  partition_view_normal (map);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index b05a860..9ffa3f1 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssanames.h"
 #include "tree-ssa-live.h"
 #include "tree-ssa-coalesce.h"
+#include "explow.h"
 #include "diagnostic-core.h"
 
 
@@ -830,6 +831,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
   basic_block bb;
   ssa_op_iter iter;
   live_track_p live;
+  basic_block entry;
+
+  /* If inter-variable coalescing is enabled, we may attempt to
+     coalesce variables from different base variables, including
+     different parameters, so we have to make sure default defs live
+     at the entry block conflict with each other.  */
+  if (flag_tree_coalesce_vars)
+    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  else
+    entry = NULL;
 
   map = live_var_map (liveinfo);
   graph = ssa_conflicts_new (num_var_partitions (map));
@@ -888,6 +899,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	    live_track_process_def (live, result, graph);
 	}
 
+      /* Pretend there are defs for params' default defs at the start
+	 of the (post-)entry block.  */
+      if (bb == entry)
+	{
+	  unsigned base;
+	  bitmap_iterator bi;
+	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	    {
+	      bitmap_iterator bi2;
+	      unsigned part;
+	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+					0, part, bi2)
+		{
+		  tree var = partition_to_var (map, part);
+		  if (!SSA_NAME_VAR (var)
+		      || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+			  && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+		      || !SSA_NAME_IS_DEFAULT_DEF (var))
+		    continue;
+		  live_track_process_def (live, var, graph);
+		}
+	    }
+	}
+
      live_track_clear_base_vars (live);
     }
 
@@ -1156,6 +1191,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
     {
       var1 = partition_to_var (map, p1);
       var2 = partition_to_var (map, p2);
+
       z = var_union (map, var1, var2);
       if (z == NO_PARTITION)
 	{
@@ -1173,6 +1209,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
 
       if (debug)
 	fprintf (debug, ": Success -> %d\n", z);
+
       return true;
     }
 
@@ -1270,6 +1307,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
 }
 
 
+/* Output partition map MAP with coalescing plan PART to file F.  */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+  int t;
+  unsigned x, y;
+  int p;
+
+  fprintf (f, "\nCoalescible Partition map \n\n");
+
+  for (x = 0; x < map->num_partitions; x++)
+    {
+      if (map->view_to_partition != NULL)
+	p = map->view_to_partition[x];
+      else
+	p = x;
+
+      if (ssa_name (p) == NULL_TREE
+	  || virtual_operand_p (ssa_name (p)))
+        continue;
+
+      t = 0;
+      for (y = 1; y < num_ssa_names; y++)
+        {
+	  tree var = version_to_var (map, y);
+	  if (!var)
+	    continue;
+	  int q = var_to_partition (map, var);
+	  p = partition_find (part, q);
+	  gcc_assert (map->partition_to_base_index[q]
+		      == map->partition_to_base_index[p]);
+
+	  if (p == (int)x)
+	    {
+	      if (t++ == 0)
+	        {
+		  fprintf (f, "Partition %d, base %d (", x,
+			   map->partition_to_base_index[q]);
+		  print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+		  fprintf (f, " - ");
+		}
+	      fprintf (f, "%d ", y);
+	    }
+	}
+      if (t != 0)
+	fprintf (f, ")\n");
+    }
+  fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+   coalescing together, false otherwise.
+
+   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+  /* First check the SSA_NAME's associated DECL.  Without
+     optimization, we only want to coalesce if they have the same DECL
+     or both have no associated DECL.  */
+  tree var1 = SSA_NAME_VAR (name1);
+  tree var2 = SSA_NAME_VAR (name2);
+  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+  if (var1 != var2 && !flag_tree_coalesce_vars)
+    return false;
+
+  /* Now check the types.  If the types are the same, then we should
+     try to coalesce V1 and V2.  */
+  tree t1 = TREE_TYPE (name1);
+  tree t2 = TREE_TYPE (name2);
+  if (t1 == t2)
+    {
+    check_modes:
+      /* If the base variables are the same, we're good: none of the
+	 other tests below could possibly fail.  */
+      var1 = SSA_NAME_VAR (name1);
+      var2 = SSA_NAME_VAR (name2);
+      if (var1 == var2)
+	return true;
+
+      /* We don't want to coalesce two SSA names if one of the base
+	 variables is supposed to be a register while the other is
+	 supposed to be on the stack.  Anonymous SSA names take
+	 registers, but when not optimizing, user variables should go
+	 on the stack, so coalescing them with the anonymous variable
+	 as the partition leader would end up assigning the user
+	 variable to a register.  Don't do that!  */
+      bool reg1 = !var1 || use_register_for_decl (var1);
+      bool reg2 = !var2 || use_register_for_decl (var2);
+      if (reg1 != reg2)
+	return false;
+
+      /* Check that the promoted modes are the same.  We don't want to
+	 coalesce if the promoted modes would be different.  Only
+	 PARM_DECLs and RESULT_DECLs have different promotion rules,
+	 so skip the test if we both are variables or anonymous
+	 SSA_NAMEs.  */
+      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+	|| promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
+    }
+
+  /* If the types are not the same, check for a canonical type match.  This
+     (for example) allows coalescing when the types are fundamentally the
+     same, but just have different names. 
+
+     Note pointer types with different address spaces may have the same
+     canonical type.  Those are rejected for coalescing by the
+     types_compatible_p check.  */
+  if (TYPE_CANONICAL (t1)
+      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+      && types_compatible_p (t1, t2))
+    goto check_modes;
+
+  return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+   partition of SSA names USED_IN_COPIES and related by CL coalesce
+   possibilities.  This must match gimple_can_coalesce_p in the
+   optimized case.  */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+				   coalesce_list_p cl)
+{
+  int parts = num_var_partitions (map);
+  partition tentative = partition_new (parts);
+
+  /* Partition the SSA versions so that, for each coalescible
+     pair, both of its members are in the same partition in
+     TENTATIVE.  */
+  gcc_assert (!cl->sorted);
+  coalesce_pair_p node;
+  coalesce_iterator_type ppi;
+  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+    {
+      tree v1 = ssa_name (node->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (node->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* We have to deal with cost one pairs too.  */
+  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+    {
+      tree v1 = ssa_name (co->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (co->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* And also with abnormal edges.  */
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->flags & EDGE_ABNORMAL)
+	  {
+	    gphi_iterator gsi;
+	    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+		 gsi_next (&gsi))
+	      {
+		gphi *phi = gsi.phi ();
+		tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+		if (SSA_NAME_IS_DEFAULT_DEF (arg)
+		    && (!SSA_NAME_VAR (arg)
+			|| TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+		  continue;
+
+		tree res = PHI_RESULT (phi);
+
+		int p1 = partition_find (tentative, var_to_partition (map, res));
+		int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+		if (p1 == p2)
+		  continue;
+
+		partition_union (tentative, p1, p2);
+	      }
+	  }
+    }
+
+  map->partition_to_base_index = XCNEWVEC (int, parts);
+  auto_vec<unsigned int> index_map (parts);
+  if (parts)
+    index_map.quick_grow (parts);
+
+  const unsigned no_part = -1;
+  unsigned count = parts;
+  while (count)
+    index_map[--count] = no_part;
+
+  /* Initialize MAP's mapping from partition to base index, using
+     as base indices an enumeration of the TENTATIVE partitions in
+     which each SSA version ended up, so that we compute conflicts
+     between all SSA versions that ended up in the same potential
+     coalesce partition.  */
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      if (index_map[base] != no_part)
+	continue;
+      index_map[base] = count++;
+    }
+
+  map->num_basevars = count;
+
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      gcc_assert (index_map[base] < count);
+      map->partition_to_base_index[pidx] = index_map[base];
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_part_var_map (dump_file, tentative, map);
+
+  partition_delete (tentative);
+}
+
+/* Hashtable helpers.  */
+
+struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
+{
+  typedef tree_int_map *value_type;
+  typedef tree_int_map *compare_type;
+  static inline hashval_t hash (const tree_int_map *);
+  static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+  return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+  return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+   names.  Partitions will share the same base if they have the same
+   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
+   must match gimple_can_coalesce_p in the non-optimized case.  */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+  int x, num_part;
+  tree var;
+  struct tree_int_map *m, *mapstorage;
+
+  num_part = num_var_partitions (map);
+  hash_table<tree_int_map_hasher> tree_to_index (num_part);
+  /* We can have at most num_part entries in the hash tables, so it's
+     enough to allocate so many map elements once, saving some malloc
+     calls.  */
+  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+  /* If a base table already exists, clear it, otherwise create it.  */
+  free (map->partition_to_base_index);
+  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+  /* Build the base variable list, and point partitions at their bases.  */
+  for (x = 0; x < num_part; x++)
+    {
+      struct tree_int_map **slot;
+      unsigned baseindex;
+      var = partition_to_var (map, x);
+      if (SSA_NAME_VAR (var)
+	  && (!VAR_P (SSA_NAME_VAR (var))
+	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+	m->base.from = SSA_NAME_VAR (var);
+      else
+	/* This restricts what anonymous SSA names we can coalesce
+	   as it restricts the sets we compute conflicts for.
+	   Using TREE_TYPE to generate sets is the easies as
+	   type equivalency also holds for SSA names with the same
+	   underlying decl.
+
+	   Check gimple_can_coalesce_p when changing this code.  */
+	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+			? TYPE_CANONICAL (TREE_TYPE (var))
+			: TREE_TYPE (var));
+      /* If base variable hasn't been seen, set it up.  */
+      slot = tree_to_index.find_slot (m, INSERT);
+      if (!*slot)
+	{
+	  baseindex = m - mapstorage;
+	  m->to = baseindex;
+	  *slot = m;
+	  m++;
+	}
+      else
+	baseindex = (*slot)->to;
+      map->partition_to_base_index[x] = baseindex;
+    }
+
+  map->num_basevars = m - mapstorage;
+
+  free (mapstorage);
+}
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1286,9 +1647,10 @@ coalesce_ssa_name (void)
   cl = create_coalesce_list ();
   map = create_outofssa_var_map (cl, used_in_copies);
 
-  /* If optimization is disabled, we need to coalesce all the names originating
-     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
-  if (!optimize)
+  /* If this optimization is disabled, we need to coalesce all the
+     names originating from the same SSA_NAME_VAR so debug info
+     remains undisturbed.  */
+  if (!flag_tree_coalesce_vars)
     {
       hash_table<ssa_name_var_hash> ssa_name_hash (10);
 
@@ -1329,8 +1691,13 @@ coalesce_ssa_name (void)
   if (dump_file && (dump_flags & TDF_DETAILS))
     dump_var_map (dump_file, map);
 
-  /* Don't calculate live ranges for variables not in the coalesce list.  */
-  partition_view_bitmap (map, used_in_copies, true);
+  partition_view_bitmap (map, used_in_copies);
+
+  if (flag_tree_coalesce_vars)
+    compute_optimized_partition_bases (map, used_in_copies, cl);
+  else
+    compute_samebase_partition_bases (map);
+
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
@@ -1369,8 +1736,7 @@ coalesce_ssa_name (void)
 
   /* Now coalesce everything in the list.  */
   coalesce_partitions (map, graph, cl,
-		       ((dump_flags & TDF_DETAILS) ? dump_file
-						   : NULL));
+		       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
   delete_coalesce_list (cl);
   ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_TREE_SSA_COALESCE_H
 
 extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
 
 #endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index f3cb56e..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,499 +0,0 @@
-/* Rename SSA copies.
-   Copyright (C) 2004-2015 Free Software Foundation, Inc.
-   Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
-#include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
-#include "tree.h"
-#include "fold-const.h"
-#include "predict.h"
-#include "hard-reg-set.h"
-#include "function.h"
-#include "dominance.h"
-#include "cfg.h"
-#include "basic-block.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "bitmap.h"
-#include "gimple-ssa.h"
-#include "stringpool.h"
-#include "tree-ssanames.h"
-#include "hashtab.h"
-#include "rtl.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
-  /* Number of copies coalesced.  */
-  int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
-   This optimization looks for copies between 2 SSA_NAMES, either through a
-   direct copy, or an implicit one via a PHI node result and its arguments.
-
-   Each copy is examined to determine if it is possible to rename the base
-   variable of one of the operands to the same variable as the other operand.
-   i.e.
-   T.3_5 = <blah>
-   a_1 = T.3_5
-
-   If this copy couldn't be copy propagated, it could possibly remain in the
-   program throughout the optimization phases.   After SSA->normal, it would
-   become:
-
-   T.3 = <blah>
-   a = T.3
-
-   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
-   fundamental reason why the base variable needs to be T.3, subject to
-   certain restrictions.  This optimization attempts to determine if we can
-   change the base variable on copies like this, and result in code such as:
-
-   a_5 = <blah>
-   a_1 = a_5
-
-   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
-   possible, the copy goes away completely. If it isn't possible, a new temp
-   will be created for a_5, and you will end up with the exact same code:
-
-   a.8 = <blah>
-   a = a.8
-
-   The other benefit of performing this optimization relates to what variables
-   are chosen in copies.  Gimplification of the program uses temporaries for
-   a lot of things. expressions like
-
-   a_1 = <blah>
-   <blah2> = a_1
-
-   get turned into
-
-   T.3_5 = <blah>
-   a_1 = T.3_5
-   <blah2> = a_1
-
-   Copy propagation is done in a forward direction, and if we can propagate
-   through the copy, we end up with:
-
-   T.3_5 = <blah>
-   <blah2> = T.3_5
-
-   The copy is gone, but so is all reference to the user variable 'a'. By
-   performing this optimization, we would see the sequence:
-
-   a_5 = <blah>
-   a_1 = a_5
-   <blah2> = a_1
-
-   which copy propagation would then turn into:
-
-   a_5 = <blah>
-   <blah2> = a_5
-
-   and so we still retain the user variable whenever possible.  */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
-   Choose a representative for the partition, and send debug info to DEBUG.  */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
-  int p1, p2, p3;
-  tree root1, root2;
-  tree rep1, rep2;
-  bool ign1, ign2, abnorm;
-
-  gcc_assert (TREE_CODE (var1) == SSA_NAME);
-  gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
-  register_ssa_partition (map, var1);
-  register_ssa_partition (map, var2);
-
-  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
-  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
-  if (debug)
-    {
-      fprintf (debug, "Try : ");
-      print_generic_expr (debug, var1, TDF_SLIM);
-      fprintf (debug, "(P%d) & ", p1);
-      print_generic_expr (debug, var2, TDF_SLIM);
-      fprintf (debug, "(P%d)", p2);
-    }
-
-  gcc_assert (p1 != NO_PARTITION);
-  gcc_assert (p2 != NO_PARTITION);
-
-  if (p1 == p2)
-    {
-      if (debug)
-	fprintf (debug, " : Already coalesced.\n");
-      return;
-    }
-
-  rep1 = partition_to_var (map, p1);
-  rep2 = partition_to_var (map, p2);
-  root1 = SSA_NAME_VAR (rep1);
-  root2 = SSA_NAME_VAR (rep2);
-  if (!root1 && !root2)
-    return;
-
-  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
-  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
-	    || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
-  if (abnorm)
-    {
-      if (debug)
-	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
-      return;
-    }
-
-  /* Partitions already have the same root, simply merge them.  */
-  if (root1 == root2)
-    {
-      p1 = partition_union (map->var_partition, p1, p2);
-      if (debug)
-	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
-      return;
-    }
-
-  /* Never attempt to coalesce 2 different parameters.  */
-  if ((root1 && TREE_CODE (root1) == PARM_DECL)
-      && (root2 && TREE_CODE (root2) == PARM_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
-      return;
-    }
-
-  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
-      != (root2 && TREE_CODE (root2) == RESULT_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
-      return;
-    }
-
-  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
-  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
-  /* Refrain from coalescing user variables, if requested.  */
-  if (!ign1 && !ign2)
-    {
-      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
-	ign2 = true;
-      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
-	ign1 = true;
-      else if (flag_ssa_coalesce_vars != 2)
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
-	  return;
-	}
-      else
-	ign2 = true;
-    }
-
-  /* If both values have default defs, we can't coalesce.  If only one has a
-     tag, make sure that variable is the new root partition.  */
-  if (root1 && ssa_default_def (cfun, root1))
-    {
-      if (root2 && ssa_default_def (cfun, root2))
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 default defs. No coalesce.\n");
-	  return;
-	}
-      else
-        {
-	  ign2 = true;
-	  ign1 = false;
-	}
-    }
-  else if (root2 && ssa_default_def (cfun, root2))
-    {
-      ign1 = true;
-      ign2 = false;
-    }
-
-  /* Do not coalesce if we cannot assign a symbol to the partition.  */
-  if (!(!ign2 && root2)
-      && !(!ign1 && root1))
-    {
-      if (debug)
-	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the new chosen root variable would be read-only.
-     If both ign1 && ign2, then the root var of the larger partition
-     wins, so reject in that case if any of the root vars is TREE_READONLY.
-     Otherwise reject only if the root var, on which replace_ssa_name_symbol
-     will be called below, is readonly.  */
-  if (((root1 && TREE_READONLY (root1)) && ign2)
-      || ((root2 && TREE_READONLY (root2)) && ign1))
-    {
-      if (debug)
-	fprintf (debug, " : Readonly variable.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the two variables aren't type compatible .  */
-  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
-      /* There is a disconnect between the middle-end type-system and
-         VRP, avoid coalescing enum types with different bounds.  */
-      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
-	   || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
-	  && TREE_TYPE (var1) != TREE_TYPE (var2)))
-    {
-      if (debug)
-	fprintf (debug, " : Incompatible types.  No coalesce.\n");
-      return;
-    }
-
-  /* Merge the two partitions.  */
-  p3 = partition_union (map->var_partition, p1, p2);
-
-  /* Set the root variable of the partition to the better choice, if there is
-     one.  */
-  if (!ign2 && root2)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
-  else if (!ign1 && root1)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
-  else
-    gcc_unreachable ();
-
-  if (debug)
-    {
-      fprintf (debug, " --> P%d ", p3);
-      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
-			  TDF_SLIM);
-      fprintf (debug, "\n");
-    }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
-  GIMPLE_PASS, /* type */
-  "copyrename", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_COPY_RENAME, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
-  pass_rename_ssa_copies (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
-   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
-   changing the underlying root variable of all coalesced version.  This will
-   then cause the SSA->normal pass to attempt to coalesce them all to the same
-   variable.  */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
-  var_map map;
-  basic_block bb;
-  tree var, part_var;
-  gimple stmt;
-  unsigned x;
-  FILE *debug;
-
-  memset (&stats, 0, sizeof (stats));
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    debug = dump_file;
-  else
-    debug = NULL;
-
-  map = init_var_map (num_ssa_names);
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Scan for real copies.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_assign_ssa_name_copy_p (stmt))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
-	    }
-	}
-    }
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Treat PHI nodes as copies between the result and each argument.  */
-      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-        {
-          size_t i;
-	  tree res;
-	  gphi *phi = gsi.phi ();
-	  res = gimple_phi_result (phi);
-
-	  /* Do not process virtual SSA_NAMES.  */
-	  if (virtual_operand_p (res))
-	    continue;
-
-	  /* Make sure to only use the same partition for an argument
-	     as the result but never the other way around.  */
-	  if (SSA_NAME_VAR (res)
-	      && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
-	    for (i = 0; i < gimple_phi_num_args (phi); i++)
-	      {
-		tree arg = PHI_ARG_DEF (phi, i);
-		if (TREE_CODE (arg) == SSA_NAME)
-		  copy_rename_partition_coalesce (map, res, arg,
-						  debug);
-	      }
-	  /* Else if all arguments are in the same partition try to merge
-	     it with the result.  */
-	  else
-	    {
-	      int all_p_same = -1;
-	      int p = -1;
-	      for (i = 0; i < gimple_phi_num_args (phi); i++)
-		{
-		  tree arg = PHI_ARG_DEF (phi, i);
-		  if (TREE_CODE (arg) != SSA_NAME)
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		  else if (all_p_same == -1)
-		    {
-		      p = partition_find (map->var_partition,
-					  SSA_NAME_VERSION (arg));
-		      all_p_same = 1;
-		    }
-		  else if (all_p_same == 1
-			   && p != partition_find (map->var_partition,
-						   SSA_NAME_VERSION (arg)))
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		}
-	      if (all_p_same == 1)
-		copy_rename_partition_coalesce (map, res,
-						PHI_ARG_DEF (phi, 0),
-						debug);
-	    }
-        }
-    }
-
-  if (debug)
-    dump_var_map (debug, map);
-
-  /* Now one more pass to make all elements of a partition share the same
-     root variable.  */
-
-  for (x = 1; x < num_ssa_names; x++)
-    {
-      part_var = partition_to_var (map, x);
-      if (!part_var)
-        continue;
-      var = ssa_name (x);
-      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
-	continue;
-      if (debug)
-        {
-	  fprintf (debug, "Coalesced ");
-	  print_generic_expr (debug, var, TDF_SLIM);
-	  fprintf (debug, " to ");
-	  print_generic_expr (debug, part_var, TDF_SLIM);
-	  fprintf (debug, "\n");
-	}
-      stats.coalesced++;
-      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
-    }
-
-  statistics_counter_event (fun, "copies coalesced",
-			    stats.coalesced);
-  delete_var_map (map);
-  return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
-  return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 2c7c072..821b2f4 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -100,90 +100,6 @@ static void  verify_live_on_entry (tree_live_info_p);
    ssa_name or variable, and vice versa.  */
 
 
-/* Hashtable helpers.  */
-
-struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
-{
-  typedef tree_int_map *value_type;
-  typedef tree_int_map *compare_type;
-  static inline hashval_t hash (const tree_int_map *);
-  static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
-  return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
-  return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP.  */
-
-static void
-var_map_base_init (var_map map)
-{
-  int x, num_part;
-  tree var;
-  struct tree_int_map *m, *mapstorage;
-
-  num_part = num_var_partitions (map);
-  hash_table<tree_int_map_hasher> tree_to_index (num_part);
-  /* We can have at most num_part entries in the hash tables, so it's
-     enough to allocate so many map elements once, saving some malloc
-     calls.  */
-  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
-  /* If a base table already exists, clear it, otherwise create it.  */
-  free (map->partition_to_base_index);
-  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
-  /* Build the base variable list, and point partitions at their bases.  */
-  for (x = 0; x < num_part; x++)
-    {
-      struct tree_int_map **slot;
-      unsigned baseindex;
-      var = partition_to_var (map, x);
-      if (SSA_NAME_VAR (var)
-	  && (!VAR_P (SSA_NAME_VAR (var))
-	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
-	m->base.from = SSA_NAME_VAR (var);
-      else
-	/* This restricts what anonymous SSA names we can coalesce
-	   as it restricts the sets we compute conflicts for.
-	   Using TREE_TYPE to generate sets is the easies as
-	   type equivalency also holds for SSA names with the same
-	   underlying decl. 
-
-	   Check gimple_can_coalesce_p when changing this code.  */
-	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
-			? TYPE_CANONICAL (TREE_TYPE (var))
-			: TREE_TYPE (var));
-      /* If base variable hasn't been seen, set it up.  */
-      slot = tree_to_index.find_slot (m, INSERT);
-      if (!*slot)
-	{
-	  baseindex = m - mapstorage;
-	  m->to = baseindex;
-	  *slot = m;
-	  m++;
-	}
-      else
-	baseindex = (*slot)->to;
-      map->partition_to_base_index[x] = baseindex;
-    }
-
-  map->num_basevars = m - mapstorage;
-
-  free (mapstorage);
-}
-
-
 /* Remove the base table in MAP.  */
 
 static void
@@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
 }
 
 
-/* Create a partition view which includes all the used partitions in MAP.  If
-   WANT_BASES is true, create the base variable map as well.  */
+/* Create a partition view which includes all the used partitions in MAP.  */
 
 void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
 {
   bitmap used;
 
   used = partition_view_init (map);
   partition_view_fini (map, used);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
@@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
    as well.  */
 
 void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
 {
   bitmap used;
   bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
     }
   partition_view_fini (map, new_partitions);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
 extern var_map init_var_map (int);
 extern void delete_var_map (var_map);
 extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
 extern void dump_scope_blocks (FILE *, int);
 extern void debug_scope_block (tree, int);
 extern void debug_scope_blocks (int);
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 3f6bebe..7bef8cf 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
 	if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
 	  continue;
 	e = TREE_OPERAND (e, 0);
-	gcc_assert (operand_equal_p (e, base, 0));
+	/* If E has an unsigned type, the operand equality test below
+	   would fail, but the equality test above would have already
+	   verified the equality, so we can proceed with it.  */
+	gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
+		    || operand_equal_p (e, base, 0));
 	if (tree_int_cst_sign_bit (step))
 	  {
 	    code = LT_EXPR;
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index f75a7f1..0982305 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -59,6 +59,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "domwalk.h"
 #include "tree-pass.h"
 #include "tree-ssa-propagate.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
 
 /* The basic structure describing an equivalency created by traversing
    an edge.  Traversing the edge effectively means that we can assume
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 0b24007..acdcd46 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4931,12 +4931,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
    registers, as well as associations between MEMs and VALUEs.  */
 
 static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
 {
   unsigned int r;
   hard_reg_set_iterator hrsi;
+  HARD_REG_SET invalidated_regs;
 
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+  get_call_reg_set_usage (call_insn, &invalidated_regs,
+			  regs_invalidated_by_call);
+
+  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
     var_regno_delete (set, r);
 
   if (MAY_HAVE_DEBUG_INSNS)
@@ -6720,7 +6724,7 @@ compute_bb_dataflow (basic_block bb)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (out);
+	    dataflow_set_clear_at_call (out, insn);
 	    break;
 
 	  case MO_USE:
@@ -9182,7 +9186,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (set);
+	    dataflow_set_clear_at_call (set, insn);
 	    emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
 	    {
 	      rtx arguments = mo->u.loc, *p = &arguments;



And here's the incremental patch:

---
 gcc/alias.c               |   17 +++++++------
 gcc/cfgexpand.c           |   57 +++++++++++++++++----------------------------
 gcc/emit-rtl.c            |    2 --
 gcc/explow.c              |    3 --
 gcc/expr.c                |   16 +++++--------
 gcc/function.c            |   15 ++++++++++++
 gcc/gimple-expr.h         |    4 ---
 gcc/tree-outof-ssa.c      |    7 ++----
 gcc/tree-ssa-coalesce.h   |    1 +
 gcc/tree-ssa-loop-niter.c |    6 ++++-
 gcc/tree-ssa-uncprop.c    |    5 ++++
 11 files changed, 64 insertions(+), 69 deletions(-)

diff --git a/gcc/alias.c b/gcc/alias.c
index 7a74e81..5a031d9 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2553,14 +2553,15 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
     return 0;
 
   /* If we refer to different gimple registers, or one gimple register
-     and one non-gimple-register, we know they can't overlap.  Now,
-     there could be more than one stack slot for (different versions
-     of) the same gimple register, but we can presumably tell they
-     don't overlap based on offsets from stack base addresses
-     elsewhere.  It's important that we don't proceed to DECL_RTL,
-     because gimple registers may not pass DECL_RTL_SET_P, and
-     make_decl_rtl won't be able to do anything about them since no
-     SSA information will have remained to guide it.  */
+     and one non-gimple-register, we know they can't overlap.  First,
+     gimple registers don't have their addresses taken.  Now, there
+     could be more than one stack slot for (different versions of) the
+     same gimple register, but we can presumably tell they don't
+     overlap based on offsets from stack base addresses elsewhere.
+     It's important that we don't proceed to DECL_RTL, because gimple
+     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+     able to do anything about them since no SSA information will have
+     remained to guide it.  */
   if (is_gimple_reg (exprx) || is_gimple_reg (expry))
     return exprx != expry;
 
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 3e80b4a..bf972fc 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -179,11 +179,10 @@ gimple_assign_rhs_to_tree (gimple stmt)
 
 #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
 
-/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
-   TREE_LIST of DECLs.  If NEXT is covered by CUR, return CUR
-   unchanged.  Otherwise, return a list with all entries of CUR, with
-   NEXT at the end.  If CUR was a list, it will be modified in
-   place.  */
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+   out of the same user variable being in multiple partitions (this is
+   less likely for compiler-introduced temps).  */
 
 static tree
 leader_merge (tree cur, tree next)
@@ -191,26 +190,11 @@ leader_merge (tree cur, tree next)
   if (cur == NULL || cur == next)
     return next;
 
-  tree list;
+  if (DECL_P (cur) && DECL_IGNORED_P (cur))
+    return cur;
 
-  if (TREE_CODE (cur) == TREE_LIST)
-    {
-      /* Look for NEXT in the list.  Stop at the last node to insert
-	 there.  */
-      for (list = cur; ; list = TREE_CHAIN (list))
-	{
-	  if (TREE_VALUE (list) == next)
-	    return cur;
-	  if (!TREE_CHAIN (list))
-	    break;
-	}
-    }
-  else
-    /* Create the first node.  */
-    list = build_tree_list (NULL, cur);
-
-  next = build_tree_list (NULL, next);
-  TREE_CHAIN (list) = next;
+  if (DECL_P (next) && DECL_IGNORED_P (next))
+    return next;
 
   return cur;
 }
@@ -285,9 +269,9 @@ set_rtl (tree t, rtx x)
       if (cur != next)
 	{
 	  if (MEM_P (x))
-	    set_mem_attributes (x, SSAVAR (t), true);
+	    set_mem_attributes (x, next, true);
 	  else
-	    set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
+	    set_reg_attrs_for_decl_rtl (next, x);
 	}
     }
 
@@ -1025,9 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (Pmode, base, offset);
-  x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
-		   ? DECL_MODE (SSAVAR (decl))
-		   : TYPE_MODE (TREE_TYPE (decl)), x);
+  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+		   ? TYPE_MODE (TREE_TYPE (decl))
+		   : DECL_MODE (SSAVAR (decl)), x);
 
   if (TREE_CODE (decl) != SSA_NAME)
     {
@@ -1268,17 +1252,17 @@ expand_one_stack_var_1 (tree var)
   HOST_WIDE_INT size, offset;
   unsigned byte_align;
 
-  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
-    {
-      size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
-      byte_align = align_local_variable (SSAVAR (var));
-    }
-  else
+  if (TREE_CODE (var) == SSA_NAME)
     {
       tree type = TREE_TYPE (var);
       size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
       byte_align = TYPE_ALIGN_UNIT (type);
     }
+  else
+    {
+      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+      byte_align = align_local_variable (var);
+    }
 
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1423,9 +1407,10 @@ expand_one_register_var (tree var)
 	  gcc_assert (REG_P (x));
 	  return;
 	}
+      gcc_unreachable ();
     }
 
-  tree decl = SSAVAR (var);
+  tree decl = var;
   tree type = TREE_TYPE (decl);
   machine_mode reg_mode = promote_decl_mode (decl, NULL);
   rtx x = gen_reg_rtx (reg_mode);
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 308da40..2b98946 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1252,8 +1252,6 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
   if (!t)
     return;
   tree tdecl = t;
-  if (TREE_CODE (t) == TREE_LIST)
-    tdecl = TREE_VALUE (t);
   if (GET_CODE (x) == SUBREG)
     {
       gcc_assert (subreg_lowpart_p (x));
diff --git a/gcc/explow.c b/gcc/explow.c
index e09c032e1..5b0d49c 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -866,9 +866,6 @@ promote_ssa_mode (const_tree name, int *punsignedp)
 {
   gcc_assert (TREE_CODE (name) == SSA_NAME);
 
-  if (SSA_NAME_VAR (name))
-    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
-
   tree type = TREE_TYPE (name);
   int unsignedp = TYPE_UNSIGNED (type);
   machine_mode mode = TYPE_MODE (type);
diff --git a/gcc/expr.c b/gcc/expr.c
index effe379..5b6e16e 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9584,20 +9584,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 
 	  /* Get the signedness to be used for this variable.  Ensure we get
 	     the same mode we got when the variable was declared.  */
-	  if (code == SSA_NAME
-	      && (g = SSA_NAME_DEF_STMT (ssa_name))
-	      && gimple_code (g) == GIMPLE_CALL
-	      && !gimple_call_internal_p (g))
+	  if (code != SSA_NAME)
+	    pmode = promote_decl_mode (exp, &unsignedp);
+	  else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+		   && gimple_code (g) == GIMPLE_CALL
+		   && !gimple_call_internal_p (g))
 	    pmode = promote_function_mode (type, mode, &unsignedp,
 					   gimple_call_fntype (g),
 					   2);
-	  else if (!exp)
-	    {
-	      gcc_assert (code == SSA_NAME);
-	      pmode = promote_ssa_mode (ssa_name, &unsignedp);
-	    }
 	  else
-	    pmode = promote_decl_mode (exp, &unsignedp);
+	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/function.c b/gcc/function.c
index dc9e77f..58e2498 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2124,6 +2124,21 @@ use_register_for_decl (const_tree decl)
 {
   if (TREE_CODE (decl) == SSA_NAME)
     {
+      /* We often try to use the SSA_NAME, instead of its underlying
+	 decl, to get type information and guide decisions, to avoid
+	 differences of behavior between anonymous and named
+	 variables, but in this one case we have to go for the actual
+	 variable if there is one.  The main reason is that, at least
+	 at -O0, we want to place user variables on the stack, but we
+	 don't mind using pseudos for anonymous or ignored temps.
+	 Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+	 should go in pseudos, whereas their corresponding variables
+	 might have to go on the stack.  So, disregarding the decl
+	 here would negatively impact debug info at -O0, enable
+	 coalescing between SSA_NAMEs that ought to get different
+	 stack/pseudo assignments, and get the incoming argument
+	 processing thoroughly confused by PARM_DECLs expected to live
+	 in stack slots but assigned to pseudos.  */
       if (!SSA_NAME_VAR (decl))
 	return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
 	  && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index 146cede..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -55,10 +55,6 @@ extern bool is_gimple_mem_ref_addr (tree);
 extern void mark_addressable (tree);
 extern bool is_gimple_reg_rhs (tree);
 
-/* Defined in tree-ssa-coalesce.c.   */
-extern bool gimple_can_coalesce_p (tree, tree);
-
-
 /* Return true if a conversion from either type of TYPE1 and TYPE2
    to the other is not required.  Otherwise return false.  */
 
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index dda9973..59d91c6 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
   rtx dest_rtx, seq, x;
   machine_mode dest_mode, src_mode;
   int unsignedp;
-  tree var;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -328,10 +327,9 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
   start_sequence ();
 
   tree name = partition_to_var (SA.map, dest);
-  var = SSA_NAME_VAR (name);
   src_mode = TYPE_MODE (TREE_TYPE (src));
   dest_mode = GET_MODE (dest_rtx);
-  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
+  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
   gcc_assert (!REG_P (dest_rtx)
 	      || dest_mode == promote_ssa_mode (name, &unsignedp));
 
@@ -709,8 +707,7 @@ elim_backward (elim_graph g, int T)
 static rtx
 get_temp_reg (tree name)
 {
-  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
-  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
+  tree type = TREE_TYPE (name);
   int unsignedp;
   machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
   rtx x = gen_reg_rtx (reg_mode);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_TREE_SSA_COALESCE_H
 
 extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
 
 #endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 3f6bebe..7bef8cf 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
 	if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
 	  continue;
 	e = TREE_OPERAND (e, 0);
-	gcc_assert (operand_equal_p (e, base, 0));
+	/* If E has an unsigned type, the operand equality test below
+	   would fail, but the equality test above would have already
+	   verified the equality, so we can proceed with it.  */
+	gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
+		    || operand_equal_p (e, base, 0));
 	if (tree_int_cst_sign_bit (step))
 	  {
 	    code = LT_EXPR;
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index f75a7f1..0982305 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -59,6 +59,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "domwalk.h"
 #include "tree-pass.h"
 #include "tree-ssa-propagate.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
 
 /* The basic structure describing an equivalency created by traversing
    an edge.  Traversing the edge effectively means that we can assume



-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-06  5:12             ` Alexandre Oliva
@ 2015-06-08  8:16               ` Richard Biener
  2015-06-09  8:58                 ` Christophe Lyon
  2015-06-10  0:28               ` Alexandre Oliva
  1 sibling, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-06-08  8:16 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Jeff Law, GCC Patches

On Sat, Jun 6, 2015 at 3:14 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> This should also mention that is_gimple_reg vars do not have their
>> address taken.
>
> check
>
>>> +static tree
>>> +leader_merge (tree cur, tree next)
>
>> Ick - presumably you can't use sth better than a TREE_LIST here?
>
> The list was an experiment that never really worked, and when I tried to
> make it work after the patch, it proved to be unworkable, so I dropped
> it, and rewrote leader_merge to choose either of the params, preferring
> anonymous over ignored over named, so as to reduce the likelihood of
> misreading of debug dumps, since that's all they're used for.
>
>>> static void
>>> -expand_one_stack_var (tree var)
>>> +expand_one_stack_var_1 (tree var)
>>> {
>>> HOST_WIDE_INT size, offset;
>>> unsigned byte_align;
>>>
>>> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>>> -  byte_align = align_local_variable (SSAVAR (var));
>>> +  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
>>> +    {
>>> +      size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>>> +      byte_align = align_local_variable (SSAVAR (var));
>>> +    }
>>> +  else
>
>> I'd go here for all TREE_CODE (var) == SSA_NAME
>
> Check
>
>> (and get rid of the SSAVAR macro?)
>
> There are remaining uses that don't seem worth dropping it for.
>
>>> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
>>> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
>>> +   mode of a temp decl of same type as the SSA_NAME, if we had created
>>> +   one.  */
>>> +
>>> +machine_mode
>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>> +{
>>> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
>>> +
>>> +  if (SSA_NAME_VAR (name))
>>> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>
>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>> vars (so just delete the above two lines).
>
> Check
>
>>> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>> pmode = promote_function_mode (type, mode, &unsignedp,
>>> gimple_call_fntype (g),
>>> 2);
>>> +         else if (!exp)
>>> +           {
>>> +             gcc_assert (code == SSA_NAME);
>
>> promote_ssa_mode should assert this.
>
>>> +             pmode = promote_ssa_mode (ssa_name, &unsignedp);
>
> It does, so...  check.
>
>
>>> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>>> bool
>>> use_register_for_decl (const_tree decl)
>>> {
>>> +  if (TREE_CODE (decl) == SSA_NAME)
>>> +    {
>>> +      if (!SSA_NAME_VAR (decl))
>>> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>>> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>>> +
>>> +      decl = SSA_NAME_VAR (decl);
>
>> See above.  Please drop the SSA_NAME_VAR != NULL path.
>
> Check, then taken back, after a bootstrap failure and some debugging
> made me realize this would be wrong.  Here are the nearly-added comments
> that explain why:
>
>       /* We often try to use the SSA_NAME, instead of its underlying
>          decl, to get type information and guide decisions, to avoid
>          differences of behavior between anonymous and named
>          variables, but in this one case we have to go for the actual
>          variable if there is one.  The main reason is that, at least
>          at -O0, we want to place user variables on the stack, but we
>          don't mind using pseudos for anonymous or ignored temps.
>          Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
>          should go in pseudos, whereas their corresponding variables
>          might have to go on the stack.  So, disregarding the decl
>          here would negatively impact debug info at -O0, enable
>          coalescing between SSA_NAMEs that ought to get different
>          stack/pseudo assignments, and get the incoming argument
>          processing thoroughly confused by PARM_DECLs expected to live
>          in stack slots but assigned to pseudos.  */
>
>
>>> +++ b/gcc/gimple-expr.h
>>> +/* Defined in tree-ssa-coalesce.c.   */
>>> +extern bool gimple_can_coalesce_p (tree, tree);
>
>> Err, put it to tree-ssa-coalesce.h?
>
> Check.  Lots of additional headers required to be able to include
> tree-ssa-coalesce.h, though.
>
>
>>> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
>>> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
>
>> The TREE_TYPE of name and its SSA_NAME_VAR are always the same.  So just
>> use TREE_TYPE (name) here.
>
> Check
>
>>> gcc_assert (!REG_P (dest_rtx)
>>> -             || dest_mode == promote_decl_mode (var, &unsignedp));
>>> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>>>
>>> if (src_mode != dest_mode)
>>> {
>>> @@ -714,12 +715,12 @@ static rtx
>>> get_temp_reg (tree name)
>>> {
>>> tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>>> -  tree type = TREE_TYPE (var);
>>> +  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
>
>> See above.
>
> Check
>
>
> Here's the revised patch, regstrapped on x86_64-linux-gnu and
> i686-linux-gnu.  The first attempt failed to compile libjava on x86_64,
> requiring the new change in tree-ssa-loop-niter.c to pass.  It didn't
> occur in the unpatched tree because the differences between anon or
> named SSA_NAMEs in copyrename changed costs and caused different choices
> in ivopts, which ultimately failed to expose the problem in loop-niter
> during vrp.
>
> At the end, I enclose the incremental changes since the previous
> revision of the patch, to ease the incremental review.
>
> Ok to install?

Ok.

Thanks,
Richard.

>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
>         * tree-ssa-copyrename.c: Removed.
>         * opts.c (default_options_table): Drop -ftree-copyrename.  Add
>         -ftree-coalesce-vars.
>         * passes.def: Drop all occurrences of pass_rename_ssa_copies.
>         * common.opt (ftree-copyrename): Ignore.
>         (ftree-coalesce-inlined-vars): Likewise.
>         * doc/invoke.texi: Remove the ignored options above.
>         * gimple-expr.h (gimple_can_coalesce_p): Move declaration
>         * tree-ssa-coalesce.h: ... here.
>         * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
>         headers required by it.
>         * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
>         across variables when flag_tree_coalesce_vars.  Check register
>         use and promoted modes to allow coalescing.  Moved to
>         tree-ssa-coalesce.c.
>         * tree-ssa-live.c (struct tree_int_map_hasher): Move along
>         with its member functions to tree-ssa-coalesce.c.
>         (var_map_base_init): Likewise.  Renamed to
>         compute_samebase_partition_bases.
>         (partition_view_normal): Drop want_bases parameter.
>         (partition_view_bitmap): Likewise.
>         * tree-ssa-live.h: Adjust declarations.
>         * tree-ssa-coalesce.c: Include explow.h.
>         (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
>         default defs at the entry point.
>         (dump_part_var_map): New.
>         (compute_optimized_partition_bases): New, called by...
>         (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
>         of compute_samebase_partition_bases.  Adjust.
>         * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
>         * cfgexpand.c (leader_merge): New.
>         (get_rtl_for_parm_ssa_default_def): New.
>         (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
>         vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
>         (expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
>         redundant MEM attr setting.
>         (expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
>         from...
>         (expand_one_stack_var): ... this.  New wrapper to check and
>         skip already expanded SSA partitions.
>         (record_alignment_for_reg_var): New, factored out of...
>         (expand_one_var): ... this.
>         (expand_one_ssa_partition): New.
>         (adjust_one_expanded_partition_var): New.
>         (expand_one_register_var): Check and skip already expanded SSA
>         partitions.
>         (expand_used_vars): Don't create DECLs for anonymous SSA
>         names.  Expand all SSA partitions, then adjust all SSA names.
>         (pass::execute): Replace the loops that set
>         SA.partition_to_pseudo from partition leaders and cleared
>         DECL_RTL for multi-location variables, and that which used to
>         rename vars and set attrs, with one that clears DECL_RTL and
>         checks that PARMs and RESULTs default_defs match DECL_RTL.
>         * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
>         * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
>         * explow.c (promote_ssa_mode): New.
>         * explow.h (promote_ssa_mode): Declare.
>         * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
>         * function.c: Include cfgexpand.h.
>         (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
>         (use_register_for_parm_decl): Wrapper for the above to
>         special-case the result_ptr.
>         (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
>         (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
>         multiple locations.
>         (assign_parm_adjust_stack_rtl): Add all and parm arguments,
>         for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
>         (assign_parm_setup_block): Prefer SSA-assigned location.
>         (assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
>         if stack_parm is NULL.
>         (assign_parm_setup_stack): Prefer SSA-assigned location.
>         (assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
>         rtl before testing for pointer bounds.  Special-case result_ptr.
>         (expand_function_start): Maybe reset DECL_RTL of result.
>         Prefer SSA-assigned location for result and static chain.
>         Factor out DECL_RESULT and SET_DECL_RTL.
>         * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
>         anonymous SSA names.  Use promote_ssa_mode.
>         (get_temp_reg): Likewise.
>         (remove_ssa_form): Adjust.
>         * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
>         and get its reg_usage for reg invalidation.
>         (compute_bb_dataflow): Pass it insn.
>         (emit_notes_in_bb): Likewise.
>         * tree-ssa-loop-niter.c (loop_exits_before_overflow): Don't
>         fail assert on conversion between unsigned types.
>
> for  gcc/testsuite/ChangeLog
>
>         * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
>         * gcc.dg/ssp-1.c: Make counter a register.
>         * gcc.dg/ssp-2.c: Likewise.
>         * gcc.dg/torture/parm-coalesce.c: New.
> ---
>  gcc/Makefile.in                              |    1
>  gcc/alias.c                                  |   13 +
>  gcc/cfgexpand.c                              |  370 ++++++++++++++-----
>  gcc/cfgexpand.h                              |    2
>  gcc/common.opt                               |   12 -
>  gcc/doc/invoke.texi                          |   48 +--
>  gcc/emit-rtl.c                               |    5
>  gcc/explow.c                                 |   22 +
>  gcc/explow.h                                 |    3
>  gcc/expr.c                                   |   39 +-
>  gcc/function.c                               |  226 +++++++++---
>  gcc/gimple-expr.c                            |   39 --
>  gcc/gimple-expr.h                            |    1
>  gcc/opts.c                                   |    2
>  gcc/passes.def                               |    5
>  gcc/testsuite/gcc.dg/guality/pr54200.c       |    2
>  gcc/testsuite/gcc.dg/ssp-1.c                 |    2
>  gcc/testsuite/gcc.dg/ssp-2.c                 |    2
>  gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
>  gcc/tree-outof-ssa.c                         |   16 -
>  gcc/tree-ssa-coalesce.c                      |  380 +++++++++++++++++++-
>  gcc/tree-ssa-coalesce.h                      |    1
>  gcc/tree-ssa-copyrename.c                    |  499 --------------------------
>  gcc/tree-ssa-live.c                          |  101 -----
>  gcc/tree-ssa-live.h                          |    4
>  gcc/tree-ssa-loop-niter.c                    |    6
>  gcc/tree-ssa-uncprop.c                       |    5
>  gcc/var-tracking.c                           |   12 -
>  28 files changed, 984 insertions(+), 874 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>  delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 3d14938..2a03223 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1441,7 +1441,6 @@ OBJS = \
>         tree-ssa-ccp.o \
>         tree-ssa-coalesce.o \
>         tree-ssa-copy.o \
> -       tree-ssa-copyrename.o \
>         tree-ssa-dce.o \
>         tree-ssa-dom.o \
>         tree-ssa-dse.o \
> diff --git a/gcc/alias.c b/gcc/alias.c
> index ea539c5..5a031d9 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2552,6 +2552,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>    if (! DECL_P (exprx) || ! DECL_P (expry))
>      return 0;
>
> +  /* If we refer to different gimple registers, or one gimple register
> +     and one non-gimple-register, we know they can't overlap.  First,
> +     gimple registers don't have their addresses taken.  Now, there
> +     could be more than one stack slot for (different versions of) the
> +     same gimple register, but we can presumably tell they don't
> +     overlap based on offsets from stack base addresses elsewhere.
> +     It's important that we don't proceed to DECL_RTL, because gimple
> +     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
> +     able to do anything about them since no SSA information will have
> +     remained to guide it.  */
> +  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> +    return exprx != expry;
> +
>    /* With invalid code we can end up storing into the constant pool.
>       Bail out to avoid ICEing when creating RTL for this.
>       See gfortran.dg/lto/20091028-2_0.f90.  */
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index b190f91..bf972fc 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -179,21 +179,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
>  #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* Choose either CUR or NEXT as the leader DECL for a partition.
> +   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
> +   out of the same user variable being in multiple partitions (this is
> +   less likely for compiler-introduced temps).  */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> +  if (cur == NULL || cur == next)
> +    return next;
> +
> +  if (DECL_P (cur) && DECL_IGNORED_P (cur))
> +    return cur;
> +
> +  if (DECL_P (next) && DECL_IGNORED_P (next))
> +    return next;
> +
> +  return cur;
> +}
> +
> +
> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
> +   there is one.  */
> +
> +rtx
> +get_rtl_for_parm_ssa_default_def (tree var)
> +{
> +  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> +
> +  if (!is_gimple_reg (var))
> +    return NULL_RTX;
> +
> +  /* If we've already determined RTL for the decl, use it.  This is
> +     not just an optimization: if VAR is a PARM whose incoming value
> +     is unused, we won't find a default def to use its partition, but
> +     we still want to use the location of the parm, if it was used at
> +     all.  During assign_parms, until a location is assigned for the
> +     VAR, RTL can only for a parm or result if we're not coalescing
> +     across variables, when we know we're coalescing all SSA_NAMEs of
> +     each parm or result, and we're not coalescing them with names
> +     pertaining to other variables, such as other parms' default
> +     defs.  */
> +  if (DECL_RTL_SET_P (var))
> +    {
> +      gcc_assert (DECL_RTL (var) != pc_rtx);
> +      return DECL_RTL (var);
> +    }
> +
> +  tree name = ssa_default_def (cfun, var);
> +
> +  if (!name)
> +    return NULL_RTX;
> +
> +  int part = var_to_partition (SA.map, name);
> +  if (part == NO_PARTITION)
> +    return NULL_RTX;
> +
> +  return SA.partition_to_pseudo[part];
> +}
> +
>  /* Associate declaration T with storage space X.  If T is no
>     SSA name this is exactly SET_DECL_RTL, otherwise make the
>     partition of T associated with X.  */
>  static inline void
>  set_rtl (tree t, rtx x)
>  {
> +  if (x && SSAVAR (t))
> +    {
> +      bool skip = false;
> +      tree cur = NULL_TREE;
> +
> +      if (MEM_P (x))
> +       cur = MEM_EXPR (x);
> +      else if (REG_P (x))
> +       cur = REG_EXPR (x);
> +      else if (GET_CODE (x) == CONCAT
> +              && REG_P (XEXP (x, 0)))
> +       cur = REG_EXPR (XEXP (x, 0));
> +      else if (GET_CODE (x) == PARALLEL)
> +       cur = REG_EXPR (XVECEXP (x, 0, 0));
> +      else if (x == pc_rtx)
> +       skip = true;
> +      else
> +       gcc_unreachable ();
> +
> +      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +
> +      if (cur != next)
> +       {
> +         if (MEM_P (x))
> +           set_mem_attributes (x, next, true);
> +         else
> +           set_reg_attrs_for_decl_rtl (next, x);
> +       }
> +    }
> +
>    if (TREE_CODE (t) == SSA_NAME)
>      {
> -      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> -      if (x && !MEM_P (x))
> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> -      /* For the benefit of debug information at -O0 (where vartracking
> -         doesn't run) record the place also in the base DECL if it's
> -        a normal variable (not a parameter).  */
> -      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
> +      int part = var_to_partition (SA.map, t);
> +      if (part != NO_PARTITION)
> +       {
> +         if (SA.partition_to_pseudo[part])
> +           gcc_assert (SA.partition_to_pseudo[part] == x);
> +         else
> +           SA.partition_to_pseudo[part] = x;
> +       }
> +      /* For the benefit of debug information at -O0 (where
> +         vartracking doesn't run) record the place also in the base
> +         DECL.  For PARMs and RESULTs, we may end up resetting these
> +         in function.c:maybe_reset_rtl_for_parm, but in some rare
> +         cases we may need them (unused and overwritten incoming
> +         value, that at -O0 must share the location with the other
> +         uses in spite of the missing default def), and this may be
> +         the only chance to preserve them.  */
> +      if (x && x != pc_rtx && SSA_NAME_VAR (t))
>         {
>           tree var = SSA_NAME_VAR (t);
>           /* If we don't yet have something recorded, just record it now.  */
> @@ -909,7 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
>    x = plus_constant (Pmode, base, offset);
> -  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
> +  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
> +                  ? TYPE_MODE (TREE_TYPE (decl))
> +                  : DECL_MODE (SSAVAR (decl)), x);
>
>    if (TREE_CODE (decl) != SSA_NAME)
>      {
> @@ -931,7 +1033,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>        DECL_USER_ALIGN (decl) = 0;
>      }
>
> -  set_mem_attributes (x, SSAVAR (decl), true);
>    set_rtl (decl, x);
>  }
>
> @@ -1146,13 +1247,22 @@ account_stack_vars (void)
>     to a variable to be allocated in the stack frame.  */
>
>  static void
> -expand_one_stack_var (tree var)
> +expand_one_stack_var_1 (tree var)
>  {
>    HOST_WIDE_INT size, offset;
>    unsigned byte_align;
>
> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> -  byte_align = align_local_variable (SSAVAR (var));
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      tree type = TREE_TYPE (var);
> +      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> +      byte_align = TYPE_ALIGN_UNIT (type);
> +    }
> +  else
> +    {
> +      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
> +      byte_align = align_local_variable (var);
> +    }
>
>    /* We handle highly aligned variables in expand_stack_vars.  */
>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1163,6 +1273,27 @@ expand_one_stack_var (tree var)
>                            crtl->max_used_stack_slot_alignment, offset);
>  }
>
> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> +   already assigned some MEM.  */
> +
> +static void
> +expand_one_stack_var (tree var)
> +{
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (MEM_P (x));
> +         return;
> +       }
> +    }
> +
> +  return expand_one_stack_var_1 (var);
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a hard register.  */
>
> @@ -1172,13 +1303,114 @@ expand_one_hard_reg_var (tree var)
>    rest_of_decl_compilation (var, 0, 0);
>  }
>
> +/* Record the alignment requirements of some variable assigned to a
> +   pseudo.  */
> +
> +static void
> +record_alignment_for_reg_var (unsigned int align)
> +{
> +  if (SUPPORTS_STACK_ALIGNMENT
> +      && crtl->stack_alignment_estimated < align)
> +    {
> +      /* stack_alignment_estimated shouldn't change after stack
> +         realign decision made */
> +      gcc_assert (!crtl->stack_realign_processed);
> +      crtl->stack_alignment_estimated = align;
> +    }
> +
> +  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> +     So here we only make sure stack_alignment_needed >= align.  */
> +  if (crtl->stack_alignment_needed < align)
> +    crtl->stack_alignment_needed = align;
> +  if (crtl->max_used_stack_slot_alignment < align)
> +    crtl->max_used_stack_slot_alignment = align;
> +}
> +
> +/* Create RTL for an SSA partition.  */
> +
> +static void
> +expand_one_ssa_partition (tree var)
> +{
> +  int part = var_to_partition (SA.map, var);
> +  gcc_assert (part != NO_PARTITION);
> +
> +  if (SA.partition_to_pseudo[part])
> +    return;
> +
> +  if (!use_register_for_decl (var))
> +    {
> +      expand_one_stack_var_1 (var);
> +      return;
> +    }
> +
> +  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
> +                                         TYPE_MODE (TREE_TYPE (var)),
> +                                         TYPE_ALIGN (TREE_TYPE (var)));
> +
> +  /* If the variable alignment is very large we'll dynamicaly allocate
> +     it, which means that in-frame portion is just a pointer.  */
> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +    align = POINTER_SIZE;
> +
> +  record_alignment_for_reg_var (align);
> +
> +  machine_mode reg_mode = promote_ssa_mode (var, NULL);
> +
> +  rtx x = gen_reg_rtx (reg_mode);
> +
> +  set_rtl (var, x);
> +}
> +
> +/* Record the association between the RTL generated for a partition
> +   and the underlying variable of the SSA_NAME.  */
> +
> +static void
> +adjust_one_expanded_partition_var (tree var)
> +{
> +  if (!var)
> +    return;
> +
> +  tree decl = SSA_NAME_VAR (var);
> +
> +  int part = var_to_partition (SA.map, var);
> +  if (part == NO_PARTITION)
> +    return;
> +
> +  rtx x = SA.partition_to_pseudo[part];
> +
> +  set_rtl (var, x);
> +
> +  if (!REG_P (x))
> +    return;
> +
> +  /* Note if the object is a user variable.  */
> +  if (decl && !DECL_ARTIFICIAL (decl))
> +    mark_user_reg (x);
> +
> +  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
> +    mark_reg_pointer (x, get_pointer_alignment (var));
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a pseudo register.  */
>
>  static void
>  expand_one_register_var (tree var)
>  {
> -  tree decl = SSAVAR (var);
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (REG_P (x));
> +         return;
> +       }
> +      gcc_unreachable ();
> +    }
> +
> +  tree decl = var;
>    tree type = TREE_TYPE (decl);
>    machine_mode reg_mode = promote_decl_mode (decl, NULL);
>    rtx x = gen_reg_rtx (reg_mode);
> @@ -1312,21 +1544,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
>         align = POINTER_SIZE;
>      }
>
> -  if (SUPPORTS_STACK_ALIGNMENT
> -      && crtl->stack_alignment_estimated < align)
> -    {
> -      /* stack_alignment_estimated shouldn't change after stack
> -         realign decision made */
> -      gcc_assert (!crtl->stack_realign_processed);
> -      crtl->stack_alignment_estimated = align;
> -    }
> -
> -  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> -     So here we only make sure stack_alignment_needed >= align.  */
> -  if (crtl->stack_alignment_needed < align)
> -    crtl->stack_alignment_needed = align;
> -  if (crtl->max_used_stack_slot_alignment < align)
> -    crtl->max_used_stack_slot_alignment = align;
> +  record_alignment_for_reg_var (align);
>
>    if (TREE_CODE (origvar) == SSA_NAME)
>      {
> @@ -1760,48 +1978,18 @@ expand_used_vars (void)
>    if (targetm.use_pseudo_pic_reg ())
>      pic_offset_table_rtx = gen_reg_rtx (Pmode);
>
> -  hash_map<tree, tree> ssa_name_decls;
>    for (i = 0; i < SA.map->num_partitions; i++)
>      {
>        tree var = partition_to_var (SA.map, i);
>
>        gcc_assert (!virtual_operand_p (var));
>
> -      /* Assign decls to each SSA name partition, share decls for partitions
> -         we could have coalesced (those with the same type).  */
> -      if (SSA_NAME_VAR (var) == NULL_TREE)
> -       {
> -         tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
> -         if (!*slot)
> -           *slot = create_tmp_reg (TREE_TYPE (var));
> -         replace_ssa_name_symbol (var, *slot);
> -       }
> -
> -      /* Always allocate space for partitions based on VAR_DECLs.  But for
> -        those based on PARM_DECLs or RESULT_DECLs and which matter for the
> -        debug info, there is no need to do so if optimization is disabled
> -        because all the SSA_NAMEs based on these DECLs have been coalesced
> -        into a single partition, which is thus assigned the canonical RTL
> -        location of the DECLs.  If in_lto_p, we can't rely on optimize,
> -        a function could be compiled with -O1 -flto first and only the
> -        link performed at -O0.  */
> -      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
> -       expand_one_var (var, true, true);
> -      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
> -       {
> -         /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
> -            contain the default def (representing the parm or result itself)
> -            we don't do anything here.  But those which don't contain the
> -            default def (representing a temporary based on the parm/result)
> -            we need to allocate space just like for normal VAR_DECLs.  */
> -         if (!bitmap_bit_p (SA.partition_has_default_def, i))
> -           {
> -             expand_one_var (var, true, true);
> -             gcc_assert (SA.partition_to_pseudo[i]);
> -           }
> -       }
> +      expand_one_ssa_partition (var);
>      }
>
> +  for (i = 1; i < num_ssa_names; i++)
> +    adjust_one_expanded_partition_var (ssa_name (i));
> +
>    if (flag_stack_protect == SPCT_FLAG_STRONG)
>        gen_stack_protect_signal
>         = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -5961,35 +6149,6 @@ pass_expand::execute (function *fun)
>        parm_birth_insn = var_seq;
>      }
>
> -  /* Now that we also have the parameter RTXs, copy them over to our
> -     partitions.  */
> -  for (i = 0; i < SA.map->num_partitions; i++)
> -    {
> -      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
> -
> -      if (TREE_CODE (var) != VAR_DECL
> -         && !SA.partition_to_pseudo[i])
> -       SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
> -      gcc_assert (SA.partition_to_pseudo[i]);
> -
> -      /* If this decl was marked as living in multiple places, reset
> -        this now to NULL.  */
> -      if (DECL_RTL_IF_SET (var) == pc_rtx)
> -       SET_DECL_RTL (var, NULL);
> -
> -      /* Some RTL parts really want to look at DECL_RTL(x) when x
> -        was a decl marked in REG_ATTR or MEM_ATTR.  We could use
> -        SET_DECL_RTL here making this available, but that would mean
> -        to select one of the potentially many RTLs for one DECL.  Instead
> -        of doing that we simply reset the MEM_EXPR of the RTL in question,
> -        then nobody can get at it and hence nobody can call DECL_RTL on it.  */
> -      if (!DECL_RTL_SET_P (var))
> -       {
> -         if (MEM_P (SA.partition_to_pseudo[i]))
> -           set_mem_expr (SA.partition_to_pseudo[i], NULL);
> -       }
> -    }
> -
>    /* If we have a class containing differently aligned pointers
>       we need to merge those into the corresponding RTL pointer
>       alignment.  */
> @@ -5997,7 +6156,6 @@ pass_expand::execute (function *fun)
>      {
>        tree name = ssa_name (i);
>        int part;
> -      rtx r;
>
>        if (!name
>           /* We might have generated new SSA names in
> @@ -6010,20 +6168,24 @@ pass_expand::execute (function *fun)
>        if (part == NO_PARTITION)
>         continue;
>
> -      /* Adjust all partition members to get the underlying decl of
> -        the representative which we might have created in expand_one_var.  */
> -      if (SSA_NAME_VAR (name) == NULL_TREE)
> +      gcc_assert (SA.partition_to_pseudo[part]);
> +
> +      /* If this decl was marked as living in multiple places, reset
> +        this now to NULL.  */
> +      tree var = SSA_NAME_VAR (name);
> +      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
> +       SET_DECL_RTL (var, NULL);
> +      /* Check that the pseudos chosen by assign_parms are those of
> +        the corresponding default defs.  */
> +      else if (SSA_NAME_IS_DEFAULT_DEF (name)
> +              && (TREE_CODE (var) == PARM_DECL
> +                  || TREE_CODE (var) == RESULT_DECL))
>         {
> -         tree leader = partition_to_var (SA.map, part);
> -         gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
> -         replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
> +         rtx in = DECL_RTL_IF_SET (var);
> +         gcc_assert (in);
> +         rtx out = SA.partition_to_pseudo[part];
> +         gcc_assert (in == out || rtx_equal_p (in, out));
>         }
> -      if (!POINTER_TYPE_P (TREE_TYPE (name)))
> -       continue;
> -
> -      r = SA.partition_to_pseudo[part];
> -      if (REG_P (r))
> -       mark_reg_pointer (r, get_pointer_alignment (name));
>      }
>
>    /* If this function is `main', emit a call to `__main'
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index a0b6e3e..602579d 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern tree gimple_assign_rhs_to_tree (gimple);
>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +
>
>  #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 32b416a..051f824 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2227,16 +2227,16 @@ Common Report Var(flag_tree_ch) Optimization
>  Enable loop header copying on trees
>
>  ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Report Var(flag_tree_coalesce_vars) Optimization
> +Enable SSA coalescing of user variables
>
>  ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-copy-prop
>  Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e25bd62..e359be2 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
>  -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-nrv -fdump-tree-vect @gol
>  -fdump-tree-sink @gol
>  -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
> @@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
>  -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>  -ftree-loop-if-convert-stores -ftree-loop-im @gol
>  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
> @@ -7076,11 +7074,6 @@ name is made by appending @file{.phiopt} to the source file name.
>  Dump each function after forward propagating single use variables.  The file
>  name is made by appending @file{.forwprop} to the source file name.
>
> -@item copyrename
> -@opindex fdump-tree-copyrename
> -Dump each function after applying the copy rename optimization.  The file
> -name is made by appending @file{.copyrename} to the source file name.
> -
>  @item nrv
>  @opindex fdump-tree-nrv
>  Dump each function after applying the named return value optimization on
> @@ -7545,8 +7538,8 @@ compilation time.
>  -ftree-ccp @gol
>  -fssa-phiopt @gol
>  -ftree-ch @gol
> +-ftree-coalesce-vars @gol
>  -ftree-copy-prop @gol
> --ftree-copyrename @gol
>  -ftree-dce @gol
>  -ftree-dominator-opts @gol
>  -ftree-dse @gol
> @@ -8815,6 +8808,15 @@ profitable to parallelize the loops.
>  Compare the results of several data dependence analyzers.  This option
>  is used for debugging the data dependence analyzers.
>
> +@item -ftree-coalesce-vars
> +@opindex ftree-coalesce-vars
> +Tell the compiler to attempt to combine small user-defined variables
> +too, instead of just compiler temporaries.  This may severely limit the
> +ability to debug an optimized program compiled with
> +@option{-fno-var-tracking-assignments}.  In the negated form, this flag
> +prevents SSA coalescing of user variables.  This option is enabled by
> +default if optimization is enabled.
> +
>  @item -ftree-loop-if-convert
>  @opindex ftree-loop-if-convert
>  Attempt to transform conditional jumps in the innermost loops to
> @@ -8928,32 +8930,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
>  references with scalars to prevent committing structures to memory too
>  early.  This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees.  This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables.  This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions.  It is a more limited form of
> -@option{-ftree-coalesce-vars}.  This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries.  This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}.  In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones.  This option is enabled by default.
> -
>  @item -ftree-ter
>  @opindex ftree-ter
>  Perform temporary expression replacement during the SSA->normal phase.  Single
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 49a1509..2b98946 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1249,6 +1249,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
>  void
>  set_reg_attrs_for_decl_rtl (tree t, rtx x)
>  {
> +  if (!t)
> +    return;
> +  tree tdecl = t;
>    if (GET_CODE (x) == SUBREG)
>      {
>        gcc_assert (subreg_lowpart_p (x));
> @@ -1257,7 +1260,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>    if (REG_P (x))
>      REG_ATTRS (x)
>        = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
> -                                              DECL_MODE (t)));
> +                                              DECL_MODE (tdecl)));
>    if (GET_CODE (x) == CONCAT)
>      {
>        if (REG_P (XEXP (x, 0)))
> diff --git a/gcc/explow.c b/gcc/explow.c
> index 8745aea..5b0d49c 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -856,6 +856,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>    return pmode;
>  }
>
> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
> +   mode of a temp decl of same type as the SSA_NAME, if we had created
> +   one.  */
> +
> +machine_mode
> +promote_ssa_mode (const_tree name, int *punsignedp)
> +{
> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
> +
> +  tree type = TREE_TYPE (name);
> +  int unsignedp = TYPE_UNSIGNED (type);
> +  machine_mode mode = TYPE_MODE (type);
> +
> +  machine_mode pmode = promote_mode (type, mode, &unsignedp);
> +  if (punsignedp)
> +    *punsignedp = unsignedp;
> +
> +  return pmode;
> +}
> +
> +
>
>  /* Controls the behaviour of {anti_,}adjust_stack.  */
>  static bool suppress_reg_args_size;
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 94613de..52113db 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
>  /* Return mode and signedness to use when object is promoted.  */
>  machine_mode promote_decl_mode (const_tree, int *);
>
> +/* Return mode and signedness to use when object is promoted.  */
> +machine_mode promote_ssa_mode (const_tree, int *);
> +
>  /* Remove some bytes from the stack.  An rtx says how many.  */
>  extern void adjust_stack (rtx);
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 5a931dc..5b6e16e 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -9301,7 +9301,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>    rtx op0, op1, temp, decl_rtl;
>    tree type;
>    int unsignedp;
> -  machine_mode mode;
> +  machine_mode mode, dmode;
>    enum tree_code code = TREE_CODE (exp);
>    rtx subtarget, original_target;
>    int ignore;
> @@ -9432,7 +9432,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        if (g == NULL
>           && modifier == EXPAND_INITIALIZER
>           && !SSA_NAME_IS_DEFAULT_DEF (exp)
> -         && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> +         && (optimize || !SSA_NAME_VAR (exp)
> +             || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>           && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
>         g = SSA_NAME_DEF_STMT (exp);
>        if (g)
> @@ -9511,15 +9512,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        /* Ensure variable marked as used even if it doesn't go through
>          a parser.  If it hasn't be used yet, write out an external
>          definition.  */
> -      TREE_USED (exp) = 1;
> +      if (exp)
> +       TREE_USED (exp) = 1;
>
>        /* Show we haven't gotten RTL for this yet.  */
>        temp = 0;
>
>        /* Variables inherited from containing functions should have
>          been lowered by this point.  */
> -      context = decl_function_context (exp);
> -      gcc_assert (SCOPE_FILE_SCOPE_P (context)
> +      if (exp)
> +       context = decl_function_context (exp);
> +      gcc_assert (!exp
> +                 || SCOPE_FILE_SCOPE_P (context)
>                   || context == current_function_decl
>                   || TREE_STATIC (exp)
>                   || DECL_EXTERNAL (exp)
> @@ -9543,7 +9547,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>           decl_rtl = use_anchored_address (decl_rtl);
>           if (modifier != EXPAND_CONST_ADDRESS
>               && modifier != EXPAND_SUM
> -             && !memory_address_addr_space_p (DECL_MODE (exp),
> +             && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
> +                                              : GET_MODE (decl_rtl),
>                                                XEXP (decl_rtl, 0),
>                                                MEM_ADDR_SPACE (decl_rtl)))
>             temp = replace_equiv_address (decl_rtl,
> @@ -9554,12 +9559,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          if the address is a register.  */
>        if (temp != 0)
>         {
> -         if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
> +         if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
>             mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>
>           return temp;
>         }
>
> +      if (exp)
> +       dmode = DECL_MODE (exp);
> +      else
> +       dmode = TYPE_MODE (TREE_TYPE (ssa_name));
> +
>        /* If the mode of DECL_RTL does not match that of the decl,
>          there are two cases: we are dealing with a BLKmode value
>          that is returned in a register, or we are dealing with
> @@ -9567,22 +9577,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          of the wanted mode, but mark it so that we know that it
>          was already extended.  */
>        if (REG_P (decl_rtl)
> -         && DECL_MODE (exp) != BLKmode
> -         && GET_MODE (decl_rtl) != DECL_MODE (exp))
> +         && dmode != BLKmode
> +         && GET_MODE (decl_rtl) != dmode)
>         {
>           machine_mode pmode;
>
>           /* Get the signedness to be used for this variable.  Ensure we get
>              the same mode we got when the variable was declared.  */
> -         if (code == SSA_NAME
> -             && (g = SSA_NAME_DEF_STMT (ssa_name))
> -             && gimple_code (g) == GIMPLE_CALL
> -             && !gimple_call_internal_p (g))
> +         if (code != SSA_NAME)
> +           pmode = promote_decl_mode (exp, &unsignedp);
> +         else if ((g = SSA_NAME_DEF_STMT (ssa_name))
> +                  && gimple_code (g) == GIMPLE_CALL
> +                  && !gimple_call_internal_p (g))
>             pmode = promote_function_mode (type, mode, &unsignedp,
>                                            gimple_call_fntype (g),
>                                            2);
>           else
> -           pmode = promote_decl_mode (exp, &unsignedp);
> +           pmode = promote_ssa_mode (ssa_name, &unsignedp);
>           gcc_assert (GET_MODE (decl_rtl) == pmode);
>
>           temp = gen_lowpart_SUBREG (mode, decl_rtl);
> diff --git a/gcc/function.c b/gcc/function.c
> index 7d2d7e4..58e2498 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfganal.h"
>  #include "cfgbuild.h"
>  #include "cfgcleanup.h"
> +#include "cfgexpand.h"
>  #include "basic-block.h"
>  #include "df.h"
>  #include "params.h"
> @@ -2121,6 +2122,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>  bool
>  use_register_for_decl (const_tree decl)
>  {
> +  if (TREE_CODE (decl) == SSA_NAME)
> +    {
> +      /* We often try to use the SSA_NAME, instead of its underlying
> +        decl, to get type information and guide decisions, to avoid
> +        differences of behavior between anonymous and named
> +        variables, but in this one case we have to go for the actual
> +        variable if there is one.  The main reason is that, at least
> +        at -O0, we want to place user variables on the stack, but we
> +        don't mind using pseudos for anonymous or ignored temps.
> +        Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
> +        should go in pseudos, whereas their corresponding variables
> +        might have to go on the stack.  So, disregarding the decl
> +        here would negatively impact debug info at -O0, enable
> +        coalescing between SSA_NAMEs that ought to get different
> +        stack/pseudo assignments, and get the incoming argument
> +        processing thoroughly confused by PARM_DECLs expected to live
> +        in stack slots but assigned to pseudos.  */
> +      if (!SSA_NAME_VAR (decl))
> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> +
> +      decl = SSA_NAME_VAR (decl);
> +    }
> +
>    if (!targetm.calls.allocate_stack_slots_for_args ())
>      return true;
>
> @@ -2804,23 +2829,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>    data->entry_parm = entry_parm;
>  }
>
> +/* Wrapper for use_register_for_decl, that special-cases the
> +   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> +   passed by reference.  */
> +
> +static bool
> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (DECL_BY_REFERENCE (result))
> +       parm = result;
> +    }
> +
> +  return use_register_for_decl (parm);
> +}
> +
> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> +   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> +   is passed by reference.  */
> +
> +static rtx
> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (!DECL_BY_REFERENCE (result))
> +       return NULL_RTX;
> +
> +      parm = result;
> +    }
> +
> +  return get_rtl_for_parm_ssa_default_def (parm);
> +}
> +
> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> +   SSA_NAMEs in multiple partitions, so that assign_parms will choose
> +   the default def, if it exists, or create new RTL to hold the unused
> +   entry value.  If we are coalescing across variables, we want to
> +   reset the location too, because a parm without a default def
> +   (incoming value unused) might be coalesced with one with a default
> +   def, and then assign_parms would copy both incoming values to the
> +   same location, which might cause the wrong value to survive.  */
> +static void
> +maybe_reset_rtl_for_parm (tree parm)
> +{
> +  gcc_assert (TREE_CODE (parm) == PARM_DECL
> +             || TREE_CODE (parm) == RESULT_DECL);
> +  if ((flag_tree_coalesce_vars
> +       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> +      && is_gimple_reg (parm))
> +    SET_DECL_RTL (parm, NULL_RTX);
> +}
> +
>  /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
>     always valid and properly aligned.  */
>
>  static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> +                             struct assign_parm_data_one *data)
>  {
>    rtx stack_parm = data->stack_parm;
>
> +  /* If out-of-SSA assigned RTL to the parm default def, make sure we
> +     don't use what we might have computed before.  */
> +  rtx ssa_assigned = rtl_for_parm (all, parm);
> +  if (ssa_assigned)
> +    stack_parm = NULL;
> +
>    /* If we can't trust the parm stack slot to be aligned enough for its
>       ultimate type, don't use that slot after entry.  We'll make another
>       stack slot, if we need one.  */
> -  if (stack_parm
> -      && ((STRICT_ALIGNMENT
> -          && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> -         || (data->nominal_type
> -             && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> -             && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> +  else if (stack_parm
> +          && ((STRICT_ALIGNMENT
> +               && (GET_MODE_ALIGNMENT (data->nominal_mode)
> +                   > MEM_ALIGN (stack_parm)))
> +              || (data->nominal_type
> +                  && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> +                  && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>      stack_parm = NULL;
>
>    /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2882,11 +2972,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>
>    size = int_size_in_bytes (data->passed_type);
>    size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> +
>    if (stack_parm == 0)
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> -      stack_parm = assign_stack_local (BLKmode, size_stored,
> -                                      DECL_ALIGN (parm));
> +      stack_parm = rtl_for_parm (all, parm);
> +      if (!stack_parm)
> +       stack_parm = assign_stack_local (BLKmode, size_stored,
> +                                        DECL_ALIGN (parm));
> +      else
> +       stack_parm = copy_rtx (stack_parm);
>        if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>         PUT_MODE (stack_parm, GET_MODE (entry_parm));
>        set_mem_attributes (stack_parm, parm, 1);
> @@ -3027,10 +3122,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>      = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                              TREE_TYPE (current_function_decl), 2);
>
> -  parmreg = gen_reg_rtx (promoted_nominal_mode);
> +  rtx from_expand = rtl_for_parm (all, parm);
>
> -  if (!DECL_ARTIFICIAL (parm))
> -    mark_user_reg (parmreg);
> +  if (from_expand && !data->passed_pointer)
> +    {
> +      parmreg = from_expand;
> +      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
> +    }
> +  else
> +    {
> +      parmreg = gen_reg_rtx (promoted_nominal_mode);
> +      if (!DECL_ARTIFICIAL (parm))
> +       mark_user_reg (parmreg);
> +    }
>
>    /* If this was an item that we received a pointer to,
>       set DECL_RTL appropriately.  */
> @@ -3049,6 +3153,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       assign_parm_find_data_types and expand_expr_real_1.  */
>
>    equiv_stack_parm = data->stack_parm;
> +  if (!equiv_stack_parm)
> +    equiv_stack_parm = data->entry_parm;
>    validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
>    need_conversion = (data->nominal_mode != data->passed_mode
> @@ -3189,11 +3295,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>    /* If we were passed a pointer but the actual value can safely live
>       in a register, retrieve it and use it directly.  */
> -  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
> +  if (data->passed_pointer
> +      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
>      {
>        /* We can't use nominal_mode, because it will have been set to
>          Pmode above.  We must use the actual mode of the parm.  */
> -      if (use_register_for_decl (parm))
> +      if (from_expand)
> +       {
> +         parmreg = from_expand;
> +         gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> +       }
> +      else if (use_register_for_decl (parm))
>         {
>           parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>           mark_user_reg (parmreg);
> @@ -3233,7 +3345,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>        /* STACK_PARM is the pointer, not the parm, and PARMREG is
>          now the parm.  */
> -      data->stack_parm = NULL;
> +      data->stack_parm = equiv_stack_parm = NULL;
>      }
>
>    /* Mark the register as eliminable if we did no conversion and it was
> @@ -3243,11 +3355,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       make here would screw up life analysis for it.  */
>    if (data->nominal_mode == data->passed_mode
>        && !did_conversion
> -      && data->stack_parm != 0
> -      && MEM_P (data->stack_parm)
> +      && equiv_stack_parm != 0
> +      && MEM_P (equiv_stack_parm)
>        && data->locate.offset.var == 0
>        && reg_mentioned_p (virtual_incoming_args_rtx,
> -                         XEXP (data->stack_parm, 0)))
> +                         XEXP (equiv_stack_parm, 0)))
>      {
>        rtx_insn *linsn = get_last_insn ();
>        rtx_insn *sinsn;
> @@ -3260,8 +3372,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>             = GET_MODE_INNER (GET_MODE (parmreg));
>           int regnor = REGNO (XEXP (parmreg, 0));
>           int regnoi = REGNO (XEXP (parmreg, 1));
> -         rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> -         rtx stacki = adjust_address_nv (data->stack_parm, submode,
> +         rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> +         rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
>                                           GET_MODE_SIZE (submode));
>
>           /* Scan backwards for the set of the real and
> @@ -3334,6 +3446,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>
>        if (data->stack_parm == 0)
>         {
> +         rtx x = data->stack_parm = rtl_for_parm (all, parm);
> +         if (x)
> +           gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
> +       }
> +
> +      if (data->stack_parm == 0)
> +       {
>           int align = STACK_SLOT_ALIGNMENT (data->passed_type,
>                                             GET_MODE (data->entry_parm),
>                                             TYPE_ALIGN (data->passed_type));
> @@ -3592,6 +3711,8 @@ assign_parms (tree fndecl)
>           DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>           continue;
>         }
> +      else
> +       maybe_reset_rtl_for_parm (parm);
>
>        /* Estimate stack alignment from parameter alignment.  */
>        if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3641,7 +3762,9 @@ assign_parms (tree fndecl)
>        else
>         set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> -      /* Boudns should be loaded in the particular order to
> +      assign_parm_adjust_stack_rtl (&all, parm, &data);
> +
> +      /* Bounds should be loaded in the particular order to
>          have registers allocated correctly.  Collect info about
>          input bounds and load them later.  */
>        if (POINTER_BOUNDS_TYPE_P (data.passed_type))
> @@ -3658,11 +3781,10 @@ assign_parms (tree fndecl)
>         }
>        else
>         {
> -         assign_parm_adjust_stack_rtl (&data);
> -
>           if (assign_parm_setup_block_p (&data))
>             assign_parm_setup_block (&all, parm, &data);
> -         else if (data.passed_pointer || use_register_for_decl (parm))
> +         else if (data.passed_pointer
> +                  || use_register_for_parm_decl (&all, parm))
>             assign_parm_setup_reg (&all, parm, &data);
>           else
>             assign_parm_setup_stack (&all, parm, &data);
> @@ -5004,7 +5126,9 @@ expand_function_start (tree subr)
>       before any library calls that assign parms might generate.  */
>
>    /* Decide whether to return the value in memory or in a register.  */
> -  if (aggregate_value_p (DECL_RESULT (subr), subr))
> +  tree res = DECL_RESULT (subr);
> +  maybe_reset_rtl_for_parm (res);
> +  if (aggregate_value_p (res, subr))
>      {
>        /* Returning something that won't go in a register.  */
>        rtx value_address = 0;
> @@ -5012,7 +5136,7 @@ expand_function_start (tree subr)
>  #ifdef PCC_STATIC_STRUCT_RETURN
>        if (cfun->returns_pcc_struct)
>         {
> -         int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
> +         int size = int_size_in_bytes (TREE_TYPE (res));
>           value_address = assemble_static_space (size);
>         }
>        else
> @@ -5024,36 +5148,45 @@ expand_function_start (tree subr)
>              it.  */
>           if (sv)
>             {
> -             value_address = gen_reg_rtx (Pmode);
> +             if (DECL_BY_REFERENCE (res))
> +               value_address = get_rtl_for_parm_ssa_default_def (res);
> +             if (!value_address)
> +               value_address = gen_reg_rtx (Pmode);
>               emit_move_insn (value_address, sv);
>             }
>         }
>        if (value_address)
>         {
>           rtx x = value_address;
> -         if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
> +         if (!DECL_BY_REFERENCE (res))
>             {
> -             x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
> -             set_mem_attributes (x, DECL_RESULT (subr), 1);
> +             x = get_rtl_for_parm_ssa_default_def (res);
> +             if (!x)
> +               {
> +                 x = gen_rtx_MEM (DECL_MODE (res), value_address);
> +                 set_mem_attributes (x, res, 1);
> +               }
>             }
> -         SET_DECL_RTL (DECL_RESULT (subr), x);
> +         SET_DECL_RTL (res, x);
>         }
>      }
> -  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
> +  else if (DECL_MODE (res) == VOIDmode)
>      /* If return mode is void, this decl rtl should not be used.  */
> -    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
> +    SET_DECL_RTL (res, NULL_RTX);
>    else
>      {
>        /* Compute the return values into a pseudo reg, which we will copy
>          into the true return register after the cleanups are done.  */
> -      tree return_type = TREE_TYPE (DECL_RESULT (subr));
> -      if (TYPE_MODE (return_type) != BLKmode
> -         && targetm.calls.return_in_msb (return_type))
> +      tree return_type = TREE_TYPE (res);
> +      rtx x = get_rtl_for_parm_ssa_default_def (res);
> +      if (x)
> +       /* Use it.  */;
> +      else if (TYPE_MODE (return_type) != BLKmode
> +              && targetm.calls.return_in_msb (return_type))
>         /* expand_function_end will insert the appropriate padding in
>            this case.  Use the return value's natural (unpadded) mode
>            within the function proper.  */
> -       SET_DECL_RTL (DECL_RESULT (subr),
> -                     gen_reg_rtx (TYPE_MODE (return_type)));
> +       x = gen_reg_rtx (TYPE_MODE (return_type));
>        else
>         {
>           /* In order to figure out what mode to use for the pseudo, we
> @@ -5064,25 +5197,26 @@ expand_function_start (tree subr)
>           /* Structures that are returned in registers are not
>              aggregate_value_p, so we may see a PARALLEL or a REG.  */
>           if (REG_P (hard_reg))
> -           SET_DECL_RTL (DECL_RESULT (subr),
> -                         gen_reg_rtx (GET_MODE (hard_reg)));
> +           x = gen_reg_rtx (GET_MODE (hard_reg));
>           else
>             {
>               gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> -             SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
> +             x = gen_group_rtx (hard_reg);
>             }
>         }
>
> +      SET_DECL_RTL (res, x);
> +
>        /* Set DECL_REGISTER flag so that expand_function_end will copy the
>          result to the real return register(s).  */
> -      DECL_REGISTER (DECL_RESULT (subr)) = 1;
> +      DECL_REGISTER (res) = 1;
>
>        if (chkp_function_instrumented_p (current_function_decl))
>         {
> -         tree return_type = TREE_TYPE (DECL_RESULT (subr));
> +         tree return_type = TREE_TYPE (res);
>           rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
>                                                                  subr, 1);
> -         SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
> +         SET_DECL_BOUNDS_RTL (res, bounds);
>         }
>      }
>
> @@ -5097,7 +5231,9 @@ expand_function_start (tree subr)
>        rtx local, chain;
>       rtx_insn *insn;
>
> -      local = gen_reg_rtx (Pmode);
> +      local = get_rtl_for_parm_ssa_default_def (parm);
> +      if (!local)
> +       local = gen_reg_rtx (Pmode);
>        chain = targetm.calls.static_chain (current_function_decl, true);
>
>        set_decl_incoming_rtl (parm, chain, false);
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index 4d683d6..d3d1c5f 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
>    return copy;
>  }
>
> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> -   coalescing together, false otherwise.
> -
> -   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> -
> -bool
> -gimple_can_coalesce_p (tree name1, tree name2)
> -{
> -  /* First check the SSA_NAME's associated DECL.  We only want to
> -     coalesce if they have the same DECL or both have no associated DECL.  */
> -  tree var1 = SSA_NAME_VAR (name1);
> -  tree var2 = SSA_NAME_VAR (name2);
> -  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> -  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> -  if (var1 != var2)
> -    return false;
> -
> -  /* Now check the types.  If the types are the same, then we should
> -     try to coalesce V1 and V2.  */
> -  tree t1 = TREE_TYPE (name1);
> -  tree t2 = TREE_TYPE (name2);
> -  if (t1 == t2)
> -    return true;
> -
> -  /* If the types are not the same, check for a canonical type match.  This
> -     (for example) allows coalescing when the types are fundamentally the
> -     same, but just have different names.
> -
> -     Note pointer types with different address spaces may have the same
> -     canonical type.  Those are rejected for coalescing by the
> -     types_compatible_p check.  */
> -  if (TYPE_CANONICAL (t1)
> -      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> -      && types_compatible_p (t1, t2))
> -    return true;
> -
> -  return false;
> -}
> -
>  /* Strip off a legitimate source ending from the input string NAME of
>     length LEN.  Rather than having to know the names used by all of
>     our front ends, we strip off an ending of a period followed by
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index ed23eb2..3d1c89f 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
>  extern bool gimple_has_body_p (tree);
>  extern const char *gimple_decl_printable_name (tree, int);
>  extern tree copy_var_decl (tree, tree, tree);
> -extern bool gimple_can_coalesce_p (tree, tree);
>  extern tree create_tmp_var_name (const char *);
>  extern tree create_tmp_var_raw (tree, const char * = NULL);
>  extern tree create_tmp_var (tree, const char * = NULL);
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 9793999..5305299 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
> +    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> -    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 4690e23..230e089 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_all_early_optimizations);
>        PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
>           NEXT_PASS (pass_remove_cgraph_callee_edges);
> -         NEXT_PASS (pass_rename_ssa_copies);
>           NEXT_PASS (pass_object_sizes);
>           NEXT_PASS (pass_ccp);
>           /* After CCP we rewrite no longer addressed locals into SSA
> @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
>        /* Initial scalar cleanups before alias computation.
>          They ensure memory accesses are not indirect wherever possible.  */
>        NEXT_PASS (pass_strip_predict_hints);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        NEXT_PASS (pass_ccp);
>        /* After CCP we rewrite no longer addressed locals into SSA
>          form if possible.  */
> @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_ch);
>        NEXT_PASS (pass_lower_complex);
>        NEXT_PASS (pass_sra);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* The dom pass will also resolve all __builtin_constant_p calls
>           that are still there to 0.  This has to be done after some
>          propagations have already run, but before some more dead code
> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_fold_builtins);
>        NEXT_PASS (pass_optimize_widening_mul);
>        NEXT_PASS (pass_tail_calls);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* FIXME: If DCE is not run before checking for uninitialized uses,
>          we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
>          However, this also causes us to misdiagnose cases that should be
> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_dce);
>        NEXT_PASS (pass_asan);
>        NEXT_PASS (pass_tsan);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* ???  We do want some kind of loop invariant motion, but we possibly
>           need to adjust LIM to be more friendly towards preserving accurate
>          debug information here.  */
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
> index 9b17187..e1e7293 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
> @@ -1,6 +1,6 @@
>  /* PR tree-optimization/54200 */
>  /* { dg-do run } */
> -/* { dg-options "-g -fno-var-tracking-assignments" } */
> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>
>  int o __attribute__((used));
>
> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
> index 5467f4d..db69332 100644
> --- a/gcc/testsuite/gcc.dg/ssp-1.c
> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>
>  int main ()
>  {
> -  int i;
> +  register int i;
>    char foo[255];
>
>    // smash stack
> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
> index 9a7ac32..752fe53 100644
> --- a/gcc/testsuite/gcc.dg/ssp-2.c
> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
>  void
>  overflow()
>  {
> -  int i = 0;
> +  register int i = 0;
>    char foo[30];
>
>    /* Overflow buffer.  */
> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> new file mode 100644
> index 0000000..dbd81c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> @@ -0,0 +1,40 @@
> +/* { dg-do run } */
> +
> +#include <stdlib.h>
> +
> +/* Make sure we don't coalesce both incoming parms, one whose incoming
> +   value is unused, to the same location, so as to overwrite one of
> +   them with the incoming value of the other.  */
> +
> +int __attribute__((noinline, noclone))
> +foo (int i, int j)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +/* Same as foo, but with swapped parameters.  */
> +int __attribute__((noinline, noclone))
> +bar (int j, int i)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +int
> +main (void)
> +{
> +  if (foo (0, 1) != 3)
> +    abort ();
> +  if (bar (1, 0) != 3)
> +    abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index e23bc0b..59d91c6 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>    rtx dest_rtx, seq, x;
>    machine_mode dest_mode, src_mode;
>    int unsignedp;
> -  tree var;
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> @@ -327,12 +326,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>
>    start_sequence ();
>
> -  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
> +  tree name = partition_to_var (SA.map, dest);
>    src_mode = TYPE_MODE (TREE_TYPE (src));
>    dest_mode = GET_MODE (dest_rtx);
> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>    gcc_assert (!REG_P (dest_rtx)
> -             || dest_mode == promote_decl_mode (var, &unsignedp));
> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>
>    if (src_mode != dest_mode)
>      {
> @@ -708,13 +707,12 @@ elim_backward (elim_graph g, int T)
>  static rtx
>  get_temp_reg (tree name)
>  {
> -  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> -  tree type = TREE_TYPE (var);
> +  tree type = TREE_TYPE (name);
>    int unsignedp;
> -  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
> +  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>    rtx x = gen_reg_rtx (reg_mode);
>    if (POINTER_TYPE_P (type))
> -    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
> +    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
>    return x;
>  }
>
> @@ -1014,7 +1012,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
>    /* Return to viewing the variable list as just all reference variables after
>       coalescing has been performed.  */
> -  partition_view_normal (map, false);
> +  partition_view_normal (map);
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index b05a860..9ffa3f1 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssanames.h"
>  #include "tree-ssa-live.h"
>  #include "tree-ssa-coalesce.h"
> +#include "explow.h"
>  #include "diagnostic-core.h"
>
>
> @@ -830,6 +831,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>    basic_block bb;
>    ssa_op_iter iter;
>    live_track_p live;
> +  basic_block entry;
> +
> +  /* If inter-variable coalescing is enabled, we may attempt to
> +     coalesce variables from different base variables, including
> +     different parameters, so we have to make sure default defs live
> +     at the entry block conflict with each other.  */
> +  if (flag_tree_coalesce_vars)
> +    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> +  else
> +    entry = NULL;
>
>    map = live_var_map (liveinfo);
>    graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -888,6 +899,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>             live_track_process_def (live, result, graph);
>         }
>
> +      /* Pretend there are defs for params' default defs at the start
> +        of the (post-)entry block.  */
> +      if (bb == entry)
> +       {
> +         unsigned base;
> +         bitmap_iterator bi;
> +         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> +           {
> +             bitmap_iterator bi2;
> +             unsigned part;
> +             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> +                                       0, part, bi2)
> +               {
> +                 tree var = partition_to_var (map, part);
> +                 if (!SSA_NAME_VAR (var)
> +                     || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> +                         && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> +                     || !SSA_NAME_IS_DEFAULT_DEF (var))
> +                   continue;
> +                 live_track_process_def (live, var, graph);
> +               }
> +           }
> +       }
> +
>       live_track_clear_base_vars (live);
>      }
>
> @@ -1156,6 +1191,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>      {
>        var1 = partition_to_var (map, p1);
>        var2 = partition_to_var (map, p2);
> +
>        z = var_union (map, var1, var2);
>        if (z == NO_PARTITION)
>         {
> @@ -1173,6 +1209,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
>        if (debug)
>         fprintf (debug, ": Success -> %d\n", z);
> +
>        return true;
>      }
>
> @@ -1270,6 +1307,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
>  }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F.  */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> +  int t;
> +  unsigned x, y;
> +  int p;
> +
> +  fprintf (f, "\nCoalescible Partition map \n\n");
> +
> +  for (x = 0; x < map->num_partitions; x++)
> +    {
> +      if (map->view_to_partition != NULL)
> +       p = map->view_to_partition[x];
> +      else
> +       p = x;
> +
> +      if (ssa_name (p) == NULL_TREE
> +         || virtual_operand_p (ssa_name (p)))
> +        continue;
> +
> +      t = 0;
> +      for (y = 1; y < num_ssa_names; y++)
> +        {
> +         tree var = version_to_var (map, y);
> +         if (!var)
> +           continue;
> +         int q = var_to_partition (map, var);
> +         p = partition_find (part, q);
> +         gcc_assert (map->partition_to_base_index[q]
> +                     == map->partition_to_base_index[p]);
> +
> +         if (p == (int)x)
> +           {
> +             if (t++ == 0)
> +               {
> +                 fprintf (f, "Partition %d, base %d (", x,
> +                          map->partition_to_base_index[q]);
> +                 print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> +                 fprintf (f, " - ");
> +               }
> +             fprintf (f, "%d ", y);
> +           }
> +       }
> +      if (t != 0)
> +       fprintf (f, ")\n");
> +    }
> +  fprintf (f, "\n");
> +}
> +
> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> +   coalescing together, false otherwise.
> +
> +   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> +
> +bool
> +gimple_can_coalesce_p (tree name1, tree name2)
> +{
> +  /* First check the SSA_NAME's associated DECL.  Without
> +     optimization, we only want to coalesce if they have the same DECL
> +     or both have no associated DECL.  */
> +  tree var1 = SSA_NAME_VAR (name1);
> +  tree var2 = SSA_NAME_VAR (name2);
> +  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> +  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> +  if (var1 != var2 && !flag_tree_coalesce_vars)
> +    return false;
> +
> +  /* Now check the types.  If the types are the same, then we should
> +     try to coalesce V1 and V2.  */
> +  tree t1 = TREE_TYPE (name1);
> +  tree t2 = TREE_TYPE (name2);
> +  if (t1 == t2)
> +    {
> +    check_modes:
> +      /* If the base variables are the same, we're good: none of the
> +        other tests below could possibly fail.  */
> +      var1 = SSA_NAME_VAR (name1);
> +      var2 = SSA_NAME_VAR (name2);
> +      if (var1 == var2)
> +       return true;
> +
> +      /* We don't want to coalesce two SSA names if one of the base
> +        variables is supposed to be a register while the other is
> +        supposed to be on the stack.  Anonymous SSA names take
> +        registers, but when not optimizing, user variables should go
> +        on the stack, so coalescing them with the anonymous variable
> +        as the partition leader would end up assigning the user
> +        variable to a register.  Don't do that!  */
> +      bool reg1 = !var1 || use_register_for_decl (var1);
> +      bool reg2 = !var2 || use_register_for_decl (var2);
> +      if (reg1 != reg2)
> +       return false;
> +
> +      /* Check that the promoted modes are the same.  We don't want to
> +        coalesce if the promoted modes would be different.  Only
> +        PARM_DECLs and RESULT_DECLs have different promotion rules,
> +        so skip the test if we both are variables or anonymous
> +        SSA_NAMEs.  */
> +      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> +       || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
> +    }
> +
> +  /* If the types are not the same, check for a canonical type match.  This
> +     (for example) allows coalescing when the types are fundamentally the
> +     same, but just have different names.
> +
> +     Note pointer types with different address spaces may have the same
> +     canonical type.  Those are rejected for coalescing by the
> +     types_compatible_p check.  */
> +  if (TYPE_CANONICAL (t1)
> +      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> +      && types_compatible_p (t1, t2))
> +    goto check_modes;
> +
> +  return false;
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> +   partition of SSA names USED_IN_COPIES and related by CL coalesce
> +   possibilities.  This must match gimple_can_coalesce_p in the
> +   optimized case.  */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> +                                  coalesce_list_p cl)
> +{
> +  int parts = num_var_partitions (map);
> +  partition tentative = partition_new (parts);
> +
> +  /* Partition the SSA versions so that, for each coalescible
> +     pair, both of its members are in the same partition in
> +     TENTATIVE.  */
> +  gcc_assert (!cl->sorted);
> +  coalesce_pair_p node;
> +  coalesce_iterator_type ppi;
> +  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> +    {
> +      tree v1 = ssa_name (node->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (node->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* We have to deal with cost one pairs too.  */
> +  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> +    {
> +      tree v1 = ssa_name (co->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (co->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* And also with abnormal edges.  */
> +  basic_block bb;
> +  edge e;
> +  edge_iterator ei;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      FOR_EACH_EDGE (e, ei, bb->preds)
> +       if (e->flags & EDGE_ABNORMAL)
> +         {
> +           gphi_iterator gsi;
> +           for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> +                gsi_next (&gsi))
> +             {
> +               gphi *phi = gsi.phi ();
> +               tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> +               if (SSA_NAME_IS_DEFAULT_DEF (arg)
> +                   && (!SSA_NAME_VAR (arg)
> +                       || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> +                 continue;
> +
> +               tree res = PHI_RESULT (phi);
> +
> +               int p1 = partition_find (tentative, var_to_partition (map, res));
> +               int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> +               if (p1 == p2)
> +                 continue;
> +
> +               partition_union (tentative, p1, p2);
> +             }
> +         }
> +    }
> +
> +  map->partition_to_base_index = XCNEWVEC (int, parts);
> +  auto_vec<unsigned int> index_map (parts);
> +  if (parts)
> +    index_map.quick_grow (parts);
> +
> +  const unsigned no_part = -1;
> +  unsigned count = parts;
> +  while (count)
> +    index_map[--count] = no_part;
> +
> +  /* Initialize MAP's mapping from partition to base index, using
> +     as base indices an enumeration of the TENTATIVE partitions in
> +     which each SSA version ended up, so that we compute conflicts
> +     between all SSA versions that ended up in the same potential
> +     coalesce partition.  */
> +  bitmap_iterator bi;
> +  unsigned i;
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      if (index_map[base] != no_part)
> +       continue;
> +      index_map[base] = count++;
> +    }
> +
> +  map->num_basevars = count;
> +
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      gcc_assert (index_map[base] < count);
> +      map->partition_to_base_index[pidx] = index_map[base];
> +    }
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    dump_part_var_map (dump_file, tentative, map);
> +
> +  partition_delete (tentative);
> +}
> +
> +/* Hashtable helpers.  */
> +
> +struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
> +{
> +  typedef tree_int_map *value_type;
> +  typedef tree_int_map *compare_type;
> +  static inline hashval_t hash (const tree_int_map *);
> +  static inline bool equal (const tree_int_map *, const tree_int_map *);
> +};
> +
> +inline hashval_t
> +tree_int_map_hasher::hash (const tree_int_map *v)
> +{
> +  return tree_map_base_hash (v);
> +}
> +
> +inline bool
> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> +{
> +  return tree_int_map_eq (v, c);
> +}
> +
> +/* This routine will initialize the basevar fields of MAP with base
> +   names.  Partitions will share the same base if they have the same
> +   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
> +   must match gimple_can_coalesce_p in the non-optimized case.  */
> +
> +static void
> +compute_samebase_partition_bases (var_map map)
> +{
> +  int x, num_part;
> +  tree var;
> +  struct tree_int_map *m, *mapstorage;
> +
> +  num_part = num_var_partitions (map);
> +  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> +  /* We can have at most num_part entries in the hash tables, so it's
> +     enough to allocate so many map elements once, saving some malloc
> +     calls.  */
> +  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> +
> +  /* If a base table already exists, clear it, otherwise create it.  */
> +  free (map->partition_to_base_index);
> +  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> +
> +  /* Build the base variable list, and point partitions at their bases.  */
> +  for (x = 0; x < num_part; x++)
> +    {
> +      struct tree_int_map **slot;
> +      unsigned baseindex;
> +      var = partition_to_var (map, x);
> +      if (SSA_NAME_VAR (var)
> +         && (!VAR_P (SSA_NAME_VAR (var))
> +             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> +       m->base.from = SSA_NAME_VAR (var);
> +      else
> +       /* This restricts what anonymous SSA names we can coalesce
> +          as it restricts the sets we compute conflicts for.
> +          Using TREE_TYPE to generate sets is the easies as
> +          type equivalency also holds for SSA names with the same
> +          underlying decl.
> +
> +          Check gimple_can_coalesce_p when changing this code.  */
> +       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> +                       ? TYPE_CANONICAL (TREE_TYPE (var))
> +                       : TREE_TYPE (var));
> +      /* If base variable hasn't been seen, set it up.  */
> +      slot = tree_to_index.find_slot (m, INSERT);
> +      if (!*slot)
> +       {
> +         baseindex = m - mapstorage;
> +         m->to = baseindex;
> +         *slot = m;
> +         m++;
> +       }
> +      else
> +       baseindex = (*slot)->to;
> +      map->partition_to_base_index[x] = baseindex;
> +    }
> +
> +  map->num_basevars = m - mapstorage;
> +
> +  free (mapstorage);
> +}
> +
>  /* Reduce the number of copies by coalescing variables in the function.  Return
>     a partition map with the resulting coalesces.  */
>
> @@ -1286,9 +1647,10 @@ coalesce_ssa_name (void)
>    cl = create_coalesce_list ();
>    map = create_outofssa_var_map (cl, used_in_copies);
>
> -  /* If optimization is disabled, we need to coalesce all the names originating
> -     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
> -  if (!optimize)
> +  /* If this optimization is disabled, we need to coalesce all the
> +     names originating from the same SSA_NAME_VAR so debug info
> +     remains undisturbed.  */
> +  if (!flag_tree_coalesce_vars)
>      {
>        hash_table<ssa_name_var_hash> ssa_name_hash (10);
>
> @@ -1329,8 +1691,13 @@ coalesce_ssa_name (void)
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      dump_var_map (dump_file, map);
>
> -  /* Don't calculate live ranges for variables not in the coalesce list.  */
> -  partition_view_bitmap (map, used_in_copies, true);
> +  partition_view_bitmap (map, used_in_copies);
> +
> +  if (flag_tree_coalesce_vars)
> +    compute_optimized_partition_bases (map, used_in_copies, cl);
> +  else
> +    compute_samebase_partition_bases (map);
> +
>    BITMAP_FREE (used_in_copies);
>
>    if (num_var_partitions (map) < 1)
> @@ -1369,8 +1736,7 @@ coalesce_ssa_name (void)
>
>    /* Now coalesce everything in the list.  */
>    coalesce_partitions (map, graph, cl,
> -                      ((dump_flags & TDF_DETAILS) ? dump_file
> -                                                  : NULL));
> +                      ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
>    delete_coalesce_list (cl);
>    ssa_conflicts_delete (graph);
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index 99b188a..ae289b4 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>  #define GCC_TREE_SSA_COALESCE_H
>
>  extern var_map coalesce_ssa_name (void);
> +extern bool gimple_can_coalesce_p (tree, tree);
>
>  #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index f3cb56e..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,499 +0,0 @@
> -/* Rename SSA copies.
> -   Copyright (C) 2004-2015 Free Software Foundation, Inc.
> -   Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3.  If not see
> -<http://www.gnu.org/licenses/>.  */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "tm.h"
> -#include "hash-set.h"
> -#include "machmode.h"
> -#include "vec.h"
> -#include "double-int.h"
> -#include "input.h"
> -#include "alias.h"
> -#include "symtab.h"
> -#include "wide-int.h"
> -#include "inchash.h"
> -#include "tree.h"
> -#include "fold-const.h"
> -#include "predict.h"
> -#include "hard-reg-set.h"
> -#include "function.h"
> -#include "dominance.h"
> -#include "cfg.h"
> -#include "basic-block.h"
> -#include "tree-ssa-alias.h"
> -#include "internal-fn.h"
> -#include "gimple-expr.h"
> -#include "is-a.h"
> -#include "gimple.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "bitmap.h"
> -#include "gimple-ssa.h"
> -#include "stringpool.h"
> -#include "tree-ssanames.h"
> -#include "hashtab.h"
> -#include "rtl.h"
> -#include "statistics.h"
> -#include "real.h"
> -#include "fixed-value.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> -  /* Number of copies coalesced.  */
> -  int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> -   This optimization looks for copies between 2 SSA_NAMES, either through a
> -   direct copy, or an implicit one via a PHI node result and its arguments.
> -
> -   Each copy is examined to determine if it is possible to rename the base
> -   variable of one of the operands to the same variable as the other operand.
> -   i.e.
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -
> -   If this copy couldn't be copy propagated, it could possibly remain in the
> -   program throughout the optimization phases.   After SSA->normal, it would
> -   become:
> -
> -   T.3 = <blah>
> -   a = T.3
> -
> -   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> -   fundamental reason why the base variable needs to be T.3, subject to
> -   certain restrictions.  This optimization attempts to determine if we can
> -   change the base variable on copies like this, and result in code such as:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -
> -   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> -   possible, the copy goes away completely. If it isn't possible, a new temp
> -   will be created for a_5, and you will end up with the exact same code:
> -
> -   a.8 = <blah>
> -   a = a.8
> -
> -   The other benefit of performing this optimization relates to what variables
> -   are chosen in copies.  Gimplification of the program uses temporaries for
> -   a lot of things. expressions like
> -
> -   a_1 = <blah>
> -   <blah2> = a_1
> -
> -   get turned into
> -
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -   <blah2> = a_1
> -
> -   Copy propagation is done in a forward direction, and if we can propagate
> -   through the copy, we end up with:
> -
> -   T.3_5 = <blah>
> -   <blah2> = T.3_5
> -
> -   The copy is gone, but so is all reference to the user variable 'a'. By
> -   performing this optimization, we would see the sequence:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -   <blah2> = a_1
> -
> -   which copy propagation would then turn into:
> -
> -   a_5 = <blah>
> -   <blah2> = a_5
> -
> -   and so we still retain the user variable whenever possible.  */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> -   Choose a representative for the partition, and send debug info to DEBUG.  */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> -  int p1, p2, p3;
> -  tree root1, root2;
> -  tree rep1, rep2;
> -  bool ign1, ign2, abnorm;
> -
> -  gcc_assert (TREE_CODE (var1) == SSA_NAME);
> -  gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> -  register_ssa_partition (map, var1);
> -  register_ssa_partition (map, var2);
> -
> -  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> -  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> -  if (debug)
> -    {
> -      fprintf (debug, "Try : ");
> -      print_generic_expr (debug, var1, TDF_SLIM);
> -      fprintf (debug, "(P%d) & ", p1);
> -      print_generic_expr (debug, var2, TDF_SLIM);
> -      fprintf (debug, "(P%d)", p2);
> -    }
> -
> -  gcc_assert (p1 != NO_PARTITION);
> -  gcc_assert (p2 != NO_PARTITION);
> -
> -  if (p1 == p2)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Already coalesced.\n");
> -      return;
> -    }
> -
> -  rep1 = partition_to_var (map, p1);
> -  rep2 = partition_to_var (map, p2);
> -  root1 = SSA_NAME_VAR (rep1);
> -  root2 = SSA_NAME_VAR (rep2);
> -  if (!root1 && !root2)
> -    return;
> -
> -  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
> -  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> -           || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> -  if (abnorm)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Partitions already have the same root, simply merge them.  */
> -  if (root1 == root2)
> -    {
> -      p1 = partition_union (map->var_partition, p1, p2);
> -      if (debug)
> -       fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> -      return;
> -    }
> -
> -  /* Never attempt to coalesce 2 different parameters.  */
> -  if ((root1 && TREE_CODE (root1) == PARM_DECL)
> -      && (root2 && TREE_CODE (root2) == PARM_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> -      return;
> -    }
> -
> -  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> -      != (root2 && TREE_CODE (root2) == RESULT_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> -      return;
> -    }
> -
> -  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> -  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> -  /* Refrain from coalescing user variables, if requested.  */
> -  if (!ign1 && !ign2)
> -    {
> -      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> -       ign2 = true;
> -      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> -       ign1 = true;
> -      else if (flag_ssa_coalesce_vars != 2)
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> -         return;
> -       }
> -      else
> -       ign2 = true;
> -    }
> -
> -  /* If both values have default defs, we can't coalesce.  If only one has a
> -     tag, make sure that variable is the new root partition.  */
> -  if (root1 && ssa_default_def (cfun, root1))
> -    {
> -      if (root2 && ssa_default_def (cfun, root2))
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 default defs. No coalesce.\n");
> -         return;
> -       }
> -      else
> -        {
> -         ign2 = true;
> -         ign1 = false;
> -       }
> -    }
> -  else if (root2 && ssa_default_def (cfun, root2))
> -    {
> -      ign1 = true;
> -      ign2 = false;
> -    }
> -
> -  /* Do not coalesce if we cannot assign a symbol to the partition.  */
> -  if (!(!ign2 && root2)
> -      && !(!ign1 && root1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the new chosen root variable would be read-only.
> -     If both ign1 && ign2, then the root var of the larger partition
> -     wins, so reject in that case if any of the root vars is TREE_READONLY.
> -     Otherwise reject only if the root var, on which replace_ssa_name_symbol
> -     will be called below, is readonly.  */
> -  if (((root1 && TREE_READONLY (root1)) && ign2)
> -      || ((root2 && TREE_READONLY (root2)) && ign1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Readonly variable.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the two variables aren't type compatible .  */
> -  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> -      /* There is a disconnect between the middle-end type-system and
> -         VRP, avoid coalescing enum types with different bounds.  */
> -      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> -          || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> -         && TREE_TYPE (var1) != TREE_TYPE (var2)))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Incompatible types.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Merge the two partitions.  */
> -  p3 = partition_union (map->var_partition, p1, p2);
> -
> -  /* Set the root variable of the partition to the better choice, if there is
> -     one.  */
> -  if (!ign2 && root2)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> -  else if (!ign1 && root1)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> -  else
> -    gcc_unreachable ();
> -
> -  if (debug)
> -    {
> -      fprintf (debug, " --> P%d ", p3);
> -      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> -                         TDF_SLIM);
> -      fprintf (debug, "\n");
> -    }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> -  GIMPLE_PASS, /* type */
> -  "copyrename", /* name */
> -  OPTGROUP_NONE, /* optinfo_flags */
> -  TV_TREE_COPY_RENAME, /* tv_id */
> -  ( PROP_cfg | PROP_ssa ), /* properties_required */
> -  0, /* properties_provided */
> -  0, /* properties_destroyed */
> -  0, /* todo_flags_start */
> -  0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> -  pass_rename_ssa_copies (gcc::context *ctxt)
> -    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> -  {}
> -
> -  /* opt_pass methods: */
> -  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> -  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> -  virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> -   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
> -   changing the underlying root variable of all coalesced version.  This will
> -   then cause the SSA->normal pass to attempt to coalesce them all to the same
> -   variable.  */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> -  var_map map;
> -  basic_block bb;
> -  tree var, part_var;
> -  gimple stmt;
> -  unsigned x;
> -  FILE *debug;
> -
> -  memset (&stats, 0, sizeof (stats));
> -
> -  if (dump_file && (dump_flags & TDF_DETAILS))
> -    debug = dump_file;
> -  else
> -    debug = NULL;
> -
> -  map = init_var_map (num_ssa_names);
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Scan for real copies.  */
> -      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -       {
> -         stmt = gsi_stmt (gsi);
> -         if (gimple_assign_ssa_name_copy_p (stmt))
> -           {
> -             tree lhs = gimple_assign_lhs (stmt);
> -             tree rhs = gimple_assign_rhs1 (stmt);
> -
> -             copy_rename_partition_coalesce (map, lhs, rhs, debug);
> -           }
> -       }
> -    }
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Treat PHI nodes as copies between the result and each argument.  */
> -      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -        {
> -          size_t i;
> -         tree res;
> -         gphi *phi = gsi.phi ();
> -         res = gimple_phi_result (phi);
> -
> -         /* Do not process virtual SSA_NAMES.  */
> -         if (virtual_operand_p (res))
> -           continue;
> -
> -         /* Make sure to only use the same partition for an argument
> -            as the result but never the other way around.  */
> -         if (SSA_NAME_VAR (res)
> -             && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> -           for (i = 0; i < gimple_phi_num_args (phi); i++)
> -             {
> -               tree arg = PHI_ARG_DEF (phi, i);
> -               if (TREE_CODE (arg) == SSA_NAME)
> -                 copy_rename_partition_coalesce (map, res, arg,
> -                                                 debug);
> -             }
> -         /* Else if all arguments are in the same partition try to merge
> -            it with the result.  */
> -         else
> -           {
> -             int all_p_same = -1;
> -             int p = -1;
> -             for (i = 0; i < gimple_phi_num_args (phi); i++)
> -               {
> -                 tree arg = PHI_ARG_DEF (phi, i);
> -                 if (TREE_CODE (arg) != SSA_NAME)
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -                 else if (all_p_same == -1)
> -                   {
> -                     p = partition_find (map->var_partition,
> -                                         SSA_NAME_VERSION (arg));
> -                     all_p_same = 1;
> -                   }
> -                 else if (all_p_same == 1
> -                          && p != partition_find (map->var_partition,
> -                                                  SSA_NAME_VERSION (arg)))
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -               }
> -             if (all_p_same == 1)
> -               copy_rename_partition_coalesce (map, res,
> -                                               PHI_ARG_DEF (phi, 0),
> -                                               debug);
> -           }
> -        }
> -    }
> -
> -  if (debug)
> -    dump_var_map (debug, map);
> -
> -  /* Now one more pass to make all elements of a partition share the same
> -     root variable.  */
> -
> -  for (x = 1; x < num_ssa_names; x++)
> -    {
> -      part_var = partition_to_var (map, x);
> -      if (!part_var)
> -        continue;
> -      var = ssa_name (x);
> -      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> -       continue;
> -      if (debug)
> -        {
> -         fprintf (debug, "Coalesced ");
> -         print_generic_expr (debug, var, TDF_SLIM);
> -         fprintf (debug, " to ");
> -         print_generic_expr (debug, part_var, TDF_SLIM);
> -         fprintf (debug, "\n");
> -       }
> -      stats.coalesced++;
> -      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> -    }
> -
> -  statistics_counter_event (fun, "copies coalesced",
> -                           stats.coalesced);
> -  delete_var_map (map);
> -  return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> -  return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index 2c7c072..821b2f4 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -100,90 +100,6 @@ static void  verify_live_on_entry (tree_live_info_p);
>     ssa_name or variable, and vice versa.  */
>
>
> -/* Hashtable helpers.  */
> -
> -struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
> -{
> -  typedef tree_int_map *value_type;
> -  typedef tree_int_map *compare_type;
> -  static inline hashval_t hash (const tree_int_map *);
> -  static inline bool equal (const tree_int_map *, const tree_int_map *);
> -};
> -
> -inline hashval_t
> -tree_int_map_hasher::hash (const tree_int_map *v)
> -{
> -  return tree_map_base_hash (v);
> -}
> -
> -inline bool
> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> -{
> -  return tree_int_map_eq (v, c);
> -}
> -
> -
> -/* This routine will initialize the basevar fields of MAP.  */
> -
> -static void
> -var_map_base_init (var_map map)
> -{
> -  int x, num_part;
> -  tree var;
> -  struct tree_int_map *m, *mapstorage;
> -
> -  num_part = num_var_partitions (map);
> -  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> -  /* We can have at most num_part entries in the hash tables, so it's
> -     enough to allocate so many map elements once, saving some malloc
> -     calls.  */
> -  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> -
> -  /* If a base table already exists, clear it, otherwise create it.  */
> -  free (map->partition_to_base_index);
> -  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> -
> -  /* Build the base variable list, and point partitions at their bases.  */
> -  for (x = 0; x < num_part; x++)
> -    {
> -      struct tree_int_map **slot;
> -      unsigned baseindex;
> -      var = partition_to_var (map, x);
> -      if (SSA_NAME_VAR (var)
> -         && (!VAR_P (SSA_NAME_VAR (var))
> -             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> -       m->base.from = SSA_NAME_VAR (var);
> -      else
> -       /* This restricts what anonymous SSA names we can coalesce
> -          as it restricts the sets we compute conflicts for.
> -          Using TREE_TYPE to generate sets is the easies as
> -          type equivalency also holds for SSA names with the same
> -          underlying decl.
> -
> -          Check gimple_can_coalesce_p when changing this code.  */
> -       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> -                       ? TYPE_CANONICAL (TREE_TYPE (var))
> -                       : TREE_TYPE (var));
> -      /* If base variable hasn't been seen, set it up.  */
> -      slot = tree_to_index.find_slot (m, INSERT);
> -      if (!*slot)
> -       {
> -         baseindex = m - mapstorage;
> -         m->to = baseindex;
> -         *slot = m;
> -         m++;
> -       }
> -      else
> -       baseindex = (*slot)->to;
> -      map->partition_to_base_index[x] = baseindex;
> -    }
> -
> -  map->num_basevars = m - mapstorage;
> -
> -  free (mapstorage);
> -}
> -
> -
>  /* Remove the base table in MAP.  */
>
>  static void
> @@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
>  }
>
>
> -/* Create a partition view which includes all the used partitions in MAP.  If
> -   WANT_BASES is true, create the base variable map as well.  */
> +/* Create a partition view which includes all the used partitions in MAP.  */
>
>  void
> -partition_view_normal (var_map map, bool want_bases)
> +partition_view_normal (var_map map)
>  {
>    bitmap used;
>
>    used = partition_view_init (map);
>    partition_view_fini (map, used);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> @@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
>     as well.  */
>
>  void
> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> +partition_view_bitmap (var_map map, bitmap only)
>  {
>    bitmap used;
>    bitmap new_partitions = BITMAP_ALLOC (NULL);
> @@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>      }
>    partition_view_fini (map, new_partitions);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
> index d5d7820..1f88358 100644
> --- a/gcc/tree-ssa-live.h
> +++ b/gcc/tree-ssa-live.h
> @@ -71,8 +71,8 @@ typedef struct _var_map
>  extern var_map init_var_map (int);
>  extern void delete_var_map (var_map);
>  extern int var_union (var_map, tree, tree);
> -extern void partition_view_normal (var_map, bool);
> -extern void partition_view_bitmap (var_map, bitmap, bool);
> +extern void partition_view_normal (var_map);
> +extern void partition_view_bitmap (var_map, bitmap);
>  extern void dump_scope_blocks (FILE *, int);
>  extern void debug_scope_block (tree, int);
>  extern void debug_scope_blocks (int);
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 3f6bebe..7bef8cf 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
>         if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
>           continue;
>         e = TREE_OPERAND (e, 0);
> -       gcc_assert (operand_equal_p (e, base, 0));
> +       /* If E has an unsigned type, the operand equality test below
> +          would fail, but the equality test above would have already
> +          verified the equality, so we can proceed with it.  */
> +       gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
> +                   || operand_equal_p (e, base, 0));
>         if (tree_int_cst_sign_bit (step))
>           {
>             code = LT_EXPR;
> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
> index f75a7f1..0982305 100644
> --- a/gcc/tree-ssa-uncprop.c
> +++ b/gcc/tree-ssa-uncprop.c
> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3.  If not see
>  #include "domwalk.h"
>  #include "tree-pass.h"
>  #include "tree-ssa-propagate.h"
> +#include "bitmap.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-ssa-live.h"
> +#include "tree-ssa-coalesce.h"
>
>  /* The basic structure describing an equivalency created by traversing
>     an edge.  Traversing the edge effectively means that we can assume
> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
> index 0b24007..acdcd46 100644
> --- a/gcc/var-tracking.c
> +++ b/gcc/var-tracking.c
> @@ -4931,12 +4931,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
>     registers, as well as associations between MEMs and VALUEs.  */
>
>  static void
> -dataflow_set_clear_at_call (dataflow_set *set)
> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
>  {
>    unsigned int r;
>    hard_reg_set_iterator hrsi;
> +  HARD_REG_SET invalidated_regs;
>
> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
> +  get_call_reg_set_usage (call_insn, &invalidated_regs,
> +                         regs_invalidated_by_call);
> +
> +  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
>      var_regno_delete (set, r);
>
>    if (MAY_HAVE_DEBUG_INSNS)
> @@ -6720,7 +6724,7 @@ compute_bb_dataflow (basic_block bb)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (out);
> +           dataflow_set_clear_at_call (out, insn);
>             break;
>
>           case MO_USE:
> @@ -9182,7 +9186,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (set);
> +           dataflow_set_clear_at_call (set, insn);
>             emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
>             {
>               rtx arguments = mo->u.loc, *p = &arguments;
>
>
>
> And here's the incremental patch:
>
> ---
>  gcc/alias.c               |   17 +++++++------
>  gcc/cfgexpand.c           |   57 +++++++++++++++++----------------------------
>  gcc/emit-rtl.c            |    2 --
>  gcc/explow.c              |    3 --
>  gcc/expr.c                |   16 +++++--------
>  gcc/function.c            |   15 ++++++++++++
>  gcc/gimple-expr.h         |    4 ---
>  gcc/tree-outof-ssa.c      |    7 ++----
>  gcc/tree-ssa-coalesce.h   |    1 +
>  gcc/tree-ssa-loop-niter.c |    6 ++++-
>  gcc/tree-ssa-uncprop.c    |    5 ++++
>  11 files changed, 64 insertions(+), 69 deletions(-)
>
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 7a74e81..5a031d9 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2553,14 +2553,15 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>      return 0;
>
>    /* If we refer to different gimple registers, or one gimple register
> -     and one non-gimple-register, we know they can't overlap.  Now,
> -     there could be more than one stack slot for (different versions
> -     of) the same gimple register, but we can presumably tell they
> -     don't overlap based on offsets from stack base addresses
> -     elsewhere.  It's important that we don't proceed to DECL_RTL,
> -     because gimple registers may not pass DECL_RTL_SET_P, and
> -     make_decl_rtl won't be able to do anything about them since no
> -     SSA information will have remained to guide it.  */
> +     and one non-gimple-register, we know they can't overlap.  First,
> +     gimple registers don't have their addresses taken.  Now, there
> +     could be more than one stack slot for (different versions of) the
> +     same gimple register, but we can presumably tell they don't
> +     overlap based on offsets from stack base addresses elsewhere.
> +     It's important that we don't proceed to DECL_RTL, because gimple
> +     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
> +     able to do anything about them since no SSA information will have
> +     remained to guide it.  */
>    if (is_gimple_reg (exprx) || is_gimple_reg (expry))
>      return exprx != expry;
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 3e80b4a..bf972fc 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -179,11 +179,10 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
>  #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> -/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
> -   TREE_LIST of DECLs.  If NEXT is covered by CUR, return CUR
> -   unchanged.  Otherwise, return a list with all entries of CUR, with
> -   NEXT at the end.  If CUR was a list, it will be modified in
> -   place.  */
> +/* Choose either CUR or NEXT as the leader DECL for a partition.
> +   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
> +   out of the same user variable being in multiple partitions (this is
> +   less likely for compiler-introduced temps).  */
>
>  static tree
>  leader_merge (tree cur, tree next)
> @@ -191,26 +190,11 @@ leader_merge (tree cur, tree next)
>    if (cur == NULL || cur == next)
>      return next;
>
> -  tree list;
> +  if (DECL_P (cur) && DECL_IGNORED_P (cur))
> +    return cur;
>
> -  if (TREE_CODE (cur) == TREE_LIST)
> -    {
> -      /* Look for NEXT in the list.  Stop at the last node to insert
> -        there.  */
> -      for (list = cur; ; list = TREE_CHAIN (list))
> -       {
> -         if (TREE_VALUE (list) == next)
> -           return cur;
> -         if (!TREE_CHAIN (list))
> -           break;
> -       }
> -    }
> -  else
> -    /* Create the first node.  */
> -    list = build_tree_list (NULL, cur);
> -
> -  next = build_tree_list (NULL, next);
> -  TREE_CHAIN (list) = next;
> +  if (DECL_P (next) && DECL_IGNORED_P (next))
> +    return next;
>
>    return cur;
>  }
> @@ -285,9 +269,9 @@ set_rtl (tree t, rtx x)
>        if (cur != next)
>         {
>           if (MEM_P (x))
> -           set_mem_attributes (x, SSAVAR (t), true);
> +           set_mem_attributes (x, next, true);
>           else
> -           set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
> +           set_reg_attrs_for_decl_rtl (next, x);
>         }
>      }
>
> @@ -1025,9 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
>    x = plus_constant (Pmode, base, offset);
> -  x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
> -                  ? DECL_MODE (SSAVAR (decl))
> -                  : TYPE_MODE (TREE_TYPE (decl)), x);
> +  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
> +                  ? TYPE_MODE (TREE_TYPE (decl))
> +                  : DECL_MODE (SSAVAR (decl)), x);
>
>    if (TREE_CODE (decl) != SSA_NAME)
>      {
> @@ -1268,17 +1252,17 @@ expand_one_stack_var_1 (tree var)
>    HOST_WIDE_INT size, offset;
>    unsigned byte_align;
>
> -  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
> -    {
> -      size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> -      byte_align = align_local_variable (SSAVAR (var));
> -    }
> -  else
> +  if (TREE_CODE (var) == SSA_NAME)
>      {
>        tree type = TREE_TYPE (var);
>        size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
>        byte_align = TYPE_ALIGN_UNIT (type);
>      }
> +  else
> +    {
> +      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
> +      byte_align = align_local_variable (var);
> +    }
>
>    /* We handle highly aligned variables in expand_stack_vars.  */
>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1423,9 +1407,10 @@ expand_one_register_var (tree var)
>           gcc_assert (REG_P (x));
>           return;
>         }
> +      gcc_unreachable ();
>      }
>
> -  tree decl = SSAVAR (var);
> +  tree decl = var;
>    tree type = TREE_TYPE (decl);
>    machine_mode reg_mode = promote_decl_mode (decl, NULL);
>    rtx x = gen_reg_rtx (reg_mode);
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 308da40..2b98946 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1252,8 +1252,6 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>    if (!t)
>      return;
>    tree tdecl = t;
> -  if (TREE_CODE (t) == TREE_LIST)
> -    tdecl = TREE_VALUE (t);
>    if (GET_CODE (x) == SUBREG)
>      {
>        gcc_assert (subreg_lowpart_p (x));
> diff --git a/gcc/explow.c b/gcc/explow.c
> index e09c032e1..5b0d49c 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -866,9 +866,6 @@ promote_ssa_mode (const_tree name, int *punsignedp)
>  {
>    gcc_assert (TREE_CODE (name) == SSA_NAME);
>
> -  if (SSA_NAME_VAR (name))
> -    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> -
>    tree type = TREE_TYPE (name);
>    int unsignedp = TYPE_UNSIGNED (type);
>    machine_mode mode = TYPE_MODE (type);
> diff --git a/gcc/expr.c b/gcc/expr.c
> index effe379..5b6e16e 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -9584,20 +9584,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>
>           /* Get the signedness to be used for this variable.  Ensure we get
>              the same mode we got when the variable was declared.  */
> -         if (code == SSA_NAME
> -             && (g = SSA_NAME_DEF_STMT (ssa_name))
> -             && gimple_code (g) == GIMPLE_CALL
> -             && !gimple_call_internal_p (g))
> +         if (code != SSA_NAME)
> +           pmode = promote_decl_mode (exp, &unsignedp);
> +         else if ((g = SSA_NAME_DEF_STMT (ssa_name))
> +                  && gimple_code (g) == GIMPLE_CALL
> +                  && !gimple_call_internal_p (g))
>             pmode = promote_function_mode (type, mode, &unsignedp,
>                                            gimple_call_fntype (g),
>                                            2);
> -         else if (!exp)
> -           {
> -             gcc_assert (code == SSA_NAME);
> -             pmode = promote_ssa_mode (ssa_name, &unsignedp);
> -           }
>           else
> -           pmode = promote_decl_mode (exp, &unsignedp);
> +           pmode = promote_ssa_mode (ssa_name, &unsignedp);
>           gcc_assert (GET_MODE (decl_rtl) == pmode);
>
>           temp = gen_lowpart_SUBREG (mode, decl_rtl);
> diff --git a/gcc/function.c b/gcc/function.c
> index dc9e77f..58e2498 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2124,6 +2124,21 @@ use_register_for_decl (const_tree decl)
>  {
>    if (TREE_CODE (decl) == SSA_NAME)
>      {
> +      /* We often try to use the SSA_NAME, instead of its underlying
> +        decl, to get type information and guide decisions, to avoid
> +        differences of behavior between anonymous and named
> +        variables, but in this one case we have to go for the actual
> +        variable if there is one.  The main reason is that, at least
> +        at -O0, we want to place user variables on the stack, but we
> +        don't mind using pseudos for anonymous or ignored temps.
> +        Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
> +        should go in pseudos, whereas their corresponding variables
> +        might have to go on the stack.  So, disregarding the decl
> +        here would negatively impact debug info at -O0, enable
> +        coalescing between SSA_NAMEs that ought to get different
> +        stack/pseudo assignments, and get the incoming argument
> +        processing thoroughly confused by PARM_DECLs expected to live
> +        in stack slots but assigned to pseudos.  */
>        if (!SSA_NAME_VAR (decl))
>         return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>           && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index 146cede..3d1c89f 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -55,10 +55,6 @@ extern bool is_gimple_mem_ref_addr (tree);
>  extern void mark_addressable (tree);
>  extern bool is_gimple_reg_rhs (tree);
>
> -/* Defined in tree-ssa-coalesce.c.   */
> -extern bool gimple_can_coalesce_p (tree, tree);
> -
> -
>  /* Return true if a conversion from either type of TYPE1 and TYPE2
>     to the other is not required.  Otherwise return false.  */
>
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index dda9973..59d91c6 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>    rtx dest_rtx, seq, x;
>    machine_mode dest_mode, src_mode;
>    int unsignedp;
> -  tree var;
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> @@ -328,10 +327,9 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>    start_sequence ();
>
>    tree name = partition_to_var (SA.map, dest);
> -  var = SSA_NAME_VAR (name);
>    src_mode = TYPE_MODE (TREE_TYPE (src));
>    dest_mode = GET_MODE (dest_rtx);
> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>    gcc_assert (!REG_P (dest_rtx)
>               || dest_mode == promote_ssa_mode (name, &unsignedp));
>
> @@ -709,8 +707,7 @@ elim_backward (elim_graph g, int T)
>  static rtx
>  get_temp_reg (tree name)
>  {
> -  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> -  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
> +  tree type = TREE_TYPE (name);
>    int unsignedp;
>    machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>    rtx x = gen_reg_rtx (reg_mode);
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index 99b188a..ae289b4 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>  #define GCC_TREE_SSA_COALESCE_H
>
>  extern var_map coalesce_ssa_name (void);
> +extern bool gimple_can_coalesce_p (tree, tree);
>
>  #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 3f6bebe..7bef8cf 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
>         if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
>           continue;
>         e = TREE_OPERAND (e, 0);
> -       gcc_assert (operand_equal_p (e, base, 0));
> +       /* If E has an unsigned type, the operand equality test below
> +          would fail, but the equality test above would have already
> +          verified the equality, so we can proceed with it.  */
> +       gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
> +                   || operand_equal_p (e, base, 0));
>         if (tree_int_cst_sign_bit (step))
>           {
>             code = LT_EXPR;
> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
> index f75a7f1..0982305 100644
> --- a/gcc/tree-ssa-uncprop.c
> +++ b/gcc/tree-ssa-uncprop.c
> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3.  If not see
>  #include "domwalk.h"
>  #include "tree-pass.h"
>  #include "tree-ssa-propagate.h"
> +#include "bitmap.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-ssa-live.h"
> +#include "tree-ssa-coalesce.h"
>
>  /* The basic structure describing an equivalency created by traversing
>     an edge.  Traversing the edge effectively means that we can assume
>
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-08  8:16               ` Richard Biener
@ 2015-06-09  8:58                 ` Christophe Lyon
  0 siblings, 0 replies; 127+ messages in thread
From: Christophe Lyon @ 2015-06-09  8:58 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: GCC Patches

On 8 June 2015 at 10:14, Richard Biener <richard.guenther@gmail.com> wrote:
> On Sat, Jun 6, 2015 at 3:14 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>>
>>> This should also mention that is_gimple_reg vars do not have their
>>> address taken.
>>
>> check
>>
>>>> +static tree
>>>> +leader_merge (tree cur, tree next)
>>
>>> Ick - presumably you can't use sth better than a TREE_LIST here?
>>
>> The list was an experiment that never really worked, and when I tried to
>> make it work after the patch, it proved to be unworkable, so I dropped
>> it, and rewrote leader_merge to choose either of the params, preferring
>> anonymous over ignored over named, so as to reduce the likelihood of
>> misreading of debug dumps, since that's all they're used for.
>>
>>>> static void
>>>> -expand_one_stack_var (tree var)
>>>> +expand_one_stack_var_1 (tree var)
>>>> {
>>>> HOST_WIDE_INT size, offset;
>>>> unsigned byte_align;
>>>>
>>>> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>>>> -  byte_align = align_local_variable (SSAVAR (var));
>>>> +  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
>>>> +    {
>>>> +      size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>>>> +      byte_align = align_local_variable (SSAVAR (var));
>>>> +    }
>>>> +  else
>>
>>> I'd go here for all TREE_CODE (var) == SSA_NAME
>>
>> Check
>>
>>> (and get rid of the SSAVAR macro?)
>>
>> There are remaining uses that don't seem worth dropping it for.
>>
>>>> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
>>>> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
>>>> +   mode of a temp decl of same type as the SSA_NAME, if we had created
>>>> +   one.  */
>>>> +
>>>> +machine_mode
>>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>>> +{
>>>> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
>>>> +
>>>> +  if (SSA_NAME_VAR (name))
>>>> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>>
>>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>>> vars (so just delete the above two lines).
>>
>> Check
>>
>>>> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>>> pmode = promote_function_mode (type, mode, &unsignedp,
>>>> gimple_call_fntype (g),
>>>> 2);
>>>> +         else if (!exp)
>>>> +           {
>>>> +             gcc_assert (code == SSA_NAME);
>>
>>> promote_ssa_mode should assert this.
>>
>>>> +             pmode = promote_ssa_mode (ssa_name, &unsignedp);
>>
>> It does, so...  check.
>>
>>
>>>> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>>>> bool
>>>> use_register_for_decl (const_tree decl)
>>>> {
>>>> +  if (TREE_CODE (decl) == SSA_NAME)
>>>> +    {
>>>> +      if (!SSA_NAME_VAR (decl))
>>>> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>>>> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>>>> +
>>>> +      decl = SSA_NAME_VAR (decl);
>>
>>> See above.  Please drop the SSA_NAME_VAR != NULL path.
>>
>> Check, then taken back, after a bootstrap failure and some debugging
>> made me realize this would be wrong.  Here are the nearly-added comments
>> that explain why:
>>
>>       /* We often try to use the SSA_NAME, instead of its underlying
>>          decl, to get type information and guide decisions, to avoid
>>          differences of behavior between anonymous and named
>>          variables, but in this one case we have to go for the actual
>>          variable if there is one.  The main reason is that, at least
>>          at -O0, we want to place user variables on the stack, but we
>>          don't mind using pseudos for anonymous or ignored temps.
>>          Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
>>          should go in pseudos, whereas their corresponding variables
>>          might have to go on the stack.  So, disregarding the decl
>>          here would negatively impact debug info at -O0, enable
>>          coalescing between SSA_NAMEs that ought to get different
>>          stack/pseudo assignments, and get the incoming argument
>>          processing thoroughly confused by PARM_DECLs expected to live
>>          in stack slots but assigned to pseudos.  */
>>
>>
>>>> +++ b/gcc/gimple-expr.h
>>>> +/* Defined in tree-ssa-coalesce.c.   */
>>>> +extern bool gimple_can_coalesce_p (tree, tree);
>>
>>> Err, put it to tree-ssa-coalesce.h?
>>
>> Check.  Lots of additional headers required to be able to include
>> tree-ssa-coalesce.h, though.
>>
>>
>>>> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
>>>> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
>>
>>> The TREE_TYPE of name and its SSA_NAME_VAR are always the same.  So just
>>> use TREE_TYPE (name) here.
>>
>> Check
>>
>>>> gcc_assert (!REG_P (dest_rtx)
>>>> -             || dest_mode == promote_decl_mode (var, &unsignedp));
>>>> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>>>>
>>>> if (src_mode != dest_mode)
>>>> {
>>>> @@ -714,12 +715,12 @@ static rtx
>>>> get_temp_reg (tree name)
>>>> {
>>>> tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>>>> -  tree type = TREE_TYPE (var);
>>>> +  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
>>
>>> See above.
>>
>> Check
>>
>>
>> Here's the revised patch, regstrapped on x86_64-linux-gnu and
>> i686-linux-gnu.  The first attempt failed to compile libjava on x86_64,
>> requiring the new change in tree-ssa-loop-niter.c to pass.  It didn't
>> occur in the unpatched tree because the differences between anon or
>> named SSA_NAMEs in copyrename changed costs and caused different choices
>> in ivopts, which ultimately failed to expose the problem in loop-niter
>> during vrp.
>>
>> At the end, I enclose the incremental changes since the previous
>> revision of the patch, to ease the incremental review.
>>
>> Ok to install?
>
> Ok.
>
> Thanks,
> Richard.
>
>>
>> for  gcc/ChangeLog
>>
>>         PR rtl-optimization/64164
>>         * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
>>         * tree-ssa-copyrename.c: Removed.
>>         * opts.c (default_options_table): Drop -ftree-copyrename.  Add
>>         -ftree-coalesce-vars.
>>         * passes.def: Drop all occurrences of pass_rename_ssa_copies.
>>         * common.opt (ftree-copyrename): Ignore.
>>         (ftree-coalesce-inlined-vars): Likewise.
>>         * doc/invoke.texi: Remove the ignored options above.
>>         * gimple-expr.h (gimple_can_coalesce_p): Move declaration
>>         * tree-ssa-coalesce.h: ... here.
>>         * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
>>         headers required by it.
>>         * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
>>         across variables when flag_tree_coalesce_vars.  Check register
>>         use and promoted modes to allow coalescing.  Moved to
>>         tree-ssa-coalesce.c.
>>         * tree-ssa-live.c (struct tree_int_map_hasher): Move along
>>         with its member functions to tree-ssa-coalesce.c.
>>         (var_map_base_init): Likewise.  Renamed to
>>         compute_samebase_partition_bases.
>>         (partition_view_normal): Drop want_bases parameter.
>>         (partition_view_bitmap): Likewise.
>>         * tree-ssa-live.h: Adjust declarations.
>>         * tree-ssa-coalesce.c: Include explow.h.
>>         (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
>>         default defs at the entry point.
>>         (dump_part_var_map): New.
>>         (compute_optimized_partition_bases): New, called by...
>>         (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
>>         of compute_samebase_partition_bases.  Adjust.
>>         * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
>>         * cfgexpand.c (leader_merge): New.
>>         (get_rtl_for_parm_ssa_default_def): New.
>>         (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
>>         vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
>>         (expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
>>         redundant MEM attr setting.
>>         (expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
>>         from...
>>         (expand_one_stack_var): ... this.  New wrapper to check and
>>         skip already expanded SSA partitions.
>>         (record_alignment_for_reg_var): New, factored out of...
>>         (expand_one_var): ... this.
>>         (expand_one_ssa_partition): New.
>>         (adjust_one_expanded_partition_var): New.
>>         (expand_one_register_var): Check and skip already expanded SSA
>>         partitions.
>>         (expand_used_vars): Don't create DECLs for anonymous SSA
>>         names.  Expand all SSA partitions, then adjust all SSA names.
>>         (pass::execute): Replace the loops that set
>>         SA.partition_to_pseudo from partition leaders and cleared
>>         DECL_RTL for multi-location variables, and that which used to
>>         rename vars and set attrs, with one that clears DECL_RTL and
>>         checks that PARMs and RESULTs default_defs match DECL_RTL.
>>         * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
>>         * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
>>         * explow.c (promote_ssa_mode): New.
>>         * explow.h (promote_ssa_mode): Declare.
>>         * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
>>         * function.c: Include cfgexpand.h.
>>         (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
>>         (use_register_for_parm_decl): Wrapper for the above to
>>         special-case the result_ptr.
>>         (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
>>         (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
>>         multiple locations.
>>         (assign_parm_adjust_stack_rtl): Add all and parm arguments,
>>         for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
>>         (assign_parm_setup_block): Prefer SSA-assigned location.
>>         (assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
>>         if stack_parm is NULL.
>>         (assign_parm_setup_stack): Prefer SSA-assigned location.
>>         (assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
>>         rtl before testing for pointer bounds.  Special-case result_ptr.
>>         (expand_function_start): Maybe reset DECL_RTL of result.
>>         Prefer SSA-assigned location for result and static chain.
>>         Factor out DECL_RESULT and SET_DECL_RTL.
>>         * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
>>         anonymous SSA names.  Use promote_ssa_mode.
>>         (get_temp_reg): Likewise.
>>         (remove_ssa_form): Adjust.
>>         * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
>>         and get its reg_usage for reg invalidation.
>>         (compute_bb_dataflow): Pass it insn.
>>         (emit_notes_in_bb): Likewise.
>>         * tree-ssa-loop-niter.c (loop_exits_before_overflow): Don't
>>         fail assert on conversion between unsigned types.
>>

Hi,

This patch causes a GCC build failure with target
armeb-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
--with-fpu=neon
during the libgcc compilation:

Here is the backtrace I have:
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/obj-armeb-none-linux-gnueabihf/gcc1/./gcc/xgcc
-B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/obj-armeb-none-linux-gnueabihf/gcc1/./gcc/
-B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/bin/
-B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/lib/
-isystem /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/include
-isystem /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/sys-include
   -g -O2 -O2  -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -W -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wstrict-prototypes
-Wmissing-prototypes -Wold-style-definition  -isystem ./include
-fPIC -fno-inline -g -DIN_LIBGCC2 -fbuilding-libgcc
-fno-stack-protector -Dinhibit_libc  -fPIC -fno-inline -I. -I.
-I../.././gcc -I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc
-I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/.
-I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/../gcc
-I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/../include
 -DHAVE_CC_TLS  -o _addQQ.o -MT _addQQ.o -MD -MP -MF _addQQ.dep
-DL_add -DQQ_MODE -c
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c
-fvisibility=hidden -DHIDE_EXPORTS
In file included from
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c:55:0:
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c:
In function '__gnu_addqq3':
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:450:31:
internal compiler error: RTL flag check: MEM_VOLATILE_P used with
unexpected rtx code 'reg' in set_mem_attributes_minus_bitpos, at
emit-rtl.c:1787
 #define FIXED_OP(OP,MODE,NUM) __gnu_ ## OP ## MODE ## NUM
                               ^
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:460:30:
note: in expansion of macro 'FIXED_OP'
 #define FIXED_ADD_TEMP(NAME) FIXED_OP(add,NAME,3)
                              ^
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:492:19:
note: in expansion of macro 'FIXED_ADD_TEMP'
 #define FIXED_ADD FIXED_ADD_TEMP(MODE_NAME_S)
                   ^
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c:59:1:
note: in expansion of macro 'FIXED_ADD'
 FIXED_ADD (FIXED_C_TYPE a, FIXED_C_TYPE b)
 ^
0xa6eb52 rtl_check_failed_flag(char const*, rtx_def const*, char
const*, int, char const*)
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/rtl.c:800
0x771fc7 set_mem_attributes_minus_bitpos(rtx_def*, tree_node*, int, long)
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/emit-rtl.c:1787
0x805294 assign_parm_setup_block
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:2977
0x80b65c assign_parms
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:3775
0x80e087 expand_function_start(tree_node*)
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:5215
0x6a77ed execute
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/cfgexpand.c:6127
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-obj.mk:27:
recipe for target '_addQQ.o' failed
make[2]: *** [_addQQ.o] Error 1

>> for  gcc/testsuite/ChangeLog
>>
>>         * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
>>         * gcc.dg/ssp-1.c: Make counter a register.
>>         * gcc.dg/ssp-2.c: Likewise.
>>         * gcc.dg/torture/parm-coalesce.c: New.
>> ---
>>  gcc/Makefile.in                              |    1
>>  gcc/alias.c                                  |   13 +
>>  gcc/cfgexpand.c                              |  370 ++++++++++++++-----
>>  gcc/cfgexpand.h                              |    2
>>  gcc/common.opt                               |   12 -
>>  gcc/doc/invoke.texi                          |   48 +--
>>  gcc/emit-rtl.c                               |    5
>>  gcc/explow.c                                 |   22 +
>>  gcc/explow.h                                 |    3
>>  gcc/expr.c                                   |   39 +-
>>  gcc/function.c                               |  226 +++++++++---
>>  gcc/gimple-expr.c                            |   39 --
>>  gcc/gimple-expr.h                            |    1
>>  gcc/opts.c                                   |    2
>>  gcc/passes.def                               |    5
>>  gcc/testsuite/gcc.dg/guality/pr54200.c       |    2
>>  gcc/testsuite/gcc.dg/ssp-1.c                 |    2
>>  gcc/testsuite/gcc.dg/ssp-2.c                 |    2
>>  gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
>>  gcc/tree-outof-ssa.c                         |   16 -
>>  gcc/tree-ssa-coalesce.c                      |  380 +++++++++++++++++++-
>>  gcc/tree-ssa-coalesce.h                      |    1
>>  gcc/tree-ssa-copyrename.c                    |  499 --------------------------
>>  gcc/tree-ssa-live.c                          |  101 -----
>>  gcc/tree-ssa-live.h                          |    4
>>  gcc/tree-ssa-loop-niter.c                    |    6
>>  gcc/tree-ssa-uncprop.c                       |    5
>>  gcc/var-tracking.c                           |   12 -
>>  28 files changed, 984 insertions(+), 874 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>>  delete mode 100644 gcc/tree-ssa-copyrename.c
>>
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>> index 3d14938..2a03223 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -1441,7 +1441,6 @@ OBJS = \
>>         tree-ssa-ccp.o \
>>         tree-ssa-coalesce.o \
>>         tree-ssa-copy.o \
>> -       tree-ssa-copyrename.o \
>>         tree-ssa-dce.o \
>>         tree-ssa-dom.o \
>>         tree-ssa-dse.o \
>> diff --git a/gcc/alias.c b/gcc/alias.c
>> index ea539c5..5a031d9 100644
>> --- a/gcc/alias.c
>> +++ b/gcc/alias.c
>> @@ -2552,6 +2552,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>>    if (! DECL_P (exprx) || ! DECL_P (expry))
>>      return 0;
>>
>> +  /* If we refer to different gimple registers, or one gimple register
>> +     and one non-gimple-register, we know they can't overlap.  First,
>> +     gimple registers don't have their addresses taken.  Now, there
>> +     could be more than one stack slot for (different versions of) the
>> +     same gimple register, but we can presumably tell they don't
>> +     overlap based on offsets from stack base addresses elsewhere.
>> +     It's important that we don't proceed to DECL_RTL, because gimple
>> +     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
>> +     able to do anything about them since no SSA information will have
>> +     remained to guide it.  */
>> +  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
>> +    return exprx != expry;
>> +
>>    /* With invalid code we can end up storing into the constant pool.
>>       Bail out to avoid ICEing when creating RTL for this.
>>       See gfortran.dg/lto/20091028-2_0.f90.  */
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index b190f91..bf972fc 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -179,21 +179,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
>>
>>  #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>>
>> +/* Choose either CUR or NEXT as the leader DECL for a partition.
>> +   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
>> +   out of the same user variable being in multiple partitions (this is
>> +   less likely for compiler-introduced temps).  */
>> +
>> +static tree
>> +leader_merge (tree cur, tree next)
>> +{
>> +  if (cur == NULL || cur == next)
>> +    return next;
>> +
>> +  if (DECL_P (cur) && DECL_IGNORED_P (cur))
>> +    return cur;
>> +
>> +  if (DECL_P (next) && DECL_IGNORED_P (next))
>> +    return next;
>> +
>> +  return cur;
>> +}
>> +
>> +
>> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
>> +   there is one.  */
>> +
>> +rtx
>> +get_rtl_for_parm_ssa_default_def (tree var)
>> +{
>> +  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
>> +
>> +  if (!is_gimple_reg (var))
>> +    return NULL_RTX;
>> +
>> +  /* If we've already determined RTL for the decl, use it.  This is
>> +     not just an optimization: if VAR is a PARM whose incoming value
>> +     is unused, we won't find a default def to use its partition, but
>> +     we still want to use the location of the parm, if it was used at
>> +     all.  During assign_parms, until a location is assigned for the
>> +     VAR, RTL can only for a parm or result if we're not coalescing
>> +     across variables, when we know we're coalescing all SSA_NAMEs of
>> +     each parm or result, and we're not coalescing them with names
>> +     pertaining to other variables, such as other parms' default
>> +     defs.  */
>> +  if (DECL_RTL_SET_P (var))
>> +    {
>> +      gcc_assert (DECL_RTL (var) != pc_rtx);
>> +      return DECL_RTL (var);
>> +    }
>> +
>> +  tree name = ssa_default_def (cfun, var);
>> +
>> +  if (!name)
>> +    return NULL_RTX;
>> +
>> +  int part = var_to_partition (SA.map, name);
>> +  if (part == NO_PARTITION)
>> +    return NULL_RTX;
>> +
>> +  return SA.partition_to_pseudo[part];
>> +}
>> +
>>  /* Associate declaration T with storage space X.  If T is no
>>     SSA name this is exactly SET_DECL_RTL, otherwise make the
>>     partition of T associated with X.  */
>>  static inline void
>>  set_rtl (tree t, rtx x)
>>  {
>> +  if (x && SSAVAR (t))
>> +    {
>> +      bool skip = false;
>> +      tree cur = NULL_TREE;
>> +
>> +      if (MEM_P (x))
>> +       cur = MEM_EXPR (x);
>> +      else if (REG_P (x))
>> +       cur = REG_EXPR (x);
>> +      else if (GET_CODE (x) == CONCAT
>> +              && REG_P (XEXP (x, 0)))
>> +       cur = REG_EXPR (XEXP (x, 0));
>> +      else if (GET_CODE (x) == PARALLEL)
>> +       cur = REG_EXPR (XVECEXP (x, 0, 0));
>> +      else if (x == pc_rtx)
>> +       skip = true;
>> +      else
>> +       gcc_unreachable ();
>> +
>> +      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
>> +
>> +      if (cur != next)
>> +       {
>> +         if (MEM_P (x))
>> +           set_mem_attributes (x, next, true);
>> +         else
>> +           set_reg_attrs_for_decl_rtl (next, x);
>> +       }
>> +    }
>> +
>>    if (TREE_CODE (t) == SSA_NAME)
>>      {
>> -      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
>> -      if (x && !MEM_P (x))
>> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
>> -      /* For the benefit of debug information at -O0 (where vartracking
>> -         doesn't run) record the place also in the base DECL if it's
>> -        a normal variable (not a parameter).  */
>> -      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
>> +      int part = var_to_partition (SA.map, t);
>> +      if (part != NO_PARTITION)
>> +       {
>> +         if (SA.partition_to_pseudo[part])
>> +           gcc_assert (SA.partition_to_pseudo[part] == x);
>> +         else
>> +           SA.partition_to_pseudo[part] = x;
>> +       }
>> +      /* For the benefit of debug information at -O0 (where
>> +         vartracking doesn't run) record the place also in the base
>> +         DECL.  For PARMs and RESULTs, we may end up resetting these
>> +         in function.c:maybe_reset_rtl_for_parm, but in some rare
>> +         cases we may need them (unused and overwritten incoming
>> +         value, that at -O0 must share the location with the other
>> +         uses in spite of the missing default def), and this may be
>> +         the only chance to preserve them.  */
>> +      if (x && x != pc_rtx && SSA_NAME_VAR (t))
>>         {
>>           tree var = SSA_NAME_VAR (t);
>>           /* If we don't yet have something recorded, just record it now.  */
>> @@ -909,7 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>>
>>    x = plus_constant (Pmode, base, offset);
>> -  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
>> +  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
>> +                  ? TYPE_MODE (TREE_TYPE (decl))
>> +                  : DECL_MODE (SSAVAR (decl)), x);
>>
>>    if (TREE_CODE (decl) != SSA_NAME)
>>      {
>> @@ -931,7 +1033,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>>        DECL_USER_ALIGN (decl) = 0;
>>      }
>>
>> -  set_mem_attributes (x, SSAVAR (decl), true);
>>    set_rtl (decl, x);
>>  }
>>
>> @@ -1146,13 +1247,22 @@ account_stack_vars (void)
>>     to a variable to be allocated in the stack frame.  */
>>
>>  static void
>> -expand_one_stack_var (tree var)
>> +expand_one_stack_var_1 (tree var)
>>  {
>>    HOST_WIDE_INT size, offset;
>>    unsigned byte_align;
>>
>> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>> -  byte_align = align_local_variable (SSAVAR (var));
>> +  if (TREE_CODE (var) == SSA_NAME)
>> +    {
>> +      tree type = TREE_TYPE (var);
>> +      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
>> +      byte_align = TYPE_ALIGN_UNIT (type);
>> +    }
>> +  else
>> +    {
>> +      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
>> +      byte_align = align_local_variable (var);
>> +    }
>>
>>    /* We handle highly aligned variables in expand_stack_vars.  */
>>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
>> @@ -1163,6 +1273,27 @@ expand_one_stack_var (tree var)
>>                            crtl->max_used_stack_slot_alignment, offset);
>>  }
>>
>> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
>> +   already assigned some MEM.  */
>> +
>> +static void
>> +expand_one_stack_var (tree var)
>> +{
>> +  if (TREE_CODE (var) == SSA_NAME)
>> +    {
>> +      int part = var_to_partition (SA.map, var);
>> +      if (part != NO_PARTITION)
>> +       {
>> +         rtx x = SA.partition_to_pseudo[part];
>> +         gcc_assert (x);
>> +         gcc_assert (MEM_P (x));
>> +         return;
>> +       }
>> +    }
>> +
>> +  return expand_one_stack_var_1 (var);
>> +}
>> +
>>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>>     that will reside in a hard register.  */
>>
>> @@ -1172,13 +1303,114 @@ expand_one_hard_reg_var (tree var)
>>    rest_of_decl_compilation (var, 0, 0);
>>  }
>>
>> +/* Record the alignment requirements of some variable assigned to a
>> +   pseudo.  */
>> +
>> +static void
>> +record_alignment_for_reg_var (unsigned int align)
>> +{
>> +  if (SUPPORTS_STACK_ALIGNMENT
>> +      && crtl->stack_alignment_estimated < align)
>> +    {
>> +      /* stack_alignment_estimated shouldn't change after stack
>> +         realign decision made */
>> +      gcc_assert (!crtl->stack_realign_processed);
>> +      crtl->stack_alignment_estimated = align;
>> +    }
>> +
>> +  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
>> +     So here we only make sure stack_alignment_needed >= align.  */
>> +  if (crtl->stack_alignment_needed < align)
>> +    crtl->stack_alignment_needed = align;
>> +  if (crtl->max_used_stack_slot_alignment < align)
>> +    crtl->max_used_stack_slot_alignment = align;
>> +}
>> +
>> +/* Create RTL for an SSA partition.  */
>> +
>> +static void
>> +expand_one_ssa_partition (tree var)
>> +{
>> +  int part = var_to_partition (SA.map, var);
>> +  gcc_assert (part != NO_PARTITION);
>> +
>> +  if (SA.partition_to_pseudo[part])
>> +    return;
>> +
>> +  if (!use_register_for_decl (var))
>> +    {
>> +      expand_one_stack_var_1 (var);
>> +      return;
>> +    }
>> +
>> +  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
>> +                                         TYPE_MODE (TREE_TYPE (var)),
>> +                                         TYPE_ALIGN (TREE_TYPE (var)));
>> +
>> +  /* If the variable alignment is very large we'll dynamicaly allocate
>> +     it, which means that in-frame portion is just a pointer.  */
>> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
>> +    align = POINTER_SIZE;
>> +
>> +  record_alignment_for_reg_var (align);
>> +
>> +  machine_mode reg_mode = promote_ssa_mode (var, NULL);
>> +
>> +  rtx x = gen_reg_rtx (reg_mode);
>> +
>> +  set_rtl (var, x);
>> +}
>> +
>> +/* Record the association between the RTL generated for a partition
>> +   and the underlying variable of the SSA_NAME.  */
>> +
>> +static void
>> +adjust_one_expanded_partition_var (tree var)
>> +{
>> +  if (!var)
>> +    return;
>> +
>> +  tree decl = SSA_NAME_VAR (var);
>> +
>> +  int part = var_to_partition (SA.map, var);
>> +  if (part == NO_PARTITION)
>> +    return;
>> +
>> +  rtx x = SA.partition_to_pseudo[part];
>> +
>> +  set_rtl (var, x);
>> +
>> +  if (!REG_P (x))
>> +    return;
>> +
>> +  /* Note if the object is a user variable.  */
>> +  if (decl && !DECL_ARTIFICIAL (decl))
>> +    mark_user_reg (x);
>> +
>> +  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
>> +    mark_reg_pointer (x, get_pointer_alignment (var));
>> +}
>> +
>>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>>     that will reside in a pseudo register.  */
>>
>>  static void
>>  expand_one_register_var (tree var)
>>  {
>> -  tree decl = SSAVAR (var);
>> +  if (TREE_CODE (var) == SSA_NAME)
>> +    {
>> +      int part = var_to_partition (SA.map, var);
>> +      if (part != NO_PARTITION)
>> +       {
>> +         rtx x = SA.partition_to_pseudo[part];
>> +         gcc_assert (x);
>> +         gcc_assert (REG_P (x));
>> +         return;
>> +       }
>> +      gcc_unreachable ();
>> +    }
>> +
>> +  tree decl = var;
>>    tree type = TREE_TYPE (decl);
>>    machine_mode reg_mode = promote_decl_mode (decl, NULL);
>>    rtx x = gen_reg_rtx (reg_mode);
>> @@ -1312,21 +1544,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
>>         align = POINTER_SIZE;
>>      }
>>
>> -  if (SUPPORTS_STACK_ALIGNMENT
>> -      && crtl->stack_alignment_estimated < align)
>> -    {
>> -      /* stack_alignment_estimated shouldn't change after stack
>> -         realign decision made */
>> -      gcc_assert (!crtl->stack_realign_processed);
>> -      crtl->stack_alignment_estimated = align;
>> -    }
>> -
>> -  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
>> -     So here we only make sure stack_alignment_needed >= align.  */
>> -  if (crtl->stack_alignment_needed < align)
>> -    crtl->stack_alignment_needed = align;
>> -  if (crtl->max_used_stack_slot_alignment < align)
>> -    crtl->max_used_stack_slot_alignment = align;
>> +  record_alignment_for_reg_var (align);
>>
>>    if (TREE_CODE (origvar) == SSA_NAME)
>>      {
>> @@ -1760,48 +1978,18 @@ expand_used_vars (void)
>>    if (targetm.use_pseudo_pic_reg ())
>>      pic_offset_table_rtx = gen_reg_rtx (Pmode);
>>
>> -  hash_map<tree, tree> ssa_name_decls;
>>    for (i = 0; i < SA.map->num_partitions; i++)
>>      {
>>        tree var = partition_to_var (SA.map, i);
>>
>>        gcc_assert (!virtual_operand_p (var));
>>
>> -      /* Assign decls to each SSA name partition, share decls for partitions
>> -         we could have coalesced (those with the same type).  */
>> -      if (SSA_NAME_VAR (var) == NULL_TREE)
>> -       {
>> -         tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
>> -         if (!*slot)
>> -           *slot = create_tmp_reg (TREE_TYPE (var));
>> -         replace_ssa_name_symbol (var, *slot);
>> -       }
>> -
>> -      /* Always allocate space for partitions based on VAR_DECLs.  But for
>> -        those based on PARM_DECLs or RESULT_DECLs and which matter for the
>> -        debug info, there is no need to do so if optimization is disabled
>> -        because all the SSA_NAMEs based on these DECLs have been coalesced
>> -        into a single partition, which is thus assigned the canonical RTL
>> -        location of the DECLs.  If in_lto_p, we can't rely on optimize,
>> -        a function could be compiled with -O1 -flto first and only the
>> -        link performed at -O0.  */
>> -      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
>> -       expand_one_var (var, true, true);
>> -      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
>> -       {
>> -         /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
>> -            contain the default def (representing the parm or result itself)
>> -            we don't do anything here.  But those which don't contain the
>> -            default def (representing a temporary based on the parm/result)
>> -            we need to allocate space just like for normal VAR_DECLs.  */
>> -         if (!bitmap_bit_p (SA.partition_has_default_def, i))
>> -           {
>> -             expand_one_var (var, true, true);
>> -             gcc_assert (SA.partition_to_pseudo[i]);
>> -           }
>> -       }
>> +      expand_one_ssa_partition (var);
>>      }
>>
>> +  for (i = 1; i < num_ssa_names; i++)
>> +    adjust_one_expanded_partition_var (ssa_name (i));
>> +
>>    if (flag_stack_protect == SPCT_FLAG_STRONG)
>>        gen_stack_protect_signal
>>         = stack_protect_decl_p () || stack_protect_return_slot_p ();
>> @@ -5961,35 +6149,6 @@ pass_expand::execute (function *fun)
>>        parm_birth_insn = var_seq;
>>      }
>>
>> -  /* Now that we also have the parameter RTXs, copy them over to our
>> -     partitions.  */
>> -  for (i = 0; i < SA.map->num_partitions; i++)
>> -    {
>> -      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
>> -
>> -      if (TREE_CODE (var) != VAR_DECL
>> -         && !SA.partition_to_pseudo[i])
>> -       SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
>> -      gcc_assert (SA.partition_to_pseudo[i]);
>> -
>> -      /* If this decl was marked as living in multiple places, reset
>> -        this now to NULL.  */
>> -      if (DECL_RTL_IF_SET (var) == pc_rtx)
>> -       SET_DECL_RTL (var, NULL);
>> -
>> -      /* Some RTL parts really want to look at DECL_RTL(x) when x
>> -        was a decl marked in REG_ATTR or MEM_ATTR.  We could use
>> -        SET_DECL_RTL here making this available, but that would mean
>> -        to select one of the potentially many RTLs for one DECL.  Instead
>> -        of doing that we simply reset the MEM_EXPR of the RTL in question,
>> -        then nobody can get at it and hence nobody can call DECL_RTL on it.  */
>> -      if (!DECL_RTL_SET_P (var))
>> -       {
>> -         if (MEM_P (SA.partition_to_pseudo[i]))
>> -           set_mem_expr (SA.partition_to_pseudo[i], NULL);
>> -       }
>> -    }
>> -
>>    /* If we have a class containing differently aligned pointers
>>       we need to merge those into the corresponding RTL pointer
>>       alignment.  */
>> @@ -5997,7 +6156,6 @@ pass_expand::execute (function *fun)
>>      {
>>        tree name = ssa_name (i);
>>        int part;
>> -      rtx r;
>>
>>        if (!name
>>           /* We might have generated new SSA names in
>> @@ -6010,20 +6168,24 @@ pass_expand::execute (function *fun)
>>        if (part == NO_PARTITION)
>>         continue;
>>
>> -      /* Adjust all partition members to get the underlying decl of
>> -        the representative which we might have created in expand_one_var.  */
>> -      if (SSA_NAME_VAR (name) == NULL_TREE)
>> +      gcc_assert (SA.partition_to_pseudo[part]);
>> +
>> +      /* If this decl was marked as living in multiple places, reset
>> +        this now to NULL.  */
>> +      tree var = SSA_NAME_VAR (name);
>> +      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
>> +       SET_DECL_RTL (var, NULL);
>> +      /* Check that the pseudos chosen by assign_parms are those of
>> +        the corresponding default defs.  */
>> +      else if (SSA_NAME_IS_DEFAULT_DEF (name)
>> +              && (TREE_CODE (var) == PARM_DECL
>> +                  || TREE_CODE (var) == RESULT_DECL))
>>         {
>> -         tree leader = partition_to_var (SA.map, part);
>> -         gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
>> -         replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
>> +         rtx in = DECL_RTL_IF_SET (var);
>> +         gcc_assert (in);
>> +         rtx out = SA.partition_to_pseudo[part];
>> +         gcc_assert (in == out || rtx_equal_p (in, out));
>>         }
>> -      if (!POINTER_TYPE_P (TREE_TYPE (name)))
>> -       continue;
>> -
>> -      r = SA.partition_to_pseudo[part];
>> -      if (REG_P (r))
>> -       mark_reg_pointer (r, get_pointer_alignment (name));
>>      }
>>
>>    /* If this function is `main', emit a call to `__main'
>> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
>> index a0b6e3e..602579d 100644
>> --- a/gcc/cfgexpand.h
>> +++ b/gcc/cfgexpand.h
>> @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>>
>>  extern tree gimple_assign_rhs_to_tree (gimple);
>>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
>> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
>> +
>>
>>  #endif /* GCC_CFGEXPAND_H */
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 32b416a..051f824 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2227,16 +2227,16 @@ Common Report Var(flag_tree_ch) Optimization
>>  Enable loop header copying on trees
>>
>>  ftree-coalesce-inlined-vars
>> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
>> -Enable coalescing of copy-related user variables that are inlined
>> +Common Ignore RejectNegative
>> +Does nothing.  Preserved for backward compatibility.
>>
>>  ftree-coalesce-vars
>> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
>> -Enable coalescing of all copy-related user variables
>> +Common Report Var(flag_tree_coalesce_vars) Optimization
>> +Enable SSA coalescing of user variables
>>
>>  ftree-copyrename
>> -Common Report Var(flag_tree_copyrename) Optimization
>> -Replace SSA temporaries with better names in copies
>> +Common Ignore
>> +Does nothing.  Preserved for backward compatibility.
>>
>>  ftree-copy-prop
>>  Common Report Var(flag_tree_copy_prop) Optimization
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index e25bd62..e359be2 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
>>  -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
>>  -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
>>  -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
>> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
>>  -fdump-tree-nrv -fdump-tree-vect @gol
>>  -fdump-tree-sink @gol
>>  -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
>> @@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
>>  -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
>> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
>> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>>  -ftree-loop-if-convert-stores -ftree-loop-im @gol
>>  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
>>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
>> @@ -7076,11 +7074,6 @@ name is made by appending @file{.phiopt} to the source file name.
>>  Dump each function after forward propagating single use variables.  The file
>>  name is made by appending @file{.forwprop} to the source file name.
>>
>> -@item copyrename
>> -@opindex fdump-tree-copyrename
>> -Dump each function after applying the copy rename optimization.  The file
>> -name is made by appending @file{.copyrename} to the source file name.
>> -
>>  @item nrv
>>  @opindex fdump-tree-nrv
>>  Dump each function after applying the named return value optimization on
>> @@ -7545,8 +7538,8 @@ compilation time.
>>  -ftree-ccp @gol
>>  -fssa-phiopt @gol
>>  -ftree-ch @gol
>> +-ftree-coalesce-vars @gol
>>  -ftree-copy-prop @gol
>> --ftree-copyrename @gol
>>  -ftree-dce @gol
>>  -ftree-dominator-opts @gol
>>  -ftree-dse @gol
>> @@ -8815,6 +8808,15 @@ profitable to parallelize the loops.
>>  Compare the results of several data dependence analyzers.  This option
>>  is used for debugging the data dependence analyzers.
>>
>> +@item -ftree-coalesce-vars
>> +@opindex ftree-coalesce-vars
>> +Tell the compiler to attempt to combine small user-defined variables
>> +too, instead of just compiler temporaries.  This may severely limit the
>> +ability to debug an optimized program compiled with
>> +@option{-fno-var-tracking-assignments}.  In the negated form, this flag
>> +prevents SSA coalescing of user variables.  This option is enabled by
>> +default if optimization is enabled.
>> +
>>  @item -ftree-loop-if-convert
>>  @opindex ftree-loop-if-convert
>>  Attempt to transform conditional jumps in the innermost loops to
>> @@ -8928,32 +8930,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
>>  references with scalars to prevent committing structures to memory too
>>  early.  This flag is enabled by default at @option{-O} and higher.
>>
>> -@item -ftree-copyrename
>> -@opindex ftree-copyrename
>> -Perform copy renaming on trees.  This pass attempts to rename compiler
>> -temporaries to other variables at copy locations, usually resulting in
>> -variable names which more closely resemble the original variables.  This flag
>> -is enabled by default at @option{-O} and higher.
>> -
>> -@item -ftree-coalesce-inlined-vars
>> -@opindex ftree-coalesce-inlined-vars
>> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
>> -combine small user-defined variables too, but only if they are inlined
>> -from other functions.  It is a more limited form of
>> -@option{-ftree-coalesce-vars}.  This may harm debug information of such
>> -inlined variables, but it keeps variables of the inlined-into
>> -function apart from each other, such that they are more likely to
>> -contain the expected values in a debugging session.
>> -
>> -@item -ftree-coalesce-vars
>> -@opindex ftree-coalesce-vars
>> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
>> -combine small user-defined variables too, instead of just compiler
>> -temporaries.  This may severely limit the ability to debug an optimized
>> -program compiled with @option{-fno-var-tracking-assignments}.  In the
>> -negated form, this flag prevents SSA coalescing of user variables,
>> -including inlined ones.  This option is enabled by default.
>> -
>>  @item -ftree-ter
>>  @opindex ftree-ter
>>  Perform temporary expression replacement during the SSA->normal phase.  Single
>> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
>> index 49a1509..2b98946 100644
>> --- a/gcc/emit-rtl.c
>> +++ b/gcc/emit-rtl.c
>> @@ -1249,6 +1249,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
>>  void
>>  set_reg_attrs_for_decl_rtl (tree t, rtx x)
>>  {
>> +  if (!t)
>> +    return;
>> +  tree tdecl = t;
>>    if (GET_CODE (x) == SUBREG)
>>      {
>>        gcc_assert (subreg_lowpart_p (x));
>> @@ -1257,7 +1260,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>>    if (REG_P (x))
>>      REG_ATTRS (x)
>>        = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
>> -                                              DECL_MODE (t)));
>> +                                              DECL_MODE (tdecl)));
>>    if (GET_CODE (x) == CONCAT)
>>      {
>>        if (REG_P (XEXP (x, 0)))
>> diff --git a/gcc/explow.c b/gcc/explow.c
>> index 8745aea..5b0d49c 100644
>> --- a/gcc/explow.c
>> +++ b/gcc/explow.c
>> @@ -856,6 +856,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>>    return pmode;
>>  }
>>
>> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
>> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
>> +   mode of a temp decl of same type as the SSA_NAME, if we had created
>> +   one.  */
>> +
>> +machine_mode
>> +promote_ssa_mode (const_tree name, int *punsignedp)
>> +{
>> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
>> +
>> +  tree type = TREE_TYPE (name);
>> +  int unsignedp = TYPE_UNSIGNED (type);
>> +  machine_mode mode = TYPE_MODE (type);
>> +
>> +  machine_mode pmode = promote_mode (type, mode, &unsignedp);
>> +  if (punsignedp)
>> +    *punsignedp = unsignedp;
>> +
>> +  return pmode;
>> +}
>> +
>> +
>>
>>  /* Controls the behaviour of {anti_,}adjust_stack.  */
>>  static bool suppress_reg_args_size;
>> diff --git a/gcc/explow.h b/gcc/explow.h
>> index 94613de..52113db 100644
>> --- a/gcc/explow.h
>> +++ b/gcc/explow.h
>> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
>>  /* Return mode and signedness to use when object is promoted.  */
>>  machine_mode promote_decl_mode (const_tree, int *);
>>
>> +/* Return mode and signedness to use when object is promoted.  */
>> +machine_mode promote_ssa_mode (const_tree, int *);
>> +
>>  /* Remove some bytes from the stack.  An rtx says how many.  */
>>  extern void adjust_stack (rtx);
>>
>> diff --git a/gcc/expr.c b/gcc/expr.c
>> index 5a931dc..5b6e16e 100644
>> --- a/gcc/expr.c
>> +++ b/gcc/expr.c
>> @@ -9301,7 +9301,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>    rtx op0, op1, temp, decl_rtl;
>>    tree type;
>>    int unsignedp;
>> -  machine_mode mode;
>> +  machine_mode mode, dmode;
>>    enum tree_code code = TREE_CODE (exp);
>>    rtx subtarget, original_target;
>>    int ignore;
>> @@ -9432,7 +9432,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>        if (g == NULL
>>           && modifier == EXPAND_INITIALIZER
>>           && !SSA_NAME_IS_DEFAULT_DEF (exp)
>> -         && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>> +         && (optimize || !SSA_NAME_VAR (exp)
>> +             || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>>           && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
>>         g = SSA_NAME_DEF_STMT (exp);
>>        if (g)
>> @@ -9511,15 +9512,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>        /* Ensure variable marked as used even if it doesn't go through
>>          a parser.  If it hasn't be used yet, write out an external
>>          definition.  */
>> -      TREE_USED (exp) = 1;
>> +      if (exp)
>> +       TREE_USED (exp) = 1;
>>
>>        /* Show we haven't gotten RTL for this yet.  */
>>        temp = 0;
>>
>>        /* Variables inherited from containing functions should have
>>          been lowered by this point.  */
>> -      context = decl_function_context (exp);
>> -      gcc_assert (SCOPE_FILE_SCOPE_P (context)
>> +      if (exp)
>> +       context = decl_function_context (exp);
>> +      gcc_assert (!exp
>> +                 || SCOPE_FILE_SCOPE_P (context)
>>                   || context == current_function_decl
>>                   || TREE_STATIC (exp)
>>                   || DECL_EXTERNAL (exp)
>> @@ -9543,7 +9547,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>           decl_rtl = use_anchored_address (decl_rtl);
>>           if (modifier != EXPAND_CONST_ADDRESS
>>               && modifier != EXPAND_SUM
>> -             && !memory_address_addr_space_p (DECL_MODE (exp),
>> +             && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
>> +                                              : GET_MODE (decl_rtl),
>>                                                XEXP (decl_rtl, 0),
>>                                                MEM_ADDR_SPACE (decl_rtl)))
>>             temp = replace_equiv_address (decl_rtl,
>> @@ -9554,12 +9559,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>          if the address is a register.  */
>>        if (temp != 0)
>>         {
>> -         if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
>> +         if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
>>             mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>>
>>           return temp;
>>         }
>>
>> +      if (exp)
>> +       dmode = DECL_MODE (exp);
>> +      else
>> +       dmode = TYPE_MODE (TREE_TYPE (ssa_name));
>> +
>>        /* If the mode of DECL_RTL does not match that of the decl,
>>          there are two cases: we are dealing with a BLKmode value
>>          that is returned in a register, or we are dealing with
>> @@ -9567,22 +9577,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>          of the wanted mode, but mark it so that we know that it
>>          was already extended.  */
>>        if (REG_P (decl_rtl)
>> -         && DECL_MODE (exp) != BLKmode
>> -         && GET_MODE (decl_rtl) != DECL_MODE (exp))
>> +         && dmode != BLKmode
>> +         && GET_MODE (decl_rtl) != dmode)
>>         {
>>           machine_mode pmode;
>>
>>           /* Get the signedness to be used for this variable.  Ensure we get
>>              the same mode we got when the variable was declared.  */
>> -         if (code == SSA_NAME
>> -             && (g = SSA_NAME_DEF_STMT (ssa_name))
>> -             && gimple_code (g) == GIMPLE_CALL
>> -             && !gimple_call_internal_p (g))
>> +         if (code != SSA_NAME)
>> +           pmode = promote_decl_mode (exp, &unsignedp);
>> +         else if ((g = SSA_NAME_DEF_STMT (ssa_name))
>> +                  && gimple_code (g) == GIMPLE_CALL
>> +                  && !gimple_call_internal_p (g))
>>             pmode = promote_function_mode (type, mode, &unsignedp,
>>                                            gimple_call_fntype (g),
>>                                            2);
>>           else
>> -           pmode = promote_decl_mode (exp, &unsignedp);
>> +           pmode = promote_ssa_mode (ssa_name, &unsignedp);
>>           gcc_assert (GET_MODE (decl_rtl) == pmode);
>>
>>           temp = gen_lowpart_SUBREG (mode, decl_rtl);
>> diff --git a/gcc/function.c b/gcc/function.c
>> index 7d2d7e4..58e2498 100644
>> --- a/gcc/function.c
>> +++ b/gcc/function.c
>> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "cfganal.h"
>>  #include "cfgbuild.h"
>>  #include "cfgcleanup.h"
>> +#include "cfgexpand.h"
>>  #include "basic-block.h"
>>  #include "df.h"
>>  #include "params.h"
>> @@ -2121,6 +2122,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>>  bool
>>  use_register_for_decl (const_tree decl)
>>  {
>> +  if (TREE_CODE (decl) == SSA_NAME)
>> +    {
>> +      /* We often try to use the SSA_NAME, instead of its underlying
>> +        decl, to get type information and guide decisions, to avoid
>> +        differences of behavior between anonymous and named
>> +        variables, but in this one case we have to go for the actual
>> +        variable if there is one.  The main reason is that, at least
>> +        at -O0, we want to place user variables on the stack, but we
>> +        don't mind using pseudos for anonymous or ignored temps.
>> +        Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
>> +        should go in pseudos, whereas their corresponding variables
>> +        might have to go on the stack.  So, disregarding the decl
>> +        here would negatively impact debug info at -O0, enable
>> +        coalescing between SSA_NAMEs that ought to get different
>> +        stack/pseudo assignments, and get the incoming argument
>> +        processing thoroughly confused by PARM_DECLs expected to live
>> +        in stack slots but assigned to pseudos.  */
>> +      if (!SSA_NAME_VAR (decl))
>> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>> +
>> +      decl = SSA_NAME_VAR (decl);
>> +    }
>> +
>>    if (!targetm.calls.allocate_stack_slots_for_args ())
>>      return true;
>>
>> @@ -2804,23 +2829,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>>    data->entry_parm = entry_parm;
>>  }
>>
>> +/* Wrapper for use_register_for_decl, that special-cases the
>> +   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
>> +   passed by reference.  */
>> +
>> +static bool
>> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
>> +{
>> +  if (parm == all->function_result_decl)
>> +    {
>> +      tree result = DECL_RESULT (current_function_decl);
>> +
>> +      if (DECL_BY_REFERENCE (result))
>> +       parm = result;
>> +    }
>> +
>> +  return use_register_for_decl (parm);
>> +}
>> +
>> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
>> +   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
>> +   is passed by reference.  */
>> +
>> +static rtx
>> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
>> +{
>> +  if (parm == all->function_result_decl)
>> +    {
>> +      tree result = DECL_RESULT (current_function_decl);
>> +
>> +      if (!DECL_BY_REFERENCE (result))
>> +       return NULL_RTX;
>> +
>> +      parm = result;
>> +    }
>> +
>> +  return get_rtl_for_parm_ssa_default_def (parm);
>> +}
>> +
>> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
>> +   SSA_NAMEs in multiple partitions, so that assign_parms will choose
>> +   the default def, if it exists, or create new RTL to hold the unused
>> +   entry value.  If we are coalescing across variables, we want to
>> +   reset the location too, because a parm without a default def
>> +   (incoming value unused) might be coalesced with one with a default
>> +   def, and then assign_parms would copy both incoming values to the
>> +   same location, which might cause the wrong value to survive.  */
>> +static void
>> +maybe_reset_rtl_for_parm (tree parm)
>> +{
>> +  gcc_assert (TREE_CODE (parm) == PARM_DECL
>> +             || TREE_CODE (parm) == RESULT_DECL);
>> +  if ((flag_tree_coalesce_vars
>> +       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
>> +      && is_gimple_reg (parm))
>> +    SET_DECL_RTL (parm, NULL_RTX);
>> +}
>> +
>>  /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
>>     always valid and properly aligned.  */
>>
>>  static void
>> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
>> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
>> +                             struct assign_parm_data_one *data)
>>  {
>>    rtx stack_parm = data->stack_parm;
>>
>> +  /* If out-of-SSA assigned RTL to the parm default def, make sure we
>> +     don't use what we might have computed before.  */
>> +  rtx ssa_assigned = rtl_for_parm (all, parm);
>> +  if (ssa_assigned)
>> +    stack_parm = NULL;
>> +
>>    /* If we can't trust the parm stack slot to be aligned enough for its
>>       ultimate type, don't use that slot after entry.  We'll make another
>>       stack slot, if we need one.  */
>> -  if (stack_parm
>> -      && ((STRICT_ALIGNMENT
>> -          && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
>> -         || (data->nominal_type
>> -             && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
>> -             && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>> +  else if (stack_parm
>> +          && ((STRICT_ALIGNMENT
>> +               && (GET_MODE_ALIGNMENT (data->nominal_mode)
>> +                   > MEM_ALIGN (stack_parm)))
>> +              || (data->nominal_type
>> +                  && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
>> +                  && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>>      stack_parm = NULL;
>>
>>    /* If parm was passed in memory, and we need to convert it on entry,
>> @@ -2882,11 +2972,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>>
>>    size = int_size_in_bytes (data->passed_type);
>>    size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
>> +
>>    if (stack_parm == 0)
>>      {
>>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
>> -      stack_parm = assign_stack_local (BLKmode, size_stored,
>> -                                      DECL_ALIGN (parm));
>> +      stack_parm = rtl_for_parm (all, parm);
>> +      if (!stack_parm)
>> +       stack_parm = assign_stack_local (BLKmode, size_stored,
>> +                                        DECL_ALIGN (parm));
>> +      else
>> +       stack_parm = copy_rtx (stack_parm);
>>        if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>>         PUT_MODE (stack_parm, GET_MODE (entry_parm));
>>        set_mem_attributes (stack_parm, parm, 1);
>> @@ -3027,10 +3122,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>>      = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>>                              TREE_TYPE (current_function_decl), 2);
>>
>> -  parmreg = gen_reg_rtx (promoted_nominal_mode);
>> +  rtx from_expand = rtl_for_parm (all, parm);
>>
>> -  if (!DECL_ARTIFICIAL (parm))
>> -    mark_user_reg (parmreg);
>> +  if (from_expand && !data->passed_pointer)
>> +    {
>> +      parmreg = from_expand;
>> +      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
>> +    }
>> +  else
>> +    {
>> +      parmreg = gen_reg_rtx (promoted_nominal_mode);
>> +      if (!DECL_ARTIFICIAL (parm))
>> +       mark_user_reg (parmreg);
>> +    }
>>
>>    /* If this was an item that we received a pointer to,
>>       set DECL_RTL appropriately.  */
>> @@ -3049,6 +3153,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>>       assign_parm_find_data_types and expand_expr_real_1.  */
>>
>>    equiv_stack_parm = data->stack_parm;
>> +  if (!equiv_stack_parm)
>> +    equiv_stack_parm = data->entry_parm;
>>    validated_mem = validize_mem (copy_rtx (data->entry_parm));
>>
>>    need_conversion = (data->nominal_mode != data->passed_mode
>> @@ -3189,11 +3295,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>>
>>    /* If we were passed a pointer but the actual value can safely live
>>       in a register, retrieve it and use it directly.  */
>> -  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
>> +  if (data->passed_pointer
>> +      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
>>      {
>>        /* We can't use nominal_mode, because it will have been set to
>>          Pmode above.  We must use the actual mode of the parm.  */
>> -      if (use_register_for_decl (parm))
>> +      if (from_expand)
>> +       {
>> +         parmreg = from_expand;
>> +         gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
>> +       }
>> +      else if (use_register_for_decl (parm))
>>         {
>>           parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>>           mark_user_reg (parmreg);
>> @@ -3233,7 +3345,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>>
>>        /* STACK_PARM is the pointer, not the parm, and PARMREG is
>>          now the parm.  */
>> -      data->stack_parm = NULL;
>> +      data->stack_parm = equiv_stack_parm = NULL;
>>      }
>>
>>    /* Mark the register as eliminable if we did no conversion and it was
>> @@ -3243,11 +3355,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>>       make here would screw up life analysis for it.  */
>>    if (data->nominal_mode == data->passed_mode
>>        && !did_conversion
>> -      && data->stack_parm != 0
>> -      && MEM_P (data->stack_parm)
>> +      && equiv_stack_parm != 0
>> +      && MEM_P (equiv_stack_parm)
>>        && data->locate.offset.var == 0
>>        && reg_mentioned_p (virtual_incoming_args_rtx,
>> -                         XEXP (data->stack_parm, 0)))
>> +                         XEXP (equiv_stack_parm, 0)))
>>      {
>>        rtx_insn *linsn = get_last_insn ();
>>        rtx_insn *sinsn;
>> @@ -3260,8 +3372,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>>             = GET_MODE_INNER (GET_MODE (parmreg));
>>           int regnor = REGNO (XEXP (parmreg, 0));
>>           int regnoi = REGNO (XEXP (parmreg, 1));
>> -         rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
>> -         rtx stacki = adjust_address_nv (data->stack_parm, submode,
>> +         rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
>> +         rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
>>                                           GET_MODE_SIZE (submode));
>>
>>           /* Scan backwards for the set of the real and
>> @@ -3334,6 +3446,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>>
>>        if (data->stack_parm == 0)
>>         {
>> +         rtx x = data->stack_parm = rtl_for_parm (all, parm);
>> +         if (x)
>> +           gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
>> +       }
>> +
>> +      if (data->stack_parm == 0)
>> +       {
>>           int align = STACK_SLOT_ALIGNMENT (data->passed_type,
>>                                             GET_MODE (data->entry_parm),
>>                                             TYPE_ALIGN (data->passed_type));
>> @@ -3592,6 +3711,8 @@ assign_parms (tree fndecl)
>>           DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>>           continue;
>>         }
>> +      else
>> +       maybe_reset_rtl_for_parm (parm);
>>
>>        /* Estimate stack alignment from parameter alignment.  */
>>        if (SUPPORTS_STACK_ALIGNMENT)
>> @@ -3641,7 +3762,9 @@ assign_parms (tree fndecl)
>>        else
>>         set_decl_incoming_rtl (parm, data.entry_parm, false);
>>
>> -      /* Boudns should be loaded in the particular order to
>> +      assign_parm_adjust_stack_rtl (&all, parm, &data);
>> +
>> +      /* Bounds should be loaded in the particular order to
>>          have registers allocated correctly.  Collect info about
>>          input bounds and load them later.  */
>>        if (POINTER_BOUNDS_TYPE_P (data.passed_type))
>> @@ -3658,11 +3781,10 @@ assign_parms (tree fndecl)
>>         }
>>        else
>>         {
>> -         assign_parm_adjust_stack_rtl (&data);
>> -
>>           if (assign_parm_setup_block_p (&data))
>>             assign_parm_setup_block (&all, parm, &data);
>> -         else if (data.passed_pointer || use_register_for_decl (parm))
>> +         else if (data.passed_pointer
>> +                  || use_register_for_parm_decl (&all, parm))
>>             assign_parm_setup_reg (&all, parm, &data);
>>           else
>>             assign_parm_setup_stack (&all, parm, &data);
>> @@ -5004,7 +5126,9 @@ expand_function_start (tree subr)
>>       before any library calls that assign parms might generate.  */
>>
>>    /* Decide whether to return the value in memory or in a register.  */
>> -  if (aggregate_value_p (DECL_RESULT (subr), subr))
>> +  tree res = DECL_RESULT (subr);
>> +  maybe_reset_rtl_for_parm (res);
>> +  if (aggregate_value_p (res, subr))
>>      {
>>        /* Returning something that won't go in a register.  */
>>        rtx value_address = 0;
>> @@ -5012,7 +5136,7 @@ expand_function_start (tree subr)
>>  #ifdef PCC_STATIC_STRUCT_RETURN
>>        if (cfun->returns_pcc_struct)
>>         {
>> -         int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
>> +         int size = int_size_in_bytes (TREE_TYPE (res));
>>           value_address = assemble_static_space (size);
>>         }
>>        else
>> @@ -5024,36 +5148,45 @@ expand_function_start (tree subr)
>>              it.  */
>>           if (sv)
>>             {
>> -             value_address = gen_reg_rtx (Pmode);
>> +             if (DECL_BY_REFERENCE (res))
>> +               value_address = get_rtl_for_parm_ssa_default_def (res);
>> +             if (!value_address)
>> +               value_address = gen_reg_rtx (Pmode);
>>               emit_move_insn (value_address, sv);
>>             }
>>         }
>>        if (value_address)
>>         {
>>           rtx x = value_address;
>> -         if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
>> +         if (!DECL_BY_REFERENCE (res))
>>             {
>> -             x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
>> -             set_mem_attributes (x, DECL_RESULT (subr), 1);
>> +             x = get_rtl_for_parm_ssa_default_def (res);
>> +             if (!x)
>> +               {
>> +                 x = gen_rtx_MEM (DECL_MODE (res), value_address);
>> +                 set_mem_attributes (x, res, 1);
>> +               }
>>             }
>> -         SET_DECL_RTL (DECL_RESULT (subr), x);
>> +         SET_DECL_RTL (res, x);
>>         }
>>      }
>> -  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
>> +  else if (DECL_MODE (res) == VOIDmode)
>>      /* If return mode is void, this decl rtl should not be used.  */
>> -    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
>> +    SET_DECL_RTL (res, NULL_RTX);
>>    else
>>      {
>>        /* Compute the return values into a pseudo reg, which we will copy
>>          into the true return register after the cleanups are done.  */
>> -      tree return_type = TREE_TYPE (DECL_RESULT (subr));
>> -      if (TYPE_MODE (return_type) != BLKmode
>> -         && targetm.calls.return_in_msb (return_type))
>> +      tree return_type = TREE_TYPE (res);
>> +      rtx x = get_rtl_for_parm_ssa_default_def (res);
>> +      if (x)
>> +       /* Use it.  */;
>> +      else if (TYPE_MODE (return_type) != BLKmode
>> +              && targetm.calls.return_in_msb (return_type))
>>         /* expand_function_end will insert the appropriate padding in
>>            this case.  Use the return value's natural (unpadded) mode
>>            within the function proper.  */
>> -       SET_DECL_RTL (DECL_RESULT (subr),
>> -                     gen_reg_rtx (TYPE_MODE (return_type)));
>> +       x = gen_reg_rtx (TYPE_MODE (return_type));
>>        else
>>         {
>>           /* In order to figure out what mode to use for the pseudo, we
>> @@ -5064,25 +5197,26 @@ expand_function_start (tree subr)
>>           /* Structures that are returned in registers are not
>>              aggregate_value_p, so we may see a PARALLEL or a REG.  */
>>           if (REG_P (hard_reg))
>> -           SET_DECL_RTL (DECL_RESULT (subr),
>> -                         gen_reg_rtx (GET_MODE (hard_reg)));
>> +           x = gen_reg_rtx (GET_MODE (hard_reg));
>>           else
>>             {
>>               gcc_assert (GET_CODE (hard_reg) == PARALLEL);
>> -             SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
>> +             x = gen_group_rtx (hard_reg);
>>             }
>>         }
>>
>> +      SET_DECL_RTL (res, x);
>> +
>>        /* Set DECL_REGISTER flag so that expand_function_end will copy the
>>          result to the real return register(s).  */
>> -      DECL_REGISTER (DECL_RESULT (subr)) = 1;
>> +      DECL_REGISTER (res) = 1;
>>
>>        if (chkp_function_instrumented_p (current_function_decl))
>>         {
>> -         tree return_type = TREE_TYPE (DECL_RESULT (subr));
>> +         tree return_type = TREE_TYPE (res);
>>           rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
>>                                                                  subr, 1);
>> -         SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
>> +         SET_DECL_BOUNDS_RTL (res, bounds);
>>         }
>>      }
>>
>> @@ -5097,7 +5231,9 @@ expand_function_start (tree subr)
>>        rtx local, chain;
>>       rtx_insn *insn;
>>
>> -      local = gen_reg_rtx (Pmode);
>> +      local = get_rtl_for_parm_ssa_default_def (parm);
>> +      if (!local)
>> +       local = gen_reg_rtx (Pmode);
>>        chain = targetm.calls.static_chain (current_function_decl, true);
>>
>>        set_decl_incoming_rtl (parm, chain, false);
>> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
>> index 4d683d6..d3d1c5f 100644
>> --- a/gcc/gimple-expr.c
>> +++ b/gcc/gimple-expr.c
>> @@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
>>    return copy;
>>  }
>>
>> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
>> -   coalescing together, false otherwise.
>> -
>> -   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
>> -
>> -bool
>> -gimple_can_coalesce_p (tree name1, tree name2)
>> -{
>> -  /* First check the SSA_NAME's associated DECL.  We only want to
>> -     coalesce if they have the same DECL or both have no associated DECL.  */
>> -  tree var1 = SSA_NAME_VAR (name1);
>> -  tree var2 = SSA_NAME_VAR (name2);
>> -  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
>> -  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
>> -  if (var1 != var2)
>> -    return false;
>> -
>> -  /* Now check the types.  If the types are the same, then we should
>> -     try to coalesce V1 and V2.  */
>> -  tree t1 = TREE_TYPE (name1);
>> -  tree t2 = TREE_TYPE (name2);
>> -  if (t1 == t2)
>> -    return true;
>> -
>> -  /* If the types are not the same, check for a canonical type match.  This
>> -     (for example) allows coalescing when the types are fundamentally the
>> -     same, but just have different names.
>> -
>> -     Note pointer types with different address spaces may have the same
>> -     canonical type.  Those are rejected for coalescing by the
>> -     types_compatible_p check.  */
>> -  if (TYPE_CANONICAL (t1)
>> -      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
>> -      && types_compatible_p (t1, t2))
>> -    return true;
>> -
>> -  return false;
>> -}
>> -
>>  /* Strip off a legitimate source ending from the input string NAME of
>>     length LEN.  Rather than having to know the names used by all of
>>     our front ends, we strip off an ending of a period followed by
>> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
>> index ed23eb2..3d1c89f 100644
>> --- a/gcc/gimple-expr.h
>> +++ b/gcc/gimple-expr.h
>> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
>>  extern bool gimple_has_body_p (tree);
>>  extern const char *gimple_decl_printable_name (tree, int);
>>  extern tree copy_var_decl (tree, tree, tree);
>> -extern bool gimple_can_coalesce_p (tree, tree);
>>  extern tree create_tmp_var_name (const char *);
>>  extern tree create_tmp_var_raw (tree, const char * = NULL);
>>  extern tree create_tmp_var (tree, const char * = NULL);
>> diff --git a/gcc/opts.c b/gcc/opts.c
>> index 9793999..5305299 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
>>      { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
>> +    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
>> -    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
>>      { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
>> diff --git a/gcc/passes.def b/gcc/passes.def
>> index 4690e23..230e089 100644
>> --- a/gcc/passes.def
>> +++ b/gcc/passes.def
>> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
>>        NEXT_PASS (pass_all_early_optimizations);
>>        PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
>>           NEXT_PASS (pass_remove_cgraph_callee_edges);
>> -         NEXT_PASS (pass_rename_ssa_copies);
>>           NEXT_PASS (pass_object_sizes);
>>           NEXT_PASS (pass_ccp);
>>           /* After CCP we rewrite no longer addressed locals into SSA
>> @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
>>        /* Initial scalar cleanups before alias computation.
>>          They ensure memory accesses are not indirect wherever possible.  */
>>        NEXT_PASS (pass_strip_predict_hints);
>> -      NEXT_PASS (pass_rename_ssa_copies);
>>        NEXT_PASS (pass_ccp);
>>        /* After CCP we rewrite no longer addressed locals into SSA
>>          form if possible.  */
>> @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
>>        NEXT_PASS (pass_ch);
>>        NEXT_PASS (pass_lower_complex);
>>        NEXT_PASS (pass_sra);
>> -      NEXT_PASS (pass_rename_ssa_copies);
>>        /* The dom pass will also resolve all __builtin_constant_p calls
>>           that are still there to 0.  This has to be done after some
>>          propagations have already run, but before some more dead code
>> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3.  If not see
>>        NEXT_PASS (pass_fold_builtins);
>>        NEXT_PASS (pass_optimize_widening_mul);
>>        NEXT_PASS (pass_tail_calls);
>> -      NEXT_PASS (pass_rename_ssa_copies);
>>        /* FIXME: If DCE is not run before checking for uninitialized uses,
>>          we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
>>          However, this also causes us to misdiagnose cases that should be
>> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3.  If not see
>>        NEXT_PASS (pass_dce);
>>        NEXT_PASS (pass_asan);
>>        NEXT_PASS (pass_tsan);
>> -      NEXT_PASS (pass_rename_ssa_copies);
>>        /* ???  We do want some kind of loop invariant motion, but we possibly
>>           need to adjust LIM to be more friendly towards preserving accurate
>>          debug information here.  */
>> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
>> index 9b17187..e1e7293 100644
>> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
>> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
>> @@ -1,6 +1,6 @@
>>  /* PR tree-optimization/54200 */
>>  /* { dg-do run } */
>> -/* { dg-options "-g -fno-var-tracking-assignments" } */
>> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>>
>>  int o __attribute__((used));
>>
>> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
>> index 5467f4d..db69332 100644
>> --- a/gcc/testsuite/gcc.dg/ssp-1.c
>> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
>> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>>
>>  int main ()
>>  {
>> -  int i;
>> +  register int i;
>>    char foo[255];
>>
>>    // smash stack
>> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
>> index 9a7ac32..752fe53 100644
>> --- a/gcc/testsuite/gcc.dg/ssp-2.c
>> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
>> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
>>  void
>>  overflow()
>>  {
>> -  int i = 0;
>> +  register int i = 0;
>>    char foo[30];
>>
>>    /* Overflow buffer.  */
>> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>> new file mode 100644
>> index 0000000..dbd81c1
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>> @@ -0,0 +1,40 @@
>> +/* { dg-do run } */
>> +
>> +#include <stdlib.h>
>> +
>> +/* Make sure we don't coalesce both incoming parms, one whose incoming
>> +   value is unused, to the same location, so as to overwrite one of
>> +   them with the incoming value of the other.  */
>> +
>> +int __attribute__((noinline, noclone))
>> +foo (int i, int j)
>> +{
>> +  j = i; /* The incoming value for J is unused.  */
>> +  i = 2;
>> +  if (j)
>> +    j++;
>> +  j += i + 1;
>> +  return j;
>> +}
>> +
>> +/* Same as foo, but with swapped parameters.  */
>> +int __attribute__((noinline, noclone))
>> +bar (int j, int i)
>> +{
>> +  j = i; /* The incoming value for J is unused.  */
>> +  i = 2;
>> +  if (j)
>> +    j++;
>> +  j += i + 1;
>> +  return j;
>> +}
>> +
>> +int
>> +main (void)
>> +{
>> +  if (foo (0, 1) != 3)
>> +    abort ();
>> +  if (bar (1, 0) != 3)
>> +    abort ();
>> +  return 0;
>> +}
>> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
>> index e23bc0b..59d91c6 100644
>> --- a/gcc/tree-outof-ssa.c
>> +++ b/gcc/tree-outof-ssa.c
>> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>>    rtx dest_rtx, seq, x;
>>    machine_mode dest_mode, src_mode;
>>    int unsignedp;
>> -  tree var;
>>
>>    if (dump_file && (dump_flags & TDF_DETAILS))
>>      {
>> @@ -327,12 +326,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>>
>>    start_sequence ();
>>
>> -  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
>> +  tree name = partition_to_var (SA.map, dest);
>>    src_mode = TYPE_MODE (TREE_TYPE (src));
>>    dest_mode = GET_MODE (dest_rtx);
>> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
>> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>>    gcc_assert (!REG_P (dest_rtx)
>> -             || dest_mode == promote_decl_mode (var, &unsignedp));
>> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>>
>>    if (src_mode != dest_mode)
>>      {
>> @@ -708,13 +707,12 @@ elim_backward (elim_graph g, int T)
>>  static rtx
>>  get_temp_reg (tree name)
>>  {
>> -  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>> -  tree type = TREE_TYPE (var);
>> +  tree type = TREE_TYPE (name);
>>    int unsignedp;
>> -  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
>> +  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>>    rtx x = gen_reg_rtx (reg_mode);
>>    if (POINTER_TYPE_P (type))
>> -    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
>> +    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
>>    return x;
>>  }
>>
>> @@ -1014,7 +1012,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>>
>>    /* Return to viewing the variable list as just all reference variables after
>>       coalescing has been performed.  */
>> -  partition_view_normal (map, false);
>> +  partition_view_normal (map);
>>
>>    if (dump_file && (dump_flags & TDF_DETAILS))
>>      {
>> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
>> index b05a860..9ffa3f1 100644
>> --- a/gcc/tree-ssa-coalesce.c
>> +++ b/gcc/tree-ssa-coalesce.c
>> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "tree-ssanames.h"
>>  #include "tree-ssa-live.h"
>>  #include "tree-ssa-coalesce.h"
>> +#include "explow.h"
>>  #include "diagnostic-core.h"
>>
>>
>> @@ -830,6 +831,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>>    basic_block bb;
>>    ssa_op_iter iter;
>>    live_track_p live;
>> +  basic_block entry;
>> +
>> +  /* If inter-variable coalescing is enabled, we may attempt to
>> +     coalesce variables from different base variables, including
>> +     different parameters, so we have to make sure default defs live
>> +     at the entry block conflict with each other.  */
>> +  if (flag_tree_coalesce_vars)
>> +    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>> +  else
>> +    entry = NULL;
>>
>>    map = live_var_map (liveinfo);
>>    graph = ssa_conflicts_new (num_var_partitions (map));
>> @@ -888,6 +899,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>>             live_track_process_def (live, result, graph);
>>         }
>>
>> +      /* Pretend there are defs for params' default defs at the start
>> +        of the (post-)entry block.  */
>> +      if (bb == entry)
>> +       {
>> +         unsigned base;
>> +         bitmap_iterator bi;
>> +         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
>> +           {
>> +             bitmap_iterator bi2;
>> +             unsigned part;
>> +             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
>> +                                       0, part, bi2)
>> +               {
>> +                 tree var = partition_to_var (map, part);
>> +                 if (!SSA_NAME_VAR (var)
>> +                     || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
>> +                         && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
>> +                     || !SSA_NAME_IS_DEFAULT_DEF (var))
>> +                   continue;
>> +                 live_track_process_def (live, var, graph);
>> +               }
>> +           }
>> +       }
>> +
>>       live_track_clear_base_vars (live);
>>      }
>>
>> @@ -1156,6 +1191,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>>      {
>>        var1 = partition_to_var (map, p1);
>>        var2 = partition_to_var (map, p2);
>> +
>>        z = var_union (map, var1, var2);
>>        if (z == NO_PARTITION)
>>         {
>> @@ -1173,6 +1209,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>>
>>        if (debug)
>>         fprintf (debug, ": Success -> %d\n", z);
>> +
>>        return true;
>>      }
>>
>> @@ -1270,6 +1307,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
>>  }
>>
>>
>> +/* Output partition map MAP with coalescing plan PART to file F.  */
>> +
>> +void
>> +dump_part_var_map (FILE *f, partition part, var_map map)
>> +{
>> +  int t;
>> +  unsigned x, y;
>> +  int p;
>> +
>> +  fprintf (f, "\nCoalescible Partition map \n\n");
>> +
>> +  for (x = 0; x < map->num_partitions; x++)
>> +    {
>> +      if (map->view_to_partition != NULL)
>> +       p = map->view_to_partition[x];
>> +      else
>> +       p = x;
>> +
>> +      if (ssa_name (p) == NULL_TREE
>> +         || virtual_operand_p (ssa_name (p)))
>> +        continue;
>> +
>> +      t = 0;
>> +      for (y = 1; y < num_ssa_names; y++)
>> +        {
>> +         tree var = version_to_var (map, y);
>> +         if (!var)
>> +           continue;
>> +         int q = var_to_partition (map, var);
>> +         p = partition_find (part, q);
>> +         gcc_assert (map->partition_to_base_index[q]
>> +                     == map->partition_to_base_index[p]);
>> +
>> +         if (p == (int)x)
>> +           {
>> +             if (t++ == 0)
>> +               {
>> +                 fprintf (f, "Partition %d, base %d (", x,
>> +                          map->partition_to_base_index[q]);
>> +                 print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
>> +                 fprintf (f, " - ");
>> +               }
>> +             fprintf (f, "%d ", y);
>> +           }
>> +       }
>> +      if (t != 0)
>> +       fprintf (f, ")\n");
>> +    }
>> +  fprintf (f, "\n");
>> +}
>> +
>> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
>> +   coalescing together, false otherwise.
>> +
>> +   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
>> +
>> +bool
>> +gimple_can_coalesce_p (tree name1, tree name2)
>> +{
>> +  /* First check the SSA_NAME's associated DECL.  Without
>> +     optimization, we only want to coalesce if they have the same DECL
>> +     or both have no associated DECL.  */
>> +  tree var1 = SSA_NAME_VAR (name1);
>> +  tree var2 = SSA_NAME_VAR (name2);
>> +  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
>> +  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
>> +  if (var1 != var2 && !flag_tree_coalesce_vars)
>> +    return false;
>> +
>> +  /* Now check the types.  If the types are the same, then we should
>> +     try to coalesce V1 and V2.  */
>> +  tree t1 = TREE_TYPE (name1);
>> +  tree t2 = TREE_TYPE (name2);
>> +  if (t1 == t2)
>> +    {
>> +    check_modes:
>> +      /* If the base variables are the same, we're good: none of the
>> +        other tests below could possibly fail.  */
>> +      var1 = SSA_NAME_VAR (name1);
>> +      var2 = SSA_NAME_VAR (name2);
>> +      if (var1 == var2)
>> +       return true;
>> +
>> +      /* We don't want to coalesce two SSA names if one of the base
>> +        variables is supposed to be a register while the other is
>> +        supposed to be on the stack.  Anonymous SSA names take
>> +        registers, but when not optimizing, user variables should go
>> +        on the stack, so coalescing them with the anonymous variable
>> +        as the partition leader would end up assigning the user
>> +        variable to a register.  Don't do that!  */
>> +      bool reg1 = !var1 || use_register_for_decl (var1);
>> +      bool reg2 = !var2 || use_register_for_decl (var2);
>> +      if (reg1 != reg2)
>> +       return false;
>> +
>> +      /* Check that the promoted modes are the same.  We don't want to
>> +        coalesce if the promoted modes would be different.  Only
>> +        PARM_DECLs and RESULT_DECLs have different promotion rules,
>> +        so skip the test if we both are variables or anonymous
>> +        SSA_NAMEs.  */
>> +      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
>> +       || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
>> +    }
>> +
>> +  /* If the types are not the same, check for a canonical type match.  This
>> +     (for example) allows coalescing when the types are fundamentally the
>> +     same, but just have different names.
>> +
>> +     Note pointer types with different address spaces may have the same
>> +     canonical type.  Those are rejected for coalescing by the
>> +     types_compatible_p check.  */
>> +  if (TYPE_CANONICAL (t1)
>> +      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
>> +      && types_compatible_p (t1, t2))
>> +    goto check_modes;
>> +
>> +  return false;
>> +}
>> +
>> +/* Fill in MAP's partition_to_base_index, with one index for each
>> +   partition of SSA names USED_IN_COPIES and related by CL coalesce
>> +   possibilities.  This must match gimple_can_coalesce_p in the
>> +   optimized case.  */
>> +
>> +static void
>> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
>> +                                  coalesce_list_p cl)
>> +{
>> +  int parts = num_var_partitions (map);
>> +  partition tentative = partition_new (parts);
>> +
>> +  /* Partition the SSA versions so that, for each coalescible
>> +     pair, both of its members are in the same partition in
>> +     TENTATIVE.  */
>> +  gcc_assert (!cl->sorted);
>> +  coalesce_pair_p node;
>> +  coalesce_iterator_type ppi;
>> +  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
>> +    {
>> +      tree v1 = ssa_name (node->first_element);
>> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
>> +      tree v2 = ssa_name (node->second_element);
>> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
>> +
>> +      if (p1 == p2)
>> +       continue;
>> +
>> +      partition_union (tentative, p1, p2);
>> +    }
>> +
>> +  /* We have to deal with cost one pairs too.  */
>> +  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
>> +    {
>> +      tree v1 = ssa_name (co->first_element);
>> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
>> +      tree v2 = ssa_name (co->second_element);
>> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
>> +
>> +      if (p1 == p2)
>> +       continue;
>> +
>> +      partition_union (tentative, p1, p2);
>> +    }
>> +
>> +  /* And also with abnormal edges.  */
>> +  basic_block bb;
>> +  edge e;
>> +  edge_iterator ei;
>> +  FOR_EACH_BB_FN (bb, cfun)
>> +    {
>> +      FOR_EACH_EDGE (e, ei, bb->preds)
>> +       if (e->flags & EDGE_ABNORMAL)
>> +         {
>> +           gphi_iterator gsi;
>> +           for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
>> +                gsi_next (&gsi))
>> +             {
>> +               gphi *phi = gsi.phi ();
>> +               tree arg = PHI_ARG_DEF (phi, e->dest_idx);
>> +               if (SSA_NAME_IS_DEFAULT_DEF (arg)
>> +                   && (!SSA_NAME_VAR (arg)
>> +                       || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
>> +                 continue;
>> +
>> +               tree res = PHI_RESULT (phi);
>> +
>> +               int p1 = partition_find (tentative, var_to_partition (map, res));
>> +               int p2 = partition_find (tentative, var_to_partition (map, arg));
>> +
>> +               if (p1 == p2)
>> +                 continue;
>> +
>> +               partition_union (tentative, p1, p2);
>> +             }
>> +         }
>> +    }
>> +
>> +  map->partition_to_base_index = XCNEWVEC (int, parts);
>> +  auto_vec<unsigned int> index_map (parts);
>> +  if (parts)
>> +    index_map.quick_grow (parts);
>> +
>> +  const unsigned no_part = -1;
>> +  unsigned count = parts;
>> +  while (count)
>> +    index_map[--count] = no_part;
>> +
>> +  /* Initialize MAP's mapping from partition to base index, using
>> +     as base indices an enumeration of the TENTATIVE partitions in
>> +     which each SSA version ended up, so that we compute conflicts
>> +     between all SSA versions that ended up in the same potential
>> +     coalesce partition.  */
>> +  bitmap_iterator bi;
>> +  unsigned i;
>> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
>> +    {
>> +      int pidx = var_to_partition (map, ssa_name (i));
>> +      int base = partition_find (tentative, pidx);
>> +      if (index_map[base] != no_part)
>> +       continue;
>> +      index_map[base] = count++;
>> +    }
>> +
>> +  map->num_basevars = count;
>> +
>> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
>> +    {
>> +      int pidx = var_to_partition (map, ssa_name (i));
>> +      int base = partition_find (tentative, pidx);
>> +      gcc_assert (index_map[base] < count);
>> +      map->partition_to_base_index[pidx] = index_map[base];
>> +    }
>> +
>> +  if (dump_file && (dump_flags & TDF_DETAILS))
>> +    dump_part_var_map (dump_file, tentative, map);
>> +
>> +  partition_delete (tentative);
>> +}
>> +
>> +/* Hashtable helpers.  */
>> +
>> +struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
>> +{
>> +  typedef tree_int_map *value_type;
>> +  typedef tree_int_map *compare_type;
>> +  static inline hashval_t hash (const tree_int_map *);
>> +  static inline bool equal (const tree_int_map *, const tree_int_map *);
>> +};
>> +
>> +inline hashval_t
>> +tree_int_map_hasher::hash (const tree_int_map *v)
>> +{
>> +  return tree_map_base_hash (v);
>> +}
>> +
>> +inline bool
>> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
>> +{
>> +  return tree_int_map_eq (v, c);
>> +}
>> +
>> +/* This routine will initialize the basevar fields of MAP with base
>> +   names.  Partitions will share the same base if they have the same
>> +   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
>> +   must match gimple_can_coalesce_p in the non-optimized case.  */
>> +
>> +static void
>> +compute_samebase_partition_bases (var_map map)
>> +{
>> +  int x, num_part;
>> +  tree var;
>> +  struct tree_int_map *m, *mapstorage;
>> +
>> +  num_part = num_var_partitions (map);
>> +  hash_table<tree_int_map_hasher> tree_to_index (num_part);
>> +  /* We can have at most num_part entries in the hash tables, so it's
>> +     enough to allocate so many map elements once, saving some malloc
>> +     calls.  */
>> +  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
>> +
>> +  /* If a base table already exists, clear it, otherwise create it.  */
>> +  free (map->partition_to_base_index);
>> +  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
>> +
>> +  /* Build the base variable list, and point partitions at their bases.  */
>> +  for (x = 0; x < num_part; x++)
>> +    {
>> +      struct tree_int_map **slot;
>> +      unsigned baseindex;
>> +      var = partition_to_var (map, x);
>> +      if (SSA_NAME_VAR (var)
>> +         && (!VAR_P (SSA_NAME_VAR (var))
>> +             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
>> +       m->base.from = SSA_NAME_VAR (var);
>> +      else
>> +       /* This restricts what anonymous SSA names we can coalesce
>> +          as it restricts the sets we compute conflicts for.
>> +          Using TREE_TYPE to generate sets is the easies as
>> +          type equivalency also holds for SSA names with the same
>> +          underlying decl.
>> +
>> +          Check gimple_can_coalesce_p when changing this code.  */
>> +       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
>> +                       ? TYPE_CANONICAL (TREE_TYPE (var))
>> +                       : TREE_TYPE (var));
>> +      /* If base variable hasn't been seen, set it up.  */
>> +      slot = tree_to_index.find_slot (m, INSERT);
>> +      if (!*slot)
>> +       {
>> +         baseindex = m - mapstorage;
>> +         m->to = baseindex;
>> +         *slot = m;
>> +         m++;
>> +       }
>> +      else
>> +       baseindex = (*slot)->to;
>> +      map->partition_to_base_index[x] = baseindex;
>> +    }
>> +
>> +  map->num_basevars = m - mapstorage;
>> +
>> +  free (mapstorage);
>> +}
>> +
>>  /* Reduce the number of copies by coalescing variables in the function.  Return
>>     a partition map with the resulting coalesces.  */
>>
>> @@ -1286,9 +1647,10 @@ coalesce_ssa_name (void)
>>    cl = create_coalesce_list ();
>>    map = create_outofssa_var_map (cl, used_in_copies);
>>
>> -  /* If optimization is disabled, we need to coalesce all the names originating
>> -     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
>> -  if (!optimize)
>> +  /* If this optimization is disabled, we need to coalesce all the
>> +     names originating from the same SSA_NAME_VAR so debug info
>> +     remains undisturbed.  */
>> +  if (!flag_tree_coalesce_vars)
>>      {
>>        hash_table<ssa_name_var_hash> ssa_name_hash (10);
>>
>> @@ -1329,8 +1691,13 @@ coalesce_ssa_name (void)
>>    if (dump_file && (dump_flags & TDF_DETAILS))
>>      dump_var_map (dump_file, map);
>>
>> -  /* Don't calculate live ranges for variables not in the coalesce list.  */
>> -  partition_view_bitmap (map, used_in_copies, true);
>> +  partition_view_bitmap (map, used_in_copies);
>> +
>> +  if (flag_tree_coalesce_vars)
>> +    compute_optimized_partition_bases (map, used_in_copies, cl);
>> +  else
>> +    compute_samebase_partition_bases (map);
>> +
>>    BITMAP_FREE (used_in_copies);
>>
>>    if (num_var_partitions (map) < 1)
>> @@ -1369,8 +1736,7 @@ coalesce_ssa_name (void)
>>
>>    /* Now coalesce everything in the list.  */
>>    coalesce_partitions (map, graph, cl,
>> -                      ((dump_flags & TDF_DETAILS) ? dump_file
>> -                                                  : NULL));
>> +                      ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>>
>>    delete_coalesce_list (cl);
>>    ssa_conflicts_delete (graph);
>> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
>> index 99b188a..ae289b4 100644
>> --- a/gcc/tree-ssa-coalesce.h
>> +++ b/gcc/tree-ssa-coalesce.h
>> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>>  #define GCC_TREE_SSA_COALESCE_H
>>
>>  extern var_map coalesce_ssa_name (void);
>> +extern bool gimple_can_coalesce_p (tree, tree);
>>
>>  #endif /* GCC_TREE_SSA_COALESCE_H */
>> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
>> deleted file mode 100644
>> index f3cb56e..0000000
>> --- a/gcc/tree-ssa-copyrename.c
>> +++ /dev/null
>> @@ -1,499 +0,0 @@
>> -/* Rename SSA copies.
>> -   Copyright (C) 2004-2015 Free Software Foundation, Inc.
>> -   Contributed by Andrew MacLeod <amacleod@redhat.com>
>> -
>> -This file is part of GCC.
>> -
>> -GCC is free software; you can redistribute it and/or modify
>> -it under the terms of the GNU General Public License as published by
>> -the Free Software Foundation; either version 3, or (at your option)
>> -any later version.
>> -
>> -GCC is distributed in the hope that it will be useful,
>> -but WITHOUT ANY WARRANTY; without even the implied warranty of
>> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> -GNU General Public License for more details.
>> -
>> -You should have received a copy of the GNU General Public License
>> -along with GCC; see the file COPYING3.  If not see
>> -<http://www.gnu.org/licenses/>.  */
>> -
>> -#include "config.h"
>> -#include "system.h"
>> -#include "coretypes.h"
>> -#include "tm.h"
>> -#include "hash-set.h"
>> -#include "machmode.h"
>> -#include "vec.h"
>> -#include "double-int.h"
>> -#include "input.h"
>> -#include "alias.h"
>> -#include "symtab.h"
>> -#include "wide-int.h"
>> -#include "inchash.h"
>> -#include "tree.h"
>> -#include "fold-const.h"
>> -#include "predict.h"
>> -#include "hard-reg-set.h"
>> -#include "function.h"
>> -#include "dominance.h"
>> -#include "cfg.h"
>> -#include "basic-block.h"
>> -#include "tree-ssa-alias.h"
>> -#include "internal-fn.h"
>> -#include "gimple-expr.h"
>> -#include "is-a.h"
>> -#include "gimple.h"
>> -#include "gimple-iterator.h"
>> -#include "flags.h"
>> -#include "tree-pretty-print.h"
>> -#include "bitmap.h"
>> -#include "gimple-ssa.h"
>> -#include "stringpool.h"
>> -#include "tree-ssanames.h"
>> -#include "hashtab.h"
>> -#include "rtl.h"
>> -#include "statistics.h"
>> -#include "real.h"
>> -#include "fixed-value.h"
>> -#include "insn-config.h"
>> -#include "expmed.h"
>> -#include "dojump.h"
>> -#include "explow.h"
>> -#include "calls.h"
>> -#include "emit-rtl.h"
>> -#include "varasm.h"
>> -#include "stmt.h"
>> -#include "expr.h"
>> -#include "tree-dfa.h"
>> -#include "tree-inline.h"
>> -#include "tree-ssa-live.h"
>> -#include "tree-pass.h"
>> -#include "langhooks.h"
>> -
>> -static struct
>> -{
>> -  /* Number of copies coalesced.  */
>> -  int coalesced;
>> -} stats;
>> -
>> -/* The following routines implement the SSA copy renaming phase.
>> -
>> -   This optimization looks for copies between 2 SSA_NAMES, either through a
>> -   direct copy, or an implicit one via a PHI node result and its arguments.
>> -
>> -   Each copy is examined to determine if it is possible to rename the base
>> -   variable of one of the operands to the same variable as the other operand.
>> -   i.e.
>> -   T.3_5 = <blah>
>> -   a_1 = T.3_5
>> -
>> -   If this copy couldn't be copy propagated, it could possibly remain in the
>> -   program throughout the optimization phases.   After SSA->normal, it would
>> -   become:
>> -
>> -   T.3 = <blah>
>> -   a = T.3
>> -
>> -   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
>> -   fundamental reason why the base variable needs to be T.3, subject to
>> -   certain restrictions.  This optimization attempts to determine if we can
>> -   change the base variable on copies like this, and result in code such as:
>> -
>> -   a_5 = <blah>
>> -   a_1 = a_5
>> -
>> -   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
>> -   possible, the copy goes away completely. If it isn't possible, a new temp
>> -   will be created for a_5, and you will end up with the exact same code:
>> -
>> -   a.8 = <blah>
>> -   a = a.8
>> -
>> -   The other benefit of performing this optimization relates to what variables
>> -   are chosen in copies.  Gimplification of the program uses temporaries for
>> -   a lot of things. expressions like
>> -
>> -   a_1 = <blah>
>> -   <blah2> = a_1
>> -
>> -   get turned into
>> -
>> -   T.3_5 = <blah>
>> -   a_1 = T.3_5
>> -   <blah2> = a_1
>> -
>> -   Copy propagation is done in a forward direction, and if we can propagate
>> -   through the copy, we end up with:
>> -
>> -   T.3_5 = <blah>
>> -   <blah2> = T.3_5
>> -
>> -   The copy is gone, but so is all reference to the user variable 'a'. By
>> -   performing this optimization, we would see the sequence:
>> -
>> -   a_5 = <blah>
>> -   a_1 = a_5
>> -   <blah2> = a_1
>> -
>> -   which copy propagation would then turn into:
>> -
>> -   a_5 = <blah>
>> -   <blah2> = a_5
>> -
>> -   and so we still retain the user variable whenever possible.  */
>> -
>> -
>> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
>> -   Choose a representative for the partition, and send debug info to DEBUG.  */
>> -
>> -static void
>> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
>> -{
>> -  int p1, p2, p3;
>> -  tree root1, root2;
>> -  tree rep1, rep2;
>> -  bool ign1, ign2, abnorm;
>> -
>> -  gcc_assert (TREE_CODE (var1) == SSA_NAME);
>> -  gcc_assert (TREE_CODE (var2) == SSA_NAME);
>> -
>> -  register_ssa_partition (map, var1);
>> -  register_ssa_partition (map, var2);
>> -
>> -  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
>> -  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
>> -
>> -  if (debug)
>> -    {
>> -      fprintf (debug, "Try : ");
>> -      print_generic_expr (debug, var1, TDF_SLIM);
>> -      fprintf (debug, "(P%d) & ", p1);
>> -      print_generic_expr (debug, var2, TDF_SLIM);
>> -      fprintf (debug, "(P%d)", p2);
>> -    }
>> -
>> -  gcc_assert (p1 != NO_PARTITION);
>> -  gcc_assert (p2 != NO_PARTITION);
>> -
>> -  if (p1 == p2)
>> -    {
>> -      if (debug)
>> -       fprintf (debug, " : Already coalesced.\n");
>> -      return;
>> -    }
>> -
>> -  rep1 = partition_to_var (map, p1);
>> -  rep2 = partition_to_var (map, p2);
>> -  root1 = SSA_NAME_VAR (rep1);
>> -  root2 = SSA_NAME_VAR (rep2);
>> -  if (!root1 && !root2)
>> -    return;
>> -
>> -  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
>> -  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
>> -           || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
>> -  if (abnorm)
>> -    {
>> -      if (debug)
>> -       fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
>> -      return;
>> -    }
>> -
>> -  /* Partitions already have the same root, simply merge them.  */
>> -  if (root1 == root2)
>> -    {
>> -      p1 = partition_union (map->var_partition, p1, p2);
>> -      if (debug)
>> -       fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
>> -      return;
>> -    }
>> -
>> -  /* Never attempt to coalesce 2 different parameters.  */
>> -  if ((root1 && TREE_CODE (root1) == PARM_DECL)
>> -      && (root2 && TREE_CODE (root2) == PARM_DECL))
>> -    {
>> -      if (debug)
>> -        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
>> -      return;
>> -    }
>> -
>> -  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
>> -      != (root2 && TREE_CODE (root2) == RESULT_DECL))
>> -    {
>> -      if (debug)
>> -        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
>> -      return;
>> -    }
>> -
>> -  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
>> -  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
>> -
>> -  /* Refrain from coalescing user variables, if requested.  */
>> -  if (!ign1 && !ign2)
>> -    {
>> -      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
>> -       ign2 = true;
>> -      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
>> -       ign1 = true;
>> -      else if (flag_ssa_coalesce_vars != 2)
>> -       {
>> -         if (debug)
>> -           fprintf (debug, " : 2 different USER vars. No coalesce.\n");
>> -         return;
>> -       }
>> -      else
>> -       ign2 = true;
>> -    }
>> -
>> -  /* If both values have default defs, we can't coalesce.  If only one has a
>> -     tag, make sure that variable is the new root partition.  */
>> -  if (root1 && ssa_default_def (cfun, root1))
>> -    {
>> -      if (root2 && ssa_default_def (cfun, root2))
>> -       {
>> -         if (debug)
>> -           fprintf (debug, " : 2 default defs. No coalesce.\n");
>> -         return;
>> -       }
>> -      else
>> -        {
>> -         ign2 = true;
>> -         ign1 = false;
>> -       }
>> -    }
>> -  else if (root2 && ssa_default_def (cfun, root2))
>> -    {
>> -      ign1 = true;
>> -      ign2 = false;
>> -    }
>> -
>> -  /* Do not coalesce if we cannot assign a symbol to the partition.  */
>> -  if (!(!ign2 && root2)
>> -      && !(!ign1 && root1))
>> -    {
>> -      if (debug)
>> -       fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
>> -      return;
>> -    }
>> -
>> -  /* Don't coalesce if the new chosen root variable would be read-only.
>> -     If both ign1 && ign2, then the root var of the larger partition
>> -     wins, so reject in that case if any of the root vars is TREE_READONLY.
>> -     Otherwise reject only if the root var, on which replace_ssa_name_symbol
>> -     will be called below, is readonly.  */
>> -  if (((root1 && TREE_READONLY (root1)) && ign2)
>> -      || ((root2 && TREE_READONLY (root2)) && ign1))
>> -    {
>> -      if (debug)
>> -       fprintf (debug, " : Readonly variable.  No coalesce.\n");
>> -      return;
>> -    }
>> -
>> -  /* Don't coalesce if the two variables aren't type compatible .  */
>> -  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
>> -      /* There is a disconnect between the middle-end type-system and
>> -         VRP, avoid coalescing enum types with different bounds.  */
>> -      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
>> -          || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
>> -         && TREE_TYPE (var1) != TREE_TYPE (var2)))
>> -    {
>> -      if (debug)
>> -       fprintf (debug, " : Incompatible types.  No coalesce.\n");
>> -      return;
>> -    }
>> -
>> -  /* Merge the two partitions.  */
>> -  p3 = partition_union (map->var_partition, p1, p2);
>> -
>> -  /* Set the root variable of the partition to the better choice, if there is
>> -     one.  */
>> -  if (!ign2 && root2)
>> -    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
>> -  else if (!ign1 && root1)
>> -    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
>> -  else
>> -    gcc_unreachable ();
>> -
>> -  if (debug)
>> -    {
>> -      fprintf (debug, " --> P%d ", p3);
>> -      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
>> -                         TDF_SLIM);
>> -      fprintf (debug, "\n");
>> -    }
>> -}
>> -
>> -
>> -namespace {
>> -
>> -const pass_data pass_data_rename_ssa_copies =
>> -{
>> -  GIMPLE_PASS, /* type */
>> -  "copyrename", /* name */
>> -  OPTGROUP_NONE, /* optinfo_flags */
>> -  TV_TREE_COPY_RENAME, /* tv_id */
>> -  ( PROP_cfg | PROP_ssa ), /* properties_required */
>> -  0, /* properties_provided */
>> -  0, /* properties_destroyed */
>> -  0, /* todo_flags_start */
>> -  0, /* todo_flags_finish */
>> -};
>> -
>> -class pass_rename_ssa_copies : public gimple_opt_pass
>> -{
>> -public:
>> -  pass_rename_ssa_copies (gcc::context *ctxt)
>> -    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
>> -  {}
>> -
>> -  /* opt_pass methods: */
>> -  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
>> -  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
>> -  virtual unsigned int execute (function *);
>> -
>> -}; // class pass_rename_ssa_copies
>> -
>> -/* This function will make a pass through the IL, and attempt to coalesce any
>> -   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
>> -   changing the underlying root variable of all coalesced version.  This will
>> -   then cause the SSA->normal pass to attempt to coalesce them all to the same
>> -   variable.  */
>> -
>> -unsigned int
>> -pass_rename_ssa_copies::execute (function *fun)
>> -{
>> -  var_map map;
>> -  basic_block bb;
>> -  tree var, part_var;
>> -  gimple stmt;
>> -  unsigned x;
>> -  FILE *debug;
>> -
>> -  memset (&stats, 0, sizeof (stats));
>> -
>> -  if (dump_file && (dump_flags & TDF_DETAILS))
>> -    debug = dump_file;
>> -  else
>> -    debug = NULL;
>> -
>> -  map = init_var_map (num_ssa_names);
>> -
>> -  FOR_EACH_BB_FN (bb, fun)
>> -    {
>> -      /* Scan for real copies.  */
>> -      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
>> -          gsi_next (&gsi))
>> -       {
>> -         stmt = gsi_stmt (gsi);
>> -         if (gimple_assign_ssa_name_copy_p (stmt))
>> -           {
>> -             tree lhs = gimple_assign_lhs (stmt);
>> -             tree rhs = gimple_assign_rhs1 (stmt);
>> -
>> -             copy_rename_partition_coalesce (map, lhs, rhs, debug);
>> -           }
>> -       }
>> -    }
>> -
>> -  FOR_EACH_BB_FN (bb, fun)
>> -    {
>> -      /* Treat PHI nodes as copies between the result and each argument.  */
>> -      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
>> -          gsi_next (&gsi))
>> -        {
>> -          size_t i;
>> -         tree res;
>> -         gphi *phi = gsi.phi ();
>> -         res = gimple_phi_result (phi);
>> -
>> -         /* Do not process virtual SSA_NAMES.  */
>> -         if (virtual_operand_p (res))
>> -           continue;
>> -
>> -         /* Make sure to only use the same partition for an argument
>> -            as the result but never the other way around.  */
>> -         if (SSA_NAME_VAR (res)
>> -             && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
>> -           for (i = 0; i < gimple_phi_num_args (phi); i++)
>> -             {
>> -               tree arg = PHI_ARG_DEF (phi, i);
>> -               if (TREE_CODE (arg) == SSA_NAME)
>> -                 copy_rename_partition_coalesce (map, res, arg,
>> -                                                 debug);
>> -             }
>> -         /* Else if all arguments are in the same partition try to merge
>> -            it with the result.  */
>> -         else
>> -           {
>> -             int all_p_same = -1;
>> -             int p = -1;
>> -             for (i = 0; i < gimple_phi_num_args (phi); i++)
>> -               {
>> -                 tree arg = PHI_ARG_DEF (phi, i);
>> -                 if (TREE_CODE (arg) != SSA_NAME)
>> -                   {
>> -                     all_p_same = 0;
>> -                     break;
>> -                   }
>> -                 else if (all_p_same == -1)
>> -                   {
>> -                     p = partition_find (map->var_partition,
>> -                                         SSA_NAME_VERSION (arg));
>> -                     all_p_same = 1;
>> -                   }
>> -                 else if (all_p_same == 1
>> -                          && p != partition_find (map->var_partition,
>> -                                                  SSA_NAME_VERSION (arg)))
>> -                   {
>> -                     all_p_same = 0;
>> -                     break;
>> -                   }
>> -               }
>> -             if (all_p_same == 1)
>> -               copy_rename_partition_coalesce (map, res,
>> -                                               PHI_ARG_DEF (phi, 0),
>> -                                               debug);
>> -           }
>> -        }
>> -    }
>> -
>> -  if (debug)
>> -    dump_var_map (debug, map);
>> -
>> -  /* Now one more pass to make all elements of a partition share the same
>> -     root variable.  */
>> -
>> -  for (x = 1; x < num_ssa_names; x++)
>> -    {
>> -      part_var = partition_to_var (map, x);
>> -      if (!part_var)
>> -        continue;
>> -      var = ssa_name (x);
>> -      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
>> -       continue;
>> -      if (debug)
>> -        {
>> -         fprintf (debug, "Coalesced ");
>> -         print_generic_expr (debug, var, TDF_SLIM);
>> -         fprintf (debug, " to ");
>> -         print_generic_expr (debug, part_var, TDF_SLIM);
>> -         fprintf (debug, "\n");
>> -       }
>> -      stats.coalesced++;
>> -      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
>> -    }
>> -
>> -  statistics_counter_event (fun, "copies coalesced",
>> -                           stats.coalesced);
>> -  delete_var_map (map);
>> -  return 0;
>> -}
>> -
>> -} // anon namespace
>> -
>> -gimple_opt_pass *
>> -make_pass_rename_ssa_copies (gcc::context *ctxt)
>> -{
>> -  return new pass_rename_ssa_copies (ctxt);
>> -}
>> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
>> index 2c7c072..821b2f4 100644
>> --- a/gcc/tree-ssa-live.c
>> +++ b/gcc/tree-ssa-live.c
>> @@ -100,90 +100,6 @@ static void  verify_live_on_entry (tree_live_info_p);
>>     ssa_name or variable, and vice versa.  */
>>
>>
>> -/* Hashtable helpers.  */
>> -
>> -struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
>> -{
>> -  typedef tree_int_map *value_type;
>> -  typedef tree_int_map *compare_type;
>> -  static inline hashval_t hash (const tree_int_map *);
>> -  static inline bool equal (const tree_int_map *, const tree_int_map *);
>> -};
>> -
>> -inline hashval_t
>> -tree_int_map_hasher::hash (const tree_int_map *v)
>> -{
>> -  return tree_map_base_hash (v);
>> -}
>> -
>> -inline bool
>> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
>> -{
>> -  return tree_int_map_eq (v, c);
>> -}
>> -
>> -
>> -/* This routine will initialize the basevar fields of MAP.  */
>> -
>> -static void
>> -var_map_base_init (var_map map)
>> -{
>> -  int x, num_part;
>> -  tree var;
>> -  struct tree_int_map *m, *mapstorage;
>> -
>> -  num_part = num_var_partitions (map);
>> -  hash_table<tree_int_map_hasher> tree_to_index (num_part);
>> -  /* We can have at most num_part entries in the hash tables, so it's
>> -     enough to allocate so many map elements once, saving some malloc
>> -     calls.  */
>> -  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
>> -
>> -  /* If a base table already exists, clear it, otherwise create it.  */
>> -  free (map->partition_to_base_index);
>> -  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
>> -
>> -  /* Build the base variable list, and point partitions at their bases.  */
>> -  for (x = 0; x < num_part; x++)
>> -    {
>> -      struct tree_int_map **slot;
>> -      unsigned baseindex;
>> -      var = partition_to_var (map, x);
>> -      if (SSA_NAME_VAR (var)
>> -         && (!VAR_P (SSA_NAME_VAR (var))
>> -             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
>> -       m->base.from = SSA_NAME_VAR (var);
>> -      else
>> -       /* This restricts what anonymous SSA names we can coalesce
>> -          as it restricts the sets we compute conflicts for.
>> -          Using TREE_TYPE to generate sets is the easies as
>> -          type equivalency also holds for SSA names with the same
>> -          underlying decl.
>> -
>> -          Check gimple_can_coalesce_p when changing this code.  */
>> -       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
>> -                       ? TYPE_CANONICAL (TREE_TYPE (var))
>> -                       : TREE_TYPE (var));
>> -      /* If base variable hasn't been seen, set it up.  */
>> -      slot = tree_to_index.find_slot (m, INSERT);
>> -      if (!*slot)
>> -       {
>> -         baseindex = m - mapstorage;
>> -         m->to = baseindex;
>> -         *slot = m;
>> -         m++;
>> -       }
>> -      else
>> -       baseindex = (*slot)->to;
>> -      map->partition_to_base_index[x] = baseindex;
>> -    }
>> -
>> -  map->num_basevars = m - mapstorage;
>> -
>> -  free (mapstorage);
>> -}
>> -
>> -
>>  /* Remove the base table in MAP.  */
>>
>>  static void
>> @@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
>>  }
>>
>>
>> -/* Create a partition view which includes all the used partitions in MAP.  If
>> -   WANT_BASES is true, create the base variable map as well.  */
>> +/* Create a partition view which includes all the used partitions in MAP.  */
>>
>>  void
>> -partition_view_normal (var_map map, bool want_bases)
>> +partition_view_normal (var_map map)
>>  {
>>    bitmap used;
>>
>>    used = partition_view_init (map);
>>    partition_view_fini (map, used);
>>
>> -  if (want_bases)
>> -    var_map_base_init (map);
>> -  else
>> -    var_map_base_fini (map);
>> +  var_map_base_fini (map);
>>  }
>>
>>
>> @@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
>>     as well.  */
>>
>>  void
>> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>> +partition_view_bitmap (var_map map, bitmap only)
>>  {
>>    bitmap used;
>>    bitmap new_partitions = BITMAP_ALLOC (NULL);
>> @@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>>      }
>>    partition_view_fini (map, new_partitions);
>>
>> -  if (want_bases)
>> -    var_map_base_init (map);
>> -  else
>> -    var_map_base_fini (map);
>> +  var_map_base_fini (map);
>>  }
>>
>>
>> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
>> index d5d7820..1f88358 100644
>> --- a/gcc/tree-ssa-live.h
>> +++ b/gcc/tree-ssa-live.h
>> @@ -71,8 +71,8 @@ typedef struct _var_map
>>  extern var_map init_var_map (int);
>>  extern void delete_var_map (var_map);
>>  extern int var_union (var_map, tree, tree);
>> -extern void partition_view_normal (var_map, bool);
>> -extern void partition_view_bitmap (var_map, bitmap, bool);
>> +extern void partition_view_normal (var_map);
>> +extern void partition_view_bitmap (var_map, bitmap);
>>  extern void dump_scope_blocks (FILE *, int);
>>  extern void debug_scope_block (tree, int);
>>  extern void debug_scope_blocks (int);
>> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
>> index 3f6bebe..7bef8cf 100644
>> --- a/gcc/tree-ssa-loop-niter.c
>> +++ b/gcc/tree-ssa-loop-niter.c
>> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
>>         if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
>>           continue;
>>         e = TREE_OPERAND (e, 0);
>> -       gcc_assert (operand_equal_p (e, base, 0));
>> +       /* If E has an unsigned type, the operand equality test below
>> +          would fail, but the equality test above would have already
>> +          verified the equality, so we can proceed with it.  */
>> +       gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
>> +                   || operand_equal_p (e, base, 0));
>>         if (tree_int_cst_sign_bit (step))
>>           {
>>             code = LT_EXPR;
>> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
>> index f75a7f1..0982305 100644
>> --- a/gcc/tree-ssa-uncprop.c
>> +++ b/gcc/tree-ssa-uncprop.c
>> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "domwalk.h"
>>  #include "tree-pass.h"
>>  #include "tree-ssa-propagate.h"
>> +#include "bitmap.h"
>> +#include "stringpool.h"
>> +#include "tree-ssanames.h"
>> +#include "tree-ssa-live.h"
>> +#include "tree-ssa-coalesce.h"
>>
>>  /* The basic structure describing an equivalency created by traversing
>>     an edge.  Traversing the edge effectively means that we can assume
>> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
>> index 0b24007..acdcd46 100644
>> --- a/gcc/var-tracking.c
>> +++ b/gcc/var-tracking.c
>> @@ -4931,12 +4931,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
>>     registers, as well as associations between MEMs and VALUEs.  */
>>
>>  static void
>> -dataflow_set_clear_at_call (dataflow_set *set)
>> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
>>  {
>>    unsigned int r;
>>    hard_reg_set_iterator hrsi;
>> +  HARD_REG_SET invalidated_regs;
>>
>> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
>> +  get_call_reg_set_usage (call_insn, &invalidated_regs,
>> +                         regs_invalidated_by_call);
>> +
>> +  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
>>      var_regno_delete (set, r);
>>
>>    if (MAY_HAVE_DEBUG_INSNS)
>> @@ -6720,7 +6724,7 @@ compute_bb_dataflow (basic_block bb)
>>        switch (mo->type)
>>         {
>>           case MO_CALL:
>> -           dataflow_set_clear_at_call (out);
>> +           dataflow_set_clear_at_call (out, insn);
>>             break;
>>
>>           case MO_USE:
>> @@ -9182,7 +9186,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
>>        switch (mo->type)
>>         {
>>           case MO_CALL:
>> -           dataflow_set_clear_at_call (set);
>> +           dataflow_set_clear_at_call (set, insn);
>>             emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
>>             {
>>               rtx arguments = mo->u.loc, *p = &arguments;
>>
>>
>>
>> And here's the incremental patch:
>>
>> ---
>>  gcc/alias.c               |   17 +++++++------
>>  gcc/cfgexpand.c           |   57 +++++++++++++++++----------------------------
>>  gcc/emit-rtl.c            |    2 --
>>  gcc/explow.c              |    3 --
>>  gcc/expr.c                |   16 +++++--------
>>  gcc/function.c            |   15 ++++++++++++
>>  gcc/gimple-expr.h         |    4 ---
>>  gcc/tree-outof-ssa.c      |    7 ++----
>>  gcc/tree-ssa-coalesce.h   |    1 +
>>  gcc/tree-ssa-loop-niter.c |    6 ++++-
>>  gcc/tree-ssa-uncprop.c    |    5 ++++
>>  11 files changed, 64 insertions(+), 69 deletions(-)
>>
>> diff --git a/gcc/alias.c b/gcc/alias.c
>> index 7a74e81..5a031d9 100644
>> --- a/gcc/alias.c
>> +++ b/gcc/alias.c
>> @@ -2553,14 +2553,15 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>>      return 0;
>>
>>    /* If we refer to different gimple registers, or one gimple register
>> -     and one non-gimple-register, we know they can't overlap.  Now,
>> -     there could be more than one stack slot for (different versions
>> -     of) the same gimple register, but we can presumably tell they
>> -     don't overlap based on offsets from stack base addresses
>> -     elsewhere.  It's important that we don't proceed to DECL_RTL,
>> -     because gimple registers may not pass DECL_RTL_SET_P, and
>> -     make_decl_rtl won't be able to do anything about them since no
>> -     SSA information will have remained to guide it.  */
>> +     and one non-gimple-register, we know they can't overlap.  First,
>> +     gimple registers don't have their addresses taken.  Now, there
>> +     could be more than one stack slot for (different versions of) the
>> +     same gimple register, but we can presumably tell they don't
>> +     overlap based on offsets from stack base addresses elsewhere.
>> +     It's important that we don't proceed to DECL_RTL, because gimple
>> +     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
>> +     able to do anything about them since no SSA information will have
>> +     remained to guide it.  */
>>    if (is_gimple_reg (exprx) || is_gimple_reg (expry))
>>      return exprx != expry;
>>
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index 3e80b4a..bf972fc 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -179,11 +179,10 @@ gimple_assign_rhs_to_tree (gimple stmt)
>>
>>  #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>>
>> -/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
>> -   TREE_LIST of DECLs.  If NEXT is covered by CUR, return CUR
>> -   unchanged.  Otherwise, return a list with all entries of CUR, with
>> -   NEXT at the end.  If CUR was a list, it will be modified in
>> -   place.  */
>> +/* Choose either CUR or NEXT as the leader DECL for a partition.
>> +   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
>> +   out of the same user variable being in multiple partitions (this is
>> +   less likely for compiler-introduced temps).  */
>>
>>  static tree
>>  leader_merge (tree cur, tree next)
>> @@ -191,26 +190,11 @@ leader_merge (tree cur, tree next)
>>    if (cur == NULL || cur == next)
>>      return next;
>>
>> -  tree list;
>> +  if (DECL_P (cur) && DECL_IGNORED_P (cur))
>> +    return cur;
>>
>> -  if (TREE_CODE (cur) == TREE_LIST)
>> -    {
>> -      /* Look for NEXT in the list.  Stop at the last node to insert
>> -        there.  */
>> -      for (list = cur; ; list = TREE_CHAIN (list))
>> -       {
>> -         if (TREE_VALUE (list) == next)
>> -           return cur;
>> -         if (!TREE_CHAIN (list))
>> -           break;
>> -       }
>> -    }
>> -  else
>> -    /* Create the first node.  */
>> -    list = build_tree_list (NULL, cur);
>> -
>> -  next = build_tree_list (NULL, next);
>> -  TREE_CHAIN (list) = next;
>> +  if (DECL_P (next) && DECL_IGNORED_P (next))
>> +    return next;
>>
>>    return cur;
>>  }
>> @@ -285,9 +269,9 @@ set_rtl (tree t, rtx x)
>>        if (cur != next)
>>         {
>>           if (MEM_P (x))
>> -           set_mem_attributes (x, SSAVAR (t), true);
>> +           set_mem_attributes (x, next, true);
>>           else
>> -           set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
>> +           set_reg_attrs_for_decl_rtl (next, x);
>>         }
>>      }
>>
>> @@ -1025,9 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>>
>>    x = plus_constant (Pmode, base, offset);
>> -  x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
>> -                  ? DECL_MODE (SSAVAR (decl))
>> -                  : TYPE_MODE (TREE_TYPE (decl)), x);
>> +  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
>> +                  ? TYPE_MODE (TREE_TYPE (decl))
>> +                  : DECL_MODE (SSAVAR (decl)), x);
>>
>>    if (TREE_CODE (decl) != SSA_NAME)
>>      {
>> @@ -1268,17 +1252,17 @@ expand_one_stack_var_1 (tree var)
>>    HOST_WIDE_INT size, offset;
>>    unsigned byte_align;
>>
>> -  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
>> -    {
>> -      size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>> -      byte_align = align_local_variable (SSAVAR (var));
>> -    }
>> -  else
>> +  if (TREE_CODE (var) == SSA_NAME)
>>      {
>>        tree type = TREE_TYPE (var);
>>        size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
>>        byte_align = TYPE_ALIGN_UNIT (type);
>>      }
>> +  else
>> +    {
>> +      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
>> +      byte_align = align_local_variable (var);
>> +    }
>>
>>    /* We handle highly aligned variables in expand_stack_vars.  */
>>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
>> @@ -1423,9 +1407,10 @@ expand_one_register_var (tree var)
>>           gcc_assert (REG_P (x));
>>           return;
>>         }
>> +      gcc_unreachable ();
>>      }
>>
>> -  tree decl = SSAVAR (var);
>> +  tree decl = var;
>>    tree type = TREE_TYPE (decl);
>>    machine_mode reg_mode = promote_decl_mode (decl, NULL);
>>    rtx x = gen_reg_rtx (reg_mode);
>> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
>> index 308da40..2b98946 100644
>> --- a/gcc/emit-rtl.c
>> +++ b/gcc/emit-rtl.c
>> @@ -1252,8 +1252,6 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>>    if (!t)
>>      return;
>>    tree tdecl = t;
>> -  if (TREE_CODE (t) == TREE_LIST)
>> -    tdecl = TREE_VALUE (t);
>>    if (GET_CODE (x) == SUBREG)
>>      {
>>        gcc_assert (subreg_lowpart_p (x));
>> diff --git a/gcc/explow.c b/gcc/explow.c
>> index e09c032e1..5b0d49c 100644
>> --- a/gcc/explow.c
>> +++ b/gcc/explow.c
>> @@ -866,9 +866,6 @@ promote_ssa_mode (const_tree name, int *punsignedp)
>>  {
>>    gcc_assert (TREE_CODE (name) == SSA_NAME);
>>
>> -  if (SSA_NAME_VAR (name))
>> -    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>> -
>>    tree type = TREE_TYPE (name);
>>    int unsignedp = TYPE_UNSIGNED (type);
>>    machine_mode mode = TYPE_MODE (type);
>> diff --git a/gcc/expr.c b/gcc/expr.c
>> index effe379..5b6e16e 100644
>> --- a/gcc/expr.c
>> +++ b/gcc/expr.c
>> @@ -9584,20 +9584,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>
>>           /* Get the signedness to be used for this variable.  Ensure we get
>>              the same mode we got when the variable was declared.  */
>> -         if (code == SSA_NAME
>> -             && (g = SSA_NAME_DEF_STMT (ssa_name))
>> -             && gimple_code (g) == GIMPLE_CALL
>> -             && !gimple_call_internal_p (g))
>> +         if (code != SSA_NAME)
>> +           pmode = promote_decl_mode (exp, &unsignedp);
>> +         else if ((g = SSA_NAME_DEF_STMT (ssa_name))
>> +                  && gimple_code (g) == GIMPLE_CALL
>> +                  && !gimple_call_internal_p (g))
>>             pmode = promote_function_mode (type, mode, &unsignedp,
>>                                            gimple_call_fntype (g),
>>                                            2);
>> -         else if (!exp)
>> -           {
>> -             gcc_assert (code == SSA_NAME);
>> -             pmode = promote_ssa_mode (ssa_name, &unsignedp);
>> -           }
>>           else
>> -           pmode = promote_decl_mode (exp, &unsignedp);
>> +           pmode = promote_ssa_mode (ssa_name, &unsignedp);
>>           gcc_assert (GET_MODE (decl_rtl) == pmode);
>>
>>           temp = gen_lowpart_SUBREG (mode, decl_rtl);
>> diff --git a/gcc/function.c b/gcc/function.c
>> index dc9e77f..58e2498 100644
>> --- a/gcc/function.c
>> +++ b/gcc/function.c
>> @@ -2124,6 +2124,21 @@ use_register_for_decl (const_tree decl)
>>  {
>>    if (TREE_CODE (decl) == SSA_NAME)
>>      {
>> +      /* We often try to use the SSA_NAME, instead of its underlying
>> +        decl, to get type information and guide decisions, to avoid
>> +        differences of behavior between anonymous and named
>> +        variables, but in this one case we have to go for the actual
>> +        variable if there is one.  The main reason is that, at least
>> +        at -O0, we want to place user variables on the stack, but we
>> +        don't mind using pseudos for anonymous or ignored temps.
>> +        Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
>> +        should go in pseudos, whereas their corresponding variables
>> +        might have to go on the stack.  So, disregarding the decl
>> +        here would negatively impact debug info at -O0, enable
>> +        coalescing between SSA_NAMEs that ought to get different
>> +        stack/pseudo assignments, and get the incoming argument
>> +        processing thoroughly confused by PARM_DECLs expected to live
>> +        in stack slots but assigned to pseudos.  */
>>        if (!SSA_NAME_VAR (decl))
>>         return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>>           && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
>> index 146cede..3d1c89f 100644
>> --- a/gcc/gimple-expr.h
>> +++ b/gcc/gimple-expr.h
>> @@ -55,10 +55,6 @@ extern bool is_gimple_mem_ref_addr (tree);
>>  extern void mark_addressable (tree);
>>  extern bool is_gimple_reg_rhs (tree);
>>
>> -/* Defined in tree-ssa-coalesce.c.   */
>> -extern bool gimple_can_coalesce_p (tree, tree);
>> -
>> -
>>  /* Return true if a conversion from either type of TYPE1 and TYPE2
>>     to the other is not required.  Otherwise return false.  */
>>
>> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
>> index dda9973..59d91c6 100644
>> --- a/gcc/tree-outof-ssa.c
>> +++ b/gcc/tree-outof-ssa.c
>> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>>    rtx dest_rtx, seq, x;
>>    machine_mode dest_mode, src_mode;
>>    int unsignedp;
>> -  tree var;
>>
>>    if (dump_file && (dump_flags & TDF_DETAILS))
>>      {
>> @@ -328,10 +327,9 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>>    start_sequence ();
>>
>>    tree name = partition_to_var (SA.map, dest);
>> -  var = SSA_NAME_VAR (name);
>>    src_mode = TYPE_MODE (TREE_TYPE (src));
>>    dest_mode = GET_MODE (dest_rtx);
>> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
>> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>>    gcc_assert (!REG_P (dest_rtx)
>>               || dest_mode == promote_ssa_mode (name, &unsignedp));
>>
>> @@ -709,8 +707,7 @@ elim_backward (elim_graph g, int T)
>>  static rtx
>>  get_temp_reg (tree name)
>>  {
>> -  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>> -  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
>> +  tree type = TREE_TYPE (name);
>>    int unsignedp;
>>    machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>>    rtx x = gen_reg_rtx (reg_mode);
>> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
>> index 99b188a..ae289b4 100644
>> --- a/gcc/tree-ssa-coalesce.h
>> +++ b/gcc/tree-ssa-coalesce.h
>> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>>  #define GCC_TREE_SSA_COALESCE_H
>>
>>  extern var_map coalesce_ssa_name (void);
>> +extern bool gimple_can_coalesce_p (tree, tree);
>>
>>  #endif /* GCC_TREE_SSA_COALESCE_H */
>> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
>> index 3f6bebe..7bef8cf 100644
>> --- a/gcc/tree-ssa-loop-niter.c
>> +++ b/gcc/tree-ssa-loop-niter.c
>> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
>>         if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
>>           continue;
>>         e = TREE_OPERAND (e, 0);
>> -       gcc_assert (operand_equal_p (e, base, 0));
>> +       /* If E has an unsigned type, the operand equality test below
>> +          would fail, but the equality test above would have already
>> +          verified the equality, so we can proceed with it.  */
>> +       gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
>> +                   || operand_equal_p (e, base, 0));
>>         if (tree_int_cst_sign_bit (step))
>>           {
>>             code = LT_EXPR;
>> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
>> index f75a7f1..0982305 100644
>> --- a/gcc/tree-ssa-uncprop.c
>> +++ b/gcc/tree-ssa-uncprop.c
>> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "domwalk.h"
>>  #include "tree-pass.h"
>>  #include "tree-ssa-propagate.h"
>> +#include "bitmap.h"
>> +#include "stringpool.h"
>> +#include "tree-ssanames.h"
>> +#include "tree-ssa-live.h"
>> +#include "tree-ssa-coalesce.h"
>>
>>  /* The basic structure describing an equivalency created by traversing
>>     an edge.  Traversing the edge effectively means that we can assume
>>
>>
>>
>> --
>> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
>> You must be the change you wish to see in the world. -- Gandhi
>> Be Free! -- http://FSFLA.org/   FSF Latin America board member
>> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-06  5:12             ` Alexandre Oliva
  2015-06-08  8:16               ` Richard Biener
@ 2015-06-10  0:28               ` Alexandre Oliva
  2015-06-10 13:36                 ` Richard Biener
  1 sibling, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-06-10  0:28 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Jun  5, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

>>> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
>>> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
>>> +   mode of a temp decl of same type as the SSA_NAME, if we had created
>>> +   one.  */
>>> +
>>> +machine_mode
>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>> +{
>>> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
>>> +
>>> +  if (SSA_NAME_VAR (name))
>>> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);

>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>> vars (so just delete the above two lines).

> Check

This caused the sparc regression reported by Eric in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37

We need to match the mode of the rtl created for the partition and the
promoted mode expected for the parm.  I recall working to make parm and
result decls the partition leaders, so that promote_ssa_mode would DTRT,
but this escaped my mind when revisiting the patch after some time on
another project.

So we either restore promote_ssa_mode's check for an underlying decl, at
least for PARM_ and RESULT_DECLs, or further massage function.c to deal
with the mode difference.  Any preference?


I'm reverting the patch for now, so that we don't have to rush to a fix
on this, and I can have more time to test and fix other arches.  It was
a terrible mistake to not do so before submitting the final version of
the patch, or at least before installing it.  I apologize for that.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-10  0:28               ` Alexandre Oliva
@ 2015-06-10 13:36                 ` Richard Biener
  2015-07-16  7:58                   ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-06-10 13:36 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jun  5, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>>>> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
>>>> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
>>>> +   mode of a temp decl of same type as the SSA_NAME, if we had created
>>>> +   one.  */
>>>> +
>>>> +machine_mode
>>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>>> +{
>>>> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
>>>> +
>>>> +  if (SSA_NAME_VAR (name))
>>>> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>
>>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>>> vars (so just delete the above two lines).
>
>> Check
>
> This caused the sparc regression reported by Eric in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37
>
> We need to match the mode of the rtl created for the partition and the
> promoted mode expected for the parm.  I recall working to make parm and
> result decls the partition leaders, so that promote_ssa_mode would DTRT,
> but this escaped my mind when revisiting the patch after some time on
> another project.
>
> So we either restore promote_ssa_mode's check for an underlying decl, at
> least for PARM_ and RESULT_DECLs, or further massage function.c to deal
> with the mode difference.  Any preference?

Alternatively not coalesce SSA names when promote_decl_mode gives
different answers (for their underlying decl)?  It sounds wrong to do that
(if that is really what happens).

Richard.

> I'm reverting the patch for now, so that we don't have to rush to a fix
> on this, and I can have more time to test and fix other arches.  It was
> a terrible mistake to not do so before submitting the final version of
> the patch, or at least before installing it.  I apologize for that.
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-10 13:36                 ` Richard Biener
@ 2015-07-16  7:58                   ` Alexandre Oliva
  2015-07-16  8:50                     ` Richard Biener
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-16  7:58 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Jun 10, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> This caused the sparc regression reported by Eric in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37

>> We need to match the mode of the rtl created for the partition and the
>> promoted mode expected for the parm.  I recall working to make parm and
>> result decls the partition leaders, so that promote_ssa_mode would DTRT,
>> but this escaped my mind when revisiting the patch after some time on
>> another project.

FWIW, during the development of this improvement, I dropped the notion
of making parm and result decls partition leaders, and instead only
considered eligible for coalescing into the same partition SSA_NAMEs
that promoted to the same mode.

> Alternatively not coalesce SSA names when promote_decl_mode gives
> different answers (for their underlying decl)?  It sounds wrong to do that
> (if that is really what happens).

Exactly.  I've now restored the promote_decl_mode behavior to
promote_ssa_mode for PARM_ and RESULT_DECLs, so that the strategy
described above works again.  This fixed the sparc regression.

On Jun  9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Jun  9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

>> On Jun  9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>>> This also broke bootstrap on PPC64 LE Linux with the same error.

>> Thanks for your reports.  I'm looking into the problem.

>> I'd appreciate a preprocessed testcase from either of you to confirm the
>> fix, if not to help debug it.

> The first potential source for this problem that jumped at me would be
> silenced with this change:

> diff --git a/gcc/function.c b/gcc/function.c
> index 8bcc352..9201ed9 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>  	stack_parm = copy_rtx (stack_parm);
>        if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>  	PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -      set_mem_attributes (stack_parm, parm, 1);
> +      if (GET_CODE (stack_parm) == MEM)
> +	set_mem_attributes (stack_parm, parm, 1);
>      }
 
>    /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle

I ended up fixing this in a slightly different way, running the original
code above, from assign_stack_local to set_mem_attributes, only when
rtl_for_parm does not obtain an assignment set up by out-of-ssa.

> but I suspect there might be other similar issues lurking in function.c
> after my attempt to turn parm assignment upside down ;-)

There weren't, after all.

On Jun  9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:

> This patch clearly should have been tested on more
> architectures than x86 before being approved and merged.

The following patch was regstrapped on x86_64-linux-gnu and
i686-pc-linux-gnu.  I've also cross-built all-target successfully for
targets aarch64-elf, arm-eabi, arm-symbianelf, avr-elf, bfin-elf,
cr16-elf, cris-elf, crisv32-elf, epiphany-elf, fido-elf, fr30-elf,
frv-elf, i686-elf, lm32-elf, m68k-elf, mcore-elf, microblaze-elf,
mips64el-elf, mips64-elf, mips64orion-elf, mipsel-elf,
mipsisa32-elfoabi, mipsisa64-elfoabi, mipsisa64r2el-elf,
mipsisa64r2-sde-elf, mipsisa64sb1-elf, mipstx39-elf, mn10300-elf,
moxie-elf, nds32be-elf, nds32le-elf, nios2-elf, powerpc-eabialtivec,
powerpc-eabisimaltivec, powerpc-eabisim, powerpc-eabispe, powerpc-eabi,
powerpcle-eabisim, powerpcle-eabi, powerpcle-elf, ppc-eabi, ppc-elf,
rx-elf, sh-elf, sh-superh-elf, sparc64-elf, sparc-elf, spu-elf, and
visium-elf, and got the same build failures before and after the patch
with targets c6x-elf, ft32-elf, h8300-elf, ia64-elf, iq2000-elf,
m32c-elf, m32r-elf, m32rle-elf, mep-elf, mips64vr-elf
(mips64vr-elf/mips16/newlib/libm/math/lib_a_e_hypot.o failed to build
with the patch and passed without it, but there were other "invalid
operand" failures for "lwu" insns without the patch, so I'm counting the
e_hypot failure as present but latent before), mipsisa64sr71k-elf,
msp430-elf, pdp11-aout, powerpc-xilinx-eabi, ppc64-eabi, rl78-elf,
sh64-elf, sparc-leon-elf, v850e-elf, v850-elf, xstormy16-elf, and
xtensa-elf.

This patch differs from the previous one in that I dropped the hunk I
had put in loop_exits_before_overflow, already noticed and fixed
independently (PR66638); I updated tree_int_map_hasher, that was updated
in the trunk in tree-ssa-live.c, but that the patch moved to
tree-ssa-coalesce.c; I resolved other conflicts in files that had
#includes added by the patch and by other changes; and I put in the two
fixes mentioned above.  After the full updated patch, I enclose a diff
with these two additional fixes, to ease the review.

Is this ok to install?


for  gcc/ChangeLog

	PR rtl-optimization/64164
	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
	* tree-ssa-copyrename.c: Removed.
	* opts.c (default_options_table): Drop -ftree-copyrename.  Add
	-ftree-coalesce-vars.
	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
	* common.opt (ftree-copyrename): Ignore.
	(ftree-coalesce-inlined-vars): Likewise.
	* doc/invoke.texi: Remove the ignored options above.
	* gimple-expr.h (gimple_can_coalesce_p): Move declaration
	* tree-ssa-coalesce.h: ... here.
	* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
	headers required by it.
	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
	across variables when flag_tree_coalesce_vars.  Check register
	use and promoted modes to allow coalescing.  Moved to
	tree-ssa-coalesce.c.
	* tree-ssa-live.c (struct tree_int_map_hasher): Move along
	with its member functions to tree-ssa-coalesce.c.
	(var_map_base_init): Likewise.  Renamed to
	compute_samebase_partition_bases.
	(partition_view_normal): Drop want_bases parameter.
	(partition_view_bitmap): Likewise.
	* tree-ssa-live.h: Adjust declarations.
	* tree-ssa-coalesce.c: Include explow.h.
	(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
	default defs at the entry point.
	(dump_part_var_map): New.
	(compute_optimized_partition_bases): New, called by...
	(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
	of compute_samebase_partition_bases.  Adjust.
	* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
	* cfgexpand.c (leader_merge): New.
	(get_rtl_for_parm_ssa_default_def): New.
	(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
	vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
	(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
	redundant MEM attr setting.
	(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
	from...
	(expand_one_stack_var): ... this.  New wrapper to check and
	skip already expanded SSA partitions.
	(record_alignment_for_reg_var): New, factored out of...
	(expand_one_var): ... this.
	(expand_one_ssa_partition): New.
	(adjust_one_expanded_partition_var): New.
	(expand_one_register_var): Check and skip already expanded SSA
	partitions.
	(expand_used_vars): Don't create DECLs for anonymous SSA
	names.  Expand all SSA partitions, then adjust all SSA names.
	(pass::execute): Replace the loops that set
	SA.partition_to_pseudo from partition leaders and cleared
	DECL_RTL for multi-location variables, and that which used to
	rename vars and set attrs, with one that clears DECL_RTL and
	checks that PARMs and RESULTs default_defs match DECL_RTL.
	* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
	* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
	* explow.c (promote_ssa_mode): New.
	* explow.h (promote_ssa_mode): Declare.
	* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
	* function.c: Include cfgexpand.h.
	(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
	(use_register_for_parm_decl): Wrapper for the above to
	special-case the result_ptr.
	(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
	(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
	multiple locations.
	(assign_parm_adjust_stack_rtl): Add all and parm arguments,
	for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
	(assign_parm_setup_block): Prefer SSA-assigned location.
	(assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
	if stack_parm is NULL.
	(assign_parm_setup_stack): Prefer SSA-assigned location.
	(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
	rtl before testing for pointer bounds.  Special-case result_ptr.
	(expand_function_start): Maybe reset DECL_RTL of result.
	Prefer SSA-assigned location for result and static chain.
	Factor out DECL_RESULT and SET_DECL_RTL.
	* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
	anonymous SSA names.  Use promote_ssa_mode.
	(get_temp_reg): Likewise.
	(remove_ssa_form): Adjust.
	* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
	and get its reg_usage for reg invalidation.
	(compute_bb_dataflow): Pass it insn.
	(emit_notes_in_bb): Likewise.

for  gcc/testsuite/ChangeLog

	* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
	* gcc.dg/ssp-1.c: Make counter a register.
	* gcc.dg/ssp-2.c: Likewise.
	* gcc.dg/torture/parm-coalesce.c: New.
---
 gcc/Makefile.in                              |    1 
 gcc/alias.c                                  |   13 +
 gcc/cfgexpand.c                              |  370 +++++++++++++++-----
 gcc/cfgexpand.h                              |    2 
 gcc/common.opt                               |   12 -
 gcc/doc/invoke.texi                          |   48 +--
 gcc/emit-rtl.c                               |    5 
 gcc/explow.c                                 |   22 +
 gcc/explow.h                                 |    3 
 gcc/expr.c                                   |   39 +-
 gcc/function.c                               |  228 ++++++++++--
 gcc/gimple-expr.c                            |   39 --
 gcc/gimple-expr.h                            |    1 
 gcc/opts.c                                   |    2 
 gcc/passes.def                               |    5 
 gcc/testsuite/gcc.dg/guality/pr54200.c       |    2 
 gcc/testsuite/gcc.dg/ssp-1.c                 |    2 
 gcc/testsuite/gcc.dg/ssp-2.c                 |    2 
 gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
 gcc/tree-outof-ssa.c                         |   16 -
 gcc/tree-ssa-coalesce.c                      |  378 ++++++++++++++++++++-
 gcc/tree-ssa-coalesce.h                      |    1 
 gcc/tree-ssa-copyrename.c                    |  475 --------------------------
 gcc/tree-ssa-live.c                          |   99 -----
 gcc/tree-ssa-live.h                          |    4 
 gcc/tree-ssa-uncprop.c                       |    5 
 gcc/var-tracking.c                           |   12 -
 27 files changed, 979 insertions(+), 847 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
 delete mode 100644 gcc/tree-ssa-copyrename.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index bf2186a..b36f9c1 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1445,7 +1445,6 @@ OBJS = \
 	tree-ssa-ccp.o \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
-	tree-ssa-copyrename.o \
 	tree-ssa-dce.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index 3203722..69e3732 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
   if (! DECL_P (exprx) || ! DECL_P (expry))
     return 0;
 
+  /* If we refer to different gimple registers, or one gimple register
+     and one non-gimple-register, we know they can't overlap.  First,
+     gimple registers don't have their addresses taken.  Now, there
+     could be more than one stack slot for (different versions of) the
+     same gimple register, but we can presumably tell they don't
+     overlap based on offsets from stack base addresses elsewhere.
+     It's important that we don't proceed to DECL_RTL, because gimple
+     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+     able to do anything about them since no SSA information will have
+     remained to guide it.  */
+  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+    return exprx != expry;
+
   /* With invalid code we can end up storing into the constant pool.
      Bail out to avoid ICEing when creating RTL for this.
      See gfortran.dg/lto/20091028-2_0.f90.  */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index a047632..0b19953 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -150,21 +150,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
 
 #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
 
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+   out of the same user variable being in multiple partitions (this is
+   less likely for compiler-introduced temps).  */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+  if (cur == NULL || cur == next)
+    return next;
+
+  if (DECL_P (cur) && DECL_IGNORED_P (cur))
+    return cur;
+
+  if (DECL_P (next) && DECL_IGNORED_P (next))
+    return next;
+
+  return cur;
+}
+
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+   there is one.  */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+  if (!is_gimple_reg (var))
+    return NULL_RTX;
+
+  /* If we've already determined RTL for the decl, use it.  This is
+     not just an optimization: if VAR is a PARM whose incoming value
+     is unused, we won't find a default def to use its partition, but
+     we still want to use the location of the parm, if it was used at
+     all.  During assign_parms, until a location is assigned for the
+     VAR, RTL can only for a parm or result if we're not coalescing
+     across variables, when we know we're coalescing all SSA_NAMEs of
+     each parm or result, and we're not coalescing them with names
+     pertaining to other variables, such as other parms' default
+     defs.  */
+  if (DECL_RTL_SET_P (var))
+    {
+      gcc_assert (DECL_RTL (var) != pc_rtx);
+      return DECL_RTL (var);
+    }
+
+  tree name = ssa_default_def (cfun, var);
+
+  if (!name)
+    return NULL_RTX;
+
+  int part = var_to_partition (SA.map, name);
+  if (part == NO_PARTITION)
+    return NULL_RTX;
+
+  return SA.partition_to_pseudo[part];
+}
+
 /* Associate declaration T with storage space X.  If T is no
    SSA name this is exactly SET_DECL_RTL, otherwise make the
    partition of T associated with X.  */
 static inline void
 set_rtl (tree t, rtx x)
 {
+  if (x && SSAVAR (t))
+    {
+      bool skip = false;
+      tree cur = NULL_TREE;
+
+      if (MEM_P (x))
+	cur = MEM_EXPR (x);
+      else if (REG_P (x))
+	cur = REG_EXPR (x);
+      else if (GET_CODE (x) == CONCAT
+	       && REG_P (XEXP (x, 0)))
+	cur = REG_EXPR (XEXP (x, 0));
+      else if (GET_CODE (x) == PARALLEL)
+	cur = REG_EXPR (XVECEXP (x, 0, 0));
+      else if (x == pc_rtx)
+	skip = true;
+      else
+	gcc_unreachable ();
+
+      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+      if (cur != next)
+	{
+	  if (MEM_P (x))
+	    set_mem_attributes (x, next, true);
+	  else
+	    set_reg_attrs_for_decl_rtl (next, x);
+	}
+    }
+
   if (TREE_CODE (t) == SSA_NAME)
     {
-      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
-      if (x && !MEM_P (x))
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
-      /* For the benefit of debug information at -O0 (where vartracking
-         doesn't run) record the place also in the base DECL if it's
-	 a normal variable (not a parameter).  */
-      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+      int part = var_to_partition (SA.map, t);
+      if (part != NO_PARTITION)
+	{
+	  if (SA.partition_to_pseudo[part])
+	    gcc_assert (SA.partition_to_pseudo[part] == x);
+	  else
+	    SA.partition_to_pseudo[part] = x;
+	}
+      /* For the benefit of debug information at -O0 (where
+         vartracking doesn't run) record the place also in the base
+         DECL.  For PARMs and RESULTs, we may end up resetting these
+         in function.c:maybe_reset_rtl_for_parm, but in some rare
+         cases we may need them (unused and overwritten incoming
+         value, that at -O0 must share the location with the other
+         uses in spite of the missing default def), and this may be
+         the only chance to preserve them.  */
+      if (x && x != pc_rtx && SSA_NAME_VAR (t))
 	{
 	  tree var = SSA_NAME_VAR (t);
 	  /* If we don't yet have something recorded, just record it now.  */
@@ -862,7 +962,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (Pmode, base, offset);
-  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+		   ? TYPE_MODE (TREE_TYPE (decl))
+		   : DECL_MODE (SSAVAR (decl)), x);
 
   if (TREE_CODE (decl) != SSA_NAME)
     {
@@ -884,7 +986,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
       DECL_USER_ALIGN (decl) = 0;
     }
 
-  set_mem_attributes (x, SSAVAR (decl), true);
   set_rtl (decl, x);
 }
 
@@ -1099,13 +1200,22 @@ account_stack_vars (void)
    to a variable to be allocated in the stack frame.  */
 
 static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
 {
   HOST_WIDE_INT size, offset;
   unsigned byte_align;
 
-  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
-  byte_align = align_local_variable (SSAVAR (var));
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      tree type = TREE_TYPE (var);
+      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+      byte_align = TYPE_ALIGN_UNIT (type);
+    }
+  else
+    {
+      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+      byte_align = align_local_variable (var);
+    }
 
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1116,6 +1226,27 @@ expand_one_stack_var (tree var)
 			   crtl->max_used_stack_slot_alignment, offset);
 }
 
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+   already assigned some MEM.  */
+
+static void
+expand_one_stack_var (tree var)
+{
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (MEM_P (x));
+	  return;
+	}
+    }
+
+  return expand_one_stack_var_1 (var);
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a hard register.  */
 
@@ -1125,13 +1256,114 @@ expand_one_hard_reg_var (tree var)
   rest_of_decl_compilation (var, 0, 0);
 }
 
+/* Record the alignment requirements of some variable assigned to a
+   pseudo.  */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+  if (SUPPORTS_STACK_ALIGNMENT
+      && crtl->stack_alignment_estimated < align)
+    {
+      /* stack_alignment_estimated shouldn't change after stack
+         realign decision made */
+      gcc_assert (!crtl->stack_realign_processed);
+      crtl->stack_alignment_estimated = align;
+    }
+
+  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+     So here we only make sure stack_alignment_needed >= align.  */
+  if (crtl->stack_alignment_needed < align)
+    crtl->stack_alignment_needed = align;
+  if (crtl->max_used_stack_slot_alignment < align)
+    crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition.  */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+  int part = var_to_partition (SA.map, var);
+  gcc_assert (part != NO_PARTITION);
+
+  if (SA.partition_to_pseudo[part])
+    return;
+
+  if (!use_register_for_decl (var))
+    {
+      expand_one_stack_var_1 (var);
+      return;
+    }
+
+  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+					  TYPE_MODE (TREE_TYPE (var)),
+					  TYPE_ALIGN (TREE_TYPE (var)));
+
+  /* If the variable alignment is very large we'll dynamicaly allocate
+     it, which means that in-frame portion is just a pointer.  */
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+    align = POINTER_SIZE;
+
+  record_alignment_for_reg_var (align);
+
+  machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+  rtx x = gen_reg_rtx (reg_mode);
+
+  set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+   and the underlying variable of the SSA_NAME.  */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+  if (!var)
+    return;
+
+  tree decl = SSA_NAME_VAR (var);
+
+  int part = var_to_partition (SA.map, var);
+  if (part == NO_PARTITION)
+    return;
+
+  rtx x = SA.partition_to_pseudo[part];
+
+  set_rtl (var, x);
+
+  if (!REG_P (x))
+    return;
+
+  /* Note if the object is a user variable.  */
+  if (decl && !DECL_ARTIFICIAL (decl))
+    mark_user_reg (x);
+
+  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+    mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a pseudo register.  */
 
 static void
 expand_one_register_var (tree var)
 {
-  tree decl = SSAVAR (var);
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (REG_P (x));
+	  return;
+	}
+      gcc_unreachable ();
+    }
+
+  tree decl = var;
   tree type = TREE_TYPE (decl);
   machine_mode reg_mode = promote_decl_mode (decl, NULL);
   rtx x = gen_reg_rtx (reg_mode);
@@ -1265,21 +1497,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
 	align = POINTER_SIZE;
     }
 
-  if (SUPPORTS_STACK_ALIGNMENT
-      && crtl->stack_alignment_estimated < align)
-    {
-      /* stack_alignment_estimated shouldn't change after stack
-         realign decision made */
-      gcc_assert (!crtl->stack_realign_processed);
-      crtl->stack_alignment_estimated = align;
-    }
-
-  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
-     So here we only make sure stack_alignment_needed >= align.  */
-  if (crtl->stack_alignment_needed < align)
-    crtl->stack_alignment_needed = align;
-  if (crtl->max_used_stack_slot_alignment < align)
-    crtl->max_used_stack_slot_alignment = align;
+  record_alignment_for_reg_var (align);
 
   if (TREE_CODE (origvar) == SSA_NAME)
     {
@@ -1713,48 +1931,18 @@ expand_used_vars (void)
   if (targetm.use_pseudo_pic_reg ())
     pic_offset_table_rtx = gen_reg_rtx (Pmode);
 
-  hash_map<tree, tree> ssa_name_decls;
   for (i = 0; i < SA.map->num_partitions; i++)
     {
       tree var = partition_to_var (SA.map, i);
 
       gcc_assert (!virtual_operand_p (var));
 
-      /* Assign decls to each SSA name partition, share decls for partitions
-         we could have coalesced (those with the same type).  */
-      if (SSA_NAME_VAR (var) == NULL_TREE)
-	{
-	  tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
-	  if (!*slot)
-	    *slot = create_tmp_reg (TREE_TYPE (var));
-	  replace_ssa_name_symbol (var, *slot);
-	}
-
-      /* Always allocate space for partitions based on VAR_DECLs.  But for
-	 those based on PARM_DECLs or RESULT_DECLs and which matter for the
-	 debug info, there is no need to do so if optimization is disabled
-	 because all the SSA_NAMEs based on these DECLs have been coalesced
-	 into a single partition, which is thus assigned the canonical RTL
-	 location of the DECLs.  If in_lto_p, we can't rely on optimize,
-	 a function could be compiled with -O1 -flto first and only the
-	 link performed at -O0.  */
-      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
-	expand_one_var (var, true, true);
-      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
-	{
-	  /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
-	     contain the default def (representing the parm or result itself)
-	     we don't do anything here.  But those which don't contain the
-	     default def (representing a temporary based on the parm/result)
-	     we need to allocate space just like for normal VAR_DECLs.  */
-	  if (!bitmap_bit_p (SA.partition_has_default_def, i))
-	    {
-	      expand_one_var (var, true, true);
-	      gcc_assert (SA.partition_to_pseudo[i]);
-	    }
-	}
+      expand_one_ssa_partition (var);
     }
 
+  for (i = 1; i < num_ssa_names; i++)
+    adjust_one_expanded_partition_var (ssa_name (i));
+
   if (flag_stack_protect == SPCT_FLAG_STRONG)
       gen_stack_protect_signal
 	= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -5928,35 +6116,6 @@ pass_expand::execute (function *fun)
       parm_birth_insn = var_seq;
     }
 
-  /* Now that we also have the parameter RTXs, copy them over to our
-     partitions.  */
-  for (i = 0; i < SA.map->num_partitions; i++)
-    {
-      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
-      if (TREE_CODE (var) != VAR_DECL
-	  && !SA.partition_to_pseudo[i])
-	SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
-      gcc_assert (SA.partition_to_pseudo[i]);
-
-      /* If this decl was marked as living in multiple places, reset
-	 this now to NULL.  */
-      if (DECL_RTL_IF_SET (var) == pc_rtx)
-	SET_DECL_RTL (var, NULL);
-
-      /* Some RTL parts really want to look at DECL_RTL(x) when x
-	 was a decl marked in REG_ATTR or MEM_ATTR.  We could use
-	 SET_DECL_RTL here making this available, but that would mean
-	 to select one of the potentially many RTLs for one DECL.  Instead
-	 of doing that we simply reset the MEM_EXPR of the RTL in question,
-	 then nobody can get at it and hence nobody can call DECL_RTL on it.  */
-      if (!DECL_RTL_SET_P (var))
-	{
-	  if (MEM_P (SA.partition_to_pseudo[i]))
-	    set_mem_expr (SA.partition_to_pseudo[i], NULL);
-	}
-    }
-
   /* If we have a class containing differently aligned pointers
      we need to merge those into the corresponding RTL pointer
      alignment.  */
@@ -5964,7 +6123,6 @@ pass_expand::execute (function *fun)
     {
       tree name = ssa_name (i);
       int part;
-      rtx r;
 
       if (!name
 	  /* We might have generated new SSA names in
@@ -5977,20 +6135,24 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      /* Adjust all partition members to get the underlying decl of
-	 the representative which we might have created in expand_one_var.  */
-      if (SSA_NAME_VAR (name) == NULL_TREE)
+      gcc_assert (SA.partition_to_pseudo[part]);
+
+      /* If this decl was marked as living in multiple places, reset
+	 this now to NULL.  */
+      tree var = SSA_NAME_VAR (name);
+      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+	SET_DECL_RTL (var, NULL);
+      /* Check that the pseudos chosen by assign_parms are those of
+	 the corresponding default defs.  */
+      else if (SSA_NAME_IS_DEFAULT_DEF (name)
+	       && (TREE_CODE (var) == PARM_DECL
+		   || TREE_CODE (var) == RESULT_DECL))
 	{
-	  tree leader = partition_to_var (SA.map, part);
-	  gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
-	  replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+	  rtx in = DECL_RTL_IF_SET (var);
+	  gcc_assert (in);
+	  rtx out = SA.partition_to_pseudo[part];
+	  gcc_assert (in == out || rtx_equal_p (in, out));
 	}
-      if (!POINTER_TYPE_P (TREE_TYPE (name)))
-	continue;
-
-      r = SA.partition_to_pseudo[part];
-      if (REG_P (r))
-	mark_reg_pointer (r, get_pointer_alignment (name));
     }
 
   /* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..602579d 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
 
 #endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 6b2ccbc..89dcabf 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2230,16 +2230,16 @@ Common Report Var(flag_tree_ch) Optimization
 Enable loop header copying on trees
 
 ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing.  Preserved for backward compatibility.
 
 ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
 
 ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copy-prop
 Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 522e924..681c33e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
 -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
 -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
 -fdump-tree-nrv -fdump-tree-vect @gol
 -fdump-tree-sink @gol
 -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
 -ftree-loop-if-convert-stores -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -7078,11 +7076,6 @@ name is made by appending @file{.phiopt} to the source file name.
 Dump each function after forward propagating single use variables.  The file
 name is made by appending @file{.forwprop} to the source file name.
 
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization.  The file
-name is made by appending @file{.copyrename} to the source file name.
-
 @item nrv
 @opindex fdump-tree-nrv
 Dump each function after applying the named return value optimization on
@@ -7547,8 +7540,8 @@ compilation time.
 -ftree-ccp @gol
 -fssa-phiopt @gol
 -ftree-ch @gol
+-ftree-coalesce-vars @gol
 -ftree-copy-prop @gol
--ftree-copyrename @gol
 -ftree-dce @gol
 -ftree-dominator-opts @gol
 -ftree-dse @gol
@@ -8817,6 +8810,15 @@ profitable to parallelize the loops.
 Compare the results of several data dependence analyzers.  This option
 is used for debugging the data dependence analyzers.
 
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries.  This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}.  In the negated form, this flag
+prevents SSA coalescing of user variables.  This option is enabled by
+default if optimization is enabled.
+
 @item -ftree-loop-if-convert
 @opindex ftree-loop-if-convert
 Attempt to transform conditional jumps in the innermost loops to
@@ -8930,32 +8932,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
 references with scalars to prevent committing structures to memory too
 early.  This flag is enabled by default at @option{-O} and higher.
 
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees.  This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables.  This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions.  It is a more limited form of
-@option{-ftree-coalesce-vars}.  This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries.  This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}.  In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones.  This option is enabled by default.
-
 @item -ftree-ter
 @opindex ftree-ter
 Perform temporary expression replacement during the SSA->normal phase.  Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index ed2b30b..0648af6 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1232,6 +1232,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
 void
 set_reg_attrs_for_decl_rtl (tree t, rtx x)
 {
+  if (!t)
+    return;
+  tree tdecl = t;
   if (GET_CODE (x) == SUBREG)
     {
       gcc_assert (subreg_lowpart_p (x));
@@ -1240,7 +1243,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
   if (REG_P (x))
     REG_ATTRS (x)
       = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
-					       DECL_MODE (t)));
+					       DECL_MODE (tdecl)));
   if (GET_CODE (x) == CONCAT)
     {
       if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index bd342c1..6dba6e5 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -842,6 +842,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
   return pmode;
 }
 
+/* Return the promoted mode for name.  If it is a named SSA_NAME, it
+   is the same as promote_decl_mode.  Otherwise, it is the promoted
+   mode of a temp decl of same type as the SSA_NAME, if we had created
+   one.  */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  tree type = TREE_TYPE (name);
+  int unsignedp = TYPE_UNSIGNED (type);
+  machine_mode mode = TYPE_MODE (type);
+
+  machine_mode pmode = promote_mode (type, mode, &unsignedp);
+  if (punsignedp)
+    *punsignedp = unsignedp;
+
+  return pmode;
+}
+
+
 \f
 /* Controls the behaviour of {anti_,}adjust_stack.  */
 static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 94613de..52113db 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
 /* Return mode and signedness to use when object is promoted.  */
 machine_mode promote_decl_mode (const_tree, int *);
 
+/* Return mode and signedness to use when object is promoted.  */
+machine_mode promote_ssa_mode (const_tree, int *);
+
 /* Remove some bytes from the stack.  An rtx says how many.  */
 extern void adjust_stack (rtx);
 
diff --git a/gcc/expr.c b/gcc/expr.c
index 899a42c..d601129 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9246,7 +9246,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
   rtx op0, op1, temp, decl_rtl;
   tree type;
   int unsignedp;
-  machine_mode mode;
+  machine_mode mode, dmode;
   enum tree_code code = TREE_CODE (exp);
   rtx subtarget, original_target;
   int ignore;
@@ -9377,7 +9377,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       if (g == NULL
 	  && modifier == EXPAND_INITIALIZER
 	  && !SSA_NAME_IS_DEFAULT_DEF (exp)
-	  && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+	  && (optimize || !SSA_NAME_VAR (exp)
+	      || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
 	  && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
 	g = SSA_NAME_DEF_STMT (exp);
       if (g)
@@ -9456,15 +9457,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       /* Ensure variable marked as used even if it doesn't go through
 	 a parser.  If it hasn't be used yet, write out an external
 	 definition.  */
-      TREE_USED (exp) = 1;
+      if (exp)
+	TREE_USED (exp) = 1;
 
       /* Show we haven't gotten RTL for this yet.  */
       temp = 0;
 
       /* Variables inherited from containing functions should have
 	 been lowered by this point.  */
-      context = decl_function_context (exp);
-      gcc_assert (SCOPE_FILE_SCOPE_P (context)
+      if (exp)
+	context = decl_function_context (exp);
+      gcc_assert (!exp
+		  || SCOPE_FILE_SCOPE_P (context)
 		  || context == current_function_decl
 		  || TREE_STATIC (exp)
 		  || DECL_EXTERNAL (exp)
@@ -9488,7 +9492,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	  decl_rtl = use_anchored_address (decl_rtl);
 	  if (modifier != EXPAND_CONST_ADDRESS
 	      && modifier != EXPAND_SUM
-	      && !memory_address_addr_space_p (DECL_MODE (exp),
+	      && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+					       : GET_MODE (decl_rtl),
 					       XEXP (decl_rtl, 0),
 					       MEM_ADDR_SPACE (decl_rtl)))
 	    temp = replace_equiv_address (decl_rtl,
@@ -9499,12 +9504,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 if the address is a register.  */
       if (temp != 0)
 	{
-	  if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+	  if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
 	    mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
 
 	  return temp;
 	}
 
+      if (exp)
+	dmode = DECL_MODE (exp);
+      else
+	dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
       /* If the mode of DECL_RTL does not match that of the decl,
 	 there are two cases: we are dealing with a BLKmode value
 	 that is returned in a register, or we are dealing with
@@ -9512,22 +9522,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 of the wanted mode, but mark it so that we know that it
 	 was already extended.  */
       if (REG_P (decl_rtl)
-	  && DECL_MODE (exp) != BLKmode
-	  && GET_MODE (decl_rtl) != DECL_MODE (exp))
+	  && dmode != BLKmode
+	  && GET_MODE (decl_rtl) != dmode)
 	{
 	  machine_mode pmode;
 
 	  /* Get the signedness to be used for this variable.  Ensure we get
 	     the same mode we got when the variable was declared.  */
-	  if (code == SSA_NAME
-	      && (g = SSA_NAME_DEF_STMT (ssa_name))
-	      && gimple_code (g) == GIMPLE_CALL
-	      && !gimple_call_internal_p (g))
+	  if (code != SSA_NAME)
+	    pmode = promote_decl_mode (exp, &unsignedp);
+	  else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+		   && gimple_code (g) == GIMPLE_CALL
+		   && !gimple_call_internal_p (g))
 	    pmode = promote_function_mode (type, mode, &unsignedp,
 					   gimple_call_fntype (g),
 					   2);
 	  else
-	    pmode = promote_decl_mode (exp, &unsignedp);
+	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/function.c b/gcc/function.c
index f9d11bf4..840f4a2 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -72,6 +72,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfganal.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
+#include "cfgexpand.h"
+#include "basic-block.h"
+#include "df.h"
 #include "params.h"
 #include "bb-reorder.h"
 #include "shrink-wrap.h"
@@ -2105,6 +2108,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
 bool
 use_register_for_decl (const_tree decl)
 {
+  if (TREE_CODE (decl) == SSA_NAME)
+    {
+      /* We often try to use the SSA_NAME, instead of its underlying
+	 decl, to get type information and guide decisions, to avoid
+	 differences of behavior between anonymous and named
+	 variables, but in this one case we have to go for the actual
+	 variable if there is one.  The main reason is that, at least
+	 at -O0, we want to place user variables on the stack, but we
+	 don't mind using pseudos for anonymous or ignored temps.
+	 Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+	 should go in pseudos, whereas their corresponding variables
+	 might have to go on the stack.  So, disregarding the decl
+	 here would negatively impact debug info at -O0, enable
+	 coalescing between SSA_NAMEs that ought to get different
+	 stack/pseudo assignments, and get the incoming argument
+	 processing thoroughly confused by PARM_DECLs expected to live
+	 in stack slots but assigned to pseudos.  */
+      if (!SSA_NAME_VAR (decl))
+	return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+	  && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+      decl = SSA_NAME_VAR (decl);
+    }
+
   if (!targetm.calls.allocate_stack_slots_for_args ())
     return true;
 
@@ -2745,23 +2772,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
   data->entry_parm = entry_parm;
 }
 
+/* Wrapper for use_register_for_decl, that special-cases the
+   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+   passed by reference.  */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (DECL_BY_REFERENCE (result))
+	parm = result;
+    }
+
+  return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+   is passed by reference.  */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (!DECL_BY_REFERENCE (result))
+	return NULL_RTX;
+
+      parm = result;
+    }
+
+  return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+   SSA_NAMEs in multiple partitions, so that assign_parms will choose
+   the default def, if it exists, or create new RTL to hold the unused
+   entry value.  If we are coalescing across variables, we want to
+   reset the location too, because a parm without a default def
+   (incoming value unused) might be coalesced with one with a default
+   def, and then assign_parms would copy both incoming values to the
+   same location, which might cause the wrong value to survive.  */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+  gcc_assert (TREE_CODE (parm) == PARM_DECL
+	      || TREE_CODE (parm) == RESULT_DECL);
+  if ((flag_tree_coalesce_vars
+       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+      && is_gimple_reg (parm))
+    SET_DECL_RTL (parm, NULL_RTX);
+}
+
 /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
    always valid and properly aligned.  */
 
 static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+			      struct assign_parm_data_one *data)
 {
   rtx stack_parm = data->stack_parm;
 
+  /* If out-of-SSA assigned RTL to the parm default def, make sure we
+     don't use what we might have computed before.  */
+  rtx ssa_assigned = rtl_for_parm (all, parm);
+  if (ssa_assigned)
+    stack_parm = NULL;
+
   /* If we can't trust the parm stack slot to be aligned enough for its
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
-  if (stack_parm
-      && ((STRICT_ALIGNMENT
-	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
-	  || (data->nominal_type
-	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
-	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+  else if (stack_parm
+	   && ((STRICT_ALIGNMENT
+		&& (GET_MODE_ALIGNMENT (data->nominal_mode)
+		    > MEM_ALIGN (stack_parm)))
+	       || (data->nominal_type
+		   && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+		   && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
     stack_parm = NULL;
 
   /* If parm was passed in memory, and we need to convert it on entry,
@@ -2823,11 +2915,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      stack_parm = assign_stack_local (BLKmode, size_stored,
-				       DECL_ALIGN (parm));
+      stack_parm = rtl_for_parm (all, parm);
+      if (!stack_parm)
+	stack_parm = assign_stack_local (BLKmode, size_stored,
+					 DECL_ALIGN (parm));
+      else
+	stack_parm = copy_rtx (stack_parm);
       if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
 	PUT_MODE (stack_parm, GET_MODE (entry_parm));
       set_mem_attributes (stack_parm, parm, 1);
@@ -2968,10 +3065,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  parmreg = gen_reg_rtx (promoted_nominal_mode);
+  rtx from_expand = rtl_for_parm (all, parm);
 
-  if (!DECL_ARTIFICIAL (parm))
-    mark_user_reg (parmreg);
+  if (from_expand && !data->passed_pointer)
+    {
+      parmreg = from_expand;
+      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
+    }
+  else
+    {
+      parmreg = gen_reg_rtx (promoted_nominal_mode);
+      if (!DECL_ARTIFICIAL (parm))
+	mark_user_reg (parmreg);
+    }
 
   /* If this was an item that we received a pointer to,
      set DECL_RTL appropriately.  */
@@ -2990,6 +3096,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      assign_parm_find_data_types and expand_expr_real_1.  */
 
   equiv_stack_parm = data->stack_parm;
+  if (!equiv_stack_parm)
+    equiv_stack_parm = data->entry_parm;
   validated_mem = validize_mem (copy_rtx (data->entry_parm));
 
   need_conversion = (data->nominal_mode != data->passed_mode
@@ -3130,11 +3238,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
   /* If we were passed a pointer but the actual value can safely live
      in a register, retrieve it and use it directly.  */
-  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+  if (data->passed_pointer
+      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
     {
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
-      if (use_register_for_decl (parm))
+      if (from_expand)
+	{
+	  parmreg = from_expand;
+	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+	}
+      else if (use_register_for_decl (parm))
 	{
 	  parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
 	  mark_user_reg (parmreg);
@@ -3174,7 +3288,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       /* STACK_PARM is the pointer, not the parm, and PARMREG is
 	 now the parm.  */
-      data->stack_parm = NULL;
+      data->stack_parm = equiv_stack_parm = NULL;
     }
 
   /* Mark the register as eliminable if we did no conversion and it was
@@ -3184,11 +3298,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      make here would screw up life analysis for it.  */
   if (data->nominal_mode == data->passed_mode
       && !did_conversion
-      && data->stack_parm != 0
-      && MEM_P (data->stack_parm)
+      && equiv_stack_parm != 0
+      && MEM_P (equiv_stack_parm)
       && data->locate.offset.var == 0
       && reg_mentioned_p (virtual_incoming_args_rtx,
-			  XEXP (data->stack_parm, 0)))
+			  XEXP (equiv_stack_parm, 0)))
     {
       rtx_insn *linsn = get_last_insn ();
       rtx_insn *sinsn;
@@ -3201,8 +3315,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	    = GET_MODE_INNER (GET_MODE (parmreg));
 	  int regnor = REGNO (XEXP (parmreg, 0));
 	  int regnoi = REGNO (XEXP (parmreg, 1));
-	  rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
-	  rtx stacki = adjust_address_nv (data->stack_parm, submode,
+	  rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+	  rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
 					  GET_MODE_SIZE (submode));
 
 	  /* Scan backwards for the set of the real and
@@ -3275,6 +3389,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 
       if (data->stack_parm == 0)
 	{
+	  rtx x = data->stack_parm = rtl_for_parm (all, parm);
+	  if (x)
+	    gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+	}
+
+      if (data->stack_parm == 0)
+	{
 	  int align = STACK_SLOT_ALIGNMENT (data->passed_type,
 					    GET_MODE (data->entry_parm),
 					    TYPE_ALIGN (data->passed_type));
@@ -3531,6 +3652,8 @@ assign_parms (tree fndecl)
 	  DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
 	  continue;
 	}
+      else
+	maybe_reset_rtl_for_parm (parm);
 
       /* Estimate stack alignment from parameter alignment.  */
       if (SUPPORTS_STACK_ALIGNMENT)
@@ -3580,7 +3703,9 @@ assign_parms (tree fndecl)
       else
 	set_decl_incoming_rtl (parm, data.entry_parm, false);
 
-      /* Boudns should be loaded in the particular order to
+      assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+      /* Bounds should be loaded in the particular order to
 	 have registers allocated correctly.  Collect info about
 	 input bounds and load them later.  */
       if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3597,11 +3722,10 @@ assign_parms (tree fndecl)
 	}
       else
 	{
-	  assign_parm_adjust_stack_rtl (&data);
-
 	  if (assign_parm_setup_block_p (&data))
 	    assign_parm_setup_block (&all, parm, &data);
-	  else if (data.passed_pointer || use_register_for_decl (parm))
+	  else if (data.passed_pointer
+		   || use_register_for_parm_decl (&all, parm))
 	    assign_parm_setup_reg (&all, parm, &data);
 	  else
 	    assign_parm_setup_stack (&all, parm, &data);
@@ -4932,7 +5056,9 @@ expand_function_start (tree subr)
      before any library calls that assign parms might generate.  */
 
   /* Decide whether to return the value in memory or in a register.  */
-  if (aggregate_value_p (DECL_RESULT (subr), subr))
+  tree res = DECL_RESULT (subr);
+  maybe_reset_rtl_for_parm (res);
+  if (aggregate_value_p (res, subr))
     {
       /* Returning something that won't go in a register.  */
       rtx value_address = 0;
@@ -4940,7 +5066,7 @@ expand_function_start (tree subr)
 #ifdef PCC_STATIC_STRUCT_RETURN
       if (cfun->returns_pcc_struct)
 	{
-	  int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+	  int size = int_size_in_bytes (TREE_TYPE (res));
 	  value_address = assemble_static_space (size);
 	}
       else
@@ -4952,36 +5078,45 @@ expand_function_start (tree subr)
 	     it.  */
 	  if (sv)
 	    {
-	      value_address = gen_reg_rtx (Pmode);
+	      if (DECL_BY_REFERENCE (res))
+		value_address = get_rtl_for_parm_ssa_default_def (res);
+	      if (!value_address)
+		value_address = gen_reg_rtx (Pmode);
 	      emit_move_insn (value_address, sv);
 	    }
 	}
       if (value_address)
 	{
 	  rtx x = value_address;
-	  if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+	  if (!DECL_BY_REFERENCE (res))
 	    {
-	      x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
-	      set_mem_attributes (x, DECL_RESULT (subr), 1);
+	      x = get_rtl_for_parm_ssa_default_def (res);
+	      if (!x)
+		{
+		  x = gen_rtx_MEM (DECL_MODE (res), value_address);
+		  set_mem_attributes (x, res, 1);
+		}
 	    }
-	  SET_DECL_RTL (DECL_RESULT (subr), x);
+	  SET_DECL_RTL (res, x);
 	}
     }
-  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+  else if (DECL_MODE (res) == VOIDmode)
     /* If return mode is void, this decl rtl should not be used.  */
-    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+    SET_DECL_RTL (res, NULL_RTX);
   else
     {
       /* Compute the return values into a pseudo reg, which we will copy
 	 into the true return register after the cleanups are done.  */
-      tree return_type = TREE_TYPE (DECL_RESULT (subr));
-      if (TYPE_MODE (return_type) != BLKmode
-	  && targetm.calls.return_in_msb (return_type))
+      tree return_type = TREE_TYPE (res);
+      rtx x = get_rtl_for_parm_ssa_default_def (res);
+      if (x)
+	/* Use it.  */;
+      else if (TYPE_MODE (return_type) != BLKmode
+	       && targetm.calls.return_in_msb (return_type))
 	/* expand_function_end will insert the appropriate padding in
 	   this case.  Use the return value's natural (unpadded) mode
 	   within the function proper.  */
-	SET_DECL_RTL (DECL_RESULT (subr),
-		      gen_reg_rtx (TYPE_MODE (return_type)));
+	x = gen_reg_rtx (TYPE_MODE (return_type));
       else
 	{
 	  /* In order to figure out what mode to use for the pseudo, we
@@ -4992,25 +5127,26 @@ expand_function_start (tree subr)
 	  /* Structures that are returned in registers are not
 	     aggregate_value_p, so we may see a PARALLEL or a REG.  */
 	  if (REG_P (hard_reg))
-	    SET_DECL_RTL (DECL_RESULT (subr),
-			  gen_reg_rtx (GET_MODE (hard_reg)));
+	    x = gen_reg_rtx (GET_MODE (hard_reg));
 	  else
 	    {
 	      gcc_assert (GET_CODE (hard_reg) == PARALLEL);
-	      SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+	      x = gen_group_rtx (hard_reg);
 	    }
 	}
 
+      SET_DECL_RTL (res, x);
+
       /* Set DECL_REGISTER flag so that expand_function_end will copy the
 	 result to the real return register(s).  */
-      DECL_REGISTER (DECL_RESULT (subr)) = 1;
+      DECL_REGISTER (res) = 1;
 
       if (chkp_function_instrumented_p (current_function_decl))
 	{
-	  tree return_type = TREE_TYPE (DECL_RESULT (subr));
+	  tree return_type = TREE_TYPE (res);
 	  rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
 								 subr, 1);
-	  SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+	  SET_DECL_BOUNDS_RTL (res, bounds);
 	}
     }
 
@@ -5025,7 +5161,9 @@ expand_function_start (tree subr)
       rtx local, chain;
      rtx_insn *insn;
 
-      local = gen_reg_rtx (Pmode);
+      local = get_rtl_for_parm_ssa_default_def (parm);
+      if (!local)
+	local = gen_reg_rtx (Pmode);
       chain = targetm.calls.static_chain (current_function_decl, true);
 
       set_decl_incoming_rtl (parm, chain, false);
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index b558d90..baed630 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type)
   return copy;
 }
 
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
-   coalescing together, false otherwise.
-
-   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
-  /* First check the SSA_NAME's associated DECL.  We only want to
-     coalesce if they have the same DECL or both have no associated DECL.  */
-  tree var1 = SSA_NAME_VAR (name1);
-  tree var2 = SSA_NAME_VAR (name2);
-  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
-  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
-  if (var1 != var2)
-    return false;
-
-  /* Now check the types.  If the types are the same, then we should
-     try to coalesce V1 and V2.  */
-  tree t1 = TREE_TYPE (name1);
-  tree t2 = TREE_TYPE (name2);
-  if (t1 == t2)
-    return true;
-
-  /* If the types are not the same, check for a canonical type match.  This
-     (for example) allows coalescing when the types are fundamentally the
-     same, but just have different names. 
-
-     Note pointer types with different address spaces may have the same
-     canonical type.  Those are rejected for coalescing by the
-     types_compatible_p check.  */
-  if (TYPE_CANONICAL (t1)
-      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
-      && types_compatible_p (t1, t2))
-    return true;
-
-  return false;
-}
-
 /* Strip off a legitimate source ending from the input string NAME of
    length LEN.  Rather than having to know the names used by all of
    our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index ed23eb2..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
 extern bool gimple_has_body_p (tree);
 extern const char *gimple_decl_printable_name (tree, int);
 extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
 extern tree create_tmp_var_name (const char *);
 extern tree create_tmp_var_raw (tree, const char * = NULL);
 extern tree create_tmp_var (tree, const char * = NULL);
diff --git a/gcc/opts.c b/gcc/opts.c
index 468a802..f22edd3 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -445,12 +445,12 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
-    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 5cd07ae..103fd2e 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_all_early_optimizations);
       PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_object_sizes);
 	  NEXT_PASS (pass_ccp);
 	  /* After CCP we rewrite no longer addressed locals into SSA
@@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
       /* Initial scalar cleanups before alias computation.
 	 They ensure memory accesses are not indirect wherever possible.  */
       NEXT_PASS (pass_strip_predict_hints);
-      NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_ccp);
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
@@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_ch);
       NEXT_PASS (pass_lower_complex);
       NEXT_PASS (pass_sra);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* The dom pass will also resolve all __builtin_constant_p calls
          that are still there to 0.  This has to be done after some
 	 propagations have already run, but before some more dead code
@@ -294,7 +291,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_fold_builtins);
       NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* FIXME: If DCE is not run before checking for uninitialized uses,
 	 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 	 However, this also causes us to misdiagnose cases that should be
@@ -329,7 +325,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tsan);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* ???  We do want some kind of loop invariant motion, but we possibly
          need to adjust LIM to be more friendly towards preserving accurate
 	 debug information here.  */
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/54200 */
 /* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
 
 int o __attribute__((used));
 
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
 
 int main ()
 {
-  int i;
+  register int i;
   char foo[255];
 
   // smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
 void
 overflow()
 {
-  int i = 0;
+  register int i = 0;
   char foo[30];
 
   /* Overflow buffer.  */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+   value is unused, to the same location, so as to overwrite one of
+   them with the incoming value of the other.  */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+/* Same as foo, but with swapped parameters.  */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+int
+main (void)
+{
+  if (foo (0, 1) != 3)
+    abort ();
+  if (bar (1, 0) != 3)
+    abort ();
+  return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index 7b747ab9..978476c 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
   rtx dest_rtx, seq, x;
   machine_mode dest_mode, src_mode;
   int unsignedp;
-  tree var;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
 
   start_sequence ();
 
-  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+  tree name = partition_to_var (SA.map, dest);
   src_mode = TYPE_MODE (TREE_TYPE (src));
   dest_mode = GET_MODE (dest_rtx);
-  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
   gcc_assert (!REG_P (dest_rtx)
-	      || dest_mode == promote_decl_mode (var, &unsignedp));
+	      || dest_mode == promote_ssa_mode (name, &unsignedp));
 
   if (src_mode != dest_mode)
     {
@@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T)
 static rtx
 get_temp_reg (tree name)
 {
-  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
-  tree type = TREE_TYPE (var);
+  tree type = TREE_TYPE (name);
   int unsignedp;
-  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
   rtx x = gen_reg_rtx (reg_mode);
   if (POINTER_TYPE_P (type))
-    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
   return x;
 }
 
@@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 
   /* Return to viewing the variable list as just all reference variables after
      coalescing has been performed.  */
-  partition_view_normal (map, false);
+  partition_view_normal (map);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index bf8983f..a622728 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "tree-ssa-live.h"
 #include "tree-ssa-coalesce.h"
+#include "explow.h"
 #include "diagnostic-core.h"
 
 
@@ -806,6 +807,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
   basic_block bb;
   ssa_op_iter iter;
   live_track_p live;
+  basic_block entry;
+
+  /* If inter-variable coalescing is enabled, we may attempt to
+     coalesce variables from different base variables, including
+     different parameters, so we have to make sure default defs live
+     at the entry block conflict with each other.  */
+  if (flag_tree_coalesce_vars)
+    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  else
+    entry = NULL;
 
   map = live_var_map (liveinfo);
   graph = ssa_conflicts_new (num_var_partitions (map));
@@ -864,6 +875,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	    live_track_process_def (live, result, graph);
 	}
 
+      /* Pretend there are defs for params' default defs at the start
+	 of the (post-)entry block.  */
+      if (bb == entry)
+	{
+	  unsigned base;
+	  bitmap_iterator bi;
+	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	    {
+	      bitmap_iterator bi2;
+	      unsigned part;
+	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+					0, part, bi2)
+		{
+		  tree var = partition_to_var (map, part);
+		  if (!SSA_NAME_VAR (var)
+		      || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+			  && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+		      || !SSA_NAME_IS_DEFAULT_DEF (var))
+		    continue;
+		  live_track_process_def (live, var, graph);
+		}
+	    }
+	}
+
      live_track_clear_base_vars (live);
     }
 
@@ -1132,6 +1167,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
     {
       var1 = partition_to_var (map, p1);
       var2 = partition_to_var (map, p2);
+
       z = var_union (map, var1, var2);
       if (z == NO_PARTITION)
 	{
@@ -1149,6 +1185,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
 
       if (debug)
 	fprintf (debug, ": Success -> %d\n", z);
+
       return true;
     }
 
@@ -1244,6 +1281,328 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
 }
 
 
+/* Output partition map MAP with coalescing plan PART to file F.  */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+  int t;
+  unsigned x, y;
+  int p;
+
+  fprintf (f, "\nCoalescible Partition map \n\n");
+
+  for (x = 0; x < map->num_partitions; x++)
+    {
+      if (map->view_to_partition != NULL)
+	p = map->view_to_partition[x];
+      else
+	p = x;
+
+      if (ssa_name (p) == NULL_TREE
+	  || virtual_operand_p (ssa_name (p)))
+        continue;
+
+      t = 0;
+      for (y = 1; y < num_ssa_names; y++)
+        {
+	  tree var = version_to_var (map, y);
+	  if (!var)
+	    continue;
+	  int q = var_to_partition (map, var);
+	  p = partition_find (part, q);
+	  gcc_assert (map->partition_to_base_index[q]
+		      == map->partition_to_base_index[p]);
+
+	  if (p == (int)x)
+	    {
+	      if (t++ == 0)
+	        {
+		  fprintf (f, "Partition %d, base %d (", x,
+			   map->partition_to_base_index[q]);
+		  print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+		  fprintf (f, " - ");
+		}
+	      fprintf (f, "%d ", y);
+	    }
+	}
+      if (t != 0)
+	fprintf (f, ")\n");
+    }
+  fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+   coalescing together, false otherwise.
+
+   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+  /* First check the SSA_NAME's associated DECL.  Without
+     optimization, we only want to coalesce if they have the same DECL
+     or both have no associated DECL.  */
+  tree var1 = SSA_NAME_VAR (name1);
+  tree var2 = SSA_NAME_VAR (name2);
+  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+  if (var1 != var2 && !flag_tree_coalesce_vars)
+    return false;
+
+  /* Now check the types.  If the types are the same, then we should
+     try to coalesce V1 and V2.  */
+  tree t1 = TREE_TYPE (name1);
+  tree t2 = TREE_TYPE (name2);
+  if (t1 == t2)
+    {
+    check_modes:
+      /* If the base variables are the same, we're good: none of the
+	 other tests below could possibly fail.  */
+      var1 = SSA_NAME_VAR (name1);
+      var2 = SSA_NAME_VAR (name2);
+      if (var1 == var2)
+	return true;
+
+      /* We don't want to coalesce two SSA names if one of the base
+	 variables is supposed to be a register while the other is
+	 supposed to be on the stack.  Anonymous SSA names take
+	 registers, but when not optimizing, user variables should go
+	 on the stack, so coalescing them with the anonymous variable
+	 as the partition leader would end up assigning the user
+	 variable to a register.  Don't do that!  */
+      bool reg1 = !var1 || use_register_for_decl (var1);
+      bool reg2 = !var2 || use_register_for_decl (var2);
+      if (reg1 != reg2)
+	return false;
+
+      /* Check that the promoted modes are the same.  We don't want to
+	 coalesce if the promoted modes would be different.  Only
+	 PARM_DECLs and RESULT_DECLs have different promotion rules,
+	 so skip the test if we both are variables or anonymous
+	 SSA_NAMEs.  */
+      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+	|| promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
+    }
+
+  /* If the types are not the same, check for a canonical type match.  This
+     (for example) allows coalescing when the types are fundamentally the
+     same, but just have different names. 
+
+     Note pointer types with different address spaces may have the same
+     canonical type.  Those are rejected for coalescing by the
+     types_compatible_p check.  */
+  if (TYPE_CANONICAL (t1)
+      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+      && types_compatible_p (t1, t2))
+    goto check_modes;
+
+  return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+   partition of SSA names USED_IN_COPIES and related by CL coalesce
+   possibilities.  This must match gimple_can_coalesce_p in the
+   optimized case.  */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+				   coalesce_list_p cl)
+{
+  int parts = num_var_partitions (map);
+  partition tentative = partition_new (parts);
+
+  /* Partition the SSA versions so that, for each coalescible
+     pair, both of its members are in the same partition in
+     TENTATIVE.  */
+  gcc_assert (!cl->sorted);
+  coalesce_pair_p node;
+  coalesce_iterator_type ppi;
+  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+    {
+      tree v1 = ssa_name (node->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (node->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* We have to deal with cost one pairs too.  */
+  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+    {
+      tree v1 = ssa_name (co->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (co->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* And also with abnormal edges.  */
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->flags & EDGE_ABNORMAL)
+	  {
+	    gphi_iterator gsi;
+	    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+		 gsi_next (&gsi))
+	      {
+		gphi *phi = gsi.phi ();
+		tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+		if (SSA_NAME_IS_DEFAULT_DEF (arg)
+		    && (!SSA_NAME_VAR (arg)
+			|| TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+		  continue;
+
+		tree res = PHI_RESULT (phi);
+
+		int p1 = partition_find (tentative, var_to_partition (map, res));
+		int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+		if (p1 == p2)
+		  continue;
+
+		partition_union (tentative, p1, p2);
+	      }
+	  }
+    }
+
+  map->partition_to_base_index = XCNEWVEC (int, parts);
+  auto_vec<unsigned int> index_map (parts);
+  if (parts)
+    index_map.quick_grow (parts);
+
+  const unsigned no_part = -1;
+  unsigned count = parts;
+  while (count)
+    index_map[--count] = no_part;
+
+  /* Initialize MAP's mapping from partition to base index, using
+     as base indices an enumeration of the TENTATIVE partitions in
+     which each SSA version ended up, so that we compute conflicts
+     between all SSA versions that ended up in the same potential
+     coalesce partition.  */
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      if (index_map[base] != no_part)
+	continue;
+      index_map[base] = count++;
+    }
+
+  map->num_basevars = count;
+
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      gcc_assert (index_map[base] < count);
+      map->partition_to_base_index[pidx] = index_map[base];
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_part_var_map (dump_file, tentative, map);
+
+  partition_delete (tentative);
+}
+
+/* Hashtable helpers.  */
+
+struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
+{
+  static inline hashval_t hash (const tree_int_map *);
+  static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+  return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+  return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+   names.  Partitions will share the same base if they have the same
+   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
+   must match gimple_can_coalesce_p in the non-optimized case.  */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+  int x, num_part;
+  tree var;
+  struct tree_int_map *m, *mapstorage;
+
+  num_part = num_var_partitions (map);
+  hash_table<tree_int_map_hasher> tree_to_index (num_part);
+  /* We can have at most num_part entries in the hash tables, so it's
+     enough to allocate so many map elements once, saving some malloc
+     calls.  */
+  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+  /* If a base table already exists, clear it, otherwise create it.  */
+  free (map->partition_to_base_index);
+  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+  /* Build the base variable list, and point partitions at their bases.  */
+  for (x = 0; x < num_part; x++)
+    {
+      struct tree_int_map **slot;
+      unsigned baseindex;
+      var = partition_to_var (map, x);
+      if (SSA_NAME_VAR (var)
+	  && (!VAR_P (SSA_NAME_VAR (var))
+	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+	m->base.from = SSA_NAME_VAR (var);
+      else
+	/* This restricts what anonymous SSA names we can coalesce
+	   as it restricts the sets we compute conflicts for.
+	   Using TREE_TYPE to generate sets is the easies as
+	   type equivalency also holds for SSA names with the same
+	   underlying decl.
+
+	   Check gimple_can_coalesce_p when changing this code.  */
+	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+			? TYPE_CANONICAL (TREE_TYPE (var))
+			: TREE_TYPE (var));
+      /* If base variable hasn't been seen, set it up.  */
+      slot = tree_to_index.find_slot (m, INSERT);
+      if (!*slot)
+	{
+	  baseindex = m - mapstorage;
+	  m->to = baseindex;
+	  *slot = m;
+	  m++;
+	}
+      else
+	baseindex = (*slot)->to;
+      map->partition_to_base_index[x] = baseindex;
+    }
+
+  map->num_basevars = m - mapstorage;
+
+  free (mapstorage);
+}
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1260,9 +1619,10 @@ coalesce_ssa_name (void)
   cl = create_coalesce_list ();
   map = create_outofssa_var_map (cl, used_in_copies);
 
-  /* If optimization is disabled, we need to coalesce all the names originating
-     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
-  if (!optimize)
+  /* If this optimization is disabled, we need to coalesce all the
+     names originating from the same SSA_NAME_VAR so debug info
+     remains undisturbed.  */
+  if (!flag_tree_coalesce_vars)
     {
       hash_table<ssa_name_var_hash> ssa_name_hash (10);
 
@@ -1303,8 +1663,13 @@ coalesce_ssa_name (void)
   if (dump_file && (dump_flags & TDF_DETAILS))
     dump_var_map (dump_file, map);
 
-  /* Don't calculate live ranges for variables not in the coalesce list.  */
-  partition_view_bitmap (map, used_in_copies, true);
+  partition_view_bitmap (map, used_in_copies);
+
+  if (flag_tree_coalesce_vars)
+    compute_optimized_partition_bases (map, used_in_copies, cl);
+  else
+    compute_samebase_partition_bases (map);
+
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
@@ -1343,8 +1708,7 @@ coalesce_ssa_name (void)
 
   /* Now coalesce everything in the list.  */
   coalesce_partitions (map, graph, cl,
-		       ((dump_flags & TDF_DETAILS) ? dump_file
-						   : NULL));
+		       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
   delete_coalesce_list (cl);
   ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_TREE_SSA_COALESCE_H
 
 extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
 
 #endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index aeb7f28..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,475 +0,0 @@
-/* Rename SSA copies.
-   Copyright (C) 2004-2015 Free Software Foundation, Inc.
-   Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "backend.h"
-#include "tree.h"
-#include "gimple.h"
-#include "rtl.h"
-#include "ssa.h"
-#include "alias.h"
-#include "fold-const.h"
-#include "internal-fn.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
-  /* Number of copies coalesced.  */
-  int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
-   This optimization looks for copies between 2 SSA_NAMES, either through a
-   direct copy, or an implicit one via a PHI node result and its arguments.
-
-   Each copy is examined to determine if it is possible to rename the base
-   variable of one of the operands to the same variable as the other operand.
-   i.e.
-   T.3_5 = <blah>
-   a_1 = T.3_5
-
-   If this copy couldn't be copy propagated, it could possibly remain in the
-   program throughout the optimization phases.   After SSA->normal, it would
-   become:
-
-   T.3 = <blah>
-   a = T.3
-
-   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
-   fundamental reason why the base variable needs to be T.3, subject to
-   certain restrictions.  This optimization attempts to determine if we can
-   change the base variable on copies like this, and result in code such as:
-
-   a_5 = <blah>
-   a_1 = a_5
-
-   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
-   possible, the copy goes away completely. If it isn't possible, a new temp
-   will be created for a_5, and you will end up with the exact same code:
-
-   a.8 = <blah>
-   a = a.8
-
-   The other benefit of performing this optimization relates to what variables
-   are chosen in copies.  Gimplification of the program uses temporaries for
-   a lot of things. expressions like
-
-   a_1 = <blah>
-   <blah2> = a_1
-
-   get turned into
-
-   T.3_5 = <blah>
-   a_1 = T.3_5
-   <blah2> = a_1
-
-   Copy propagation is done in a forward direction, and if we can propagate
-   through the copy, we end up with:
-
-   T.3_5 = <blah>
-   <blah2> = T.3_5
-
-   The copy is gone, but so is all reference to the user variable 'a'. By
-   performing this optimization, we would see the sequence:
-
-   a_5 = <blah>
-   a_1 = a_5
-   <blah2> = a_1
-
-   which copy propagation would then turn into:
-
-   a_5 = <blah>
-   <blah2> = a_5
-
-   and so we still retain the user variable whenever possible.  */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
-   Choose a representative for the partition, and send debug info to DEBUG.  */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
-  int p1, p2, p3;
-  tree root1, root2;
-  tree rep1, rep2;
-  bool ign1, ign2, abnorm;
-
-  gcc_assert (TREE_CODE (var1) == SSA_NAME);
-  gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
-  register_ssa_partition (map, var1);
-  register_ssa_partition (map, var2);
-
-  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
-  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
-  if (debug)
-    {
-      fprintf (debug, "Try : ");
-      print_generic_expr (debug, var1, TDF_SLIM);
-      fprintf (debug, "(P%d) & ", p1);
-      print_generic_expr (debug, var2, TDF_SLIM);
-      fprintf (debug, "(P%d)", p2);
-    }
-
-  gcc_assert (p1 != NO_PARTITION);
-  gcc_assert (p2 != NO_PARTITION);
-
-  if (p1 == p2)
-    {
-      if (debug)
-	fprintf (debug, " : Already coalesced.\n");
-      return;
-    }
-
-  rep1 = partition_to_var (map, p1);
-  rep2 = partition_to_var (map, p2);
-  root1 = SSA_NAME_VAR (rep1);
-  root2 = SSA_NAME_VAR (rep2);
-  if (!root1 && !root2)
-    return;
-
-  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
-  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
-	    || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
-  if (abnorm)
-    {
-      if (debug)
-	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
-      return;
-    }
-
-  /* Partitions already have the same root, simply merge them.  */
-  if (root1 == root2)
-    {
-      p1 = partition_union (map->var_partition, p1, p2);
-      if (debug)
-	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
-      return;
-    }
-
-  /* Never attempt to coalesce 2 different parameters.  */
-  if ((root1 && TREE_CODE (root1) == PARM_DECL)
-      && (root2 && TREE_CODE (root2) == PARM_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
-      return;
-    }
-
-  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
-      != (root2 && TREE_CODE (root2) == RESULT_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
-      return;
-    }
-
-  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
-  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
-  /* Refrain from coalescing user variables, if requested.  */
-  if (!ign1 && !ign2)
-    {
-      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
-	ign2 = true;
-      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
-	ign1 = true;
-      else if (flag_ssa_coalesce_vars != 2)
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
-	  return;
-	}
-      else
-	ign2 = true;
-    }
-
-  /* If both values have default defs, we can't coalesce.  If only one has a
-     tag, make sure that variable is the new root partition.  */
-  if (root1 && ssa_default_def (cfun, root1))
-    {
-      if (root2 && ssa_default_def (cfun, root2))
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 default defs. No coalesce.\n");
-	  return;
-	}
-      else
-        {
-	  ign2 = true;
-	  ign1 = false;
-	}
-    }
-  else if (root2 && ssa_default_def (cfun, root2))
-    {
-      ign1 = true;
-      ign2 = false;
-    }
-
-  /* Do not coalesce if we cannot assign a symbol to the partition.  */
-  if (!(!ign2 && root2)
-      && !(!ign1 && root1))
-    {
-      if (debug)
-	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the new chosen root variable would be read-only.
-     If both ign1 && ign2, then the root var of the larger partition
-     wins, so reject in that case if any of the root vars is TREE_READONLY.
-     Otherwise reject only if the root var, on which replace_ssa_name_symbol
-     will be called below, is readonly.  */
-  if (((root1 && TREE_READONLY (root1)) && ign2)
-      || ((root2 && TREE_READONLY (root2)) && ign1))
-    {
-      if (debug)
-	fprintf (debug, " : Readonly variable.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the two variables aren't type compatible .  */
-  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
-      /* There is a disconnect between the middle-end type-system and
-         VRP, avoid coalescing enum types with different bounds.  */
-      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
-	   || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
-	  && TREE_TYPE (var1) != TREE_TYPE (var2)))
-    {
-      if (debug)
-	fprintf (debug, " : Incompatible types.  No coalesce.\n");
-      return;
-    }
-
-  /* Merge the two partitions.  */
-  p3 = partition_union (map->var_partition, p1, p2);
-
-  /* Set the root variable of the partition to the better choice, if there is
-     one.  */
-  if (!ign2 && root2)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
-  else if (!ign1 && root1)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
-  else
-    gcc_unreachable ();
-
-  if (debug)
-    {
-      fprintf (debug, " --> P%d ", p3);
-      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
-			  TDF_SLIM);
-      fprintf (debug, "\n");
-    }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
-  GIMPLE_PASS, /* type */
-  "copyrename", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_COPY_RENAME, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
-  pass_rename_ssa_copies (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
-   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
-   changing the underlying root variable of all coalesced version.  This will
-   then cause the SSA->normal pass to attempt to coalesce them all to the same
-   variable.  */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
-  var_map map;
-  basic_block bb;
-  tree var, part_var;
-  gimple stmt;
-  unsigned x;
-  FILE *debug;
-
-  memset (&stats, 0, sizeof (stats));
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    debug = dump_file;
-  else
-    debug = NULL;
-
-  map = init_var_map (num_ssa_names);
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Scan for real copies.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_assign_ssa_name_copy_p (stmt))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
-	    }
-	}
-    }
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Treat PHI nodes as copies between the result and each argument.  */
-      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-        {
-          size_t i;
-	  tree res;
-	  gphi *phi = gsi.phi ();
-	  res = gimple_phi_result (phi);
-
-	  /* Do not process virtual SSA_NAMES.  */
-	  if (virtual_operand_p (res))
-	    continue;
-
-	  /* Make sure to only use the same partition for an argument
-	     as the result but never the other way around.  */
-	  if (SSA_NAME_VAR (res)
-	      && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
-	    for (i = 0; i < gimple_phi_num_args (phi); i++)
-	      {
-		tree arg = PHI_ARG_DEF (phi, i);
-		if (TREE_CODE (arg) == SSA_NAME)
-		  copy_rename_partition_coalesce (map, res, arg,
-						  debug);
-	      }
-	  /* Else if all arguments are in the same partition try to merge
-	     it with the result.  */
-	  else
-	    {
-	      int all_p_same = -1;
-	      int p = -1;
-	      for (i = 0; i < gimple_phi_num_args (phi); i++)
-		{
-		  tree arg = PHI_ARG_DEF (phi, i);
-		  if (TREE_CODE (arg) != SSA_NAME)
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		  else if (all_p_same == -1)
-		    {
-		      p = partition_find (map->var_partition,
-					  SSA_NAME_VERSION (arg));
-		      all_p_same = 1;
-		    }
-		  else if (all_p_same == 1
-			   && p != partition_find (map->var_partition,
-						   SSA_NAME_VERSION (arg)))
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		}
-	      if (all_p_same == 1)
-		copy_rename_partition_coalesce (map, res,
-						PHI_ARG_DEF (phi, 0),
-						debug);
-	    }
-        }
-    }
-
-  if (debug)
-    dump_var_map (debug, map);
-
-  /* Now one more pass to make all elements of a partition share the same
-     root variable.  */
-
-  for (x = 1; x < num_ssa_names; x++)
-    {
-      part_var = partition_to_var (map, x);
-      if (!part_var)
-        continue;
-      var = ssa_name (x);
-      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
-	continue;
-      if (debug)
-        {
-	  fprintf (debug, "Coalesced ");
-	  print_generic_expr (debug, var, TDF_SLIM);
-	  fprintf (debug, " to ");
-	  print_generic_expr (debug, part_var, TDF_SLIM);
-	  fprintf (debug, "\n");
-	}
-      stats.coalesced++;
-      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
-    }
-
-  statistics_counter_event (fun, "copies coalesced",
-			    stats.coalesced);
-  delete_var_map (map);
-  return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
-  return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 5b00f58..4772558 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -70,88 +70,6 @@ static void  verify_live_on_entry (tree_live_info_p);
    ssa_name or variable, and vice versa.  */
 
 
-/* Hashtable helpers.  */
-
-struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
-{
-  static inline hashval_t hash (const tree_int_map *);
-  static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
-  return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
-  return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP.  */
-
-static void
-var_map_base_init (var_map map)
-{
-  int x, num_part;
-  tree var;
-  struct tree_int_map *m, *mapstorage;
-
-  num_part = num_var_partitions (map);
-  hash_table<tree_int_map_hasher> tree_to_index (num_part);
-  /* We can have at most num_part entries in the hash tables, so it's
-     enough to allocate so many map elements once, saving some malloc
-     calls.  */
-  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
-  /* If a base table already exists, clear it, otherwise create it.  */
-  free (map->partition_to_base_index);
-  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
-  /* Build the base variable list, and point partitions at their bases.  */
-  for (x = 0; x < num_part; x++)
-    {
-      struct tree_int_map **slot;
-      unsigned baseindex;
-      var = partition_to_var (map, x);
-      if (SSA_NAME_VAR (var)
-	  && (!VAR_P (SSA_NAME_VAR (var))
-	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
-	m->base.from = SSA_NAME_VAR (var);
-      else
-	/* This restricts what anonymous SSA names we can coalesce
-	   as it restricts the sets we compute conflicts for.
-	   Using TREE_TYPE to generate sets is the easies as
-	   type equivalency also holds for SSA names with the same
-	   underlying decl. 
-
-	   Check gimple_can_coalesce_p when changing this code.  */
-	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
-			? TYPE_CANONICAL (TREE_TYPE (var))
-			: TREE_TYPE (var));
-      /* If base variable hasn't been seen, set it up.  */
-      slot = tree_to_index.find_slot (m, INSERT);
-      if (!*slot)
-	{
-	  baseindex = m - mapstorage;
-	  m->to = baseindex;
-	  *slot = m;
-	  m++;
-	}
-      else
-	baseindex = (*slot)->to;
-      map->partition_to_base_index[x] = baseindex;
-    }
-
-  map->num_basevars = m - mapstorage;
-
-  free (mapstorage);
-}
-
-
 /* Remove the base table in MAP.  */
 
 static void
@@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected)
 }
 
 
-/* Create a partition view which includes all the used partitions in MAP.  If
-   WANT_BASES is true, create the base variable map as well.  */
+/* Create a partition view which includes all the used partitions in MAP.  */
 
 void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
 {
   bitmap used;
 
   used = partition_view_init (map);
   partition_view_fini (map, used);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
@@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases)
    as well.  */
 
 void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
 {
   bitmap used;
   bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
     }
   partition_view_fini (map, new_partitions);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
 extern var_map init_var_map (int);
 extern void delete_var_map (var_map);
 extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
 extern void dump_scope_blocks (FILE *, int);
 extern void debug_scope_block (tree, int);
 extern void debug_scope_blocks (int);
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index 437f69d..1fbd71e 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -38,6 +38,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "tree-ssa-propagate.h"
 #include "tree-hash-traits.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
 
 /* The basic structure describing an equivalency created by traversing
    an edge.  Traversing the edge effectively means that we can assume
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index b5b0cb6..e10f775 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4909,12 +4909,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
    registers, as well as associations between MEMs and VALUEs.  */
 
 static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
 {
   unsigned int r;
   hard_reg_set_iterator hrsi;
+  HARD_REG_SET invalidated_regs;
 
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+  get_call_reg_set_usage (call_insn, &invalidated_regs,
+			  regs_invalidated_by_call);
+
+  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
     var_regno_delete (set, r);
 
   if (MAY_HAVE_DEBUG_INSNS)
@@ -6698,7 +6702,7 @@ compute_bb_dataflow (basic_block bb)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (out);
+	    dataflow_set_clear_at_call (out, insn);
 	    break;
 
 	  case MO_USE:
@@ -9160,7 +9164,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (set);
+	    dataflow_set_clear_at_call (set, insn);
 	    emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
 	    {
 	      rtx arguments = mo->u.loc, *p = &arguments;



These are the incremental fixes:

diff --git a/gcc/explow.c b/gcc/explow.c
index 6dba6e5..6941f4e 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -852,6 +852,13 @@ promote_ssa_mode (const_tree name, int *punsignedp)
 {
   gcc_assert (TREE_CODE (name) == SSA_NAME);
 
+  /* Partitions holding parms and results must be promoted as expected
+     by function.c.  */
+  if (SSA_NAME_VAR (name)
+      && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
+	  || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
+    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+
   tree type = TREE_TYPE (name);
   int unsignedp = TYPE_UNSIGNED (type);
   machine_mode mode = TYPE_MODE (type);
diff --git a/gcc/function.c b/gcc/function.c
index 840f4a2..753d889 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2920,14 +2920,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
       stack_parm = rtl_for_parm (all, parm);
-      if (!stack_parm)
-	stack_parm = assign_stack_local (BLKmode, size_stored,
-					 DECL_ALIGN (parm));
-      else
+      if (stack_parm)
 	stack_parm = copy_rtx (stack_parm);
-      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
-	PUT_MODE (stack_parm, GET_MODE (entry_parm));
-      set_mem_attributes (stack_parm, parm, 1);
+      else
+	{
+	  stack_parm = assign_stack_local (BLKmode, size_stored,
+					   DECL_ALIGN (parm));
+	  if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
+	    PUT_MODE (stack_parm, GET_MODE (entry_parm));
+	  set_mem_attributes (stack_parm, parm, 1);
+	}
     }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-16  7:58                   ` Alexandre Oliva
@ 2015-07-16  8:50                     ` Richard Biener
  2015-07-16 21:33                       ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-07-16  8:50 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Thu, Jul 16, 2015 at 9:29 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jun 10, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> This caused the sparc regression reported by Eric in
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37
>
>>> We need to match the mode of the rtl created for the partition and the
>>> promoted mode expected for the parm.  I recall working to make parm and
>>> result decls the partition leaders, so that promote_ssa_mode would DTRT,
>>> but this escaped my mind when revisiting the patch after some time on
>>> another project.
>
> FWIW, during the development of this improvement, I dropped the notion
> of making parm and result decls partition leaders, and instead only
> considered eligible for coalescing into the same partition SSA_NAMEs
> that promoted to the same mode.
>
>> Alternatively not coalesce SSA names when promote_decl_mode gives
>> different answers (for their underlying decl)?  It sounds wrong to do that
>> (if that is really what happens).
>
> Exactly.  I've now restored the promote_decl_mode behavior to
> promote_ssa_mode for PARM_ and RESULT_DECLs, so that the strategy
> described above works again.  This fixed the sparc regression.
>
> On Jun  9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> On Jun  9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>>> On Jun  9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>>>> This also broke bootstrap on PPC64 LE Linux with the same error.
>
>>> Thanks for your reports.  I'm looking into the problem.
>
>>> I'd appreciate a preprocessed testcase from either of you to confirm the
>>> fix, if not to help debug it.
>
>> The first potential source for this problem that jumped at me would be
>> silenced with this change:
>
>> diff --git a/gcc/function.c b/gcc/function.c
>> index 8bcc352..9201ed9 100644
>> --- a/gcc/function.c
>> +++ b/gcc/function.c
>> @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>>       stack_parm = copy_rtx (stack_parm);
>>        if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>>       PUT_MODE (stack_parm, GET_MODE (entry_parm));
>> -      set_mem_attributes (stack_parm, parm, 1);
>> +      if (GET_CODE (stack_parm) == MEM)
>> +     set_mem_attributes (stack_parm, parm, 1);
>>      }
>
>>    /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
>
> I ended up fixing this in a slightly different way, running the original
> code above, from assign_stack_local to set_mem_attributes, only when
> rtl_for_parm does not obtain an assignment set up by out-of-ssa.
>
>> but I suspect there might be other similar issues lurking in function.c
>> after my attempt to turn parm assignment upside down ;-)
>
> There weren't, after all.
>
> On Jun  9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>
>> This patch clearly should have been tested on more
>> architectures than x86 before being approved and merged.
>
> The following patch was regstrapped on x86_64-linux-gnu and
> i686-pc-linux-gnu.  I've also cross-built all-target successfully for
> targets aarch64-elf, arm-eabi, arm-symbianelf, avr-elf, bfin-elf,
> cr16-elf, cris-elf, crisv32-elf, epiphany-elf, fido-elf, fr30-elf,
> frv-elf, i686-elf, lm32-elf, m68k-elf, mcore-elf, microblaze-elf,
> mips64el-elf, mips64-elf, mips64orion-elf, mipsel-elf,
> mipsisa32-elfoabi, mipsisa64-elfoabi, mipsisa64r2el-elf,
> mipsisa64r2-sde-elf, mipsisa64sb1-elf, mipstx39-elf, mn10300-elf,
> moxie-elf, nds32be-elf, nds32le-elf, nios2-elf, powerpc-eabialtivec,
> powerpc-eabisimaltivec, powerpc-eabisim, powerpc-eabispe, powerpc-eabi,
> powerpcle-eabisim, powerpcle-eabi, powerpcle-elf, ppc-eabi, ppc-elf,
> rx-elf, sh-elf, sh-superh-elf, sparc64-elf, sparc-elf, spu-elf, and
> visium-elf, and got the same build failures before and after the patch
> with targets c6x-elf, ft32-elf, h8300-elf, ia64-elf, iq2000-elf,
> m32c-elf, m32r-elf, m32rle-elf, mep-elf, mips64vr-elf
> (mips64vr-elf/mips16/newlib/libm/math/lib_a_e_hypot.o failed to build
> with the patch and passed without it, but there were other "invalid
> operand" failures for "lwu" insns without the patch, so I'm counting the
> e_hypot failure as present but latent before), mipsisa64sr71k-elf,
> msp430-elf, pdp11-aout, powerpc-xilinx-eabi, ppc64-eabi, rl78-elf,
> sh64-elf, sparc-leon-elf, v850e-elf, v850-elf, xstormy16-elf, and
> xtensa-elf.
>
> This patch differs from the previous one in that I dropped the hunk I
> had put in loop_exits_before_overflow, already noticed and fixed
> independently (PR66638); I updated tree_int_map_hasher, that was updated
> in the trunk in tree-ssa-live.c, but that the patch moved to
> tree-ssa-coalesce.c; I resolved other conflicts in files that had
> #includes added by the patch and by other changes; and I put in the two
> fixes mentioned above.  After the full updated patch, I enclose a diff
> with these two additional fixes, to ease the review.
>
> Is this ok to install?

Yes.

Thanks again for taking care of this!

Richard.

>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
>         * tree-ssa-copyrename.c: Removed.
>         * opts.c (default_options_table): Drop -ftree-copyrename.  Add
>         -ftree-coalesce-vars.
>         * passes.def: Drop all occurrences of pass_rename_ssa_copies.
>         * common.opt (ftree-copyrename): Ignore.
>         (ftree-coalesce-inlined-vars): Likewise.
>         * doc/invoke.texi: Remove the ignored options above.
>         * gimple-expr.h (gimple_can_coalesce_p): Move declaration
>         * tree-ssa-coalesce.h: ... here.
>         * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
>         headers required by it.
>         * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
>         across variables when flag_tree_coalesce_vars.  Check register
>         use and promoted modes to allow coalescing.  Moved to
>         tree-ssa-coalesce.c.
>         * tree-ssa-live.c (struct tree_int_map_hasher): Move along
>         with its member functions to tree-ssa-coalesce.c.
>         (var_map_base_init): Likewise.  Renamed to
>         compute_samebase_partition_bases.
>         (partition_view_normal): Drop want_bases parameter.
>         (partition_view_bitmap): Likewise.
>         * tree-ssa-live.h: Adjust declarations.
>         * tree-ssa-coalesce.c: Include explow.h.
>         (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
>         default defs at the entry point.
>         (dump_part_var_map): New.
>         (compute_optimized_partition_bases): New, called by...
>         (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
>         of compute_samebase_partition_bases.  Adjust.
>         * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
>         * cfgexpand.c (leader_merge): New.
>         (get_rtl_for_parm_ssa_default_def): New.
>         (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
>         vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
>         (expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
>         redundant MEM attr setting.
>         (expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
>         from...
>         (expand_one_stack_var): ... this.  New wrapper to check and
>         skip already expanded SSA partitions.
>         (record_alignment_for_reg_var): New, factored out of...
>         (expand_one_var): ... this.
>         (expand_one_ssa_partition): New.
>         (adjust_one_expanded_partition_var): New.
>         (expand_one_register_var): Check and skip already expanded SSA
>         partitions.
>         (expand_used_vars): Don't create DECLs for anonymous SSA
>         names.  Expand all SSA partitions, then adjust all SSA names.
>         (pass::execute): Replace the loops that set
>         SA.partition_to_pseudo from partition leaders and cleared
>         DECL_RTL for multi-location variables, and that which used to
>         rename vars and set attrs, with one that clears DECL_RTL and
>         checks that PARMs and RESULTs default_defs match DECL_RTL.
>         * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
>         * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
>         * explow.c (promote_ssa_mode): New.
>         * explow.h (promote_ssa_mode): Declare.
>         * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
>         * function.c: Include cfgexpand.h.
>         (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
>         (use_register_for_parm_decl): Wrapper for the above to
>         special-case the result_ptr.
>         (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
>         (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
>         multiple locations.
>         (assign_parm_adjust_stack_rtl): Add all and parm arguments,
>         for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
>         (assign_parm_setup_block): Prefer SSA-assigned location.
>         (assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
>         if stack_parm is NULL.
>         (assign_parm_setup_stack): Prefer SSA-assigned location.
>         (assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
>         rtl before testing for pointer bounds.  Special-case result_ptr.
>         (expand_function_start): Maybe reset DECL_RTL of result.
>         Prefer SSA-assigned location for result and static chain.
>         Factor out DECL_RESULT and SET_DECL_RTL.
>         * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
>         anonymous SSA names.  Use promote_ssa_mode.
>         (get_temp_reg): Likewise.
>         (remove_ssa_form): Adjust.
>         * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
>         and get its reg_usage for reg invalidation.
>         (compute_bb_dataflow): Pass it insn.
>         (emit_notes_in_bb): Likewise.
>
> for  gcc/testsuite/ChangeLog
>
>         * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
>         * gcc.dg/ssp-1.c: Make counter a register.
>         * gcc.dg/ssp-2.c: Likewise.
>         * gcc.dg/torture/parm-coalesce.c: New.
> ---
>  gcc/Makefile.in                              |    1
>  gcc/alias.c                                  |   13 +
>  gcc/cfgexpand.c                              |  370 +++++++++++++++-----
>  gcc/cfgexpand.h                              |    2
>  gcc/common.opt                               |   12 -
>  gcc/doc/invoke.texi                          |   48 +--
>  gcc/emit-rtl.c                               |    5
>  gcc/explow.c                                 |   22 +
>  gcc/explow.h                                 |    3
>  gcc/expr.c                                   |   39 +-
>  gcc/function.c                               |  228 ++++++++++--
>  gcc/gimple-expr.c                            |   39 --
>  gcc/gimple-expr.h                            |    1
>  gcc/opts.c                                   |    2
>  gcc/passes.def                               |    5
>  gcc/testsuite/gcc.dg/guality/pr54200.c       |    2
>  gcc/testsuite/gcc.dg/ssp-1.c                 |    2
>  gcc/testsuite/gcc.dg/ssp-2.c                 |    2
>  gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
>  gcc/tree-outof-ssa.c                         |   16 -
>  gcc/tree-ssa-coalesce.c                      |  378 ++++++++++++++++++++-
>  gcc/tree-ssa-coalesce.h                      |    1
>  gcc/tree-ssa-copyrename.c                    |  475 --------------------------
>  gcc/tree-ssa-live.c                          |   99 -----
>  gcc/tree-ssa-live.h                          |    4
>  gcc/tree-ssa-uncprop.c                       |    5
>  gcc/var-tracking.c                           |   12 -
>  27 files changed, 979 insertions(+), 847 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>  delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index bf2186a..b36f9c1 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1445,7 +1445,6 @@ OBJS = \
>         tree-ssa-ccp.o \
>         tree-ssa-coalesce.o \
>         tree-ssa-copy.o \
> -       tree-ssa-copyrename.o \
>         tree-ssa-dce.o \
>         tree-ssa-dom.o \
>         tree-ssa-dse.o \
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 3203722..69e3732 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>    if (! DECL_P (exprx) || ! DECL_P (expry))
>      return 0;
>
> +  /* If we refer to different gimple registers, or one gimple register
> +     and one non-gimple-register, we know they can't overlap.  First,
> +     gimple registers don't have their addresses taken.  Now, there
> +     could be more than one stack slot for (different versions of) the
> +     same gimple register, but we can presumably tell they don't
> +     overlap based on offsets from stack base addresses elsewhere.
> +     It's important that we don't proceed to DECL_RTL, because gimple
> +     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
> +     able to do anything about them since no SSA information will have
> +     remained to guide it.  */
> +  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> +    return exprx != expry;
> +
>    /* With invalid code we can end up storing into the constant pool.
>       Bail out to avoid ICEing when creating RTL for this.
>       See gfortran.dg/lto/20091028-2_0.f90.  */
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index a047632..0b19953 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -150,21 +150,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
>  #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* Choose either CUR or NEXT as the leader DECL for a partition.
> +   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
> +   out of the same user variable being in multiple partitions (this is
> +   less likely for compiler-introduced temps).  */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> +  if (cur == NULL || cur == next)
> +    return next;
> +
> +  if (DECL_P (cur) && DECL_IGNORED_P (cur))
> +    return cur;
> +
> +  if (DECL_P (next) && DECL_IGNORED_P (next))
> +    return next;
> +
> +  return cur;
> +}
> +
> +
> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
> +   there is one.  */
> +
> +rtx
> +get_rtl_for_parm_ssa_default_def (tree var)
> +{
> +  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> +
> +  if (!is_gimple_reg (var))
> +    return NULL_RTX;
> +
> +  /* If we've already determined RTL for the decl, use it.  This is
> +     not just an optimization: if VAR is a PARM whose incoming value
> +     is unused, we won't find a default def to use its partition, but
> +     we still want to use the location of the parm, if it was used at
> +     all.  During assign_parms, until a location is assigned for the
> +     VAR, RTL can only for a parm or result if we're not coalescing
> +     across variables, when we know we're coalescing all SSA_NAMEs of
> +     each parm or result, and we're not coalescing them with names
> +     pertaining to other variables, such as other parms' default
> +     defs.  */
> +  if (DECL_RTL_SET_P (var))
> +    {
> +      gcc_assert (DECL_RTL (var) != pc_rtx);
> +      return DECL_RTL (var);
> +    }
> +
> +  tree name = ssa_default_def (cfun, var);
> +
> +  if (!name)
> +    return NULL_RTX;
> +
> +  int part = var_to_partition (SA.map, name);
> +  if (part == NO_PARTITION)
> +    return NULL_RTX;
> +
> +  return SA.partition_to_pseudo[part];
> +}
> +
>  /* Associate declaration T with storage space X.  If T is no
>     SSA name this is exactly SET_DECL_RTL, otherwise make the
>     partition of T associated with X.  */
>  static inline void
>  set_rtl (tree t, rtx x)
>  {
> +  if (x && SSAVAR (t))
> +    {
> +      bool skip = false;
> +      tree cur = NULL_TREE;
> +
> +      if (MEM_P (x))
> +       cur = MEM_EXPR (x);
> +      else if (REG_P (x))
> +       cur = REG_EXPR (x);
> +      else if (GET_CODE (x) == CONCAT
> +              && REG_P (XEXP (x, 0)))
> +       cur = REG_EXPR (XEXP (x, 0));
> +      else if (GET_CODE (x) == PARALLEL)
> +       cur = REG_EXPR (XVECEXP (x, 0, 0));
> +      else if (x == pc_rtx)
> +       skip = true;
> +      else
> +       gcc_unreachable ();
> +
> +      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +
> +      if (cur != next)
> +       {
> +         if (MEM_P (x))
> +           set_mem_attributes (x, next, true);
> +         else
> +           set_reg_attrs_for_decl_rtl (next, x);
> +       }
> +    }
> +
>    if (TREE_CODE (t) == SSA_NAME)
>      {
> -      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> -      if (x && !MEM_P (x))
> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> -      /* For the benefit of debug information at -O0 (where vartracking
> -         doesn't run) record the place also in the base DECL if it's
> -        a normal variable (not a parameter).  */
> -      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
> +      int part = var_to_partition (SA.map, t);
> +      if (part != NO_PARTITION)
> +       {
> +         if (SA.partition_to_pseudo[part])
> +           gcc_assert (SA.partition_to_pseudo[part] == x);
> +         else
> +           SA.partition_to_pseudo[part] = x;
> +       }
> +      /* For the benefit of debug information at -O0 (where
> +         vartracking doesn't run) record the place also in the base
> +         DECL.  For PARMs and RESULTs, we may end up resetting these
> +         in function.c:maybe_reset_rtl_for_parm, but in some rare
> +         cases we may need them (unused and overwritten incoming
> +         value, that at -O0 must share the location with the other
> +         uses in spite of the missing default def), and this may be
> +         the only chance to preserve them.  */
> +      if (x && x != pc_rtx && SSA_NAME_VAR (t))
>         {
>           tree var = SSA_NAME_VAR (t);
>           /* If we don't yet have something recorded, just record it now.  */
> @@ -862,7 +962,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
>    x = plus_constant (Pmode, base, offset);
> -  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
> +  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
> +                  ? TYPE_MODE (TREE_TYPE (decl))
> +                  : DECL_MODE (SSAVAR (decl)), x);
>
>    if (TREE_CODE (decl) != SSA_NAME)
>      {
> @@ -884,7 +986,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>        DECL_USER_ALIGN (decl) = 0;
>      }
>
> -  set_mem_attributes (x, SSAVAR (decl), true);
>    set_rtl (decl, x);
>  }
>
> @@ -1099,13 +1200,22 @@ account_stack_vars (void)
>     to a variable to be allocated in the stack frame.  */
>
>  static void
> -expand_one_stack_var (tree var)
> +expand_one_stack_var_1 (tree var)
>  {
>    HOST_WIDE_INT size, offset;
>    unsigned byte_align;
>
> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> -  byte_align = align_local_variable (SSAVAR (var));
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      tree type = TREE_TYPE (var);
> +      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> +      byte_align = TYPE_ALIGN_UNIT (type);
> +    }
> +  else
> +    {
> +      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
> +      byte_align = align_local_variable (var);
> +    }
>
>    /* We handle highly aligned variables in expand_stack_vars.  */
>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1116,6 +1226,27 @@ expand_one_stack_var (tree var)
>                            crtl->max_used_stack_slot_alignment, offset);
>  }
>
> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> +   already assigned some MEM.  */
> +
> +static void
> +expand_one_stack_var (tree var)
> +{
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (MEM_P (x));
> +         return;
> +       }
> +    }
> +
> +  return expand_one_stack_var_1 (var);
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a hard register.  */
>
> @@ -1125,13 +1256,114 @@ expand_one_hard_reg_var (tree var)
>    rest_of_decl_compilation (var, 0, 0);
>  }
>
> +/* Record the alignment requirements of some variable assigned to a
> +   pseudo.  */
> +
> +static void
> +record_alignment_for_reg_var (unsigned int align)
> +{
> +  if (SUPPORTS_STACK_ALIGNMENT
> +      && crtl->stack_alignment_estimated < align)
> +    {
> +      /* stack_alignment_estimated shouldn't change after stack
> +         realign decision made */
> +      gcc_assert (!crtl->stack_realign_processed);
> +      crtl->stack_alignment_estimated = align;
> +    }
> +
> +  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> +     So here we only make sure stack_alignment_needed >= align.  */
> +  if (crtl->stack_alignment_needed < align)
> +    crtl->stack_alignment_needed = align;
> +  if (crtl->max_used_stack_slot_alignment < align)
> +    crtl->max_used_stack_slot_alignment = align;
> +}
> +
> +/* Create RTL for an SSA partition.  */
> +
> +static void
> +expand_one_ssa_partition (tree var)
> +{
> +  int part = var_to_partition (SA.map, var);
> +  gcc_assert (part != NO_PARTITION);
> +
> +  if (SA.partition_to_pseudo[part])
> +    return;
> +
> +  if (!use_register_for_decl (var))
> +    {
> +      expand_one_stack_var_1 (var);
> +      return;
> +    }
> +
> +  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
> +                                         TYPE_MODE (TREE_TYPE (var)),
> +                                         TYPE_ALIGN (TREE_TYPE (var)));
> +
> +  /* If the variable alignment is very large we'll dynamicaly allocate
> +     it, which means that in-frame portion is just a pointer.  */
> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +    align = POINTER_SIZE;
> +
> +  record_alignment_for_reg_var (align);
> +
> +  machine_mode reg_mode = promote_ssa_mode (var, NULL);
> +
> +  rtx x = gen_reg_rtx (reg_mode);
> +
> +  set_rtl (var, x);
> +}
> +
> +/* Record the association between the RTL generated for a partition
> +   and the underlying variable of the SSA_NAME.  */
> +
> +static void
> +adjust_one_expanded_partition_var (tree var)
> +{
> +  if (!var)
> +    return;
> +
> +  tree decl = SSA_NAME_VAR (var);
> +
> +  int part = var_to_partition (SA.map, var);
> +  if (part == NO_PARTITION)
> +    return;
> +
> +  rtx x = SA.partition_to_pseudo[part];
> +
> +  set_rtl (var, x);
> +
> +  if (!REG_P (x))
> +    return;
> +
> +  /* Note if the object is a user variable.  */
> +  if (decl && !DECL_ARTIFICIAL (decl))
> +    mark_user_reg (x);
> +
> +  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
> +    mark_reg_pointer (x, get_pointer_alignment (var));
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a pseudo register.  */
>
>  static void
>  expand_one_register_var (tree var)
>  {
> -  tree decl = SSAVAR (var);
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (REG_P (x));
> +         return;
> +       }
> +      gcc_unreachable ();
> +    }
> +
> +  tree decl = var;
>    tree type = TREE_TYPE (decl);
>    machine_mode reg_mode = promote_decl_mode (decl, NULL);
>    rtx x = gen_reg_rtx (reg_mode);
> @@ -1265,21 +1497,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
>         align = POINTER_SIZE;
>      }
>
> -  if (SUPPORTS_STACK_ALIGNMENT
> -      && crtl->stack_alignment_estimated < align)
> -    {
> -      /* stack_alignment_estimated shouldn't change after stack
> -         realign decision made */
> -      gcc_assert (!crtl->stack_realign_processed);
> -      crtl->stack_alignment_estimated = align;
> -    }
> -
> -  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> -     So here we only make sure stack_alignment_needed >= align.  */
> -  if (crtl->stack_alignment_needed < align)
> -    crtl->stack_alignment_needed = align;
> -  if (crtl->max_used_stack_slot_alignment < align)
> -    crtl->max_used_stack_slot_alignment = align;
> +  record_alignment_for_reg_var (align);
>
>    if (TREE_CODE (origvar) == SSA_NAME)
>      {
> @@ -1713,48 +1931,18 @@ expand_used_vars (void)
>    if (targetm.use_pseudo_pic_reg ())
>      pic_offset_table_rtx = gen_reg_rtx (Pmode);
>
> -  hash_map<tree, tree> ssa_name_decls;
>    for (i = 0; i < SA.map->num_partitions; i++)
>      {
>        tree var = partition_to_var (SA.map, i);
>
>        gcc_assert (!virtual_operand_p (var));
>
> -      /* Assign decls to each SSA name partition, share decls for partitions
> -         we could have coalesced (those with the same type).  */
> -      if (SSA_NAME_VAR (var) == NULL_TREE)
> -       {
> -         tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
> -         if (!*slot)
> -           *slot = create_tmp_reg (TREE_TYPE (var));
> -         replace_ssa_name_symbol (var, *slot);
> -       }
> -
> -      /* Always allocate space for partitions based on VAR_DECLs.  But for
> -        those based on PARM_DECLs or RESULT_DECLs and which matter for the
> -        debug info, there is no need to do so if optimization is disabled
> -        because all the SSA_NAMEs based on these DECLs have been coalesced
> -        into a single partition, which is thus assigned the canonical RTL
> -        location of the DECLs.  If in_lto_p, we can't rely on optimize,
> -        a function could be compiled with -O1 -flto first and only the
> -        link performed at -O0.  */
> -      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
> -       expand_one_var (var, true, true);
> -      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
> -       {
> -         /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
> -            contain the default def (representing the parm or result itself)
> -            we don't do anything here.  But those which don't contain the
> -            default def (representing a temporary based on the parm/result)
> -            we need to allocate space just like for normal VAR_DECLs.  */
> -         if (!bitmap_bit_p (SA.partition_has_default_def, i))
> -           {
> -             expand_one_var (var, true, true);
> -             gcc_assert (SA.partition_to_pseudo[i]);
> -           }
> -       }
> +      expand_one_ssa_partition (var);
>      }
>
> +  for (i = 1; i < num_ssa_names; i++)
> +    adjust_one_expanded_partition_var (ssa_name (i));
> +
>    if (flag_stack_protect == SPCT_FLAG_STRONG)
>        gen_stack_protect_signal
>         = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -5928,35 +6116,6 @@ pass_expand::execute (function *fun)
>        parm_birth_insn = var_seq;
>      }
>
> -  /* Now that we also have the parameter RTXs, copy them over to our
> -     partitions.  */
> -  for (i = 0; i < SA.map->num_partitions; i++)
> -    {
> -      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
> -
> -      if (TREE_CODE (var) != VAR_DECL
> -         && !SA.partition_to_pseudo[i])
> -       SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
> -      gcc_assert (SA.partition_to_pseudo[i]);
> -
> -      /* If this decl was marked as living in multiple places, reset
> -        this now to NULL.  */
> -      if (DECL_RTL_IF_SET (var) == pc_rtx)
> -       SET_DECL_RTL (var, NULL);
> -
> -      /* Some RTL parts really want to look at DECL_RTL(x) when x
> -        was a decl marked in REG_ATTR or MEM_ATTR.  We could use
> -        SET_DECL_RTL here making this available, but that would mean
> -        to select one of the potentially many RTLs for one DECL.  Instead
> -        of doing that we simply reset the MEM_EXPR of the RTL in question,
> -        then nobody can get at it and hence nobody can call DECL_RTL on it.  */
> -      if (!DECL_RTL_SET_P (var))
> -       {
> -         if (MEM_P (SA.partition_to_pseudo[i]))
> -           set_mem_expr (SA.partition_to_pseudo[i], NULL);
> -       }
> -    }
> -
>    /* If we have a class containing differently aligned pointers
>       we need to merge those into the corresponding RTL pointer
>       alignment.  */
> @@ -5964,7 +6123,6 @@ pass_expand::execute (function *fun)
>      {
>        tree name = ssa_name (i);
>        int part;
> -      rtx r;
>
>        if (!name
>           /* We might have generated new SSA names in
> @@ -5977,20 +6135,24 @@ pass_expand::execute (function *fun)
>        if (part == NO_PARTITION)
>         continue;
>
> -      /* Adjust all partition members to get the underlying decl of
> -        the representative which we might have created in expand_one_var.  */
> -      if (SSA_NAME_VAR (name) == NULL_TREE)
> +      gcc_assert (SA.partition_to_pseudo[part]);
> +
> +      /* If this decl was marked as living in multiple places, reset
> +        this now to NULL.  */
> +      tree var = SSA_NAME_VAR (name);
> +      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
> +       SET_DECL_RTL (var, NULL);
> +      /* Check that the pseudos chosen by assign_parms are those of
> +        the corresponding default defs.  */
> +      else if (SSA_NAME_IS_DEFAULT_DEF (name)
> +              && (TREE_CODE (var) == PARM_DECL
> +                  || TREE_CODE (var) == RESULT_DECL))
>         {
> -         tree leader = partition_to_var (SA.map, part);
> -         gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
> -         replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
> +         rtx in = DECL_RTL_IF_SET (var);
> +         gcc_assert (in);
> +         rtx out = SA.partition_to_pseudo[part];
> +         gcc_assert (in == out || rtx_equal_p (in, out));
>         }
> -      if (!POINTER_TYPE_P (TREE_TYPE (name)))
> -       continue;
> -
> -      r = SA.partition_to_pseudo[part];
> -      if (REG_P (r))
> -       mark_reg_pointer (r, get_pointer_alignment (name));
>      }
>
>    /* If this function is `main', emit a call to `__main'
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index a0b6e3e..602579d 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern tree gimple_assign_rhs_to_tree (gimple);
>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +
>
>  #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 6b2ccbc..89dcabf 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2230,16 +2230,16 @@ Common Report Var(flag_tree_ch) Optimization
>  Enable loop header copying on trees
>
>  ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Report Var(flag_tree_coalesce_vars) Optimization
> +Enable SSA coalescing of user variables
>
>  ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-copy-prop
>  Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 522e924..681c33e 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
>  -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-nrv -fdump-tree-vect @gol
>  -fdump-tree-sink @gol
>  -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
> @@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
>  -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>  -ftree-loop-if-convert-stores -ftree-loop-im @gol
>  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
> @@ -7078,11 +7076,6 @@ name is made by appending @file{.phiopt} to the source file name.
>  Dump each function after forward propagating single use variables.  The file
>  name is made by appending @file{.forwprop} to the source file name.
>
> -@item copyrename
> -@opindex fdump-tree-copyrename
> -Dump each function after applying the copy rename optimization.  The file
> -name is made by appending @file{.copyrename} to the source file name.
> -
>  @item nrv
>  @opindex fdump-tree-nrv
>  Dump each function after applying the named return value optimization on
> @@ -7547,8 +7540,8 @@ compilation time.
>  -ftree-ccp @gol
>  -fssa-phiopt @gol
>  -ftree-ch @gol
> +-ftree-coalesce-vars @gol
>  -ftree-copy-prop @gol
> --ftree-copyrename @gol
>  -ftree-dce @gol
>  -ftree-dominator-opts @gol
>  -ftree-dse @gol
> @@ -8817,6 +8810,15 @@ profitable to parallelize the loops.
>  Compare the results of several data dependence analyzers.  This option
>  is used for debugging the data dependence analyzers.
>
> +@item -ftree-coalesce-vars
> +@opindex ftree-coalesce-vars
> +Tell the compiler to attempt to combine small user-defined variables
> +too, instead of just compiler temporaries.  This may severely limit the
> +ability to debug an optimized program compiled with
> +@option{-fno-var-tracking-assignments}.  In the negated form, this flag
> +prevents SSA coalescing of user variables.  This option is enabled by
> +default if optimization is enabled.
> +
>  @item -ftree-loop-if-convert
>  @opindex ftree-loop-if-convert
>  Attempt to transform conditional jumps in the innermost loops to
> @@ -8930,32 +8932,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
>  references with scalars to prevent committing structures to memory too
>  early.  This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees.  This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables.  This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions.  It is a more limited form of
> -@option{-ftree-coalesce-vars}.  This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries.  This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}.  In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones.  This option is enabled by default.
> -
>  @item -ftree-ter
>  @opindex ftree-ter
>  Perform temporary expression replacement during the SSA->normal phase.  Single
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index ed2b30b..0648af6 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1232,6 +1232,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
>  void
>  set_reg_attrs_for_decl_rtl (tree t, rtx x)
>  {
> +  if (!t)
> +    return;
> +  tree tdecl = t;
>    if (GET_CODE (x) == SUBREG)
>      {
>        gcc_assert (subreg_lowpart_p (x));
> @@ -1240,7 +1243,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>    if (REG_P (x))
>      REG_ATTRS (x)
>        = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
> -                                              DECL_MODE (t)));
> +                                              DECL_MODE (tdecl)));
>    if (GET_CODE (x) == CONCAT)
>      {
>        if (REG_P (XEXP (x, 0)))
> diff --git a/gcc/explow.c b/gcc/explow.c
> index bd342c1..6dba6e5 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -842,6 +842,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>    return pmode;
>  }
>
> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
> +   mode of a temp decl of same type as the SSA_NAME, if we had created
> +   one.  */
> +
> +machine_mode
> +promote_ssa_mode (const_tree name, int *punsignedp)
> +{
> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
> +
> +  tree type = TREE_TYPE (name);
> +  int unsignedp = TYPE_UNSIGNED (type);
> +  machine_mode mode = TYPE_MODE (type);
> +
> +  machine_mode pmode = promote_mode (type, mode, &unsignedp);
> +  if (punsignedp)
> +    *punsignedp = unsignedp;
> +
> +  return pmode;
> +}
> +
> +
>
>  /* Controls the behaviour of {anti_,}adjust_stack.  */
>  static bool suppress_reg_args_size;
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 94613de..52113db 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
>  /* Return mode and signedness to use when object is promoted.  */
>  machine_mode promote_decl_mode (const_tree, int *);
>
> +/* Return mode and signedness to use when object is promoted.  */
> +machine_mode promote_ssa_mode (const_tree, int *);
> +
>  /* Remove some bytes from the stack.  An rtx says how many.  */
>  extern void adjust_stack (rtx);
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 899a42c..d601129 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -9246,7 +9246,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>    rtx op0, op1, temp, decl_rtl;
>    tree type;
>    int unsignedp;
> -  machine_mode mode;
> +  machine_mode mode, dmode;
>    enum tree_code code = TREE_CODE (exp);
>    rtx subtarget, original_target;
>    int ignore;
> @@ -9377,7 +9377,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        if (g == NULL
>           && modifier == EXPAND_INITIALIZER
>           && !SSA_NAME_IS_DEFAULT_DEF (exp)
> -         && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> +         && (optimize || !SSA_NAME_VAR (exp)
> +             || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>           && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
>         g = SSA_NAME_DEF_STMT (exp);
>        if (g)
> @@ -9456,15 +9457,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        /* Ensure variable marked as used even if it doesn't go through
>          a parser.  If it hasn't be used yet, write out an external
>          definition.  */
> -      TREE_USED (exp) = 1;
> +      if (exp)
> +       TREE_USED (exp) = 1;
>
>        /* Show we haven't gotten RTL for this yet.  */
>        temp = 0;
>
>        /* Variables inherited from containing functions should have
>          been lowered by this point.  */
> -      context = decl_function_context (exp);
> -      gcc_assert (SCOPE_FILE_SCOPE_P (context)
> +      if (exp)
> +       context = decl_function_context (exp);
> +      gcc_assert (!exp
> +                 || SCOPE_FILE_SCOPE_P (context)
>                   || context == current_function_decl
>                   || TREE_STATIC (exp)
>                   || DECL_EXTERNAL (exp)
> @@ -9488,7 +9492,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>           decl_rtl = use_anchored_address (decl_rtl);
>           if (modifier != EXPAND_CONST_ADDRESS
>               && modifier != EXPAND_SUM
> -             && !memory_address_addr_space_p (DECL_MODE (exp),
> +             && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
> +                                              : GET_MODE (decl_rtl),
>                                                XEXP (decl_rtl, 0),
>                                                MEM_ADDR_SPACE (decl_rtl)))
>             temp = replace_equiv_address (decl_rtl,
> @@ -9499,12 +9504,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          if the address is a register.  */
>        if (temp != 0)
>         {
> -         if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
> +         if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
>             mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>
>           return temp;
>         }
>
> +      if (exp)
> +       dmode = DECL_MODE (exp);
> +      else
> +       dmode = TYPE_MODE (TREE_TYPE (ssa_name));
> +
>        /* If the mode of DECL_RTL does not match that of the decl,
>          there are two cases: we are dealing with a BLKmode value
>          that is returned in a register, or we are dealing with
> @@ -9512,22 +9522,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          of the wanted mode, but mark it so that we know that it
>          was already extended.  */
>        if (REG_P (decl_rtl)
> -         && DECL_MODE (exp) != BLKmode
> -         && GET_MODE (decl_rtl) != DECL_MODE (exp))
> +         && dmode != BLKmode
> +         && GET_MODE (decl_rtl) != dmode)
>         {
>           machine_mode pmode;
>
>           /* Get the signedness to be used for this variable.  Ensure we get
>              the same mode we got when the variable was declared.  */
> -         if (code == SSA_NAME
> -             && (g = SSA_NAME_DEF_STMT (ssa_name))
> -             && gimple_code (g) == GIMPLE_CALL
> -             && !gimple_call_internal_p (g))
> +         if (code != SSA_NAME)
> +           pmode = promote_decl_mode (exp, &unsignedp);
> +         else if ((g = SSA_NAME_DEF_STMT (ssa_name))
> +                  && gimple_code (g) == GIMPLE_CALL
> +                  && !gimple_call_internal_p (g))
>             pmode = promote_function_mode (type, mode, &unsignedp,
>                                            gimple_call_fntype (g),
>                                            2);
>           else
> -           pmode = promote_decl_mode (exp, &unsignedp);
> +           pmode = promote_ssa_mode (ssa_name, &unsignedp);
>           gcc_assert (GET_MODE (decl_rtl) == pmode);
>
>           temp = gen_lowpart_SUBREG (mode, decl_rtl);
> diff --git a/gcc/function.c b/gcc/function.c
> index f9d11bf4..840f4a2 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -72,6 +72,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfganal.h"
>  #include "cfgbuild.h"
>  #include "cfgcleanup.h"
> +#include "cfgexpand.h"
> +#include "basic-block.h"
> +#include "df.h"
>  #include "params.h"
>  #include "bb-reorder.h"
>  #include "shrink-wrap.h"
> @@ -2105,6 +2108,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>  bool
>  use_register_for_decl (const_tree decl)
>  {
> +  if (TREE_CODE (decl) == SSA_NAME)
> +    {
> +      /* We often try to use the SSA_NAME, instead of its underlying
> +        decl, to get type information and guide decisions, to avoid
> +        differences of behavior between anonymous and named
> +        variables, but in this one case we have to go for the actual
> +        variable if there is one.  The main reason is that, at least
> +        at -O0, we want to place user variables on the stack, but we
> +        don't mind using pseudos for anonymous or ignored temps.
> +        Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
> +        should go in pseudos, whereas their corresponding variables
> +        might have to go on the stack.  So, disregarding the decl
> +        here would negatively impact debug info at -O0, enable
> +        coalescing between SSA_NAMEs that ought to get different
> +        stack/pseudo assignments, and get the incoming argument
> +        processing thoroughly confused by PARM_DECLs expected to live
> +        in stack slots but assigned to pseudos.  */
> +      if (!SSA_NAME_VAR (decl))
> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> +
> +      decl = SSA_NAME_VAR (decl);
> +    }
> +
>    if (!targetm.calls.allocate_stack_slots_for_args ())
>      return true;
>
> @@ -2745,23 +2772,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>    data->entry_parm = entry_parm;
>  }
>
> +/* Wrapper for use_register_for_decl, that special-cases the
> +   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> +   passed by reference.  */
> +
> +static bool
> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (DECL_BY_REFERENCE (result))
> +       parm = result;
> +    }
> +
> +  return use_register_for_decl (parm);
> +}
> +
> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> +   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> +   is passed by reference.  */
> +
> +static rtx
> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (!DECL_BY_REFERENCE (result))
> +       return NULL_RTX;
> +
> +      parm = result;
> +    }
> +
> +  return get_rtl_for_parm_ssa_default_def (parm);
> +}
> +
> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> +   SSA_NAMEs in multiple partitions, so that assign_parms will choose
> +   the default def, if it exists, or create new RTL to hold the unused
> +   entry value.  If we are coalescing across variables, we want to
> +   reset the location too, because a parm without a default def
> +   (incoming value unused) might be coalesced with one with a default
> +   def, and then assign_parms would copy both incoming values to the
> +   same location, which might cause the wrong value to survive.  */
> +static void
> +maybe_reset_rtl_for_parm (tree parm)
> +{
> +  gcc_assert (TREE_CODE (parm) == PARM_DECL
> +             || TREE_CODE (parm) == RESULT_DECL);
> +  if ((flag_tree_coalesce_vars
> +       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> +      && is_gimple_reg (parm))
> +    SET_DECL_RTL (parm, NULL_RTX);
> +}
> +
>  /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
>     always valid and properly aligned.  */
>
>  static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> +                             struct assign_parm_data_one *data)
>  {
>    rtx stack_parm = data->stack_parm;
>
> +  /* If out-of-SSA assigned RTL to the parm default def, make sure we
> +     don't use what we might have computed before.  */
> +  rtx ssa_assigned = rtl_for_parm (all, parm);
> +  if (ssa_assigned)
> +    stack_parm = NULL;
> +
>    /* If we can't trust the parm stack slot to be aligned enough for its
>       ultimate type, don't use that slot after entry.  We'll make another
>       stack slot, if we need one.  */
> -  if (stack_parm
> -      && ((STRICT_ALIGNMENT
> -          && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> -         || (data->nominal_type
> -             && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> -             && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> +  else if (stack_parm
> +          && ((STRICT_ALIGNMENT
> +               && (GET_MODE_ALIGNMENT (data->nominal_mode)
> +                   > MEM_ALIGN (stack_parm)))
> +              || (data->nominal_type
> +                  && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> +                  && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>      stack_parm = NULL;
>
>    /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2823,11 +2915,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>
>    size = int_size_in_bytes (data->passed_type);
>    size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> +
>    if (stack_parm == 0)
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> -      stack_parm = assign_stack_local (BLKmode, size_stored,
> -                                      DECL_ALIGN (parm));
> +      stack_parm = rtl_for_parm (all, parm);
> +      if (!stack_parm)
> +       stack_parm = assign_stack_local (BLKmode, size_stored,
> +                                        DECL_ALIGN (parm));
> +      else
> +       stack_parm = copy_rtx (stack_parm);
>        if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>         PUT_MODE (stack_parm, GET_MODE (entry_parm));
>        set_mem_attributes (stack_parm, parm, 1);
> @@ -2968,10 +3065,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>      = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                              TREE_TYPE (current_function_decl), 2);
>
> -  parmreg = gen_reg_rtx (promoted_nominal_mode);
> +  rtx from_expand = rtl_for_parm (all, parm);
>
> -  if (!DECL_ARTIFICIAL (parm))
> -    mark_user_reg (parmreg);
> +  if (from_expand && !data->passed_pointer)
> +    {
> +      parmreg = from_expand;
> +      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
> +    }
> +  else
> +    {
> +      parmreg = gen_reg_rtx (promoted_nominal_mode);
> +      if (!DECL_ARTIFICIAL (parm))
> +       mark_user_reg (parmreg);
> +    }
>
>    /* If this was an item that we received a pointer to,
>       set DECL_RTL appropriately.  */
> @@ -2990,6 +3096,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       assign_parm_find_data_types and expand_expr_real_1.  */
>
>    equiv_stack_parm = data->stack_parm;
> +  if (!equiv_stack_parm)
> +    equiv_stack_parm = data->entry_parm;
>    validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
>    need_conversion = (data->nominal_mode != data->passed_mode
> @@ -3130,11 +3238,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>    /* If we were passed a pointer but the actual value can safely live
>       in a register, retrieve it and use it directly.  */
> -  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
> +  if (data->passed_pointer
> +      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
>      {
>        /* We can't use nominal_mode, because it will have been set to
>          Pmode above.  We must use the actual mode of the parm.  */
> -      if (use_register_for_decl (parm))
> +      if (from_expand)
> +       {
> +         parmreg = from_expand;
> +         gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> +       }
> +      else if (use_register_for_decl (parm))
>         {
>           parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>           mark_user_reg (parmreg);
> @@ -3174,7 +3288,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>        /* STACK_PARM is the pointer, not the parm, and PARMREG is
>          now the parm.  */
> -      data->stack_parm = NULL;
> +      data->stack_parm = equiv_stack_parm = NULL;
>      }
>
>    /* Mark the register as eliminable if we did no conversion and it was
> @@ -3184,11 +3298,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       make here would screw up life analysis for it.  */
>    if (data->nominal_mode == data->passed_mode
>        && !did_conversion
> -      && data->stack_parm != 0
> -      && MEM_P (data->stack_parm)
> +      && equiv_stack_parm != 0
> +      && MEM_P (equiv_stack_parm)
>        && data->locate.offset.var == 0
>        && reg_mentioned_p (virtual_incoming_args_rtx,
> -                         XEXP (data->stack_parm, 0)))
> +                         XEXP (equiv_stack_parm, 0)))
>      {
>        rtx_insn *linsn = get_last_insn ();
>        rtx_insn *sinsn;
> @@ -3201,8 +3315,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>             = GET_MODE_INNER (GET_MODE (parmreg));
>           int regnor = REGNO (XEXP (parmreg, 0));
>           int regnoi = REGNO (XEXP (parmreg, 1));
> -         rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> -         rtx stacki = adjust_address_nv (data->stack_parm, submode,
> +         rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> +         rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
>                                           GET_MODE_SIZE (submode));
>
>           /* Scan backwards for the set of the real and
> @@ -3275,6 +3389,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>
>        if (data->stack_parm == 0)
>         {
> +         rtx x = data->stack_parm = rtl_for_parm (all, parm);
> +         if (x)
> +           gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
> +       }
> +
> +      if (data->stack_parm == 0)
> +       {
>           int align = STACK_SLOT_ALIGNMENT (data->passed_type,
>                                             GET_MODE (data->entry_parm),
>                                             TYPE_ALIGN (data->passed_type));
> @@ -3531,6 +3652,8 @@ assign_parms (tree fndecl)
>           DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>           continue;
>         }
> +      else
> +       maybe_reset_rtl_for_parm (parm);
>
>        /* Estimate stack alignment from parameter alignment.  */
>        if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3580,7 +3703,9 @@ assign_parms (tree fndecl)
>        else
>         set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> -      /* Boudns should be loaded in the particular order to
> +      assign_parm_adjust_stack_rtl (&all, parm, &data);
> +
> +      /* Bounds should be loaded in the particular order to
>          have registers allocated correctly.  Collect info about
>          input bounds and load them later.  */
>        if (POINTER_BOUNDS_TYPE_P (data.passed_type))
> @@ -3597,11 +3722,10 @@ assign_parms (tree fndecl)
>         }
>        else
>         {
> -         assign_parm_adjust_stack_rtl (&data);
> -
>           if (assign_parm_setup_block_p (&data))
>             assign_parm_setup_block (&all, parm, &data);
> -         else if (data.passed_pointer || use_register_for_decl (parm))
> +         else if (data.passed_pointer
> +                  || use_register_for_parm_decl (&all, parm))
>             assign_parm_setup_reg (&all, parm, &data);
>           else
>             assign_parm_setup_stack (&all, parm, &data);
> @@ -4932,7 +5056,9 @@ expand_function_start (tree subr)
>       before any library calls that assign parms might generate.  */
>
>    /* Decide whether to return the value in memory or in a register.  */
> -  if (aggregate_value_p (DECL_RESULT (subr), subr))
> +  tree res = DECL_RESULT (subr);
> +  maybe_reset_rtl_for_parm (res);
> +  if (aggregate_value_p (res, subr))
>      {
>        /* Returning something that won't go in a register.  */
>        rtx value_address = 0;
> @@ -4940,7 +5066,7 @@ expand_function_start (tree subr)
>  #ifdef PCC_STATIC_STRUCT_RETURN
>        if (cfun->returns_pcc_struct)
>         {
> -         int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
> +         int size = int_size_in_bytes (TREE_TYPE (res));
>           value_address = assemble_static_space (size);
>         }
>        else
> @@ -4952,36 +5078,45 @@ expand_function_start (tree subr)
>              it.  */
>           if (sv)
>             {
> -             value_address = gen_reg_rtx (Pmode);
> +             if (DECL_BY_REFERENCE (res))
> +               value_address = get_rtl_for_parm_ssa_default_def (res);
> +             if (!value_address)
> +               value_address = gen_reg_rtx (Pmode);
>               emit_move_insn (value_address, sv);
>             }
>         }
>        if (value_address)
>         {
>           rtx x = value_address;
> -         if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
> +         if (!DECL_BY_REFERENCE (res))
>             {
> -             x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
> -             set_mem_attributes (x, DECL_RESULT (subr), 1);
> +             x = get_rtl_for_parm_ssa_default_def (res);
> +             if (!x)
> +               {
> +                 x = gen_rtx_MEM (DECL_MODE (res), value_address);
> +                 set_mem_attributes (x, res, 1);
> +               }
>             }
> -         SET_DECL_RTL (DECL_RESULT (subr), x);
> +         SET_DECL_RTL (res, x);
>         }
>      }
> -  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
> +  else if (DECL_MODE (res) == VOIDmode)
>      /* If return mode is void, this decl rtl should not be used.  */
> -    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
> +    SET_DECL_RTL (res, NULL_RTX);
>    else
>      {
>        /* Compute the return values into a pseudo reg, which we will copy
>          into the true return register after the cleanups are done.  */
> -      tree return_type = TREE_TYPE (DECL_RESULT (subr));
> -      if (TYPE_MODE (return_type) != BLKmode
> -         && targetm.calls.return_in_msb (return_type))
> +      tree return_type = TREE_TYPE (res);
> +      rtx x = get_rtl_for_parm_ssa_default_def (res);
> +      if (x)
> +       /* Use it.  */;
> +      else if (TYPE_MODE (return_type) != BLKmode
> +              && targetm.calls.return_in_msb (return_type))
>         /* expand_function_end will insert the appropriate padding in
>            this case.  Use the return value's natural (unpadded) mode
>            within the function proper.  */
> -       SET_DECL_RTL (DECL_RESULT (subr),
> -                     gen_reg_rtx (TYPE_MODE (return_type)));
> +       x = gen_reg_rtx (TYPE_MODE (return_type));
>        else
>         {
>           /* In order to figure out what mode to use for the pseudo, we
> @@ -4992,25 +5127,26 @@ expand_function_start (tree subr)
>           /* Structures that are returned in registers are not
>              aggregate_value_p, so we may see a PARALLEL or a REG.  */
>           if (REG_P (hard_reg))
> -           SET_DECL_RTL (DECL_RESULT (subr),
> -                         gen_reg_rtx (GET_MODE (hard_reg)));
> +           x = gen_reg_rtx (GET_MODE (hard_reg));
>           else
>             {
>               gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> -             SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
> +             x = gen_group_rtx (hard_reg);
>             }
>         }
>
> +      SET_DECL_RTL (res, x);
> +
>        /* Set DECL_REGISTER flag so that expand_function_end will copy the
>          result to the real return register(s).  */
> -      DECL_REGISTER (DECL_RESULT (subr)) = 1;
> +      DECL_REGISTER (res) = 1;
>
>        if (chkp_function_instrumented_p (current_function_decl))
>         {
> -         tree return_type = TREE_TYPE (DECL_RESULT (subr));
> +         tree return_type = TREE_TYPE (res);
>           rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
>                                                                  subr, 1);
> -         SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
> +         SET_DECL_BOUNDS_RTL (res, bounds);
>         }
>      }
>
> @@ -5025,7 +5161,9 @@ expand_function_start (tree subr)
>        rtx local, chain;
>       rtx_insn *insn;
>
> -      local = gen_reg_rtx (Pmode);
> +      local = get_rtl_for_parm_ssa_default_def (parm);
> +      if (!local)
> +       local = gen_reg_rtx (Pmode);
>        chain = targetm.calls.static_chain (current_function_decl, true);
>
>        set_decl_incoming_rtl (parm, chain, false);
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index b558d90..baed630 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type)
>    return copy;
>  }
>
> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> -   coalescing together, false otherwise.
> -
> -   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> -
> -bool
> -gimple_can_coalesce_p (tree name1, tree name2)
> -{
> -  /* First check the SSA_NAME's associated DECL.  We only want to
> -     coalesce if they have the same DECL or both have no associated DECL.  */
> -  tree var1 = SSA_NAME_VAR (name1);
> -  tree var2 = SSA_NAME_VAR (name2);
> -  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> -  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> -  if (var1 != var2)
> -    return false;
> -
> -  /* Now check the types.  If the types are the same, then we should
> -     try to coalesce V1 and V2.  */
> -  tree t1 = TREE_TYPE (name1);
> -  tree t2 = TREE_TYPE (name2);
> -  if (t1 == t2)
> -    return true;
> -
> -  /* If the types are not the same, check for a canonical type match.  This
> -     (for example) allows coalescing when the types are fundamentally the
> -     same, but just have different names.
> -
> -     Note pointer types with different address spaces may have the same
> -     canonical type.  Those are rejected for coalescing by the
> -     types_compatible_p check.  */
> -  if (TYPE_CANONICAL (t1)
> -      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> -      && types_compatible_p (t1, t2))
> -    return true;
> -
> -  return false;
> -}
> -
>  /* Strip off a legitimate source ending from the input string NAME of
>     length LEN.  Rather than having to know the names used by all of
>     our front ends, we strip off an ending of a period followed by
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index ed23eb2..3d1c89f 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
>  extern bool gimple_has_body_p (tree);
>  extern const char *gimple_decl_printable_name (tree, int);
>  extern tree copy_var_decl (tree, tree, tree);
> -extern bool gimple_can_coalesce_p (tree, tree);
>  extern tree create_tmp_var_name (const char *);
>  extern tree create_tmp_var_raw (tree, const char * = NULL);
>  extern tree create_tmp_var (tree, const char * = NULL);
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 468a802..f22edd3 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -445,12 +445,12 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
> +    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> -    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 5cd07ae..103fd2e 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_all_early_optimizations);
>        PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
>           NEXT_PASS (pass_remove_cgraph_callee_edges);
> -         NEXT_PASS (pass_rename_ssa_copies);
>           NEXT_PASS (pass_object_sizes);
>           NEXT_PASS (pass_ccp);
>           /* After CCP we rewrite no longer addressed locals into SSA
> @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
>        /* Initial scalar cleanups before alias computation.
>          They ensure memory accesses are not indirect wherever possible.  */
>        NEXT_PASS (pass_strip_predict_hints);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        NEXT_PASS (pass_ccp);
>        /* After CCP we rewrite no longer addressed locals into SSA
>          form if possible.  */
> @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_ch);
>        NEXT_PASS (pass_lower_complex);
>        NEXT_PASS (pass_sra);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* The dom pass will also resolve all __builtin_constant_p calls
>           that are still there to 0.  This has to be done after some
>          propagations have already run, but before some more dead code
> @@ -294,7 +291,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_fold_builtins);
>        NEXT_PASS (pass_optimize_widening_mul);
>        NEXT_PASS (pass_tail_calls);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* FIXME: If DCE is not run before checking for uninitialized uses,
>          we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
>          However, this also causes us to misdiagnose cases that should be
> @@ -329,7 +325,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_dce);
>        NEXT_PASS (pass_asan);
>        NEXT_PASS (pass_tsan);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* ???  We do want some kind of loop invariant motion, but we possibly
>           need to adjust LIM to be more friendly towards preserving accurate
>          debug information here.  */
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
> index 9b17187..e1e7293 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
> @@ -1,6 +1,6 @@
>  /* PR tree-optimization/54200 */
>  /* { dg-do run } */
> -/* { dg-options "-g -fno-var-tracking-assignments" } */
> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>
>  int o __attribute__((used));
>
> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
> index 5467f4d..db69332 100644
> --- a/gcc/testsuite/gcc.dg/ssp-1.c
> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>
>  int main ()
>  {
> -  int i;
> +  register int i;
>    char foo[255];
>
>    // smash stack
> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
> index 9a7ac32..752fe53 100644
> --- a/gcc/testsuite/gcc.dg/ssp-2.c
> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
>  void
>  overflow()
>  {
> -  int i = 0;
> +  register int i = 0;
>    char foo[30];
>
>    /* Overflow buffer.  */
> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> new file mode 100644
> index 0000000..dbd81c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> @@ -0,0 +1,40 @@
> +/* { dg-do run } */
> +
> +#include <stdlib.h>
> +
> +/* Make sure we don't coalesce both incoming parms, one whose incoming
> +   value is unused, to the same location, so as to overwrite one of
> +   them with the incoming value of the other.  */
> +
> +int __attribute__((noinline, noclone))
> +foo (int i, int j)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +/* Same as foo, but with swapped parameters.  */
> +int __attribute__((noinline, noclone))
> +bar (int j, int i)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +int
> +main (void)
> +{
> +  if (foo (0, 1) != 3)
> +    abort ();
> +  if (bar (1, 0) != 3)
> +    abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index 7b747ab9..978476c 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>    rtx dest_rtx, seq, x;
>    machine_mode dest_mode, src_mode;
>    int unsignedp;
> -  tree var;
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> @@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>
>    start_sequence ();
>
> -  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
> +  tree name = partition_to_var (SA.map, dest);
>    src_mode = TYPE_MODE (TREE_TYPE (src));
>    dest_mode = GET_MODE (dest_rtx);
> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>    gcc_assert (!REG_P (dest_rtx)
> -             || dest_mode == promote_decl_mode (var, &unsignedp));
> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>
>    if (src_mode != dest_mode)
>      {
> @@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T)
>  static rtx
>  get_temp_reg (tree name)
>  {
> -  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> -  tree type = TREE_TYPE (var);
> +  tree type = TREE_TYPE (name);
>    int unsignedp;
> -  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
> +  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>    rtx x = gen_reg_rtx (reg_mode);
>    if (POINTER_TYPE_P (type))
> -    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
> +    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
>    return x;
>  }
>
> @@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
>    /* Return to viewing the variable list as just all reference variables after
>       coalescing has been performed.  */
> -  partition_view_normal (map, false);
> +  partition_view_normal (map);
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index bf8983f..a622728 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-iterator.h"
>  #include "tree-ssa-live.h"
>  #include "tree-ssa-coalesce.h"
> +#include "explow.h"
>  #include "diagnostic-core.h"
>
>
> @@ -806,6 +807,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>    basic_block bb;
>    ssa_op_iter iter;
>    live_track_p live;
> +  basic_block entry;
> +
> +  /* If inter-variable coalescing is enabled, we may attempt to
> +     coalesce variables from different base variables, including
> +     different parameters, so we have to make sure default defs live
> +     at the entry block conflict with each other.  */
> +  if (flag_tree_coalesce_vars)
> +    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> +  else
> +    entry = NULL;
>
>    map = live_var_map (liveinfo);
>    graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -864,6 +875,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>             live_track_process_def (live, result, graph);
>         }
>
> +      /* Pretend there are defs for params' default defs at the start
> +        of the (post-)entry block.  */
> +      if (bb == entry)
> +       {
> +         unsigned base;
> +         bitmap_iterator bi;
> +         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> +           {
> +             bitmap_iterator bi2;
> +             unsigned part;
> +             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> +                                       0, part, bi2)
> +               {
> +                 tree var = partition_to_var (map, part);
> +                 if (!SSA_NAME_VAR (var)
> +                     || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> +                         && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> +                     || !SSA_NAME_IS_DEFAULT_DEF (var))
> +                   continue;
> +                 live_track_process_def (live, var, graph);
> +               }
> +           }
> +       }
> +
>       live_track_clear_base_vars (live);
>      }
>
> @@ -1132,6 +1167,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>      {
>        var1 = partition_to_var (map, p1);
>        var2 = partition_to_var (map, p2);
> +
>        z = var_union (map, var1, var2);
>        if (z == NO_PARTITION)
>         {
> @@ -1149,6 +1185,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
>        if (debug)
>         fprintf (debug, ": Success -> %d\n", z);
> +
>        return true;
>      }
>
> @@ -1244,6 +1281,328 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
>  }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F.  */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> +  int t;
> +  unsigned x, y;
> +  int p;
> +
> +  fprintf (f, "\nCoalescible Partition map \n\n");
> +
> +  for (x = 0; x < map->num_partitions; x++)
> +    {
> +      if (map->view_to_partition != NULL)
> +       p = map->view_to_partition[x];
> +      else
> +       p = x;
> +
> +      if (ssa_name (p) == NULL_TREE
> +         || virtual_operand_p (ssa_name (p)))
> +        continue;
> +
> +      t = 0;
> +      for (y = 1; y < num_ssa_names; y++)
> +        {
> +         tree var = version_to_var (map, y);
> +         if (!var)
> +           continue;
> +         int q = var_to_partition (map, var);
> +         p = partition_find (part, q);
> +         gcc_assert (map->partition_to_base_index[q]
> +                     == map->partition_to_base_index[p]);
> +
> +         if (p == (int)x)
> +           {
> +             if (t++ == 0)
> +               {
> +                 fprintf (f, "Partition %d, base %d (", x,
> +                          map->partition_to_base_index[q]);
> +                 print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> +                 fprintf (f, " - ");
> +               }
> +             fprintf (f, "%d ", y);
> +           }
> +       }
> +      if (t != 0)
> +       fprintf (f, ")\n");
> +    }
> +  fprintf (f, "\n");
> +}
> +
> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> +   coalescing together, false otherwise.
> +
> +   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> +
> +bool
> +gimple_can_coalesce_p (tree name1, tree name2)
> +{
> +  /* First check the SSA_NAME's associated DECL.  Without
> +     optimization, we only want to coalesce if they have the same DECL
> +     or both have no associated DECL.  */
> +  tree var1 = SSA_NAME_VAR (name1);
> +  tree var2 = SSA_NAME_VAR (name2);
> +  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> +  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> +  if (var1 != var2 && !flag_tree_coalesce_vars)
> +    return false;
> +
> +  /* Now check the types.  If the types are the same, then we should
> +     try to coalesce V1 and V2.  */
> +  tree t1 = TREE_TYPE (name1);
> +  tree t2 = TREE_TYPE (name2);
> +  if (t1 == t2)
> +    {
> +    check_modes:
> +      /* If the base variables are the same, we're good: none of the
> +        other tests below could possibly fail.  */
> +      var1 = SSA_NAME_VAR (name1);
> +      var2 = SSA_NAME_VAR (name2);
> +      if (var1 == var2)
> +       return true;
> +
> +      /* We don't want to coalesce two SSA names if one of the base
> +        variables is supposed to be a register while the other is
> +        supposed to be on the stack.  Anonymous SSA names take
> +        registers, but when not optimizing, user variables should go
> +        on the stack, so coalescing them with the anonymous variable
> +        as the partition leader would end up assigning the user
> +        variable to a register.  Don't do that!  */
> +      bool reg1 = !var1 || use_register_for_decl (var1);
> +      bool reg2 = !var2 || use_register_for_decl (var2);
> +      if (reg1 != reg2)
> +       return false;
> +
> +      /* Check that the promoted modes are the same.  We don't want to
> +        coalesce if the promoted modes would be different.  Only
> +        PARM_DECLs and RESULT_DECLs have different promotion rules,
> +        so skip the test if we both are variables or anonymous
> +        SSA_NAMEs.  */
> +      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> +       || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
> +    }
> +
> +  /* If the types are not the same, check for a canonical type match.  This
> +     (for example) allows coalescing when the types are fundamentally the
> +     same, but just have different names.
> +
> +     Note pointer types with different address spaces may have the same
> +     canonical type.  Those are rejected for coalescing by the
> +     types_compatible_p check.  */
> +  if (TYPE_CANONICAL (t1)
> +      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> +      && types_compatible_p (t1, t2))
> +    goto check_modes;
> +
> +  return false;
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> +   partition of SSA names USED_IN_COPIES and related by CL coalesce
> +   possibilities.  This must match gimple_can_coalesce_p in the
> +   optimized case.  */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> +                                  coalesce_list_p cl)
> +{
> +  int parts = num_var_partitions (map);
> +  partition tentative = partition_new (parts);
> +
> +  /* Partition the SSA versions so that, for each coalescible
> +     pair, both of its members are in the same partition in
> +     TENTATIVE.  */
> +  gcc_assert (!cl->sorted);
> +  coalesce_pair_p node;
> +  coalesce_iterator_type ppi;
> +  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> +    {
> +      tree v1 = ssa_name (node->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (node->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* We have to deal with cost one pairs too.  */
> +  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> +    {
> +      tree v1 = ssa_name (co->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (co->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* And also with abnormal edges.  */
> +  basic_block bb;
> +  edge e;
> +  edge_iterator ei;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      FOR_EACH_EDGE (e, ei, bb->preds)
> +       if (e->flags & EDGE_ABNORMAL)
> +         {
> +           gphi_iterator gsi;
> +           for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> +                gsi_next (&gsi))
> +             {
> +               gphi *phi = gsi.phi ();
> +               tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> +               if (SSA_NAME_IS_DEFAULT_DEF (arg)
> +                   && (!SSA_NAME_VAR (arg)
> +                       || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> +                 continue;
> +
> +               tree res = PHI_RESULT (phi);
> +
> +               int p1 = partition_find (tentative, var_to_partition (map, res));
> +               int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> +               if (p1 == p2)
> +                 continue;
> +
> +               partition_union (tentative, p1, p2);
> +             }
> +         }
> +    }
> +
> +  map->partition_to_base_index = XCNEWVEC (int, parts);
> +  auto_vec<unsigned int> index_map (parts);
> +  if (parts)
> +    index_map.quick_grow (parts);
> +
> +  const unsigned no_part = -1;
> +  unsigned count = parts;
> +  while (count)
> +    index_map[--count] = no_part;
> +
> +  /* Initialize MAP's mapping from partition to base index, using
> +     as base indices an enumeration of the TENTATIVE partitions in
> +     which each SSA version ended up, so that we compute conflicts
> +     between all SSA versions that ended up in the same potential
> +     coalesce partition.  */
> +  bitmap_iterator bi;
> +  unsigned i;
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      if (index_map[base] != no_part)
> +       continue;
> +      index_map[base] = count++;
> +    }
> +
> +  map->num_basevars = count;
> +
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      gcc_assert (index_map[base] < count);
> +      map->partition_to_base_index[pidx] = index_map[base];
> +    }
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    dump_part_var_map (dump_file, tentative, map);
> +
> +  partition_delete (tentative);
> +}
> +
> +/* Hashtable helpers.  */
> +
> +struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
> +{
> +  static inline hashval_t hash (const tree_int_map *);
> +  static inline bool equal (const tree_int_map *, const tree_int_map *);
> +};
> +
> +inline hashval_t
> +tree_int_map_hasher::hash (const tree_int_map *v)
> +{
> +  return tree_map_base_hash (v);
> +}
> +
> +inline bool
> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> +{
> +  return tree_int_map_eq (v, c);
> +}
> +
> +/* This routine will initialize the basevar fields of MAP with base
> +   names.  Partitions will share the same base if they have the same
> +   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
> +   must match gimple_can_coalesce_p in the non-optimized case.  */
> +
> +static void
> +compute_samebase_partition_bases (var_map map)
> +{
> +  int x, num_part;
> +  tree var;
> +  struct tree_int_map *m, *mapstorage;
> +
> +  num_part = num_var_partitions (map);
> +  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> +  /* We can have at most num_part entries in the hash tables, so it's
> +     enough to allocate so many map elements once, saving some malloc
> +     calls.  */
> +  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> +
> +  /* If a base table already exists, clear it, otherwise create it.  */
> +  free (map->partition_to_base_index);
> +  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> +
> +  /* Build the base variable list, and point partitions at their bases.  */
> +  for (x = 0; x < num_part; x++)
> +    {
> +      struct tree_int_map **slot;
> +      unsigned baseindex;
> +      var = partition_to_var (map, x);
> +      if (SSA_NAME_VAR (var)
> +         && (!VAR_P (SSA_NAME_VAR (var))
> +             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> +       m->base.from = SSA_NAME_VAR (var);
> +      else
> +       /* This restricts what anonymous SSA names we can coalesce
> +          as it restricts the sets we compute conflicts for.
> +          Using TREE_TYPE to generate sets is the easies as
> +          type equivalency also holds for SSA names with the same
> +          underlying decl.
> +
> +          Check gimple_can_coalesce_p when changing this code.  */
> +       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> +                       ? TYPE_CANONICAL (TREE_TYPE (var))
> +                       : TREE_TYPE (var));
> +      /* If base variable hasn't been seen, set it up.  */
> +      slot = tree_to_index.find_slot (m, INSERT);
> +      if (!*slot)
> +       {
> +         baseindex = m - mapstorage;
> +         m->to = baseindex;
> +         *slot = m;
> +         m++;
> +       }
> +      else
> +       baseindex = (*slot)->to;
> +      map->partition_to_base_index[x] = baseindex;
> +    }
> +
> +  map->num_basevars = m - mapstorage;
> +
> +  free (mapstorage);
> +}
> +
>  /* Reduce the number of copies by coalescing variables in the function.  Return
>     a partition map with the resulting coalesces.  */
>
> @@ -1260,9 +1619,10 @@ coalesce_ssa_name (void)
>    cl = create_coalesce_list ();
>    map = create_outofssa_var_map (cl, used_in_copies);
>
> -  /* If optimization is disabled, we need to coalesce all the names originating
> -     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
> -  if (!optimize)
> +  /* If this optimization is disabled, we need to coalesce all the
> +     names originating from the same SSA_NAME_VAR so debug info
> +     remains undisturbed.  */
> +  if (!flag_tree_coalesce_vars)
>      {
>        hash_table<ssa_name_var_hash> ssa_name_hash (10);
>
> @@ -1303,8 +1663,13 @@ coalesce_ssa_name (void)
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      dump_var_map (dump_file, map);
>
> -  /* Don't calculate live ranges for variables not in the coalesce list.  */
> -  partition_view_bitmap (map, used_in_copies, true);
> +  partition_view_bitmap (map, used_in_copies);
> +
> +  if (flag_tree_coalesce_vars)
> +    compute_optimized_partition_bases (map, used_in_copies, cl);
> +  else
> +    compute_samebase_partition_bases (map);
> +
>    BITMAP_FREE (used_in_copies);
>
>    if (num_var_partitions (map) < 1)
> @@ -1343,8 +1708,7 @@ coalesce_ssa_name (void)
>
>    /* Now coalesce everything in the list.  */
>    coalesce_partitions (map, graph, cl,
> -                      ((dump_flags & TDF_DETAILS) ? dump_file
> -                                                  : NULL));
> +                      ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
>    delete_coalesce_list (cl);
>    ssa_conflicts_delete (graph);
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index 99b188a..ae289b4 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>  #define GCC_TREE_SSA_COALESCE_H
>
>  extern var_map coalesce_ssa_name (void);
> +extern bool gimple_can_coalesce_p (tree, tree);
>
>  #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index aeb7f28..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,475 +0,0 @@
> -/* Rename SSA copies.
> -   Copyright (C) 2004-2015 Free Software Foundation, Inc.
> -   Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3.  If not see
> -<http://www.gnu.org/licenses/>.  */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "backend.h"
> -#include "tree.h"
> -#include "gimple.h"
> -#include "rtl.h"
> -#include "ssa.h"
> -#include "alias.h"
> -#include "fold-const.h"
> -#include "internal-fn.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> -  /* Number of copies coalesced.  */
> -  int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> -   This optimization looks for copies between 2 SSA_NAMES, either through a
> -   direct copy, or an implicit one via a PHI node result and its arguments.
> -
> -   Each copy is examined to determine if it is possible to rename the base
> -   variable of one of the operands to the same variable as the other operand.
> -   i.e.
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -
> -   If this copy couldn't be copy propagated, it could possibly remain in the
> -   program throughout the optimization phases.   After SSA->normal, it would
> -   become:
> -
> -   T.3 = <blah>
> -   a = T.3
> -
> -   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> -   fundamental reason why the base variable needs to be T.3, subject to
> -   certain restrictions.  This optimization attempts to determine if we can
> -   change the base variable on copies like this, and result in code such as:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -
> -   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> -   possible, the copy goes away completely. If it isn't possible, a new temp
> -   will be created for a_5, and you will end up with the exact same code:
> -
> -   a.8 = <blah>
> -   a = a.8
> -
> -   The other benefit of performing this optimization relates to what variables
> -   are chosen in copies.  Gimplification of the program uses temporaries for
> -   a lot of things. expressions like
> -
> -   a_1 = <blah>
> -   <blah2> = a_1
> -
> -   get turned into
> -
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -   <blah2> = a_1
> -
> -   Copy propagation is done in a forward direction, and if we can propagate
> -   through the copy, we end up with:
> -
> -   T.3_5 = <blah>
> -   <blah2> = T.3_5
> -
> -   The copy is gone, but so is all reference to the user variable 'a'. By
> -   performing this optimization, we would see the sequence:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -   <blah2> = a_1
> -
> -   which copy propagation would then turn into:
> -
> -   a_5 = <blah>
> -   <blah2> = a_5
> -
> -   and so we still retain the user variable whenever possible.  */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> -   Choose a representative for the partition, and send debug info to DEBUG.  */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> -  int p1, p2, p3;
> -  tree root1, root2;
> -  tree rep1, rep2;
> -  bool ign1, ign2, abnorm;
> -
> -  gcc_assert (TREE_CODE (var1) == SSA_NAME);
> -  gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> -  register_ssa_partition (map, var1);
> -  register_ssa_partition (map, var2);
> -
> -  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> -  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> -  if (debug)
> -    {
> -      fprintf (debug, "Try : ");
> -      print_generic_expr (debug, var1, TDF_SLIM);
> -      fprintf (debug, "(P%d) & ", p1);
> -      print_generic_expr (debug, var2, TDF_SLIM);
> -      fprintf (debug, "(P%d)", p2);
> -    }
> -
> -  gcc_assert (p1 != NO_PARTITION);
> -  gcc_assert (p2 != NO_PARTITION);
> -
> -  if (p1 == p2)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Already coalesced.\n");
> -      return;
> -    }
> -
> -  rep1 = partition_to_var (map, p1);
> -  rep2 = partition_to_var (map, p2);
> -  root1 = SSA_NAME_VAR (rep1);
> -  root2 = SSA_NAME_VAR (rep2);
> -  if (!root1 && !root2)
> -    return;
> -
> -  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
> -  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> -           || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> -  if (abnorm)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Partitions already have the same root, simply merge them.  */
> -  if (root1 == root2)
> -    {
> -      p1 = partition_union (map->var_partition, p1, p2);
> -      if (debug)
> -       fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> -      return;
> -    }
> -
> -  /* Never attempt to coalesce 2 different parameters.  */
> -  if ((root1 && TREE_CODE (root1) == PARM_DECL)
> -      && (root2 && TREE_CODE (root2) == PARM_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> -      return;
> -    }
> -
> -  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> -      != (root2 && TREE_CODE (root2) == RESULT_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> -      return;
> -    }
> -
> -  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> -  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> -  /* Refrain from coalescing user variables, if requested.  */
> -  if (!ign1 && !ign2)
> -    {
> -      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> -       ign2 = true;
> -      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> -       ign1 = true;
> -      else if (flag_ssa_coalesce_vars != 2)
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> -         return;
> -       }
> -      else
> -       ign2 = true;
> -    }
> -
> -  /* If both values have default defs, we can't coalesce.  If only one has a
> -     tag, make sure that variable is the new root partition.  */
> -  if (root1 && ssa_default_def (cfun, root1))
> -    {
> -      if (root2 && ssa_default_def (cfun, root2))
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 default defs. No coalesce.\n");
> -         return;
> -       }
> -      else
> -        {
> -         ign2 = true;
> -         ign1 = false;
> -       }
> -    }
> -  else if (root2 && ssa_default_def (cfun, root2))
> -    {
> -      ign1 = true;
> -      ign2 = false;
> -    }
> -
> -  /* Do not coalesce if we cannot assign a symbol to the partition.  */
> -  if (!(!ign2 && root2)
> -      && !(!ign1 && root1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the new chosen root variable would be read-only.
> -     If both ign1 && ign2, then the root var of the larger partition
> -     wins, so reject in that case if any of the root vars is TREE_READONLY.
> -     Otherwise reject only if the root var, on which replace_ssa_name_symbol
> -     will be called below, is readonly.  */
> -  if (((root1 && TREE_READONLY (root1)) && ign2)
> -      || ((root2 && TREE_READONLY (root2)) && ign1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Readonly variable.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the two variables aren't type compatible .  */
> -  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> -      /* There is a disconnect between the middle-end type-system and
> -         VRP, avoid coalescing enum types with different bounds.  */
> -      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> -          || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> -         && TREE_TYPE (var1) != TREE_TYPE (var2)))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Incompatible types.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Merge the two partitions.  */
> -  p3 = partition_union (map->var_partition, p1, p2);
> -
> -  /* Set the root variable of the partition to the better choice, if there is
> -     one.  */
> -  if (!ign2 && root2)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> -  else if (!ign1 && root1)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> -  else
> -    gcc_unreachable ();
> -
> -  if (debug)
> -    {
> -      fprintf (debug, " --> P%d ", p3);
> -      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> -                         TDF_SLIM);
> -      fprintf (debug, "\n");
> -    }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> -  GIMPLE_PASS, /* type */
> -  "copyrename", /* name */
> -  OPTGROUP_NONE, /* optinfo_flags */
> -  TV_TREE_COPY_RENAME, /* tv_id */
> -  ( PROP_cfg | PROP_ssa ), /* properties_required */
> -  0, /* properties_provided */
> -  0, /* properties_destroyed */
> -  0, /* todo_flags_start */
> -  0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> -  pass_rename_ssa_copies (gcc::context *ctxt)
> -    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> -  {}
> -
> -  /* opt_pass methods: */
> -  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> -  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> -  virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> -   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
> -   changing the underlying root variable of all coalesced version.  This will
> -   then cause the SSA->normal pass to attempt to coalesce them all to the same
> -   variable.  */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> -  var_map map;
> -  basic_block bb;
> -  tree var, part_var;
> -  gimple stmt;
> -  unsigned x;
> -  FILE *debug;
> -
> -  memset (&stats, 0, sizeof (stats));
> -
> -  if (dump_file && (dump_flags & TDF_DETAILS))
> -    debug = dump_file;
> -  else
> -    debug = NULL;
> -
> -  map = init_var_map (num_ssa_names);
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Scan for real copies.  */
> -      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -       {
> -         stmt = gsi_stmt (gsi);
> -         if (gimple_assign_ssa_name_copy_p (stmt))
> -           {
> -             tree lhs = gimple_assign_lhs (stmt);
> -             tree rhs = gimple_assign_rhs1 (stmt);
> -
> -             copy_rename_partition_coalesce (map, lhs, rhs, debug);
> -           }
> -       }
> -    }
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Treat PHI nodes as copies between the result and each argument.  */
> -      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -        {
> -          size_t i;
> -         tree res;
> -         gphi *phi = gsi.phi ();
> -         res = gimple_phi_result (phi);
> -
> -         /* Do not process virtual SSA_NAMES.  */
> -         if (virtual_operand_p (res))
> -           continue;
> -
> -         /* Make sure to only use the same partition for an argument
> -            as the result but never the other way around.  */
> -         if (SSA_NAME_VAR (res)
> -             && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> -           for (i = 0; i < gimple_phi_num_args (phi); i++)
> -             {
> -               tree arg = PHI_ARG_DEF (phi, i);
> -               if (TREE_CODE (arg) == SSA_NAME)
> -                 copy_rename_partition_coalesce (map, res, arg,
> -                                                 debug);
> -             }
> -         /* Else if all arguments are in the same partition try to merge
> -            it with the result.  */
> -         else
> -           {
> -             int all_p_same = -1;
> -             int p = -1;
> -             for (i = 0; i < gimple_phi_num_args (phi); i++)
> -               {
> -                 tree arg = PHI_ARG_DEF (phi, i);
> -                 if (TREE_CODE (arg) != SSA_NAME)
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -                 else if (all_p_same == -1)
> -                   {
> -                     p = partition_find (map->var_partition,
> -                                         SSA_NAME_VERSION (arg));
> -                     all_p_same = 1;
> -                   }
> -                 else if (all_p_same == 1
> -                          && p != partition_find (map->var_partition,
> -                                                  SSA_NAME_VERSION (arg)))
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -               }
> -             if (all_p_same == 1)
> -               copy_rename_partition_coalesce (map, res,
> -                                               PHI_ARG_DEF (phi, 0),
> -                                               debug);
> -           }
> -        }
> -    }
> -
> -  if (debug)
> -    dump_var_map (debug, map);
> -
> -  /* Now one more pass to make all elements of a partition share the same
> -     root variable.  */
> -
> -  for (x = 1; x < num_ssa_names; x++)
> -    {
> -      part_var = partition_to_var (map, x);
> -      if (!part_var)
> -        continue;
> -      var = ssa_name (x);
> -      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> -       continue;
> -      if (debug)
> -        {
> -         fprintf (debug, "Coalesced ");
> -         print_generic_expr (debug, var, TDF_SLIM);
> -         fprintf (debug, " to ");
> -         print_generic_expr (debug, part_var, TDF_SLIM);
> -         fprintf (debug, "\n");
> -       }
> -      stats.coalesced++;
> -      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> -    }
> -
> -  statistics_counter_event (fun, "copies coalesced",
> -                           stats.coalesced);
> -  delete_var_map (map);
> -  return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> -  return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index 5b00f58..4772558 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -70,88 +70,6 @@ static void  verify_live_on_entry (tree_live_info_p);
>     ssa_name or variable, and vice versa.  */
>
>
> -/* Hashtable helpers.  */
> -
> -struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
> -{
> -  static inline hashval_t hash (const tree_int_map *);
> -  static inline bool equal (const tree_int_map *, const tree_int_map *);
> -};
> -
> -inline hashval_t
> -tree_int_map_hasher::hash (const tree_int_map *v)
> -{
> -  return tree_map_base_hash (v);
> -}
> -
> -inline bool
> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> -{
> -  return tree_int_map_eq (v, c);
> -}
> -
> -
> -/* This routine will initialize the basevar fields of MAP.  */
> -
> -static void
> -var_map_base_init (var_map map)
> -{
> -  int x, num_part;
> -  tree var;
> -  struct tree_int_map *m, *mapstorage;
> -
> -  num_part = num_var_partitions (map);
> -  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> -  /* We can have at most num_part entries in the hash tables, so it's
> -     enough to allocate so many map elements once, saving some malloc
> -     calls.  */
> -  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> -
> -  /* If a base table already exists, clear it, otherwise create it.  */
> -  free (map->partition_to_base_index);
> -  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> -
> -  /* Build the base variable list, and point partitions at their bases.  */
> -  for (x = 0; x < num_part; x++)
> -    {
> -      struct tree_int_map **slot;
> -      unsigned baseindex;
> -      var = partition_to_var (map, x);
> -      if (SSA_NAME_VAR (var)
> -         && (!VAR_P (SSA_NAME_VAR (var))
> -             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> -       m->base.from = SSA_NAME_VAR (var);
> -      else
> -       /* This restricts what anonymous SSA names we can coalesce
> -          as it restricts the sets we compute conflicts for.
> -          Using TREE_TYPE to generate sets is the easies as
> -          type equivalency also holds for SSA names with the same
> -          underlying decl.
> -
> -          Check gimple_can_coalesce_p when changing this code.  */
> -       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> -                       ? TYPE_CANONICAL (TREE_TYPE (var))
> -                       : TREE_TYPE (var));
> -      /* If base variable hasn't been seen, set it up.  */
> -      slot = tree_to_index.find_slot (m, INSERT);
> -      if (!*slot)
> -       {
> -         baseindex = m - mapstorage;
> -         m->to = baseindex;
> -         *slot = m;
> -         m++;
> -       }
> -      else
> -       baseindex = (*slot)->to;
> -      map->partition_to_base_index[x] = baseindex;
> -    }
> -
> -  map->num_basevars = m - mapstorage;
> -
> -  free (mapstorage);
> -}
> -
> -
>  /* Remove the base table in MAP.  */
>
>  static void
> @@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected)
>  }
>
>
> -/* Create a partition view which includes all the used partitions in MAP.  If
> -   WANT_BASES is true, create the base variable map as well.  */
> +/* Create a partition view which includes all the used partitions in MAP.  */
>
>  void
> -partition_view_normal (var_map map, bool want_bases)
> +partition_view_normal (var_map map)
>  {
>    bitmap used;
>
>    used = partition_view_init (map);
>    partition_view_fini (map, used);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> @@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases)
>     as well.  */
>
>  void
> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> +partition_view_bitmap (var_map map, bitmap only)
>  {
>    bitmap used;
>    bitmap new_partitions = BITMAP_ALLOC (NULL);
> @@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>      }
>    partition_view_fini (map, new_partitions);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
> index d5d7820..1f88358 100644
> --- a/gcc/tree-ssa-live.h
> +++ b/gcc/tree-ssa-live.h
> @@ -71,8 +71,8 @@ typedef struct _var_map
>  extern var_map init_var_map (int);
>  extern void delete_var_map (var_map);
>  extern int var_union (var_map, tree, tree);
> -extern void partition_view_normal (var_map, bool);
> -extern void partition_view_bitmap (var_map, bitmap, bool);
> +extern void partition_view_normal (var_map);
> +extern void partition_view_bitmap (var_map, bitmap);
>  extern void dump_scope_blocks (FILE *, int);
>  extern void debug_scope_block (tree, int);
>  extern void debug_scope_blocks (int);
> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
> index 437f69d..1fbd71e 100644
> --- a/gcc/tree-ssa-uncprop.c
> +++ b/gcc/tree-ssa-uncprop.c
> @@ -38,6 +38,11 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-pass.h"
>  #include "tree-ssa-propagate.h"
>  #include "tree-hash-traits.h"
> +#include "bitmap.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-ssa-live.h"
> +#include "tree-ssa-coalesce.h"
>
>  /* The basic structure describing an equivalency created by traversing
>     an edge.  Traversing the edge effectively means that we can assume
> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
> index b5b0cb6..e10f775 100644
> --- a/gcc/var-tracking.c
> +++ b/gcc/var-tracking.c
> @@ -4909,12 +4909,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
>     registers, as well as associations between MEMs and VALUEs.  */
>
>  static void
> -dataflow_set_clear_at_call (dataflow_set *set)
> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
>  {
>    unsigned int r;
>    hard_reg_set_iterator hrsi;
> +  HARD_REG_SET invalidated_regs;
>
> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
> +  get_call_reg_set_usage (call_insn, &invalidated_regs,
> +                         regs_invalidated_by_call);
> +
> +  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
>      var_regno_delete (set, r);
>
>    if (MAY_HAVE_DEBUG_INSNS)
> @@ -6698,7 +6702,7 @@ compute_bb_dataflow (basic_block bb)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (out);
> +           dataflow_set_clear_at_call (out, insn);
>             break;
>
>           case MO_USE:
> @@ -9160,7 +9164,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (set);
> +           dataflow_set_clear_at_call (set, insn);
>             emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
>             {
>               rtx arguments = mo->u.loc, *p = &arguments;
>
>
>
> These are the incremental fixes:
>
> diff --git a/gcc/explow.c b/gcc/explow.c
> index 6dba6e5..6941f4e 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -852,6 +852,13 @@ promote_ssa_mode (const_tree name, int *punsignedp)
>  {
>    gcc_assert (TREE_CODE (name) == SSA_NAME);
>
> +  /* Partitions holding parms and results must be promoted as expected
> +     by function.c.  */
> +  if (SSA_NAME_VAR (name)
> +      && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
> +         || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> +
>    tree type = TREE_TYPE (name);
>    int unsignedp = TYPE_UNSIGNED (type);
>    machine_mode mode = TYPE_MODE (type);
> diff --git a/gcc/function.c b/gcc/function.c
> index 840f4a2..753d889 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2920,14 +2920,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
>        stack_parm = rtl_for_parm (all, parm);
> -      if (!stack_parm)
> -       stack_parm = assign_stack_local (BLKmode, size_stored,
> -                                        DECL_ALIGN (parm));
> -      else
> +      if (stack_parm)
>         stack_parm = copy_rtx (stack_parm);
> -      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> -       PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -      set_mem_attributes (stack_parm, parm, 1);
> +      else
> +       {
> +         stack_parm = assign_stack_local (BLKmode, size_stored,
> +                                          DECL_ALIGN (parm));
> +         if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> +           PUT_MODE (stack_parm, GET_MODE (entry_parm));
> +         set_mem_attributes (stack_parm, parm, 1);
> +       }
>      }
>
>    /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-16  8:50                     ` Richard Biener
@ 2015-07-16 21:33                       ` Alexandre Oliva
  2015-07-18  8:26                         ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-16 21:33 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Jul 16, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

>> Is this ok to install?

> Yes.

So, I decided to run a ppc64le-linux-gnu bootstrap, just in case, and
there are issues with split complex parms that caused go and fortran
libs to fail the build.

I will refrain from installing this for now, and I'll post a followup as
soon as I sort that out.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-16 21:33                       ` Alexandre Oliva
@ 2015-07-18  8:26                         ` Alexandre Oliva
  2015-07-21 13:25                           ` Richard Biener
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-18  8:26 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Jul 16, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> So, I decided to run a ppc64le-linux-gnu bootstrap, just in case, and
> there are issues with split complex parms that caused go and fortran
> libs to fail the build.

This incremental patch, along with the previously-posted patches, fix
split complex args handling with preassigned args RTL, and enables
ppc64le-linux-gnu bootstrap to succeed.

I'm not particularly happy with the abuse of DECL_CONTEXT to recognize
split complex args and leave their RTL alone, but that was the best that
occurred to me.  Any other suggestions?

Is the combined patch ok, assuming further (re)testing of embedded
targets passes?

for  gcc/ChangeLog (to be integrated with the approved patches)

	* function.c (split_complex_args): Take assign_parm_data_all
	argument.  Pass it to rtl_for_parm.  Set up rtl and context
	for split args.
	(assign_parms_augmented_arg_list): Adjust.
	(maybe_reset_rtl_for_parm): Recognize split complex args.
	* stor-layout.c (layout_decl): Don't set mem attributes of
	non-MEMs.
---
 gcc/function.c    |   39 +++++++++++++++++++++++++++++++++++++--
 gcc/stor-layout.c |    3 ++-
 2 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index 753d889..6fba001 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -151,6 +151,8 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
 static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
+static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
+
 \f
 /* Stack of nested functions.  */
 /* Keep track of the cfun stack.  */
@@ -2267,7 +2269,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
    needed, else the old list.  */
 
 static void
-split_complex_args (vec<tree> *args)
+split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 {
   unsigned i;
   tree p;
@@ -2278,6 +2280,7 @@ split_complex_args (vec<tree> *args)
       if (TREE_CODE (type) == COMPLEX_TYPE
 	  && targetm.calls.split_complex_arg (type))
 	{
+	  tree cparm = p;
 	  tree decl;
 	  tree subtype = TREE_TYPE (type);
 	  bool addressable = TREE_ADDRESSABLE (p);
@@ -2296,6 +2299,9 @@ split_complex_args (vec<tree> *args)
 	  DECL_ARTIFICIAL (p) = addressable;
 	  DECL_IGNORED_P (p) = addressable;
 	  TREE_ADDRESSABLE (p) = 0;
+	  /* Reset the RTL before layout_decl, or it may change the
+	     mode of the RTL of the original argument copied to P.  */
+	  SET_DECL_RTL (p, NULL_RTX);
 	  layout_decl (p, 0);
 	  (*args)[i] = p;
 
@@ -2307,6 +2313,25 @@ split_complex_args (vec<tree> *args)
 	  DECL_IGNORED_P (decl) = addressable;
 	  layout_decl (decl, 0);
 	  args->safe_insert (++i, decl);
+
+	  /* If we are assigning parameters for a function, rather
+	     than for a call, propagate the RTL of the complex parm to
+	     the split declarations, and set their contexts so that
+	     maybe_reset_rtl_for_parm can recognize them and refrain
+	     from resetting their RTL.  */
+	  if (cfun->gimple_df)
+	    {
+	      rtx rtl = rtl_for_parm (all, cparm);
+	      gcc_assert (!rtl || GET_CODE (rtl) == CONCAT);
+	      if (rtl)
+		{
+		  SET_DECL_RTL (p, XEXP (rtl, 0));
+		  SET_DECL_RTL (decl, XEXP (rtl, 1));
+
+		  DECL_CONTEXT (p) = cparm;
+		  DECL_CONTEXT (decl) = cparm;
+		}
+	    }
 	}
     }
 }
@@ -2369,7 +2394,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
 
   /* If the target wants to split complex arguments into scalars, do so.  */
   if (targetm.calls.split_complex_arg)
-    split_complex_args (&fnargs);
+    split_complex_args (all, &fnargs);
 
   return fnargs;
 }
@@ -2823,6 +2848,16 @@ maybe_reset_rtl_for_parm (tree parm)
 {
   gcc_assert (TREE_CODE (parm) == PARM_DECL
 	      || TREE_CODE (parm) == RESULT_DECL);
+
+  /* This is a split complex parameter, and its context was set to its
+     original PARM_DECL in split_complex_args so that we could
+     recognize it here and not reset its RTL.  */
+  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
+    {
+      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
+      return;
+    }
+
   if ((flag_tree_coalesce_vars
        || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
       && is_gimple_reg (parm))
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 0d4f4a4..288227a 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -794,7 +794,8 @@ layout_decl (tree decl, unsigned int known_align)
     {
       PUT_MODE (rtl, DECL_MODE (decl));
       SET_DECL_RTL (decl, 0);
-      set_mem_attributes (rtl, decl, 1);
+      if (MEM_P (rtl))
+	set_mem_attributes (rtl, decl, 1);
       SET_DECL_RTL (decl, rtl);
     }
 }


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-18  8:26                         ` Alexandre Oliva
@ 2015-07-21 13:25                           ` Richard Biener
  2015-07-22 17:13                             ` Alexandre Oliva
  2015-07-22 17:43                             ` Alexandre Oliva
  0 siblings, 2 replies; 127+ messages in thread
From: Richard Biener @ 2015-07-21 13:25 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jul 16, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> So, I decided to run a ppc64le-linux-gnu bootstrap, just in case, and
>> there are issues with split complex parms that caused go and fortran
>> libs to fail the build.
>
> This incremental patch, along with the previously-posted patches, fix
> split complex args handling with preassigned args RTL, and enables
> ppc64le-linux-gnu bootstrap to succeed.
>
> I'm not particularly happy with the abuse of DECL_CONTEXT to recognize
> split complex args and leave their RTL alone, but that was the best that
> occurred to me.  Any other suggestions?
>
> Is the combined patch ok, assuming further (re)testing of embedded
> targets passes?
>
> for  gcc/ChangeLog (to be integrated with the approved patches)
>
>         * function.c (split_complex_args): Take assign_parm_data_all
>         argument.  Pass it to rtl_for_parm.  Set up rtl and context
>         for split args.
>         (assign_parms_augmented_arg_list): Adjust.
>         (maybe_reset_rtl_for_parm): Recognize split complex args.
>         * stor-layout.c (layout_decl): Don't set mem attributes of
>         non-MEMs.
> ---
>  gcc/function.c    |   39 +++++++++++++++++++++++++++++++++++++--
>  gcc/stor-layout.c |    3 ++-
>  2 files changed, 39 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/function.c b/gcc/function.c
> index 753d889..6fba001 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -151,6 +151,8 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
>  static void prepare_function_start (void);
>  static void do_clobber_return_reg (rtx, void *);
>  static void do_use_return_reg (rtx, void *);
> +static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
> +
>
>  /* Stack of nested functions.  */
>  /* Keep track of the cfun stack.  */
> @@ -2267,7 +2269,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
>     needed, else the old list.  */
>
>  static void
> -split_complex_args (vec<tree> *args)
> +split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>  {
>    unsigned i;
>    tree p;
> @@ -2278,6 +2280,7 @@ split_complex_args (vec<tree> *args)
>        if (TREE_CODE (type) == COMPLEX_TYPE
>           && targetm.calls.split_complex_arg (type))
>         {
> +         tree cparm = p;
>           tree decl;
>           tree subtype = TREE_TYPE (type);
>           bool addressable = TREE_ADDRESSABLE (p);
> @@ -2296,6 +2299,9 @@ split_complex_args (vec<tree> *args)
>           DECL_ARTIFICIAL (p) = addressable;
>           DECL_IGNORED_P (p) = addressable;
>           TREE_ADDRESSABLE (p) = 0;
> +         /* Reset the RTL before layout_decl, or it may change the
> +            mode of the RTL of the original argument copied to P.  */
> +         SET_DECL_RTL (p, NULL_RTX);
>           layout_decl (p, 0);
>           (*args)[i] = p;
>
> @@ -2307,6 +2313,25 @@ split_complex_args (vec<tree> *args)
>           DECL_IGNORED_P (decl) = addressable;
>           layout_decl (decl, 0);
>           args->safe_insert (++i, decl);
> +
> +         /* If we are assigning parameters for a function, rather
> +            than for a call, propagate the RTL of the complex parm to
> +            the split declarations, and set their contexts so that
> +            maybe_reset_rtl_for_parm can recognize them and refrain
> +            from resetting their RTL.  */
> +         if (cfun->gimple_df)

If the cfun->gimple_df check is to decide whether this is a call or a function
then no, this can't work reliably.  What is this test for else?

You pass another argument to split_complex_arg, so why not pass in a bool
on whether we split it for this or the other case?

Richard.

> +           {
> +             rtx rtl = rtl_for_parm (all, cparm);
> +             gcc_assert (!rtl || GET_CODE (rtl) == CONCAT);
> +             if (rtl)
> +               {
> +                 SET_DECL_RTL (p, XEXP (rtl, 0));
> +                 SET_DECL_RTL (decl, XEXP (rtl, 1));
> +
> +                 DECL_CONTEXT (p) = cparm;
> +                 DECL_CONTEXT (decl) = cparm;
> +               }
> +           }
>         }
>      }
>  }
> @@ -2369,7 +2394,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
>
>    /* If the target wants to split complex arguments into scalars, do so.  */
>    if (targetm.calls.split_complex_arg)
> -    split_complex_args (&fnargs);
> +    split_complex_args (all, &fnargs);
>
>    return fnargs;
>  }
> @@ -2823,6 +2848,16 @@ maybe_reset_rtl_for_parm (tree parm)
>  {
>    gcc_assert (TREE_CODE (parm) == PARM_DECL
>               || TREE_CODE (parm) == RESULT_DECL);
> +
> +  /* This is a split complex parameter, and its context was set to its
> +     original PARM_DECL in split_complex_args so that we could
> +     recognize it here and not reset its RTL.  */
> +  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
> +    {
> +      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
> +      return;
> +    }
> +
>    if ((flag_tree_coalesce_vars
>         || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
>        && is_gimple_reg (parm))
> diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
> index 0d4f4a4..288227a 100644
> --- a/gcc/stor-layout.c
> +++ b/gcc/stor-layout.c
> @@ -794,7 +794,8 @@ layout_decl (tree decl, unsigned int known_align)
>      {
>        PUT_MODE (rtl, DECL_MODE (decl));
>        SET_DECL_RTL (decl, 0);
> -      set_mem_attributes (rtl, decl, 1);
> +      if (MEM_P (rtl))
> +       set_mem_attributes (rtl, decl, 1);
>        SET_DECL_RTL (decl, rtl);
>      }
>  }
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-21 13:25                           ` Richard Biener
@ 2015-07-22 17:13                             ` Alexandre Oliva
  2015-07-22 17:43                             ` Alexandre Oliva
  1 sibling, 0 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-22 17:13 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Jul 21, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Jul 16, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>> +         /* If we are assigning parameters for a function, rather
>> +            than for a call, propagate the RTL of the complex parm to
>> +            the split declarations, and set their contexts so that
>> +            maybe_reset_rtl_for_parm can recognize them and refrain
>> +            from resetting their RTL.  */
>> +         if (cfun->gimple_df)

> If the cfun->gimple_df check is to decide whether this is a call or a function
> then no, this can't work reliably.  What is this test for else?

That was the reason: call or function.

> You pass another argument to split_complex_arg, so why not pass in a bool
> on whether we split it for this or the other case?

There's only one call to split_complex_args.  I'll try to figure out
where the paths converge and see if it's reasonable to pass an argument
all the way to tell the two cases apart.

Thanks for the suggestion,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-21 13:25                           ` Richard Biener
  2015-07-22 17:13                             ` Alexandre Oliva
@ 2015-07-22 17:43                             ` Alexandre Oliva
  2015-07-23 11:04                               ` Richard Biener
  1 sibling, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-22 17:43 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Jul 21, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> +         if (cfun->gimple_df)

> If the cfun->gimple_df check is to decide whether this is a call or a function
> then no, this can't work reliably.  What is this test for else?

It turns out it's not call or function, as I thought at first, but
gimplifying or expanding the function.  split_complex_args is not used
for calls.  So the above might actually work (minus the misleading
comments I wrote), and I think it's cleaner than adding a bool
expanding_p arg to split_complex_args and
assign_parms_augmented_arg_list, called from gimplify_parameters (during
gimplification of a function) and assign_parms (during its expansion).
Do you agree, or would you prefer the explicit argument?

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-22 17:43                             ` Alexandre Oliva
@ 2015-07-23 11:04                               ` Richard Biener
  2015-07-23 15:42                                 ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-07-23 11:04 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Wed, Jul 22, 2015 at 7:33 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jul 21, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> +         if (cfun->gimple_df)
>
>> If the cfun->gimple_df check is to decide whether this is a call or a function
>> then no, this can't work reliably.  What is this test for else?
>
> It turns out it's not call or function, as I thought at first, but
> gimplifying or expanding the function.  split_complex_args is not used
> for calls.  So the above might actually work (minus the misleading
> comments I wrote), and I think it's cleaner than adding a bool
> expanding_p arg to split_complex_args and
> assign_parms_augmented_arg_list, called from gimplify_parameters (during
> gimplification of a function) and assign_parms (during its expansion).
> Do you agree, or would you prefer the explicit argument?

Hmm, ok.  Does using

   if (currently_expanding_to_rtl)

work?  I think it's slightly more descriptive.

Ok with that change.

Thanks,
Richard.

> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 11:04                               ` Richard Biener
@ 2015-07-23 15:42                                 ` Alexandre Oliva
  2015-07-23 20:35                                   ` Segher Boessenkool
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-23 15:42 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Jul 23, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> Hmm, ok.  Does using

>    if (currently_expanding_to_rtl)

> work?  I think it's slightly more descriptive.

Yeah.  Thanks, I've tested it with this change, and I'm now checking
this in (full patch first; adjusted incremental patch at the end):

[PR64164] Drop copyrename, use coalescible partition as base when optimizing.

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	PR rtl-optimization/64164
	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
	* tree-ssa-copyrename.c: Removed.
	* opts.c (default_options_table): Drop -ftree-copyrename.  Add
	-ftree-coalesce-vars.
	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
	* common.opt (ftree-copyrename): Ignore.
	(ftree-coalesce-inlined-vars): Likewise.
	* doc/invoke.texi: Remove the ignored options above.
	* gimple-expr.h (gimple_can_coalesce_p): Move declaration
	* tree-ssa-coalesce.h: ... here.
	* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
	headers required by it.
	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
	across variables when flag_tree_coalesce_vars.  Check register
	use and promoted modes to allow coalescing.  Moved to
	tree-ssa-coalesce.c.
	* tree-ssa-live.c (struct tree_int_map_hasher): Move along
	with its member functions to tree-ssa-coalesce.c.
	(var_map_base_init): Likewise.  Renamed to
	compute_samebase_partition_bases.
	(partition_view_normal): Drop want_bases parameter.
	(partition_view_bitmap): Likewise.
	* tree-ssa-live.h: Adjust declarations.
	* tree-ssa-coalesce.c: Include explow.h.
	(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
	default defs at the entry point.
	(dump_part_var_map): New.
	(compute_optimized_partition_bases): New, called by...
	(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
	of compute_samebase_partition_bases.  Adjust.
	* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
	* cfgexpand.c (leader_merge): New.
	(get_rtl_for_parm_ssa_default_def): New.
	(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
	vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
	(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
	redundant MEM attr setting.
	(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
	from...
	(expand_one_stack_var): ... this.  New wrapper to check and
	skip already expanded SSA partitions.
	(record_alignment_for_reg_var): New, factored out of...
	(expand_one_var): ... this.
	(expand_one_ssa_partition): New.
	(adjust_one_expanded_partition_var): New.
	(expand_one_register_var): Check and skip already expanded SSA
	partitions.
	(expand_used_vars): Don't create DECLs for anonymous SSA
	names.  Expand all SSA partitions, then adjust all SSA names.
	(pass::execute): Replace the loops that set
	SA.partition_to_pseudo from partition leaders and cleared
	DECL_RTL for multi-location variables, and that which used to
	rename vars and set attrs, with one that clears DECL_RTL and
	checks that PARMs and RESULTs default_defs match DECL_RTL.
	* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
	* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
	* explow.c (promote_ssa_mode): New.
	* explow.h (promote_ssa_mode): Declare.
	* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
	* function.c: Include cfgexpand.h.
	(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
	(use_register_for_parm_decl): Wrapper for the above to
	special-case the result_ptr.
	(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
	(split_complex_args): Take assign_parm_data_all argument.
	Pass it to rtl_for_parm.  Set up rtl and context for split
	args.
	(assign_parms_augmented_arg_list): Adjust.
	(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
	multiple locations.  Recognize split complex args.
	(assign_parm_adjust_stack_rtl): Add all and parm arguments,
	for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
	(assign_parm_setup_block): Prefer SSA-assigned location.
	(assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
	if stack_parm is NULL.
	(assign_parm_setup_stack): Prefer SSA-assigned location.
	(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
	rtl before testing for pointer bounds.  Special-case result_ptr.
	(expand_function_start): Maybe reset DECL_RTL of result.
	Prefer SSA-assigned location for result and static chain.
	Factor out DECL_RESULT and SET_DECL_RTL.
	* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
	anonymous SSA names.  Use promote_ssa_mode.
	(get_temp_reg): Likewise.
	(remove_ssa_form): Adjust.
	* stor-layout.c (layout_decl): Don't set mem attributes of
	non-MEMs.
	* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
	and get its reg_usage for reg invalidation.
	(compute_bb_dataflow): Pass it insn.
	(emit_notes_in_bb): Likewise.

for  gcc/testsuite/ChangeLog

	PR rtl-optimization/64164
	* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
	* gcc.dg/ssp-1.c: Make counter a register.
	* gcc.dg/ssp-2.c: Likewise.
	* gcc.dg/torture/parm-coalesce.c: New.
---
 gcc/Makefile.in                              |    1 
 gcc/alias.c                                  |   13 +
 gcc/cfgexpand.c                              |  370 +++++++++++++++-----
 gcc/cfgexpand.h                              |    2 
 gcc/common.opt                               |   12 -
 gcc/doc/invoke.texi                          |   48 +--
 gcc/emit-rtl.c                               |    5 
 gcc/explow.c                                 |   29 ++
 gcc/explow.h                                 |    3 
 gcc/expr.c                                   |   39 +-
 gcc/function.c                               |  275 ++++++++++++---
 gcc/gimple-expr.c                            |   39 --
 gcc/gimple-expr.h                            |    1 
 gcc/opts.c                                   |    2 
 gcc/passes.def                               |    5 
 gcc/stor-layout.c                            |    3 
 gcc/testsuite/gcc.dg/guality/pr54200.c       |    2 
 gcc/testsuite/gcc.dg/ssp-1.c                 |    2 
 gcc/testsuite/gcc.dg/ssp-2.c                 |    2 
 gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
 gcc/tree-outof-ssa.c                         |   16 -
 gcc/tree-ssa-coalesce.c                      |  378 ++++++++++++++++++++-
 gcc/tree-ssa-coalesce.h                      |    1 
 gcc/tree-ssa-copyrename.c                    |  475 --------------------------
 gcc/tree-ssa-live.c                          |   99 -----
 gcc/tree-ssa-live.h                          |    4 
 gcc/tree-ssa-uncprop.c                       |    5 
 gcc/var-tracking.c                           |   12 -
 28 files changed, 1030 insertions(+), 853 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
 delete mode 100644 gcc/tree-ssa-copyrename.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 333461b..16d5582 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1444,7 +1444,6 @@ OBJS = \
 	tree-ssa-ccp.o \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
-	tree-ssa-copyrename.o \
 	tree-ssa-dce.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index 3203722..69e3732 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
   if (! DECL_P (exprx) || ! DECL_P (expry))
     return 0;
 
+  /* If we refer to different gimple registers, or one gimple register
+     and one non-gimple-register, we know they can't overlap.  First,
+     gimple registers don't have their addresses taken.  Now, there
+     could be more than one stack slot for (different versions of) the
+     same gimple register, but we can presumably tell they don't
+     overlap based on offsets from stack base addresses elsewhere.
+     It's important that we don't proceed to DECL_RTL, because gimple
+     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+     able to do anything about them since no SSA information will have
+     remained to guide it.  */
+  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+    return exprx != expry;
+
   /* With invalid code we can end up storing into the constant pool.
      Bail out to avoid ICEing when creating RTL for this.
      See gfortran.dg/lto/20091028-2_0.f90.  */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index a047632..0b19953 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -150,21 +150,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
 
 #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
 
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+   out of the same user variable being in multiple partitions (this is
+   less likely for compiler-introduced temps).  */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+  if (cur == NULL || cur == next)
+    return next;
+
+  if (DECL_P (cur) && DECL_IGNORED_P (cur))
+    return cur;
+
+  if (DECL_P (next) && DECL_IGNORED_P (next))
+    return next;
+
+  return cur;
+}
+
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+   there is one.  */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+  if (!is_gimple_reg (var))
+    return NULL_RTX;
+
+  /* If we've already determined RTL for the decl, use it.  This is
+     not just an optimization: if VAR is a PARM whose incoming value
+     is unused, we won't find a default def to use its partition, but
+     we still want to use the location of the parm, if it was used at
+     all.  During assign_parms, until a location is assigned for the
+     VAR, RTL can only for a parm or result if we're not coalescing
+     across variables, when we know we're coalescing all SSA_NAMEs of
+     each parm or result, and we're not coalescing them with names
+     pertaining to other variables, such as other parms' default
+     defs.  */
+  if (DECL_RTL_SET_P (var))
+    {
+      gcc_assert (DECL_RTL (var) != pc_rtx);
+      return DECL_RTL (var);
+    }
+
+  tree name = ssa_default_def (cfun, var);
+
+  if (!name)
+    return NULL_RTX;
+
+  int part = var_to_partition (SA.map, name);
+  if (part == NO_PARTITION)
+    return NULL_RTX;
+
+  return SA.partition_to_pseudo[part];
+}
+
 /* Associate declaration T with storage space X.  If T is no
    SSA name this is exactly SET_DECL_RTL, otherwise make the
    partition of T associated with X.  */
 static inline void
 set_rtl (tree t, rtx x)
 {
+  if (x && SSAVAR (t))
+    {
+      bool skip = false;
+      tree cur = NULL_TREE;
+
+      if (MEM_P (x))
+	cur = MEM_EXPR (x);
+      else if (REG_P (x))
+	cur = REG_EXPR (x);
+      else if (GET_CODE (x) == CONCAT
+	       && REG_P (XEXP (x, 0)))
+	cur = REG_EXPR (XEXP (x, 0));
+      else if (GET_CODE (x) == PARALLEL)
+	cur = REG_EXPR (XVECEXP (x, 0, 0));
+      else if (x == pc_rtx)
+	skip = true;
+      else
+	gcc_unreachable ();
+
+      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+      if (cur != next)
+	{
+	  if (MEM_P (x))
+	    set_mem_attributes (x, next, true);
+	  else
+	    set_reg_attrs_for_decl_rtl (next, x);
+	}
+    }
+
   if (TREE_CODE (t) == SSA_NAME)
     {
-      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
-      if (x && !MEM_P (x))
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
-      /* For the benefit of debug information at -O0 (where vartracking
-         doesn't run) record the place also in the base DECL if it's
-	 a normal variable (not a parameter).  */
-      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+      int part = var_to_partition (SA.map, t);
+      if (part != NO_PARTITION)
+	{
+	  if (SA.partition_to_pseudo[part])
+	    gcc_assert (SA.partition_to_pseudo[part] == x);
+	  else
+	    SA.partition_to_pseudo[part] = x;
+	}
+      /* For the benefit of debug information at -O0 (where
+         vartracking doesn't run) record the place also in the base
+         DECL.  For PARMs and RESULTs, we may end up resetting these
+         in function.c:maybe_reset_rtl_for_parm, but in some rare
+         cases we may need them (unused and overwritten incoming
+         value, that at -O0 must share the location with the other
+         uses in spite of the missing default def), and this may be
+         the only chance to preserve them.  */
+      if (x && x != pc_rtx && SSA_NAME_VAR (t))
 	{
 	  tree var = SSA_NAME_VAR (t);
 	  /* If we don't yet have something recorded, just record it now.  */
@@ -862,7 +962,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (Pmode, base, offset);
-  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+		   ? TYPE_MODE (TREE_TYPE (decl))
+		   : DECL_MODE (SSAVAR (decl)), x);
 
   if (TREE_CODE (decl) != SSA_NAME)
     {
@@ -884,7 +986,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
       DECL_USER_ALIGN (decl) = 0;
     }
 
-  set_mem_attributes (x, SSAVAR (decl), true);
   set_rtl (decl, x);
 }
 
@@ -1099,13 +1200,22 @@ account_stack_vars (void)
    to a variable to be allocated in the stack frame.  */
 
 static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
 {
   HOST_WIDE_INT size, offset;
   unsigned byte_align;
 
-  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
-  byte_align = align_local_variable (SSAVAR (var));
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      tree type = TREE_TYPE (var);
+      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+      byte_align = TYPE_ALIGN_UNIT (type);
+    }
+  else
+    {
+      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+      byte_align = align_local_variable (var);
+    }
 
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1116,6 +1226,27 @@ expand_one_stack_var (tree var)
 			   crtl->max_used_stack_slot_alignment, offset);
 }
 
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+   already assigned some MEM.  */
+
+static void
+expand_one_stack_var (tree var)
+{
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (MEM_P (x));
+	  return;
+	}
+    }
+
+  return expand_one_stack_var_1 (var);
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a hard register.  */
 
@@ -1125,13 +1256,114 @@ expand_one_hard_reg_var (tree var)
   rest_of_decl_compilation (var, 0, 0);
 }
 
+/* Record the alignment requirements of some variable assigned to a
+   pseudo.  */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+  if (SUPPORTS_STACK_ALIGNMENT
+      && crtl->stack_alignment_estimated < align)
+    {
+      /* stack_alignment_estimated shouldn't change after stack
+         realign decision made */
+      gcc_assert (!crtl->stack_realign_processed);
+      crtl->stack_alignment_estimated = align;
+    }
+
+  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+     So here we only make sure stack_alignment_needed >= align.  */
+  if (crtl->stack_alignment_needed < align)
+    crtl->stack_alignment_needed = align;
+  if (crtl->max_used_stack_slot_alignment < align)
+    crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition.  */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+  int part = var_to_partition (SA.map, var);
+  gcc_assert (part != NO_PARTITION);
+
+  if (SA.partition_to_pseudo[part])
+    return;
+
+  if (!use_register_for_decl (var))
+    {
+      expand_one_stack_var_1 (var);
+      return;
+    }
+
+  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+					  TYPE_MODE (TREE_TYPE (var)),
+					  TYPE_ALIGN (TREE_TYPE (var)));
+
+  /* If the variable alignment is very large we'll dynamicaly allocate
+     it, which means that in-frame portion is just a pointer.  */
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+    align = POINTER_SIZE;
+
+  record_alignment_for_reg_var (align);
+
+  machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+  rtx x = gen_reg_rtx (reg_mode);
+
+  set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+   and the underlying variable of the SSA_NAME.  */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+  if (!var)
+    return;
+
+  tree decl = SSA_NAME_VAR (var);
+
+  int part = var_to_partition (SA.map, var);
+  if (part == NO_PARTITION)
+    return;
+
+  rtx x = SA.partition_to_pseudo[part];
+
+  set_rtl (var, x);
+
+  if (!REG_P (x))
+    return;
+
+  /* Note if the object is a user variable.  */
+  if (decl && !DECL_ARTIFICIAL (decl))
+    mark_user_reg (x);
+
+  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+    mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a pseudo register.  */
 
 static void
 expand_one_register_var (tree var)
 {
-  tree decl = SSAVAR (var);
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (REG_P (x));
+	  return;
+	}
+      gcc_unreachable ();
+    }
+
+  tree decl = var;
   tree type = TREE_TYPE (decl);
   machine_mode reg_mode = promote_decl_mode (decl, NULL);
   rtx x = gen_reg_rtx (reg_mode);
@@ -1265,21 +1497,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
 	align = POINTER_SIZE;
     }
 
-  if (SUPPORTS_STACK_ALIGNMENT
-      && crtl->stack_alignment_estimated < align)
-    {
-      /* stack_alignment_estimated shouldn't change after stack
-         realign decision made */
-      gcc_assert (!crtl->stack_realign_processed);
-      crtl->stack_alignment_estimated = align;
-    }
-
-  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
-     So here we only make sure stack_alignment_needed >= align.  */
-  if (crtl->stack_alignment_needed < align)
-    crtl->stack_alignment_needed = align;
-  if (crtl->max_used_stack_slot_alignment < align)
-    crtl->max_used_stack_slot_alignment = align;
+  record_alignment_for_reg_var (align);
 
   if (TREE_CODE (origvar) == SSA_NAME)
     {
@@ -1713,48 +1931,18 @@ expand_used_vars (void)
   if (targetm.use_pseudo_pic_reg ())
     pic_offset_table_rtx = gen_reg_rtx (Pmode);
 
-  hash_map<tree, tree> ssa_name_decls;
   for (i = 0; i < SA.map->num_partitions; i++)
     {
       tree var = partition_to_var (SA.map, i);
 
       gcc_assert (!virtual_operand_p (var));
 
-      /* Assign decls to each SSA name partition, share decls for partitions
-         we could have coalesced (those with the same type).  */
-      if (SSA_NAME_VAR (var) == NULL_TREE)
-	{
-	  tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
-	  if (!*slot)
-	    *slot = create_tmp_reg (TREE_TYPE (var));
-	  replace_ssa_name_symbol (var, *slot);
-	}
-
-      /* Always allocate space for partitions based on VAR_DECLs.  But for
-	 those based on PARM_DECLs or RESULT_DECLs and which matter for the
-	 debug info, there is no need to do so if optimization is disabled
-	 because all the SSA_NAMEs based on these DECLs have been coalesced
-	 into a single partition, which is thus assigned the canonical RTL
-	 location of the DECLs.  If in_lto_p, we can't rely on optimize,
-	 a function could be compiled with -O1 -flto first and only the
-	 link performed at -O0.  */
-      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
-	expand_one_var (var, true, true);
-      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
-	{
-	  /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
-	     contain the default def (representing the parm or result itself)
-	     we don't do anything here.  But those which don't contain the
-	     default def (representing a temporary based on the parm/result)
-	     we need to allocate space just like for normal VAR_DECLs.  */
-	  if (!bitmap_bit_p (SA.partition_has_default_def, i))
-	    {
-	      expand_one_var (var, true, true);
-	      gcc_assert (SA.partition_to_pseudo[i]);
-	    }
-	}
+      expand_one_ssa_partition (var);
     }
 
+  for (i = 1; i < num_ssa_names; i++)
+    adjust_one_expanded_partition_var (ssa_name (i));
+
   if (flag_stack_protect == SPCT_FLAG_STRONG)
       gen_stack_protect_signal
 	= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -5928,35 +6116,6 @@ pass_expand::execute (function *fun)
       parm_birth_insn = var_seq;
     }
 
-  /* Now that we also have the parameter RTXs, copy them over to our
-     partitions.  */
-  for (i = 0; i < SA.map->num_partitions; i++)
-    {
-      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
-      if (TREE_CODE (var) != VAR_DECL
-	  && !SA.partition_to_pseudo[i])
-	SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
-      gcc_assert (SA.partition_to_pseudo[i]);
-
-      /* If this decl was marked as living in multiple places, reset
-	 this now to NULL.  */
-      if (DECL_RTL_IF_SET (var) == pc_rtx)
-	SET_DECL_RTL (var, NULL);
-
-      /* Some RTL parts really want to look at DECL_RTL(x) when x
-	 was a decl marked in REG_ATTR or MEM_ATTR.  We could use
-	 SET_DECL_RTL here making this available, but that would mean
-	 to select one of the potentially many RTLs for one DECL.  Instead
-	 of doing that we simply reset the MEM_EXPR of the RTL in question,
-	 then nobody can get at it and hence nobody can call DECL_RTL on it.  */
-      if (!DECL_RTL_SET_P (var))
-	{
-	  if (MEM_P (SA.partition_to_pseudo[i]))
-	    set_mem_expr (SA.partition_to_pseudo[i], NULL);
-	}
-    }
-
   /* If we have a class containing differently aligned pointers
      we need to merge those into the corresponding RTL pointer
      alignment.  */
@@ -5964,7 +6123,6 @@ pass_expand::execute (function *fun)
     {
       tree name = ssa_name (i);
       int part;
-      rtx r;
 
       if (!name
 	  /* We might have generated new SSA names in
@@ -5977,20 +6135,24 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      /* Adjust all partition members to get the underlying decl of
-	 the representative which we might have created in expand_one_var.  */
-      if (SSA_NAME_VAR (name) == NULL_TREE)
+      gcc_assert (SA.partition_to_pseudo[part]);
+
+      /* If this decl was marked as living in multiple places, reset
+	 this now to NULL.  */
+      tree var = SSA_NAME_VAR (name);
+      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+	SET_DECL_RTL (var, NULL);
+      /* Check that the pseudos chosen by assign_parms are those of
+	 the corresponding default defs.  */
+      else if (SSA_NAME_IS_DEFAULT_DEF (name)
+	       && (TREE_CODE (var) == PARM_DECL
+		   || TREE_CODE (var) == RESULT_DECL))
 	{
-	  tree leader = partition_to_var (SA.map, part);
-	  gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
-	  replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+	  rtx in = DECL_RTL_IF_SET (var);
+	  gcc_assert (in);
+	  rtx out = SA.partition_to_pseudo[part];
+	  gcc_assert (in == out || rtx_equal_p (in, out));
 	}
-      if (!POINTER_TYPE_P (TREE_TYPE (name)))
-	continue;
-
-      r = SA.partition_to_pseudo[part];
-      if (REG_P (r))
-	mark_reg_pointer (r, get_pointer_alignment (name));
     }
 
   /* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..602579d 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
 
 #endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 8f25f8b..6d47e94 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2230,16 +2230,16 @@ Common Report Var(flag_tree_ch) Optimization
 Enable loop header copying on trees
 
 ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing.  Preserved for backward compatibility.
 
 ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
 
 ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copy-prop
 Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a62c8b3..4b4ed2d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
 -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
 -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
 -fdump-tree-nrv -fdump-tree-vect @gol
 -fdump-tree-sink @gol
 -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
 -ftree-loop-if-convert-stores -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -7078,11 +7076,6 @@ name is made by appending @file{.phiopt} to the source file name.
 Dump each function after forward propagating single use variables.  The file
 name is made by appending @file{.forwprop} to the source file name.
 
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization.  The file
-name is made by appending @file{.copyrename} to the source file name.
-
 @item nrv
 @opindex fdump-tree-nrv
 Dump each function after applying the named return value optimization on
@@ -7547,8 +7540,8 @@ compilation time.
 -ftree-ccp @gol
 -fssa-phiopt @gol
 -ftree-ch @gol
+-ftree-coalesce-vars @gol
 -ftree-copy-prop @gol
--ftree-copyrename @gol
 -ftree-dce @gol
 -ftree-dominator-opts @gol
 -ftree-dse @gol
@@ -8812,6 +8805,15 @@ be parallelized.  Parallelize all the loops that can be analyzed to
 not contain loop carried dependences without checking that it is
 profitable to parallelize the loops.
 
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries.  This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}.  In the negated form, this flag
+prevents SSA coalescing of user variables.  This option is enabled by
+default if optimization is enabled.
+
 @item -ftree-loop-if-convert
 @opindex ftree-loop-if-convert
 Attempt to transform conditional jumps in the innermost loops to
@@ -8925,32 +8927,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
 references with scalars to prevent committing structures to memory too
 early.  This flag is enabled by default at @option{-O} and higher.
 
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees.  This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables.  This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions.  It is a more limited form of
-@option{-ftree-coalesce-vars}.  This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries.  This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}.  In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones.  This option is enabled by default.
-
 @item -ftree-ter
 @opindex ftree-ter
 Perform temporary expression replacement during the SSA->normal phase.  Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index ed2b30b..0648af6 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1232,6 +1232,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
 void
 set_reg_attrs_for_decl_rtl (tree t, rtx x)
 {
+  if (!t)
+    return;
+  tree tdecl = t;
   if (GET_CODE (x) == SUBREG)
     {
       gcc_assert (subreg_lowpart_p (x));
@@ -1240,7 +1243,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
   if (REG_P (x))
     REG_ATTRS (x)
       = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
-					       DECL_MODE (t)));
+					       DECL_MODE (tdecl)));
   if (GET_CODE (x) == CONCAT)
     {
       if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index bd342c1..6941f4e 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -842,6 +842,35 @@ promote_decl_mode (const_tree decl, int *punsignedp)
   return pmode;
 }
 
+/* Return the promoted mode for name.  If it is a named SSA_NAME, it
+   is the same as promote_decl_mode.  Otherwise, it is the promoted
+   mode of a temp decl of same type as the SSA_NAME, if we had created
+   one.  */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  /* Partitions holding parms and results must be promoted as expected
+     by function.c.  */
+  if (SSA_NAME_VAR (name)
+      && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
+	  || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
+    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+
+  tree type = TREE_TYPE (name);
+  int unsignedp = TYPE_UNSIGNED (type);
+  machine_mode mode = TYPE_MODE (type);
+
+  machine_mode pmode = promote_mode (type, mode, &unsignedp);
+  if (punsignedp)
+    *punsignedp = unsignedp;
+
+  return pmode;
+}
+
+
 \f
 /* Controls the behaviour of {anti_,}adjust_stack.  */
 static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 94613de..52113db 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
 /* Return mode and signedness to use when object is promoted.  */
 machine_mode promote_decl_mode (const_tree, int *);
 
+/* Return mode and signedness to use when object is promoted.  */
+machine_mode promote_ssa_mode (const_tree, int *);
+
 /* Remove some bytes from the stack.  An rtx says how many.  */
 extern void adjust_stack (rtx);
 
diff --git a/gcc/expr.c b/gcc/expr.c
index 899a42c..d601129 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9246,7 +9246,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
   rtx op0, op1, temp, decl_rtl;
   tree type;
   int unsignedp;
-  machine_mode mode;
+  machine_mode mode, dmode;
   enum tree_code code = TREE_CODE (exp);
   rtx subtarget, original_target;
   int ignore;
@@ -9377,7 +9377,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       if (g == NULL
 	  && modifier == EXPAND_INITIALIZER
 	  && !SSA_NAME_IS_DEFAULT_DEF (exp)
-	  && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+	  && (optimize || !SSA_NAME_VAR (exp)
+	      || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
 	  && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
 	g = SSA_NAME_DEF_STMT (exp);
       if (g)
@@ -9456,15 +9457,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       /* Ensure variable marked as used even if it doesn't go through
 	 a parser.  If it hasn't be used yet, write out an external
 	 definition.  */
-      TREE_USED (exp) = 1;
+      if (exp)
+	TREE_USED (exp) = 1;
 
       /* Show we haven't gotten RTL for this yet.  */
       temp = 0;
 
       /* Variables inherited from containing functions should have
 	 been lowered by this point.  */
-      context = decl_function_context (exp);
-      gcc_assert (SCOPE_FILE_SCOPE_P (context)
+      if (exp)
+	context = decl_function_context (exp);
+      gcc_assert (!exp
+		  || SCOPE_FILE_SCOPE_P (context)
 		  || context == current_function_decl
 		  || TREE_STATIC (exp)
 		  || DECL_EXTERNAL (exp)
@@ -9488,7 +9492,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	  decl_rtl = use_anchored_address (decl_rtl);
 	  if (modifier != EXPAND_CONST_ADDRESS
 	      && modifier != EXPAND_SUM
-	      && !memory_address_addr_space_p (DECL_MODE (exp),
+	      && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+					       : GET_MODE (decl_rtl),
 					       XEXP (decl_rtl, 0),
 					       MEM_ADDR_SPACE (decl_rtl)))
 	    temp = replace_equiv_address (decl_rtl,
@@ -9499,12 +9504,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 if the address is a register.  */
       if (temp != 0)
 	{
-	  if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+	  if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
 	    mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
 
 	  return temp;
 	}
 
+      if (exp)
+	dmode = DECL_MODE (exp);
+      else
+	dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
       /* If the mode of DECL_RTL does not match that of the decl,
 	 there are two cases: we are dealing with a BLKmode value
 	 that is returned in a register, or we are dealing with
@@ -9512,22 +9522,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 of the wanted mode, but mark it so that we know that it
 	 was already extended.  */
       if (REG_P (decl_rtl)
-	  && DECL_MODE (exp) != BLKmode
-	  && GET_MODE (decl_rtl) != DECL_MODE (exp))
+	  && dmode != BLKmode
+	  && GET_MODE (decl_rtl) != dmode)
 	{
 	  machine_mode pmode;
 
 	  /* Get the signedness to be used for this variable.  Ensure we get
 	     the same mode we got when the variable was declared.  */
-	  if (code == SSA_NAME
-	      && (g = SSA_NAME_DEF_STMT (ssa_name))
-	      && gimple_code (g) == GIMPLE_CALL
-	      && !gimple_call_internal_p (g))
+	  if (code != SSA_NAME)
+	    pmode = promote_decl_mode (exp, &unsignedp);
+	  else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+		   && gimple_code (g) == GIMPLE_CALL
+		   && !gimple_call_internal_p (g))
 	    pmode = promote_function_mode (type, mode, &unsignedp,
 					   gimple_call_fntype (g),
 					   2);
 	  else
-	    pmode = promote_decl_mode (exp, &unsignedp);
+	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/function.c b/gcc/function.c
index f9d11bf4..c3d00cd 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -72,6 +72,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfganal.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
+#include "cfgexpand.h"
+#include "basic-block.h"
+#include "df.h"
 #include "params.h"
 #include "bb-reorder.h"
 #include "shrink-wrap.h"
@@ -148,6 +151,8 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
 static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
+static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
+
 \f
 /* Stack of nested functions.  */
 /* Keep track of the cfun stack.  */
@@ -2105,6 +2110,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
 bool
 use_register_for_decl (const_tree decl)
 {
+  if (TREE_CODE (decl) == SSA_NAME)
+    {
+      /* We often try to use the SSA_NAME, instead of its underlying
+	 decl, to get type information and guide decisions, to avoid
+	 differences of behavior between anonymous and named
+	 variables, but in this one case we have to go for the actual
+	 variable if there is one.  The main reason is that, at least
+	 at -O0, we want to place user variables on the stack, but we
+	 don't mind using pseudos for anonymous or ignored temps.
+	 Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+	 should go in pseudos, whereas their corresponding variables
+	 might have to go on the stack.  So, disregarding the decl
+	 here would negatively impact debug info at -O0, enable
+	 coalescing between SSA_NAMEs that ought to get different
+	 stack/pseudo assignments, and get the incoming argument
+	 processing thoroughly confused by PARM_DECLs expected to live
+	 in stack slots but assigned to pseudos.  */
+      if (!SSA_NAME_VAR (decl))
+	return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+	  && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+      decl = SSA_NAME_VAR (decl);
+    }
+
   if (!targetm.calls.allocate_stack_slots_for_args ())
     return true;
 
@@ -2240,7 +2269,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
    needed, else the old list.  */
 
 static void
-split_complex_args (vec<tree> *args)
+split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 {
   unsigned i;
   tree p;
@@ -2251,6 +2280,7 @@ split_complex_args (vec<tree> *args)
       if (TREE_CODE (type) == COMPLEX_TYPE
 	  && targetm.calls.split_complex_arg (type))
 	{
+	  tree cparm = p;
 	  tree decl;
 	  tree subtype = TREE_TYPE (type);
 	  bool addressable = TREE_ADDRESSABLE (p);
@@ -2269,6 +2299,9 @@ split_complex_args (vec<tree> *args)
 	  DECL_ARTIFICIAL (p) = addressable;
 	  DECL_IGNORED_P (p) = addressable;
 	  TREE_ADDRESSABLE (p) = 0;
+	  /* Reset the RTL before layout_decl, or it may change the
+	     mode of the RTL of the original argument copied to P.  */
+	  SET_DECL_RTL (p, NULL_RTX);
 	  layout_decl (p, 0);
 	  (*args)[i] = p;
 
@@ -2280,6 +2313,25 @@ split_complex_args (vec<tree> *args)
 	  DECL_IGNORED_P (decl) = addressable;
 	  layout_decl (decl, 0);
 	  args->safe_insert (++i, decl);
+
+	  /* If we are expanding a function, rather than gimplifying
+	     it, propagate the RTL of the complex parm to the split
+	     declarations, and set their contexts so that
+	     maybe_reset_rtl_for_parm can recognize them and refrain
+	     from resetting their RTL.  */
+	  if (currently_expanding_to_rtl)
+	    {
+	      rtx rtl = rtl_for_parm (all, cparm);
+	      gcc_assert (!rtl || GET_CODE (rtl) == CONCAT);
+	      if (rtl)
+		{
+		  SET_DECL_RTL (p, XEXP (rtl, 0));
+		  SET_DECL_RTL (decl, XEXP (rtl, 1));
+
+		  DECL_CONTEXT (p) = cparm;
+		  DECL_CONTEXT (decl) = cparm;
+		}
+	    }
 	}
     }
 }
@@ -2342,7 +2394,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
 
   /* If the target wants to split complex arguments into scalars, do so.  */
   if (targetm.calls.split_complex_arg)
-    split_complex_args (&fnargs);
+    split_complex_args (all, &fnargs);
 
   return fnargs;
 }
@@ -2745,23 +2797,98 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
   data->entry_parm = entry_parm;
 }
 
+/* Wrapper for use_register_for_decl, that special-cases the
+   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+   passed by reference.  */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (DECL_BY_REFERENCE (result))
+	parm = result;
+    }
+
+  return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+   is passed by reference.  */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (!DECL_BY_REFERENCE (result))
+	return NULL_RTX;
+
+      parm = result;
+    }
+
+  return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+   SSA_NAMEs in multiple partitions, so that assign_parms will choose
+   the default def, if it exists, or create new RTL to hold the unused
+   entry value.  If we are coalescing across variables, we want to
+   reset the location too, because a parm without a default def
+   (incoming value unused) might be coalesced with one with a default
+   def, and then assign_parms would copy both incoming values to the
+   same location, which might cause the wrong value to survive.  */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+  gcc_assert (TREE_CODE (parm) == PARM_DECL
+	      || TREE_CODE (parm) == RESULT_DECL);
+
+  /* This is a split complex parameter, and its context was set to its
+     original PARM_DECL in split_complex_args so that we could
+     recognize it here and not reset its RTL.  */
+  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
+    {
+      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
+      return;
+    }
+
+  if ((flag_tree_coalesce_vars
+       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+      && is_gimple_reg (parm))
+    SET_DECL_RTL (parm, NULL_RTX);
+}
+
 /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
    always valid and properly aligned.  */
 
 static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+			      struct assign_parm_data_one *data)
 {
   rtx stack_parm = data->stack_parm;
 
+  /* If out-of-SSA assigned RTL to the parm default def, make sure we
+     don't use what we might have computed before.  */
+  rtx ssa_assigned = rtl_for_parm (all, parm);
+  if (ssa_assigned)
+    stack_parm = NULL;
+
   /* If we can't trust the parm stack slot to be aligned enough for its
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
-  if (stack_parm
-      && ((STRICT_ALIGNMENT
-	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
-	  || (data->nominal_type
-	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
-	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+  else if (stack_parm
+	   && ((STRICT_ALIGNMENT
+		&& (GET_MODE_ALIGNMENT (data->nominal_mode)
+		    > MEM_ALIGN (stack_parm)))
+	       || (data->nominal_type
+		   && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+		   && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
     stack_parm = NULL;
 
   /* If parm was passed in memory, and we need to convert it on entry,
@@ -2823,14 +2950,21 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      stack_parm = assign_stack_local (BLKmode, size_stored,
-				       DECL_ALIGN (parm));
-      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
-	PUT_MODE (stack_parm, GET_MODE (entry_parm));
-      set_mem_attributes (stack_parm, parm, 1);
+      stack_parm = rtl_for_parm (all, parm);
+      if (stack_parm)
+	stack_parm = copy_rtx (stack_parm);
+      else
+	{
+	  stack_parm = assign_stack_local (BLKmode, size_stored,
+					   DECL_ALIGN (parm));
+	  if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
+	    PUT_MODE (stack_parm, GET_MODE (entry_parm));
+	  set_mem_attributes (stack_parm, parm, 1);
+	}
     }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
@@ -2968,10 +3102,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  parmreg = gen_reg_rtx (promoted_nominal_mode);
+  rtx from_expand = rtl_for_parm (all, parm);
 
-  if (!DECL_ARTIFICIAL (parm))
-    mark_user_reg (parmreg);
+  if (from_expand && !data->passed_pointer)
+    {
+      parmreg = from_expand;
+      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
+    }
+  else
+    {
+      parmreg = gen_reg_rtx (promoted_nominal_mode);
+      if (!DECL_ARTIFICIAL (parm))
+	mark_user_reg (parmreg);
+    }
 
   /* If this was an item that we received a pointer to,
      set DECL_RTL appropriately.  */
@@ -2990,6 +3133,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      assign_parm_find_data_types and expand_expr_real_1.  */
 
   equiv_stack_parm = data->stack_parm;
+  if (!equiv_stack_parm)
+    equiv_stack_parm = data->entry_parm;
   validated_mem = validize_mem (copy_rtx (data->entry_parm));
 
   need_conversion = (data->nominal_mode != data->passed_mode
@@ -3130,11 +3275,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
   /* If we were passed a pointer but the actual value can safely live
      in a register, retrieve it and use it directly.  */
-  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+  if (data->passed_pointer
+      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
     {
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
-      if (use_register_for_decl (parm))
+      if (from_expand)
+	{
+	  parmreg = from_expand;
+	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+	}
+      else if (use_register_for_decl (parm))
 	{
 	  parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
 	  mark_user_reg (parmreg);
@@ -3174,7 +3325,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       /* STACK_PARM is the pointer, not the parm, and PARMREG is
 	 now the parm.  */
-      data->stack_parm = NULL;
+      data->stack_parm = equiv_stack_parm = NULL;
     }
 
   /* Mark the register as eliminable if we did no conversion and it was
@@ -3184,11 +3335,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      make here would screw up life analysis for it.  */
   if (data->nominal_mode == data->passed_mode
       && !did_conversion
-      && data->stack_parm != 0
-      && MEM_P (data->stack_parm)
+      && equiv_stack_parm != 0
+      && MEM_P (equiv_stack_parm)
       && data->locate.offset.var == 0
       && reg_mentioned_p (virtual_incoming_args_rtx,
-			  XEXP (data->stack_parm, 0)))
+			  XEXP (equiv_stack_parm, 0)))
     {
       rtx_insn *linsn = get_last_insn ();
       rtx_insn *sinsn;
@@ -3201,8 +3352,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	    = GET_MODE_INNER (GET_MODE (parmreg));
 	  int regnor = REGNO (XEXP (parmreg, 0));
 	  int regnoi = REGNO (XEXP (parmreg, 1));
-	  rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
-	  rtx stacki = adjust_address_nv (data->stack_parm, submode,
+	  rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+	  rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
 					  GET_MODE_SIZE (submode));
 
 	  /* Scan backwards for the set of the real and
@@ -3275,6 +3426,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 
       if (data->stack_parm == 0)
 	{
+	  rtx x = data->stack_parm = rtl_for_parm (all, parm);
+	  if (x)
+	    gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+	}
+
+      if (data->stack_parm == 0)
+	{
 	  int align = STACK_SLOT_ALIGNMENT (data->passed_type,
 					    GET_MODE (data->entry_parm),
 					    TYPE_ALIGN (data->passed_type));
@@ -3531,6 +3689,8 @@ assign_parms (tree fndecl)
 	  DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
 	  continue;
 	}
+      else
+	maybe_reset_rtl_for_parm (parm);
 
       /* Estimate stack alignment from parameter alignment.  */
       if (SUPPORTS_STACK_ALIGNMENT)
@@ -3580,7 +3740,9 @@ assign_parms (tree fndecl)
       else
 	set_decl_incoming_rtl (parm, data.entry_parm, false);
 
-      /* Boudns should be loaded in the particular order to
+      assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+      /* Bounds should be loaded in the particular order to
 	 have registers allocated correctly.  Collect info about
 	 input bounds and load them later.  */
       if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3597,11 +3759,10 @@ assign_parms (tree fndecl)
 	}
       else
 	{
-	  assign_parm_adjust_stack_rtl (&data);
-
 	  if (assign_parm_setup_block_p (&data))
 	    assign_parm_setup_block (&all, parm, &data);
-	  else if (data.passed_pointer || use_register_for_decl (parm))
+	  else if (data.passed_pointer
+		   || use_register_for_parm_decl (&all, parm))
 	    assign_parm_setup_reg (&all, parm, &data);
 	  else
 	    assign_parm_setup_stack (&all, parm, &data);
@@ -4932,7 +5093,9 @@ expand_function_start (tree subr)
      before any library calls that assign parms might generate.  */
 
   /* Decide whether to return the value in memory or in a register.  */
-  if (aggregate_value_p (DECL_RESULT (subr), subr))
+  tree res = DECL_RESULT (subr);
+  maybe_reset_rtl_for_parm (res);
+  if (aggregate_value_p (res, subr))
     {
       /* Returning something that won't go in a register.  */
       rtx value_address = 0;
@@ -4940,7 +5103,7 @@ expand_function_start (tree subr)
 #ifdef PCC_STATIC_STRUCT_RETURN
       if (cfun->returns_pcc_struct)
 	{
-	  int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+	  int size = int_size_in_bytes (TREE_TYPE (res));
 	  value_address = assemble_static_space (size);
 	}
       else
@@ -4952,36 +5115,45 @@ expand_function_start (tree subr)
 	     it.  */
 	  if (sv)
 	    {
-	      value_address = gen_reg_rtx (Pmode);
+	      if (DECL_BY_REFERENCE (res))
+		value_address = get_rtl_for_parm_ssa_default_def (res);
+	      if (!value_address)
+		value_address = gen_reg_rtx (Pmode);
 	      emit_move_insn (value_address, sv);
 	    }
 	}
       if (value_address)
 	{
 	  rtx x = value_address;
-	  if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+	  if (!DECL_BY_REFERENCE (res))
 	    {
-	      x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
-	      set_mem_attributes (x, DECL_RESULT (subr), 1);
+	      x = get_rtl_for_parm_ssa_default_def (res);
+	      if (!x)
+		{
+		  x = gen_rtx_MEM (DECL_MODE (res), value_address);
+		  set_mem_attributes (x, res, 1);
+		}
 	    }
-	  SET_DECL_RTL (DECL_RESULT (subr), x);
+	  SET_DECL_RTL (res, x);
 	}
     }
-  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+  else if (DECL_MODE (res) == VOIDmode)
     /* If return mode is void, this decl rtl should not be used.  */
-    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+    SET_DECL_RTL (res, NULL_RTX);
   else
     {
       /* Compute the return values into a pseudo reg, which we will copy
 	 into the true return register after the cleanups are done.  */
-      tree return_type = TREE_TYPE (DECL_RESULT (subr));
-      if (TYPE_MODE (return_type) != BLKmode
-	  && targetm.calls.return_in_msb (return_type))
+      tree return_type = TREE_TYPE (res);
+      rtx x = get_rtl_for_parm_ssa_default_def (res);
+      if (x)
+	/* Use it.  */;
+      else if (TYPE_MODE (return_type) != BLKmode
+	       && targetm.calls.return_in_msb (return_type))
 	/* expand_function_end will insert the appropriate padding in
 	   this case.  Use the return value's natural (unpadded) mode
 	   within the function proper.  */
-	SET_DECL_RTL (DECL_RESULT (subr),
-		      gen_reg_rtx (TYPE_MODE (return_type)));
+	x = gen_reg_rtx (TYPE_MODE (return_type));
       else
 	{
 	  /* In order to figure out what mode to use for the pseudo, we
@@ -4992,25 +5164,26 @@ expand_function_start (tree subr)
 	  /* Structures that are returned in registers are not
 	     aggregate_value_p, so we may see a PARALLEL or a REG.  */
 	  if (REG_P (hard_reg))
-	    SET_DECL_RTL (DECL_RESULT (subr),
-			  gen_reg_rtx (GET_MODE (hard_reg)));
+	    x = gen_reg_rtx (GET_MODE (hard_reg));
 	  else
 	    {
 	      gcc_assert (GET_CODE (hard_reg) == PARALLEL);
-	      SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+	      x = gen_group_rtx (hard_reg);
 	    }
 	}
 
+      SET_DECL_RTL (res, x);
+
       /* Set DECL_REGISTER flag so that expand_function_end will copy the
 	 result to the real return register(s).  */
-      DECL_REGISTER (DECL_RESULT (subr)) = 1;
+      DECL_REGISTER (res) = 1;
 
       if (chkp_function_instrumented_p (current_function_decl))
 	{
-	  tree return_type = TREE_TYPE (DECL_RESULT (subr));
+	  tree return_type = TREE_TYPE (res);
 	  rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
 								 subr, 1);
-	  SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+	  SET_DECL_BOUNDS_RTL (res, bounds);
 	}
     }
 
@@ -5025,7 +5198,9 @@ expand_function_start (tree subr)
       rtx local, chain;
      rtx_insn *insn;
 
-      local = gen_reg_rtx (Pmode);
+      local = get_rtl_for_parm_ssa_default_def (parm);
+      if (!local)
+	local = gen_reg_rtx (Pmode);
       chain = targetm.calls.static_chain (current_function_decl, true);
 
       set_decl_incoming_rtl (parm, chain, false);
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index b558d90..baed630 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type)
   return copy;
 }
 
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
-   coalescing together, false otherwise.
-
-   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
-  /* First check the SSA_NAME's associated DECL.  We only want to
-     coalesce if they have the same DECL or both have no associated DECL.  */
-  tree var1 = SSA_NAME_VAR (name1);
-  tree var2 = SSA_NAME_VAR (name2);
-  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
-  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
-  if (var1 != var2)
-    return false;
-
-  /* Now check the types.  If the types are the same, then we should
-     try to coalesce V1 and V2.  */
-  tree t1 = TREE_TYPE (name1);
-  tree t2 = TREE_TYPE (name2);
-  if (t1 == t2)
-    return true;
-
-  /* If the types are not the same, check for a canonical type match.  This
-     (for example) allows coalescing when the types are fundamentally the
-     same, but just have different names. 
-
-     Note pointer types with different address spaces may have the same
-     canonical type.  Those are rejected for coalescing by the
-     types_compatible_p check.  */
-  if (TYPE_CANONICAL (t1)
-      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
-      && types_compatible_p (t1, t2))
-    return true;
-
-  return false;
-}
-
 /* Strip off a legitimate source ending from the input string NAME of
    length LEN.  Rather than having to know the names used by all of
    our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index ed23eb2..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
 extern bool gimple_has_body_p (tree);
 extern const char *gimple_decl_printable_name (tree, int);
 extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
 extern tree create_tmp_var_name (const char *);
 extern tree create_tmp_var_raw (tree, const char * = NULL);
 extern tree create_tmp_var (tree, const char * = NULL);
diff --git a/gcc/opts.c b/gcc/opts.c
index 468a802..f22edd3 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -445,12 +445,12 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
-    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 6b66f8f..64fc4d9 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_all_early_optimizations);
       PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_object_sizes);
 	  NEXT_PASS (pass_ccp);
 	  /* After CCP we rewrite no longer addressed locals into SSA
@@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
       /* Initial scalar cleanups before alias computation.
 	 They ensure memory accesses are not indirect wherever possible.  */
       NEXT_PASS (pass_strip_predict_hints);
-      NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_ccp);
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
@@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_ch);
       NEXT_PASS (pass_lower_complex);
       NEXT_PASS (pass_sra);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* The dom pass will also resolve all __builtin_constant_p calls
          that are still there to 0.  This has to be done after some
 	 propagations have already run, but before some more dead code
@@ -293,7 +290,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_fold_builtins);
       NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* FIXME: If DCE is not run before checking for uninitialized uses,
 	 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 	 However, this also causes us to misdiagnose cases that should be
@@ -328,7 +324,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tsan);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* ???  We do want some kind of loop invariant motion, but we possibly
          need to adjust LIM to be more friendly towards preserving accurate
 	 debug information here.  */
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 0d4f4a4..288227a 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -794,7 +794,8 @@ layout_decl (tree decl, unsigned int known_align)
     {
       PUT_MODE (rtl, DECL_MODE (decl));
       SET_DECL_RTL (decl, 0);
-      set_mem_attributes (rtl, decl, 1);
+      if (MEM_P (rtl))
+	set_mem_attributes (rtl, decl, 1);
       SET_DECL_RTL (decl, rtl);
     }
 }
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/54200 */
 /* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
 
 int o __attribute__((used));
 
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
 
 int main ()
 {
-  int i;
+  register int i;
   char foo[255];
 
   // smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
 void
 overflow()
 {
-  int i = 0;
+  register int i = 0;
   char foo[30];
 
   /* Overflow buffer.  */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+   value is unused, to the same location, so as to overwrite one of
+   them with the incoming value of the other.  */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+/* Same as foo, but with swapped parameters.  */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+int
+main (void)
+{
+  if (foo (0, 1) != 3)
+    abort ();
+  if (bar (1, 0) != 3)
+    abort ();
+  return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index 7b747ab9..978476c 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
   rtx dest_rtx, seq, x;
   machine_mode dest_mode, src_mode;
   int unsignedp;
-  tree var;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
 
   start_sequence ();
 
-  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+  tree name = partition_to_var (SA.map, dest);
   src_mode = TYPE_MODE (TREE_TYPE (src));
   dest_mode = GET_MODE (dest_rtx);
-  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
   gcc_assert (!REG_P (dest_rtx)
-	      || dest_mode == promote_decl_mode (var, &unsignedp));
+	      || dest_mode == promote_ssa_mode (name, &unsignedp));
 
   if (src_mode != dest_mode)
     {
@@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T)
 static rtx
 get_temp_reg (tree name)
 {
-  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
-  tree type = TREE_TYPE (var);
+  tree type = TREE_TYPE (name);
   int unsignedp;
-  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
   rtx x = gen_reg_rtx (reg_mode);
   if (POINTER_TYPE_P (type))
-    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
   return x;
 }
 
@@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 
   /* Return to viewing the variable list as just all reference variables after
      coalescing has been performed.  */
-  partition_view_normal (map, false);
+  partition_view_normal (map);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index bf8983f..a622728 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "tree-ssa-live.h"
 #include "tree-ssa-coalesce.h"
+#include "explow.h"
 #include "diagnostic-core.h"
 
 
@@ -806,6 +807,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
   basic_block bb;
   ssa_op_iter iter;
   live_track_p live;
+  basic_block entry;
+
+  /* If inter-variable coalescing is enabled, we may attempt to
+     coalesce variables from different base variables, including
+     different parameters, so we have to make sure default defs live
+     at the entry block conflict with each other.  */
+  if (flag_tree_coalesce_vars)
+    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  else
+    entry = NULL;
 
   map = live_var_map (liveinfo);
   graph = ssa_conflicts_new (num_var_partitions (map));
@@ -864,6 +875,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	    live_track_process_def (live, result, graph);
 	}
 
+      /* Pretend there are defs for params' default defs at the start
+	 of the (post-)entry block.  */
+      if (bb == entry)
+	{
+	  unsigned base;
+	  bitmap_iterator bi;
+	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	    {
+	      bitmap_iterator bi2;
+	      unsigned part;
+	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+					0, part, bi2)
+		{
+		  tree var = partition_to_var (map, part);
+		  if (!SSA_NAME_VAR (var)
+		      || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+			  && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+		      || !SSA_NAME_IS_DEFAULT_DEF (var))
+		    continue;
+		  live_track_process_def (live, var, graph);
+		}
+	    }
+	}
+
      live_track_clear_base_vars (live);
     }
 
@@ -1132,6 +1167,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
     {
       var1 = partition_to_var (map, p1);
       var2 = partition_to_var (map, p2);
+
       z = var_union (map, var1, var2);
       if (z == NO_PARTITION)
 	{
@@ -1149,6 +1185,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
 
       if (debug)
 	fprintf (debug, ": Success -> %d\n", z);
+
       return true;
     }
 
@@ -1244,6 +1281,328 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
 }
 
 
+/* Output partition map MAP with coalescing plan PART to file F.  */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+  int t;
+  unsigned x, y;
+  int p;
+
+  fprintf (f, "\nCoalescible Partition map \n\n");
+
+  for (x = 0; x < map->num_partitions; x++)
+    {
+      if (map->view_to_partition != NULL)
+	p = map->view_to_partition[x];
+      else
+	p = x;
+
+      if (ssa_name (p) == NULL_TREE
+	  || virtual_operand_p (ssa_name (p)))
+        continue;
+
+      t = 0;
+      for (y = 1; y < num_ssa_names; y++)
+        {
+	  tree var = version_to_var (map, y);
+	  if (!var)
+	    continue;
+	  int q = var_to_partition (map, var);
+	  p = partition_find (part, q);
+	  gcc_assert (map->partition_to_base_index[q]
+		      == map->partition_to_base_index[p]);
+
+	  if (p == (int)x)
+	    {
+	      if (t++ == 0)
+	        {
+		  fprintf (f, "Partition %d, base %d (", x,
+			   map->partition_to_base_index[q]);
+		  print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+		  fprintf (f, " - ");
+		}
+	      fprintf (f, "%d ", y);
+	    }
+	}
+      if (t != 0)
+	fprintf (f, ")\n");
+    }
+  fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+   coalescing together, false otherwise.
+
+   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+  /* First check the SSA_NAME's associated DECL.  Without
+     optimization, we only want to coalesce if they have the same DECL
+     or both have no associated DECL.  */
+  tree var1 = SSA_NAME_VAR (name1);
+  tree var2 = SSA_NAME_VAR (name2);
+  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+  if (var1 != var2 && !flag_tree_coalesce_vars)
+    return false;
+
+  /* Now check the types.  If the types are the same, then we should
+     try to coalesce V1 and V2.  */
+  tree t1 = TREE_TYPE (name1);
+  tree t2 = TREE_TYPE (name2);
+  if (t1 == t2)
+    {
+    check_modes:
+      /* If the base variables are the same, we're good: none of the
+	 other tests below could possibly fail.  */
+      var1 = SSA_NAME_VAR (name1);
+      var2 = SSA_NAME_VAR (name2);
+      if (var1 == var2)
+	return true;
+
+      /* We don't want to coalesce two SSA names if one of the base
+	 variables is supposed to be a register while the other is
+	 supposed to be on the stack.  Anonymous SSA names take
+	 registers, but when not optimizing, user variables should go
+	 on the stack, so coalescing them with the anonymous variable
+	 as the partition leader would end up assigning the user
+	 variable to a register.  Don't do that!  */
+      bool reg1 = !var1 || use_register_for_decl (var1);
+      bool reg2 = !var2 || use_register_for_decl (var2);
+      if (reg1 != reg2)
+	return false;
+
+      /* Check that the promoted modes are the same.  We don't want to
+	 coalesce if the promoted modes would be different.  Only
+	 PARM_DECLs and RESULT_DECLs have different promotion rules,
+	 so skip the test if we both are variables or anonymous
+	 SSA_NAMEs.  */
+      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+	|| promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
+    }
+
+  /* If the types are not the same, check for a canonical type match.  This
+     (for example) allows coalescing when the types are fundamentally the
+     same, but just have different names. 
+
+     Note pointer types with different address spaces may have the same
+     canonical type.  Those are rejected for coalescing by the
+     types_compatible_p check.  */
+  if (TYPE_CANONICAL (t1)
+      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+      && types_compatible_p (t1, t2))
+    goto check_modes;
+
+  return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+   partition of SSA names USED_IN_COPIES and related by CL coalesce
+   possibilities.  This must match gimple_can_coalesce_p in the
+   optimized case.  */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+				   coalesce_list_p cl)
+{
+  int parts = num_var_partitions (map);
+  partition tentative = partition_new (parts);
+
+  /* Partition the SSA versions so that, for each coalescible
+     pair, both of its members are in the same partition in
+     TENTATIVE.  */
+  gcc_assert (!cl->sorted);
+  coalesce_pair_p node;
+  coalesce_iterator_type ppi;
+  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+    {
+      tree v1 = ssa_name (node->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (node->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* We have to deal with cost one pairs too.  */
+  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+    {
+      tree v1 = ssa_name (co->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (co->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* And also with abnormal edges.  */
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->flags & EDGE_ABNORMAL)
+	  {
+	    gphi_iterator gsi;
+	    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+		 gsi_next (&gsi))
+	      {
+		gphi *phi = gsi.phi ();
+		tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+		if (SSA_NAME_IS_DEFAULT_DEF (arg)
+		    && (!SSA_NAME_VAR (arg)
+			|| TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+		  continue;
+
+		tree res = PHI_RESULT (phi);
+
+		int p1 = partition_find (tentative, var_to_partition (map, res));
+		int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+		if (p1 == p2)
+		  continue;
+
+		partition_union (tentative, p1, p2);
+	      }
+	  }
+    }
+
+  map->partition_to_base_index = XCNEWVEC (int, parts);
+  auto_vec<unsigned int> index_map (parts);
+  if (parts)
+    index_map.quick_grow (parts);
+
+  const unsigned no_part = -1;
+  unsigned count = parts;
+  while (count)
+    index_map[--count] = no_part;
+
+  /* Initialize MAP's mapping from partition to base index, using
+     as base indices an enumeration of the TENTATIVE partitions in
+     which each SSA version ended up, so that we compute conflicts
+     between all SSA versions that ended up in the same potential
+     coalesce partition.  */
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      if (index_map[base] != no_part)
+	continue;
+      index_map[base] = count++;
+    }
+
+  map->num_basevars = count;
+
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      gcc_assert (index_map[base] < count);
+      map->partition_to_base_index[pidx] = index_map[base];
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_part_var_map (dump_file, tentative, map);
+
+  partition_delete (tentative);
+}
+
+/* Hashtable helpers.  */
+
+struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
+{
+  static inline hashval_t hash (const tree_int_map *);
+  static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+  return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+  return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+   names.  Partitions will share the same base if they have the same
+   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
+   must match gimple_can_coalesce_p in the non-optimized case.  */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+  int x, num_part;
+  tree var;
+  struct tree_int_map *m, *mapstorage;
+
+  num_part = num_var_partitions (map);
+  hash_table<tree_int_map_hasher> tree_to_index (num_part);
+  /* We can have at most num_part entries in the hash tables, so it's
+     enough to allocate so many map elements once, saving some malloc
+     calls.  */
+  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+  /* If a base table already exists, clear it, otherwise create it.  */
+  free (map->partition_to_base_index);
+  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+  /* Build the base variable list, and point partitions at their bases.  */
+  for (x = 0; x < num_part; x++)
+    {
+      struct tree_int_map **slot;
+      unsigned baseindex;
+      var = partition_to_var (map, x);
+      if (SSA_NAME_VAR (var)
+	  && (!VAR_P (SSA_NAME_VAR (var))
+	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+	m->base.from = SSA_NAME_VAR (var);
+      else
+	/* This restricts what anonymous SSA names we can coalesce
+	   as it restricts the sets we compute conflicts for.
+	   Using TREE_TYPE to generate sets is the easies as
+	   type equivalency also holds for SSA names with the same
+	   underlying decl.
+
+	   Check gimple_can_coalesce_p when changing this code.  */
+	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+			? TYPE_CANONICAL (TREE_TYPE (var))
+			: TREE_TYPE (var));
+      /* If base variable hasn't been seen, set it up.  */
+      slot = tree_to_index.find_slot (m, INSERT);
+      if (!*slot)
+	{
+	  baseindex = m - mapstorage;
+	  m->to = baseindex;
+	  *slot = m;
+	  m++;
+	}
+      else
+	baseindex = (*slot)->to;
+      map->partition_to_base_index[x] = baseindex;
+    }
+
+  map->num_basevars = m - mapstorage;
+
+  free (mapstorage);
+}
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1260,9 +1619,10 @@ coalesce_ssa_name (void)
   cl = create_coalesce_list ();
   map = create_outofssa_var_map (cl, used_in_copies);
 
-  /* If optimization is disabled, we need to coalesce all the names originating
-     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
-  if (!optimize)
+  /* If this optimization is disabled, we need to coalesce all the
+     names originating from the same SSA_NAME_VAR so debug info
+     remains undisturbed.  */
+  if (!flag_tree_coalesce_vars)
     {
       hash_table<ssa_name_var_hash> ssa_name_hash (10);
 
@@ -1303,8 +1663,13 @@ coalesce_ssa_name (void)
   if (dump_file && (dump_flags & TDF_DETAILS))
     dump_var_map (dump_file, map);
 
-  /* Don't calculate live ranges for variables not in the coalesce list.  */
-  partition_view_bitmap (map, used_in_copies, true);
+  partition_view_bitmap (map, used_in_copies);
+
+  if (flag_tree_coalesce_vars)
+    compute_optimized_partition_bases (map, used_in_copies, cl);
+  else
+    compute_samebase_partition_bases (map);
+
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
@@ -1343,8 +1708,7 @@ coalesce_ssa_name (void)
 
   /* Now coalesce everything in the list.  */
   coalesce_partitions (map, graph, cl,
-		       ((dump_flags & TDF_DETAILS) ? dump_file
-						   : NULL));
+		       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
   delete_coalesce_list (cl);
   ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_TREE_SSA_COALESCE_H
 
 extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
 
 #endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index aeb7f28..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,475 +0,0 @@
-/* Rename SSA copies.
-   Copyright (C) 2004-2015 Free Software Foundation, Inc.
-   Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "backend.h"
-#include "tree.h"
-#include "gimple.h"
-#include "rtl.h"
-#include "ssa.h"
-#include "alias.h"
-#include "fold-const.h"
-#include "internal-fn.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
-  /* Number of copies coalesced.  */
-  int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
-   This optimization looks for copies between 2 SSA_NAMES, either through a
-   direct copy, or an implicit one via a PHI node result and its arguments.
-
-   Each copy is examined to determine if it is possible to rename the base
-   variable of one of the operands to the same variable as the other operand.
-   i.e.
-   T.3_5 = <blah>
-   a_1 = T.3_5
-
-   If this copy couldn't be copy propagated, it could possibly remain in the
-   program throughout the optimization phases.   After SSA->normal, it would
-   become:
-
-   T.3 = <blah>
-   a = T.3
-
-   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
-   fundamental reason why the base variable needs to be T.3, subject to
-   certain restrictions.  This optimization attempts to determine if we can
-   change the base variable on copies like this, and result in code such as:
-
-   a_5 = <blah>
-   a_1 = a_5
-
-   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
-   possible, the copy goes away completely. If it isn't possible, a new temp
-   will be created for a_5, and you will end up with the exact same code:
-
-   a.8 = <blah>
-   a = a.8
-
-   The other benefit of performing this optimization relates to what variables
-   are chosen in copies.  Gimplification of the program uses temporaries for
-   a lot of things. expressions like
-
-   a_1 = <blah>
-   <blah2> = a_1
-
-   get turned into
-
-   T.3_5 = <blah>
-   a_1 = T.3_5
-   <blah2> = a_1
-
-   Copy propagation is done in a forward direction, and if we can propagate
-   through the copy, we end up with:
-
-   T.3_5 = <blah>
-   <blah2> = T.3_5
-
-   The copy is gone, but so is all reference to the user variable 'a'. By
-   performing this optimization, we would see the sequence:
-
-   a_5 = <blah>
-   a_1 = a_5
-   <blah2> = a_1
-
-   which copy propagation would then turn into:
-
-   a_5 = <blah>
-   <blah2> = a_5
-
-   and so we still retain the user variable whenever possible.  */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
-   Choose a representative for the partition, and send debug info to DEBUG.  */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
-  int p1, p2, p3;
-  tree root1, root2;
-  tree rep1, rep2;
-  bool ign1, ign2, abnorm;
-
-  gcc_assert (TREE_CODE (var1) == SSA_NAME);
-  gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
-  register_ssa_partition (map, var1);
-  register_ssa_partition (map, var2);
-
-  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
-  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
-  if (debug)
-    {
-      fprintf (debug, "Try : ");
-      print_generic_expr (debug, var1, TDF_SLIM);
-      fprintf (debug, "(P%d) & ", p1);
-      print_generic_expr (debug, var2, TDF_SLIM);
-      fprintf (debug, "(P%d)", p2);
-    }
-
-  gcc_assert (p1 != NO_PARTITION);
-  gcc_assert (p2 != NO_PARTITION);
-
-  if (p1 == p2)
-    {
-      if (debug)
-	fprintf (debug, " : Already coalesced.\n");
-      return;
-    }
-
-  rep1 = partition_to_var (map, p1);
-  rep2 = partition_to_var (map, p2);
-  root1 = SSA_NAME_VAR (rep1);
-  root2 = SSA_NAME_VAR (rep2);
-  if (!root1 && !root2)
-    return;
-
-  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
-  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
-	    || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
-  if (abnorm)
-    {
-      if (debug)
-	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
-      return;
-    }
-
-  /* Partitions already have the same root, simply merge them.  */
-  if (root1 == root2)
-    {
-      p1 = partition_union (map->var_partition, p1, p2);
-      if (debug)
-	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
-      return;
-    }
-
-  /* Never attempt to coalesce 2 different parameters.  */
-  if ((root1 && TREE_CODE (root1) == PARM_DECL)
-      && (root2 && TREE_CODE (root2) == PARM_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
-      return;
-    }
-
-  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
-      != (root2 && TREE_CODE (root2) == RESULT_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
-      return;
-    }
-
-  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
-  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
-  /* Refrain from coalescing user variables, if requested.  */
-  if (!ign1 && !ign2)
-    {
-      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
-	ign2 = true;
-      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
-	ign1 = true;
-      else if (flag_ssa_coalesce_vars != 2)
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
-	  return;
-	}
-      else
-	ign2 = true;
-    }
-
-  /* If both values have default defs, we can't coalesce.  If only one has a
-     tag, make sure that variable is the new root partition.  */
-  if (root1 && ssa_default_def (cfun, root1))
-    {
-      if (root2 && ssa_default_def (cfun, root2))
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 default defs. No coalesce.\n");
-	  return;
-	}
-      else
-        {
-	  ign2 = true;
-	  ign1 = false;
-	}
-    }
-  else if (root2 && ssa_default_def (cfun, root2))
-    {
-      ign1 = true;
-      ign2 = false;
-    }
-
-  /* Do not coalesce if we cannot assign a symbol to the partition.  */
-  if (!(!ign2 && root2)
-      && !(!ign1 && root1))
-    {
-      if (debug)
-	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the new chosen root variable would be read-only.
-     If both ign1 && ign2, then the root var of the larger partition
-     wins, so reject in that case if any of the root vars is TREE_READONLY.
-     Otherwise reject only if the root var, on which replace_ssa_name_symbol
-     will be called below, is readonly.  */
-  if (((root1 && TREE_READONLY (root1)) && ign2)
-      || ((root2 && TREE_READONLY (root2)) && ign1))
-    {
-      if (debug)
-	fprintf (debug, " : Readonly variable.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the two variables aren't type compatible .  */
-  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
-      /* There is a disconnect between the middle-end type-system and
-         VRP, avoid coalescing enum types with different bounds.  */
-      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
-	   || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
-	  && TREE_TYPE (var1) != TREE_TYPE (var2)))
-    {
-      if (debug)
-	fprintf (debug, " : Incompatible types.  No coalesce.\n");
-      return;
-    }
-
-  /* Merge the two partitions.  */
-  p3 = partition_union (map->var_partition, p1, p2);
-
-  /* Set the root variable of the partition to the better choice, if there is
-     one.  */
-  if (!ign2 && root2)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
-  else if (!ign1 && root1)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
-  else
-    gcc_unreachable ();
-
-  if (debug)
-    {
-      fprintf (debug, " --> P%d ", p3);
-      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
-			  TDF_SLIM);
-      fprintf (debug, "\n");
-    }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
-  GIMPLE_PASS, /* type */
-  "copyrename", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_COPY_RENAME, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
-  pass_rename_ssa_copies (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
-   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
-   changing the underlying root variable of all coalesced version.  This will
-   then cause the SSA->normal pass to attempt to coalesce them all to the same
-   variable.  */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
-  var_map map;
-  basic_block bb;
-  tree var, part_var;
-  gimple stmt;
-  unsigned x;
-  FILE *debug;
-
-  memset (&stats, 0, sizeof (stats));
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    debug = dump_file;
-  else
-    debug = NULL;
-
-  map = init_var_map (num_ssa_names);
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Scan for real copies.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_assign_ssa_name_copy_p (stmt))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
-	    }
-	}
-    }
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Treat PHI nodes as copies between the result and each argument.  */
-      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-        {
-          size_t i;
-	  tree res;
-	  gphi *phi = gsi.phi ();
-	  res = gimple_phi_result (phi);
-
-	  /* Do not process virtual SSA_NAMES.  */
-	  if (virtual_operand_p (res))
-	    continue;
-
-	  /* Make sure to only use the same partition for an argument
-	     as the result but never the other way around.  */
-	  if (SSA_NAME_VAR (res)
-	      && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
-	    for (i = 0; i < gimple_phi_num_args (phi); i++)
-	      {
-		tree arg = PHI_ARG_DEF (phi, i);
-		if (TREE_CODE (arg) == SSA_NAME)
-		  copy_rename_partition_coalesce (map, res, arg,
-						  debug);
-	      }
-	  /* Else if all arguments are in the same partition try to merge
-	     it with the result.  */
-	  else
-	    {
-	      int all_p_same = -1;
-	      int p = -1;
-	      for (i = 0; i < gimple_phi_num_args (phi); i++)
-		{
-		  tree arg = PHI_ARG_DEF (phi, i);
-		  if (TREE_CODE (arg) != SSA_NAME)
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		  else if (all_p_same == -1)
-		    {
-		      p = partition_find (map->var_partition,
-					  SSA_NAME_VERSION (arg));
-		      all_p_same = 1;
-		    }
-		  else if (all_p_same == 1
-			   && p != partition_find (map->var_partition,
-						   SSA_NAME_VERSION (arg)))
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		}
-	      if (all_p_same == 1)
-		copy_rename_partition_coalesce (map, res,
-						PHI_ARG_DEF (phi, 0),
-						debug);
-	    }
-        }
-    }
-
-  if (debug)
-    dump_var_map (debug, map);
-
-  /* Now one more pass to make all elements of a partition share the same
-     root variable.  */
-
-  for (x = 1; x < num_ssa_names; x++)
-    {
-      part_var = partition_to_var (map, x);
-      if (!part_var)
-        continue;
-      var = ssa_name (x);
-      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
-	continue;
-      if (debug)
-        {
-	  fprintf (debug, "Coalesced ");
-	  print_generic_expr (debug, var, TDF_SLIM);
-	  fprintf (debug, " to ");
-	  print_generic_expr (debug, part_var, TDF_SLIM);
-	  fprintf (debug, "\n");
-	}
-      stats.coalesced++;
-      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
-    }
-
-  statistics_counter_event (fun, "copies coalesced",
-			    stats.coalesced);
-  delete_var_map (map);
-  return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
-  return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 5b00f58..4772558 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -70,88 +70,6 @@ static void  verify_live_on_entry (tree_live_info_p);
    ssa_name or variable, and vice versa.  */
 
 
-/* Hashtable helpers.  */
-
-struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
-{
-  static inline hashval_t hash (const tree_int_map *);
-  static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
-  return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
-  return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP.  */
-
-static void
-var_map_base_init (var_map map)
-{
-  int x, num_part;
-  tree var;
-  struct tree_int_map *m, *mapstorage;
-
-  num_part = num_var_partitions (map);
-  hash_table<tree_int_map_hasher> tree_to_index (num_part);
-  /* We can have at most num_part entries in the hash tables, so it's
-     enough to allocate so many map elements once, saving some malloc
-     calls.  */
-  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
-  /* If a base table already exists, clear it, otherwise create it.  */
-  free (map->partition_to_base_index);
-  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
-  /* Build the base variable list, and point partitions at their bases.  */
-  for (x = 0; x < num_part; x++)
-    {
-      struct tree_int_map **slot;
-      unsigned baseindex;
-      var = partition_to_var (map, x);
-      if (SSA_NAME_VAR (var)
-	  && (!VAR_P (SSA_NAME_VAR (var))
-	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
-	m->base.from = SSA_NAME_VAR (var);
-      else
-	/* This restricts what anonymous SSA names we can coalesce
-	   as it restricts the sets we compute conflicts for.
-	   Using TREE_TYPE to generate sets is the easies as
-	   type equivalency also holds for SSA names with the same
-	   underlying decl. 
-
-	   Check gimple_can_coalesce_p when changing this code.  */
-	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
-			? TYPE_CANONICAL (TREE_TYPE (var))
-			: TREE_TYPE (var));
-      /* If base variable hasn't been seen, set it up.  */
-      slot = tree_to_index.find_slot (m, INSERT);
-      if (!*slot)
-	{
-	  baseindex = m - mapstorage;
-	  m->to = baseindex;
-	  *slot = m;
-	  m++;
-	}
-      else
-	baseindex = (*slot)->to;
-      map->partition_to_base_index[x] = baseindex;
-    }
-
-  map->num_basevars = m - mapstorage;
-
-  free (mapstorage);
-}
-
-
 /* Remove the base table in MAP.  */
 
 static void
@@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected)
 }
 
 
-/* Create a partition view which includes all the used partitions in MAP.  If
-   WANT_BASES is true, create the base variable map as well.  */
+/* Create a partition view which includes all the used partitions in MAP.  */
 
 void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
 {
   bitmap used;
 
   used = partition_view_init (map);
   partition_view_fini (map, used);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
@@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases)
    as well.  */
 
 void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
 {
   bitmap used;
   bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
     }
   partition_view_fini (map, new_partitions);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
 extern var_map init_var_map (int);
 extern void delete_var_map (var_map);
 extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
 extern void dump_scope_blocks (FILE *, int);
 extern void debug_scope_block (tree, int);
 extern void debug_scope_blocks (int);
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index 437f69d..1fbd71e 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -38,6 +38,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "tree-ssa-propagate.h"
 #include "tree-hash-traits.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
 
 /* The basic structure describing an equivalency created by traversing
    an edge.  Traversing the edge effectively means that we can assume
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index da9de28..a31a137 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4856,12 +4856,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
    registers, as well as associations between MEMs and VALUEs.  */
 
 static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
 {
   unsigned int r;
   hard_reg_set_iterator hrsi;
+  HARD_REG_SET invalidated_regs;
 
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+  get_call_reg_set_usage (call_insn, &invalidated_regs,
+			  regs_invalidated_by_call);
+
+  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
     var_regno_delete (set, r);
 
   if (MAY_HAVE_DEBUG_INSNS)
@@ -6645,7 +6649,7 @@ compute_bb_dataflow (basic_block bb)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (out);
+	    dataflow_set_clear_at_call (out, insn);
 	    break;
 
 	  case MO_USE:
@@ -9107,7 +9111,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (set);
+	    dataflow_set_clear_at_call (set, insn);
 	    emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
 	    {
 	      rtx arguments = mo->u.loc, *p = &arguments;


Adjusted incremental patch:

---
 gcc/function.c    |   39 +++++++++++++++++++++++++++++++++++++--
 gcc/stor-layout.c |    3 ++-
 2 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index 753d889..c3d00cd 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -151,6 +151,8 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
 static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
+static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
+
 \f
 /* Stack of nested functions.  */
 /* Keep track of the cfun stack.  */
@@ -2267,7 +2269,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
    needed, else the old list.  */
 
 static void
-split_complex_args (vec<tree> *args)
+split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 {
   unsigned i;
   tree p;
@@ -2278,6 +2280,7 @@ split_complex_args (vec<tree> *args)
       if (TREE_CODE (type) == COMPLEX_TYPE
 	  && targetm.calls.split_complex_arg (type))
 	{
+	  tree cparm = p;
 	  tree decl;
 	  tree subtype = TREE_TYPE (type);
 	  bool addressable = TREE_ADDRESSABLE (p);
@@ -2296,6 +2299,9 @@ split_complex_args (vec<tree> *args)
 	  DECL_ARTIFICIAL (p) = addressable;
 	  DECL_IGNORED_P (p) = addressable;
 	  TREE_ADDRESSABLE (p) = 0;
+	  /* Reset the RTL before layout_decl, or it may change the
+	     mode of the RTL of the original argument copied to P.  */
+	  SET_DECL_RTL (p, NULL_RTX);
 	  layout_decl (p, 0);
 	  (*args)[i] = p;
 
@@ -2307,6 +2313,25 @@ split_complex_args (vec<tree> *args)
 	  DECL_IGNORED_P (decl) = addressable;
 	  layout_decl (decl, 0);
 	  args->safe_insert (++i, decl);
+
+	  /* If we are expanding a function, rather than gimplifying
+	     it, propagate the RTL of the complex parm to the split
+	     declarations, and set their contexts so that
+	     maybe_reset_rtl_for_parm can recognize them and refrain
+	     from resetting their RTL.  */
+	  if (currently_expanding_to_rtl)
+	    {
+	      rtx rtl = rtl_for_parm (all, cparm);
+	      gcc_assert (!rtl || GET_CODE (rtl) == CONCAT);
+	      if (rtl)
+		{
+		  SET_DECL_RTL (p, XEXP (rtl, 0));
+		  SET_DECL_RTL (decl, XEXP (rtl, 1));
+
+		  DECL_CONTEXT (p) = cparm;
+		  DECL_CONTEXT (decl) = cparm;
+		}
+	    }
 	}
     }
 }
@@ -2369,7 +2394,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
 
   /* If the target wants to split complex arguments into scalars, do so.  */
   if (targetm.calls.split_complex_arg)
-    split_complex_args (&fnargs);
+    split_complex_args (all, &fnargs);
 
   return fnargs;
 }
@@ -2823,6 +2848,16 @@ maybe_reset_rtl_for_parm (tree parm)
 {
   gcc_assert (TREE_CODE (parm) == PARM_DECL
 	      || TREE_CODE (parm) == RESULT_DECL);
+
+  /* This is a split complex parameter, and its context was set to its
+     original PARM_DECL in split_complex_args so that we could
+     recognize it here and not reset its RTL.  */
+  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
+    {
+      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
+      return;
+    }
+
   if ((flag_tree_coalesce_vars
        || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
       && is_gimple_reg (parm))
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 0d4f4a4..288227a 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -794,7 +794,8 @@ layout_decl (tree decl, unsigned int known_align)
     {
       PUT_MODE (rtl, DECL_MODE (decl));
       SET_DECL_RTL (decl, 0);
-      set_mem_attributes (rtl, decl, 1);
+      if (MEM_P (rtl))
+	set_mem_attributes (rtl, decl, 1);
       SET_DECL_RTL (decl, rtl);
     }
 }


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 15:42                                 ` Alexandre Oliva
@ 2015-07-23 20:35                                   ` Segher Boessenkool
  2015-07-23 21:24                                     ` H.J. Lu
                                                       ` (2 more replies)
  0 siblings, 3 replies; 127+ messages in thread
From: Segher Boessenkool @ 2015-07-23 20:35 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Richard Biener, Jeff Law, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
> Yeah.  Thanks, I've tested it with this change, and I'm now checking
> this in (full patch first; adjusted incremental patch at the end):

Unfortunately it causes about a thousand test fails on powerpc64-linux
(at least, it seems to be this patch, I haven't actually checked).

Some representative backtraces:


/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr54713-1.c: In function 'f1':
/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr54713-1.c:13:1: internal compiler error: in expand_one_stack_var_1, at cfgexpand.c:1221
0x1030eae7 expand_one_stack_var_1
	/home/segher/src/gcc/gcc/cfgexpand.c:1221
0x10320a23 expand_one_ssa_partition
	/home/segher/src/gcc/gcc/cfgexpand.c:1295
0x10320a23 expand_used_vars
	/home/segher/src/gcc/gcc/cfgexpand.c:1940
0x10322ea3 execute
	/home/segher/src/gcc/gcc/cfgexpand.c:6084


/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr39928-1.c: In function 'vq_nbest':
/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr39928-1.c:6:1: internal compiler error: in emit_move_insn, at expr.c:3552
0x1046f587 emit_move_insn(rtx_def*, rtx_def*)
	/home/segher/src/gcc/gcc/expr.c:3551
0x104daa67 assign_parm_setup_reg
	/home/segher/src/gcc/gcc/function.c:3322
0x104dd063 assign_parms
	/home/segher/src/gcc/gcc/function.c:3766
0x104e0aa7 expand_function_start(tree_node*)
	/home/segher/src/gcc/gcc/function.c:5192
0x10322f07 execute
	/home/segher/src/gcc/gcc/cfgexpand.c:6105


I have the full testsuite logs if you want them.


Segher

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 20:35                                   ` Segher Boessenkool
@ 2015-07-23 21:24                                     ` H.J. Lu
  2015-07-23 22:11                                       ` H.J. Lu
  2015-07-24 18:21                                     ` Alexandre Oliva
  2015-07-29 20:32                                     ` Alexandre Oliva
  2 siblings, 1 reply; 127+ messages in thread
From: H.J. Lu @ 2015-07-23 21:24 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Alexandre Oliva, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>> this in (full patch first; adjusted incremental patch at the end):
>
> Unfortunately it causes about a thousand test fails on powerpc64-linux
> (at least, it seems to be this patch, I haven't actually checked).
>

It also caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978

-- 
H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 21:24                                     ` H.J. Lu
@ 2015-07-23 22:11                                       ` H.J. Lu
  2015-07-24  1:31                                         ` David Edelsohn
                                                           ` (2 more replies)
  0 siblings, 3 replies; 127+ messages in thread
From: H.J. Lu @ 2015-07-23 22:11 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Alexandre Oliva, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>>> this in (full patch first; adjusted incremental patch at the end):
>>
>> Unfortunately it causes about a thousand test fails on powerpc64-linux
>> (at least, it seems to be this patch, I haven't actually checked).
>>
>
> It also caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>

and maybe:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

-- 
H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 22:11                                       ` H.J. Lu
@ 2015-07-24  1:31                                         ` David Edelsohn
  2015-07-24  5:08                                           ` H.J. Lu
  2015-07-24 20:20                                           ` Alexandre Oliva
  2015-07-24 18:51                                         ` Alexandre Oliva
  2015-07-29 20:52                                         ` Alexandre Oliva
  2 siblings, 2 replies; 127+ messages in thread
From: David Edelsohn @ 2015-07-24  1:31 UTC (permalink / raw)
  To: Alexandre Oliva, Jeff Law
  Cc: Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	Eric Botcazou, H.J. Lu

On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
>> <segher@kernel.crashing.org> wrote:
>>> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>>>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>>>> this in (full patch first; adjusted incremental patch at the end):
>>>
>>> Unfortunately it causes about a thousand test fails on powerpc64-linux
>>> (at least, it seems to be this patch, I haven't actually checked).
>>>
>>
>> It also caused:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>>
>
> and maybe:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

I request that this patch be reverted (again).

Thanks, David

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24  1:31                                         ` David Edelsohn
@ 2015-07-24  5:08                                           ` H.J. Lu
  2015-07-24  9:26                                             ` Richard Biener
  2015-07-24 20:20                                           ` Alexandre Oliva
  1 sibling, 1 reply; 127+ messages in thread
From: H.J. Lu @ 2015-07-24  5:08 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Alexandre Oliva, Jeff Law, Segher Boessenkool, Richard Biener,
	GCC Patches, Christophe Lyon, Eric Botcazou

On Thu, Jul 23, 2015 at 4:14 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
> On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
>>> <segher@kernel.crashing.org> wrote:
>>>> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>>>>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>>>>> this in (full patch first; adjusted incremental patch at the end):
>>>>
>>>> Unfortunately it causes about a thousand test fails on powerpc64-linux
>>>> (at least, it seems to be this patch, I haven't actually checked).
>>>>
>>>
>>> It also caused:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>>>
>>
>> and maybe:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>
> I request that this patch be reverted (again).

And I request to test any new patches under x32 before checking in.
You can use Ubuntu 14 to test x32.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24  5:08                                           ` H.J. Lu
@ 2015-07-24  9:26                                             ` Richard Biener
  2015-07-24 12:50                                               ` H.J. Lu
  0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-07-24  9:26 UTC (permalink / raw)
  To: H.J. Lu
  Cc: David Edelsohn, Alexandre Oliva, Jeff Law, Segher Boessenkool,
	GCC Patches, Christophe Lyon, Eric Botcazou

On Fri, Jul 24, 2015 at 1:19 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Jul 23, 2015 at 4:14 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
>> On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
>>>> <segher@kernel.crashing.org> wrote:
>>>>> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>>>>>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>>>>>> this in (full patch first; adjusted incremental patch at the end):
>>>>>
>>>>> Unfortunately it causes about a thousand test fails on powerpc64-linux
>>>>> (at least, it seems to be this patch, I haven't actually checked).
>>>>>
>>>>
>>>> It also caused:
>>>>
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>>>>
>>>
>>> and maybe:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>>
>> I request that this patch be reverted (again).
>
> And I request to test any new patches under x32 before checking in.
> You can use Ubuntu 14 to test x32.

x32 is neither primary nor secondary arch.

Richard.

> Thanks.
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24  9:26                                             ` Richard Biener
@ 2015-07-24 12:50                                               ` H.J. Lu
  0 siblings, 0 replies; 127+ messages in thread
From: H.J. Lu @ 2015-07-24 12:50 UTC (permalink / raw)
  To: Richard Biener
  Cc: David Edelsohn, Alexandre Oliva, Jeff Law, Segher Boessenkool,
	GCC Patches, Christophe Lyon, Eric Botcazou

On Fri, Jul 24, 2015 at 2:22 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Jul 24, 2015 at 1:19 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Jul 23, 2015 at 4:14 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
>>> On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
>>>>> <segher@kernel.crashing.org> wrote:
>>>>>> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>>>>>>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>>>>>>> this in (full patch first; adjusted incremental patch at the end):
>>>>>>
>>>>>> Unfortunately it causes about a thousand test fails on powerpc64-linux
>>>>>> (at least, it seems to be this patch, I haven't actually checked).
>>>>>>
>>>>>
>>>>> It also caused:
>>>>>
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>>>>>
>>>>
>>>> and maybe:
>>>>
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>>>
>>> I request that this patch be reverted (again).
>>
>> And I request to test any new patches under x32 before checking in.
>> You can use Ubuntu 14 to test x32.
>
> x32 is neither primary nor secondary arch.
>

I suggested a way to reproduce the problem.  I checked in this testcase so
that the problem will show up on Linux/x86-64.

-- 
H.J.
---
Index: ChangeLog
===================================================================
--- ChangeLog (revision 226149)
+++ ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2015-07-24  H.J. Lu  <hongjiu.lu@intel.com>
+
+ PR bootstrap/66978
+ * gcc.target/i386/pr66978.c: New test.
+
 2015-07-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

  * gcc.target/s390/gpr2fprsavecfi.c: New test.
Index: gcc.target/i386/pr66978.c
===================================================================
--- gcc.target/i386/pr66978.c (revision 0)
+++ gcc.target/i386/pr66978.c (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-require-effective-target maybe_x32 } */
+/* { dg-options "-O2 -mx32 -maddress-mode=short" } */
+
+extern int foo (int *);
+int
+bar (int *p)
+{
+  __attribute__ ((noinline, noclone))
+  int hack_digit (void)
+    {
+      return foo (p);
+    }
+  return hack_digit ();
+}

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 20:35                                   ` Segher Boessenkool
  2015-07-23 21:24                                     ` H.J. Lu
@ 2015-07-24 18:21                                     ` Alexandre Oliva
  2015-07-29 20:32                                     ` Alexandre Oliva
  2 siblings, 0 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-24 18:21 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Richard Biener, Jeff Law, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Jul 23, 2015, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>> this in (full patch first; adjusted incremental patch at the end):

> Unfortunately it causes about a thousand test fails on powerpc64-linux
> (at least, it seems to be this patch, I haven't actually checked).

Yeah, the backtrace suggests very strongly that it's my patch.
Apologies for the breakage.  I'm looking into it right now.

> I have the full testsuite logs if you want them.

Preprocessed testcases would probably help more than the logs, but at
least one of the testcases you mentioned doesn't require any libraries,
so I'm going to start with them.  Then, I'll find some ppc64-linux-gnu
box in the build farm and run some bootstrap and regression testing
there.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 22:11                                       ` H.J. Lu
  2015-07-24  1:31                                         ` David Edelsohn
@ 2015-07-24 18:51                                         ` Alexandre Oliva
  2015-07-24 19:12                                           ` H.J. Lu
  2015-07-29 20:52                                         ` Alexandre Oliva
  2 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-24 18:51 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

My test logs with the patch have:

PASS: c-c++-common/dfp/func-vararg-dfp.c execution test

so I think this one is not caused by the PR64164 patch.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24 18:51                                         ` Alexandre Oliva
@ 2015-07-24 19:12                                           ` H.J. Lu
  2015-07-24 19:31                                             ` David Edelsohn
  2015-07-24 20:47                                             ` Alexandre Oliva
  0 siblings, 2 replies; 127+ messages in thread
From: H.J. Lu @ 2015-07-24 19:12 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>
> My test logs with the patch have:
>
> PASS: c-c++-common/dfp/func-vararg-dfp.c execution test
>
> so I think this one is not caused by the PR64164 patch.

I double checked.  r226113 failed and r226112 passed.

-- 
H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24 19:12                                           ` H.J. Lu
@ 2015-07-24 19:31                                             ` David Edelsohn
  2015-07-24 20:43                                               ` Alexandre Oliva
  2015-07-24 20:47                                             ` Alexandre Oliva
  1 sibling, 1 reply; 127+ messages in thread
From: David Edelsohn @ 2015-07-24 19:31 UTC (permalink / raw)
  To: H.J. Lu, Alexandre Oliva
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, Eric Botcazou

On Fri, Jul 24, 2015 at 3:10 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>>
>> My test logs with the patch have:
>>
>> PASS: c-c++-common/dfp/func-vararg-dfp.c execution test
>>
>> so I think this one is not caused by the PR64164 patch.
>
> I double checked.  r226113 failed and r226112 passed.

Alexandre,

Did you commit the final, complete version of the patches?

Did you test the version of the patches that you committed?

Thanks, David

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24  1:31                                         ` David Edelsohn
  2015-07-24  5:08                                           ` H.J. Lu
@ 2015-07-24 20:20                                           ` Alexandre Oliva
  2015-07-25  2:37                                             ` David Edelsohn
  1 sibling, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-24 20:20 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Jeff Law, Segher Boessenkool, Richard Biener, GCC Patches,
	Christophe Lyon, Eric Botcazou, H.J. Lu

On Jul 23, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:

> I request that this patch be reverted (again).

Might I kindly ask you to please do so for me.  I've just found out
that, after yesterday's memory upgrade on my local build machine, the
filesystem that I normally use for GCC development got corrupted, and I
don't want to mess with it before running an fsck which will take me a
while.

Apologies for the breakage.  I'll add ppc64-linux-gnu regression testing
to the test list for this patchset.  Thanks,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24 19:31                                             ` David Edelsohn
@ 2015-07-24 20:43                                               ` Alexandre Oliva
  0 siblings, 0 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-24 20:43 UTC (permalink / raw)
  To: David Edelsohn
  Cc: H.J. Lu, Segher Boessenkool, Richard Biener, Jeff Law,
	GCC Patches, Christophe Lyon, Eric Botcazou

On Jul 24, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:

> Did you commit the final, complete version of the patches?

Yes, I have double-checked that reverting the patch I posted on the
r225979 tree I used for testing reverts it to the base revision, except
for unrelated libgfortran configury changes that I used to avoid halting
configure in cross builds, to increase the amount of code build during
the cross testing.

> Did you test the version of the patches that you committed?

Yes, I have just double-checked that the version I tested is identical
to the change introduced by commit 226113, except for the ChangeLogs and
the libgfortran configury change.

It could be that some other change between 225979 and 226112, combined
with 226113, caused the regression.  Or it could be that my
i686-pc-linux-gnu "native" testing on x86_64-pc-linux-gnu is somehow
faulty.  I'll look into it and try to find out what the source of the
difference in test results H.J. and I get could be.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24 19:12                                           ` H.J. Lu
  2015-07-24 19:31                                             ` David Edelsohn
@ 2015-07-24 20:47                                             ` Alexandre Oliva
  2015-07-24 21:53                                               ` H.J. Lu
  1 sibling, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-24 20:47 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 725 bytes --]

On Jul 24, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:

> On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>> 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>> 
>> My test logs with the patch have:
>> 
>> PASS: c-c++-common/dfp/func-vararg-dfp.c execution test
>> 
>> so I think this one is not caused by the PR64164 patch.

> I double checked.  r226113 failed and r226112 passed.

Weird.  Here's the asm I got, after re-running the command used to
compiled this test with -save-temps.  I checked that the produced
executable is an ELF32 executable, and that it completes execution
successfully.

How does it compare to yours?


[-- Attachment #2: func-vararg-dfp.s.gz --]
[-- Type: application/gzip, Size: 1288 bytes --]

[-- Attachment #3: Type: text/plain, Size: 257 bytes --]


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24 20:47                                             ` Alexandre Oliva
@ 2015-07-24 21:53                                               ` H.J. Lu
  2015-07-25  7:17                                                 ` Richard Biener
  0 siblings, 1 reply; 127+ messages in thread
From: H.J. Lu @ 2015-07-24 21:53 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Fri, Jul 24, 2015 at 1:36 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jul 24, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>
>> On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>>
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>>>
>>> My test logs with the patch have:
>>>
>>> PASS: c-c++-common/dfp/func-vararg-dfp.c execution test
>>>
>>> so I think this one is not caused by the PR64164 patch.
>
>> I double checked.  r226113 failed and r226112 passed.
>
> Weird.  Here's the asm I got, after re-running the command used to
> compiled this test with -save-temps.  I checked that the produced
> executable is an ELF32 executable, and that it completes execution
> successfully.
>
> How does it compare to yours?

Please add -msse2:

[hjl@gnu-ivb-1 gcc]$
/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/xgcc
-B/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/
/export/gnu/import/git/gcc-regression/gcc/gcc/testsuite/c-c++-common/dfp/func-vararg-dfp.c
-m32 -fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99
-march=i686  -msse2
[hjl@gnu-ivb-1 gcc]$ ./a.out
Segmentation fault (core dumped)
[hjl@gnu-ivb-1 gcc]$
/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/xgcc
-B/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/
/export/gnu/import/git/gcc-regression/gcc/gcc/testsuite/c-c++-common/dfp/func-vararg-dfp.c
-m32 -fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99
-march=i686
[hjl@gnu-ivb-1 gcc]$ ./a.out
[hjl@gnu-ivb-1 gcc]$


-- 
H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24 20:20                                           ` Alexandre Oliva
@ 2015-07-25  2:37                                             ` David Edelsohn
  2015-07-27 22:16                                               ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: David Edelsohn @ 2015-07-25  2:37 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, Segher Boessenkool, Richard Biener, GCC Patches,
	Christophe Lyon, Eric Botcazou, H.J. Lu

On Fri, Jul 24, 2015 at 4:02 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jul 23, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>
>> I request that this patch be reverted (again).
>
> Might I kindly ask you to please do so for me.  I've just found out
> that, after yesterday's memory upgrade on my local build machine, the
> filesystem that I normally use for GCC development got corrupted, and I
> don't want to mess with it before running an fsck which will take me a
> while.

I have reverted the patch.

- David

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-24 21:53                                               ` H.J. Lu
@ 2015-07-25  7:17                                                 ` Richard Biener
  0 siblings, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-07-25  7:17 UTC (permalink / raw)
  To: H.J. Lu, Alexandre Oliva
  Cc: Segher Boessenkool, Jeff Law, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On July 24, 2015 10:47:37 PM GMT+02:00, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>On Fri, Jul 24, 2015 at 1:36 PM, Alexandre Oliva <aoliva@redhat.com>
>wrote:
>> On Jul 24, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>
>>> On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva
><aoliva@redhat.com> wrote:
>>>> On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>>>
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>>>>
>>>> My test logs with the patch have:
>>>>
>>>> PASS: c-c++-common/dfp/func-vararg-dfp.c execution test
>>>>
>>>> so I think this one is not caused by the PR64164 patch.
>>
>>> I double checked.  r226113 failed and r226112 passed.
>>
>> Weird.  Here's the asm I got, after re-running the command used to
>> compiled this test with -save-temps.  I checked that the produced
>> executable is an ELF32 executable, and that it completes execution
>> successfully.
>>
>> How does it compare to yours?
>
>Please add -msse2:

Yes, the fails appear with -m32 multilib testing on x86_64.

Richard.

>[hjl@gnu-ivb-1 gcc]$
>/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/xgcc
>-B/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/
>/export/gnu/import/git/gcc-regression/gcc/gcc/testsuite/c-c++-common/dfp/func-vararg-dfp.c
>-m32 -fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99
>-march=i686  -msse2
>[hjl@gnu-ivb-1 gcc]$ ./a.out
>Segmentation fault (core dumped)
>[hjl@gnu-ivb-1 gcc]$
>/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/xgcc
>-B/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/
>/export/gnu/import/git/gcc-regression/gcc/gcc/testsuite/c-c++-common/dfp/func-vararg-dfp.c
>-m32 -fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99
>-march=i686
>[hjl@gnu-ivb-1 gcc]$ ./a.out
>[hjl@gnu-ivb-1 gcc]$


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-25  2:37                                             ` David Edelsohn
@ 2015-07-27 22:16                                               ` Alexandre Oliva
  2015-07-27 22:31                                                 ` H.J. Lu
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-27 22:16 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Jeff Law, Segher Boessenkool, Richard Biener, GCC Patches,
	Christophe Lyon, Eric Botcazou, H.J. Lu

On Jul 24, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:

> On Fri, Jul 24, 2015 at 4:02 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Jul 23, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>> 
>>> I request that this patch be reverted (again).
>> 
>> Might I kindly ask you to please do so for me.  I've just found out
>> that, after yesterday's memory upgrade on my local build machine, the
>> filesystem that I normally use for GCC development got corrupted, and I
>> don't want to mess with it before running an fsck which will take me a
>> while.

> I have reverted the patch.

Thank you very much.  Long story short, the filesystem got corrupted
beyond repair before I realized something was wrong, so I spend my
weekend backing up the bits I still could and recreating it all from
scratch.  *fun* :-/

I even ran memtest before booting up, but everything was fine in the
single-threaded tests it runs by default.  It was only with all cores
actively using memory intensely that something overheated (memory
modules?  chipset?  cpu?  no clue) and started randomly corrupting bits.
So, I'm now back at lower memory clock speeds, and everything appears to
be rock solid again.  Phew!  So, I'm back to debugging the
newly-reported problems and thinking how much further I should extend
testing coverage so that the next round doesn't have to be reverted
again ;-)

Thanks again,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-27 22:16                                               ` Alexandre Oliva
@ 2015-07-27 22:31                                                 ` H.J. Lu
  0 siblings, 0 replies; 127+ messages in thread
From: H.J. Lu @ 2015-07-27 22:31 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: David Edelsohn, Jeff Law, Segher Boessenkool, Richard Biener,
	GCC Patches, Christophe Lyon, Eric Botcazou

On Mon, Jul 27, 2015 at 2:22 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jul 24, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>
>> On Fri, Jul 24, 2015 at 4:02 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> On Jul 23, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>>>
>>>> I request that this patch be reverted (again).
>>>
>>> Might I kindly ask you to please do so for me.  I've just found out
>>> that, after yesterday's memory upgrade on my local build machine, the
>>> filesystem that I normally use for GCC development got corrupted, and I
>>> don't want to mess with it before running an fsck which will take me a
>>> while.
>
>> I have reverted the patch.
>
> Thank you very much.  Long story short, the filesystem got corrupted
> beyond repair before I realized something was wrong, so I spend my
> weekend backing up the bits I still could and recreating it all from
> scratch.  *fun* :-/
>
> I even ran memtest before booting up, but everything was fine in the
> single-threaded tests it runs by default.  It was only with all cores
> actively using memory intensely that something overheated (memory
> modules?  chipset?  cpu?  no clue) and started randomly corrupting bits.
> So, I'm now back at lower memory clock speeds, and everything appears to
> be rock solid again.  Phew!  So, I'm back to debugging the

The exactly same thing happened to my machine.  It took me
several weeks before I lowered memory clock.  My machine has
been running fine for over a year under very heavy load.

BTW, this is what I use to test ia32 on Intel64:

PATH=/usr/local32/bin:/bin:/usr/bin
.../configure --prefix=/usr/6.0.0 --enable-clocale=gnu
--with-system-zlib --enable-shared --with-demangler-in-ld
--enable-libmpx i686-linux --with-fpmath=sse
--enable-languages=c,c++,fortran,java,lto,objc

where /usr/local32/bin has ia32 binutils.

-- 
H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 20:35                                   ` Segher Boessenkool
  2015-07-23 21:24                                     ` H.J. Lu
  2015-07-24 18:21                                     ` Alexandre Oliva
@ 2015-07-29 20:32                                     ` Alexandre Oliva
  2 siblings, 0 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-29 20:32 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Richard Biener, Jeff Law, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Jul 23, 2015, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>> this in (full patch first; adjusted incremental patch at the end):

> Unfortunately it causes about a thousand test fails on powerpc64-linux
> (at least, it seems to be this patch, I haven't actually checked).

> Some representative backtraces:

Thanks, both of these are now fixed (at least in that they don't ICE any
more) in the git branch aoliva/pr64164, but I'm going to investigate a
few more issues before I start a regression test on gcc110 and gcc112.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-23 22:11                                       ` H.J. Lu
  2015-07-24  1:31                                         ` David Edelsohn
  2015-07-24 18:51                                         ` Alexandre Oliva
@ 2015-07-29 20:52                                         ` Alexandre Oliva
  2015-07-29 21:06                                           ` H.J. Lu
  2 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-29 20:52 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:

>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

Thanks, both of these are also fixed (I merged your patch for x32, and I
verified manually that another fix I just wrote fixes all the -m32
-msse2 regressions) in the git branch aoliva/pr64164, but I'm going to
investigate a few more issues affecting other targets before I start
full regression tests all over the build farm ;-)

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-29 20:52                                         ` Alexandre Oliva
@ 2015-07-29 21:06                                           ` H.J. Lu
  2015-07-30 17:47                                             ` H.J. Lu
  0 siblings, 1 reply; 127+ messages in thread
From: H.J. Lu @ 2015-07-29 21:06 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Wed, Jul 29, 2015 at 1:13 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>
> Thanks, both of these are also fixed (I merged your patch for x32, and I
> verified manually that another fix I just wrote fixes all the -m32
> -msse2 regressions) in the git branch aoliva/pr64164, but I'm going to
> investigate a few more issues affecting other targets before I start
> full regression tests all over the build farm ;-)
>

I am building x32 on aoliva/pr64164 now.

-- 
H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-29 21:06                                           ` H.J. Lu
@ 2015-07-30 17:47                                             ` H.J. Lu
  2015-08-03 23:46                                               ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: H.J. Lu @ 2015-07-30 17:47 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Wed, Jul 29, 2015 at 1:23 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Jul 29, 2015 at 1:13 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Jul 23, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>>
>> Thanks, both of these are also fixed (I merged your patch for x32, and I
>> verified manually that another fix I just wrote fixes all the -m32
>> -msse2 regressions) in the git branch aoliva/pr64164, but I'm going to
>> investigate a few more issues affecting other targets before I start
>> full regression tests all over the build farm ;-)
>>
>
> I am building x32 on aoliva/pr64164 now.

aoliva/pr64164  is fine on x32.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-07-30 17:47                                             ` H.J. Lu
@ 2015-08-03 23:46                                               ` Alexandre Oliva
  2015-08-04  9:48                                                 ` Richard Biener
  2015-08-10  8:24                                                 ` James Greenhalgh
  0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-03 23:46 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Segher Boessenkool, Richard Biener, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Jul 30, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:

> aoliva/pr64164  is fine on x32.

Thanks.  I have made a large number of changes since you tested it,
fixing all the reported issues and then some.  Now, x86_64-linux-gnu
(-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
targets I cross-tested still get the same 'make all' errors that the
pristine tree did.

The bulk of the incremental changes had to do with handling splitting
and unsplitting of complex args, and BLKmode types passed by reference.

For the former, I had naively assumed complex args would always be
represented as CONCATs.  I now use read_complex_part to split the
expand-assigned parm rtl into components, and I use it again at unsplit
time to make sure the expand-assigned parm rtl matches that of the split
components.

The latter, in turn, almost required me to give up the entire notion of
coalescing parms.  The problem is that, for arguments passed as a
BLKmode pointer, copying the argument to an expand-assigned stack slot
is not only wasteful, it doesn't really work: we'd expand the copy in
assign_parms* and insert it before the stack allocation performed by
expand_user_vars, so we'd initialize the pseudo holding the address of
the stack slot only after its first use.

The solution I came up with was to detect BLKmode parms and NOT allow
them to coalesce with other variables, so that we can easily detect
partitions that need special handling.  The special handling amounts to
not allocating a stack slot for the partition holding the param default
def, and leaving it for assign_parms to do so.  We do, however, allocate
a MEM, in theory assigned to all partition members (though they're all
the same parm ATM, but not necessarily all SSA_NAMEs of the same parm,
since optimization causes different versions to conflict).  We leave the
address of that MEM unset, so that assign_parm knows it is to fill it in
with a pseudo holding a copy of the incoming parm address, or with the
address of the local stack slot created to hold a copy of the parameter.

It took me several rounds of trial and error to get these to pass all
complex and vector tests on x86 and ppc.  The last remaining failure was
a regression in gcc.target/powerpc/pr16458-4, caused by our inability to
hold SSA_NAMEs as REG_EXPRs in pseudos.  emit_case_decision_tree
attempted to preserve the decl as the REG_EXPR of a pseudo holding a
copy of the switch expr, and its type appears to be used to decide
whether to emit signed or unsigned compares, even though we explicitly
pass mode and unsignedp down to the cmp_and_jump expanders.  I figured
there was no good reason to prevent SSA_NAMEs in REG_EXPRs, just like
MEM_EXPRs, so I went ahead and adjusted the DECL_MODE that prevented it,
and now we expand the case decision tree as intended.

Here's a consolidated patch, followed by the consolidated incremental
patch.  I don't intend to install it this week, even if approved,
because I'm going to be away Aug 5-10, and I would like to be around
should any further problems arise.  So, ok to install when I return?

[PR64164] Drop copyrename, use coalescible partition as base when optimizing.

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	PR rtl-optimization/64164
	PR bootstrap/66978
	PR middle-end/66983
	PR rtl-optimization/67000
	PR middle-end/67034
	PR middle-end/67035
	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
	* tree-ssa-copyrename.c: Removed.
	* opts.c (default_options_table): Drop -ftree-copyrename.  Add
	-ftree-coalesce-vars.
	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
	* common.opt (ftree-copyrename): Ignore.
	(ftree-coalesce-inlined-vars): Likewise.
	* doc/invoke.texi: Remove the ignored options above.
	* gimple-expr.h (gimple_can_coalesce_p): Move declaration
	* tree-ssa-coalesce.h: ... here.
	* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
	headers required by it.
	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
	across variables when flag_tree_coalesce_vars.  Check register
	use and promoted modes to allow coalescing.  Do not coalesce
	maybe-byref parms with SSA_NAMEs of other variables, or
	anonymous SSA_NAMEs.  Moved to tree-ssa-coalesce.c.
	* tree-ssa-live.c (struct tree_int_map_hasher): Move along
	with its member functions to tree-ssa-coalesce.c.
	(var_map_base_init): Likewise.  Renamed to
	compute_samebase_partition_bases.
	(partition_view_normal): Drop want_bases parameter.
	(partition_view_bitmap): Likewise.
	* tree-ssa-live.h: Adjust declarations.
	* tree-ssa-coalesce.c: Include explow.h and cfgexpand.h.
	(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
	default defs at the entry point.
	(dump_part_var_map): New.
	(compute_optimized_partition_bases): New, called by...
	(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
	of compute_samebase_partition_bases.  Adjust.
	* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
	* cfgexpand.c (leader_merge, parm_maybe_byref_p): New.
	(ssa_default_def_partition): New.
	(get_rtl_for_parm_ssa_default_def): New.
	(align_local_variable, add_stack_var): Support anonymous SSA
	names.
	(defer_stack_allocation): Likewise.  Declare earlier.
	(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
	vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
	Do no record deferred-allocation marker in
	SA.partition_to_pseudo.
	(expand_stack_vars): Adjust check for the marker in it.
	(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
	redundant MEM attr setting.
	(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
	from...
	(expand_one_stack_var): ... this.  New wrapper to check and
	skip already expanded SSA partitions.
	(record_alignment_for_reg_var): New, factored out of...
	(expand_one_var): ... this.
	(expand_one_ssa_partition): New.
	(adjust_one_expanded_partition_var): New.
	(expand_one_register_var): Check and skip already expanded SSA
	partitions.
	(expand_used_vars): Don't create DECLs for anonymous SSA
	names.  Expand all SSA partitions, then adjust all SSA names.
	(pass::execute): Replace the loops that set
	SA.partition_to_pseudo from partition leaders and cleared
	DECL_RTL for multi-location variables, and that which used to
	rename vars and set attrs, with one that clears DECL_RTL and
	checks that PARMs and RESULTs default_defs match DECL_RTL.
	* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
	* emit-rtl.c: Include stor-layout.h.
	(set_reg_attrs_for_parm): Handle NULL decl.
	(set_reg_attrs_for_decl_rtl): Take mode from expression if
	it's not a DECL.
	* stmt.c (emit_case_decision_tree): Pass it the SSA_NAME
	rather than its possibly-NULL DECL.
	* explow.c (promote_ssa_mode): New.
	* explow.h (promote_ssa_mode): Declare.
	* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
	(read_complex_part): Export.
	* expr.h (read_complex_part): Declare.
	* cfgexpand.h (parm_maybe_byref_p): Declare.
	* function.c: Include cfgexpand.h.
	(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
	(use_register_for_parm_decl): Wrapper for the above to
	special-case the result_ptr.
	(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
	(split_complex_args): Take assign_parm_data_all argument.
	Pass it to rtl_for_parm.  Set up rtl and context for split
	args.  Reset complex parm before fetching its default decl
	rtl.
	(assign_parms_unsplit_complex): Use the default-def complex
	parm rtl if it matches the components.
	(assign_parms_augmented_arg_list): Adjust.
	(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
	multiple locations.  Recognize split complex args.
	(assign_parm_adjust_stack_rtl): Add all and parm arguments,
	for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
	(assign_parm_setup_block): Prefer SSA-assigned location, and
	fill in its address if the memory location of a maybe-byref
	parm was not assigned by cfgexpand.
	(assign_parm_setup_reg): Likewise.  Adjust its mode as
	needed.  Use entry_parm for equiv if stack_parm is NULL.  Make
	sure passed_pointer parms don't need conversion.  Copy address
	or value as needed.
	(assign_parm_setup_stack): Prefer SSA-assigned location.
	(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
	rtl before testing for pointer bounds.  Special-case result_ptr.
	(expand_function_start): Maybe reset DECL_RTL of result.
	Prefer SSA-assigned location for result and static chain.
	Factor out DECL_RESULT and SET_DECL_RTL.  Convert static chain
	to Pmode if needed, from H.J. Lu  <hongjiu.lu@intel.com>.
	* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
	anonymous SSA names.  Use promote_ssa_mode.
	(get_temp_reg): Likewise.
	(remove_ssa_form): Adjust.
	* stor-layout.c (layout_decl): Don't set mem attributes of
	non-MEMs.
	* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
	and get its reg_usage for reg invalidation.
	(compute_bb_dataflow): Pass it insn.
	(emit_notes_in_bb): Likewise.

for  gcc/testsuite/ChangeLog

	* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
	* gcc.dg/ssp-1.c: Make counter a register.
	* gcc.dg/ssp-2.c: Likewise.
	* gcc.dg/torture/parm-coalesce.c: New.
---
 gcc/Makefile.in                              |    1 
 gcc/alias.c                                  |   13 +
 gcc/cfgexpand.c                              |  471 +++++++++++++++++++-------
 gcc/cfgexpand.h                              |    3 
 gcc/common.opt                               |   12 -
 gcc/doc/invoke.texi                          |   48 +--
 gcc/emit-rtl.c                               |    8 
 gcc/explow.c                                 |   29 ++
 gcc/explow.h                                 |    3 
 gcc/expr.c                                   |   41 +-
 gcc/expr.h                                   |    1 
 gcc/function.c                               |  341 +++++++++++++++----
 gcc/gimple-expr.c                            |   39 --
 gcc/gimple-expr.h                            |    1 
 gcc/opts.c                                   |    2 
 gcc/passes.def                               |    5 
 gcc/stmt.c                                   |    2 
 gcc/stor-layout.c                            |    3 
 gcc/testsuite/gcc.dg/guality/pr54200.c       |    2 
 gcc/testsuite/gcc.dg/ssp-1.c                 |    2 
 gcc/testsuite/gcc.dg/ssp-2.c                 |    2 
 gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
 gcc/tree-outof-ssa.c                         |   16 -
 gcc/tree-ssa-coalesce.c                      |  384 +++++++++++++++++++++
 gcc/tree-ssa-coalesce.h                      |    1 
 gcc/tree-ssa-copyrename.c                    |  475 --------------------------
 gcc/tree-ssa-live.c                          |   99 -----
 gcc/tree-ssa-live.h                          |    4 
 gcc/tree-ssa-uncprop.c                       |    5 
 gcc/var-tracking.c                           |   12 -
 30 files changed, 1187 insertions(+), 878 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
 delete mode 100644 gcc/tree-ssa-copyrename.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index be259e8..6079acc 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1444,7 +1444,6 @@ OBJS = \
 	tree-ssa-ccp.o \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
-	tree-ssa-copyrename.o \
 	tree-ssa-dce.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index fa7d5d8..4681e3f 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
   if (! DECL_P (exprx) || ! DECL_P (expry))
     return 0;
 
+  /* If we refer to different gimple registers, or one gimple register
+     and one non-gimple-register, we know they can't overlap.  First,
+     gimple registers don't have their addresses taken.  Now, there
+     could be more than one stack slot for (different versions of) the
+     same gimple register, but we can presumably tell they don't
+     overlap based on offsets from stack base addresses elsewhere.
+     It's important that we don't proceed to DECL_RTL, because gimple
+     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+     able to do anything about them since no SSA information will have
+     remained to guide it.  */
+  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+    return exprx != expry;
+
   /* With invalid code we can end up storing into the constant pool.
      Bail out to avoid ICEing when creating RTL for this.
      See gfortran.dg/lto/20091028-2_0.f90.  */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index a047632..8f6caf6 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -97,6 +97,8 @@ gimple currently_expanding_gimple_stmt;
 
 static rtx expand_debug_expr (tree);
 
+static bool defer_stack_allocation (tree, bool);
+
 /* Return an expression tree corresponding to the RHS of GIMPLE
    statement STMT.  */
 
@@ -150,21 +152,149 @@ gimple_assign_rhs_to_tree (gimple stmt)
 
 #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
 
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+   out of the same user variable being in multiple partitions (this is
+   less likely for compiler-introduced temps).  */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+  if (cur == NULL || cur == next)
+    return next;
+
+  if (DECL_P (cur) && DECL_IGNORED_P (cur))
+    return cur;
+
+  if (DECL_P (next) && DECL_IGNORED_P (next))
+    return next;
+
+  return cur;
+}
+
+/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
+   Such parameters are likely passed as a pointer to the value, rather
+   than as a value, and so we must not coalesce them, nor allocate
+   stack space for them before determining the calling conventions for
+   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
+   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
+   with NULL so as to make sure the MEM is not used before it is
+   adjusted in assign_parm_setup_reg.  */
+
+bool
+parm_maybe_byref_p (tree var)
+{
+  if (!var || VAR_P (var))
+    return false;
+
+  gcc_assert (TREE_CODE (var) == PARM_DECL
+	      || TREE_CODE (var) == RESULT_DECL);
+
+  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
+}
+
+/* Return the partition of the default SSA_DEF for decl VAR.  */
+
+static int
+ssa_default_def_partition (tree var)
+{
+  tree name = ssa_default_def (cfun, var);
+
+  if (!name)
+    return NO_PARTITION;
+
+  return var_to_partition (SA.map, name);
+}
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+   there is one.  */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+  if (!is_gimple_reg (var))
+    return NULL_RTX;
+
+  /* If we've already determined RTL for the decl, use it.  This is
+     not just an optimization: if VAR is a PARM whose incoming value
+     is unused, we won't find a default def to use its partition, but
+     we still want to use the location of the parm, if it was used at
+     all.  During assign_parms, until a location is assigned for the
+     VAR, RTL can only for a parm or result if we're not coalescing
+     across variables, when we know we're coalescing all SSA_NAMEs of
+     each parm or result, and we're not coalescing them with names
+     pertaining to other variables, such as other parms' default
+     defs.  */
+  if (DECL_RTL_SET_P (var))
+    {
+      gcc_assert (DECL_RTL (var) != pc_rtx);
+      return DECL_RTL (var);
+    }
+
+  int part = ssa_default_def_partition (var);
+  if (part == NO_PARTITION)
+    return NULL_RTX;
+
+  return SA.partition_to_pseudo[part];
+}
+
 /* Associate declaration T with storage space X.  If T is no
    SSA name this is exactly SET_DECL_RTL, otherwise make the
    partition of T associated with X.  */
 static inline void
 set_rtl (tree t, rtx x)
 {
+  if (x && SSAVAR (t))
+    {
+      bool skip = false;
+      tree cur = NULL_TREE;
+
+      if (MEM_P (x))
+	cur = MEM_EXPR (x);
+      else if (REG_P (x))
+	cur = REG_EXPR (x);
+      else if (GET_CODE (x) == CONCAT
+	       && REG_P (XEXP (x, 0)))
+	cur = REG_EXPR (XEXP (x, 0));
+      else if (GET_CODE (x) == PARALLEL)
+	cur = REG_EXPR (XVECEXP (x, 0, 0));
+      else if (x == pc_rtx)
+	skip = true;
+      else
+	gcc_unreachable ();
+
+      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+      if (cur != next)
+	{
+	  if (MEM_P (x))
+	    set_mem_attributes (x, next, true);
+	  else
+	    set_reg_attrs_for_decl_rtl (next, x);
+	}
+    }
+
   if (TREE_CODE (t) == SSA_NAME)
     {
-      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
-      if (x && !MEM_P (x))
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
-      /* For the benefit of debug information at -O0 (where vartracking
-         doesn't run) record the place also in the base DECL if it's
-	 a normal variable (not a parameter).  */
-      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+      int part = var_to_partition (SA.map, t);
+      if (part != NO_PARTITION)
+	{
+	  if (SA.partition_to_pseudo[part])
+	    gcc_assert (SA.partition_to_pseudo[part] == x);
+	  else if (x != pc_rtx)
+	    SA.partition_to_pseudo[part] = x;
+	}
+      /* For the benefit of debug information at -O0 (where
+         vartracking doesn't run) record the place also in the base
+         DECL.  For PARMs and RESULTs, we may end up resetting these
+         in function.c:maybe_reset_rtl_for_parm, but in some rare
+         cases we may need them (unused and overwritten incoming
+         value, that at -O0 must share the location with the other
+         uses in spite of the missing default def), and this may be
+         the only chance to preserve them.  */
+      if (x && x != pc_rtx && SSA_NAME_VAR (t))
 	{
 	  tree var = SSA_NAME_VAR (t);
 	  /* If we don't yet have something recorded, just record it now.  */
@@ -248,8 +378,15 @@ static bool has_short_buffer;
 static unsigned int
 align_local_variable (tree decl)
 {
-  unsigned int align = LOCAL_DECL_ALIGNMENT (decl);
-  DECL_ALIGN (decl) = align;
+  unsigned int align;
+
+  if (TREE_CODE (decl) == SSA_NAME)
+    align = TYPE_ALIGN (TREE_TYPE (decl));
+  else
+    {
+      align = LOCAL_DECL_ALIGNMENT (decl);
+      DECL_ALIGN (decl) = align;
+    }
   return align / BITS_PER_UNIT;
 }
 
@@ -315,12 +452,15 @@ add_stack_var (tree decl)
   decl_to_stack_part->put (decl, stack_vars_num);
 
   v->decl = decl;
-  v->size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (decl)));
+  tree size = TREE_CODE (decl) == SSA_NAME
+    ? TYPE_SIZE_UNIT (TREE_TYPE (decl))
+    : DECL_SIZE_UNIT (decl);
+  v->size = tree_to_uhwi (size);
   /* Ensure that all variables have size, so that &a != &b for any two
      variables that are simultaneously live.  */
   if (v->size == 0)
     v->size = 1;
-  v->alignb = align_local_variable (SSAVAR (decl));
+  v->alignb = align_local_variable (decl);
   /* An alignment of zero can mightily confuse us later.  */
   gcc_assert (v->alignb != 0);
 
@@ -862,7 +1002,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (Pmode, base, offset);
-  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+		   ? TYPE_MODE (TREE_TYPE (decl))
+		   : DECL_MODE (SSAVAR (decl)), x);
 
   if (TREE_CODE (decl) != SSA_NAME)
     {
@@ -884,7 +1026,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
       DECL_USER_ALIGN (decl) = 0;
     }
 
-  set_mem_attributes (x, SSAVAR (decl), true);
   set_rtl (decl, x);
 }
 
@@ -950,9 +1091,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 	  /* Skip variables that have already had rtl assigned.  See also
 	     add_stack_var where we perpetrate this pc_rtx hack.  */
 	  decl = stack_vars[i].decl;
-	  if ((TREE_CODE (decl) == SSA_NAME
-	      ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
-	      : DECL_RTL (decl)) != pc_rtx)
+	  if (TREE_CODE (decl) == SSA_NAME
+	      ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
+	      : DECL_RTL (decl) != pc_rtx)
 	    continue;
 
 	  large_size += alignb - 1;
@@ -981,9 +1122,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
       /* Skip variables that have already had rtl assigned.  See also
 	 add_stack_var where we perpetrate this pc_rtx hack.  */
       decl = stack_vars[i].decl;
-      if ((TREE_CODE (decl) == SSA_NAME
-	   ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
-	   : DECL_RTL (decl)) != pc_rtx)
+      if (TREE_CODE (decl) == SSA_NAME
+	  ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
+	  : DECL_RTL (decl) != pc_rtx)
 	continue;
 
       /* Check the predicate to see whether this variable should be
@@ -1099,13 +1240,22 @@ account_stack_vars (void)
    to a variable to be allocated in the stack frame.  */
 
 static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
 {
   HOST_WIDE_INT size, offset;
   unsigned byte_align;
 
-  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
-  byte_align = align_local_variable (SSAVAR (var));
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      tree type = TREE_TYPE (var);
+      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+      byte_align = TYPE_ALIGN_UNIT (type);
+    }
+  else
+    {
+      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+      byte_align = align_local_variable (var);
+    }
 
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1116,6 +1266,27 @@ expand_one_stack_var (tree var)
 			   crtl->max_used_stack_slot_alignment, offset);
 }
 
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+   already assigned some MEM.  */
+
+static void
+expand_one_stack_var (tree var)
+{
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (MEM_P (x));
+	  return;
+	}
+    }
+
+  return expand_one_stack_var_1 (var);
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a hard register.  */
 
@@ -1125,13 +1296,136 @@ expand_one_hard_reg_var (tree var)
   rest_of_decl_compilation (var, 0, 0);
 }
 
+/* Record the alignment requirements of some variable assigned to a
+   pseudo.  */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+  if (SUPPORTS_STACK_ALIGNMENT
+      && crtl->stack_alignment_estimated < align)
+    {
+      /* stack_alignment_estimated shouldn't change after stack
+         realign decision made */
+      gcc_assert (!crtl->stack_realign_processed);
+      crtl->stack_alignment_estimated = align;
+    }
+
+  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+     So here we only make sure stack_alignment_needed >= align.  */
+  if (crtl->stack_alignment_needed < align)
+    crtl->stack_alignment_needed = align;
+  if (crtl->max_used_stack_slot_alignment < align)
+    crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition.  */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+  int part = var_to_partition (SA.map, var);
+  gcc_assert (part != NO_PARTITION);
+
+  if (SA.partition_to_pseudo[part])
+    return;
+
+  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+					  TYPE_MODE (TREE_TYPE (var)),
+					  TYPE_ALIGN (TREE_TYPE (var)));
+
+  /* If the variable alignment is very large we'll dynamicaly allocate
+     it, which means that in-frame portion is just a pointer.  */
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+    align = POINTER_SIZE;
+
+  record_alignment_for_reg_var (align);
+
+  if (!use_register_for_decl (var))
+    {
+      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
+	  && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
+	{
+	  expand_one_stack_var_at (var, pc_rtx, 0, 0);
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (GET_CODE (x) == MEM);
+	  gcc_assert (GET_MODE (x) == BLKmode);
+	  gcc_assert (XEXP (x, 0) == pc_rtx);
+	  /* Reset the address, so that any attempt to use it will
+	     ICE.  It will be adjusted in assign_parm_setup_reg.  */
+	  XEXP (x, 0) = NULL_RTX;
+	}
+      else if (defer_stack_allocation (var, true))
+	add_stack_var (var);
+      else
+	expand_one_stack_var_1 (var);
+      return;
+    }
+
+  machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+  rtx x = gen_reg_rtx (reg_mode);
+
+  set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+   and the underlying variable of the SSA_NAME.  */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+  if (!var)
+    return;
+
+  tree decl = SSA_NAME_VAR (var);
+
+  int part = var_to_partition (SA.map, var);
+  if (part == NO_PARTITION)
+    return;
+
+  rtx x = SA.partition_to_pseudo[part];
+
+  if (!x)
+    {
+      /* This var will get a stack slot later.  */
+      gcc_assert (defer_stack_allocation (var, true));
+      return;
+    }
+
+  set_rtl (var, x);
+
+  if (!REG_P (x))
+    return;
+
+  /* Note if the object is a user variable.  */
+  if (decl && !DECL_ARTIFICIAL (decl))
+    mark_user_reg (x);
+
+  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+    mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a pseudo register.  */
 
 static void
 expand_one_register_var (tree var)
 {
-  tree decl = SSAVAR (var);
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (REG_P (x));
+	  return;
+	}
+      gcc_unreachable ();
+    }
+
+  tree decl = var;
   tree type = TREE_TYPE (decl);
   machine_mode reg_mode = promote_decl_mode (decl, NULL);
   rtx x = gen_reg_rtx (reg_mode);
@@ -1177,10 +1471,14 @@ expand_one_error_var (tree var)
 static bool
 defer_stack_allocation (tree var, bool toplevel)
 {
+  tree size_unit = TREE_CODE (var) == SSA_NAME
+    ? TYPE_SIZE_UNIT (TREE_TYPE (var))
+    : DECL_SIZE_UNIT (var);
+
   /* Whether the variable is small enough for immediate allocation not to be
      a problem with regard to the frame size.  */
   bool smallish
-    = ((HOST_WIDE_INT) tree_to_uhwi (DECL_SIZE_UNIT (var))
+    = ((HOST_WIDE_INT) tree_to_uhwi (size_unit)
        < PARAM_VALUE (PARAM_MIN_SIZE_FOR_STACK_SHARING));
 
   /* If stack protection is enabled, *all* stack variables must be deferred,
@@ -1189,16 +1487,24 @@ defer_stack_allocation (tree var, bool toplevel)
   if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK))
     return true;
 
+  unsigned int align = TREE_CODE (var) == SSA_NAME
+    ? TYPE_ALIGN (TREE_TYPE (var))
+    : DECL_ALIGN (var);
+
   /* We handle "large" alignment via dynamic allocation.  We want to handle
      this extra complication in only one place, so defer them.  */
-  if (DECL_ALIGN (var) > MAX_SUPPORTED_STACK_ALIGNMENT)
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
     return true;
 
+  bool ignored = TREE_CODE (var) == SSA_NAME
+    ? !SSAVAR (var) || DECL_IGNORED_P (SSA_NAME_VAR (var))
+    : DECL_IGNORED_P (var);
+
   /* When optimization is enabled, DECL_IGNORED_P variables originally scoped
      might be detached from their block and appear at toplevel when we reach
      here.  We want to coalesce them with variables from other blocks when
      the immediate contribution to the frame size would be noticeable.  */
-  if (toplevel && optimize > 0 && DECL_IGNORED_P (var) && !smallish)
+  if (toplevel && optimize > 0 && ignored && !smallish)
     return true;
 
   /* Variables declared in the outermost scope automatically conflict
@@ -1265,21 +1571,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
 	align = POINTER_SIZE;
     }
 
-  if (SUPPORTS_STACK_ALIGNMENT
-      && crtl->stack_alignment_estimated < align)
-    {
-      /* stack_alignment_estimated shouldn't change after stack
-         realign decision made */
-      gcc_assert (!crtl->stack_realign_processed);
-      crtl->stack_alignment_estimated = align;
-    }
-
-  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
-     So here we only make sure stack_alignment_needed >= align.  */
-  if (crtl->stack_alignment_needed < align)
-    crtl->stack_alignment_needed = align;
-  if (crtl->max_used_stack_slot_alignment < align)
-    crtl->max_used_stack_slot_alignment = align;
+  record_alignment_for_reg_var (align);
 
   if (TREE_CODE (origvar) == SSA_NAME)
     {
@@ -1713,48 +2005,18 @@ expand_used_vars (void)
   if (targetm.use_pseudo_pic_reg ())
     pic_offset_table_rtx = gen_reg_rtx (Pmode);
 
-  hash_map<tree, tree> ssa_name_decls;
   for (i = 0; i < SA.map->num_partitions; i++)
     {
       tree var = partition_to_var (SA.map, i);
 
       gcc_assert (!virtual_operand_p (var));
 
-      /* Assign decls to each SSA name partition, share decls for partitions
-         we could have coalesced (those with the same type).  */
-      if (SSA_NAME_VAR (var) == NULL_TREE)
-	{
-	  tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
-	  if (!*slot)
-	    *slot = create_tmp_reg (TREE_TYPE (var));
-	  replace_ssa_name_symbol (var, *slot);
-	}
-
-      /* Always allocate space for partitions based on VAR_DECLs.  But for
-	 those based on PARM_DECLs or RESULT_DECLs and which matter for the
-	 debug info, there is no need to do so if optimization is disabled
-	 because all the SSA_NAMEs based on these DECLs have been coalesced
-	 into a single partition, which is thus assigned the canonical RTL
-	 location of the DECLs.  If in_lto_p, we can't rely on optimize,
-	 a function could be compiled with -O1 -flto first and only the
-	 link performed at -O0.  */
-      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
-	expand_one_var (var, true, true);
-      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
-	{
-	  /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
-	     contain the default def (representing the parm or result itself)
-	     we don't do anything here.  But those which don't contain the
-	     default def (representing a temporary based on the parm/result)
-	     we need to allocate space just like for normal VAR_DECLs.  */
-	  if (!bitmap_bit_p (SA.partition_has_default_def, i))
-	    {
-	      expand_one_var (var, true, true);
-	      gcc_assert (SA.partition_to_pseudo[i]);
-	    }
-	}
+      expand_one_ssa_partition (var);
     }
 
+  for (i = 1; i < num_ssa_names; i++)
+    adjust_one_expanded_partition_var (ssa_name (i));
+
   if (flag_stack_protect == SPCT_FLAG_STRONG)
       gen_stack_protect_signal
 	= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -5928,35 +6190,6 @@ pass_expand::execute (function *fun)
       parm_birth_insn = var_seq;
     }
 
-  /* Now that we also have the parameter RTXs, copy them over to our
-     partitions.  */
-  for (i = 0; i < SA.map->num_partitions; i++)
-    {
-      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
-      if (TREE_CODE (var) != VAR_DECL
-	  && !SA.partition_to_pseudo[i])
-	SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
-      gcc_assert (SA.partition_to_pseudo[i]);
-
-      /* If this decl was marked as living in multiple places, reset
-	 this now to NULL.  */
-      if (DECL_RTL_IF_SET (var) == pc_rtx)
-	SET_DECL_RTL (var, NULL);
-
-      /* Some RTL parts really want to look at DECL_RTL(x) when x
-	 was a decl marked in REG_ATTR or MEM_ATTR.  We could use
-	 SET_DECL_RTL here making this available, but that would mean
-	 to select one of the potentially many RTLs for one DECL.  Instead
-	 of doing that we simply reset the MEM_EXPR of the RTL in question,
-	 then nobody can get at it and hence nobody can call DECL_RTL on it.  */
-      if (!DECL_RTL_SET_P (var))
-	{
-	  if (MEM_P (SA.partition_to_pseudo[i]))
-	    set_mem_expr (SA.partition_to_pseudo[i], NULL);
-	}
-    }
-
   /* If we have a class containing differently aligned pointers
      we need to merge those into the corresponding RTL pointer
      alignment.  */
@@ -5964,7 +6197,6 @@ pass_expand::execute (function *fun)
     {
       tree name = ssa_name (i);
       int part;
-      rtx r;
 
       if (!name
 	  /* We might have generated new SSA names in
@@ -5977,20 +6209,25 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      /* Adjust all partition members to get the underlying decl of
-	 the representative which we might have created in expand_one_var.  */
-      if (SSA_NAME_VAR (name) == NULL_TREE)
+      gcc_assert (SA.partition_to_pseudo[part]
+		  || defer_stack_allocation (name, true));
+
+      /* If this decl was marked as living in multiple places, reset
+	 this now to NULL.  */
+      tree var = SSA_NAME_VAR (name);
+      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+	SET_DECL_RTL (var, NULL);
+      /* Check that the pseudos chosen by assign_parms are those of
+	 the corresponding default defs.  */
+      else if (SSA_NAME_IS_DEFAULT_DEF (name)
+	       && (TREE_CODE (var) == PARM_DECL
+		   || TREE_CODE (var) == RESULT_DECL))
 	{
-	  tree leader = partition_to_var (SA.map, part);
-	  gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
-	  replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+	  rtx in = DECL_RTL_IF_SET (var);
+	  gcc_assert (in);
+	  rtx out = SA.partition_to_pseudo[part];
+	  gcc_assert (in == out || rtx_equal_p (in, out));
 	}
-      if (!POINTER_TYPE_P (TREE_TYPE (name)))
-	continue;
-
-      r = SA.partition_to_pseudo[part];
-      if (REG_P (r))
-	mark_reg_pointer (r, get_pointer_alignment (name));
     }
 
   /* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..987cf356 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,8 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern bool parm_maybe_byref_p (tree);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
 
 #endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 8f25f8b..6d47e94 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2230,16 +2230,16 @@ Common Report Var(flag_tree_ch) Optimization
 Enable loop header copying on trees
 
 ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing.  Preserved for backward compatibility.
 
 ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
 
 ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copy-prop
 Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7b5d86b..fcb1d36 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -341,7 +341,6 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
 -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
 -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
 -fdump-tree-nrv -fdump-tree-vect @gol
 -fdump-tree-sink @gol
 -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -447,9 +446,8 @@ Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
 -ftree-loop-if-convert-stores -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -7114,11 +7112,6 @@ name is made by appending @file{.phiopt} to the source file name.
 Dump each function after forward propagating single use variables.  The file
 name is made by appending @file{.forwprop} to the source file name.
 
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization.  The file
-name is made by appending @file{.copyrename} to the source file name.
-
 @item nrv
 @opindex fdump-tree-nrv
 Dump each function after applying the named return value optimization on
@@ -7583,8 +7576,8 @@ compilation time.
 -ftree-ccp @gol
 -fssa-phiopt @gol
 -ftree-ch @gol
+-ftree-coalesce-vars @gol
 -ftree-copy-prop @gol
--ftree-copyrename @gol
 -ftree-dce @gol
 -ftree-dominator-opts @gol
 -ftree-dse @gol
@@ -8848,6 +8841,15 @@ be parallelized.  Parallelize all the loops that can be analyzed to
 not contain loop carried dependences without checking that it is
 profitable to parallelize the loops.
 
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries.  This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}.  In the negated form, this flag
+prevents SSA coalescing of user variables.  This option is enabled by
+default if optimization is enabled.
+
 @item -ftree-loop-if-convert
 @opindex ftree-loop-if-convert
 Attempt to transform conditional jumps in the innermost loops to
@@ -8961,32 +8963,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
 references with scalars to prevent committing structures to memory too
 early.  This flag is enabled by default at @option{-O} and higher.
 
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees.  This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables.  This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions.  It is a more limited form of
-@option{-ftree-coalesce-vars}.  This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries.  This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}.  In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones.  This option is enabled by default.
-
 @item -ftree-ter
 @opindex ftree-ter
 Perform temporary expression replacement during the SSA->normal phase.  Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index ed2b30b..3b95c5d 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "builtins.h"
 #include "rtl-iter.h"
+#include "stor-layout.h"
 
 struct target_rtl default_target_rtl;
 #if SWITCHABLE_TARGET
@@ -1232,6 +1233,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
 void
 set_reg_attrs_for_decl_rtl (tree t, rtx x)
 {
+  if (!t)
+    return;
+  tree tdecl = t;
   if (GET_CODE (x) == SUBREG)
     {
       gcc_assert (subreg_lowpart_p (x));
@@ -1240,7 +1244,9 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
   if (REG_P (x))
     REG_ATTRS (x)
       = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
-					       DECL_MODE (t)));
+					       DECL_P (tdecl)
+					       ? DECL_MODE (tdecl)
+					       : TYPE_MODE (TREE_TYPE (tdecl))));
   if (GET_CODE (x) == CONCAT)
     {
       if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index bd342c1..6941f4e 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -842,6 +842,35 @@ promote_decl_mode (const_tree decl, int *punsignedp)
   return pmode;
 }
 
+/* Return the promoted mode for name.  If it is a named SSA_NAME, it
+   is the same as promote_decl_mode.  Otherwise, it is the promoted
+   mode of a temp decl of same type as the SSA_NAME, if we had created
+   one.  */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  /* Partitions holding parms and results must be promoted as expected
+     by function.c.  */
+  if (SSA_NAME_VAR (name)
+      && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
+	  || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
+    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+
+  tree type = TREE_TYPE (name);
+  int unsignedp = TYPE_UNSIGNED (type);
+  machine_mode mode = TYPE_MODE (type);
+
+  machine_mode pmode = promote_mode (type, mode, &unsignedp);
+  if (punsignedp)
+    *punsignedp = unsignedp;
+
+  return pmode;
+}
+
+
 \f
 /* Controls the behaviour of {anti_,}adjust_stack.  */
 static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 94613de..52113db 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
 /* Return mode and signedness to use when object is promoted.  */
 machine_mode promote_decl_mode (const_tree, int *);
 
+/* Return mode and signedness to use when object is promoted.  */
+machine_mode promote_ssa_mode (const_tree, int *);
+
 /* Remove some bytes from the stack.  An rtx says how many.  */
 extern void adjust_stack (rtx);
 
diff --git a/gcc/expr.c b/gcc/expr.c
index 899a42c..fc49f92 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -3022,7 +3022,7 @@ write_complex_part (rtx cplx, rtx val, bool imag_p)
 /* Extract one of the components of the complex value CPLX.  Extract the
    real part if IMAG_P is false, and the imaginary part if it's true.  */
 
-static rtx
+rtx
 read_complex_part (rtx cplx, bool imag_p)
 {
   machine_mode cmode, imode;
@@ -9246,7 +9246,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
   rtx op0, op1, temp, decl_rtl;
   tree type;
   int unsignedp;
-  machine_mode mode;
+  machine_mode mode, dmode;
   enum tree_code code = TREE_CODE (exp);
   rtx subtarget, original_target;
   int ignore;
@@ -9377,7 +9377,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       if (g == NULL
 	  && modifier == EXPAND_INITIALIZER
 	  && !SSA_NAME_IS_DEFAULT_DEF (exp)
-	  && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+	  && (optimize || !SSA_NAME_VAR (exp)
+	      || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
 	  && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
 	g = SSA_NAME_DEF_STMT (exp);
       if (g)
@@ -9456,15 +9457,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       /* Ensure variable marked as used even if it doesn't go through
 	 a parser.  If it hasn't be used yet, write out an external
 	 definition.  */
-      TREE_USED (exp) = 1;
+      if (exp)
+	TREE_USED (exp) = 1;
 
       /* Show we haven't gotten RTL for this yet.  */
       temp = 0;
 
       /* Variables inherited from containing functions should have
 	 been lowered by this point.  */
-      context = decl_function_context (exp);
-      gcc_assert (SCOPE_FILE_SCOPE_P (context)
+      if (exp)
+	context = decl_function_context (exp);
+      gcc_assert (!exp
+		  || SCOPE_FILE_SCOPE_P (context)
 		  || context == current_function_decl
 		  || TREE_STATIC (exp)
 		  || DECL_EXTERNAL (exp)
@@ -9488,7 +9492,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	  decl_rtl = use_anchored_address (decl_rtl);
 	  if (modifier != EXPAND_CONST_ADDRESS
 	      && modifier != EXPAND_SUM
-	      && !memory_address_addr_space_p (DECL_MODE (exp),
+	      && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+					       : GET_MODE (decl_rtl),
 					       XEXP (decl_rtl, 0),
 					       MEM_ADDR_SPACE (decl_rtl)))
 	    temp = replace_equiv_address (decl_rtl,
@@ -9499,12 +9504,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 if the address is a register.  */
       if (temp != 0)
 	{
-	  if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+	  if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
 	    mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
 
 	  return temp;
 	}
 
+      if (exp)
+	dmode = DECL_MODE (exp);
+      else
+	dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
       /* If the mode of DECL_RTL does not match that of the decl,
 	 there are two cases: we are dealing with a BLKmode value
 	 that is returned in a register, or we are dealing with
@@ -9512,22 +9522,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 of the wanted mode, but mark it so that we know that it
 	 was already extended.  */
       if (REG_P (decl_rtl)
-	  && DECL_MODE (exp) != BLKmode
-	  && GET_MODE (decl_rtl) != DECL_MODE (exp))
+	  && dmode != BLKmode
+	  && GET_MODE (decl_rtl) != dmode)
 	{
 	  machine_mode pmode;
 
 	  /* Get the signedness to be used for this variable.  Ensure we get
 	     the same mode we got when the variable was declared.  */
-	  if (code == SSA_NAME
-	      && (g = SSA_NAME_DEF_STMT (ssa_name))
-	      && gimple_code (g) == GIMPLE_CALL
-	      && !gimple_call_internal_p (g))
+	  if (code != SSA_NAME)
+	    pmode = promote_decl_mode (exp, &unsignedp);
+	  else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+		   && gimple_code (g) == GIMPLE_CALL
+		   && !gimple_call_internal_p (g))
 	    pmode = promote_function_mode (type, mode, &unsignedp,
 					   gimple_call_fntype (g),
 					   2);
 	  else
-	    pmode = promote_decl_mode (exp, &unsignedp);
+	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/expr.h b/gcc/expr.h
index 32d1707..a2c8e1d 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -210,6 +210,7 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx);
 
 extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx);
 extern rtx_insn *emit_move_complex_parts (rtx, rtx);
+extern rtx read_complex_part (rtx, bool);
 extern void write_complex_part (rtx, rtx, bool);
 extern rtx emit_move_resolve_push (machine_mode, rtx);
 
diff --git a/gcc/function.c b/gcc/function.c
index f9d11bf4..1d98ede 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -72,6 +72,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfganal.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
+#include "cfgexpand.h"
+#include "basic-block.h"
+#include "df.h"
 #include "params.h"
 #include "bb-reorder.h"
 #include "shrink-wrap.h"
@@ -148,6 +151,9 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
 static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
+static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
+static void maybe_reset_rtl_for_parm (tree);
+
 \f
 /* Stack of nested functions.  */
 /* Keep track of the cfun stack.  */
@@ -2105,6 +2111,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
 bool
 use_register_for_decl (const_tree decl)
 {
+  if (TREE_CODE (decl) == SSA_NAME)
+    {
+      /* We often try to use the SSA_NAME, instead of its underlying
+	 decl, to get type information and guide decisions, to avoid
+	 differences of behavior between anonymous and named
+	 variables, but in this one case we have to go for the actual
+	 variable if there is one.  The main reason is that, at least
+	 at -O0, we want to place user variables on the stack, but we
+	 don't mind using pseudos for anonymous or ignored temps.
+	 Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+	 should go in pseudos, whereas their corresponding variables
+	 might have to go on the stack.  So, disregarding the decl
+	 here would negatively impact debug info at -O0, enable
+	 coalescing between SSA_NAMEs that ought to get different
+	 stack/pseudo assignments, and get the incoming argument
+	 processing thoroughly confused by PARM_DECLs expected to live
+	 in stack slots but assigned to pseudos.  */
+      if (!SSA_NAME_VAR (decl))
+	return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+	  && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+      decl = SSA_NAME_VAR (decl);
+    }
+
   if (!targetm.calls.allocate_stack_slots_for_args ())
     return true;
 
@@ -2240,7 +2270,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
    needed, else the old list.  */
 
 static void
-split_complex_args (vec<tree> *args)
+split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 {
   unsigned i;
   tree p;
@@ -2251,6 +2281,7 @@ split_complex_args (vec<tree> *args)
       if (TREE_CODE (type) == COMPLEX_TYPE
 	  && targetm.calls.split_complex_arg (type))
 	{
+	  tree cparm = p;
 	  tree decl;
 	  tree subtype = TREE_TYPE (type);
 	  bool addressable = TREE_ADDRESSABLE (p);
@@ -2269,6 +2300,9 @@ split_complex_args (vec<tree> *args)
 	  DECL_ARTIFICIAL (p) = addressable;
 	  DECL_IGNORED_P (p) = addressable;
 	  TREE_ADDRESSABLE (p) = 0;
+	  /* Reset the RTL before layout_decl, or it may change the
+	     mode of the RTL of the original argument copied to P.  */
+	  SET_DECL_RTL (p, NULL_RTX);
 	  layout_decl (p, 0);
 	  (*args)[i] = p;
 
@@ -2280,6 +2314,25 @@ split_complex_args (vec<tree> *args)
 	  DECL_IGNORED_P (decl) = addressable;
 	  layout_decl (decl, 0);
 	  args->safe_insert (++i, decl);
+
+	  /* If we are expanding a function, rather than gimplifying
+	     it, propagate the RTL of the complex parm to the split
+	     declarations, and set their contexts so that
+	     maybe_reset_rtl_for_parm can recognize them and refrain
+	     from resetting their RTL.  */
+	  if (currently_expanding_to_rtl)
+	    {
+	      maybe_reset_rtl_for_parm (cparm);
+	      rtx rtl = rtl_for_parm (all, cparm);
+	      if (rtl)
+		{
+		  SET_DECL_RTL (p, read_complex_part (rtl, false));
+		  SET_DECL_RTL (decl, read_complex_part (rtl, true));
+
+		  DECL_CONTEXT (p) = cparm;
+		  DECL_CONTEXT (decl) = cparm;
+		}
+	    }
 	}
     }
 }
@@ -2342,7 +2395,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
 
   /* If the target wants to split complex arguments into scalars, do so.  */
   if (targetm.calls.split_complex_arg)
-    split_complex_args (&fnargs);
+    split_complex_args (all, &fnargs);
 
   return fnargs;
 }
@@ -2745,23 +2798,98 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
   data->entry_parm = entry_parm;
 }
 
+/* Wrapper for use_register_for_decl, that special-cases the
+   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+   passed by reference.  */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (DECL_BY_REFERENCE (result))
+	parm = result;
+    }
+
+  return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+   is passed by reference.  */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (!DECL_BY_REFERENCE (result))
+	return NULL_RTX;
+
+      parm = result;
+    }
+
+  return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+   SSA_NAMEs in multiple partitions, so that assign_parms will choose
+   the default def, if it exists, or create new RTL to hold the unused
+   entry value.  If we are coalescing across variables, we want to
+   reset the location too, because a parm without a default def
+   (incoming value unused) might be coalesced with one with a default
+   def, and then assign_parms would copy both incoming values to the
+   same location, which might cause the wrong value to survive.  */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+  gcc_assert (TREE_CODE (parm) == PARM_DECL
+	      || TREE_CODE (parm) == RESULT_DECL);
+
+  /* This is a split complex parameter, and its context was set to its
+     original PARM_DECL in split_complex_args so that we could
+     recognize it here and not reset its RTL.  */
+  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
+    {
+      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
+      return;
+    }
+
+  if ((flag_tree_coalesce_vars
+       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+      && is_gimple_reg (parm))
+    SET_DECL_RTL (parm, NULL_RTX);
+}
+
 /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
    always valid and properly aligned.  */
 
 static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+			      struct assign_parm_data_one *data)
 {
   rtx stack_parm = data->stack_parm;
 
+  /* If out-of-SSA assigned RTL to the parm default def, make sure we
+     don't use what we might have computed before.  */
+  rtx ssa_assigned = rtl_for_parm (all, parm);
+  if (ssa_assigned)
+    stack_parm = NULL;
+
   /* If we can't trust the parm stack slot to be aligned enough for its
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
-  if (stack_parm
-      && ((STRICT_ALIGNMENT
-	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
-	  || (data->nominal_type
-	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
-	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+  else if (stack_parm
+	   && ((STRICT_ALIGNMENT
+		&& (GET_MODE_ALIGNMENT (data->nominal_mode)
+		    > MEM_ALIGN (stack_parm)))
+	       || (data->nominal_type
+		   && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+		   && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
     stack_parm = NULL;
 
   /* If parm was passed in memory, and we need to convert it on entry,
@@ -2823,14 +2951,32 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      stack_parm = assign_stack_local (BLKmode, size_stored,
-				       DECL_ALIGN (parm));
-      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
-	PUT_MODE (stack_parm, GET_MODE (entry_parm));
-      set_mem_attributes (stack_parm, parm, 1);
+      rtx from_expand = rtl_for_parm (all, parm);
+      if (from_expand && (!parm_maybe_byref_p (parm)
+			  || XEXP (from_expand, 0) != NULL_RTX))
+	stack_parm = copy_rtx (from_expand);
+      else
+	{
+	  stack_parm = assign_stack_local (BLKmode, size_stored,
+					   DECL_ALIGN (parm));
+	  if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
+	    PUT_MODE (stack_parm, GET_MODE (entry_parm));
+	  if (from_expand)
+	    {
+	      gcc_assert (GET_CODE (stack_parm) == MEM);
+	      gcc_assert (GET_CODE (from_expand) == MEM);
+	      gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
+	      XEXP (from_expand, 0) = XEXP (stack_parm, 0);
+	      PUT_MODE (from_expand, GET_MODE (stack_parm));
+	      stack_parm = copy_rtx (from_expand);
+	    }
+	  else
+	    set_mem_attributes (stack_parm, parm, 1);
+	}
     }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
@@ -2968,14 +3114,34 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  parmreg = gen_reg_rtx (promoted_nominal_mode);
+  rtx from_expand = parmreg = rtl_for_parm (all, parm);
 
-  if (!DECL_ARTIFICIAL (parm))
-    mark_user_reg (parmreg);
+  if (from_expand && !data->passed_pointer)
+    {
+      if (GET_MODE (parmreg) != promoted_nominal_mode)
+	parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
+    }
+  else if (!from_expand || parm_maybe_byref_p (parm))
+    {
+      parmreg = gen_reg_rtx (promoted_nominal_mode);
+      if (!DECL_ARTIFICIAL (parm))
+	mark_user_reg (parmreg);
+
+      if (from_expand)
+	{
+	  gcc_assert (data->passed_pointer);
+	  gcc_assert (GET_CODE (from_expand) == MEM
+		      && GET_MODE (from_expand) == BLKmode
+		      && XEXP (from_expand, 0) == NULL_RTX);
+	  XEXP (from_expand, 0) = parmreg;
+	}
+    }
 
   /* If this was an item that we received a pointer to,
      set DECL_RTL appropriately.  */
-  if (data->passed_pointer)
+  if (from_expand)
+    SET_DECL_RTL (parm, from_expand);
+  else if (data->passed_pointer)
     {
       rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
       set_mem_attributes (x, parm, 1);
@@ -2990,10 +3156,13 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      assign_parm_find_data_types and expand_expr_real_1.  */
 
   equiv_stack_parm = data->stack_parm;
+  if (!equiv_stack_parm)
+    equiv_stack_parm = data->entry_parm;
   validated_mem = validize_mem (copy_rtx (data->entry_parm));
 
   need_conversion = (data->nominal_mode != data->passed_mode
 		     || promoted_nominal_mode != data->promoted_mode);
+  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
   moved = false;
 
   if (need_conversion
@@ -3125,16 +3294,28 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       did_conversion = true;
     }
-  else
+  /* We don't want to copy the incoming pointer to a parmreg expected
+     to hold the value rather than the pointer.  */
+  else if (!data->passed_pointer || parmreg != from_expand)
     emit_move_insn (parmreg, validated_mem);
 
   /* If we were passed a pointer but the actual value can safely live
      in a register, retrieve it and use it directly.  */
-  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+  if (data->passed_pointer
+      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
     {
+      rtx src = DECL_RTL (parm);
+
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
-      if (use_register_for_decl (parm))
+      if (from_expand)
+	{
+	  parmreg = from_expand;
+	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+	  src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
+	  set_mem_attributes (src, parm, 1);
+	}
+      else if (use_register_for_decl (parm))
 	{
 	  parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
 	  mark_user_reg (parmreg);
@@ -3151,14 +3332,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	  set_mem_attributes (parmreg, parm, 1);
 	}
 
-      if (GET_MODE (parmreg) != GET_MODE (DECL_RTL (parm)))
+      if (GET_MODE (parmreg) != GET_MODE (src))
 	{
-	  rtx tempreg = gen_reg_rtx (GET_MODE (DECL_RTL (parm)));
+	  rtx tempreg = gen_reg_rtx (GET_MODE (src));
 	  int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
 
 	  push_to_sequence2 (all->first_conversion_insn,
 			     all->last_conversion_insn);
-	  emit_move_insn (tempreg, DECL_RTL (parm));
+	  emit_move_insn (tempreg, src);
 	  tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
 	  emit_move_insn (parmreg, tempreg);
 	  all->first_conversion_insn = get_insns ();
@@ -3167,14 +3348,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
 	  did_conversion = true;
 	}
+      else if (GET_MODE (parmreg) == BLKmode)
+	gcc_assert (parm_maybe_byref_p (parm));
       else
-	emit_move_insn (parmreg, DECL_RTL (parm));
+	emit_move_insn (parmreg, src);
 
       SET_DECL_RTL (parm, parmreg);
 
       /* STACK_PARM is the pointer, not the parm, and PARMREG is
 	 now the parm.  */
-      data->stack_parm = NULL;
+      data->stack_parm = equiv_stack_parm = NULL;
     }
 
   /* Mark the register as eliminable if we did no conversion and it was
@@ -3184,11 +3367,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      make here would screw up life analysis for it.  */
   if (data->nominal_mode == data->passed_mode
       && !did_conversion
-      && data->stack_parm != 0
-      && MEM_P (data->stack_parm)
+      && equiv_stack_parm != 0
+      && MEM_P (equiv_stack_parm)
       && data->locate.offset.var == 0
       && reg_mentioned_p (virtual_incoming_args_rtx,
-			  XEXP (data->stack_parm, 0)))
+			  XEXP (equiv_stack_parm, 0)))
     {
       rtx_insn *linsn = get_last_insn ();
       rtx_insn *sinsn;
@@ -3201,8 +3384,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	    = GET_MODE_INNER (GET_MODE (parmreg));
 	  int regnor = REGNO (XEXP (parmreg, 0));
 	  int regnoi = REGNO (XEXP (parmreg, 1));
-	  rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
-	  rtx stacki = adjust_address_nv (data->stack_parm, submode,
+	  rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+	  rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
 					  GET_MODE_SIZE (submode));
 
 	  /* Scan backwards for the set of the real and
@@ -3275,6 +3458,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 
       if (data->stack_parm == 0)
 	{
+	  rtx x = data->stack_parm = rtl_for_parm (all, parm);
+	  if (x)
+	    gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+	}
+
+      if (data->stack_parm == 0)
+	{
 	  int align = STACK_SLOT_ALIGNMENT (data->passed_type,
 					    GET_MODE (data->entry_parm),
 					    TYPE_ALIGN (data->passed_type));
@@ -3337,11 +3527,21 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
 	  imag = DECL_RTL (fnargs[i + 1]);
 	  if (inner != GET_MODE (real))
 	    {
-	      real = gen_lowpart_SUBREG (inner, real);
-	      imag = gen_lowpart_SUBREG (inner, imag);
+	      real = simplify_gen_subreg (inner, real, GET_MODE (real),
+					  subreg_lowpart_offset
+					  (inner, GET_MODE (real)));
+	      imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
+					  subreg_lowpart_offset
+					  (inner, GET_MODE (imag)));
 	    }
 
-	  if (TREE_ADDRESSABLE (parm))
+	  if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
+	      && rtx_equal_p (real,
+			      read_complex_part (tmp, false))
+	      && rtx_equal_p (imag,
+			      read_complex_part (tmp, true)))
+	    ; /* We now have the right rtl in tmp.  */
+	  else if (TREE_ADDRESSABLE (parm))
 	    {
 	      rtx rmem, imem;
 	      HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
@@ -3487,7 +3687,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
 	  assign_parm_setup_block (&all, pbdata->bounds_parm,
 				   &pbdata->parm_data);
 	else if (pbdata->parm_data.passed_pointer
-		 || use_register_for_decl (pbdata->bounds_parm))
+		 || use_register_for_parm_decl (&all, pbdata->bounds_parm))
 	  assign_parm_setup_reg (&all, pbdata->bounds_parm,
 				 &pbdata->parm_data);
 	else
@@ -3531,6 +3731,8 @@ assign_parms (tree fndecl)
 	  DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
 	  continue;
 	}
+      else
+	maybe_reset_rtl_for_parm (parm);
 
       /* Estimate stack alignment from parameter alignment.  */
       if (SUPPORTS_STACK_ALIGNMENT)
@@ -3580,7 +3782,9 @@ assign_parms (tree fndecl)
       else
 	set_decl_incoming_rtl (parm, data.entry_parm, false);
 
-      /* Boudns should be loaded in the particular order to
+      assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+      /* Bounds should be loaded in the particular order to
 	 have registers allocated correctly.  Collect info about
 	 input bounds and load them later.  */
       if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3597,11 +3801,10 @@ assign_parms (tree fndecl)
 	}
       else
 	{
-	  assign_parm_adjust_stack_rtl (&data);
-
 	  if (assign_parm_setup_block_p (&data))
 	    assign_parm_setup_block (&all, parm, &data);
-	  else if (data.passed_pointer || use_register_for_decl (parm))
+	  else if (data.passed_pointer
+		   || use_register_for_parm_decl (&all, parm))
 	    assign_parm_setup_reg (&all, parm, &data);
 	  else
 	    assign_parm_setup_stack (&all, parm, &data);
@@ -4932,7 +5135,9 @@ expand_function_start (tree subr)
      before any library calls that assign parms might generate.  */
 
   /* Decide whether to return the value in memory or in a register.  */
-  if (aggregate_value_p (DECL_RESULT (subr), subr))
+  tree res = DECL_RESULT (subr);
+  maybe_reset_rtl_for_parm (res);
+  if (aggregate_value_p (res, subr))
     {
       /* Returning something that won't go in a register.  */
       rtx value_address = 0;
@@ -4940,7 +5145,7 @@ expand_function_start (tree subr)
 #ifdef PCC_STATIC_STRUCT_RETURN
       if (cfun->returns_pcc_struct)
 	{
-	  int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+	  int size = int_size_in_bytes (TREE_TYPE (res));
 	  value_address = assemble_static_space (size);
 	}
       else
@@ -4952,36 +5157,45 @@ expand_function_start (tree subr)
 	     it.  */
 	  if (sv)
 	    {
-	      value_address = gen_reg_rtx (Pmode);
+	      if (DECL_BY_REFERENCE (res))
+		value_address = get_rtl_for_parm_ssa_default_def (res);
+	      if (!value_address)
+		value_address = gen_reg_rtx (Pmode);
 	      emit_move_insn (value_address, sv);
 	    }
 	}
       if (value_address)
 	{
 	  rtx x = value_address;
-	  if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+	  if (!DECL_BY_REFERENCE (res))
 	    {
-	      x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
-	      set_mem_attributes (x, DECL_RESULT (subr), 1);
+	      x = get_rtl_for_parm_ssa_default_def (res);
+	      if (!x)
+		{
+		  x = gen_rtx_MEM (DECL_MODE (res), value_address);
+		  set_mem_attributes (x, res, 1);
+		}
 	    }
-	  SET_DECL_RTL (DECL_RESULT (subr), x);
+	  SET_DECL_RTL (res, x);
 	}
     }
-  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+  else if (DECL_MODE (res) == VOIDmode)
     /* If return mode is void, this decl rtl should not be used.  */
-    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+    SET_DECL_RTL (res, NULL_RTX);
   else
     {
       /* Compute the return values into a pseudo reg, which we will copy
 	 into the true return register after the cleanups are done.  */
-      tree return_type = TREE_TYPE (DECL_RESULT (subr));
-      if (TYPE_MODE (return_type) != BLKmode
-	  && targetm.calls.return_in_msb (return_type))
+      tree return_type = TREE_TYPE (res);
+      rtx x = get_rtl_for_parm_ssa_default_def (res);
+      if (x)
+	/* Use it.  */;
+      else if (TYPE_MODE (return_type) != BLKmode
+	       && targetm.calls.return_in_msb (return_type))
 	/* expand_function_end will insert the appropriate padding in
 	   this case.  Use the return value's natural (unpadded) mode
 	   within the function proper.  */
-	SET_DECL_RTL (DECL_RESULT (subr),
-		      gen_reg_rtx (TYPE_MODE (return_type)));
+	x = gen_reg_rtx (TYPE_MODE (return_type));
       else
 	{
 	  /* In order to figure out what mode to use for the pseudo, we
@@ -4992,25 +5206,26 @@ expand_function_start (tree subr)
 	  /* Structures that are returned in registers are not
 	     aggregate_value_p, so we may see a PARALLEL or a REG.  */
 	  if (REG_P (hard_reg))
-	    SET_DECL_RTL (DECL_RESULT (subr),
-			  gen_reg_rtx (GET_MODE (hard_reg)));
+	    x = gen_reg_rtx (GET_MODE (hard_reg));
 	  else
 	    {
 	      gcc_assert (GET_CODE (hard_reg) == PARALLEL);
-	      SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+	      x = gen_group_rtx (hard_reg);
 	    }
 	}
 
+      SET_DECL_RTL (res, x);
+
       /* Set DECL_REGISTER flag so that expand_function_end will copy the
 	 result to the real return register(s).  */
-      DECL_REGISTER (DECL_RESULT (subr)) = 1;
+      DECL_REGISTER (res) = 1;
 
       if (chkp_function_instrumented_p (current_function_decl))
 	{
-	  tree return_type = TREE_TYPE (DECL_RESULT (subr));
+	  tree return_type = TREE_TYPE (res);
 	  rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
 								 subr, 1);
-	  SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+	  SET_DECL_BOUNDS_RTL (res, bounds);
 	}
     }
 
@@ -5025,13 +5240,19 @@ expand_function_start (tree subr)
       rtx local, chain;
      rtx_insn *insn;
 
-      local = gen_reg_rtx (Pmode);
+      local = get_rtl_for_parm_ssa_default_def (parm);
+      if (!local)
+	local = gen_reg_rtx (Pmode);
       chain = targetm.calls.static_chain (current_function_decl, true);
 
       set_decl_incoming_rtl (parm, chain, false);
       SET_DECL_RTL (parm, local);
       mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
 
+      if (GET_MODE (local) != Pmode)
+	local = convert_to_mode (Pmode, local,
+				 TYPE_UNSIGNED (TREE_TYPE (parm)));
+
       insn = emit_move_insn (local, chain);
 
       /* Mark the register as eliminable, similar to parameters.  */
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index b558d90..baed630 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type)
   return copy;
 }
 
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
-   coalescing together, false otherwise.
-
-   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
-  /* First check the SSA_NAME's associated DECL.  We only want to
-     coalesce if they have the same DECL or both have no associated DECL.  */
-  tree var1 = SSA_NAME_VAR (name1);
-  tree var2 = SSA_NAME_VAR (name2);
-  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
-  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
-  if (var1 != var2)
-    return false;
-
-  /* Now check the types.  If the types are the same, then we should
-     try to coalesce V1 and V2.  */
-  tree t1 = TREE_TYPE (name1);
-  tree t2 = TREE_TYPE (name2);
-  if (t1 == t2)
-    return true;
-
-  /* If the types are not the same, check for a canonical type match.  This
-     (for example) allows coalescing when the types are fundamentally the
-     same, but just have different names. 
-
-     Note pointer types with different address spaces may have the same
-     canonical type.  Those are rejected for coalescing by the
-     types_compatible_p check.  */
-  if (TYPE_CANONICAL (t1)
-      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
-      && types_compatible_p (t1, t2))
-    return true;
-
-  return false;
-}
-
 /* Strip off a legitimate source ending from the input string NAME of
    length LEN.  Rather than having to know the names used by all of
    our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index ed23eb2..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
 extern bool gimple_has_body_p (tree);
 extern const char *gimple_decl_printable_name (tree, int);
 extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
 extern tree create_tmp_var_name (const char *);
 extern tree create_tmp_var_raw (tree, const char * = NULL);
 extern tree create_tmp_var (tree, const char * = NULL);
diff --git a/gcc/opts.c b/gcc/opts.c
index 468a802..f22edd3 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -445,12 +445,12 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
-    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 6b66f8f..64fc4d9 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_all_early_optimizations);
       PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_object_sizes);
 	  NEXT_PASS (pass_ccp);
 	  /* After CCP we rewrite no longer addressed locals into SSA
@@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
       /* Initial scalar cleanups before alias computation.
 	 They ensure memory accesses are not indirect wherever possible.  */
       NEXT_PASS (pass_strip_predict_hints);
-      NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_ccp);
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
@@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_ch);
       NEXT_PASS (pass_lower_complex);
       NEXT_PASS (pass_sra);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* The dom pass will also resolve all __builtin_constant_p calls
          that are still there to 0.  This has to be done after some
 	 propagations have already run, but before some more dead code
@@ -293,7 +290,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_fold_builtins);
       NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* FIXME: If DCE is not run before checking for uninitialized uses,
 	 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 	 However, this also causes us to misdiagnose cases that should be
@@ -328,7 +324,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tsan);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* ???  We do want some kind of loop invariant motion, but we possibly
          need to adjust LIM to be more friendly towards preserving accurate
 	 debug information here.  */
diff --git a/gcc/stmt.c b/gcc/stmt.c
index 391686c..e7f7dd4 100644
--- a/gcc/stmt.c
+++ b/gcc/stmt.c
@@ -891,7 +891,7 @@ emit_case_decision_tree (tree index_expr, tree index_type,
     {
       index = copy_to_reg (index);
       if (TREE_CODE (index_expr) == SSA_NAME)
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (index_expr), index);
+	set_reg_attrs_for_decl_rtl (index_expr, index);
     }
 
   balance_case_nodes (&case_list, NULL);
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 0d4f4a4..288227a 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -794,7 +794,8 @@ layout_decl (tree decl, unsigned int known_align)
     {
       PUT_MODE (rtl, DECL_MODE (decl));
       SET_DECL_RTL (decl, 0);
-      set_mem_attributes (rtl, decl, 1);
+      if (MEM_P (rtl))
+	set_mem_attributes (rtl, decl, 1);
       SET_DECL_RTL (decl, rtl);
     }
 }
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/54200 */
 /* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
 
 int o __attribute__((used));
 
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
 
 int main ()
 {
-  int i;
+  register int i;
   char foo[255];
 
   // smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
 void
 overflow()
 {
-  int i = 0;
+  register int i = 0;
   char foo[30];
 
   /* Overflow buffer.  */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+   value is unused, to the same location, so as to overwrite one of
+   them with the incoming value of the other.  */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+/* Same as foo, but with swapped parameters.  */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+int
+main (void)
+{
+  if (foo (0, 1) != 3)
+    abort ();
+  if (bar (1, 0) != 3)
+    abort ();
+  return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index 7b747ab9..978476c 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
   rtx dest_rtx, seq, x;
   machine_mode dest_mode, src_mode;
   int unsignedp;
-  tree var;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
 
   start_sequence ();
 
-  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+  tree name = partition_to_var (SA.map, dest);
   src_mode = TYPE_MODE (TREE_TYPE (src));
   dest_mode = GET_MODE (dest_rtx);
-  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
   gcc_assert (!REG_P (dest_rtx)
-	      || dest_mode == promote_decl_mode (var, &unsignedp));
+	      || dest_mode == promote_ssa_mode (name, &unsignedp));
 
   if (src_mode != dest_mode)
     {
@@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T)
 static rtx
 get_temp_reg (tree name)
 {
-  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
-  tree type = TREE_TYPE (var);
+  tree type = TREE_TYPE (name);
   int unsignedp;
-  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
   rtx x = gen_reg_rtx (reg_mode);
   if (POINTER_TYPE_P (type))
-    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
   return x;
 }
 
@@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 
   /* Return to viewing the variable list as just all reference variables after
      coalescing has been performed.  */
-  partition_view_normal (map, false);
+  partition_view_normal (map);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index bf8983f..08ce72c 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -36,6 +36,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "tree-ssa-live.h"
 #include "tree-ssa-coalesce.h"
+#include "cfgexpand.h"
+#include "explow.h"
 #include "diagnostic-core.h"
 
 
@@ -806,6 +808,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
   basic_block bb;
   ssa_op_iter iter;
   live_track_p live;
+  basic_block entry;
+
+  /* If inter-variable coalescing is enabled, we may attempt to
+     coalesce variables from different base variables, including
+     different parameters, so we have to make sure default defs live
+     at the entry block conflict with each other.  */
+  if (flag_tree_coalesce_vars)
+    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  else
+    entry = NULL;
 
   map = live_var_map (liveinfo);
   graph = ssa_conflicts_new (num_var_partitions (map));
@@ -864,6 +876,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	    live_track_process_def (live, result, graph);
 	}
 
+      /* Pretend there are defs for params' default defs at the start
+	 of the (post-)entry block.  */
+      if (bb == entry)
+	{
+	  unsigned base;
+	  bitmap_iterator bi;
+	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	    {
+	      bitmap_iterator bi2;
+	      unsigned part;
+	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+					0, part, bi2)
+		{
+		  tree var = partition_to_var (map, part);
+		  if (!SSA_NAME_VAR (var)
+		      || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+			  && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+		      || !SSA_NAME_IS_DEFAULT_DEF (var))
+		    continue;
+		  live_track_process_def (live, var, graph);
+		}
+	    }
+	}
+
      live_track_clear_base_vars (live);
     }
 
@@ -1132,6 +1168,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
     {
       var1 = partition_to_var (map, p1);
       var2 = partition_to_var (map, p2);
+
       z = var_union (map, var1, var2);
       if (z == NO_PARTITION)
 	{
@@ -1149,6 +1186,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
 
       if (debug)
 	fprintf (debug, ": Success -> %d\n", z);
+
       return true;
     }
 
@@ -1244,6 +1282,333 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
 }
 
 
+/* Output partition map MAP with coalescing plan PART to file F.  */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+  int t;
+  unsigned x, y;
+  int p;
+
+  fprintf (f, "\nCoalescible Partition map \n\n");
+
+  for (x = 0; x < map->num_partitions; x++)
+    {
+      if (map->view_to_partition != NULL)
+	p = map->view_to_partition[x];
+      else
+	p = x;
+
+      if (ssa_name (p) == NULL_TREE
+	  || virtual_operand_p (ssa_name (p)))
+        continue;
+
+      t = 0;
+      for (y = 1; y < num_ssa_names; y++)
+        {
+	  tree var = version_to_var (map, y);
+	  if (!var)
+	    continue;
+	  int q = var_to_partition (map, var);
+	  p = partition_find (part, q);
+	  gcc_assert (map->partition_to_base_index[q]
+		      == map->partition_to_base_index[p]);
+
+	  if (p == (int)x)
+	    {
+	      if (t++ == 0)
+	        {
+		  fprintf (f, "Partition %d, base %d (", x,
+			   map->partition_to_base_index[q]);
+		  print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+		  fprintf (f, " - ");
+		}
+	      fprintf (f, "%d ", y);
+	    }
+	}
+      if (t != 0)
+	fprintf (f, ")\n");
+    }
+  fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+   coalescing together, false otherwise.
+
+   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+  /* First check the SSA_NAME's associated DECL.  Without
+     optimization, we only want to coalesce if they have the same DECL
+     or both have no associated DECL.  */
+  tree var1 = SSA_NAME_VAR (name1);
+  tree var2 = SSA_NAME_VAR (name2);
+  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+  if (var1 != var2 && !flag_tree_coalesce_vars)
+    return false;
+
+  /* Now check the types.  If the types are the same, then we should
+     try to coalesce V1 and V2.  */
+  tree t1 = TREE_TYPE (name1);
+  tree t2 = TREE_TYPE (name2);
+  if (t1 == t2)
+    {
+    check_modes:
+      /* If the base variables are the same, we're good: none of the
+	 other tests below could possibly fail.  */
+      var1 = SSA_NAME_VAR (name1);
+      var2 = SSA_NAME_VAR (name2);
+      if (var1 == var2)
+	return true;
+
+      /* We don't want to coalesce two SSA names if one of the base
+	 variables is supposed to be a register while the other is
+	 supposed to be on the stack.  Anonymous SSA names take
+	 registers, but when not optimizing, user variables should go
+	 on the stack, so coalescing them with the anonymous variable
+	 as the partition leader would end up assigning the user
+	 variable to a register.  Don't do that!  */
+      bool reg1 = !var1 || use_register_for_decl (var1);
+      bool reg2 = !var2 || use_register_for_decl (var2);
+      if (reg1 != reg2)
+	return false;
+
+      /* Check that the promoted modes are the same.  We don't want to
+	 coalesce if the promoted modes would be different.  Only
+	 PARM_DECLs and RESULT_DECLs have different promotion rules,
+	 so skip the test if both are variables, or both are anonymous
+	 SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
+	 coalesce its SSA versions with those of any other variables,
+	 because it may be passed by reference.  */
+      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+	|| (/* The case var1 == var2 is already covered above.  */
+	    !parm_maybe_byref_p (var1)
+	    && !parm_maybe_byref_p (var2)
+	    && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
+    }
+
+  /* If the types are not the same, check for a canonical type match.  This
+     (for example) allows coalescing when the types are fundamentally the
+     same, but just have different names. 
+
+     Note pointer types with different address spaces may have the same
+     canonical type.  Those are rejected for coalescing by the
+     types_compatible_p check.  */
+  if (TYPE_CANONICAL (t1)
+      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+      && types_compatible_p (t1, t2))
+    goto check_modes;
+
+  return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+   partition of SSA names USED_IN_COPIES and related by CL coalesce
+   possibilities.  This must match gimple_can_coalesce_p in the
+   optimized case.  */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+				   coalesce_list_p cl)
+{
+  int parts = num_var_partitions (map);
+  partition tentative = partition_new (parts);
+
+  /* Partition the SSA versions so that, for each coalescible
+     pair, both of its members are in the same partition in
+     TENTATIVE.  */
+  gcc_assert (!cl->sorted);
+  coalesce_pair_p node;
+  coalesce_iterator_type ppi;
+  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+    {
+      tree v1 = ssa_name (node->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (node->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* We have to deal with cost one pairs too.  */
+  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+    {
+      tree v1 = ssa_name (co->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (co->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* And also with abnormal edges.  */
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->flags & EDGE_ABNORMAL)
+	  {
+	    gphi_iterator gsi;
+	    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+		 gsi_next (&gsi))
+	      {
+		gphi *phi = gsi.phi ();
+		tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+		if (SSA_NAME_IS_DEFAULT_DEF (arg)
+		    && (!SSA_NAME_VAR (arg)
+			|| TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+		  continue;
+
+		tree res = PHI_RESULT (phi);
+
+		int p1 = partition_find (tentative, var_to_partition (map, res));
+		int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+		if (p1 == p2)
+		  continue;
+
+		partition_union (tentative, p1, p2);
+	      }
+	  }
+    }
+
+  map->partition_to_base_index = XCNEWVEC (int, parts);
+  auto_vec<unsigned int> index_map (parts);
+  if (parts)
+    index_map.quick_grow (parts);
+
+  const unsigned no_part = -1;
+  unsigned count = parts;
+  while (count)
+    index_map[--count] = no_part;
+
+  /* Initialize MAP's mapping from partition to base index, using
+     as base indices an enumeration of the TENTATIVE partitions in
+     which each SSA version ended up, so that we compute conflicts
+     between all SSA versions that ended up in the same potential
+     coalesce partition.  */
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      if (index_map[base] != no_part)
+	continue;
+      index_map[base] = count++;
+    }
+
+  map->num_basevars = count;
+
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      gcc_assert (index_map[base] < count);
+      map->partition_to_base_index[pidx] = index_map[base];
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_part_var_map (dump_file, tentative, map);
+
+  partition_delete (tentative);
+}
+
+/* Hashtable helpers.  */
+
+struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
+{
+  static inline hashval_t hash (const tree_int_map *);
+  static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+  return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+  return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+   names.  Partitions will share the same base if they have the same
+   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
+   must match gimple_can_coalesce_p in the non-optimized case.  */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+  int x, num_part;
+  tree var;
+  struct tree_int_map *m, *mapstorage;
+
+  num_part = num_var_partitions (map);
+  hash_table<tree_int_map_hasher> tree_to_index (num_part);
+  /* We can have at most num_part entries in the hash tables, so it's
+     enough to allocate so many map elements once, saving some malloc
+     calls.  */
+  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+  /* If a base table already exists, clear it, otherwise create it.  */
+  free (map->partition_to_base_index);
+  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+  /* Build the base variable list, and point partitions at their bases.  */
+  for (x = 0; x < num_part; x++)
+    {
+      struct tree_int_map **slot;
+      unsigned baseindex;
+      var = partition_to_var (map, x);
+      if (SSA_NAME_VAR (var)
+	  && (!VAR_P (SSA_NAME_VAR (var))
+	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+	m->base.from = SSA_NAME_VAR (var);
+      else
+	/* This restricts what anonymous SSA names we can coalesce
+	   as it restricts the sets we compute conflicts for.
+	   Using TREE_TYPE to generate sets is the easies as
+	   type equivalency also holds for SSA names with the same
+	   underlying decl.
+
+	   Check gimple_can_coalesce_p when changing this code.  */
+	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+			? TYPE_CANONICAL (TREE_TYPE (var))
+			: TREE_TYPE (var));
+      /* If base variable hasn't been seen, set it up.  */
+      slot = tree_to_index.find_slot (m, INSERT);
+      if (!*slot)
+	{
+	  baseindex = m - mapstorage;
+	  m->to = baseindex;
+	  *slot = m;
+	  m++;
+	}
+      else
+	baseindex = (*slot)->to;
+      map->partition_to_base_index[x] = baseindex;
+    }
+
+  map->num_basevars = m - mapstorage;
+
+  free (mapstorage);
+}
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1260,9 +1625,10 @@ coalesce_ssa_name (void)
   cl = create_coalesce_list ();
   map = create_outofssa_var_map (cl, used_in_copies);
 
-  /* If optimization is disabled, we need to coalesce all the names originating
-     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
-  if (!optimize)
+  /* If this optimization is disabled, we need to coalesce all the
+     names originating from the same SSA_NAME_VAR so debug info
+     remains undisturbed.  */
+  if (!flag_tree_coalesce_vars)
     {
       hash_table<ssa_name_var_hash> ssa_name_hash (10);
 
@@ -1303,8 +1669,13 @@ coalesce_ssa_name (void)
   if (dump_file && (dump_flags & TDF_DETAILS))
     dump_var_map (dump_file, map);
 
-  /* Don't calculate live ranges for variables not in the coalesce list.  */
-  partition_view_bitmap (map, used_in_copies, true);
+  partition_view_bitmap (map, used_in_copies);
+
+  if (flag_tree_coalesce_vars)
+    compute_optimized_partition_bases (map, used_in_copies, cl);
+  else
+    compute_samebase_partition_bases (map);
+
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
@@ -1343,8 +1714,7 @@ coalesce_ssa_name (void)
 
   /* Now coalesce everything in the list.  */
   coalesce_partitions (map, graph, cl,
-		       ((dump_flags & TDF_DETAILS) ? dump_file
-						   : NULL));
+		       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
   delete_coalesce_list (cl);
   ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_TREE_SSA_COALESCE_H
 
 extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
 
 #endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index aeb7f28..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,475 +0,0 @@
-/* Rename SSA copies.
-   Copyright (C) 2004-2015 Free Software Foundation, Inc.
-   Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "backend.h"
-#include "tree.h"
-#include "gimple.h"
-#include "rtl.h"
-#include "ssa.h"
-#include "alias.h"
-#include "fold-const.h"
-#include "internal-fn.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
-  /* Number of copies coalesced.  */
-  int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
-   This optimization looks for copies between 2 SSA_NAMES, either through a
-   direct copy, or an implicit one via a PHI node result and its arguments.
-
-   Each copy is examined to determine if it is possible to rename the base
-   variable of one of the operands to the same variable as the other operand.
-   i.e.
-   T.3_5 = <blah>
-   a_1 = T.3_5
-
-   If this copy couldn't be copy propagated, it could possibly remain in the
-   program throughout the optimization phases.   After SSA->normal, it would
-   become:
-
-   T.3 = <blah>
-   a = T.3
-
-   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
-   fundamental reason why the base variable needs to be T.3, subject to
-   certain restrictions.  This optimization attempts to determine if we can
-   change the base variable on copies like this, and result in code such as:
-
-   a_5 = <blah>
-   a_1 = a_5
-
-   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
-   possible, the copy goes away completely. If it isn't possible, a new temp
-   will be created for a_5, and you will end up with the exact same code:
-
-   a.8 = <blah>
-   a = a.8
-
-   The other benefit of performing this optimization relates to what variables
-   are chosen in copies.  Gimplification of the program uses temporaries for
-   a lot of things. expressions like
-
-   a_1 = <blah>
-   <blah2> = a_1
-
-   get turned into
-
-   T.3_5 = <blah>
-   a_1 = T.3_5
-   <blah2> = a_1
-
-   Copy propagation is done in a forward direction, and if we can propagate
-   through the copy, we end up with:
-
-   T.3_5 = <blah>
-   <blah2> = T.3_5
-
-   The copy is gone, but so is all reference to the user variable 'a'. By
-   performing this optimization, we would see the sequence:
-
-   a_5 = <blah>
-   a_1 = a_5
-   <blah2> = a_1
-
-   which copy propagation would then turn into:
-
-   a_5 = <blah>
-   <blah2> = a_5
-
-   and so we still retain the user variable whenever possible.  */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
-   Choose a representative for the partition, and send debug info to DEBUG.  */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
-  int p1, p2, p3;
-  tree root1, root2;
-  tree rep1, rep2;
-  bool ign1, ign2, abnorm;
-
-  gcc_assert (TREE_CODE (var1) == SSA_NAME);
-  gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
-  register_ssa_partition (map, var1);
-  register_ssa_partition (map, var2);
-
-  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
-  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
-  if (debug)
-    {
-      fprintf (debug, "Try : ");
-      print_generic_expr (debug, var1, TDF_SLIM);
-      fprintf (debug, "(P%d) & ", p1);
-      print_generic_expr (debug, var2, TDF_SLIM);
-      fprintf (debug, "(P%d)", p2);
-    }
-
-  gcc_assert (p1 != NO_PARTITION);
-  gcc_assert (p2 != NO_PARTITION);
-
-  if (p1 == p2)
-    {
-      if (debug)
-	fprintf (debug, " : Already coalesced.\n");
-      return;
-    }
-
-  rep1 = partition_to_var (map, p1);
-  rep2 = partition_to_var (map, p2);
-  root1 = SSA_NAME_VAR (rep1);
-  root2 = SSA_NAME_VAR (rep2);
-  if (!root1 && !root2)
-    return;
-
-  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
-  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
-	    || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
-  if (abnorm)
-    {
-      if (debug)
-	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
-      return;
-    }
-
-  /* Partitions already have the same root, simply merge them.  */
-  if (root1 == root2)
-    {
-      p1 = partition_union (map->var_partition, p1, p2);
-      if (debug)
-	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
-      return;
-    }
-
-  /* Never attempt to coalesce 2 different parameters.  */
-  if ((root1 && TREE_CODE (root1) == PARM_DECL)
-      && (root2 && TREE_CODE (root2) == PARM_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
-      return;
-    }
-
-  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
-      != (root2 && TREE_CODE (root2) == RESULT_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
-      return;
-    }
-
-  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
-  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
-  /* Refrain from coalescing user variables, if requested.  */
-  if (!ign1 && !ign2)
-    {
-      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
-	ign2 = true;
-      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
-	ign1 = true;
-      else if (flag_ssa_coalesce_vars != 2)
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
-	  return;
-	}
-      else
-	ign2 = true;
-    }
-
-  /* If both values have default defs, we can't coalesce.  If only one has a
-     tag, make sure that variable is the new root partition.  */
-  if (root1 && ssa_default_def (cfun, root1))
-    {
-      if (root2 && ssa_default_def (cfun, root2))
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 default defs. No coalesce.\n");
-	  return;
-	}
-      else
-        {
-	  ign2 = true;
-	  ign1 = false;
-	}
-    }
-  else if (root2 && ssa_default_def (cfun, root2))
-    {
-      ign1 = true;
-      ign2 = false;
-    }
-
-  /* Do not coalesce if we cannot assign a symbol to the partition.  */
-  if (!(!ign2 && root2)
-      && !(!ign1 && root1))
-    {
-      if (debug)
-	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the new chosen root variable would be read-only.
-     If both ign1 && ign2, then the root var of the larger partition
-     wins, so reject in that case if any of the root vars is TREE_READONLY.
-     Otherwise reject only if the root var, on which replace_ssa_name_symbol
-     will be called below, is readonly.  */
-  if (((root1 && TREE_READONLY (root1)) && ign2)
-      || ((root2 && TREE_READONLY (root2)) && ign1))
-    {
-      if (debug)
-	fprintf (debug, " : Readonly variable.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the two variables aren't type compatible .  */
-  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
-      /* There is a disconnect between the middle-end type-system and
-         VRP, avoid coalescing enum types with different bounds.  */
-      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
-	   || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
-	  && TREE_TYPE (var1) != TREE_TYPE (var2)))
-    {
-      if (debug)
-	fprintf (debug, " : Incompatible types.  No coalesce.\n");
-      return;
-    }
-
-  /* Merge the two partitions.  */
-  p3 = partition_union (map->var_partition, p1, p2);
-
-  /* Set the root variable of the partition to the better choice, if there is
-     one.  */
-  if (!ign2 && root2)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
-  else if (!ign1 && root1)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
-  else
-    gcc_unreachable ();
-
-  if (debug)
-    {
-      fprintf (debug, " --> P%d ", p3);
-      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
-			  TDF_SLIM);
-      fprintf (debug, "\n");
-    }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
-  GIMPLE_PASS, /* type */
-  "copyrename", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_COPY_RENAME, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
-  pass_rename_ssa_copies (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
-   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
-   changing the underlying root variable of all coalesced version.  This will
-   then cause the SSA->normal pass to attempt to coalesce them all to the same
-   variable.  */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
-  var_map map;
-  basic_block bb;
-  tree var, part_var;
-  gimple stmt;
-  unsigned x;
-  FILE *debug;
-
-  memset (&stats, 0, sizeof (stats));
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    debug = dump_file;
-  else
-    debug = NULL;
-
-  map = init_var_map (num_ssa_names);
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Scan for real copies.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_assign_ssa_name_copy_p (stmt))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
-	    }
-	}
-    }
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Treat PHI nodes as copies between the result and each argument.  */
-      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-        {
-          size_t i;
-	  tree res;
-	  gphi *phi = gsi.phi ();
-	  res = gimple_phi_result (phi);
-
-	  /* Do not process virtual SSA_NAMES.  */
-	  if (virtual_operand_p (res))
-	    continue;
-
-	  /* Make sure to only use the same partition for an argument
-	     as the result but never the other way around.  */
-	  if (SSA_NAME_VAR (res)
-	      && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
-	    for (i = 0; i < gimple_phi_num_args (phi); i++)
-	      {
-		tree arg = PHI_ARG_DEF (phi, i);
-		if (TREE_CODE (arg) == SSA_NAME)
-		  copy_rename_partition_coalesce (map, res, arg,
-						  debug);
-	      }
-	  /* Else if all arguments are in the same partition try to merge
-	     it with the result.  */
-	  else
-	    {
-	      int all_p_same = -1;
-	      int p = -1;
-	      for (i = 0; i < gimple_phi_num_args (phi); i++)
-		{
-		  tree arg = PHI_ARG_DEF (phi, i);
-		  if (TREE_CODE (arg) != SSA_NAME)
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		  else if (all_p_same == -1)
-		    {
-		      p = partition_find (map->var_partition,
-					  SSA_NAME_VERSION (arg));
-		      all_p_same = 1;
-		    }
-		  else if (all_p_same == 1
-			   && p != partition_find (map->var_partition,
-						   SSA_NAME_VERSION (arg)))
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		}
-	      if (all_p_same == 1)
-		copy_rename_partition_coalesce (map, res,
-						PHI_ARG_DEF (phi, 0),
-						debug);
-	    }
-        }
-    }
-
-  if (debug)
-    dump_var_map (debug, map);
-
-  /* Now one more pass to make all elements of a partition share the same
-     root variable.  */
-
-  for (x = 1; x < num_ssa_names; x++)
-    {
-      part_var = partition_to_var (map, x);
-      if (!part_var)
-        continue;
-      var = ssa_name (x);
-      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
-	continue;
-      if (debug)
-        {
-	  fprintf (debug, "Coalesced ");
-	  print_generic_expr (debug, var, TDF_SLIM);
-	  fprintf (debug, " to ");
-	  print_generic_expr (debug, part_var, TDF_SLIM);
-	  fprintf (debug, "\n");
-	}
-      stats.coalesced++;
-      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
-    }
-
-  statistics_counter_event (fun, "copies coalesced",
-			    stats.coalesced);
-  delete_var_map (map);
-  return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
-  return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 5b00f58..4772558 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -70,88 +70,6 @@ static void  verify_live_on_entry (tree_live_info_p);
    ssa_name or variable, and vice versa.  */
 
 
-/* Hashtable helpers.  */
-
-struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
-{
-  static inline hashval_t hash (const tree_int_map *);
-  static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
-  return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
-  return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP.  */
-
-static void
-var_map_base_init (var_map map)
-{
-  int x, num_part;
-  tree var;
-  struct tree_int_map *m, *mapstorage;
-
-  num_part = num_var_partitions (map);
-  hash_table<tree_int_map_hasher> tree_to_index (num_part);
-  /* We can have at most num_part entries in the hash tables, so it's
-     enough to allocate so many map elements once, saving some malloc
-     calls.  */
-  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
-  /* If a base table already exists, clear it, otherwise create it.  */
-  free (map->partition_to_base_index);
-  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
-  /* Build the base variable list, and point partitions at their bases.  */
-  for (x = 0; x < num_part; x++)
-    {
-      struct tree_int_map **slot;
-      unsigned baseindex;
-      var = partition_to_var (map, x);
-      if (SSA_NAME_VAR (var)
-	  && (!VAR_P (SSA_NAME_VAR (var))
-	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
-	m->base.from = SSA_NAME_VAR (var);
-      else
-	/* This restricts what anonymous SSA names we can coalesce
-	   as it restricts the sets we compute conflicts for.
-	   Using TREE_TYPE to generate sets is the easies as
-	   type equivalency also holds for SSA names with the same
-	   underlying decl. 
-
-	   Check gimple_can_coalesce_p when changing this code.  */
-	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
-			? TYPE_CANONICAL (TREE_TYPE (var))
-			: TREE_TYPE (var));
-      /* If base variable hasn't been seen, set it up.  */
-      slot = tree_to_index.find_slot (m, INSERT);
-      if (!*slot)
-	{
-	  baseindex = m - mapstorage;
-	  m->to = baseindex;
-	  *slot = m;
-	  m++;
-	}
-      else
-	baseindex = (*slot)->to;
-      map->partition_to_base_index[x] = baseindex;
-    }
-
-  map->num_basevars = m - mapstorage;
-
-  free (mapstorage);
-}
-
-
 /* Remove the base table in MAP.  */
 
 static void
@@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected)
 }
 
 
-/* Create a partition view which includes all the used partitions in MAP.  If
-   WANT_BASES is true, create the base variable map as well.  */
+/* Create a partition view which includes all the used partitions in MAP.  */
 
 void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
 {
   bitmap used;
 
   used = partition_view_init (map);
   partition_view_fini (map, used);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
@@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases)
    as well.  */
 
 void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
 {
   bitmap used;
   bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
     }
   partition_view_fini (map, new_partitions);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
 extern var_map init_var_map (int);
 extern void delete_var_map (var_map);
 extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
 extern void dump_scope_blocks (FILE *, int);
 extern void debug_scope_block (tree, int);
 extern void debug_scope_blocks (int);
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index 437f69d..1fbd71e 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -38,6 +38,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "tree-ssa-propagate.h"
 #include "tree-hash-traits.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
 
 /* The basic structure describing an equivalency created by traversing
    an edge.  Traversing the edge effectively means that we can assume
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index da9de28..a31a137 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4856,12 +4856,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
    registers, as well as associations between MEMs and VALUEs.  */
 
 static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
 {
   unsigned int r;
   hard_reg_set_iterator hrsi;
+  HARD_REG_SET invalidated_regs;
 
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+  get_call_reg_set_usage (call_insn, &invalidated_regs,
+			  regs_invalidated_by_call);
+
+  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
     var_regno_delete (set, r);
 
   if (MAY_HAVE_DEBUG_INSNS)
@@ -6645,7 +6649,7 @@ compute_bb_dataflow (basic_block bb)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (out);
+	    dataflow_set_clear_at_call (out, insn);
 	    break;
 
 	  case MO_USE:
@@ -9107,7 +9111,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (set);
+	    dataflow_set_clear_at_call (set, insn);
 	    emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
 	    {
 	      rtx arguments = mo->u.loc, *p = &arguments;


incremental fixes

From: Alexandre Oliva <aoliva@redhat.com>

	* emit-rtl.c: Include stor-layout.h.
	(set_reg_attrs_for_decl_rtl): Take mode from expression if
	it's not a DECL.
	* stmt.c (emit_case_decision_tree): Pass it the SSA_NAME
	rather than its possibly-NULL DECL.
	PR bootstrap/66978
	* function.c (expand_function_start): Convert static chain to
	Pmode if needed.  From H.J. Lu  <hongjiu.lu@intel.com>.
	PR middle-end/66983
	PR middle-end/67035
	* cfgexpand.c (align_local_variable, add_stack_var): Support
	anonymous SSA names.
	(defer_stack_allocation): Likewise.  Declare earlier.
	(expand_one_ssa_partition): Record alignment before expanding
	stack vars.  Support deferred allocation.
	(set_rtl): Do no record deferred-allocation marker in
	SA.partition_to_pseudo.
	(expand_stack_vars): Adjust check for the marker in it.
	(adjust_one_expanded_partition_var): Skip deferred-alloc vars.
	PR middle-end/67034
	* cfgexpand.c (parm_maybe_byref_p): New.
	(expand_one_ssa_partition): Call it.  Expand maybe-byref
	parms' default defs with a placeholder for the mem addr.
	(ssa_default_def_partition): New.
	(get_rtl_for_parm_ssa_default_def): Use it.
	* function.c (assign_parm_setup_block): Replace the
	placeholder with the address of the newly-allocated block.
	(assign_parm_setup_reg): Replace the placeholder with a
	newly-created pseudo.  Arrange for the pseudo to be
	initialized from the incoming passed pointer.  Make sure
	passed_pointer parms don't need conversion.  Don't copy from
	validated_mem if parmreg is the value expression from expand
	and validated_mem is the passed pointer.  For passed pointers,
	copy from the mem referenced by validated_mem when using the
	expand-chosen rtl.
	* cfgexpand.h (parm_maybe_byref_p): Declare.
	* tree-ssa-coalesce.c: Include cfgexpand.h.
	(gimple_can_coalesce_p): Do not coalesce maybe-byref parms
	with SSA_NAMEs of other variables, or anonymous SSA_NAMEs.
	PR rtl-optimization/67000
	* expr.c (read_complex_part): Export.
	* expr.h (read_complex_part): Declare.
	* function.c (split_complex_args): Use it.  Reset complex parm
	before fetching its default decl rtl.
	(assign_parms_unsplit_complex): Use the preexisting complex
	parm rtl if it matches the components.
	(assign_parm_setup_reg): Drop assert on from_expand mode.
	Adjust it to the promoted_mode, if not byref.
---
 gcc/cfgexpand.c         |  129 +++++++++++++++++++++++++++++++++++++----------
 gcc/cfgexpand.h         |    1 
 gcc/emit-rtl.c          |    5 +-
 gcc/expr.c              |    2 -
 gcc/expr.h              |    1 
 gcc/function.c          |   88 ++++++++++++++++++++++++--------
 gcc/stmt.c              |    2 -
 gcc/tree-ssa-coalesce.c |   12 +++-
 8 files changed, 186 insertions(+), 54 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 0b19953..8f6caf6 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -97,6 +97,8 @@ gimple currently_expanding_gimple_stmt;
 
 static rtx expand_debug_expr (tree);
 
+static bool defer_stack_allocation (tree, bool);
+
 /* Return an expression tree corresponding to the RHS of GIMPLE
    statement STMT.  */
 
@@ -170,6 +172,39 @@ leader_merge (tree cur, tree next)
   return cur;
 }
 
+/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
+   Such parameters are likely passed as a pointer to the value, rather
+   than as a value, and so we must not coalesce them, nor allocate
+   stack space for them before determining the calling conventions for
+   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
+   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
+   with NULL so as to make sure the MEM is not used before it is
+   adjusted in assign_parm_setup_reg.  */
+
+bool
+parm_maybe_byref_p (tree var)
+{
+  if (!var || VAR_P (var))
+    return false;
+
+  gcc_assert (TREE_CODE (var) == PARM_DECL
+	      || TREE_CODE (var) == RESULT_DECL);
+
+  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
+}
+
+/* Return the partition of the default SSA_DEF for decl VAR.  */
+
+static int
+ssa_default_def_partition (tree var)
+{
+  tree name = ssa_default_def (cfun, var);
+
+  if (!name)
+    return NO_PARTITION;
+
+  return var_to_partition (SA.map, name);
+}
 
 /* Return the RTL for the default SSA def of a PARM or RESULT, if
    there is one.  */
@@ -198,12 +233,7 @@ get_rtl_for_parm_ssa_default_def (tree var)
       return DECL_RTL (var);
     }
 
-  tree name = ssa_default_def (cfun, var);
-
-  if (!name)
-    return NULL_RTX;
-
-  int part = var_to_partition (SA.map, name);
+  int part = ssa_default_def_partition (var);
   if (part == NO_PARTITION)
     return NULL_RTX;
 
@@ -253,7 +283,7 @@ set_rtl (tree t, rtx x)
 	{
 	  if (SA.partition_to_pseudo[part])
 	    gcc_assert (SA.partition_to_pseudo[part] == x);
-	  else
+	  else if (x != pc_rtx)
 	    SA.partition_to_pseudo[part] = x;
 	}
       /* For the benefit of debug information at -O0 (where
@@ -348,8 +378,15 @@ static bool has_short_buffer;
 static unsigned int
 align_local_variable (tree decl)
 {
-  unsigned int align = LOCAL_DECL_ALIGNMENT (decl);
-  DECL_ALIGN (decl) = align;
+  unsigned int align;
+
+  if (TREE_CODE (decl) == SSA_NAME)
+    align = TYPE_ALIGN (TREE_TYPE (decl));
+  else
+    {
+      align = LOCAL_DECL_ALIGNMENT (decl);
+      DECL_ALIGN (decl) = align;
+    }
   return align / BITS_PER_UNIT;
 }
 
@@ -415,12 +452,15 @@ add_stack_var (tree decl)
   decl_to_stack_part->put (decl, stack_vars_num);
 
   v->decl = decl;
-  v->size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (decl)));
+  tree size = TREE_CODE (decl) == SSA_NAME
+    ? TYPE_SIZE_UNIT (TREE_TYPE (decl))
+    : DECL_SIZE_UNIT (decl);
+  v->size = tree_to_uhwi (size);
   /* Ensure that all variables have size, so that &a != &b for any two
      variables that are simultaneously live.  */
   if (v->size == 0)
     v->size = 1;
-  v->alignb = align_local_variable (SSAVAR (decl));
+  v->alignb = align_local_variable (decl);
   /* An alignment of zero can mightily confuse us later.  */
   gcc_assert (v->alignb != 0);
 
@@ -1051,9 +1091,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 	  /* Skip variables that have already had rtl assigned.  See also
 	     add_stack_var where we perpetrate this pc_rtx hack.  */
 	  decl = stack_vars[i].decl;
-	  if ((TREE_CODE (decl) == SSA_NAME
-	      ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
-	      : DECL_RTL (decl)) != pc_rtx)
+	  if (TREE_CODE (decl) == SSA_NAME
+	      ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
+	      : DECL_RTL (decl) != pc_rtx)
 	    continue;
 
 	  large_size += alignb - 1;
@@ -1082,9 +1122,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
       /* Skip variables that have already had rtl assigned.  See also
 	 add_stack_var where we perpetrate this pc_rtx hack.  */
       decl = stack_vars[i].decl;
-      if ((TREE_CODE (decl) == SSA_NAME
-	   ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
-	   : DECL_RTL (decl)) != pc_rtx)
+      if (TREE_CODE (decl) == SSA_NAME
+	  ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
+	  : DECL_RTL (decl) != pc_rtx)
 	continue;
 
       /* Check the predicate to see whether this variable should be
@@ -1290,12 +1330,6 @@ expand_one_ssa_partition (tree var)
   if (SA.partition_to_pseudo[part])
     return;
 
-  if (!use_register_for_decl (var))
-    {
-      expand_one_stack_var_1 (var);
-      return;
-    }
-
   unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
 					  TYPE_MODE (TREE_TYPE (var)),
 					  TYPE_ALIGN (TREE_TYPE (var)));
@@ -1307,6 +1341,27 @@ expand_one_ssa_partition (tree var)
 
   record_alignment_for_reg_var (align);
 
+  if (!use_register_for_decl (var))
+    {
+      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
+	  && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
+	{
+	  expand_one_stack_var_at (var, pc_rtx, 0, 0);
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (GET_CODE (x) == MEM);
+	  gcc_assert (GET_MODE (x) == BLKmode);
+	  gcc_assert (XEXP (x, 0) == pc_rtx);
+	  /* Reset the address, so that any attempt to use it will
+	     ICE.  It will be adjusted in assign_parm_setup_reg.  */
+	  XEXP (x, 0) = NULL_RTX;
+	}
+      else if (defer_stack_allocation (var, true))
+	add_stack_var (var);
+      else
+	expand_one_stack_var_1 (var);
+      return;
+    }
+
   machine_mode reg_mode = promote_ssa_mode (var, NULL);
 
   rtx x = gen_reg_rtx (reg_mode);
@@ -1331,6 +1386,13 @@ adjust_one_expanded_partition_var (tree var)
 
   rtx x = SA.partition_to_pseudo[part];
 
+  if (!x)
+    {
+      /* This var will get a stack slot later.  */
+      gcc_assert (defer_stack_allocation (var, true));
+      return;
+    }
+
   set_rtl (var, x);
 
   if (!REG_P (x))
@@ -1409,10 +1471,14 @@ expand_one_error_var (tree var)
 static bool
 defer_stack_allocation (tree var, bool toplevel)
 {
+  tree size_unit = TREE_CODE (var) == SSA_NAME
+    ? TYPE_SIZE_UNIT (TREE_TYPE (var))
+    : DECL_SIZE_UNIT (var);
+
   /* Whether the variable is small enough for immediate allocation not to be
      a problem with regard to the frame size.  */
   bool smallish
-    = ((HOST_WIDE_INT) tree_to_uhwi (DECL_SIZE_UNIT (var))
+    = ((HOST_WIDE_INT) tree_to_uhwi (size_unit)
        < PARAM_VALUE (PARAM_MIN_SIZE_FOR_STACK_SHARING));
 
   /* If stack protection is enabled, *all* stack variables must be deferred,
@@ -1421,16 +1487,24 @@ defer_stack_allocation (tree var, bool toplevel)
   if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK))
     return true;
 
+  unsigned int align = TREE_CODE (var) == SSA_NAME
+    ? TYPE_ALIGN (TREE_TYPE (var))
+    : DECL_ALIGN (var);
+
   /* We handle "large" alignment via dynamic allocation.  We want to handle
      this extra complication in only one place, so defer them.  */
-  if (DECL_ALIGN (var) > MAX_SUPPORTED_STACK_ALIGNMENT)
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
     return true;
 
+  bool ignored = TREE_CODE (var) == SSA_NAME
+    ? !SSAVAR (var) || DECL_IGNORED_P (SSA_NAME_VAR (var))
+    : DECL_IGNORED_P (var);
+
   /* When optimization is enabled, DECL_IGNORED_P variables originally scoped
      might be detached from their block and appear at toplevel when we reach
      here.  We want to coalesce them with variables from other blocks when
      the immediate contribution to the frame size would be noticeable.  */
-  if (toplevel && optimize > 0 && DECL_IGNORED_P (var) && !smallish)
+  if (toplevel && optimize > 0 && ignored && !smallish)
     return true;
 
   /* Variables declared in the outermost scope automatically conflict
@@ -6135,7 +6209,8 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      gcc_assert (SA.partition_to_pseudo[part]);
+      gcc_assert (SA.partition_to_pseudo[part]
+		  || defer_stack_allocation (name, true));
 
       /* If this decl was marked as living in multiple places, reset
 	 this now to NULL.  */
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index 602579d..987cf356 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern bool parm_maybe_byref_p (tree);
 extern rtx get_rtl_for_parm_ssa_default_def (tree var);
 
 
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 0648af6..3b95c5d 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "builtins.h"
 #include "rtl-iter.h"
+#include "stor-layout.h"
 
 struct target_rtl default_target_rtl;
 #if SWITCHABLE_TARGET
@@ -1243,7 +1244,9 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
   if (REG_P (x))
     REG_ATTRS (x)
       = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
-					       DECL_MODE (tdecl)));
+					       DECL_P (tdecl)
+					       ? DECL_MODE (tdecl)
+					       : TYPE_MODE (TREE_TYPE (tdecl))));
   if (GET_CODE (x) == CONCAT)
     {
       if (REG_P (XEXP (x, 0)))
diff --git a/gcc/expr.c b/gcc/expr.c
index d601129..fc49f92 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -3022,7 +3022,7 @@ write_complex_part (rtx cplx, rtx val, bool imag_p)
 /* Extract one of the components of the complex value CPLX.  Extract the
    real part if IMAG_P is false, and the imaginary part if it's true.  */
 
-static rtx
+rtx
 read_complex_part (rtx cplx, bool imag_p)
 {
   machine_mode cmode, imode;
diff --git a/gcc/expr.h b/gcc/expr.h
index 32d1707..a2c8e1d 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -210,6 +210,7 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx);
 
 extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx);
 extern rtx_insn *emit_move_complex_parts (rtx, rtx);
+extern rtx read_complex_part (rtx, bool);
 extern void write_complex_part (rtx, rtx, bool);
 extern rtx emit_move_resolve_push (machine_mode, rtx);
 
diff --git a/gcc/function.c b/gcc/function.c
index c3d00cd..1d98ede 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -152,6 +152,7 @@ static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
 static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
+static void maybe_reset_rtl_for_parm (tree);
 
 \f
 /* Stack of nested functions.  */
@@ -2321,12 +2322,12 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 	     from resetting their RTL.  */
 	  if (currently_expanding_to_rtl)
 	    {
+	      maybe_reset_rtl_for_parm (cparm);
 	      rtx rtl = rtl_for_parm (all, cparm);
-	      gcc_assert (!rtl || GET_CODE (rtl) == CONCAT);
 	      if (rtl)
 		{
-		  SET_DECL_RTL (p, XEXP (rtl, 0));
-		  SET_DECL_RTL (decl, XEXP (rtl, 1));
+		  SET_DECL_RTL (p, read_complex_part (rtl, false));
+		  SET_DECL_RTL (decl, read_complex_part (rtl, true));
 
 		  DECL_CONTEXT (p) = cparm;
 		  DECL_CONTEXT (decl) = cparm;
@@ -2954,16 +2955,27 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      stack_parm = rtl_for_parm (all, parm);
-      if (stack_parm)
-	stack_parm = copy_rtx (stack_parm);
+      rtx from_expand = rtl_for_parm (all, parm);
+      if (from_expand && (!parm_maybe_byref_p (parm)
+			  || XEXP (from_expand, 0) != NULL_RTX))
+	stack_parm = copy_rtx (from_expand);
       else
 	{
 	  stack_parm = assign_stack_local (BLKmode, size_stored,
 					   DECL_ALIGN (parm));
 	  if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
 	    PUT_MODE (stack_parm, GET_MODE (entry_parm));
-	  set_mem_attributes (stack_parm, parm, 1);
+	  if (from_expand)
+	    {
+	      gcc_assert (GET_CODE (stack_parm) == MEM);
+	      gcc_assert (GET_CODE (from_expand) == MEM);
+	      gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
+	      XEXP (from_expand, 0) = XEXP (stack_parm, 0);
+	      PUT_MODE (from_expand, GET_MODE (stack_parm));
+	      stack_parm = copy_rtx (from_expand);
+	    }
+	  else
+	    set_mem_attributes (stack_parm, parm, 1);
 	}
     }
 
@@ -3102,23 +3114,34 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  rtx from_expand = rtl_for_parm (all, parm);
+  rtx from_expand = parmreg = rtl_for_parm (all, parm);
 
   if (from_expand && !data->passed_pointer)
     {
-      parmreg = from_expand;
-      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
+      if (GET_MODE (parmreg) != promoted_nominal_mode)
+	parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
     }
-  else
+  else if (!from_expand || parm_maybe_byref_p (parm))
     {
       parmreg = gen_reg_rtx (promoted_nominal_mode);
       if (!DECL_ARTIFICIAL (parm))
 	mark_user_reg (parmreg);
+
+      if (from_expand)
+	{
+	  gcc_assert (data->passed_pointer);
+	  gcc_assert (GET_CODE (from_expand) == MEM
+		      && GET_MODE (from_expand) == BLKmode
+		      && XEXP (from_expand, 0) == NULL_RTX);
+	  XEXP (from_expand, 0) = parmreg;
+	}
     }
 
   /* If this was an item that we received a pointer to,
      set DECL_RTL appropriately.  */
-  if (data->passed_pointer)
+  if (from_expand)
+    SET_DECL_RTL (parm, from_expand);
+  else if (data->passed_pointer)
     {
       rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
       set_mem_attributes (x, parm, 1);
@@ -3139,6 +3162,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
   need_conversion = (data->nominal_mode != data->passed_mode
 		     || promoted_nominal_mode != data->promoted_mode);
+  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
   moved = false;
 
   if (need_conversion
@@ -3270,7 +3294,9 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       did_conversion = true;
     }
-  else
+  /* We don't want to copy the incoming pointer to a parmreg expected
+     to hold the value rather than the pointer.  */
+  else if (!data->passed_pointer || parmreg != from_expand)
     emit_move_insn (parmreg, validated_mem);
 
   /* If we were passed a pointer but the actual value can safely live
@@ -3278,12 +3304,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
   if (data->passed_pointer
       && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
     {
+      rtx src = DECL_RTL (parm);
+
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
       if (from_expand)
 	{
 	  parmreg = from_expand;
 	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+	  src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
+	  set_mem_attributes (src, parm, 1);
 	}
       else if (use_register_for_decl (parm))
 	{
@@ -3302,14 +3332,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	  set_mem_attributes (parmreg, parm, 1);
 	}
 
-      if (GET_MODE (parmreg) != GET_MODE (DECL_RTL (parm)))
+      if (GET_MODE (parmreg) != GET_MODE (src))
 	{
-	  rtx tempreg = gen_reg_rtx (GET_MODE (DECL_RTL (parm)));
+	  rtx tempreg = gen_reg_rtx (GET_MODE (src));
 	  int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
 
 	  push_to_sequence2 (all->first_conversion_insn,
 			     all->last_conversion_insn);
-	  emit_move_insn (tempreg, DECL_RTL (parm));
+	  emit_move_insn (tempreg, src);
 	  tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
 	  emit_move_insn (parmreg, tempreg);
 	  all->first_conversion_insn = get_insns ();
@@ -3318,8 +3348,10 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
 	  did_conversion = true;
 	}
+      else if (GET_MODE (parmreg) == BLKmode)
+	gcc_assert (parm_maybe_byref_p (parm));
       else
-	emit_move_insn (parmreg, DECL_RTL (parm));
+	emit_move_insn (parmreg, src);
 
       SET_DECL_RTL (parm, parmreg);
 
@@ -3495,11 +3527,21 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
 	  imag = DECL_RTL (fnargs[i + 1]);
 	  if (inner != GET_MODE (real))
 	    {
-	      real = gen_lowpart_SUBREG (inner, real);
-	      imag = gen_lowpart_SUBREG (inner, imag);
+	      real = simplify_gen_subreg (inner, real, GET_MODE (real),
+					  subreg_lowpart_offset
+					  (inner, GET_MODE (real)));
+	      imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
+					  subreg_lowpart_offset
+					  (inner, GET_MODE (imag)));
 	    }
 
-	  if (TREE_ADDRESSABLE (parm))
+	  if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
+	      && rtx_equal_p (real,
+			      read_complex_part (tmp, false))
+	      && rtx_equal_p (imag,
+			      read_complex_part (tmp, true)))
+	    ; /* We now have the right rtl in tmp.  */
+	  else if (TREE_ADDRESSABLE (parm))
 	    {
 	      rtx rmem, imem;
 	      HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
@@ -3645,7 +3687,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
 	  assign_parm_setup_block (&all, pbdata->bounds_parm,
 				   &pbdata->parm_data);
 	else if (pbdata->parm_data.passed_pointer
-		 || use_register_for_decl (pbdata->bounds_parm))
+		 || use_register_for_parm_decl (&all, pbdata->bounds_parm))
 	  assign_parm_setup_reg (&all, pbdata->bounds_parm,
 				 &pbdata->parm_data);
 	else
@@ -5207,6 +5249,10 @@ expand_function_start (tree subr)
       SET_DECL_RTL (parm, local);
       mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
 
+      if (GET_MODE (local) != Pmode)
+	local = convert_to_mode (Pmode, local,
+				 TYPE_UNSIGNED (TREE_TYPE (parm)));
+
       insn = emit_move_insn (local, chain);
 
       /* Mark the register as eliminable, similar to parameters.  */
diff --git a/gcc/stmt.c b/gcc/stmt.c
index 391686c..e7f7dd4 100644
--- a/gcc/stmt.c
+++ b/gcc/stmt.c
@@ -891,7 +891,7 @@ emit_case_decision_tree (tree index_expr, tree index_type,
     {
       index = copy_to_reg (index);
       if (TREE_CODE (index_expr) == SSA_NAME)
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (index_expr), index);
+	set_reg_attrs_for_decl_rtl (index_expr, index);
     }
 
   balance_case_nodes (&case_list, NULL);
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index a622728..08ce72c 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "tree-ssa-live.h"
 #include "tree-ssa-coalesce.h"
+#include "cfgexpand.h"
 #include "explow.h"
 #include "diagnostic-core.h"
 
@@ -1379,10 +1380,15 @@ gimple_can_coalesce_p (tree name1, tree name2)
       /* Check that the promoted modes are the same.  We don't want to
 	 coalesce if the promoted modes would be different.  Only
 	 PARM_DECLs and RESULT_DECLs have different promotion rules,
-	 so skip the test if we both are variables or anonymous
-	 SSA_NAMEs.  */
+	 so skip the test if both are variables, or both are anonymous
+	 SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
+	 coalesce its SSA versions with those of any other variables,
+	 because it may be passed by reference.  */
       return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
-	|| promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
+	|| (/* The case var1 == var2 is already covered above.  */
+	    !parm_maybe_byref_p (var1)
+	    && !parm_maybe_byref_p (var2)
+	    && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
     }
 
   /* If the types are not the same, check for a canonical type match.  This


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-03 23:46                                               ` Alexandre Oliva
@ 2015-08-04  9:48                                                 ` Richard Biener
  2015-08-05  0:39                                                   ` Alexandre Oliva
  2015-08-10  8:24                                                 ` James Greenhalgh
  1 sibling, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-08-04  9:48 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: H.J. Lu, Segher Boessenkool, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Tue, Aug 4, 2015 at 1:45 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jul 30, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>
>> aoliva/pr64164  is fine on x32.
>
> Thanks.  I have made a large number of changes since you tested it,
> fixing all the reported issues and then some.  Now, x86_64-linux-gnu
> (-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
> powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
> targets I cross-tested still get the same 'make all' errors that the
> pristine tree did.
>
> The bulk of the incremental changes had to do with handling splitting
> and unsplitting of complex args, and BLKmode types passed by reference.
>
> For the former, I had naively assumed complex args would always be
> represented as CONCATs.  I now use read_complex_part to split the
> expand-assigned parm rtl into components, and I use it again at unsplit
> time to make sure the expand-assigned parm rtl matches that of the split
> components.
>
> The latter, in turn, almost required me to give up the entire notion of
> coalescing parms.  The problem is that, for arguments passed as a
> BLKmode pointer, copying the argument to an expand-assigned stack slot
> is not only wasteful, it doesn't really work: we'd expand the copy in
> assign_parms* and insert it before the stack allocation performed by
> expand_user_vars, so we'd initialize the pseudo holding the address of
> the stack slot only after its first use.
>
> The solution I came up with was to detect BLKmode parms and NOT allow
> them to coalesce with other variables, so that we can easily detect
> partitions that need special handling.  The special handling amounts to
> not allocating a stack slot for the partition holding the param default
> def, and leaving it for assign_parms to do so.  We do, however, allocate
> a MEM, in theory assigned to all partition members (though they're all
> the same parm ATM, but not necessarily all SSA_NAMEs of the same parm,
> since optimization causes different versions to conflict).  We leave the
> address of that MEM unset, so that assign_parm knows it is to fill it in
> with a pseudo holding a copy of the incoming parm address, or with the
> address of the local stack slot created to hold a copy of the parameter.
>
> It took me several rounds of trial and error to get these to pass all
> complex and vector tests on x86 and ppc.  The last remaining failure was
> a regression in gcc.target/powerpc/pr16458-4, caused by our inability to
> hold SSA_NAMEs as REG_EXPRs in pseudos.  emit_case_decision_tree
> attempted to preserve the decl as the REG_EXPR of a pseudo holding a
> copy of the switch expr, and its type appears to be used to decide
> whether to emit signed or unsigned compares, even though we explicitly
> pass mode and unsignedp down to the cmp_and_jump expanders.  I figured
> there was no good reason to prevent SSA_NAMEs in REG_EXPRs, just like
> MEM_EXPRs, so I went ahead and adjusted the DECL_MODE that prevented it,
> and now we expand the case decision tree as intended.
>
> Here's a consolidated patch, followed by the consolidated incremental
> patch.  I don't intend to install it this week, even if approved,
> because I'm going to be away Aug 5-10, and I would like to be around
> should any further problems arise.  So, ok to install when I return?

Ok.

Though I wonder on whether splitting the patch into a first one with disabling
coalescing of parms (their default defs(?)) and a followup implementing the
support for that.

Especially as, given the existing way GCC handles incoming parameters
during RTL expansion, there might be code generation issues that make
this coalescing not wanted (for certain kinds of parameters?).

So - is my observation correct that this is only about coalescing of the
default defs of parameters, not other SSA names based on parameter decls?
Do you think this splitting is feasible and my concern about the code-gen issues
warranted?

I don't want to force extra work on you but if you think this may make sense
(if only to better bisect code-gen issues to the relevant part) you may want to
give that a try.  If not - just go ahead with the whole patch.

Thanks,
Richard.

> [PR64164] Drop copyrename, use coalescible partition as base when optimizing.
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         PR bootstrap/66978
>         PR middle-end/66983
>         PR rtl-optimization/67000
>         PR middle-end/67034
>         PR middle-end/67035
>         * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
>         * tree-ssa-copyrename.c: Removed.
>         * opts.c (default_options_table): Drop -ftree-copyrename.  Add
>         -ftree-coalesce-vars.
>         * passes.def: Drop all occurrences of pass_rename_ssa_copies.
>         * common.opt (ftree-copyrename): Ignore.
>         (ftree-coalesce-inlined-vars): Likewise.
>         * doc/invoke.texi: Remove the ignored options above.
>         * gimple-expr.h (gimple_can_coalesce_p): Move declaration
>         * tree-ssa-coalesce.h: ... here.
>         * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
>         headers required by it.
>         * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
>         across variables when flag_tree_coalesce_vars.  Check register
>         use and promoted modes to allow coalescing.  Do not coalesce
>         maybe-byref parms with SSA_NAMEs of other variables, or
>         anonymous SSA_NAMEs.  Moved to tree-ssa-coalesce.c.
>         * tree-ssa-live.c (struct tree_int_map_hasher): Move along
>         with its member functions to tree-ssa-coalesce.c.
>         (var_map_base_init): Likewise.  Renamed to
>         compute_samebase_partition_bases.
>         (partition_view_normal): Drop want_bases parameter.
>         (partition_view_bitmap): Likewise.
>         * tree-ssa-live.h: Adjust declarations.
>         * tree-ssa-coalesce.c: Include explow.h and cfgexpand.h.
>         (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
>         default defs at the entry point.
>         (dump_part_var_map): New.
>         (compute_optimized_partition_bases): New, called by...
>         (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
>         of compute_samebase_partition_bases.  Adjust.
>         * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
>         * cfgexpand.c (leader_merge, parm_maybe_byref_p): New.
>         (ssa_default_def_partition): New.
>         (get_rtl_for_parm_ssa_default_def): New.
>         (align_local_variable, add_stack_var): Support anonymous SSA
>         names.
>         (defer_stack_allocation): Likewise.  Declare earlier.
>         (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
>         vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
>         Do no record deferred-allocation marker in
>         SA.partition_to_pseudo.
>         (expand_stack_vars): Adjust check for the marker in it.
>         (expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
>         redundant MEM attr setting.
>         (expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
>         from...
>         (expand_one_stack_var): ... this.  New wrapper to check and
>         skip already expanded SSA partitions.
>         (record_alignment_for_reg_var): New, factored out of...
>         (expand_one_var): ... this.
>         (expand_one_ssa_partition): New.
>         (adjust_one_expanded_partition_var): New.
>         (expand_one_register_var): Check and skip already expanded SSA
>         partitions.
>         (expand_used_vars): Don't create DECLs for anonymous SSA
>         names.  Expand all SSA partitions, then adjust all SSA names.
>         (pass::execute): Replace the loops that set
>         SA.partition_to_pseudo from partition leaders and cleared
>         DECL_RTL for multi-location variables, and that which used to
>         rename vars and set attrs, with one that clears DECL_RTL and
>         checks that PARMs and RESULTs default_defs match DECL_RTL.
>         * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
>         * emit-rtl.c: Include stor-layout.h.
>         (set_reg_attrs_for_parm): Handle NULL decl.
>         (set_reg_attrs_for_decl_rtl): Take mode from expression if
>         it's not a DECL.
>         * stmt.c (emit_case_decision_tree): Pass it the SSA_NAME
>         rather than its possibly-NULL DECL.
>         * explow.c (promote_ssa_mode): New.
>         * explow.h (promote_ssa_mode): Declare.
>         * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
>         (read_complex_part): Export.
>         * expr.h (read_complex_part): Declare.
>         * cfgexpand.h (parm_maybe_byref_p): Declare.
>         * function.c: Include cfgexpand.h.
>         (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
>         (use_register_for_parm_decl): Wrapper for the above to
>         special-case the result_ptr.
>         (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
>         (split_complex_args): Take assign_parm_data_all argument.
>         Pass it to rtl_for_parm.  Set up rtl and context for split
>         args.  Reset complex parm before fetching its default decl
>         rtl.
>         (assign_parms_unsplit_complex): Use the default-def complex
>         parm rtl if it matches the components.
>         (assign_parms_augmented_arg_list): Adjust.
>         (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
>         multiple locations.  Recognize split complex args.
>         (assign_parm_adjust_stack_rtl): Add all and parm arguments,
>         for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
>         (assign_parm_setup_block): Prefer SSA-assigned location, and
>         fill in its address if the memory location of a maybe-byref
>         parm was not assigned by cfgexpand.
>         (assign_parm_setup_reg): Likewise.  Adjust its mode as
>         needed.  Use entry_parm for equiv if stack_parm is NULL.  Make
>         sure passed_pointer parms don't need conversion.  Copy address
>         or value as needed.
>         (assign_parm_setup_stack): Prefer SSA-assigned location.
>         (assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
>         rtl before testing for pointer bounds.  Special-case result_ptr.
>         (expand_function_start): Maybe reset DECL_RTL of result.
>         Prefer SSA-assigned location for result and static chain.
>         Factor out DECL_RESULT and SET_DECL_RTL.  Convert static chain
>         to Pmode if needed, from H.J. Lu  <hongjiu.lu@intel.com>.
>         * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
>         anonymous SSA names.  Use promote_ssa_mode.
>         (get_temp_reg): Likewise.
>         (remove_ssa_form): Adjust.
>         * stor-layout.c (layout_decl): Don't set mem attributes of
>         non-MEMs.
>         * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
>         and get its reg_usage for reg invalidation.
>         (compute_bb_dataflow): Pass it insn.
>         (emit_notes_in_bb): Likewise.
>
> for  gcc/testsuite/ChangeLog
>
>         * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
>         * gcc.dg/ssp-1.c: Make counter a register.
>         * gcc.dg/ssp-2.c: Likewise.
>         * gcc.dg/torture/parm-coalesce.c: New.
> ---
>  gcc/Makefile.in                              |    1
>  gcc/alias.c                                  |   13 +
>  gcc/cfgexpand.c                              |  471 +++++++++++++++++++-------
>  gcc/cfgexpand.h                              |    3
>  gcc/common.opt                               |   12 -
>  gcc/doc/invoke.texi                          |   48 +--
>  gcc/emit-rtl.c                               |    8
>  gcc/explow.c                                 |   29 ++
>  gcc/explow.h                                 |    3
>  gcc/expr.c                                   |   41 +-
>  gcc/expr.h                                   |    1
>  gcc/function.c                               |  341 +++++++++++++++----
>  gcc/gimple-expr.c                            |   39 --
>  gcc/gimple-expr.h                            |    1
>  gcc/opts.c                                   |    2
>  gcc/passes.def                               |    5
>  gcc/stmt.c                                   |    2
>  gcc/stor-layout.c                            |    3
>  gcc/testsuite/gcc.dg/guality/pr54200.c       |    2
>  gcc/testsuite/gcc.dg/ssp-1.c                 |    2
>  gcc/testsuite/gcc.dg/ssp-2.c                 |    2
>  gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
>  gcc/tree-outof-ssa.c                         |   16 -
>  gcc/tree-ssa-coalesce.c                      |  384 +++++++++++++++++++++
>  gcc/tree-ssa-coalesce.h                      |    1
>  gcc/tree-ssa-copyrename.c                    |  475 --------------------------
>  gcc/tree-ssa-live.c                          |   99 -----
>  gcc/tree-ssa-live.h                          |    4
>  gcc/tree-ssa-uncprop.c                       |    5
>  gcc/var-tracking.c                           |   12 -
>  30 files changed, 1187 insertions(+), 878 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>  delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index be259e8..6079acc 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1444,7 +1444,6 @@ OBJS = \
>         tree-ssa-ccp.o \
>         tree-ssa-coalesce.o \
>         tree-ssa-copy.o \
> -       tree-ssa-copyrename.o \
>         tree-ssa-dce.o \
>         tree-ssa-dom.o \
>         tree-ssa-dse.o \
> diff --git a/gcc/alias.c b/gcc/alias.c
> index fa7d5d8..4681e3f 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>    if (! DECL_P (exprx) || ! DECL_P (expry))
>      return 0;
>
> +  /* If we refer to different gimple registers, or one gimple register
> +     and one non-gimple-register, we know they can't overlap.  First,
> +     gimple registers don't have their addresses taken.  Now, there
> +     could be more than one stack slot for (different versions of) the
> +     same gimple register, but we can presumably tell they don't
> +     overlap based on offsets from stack base addresses elsewhere.
> +     It's important that we don't proceed to DECL_RTL, because gimple
> +     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
> +     able to do anything about them since no SSA information will have
> +     remained to guide it.  */
> +  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> +    return exprx != expry;
> +
>    /* With invalid code we can end up storing into the constant pool.
>       Bail out to avoid ICEing when creating RTL for this.
>       See gfortran.dg/lto/20091028-2_0.f90.  */
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index a047632..8f6caf6 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -97,6 +97,8 @@ gimple currently_expanding_gimple_stmt;
>
>  static rtx expand_debug_expr (tree);
>
> +static bool defer_stack_allocation (tree, bool);
> +
>  /* Return an expression tree corresponding to the RHS of GIMPLE
>     statement STMT.  */
>
> @@ -150,21 +152,149 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
>  #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* Choose either CUR or NEXT as the leader DECL for a partition.
> +   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
> +   out of the same user variable being in multiple partitions (this is
> +   less likely for compiler-introduced temps).  */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> +  if (cur == NULL || cur == next)
> +    return next;
> +
> +  if (DECL_P (cur) && DECL_IGNORED_P (cur))
> +    return cur;
> +
> +  if (DECL_P (next) && DECL_IGNORED_P (next))
> +    return next;
> +
> +  return cur;
> +}
> +
> +/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
> +   Such parameters are likely passed as a pointer to the value, rather
> +   than as a value, and so we must not coalesce them, nor allocate
> +   stack space for them before determining the calling conventions for
> +   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
> +   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
> +   with NULL so as to make sure the MEM is not used before it is
> +   adjusted in assign_parm_setup_reg.  */
> +
> +bool
> +parm_maybe_byref_p (tree var)
> +{
> +  if (!var || VAR_P (var))
> +    return false;
> +
> +  gcc_assert (TREE_CODE (var) == PARM_DECL
> +             || TREE_CODE (var) == RESULT_DECL);
> +
> +  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
> +}
> +
> +/* Return the partition of the default SSA_DEF for decl VAR.  */
> +
> +static int
> +ssa_default_def_partition (tree var)
> +{
> +  tree name = ssa_default_def (cfun, var);
> +
> +  if (!name)
> +    return NO_PARTITION;
> +
> +  return var_to_partition (SA.map, name);
> +}
> +
> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
> +   there is one.  */
> +
> +rtx
> +get_rtl_for_parm_ssa_default_def (tree var)
> +{
> +  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> +
> +  if (!is_gimple_reg (var))
> +    return NULL_RTX;
> +
> +  /* If we've already determined RTL for the decl, use it.  This is
> +     not just an optimization: if VAR is a PARM whose incoming value
> +     is unused, we won't find a default def to use its partition, but
> +     we still want to use the location of the parm, if it was used at
> +     all.  During assign_parms, until a location is assigned for the
> +     VAR, RTL can only for a parm or result if we're not coalescing
> +     across variables, when we know we're coalescing all SSA_NAMEs of
> +     each parm or result, and we're not coalescing them with names
> +     pertaining to other variables, such as other parms' default
> +     defs.  */
> +  if (DECL_RTL_SET_P (var))
> +    {
> +      gcc_assert (DECL_RTL (var) != pc_rtx);
> +      return DECL_RTL (var);
> +    }
> +
> +  int part = ssa_default_def_partition (var);
> +  if (part == NO_PARTITION)
> +    return NULL_RTX;
> +
> +  return SA.partition_to_pseudo[part];
> +}
> +
>  /* Associate declaration T with storage space X.  If T is no
>     SSA name this is exactly SET_DECL_RTL, otherwise make the
>     partition of T associated with X.  */
>  static inline void
>  set_rtl (tree t, rtx x)
>  {
> +  if (x && SSAVAR (t))
> +    {
> +      bool skip = false;
> +      tree cur = NULL_TREE;
> +
> +      if (MEM_P (x))
> +       cur = MEM_EXPR (x);
> +      else if (REG_P (x))
> +       cur = REG_EXPR (x);
> +      else if (GET_CODE (x) == CONCAT
> +              && REG_P (XEXP (x, 0)))
> +       cur = REG_EXPR (XEXP (x, 0));
> +      else if (GET_CODE (x) == PARALLEL)
> +       cur = REG_EXPR (XVECEXP (x, 0, 0));
> +      else if (x == pc_rtx)
> +       skip = true;
> +      else
> +       gcc_unreachable ();
> +
> +      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +
> +      if (cur != next)
> +       {
> +         if (MEM_P (x))
> +           set_mem_attributes (x, next, true);
> +         else
> +           set_reg_attrs_for_decl_rtl (next, x);
> +       }
> +    }
> +
>    if (TREE_CODE (t) == SSA_NAME)
>      {
> -      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> -      if (x && !MEM_P (x))
> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> -      /* For the benefit of debug information at -O0 (where vartracking
> -         doesn't run) record the place also in the base DECL if it's
> -        a normal variable (not a parameter).  */
> -      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
> +      int part = var_to_partition (SA.map, t);
> +      if (part != NO_PARTITION)
> +       {
> +         if (SA.partition_to_pseudo[part])
> +           gcc_assert (SA.partition_to_pseudo[part] == x);
> +         else if (x != pc_rtx)
> +           SA.partition_to_pseudo[part] = x;
> +       }
> +      /* For the benefit of debug information at -O0 (where
> +         vartracking doesn't run) record the place also in the base
> +         DECL.  For PARMs and RESULTs, we may end up resetting these
> +         in function.c:maybe_reset_rtl_for_parm, but in some rare
> +         cases we may need them (unused and overwritten incoming
> +         value, that at -O0 must share the location with the other
> +         uses in spite of the missing default def), and this may be
> +         the only chance to preserve them.  */
> +      if (x && x != pc_rtx && SSA_NAME_VAR (t))
>         {
>           tree var = SSA_NAME_VAR (t);
>           /* If we don't yet have something recorded, just record it now.  */
> @@ -248,8 +378,15 @@ static bool has_short_buffer;
>  static unsigned int
>  align_local_variable (tree decl)
>  {
> -  unsigned int align = LOCAL_DECL_ALIGNMENT (decl);
> -  DECL_ALIGN (decl) = align;
> +  unsigned int align;
> +
> +  if (TREE_CODE (decl) == SSA_NAME)
> +    align = TYPE_ALIGN (TREE_TYPE (decl));
> +  else
> +    {
> +      align = LOCAL_DECL_ALIGNMENT (decl);
> +      DECL_ALIGN (decl) = align;
> +    }
>    return align / BITS_PER_UNIT;
>  }
>
> @@ -315,12 +452,15 @@ add_stack_var (tree decl)
>    decl_to_stack_part->put (decl, stack_vars_num);
>
>    v->decl = decl;
> -  v->size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (decl)));
> +  tree size = TREE_CODE (decl) == SSA_NAME
> +    ? TYPE_SIZE_UNIT (TREE_TYPE (decl))
> +    : DECL_SIZE_UNIT (decl);
> +  v->size = tree_to_uhwi (size);
>    /* Ensure that all variables have size, so that &a != &b for any two
>       variables that are simultaneously live.  */
>    if (v->size == 0)
>      v->size = 1;
> -  v->alignb = align_local_variable (SSAVAR (decl));
> +  v->alignb = align_local_variable (decl);
>    /* An alignment of zero can mightily confuse us later.  */
>    gcc_assert (v->alignb != 0);
>
> @@ -862,7 +1002,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
>    x = plus_constant (Pmode, base, offset);
> -  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
> +  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
> +                  ? TYPE_MODE (TREE_TYPE (decl))
> +                  : DECL_MODE (SSAVAR (decl)), x);
>
>    if (TREE_CODE (decl) != SSA_NAME)
>      {
> @@ -884,7 +1026,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>        DECL_USER_ALIGN (decl) = 0;
>      }
>
> -  set_mem_attributes (x, SSAVAR (decl), true);
>    set_rtl (decl, x);
>  }
>
> @@ -950,9 +1091,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
>           /* Skip variables that have already had rtl assigned.  See also
>              add_stack_var where we perpetrate this pc_rtx hack.  */
>           decl = stack_vars[i].decl;
> -         if ((TREE_CODE (decl) == SSA_NAME
> -             ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
> -             : DECL_RTL (decl)) != pc_rtx)
> +         if (TREE_CODE (decl) == SSA_NAME
> +             ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
> +             : DECL_RTL (decl) != pc_rtx)
>             continue;
>
>           large_size += alignb - 1;
> @@ -981,9 +1122,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
>        /* Skip variables that have already had rtl assigned.  See also
>          add_stack_var where we perpetrate this pc_rtx hack.  */
>        decl = stack_vars[i].decl;
> -      if ((TREE_CODE (decl) == SSA_NAME
> -          ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
> -          : DECL_RTL (decl)) != pc_rtx)
> +      if (TREE_CODE (decl) == SSA_NAME
> +         ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
> +         : DECL_RTL (decl) != pc_rtx)
>         continue;
>
>        /* Check the predicate to see whether this variable should be
> @@ -1099,13 +1240,22 @@ account_stack_vars (void)
>     to a variable to be allocated in the stack frame.  */
>
>  static void
> -expand_one_stack_var (tree var)
> +expand_one_stack_var_1 (tree var)
>  {
>    HOST_WIDE_INT size, offset;
>    unsigned byte_align;
>
> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> -  byte_align = align_local_variable (SSAVAR (var));
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      tree type = TREE_TYPE (var);
> +      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> +      byte_align = TYPE_ALIGN_UNIT (type);
> +    }
> +  else
> +    {
> +      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
> +      byte_align = align_local_variable (var);
> +    }
>
>    /* We handle highly aligned variables in expand_stack_vars.  */
>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1116,6 +1266,27 @@ expand_one_stack_var (tree var)
>                            crtl->max_used_stack_slot_alignment, offset);
>  }
>
> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> +   already assigned some MEM.  */
> +
> +static void
> +expand_one_stack_var (tree var)
> +{
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (MEM_P (x));
> +         return;
> +       }
> +    }
> +
> +  return expand_one_stack_var_1 (var);
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a hard register.  */
>
> @@ -1125,13 +1296,136 @@ expand_one_hard_reg_var (tree var)
>    rest_of_decl_compilation (var, 0, 0);
>  }
>
> +/* Record the alignment requirements of some variable assigned to a
> +   pseudo.  */
> +
> +static void
> +record_alignment_for_reg_var (unsigned int align)
> +{
> +  if (SUPPORTS_STACK_ALIGNMENT
> +      && crtl->stack_alignment_estimated < align)
> +    {
> +      /* stack_alignment_estimated shouldn't change after stack
> +         realign decision made */
> +      gcc_assert (!crtl->stack_realign_processed);
> +      crtl->stack_alignment_estimated = align;
> +    }
> +
> +  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> +     So here we only make sure stack_alignment_needed >= align.  */
> +  if (crtl->stack_alignment_needed < align)
> +    crtl->stack_alignment_needed = align;
> +  if (crtl->max_used_stack_slot_alignment < align)
> +    crtl->max_used_stack_slot_alignment = align;
> +}
> +
> +/* Create RTL for an SSA partition.  */
> +
> +static void
> +expand_one_ssa_partition (tree var)
> +{
> +  int part = var_to_partition (SA.map, var);
> +  gcc_assert (part != NO_PARTITION);
> +
> +  if (SA.partition_to_pseudo[part])
> +    return;
> +
> +  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
> +                                         TYPE_MODE (TREE_TYPE (var)),
> +                                         TYPE_ALIGN (TREE_TYPE (var)));
> +
> +  /* If the variable alignment is very large we'll dynamicaly allocate
> +     it, which means that in-frame portion is just a pointer.  */
> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +    align = POINTER_SIZE;
> +
> +  record_alignment_for_reg_var (align);
> +
> +  if (!use_register_for_decl (var))
> +    {
> +      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
> +         && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
> +       {
> +         expand_one_stack_var_at (var, pc_rtx, 0, 0);
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (GET_CODE (x) == MEM);
> +         gcc_assert (GET_MODE (x) == BLKmode);
> +         gcc_assert (XEXP (x, 0) == pc_rtx);
> +         /* Reset the address, so that any attempt to use it will
> +            ICE.  It will be adjusted in assign_parm_setup_reg.  */
> +         XEXP (x, 0) = NULL_RTX;
> +       }
> +      else if (defer_stack_allocation (var, true))
> +       add_stack_var (var);
> +      else
> +       expand_one_stack_var_1 (var);
> +      return;
> +    }
> +
> +  machine_mode reg_mode = promote_ssa_mode (var, NULL);
> +
> +  rtx x = gen_reg_rtx (reg_mode);
> +
> +  set_rtl (var, x);
> +}
> +
> +/* Record the association between the RTL generated for a partition
> +   and the underlying variable of the SSA_NAME.  */
> +
> +static void
> +adjust_one_expanded_partition_var (tree var)
> +{
> +  if (!var)
> +    return;
> +
> +  tree decl = SSA_NAME_VAR (var);
> +
> +  int part = var_to_partition (SA.map, var);
> +  if (part == NO_PARTITION)
> +    return;
> +
> +  rtx x = SA.partition_to_pseudo[part];
> +
> +  if (!x)
> +    {
> +      /* This var will get a stack slot later.  */
> +      gcc_assert (defer_stack_allocation (var, true));
> +      return;
> +    }
> +
> +  set_rtl (var, x);
> +
> +  if (!REG_P (x))
> +    return;
> +
> +  /* Note if the object is a user variable.  */
> +  if (decl && !DECL_ARTIFICIAL (decl))
> +    mark_user_reg (x);
> +
> +  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
> +    mark_reg_pointer (x, get_pointer_alignment (var));
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a pseudo register.  */
>
>  static void
>  expand_one_register_var (tree var)
>  {
> -  tree decl = SSAVAR (var);
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (REG_P (x));
> +         return;
> +       }
> +      gcc_unreachable ();
> +    }
> +
> +  tree decl = var;
>    tree type = TREE_TYPE (decl);
>    machine_mode reg_mode = promote_decl_mode (decl, NULL);
>    rtx x = gen_reg_rtx (reg_mode);
> @@ -1177,10 +1471,14 @@ expand_one_error_var (tree var)
>  static bool
>  defer_stack_allocation (tree var, bool toplevel)
>  {
> +  tree size_unit = TREE_CODE (var) == SSA_NAME
> +    ? TYPE_SIZE_UNIT (TREE_TYPE (var))
> +    : DECL_SIZE_UNIT (var);
> +
>    /* Whether the variable is small enough for immediate allocation not to be
>       a problem with regard to the frame size.  */
>    bool smallish
> -    = ((HOST_WIDE_INT) tree_to_uhwi (DECL_SIZE_UNIT (var))
> +    = ((HOST_WIDE_INT) tree_to_uhwi (size_unit)
>         < PARAM_VALUE (PARAM_MIN_SIZE_FOR_STACK_SHARING));
>
>    /* If stack protection is enabled, *all* stack variables must be deferred,
> @@ -1189,16 +1487,24 @@ defer_stack_allocation (tree var, bool toplevel)
>    if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK))
>      return true;
>
> +  unsigned int align = TREE_CODE (var) == SSA_NAME
> +    ? TYPE_ALIGN (TREE_TYPE (var))
> +    : DECL_ALIGN (var);
> +
>    /* We handle "large" alignment via dynamic allocation.  We want to handle
>       this extra complication in only one place, so defer them.  */
> -  if (DECL_ALIGN (var) > MAX_SUPPORTED_STACK_ALIGNMENT)
> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
>      return true;
>
> +  bool ignored = TREE_CODE (var) == SSA_NAME
> +    ? !SSAVAR (var) || DECL_IGNORED_P (SSA_NAME_VAR (var))
> +    : DECL_IGNORED_P (var);
> +
>    /* When optimization is enabled, DECL_IGNORED_P variables originally scoped
>       might be detached from their block and appear at toplevel when we reach
>       here.  We want to coalesce them with variables from other blocks when
>       the immediate contribution to the frame size would be noticeable.  */
> -  if (toplevel && optimize > 0 && DECL_IGNORED_P (var) && !smallish)
> +  if (toplevel && optimize > 0 && ignored && !smallish)
>      return true;
>
>    /* Variables declared in the outermost scope automatically conflict
> @@ -1265,21 +1571,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
>         align = POINTER_SIZE;
>      }
>
> -  if (SUPPORTS_STACK_ALIGNMENT
> -      && crtl->stack_alignment_estimated < align)
> -    {
> -      /* stack_alignment_estimated shouldn't change after stack
> -         realign decision made */
> -      gcc_assert (!crtl->stack_realign_processed);
> -      crtl->stack_alignment_estimated = align;
> -    }
> -
> -  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> -     So here we only make sure stack_alignment_needed >= align.  */
> -  if (crtl->stack_alignment_needed < align)
> -    crtl->stack_alignment_needed = align;
> -  if (crtl->max_used_stack_slot_alignment < align)
> -    crtl->max_used_stack_slot_alignment = align;
> +  record_alignment_for_reg_var (align);
>
>    if (TREE_CODE (origvar) == SSA_NAME)
>      {
> @@ -1713,48 +2005,18 @@ expand_used_vars (void)
>    if (targetm.use_pseudo_pic_reg ())
>      pic_offset_table_rtx = gen_reg_rtx (Pmode);
>
> -  hash_map<tree, tree> ssa_name_decls;
>    for (i = 0; i < SA.map->num_partitions; i++)
>      {
>        tree var = partition_to_var (SA.map, i);
>
>        gcc_assert (!virtual_operand_p (var));
>
> -      /* Assign decls to each SSA name partition, share decls for partitions
> -         we could have coalesced (those with the same type).  */
> -      if (SSA_NAME_VAR (var) == NULL_TREE)
> -       {
> -         tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
> -         if (!*slot)
> -           *slot = create_tmp_reg (TREE_TYPE (var));
> -         replace_ssa_name_symbol (var, *slot);
> -       }
> -
> -      /* Always allocate space for partitions based on VAR_DECLs.  But for
> -        those based on PARM_DECLs or RESULT_DECLs and which matter for the
> -        debug info, there is no need to do so if optimization is disabled
> -        because all the SSA_NAMEs based on these DECLs have been coalesced
> -        into a single partition, which is thus assigned the canonical RTL
> -        location of the DECLs.  If in_lto_p, we can't rely on optimize,
> -        a function could be compiled with -O1 -flto first and only the
> -        link performed at -O0.  */
> -      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
> -       expand_one_var (var, true, true);
> -      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
> -       {
> -         /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
> -            contain the default def (representing the parm or result itself)
> -            we don't do anything here.  But those which don't contain the
> -            default def (representing a temporary based on the parm/result)
> -            we need to allocate space just like for normal VAR_DECLs.  */
> -         if (!bitmap_bit_p (SA.partition_has_default_def, i))
> -           {
> -             expand_one_var (var, true, true);
> -             gcc_assert (SA.partition_to_pseudo[i]);
> -           }
> -       }
> +      expand_one_ssa_partition (var);
>      }
>
> +  for (i = 1; i < num_ssa_names; i++)
> +    adjust_one_expanded_partition_var (ssa_name (i));
> +
>    if (flag_stack_protect == SPCT_FLAG_STRONG)
>        gen_stack_protect_signal
>         = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -5928,35 +6190,6 @@ pass_expand::execute (function *fun)
>        parm_birth_insn = var_seq;
>      }
>
> -  /* Now that we also have the parameter RTXs, copy them over to our
> -     partitions.  */
> -  for (i = 0; i < SA.map->num_partitions; i++)
> -    {
> -      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
> -
> -      if (TREE_CODE (var) != VAR_DECL
> -         && !SA.partition_to_pseudo[i])
> -       SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
> -      gcc_assert (SA.partition_to_pseudo[i]);
> -
> -      /* If this decl was marked as living in multiple places, reset
> -        this now to NULL.  */
> -      if (DECL_RTL_IF_SET (var) == pc_rtx)
> -       SET_DECL_RTL (var, NULL);
> -
> -      /* Some RTL parts really want to look at DECL_RTL(x) when x
> -        was a decl marked in REG_ATTR or MEM_ATTR.  We could use
> -        SET_DECL_RTL here making this available, but that would mean
> -        to select one of the potentially many RTLs for one DECL.  Instead
> -        of doing that we simply reset the MEM_EXPR of the RTL in question,
> -        then nobody can get at it and hence nobody can call DECL_RTL on it.  */
> -      if (!DECL_RTL_SET_P (var))
> -       {
> -         if (MEM_P (SA.partition_to_pseudo[i]))
> -           set_mem_expr (SA.partition_to_pseudo[i], NULL);
> -       }
> -    }
> -
>    /* If we have a class containing differently aligned pointers
>       we need to merge those into the corresponding RTL pointer
>       alignment.  */
> @@ -5964,7 +6197,6 @@ pass_expand::execute (function *fun)
>      {
>        tree name = ssa_name (i);
>        int part;
> -      rtx r;
>
>        if (!name
>           /* We might have generated new SSA names in
> @@ -5977,20 +6209,25 @@ pass_expand::execute (function *fun)
>        if (part == NO_PARTITION)
>         continue;
>
> -      /* Adjust all partition members to get the underlying decl of
> -        the representative which we might have created in expand_one_var.  */
> -      if (SSA_NAME_VAR (name) == NULL_TREE)
> +      gcc_assert (SA.partition_to_pseudo[part]
> +                 || defer_stack_allocation (name, true));
> +
> +      /* If this decl was marked as living in multiple places, reset
> +        this now to NULL.  */
> +      tree var = SSA_NAME_VAR (name);
> +      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
> +       SET_DECL_RTL (var, NULL);
> +      /* Check that the pseudos chosen by assign_parms are those of
> +        the corresponding default defs.  */
> +      else if (SSA_NAME_IS_DEFAULT_DEF (name)
> +              && (TREE_CODE (var) == PARM_DECL
> +                  || TREE_CODE (var) == RESULT_DECL))
>         {
> -         tree leader = partition_to_var (SA.map, part);
> -         gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
> -         replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
> +         rtx in = DECL_RTL_IF_SET (var);
> +         gcc_assert (in);
> +         rtx out = SA.partition_to_pseudo[part];
> +         gcc_assert (in == out || rtx_equal_p (in, out));
>         }
> -      if (!POINTER_TYPE_P (TREE_TYPE (name)))
> -       continue;
> -
> -      r = SA.partition_to_pseudo[part];
> -      if (REG_P (r))
> -       mark_reg_pointer (r, get_pointer_alignment (name));
>      }
>
>    /* If this function is `main', emit a call to `__main'
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index a0b6e3e..987cf356 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,5 +22,8 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern tree gimple_assign_rhs_to_tree (gimple);
>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> +extern bool parm_maybe_byref_p (tree);
> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +
>
>  #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 8f25f8b..6d47e94 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2230,16 +2230,16 @@ Common Report Var(flag_tree_ch) Optimization
>  Enable loop header copying on trees
>
>  ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Report Var(flag_tree_coalesce_vars) Optimization
> +Enable SSA coalescing of user variables
>
>  ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-copy-prop
>  Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 7b5d86b..fcb1d36 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -341,7 +341,6 @@ Objective-C and Objective-C++ Dialects}.
>  -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-nrv -fdump-tree-vect @gol
>  -fdump-tree-sink @gol
>  -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
> @@ -447,9 +446,8 @@ Objective-C and Objective-C++ Dialects}.
>  -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>  -ftree-loop-if-convert-stores -ftree-loop-im @gol
>  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
> @@ -7114,11 +7112,6 @@ name is made by appending @file{.phiopt} to the source file name.
>  Dump each function after forward propagating single use variables.  The file
>  name is made by appending @file{.forwprop} to the source file name.
>
> -@item copyrename
> -@opindex fdump-tree-copyrename
> -Dump each function after applying the copy rename optimization.  The file
> -name is made by appending @file{.copyrename} to the source file name.
> -
>  @item nrv
>  @opindex fdump-tree-nrv
>  Dump each function after applying the named return value optimization on
> @@ -7583,8 +7576,8 @@ compilation time.
>  -ftree-ccp @gol
>  -fssa-phiopt @gol
>  -ftree-ch @gol
> +-ftree-coalesce-vars @gol
>  -ftree-copy-prop @gol
> --ftree-copyrename @gol
>  -ftree-dce @gol
>  -ftree-dominator-opts @gol
>  -ftree-dse @gol
> @@ -8848,6 +8841,15 @@ be parallelized.  Parallelize all the loops that can be analyzed to
>  not contain loop carried dependences without checking that it is
>  profitable to parallelize the loops.
>
> +@item -ftree-coalesce-vars
> +@opindex ftree-coalesce-vars
> +Tell the compiler to attempt to combine small user-defined variables
> +too, instead of just compiler temporaries.  This may severely limit the
> +ability to debug an optimized program compiled with
> +@option{-fno-var-tracking-assignments}.  In the negated form, this flag
> +prevents SSA coalescing of user variables.  This option is enabled by
> +default if optimization is enabled.
> +
>  @item -ftree-loop-if-convert
>  @opindex ftree-loop-if-convert
>  Attempt to transform conditional jumps in the innermost loops to
> @@ -8961,32 +8963,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
>  references with scalars to prevent committing structures to memory too
>  early.  This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees.  This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables.  This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions.  It is a more limited form of
> -@option{-ftree-coalesce-vars}.  This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries.  This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}.  In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones.  This option is enabled by default.
> -
>  @item -ftree-ter
>  @opindex ftree-ter
>  Perform temporary expression replacement during the SSA->normal phase.  Single
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index ed2b30b..3b95c5d 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "target.h"
>  #include "builtins.h"
>  #include "rtl-iter.h"
> +#include "stor-layout.h"
>
>  struct target_rtl default_target_rtl;
>  #if SWITCHABLE_TARGET
> @@ -1232,6 +1233,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
>  void
>  set_reg_attrs_for_decl_rtl (tree t, rtx x)
>  {
> +  if (!t)
> +    return;
> +  tree tdecl = t;
>    if (GET_CODE (x) == SUBREG)
>      {
>        gcc_assert (subreg_lowpart_p (x));
> @@ -1240,7 +1244,9 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>    if (REG_P (x))
>      REG_ATTRS (x)
>        = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
> -                                              DECL_MODE (t)));
> +                                              DECL_P (tdecl)
> +                                              ? DECL_MODE (tdecl)
> +                                              : TYPE_MODE (TREE_TYPE (tdecl))));
>    if (GET_CODE (x) == CONCAT)
>      {
>        if (REG_P (XEXP (x, 0)))
> diff --git a/gcc/explow.c b/gcc/explow.c
> index bd342c1..6941f4e 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -842,6 +842,35 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>    return pmode;
>  }
>
> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
> +   mode of a temp decl of same type as the SSA_NAME, if we had created
> +   one.  */
> +
> +machine_mode
> +promote_ssa_mode (const_tree name, int *punsignedp)
> +{
> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
> +
> +  /* Partitions holding parms and results must be promoted as expected
> +     by function.c.  */
> +  if (SSA_NAME_VAR (name)
> +      && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
> +         || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> +
> +  tree type = TREE_TYPE (name);
> +  int unsignedp = TYPE_UNSIGNED (type);
> +  machine_mode mode = TYPE_MODE (type);
> +
> +  machine_mode pmode = promote_mode (type, mode, &unsignedp);
> +  if (punsignedp)
> +    *punsignedp = unsignedp;
> +
> +  return pmode;
> +}
> +
> +
>
>  /* Controls the behaviour of {anti_,}adjust_stack.  */
>  static bool suppress_reg_args_size;
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 94613de..52113db 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
>  /* Return mode and signedness to use when object is promoted.  */
>  machine_mode promote_decl_mode (const_tree, int *);
>
> +/* Return mode and signedness to use when object is promoted.  */
> +machine_mode promote_ssa_mode (const_tree, int *);
> +
>  /* Remove some bytes from the stack.  An rtx says how many.  */
>  extern void adjust_stack (rtx);
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 899a42c..fc49f92 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -3022,7 +3022,7 @@ write_complex_part (rtx cplx, rtx val, bool imag_p)
>  /* Extract one of the components of the complex value CPLX.  Extract the
>     real part if IMAG_P is false, and the imaginary part if it's true.  */
>
> -static rtx
> +rtx
>  read_complex_part (rtx cplx, bool imag_p)
>  {
>    machine_mode cmode, imode;
> @@ -9246,7 +9246,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>    rtx op0, op1, temp, decl_rtl;
>    tree type;
>    int unsignedp;
> -  machine_mode mode;
> +  machine_mode mode, dmode;
>    enum tree_code code = TREE_CODE (exp);
>    rtx subtarget, original_target;
>    int ignore;
> @@ -9377,7 +9377,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        if (g == NULL
>           && modifier == EXPAND_INITIALIZER
>           && !SSA_NAME_IS_DEFAULT_DEF (exp)
> -         && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> +         && (optimize || !SSA_NAME_VAR (exp)
> +             || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>           && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
>         g = SSA_NAME_DEF_STMT (exp);
>        if (g)
> @@ -9456,15 +9457,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        /* Ensure variable marked as used even if it doesn't go through
>          a parser.  If it hasn't be used yet, write out an external
>          definition.  */
> -      TREE_USED (exp) = 1;
> +      if (exp)
> +       TREE_USED (exp) = 1;
>
>        /* Show we haven't gotten RTL for this yet.  */
>        temp = 0;
>
>        /* Variables inherited from containing functions should have
>          been lowered by this point.  */
> -      context = decl_function_context (exp);
> -      gcc_assert (SCOPE_FILE_SCOPE_P (context)
> +      if (exp)
> +       context = decl_function_context (exp);
> +      gcc_assert (!exp
> +                 || SCOPE_FILE_SCOPE_P (context)
>                   || context == current_function_decl
>                   || TREE_STATIC (exp)
>                   || DECL_EXTERNAL (exp)
> @@ -9488,7 +9492,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>           decl_rtl = use_anchored_address (decl_rtl);
>           if (modifier != EXPAND_CONST_ADDRESS
>               && modifier != EXPAND_SUM
> -             && !memory_address_addr_space_p (DECL_MODE (exp),
> +             && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
> +                                              : GET_MODE (decl_rtl),
>                                                XEXP (decl_rtl, 0),
>                                                MEM_ADDR_SPACE (decl_rtl)))
>             temp = replace_equiv_address (decl_rtl,
> @@ -9499,12 +9504,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          if the address is a register.  */
>        if (temp != 0)
>         {
> -         if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
> +         if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
>             mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>
>           return temp;
>         }
>
> +      if (exp)
> +       dmode = DECL_MODE (exp);
> +      else
> +       dmode = TYPE_MODE (TREE_TYPE (ssa_name));
> +
>        /* If the mode of DECL_RTL does not match that of the decl,
>          there are two cases: we are dealing with a BLKmode value
>          that is returned in a register, or we are dealing with
> @@ -9512,22 +9522,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          of the wanted mode, but mark it so that we know that it
>          was already extended.  */
>        if (REG_P (decl_rtl)
> -         && DECL_MODE (exp) != BLKmode
> -         && GET_MODE (decl_rtl) != DECL_MODE (exp))
> +         && dmode != BLKmode
> +         && GET_MODE (decl_rtl) != dmode)
>         {
>           machine_mode pmode;
>
>           /* Get the signedness to be used for this variable.  Ensure we get
>              the same mode we got when the variable was declared.  */
> -         if (code == SSA_NAME
> -             && (g = SSA_NAME_DEF_STMT (ssa_name))
> -             && gimple_code (g) == GIMPLE_CALL
> -             && !gimple_call_internal_p (g))
> +         if (code != SSA_NAME)
> +           pmode = promote_decl_mode (exp, &unsignedp);
> +         else if ((g = SSA_NAME_DEF_STMT (ssa_name))
> +                  && gimple_code (g) == GIMPLE_CALL
> +                  && !gimple_call_internal_p (g))
>             pmode = promote_function_mode (type, mode, &unsignedp,
>                                            gimple_call_fntype (g),
>                                            2);
>           else
> -           pmode = promote_decl_mode (exp, &unsignedp);
> +           pmode = promote_ssa_mode (ssa_name, &unsignedp);
>           gcc_assert (GET_MODE (decl_rtl) == pmode);
>
>           temp = gen_lowpart_SUBREG (mode, decl_rtl);
> diff --git a/gcc/expr.h b/gcc/expr.h
> index 32d1707..a2c8e1d 100644
> --- a/gcc/expr.h
> +++ b/gcc/expr.h
> @@ -210,6 +210,7 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx);
>
>  extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx);
>  extern rtx_insn *emit_move_complex_parts (rtx, rtx);
> +extern rtx read_complex_part (rtx, bool);
>  extern void write_complex_part (rtx, rtx, bool);
>  extern rtx emit_move_resolve_push (machine_mode, rtx);
>
> diff --git a/gcc/function.c b/gcc/function.c
> index f9d11bf4..1d98ede 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -72,6 +72,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfganal.h"
>  #include "cfgbuild.h"
>  #include "cfgcleanup.h"
> +#include "cfgexpand.h"
> +#include "basic-block.h"
> +#include "df.h"
>  #include "params.h"
>  #include "bb-reorder.h"
>  #include "shrink-wrap.h"
> @@ -148,6 +151,9 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
>  static void prepare_function_start (void);
>  static void do_clobber_return_reg (rtx, void *);
>  static void do_use_return_reg (rtx, void *);
> +static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
> +static void maybe_reset_rtl_for_parm (tree);
> +
>
>  /* Stack of nested functions.  */
>  /* Keep track of the cfun stack.  */
> @@ -2105,6 +2111,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>  bool
>  use_register_for_decl (const_tree decl)
>  {
> +  if (TREE_CODE (decl) == SSA_NAME)
> +    {
> +      /* We often try to use the SSA_NAME, instead of its underlying
> +        decl, to get type information and guide decisions, to avoid
> +        differences of behavior between anonymous and named
> +        variables, but in this one case we have to go for the actual
> +        variable if there is one.  The main reason is that, at least
> +        at -O0, we want to place user variables on the stack, but we
> +        don't mind using pseudos for anonymous or ignored temps.
> +        Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
> +        should go in pseudos, whereas their corresponding variables
> +        might have to go on the stack.  So, disregarding the decl
> +        here would negatively impact debug info at -O0, enable
> +        coalescing between SSA_NAMEs that ought to get different
> +        stack/pseudo assignments, and get the incoming argument
> +        processing thoroughly confused by PARM_DECLs expected to live
> +        in stack slots but assigned to pseudos.  */
> +      if (!SSA_NAME_VAR (decl))
> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> +
> +      decl = SSA_NAME_VAR (decl);
> +    }
> +
>    if (!targetm.calls.allocate_stack_slots_for_args ())
>      return true;
>
> @@ -2240,7 +2270,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
>     needed, else the old list.  */
>
>  static void
> -split_complex_args (vec<tree> *args)
> +split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>  {
>    unsigned i;
>    tree p;
> @@ -2251,6 +2281,7 @@ split_complex_args (vec<tree> *args)
>        if (TREE_CODE (type) == COMPLEX_TYPE
>           && targetm.calls.split_complex_arg (type))
>         {
> +         tree cparm = p;
>           tree decl;
>           tree subtype = TREE_TYPE (type);
>           bool addressable = TREE_ADDRESSABLE (p);
> @@ -2269,6 +2300,9 @@ split_complex_args (vec<tree> *args)
>           DECL_ARTIFICIAL (p) = addressable;
>           DECL_IGNORED_P (p) = addressable;
>           TREE_ADDRESSABLE (p) = 0;
> +         /* Reset the RTL before layout_decl, or it may change the
> +            mode of the RTL of the original argument copied to P.  */
> +         SET_DECL_RTL (p, NULL_RTX);
>           layout_decl (p, 0);
>           (*args)[i] = p;
>
> @@ -2280,6 +2314,25 @@ split_complex_args (vec<tree> *args)
>           DECL_IGNORED_P (decl) = addressable;
>           layout_decl (decl, 0);
>           args->safe_insert (++i, decl);
> +
> +         /* If we are expanding a function, rather than gimplifying
> +            it, propagate the RTL of the complex parm to the split
> +            declarations, and set their contexts so that
> +            maybe_reset_rtl_for_parm can recognize them and refrain
> +            from resetting their RTL.  */
> +         if (currently_expanding_to_rtl)
> +           {
> +             maybe_reset_rtl_for_parm (cparm);
> +             rtx rtl = rtl_for_parm (all, cparm);
> +             if (rtl)
> +               {
> +                 SET_DECL_RTL (p, read_complex_part (rtl, false));
> +                 SET_DECL_RTL (decl, read_complex_part (rtl, true));
> +
> +                 DECL_CONTEXT (p) = cparm;
> +                 DECL_CONTEXT (decl) = cparm;
> +               }
> +           }
>         }
>      }
>  }
> @@ -2342,7 +2395,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
>
>    /* If the target wants to split complex arguments into scalars, do so.  */
>    if (targetm.calls.split_complex_arg)
> -    split_complex_args (&fnargs);
> +    split_complex_args (all, &fnargs);
>
>    return fnargs;
>  }
> @@ -2745,23 +2798,98 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>    data->entry_parm = entry_parm;
>  }
>
> +/* Wrapper for use_register_for_decl, that special-cases the
> +   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> +   passed by reference.  */
> +
> +static bool
> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (DECL_BY_REFERENCE (result))
> +       parm = result;
> +    }
> +
> +  return use_register_for_decl (parm);
> +}
> +
> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> +   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> +   is passed by reference.  */
> +
> +static rtx
> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (!DECL_BY_REFERENCE (result))
> +       return NULL_RTX;
> +
> +      parm = result;
> +    }
> +
> +  return get_rtl_for_parm_ssa_default_def (parm);
> +}
> +
> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> +   SSA_NAMEs in multiple partitions, so that assign_parms will choose
> +   the default def, if it exists, or create new RTL to hold the unused
> +   entry value.  If we are coalescing across variables, we want to
> +   reset the location too, because a parm without a default def
> +   (incoming value unused) might be coalesced with one with a default
> +   def, and then assign_parms would copy both incoming values to the
> +   same location, which might cause the wrong value to survive.  */
> +static void
> +maybe_reset_rtl_for_parm (tree parm)
> +{
> +  gcc_assert (TREE_CODE (parm) == PARM_DECL
> +             || TREE_CODE (parm) == RESULT_DECL);
> +
> +  /* This is a split complex parameter, and its context was set to its
> +     original PARM_DECL in split_complex_args so that we could
> +     recognize it here and not reset its RTL.  */
> +  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
> +    {
> +      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
> +      return;
> +    }
> +
> +  if ((flag_tree_coalesce_vars
> +       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> +      && is_gimple_reg (parm))
> +    SET_DECL_RTL (parm, NULL_RTX);
> +}
> +
>  /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
>     always valid and properly aligned.  */
>
>  static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> +                             struct assign_parm_data_one *data)
>  {
>    rtx stack_parm = data->stack_parm;
>
> +  /* If out-of-SSA assigned RTL to the parm default def, make sure we
> +     don't use what we might have computed before.  */
> +  rtx ssa_assigned = rtl_for_parm (all, parm);
> +  if (ssa_assigned)
> +    stack_parm = NULL;
> +
>    /* If we can't trust the parm stack slot to be aligned enough for its
>       ultimate type, don't use that slot after entry.  We'll make another
>       stack slot, if we need one.  */
> -  if (stack_parm
> -      && ((STRICT_ALIGNMENT
> -          && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> -         || (data->nominal_type
> -             && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> -             && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> +  else if (stack_parm
> +          && ((STRICT_ALIGNMENT
> +               && (GET_MODE_ALIGNMENT (data->nominal_mode)
> +                   > MEM_ALIGN (stack_parm)))
> +              || (data->nominal_type
> +                  && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> +                  && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>      stack_parm = NULL;
>
>    /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2823,14 +2951,32 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>
>    size = int_size_in_bytes (data->passed_type);
>    size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> +
>    if (stack_parm == 0)
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> -      stack_parm = assign_stack_local (BLKmode, size_stored,
> -                                      DECL_ALIGN (parm));
> -      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> -       PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -      set_mem_attributes (stack_parm, parm, 1);
> +      rtx from_expand = rtl_for_parm (all, parm);
> +      if (from_expand && (!parm_maybe_byref_p (parm)
> +                         || XEXP (from_expand, 0) != NULL_RTX))
> +       stack_parm = copy_rtx (from_expand);
> +      else
> +       {
> +         stack_parm = assign_stack_local (BLKmode, size_stored,
> +                                          DECL_ALIGN (parm));
> +         if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> +           PUT_MODE (stack_parm, GET_MODE (entry_parm));
> +         if (from_expand)
> +           {
> +             gcc_assert (GET_CODE (stack_parm) == MEM);
> +             gcc_assert (GET_CODE (from_expand) == MEM);
> +             gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
> +             XEXP (from_expand, 0) = XEXP (stack_parm, 0);
> +             PUT_MODE (from_expand, GET_MODE (stack_parm));
> +             stack_parm = copy_rtx (from_expand);
> +           }
> +         else
> +           set_mem_attributes (stack_parm, parm, 1);
> +       }
>      }
>
>    /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
> @@ -2968,14 +3114,34 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>      = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                              TREE_TYPE (current_function_decl), 2);
>
> -  parmreg = gen_reg_rtx (promoted_nominal_mode);
> +  rtx from_expand = parmreg = rtl_for_parm (all, parm);
>
> -  if (!DECL_ARTIFICIAL (parm))
> -    mark_user_reg (parmreg);
> +  if (from_expand && !data->passed_pointer)
> +    {
> +      if (GET_MODE (parmreg) != promoted_nominal_mode)
> +       parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
> +    }
> +  else if (!from_expand || parm_maybe_byref_p (parm))
> +    {
> +      parmreg = gen_reg_rtx (promoted_nominal_mode);
> +      if (!DECL_ARTIFICIAL (parm))
> +       mark_user_reg (parmreg);
> +
> +      if (from_expand)
> +       {
> +         gcc_assert (data->passed_pointer);
> +         gcc_assert (GET_CODE (from_expand) == MEM
> +                     && GET_MODE (from_expand) == BLKmode
> +                     && XEXP (from_expand, 0) == NULL_RTX);
> +         XEXP (from_expand, 0) = parmreg;
> +       }
> +    }
>
>    /* If this was an item that we received a pointer to,
>       set DECL_RTL appropriately.  */
> -  if (data->passed_pointer)
> +  if (from_expand)
> +    SET_DECL_RTL (parm, from_expand);
> +  else if (data->passed_pointer)
>      {
>        rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
>        set_mem_attributes (x, parm, 1);
> @@ -2990,10 +3156,13 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       assign_parm_find_data_types and expand_expr_real_1.  */
>
>    equiv_stack_parm = data->stack_parm;
> +  if (!equiv_stack_parm)
> +    equiv_stack_parm = data->entry_parm;
>    validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
>    need_conversion = (data->nominal_mode != data->passed_mode
>                      || promoted_nominal_mode != data->promoted_mode);
> +  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
>    moved = false;
>
>    if (need_conversion
> @@ -3125,16 +3294,28 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>        did_conversion = true;
>      }
> -  else
> +  /* We don't want to copy the incoming pointer to a parmreg expected
> +     to hold the value rather than the pointer.  */
> +  else if (!data->passed_pointer || parmreg != from_expand)
>      emit_move_insn (parmreg, validated_mem);
>
>    /* If we were passed a pointer but the actual value can safely live
>       in a register, retrieve it and use it directly.  */
> -  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
> +  if (data->passed_pointer
> +      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
>      {
> +      rtx src = DECL_RTL (parm);
> +
>        /* We can't use nominal_mode, because it will have been set to
>          Pmode above.  We must use the actual mode of the parm.  */
> -      if (use_register_for_decl (parm))
> +      if (from_expand)
> +       {
> +         parmreg = from_expand;
> +         gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> +         src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
> +         set_mem_attributes (src, parm, 1);
> +       }
> +      else if (use_register_for_decl (parm))
>         {
>           parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>           mark_user_reg (parmreg);
> @@ -3151,14 +3332,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>           set_mem_attributes (parmreg, parm, 1);
>         }
>
> -      if (GET_MODE (parmreg) != GET_MODE (DECL_RTL (parm)))
> +      if (GET_MODE (parmreg) != GET_MODE (src))
>         {
> -         rtx tempreg = gen_reg_rtx (GET_MODE (DECL_RTL (parm)));
> +         rtx tempreg = gen_reg_rtx (GET_MODE (src));
>           int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
>
>           push_to_sequence2 (all->first_conversion_insn,
>                              all->last_conversion_insn);
> -         emit_move_insn (tempreg, DECL_RTL (parm));
> +         emit_move_insn (tempreg, src);
>           tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
>           emit_move_insn (parmreg, tempreg);
>           all->first_conversion_insn = get_insns ();
> @@ -3167,14 +3348,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>           did_conversion = true;
>         }
> +      else if (GET_MODE (parmreg) == BLKmode)
> +       gcc_assert (parm_maybe_byref_p (parm));
>        else
> -       emit_move_insn (parmreg, DECL_RTL (parm));
> +       emit_move_insn (parmreg, src);
>
>        SET_DECL_RTL (parm, parmreg);
>
>        /* STACK_PARM is the pointer, not the parm, and PARMREG is
>          now the parm.  */
> -      data->stack_parm = NULL;
> +      data->stack_parm = equiv_stack_parm = NULL;
>      }
>
>    /* Mark the register as eliminable if we did no conversion and it was
> @@ -3184,11 +3367,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       make here would screw up life analysis for it.  */
>    if (data->nominal_mode == data->passed_mode
>        && !did_conversion
> -      && data->stack_parm != 0
> -      && MEM_P (data->stack_parm)
> +      && equiv_stack_parm != 0
> +      && MEM_P (equiv_stack_parm)
>        && data->locate.offset.var == 0
>        && reg_mentioned_p (virtual_incoming_args_rtx,
> -                         XEXP (data->stack_parm, 0)))
> +                         XEXP (equiv_stack_parm, 0)))
>      {
>        rtx_insn *linsn = get_last_insn ();
>        rtx_insn *sinsn;
> @@ -3201,8 +3384,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>             = GET_MODE_INNER (GET_MODE (parmreg));
>           int regnor = REGNO (XEXP (parmreg, 0));
>           int regnoi = REGNO (XEXP (parmreg, 1));
> -         rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> -         rtx stacki = adjust_address_nv (data->stack_parm, submode,
> +         rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> +         rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
>                                           GET_MODE_SIZE (submode));
>
>           /* Scan backwards for the set of the real and
> @@ -3275,6 +3458,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>
>        if (data->stack_parm == 0)
>         {
> +         rtx x = data->stack_parm = rtl_for_parm (all, parm);
> +         if (x)
> +           gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
> +       }
> +
> +      if (data->stack_parm == 0)
> +       {
>           int align = STACK_SLOT_ALIGNMENT (data->passed_type,
>                                             GET_MODE (data->entry_parm),
>                                             TYPE_ALIGN (data->passed_type));
> @@ -3337,11 +3527,21 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
>           imag = DECL_RTL (fnargs[i + 1]);
>           if (inner != GET_MODE (real))
>             {
> -             real = gen_lowpart_SUBREG (inner, real);
> -             imag = gen_lowpart_SUBREG (inner, imag);
> +             real = simplify_gen_subreg (inner, real, GET_MODE (real),
> +                                         subreg_lowpart_offset
> +                                         (inner, GET_MODE (real)));
> +             imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
> +                                         subreg_lowpart_offset
> +                                         (inner, GET_MODE (imag)));
>             }
>
> -         if (TREE_ADDRESSABLE (parm))
> +         if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
> +             && rtx_equal_p (real,
> +                             read_complex_part (tmp, false))
> +             && rtx_equal_p (imag,
> +                             read_complex_part (tmp, true)))
> +           ; /* We now have the right rtl in tmp.  */
> +         else if (TREE_ADDRESSABLE (parm))
>             {
>               rtx rmem, imem;
>               HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
> @@ -3487,7 +3687,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
>           assign_parm_setup_block (&all, pbdata->bounds_parm,
>                                    &pbdata->parm_data);
>         else if (pbdata->parm_data.passed_pointer
> -                || use_register_for_decl (pbdata->bounds_parm))
> +                || use_register_for_parm_decl (&all, pbdata->bounds_parm))
>           assign_parm_setup_reg (&all, pbdata->bounds_parm,
>                                  &pbdata->parm_data);
>         else
> @@ -3531,6 +3731,8 @@ assign_parms (tree fndecl)
>           DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>           continue;
>         }
> +      else
> +       maybe_reset_rtl_for_parm (parm);
>
>        /* Estimate stack alignment from parameter alignment.  */
>        if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3580,7 +3782,9 @@ assign_parms (tree fndecl)
>        else
>         set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> -      /* Boudns should be loaded in the particular order to
> +      assign_parm_adjust_stack_rtl (&all, parm, &data);
> +
> +      /* Bounds should be loaded in the particular order to
>          have registers allocated correctly.  Collect info about
>          input bounds and load them later.  */
>        if (POINTER_BOUNDS_TYPE_P (data.passed_type))
> @@ -3597,11 +3801,10 @@ assign_parms (tree fndecl)
>         }
>        else
>         {
> -         assign_parm_adjust_stack_rtl (&data);
> -
>           if (assign_parm_setup_block_p (&data))
>             assign_parm_setup_block (&all, parm, &data);
> -         else if (data.passed_pointer || use_register_for_decl (parm))
> +         else if (data.passed_pointer
> +                  || use_register_for_parm_decl (&all, parm))
>             assign_parm_setup_reg (&all, parm, &data);
>           else
>             assign_parm_setup_stack (&all, parm, &data);
> @@ -4932,7 +5135,9 @@ expand_function_start (tree subr)
>       before any library calls that assign parms might generate.  */
>
>    /* Decide whether to return the value in memory or in a register.  */
> -  if (aggregate_value_p (DECL_RESULT (subr), subr))
> +  tree res = DECL_RESULT (subr);
> +  maybe_reset_rtl_for_parm (res);
> +  if (aggregate_value_p (res, subr))
>      {
>        /* Returning something that won't go in a register.  */
>        rtx value_address = 0;
> @@ -4940,7 +5145,7 @@ expand_function_start (tree subr)
>  #ifdef PCC_STATIC_STRUCT_RETURN
>        if (cfun->returns_pcc_struct)
>         {
> -         int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
> +         int size = int_size_in_bytes (TREE_TYPE (res));
>           value_address = assemble_static_space (size);
>         }
>        else
> @@ -4952,36 +5157,45 @@ expand_function_start (tree subr)
>              it.  */
>           if (sv)
>             {
> -             value_address = gen_reg_rtx (Pmode);
> +             if (DECL_BY_REFERENCE (res))
> +               value_address = get_rtl_for_parm_ssa_default_def (res);
> +             if (!value_address)
> +               value_address = gen_reg_rtx (Pmode);
>               emit_move_insn (value_address, sv);
>             }
>         }
>        if (value_address)
>         {
>           rtx x = value_address;
> -         if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
> +         if (!DECL_BY_REFERENCE (res))
>             {
> -             x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
> -             set_mem_attributes (x, DECL_RESULT (subr), 1);
> +             x = get_rtl_for_parm_ssa_default_def (res);
> +             if (!x)
> +               {
> +                 x = gen_rtx_MEM (DECL_MODE (res), value_address);
> +                 set_mem_attributes (x, res, 1);
> +               }
>             }
> -         SET_DECL_RTL (DECL_RESULT (subr), x);
> +         SET_DECL_RTL (res, x);
>         }
>      }
> -  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
> +  else if (DECL_MODE (res) == VOIDmode)
>      /* If return mode is void, this decl rtl should not be used.  */
> -    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
> +    SET_DECL_RTL (res, NULL_RTX);
>    else
>      {
>        /* Compute the return values into a pseudo reg, which we will copy
>          into the true return register after the cleanups are done.  */
> -      tree return_type = TREE_TYPE (DECL_RESULT (subr));
> -      if (TYPE_MODE (return_type) != BLKmode
> -         && targetm.calls.return_in_msb (return_type))
> +      tree return_type = TREE_TYPE (res);
> +      rtx x = get_rtl_for_parm_ssa_default_def (res);
> +      if (x)
> +       /* Use it.  */;
> +      else if (TYPE_MODE (return_type) != BLKmode
> +              && targetm.calls.return_in_msb (return_type))
>         /* expand_function_end will insert the appropriate padding in
>            this case.  Use the return value's natural (unpadded) mode
>            within the function proper.  */
> -       SET_DECL_RTL (DECL_RESULT (subr),
> -                     gen_reg_rtx (TYPE_MODE (return_type)));
> +       x = gen_reg_rtx (TYPE_MODE (return_type));
>        else
>         {
>           /* In order to figure out what mode to use for the pseudo, we
> @@ -4992,25 +5206,26 @@ expand_function_start (tree subr)
>           /* Structures that are returned in registers are not
>              aggregate_value_p, so we may see a PARALLEL or a REG.  */
>           if (REG_P (hard_reg))
> -           SET_DECL_RTL (DECL_RESULT (subr),
> -                         gen_reg_rtx (GET_MODE (hard_reg)));
> +           x = gen_reg_rtx (GET_MODE (hard_reg));
>           else
>             {
>               gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> -             SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
> +             x = gen_group_rtx (hard_reg);
>             }
>         }
>
> +      SET_DECL_RTL (res, x);
> +
>        /* Set DECL_REGISTER flag so that expand_function_end will copy the
>          result to the real return register(s).  */
> -      DECL_REGISTER (DECL_RESULT (subr)) = 1;
> +      DECL_REGISTER (res) = 1;
>
>        if (chkp_function_instrumented_p (current_function_decl))
>         {
> -         tree return_type = TREE_TYPE (DECL_RESULT (subr));
> +         tree return_type = TREE_TYPE (res);
>           rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
>                                                                  subr, 1);
> -         SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
> +         SET_DECL_BOUNDS_RTL (res, bounds);
>         }
>      }
>
> @@ -5025,13 +5240,19 @@ expand_function_start (tree subr)
>        rtx local, chain;
>       rtx_insn *insn;
>
> -      local = gen_reg_rtx (Pmode);
> +      local = get_rtl_for_parm_ssa_default_def (parm);
> +      if (!local)
> +       local = gen_reg_rtx (Pmode);
>        chain = targetm.calls.static_chain (current_function_decl, true);
>
>        set_decl_incoming_rtl (parm, chain, false);
>        SET_DECL_RTL (parm, local);
>        mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
>
> +      if (GET_MODE (local) != Pmode)
> +       local = convert_to_mode (Pmode, local,
> +                                TYPE_UNSIGNED (TREE_TYPE (parm)));
> +
>        insn = emit_move_insn (local, chain);
>
>        /* Mark the register as eliminable, similar to parameters.  */
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index b558d90..baed630 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type)
>    return copy;
>  }
>
> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> -   coalescing together, false otherwise.
> -
> -   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> -
> -bool
> -gimple_can_coalesce_p (tree name1, tree name2)
> -{
> -  /* First check the SSA_NAME's associated DECL.  We only want to
> -     coalesce if they have the same DECL or both have no associated DECL.  */
> -  tree var1 = SSA_NAME_VAR (name1);
> -  tree var2 = SSA_NAME_VAR (name2);
> -  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> -  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> -  if (var1 != var2)
> -    return false;
> -
> -  /* Now check the types.  If the types are the same, then we should
> -     try to coalesce V1 and V2.  */
> -  tree t1 = TREE_TYPE (name1);
> -  tree t2 = TREE_TYPE (name2);
> -  if (t1 == t2)
> -    return true;
> -
> -  /* If the types are not the same, check for a canonical type match.  This
> -     (for example) allows coalescing when the types are fundamentally the
> -     same, but just have different names.
> -
> -     Note pointer types with different address spaces may have the same
> -     canonical type.  Those are rejected for coalescing by the
> -     types_compatible_p check.  */
> -  if (TYPE_CANONICAL (t1)
> -      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> -      && types_compatible_p (t1, t2))
> -    return true;
> -
> -  return false;
> -}
> -
>  /* Strip off a legitimate source ending from the input string NAME of
>     length LEN.  Rather than having to know the names used by all of
>     our front ends, we strip off an ending of a period followed by
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index ed23eb2..3d1c89f 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
>  extern bool gimple_has_body_p (tree);
>  extern const char *gimple_decl_printable_name (tree, int);
>  extern tree copy_var_decl (tree, tree, tree);
> -extern bool gimple_can_coalesce_p (tree, tree);
>  extern tree create_tmp_var_name (const char *);
>  extern tree create_tmp_var_raw (tree, const char * = NULL);
>  extern tree create_tmp_var (tree, const char * = NULL);
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 468a802..f22edd3 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -445,12 +445,12 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
> +    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> -    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 6b66f8f..64fc4d9 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_all_early_optimizations);
>        PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
>           NEXT_PASS (pass_remove_cgraph_callee_edges);
> -         NEXT_PASS (pass_rename_ssa_copies);
>           NEXT_PASS (pass_object_sizes);
>           NEXT_PASS (pass_ccp);
>           /* After CCP we rewrite no longer addressed locals into SSA
> @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
>        /* Initial scalar cleanups before alias computation.
>          They ensure memory accesses are not indirect wherever possible.  */
>        NEXT_PASS (pass_strip_predict_hints);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        NEXT_PASS (pass_ccp);
>        /* After CCP we rewrite no longer addressed locals into SSA
>          form if possible.  */
> @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_ch);
>        NEXT_PASS (pass_lower_complex);
>        NEXT_PASS (pass_sra);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* The dom pass will also resolve all __builtin_constant_p calls
>           that are still there to 0.  This has to be done after some
>          propagations have already run, but before some more dead code
> @@ -293,7 +290,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_fold_builtins);
>        NEXT_PASS (pass_optimize_widening_mul);
>        NEXT_PASS (pass_tail_calls);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* FIXME: If DCE is not run before checking for uninitialized uses,
>          we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
>          However, this also causes us to misdiagnose cases that should be
> @@ -328,7 +324,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_dce);
>        NEXT_PASS (pass_asan);
>        NEXT_PASS (pass_tsan);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* ???  We do want some kind of loop invariant motion, but we possibly
>           need to adjust LIM to be more friendly towards preserving accurate
>          debug information here.  */
> diff --git a/gcc/stmt.c b/gcc/stmt.c
> index 391686c..e7f7dd4 100644
> --- a/gcc/stmt.c
> +++ b/gcc/stmt.c
> @@ -891,7 +891,7 @@ emit_case_decision_tree (tree index_expr, tree index_type,
>      {
>        index = copy_to_reg (index);
>        if (TREE_CODE (index_expr) == SSA_NAME)
> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (index_expr), index);
> +       set_reg_attrs_for_decl_rtl (index_expr, index);
>      }
>
>    balance_case_nodes (&case_list, NULL);
> diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
> index 0d4f4a4..288227a 100644
> --- a/gcc/stor-layout.c
> +++ b/gcc/stor-layout.c
> @@ -794,7 +794,8 @@ layout_decl (tree decl, unsigned int known_align)
>      {
>        PUT_MODE (rtl, DECL_MODE (decl));
>        SET_DECL_RTL (decl, 0);
> -      set_mem_attributes (rtl, decl, 1);
> +      if (MEM_P (rtl))
> +       set_mem_attributes (rtl, decl, 1);
>        SET_DECL_RTL (decl, rtl);
>      }
>  }
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
> index 9b17187..e1e7293 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
> @@ -1,6 +1,6 @@
>  /* PR tree-optimization/54200 */
>  /* { dg-do run } */
> -/* { dg-options "-g -fno-var-tracking-assignments" } */
> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>
>  int o __attribute__((used));
>
> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
> index 5467f4d..db69332 100644
> --- a/gcc/testsuite/gcc.dg/ssp-1.c
> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>
>  int main ()
>  {
> -  int i;
> +  register int i;
>    char foo[255];
>
>    // smash stack
> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
> index 9a7ac32..752fe53 100644
> --- a/gcc/testsuite/gcc.dg/ssp-2.c
> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
>  void
>  overflow()
>  {
> -  int i = 0;
> +  register int i = 0;
>    char foo[30];
>
>    /* Overflow buffer.  */
> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> new file mode 100644
> index 0000000..dbd81c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> @@ -0,0 +1,40 @@
> +/* { dg-do run } */
> +
> +#include <stdlib.h>
> +
> +/* Make sure we don't coalesce both incoming parms, one whose incoming
> +   value is unused, to the same location, so as to overwrite one of
> +   them with the incoming value of the other.  */
> +
> +int __attribute__((noinline, noclone))
> +foo (int i, int j)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +/* Same as foo, but with swapped parameters.  */
> +int __attribute__((noinline, noclone))
> +bar (int j, int i)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +int
> +main (void)
> +{
> +  if (foo (0, 1) != 3)
> +    abort ();
> +  if (bar (1, 0) != 3)
> +    abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index 7b747ab9..978476c 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>    rtx dest_rtx, seq, x;
>    machine_mode dest_mode, src_mode;
>    int unsignedp;
> -  tree var;
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> @@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>
>    start_sequence ();
>
> -  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
> +  tree name = partition_to_var (SA.map, dest);
>    src_mode = TYPE_MODE (TREE_TYPE (src));
>    dest_mode = GET_MODE (dest_rtx);
> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>    gcc_assert (!REG_P (dest_rtx)
> -             || dest_mode == promote_decl_mode (var, &unsignedp));
> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>
>    if (src_mode != dest_mode)
>      {
> @@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T)
>  static rtx
>  get_temp_reg (tree name)
>  {
> -  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> -  tree type = TREE_TYPE (var);
> +  tree type = TREE_TYPE (name);
>    int unsignedp;
> -  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
> +  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>    rtx x = gen_reg_rtx (reg_mode);
>    if (POINTER_TYPE_P (type))
> -    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
> +    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
>    return x;
>  }
>
> @@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
>    /* Return to viewing the variable list as just all reference variables after
>       coalescing has been performed.  */
> -  partition_view_normal (map, false);
> +  partition_view_normal (map);
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index bf8983f..08ce72c 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -36,6 +36,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-iterator.h"
>  #include "tree-ssa-live.h"
>  #include "tree-ssa-coalesce.h"
> +#include "cfgexpand.h"
> +#include "explow.h"
>  #include "diagnostic-core.h"
>
>
> @@ -806,6 +808,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>    basic_block bb;
>    ssa_op_iter iter;
>    live_track_p live;
> +  basic_block entry;
> +
> +  /* If inter-variable coalescing is enabled, we may attempt to
> +     coalesce variables from different base variables, including
> +     different parameters, so we have to make sure default defs live
> +     at the entry block conflict with each other.  */
> +  if (flag_tree_coalesce_vars)
> +    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> +  else
> +    entry = NULL;
>
>    map = live_var_map (liveinfo);
>    graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -864,6 +876,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>             live_track_process_def (live, result, graph);
>         }
>
> +      /* Pretend there are defs for params' default defs at the start
> +        of the (post-)entry block.  */
> +      if (bb == entry)
> +       {
> +         unsigned base;
> +         bitmap_iterator bi;
> +         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> +           {
> +             bitmap_iterator bi2;
> +             unsigned part;
> +             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> +                                       0, part, bi2)
> +               {
> +                 tree var = partition_to_var (map, part);
> +                 if (!SSA_NAME_VAR (var)
> +                     || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> +                         && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> +                     || !SSA_NAME_IS_DEFAULT_DEF (var))
> +                   continue;
> +                 live_track_process_def (live, var, graph);
> +               }
> +           }
> +       }
> +
>       live_track_clear_base_vars (live);
>      }
>
> @@ -1132,6 +1168,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>      {
>        var1 = partition_to_var (map, p1);
>        var2 = partition_to_var (map, p2);
> +
>        z = var_union (map, var1, var2);
>        if (z == NO_PARTITION)
>         {
> @@ -1149,6 +1186,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
>        if (debug)
>         fprintf (debug, ": Success -> %d\n", z);
> +
>        return true;
>      }
>
> @@ -1244,6 +1282,333 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
>  }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F.  */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> +  int t;
> +  unsigned x, y;
> +  int p;
> +
> +  fprintf (f, "\nCoalescible Partition map \n\n");
> +
> +  for (x = 0; x < map->num_partitions; x++)
> +    {
> +      if (map->view_to_partition != NULL)
> +       p = map->view_to_partition[x];
> +      else
> +       p = x;
> +
> +      if (ssa_name (p) == NULL_TREE
> +         || virtual_operand_p (ssa_name (p)))
> +        continue;
> +
> +      t = 0;
> +      for (y = 1; y < num_ssa_names; y++)
> +        {
> +         tree var = version_to_var (map, y);
> +         if (!var)
> +           continue;
> +         int q = var_to_partition (map, var);
> +         p = partition_find (part, q);
> +         gcc_assert (map->partition_to_base_index[q]
> +                     == map->partition_to_base_index[p]);
> +
> +         if (p == (int)x)
> +           {
> +             if (t++ == 0)
> +               {
> +                 fprintf (f, "Partition %d, base %d (", x,
> +                          map->partition_to_base_index[q]);
> +                 print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> +                 fprintf (f, " - ");
> +               }
> +             fprintf (f, "%d ", y);
> +           }
> +       }
> +      if (t != 0)
> +       fprintf (f, ")\n");
> +    }
> +  fprintf (f, "\n");
> +}
> +
> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> +   coalescing together, false otherwise.
> +
> +   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> +
> +bool
> +gimple_can_coalesce_p (tree name1, tree name2)
> +{
> +  /* First check the SSA_NAME's associated DECL.  Without
> +     optimization, we only want to coalesce if they have the same DECL
> +     or both have no associated DECL.  */
> +  tree var1 = SSA_NAME_VAR (name1);
> +  tree var2 = SSA_NAME_VAR (name2);
> +  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> +  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> +  if (var1 != var2 && !flag_tree_coalesce_vars)
> +    return false;
> +
> +  /* Now check the types.  If the types are the same, then we should
> +     try to coalesce V1 and V2.  */
> +  tree t1 = TREE_TYPE (name1);
> +  tree t2 = TREE_TYPE (name2);
> +  if (t1 == t2)
> +    {
> +    check_modes:
> +      /* If the base variables are the same, we're good: none of the
> +        other tests below could possibly fail.  */
> +      var1 = SSA_NAME_VAR (name1);
> +      var2 = SSA_NAME_VAR (name2);
> +      if (var1 == var2)
> +       return true;
> +
> +      /* We don't want to coalesce two SSA names if one of the base
> +        variables is supposed to be a register while the other is
> +        supposed to be on the stack.  Anonymous SSA names take
> +        registers, but when not optimizing, user variables should go
> +        on the stack, so coalescing them with the anonymous variable
> +        as the partition leader would end up assigning the user
> +        variable to a register.  Don't do that!  */
> +      bool reg1 = !var1 || use_register_for_decl (var1);
> +      bool reg2 = !var2 || use_register_for_decl (var2);
> +      if (reg1 != reg2)
> +       return false;
> +
> +      /* Check that the promoted modes are the same.  We don't want to
> +        coalesce if the promoted modes would be different.  Only
> +        PARM_DECLs and RESULT_DECLs have different promotion rules,
> +        so skip the test if both are variables, or both are anonymous
> +        SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
> +        coalesce its SSA versions with those of any other variables,
> +        because it may be passed by reference.  */
> +      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> +       || (/* The case var1 == var2 is already covered above.  */
> +           !parm_maybe_byref_p (var1)
> +           && !parm_maybe_byref_p (var2)
> +           && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
> +    }
> +
> +  /* If the types are not the same, check for a canonical type match.  This
> +     (for example) allows coalescing when the types are fundamentally the
> +     same, but just have different names.
> +
> +     Note pointer types with different address spaces may have the same
> +     canonical type.  Those are rejected for coalescing by the
> +     types_compatible_p check.  */
> +  if (TYPE_CANONICAL (t1)
> +      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> +      && types_compatible_p (t1, t2))
> +    goto check_modes;
> +
> +  return false;
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> +   partition of SSA names USED_IN_COPIES and related by CL coalesce
> +   possibilities.  This must match gimple_can_coalesce_p in the
> +   optimized case.  */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> +                                  coalesce_list_p cl)
> +{
> +  int parts = num_var_partitions (map);
> +  partition tentative = partition_new (parts);
> +
> +  /* Partition the SSA versions so that, for each coalescible
> +     pair, both of its members are in the same partition in
> +     TENTATIVE.  */
> +  gcc_assert (!cl->sorted);
> +  coalesce_pair_p node;
> +  coalesce_iterator_type ppi;
> +  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> +    {
> +      tree v1 = ssa_name (node->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (node->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* We have to deal with cost one pairs too.  */
> +  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> +    {
> +      tree v1 = ssa_name (co->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (co->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* And also with abnormal edges.  */
> +  basic_block bb;
> +  edge e;
> +  edge_iterator ei;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      FOR_EACH_EDGE (e, ei, bb->preds)
> +       if (e->flags & EDGE_ABNORMAL)
> +         {
> +           gphi_iterator gsi;
> +           for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> +                gsi_next (&gsi))
> +             {
> +               gphi *phi = gsi.phi ();
> +               tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> +               if (SSA_NAME_IS_DEFAULT_DEF (arg)
> +                   && (!SSA_NAME_VAR (arg)
> +                       || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> +                 continue;
> +
> +               tree res = PHI_RESULT (phi);
> +
> +               int p1 = partition_find (tentative, var_to_partition (map, res));
> +               int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> +               if (p1 == p2)
> +                 continue;
> +
> +               partition_union (tentative, p1, p2);
> +             }
> +         }
> +    }
> +
> +  map->partition_to_base_index = XCNEWVEC (int, parts);
> +  auto_vec<unsigned int> index_map (parts);
> +  if (parts)
> +    index_map.quick_grow (parts);
> +
> +  const unsigned no_part = -1;
> +  unsigned count = parts;
> +  while (count)
> +    index_map[--count] = no_part;
> +
> +  /* Initialize MAP's mapping from partition to base index, using
> +     as base indices an enumeration of the TENTATIVE partitions in
> +     which each SSA version ended up, so that we compute conflicts
> +     between all SSA versions that ended up in the same potential
> +     coalesce partition.  */
> +  bitmap_iterator bi;
> +  unsigned i;
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      if (index_map[base] != no_part)
> +       continue;
> +      index_map[base] = count++;
> +    }
> +
> +  map->num_basevars = count;
> +
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      gcc_assert (index_map[base] < count);
> +      map->partition_to_base_index[pidx] = index_map[base];
> +    }
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    dump_part_var_map (dump_file, tentative, map);
> +
> +  partition_delete (tentative);
> +}
> +
> +/* Hashtable helpers.  */
> +
> +struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
> +{
> +  static inline hashval_t hash (const tree_int_map *);
> +  static inline bool equal (const tree_int_map *, const tree_int_map *);
> +};
> +
> +inline hashval_t
> +tree_int_map_hasher::hash (const tree_int_map *v)
> +{
> +  return tree_map_base_hash (v);
> +}
> +
> +inline bool
> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> +{
> +  return tree_int_map_eq (v, c);
> +}
> +
> +/* This routine will initialize the basevar fields of MAP with base
> +   names.  Partitions will share the same base if they have the same
> +   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
> +   must match gimple_can_coalesce_p in the non-optimized case.  */
> +
> +static void
> +compute_samebase_partition_bases (var_map map)
> +{
> +  int x, num_part;
> +  tree var;
> +  struct tree_int_map *m, *mapstorage;
> +
> +  num_part = num_var_partitions (map);
> +  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> +  /* We can have at most num_part entries in the hash tables, so it's
> +     enough to allocate so many map elements once, saving some malloc
> +     calls.  */
> +  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> +
> +  /* If a base table already exists, clear it, otherwise create it.  */
> +  free (map->partition_to_base_index);
> +  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> +
> +  /* Build the base variable list, and point partitions at their bases.  */
> +  for (x = 0; x < num_part; x++)
> +    {
> +      struct tree_int_map **slot;
> +      unsigned baseindex;
> +      var = partition_to_var (map, x);
> +      if (SSA_NAME_VAR (var)
> +         && (!VAR_P (SSA_NAME_VAR (var))
> +             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> +       m->base.from = SSA_NAME_VAR (var);
> +      else
> +       /* This restricts what anonymous SSA names we can coalesce
> +          as it restricts the sets we compute conflicts for.
> +          Using TREE_TYPE to generate sets is the easies as
> +          type equivalency also holds for SSA names with the same
> +          underlying decl.
> +
> +          Check gimple_can_coalesce_p when changing this code.  */
> +       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> +                       ? TYPE_CANONICAL (TREE_TYPE (var))
> +                       : TREE_TYPE (var));
> +      /* If base variable hasn't been seen, set it up.  */
> +      slot = tree_to_index.find_slot (m, INSERT);
> +      if (!*slot)
> +       {
> +         baseindex = m - mapstorage;
> +         m->to = baseindex;
> +         *slot = m;
> +         m++;
> +       }
> +      else
> +       baseindex = (*slot)->to;
> +      map->partition_to_base_index[x] = baseindex;
> +    }
> +
> +  map->num_basevars = m - mapstorage;
> +
> +  free (mapstorage);
> +}
> +
>  /* Reduce the number of copies by coalescing variables in the function.  Return
>     a partition map with the resulting coalesces.  */
>
> @@ -1260,9 +1625,10 @@ coalesce_ssa_name (void)
>    cl = create_coalesce_list ();
>    map = create_outofssa_var_map (cl, used_in_copies);
>
> -  /* If optimization is disabled, we need to coalesce all the names originating
> -     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
> -  if (!optimize)
> +  /* If this optimization is disabled, we need to coalesce all the
> +     names originating from the same SSA_NAME_VAR so debug info
> +     remains undisturbed.  */
> +  if (!flag_tree_coalesce_vars)
>      {
>        hash_table<ssa_name_var_hash> ssa_name_hash (10);
>
> @@ -1303,8 +1669,13 @@ coalesce_ssa_name (void)
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      dump_var_map (dump_file, map);
>
> -  /* Don't calculate live ranges for variables not in the coalesce list.  */
> -  partition_view_bitmap (map, used_in_copies, true);
> +  partition_view_bitmap (map, used_in_copies);
> +
> +  if (flag_tree_coalesce_vars)
> +    compute_optimized_partition_bases (map, used_in_copies, cl);
> +  else
> +    compute_samebase_partition_bases (map);
> +
>    BITMAP_FREE (used_in_copies);
>
>    if (num_var_partitions (map) < 1)
> @@ -1343,8 +1714,7 @@ coalesce_ssa_name (void)
>
>    /* Now coalesce everything in the list.  */
>    coalesce_partitions (map, graph, cl,
> -                      ((dump_flags & TDF_DETAILS) ? dump_file
> -                                                  : NULL));
> +                      ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
>    delete_coalesce_list (cl);
>    ssa_conflicts_delete (graph);
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index 99b188a..ae289b4 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>  #define GCC_TREE_SSA_COALESCE_H
>
>  extern var_map coalesce_ssa_name (void);
> +extern bool gimple_can_coalesce_p (tree, tree);
>
>  #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index aeb7f28..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,475 +0,0 @@
> -/* Rename SSA copies.
> -   Copyright (C) 2004-2015 Free Software Foundation, Inc.
> -   Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3.  If not see
> -<http://www.gnu.org/licenses/>.  */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "backend.h"
> -#include "tree.h"
> -#include "gimple.h"
> -#include "rtl.h"
> -#include "ssa.h"
> -#include "alias.h"
> -#include "fold-const.h"
> -#include "internal-fn.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> -  /* Number of copies coalesced.  */
> -  int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> -   This optimization looks for copies between 2 SSA_NAMES, either through a
> -   direct copy, or an implicit one via a PHI node result and its arguments.
> -
> -   Each copy is examined to determine if it is possible to rename the base
> -   variable of one of the operands to the same variable as the other operand.
> -   i.e.
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -
> -   If this copy couldn't be copy propagated, it could possibly remain in the
> -   program throughout the optimization phases.   After SSA->normal, it would
> -   become:
> -
> -   T.3 = <blah>
> -   a = T.3
> -
> -   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> -   fundamental reason why the base variable needs to be T.3, subject to
> -   certain restrictions.  This optimization attempts to determine if we can
> -   change the base variable on copies like this, and result in code such as:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -
> -   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> -   possible, the copy goes away completely. If it isn't possible, a new temp
> -   will be created for a_5, and you will end up with the exact same code:
> -
> -   a.8 = <blah>
> -   a = a.8
> -
> -   The other benefit of performing this optimization relates to what variables
> -   are chosen in copies.  Gimplification of the program uses temporaries for
> -   a lot of things. expressions like
> -
> -   a_1 = <blah>
> -   <blah2> = a_1
> -
> -   get turned into
> -
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -   <blah2> = a_1
> -
> -   Copy propagation is done in a forward direction, and if we can propagate
> -   through the copy, we end up with:
> -
> -   T.3_5 = <blah>
> -   <blah2> = T.3_5
> -
> -   The copy is gone, but so is all reference to the user variable 'a'. By
> -   performing this optimization, we would see the sequence:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -   <blah2> = a_1
> -
> -   which copy propagation would then turn into:
> -
> -   a_5 = <blah>
> -   <blah2> = a_5
> -
> -   and so we still retain the user variable whenever possible.  */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> -   Choose a representative for the partition, and send debug info to DEBUG.  */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> -  int p1, p2, p3;
> -  tree root1, root2;
> -  tree rep1, rep2;
> -  bool ign1, ign2, abnorm;
> -
> -  gcc_assert (TREE_CODE (var1) == SSA_NAME);
> -  gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> -  register_ssa_partition (map, var1);
> -  register_ssa_partition (map, var2);
> -
> -  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> -  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> -  if (debug)
> -    {
> -      fprintf (debug, "Try : ");
> -      print_generic_expr (debug, var1, TDF_SLIM);
> -      fprintf (debug, "(P%d) & ", p1);
> -      print_generic_expr (debug, var2, TDF_SLIM);
> -      fprintf (debug, "(P%d)", p2);
> -    }
> -
> -  gcc_assert (p1 != NO_PARTITION);
> -  gcc_assert (p2 != NO_PARTITION);
> -
> -  if (p1 == p2)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Already coalesced.\n");
> -      return;
> -    }
> -
> -  rep1 = partition_to_var (map, p1);
> -  rep2 = partition_to_var (map, p2);
> -  root1 = SSA_NAME_VAR (rep1);
> -  root2 = SSA_NAME_VAR (rep2);
> -  if (!root1 && !root2)
> -    return;
> -
> -  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
> -  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> -           || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> -  if (abnorm)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Partitions already have the same root, simply merge them.  */
> -  if (root1 == root2)
> -    {
> -      p1 = partition_union (map->var_partition, p1, p2);
> -      if (debug)
> -       fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> -      return;
> -    }
> -
> -  /* Never attempt to coalesce 2 different parameters.  */
> -  if ((root1 && TREE_CODE (root1) == PARM_DECL)
> -      && (root2 && TREE_CODE (root2) == PARM_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> -      return;
> -    }
> -
> -  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> -      != (root2 && TREE_CODE (root2) == RESULT_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> -      return;
> -    }
> -
> -  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> -  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> -  /* Refrain from coalescing user variables, if requested.  */
> -  if (!ign1 && !ign2)
> -    {
> -      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> -       ign2 = true;
> -      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> -       ign1 = true;
> -      else if (flag_ssa_coalesce_vars != 2)
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> -         return;
> -       }
> -      else
> -       ign2 = true;
> -    }
> -
> -  /* If both values have default defs, we can't coalesce.  If only one has a
> -     tag, make sure that variable is the new root partition.  */
> -  if (root1 && ssa_default_def (cfun, root1))
> -    {
> -      if (root2 && ssa_default_def (cfun, root2))
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 default defs. No coalesce.\n");
> -         return;
> -       }
> -      else
> -        {
> -         ign2 = true;
> -         ign1 = false;
> -       }
> -    }
> -  else if (root2 && ssa_default_def (cfun, root2))
> -    {
> -      ign1 = true;
> -      ign2 = false;
> -    }
> -
> -  /* Do not coalesce if we cannot assign a symbol to the partition.  */
> -  if (!(!ign2 && root2)
> -      && !(!ign1 && root1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the new chosen root variable would be read-only.
> -     If both ign1 && ign2, then the root var of the larger partition
> -     wins, so reject in that case if any of the root vars is TREE_READONLY.
> -     Otherwise reject only if the root var, on which replace_ssa_name_symbol
> -     will be called below, is readonly.  */
> -  if (((root1 && TREE_READONLY (root1)) && ign2)
> -      || ((root2 && TREE_READONLY (root2)) && ign1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Readonly variable.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the two variables aren't type compatible .  */
> -  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> -      /* There is a disconnect between the middle-end type-system and
> -         VRP, avoid coalescing enum types with different bounds.  */
> -      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> -          || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> -         && TREE_TYPE (var1) != TREE_TYPE (var2)))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Incompatible types.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Merge the two partitions.  */
> -  p3 = partition_union (map->var_partition, p1, p2);
> -
> -  /* Set the root variable of the partition to the better choice, if there is
> -     one.  */
> -  if (!ign2 && root2)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> -  else if (!ign1 && root1)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> -  else
> -    gcc_unreachable ();
> -
> -  if (debug)
> -    {
> -      fprintf (debug, " --> P%d ", p3);
> -      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> -                         TDF_SLIM);
> -      fprintf (debug, "\n");
> -    }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> -  GIMPLE_PASS, /* type */
> -  "copyrename", /* name */
> -  OPTGROUP_NONE, /* optinfo_flags */
> -  TV_TREE_COPY_RENAME, /* tv_id */
> -  ( PROP_cfg | PROP_ssa ), /* properties_required */
> -  0, /* properties_provided */
> -  0, /* properties_destroyed */
> -  0, /* todo_flags_start */
> -  0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> -  pass_rename_ssa_copies (gcc::context *ctxt)
> -    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> -  {}
> -
> -  /* opt_pass methods: */
> -  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> -  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> -  virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> -   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
> -   changing the underlying root variable of all coalesced version.  This will
> -   then cause the SSA->normal pass to attempt to coalesce them all to the same
> -   variable.  */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> -  var_map map;
> -  basic_block bb;
> -  tree var, part_var;
> -  gimple stmt;
> -  unsigned x;
> -  FILE *debug;
> -
> -  memset (&stats, 0, sizeof (stats));
> -
> -  if (dump_file && (dump_flags & TDF_DETAILS))
> -    debug = dump_file;
> -  else
> -    debug = NULL;
> -
> -  map = init_var_map (num_ssa_names);
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Scan for real copies.  */
> -      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -       {
> -         stmt = gsi_stmt (gsi);
> -         if (gimple_assign_ssa_name_copy_p (stmt))
> -           {
> -             tree lhs = gimple_assign_lhs (stmt);
> -             tree rhs = gimple_assign_rhs1 (stmt);
> -
> -             copy_rename_partition_coalesce (map, lhs, rhs, debug);
> -           }
> -       }
> -    }
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Treat PHI nodes as copies between the result and each argument.  */
> -      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -        {
> -          size_t i;
> -         tree res;
> -         gphi *phi = gsi.phi ();
> -         res = gimple_phi_result (phi);
> -
> -         /* Do not process virtual SSA_NAMES.  */
> -         if (virtual_operand_p (res))
> -           continue;
> -
> -         /* Make sure to only use the same partition for an argument
> -            as the result but never the other way around.  */
> -         if (SSA_NAME_VAR (res)
> -             && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> -           for (i = 0; i < gimple_phi_num_args (phi); i++)
> -             {
> -               tree arg = PHI_ARG_DEF (phi, i);
> -               if (TREE_CODE (arg) == SSA_NAME)
> -                 copy_rename_partition_coalesce (map, res, arg,
> -                                                 debug);
> -             }
> -         /* Else if all arguments are in the same partition try to merge
> -            it with the result.  */
> -         else
> -           {
> -             int all_p_same = -1;
> -             int p = -1;
> -             for (i = 0; i < gimple_phi_num_args (phi); i++)
> -               {
> -                 tree arg = PHI_ARG_DEF (phi, i);
> -                 if (TREE_CODE (arg) != SSA_NAME)
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -                 else if (all_p_same == -1)
> -                   {
> -                     p = partition_find (map->var_partition,
> -                                         SSA_NAME_VERSION (arg));
> -                     all_p_same = 1;
> -                   }
> -                 else if (all_p_same == 1
> -                          && p != partition_find (map->var_partition,
> -                                                  SSA_NAME_VERSION (arg)))
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -               }
> -             if (all_p_same == 1)
> -               copy_rename_partition_coalesce (map, res,
> -                                               PHI_ARG_DEF (phi, 0),
> -                                               debug);
> -           }
> -        }
> -    }
> -
> -  if (debug)
> -    dump_var_map (debug, map);
> -
> -  /* Now one more pass to make all elements of a partition share the same
> -     root variable.  */
> -
> -  for (x = 1; x < num_ssa_names; x++)
> -    {
> -      part_var = partition_to_var (map, x);
> -      if (!part_var)
> -        continue;
> -      var = ssa_name (x);
> -      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> -       continue;
> -      if (debug)
> -        {
> -         fprintf (debug, "Coalesced ");
> -         print_generic_expr (debug, var, TDF_SLIM);
> -         fprintf (debug, " to ");
> -         print_generic_expr (debug, part_var, TDF_SLIM);
> -         fprintf (debug, "\n");
> -       }
> -      stats.coalesced++;
> -      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> -    }
> -
> -  statistics_counter_event (fun, "copies coalesced",
> -                           stats.coalesced);
> -  delete_var_map (map);
> -  return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> -  return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index 5b00f58..4772558 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -70,88 +70,6 @@ static void  verify_live_on_entry (tree_live_info_p);
>     ssa_name or variable, and vice versa.  */
>
>
> -/* Hashtable helpers.  */
> -
> -struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
> -{
> -  static inline hashval_t hash (const tree_int_map *);
> -  static inline bool equal (const tree_int_map *, const tree_int_map *);
> -};
> -
> -inline hashval_t
> -tree_int_map_hasher::hash (const tree_int_map *v)
> -{
> -  return tree_map_base_hash (v);
> -}
> -
> -inline bool
> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> -{
> -  return tree_int_map_eq (v, c);
> -}
> -
> -
> -/* This routine will initialize the basevar fields of MAP.  */
> -
> -static void
> -var_map_base_init (var_map map)
> -{
> -  int x, num_part;
> -  tree var;
> -  struct tree_int_map *m, *mapstorage;
> -
> -  num_part = num_var_partitions (map);
> -  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> -  /* We can have at most num_part entries in the hash tables, so it's
> -     enough to allocate so many map elements once, saving some malloc
> -     calls.  */
> -  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> -
> -  /* If a base table already exists, clear it, otherwise create it.  */
> -  free (map->partition_to_base_index);
> -  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> -
> -  /* Build the base variable list, and point partitions at their bases.  */
> -  for (x = 0; x < num_part; x++)
> -    {
> -      struct tree_int_map **slot;
> -      unsigned baseindex;
> -      var = partition_to_var (map, x);
> -      if (SSA_NAME_VAR (var)
> -         && (!VAR_P (SSA_NAME_VAR (var))
> -             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> -       m->base.from = SSA_NAME_VAR (var);
> -      else
> -       /* This restricts what anonymous SSA names we can coalesce
> -          as it restricts the sets we compute conflicts for.
> -          Using TREE_TYPE to generate sets is the easies as
> -          type equivalency also holds for SSA names with the same
> -          underlying decl.
> -
> -          Check gimple_can_coalesce_p when changing this code.  */
> -       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> -                       ? TYPE_CANONICAL (TREE_TYPE (var))
> -                       : TREE_TYPE (var));
> -      /* If base variable hasn't been seen, set it up.  */
> -      slot = tree_to_index.find_slot (m, INSERT);
> -      if (!*slot)
> -       {
> -         baseindex = m - mapstorage;
> -         m->to = baseindex;
> -         *slot = m;
> -         m++;
> -       }
> -      else
> -       baseindex = (*slot)->to;
> -      map->partition_to_base_index[x] = baseindex;
> -    }
> -
> -  map->num_basevars = m - mapstorage;
> -
> -  free (mapstorage);
> -}
> -
> -
>  /* Remove the base table in MAP.  */
>
>  static void
> @@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected)
>  }
>
>
> -/* Create a partition view which includes all the used partitions in MAP.  If
> -   WANT_BASES is true, create the base variable map as well.  */
> +/* Create a partition view which includes all the used partitions in MAP.  */
>
>  void
> -partition_view_normal (var_map map, bool want_bases)
> +partition_view_normal (var_map map)
>  {
>    bitmap used;
>
>    used = partition_view_init (map);
>    partition_view_fini (map, used);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> @@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases)
>     as well.  */
>
>  void
> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> +partition_view_bitmap (var_map map, bitmap only)
>  {
>    bitmap used;
>    bitmap new_partitions = BITMAP_ALLOC (NULL);
> @@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>      }
>    partition_view_fini (map, new_partitions);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
> index d5d7820..1f88358 100644
> --- a/gcc/tree-ssa-live.h
> +++ b/gcc/tree-ssa-live.h
> @@ -71,8 +71,8 @@ typedef struct _var_map
>  extern var_map init_var_map (int);
>  extern void delete_var_map (var_map);
>  extern int var_union (var_map, tree, tree);
> -extern void partition_view_normal (var_map, bool);
> -extern void partition_view_bitmap (var_map, bitmap, bool);
> +extern void partition_view_normal (var_map);
> +extern void partition_view_bitmap (var_map, bitmap);
>  extern void dump_scope_blocks (FILE *, int);
>  extern void debug_scope_block (tree, int);
>  extern void debug_scope_blocks (int);
> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
> index 437f69d..1fbd71e 100644
> --- a/gcc/tree-ssa-uncprop.c
> +++ b/gcc/tree-ssa-uncprop.c
> @@ -38,6 +38,11 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-pass.h"
>  #include "tree-ssa-propagate.h"
>  #include "tree-hash-traits.h"
> +#include "bitmap.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-ssa-live.h"
> +#include "tree-ssa-coalesce.h"
>
>  /* The basic structure describing an equivalency created by traversing
>     an edge.  Traversing the edge effectively means that we can assume
> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
> index da9de28..a31a137 100644
> --- a/gcc/var-tracking.c
> +++ b/gcc/var-tracking.c
> @@ -4856,12 +4856,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
>     registers, as well as associations between MEMs and VALUEs.  */
>
>  static void
> -dataflow_set_clear_at_call (dataflow_set *set)
> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
>  {
>    unsigned int r;
>    hard_reg_set_iterator hrsi;
> +  HARD_REG_SET invalidated_regs;
>
> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
> +  get_call_reg_set_usage (call_insn, &invalidated_regs,
> +                         regs_invalidated_by_call);
> +
> +  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
>      var_regno_delete (set, r);
>
>    if (MAY_HAVE_DEBUG_INSNS)
> @@ -6645,7 +6649,7 @@ compute_bb_dataflow (basic_block bb)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (out);
> +           dataflow_set_clear_at_call (out, insn);
>             break;
>
>           case MO_USE:
> @@ -9107,7 +9111,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (set);
> +           dataflow_set_clear_at_call (set, insn);
>             emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
>             {
>               rtx arguments = mo->u.loc, *p = &arguments;
>
>
> incremental fixes
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
>         * emit-rtl.c: Include stor-layout.h.
>         (set_reg_attrs_for_decl_rtl): Take mode from expression if
>         it's not a DECL.
>         * stmt.c (emit_case_decision_tree): Pass it the SSA_NAME
>         rather than its possibly-NULL DECL.
>         PR bootstrap/66978
>         * function.c (expand_function_start): Convert static chain to
>         Pmode if needed.  From H.J. Lu  <hongjiu.lu@intel.com>.
>         PR middle-end/66983
>         PR middle-end/67035
>         * cfgexpand.c (align_local_variable, add_stack_var): Support
>         anonymous SSA names.
>         (defer_stack_allocation): Likewise.  Declare earlier.
>         (expand_one_ssa_partition): Record alignment before expanding
>         stack vars.  Support deferred allocation.
>         (set_rtl): Do no record deferred-allocation marker in
>         SA.partition_to_pseudo.
>         (expand_stack_vars): Adjust check for the marker in it.
>         (adjust_one_expanded_partition_var): Skip deferred-alloc vars.
>         PR middle-end/67034
>         * cfgexpand.c (parm_maybe_byref_p): New.
>         (expand_one_ssa_partition): Call it.  Expand maybe-byref
>         parms' default defs with a placeholder for the mem addr.
>         (ssa_default_def_partition): New.
>         (get_rtl_for_parm_ssa_default_def): Use it.
>         * function.c (assign_parm_setup_block): Replace the
>         placeholder with the address of the newly-allocated block.
>         (assign_parm_setup_reg): Replace the placeholder with a
>         newly-created pseudo.  Arrange for the pseudo to be
>         initialized from the incoming passed pointer.  Make sure
>         passed_pointer parms don't need conversion.  Don't copy from
>         validated_mem if parmreg is the value expression from expand
>         and validated_mem is the passed pointer.  For passed pointers,
>         copy from the mem referenced by validated_mem when using the
>         expand-chosen rtl.
>         * cfgexpand.h (parm_maybe_byref_p): Declare.
>         * tree-ssa-coalesce.c: Include cfgexpand.h.
>         (gimple_can_coalesce_p): Do not coalesce maybe-byref parms
>         with SSA_NAMEs of other variables, or anonymous SSA_NAMEs.
>         PR rtl-optimization/67000
>         * expr.c (read_complex_part): Export.
>         * expr.h (read_complex_part): Declare.
>         * function.c (split_complex_args): Use it.  Reset complex parm
>         before fetching its default decl rtl.
>         (assign_parms_unsplit_complex): Use the preexisting complex
>         parm rtl if it matches the components.
>         (assign_parm_setup_reg): Drop assert on from_expand mode.
>         Adjust it to the promoted_mode, if not byref.
> ---
>  gcc/cfgexpand.c         |  129 +++++++++++++++++++++++++++++++++++++----------
>  gcc/cfgexpand.h         |    1
>  gcc/emit-rtl.c          |    5 +-
>  gcc/expr.c              |    2 -
>  gcc/expr.h              |    1
>  gcc/function.c          |   88 ++++++++++++++++++++++++--------
>  gcc/stmt.c              |    2 -
>  gcc/tree-ssa-coalesce.c |   12 +++-
>  8 files changed, 186 insertions(+), 54 deletions(-)
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 0b19953..8f6caf6 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -97,6 +97,8 @@ gimple currently_expanding_gimple_stmt;
>
>  static rtx expand_debug_expr (tree);
>
> +static bool defer_stack_allocation (tree, bool);
> +
>  /* Return an expression tree corresponding to the RHS of GIMPLE
>     statement STMT.  */
>
> @@ -170,6 +172,39 @@ leader_merge (tree cur, tree next)
>    return cur;
>  }
>
> +/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
> +   Such parameters are likely passed as a pointer to the value, rather
> +   than as a value, and so we must not coalesce them, nor allocate
> +   stack space for them before determining the calling conventions for
> +   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
> +   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
> +   with NULL so as to make sure the MEM is not used before it is
> +   adjusted in assign_parm_setup_reg.  */
> +
> +bool
> +parm_maybe_byref_p (tree var)
> +{
> +  if (!var || VAR_P (var))
> +    return false;
> +
> +  gcc_assert (TREE_CODE (var) == PARM_DECL
> +             || TREE_CODE (var) == RESULT_DECL);
> +
> +  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
> +}
> +
> +/* Return the partition of the default SSA_DEF for decl VAR.  */
> +
> +static int
> +ssa_default_def_partition (tree var)
> +{
> +  tree name = ssa_default_def (cfun, var);
> +
> +  if (!name)
> +    return NO_PARTITION;
> +
> +  return var_to_partition (SA.map, name);
> +}
>
>  /* Return the RTL for the default SSA def of a PARM or RESULT, if
>     there is one.  */
> @@ -198,12 +233,7 @@ get_rtl_for_parm_ssa_default_def (tree var)
>        return DECL_RTL (var);
>      }
>
> -  tree name = ssa_default_def (cfun, var);
> -
> -  if (!name)
> -    return NULL_RTX;
> -
> -  int part = var_to_partition (SA.map, name);
> +  int part = ssa_default_def_partition (var);
>    if (part == NO_PARTITION)
>      return NULL_RTX;
>
> @@ -253,7 +283,7 @@ set_rtl (tree t, rtx x)
>         {
>           if (SA.partition_to_pseudo[part])
>             gcc_assert (SA.partition_to_pseudo[part] == x);
> -         else
> +         else if (x != pc_rtx)
>             SA.partition_to_pseudo[part] = x;
>         }
>        /* For the benefit of debug information at -O0 (where
> @@ -348,8 +378,15 @@ static bool has_short_buffer;
>  static unsigned int
>  align_local_variable (tree decl)
>  {
> -  unsigned int align = LOCAL_DECL_ALIGNMENT (decl);
> -  DECL_ALIGN (decl) = align;
> +  unsigned int align;
> +
> +  if (TREE_CODE (decl) == SSA_NAME)
> +    align = TYPE_ALIGN (TREE_TYPE (decl));
> +  else
> +    {
> +      align = LOCAL_DECL_ALIGNMENT (decl);
> +      DECL_ALIGN (decl) = align;
> +    }
>    return align / BITS_PER_UNIT;
>  }
>
> @@ -415,12 +452,15 @@ add_stack_var (tree decl)
>    decl_to_stack_part->put (decl, stack_vars_num);
>
>    v->decl = decl;
> -  v->size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (decl)));
> +  tree size = TREE_CODE (decl) == SSA_NAME
> +    ? TYPE_SIZE_UNIT (TREE_TYPE (decl))
> +    : DECL_SIZE_UNIT (decl);
> +  v->size = tree_to_uhwi (size);
>    /* Ensure that all variables have size, so that &a != &b for any two
>       variables that are simultaneously live.  */
>    if (v->size == 0)
>      v->size = 1;
> -  v->alignb = align_local_variable (SSAVAR (decl));
> +  v->alignb = align_local_variable (decl);
>    /* An alignment of zero can mightily confuse us later.  */
>    gcc_assert (v->alignb != 0);
>
> @@ -1051,9 +1091,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
>           /* Skip variables that have already had rtl assigned.  See also
>              add_stack_var where we perpetrate this pc_rtx hack.  */
>           decl = stack_vars[i].decl;
> -         if ((TREE_CODE (decl) == SSA_NAME
> -             ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
> -             : DECL_RTL (decl)) != pc_rtx)
> +         if (TREE_CODE (decl) == SSA_NAME
> +             ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
> +             : DECL_RTL (decl) != pc_rtx)
>             continue;
>
>           large_size += alignb - 1;
> @@ -1082,9 +1122,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
>        /* Skip variables that have already had rtl assigned.  See also
>          add_stack_var where we perpetrate this pc_rtx hack.  */
>        decl = stack_vars[i].decl;
> -      if ((TREE_CODE (decl) == SSA_NAME
> -          ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
> -          : DECL_RTL (decl)) != pc_rtx)
> +      if (TREE_CODE (decl) == SSA_NAME
> +         ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
> +         : DECL_RTL (decl) != pc_rtx)
>         continue;
>
>        /* Check the predicate to see whether this variable should be
> @@ -1290,12 +1330,6 @@ expand_one_ssa_partition (tree var)
>    if (SA.partition_to_pseudo[part])
>      return;
>
> -  if (!use_register_for_decl (var))
> -    {
> -      expand_one_stack_var_1 (var);
> -      return;
> -    }
> -
>    unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
>                                           TYPE_MODE (TREE_TYPE (var)),
>                                           TYPE_ALIGN (TREE_TYPE (var)));
> @@ -1307,6 +1341,27 @@ expand_one_ssa_partition (tree var)
>
>    record_alignment_for_reg_var (align);
>
> +  if (!use_register_for_decl (var))
> +    {
> +      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
> +         && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
> +       {
> +         expand_one_stack_var_at (var, pc_rtx, 0, 0);
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (GET_CODE (x) == MEM);
> +         gcc_assert (GET_MODE (x) == BLKmode);
> +         gcc_assert (XEXP (x, 0) == pc_rtx);
> +         /* Reset the address, so that any attempt to use it will
> +            ICE.  It will be adjusted in assign_parm_setup_reg.  */
> +         XEXP (x, 0) = NULL_RTX;
> +       }
> +      else if (defer_stack_allocation (var, true))
> +       add_stack_var (var);
> +      else
> +       expand_one_stack_var_1 (var);
> +      return;
> +    }
> +
>    machine_mode reg_mode = promote_ssa_mode (var, NULL);
>
>    rtx x = gen_reg_rtx (reg_mode);
> @@ -1331,6 +1386,13 @@ adjust_one_expanded_partition_var (tree var)
>
>    rtx x = SA.partition_to_pseudo[part];
>
> +  if (!x)
> +    {
> +      /* This var will get a stack slot later.  */
> +      gcc_assert (defer_stack_allocation (var, true));
> +      return;
> +    }
> +
>    set_rtl (var, x);
>
>    if (!REG_P (x))
> @@ -1409,10 +1471,14 @@ expand_one_error_var (tree var)
>  static bool
>  defer_stack_allocation (tree var, bool toplevel)
>  {
> +  tree size_unit = TREE_CODE (var) == SSA_NAME
> +    ? TYPE_SIZE_UNIT (TREE_TYPE (var))
> +    : DECL_SIZE_UNIT (var);
> +
>    /* Whether the variable is small enough for immediate allocation not to be
>       a problem with regard to the frame size.  */
>    bool smallish
> -    = ((HOST_WIDE_INT) tree_to_uhwi (DECL_SIZE_UNIT (var))
> +    = ((HOST_WIDE_INT) tree_to_uhwi (size_unit)
>         < PARAM_VALUE (PARAM_MIN_SIZE_FOR_STACK_SHARING));
>
>    /* If stack protection is enabled, *all* stack variables must be deferred,
> @@ -1421,16 +1487,24 @@ defer_stack_allocation (tree var, bool toplevel)
>    if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK))
>      return true;
>
> +  unsigned int align = TREE_CODE (var) == SSA_NAME
> +    ? TYPE_ALIGN (TREE_TYPE (var))
> +    : DECL_ALIGN (var);
> +
>    /* We handle "large" alignment via dynamic allocation.  We want to handle
>       this extra complication in only one place, so defer them.  */
> -  if (DECL_ALIGN (var) > MAX_SUPPORTED_STACK_ALIGNMENT)
> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
>      return true;
>
> +  bool ignored = TREE_CODE (var) == SSA_NAME
> +    ? !SSAVAR (var) || DECL_IGNORED_P (SSA_NAME_VAR (var))
> +    : DECL_IGNORED_P (var);
> +
>    /* When optimization is enabled, DECL_IGNORED_P variables originally scoped
>       might be detached from their block and appear at toplevel when we reach
>       here.  We want to coalesce them with variables from other blocks when
>       the immediate contribution to the frame size would be noticeable.  */
> -  if (toplevel && optimize > 0 && DECL_IGNORED_P (var) && !smallish)
> +  if (toplevel && optimize > 0 && ignored && !smallish)
>      return true;
>
>    /* Variables declared in the outermost scope automatically conflict
> @@ -6135,7 +6209,8 @@ pass_expand::execute (function *fun)
>        if (part == NO_PARTITION)
>         continue;
>
> -      gcc_assert (SA.partition_to_pseudo[part]);
> +      gcc_assert (SA.partition_to_pseudo[part]
> +                 || defer_stack_allocation (name, true));
>
>        /* If this decl was marked as living in multiple places, reset
>          this now to NULL.  */
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index 602579d..987cf356 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern tree gimple_assign_rhs_to_tree (gimple);
>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> +extern bool parm_maybe_byref_p (tree);
>  extern rtx get_rtl_for_parm_ssa_default_def (tree var);
>
>
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 0648af6..3b95c5d 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "target.h"
>  #include "builtins.h"
>  #include "rtl-iter.h"
> +#include "stor-layout.h"
>
>  struct target_rtl default_target_rtl;
>  #if SWITCHABLE_TARGET
> @@ -1243,7 +1244,9 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>    if (REG_P (x))
>      REG_ATTRS (x)
>        = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
> -                                              DECL_MODE (tdecl)));
> +                                              DECL_P (tdecl)
> +                                              ? DECL_MODE (tdecl)
> +                                              : TYPE_MODE (TREE_TYPE (tdecl))));
>    if (GET_CODE (x) == CONCAT)
>      {
>        if (REG_P (XEXP (x, 0)))
> diff --git a/gcc/expr.c b/gcc/expr.c
> index d601129..fc49f92 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -3022,7 +3022,7 @@ write_complex_part (rtx cplx, rtx val, bool imag_p)
>  /* Extract one of the components of the complex value CPLX.  Extract the
>     real part if IMAG_P is false, and the imaginary part if it's true.  */
>
> -static rtx
> +rtx
>  read_complex_part (rtx cplx, bool imag_p)
>  {
>    machine_mode cmode, imode;
> diff --git a/gcc/expr.h b/gcc/expr.h
> index 32d1707..a2c8e1d 100644
> --- a/gcc/expr.h
> +++ b/gcc/expr.h
> @@ -210,6 +210,7 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx);
>
>  extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx);
>  extern rtx_insn *emit_move_complex_parts (rtx, rtx);
> +extern rtx read_complex_part (rtx, bool);
>  extern void write_complex_part (rtx, rtx, bool);
>  extern rtx emit_move_resolve_push (machine_mode, rtx);
>
> diff --git a/gcc/function.c b/gcc/function.c
> index c3d00cd..1d98ede 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -152,6 +152,7 @@ static void prepare_function_start (void);
>  static void do_clobber_return_reg (rtx, void *);
>  static void do_use_return_reg (rtx, void *);
>  static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
> +static void maybe_reset_rtl_for_parm (tree);
>
>
>  /* Stack of nested functions.  */
> @@ -2321,12 +2322,12 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>              from resetting their RTL.  */
>           if (currently_expanding_to_rtl)
>             {
> +             maybe_reset_rtl_for_parm (cparm);
>               rtx rtl = rtl_for_parm (all, cparm);
> -             gcc_assert (!rtl || GET_CODE (rtl) == CONCAT);
>               if (rtl)
>                 {
> -                 SET_DECL_RTL (p, XEXP (rtl, 0));
> -                 SET_DECL_RTL (decl, XEXP (rtl, 1));
> +                 SET_DECL_RTL (p, read_complex_part (rtl, false));
> +                 SET_DECL_RTL (decl, read_complex_part (rtl, true));
>
>                   DECL_CONTEXT (p) = cparm;
>                   DECL_CONTEXT (decl) = cparm;
> @@ -2954,16 +2955,27 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>    if (stack_parm == 0)
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> -      stack_parm = rtl_for_parm (all, parm);
> -      if (stack_parm)
> -       stack_parm = copy_rtx (stack_parm);
> +      rtx from_expand = rtl_for_parm (all, parm);
> +      if (from_expand && (!parm_maybe_byref_p (parm)
> +                         || XEXP (from_expand, 0) != NULL_RTX))
> +       stack_parm = copy_rtx (from_expand);
>        else
>         {
>           stack_parm = assign_stack_local (BLKmode, size_stored,
>                                            DECL_ALIGN (parm));
>           if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>             PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -         set_mem_attributes (stack_parm, parm, 1);
> +         if (from_expand)
> +           {
> +             gcc_assert (GET_CODE (stack_parm) == MEM);
> +             gcc_assert (GET_CODE (from_expand) == MEM);
> +             gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
> +             XEXP (from_expand, 0) = XEXP (stack_parm, 0);
> +             PUT_MODE (from_expand, GET_MODE (stack_parm));
> +             stack_parm = copy_rtx (from_expand);
> +           }
> +         else
> +           set_mem_attributes (stack_parm, parm, 1);
>         }
>      }
>
> @@ -3102,23 +3114,34 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>      = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                              TREE_TYPE (current_function_decl), 2);
>
> -  rtx from_expand = rtl_for_parm (all, parm);
> +  rtx from_expand = parmreg = rtl_for_parm (all, parm);
>
>    if (from_expand && !data->passed_pointer)
>      {
> -      parmreg = from_expand;
> -      gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
> +      if (GET_MODE (parmreg) != promoted_nominal_mode)
> +       parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
>      }
> -  else
> +  else if (!from_expand || parm_maybe_byref_p (parm))
>      {
>        parmreg = gen_reg_rtx (promoted_nominal_mode);
>        if (!DECL_ARTIFICIAL (parm))
>         mark_user_reg (parmreg);
> +
> +      if (from_expand)
> +       {
> +         gcc_assert (data->passed_pointer);
> +         gcc_assert (GET_CODE (from_expand) == MEM
> +                     && GET_MODE (from_expand) == BLKmode
> +                     && XEXP (from_expand, 0) == NULL_RTX);
> +         XEXP (from_expand, 0) = parmreg;
> +       }
>      }
>
>    /* If this was an item that we received a pointer to,
>       set DECL_RTL appropriately.  */
> -  if (data->passed_pointer)
> +  if (from_expand)
> +    SET_DECL_RTL (parm, from_expand);
> +  else if (data->passed_pointer)
>      {
>        rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
>        set_mem_attributes (x, parm, 1);
> @@ -3139,6 +3162,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>    need_conversion = (data->nominal_mode != data->passed_mode
>                      || promoted_nominal_mode != data->promoted_mode);
> +  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
>    moved = false;
>
>    if (need_conversion
> @@ -3270,7 +3294,9 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>        did_conversion = true;
>      }
> -  else
> +  /* We don't want to copy the incoming pointer to a parmreg expected
> +     to hold the value rather than the pointer.  */
> +  else if (!data->passed_pointer || parmreg != from_expand)
>      emit_move_insn (parmreg, validated_mem);
>
>    /* If we were passed a pointer but the actual value can safely live
> @@ -3278,12 +3304,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>    if (data->passed_pointer
>        && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
>      {
> +      rtx src = DECL_RTL (parm);
> +
>        /* We can't use nominal_mode, because it will have been set to
>          Pmode above.  We must use the actual mode of the parm.  */
>        if (from_expand)
>         {
>           parmreg = from_expand;
>           gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> +         src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
> +         set_mem_attributes (src, parm, 1);
>         }
>        else if (use_register_for_decl (parm))
>         {
> @@ -3302,14 +3332,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>           set_mem_attributes (parmreg, parm, 1);
>         }
>
> -      if (GET_MODE (parmreg) != GET_MODE (DECL_RTL (parm)))
> +      if (GET_MODE (parmreg) != GET_MODE (src))
>         {
> -         rtx tempreg = gen_reg_rtx (GET_MODE (DECL_RTL (parm)));
> +         rtx tempreg = gen_reg_rtx (GET_MODE (src));
>           int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
>
>           push_to_sequence2 (all->first_conversion_insn,
>                              all->last_conversion_insn);
> -         emit_move_insn (tempreg, DECL_RTL (parm));
> +         emit_move_insn (tempreg, src);
>           tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
>           emit_move_insn (parmreg, tempreg);
>           all->first_conversion_insn = get_insns ();
> @@ -3318,8 +3348,10 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>           did_conversion = true;
>         }
> +      else if (GET_MODE (parmreg) == BLKmode)
> +       gcc_assert (parm_maybe_byref_p (parm));
>        else
> -       emit_move_insn (parmreg, DECL_RTL (parm));
> +       emit_move_insn (parmreg, src);
>
>        SET_DECL_RTL (parm, parmreg);
>
> @@ -3495,11 +3527,21 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
>           imag = DECL_RTL (fnargs[i + 1]);
>           if (inner != GET_MODE (real))
>             {
> -             real = gen_lowpart_SUBREG (inner, real);
> -             imag = gen_lowpart_SUBREG (inner, imag);
> +             real = simplify_gen_subreg (inner, real, GET_MODE (real),
> +                                         subreg_lowpart_offset
> +                                         (inner, GET_MODE (real)));
> +             imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
> +                                         subreg_lowpart_offset
> +                                         (inner, GET_MODE (imag)));
>             }
>
> -         if (TREE_ADDRESSABLE (parm))
> +         if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
> +             && rtx_equal_p (real,
> +                             read_complex_part (tmp, false))
> +             && rtx_equal_p (imag,
> +                             read_complex_part (tmp, true)))
> +           ; /* We now have the right rtl in tmp.  */
> +         else if (TREE_ADDRESSABLE (parm))
>             {
>               rtx rmem, imem;
>               HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
> @@ -3645,7 +3687,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
>           assign_parm_setup_block (&all, pbdata->bounds_parm,
>                                    &pbdata->parm_data);
>         else if (pbdata->parm_data.passed_pointer
> -                || use_register_for_decl (pbdata->bounds_parm))
> +                || use_register_for_parm_decl (&all, pbdata->bounds_parm))
>           assign_parm_setup_reg (&all, pbdata->bounds_parm,
>                                  &pbdata->parm_data);
>         else
> @@ -5207,6 +5249,10 @@ expand_function_start (tree subr)
>        SET_DECL_RTL (parm, local);
>        mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
>
> +      if (GET_MODE (local) != Pmode)
> +       local = convert_to_mode (Pmode, local,
> +                                TYPE_UNSIGNED (TREE_TYPE (parm)));
> +
>        insn = emit_move_insn (local, chain);
>
>        /* Mark the register as eliminable, similar to parameters.  */
> diff --git a/gcc/stmt.c b/gcc/stmt.c
> index 391686c..e7f7dd4 100644
> --- a/gcc/stmt.c
> +++ b/gcc/stmt.c
> @@ -891,7 +891,7 @@ emit_case_decision_tree (tree index_expr, tree index_type,
>      {
>        index = copy_to_reg (index);
>        if (TREE_CODE (index_expr) == SSA_NAME)
> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (index_expr), index);
> +       set_reg_attrs_for_decl_rtl (index_expr, index);
>      }
>
>    balance_case_nodes (&case_list, NULL);
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index a622728..08ce72c 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-iterator.h"
>  #include "tree-ssa-live.h"
>  #include "tree-ssa-coalesce.h"
> +#include "cfgexpand.h"
>  #include "explow.h"
>  #include "diagnostic-core.h"
>
> @@ -1379,10 +1380,15 @@ gimple_can_coalesce_p (tree name1, tree name2)
>        /* Check that the promoted modes are the same.  We don't want to
>          coalesce if the promoted modes would be different.  Only
>          PARM_DECLs and RESULT_DECLs have different promotion rules,
> -        so skip the test if we both are variables or anonymous
> -        SSA_NAMEs.  */
> +        so skip the test if both are variables, or both are anonymous
> +        SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
> +        coalesce its SSA versions with those of any other variables,
> +        because it may be passed by reference.  */
>        return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> -       || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
> +       || (/* The case var1 == var2 is already covered above.  */
> +           !parm_maybe_byref_p (var1)
> +           && !parm_maybe_byref_p (var2)
> +           && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
>      }
>
>    /* If the types are not the same, check for a canonical type match.  This
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-04  9:48                                                 ` Richard Biener
@ 2015-08-05  0:39                                                   ` Alexandre Oliva
  2015-08-05  9:14                                                     ` Richard Biener
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-05  0:39 UTC (permalink / raw)
  To: Richard Biener
  Cc: H.J. Lu, Segher Boessenkool, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Aug  4, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> Though I wonder on whether splitting the patch into a first one with disabling
> coalescing of parms (their default defs(?)) and a followup implementing the
> support for that.

We can't disable coalescing of parms altogether.  With -O0, we must
coalesce all SSA_NAMEs referencing each parm to a single partition.
With optimization, we could coalesce parms in general, just not these
special cases in which the parm is to live in a caller-supplied memory
block.

Now, it's not coalescing parms proper that brought so much risk to the
patch, it is assigning rtl to SSA partitions, and having assign_parms*
use that assignment.  Considering that sometimes a single param
necessarily ends up in more than one partition, requiring two
assignments, and that assign_parms* can't deal with that, I don't see
how to easily disable the cfgexpand logic when it comes to parms, so as
to be able to leave assign_parms alone.

How about, if further problems arise that justify reverting the patch
one more time, I'll look into splitting the patch as you suggested, but
otherwise, I'll save myself the trouble, ok?

> So - is my observation correct that this is only about coalescing of the
> default defs of parameters, not other SSA names based on parameter decls?

It's more like the opposite, i.e., we *refrain* from coalescing other
SSA_NAMEs related with byref params, so that we can easily tell when a
partition references a byref param and whether that partition holds its
default def.  We could have coalesced any other names that ended up in
different partitions, and even the partition holding the default def, if
we had other means to identify partitions with default defs of byref
params.  For example, we could create a bitmap of byref param default
def versions, and then, after partitioning, map those to the partitions
they were assigned to.  In fact, I might do that as a followup.

> Do you think this splitting is feasible and my concern about the
> code-gen issues warranted?

It is feasible but not exactly easy.

As for codegen, I hope to have covered all cases now, but should we find
out I haven't, I'll try the split and see what that gets us.  Did you
have any special cases in mind that it looks like I may have missed?

Thanks,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-05  0:39                                                   ` Alexandre Oliva
@ 2015-08-05  9:14                                                     ` Richard Biener
  2015-08-05 23:03                                                       ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-08-05  9:14 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: H.J. Lu, Segher Boessenkool, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Wed, Aug 5, 2015 at 2:38 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Aug  4, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> Though I wonder on whether splitting the patch into a first one with disabling
>> coalescing of parms (their default defs(?)) and a followup implementing the
>> support for that.
>
> We can't disable coalescing of parms altogether.  With -O0, we must
> coalesce all SSA_NAMEs referencing each parm to a single partition.
> With optimization, we could coalesce parms in general, just not these
> special cases in which the parm is to live in a caller-supplied memory
> block.
>
> Now, it's not coalescing parms proper that brought so much risk to the
> patch, it is assigning rtl to SSA partitions, and having assign_parms*
> use that assignment.  Considering that sometimes a single param
> necessarily ends up in more than one partition, requiring two
> assignments, and that assign_parms* can't deal with that, I don't see
> how to easily disable the cfgexpand logic when it comes to parms, so as
> to be able to leave assign_parms alone.
>
> How about, if further problems arise that justify reverting the patch
> one more time, I'll look into splitting the patch as you suggested, but
> otherwise, I'll save myself the trouble, ok?

Sure.

>> So - is my observation correct that this is only about coalescing of the
>> default defs of parameters, not other SSA names based on parameter decls?
>
> It's more like the opposite, i.e., we *refrain* from coalescing other
> SSA_NAMEs related with byref params, so that we can easily tell when a
> partition references a byref param and whether that partition holds its
> default def.  We could have coalesced any other names that ended up in
> different partitions, and even the partition holding the default def, if
> we had other means to identify partitions with default defs of byref
> params.  For example, we could create a bitmap of byref param default
> def versions, and then, after partitioning, map those to the partitions
> they were assigned to.  In fact, I might do that as a followup.
>
>> Do you think this splitting is feasible and my concern about the
>> code-gen issues warranted?
>
> It is feasible but not exactly easy.
>
> As for codegen, I hope to have covered all cases now, but should we find
> out I haven't, I'll try the split and see what that gets us.  Did you
> have any special cases in mind that it looks like I may have missed?

It was just a hunch when you talked about BLKmode and params in memory.
As coalescing is about SSA name (thus register) coalescing I was thinking
that if you coalesce a register with incoming memory you'll end up with
more memory accesses?  But maybe I'm completely off here.

I also thought of the RTL expansion thing we do with at first copying
the hardreg incoming args to pseudos and how that interacts with
coalescing.

But I guess you have eyed code-gen changes a bit anyway.

Thanks,
Richard.

> Thanks,
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-05  9:14                                                     ` Richard Biener
@ 2015-08-05 23:03                                                       ` Alexandre Oliva
  0 siblings, 0 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-05 23:03 UTC (permalink / raw)
  To: Richard Biener
  Cc: H.J. Lu, Segher Boessenkool, Jeff Law, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Aug  5, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> It was just a hunch when you talked about BLKmode and params in memory.
> As coalescing is about SSA name (thus register) coalescing I was thinking
> that if you coalesce a register with incoming memory you'll end up with
> more memory accesses?

Since we only coalesce variables whose promoted mode is the same, if one
of them gets BLKmode and has to live in memory, so would all the others
it might coalesce with.  So, even though we have gimple_regs, we can't
have pseudos.  This was observed with vector types for which no native
vector mode is available.

It would still make sense to share them when possible, to reduce the
number of mem-to-mem copies.  And we don't want to copy incoming BLKmode
parms to *another* memory location if we can help it.

Now, maybe you're concerned about incoming parms passed by reference
that *can* be held in pseudos.  For those, we will perform a load from
memory to a pseudo and use that, even if the pseudo ends up allocated in
memory.

> I also thought of the RTL expansion thing we do with at first copying
> the hardreg incoming args to pseudos and how that interacts with
> coalescing.

Most of what changed now is who gets to choose the pseudo; it used to be
assign_parms, now it's cfgexpand.  The other significant change is that
now, when cfgexpand detects a BLKmode parm, it will choose MEM, but it
won't set up the address, so that assign_parms still does what it used
to, namely, copy the incoming hard reg to a pseudo, and then use the
pseudo as the MEM address.

> But I guess you have eyed code-gen changes a bit anyway.

Yeah.  Not much has changed in the before parm_birth area; expected
changes have to do with the pseudo numbering.  IIRC, anything else would
be unexpected.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-03 23:46                                               ` Alexandre Oliva
  2015-08-04  9:48                                                 ` Richard Biener
@ 2015-08-10  8:24                                                 ` James Greenhalgh
  2015-08-10 15:14                                                   ` Jeff Law
  1 sibling, 1 reply; 127+ messages in thread
From: James Greenhalgh @ 2015-08-10  8:24 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: H.J. Lu, Segher Boessenkool, Richard Biener, Jeff Law,
	GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Tue, Aug 04, 2015 at 12:45:28AM +0100, Alexandre Oliva wrote:
> On Jul 30, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
> 
> > aoliva/pr64164  is fine on x32.
> 
> Thanks.  I have made a large number of changes since you tested it,
> fixing all the reported issues and then some.  Now, x86_64-linux-gnu
> (-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
> powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
> targets I cross-tested still get the same 'make all' errors that the
> pristine tree did.

For what it is worth, I bootstrapped and tested the consolidated patch
on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
r226516 over the weekend, and didn't see any new issues.

Thanks,
James

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-10  8:24                                                 ` James Greenhalgh
@ 2015-08-10 15:14                                                   ` Jeff Law
  2015-08-11  4:53                                                     ` Patrick Marlier
  0 siblings, 1 reply; 127+ messages in thread
From: Jeff Law @ 2015-08-10 15:14 UTC (permalink / raw)
  To: James Greenhalgh, Alexandre Oliva
  Cc: H.J. Lu, Segher Boessenkool, Richard Biener, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On 08/10/2015 02:23 AM, James Greenhalgh wrote:
> On Tue, Aug 04, 2015 at 12:45:28AM +0100, Alexandre Oliva wrote:
>> On Jul 30, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>
>>> aoliva/pr64164  is fine on x32.
>>
>> Thanks.  I have made a large number of changes since you tested it,
>> fixing all the reported issues and then some.  Now, x86_64-linux-gnu
>> (-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
>> powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
>> targets I cross-tested still get the same 'make all' errors that the
>> pristine tree did.
>
> For what it is worth, I bootstrapped and tested the consolidated patch
> on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
> r226516 over the weekend, and didn't see any new issues.
Thanks -- I know it's been a long road on this patch.  I don't think 
anyone would have ever guessed fixing 64164 would be so complex.

jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-10 15:14                                                   ` Jeff Law
@ 2015-08-11  4:53                                                     ` Patrick Marlier
  2015-08-14 19:03                                                       ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Patrick Marlier @ 2015-08-11  4:53 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On Mon, Aug 10, 2015 at 5:14 PM, Jeff Law <law@redhat.com> wrote:
> On 08/10/2015 02:23 AM, James Greenhalgh wrote:
>>
>> On Tue, Aug 04, 2015 at 12:45:28AM +0100, Alexandre Oliva wrote:
>>>
>>> On Jul 30, 2015, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>>
>>>> aoliva/pr64164  is fine on x32.
>>>
>>>
>>> Thanks.  I have made a large number of changes since you tested it,
>>> fixing all the reported issues and then some.  Now, x86_64-linux-gnu
>>> (-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
>>> powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
>>> targets I cross-tested still get the same 'make all' errors that the
>>> pristine tree did.
>>
>>
>> For what it is worth, I bootstrapped and tested the consolidated patch
>> on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
>> r226516 over the weekend, and didn't see any new issues.
>
> Thanks -- I know it's been a long road on this patch.  I don't think anyone
> would have ever guessed fixing 64164 would be so complex.

Especially as the bug reporter, I am impressed how a slight problem
can lead to such a patch! ;)
Thanks a lot Alexandre!

I feel like I owe you something for this hard work!
Feel free to ping if I can help you with something or I owe you at
least a beer when you will be around Switzerland. :)
--
Pat

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-11  4:53                                                     ` Patrick Marlier
@ 2015-08-14 19:03                                                       ` Alexandre Oliva
  2015-08-15  8:57                                                         ` Andreas Schwab
                                                                           ` (3 more replies)
  0 siblings, 4 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-14 19:03 UTC (permalink / raw)
  To: Patrick Marlier
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On Aug 11, 2015, Patrick Marlier <patrick.marlier@gmail.com> wrote:

> On Mon, Aug 10, 2015 at 5:14 PM, Jeff Law <law@redhat.com> wrote:
>> On 08/10/2015 02:23 AM, James Greenhalgh wrote:

>>> For what it is worth, I bootstrapped and tested the consolidated patch
>>> on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
>>> r226516 over the weekend, and didn't see any new issues.

Thanks!

> Especially as the bug reporter, I am impressed how a slight problem
> can lead to such a patch! ;)
> Thanks a lot Alexandre!

You're welcome.  I'm glad it appears to be working to everyone's
satisfaction now.  I've just committed it as r226901, with only a
context adjustment to account for a change in use_register_for_decl in
function.c.  /me crosses fingers :-)

Here's the patch as checked in:

for  gcc/ChangeLog

	PR rtl-optimization/64164
	PR bootstrap/66978
	PR middle-end/66983
	PR rtl-optimization/67000
	PR middle-end/67034
	PR middle-end/67035
	* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
	* tree-ssa-copyrename.c: Removed.
	* opts.c (default_options_table): Drop -ftree-copyrename.  Add
	-ftree-coalesce-vars.
	* passes.def: Drop all occurrences of pass_rename_ssa_copies.
	* common.opt (ftree-copyrename): Ignore.
	(ftree-coalesce-inlined-vars): Likewise.
	* doc/invoke.texi: Remove the ignored options above.
	* gimple-expr.h (gimple_can_coalesce_p): Move declaration
	* tree-ssa-coalesce.h: ... here.
	* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
	headers required by it.
	* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
	across variables when flag_tree_coalesce_vars.  Check register
	use and promoted modes to allow coalescing.  Do not coalesce
	maybe-byref parms with SSA_NAMEs of other variables, or
	anonymous SSA_NAMEs.  Moved to tree-ssa-coalesce.c.
	* tree-ssa-live.c (struct tree_int_map_hasher): Move along
	with its member functions to tree-ssa-coalesce.c.
	(var_map_base_init): Likewise.  Renamed to
	compute_samebase_partition_bases.
	(partition_view_normal): Drop want_bases parameter.
	(partition_view_bitmap): Likewise.
	* tree-ssa-live.h: Adjust declarations.
	* tree-ssa-coalesce.c: Include explow.h and cfgexpand.h.
	(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
	default defs at the entry point.
	(dump_part_var_map): New.
	(compute_optimized_partition_bases): New, called by...
	(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
	of compute_samebase_partition_bases.  Adjust.
	* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
	* cfgexpand.c (leader_merge, parm_maybe_byref_p): New.
	(ssa_default_def_partition): New.
	(get_rtl_for_parm_ssa_default_def): New.
	(align_local_variable, add_stack_var): Support anonymous SSA
	names.
	(defer_stack_allocation): Likewise.  Declare earlier.
	(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
	vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
	Do no record deferred-allocation marker in
	SA.partition_to_pseudo.
	(expand_stack_vars): Adjust check for the marker in it.
	(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
	redundant MEM attr setting.
	(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
	from...
	(expand_one_stack_var): ... this.  New wrapper to check and
	skip already expanded SSA partitions.
	(record_alignment_for_reg_var): New, factored out of...
	(expand_one_var): ... this.
	(expand_one_ssa_partition): New.
	(adjust_one_expanded_partition_var): New.
	(expand_one_register_var): Check and skip already expanded SSA
	partitions.
	(expand_used_vars): Don't create DECLs for anonymous SSA
	names.  Expand all SSA partitions, then adjust all SSA names.
	(pass::execute): Replace the loops that set
	SA.partition_to_pseudo from partition leaders and cleared
	DECL_RTL for multi-location variables, and that which used to
	rename vars and set attrs, with one that clears DECL_RTL and
	checks that PARMs and RESULTs default_defs match DECL_RTL.
	* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
	* emit-rtl.c: Include stor-layout.h.
	(set_reg_attrs_for_parm): Handle NULL decl.
	(set_reg_attrs_for_decl_rtl): Take mode from expression if
	it's not a DECL.
	* stmt.c (emit_case_decision_tree): Pass it the SSA_NAME
	rather than its possibly-NULL DECL.
	* explow.c (promote_ssa_mode): New.
	* explow.h (promote_ssa_mode): Declare.
	* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
	(read_complex_part): Export.
	* expr.h (read_complex_part): Declare.
	* cfgexpand.h (parm_maybe_byref_p): Declare.
	* function.c: Include cfgexpand.h.
	(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
	(use_register_for_parm_decl): Wrapper for the above to
	special-case the result_ptr.
	(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
	(split_complex_args): Take assign_parm_data_all argument.
	Pass it to rtl_for_parm.  Set up rtl and context for split
	args.  Reset complex parm before fetching its default decl
	rtl.
	(assign_parms_unsplit_complex): Use the default-def complex
	parm rtl if it matches the components.
	(assign_parms_augmented_arg_list): Adjust.
	(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
	multiple locations.  Recognize split complex args.
	(assign_parm_adjust_stack_rtl): Add all and parm arguments,
	for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
	(assign_parm_setup_block): Prefer SSA-assigned location, and
	fill in its address if the memory location of a maybe-byref
	parm was not assigned by cfgexpand.
	(assign_parm_setup_reg): Likewise.  Adjust its mode as
	needed.  Use entry_parm for equiv if stack_parm is NULL.  Make
	sure passed_pointer parms don't need conversion.  Copy address
	or value as needed.
	(assign_parm_setup_stack): Prefer SSA-assigned location.
	(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
	rtl before testing for pointer bounds.  Special-case result_ptr.
	(expand_function_start): Maybe reset DECL_RTL of result.
	Prefer SSA-assigned location for result and static chain.
	Factor out DECL_RESULT and SET_DECL_RTL.  Convert static chain
	to Pmode if needed, from H.J. Lu  <hongjiu.lu@intel.com>.
	* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
	anonymous SSA names.  Use promote_ssa_mode.
	(get_temp_reg): Likewise.
	(remove_ssa_form): Adjust.
	* stor-layout.c (layout_decl): Don't set mem attributes of
	non-MEMs.
	* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
	and get its reg_usage for reg invalidation.
	(compute_bb_dataflow): Pass it insn.
	(emit_notes_in_bb): Likewise.

for  gcc/testsuite/ChangeLog

	* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
	* gcc.dg/ssp-1.c: Make counter a register.
	* gcc.dg/ssp-2.c: Likewise.
	* gcc.dg/torture/parm-coalesce.c: New.
---
 gcc/Makefile.in                              |    1 
 gcc/alias.c                                  |   13 +
 gcc/cfgexpand.c                              |  471 +++++++++++++++++++-------
 gcc/cfgexpand.h                              |    3 
 gcc/common.opt                               |   12 -
 gcc/doc/invoke.texi                          |   48 +--
 gcc/emit-rtl.c                               |    8 
 gcc/explow.c                                 |   29 ++
 gcc/explow.h                                 |    3 
 gcc/expr.c                                   |   41 +-
 gcc/expr.h                                   |    1 
 gcc/function.c                               |  341 +++++++++++++++----
 gcc/gimple-expr.c                            |   39 --
 gcc/gimple-expr.h                            |    1 
 gcc/opts.c                                   |    2 
 gcc/passes.def                               |    5 
 gcc/stmt.c                                   |    2 
 gcc/stor-layout.c                            |    3 
 gcc/testsuite/gcc.dg/guality/pr54200.c       |    2 
 gcc/testsuite/gcc.dg/ssp-1.c                 |    2 
 gcc/testsuite/gcc.dg/ssp-2.c                 |    2 
 gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
 gcc/tree-outof-ssa.c                         |   16 -
 gcc/tree-ssa-coalesce.c                      |  384 +++++++++++++++++++++
 gcc/tree-ssa-coalesce.h                      |    1 
 gcc/tree-ssa-copyrename.c                    |  475 --------------------------
 gcc/tree-ssa-live.c                          |   99 -----
 gcc/tree-ssa-live.h                          |    4 
 gcc/tree-ssa-uncprop.c                       |    5 
 gcc/var-tracking.c                           |   12 -
 30 files changed, 1187 insertions(+), 878 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
 delete mode 100644 gcc/tree-ssa-copyrename.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index c1cb4ce..e298ecc 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1447,7 +1447,6 @@ OBJS = \
 	tree-ssa-ccp.o \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
-	tree-ssa-copyrename.o \
 	tree-ssa-dce.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index fa7d5d8..4681e3f 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
   if (! DECL_P (exprx) || ! DECL_P (expry))
     return 0;
 
+  /* If we refer to different gimple registers, or one gimple register
+     and one non-gimple-register, we know they can't overlap.  First,
+     gimple registers don't have their addresses taken.  Now, there
+     could be more than one stack slot for (different versions of) the
+     same gimple register, but we can presumably tell they don't
+     overlap based on offsets from stack base addresses elsewhere.
+     It's important that we don't proceed to DECL_RTL, because gimple
+     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+     able to do anything about them since no SSA information will have
+     remained to guide it.  */
+  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+    return exprx != expry;
+
   /* With invalid code we can end up storing into the constant pool.
      Bail out to avoid ICEing when creating RTL for this.
      See gfortran.dg/lto/20091028-2_0.f90.  */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 7df9d06..0bc20f6 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -97,6 +97,8 @@ gimple currently_expanding_gimple_stmt;
 
 static rtx expand_debug_expr (tree);
 
+static bool defer_stack_allocation (tree, bool);
+
 /* Return an expression tree corresponding to the RHS of GIMPLE
    statement STMT.  */
 
@@ -150,21 +152,149 @@ gimple_assign_rhs_to_tree (gimple stmt)
 
 #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
 
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+   out of the same user variable being in multiple partitions (this is
+   less likely for compiler-introduced temps).  */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+  if (cur == NULL || cur == next)
+    return next;
+
+  if (DECL_P (cur) && DECL_IGNORED_P (cur))
+    return cur;
+
+  if (DECL_P (next) && DECL_IGNORED_P (next))
+    return next;
+
+  return cur;
+}
+
+/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
+   Such parameters are likely passed as a pointer to the value, rather
+   than as a value, and so we must not coalesce them, nor allocate
+   stack space for them before determining the calling conventions for
+   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
+   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
+   with NULL so as to make sure the MEM is not used before it is
+   adjusted in assign_parm_setup_reg.  */
+
+bool
+parm_maybe_byref_p (tree var)
+{
+  if (!var || VAR_P (var))
+    return false;
+
+  gcc_assert (TREE_CODE (var) == PARM_DECL
+	      || TREE_CODE (var) == RESULT_DECL);
+
+  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
+}
+
+/* Return the partition of the default SSA_DEF for decl VAR.  */
+
+static int
+ssa_default_def_partition (tree var)
+{
+  tree name = ssa_default_def (cfun, var);
+
+  if (!name)
+    return NO_PARTITION;
+
+  return var_to_partition (SA.map, name);
+}
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+   there is one.  */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+  if (!is_gimple_reg (var))
+    return NULL_RTX;
+
+  /* If we've already determined RTL for the decl, use it.  This is
+     not just an optimization: if VAR is a PARM whose incoming value
+     is unused, we won't find a default def to use its partition, but
+     we still want to use the location of the parm, if it was used at
+     all.  During assign_parms, until a location is assigned for the
+     VAR, RTL can only for a parm or result if we're not coalescing
+     across variables, when we know we're coalescing all SSA_NAMEs of
+     each parm or result, and we're not coalescing them with names
+     pertaining to other variables, such as other parms' default
+     defs.  */
+  if (DECL_RTL_SET_P (var))
+    {
+      gcc_assert (DECL_RTL (var) != pc_rtx);
+      return DECL_RTL (var);
+    }
+
+  int part = ssa_default_def_partition (var);
+  if (part == NO_PARTITION)
+    return NULL_RTX;
+
+  return SA.partition_to_pseudo[part];
+}
+
 /* Associate declaration T with storage space X.  If T is no
    SSA name this is exactly SET_DECL_RTL, otherwise make the
    partition of T associated with X.  */
 static inline void
 set_rtl (tree t, rtx x)
 {
+  if (x && SSAVAR (t))
+    {
+      bool skip = false;
+      tree cur = NULL_TREE;
+
+      if (MEM_P (x))
+	cur = MEM_EXPR (x);
+      else if (REG_P (x))
+	cur = REG_EXPR (x);
+      else if (GET_CODE (x) == CONCAT
+	       && REG_P (XEXP (x, 0)))
+	cur = REG_EXPR (XEXP (x, 0));
+      else if (GET_CODE (x) == PARALLEL)
+	cur = REG_EXPR (XVECEXP (x, 0, 0));
+      else if (x == pc_rtx)
+	skip = true;
+      else
+	gcc_unreachable ();
+
+      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+      if (cur != next)
+	{
+	  if (MEM_P (x))
+	    set_mem_attributes (x, next, true);
+	  else
+	    set_reg_attrs_for_decl_rtl (next, x);
+	}
+    }
+
   if (TREE_CODE (t) == SSA_NAME)
     {
-      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
-      if (x && !MEM_P (x))
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
-      /* For the benefit of debug information at -O0 (where vartracking
-         doesn't run) record the place also in the base DECL if it's
-	 a normal variable (not a parameter).  */
-      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+      int part = var_to_partition (SA.map, t);
+      if (part != NO_PARTITION)
+	{
+	  if (SA.partition_to_pseudo[part])
+	    gcc_assert (SA.partition_to_pseudo[part] == x);
+	  else if (x != pc_rtx)
+	    SA.partition_to_pseudo[part] = x;
+	}
+      /* For the benefit of debug information at -O0 (where
+         vartracking doesn't run) record the place also in the base
+         DECL.  For PARMs and RESULTs, we may end up resetting these
+         in function.c:maybe_reset_rtl_for_parm, but in some rare
+         cases we may need them (unused and overwritten incoming
+         value, that at -O0 must share the location with the other
+         uses in spite of the missing default def), and this may be
+         the only chance to preserve them.  */
+      if (x && x != pc_rtx && SSA_NAME_VAR (t))
 	{
 	  tree var = SSA_NAME_VAR (t);
 	  /* If we don't yet have something recorded, just record it now.  */
@@ -248,8 +378,15 @@ static bool has_short_buffer;
 static unsigned int
 align_local_variable (tree decl)
 {
-  unsigned int align = LOCAL_DECL_ALIGNMENT (decl);
-  DECL_ALIGN (decl) = align;
+  unsigned int align;
+
+  if (TREE_CODE (decl) == SSA_NAME)
+    align = TYPE_ALIGN (TREE_TYPE (decl));
+  else
+    {
+      align = LOCAL_DECL_ALIGNMENT (decl);
+      DECL_ALIGN (decl) = align;
+    }
   return align / BITS_PER_UNIT;
 }
 
@@ -315,12 +452,15 @@ add_stack_var (tree decl)
   decl_to_stack_part->put (decl, stack_vars_num);
 
   v->decl = decl;
-  v->size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (decl)));
+  tree size = TREE_CODE (decl) == SSA_NAME
+    ? TYPE_SIZE_UNIT (TREE_TYPE (decl))
+    : DECL_SIZE_UNIT (decl);
+  v->size = tree_to_uhwi (size);
   /* Ensure that all variables have size, so that &a != &b for any two
      variables that are simultaneously live.  */
   if (v->size == 0)
     v->size = 1;
-  v->alignb = align_local_variable (SSAVAR (decl));
+  v->alignb = align_local_variable (decl);
   /* An alignment of zero can mightily confuse us later.  */
   gcc_assert (v->alignb != 0);
 
@@ -862,7 +1002,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (Pmode, base, offset);
-  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+		   ? TYPE_MODE (TREE_TYPE (decl))
+		   : DECL_MODE (SSAVAR (decl)), x);
 
   if (TREE_CODE (decl) != SSA_NAME)
     {
@@ -884,7 +1026,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
       DECL_USER_ALIGN (decl) = 0;
     }
 
-  set_mem_attributes (x, SSAVAR (decl), true);
   set_rtl (decl, x);
 }
 
@@ -950,9 +1091,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 	  /* Skip variables that have already had rtl assigned.  See also
 	     add_stack_var where we perpetrate this pc_rtx hack.  */
 	  decl = stack_vars[i].decl;
-	  if ((TREE_CODE (decl) == SSA_NAME
-	      ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
-	      : DECL_RTL (decl)) != pc_rtx)
+	  if (TREE_CODE (decl) == SSA_NAME
+	      ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
+	      : DECL_RTL (decl) != pc_rtx)
 	    continue;
 
 	  large_size += alignb - 1;
@@ -981,9 +1122,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
       /* Skip variables that have already had rtl assigned.  See also
 	 add_stack_var where we perpetrate this pc_rtx hack.  */
       decl = stack_vars[i].decl;
-      if ((TREE_CODE (decl) == SSA_NAME
-	   ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
-	   : DECL_RTL (decl)) != pc_rtx)
+      if (TREE_CODE (decl) == SSA_NAME
+	  ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
+	  : DECL_RTL (decl) != pc_rtx)
 	continue;
 
       /* Check the predicate to see whether this variable should be
@@ -1099,13 +1240,22 @@ account_stack_vars (void)
    to a variable to be allocated in the stack frame.  */
 
 static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
 {
   HOST_WIDE_INT size, offset;
   unsigned byte_align;
 
-  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
-  byte_align = align_local_variable (SSAVAR (var));
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      tree type = TREE_TYPE (var);
+      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+      byte_align = TYPE_ALIGN_UNIT (type);
+    }
+  else
+    {
+      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+      byte_align = align_local_variable (var);
+    }
 
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1116,6 +1266,27 @@ expand_one_stack_var (tree var)
 			   crtl->max_used_stack_slot_alignment, offset);
 }
 
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+   already assigned some MEM.  */
+
+static void
+expand_one_stack_var (tree var)
+{
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (MEM_P (x));
+	  return;
+	}
+    }
+
+  return expand_one_stack_var_1 (var);
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a hard register.  */
 
@@ -1125,13 +1296,136 @@ expand_one_hard_reg_var (tree var)
   rest_of_decl_compilation (var, 0, 0);
 }
 
+/* Record the alignment requirements of some variable assigned to a
+   pseudo.  */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+  if (SUPPORTS_STACK_ALIGNMENT
+      && crtl->stack_alignment_estimated < align)
+    {
+      /* stack_alignment_estimated shouldn't change after stack
+         realign decision made */
+      gcc_assert (!crtl->stack_realign_processed);
+      crtl->stack_alignment_estimated = align;
+    }
+
+  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+     So here we only make sure stack_alignment_needed >= align.  */
+  if (crtl->stack_alignment_needed < align)
+    crtl->stack_alignment_needed = align;
+  if (crtl->max_used_stack_slot_alignment < align)
+    crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition.  */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+  int part = var_to_partition (SA.map, var);
+  gcc_assert (part != NO_PARTITION);
+
+  if (SA.partition_to_pseudo[part])
+    return;
+
+  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+					  TYPE_MODE (TREE_TYPE (var)),
+					  TYPE_ALIGN (TREE_TYPE (var)));
+
+  /* If the variable alignment is very large we'll dynamicaly allocate
+     it, which means that in-frame portion is just a pointer.  */
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+    align = POINTER_SIZE;
+
+  record_alignment_for_reg_var (align);
+
+  if (!use_register_for_decl (var))
+    {
+      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
+	  && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
+	{
+	  expand_one_stack_var_at (var, pc_rtx, 0, 0);
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (GET_CODE (x) == MEM);
+	  gcc_assert (GET_MODE (x) == BLKmode);
+	  gcc_assert (XEXP (x, 0) == pc_rtx);
+	  /* Reset the address, so that any attempt to use it will
+	     ICE.  It will be adjusted in assign_parm_setup_reg.  */
+	  XEXP (x, 0) = NULL_RTX;
+	}
+      else if (defer_stack_allocation (var, true))
+	add_stack_var (var);
+      else
+	expand_one_stack_var_1 (var);
+      return;
+    }
+
+  machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+  rtx x = gen_reg_rtx (reg_mode);
+
+  set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+   and the underlying variable of the SSA_NAME.  */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+  if (!var)
+    return;
+
+  tree decl = SSA_NAME_VAR (var);
+
+  int part = var_to_partition (SA.map, var);
+  if (part == NO_PARTITION)
+    return;
+
+  rtx x = SA.partition_to_pseudo[part];
+
+  if (!x)
+    {
+      /* This var will get a stack slot later.  */
+      gcc_assert (defer_stack_allocation (var, true));
+      return;
+    }
+
+  set_rtl (var, x);
+
+  if (!REG_P (x))
+    return;
+
+  /* Note if the object is a user variable.  */
+  if (decl && !DECL_ARTIFICIAL (decl))
+    mark_user_reg (x);
+
+  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+    mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
 /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
    that will reside in a pseudo register.  */
 
 static void
 expand_one_register_var (tree var)
 {
-  tree decl = SSAVAR (var);
+  if (TREE_CODE (var) == SSA_NAME)
+    {
+      int part = var_to_partition (SA.map, var);
+      if (part != NO_PARTITION)
+	{
+	  rtx x = SA.partition_to_pseudo[part];
+	  gcc_assert (x);
+	  gcc_assert (REG_P (x));
+	  return;
+	}
+      gcc_unreachable ();
+    }
+
+  tree decl = var;
   tree type = TREE_TYPE (decl);
   machine_mode reg_mode = promote_decl_mode (decl, NULL);
   rtx x = gen_reg_rtx (reg_mode);
@@ -1177,10 +1471,14 @@ expand_one_error_var (tree var)
 static bool
 defer_stack_allocation (tree var, bool toplevel)
 {
+  tree size_unit = TREE_CODE (var) == SSA_NAME
+    ? TYPE_SIZE_UNIT (TREE_TYPE (var))
+    : DECL_SIZE_UNIT (var);
+
   /* Whether the variable is small enough for immediate allocation not to be
      a problem with regard to the frame size.  */
   bool smallish
-    = ((HOST_WIDE_INT) tree_to_uhwi (DECL_SIZE_UNIT (var))
+    = ((HOST_WIDE_INT) tree_to_uhwi (size_unit)
        < PARAM_VALUE (PARAM_MIN_SIZE_FOR_STACK_SHARING));
 
   /* If stack protection is enabled, *all* stack variables must be deferred,
@@ -1189,16 +1487,24 @@ defer_stack_allocation (tree var, bool toplevel)
   if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK))
     return true;
 
+  unsigned int align = TREE_CODE (var) == SSA_NAME
+    ? TYPE_ALIGN (TREE_TYPE (var))
+    : DECL_ALIGN (var);
+
   /* We handle "large" alignment via dynamic allocation.  We want to handle
      this extra complication in only one place, so defer them.  */
-  if (DECL_ALIGN (var) > MAX_SUPPORTED_STACK_ALIGNMENT)
+  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
     return true;
 
+  bool ignored = TREE_CODE (var) == SSA_NAME
+    ? !SSAVAR (var) || DECL_IGNORED_P (SSA_NAME_VAR (var))
+    : DECL_IGNORED_P (var);
+
   /* When optimization is enabled, DECL_IGNORED_P variables originally scoped
      might be detached from their block and appear at toplevel when we reach
      here.  We want to coalesce them with variables from other blocks when
      the immediate contribution to the frame size would be noticeable.  */
-  if (toplevel && optimize > 0 && DECL_IGNORED_P (var) && !smallish)
+  if (toplevel && optimize > 0 && ignored && !smallish)
     return true;
 
   /* Variables declared in the outermost scope automatically conflict
@@ -1265,21 +1571,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
 	align = POINTER_SIZE;
     }
 
-  if (SUPPORTS_STACK_ALIGNMENT
-      && crtl->stack_alignment_estimated < align)
-    {
-      /* stack_alignment_estimated shouldn't change after stack
-         realign decision made */
-      gcc_assert (!crtl->stack_realign_processed);
-      crtl->stack_alignment_estimated = align;
-    }
-
-  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
-     So here we only make sure stack_alignment_needed >= align.  */
-  if (crtl->stack_alignment_needed < align)
-    crtl->stack_alignment_needed = align;
-  if (crtl->max_used_stack_slot_alignment < align)
-    crtl->max_used_stack_slot_alignment = align;
+  record_alignment_for_reg_var (align);
 
   if (TREE_CODE (origvar) == SSA_NAME)
     {
@@ -1722,48 +2014,18 @@ expand_used_vars (void)
   if (targetm.use_pseudo_pic_reg ())
     pic_offset_table_rtx = gen_reg_rtx (Pmode);
 
-  hash_map<tree, tree> ssa_name_decls;
   for (i = 0; i < SA.map->num_partitions; i++)
     {
       tree var = partition_to_var (SA.map, i);
 
       gcc_assert (!virtual_operand_p (var));
 
-      /* Assign decls to each SSA name partition, share decls for partitions
-         we could have coalesced (those with the same type).  */
-      if (SSA_NAME_VAR (var) == NULL_TREE)
-	{
-	  tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
-	  if (!*slot)
-	    *slot = create_tmp_reg (TREE_TYPE (var));
-	  replace_ssa_name_symbol (var, *slot);
-	}
-
-      /* Always allocate space for partitions based on VAR_DECLs.  But for
-	 those based on PARM_DECLs or RESULT_DECLs and which matter for the
-	 debug info, there is no need to do so if optimization is disabled
-	 because all the SSA_NAMEs based on these DECLs have been coalesced
-	 into a single partition, which is thus assigned the canonical RTL
-	 location of the DECLs.  If in_lto_p, we can't rely on optimize,
-	 a function could be compiled with -O1 -flto first and only the
-	 link performed at -O0.  */
-      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
-	expand_one_var (var, true, true);
-      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
-	{
-	  /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
-	     contain the default def (representing the parm or result itself)
-	     we don't do anything here.  But those which don't contain the
-	     default def (representing a temporary based on the parm/result)
-	     we need to allocate space just like for normal VAR_DECLs.  */
-	  if (!bitmap_bit_p (SA.partition_has_default_def, i))
-	    {
-	      expand_one_var (var, true, true);
-	      gcc_assert (SA.partition_to_pseudo[i]);
-	    }
-	}
+      expand_one_ssa_partition (var);
     }
 
+  for (i = 1; i < num_ssa_names; i++)
+    adjust_one_expanded_partition_var (ssa_name (i));
+
   if (flag_stack_protect == SPCT_FLAG_STRONG)
       gen_stack_protect_signal
 	= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -5928,35 +6190,6 @@ pass_expand::execute (function *fun)
       parm_birth_insn = var_seq;
     }
 
-  /* Now that we also have the parameter RTXs, copy them over to our
-     partitions.  */
-  for (i = 0; i < SA.map->num_partitions; i++)
-    {
-      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
-      if (TREE_CODE (var) != VAR_DECL
-	  && !SA.partition_to_pseudo[i])
-	SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
-      gcc_assert (SA.partition_to_pseudo[i]);
-
-      /* If this decl was marked as living in multiple places, reset
-	 this now to NULL.  */
-      if (DECL_RTL_IF_SET (var) == pc_rtx)
-	SET_DECL_RTL (var, NULL);
-
-      /* Some RTL parts really want to look at DECL_RTL(x) when x
-	 was a decl marked in REG_ATTR or MEM_ATTR.  We could use
-	 SET_DECL_RTL here making this available, but that would mean
-	 to select one of the potentially many RTLs for one DECL.  Instead
-	 of doing that we simply reset the MEM_EXPR of the RTL in question,
-	 then nobody can get at it and hence nobody can call DECL_RTL on it.  */
-      if (!DECL_RTL_SET_P (var))
-	{
-	  if (MEM_P (SA.partition_to_pseudo[i]))
-	    set_mem_expr (SA.partition_to_pseudo[i], NULL);
-	}
-    }
-
   /* If we have a class containing differently aligned pointers
      we need to merge those into the corresponding RTL pointer
      alignment.  */
@@ -5964,7 +6197,6 @@ pass_expand::execute (function *fun)
     {
       tree name = ssa_name (i);
       int part;
-      rtx r;
 
       if (!name
 	  /* We might have generated new SSA names in
@@ -5977,20 +6209,25 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      /* Adjust all partition members to get the underlying decl of
-	 the representative which we might have created in expand_one_var.  */
-      if (SSA_NAME_VAR (name) == NULL_TREE)
+      gcc_assert (SA.partition_to_pseudo[part]
+		  || defer_stack_allocation (name, true));
+
+      /* If this decl was marked as living in multiple places, reset
+	 this now to NULL.  */
+      tree var = SSA_NAME_VAR (name);
+      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+	SET_DECL_RTL (var, NULL);
+      /* Check that the pseudos chosen by assign_parms are those of
+	 the corresponding default defs.  */
+      else if (SSA_NAME_IS_DEFAULT_DEF (name)
+	       && (TREE_CODE (var) == PARM_DECL
+		   || TREE_CODE (var) == RESULT_DECL))
 	{
-	  tree leader = partition_to_var (SA.map, part);
-	  gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
-	  replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+	  rtx in = DECL_RTL_IF_SET (var);
+	  gcc_assert (in);
+	  rtx out = SA.partition_to_pseudo[part];
+	  gcc_assert (in == out || rtx_equal_p (in, out));
 	}
-      if (!POINTER_TYPE_P (TREE_TYPE (name)))
-	continue;
-
-      r = SA.partition_to_pseudo[part];
-      if (REG_P (r))
-	mark_reg_pointer (r, get_pointer_alignment (name));
     }
 
   /* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..987cf356 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,8 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern bool parm_maybe_byref_p (tree);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
 
 #endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index e80eadf..dd59ff3 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2234,16 +2234,16 @@ Common Report Var(flag_tree_ch) Optimization
 Enable loop header copying on trees
 
 ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing.  Preserved for backward compatibility.
 
 ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
 
 ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 ftree-copy-prop
 Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2871337..27be317 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -342,7 +342,6 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
 -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
 -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
 -fdump-tree-nrv -fdump-tree-vect @gol
 -fdump-tree-sink @gol
 -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -448,9 +447,8 @@ Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
 -ftree-loop-if-convert-stores -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -7133,11 +7131,6 @@ name is made by appending @file{.phiopt} to the source file name.
 Dump each function after forward propagating single use variables.  The file
 name is made by appending @file{.forwprop} to the source file name.
 
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization.  The file
-name is made by appending @file{.copyrename} to the source file name.
-
 @item nrv
 @opindex fdump-tree-nrv
 Dump each function after applying the named return value optimization on
@@ -7602,8 +7595,8 @@ compilation time.
 -ftree-ccp @gol
 -fssa-phiopt @gol
 -ftree-ch @gol
+-ftree-coalesce-vars @gol
 -ftree-copy-prop @gol
--ftree-copyrename @gol
 -ftree-dce @gol
 -ftree-dominator-opts @gol
 -ftree-dse @gol
@@ -8867,6 +8860,15 @@ be parallelized.  Parallelize all the loops that can be analyzed to
 not contain loop carried dependences without checking that it is
 profitable to parallelize the loops.
 
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries.  This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}.  In the negated form, this flag
+prevents SSA coalescing of user variables.  This option is enabled by
+default if optimization is enabled.
+
 @item -ftree-loop-if-convert
 @opindex ftree-loop-if-convert
 Attempt to transform conditional jumps in the innermost loops to
@@ -8980,32 +8982,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
 references with scalars to prevent committing structures to memory too
 early.  This flag is enabled by default at @option{-O} and higher.
 
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees.  This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables.  This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions.  It is a more limited form of
-@option{-ftree-coalesce-vars}.  This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries.  This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}.  In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones.  This option is enabled by default.
-
 @item -ftree-ter
 @opindex ftree-ter
 Perform temporary expression replacement during the SSA->normal phase.  Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index d211e6b0..a6ef154 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "builtins.h"
 #include "rtl-iter.h"
+#include "stor-layout.h"
 
 struct target_rtl default_target_rtl;
 #if SWITCHABLE_TARGET
@@ -1233,6 +1234,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
 void
 set_reg_attrs_for_decl_rtl (tree t, rtx x)
 {
+  if (!t)
+    return;
+  tree tdecl = t;
   if (GET_CODE (x) == SUBREG)
     {
       gcc_assert (subreg_lowpart_p (x));
@@ -1241,7 +1245,9 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
   if (REG_P (x))
     REG_ATTRS (x)
       = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
-					       DECL_MODE (t)));
+					       DECL_P (tdecl)
+					       ? DECL_MODE (tdecl)
+					       : TYPE_MODE (TREE_TYPE (tdecl))));
   if (GET_CODE (x) == CONCAT)
     {
       if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index bd342c1..6941f4e 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -842,6 +842,35 @@ promote_decl_mode (const_tree decl, int *punsignedp)
   return pmode;
 }
 
+/* Return the promoted mode for name.  If it is a named SSA_NAME, it
+   is the same as promote_decl_mode.  Otherwise, it is the promoted
+   mode of a temp decl of same type as the SSA_NAME, if we had created
+   one.  */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  /* Partitions holding parms and results must be promoted as expected
+     by function.c.  */
+  if (SSA_NAME_VAR (name)
+      && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
+	  || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
+    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+
+  tree type = TREE_TYPE (name);
+  int unsignedp = TYPE_UNSIGNED (type);
+  machine_mode mode = TYPE_MODE (type);
+
+  machine_mode pmode = promote_mode (type, mode, &unsignedp);
+  if (punsignedp)
+    *punsignedp = unsignedp;
+
+  return pmode;
+}
+
+
 \f
 /* Controls the behaviour of {anti_,}adjust_stack.  */
 static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 94613de..52113db 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
 /* Return mode and signedness to use when object is promoted.  */
 machine_mode promote_decl_mode (const_tree, int *);
 
+/* Return mode and signedness to use when object is promoted.  */
+machine_mode promote_ssa_mode (const_tree, int *);
+
 /* Remove some bytes from the stack.  An rtx says how many.  */
 extern void adjust_stack (rtx);
 
diff --git a/gcc/expr.c b/gcc/expr.c
index 31b4573..f604f52 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -3022,7 +3022,7 @@ write_complex_part (rtx cplx, rtx val, bool imag_p)
 /* Extract one of the components of the complex value CPLX.  Extract the
    real part if IMAG_P is false, and the imaginary part if it's true.  */
 
-static rtx
+rtx
 read_complex_part (rtx cplx, bool imag_p)
 {
   machine_mode cmode, imode;
@@ -9236,7 +9236,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
   rtx op0, op1, temp, decl_rtl;
   tree type;
   int unsignedp;
-  machine_mode mode;
+  machine_mode mode, dmode;
   enum tree_code code = TREE_CODE (exp);
   rtx subtarget, original_target;
   int ignore;
@@ -9367,7 +9367,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       if (g == NULL
 	  && modifier == EXPAND_INITIALIZER
 	  && !SSA_NAME_IS_DEFAULT_DEF (exp)
-	  && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+	  && (optimize || !SSA_NAME_VAR (exp)
+	      || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
 	  && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
 	g = SSA_NAME_DEF_STMT (exp);
       if (g)
@@ -9446,15 +9447,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       /* Ensure variable marked as used even if it doesn't go through
 	 a parser.  If it hasn't be used yet, write out an external
 	 definition.  */
-      TREE_USED (exp) = 1;
+      if (exp)
+	TREE_USED (exp) = 1;
 
       /* Show we haven't gotten RTL for this yet.  */
       temp = 0;
 
       /* Variables inherited from containing functions should have
 	 been lowered by this point.  */
-      context = decl_function_context (exp);
-      gcc_assert (SCOPE_FILE_SCOPE_P (context)
+      if (exp)
+	context = decl_function_context (exp);
+      gcc_assert (!exp
+		  || SCOPE_FILE_SCOPE_P (context)
 		  || context == current_function_decl
 		  || TREE_STATIC (exp)
 		  || DECL_EXTERNAL (exp)
@@ -9478,7 +9482,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	  decl_rtl = use_anchored_address (decl_rtl);
 	  if (modifier != EXPAND_CONST_ADDRESS
 	      && modifier != EXPAND_SUM
-	      && !memory_address_addr_space_p (DECL_MODE (exp),
+	      && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+					       : GET_MODE (decl_rtl),
 					       XEXP (decl_rtl, 0),
 					       MEM_ADDR_SPACE (decl_rtl)))
 	    temp = replace_equiv_address (decl_rtl,
@@ -9489,12 +9494,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 if the address is a register.  */
       if (temp != 0)
 	{
-	  if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+	  if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
 	    mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
 
 	  return temp;
 	}
 
+      if (exp)
+	dmode = DECL_MODE (exp);
+      else
+	dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
       /* If the mode of DECL_RTL does not match that of the decl,
 	 there are two cases: we are dealing with a BLKmode value
 	 that is returned in a register, or we are dealing with
@@ -9502,22 +9512,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	 of the wanted mode, but mark it so that we know that it
 	 was already extended.  */
       if (REG_P (decl_rtl)
-	  && DECL_MODE (exp) != BLKmode
-	  && GET_MODE (decl_rtl) != DECL_MODE (exp))
+	  && dmode != BLKmode
+	  && GET_MODE (decl_rtl) != dmode)
 	{
 	  machine_mode pmode;
 
 	  /* Get the signedness to be used for this variable.  Ensure we get
 	     the same mode we got when the variable was declared.  */
-	  if (code == SSA_NAME
-	      && (g = SSA_NAME_DEF_STMT (ssa_name))
-	      && gimple_code (g) == GIMPLE_CALL
-	      && !gimple_call_internal_p (g))
+	  if (code != SSA_NAME)
+	    pmode = promote_decl_mode (exp, &unsignedp);
+	  else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+		   && gimple_code (g) == GIMPLE_CALL
+		   && !gimple_call_internal_p (g))
 	    pmode = promote_function_mode (type, mode, &unsignedp,
 					   gimple_call_fntype (g),
 					   2);
 	  else
-	    pmode = promote_decl_mode (exp, &unsignedp);
+	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/expr.h b/gcc/expr.h
index 32d1707..a2c8e1d 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -210,6 +210,7 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx);
 
 extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx);
 extern rtx_insn *emit_move_complex_parts (rtx, rtx);
+extern rtx read_complex_part (rtx, bool);
 extern void write_complex_part (rtx, rtx, bool);
 extern rtx emit_move_resolve_push (machine_mode, rtx);
 
diff --git a/gcc/function.c b/gcc/function.c
index 20bf3b3..715c19f 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -72,6 +72,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfganal.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
+#include "cfgexpand.h"
+#include "basic-block.h"
+#include "df.h"
 #include "params.h"
 #include "bb-reorder.h"
 #include "shrink-wrap.h"
@@ -148,6 +151,9 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
 static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
+static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
+static void maybe_reset_rtl_for_parm (tree);
+
 \f
 /* Stack of nested functions.  */
 /* Keep track of the cfun stack.  */
@@ -2105,6 +2111,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
 bool
 use_register_for_decl (const_tree decl)
 {
+  if (TREE_CODE (decl) == SSA_NAME)
+    {
+      /* We often try to use the SSA_NAME, instead of its underlying
+	 decl, to get type information and guide decisions, to avoid
+	 differences of behavior between anonymous and named
+	 variables, but in this one case we have to go for the actual
+	 variable if there is one.  The main reason is that, at least
+	 at -O0, we want to place user variables on the stack, but we
+	 don't mind using pseudos for anonymous or ignored temps.
+	 Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+	 should go in pseudos, whereas their corresponding variables
+	 might have to go on the stack.  So, disregarding the decl
+	 here would negatively impact debug info at -O0, enable
+	 coalescing between SSA_NAMEs that ought to get different
+	 stack/pseudo assignments, and get the incoming argument
+	 processing thoroughly confused by PARM_DECLs expected to live
+	 in stack slots but assigned to pseudos.  */
+      if (!SSA_NAME_VAR (decl))
+	return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+	  && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+      decl = SSA_NAME_VAR (decl);
+    }
+
   /* Honor volatile.  */
   if (TREE_SIDE_EFFECTS (decl))
     return false;
@@ -2240,7 +2270,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
    needed, else the old list.  */
 
 static void
-split_complex_args (vec<tree> *args)
+split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 {
   unsigned i;
   tree p;
@@ -2251,6 +2281,7 @@ split_complex_args (vec<tree> *args)
       if (TREE_CODE (type) == COMPLEX_TYPE
 	  && targetm.calls.split_complex_arg (type))
 	{
+	  tree cparm = p;
 	  tree decl;
 	  tree subtype = TREE_TYPE (type);
 	  bool addressable = TREE_ADDRESSABLE (p);
@@ -2269,6 +2300,9 @@ split_complex_args (vec<tree> *args)
 	  DECL_ARTIFICIAL (p) = addressable;
 	  DECL_IGNORED_P (p) = addressable;
 	  TREE_ADDRESSABLE (p) = 0;
+	  /* Reset the RTL before layout_decl, or it may change the
+	     mode of the RTL of the original argument copied to P.  */
+	  SET_DECL_RTL (p, NULL_RTX);
 	  layout_decl (p, 0);
 	  (*args)[i] = p;
 
@@ -2280,6 +2314,25 @@ split_complex_args (vec<tree> *args)
 	  DECL_IGNORED_P (decl) = addressable;
 	  layout_decl (decl, 0);
 	  args->safe_insert (++i, decl);
+
+	  /* If we are expanding a function, rather than gimplifying
+	     it, propagate the RTL of the complex parm to the split
+	     declarations, and set their contexts so that
+	     maybe_reset_rtl_for_parm can recognize them and refrain
+	     from resetting their RTL.  */
+	  if (currently_expanding_to_rtl)
+	    {
+	      maybe_reset_rtl_for_parm (cparm);
+	      rtx rtl = rtl_for_parm (all, cparm);
+	      if (rtl)
+		{
+		  SET_DECL_RTL (p, read_complex_part (rtl, false));
+		  SET_DECL_RTL (decl, read_complex_part (rtl, true));
+
+		  DECL_CONTEXT (p) = cparm;
+		  DECL_CONTEXT (decl) = cparm;
+		}
+	    }
 	}
     }
 }
@@ -2342,7 +2395,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
 
   /* If the target wants to split complex arguments into scalars, do so.  */
   if (targetm.calls.split_complex_arg)
-    split_complex_args (&fnargs);
+    split_complex_args (all, &fnargs);
 
   return fnargs;
 }
@@ -2745,23 +2798,98 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
   data->entry_parm = entry_parm;
 }
 
+/* Wrapper for use_register_for_decl, that special-cases the
+   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+   passed by reference.  */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (DECL_BY_REFERENCE (result))
+	parm = result;
+    }
+
+  return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+   is passed by reference.  */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+  if (parm == all->function_result_decl)
+    {
+      tree result = DECL_RESULT (current_function_decl);
+
+      if (!DECL_BY_REFERENCE (result))
+	return NULL_RTX;
+
+      parm = result;
+    }
+
+  return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+   SSA_NAMEs in multiple partitions, so that assign_parms will choose
+   the default def, if it exists, or create new RTL to hold the unused
+   entry value.  If we are coalescing across variables, we want to
+   reset the location too, because a parm without a default def
+   (incoming value unused) might be coalesced with one with a default
+   def, and then assign_parms would copy both incoming values to the
+   same location, which might cause the wrong value to survive.  */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+  gcc_assert (TREE_CODE (parm) == PARM_DECL
+	      || TREE_CODE (parm) == RESULT_DECL);
+
+  /* This is a split complex parameter, and its context was set to its
+     original PARM_DECL in split_complex_args so that we could
+     recognize it here and not reset its RTL.  */
+  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
+    {
+      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
+      return;
+    }
+
+  if ((flag_tree_coalesce_vars
+       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+      && is_gimple_reg (parm))
+    SET_DECL_RTL (parm, NULL_RTX);
+}
+
 /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
    always valid and properly aligned.  */
 
 static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+			      struct assign_parm_data_one *data)
 {
   rtx stack_parm = data->stack_parm;
 
+  /* If out-of-SSA assigned RTL to the parm default def, make sure we
+     don't use what we might have computed before.  */
+  rtx ssa_assigned = rtl_for_parm (all, parm);
+  if (ssa_assigned)
+    stack_parm = NULL;
+
   /* If we can't trust the parm stack slot to be aligned enough for its
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
-  if (stack_parm
-      && ((STRICT_ALIGNMENT
-	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
-	  || (data->nominal_type
-	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
-	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+  else if (stack_parm
+	   && ((STRICT_ALIGNMENT
+		&& (GET_MODE_ALIGNMENT (data->nominal_mode)
+		    > MEM_ALIGN (stack_parm)))
+	       || (data->nominal_type
+		   && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+		   && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
     stack_parm = NULL;
 
   /* If parm was passed in memory, and we need to convert it on entry,
@@ -2823,14 +2951,32 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      stack_parm = assign_stack_local (BLKmode, size_stored,
-				       DECL_ALIGN (parm));
-      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
-	PUT_MODE (stack_parm, GET_MODE (entry_parm));
-      set_mem_attributes (stack_parm, parm, 1);
+      rtx from_expand = rtl_for_parm (all, parm);
+      if (from_expand && (!parm_maybe_byref_p (parm)
+			  || XEXP (from_expand, 0) != NULL_RTX))
+	stack_parm = copy_rtx (from_expand);
+      else
+	{
+	  stack_parm = assign_stack_local (BLKmode, size_stored,
+					   DECL_ALIGN (parm));
+	  if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
+	    PUT_MODE (stack_parm, GET_MODE (entry_parm));
+	  if (from_expand)
+	    {
+	      gcc_assert (GET_CODE (stack_parm) == MEM);
+	      gcc_assert (GET_CODE (from_expand) == MEM);
+	      gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
+	      XEXP (from_expand, 0) = XEXP (stack_parm, 0);
+	      PUT_MODE (from_expand, GET_MODE (stack_parm));
+	      stack_parm = copy_rtx (from_expand);
+	    }
+	  else
+	    set_mem_attributes (stack_parm, parm, 1);
+	}
     }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
@@ -2968,14 +3114,34 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  parmreg = gen_reg_rtx (promoted_nominal_mode);
+  rtx from_expand = parmreg = rtl_for_parm (all, parm);
 
-  if (!DECL_ARTIFICIAL (parm))
-    mark_user_reg (parmreg);
+  if (from_expand && !data->passed_pointer)
+    {
+      if (GET_MODE (parmreg) != promoted_nominal_mode)
+	parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
+    }
+  else if (!from_expand || parm_maybe_byref_p (parm))
+    {
+      parmreg = gen_reg_rtx (promoted_nominal_mode);
+      if (!DECL_ARTIFICIAL (parm))
+	mark_user_reg (parmreg);
+
+      if (from_expand)
+	{
+	  gcc_assert (data->passed_pointer);
+	  gcc_assert (GET_CODE (from_expand) == MEM
+		      && GET_MODE (from_expand) == BLKmode
+		      && XEXP (from_expand, 0) == NULL_RTX);
+	  XEXP (from_expand, 0) = parmreg;
+	}
+    }
 
   /* If this was an item that we received a pointer to,
      set DECL_RTL appropriately.  */
-  if (data->passed_pointer)
+  if (from_expand)
+    SET_DECL_RTL (parm, from_expand);
+  else if (data->passed_pointer)
     {
       rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
       set_mem_attributes (x, parm, 1);
@@ -2990,10 +3156,13 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      assign_parm_find_data_types and expand_expr_real_1.  */
 
   equiv_stack_parm = data->stack_parm;
+  if (!equiv_stack_parm)
+    equiv_stack_parm = data->entry_parm;
   validated_mem = validize_mem (copy_rtx (data->entry_parm));
 
   need_conversion = (data->nominal_mode != data->passed_mode
 		     || promoted_nominal_mode != data->promoted_mode);
+  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
   moved = false;
 
   if (need_conversion
@@ -3125,16 +3294,28 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       did_conversion = true;
     }
-  else
+  /* We don't want to copy the incoming pointer to a parmreg expected
+     to hold the value rather than the pointer.  */
+  else if (!data->passed_pointer || parmreg != from_expand)
     emit_move_insn (parmreg, validated_mem);
 
   /* If we were passed a pointer but the actual value can safely live
      in a register, retrieve it and use it directly.  */
-  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+  if (data->passed_pointer
+      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
     {
+      rtx src = DECL_RTL (parm);
+
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
-      if (use_register_for_decl (parm))
+      if (from_expand)
+	{
+	  parmreg = from_expand;
+	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+	  src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
+	  set_mem_attributes (src, parm, 1);
+	}
+      else if (use_register_for_decl (parm))
 	{
 	  parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
 	  mark_user_reg (parmreg);
@@ -3151,14 +3332,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	  set_mem_attributes (parmreg, parm, 1);
 	}
 
-      if (GET_MODE (parmreg) != GET_MODE (DECL_RTL (parm)))
+      if (GET_MODE (parmreg) != GET_MODE (src))
 	{
-	  rtx tempreg = gen_reg_rtx (GET_MODE (DECL_RTL (parm)));
+	  rtx tempreg = gen_reg_rtx (GET_MODE (src));
 	  int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
 
 	  push_to_sequence2 (all->first_conversion_insn,
 			     all->last_conversion_insn);
-	  emit_move_insn (tempreg, DECL_RTL (parm));
+	  emit_move_insn (tempreg, src);
 	  tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
 	  emit_move_insn (parmreg, tempreg);
 	  all->first_conversion_insn = get_insns ();
@@ -3167,14 +3348,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
 	  did_conversion = true;
 	}
+      else if (GET_MODE (parmreg) == BLKmode)
+	gcc_assert (parm_maybe_byref_p (parm));
       else
-	emit_move_insn (parmreg, DECL_RTL (parm));
+	emit_move_insn (parmreg, src);
 
       SET_DECL_RTL (parm, parmreg);
 
       /* STACK_PARM is the pointer, not the parm, and PARMREG is
 	 now the parm.  */
-      data->stack_parm = NULL;
+      data->stack_parm = equiv_stack_parm = NULL;
     }
 
   /* Mark the register as eliminable if we did no conversion and it was
@@ -3184,11 +3367,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      make here would screw up life analysis for it.  */
   if (data->nominal_mode == data->passed_mode
       && !did_conversion
-      && data->stack_parm != 0
-      && MEM_P (data->stack_parm)
+      && equiv_stack_parm != 0
+      && MEM_P (equiv_stack_parm)
       && data->locate.offset.var == 0
       && reg_mentioned_p (virtual_incoming_args_rtx,
-			  XEXP (data->stack_parm, 0)))
+			  XEXP (equiv_stack_parm, 0)))
     {
       rtx_insn *linsn = get_last_insn ();
       rtx_insn *sinsn;
@@ -3201,8 +3384,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	    = GET_MODE_INNER (GET_MODE (parmreg));
 	  int regnor = REGNO (XEXP (parmreg, 0));
 	  int regnoi = REGNO (XEXP (parmreg, 1));
-	  rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
-	  rtx stacki = adjust_address_nv (data->stack_parm, submode,
+	  rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+	  rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
 					  GET_MODE_SIZE (submode));
 
 	  /* Scan backwards for the set of the real and
@@ -3275,6 +3458,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 
       if (data->stack_parm == 0)
 	{
+	  rtx x = data->stack_parm = rtl_for_parm (all, parm);
+	  if (x)
+	    gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+	}
+
+      if (data->stack_parm == 0)
+	{
 	  int align = STACK_SLOT_ALIGNMENT (data->passed_type,
 					    GET_MODE (data->entry_parm),
 					    TYPE_ALIGN (data->passed_type));
@@ -3337,11 +3527,21 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
 	  imag = DECL_RTL (fnargs[i + 1]);
 	  if (inner != GET_MODE (real))
 	    {
-	      real = gen_lowpart_SUBREG (inner, real);
-	      imag = gen_lowpart_SUBREG (inner, imag);
+	      real = simplify_gen_subreg (inner, real, GET_MODE (real),
+					  subreg_lowpart_offset
+					  (inner, GET_MODE (real)));
+	      imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
+					  subreg_lowpart_offset
+					  (inner, GET_MODE (imag)));
 	    }
 
-	  if (TREE_ADDRESSABLE (parm))
+	  if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
+	      && rtx_equal_p (real,
+			      read_complex_part (tmp, false))
+	      && rtx_equal_p (imag,
+			      read_complex_part (tmp, true)))
+	    ; /* We now have the right rtl in tmp.  */
+	  else if (TREE_ADDRESSABLE (parm))
 	    {
 	      rtx rmem, imem;
 	      HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
@@ -3487,7 +3687,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
 	  assign_parm_setup_block (&all, pbdata->bounds_parm,
 				   &pbdata->parm_data);
 	else if (pbdata->parm_data.passed_pointer
-		 || use_register_for_decl (pbdata->bounds_parm))
+		 || use_register_for_parm_decl (&all, pbdata->bounds_parm))
 	  assign_parm_setup_reg (&all, pbdata->bounds_parm,
 				 &pbdata->parm_data);
 	else
@@ -3531,6 +3731,8 @@ assign_parms (tree fndecl)
 	  DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
 	  continue;
 	}
+      else
+	maybe_reset_rtl_for_parm (parm);
 
       /* Estimate stack alignment from parameter alignment.  */
       if (SUPPORTS_STACK_ALIGNMENT)
@@ -3580,7 +3782,9 @@ assign_parms (tree fndecl)
       else
 	set_decl_incoming_rtl (parm, data.entry_parm, false);
 
-      /* Boudns should be loaded in the particular order to
+      assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+      /* Bounds should be loaded in the particular order to
 	 have registers allocated correctly.  Collect info about
 	 input bounds and load them later.  */
       if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3597,11 +3801,10 @@ assign_parms (tree fndecl)
 	}
       else
 	{
-	  assign_parm_adjust_stack_rtl (&data);
-
 	  if (assign_parm_setup_block_p (&data))
 	    assign_parm_setup_block (&all, parm, &data);
-	  else if (data.passed_pointer || use_register_for_decl (parm))
+	  else if (data.passed_pointer
+		   || use_register_for_parm_decl (&all, parm))
 	    assign_parm_setup_reg (&all, parm, &data);
 	  else
 	    assign_parm_setup_stack (&all, parm, &data);
@@ -4932,7 +5135,9 @@ expand_function_start (tree subr)
      before any library calls that assign parms might generate.  */
 
   /* Decide whether to return the value in memory or in a register.  */
-  if (aggregate_value_p (DECL_RESULT (subr), subr))
+  tree res = DECL_RESULT (subr);
+  maybe_reset_rtl_for_parm (res);
+  if (aggregate_value_p (res, subr))
     {
       /* Returning something that won't go in a register.  */
       rtx value_address = 0;
@@ -4940,7 +5145,7 @@ expand_function_start (tree subr)
 #ifdef PCC_STATIC_STRUCT_RETURN
       if (cfun->returns_pcc_struct)
 	{
-	  int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+	  int size = int_size_in_bytes (TREE_TYPE (res));
 	  value_address = assemble_static_space (size);
 	}
       else
@@ -4952,36 +5157,45 @@ expand_function_start (tree subr)
 	     it.  */
 	  if (sv)
 	    {
-	      value_address = gen_reg_rtx (Pmode);
+	      if (DECL_BY_REFERENCE (res))
+		value_address = get_rtl_for_parm_ssa_default_def (res);
+	      if (!value_address)
+		value_address = gen_reg_rtx (Pmode);
 	      emit_move_insn (value_address, sv);
 	    }
 	}
       if (value_address)
 	{
 	  rtx x = value_address;
-	  if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+	  if (!DECL_BY_REFERENCE (res))
 	    {
-	      x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
-	      set_mem_attributes (x, DECL_RESULT (subr), 1);
+	      x = get_rtl_for_parm_ssa_default_def (res);
+	      if (!x)
+		{
+		  x = gen_rtx_MEM (DECL_MODE (res), value_address);
+		  set_mem_attributes (x, res, 1);
+		}
 	    }
-	  SET_DECL_RTL (DECL_RESULT (subr), x);
+	  SET_DECL_RTL (res, x);
 	}
     }
-  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+  else if (DECL_MODE (res) == VOIDmode)
     /* If return mode is void, this decl rtl should not be used.  */
-    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+    SET_DECL_RTL (res, NULL_RTX);
   else
     {
       /* Compute the return values into a pseudo reg, which we will copy
 	 into the true return register after the cleanups are done.  */
-      tree return_type = TREE_TYPE (DECL_RESULT (subr));
-      if (TYPE_MODE (return_type) != BLKmode
-	  && targetm.calls.return_in_msb (return_type))
+      tree return_type = TREE_TYPE (res);
+      rtx x = get_rtl_for_parm_ssa_default_def (res);
+      if (x)
+	/* Use it.  */;
+      else if (TYPE_MODE (return_type) != BLKmode
+	       && targetm.calls.return_in_msb (return_type))
 	/* expand_function_end will insert the appropriate padding in
 	   this case.  Use the return value's natural (unpadded) mode
 	   within the function proper.  */
-	SET_DECL_RTL (DECL_RESULT (subr),
-		      gen_reg_rtx (TYPE_MODE (return_type)));
+	x = gen_reg_rtx (TYPE_MODE (return_type));
       else
 	{
 	  /* In order to figure out what mode to use for the pseudo, we
@@ -4992,25 +5206,26 @@ expand_function_start (tree subr)
 	  /* Structures that are returned in registers are not
 	     aggregate_value_p, so we may see a PARALLEL or a REG.  */
 	  if (REG_P (hard_reg))
-	    SET_DECL_RTL (DECL_RESULT (subr),
-			  gen_reg_rtx (GET_MODE (hard_reg)));
+	    x = gen_reg_rtx (GET_MODE (hard_reg));
 	  else
 	    {
 	      gcc_assert (GET_CODE (hard_reg) == PARALLEL);
-	      SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+	      x = gen_group_rtx (hard_reg);
 	    }
 	}
 
+      SET_DECL_RTL (res, x);
+
       /* Set DECL_REGISTER flag so that expand_function_end will copy the
 	 result to the real return register(s).  */
-      DECL_REGISTER (DECL_RESULT (subr)) = 1;
+      DECL_REGISTER (res) = 1;
 
       if (chkp_function_instrumented_p (current_function_decl))
 	{
-	  tree return_type = TREE_TYPE (DECL_RESULT (subr));
+	  tree return_type = TREE_TYPE (res);
 	  rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
 								 subr, 1);
-	  SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+	  SET_DECL_BOUNDS_RTL (res, bounds);
 	}
     }
 
@@ -5025,13 +5240,19 @@ expand_function_start (tree subr)
       rtx local, chain;
      rtx_insn *insn;
 
-      local = gen_reg_rtx (Pmode);
+      local = get_rtl_for_parm_ssa_default_def (parm);
+      if (!local)
+	local = gen_reg_rtx (Pmode);
       chain = targetm.calls.static_chain (current_function_decl, true);
 
       set_decl_incoming_rtl (parm, chain, false);
       SET_DECL_RTL (parm, local);
       mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
 
+      if (GET_MODE (local) != Pmode)
+	local = convert_to_mode (Pmode, local,
+				 TYPE_UNSIGNED (TREE_TYPE (parm)));
+
       insn = emit_move_insn (local, chain);
 
       /* Mark the register as eliminable, similar to parameters.  */
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index b558d90..baed630 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type)
   return copy;
 }
 
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
-   coalescing together, false otherwise.
-
-   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
-  /* First check the SSA_NAME's associated DECL.  We only want to
-     coalesce if they have the same DECL or both have no associated DECL.  */
-  tree var1 = SSA_NAME_VAR (name1);
-  tree var2 = SSA_NAME_VAR (name2);
-  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
-  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
-  if (var1 != var2)
-    return false;
-
-  /* Now check the types.  If the types are the same, then we should
-     try to coalesce V1 and V2.  */
-  tree t1 = TREE_TYPE (name1);
-  tree t2 = TREE_TYPE (name2);
-  if (t1 == t2)
-    return true;
-
-  /* If the types are not the same, check for a canonical type match.  This
-     (for example) allows coalescing when the types are fundamentally the
-     same, but just have different names. 
-
-     Note pointer types with different address spaces may have the same
-     canonical type.  Those are rejected for coalescing by the
-     types_compatible_p check.  */
-  if (TYPE_CANONICAL (t1)
-      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
-      && types_compatible_p (t1, t2))
-    return true;
-
-  return false;
-}
-
 /* Strip off a legitimate source ending from the input string NAME of
    length LEN.  Rather than having to know the names used by all of
    our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index ed23eb2..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
 extern bool gimple_has_body_p (tree);
 extern const char *gimple_decl_printable_name (tree, int);
 extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
 extern tree create_tmp_var_name (const char *);
 extern tree create_tmp_var_raw (tree, const char * = NULL);
 extern tree create_tmp_var (tree, const char * = NULL);
diff --git a/gcc/opts.c b/gcc/opts.c
index 9d5de96..32de605 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -445,12 +445,12 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
-    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
     { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 6b66f8f..64fc4d9 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_all_early_optimizations);
       PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_object_sizes);
 	  NEXT_PASS (pass_ccp);
 	  /* After CCP we rewrite no longer addressed locals into SSA
@@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
       /* Initial scalar cleanups before alias computation.
 	 They ensure memory accesses are not indirect wherever possible.  */
       NEXT_PASS (pass_strip_predict_hints);
-      NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_ccp);
       /* After CCP we rewrite no longer addressed locals into SSA
 	 form if possible.  */
@@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_ch);
       NEXT_PASS (pass_lower_complex);
       NEXT_PASS (pass_sra);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* The dom pass will also resolve all __builtin_constant_p calls
          that are still there to 0.  This has to be done after some
 	 propagations have already run, but before some more dead code
@@ -293,7 +290,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_fold_builtins);
       NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* FIXME: If DCE is not run before checking for uninitialized uses,
 	 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 	 However, this also causes us to misdiagnose cases that should be
@@ -328,7 +324,6 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tsan);
-      NEXT_PASS (pass_rename_ssa_copies);
       /* ???  We do want some kind of loop invariant motion, but we possibly
          need to adjust LIM to be more friendly towards preserving accurate
 	 debug information here.  */
diff --git a/gcc/stmt.c b/gcc/stmt.c
index 391686c..e7f7dd4 100644
--- a/gcc/stmt.c
+++ b/gcc/stmt.c
@@ -891,7 +891,7 @@ emit_case_decision_tree (tree index_expr, tree index_type,
     {
       index = copy_to_reg (index);
       if (TREE_CODE (index_expr) == SSA_NAME)
-	set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (index_expr), index);
+	set_reg_attrs_for_decl_rtl (index_expr, index);
     }
 
   balance_case_nodes (&case_list, NULL);
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 9757777..938e54b 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -782,7 +782,8 @@ layout_decl (tree decl, unsigned int known_align)
     {
       PUT_MODE (rtl, DECL_MODE (decl));
       SET_DECL_RTL (decl, 0);
-      set_mem_attributes (rtl, decl, 1);
+      if (MEM_P (rtl))
+	set_mem_attributes (rtl, decl, 1);
       SET_DECL_RTL (decl, rtl);
     }
 }
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/54200 */
 /* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
 
 int o __attribute__((used));
 
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
 
 int main ()
 {
-  int i;
+  register int i;
   char foo[255];
 
   // smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
 void
 overflow()
 {
-  int i = 0;
+  register int i = 0;
   char foo[30];
 
   /* Overflow buffer.  */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+   value is unused, to the same location, so as to overwrite one of
+   them with the incoming value of the other.  */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+/* Same as foo, but with swapped parameters.  */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+  j = i; /* The incoming value for J is unused.  */
+  i = 2;
+  if (j)
+    j++;
+  j += i + 1;
+  return j;
+}
+
+int
+main (void)
+{
+  if (foo (0, 1) != 3)
+    abort ();
+  if (bar (1, 0) != 3)
+    abort ();
+  return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index 7b747ab9..978476c 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
   rtx dest_rtx, seq, x;
   machine_mode dest_mode, src_mode;
   int unsignedp;
-  tree var;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
 
   start_sequence ();
 
-  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+  tree name = partition_to_var (SA.map, dest);
   src_mode = TYPE_MODE (TREE_TYPE (src));
   dest_mode = GET_MODE (dest_rtx);
-  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
   gcc_assert (!REG_P (dest_rtx)
-	      || dest_mode == promote_decl_mode (var, &unsignedp));
+	      || dest_mode == promote_ssa_mode (name, &unsignedp));
 
   if (src_mode != dest_mode)
     {
@@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T)
 static rtx
 get_temp_reg (tree name)
 {
-  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
-  tree type = TREE_TYPE (var);
+  tree type = TREE_TYPE (name);
   int unsignedp;
-  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
   rtx x = gen_reg_rtx (reg_mode);
   if (POINTER_TYPE_P (type))
-    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
   return x;
 }
 
@@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 
   /* Return to viewing the variable list as just all reference variables after
      coalescing has been performed.  */
-  partition_view_normal (map, false);
+  partition_view_normal (map);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index bf8983f..08ce72c 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -36,6 +36,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "tree-ssa-live.h"
 #include "tree-ssa-coalesce.h"
+#include "cfgexpand.h"
+#include "explow.h"
 #include "diagnostic-core.h"
 
 
@@ -806,6 +808,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
   basic_block bb;
   ssa_op_iter iter;
   live_track_p live;
+  basic_block entry;
+
+  /* If inter-variable coalescing is enabled, we may attempt to
+     coalesce variables from different base variables, including
+     different parameters, so we have to make sure default defs live
+     at the entry block conflict with each other.  */
+  if (flag_tree_coalesce_vars)
+    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  else
+    entry = NULL;
 
   map = live_var_map (liveinfo);
   graph = ssa_conflicts_new (num_var_partitions (map));
@@ -864,6 +876,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	    live_track_process_def (live, result, graph);
 	}
 
+      /* Pretend there are defs for params' default defs at the start
+	 of the (post-)entry block.  */
+      if (bb == entry)
+	{
+	  unsigned base;
+	  bitmap_iterator bi;
+	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	    {
+	      bitmap_iterator bi2;
+	      unsigned part;
+	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+					0, part, bi2)
+		{
+		  tree var = partition_to_var (map, part);
+		  if (!SSA_NAME_VAR (var)
+		      || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+			  && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+		      || !SSA_NAME_IS_DEFAULT_DEF (var))
+		    continue;
+		  live_track_process_def (live, var, graph);
+		}
+	    }
+	}
+
      live_track_clear_base_vars (live);
     }
 
@@ -1132,6 +1168,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
     {
       var1 = partition_to_var (map, p1);
       var2 = partition_to_var (map, p2);
+
       z = var_union (map, var1, var2);
       if (z == NO_PARTITION)
 	{
@@ -1149,6 +1186,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
 
       if (debug)
 	fprintf (debug, ": Success -> %d\n", z);
+
       return true;
     }
 
@@ -1244,6 +1282,333 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
 }
 
 
+/* Output partition map MAP with coalescing plan PART to file F.  */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+  int t;
+  unsigned x, y;
+  int p;
+
+  fprintf (f, "\nCoalescible Partition map \n\n");
+
+  for (x = 0; x < map->num_partitions; x++)
+    {
+      if (map->view_to_partition != NULL)
+	p = map->view_to_partition[x];
+      else
+	p = x;
+
+      if (ssa_name (p) == NULL_TREE
+	  || virtual_operand_p (ssa_name (p)))
+        continue;
+
+      t = 0;
+      for (y = 1; y < num_ssa_names; y++)
+        {
+	  tree var = version_to_var (map, y);
+	  if (!var)
+	    continue;
+	  int q = var_to_partition (map, var);
+	  p = partition_find (part, q);
+	  gcc_assert (map->partition_to_base_index[q]
+		      == map->partition_to_base_index[p]);
+
+	  if (p == (int)x)
+	    {
+	      if (t++ == 0)
+	        {
+		  fprintf (f, "Partition %d, base %d (", x,
+			   map->partition_to_base_index[q]);
+		  print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+		  fprintf (f, " - ");
+		}
+	      fprintf (f, "%d ", y);
+	    }
+	}
+      if (t != 0)
+	fprintf (f, ")\n");
+    }
+  fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+   coalescing together, false otherwise.
+
+   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+  /* First check the SSA_NAME's associated DECL.  Without
+     optimization, we only want to coalesce if they have the same DECL
+     or both have no associated DECL.  */
+  tree var1 = SSA_NAME_VAR (name1);
+  tree var2 = SSA_NAME_VAR (name2);
+  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+  if (var1 != var2 && !flag_tree_coalesce_vars)
+    return false;
+
+  /* Now check the types.  If the types are the same, then we should
+     try to coalesce V1 and V2.  */
+  tree t1 = TREE_TYPE (name1);
+  tree t2 = TREE_TYPE (name2);
+  if (t1 == t2)
+    {
+    check_modes:
+      /* If the base variables are the same, we're good: none of the
+	 other tests below could possibly fail.  */
+      var1 = SSA_NAME_VAR (name1);
+      var2 = SSA_NAME_VAR (name2);
+      if (var1 == var2)
+	return true;
+
+      /* We don't want to coalesce two SSA names if one of the base
+	 variables is supposed to be a register while the other is
+	 supposed to be on the stack.  Anonymous SSA names take
+	 registers, but when not optimizing, user variables should go
+	 on the stack, so coalescing them with the anonymous variable
+	 as the partition leader would end up assigning the user
+	 variable to a register.  Don't do that!  */
+      bool reg1 = !var1 || use_register_for_decl (var1);
+      bool reg2 = !var2 || use_register_for_decl (var2);
+      if (reg1 != reg2)
+	return false;
+
+      /* Check that the promoted modes are the same.  We don't want to
+	 coalesce if the promoted modes would be different.  Only
+	 PARM_DECLs and RESULT_DECLs have different promotion rules,
+	 so skip the test if both are variables, or both are anonymous
+	 SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
+	 coalesce its SSA versions with those of any other variables,
+	 because it may be passed by reference.  */
+      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+	|| (/* The case var1 == var2 is already covered above.  */
+	    !parm_maybe_byref_p (var1)
+	    && !parm_maybe_byref_p (var2)
+	    && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
+    }
+
+  /* If the types are not the same, check for a canonical type match.  This
+     (for example) allows coalescing when the types are fundamentally the
+     same, but just have different names. 
+
+     Note pointer types with different address spaces may have the same
+     canonical type.  Those are rejected for coalescing by the
+     types_compatible_p check.  */
+  if (TYPE_CANONICAL (t1)
+      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+      && types_compatible_p (t1, t2))
+    goto check_modes;
+
+  return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+   partition of SSA names USED_IN_COPIES and related by CL coalesce
+   possibilities.  This must match gimple_can_coalesce_p in the
+   optimized case.  */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+				   coalesce_list_p cl)
+{
+  int parts = num_var_partitions (map);
+  partition tentative = partition_new (parts);
+
+  /* Partition the SSA versions so that, for each coalescible
+     pair, both of its members are in the same partition in
+     TENTATIVE.  */
+  gcc_assert (!cl->sorted);
+  coalesce_pair_p node;
+  coalesce_iterator_type ppi;
+  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+    {
+      tree v1 = ssa_name (node->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (node->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* We have to deal with cost one pairs too.  */
+  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+    {
+      tree v1 = ssa_name (co->first_element);
+      int p1 = partition_find (tentative, var_to_partition (map, v1));
+      tree v2 = ssa_name (co->second_element);
+      int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+      if (p1 == p2)
+	continue;
+
+      partition_union (tentative, p1, p2);
+    }
+
+  /* And also with abnormal edges.  */
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->flags & EDGE_ABNORMAL)
+	  {
+	    gphi_iterator gsi;
+	    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+		 gsi_next (&gsi))
+	      {
+		gphi *phi = gsi.phi ();
+		tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+		if (SSA_NAME_IS_DEFAULT_DEF (arg)
+		    && (!SSA_NAME_VAR (arg)
+			|| TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+		  continue;
+
+		tree res = PHI_RESULT (phi);
+
+		int p1 = partition_find (tentative, var_to_partition (map, res));
+		int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+		if (p1 == p2)
+		  continue;
+
+		partition_union (tentative, p1, p2);
+	      }
+	  }
+    }
+
+  map->partition_to_base_index = XCNEWVEC (int, parts);
+  auto_vec<unsigned int> index_map (parts);
+  if (parts)
+    index_map.quick_grow (parts);
+
+  const unsigned no_part = -1;
+  unsigned count = parts;
+  while (count)
+    index_map[--count] = no_part;
+
+  /* Initialize MAP's mapping from partition to base index, using
+     as base indices an enumeration of the TENTATIVE partitions in
+     which each SSA version ended up, so that we compute conflicts
+     between all SSA versions that ended up in the same potential
+     coalesce partition.  */
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      if (index_map[base] != no_part)
+	continue;
+      index_map[base] = count++;
+    }
+
+  map->num_basevars = count;
+
+  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+    {
+      int pidx = var_to_partition (map, ssa_name (i));
+      int base = partition_find (tentative, pidx);
+      gcc_assert (index_map[base] < count);
+      map->partition_to_base_index[pidx] = index_map[base];
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_part_var_map (dump_file, tentative, map);
+
+  partition_delete (tentative);
+}
+
+/* Hashtable helpers.  */
+
+struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
+{
+  static inline hashval_t hash (const tree_int_map *);
+  static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+  return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+  return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+   names.  Partitions will share the same base if they have the same
+   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
+   must match gimple_can_coalesce_p in the non-optimized case.  */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+  int x, num_part;
+  tree var;
+  struct tree_int_map *m, *mapstorage;
+
+  num_part = num_var_partitions (map);
+  hash_table<tree_int_map_hasher> tree_to_index (num_part);
+  /* We can have at most num_part entries in the hash tables, so it's
+     enough to allocate so many map elements once, saving some malloc
+     calls.  */
+  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+  /* If a base table already exists, clear it, otherwise create it.  */
+  free (map->partition_to_base_index);
+  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+  /* Build the base variable list, and point partitions at their bases.  */
+  for (x = 0; x < num_part; x++)
+    {
+      struct tree_int_map **slot;
+      unsigned baseindex;
+      var = partition_to_var (map, x);
+      if (SSA_NAME_VAR (var)
+	  && (!VAR_P (SSA_NAME_VAR (var))
+	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+	m->base.from = SSA_NAME_VAR (var);
+      else
+	/* This restricts what anonymous SSA names we can coalesce
+	   as it restricts the sets we compute conflicts for.
+	   Using TREE_TYPE to generate sets is the easies as
+	   type equivalency also holds for SSA names with the same
+	   underlying decl.
+
+	   Check gimple_can_coalesce_p when changing this code.  */
+	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+			? TYPE_CANONICAL (TREE_TYPE (var))
+			: TREE_TYPE (var));
+      /* If base variable hasn't been seen, set it up.  */
+      slot = tree_to_index.find_slot (m, INSERT);
+      if (!*slot)
+	{
+	  baseindex = m - mapstorage;
+	  m->to = baseindex;
+	  *slot = m;
+	  m++;
+	}
+      else
+	baseindex = (*slot)->to;
+      map->partition_to_base_index[x] = baseindex;
+    }
+
+  map->num_basevars = m - mapstorage;
+
+  free (mapstorage);
+}
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1260,9 +1625,10 @@ coalesce_ssa_name (void)
   cl = create_coalesce_list ();
   map = create_outofssa_var_map (cl, used_in_copies);
 
-  /* If optimization is disabled, we need to coalesce all the names originating
-     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
-  if (!optimize)
+  /* If this optimization is disabled, we need to coalesce all the
+     names originating from the same SSA_NAME_VAR so debug info
+     remains undisturbed.  */
+  if (!flag_tree_coalesce_vars)
     {
       hash_table<ssa_name_var_hash> ssa_name_hash (10);
 
@@ -1303,8 +1669,13 @@ coalesce_ssa_name (void)
   if (dump_file && (dump_flags & TDF_DETAILS))
     dump_var_map (dump_file, map);
 
-  /* Don't calculate live ranges for variables not in the coalesce list.  */
-  partition_view_bitmap (map, used_in_copies, true);
+  partition_view_bitmap (map, used_in_copies);
+
+  if (flag_tree_coalesce_vars)
+    compute_optimized_partition_bases (map, used_in_copies, cl);
+  else
+    compute_samebase_partition_bases (map);
+
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
@@ -1343,8 +1714,7 @@ coalesce_ssa_name (void)
 
   /* Now coalesce everything in the list.  */
   coalesce_partitions (map, graph, cl,
-		       ((dump_flags & TDF_DETAILS) ? dump_file
-						   : NULL));
+		       ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
 
   delete_coalesce_list (cl);
   ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_TREE_SSA_COALESCE_H
 
 extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
 
 #endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index aeb7f28..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,475 +0,0 @@
-/* Rename SSA copies.
-   Copyright (C) 2004-2015 Free Software Foundation, Inc.
-   Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "backend.h"
-#include "tree.h"
-#include "gimple.h"
-#include "rtl.h"
-#include "ssa.h"
-#include "alias.h"
-#include "fold-const.h"
-#include "internal-fn.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
-  /* Number of copies coalesced.  */
-  int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
-   This optimization looks for copies between 2 SSA_NAMES, either through a
-   direct copy, or an implicit one via a PHI node result and its arguments.
-
-   Each copy is examined to determine if it is possible to rename the base
-   variable of one of the operands to the same variable as the other operand.
-   i.e.
-   T.3_5 = <blah>
-   a_1 = T.3_5
-
-   If this copy couldn't be copy propagated, it could possibly remain in the
-   program throughout the optimization phases.   After SSA->normal, it would
-   become:
-
-   T.3 = <blah>
-   a = T.3
-
-   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
-   fundamental reason why the base variable needs to be T.3, subject to
-   certain restrictions.  This optimization attempts to determine if we can
-   change the base variable on copies like this, and result in code such as:
-
-   a_5 = <blah>
-   a_1 = a_5
-
-   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
-   possible, the copy goes away completely. If it isn't possible, a new temp
-   will be created for a_5, and you will end up with the exact same code:
-
-   a.8 = <blah>
-   a = a.8
-
-   The other benefit of performing this optimization relates to what variables
-   are chosen in copies.  Gimplification of the program uses temporaries for
-   a lot of things. expressions like
-
-   a_1 = <blah>
-   <blah2> = a_1
-
-   get turned into
-
-   T.3_5 = <blah>
-   a_1 = T.3_5
-   <blah2> = a_1
-
-   Copy propagation is done in a forward direction, and if we can propagate
-   through the copy, we end up with:
-
-   T.3_5 = <blah>
-   <blah2> = T.3_5
-
-   The copy is gone, but so is all reference to the user variable 'a'. By
-   performing this optimization, we would see the sequence:
-
-   a_5 = <blah>
-   a_1 = a_5
-   <blah2> = a_1
-
-   which copy propagation would then turn into:
-
-   a_5 = <blah>
-   <blah2> = a_5
-
-   and so we still retain the user variable whenever possible.  */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
-   Choose a representative for the partition, and send debug info to DEBUG.  */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
-  int p1, p2, p3;
-  tree root1, root2;
-  tree rep1, rep2;
-  bool ign1, ign2, abnorm;
-
-  gcc_assert (TREE_CODE (var1) == SSA_NAME);
-  gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
-  register_ssa_partition (map, var1);
-  register_ssa_partition (map, var2);
-
-  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
-  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
-  if (debug)
-    {
-      fprintf (debug, "Try : ");
-      print_generic_expr (debug, var1, TDF_SLIM);
-      fprintf (debug, "(P%d) & ", p1);
-      print_generic_expr (debug, var2, TDF_SLIM);
-      fprintf (debug, "(P%d)", p2);
-    }
-
-  gcc_assert (p1 != NO_PARTITION);
-  gcc_assert (p2 != NO_PARTITION);
-
-  if (p1 == p2)
-    {
-      if (debug)
-	fprintf (debug, " : Already coalesced.\n");
-      return;
-    }
-
-  rep1 = partition_to_var (map, p1);
-  rep2 = partition_to_var (map, p2);
-  root1 = SSA_NAME_VAR (rep1);
-  root2 = SSA_NAME_VAR (rep2);
-  if (!root1 && !root2)
-    return;
-
-  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
-  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
-	    || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
-  if (abnorm)
-    {
-      if (debug)
-	fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
-      return;
-    }
-
-  /* Partitions already have the same root, simply merge them.  */
-  if (root1 == root2)
-    {
-      p1 = partition_union (map->var_partition, p1, p2);
-      if (debug)
-	fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
-      return;
-    }
-
-  /* Never attempt to coalesce 2 different parameters.  */
-  if ((root1 && TREE_CODE (root1) == PARM_DECL)
-      && (root2 && TREE_CODE (root2) == PARM_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
-      return;
-    }
-
-  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
-      != (root2 && TREE_CODE (root2) == RESULT_DECL))
-    {
-      if (debug)
-        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
-      return;
-    }
-
-  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
-  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
-  /* Refrain from coalescing user variables, if requested.  */
-  if (!ign1 && !ign2)
-    {
-      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
-	ign2 = true;
-      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
-	ign1 = true;
-      else if (flag_ssa_coalesce_vars != 2)
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 different USER vars. No coalesce.\n");
-	  return;
-	}
-      else
-	ign2 = true;
-    }
-
-  /* If both values have default defs, we can't coalesce.  If only one has a
-     tag, make sure that variable is the new root partition.  */
-  if (root1 && ssa_default_def (cfun, root1))
-    {
-      if (root2 && ssa_default_def (cfun, root2))
-	{
-	  if (debug)
-	    fprintf (debug, " : 2 default defs. No coalesce.\n");
-	  return;
-	}
-      else
-        {
-	  ign2 = true;
-	  ign1 = false;
-	}
-    }
-  else if (root2 && ssa_default_def (cfun, root2))
-    {
-      ign1 = true;
-      ign2 = false;
-    }
-
-  /* Do not coalesce if we cannot assign a symbol to the partition.  */
-  if (!(!ign2 && root2)
-      && !(!ign1 && root1))
-    {
-      if (debug)
-	fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the new chosen root variable would be read-only.
-     If both ign1 && ign2, then the root var of the larger partition
-     wins, so reject in that case if any of the root vars is TREE_READONLY.
-     Otherwise reject only if the root var, on which replace_ssa_name_symbol
-     will be called below, is readonly.  */
-  if (((root1 && TREE_READONLY (root1)) && ign2)
-      || ((root2 && TREE_READONLY (root2)) && ign1))
-    {
-      if (debug)
-	fprintf (debug, " : Readonly variable.  No coalesce.\n");
-      return;
-    }
-
-  /* Don't coalesce if the two variables aren't type compatible .  */
-  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
-      /* There is a disconnect between the middle-end type-system and
-         VRP, avoid coalescing enum types with different bounds.  */
-      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
-	   || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
-	  && TREE_TYPE (var1) != TREE_TYPE (var2)))
-    {
-      if (debug)
-	fprintf (debug, " : Incompatible types.  No coalesce.\n");
-      return;
-    }
-
-  /* Merge the two partitions.  */
-  p3 = partition_union (map->var_partition, p1, p2);
-
-  /* Set the root variable of the partition to the better choice, if there is
-     one.  */
-  if (!ign2 && root2)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
-  else if (!ign1 && root1)
-    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
-  else
-    gcc_unreachable ();
-
-  if (debug)
-    {
-      fprintf (debug, " --> P%d ", p3);
-      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
-			  TDF_SLIM);
-      fprintf (debug, "\n");
-    }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
-  GIMPLE_PASS, /* type */
-  "copyrename", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_COPY_RENAME, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
-  pass_rename_ssa_copies (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
-   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
-   changing the underlying root variable of all coalesced version.  This will
-   then cause the SSA->normal pass to attempt to coalesce them all to the same
-   variable.  */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
-  var_map map;
-  basic_block bb;
-  tree var, part_var;
-  gimple stmt;
-  unsigned x;
-  FILE *debug;
-
-  memset (&stats, 0, sizeof (stats));
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    debug = dump_file;
-  else
-    debug = NULL;
-
-  map = init_var_map (num_ssa_names);
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Scan for real copies.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_assign_ssa_name_copy_p (stmt))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      copy_rename_partition_coalesce (map, lhs, rhs, debug);
-	    }
-	}
-    }
-
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      /* Treat PHI nodes as copies between the result and each argument.  */
-      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-        {
-          size_t i;
-	  tree res;
-	  gphi *phi = gsi.phi ();
-	  res = gimple_phi_result (phi);
-
-	  /* Do not process virtual SSA_NAMES.  */
-	  if (virtual_operand_p (res))
-	    continue;
-
-	  /* Make sure to only use the same partition for an argument
-	     as the result but never the other way around.  */
-	  if (SSA_NAME_VAR (res)
-	      && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
-	    for (i = 0; i < gimple_phi_num_args (phi); i++)
-	      {
-		tree arg = PHI_ARG_DEF (phi, i);
-		if (TREE_CODE (arg) == SSA_NAME)
-		  copy_rename_partition_coalesce (map, res, arg,
-						  debug);
-	      }
-	  /* Else if all arguments are in the same partition try to merge
-	     it with the result.  */
-	  else
-	    {
-	      int all_p_same = -1;
-	      int p = -1;
-	      for (i = 0; i < gimple_phi_num_args (phi); i++)
-		{
-		  tree arg = PHI_ARG_DEF (phi, i);
-		  if (TREE_CODE (arg) != SSA_NAME)
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		  else if (all_p_same == -1)
-		    {
-		      p = partition_find (map->var_partition,
-					  SSA_NAME_VERSION (arg));
-		      all_p_same = 1;
-		    }
-		  else if (all_p_same == 1
-			   && p != partition_find (map->var_partition,
-						   SSA_NAME_VERSION (arg)))
-		    {
-		      all_p_same = 0;
-		      break;
-		    }
-		}
-	      if (all_p_same == 1)
-		copy_rename_partition_coalesce (map, res,
-						PHI_ARG_DEF (phi, 0),
-						debug);
-	    }
-        }
-    }
-
-  if (debug)
-    dump_var_map (debug, map);
-
-  /* Now one more pass to make all elements of a partition share the same
-     root variable.  */
-
-  for (x = 1; x < num_ssa_names; x++)
-    {
-      part_var = partition_to_var (map, x);
-      if (!part_var)
-        continue;
-      var = ssa_name (x);
-      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
-	continue;
-      if (debug)
-        {
-	  fprintf (debug, "Coalesced ");
-	  print_generic_expr (debug, var, TDF_SLIM);
-	  fprintf (debug, " to ");
-	  print_generic_expr (debug, part_var, TDF_SLIM);
-	  fprintf (debug, "\n");
-	}
-      stats.coalesced++;
-      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
-    }
-
-  statistics_counter_event (fun, "copies coalesced",
-			    stats.coalesced);
-  delete_var_map (map);
-  return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
-  return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 5b00f58..4772558 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -70,88 +70,6 @@ static void  verify_live_on_entry (tree_live_info_p);
    ssa_name or variable, and vice versa.  */
 
 
-/* Hashtable helpers.  */
-
-struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
-{
-  static inline hashval_t hash (const tree_int_map *);
-  static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
-  return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
-  return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP.  */
-
-static void
-var_map_base_init (var_map map)
-{
-  int x, num_part;
-  tree var;
-  struct tree_int_map *m, *mapstorage;
-
-  num_part = num_var_partitions (map);
-  hash_table<tree_int_map_hasher> tree_to_index (num_part);
-  /* We can have at most num_part entries in the hash tables, so it's
-     enough to allocate so many map elements once, saving some malloc
-     calls.  */
-  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
-  /* If a base table already exists, clear it, otherwise create it.  */
-  free (map->partition_to_base_index);
-  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
-  /* Build the base variable list, and point partitions at their bases.  */
-  for (x = 0; x < num_part; x++)
-    {
-      struct tree_int_map **slot;
-      unsigned baseindex;
-      var = partition_to_var (map, x);
-      if (SSA_NAME_VAR (var)
-	  && (!VAR_P (SSA_NAME_VAR (var))
-	      || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
-	m->base.from = SSA_NAME_VAR (var);
-      else
-	/* This restricts what anonymous SSA names we can coalesce
-	   as it restricts the sets we compute conflicts for.
-	   Using TREE_TYPE to generate sets is the easies as
-	   type equivalency also holds for SSA names with the same
-	   underlying decl. 
-
-	   Check gimple_can_coalesce_p when changing this code.  */
-	m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
-			? TYPE_CANONICAL (TREE_TYPE (var))
-			: TREE_TYPE (var));
-      /* If base variable hasn't been seen, set it up.  */
-      slot = tree_to_index.find_slot (m, INSERT);
-      if (!*slot)
-	{
-	  baseindex = m - mapstorage;
-	  m->to = baseindex;
-	  *slot = m;
-	  m++;
-	}
-      else
-	baseindex = (*slot)->to;
-      map->partition_to_base_index[x] = baseindex;
-    }
-
-  map->num_basevars = m - mapstorage;
-
-  free (mapstorage);
-}
-
-
 /* Remove the base table in MAP.  */
 
 static void
@@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected)
 }
 
 
-/* Create a partition view which includes all the used partitions in MAP.  If
-   WANT_BASES is true, create the base variable map as well.  */
+/* Create a partition view which includes all the used partitions in MAP.  */
 
 void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
 {
   bitmap used;
 
   used = partition_view_init (map);
   partition_view_fini (map, used);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
@@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases)
    as well.  */
 
 void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
 {
   bitmap used;
   bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
     }
   partition_view_fini (map, new_partitions);
 
-  if (want_bases)
-    var_map_base_init (map);
-  else
-    var_map_base_fini (map);
+  var_map_base_fini (map);
 }
 
 
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
 extern var_map init_var_map (int);
 extern void delete_var_map (var_map);
 extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
 extern void dump_scope_blocks (FILE *, int);
 extern void debug_scope_block (tree, int);
 extern void debug_scope_blocks (int);
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index 437f69d..1fbd71e 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -38,6 +38,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "tree-ssa-propagate.h"
 #include "tree-hash-traits.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
 
 /* The basic structure describing an equivalency created by traversing
    an edge.  Traversing the edge effectively means that we can assume
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index da9de28..a31a137 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4856,12 +4856,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
    registers, as well as associations between MEMs and VALUEs.  */
 
 static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
 {
   unsigned int r;
   hard_reg_set_iterator hrsi;
+  HARD_REG_SET invalidated_regs;
 
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+  get_call_reg_set_usage (call_insn, &invalidated_regs,
+			  regs_invalidated_by_call);
+
+  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
     var_regno_delete (set, r);
 
   if (MAY_HAVE_DEBUG_INSNS)
@@ -6645,7 +6649,7 @@ compute_bb_dataflow (basic_block bb)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (out);
+	    dataflow_set_clear_at_call (out, insn);
 	    break;
 
 	  case MO_USE:
@@ -9107,7 +9111,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
       switch (mo->type)
 	{
 	  case MO_CALL:
-	    dataflow_set_clear_at_call (set);
+	    dataflow_set_clear_at_call (set, insn);
 	    emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
 	    {
 	      rtx arguments = mo->u.loc, *p = &arguments;


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-14 19:03                                                       ` Alexandre Oliva
@ 2015-08-15  8:57                                                         ` Andreas Schwab
  2015-08-16 13:00                                                           ` Alexandre Oliva
  2015-08-16 16:42                                                         ` Andreas Schwab
                                                                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 127+ messages in thread
From: Andreas Schwab @ 2015-08-15  8:57 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)

In file included from /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:
/opt/gcc/gcc-20150815/Build/gcc/include/arm_neon.h: In function 'test_vsha1cq_u32':
/opt/gcc/gcc-20150815/Build/gcc/include/arm_neon.h:21076:10: internal compiler error: in expand_expr_real_1, at expr.c:9532
0x7f060b expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:9532
0xdb1027 expand_normal
        ../../gcc/expr.h:261
0xdb1027 aarch64_simd_expand_args
        ../../gcc/config/aarch64/aarch64-builtins.c:944
0xdb1027 aarch64_simd_expand_builtin(int, tree_node*, rtx_def*)
        ../../gcc/config/aarch64/aarch64-builtins.c:1118
0x6cc667 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
        ../../gcc/builtins.c:5931
0x7ecab7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:10360
0x7f8547 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, tree_node*)
        ../../gcc/expr.c:5398
0x7fa9d3 expand_assignment(tree_node*, tree_node*, bool)
        ../../gcc/expr.c:5170
0x6f435f expand_call_stmt
        ../../gcc/cfgexpand.c:2621
0x6f435f expand_gimple_stmt_1
        ../../gcc/cfgexpand.c:3510
0x6f435f expand_gimple_stmt
        ../../gcc/cfgexpand.c:3671
0x6f69c7 expand_gimple_tailcall
        ../../gcc/cfgexpand.c:3718
0x6f69c7 expand_gimple_basic_block
        ../../gcc/cfgexpand.c:5651
0x6fc777 execute
        ../../gcc/cfgexpand.c:6260

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-15  8:57                                                         ` Andreas Schwab
@ 2015-08-16 13:00                                                           ` Alexandre Oliva
       [not found]                                                             ` <m2k2sv8s21.fsf@linux-m68k.org>
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-16 13:00 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:

> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)

> In file included from
> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:

Are you sure this is a regression introduced by my patch?  The comments
at the top of this file seem to indicate it is a known problem in the
expansion of the crypto builtin, which is precisely what we see in the
backtrace?

If it is indeed a regression, would you please provide me with a
preprocessed testcase so that I can look into it without a native
environment?

Thanks in advance,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-14 19:03                                                       ` Alexandre Oliva
  2015-08-15  8:57                                                         ` Andreas Schwab
@ 2015-08-16 16:42                                                         ` Andreas Schwab
  2015-08-17  2:57                                                           ` Alexandre Oliva
  2015-08-17  7:48                                                         ` Christophe Lyon
  2015-09-02 17:09                                                         ` Alan Lawrence
  3 siblings, 1 reply; 127+ messages in thread
From: Andreas Schwab @ 2015-08-16 16:42 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On m68k:

FAIL: gcc.c-torture/execute/20050316-1.c   -O0  execution test
FAIL: gcc.c-torture/execute/20050316-2.c   -O0  execution test
FAIL: gcc.c-torture/execute/20050316-3.c   -O0  execution test
FAIL: gcc.c-torture/execute/simd-4.c   -O0  execution test
FAIL: gcc.c-torture/execute/simd-6.c   -O0  execution test
FAIL: gcc.dg/compat/vector-1 c_compat_x_tst.o-c_compat_y_tst.o execute

--- 20050316-1.s-good
+++ 20050316-1.s-bad
@@ -15,8 +15,17 @@
	.type	test2, @function
 test2:
	link.w %fp,#0
-	move.l 8(%fp),%d0
-	move.l 12(%fp),%d1
+	move.l 8(%fp),(%a0)
+	move.l 12(%fp),4(%a0)
+	lea (-16,%sp),%sp
+	move.l %sp,%d0
+	addq.l #7,%d0
+	lsr.l #3,%d0
+	move.l %d0,%d1
+	lsl.l #3,%d1
+	move.l %d1,%a0
+	move.l (%a0),%d0
+	move.l 4(%a0),%d1
	move.l %d1,%d0
	unlk %fp
	rts
@@ -37,8 +46,9 @@
	.globl	test4
	.type	test4, @function
 test4:
-	link.w %fp,#0
-	move.l 8(%fp),%d0
+	link.w %fp,#-4
+	move.l 8(%fp),-4(%fp)
+	move.l -4(%fp),%d0
	move.l %d0,%d1
	smi %d0
	extb.l %d0
@@ -54,8 +64,17 @@
	.type	test5, @function
 test5:
	link.w %fp,#0
-	move.l 8(%fp),%a0
-	move.l 12(%fp),%a1
+	move.l 8(%fp),(%a0)
+	move.l 12(%fp),4(%a0)
+	lea (-16,%sp),%sp
+	move.l %sp,%d0
+	addq.l #7,%d0
+	lsr.l #3,%d0
+	move.l %d0,%d1
+	lsl.l #3,%d1
+	move.l %d1,%a0
+	move.l 4(%a0),%a1
+	move.l (%a0),%a0
	move.l %a0,%d0
	move.l %a1,%d1
	unlk %fp

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-16 16:42                                                         ` Andreas Schwab
@ 2015-08-17  2:57                                                           ` Alexandre Oliva
  2015-08-17  8:23                                                             ` Andreas Schwab
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-17  2:57 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:

> On m68k:
> FAIL: gcc.c-torture/execute/20050316-1.c   -O0  execution test
> FAIL: gcc.c-torture/execute/20050316-2.c   -O0  execution test
> FAIL: gcc.c-torture/execute/20050316-3.c   -O0  execution test
> FAIL: gcc.c-torture/execute/simd-4.c   -O0  execution test
> FAIL: gcc.c-torture/execute/simd-6.c   -O0  execution test
> FAIL: gcc.dg/compat/vector-1 c_compat_x_tst.o-c_compat_y_tst.o execute

Thanks.  Interesting.  This exposes a more general situation than the
one I covered with the byref params: the general case does not require
the params to be passed by reference, but rather that the params require
a stack address that, if determined by cfgexpand, will cause them to be
computed too late for assign_parms' use.  The following patch appears to
fix the problem, applying the same logic of limited coalescing and
deferred address assignment to all params that can't live in pseudos,
and extending assign_parms' remaining case of copying incoming params to
new stack slots to fill in the blank address with that of the
newly-allocated stack slot.

Would you be so kind as to give it a spin on a m68k native?  TIA,


diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 0bc20f6..56571ce 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -172,17 +172,23 @@ leader_merge (tree cur, tree next)
   return cur;
 }
 
-/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
+/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
+   assigned to a stack slot.  We can't have expand_one_ssa_partition
+   choose their address: the pseudo holding the address would be set
+   up too late for assign_params to copy the parameter if needed.
+
    Such parameters are likely passed as a pointer to the value, rather
    than as a value, and so we must not coalesce them, nor allocate
    stack space for them before determining the calling conventions for
-   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
-   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
-   with NULL so as to make sure the MEM is not used before it is
-   adjusted in assign_parm_setup_reg.  */
+   them.
+
+   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
+   with pc_rtx as the address, and then it replaces the pc_rtx with
+   NULL so as to make sure the MEM is not used before it is adjusted
+   in assign_parm_setup_reg.  */
 
 bool
-parm_maybe_byref_p (tree var)
+parm_in_stack_slot_p (tree var)
 {
   if (!var || VAR_P (var))
     return false;
@@ -190,7 +196,7 @@ parm_maybe_byref_p (tree var)
   gcc_assert (TREE_CODE (var) == PARM_DECL
 	      || TREE_CODE (var) == RESULT_DECL);
 
-  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
+  return !use_register_for_decl (var);
 }
 
 /* Return the partition of the default SSA_DEF for decl VAR.  */
@@ -1343,13 +1349,15 @@ expand_one_ssa_partition (tree var)
 
   if (!use_register_for_decl (var))
     {
-      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
+      /* We can't risk having the parm assigned to a MEM location
+	 whose address references a pseudo, for the pseudo will only
+	 be set up after arguments are copied to the stack slot.  */
+      if (parm_in_stack_slot_p (SSA_NAME_VAR (var))
 	  && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
 	{
 	  expand_one_stack_var_at (var, pc_rtx, 0, 0);
 	  rtx x = SA.partition_to_pseudo[part];
 	  gcc_assert (GET_CODE (x) == MEM);
-	  gcc_assert (GET_MODE (x) == BLKmode);
 	  gcc_assert (XEXP (x, 0) == pc_rtx);
 	  /* Reset the address, so that any attempt to use it will
 	     ICE.  It will be adjusted in assign_parm_setup_reg.  */
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index 987cf356..d168672 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
-extern bool parm_maybe_byref_p (tree);
+extern bool parm_in_stack_slot_p (tree);
 extern rtx get_rtl_for_parm_ssa_default_def (tree var);
 
 
diff --git a/gcc/function.c b/gcc/function.c
index 715c19f..eccd8c6 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2934,6 +2934,16 @@ assign_parm_setup_block_p (struct assign_parm_data_one *data)
   return false;
 }
 
+static bool
+parm_in_unassigned_mem_p (tree decl, rtx from_expand)
+{
+  bool result = MEM_P (from_expand) && !XEXP (from_expand, 0);
+
+  gcc_assert (result == parm_in_stack_slot_p (decl));
+
+  return result;
+}
+
 /* A subroutine of assign_parms.  Arrange for the parameter to be
    present and valid in DATA->STACK_RTL.  */
 
@@ -2956,8 +2966,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
       rtx from_expand = rtl_for_parm (all, parm);
-      if (from_expand && (!parm_maybe_byref_p (parm)
-			  || XEXP (from_expand, 0) != NULL_RTX))
+      if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
 	stack_parm = copy_rtx (from_expand);
       else
 	{
@@ -2968,8 +2977,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 	  if (from_expand)
 	    {
 	      gcc_assert (GET_CODE (stack_parm) == MEM);
-	      gcc_assert (GET_CODE (from_expand) == MEM);
-	      gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
+	      gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
 	      XEXP (from_expand, 0) = XEXP (stack_parm, 0);
 	      PUT_MODE (from_expand, GET_MODE (stack_parm));
 	      stack_parm = copy_rtx (from_expand);
@@ -3121,7 +3129,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
       if (GET_MODE (parmreg) != promoted_nominal_mode)
 	parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
     }
-  else if (!from_expand || parm_maybe_byref_p (parm))
+  else if (!from_expand || parm_in_unassigned_mem_p (parm, from_expand))
     {
       parmreg = gen_reg_rtx (promoted_nominal_mode);
       if (!DECL_ARTIFICIAL (parm))
@@ -3349,7 +3357,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	  did_conversion = true;
 	}
       else if (GET_MODE (parmreg) == BLKmode)
-	gcc_assert (parm_maybe_byref_p (parm));
+	gcc_assert (parm_in_stack_slot_p (parm));
       else
 	emit_move_insn (parmreg, src);
 
@@ -3455,12 +3463,15 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
   if (data->entry_parm != data->stack_parm)
     {
       rtx src, dest;
+      rtx from_expand = NULL_RTX;
 
       if (data->stack_parm == 0)
 	{
-	  rtx x = data->stack_parm = rtl_for_parm (all, parm);
-	  if (x)
-	    gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+	  from_expand = rtl_for_parm (all, parm);
+	  if (from_expand)
+	    gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
+	  else if (!parm_in_unassigned_mem_p (parm, from_expand))
+	    data->stack_parm = from_expand;
 	}
 
       if (data->stack_parm == 0)
@@ -3472,7 +3483,16 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 	    = assign_stack_local (GET_MODE (data->entry_parm),
 				  GET_MODE_SIZE (GET_MODE (data->entry_parm)),
 				  align);
-	  set_mem_attributes (data->stack_parm, parm, 1);
+	  if (!from_expand)
+	    set_mem_attributes (data->stack_parm, parm, 1);
+	  else
+	    {
+	      gcc_assert (GET_CODE (data->stack_parm) == MEM);
+	      gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
+	      XEXP (from_expand, 0) = XEXP (data->stack_parm, 0);
+	      PUT_MODE (from_expand, GET_MODE (data->stack_parm));
+	      data->stack_parm = copy_rtx (from_expand);
+	    }
 	}
 
       dest = validize_mem (copy_rtx (data->stack_parm));
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 08ce72c..6468012 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -1386,8 +1386,8 @@ gimple_can_coalesce_p (tree name1, tree name2)
 	 because it may be passed by reference.  */
       return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
 	|| (/* The case var1 == var2 is already covered above.  */
-	    !parm_maybe_byref_p (var1)
-	    && !parm_maybe_byref_p (var2)
+	    !parm_in_stack_slot_p (var1)
+	    && !parm_in_stack_slot_p (var2)
 	    && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
     }
 


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
       [not found]                                                             ` <m2k2sv8s21.fsf@linux-m68k.org>
@ 2015-08-17  5:05                                                               ` Alexandre Oliva
  2015-08-17  9:29                                                                 ` Kyrill Tkachov
  2015-08-18 16:18                                                                 ` Kyrill Tkachov
  0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-17  5:05 UTC (permalink / raw)
  To: Andreas Schwab, Kyrylo Tkachov
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:

> Alexandre Oliva <aoliva@redhat.com> writes:
>> On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:
>> 
>>> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)
>> 
>>> In file included from
>>> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:
>> 
>> Are you sure this is a regression introduced by my patch?

> Yes, it reintroduces the ICE.

Ugh.  I see this testcase was introduced very recently, so presumably it
wasn't present in the tree that James Greenhalgh tested and confirmed
there were no regressions.

The hack in aarch64-builtins.c looks risky IMHO.  Changing the mode of a
decl after RTL is assigned to it (or to its SSA partitions) seems fishy.
The assert is doing just what it was supposed to do.  The only surprise
to me is that it didn't catch this unexpected and unsupported change
before.

Presumably if we just dropped the assert in expand_expr_real_1, this
case would work just fine, although the unsignedp bit would be
meaningless and thus confusing, since the subreg isn't about a
promotion, but about reflecting the mode change that was made from under
us.

May I suggest that you guys find (or introduce) other means to change
the layout and mode of the decl *before* RTL is assigned to the params?
I think this would save us a ton of trouble down the road.  Just think
how much trouble you'd get if the different modes had different calling
conventions, alignment requirements, valid register assignments, or
anything that might make coalescing their SSA names with those of other
variables invalid.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-14 19:03                                                       ` Alexandre Oliva
  2015-08-15  8:57                                                         ` Andreas Schwab
  2015-08-16 16:42                                                         ` Andreas Schwab
@ 2015-08-17  7:48                                                         ` Christophe Lyon
  2015-08-17 12:43                                                           ` Alexandre Oliva
  2015-09-02 17:09                                                         ` Alan Lawrence
  3 siblings, 1 reply; 127+ messages in thread
From: Christophe Lyon @ 2015-08-17  7:48 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, David Edelsohn,
	Eric Botcazou

On 14 August 2015 at 20:57, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Aug 11, 2015, Patrick Marlier <patrick.marlier@gmail.com> wrote:
>
>> On Mon, Aug 10, 2015 at 5:14 PM, Jeff Law <law@redhat.com> wrote:
>>> On 08/10/2015 02:23 AM, James Greenhalgh wrote:
>
>>>> For what it is worth, I bootstrapped and tested the consolidated patch
>>>> on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
>>>> r226516 over the weekend, and didn't see any new issues.
>
> Thanks!
>
>> Especially as the bug reporter, I am impressed how a slight problem
>> can lead to such a patch! ;)
>> Thanks a lot Alexandre!
>
> You're welcome.  I'm glad it appears to be working to everyone's
> satisfaction now.  I've just committed it as r226901, with only a
> context adjustment to account for a change in use_register_for_decl in
> function.c.  /me crosses fingers :-)
>
> Here's the patch as checked in:
>

Hi,

Since this was committed (r226901), I can see that the compiler build
fails for armeb targets, when building libgcc:
In file included from
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c:55:0:
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c:
In function '__gnu_addha3':
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:450:31:
internal compiler error: in simplify_subreg, at simplify-rtx.c:5790
 #define FIXED_OP(OP,MODE,NUM) __gnu_ ## OP ## MODE ## NUM
                               ^
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:460:30:
note: in expansion of macro 'FIXED_OP'
 #define FIXED_ADD_TEMP(NAME) FIXED_OP(add,NAME,3)
                              ^
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:492:19:
note: in expansion of macro 'FIXED_ADD_TEMP'
 #define FIXED_ADD FIXED_ADD_TEMP(MODE_NAME_S)
                   ^
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c:59:1:
note: in expansion of macro 'FIXED_ADD'
 FIXED_ADD (FIXED_C_TYPE a, FIXED_C_TYPE b)
 ^
0xa4bbc3 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:5790
0xa4bbc3 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:5790
0xa4ce2d simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:6013
0xa4ce2d simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:6013
0x784385 move_block_from_reg(int, rtx_def*, int)
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1536
0x784385 move_block_from_reg(int, rtx_def*, int)
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1536
0x7e165d assign_parm_setup_block
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3076
0x7e165d assign_parm_setup_block
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3076
0x7e813a assign_parms
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3805
0x7e813a assign_parms
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3805
0x7e8f2e expand_function_start(tree_node*)
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:5234
0x7e8f2e expand_function_start(tree_node*)
        /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:5234

Christophe.

> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         PR bootstrap/66978
>         PR middle-end/66983
>         PR rtl-optimization/67000
>         PR middle-end/67034
>         PR middle-end/67035
>         * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
>         * tree-ssa-copyrename.c: Removed.
>         * opts.c (default_options_table): Drop -ftree-copyrename.  Add
>         -ftree-coalesce-vars.
>         * passes.def: Drop all occurrences of pass_rename_ssa_copies.
>         * common.opt (ftree-copyrename): Ignore.
>         (ftree-coalesce-inlined-vars): Likewise.
>         * doc/invoke.texi: Remove the ignored options above.
>         * gimple-expr.h (gimple_can_coalesce_p): Move declaration
>         * tree-ssa-coalesce.h: ... here.
>         * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
>         headers required by it.
>         * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
>         across variables when flag_tree_coalesce_vars.  Check register
>         use and promoted modes to allow coalescing.  Do not coalesce
>         maybe-byref parms with SSA_NAMEs of other variables, or
>         anonymous SSA_NAMEs.  Moved to tree-ssa-coalesce.c.
>         * tree-ssa-live.c (struct tree_int_map_hasher): Move along
>         with its member functions to tree-ssa-coalesce.c.
>         (var_map_base_init): Likewise.  Renamed to
>         compute_samebase_partition_bases.
>         (partition_view_normal): Drop want_bases parameter.
>         (partition_view_bitmap): Likewise.
>         * tree-ssa-live.h: Adjust declarations.
>         * tree-ssa-coalesce.c: Include explow.h and cfgexpand.h.
>         (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
>         default defs at the entry point.
>         (dump_part_var_map): New.
>         (compute_optimized_partition_bases): New, called by...
>         (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
>         of compute_samebase_partition_bases.  Adjust.
>         * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
>         * cfgexpand.c (leader_merge, parm_maybe_byref_p): New.
>         (ssa_default_def_partition): New.
>         (get_rtl_for_parm_ssa_default_def): New.
>         (align_local_variable, add_stack_var): Support anonymous SSA
>         names.
>         (defer_stack_allocation): Likewise.  Declare earlier.
>         (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
>         vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
>         Do no record deferred-allocation marker in
>         SA.partition_to_pseudo.
>         (expand_stack_vars): Adjust check for the marker in it.
>         (expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
>         redundant MEM attr setting.
>         (expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
>         from...
>         (expand_one_stack_var): ... this.  New wrapper to check and
>         skip already expanded SSA partitions.
>         (record_alignment_for_reg_var): New, factored out of...
>         (expand_one_var): ... this.
>         (expand_one_ssa_partition): New.
>         (adjust_one_expanded_partition_var): New.
>         (expand_one_register_var): Check and skip already expanded SSA
>         partitions.
>         (expand_used_vars): Don't create DECLs for anonymous SSA
>         names.  Expand all SSA partitions, then adjust all SSA names.
>         (pass::execute): Replace the loops that set
>         SA.partition_to_pseudo from partition leaders and cleared
>         DECL_RTL for multi-location variables, and that which used to
>         rename vars and set attrs, with one that clears DECL_RTL and
>         checks that PARMs and RESULTs default_defs match DECL_RTL.
>         * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
>         * emit-rtl.c: Include stor-layout.h.
>         (set_reg_attrs_for_parm): Handle NULL decl.
>         (set_reg_attrs_for_decl_rtl): Take mode from expression if
>         it's not a DECL.
>         * stmt.c (emit_case_decision_tree): Pass it the SSA_NAME
>         rather than its possibly-NULL DECL.
>         * explow.c (promote_ssa_mode): New.
>         * explow.h (promote_ssa_mode): Declare.
>         * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
>         (read_complex_part): Export.
>         * expr.h (read_complex_part): Declare.
>         * cfgexpand.h (parm_maybe_byref_p): Declare.
>         * function.c: Include cfgexpand.h.
>         (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
>         (use_register_for_parm_decl): Wrapper for the above to
>         special-case the result_ptr.
>         (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
>         (split_complex_args): Take assign_parm_data_all argument.
>         Pass it to rtl_for_parm.  Set up rtl and context for split
>         args.  Reset complex parm before fetching its default decl
>         rtl.
>         (assign_parms_unsplit_complex): Use the default-def complex
>         parm rtl if it matches the components.
>         (assign_parms_augmented_arg_list): Adjust.
>         (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
>         multiple locations.  Recognize split complex args.
>         (assign_parm_adjust_stack_rtl): Add all and parm arguments,
>         for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
>         (assign_parm_setup_block): Prefer SSA-assigned location, and
>         fill in its address if the memory location of a maybe-byref
>         parm was not assigned by cfgexpand.
>         (assign_parm_setup_reg): Likewise.  Adjust its mode as
>         needed.  Use entry_parm for equiv if stack_parm is NULL.  Make
>         sure passed_pointer parms don't need conversion.  Copy address
>         or value as needed.
>         (assign_parm_setup_stack): Prefer SSA-assigned location.
>         (assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
>         rtl before testing for pointer bounds.  Special-case result_ptr.
>         (expand_function_start): Maybe reset DECL_RTL of result.
>         Prefer SSA-assigned location for result and static chain.
>         Factor out DECL_RESULT and SET_DECL_RTL.  Convert static chain
>         to Pmode if needed, from H.J. Lu  <hongjiu.lu@intel.com>.
>         * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
>         anonymous SSA names.  Use promote_ssa_mode.
>         (get_temp_reg): Likewise.
>         (remove_ssa_form): Adjust.
>         * stor-layout.c (layout_decl): Don't set mem attributes of
>         non-MEMs.
>         * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
>         and get its reg_usage for reg invalidation.
>         (compute_bb_dataflow): Pass it insn.
>         (emit_notes_in_bb): Likewise.
>
> for  gcc/testsuite/ChangeLog
>
>         * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
>         * gcc.dg/ssp-1.c: Make counter a register.
>         * gcc.dg/ssp-2.c: Likewise.
>         * gcc.dg/torture/parm-coalesce.c: New.
> ---
>  gcc/Makefile.in                              |    1
>  gcc/alias.c                                  |   13 +
>  gcc/cfgexpand.c                              |  471 +++++++++++++++++++-------
>  gcc/cfgexpand.h                              |    3
>  gcc/common.opt                               |   12 -
>  gcc/doc/invoke.texi                          |   48 +--
>  gcc/emit-rtl.c                               |    8
>  gcc/explow.c                                 |   29 ++
>  gcc/explow.h                                 |    3
>  gcc/expr.c                                   |   41 +-
>  gcc/expr.h                                   |    1
>  gcc/function.c                               |  341 +++++++++++++++----
>  gcc/gimple-expr.c                            |   39 --
>  gcc/gimple-expr.h                            |    1
>  gcc/opts.c                                   |    2
>  gcc/passes.def                               |    5
>  gcc/stmt.c                                   |    2
>  gcc/stor-layout.c                            |    3
>  gcc/testsuite/gcc.dg/guality/pr54200.c       |    2
>  gcc/testsuite/gcc.dg/ssp-1.c                 |    2
>  gcc/testsuite/gcc.dg/ssp-2.c                 |    2
>  gcc/testsuite/gcc.dg/torture/parm-coalesce.c |   40 ++
>  gcc/tree-outof-ssa.c                         |   16 -
>  gcc/tree-ssa-coalesce.c                      |  384 +++++++++++++++++++++
>  gcc/tree-ssa-coalesce.h                      |    1
>  gcc/tree-ssa-copyrename.c                    |  475 --------------------------
>  gcc/tree-ssa-live.c                          |   99 -----
>  gcc/tree-ssa-live.h                          |    4
>  gcc/tree-ssa-uncprop.c                       |    5
>  gcc/var-tracking.c                           |   12 -
>  30 files changed, 1187 insertions(+), 878 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>  delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index c1cb4ce..e298ecc 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1447,7 +1447,6 @@ OBJS = \
>         tree-ssa-ccp.o \
>         tree-ssa-coalesce.o \
>         tree-ssa-copy.o \
> -       tree-ssa-copyrename.o \
>         tree-ssa-dce.o \
>         tree-ssa-dom.o \
>         tree-ssa-dse.o \
> diff --git a/gcc/alias.c b/gcc/alias.c
> index fa7d5d8..4681e3f 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>    if (! DECL_P (exprx) || ! DECL_P (expry))
>      return 0;
>
> +  /* If we refer to different gimple registers, or one gimple register
> +     and one non-gimple-register, we know they can't overlap.  First,
> +     gimple registers don't have their addresses taken.  Now, there
> +     could be more than one stack slot for (different versions of) the
> +     same gimple register, but we can presumably tell they don't
> +     overlap based on offsets from stack base addresses elsewhere.
> +     It's important that we don't proceed to DECL_RTL, because gimple
> +     registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
> +     able to do anything about them since no SSA information will have
> +     remained to guide it.  */
> +  if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> +    return exprx != expry;
> +
>    /* With invalid code we can end up storing into the constant pool.
>       Bail out to avoid ICEing when creating RTL for this.
>       See gfortran.dg/lto/20091028-2_0.f90.  */
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 7df9d06..0bc20f6 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -97,6 +97,8 @@ gimple currently_expanding_gimple_stmt;
>
>  static rtx expand_debug_expr (tree);
>
> +static bool defer_stack_allocation (tree, bool);
> +
>  /* Return an expression tree corresponding to the RHS of GIMPLE
>     statement STMT.  */
>
> @@ -150,21 +152,149 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
>  #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* Choose either CUR or NEXT as the leader DECL for a partition.
> +   Prefer ignored decls, to simplify debug dumps and reduce ambiguity
> +   out of the same user variable being in multiple partitions (this is
> +   less likely for compiler-introduced temps).  */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> +  if (cur == NULL || cur == next)
> +    return next;
> +
> +  if (DECL_P (cur) && DECL_IGNORED_P (cur))
> +    return cur;
> +
> +  if (DECL_P (next) && DECL_IGNORED_P (next))
> +    return next;
> +
> +  return cur;
> +}
> +
> +/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
> +   Such parameters are likely passed as a pointer to the value, rather
> +   than as a value, and so we must not coalesce them, nor allocate
> +   stack space for them before determining the calling conventions for
> +   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
> +   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
> +   with NULL so as to make sure the MEM is not used before it is
> +   adjusted in assign_parm_setup_reg.  */
> +
> +bool
> +parm_maybe_byref_p (tree var)
> +{
> +  if (!var || VAR_P (var))
> +    return false;
> +
> +  gcc_assert (TREE_CODE (var) == PARM_DECL
> +             || TREE_CODE (var) == RESULT_DECL);
> +
> +  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
> +}
> +
> +/* Return the partition of the default SSA_DEF for decl VAR.  */
> +
> +static int
> +ssa_default_def_partition (tree var)
> +{
> +  tree name = ssa_default_def (cfun, var);
> +
> +  if (!name)
> +    return NO_PARTITION;
> +
> +  return var_to_partition (SA.map, name);
> +}
> +
> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
> +   there is one.  */
> +
> +rtx
> +get_rtl_for_parm_ssa_default_def (tree var)
> +{
> +  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> +
> +  if (!is_gimple_reg (var))
> +    return NULL_RTX;
> +
> +  /* If we've already determined RTL for the decl, use it.  This is
> +     not just an optimization: if VAR is a PARM whose incoming value
> +     is unused, we won't find a default def to use its partition, but
> +     we still want to use the location of the parm, if it was used at
> +     all.  During assign_parms, until a location is assigned for the
> +     VAR, RTL can only for a parm or result if we're not coalescing
> +     across variables, when we know we're coalescing all SSA_NAMEs of
> +     each parm or result, and we're not coalescing them with names
> +     pertaining to other variables, such as other parms' default
> +     defs.  */
> +  if (DECL_RTL_SET_P (var))
> +    {
> +      gcc_assert (DECL_RTL (var) != pc_rtx);
> +      return DECL_RTL (var);
> +    }
> +
> +  int part = ssa_default_def_partition (var);
> +  if (part == NO_PARTITION)
> +    return NULL_RTX;
> +
> +  return SA.partition_to_pseudo[part];
> +}
> +
>  /* Associate declaration T with storage space X.  If T is no
>     SSA name this is exactly SET_DECL_RTL, otherwise make the
>     partition of T associated with X.  */
>  static inline void
>  set_rtl (tree t, rtx x)
>  {
> +  if (x && SSAVAR (t))
> +    {
> +      bool skip = false;
> +      tree cur = NULL_TREE;
> +
> +      if (MEM_P (x))
> +       cur = MEM_EXPR (x);
> +      else if (REG_P (x))
> +       cur = REG_EXPR (x);
> +      else if (GET_CODE (x) == CONCAT
> +              && REG_P (XEXP (x, 0)))
> +       cur = REG_EXPR (XEXP (x, 0));
> +      else if (GET_CODE (x) == PARALLEL)
> +       cur = REG_EXPR (XVECEXP (x, 0, 0));
> +      else if (x == pc_rtx)
> +       skip = true;
> +      else
> +       gcc_unreachable ();
> +
> +      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +
> +      if (cur != next)
> +       {
> +         if (MEM_P (x))
> +           set_mem_attributes (x, next, true);
> +         else
> +           set_reg_attrs_for_decl_rtl (next, x);
> +       }
> +    }
> +
>    if (TREE_CODE (t) == SSA_NAME)
>      {
> -      SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> -      if (x && !MEM_P (x))
> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> -      /* For the benefit of debug information at -O0 (where vartracking
> -         doesn't run) record the place also in the base DECL if it's
> -        a normal variable (not a parameter).  */
> -      if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
> +      int part = var_to_partition (SA.map, t);
> +      if (part != NO_PARTITION)
> +       {
> +         if (SA.partition_to_pseudo[part])
> +           gcc_assert (SA.partition_to_pseudo[part] == x);
> +         else if (x != pc_rtx)
> +           SA.partition_to_pseudo[part] = x;
> +       }
> +      /* For the benefit of debug information at -O0 (where
> +         vartracking doesn't run) record the place also in the base
> +         DECL.  For PARMs and RESULTs, we may end up resetting these
> +         in function.c:maybe_reset_rtl_for_parm, but in some rare
> +         cases we may need them (unused and overwritten incoming
> +         value, that at -O0 must share the location with the other
> +         uses in spite of the missing default def), and this may be
> +         the only chance to preserve them.  */
> +      if (x && x != pc_rtx && SSA_NAME_VAR (t))
>         {
>           tree var = SSA_NAME_VAR (t);
>           /* If we don't yet have something recorded, just record it now.  */
> @@ -248,8 +378,15 @@ static bool has_short_buffer;
>  static unsigned int
>  align_local_variable (tree decl)
>  {
> -  unsigned int align = LOCAL_DECL_ALIGNMENT (decl);
> -  DECL_ALIGN (decl) = align;
> +  unsigned int align;
> +
> +  if (TREE_CODE (decl) == SSA_NAME)
> +    align = TYPE_ALIGN (TREE_TYPE (decl));
> +  else
> +    {
> +      align = LOCAL_DECL_ALIGNMENT (decl);
> +      DECL_ALIGN (decl) = align;
> +    }
>    return align / BITS_PER_UNIT;
>  }
>
> @@ -315,12 +452,15 @@ add_stack_var (tree decl)
>    decl_to_stack_part->put (decl, stack_vars_num);
>
>    v->decl = decl;
> -  v->size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (decl)));
> +  tree size = TREE_CODE (decl) == SSA_NAME
> +    ? TYPE_SIZE_UNIT (TREE_TYPE (decl))
> +    : DECL_SIZE_UNIT (decl);
> +  v->size = tree_to_uhwi (size);
>    /* Ensure that all variables have size, so that &a != &b for any two
>       variables that are simultaneously live.  */
>    if (v->size == 0)
>      v->size = 1;
> -  v->alignb = align_local_variable (SSAVAR (decl));
> +  v->alignb = align_local_variable (decl);
>    /* An alignment of zero can mightily confuse us later.  */
>    gcc_assert (v->alignb != 0);
>
> @@ -862,7 +1002,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
>    x = plus_constant (Pmode, base, offset);
> -  x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
> +  x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
> +                  ? TYPE_MODE (TREE_TYPE (decl))
> +                  : DECL_MODE (SSAVAR (decl)), x);
>
>    if (TREE_CODE (decl) != SSA_NAME)
>      {
> @@ -884,7 +1026,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>        DECL_USER_ALIGN (decl) = 0;
>      }
>
> -  set_mem_attributes (x, SSAVAR (decl), true);
>    set_rtl (decl, x);
>  }
>
> @@ -950,9 +1091,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
>           /* Skip variables that have already had rtl assigned.  See also
>              add_stack_var where we perpetrate this pc_rtx hack.  */
>           decl = stack_vars[i].decl;
> -         if ((TREE_CODE (decl) == SSA_NAME
> -             ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
> -             : DECL_RTL (decl)) != pc_rtx)
> +         if (TREE_CODE (decl) == SSA_NAME
> +             ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
> +             : DECL_RTL (decl) != pc_rtx)
>             continue;
>
>           large_size += alignb - 1;
> @@ -981,9 +1122,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
>        /* Skip variables that have already had rtl assigned.  See also
>          add_stack_var where we perpetrate this pc_rtx hack.  */
>        decl = stack_vars[i].decl;
> -      if ((TREE_CODE (decl) == SSA_NAME
> -          ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)]
> -          : DECL_RTL (decl)) != pc_rtx)
> +      if (TREE_CODE (decl) == SSA_NAME
> +         ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX
> +         : DECL_RTL (decl) != pc_rtx)
>         continue;
>
>        /* Check the predicate to see whether this variable should be
> @@ -1099,13 +1240,22 @@ account_stack_vars (void)
>     to a variable to be allocated in the stack frame.  */
>
>  static void
> -expand_one_stack_var (tree var)
> +expand_one_stack_var_1 (tree var)
>  {
>    HOST_WIDE_INT size, offset;
>    unsigned byte_align;
>
> -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> -  byte_align = align_local_variable (SSAVAR (var));
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      tree type = TREE_TYPE (var);
> +      size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> +      byte_align = TYPE_ALIGN_UNIT (type);
> +    }
> +  else
> +    {
> +      size = tree_to_uhwi (DECL_SIZE_UNIT (var));
> +      byte_align = align_local_variable (var);
> +    }
>
>    /* We handle highly aligned variables in expand_stack_vars.  */
>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1116,6 +1266,27 @@ expand_one_stack_var (tree var)
>                            crtl->max_used_stack_slot_alignment, offset);
>  }
>
> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> +   already assigned some MEM.  */
> +
> +static void
> +expand_one_stack_var (tree var)
> +{
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (MEM_P (x));
> +         return;
> +       }
> +    }
> +
> +  return expand_one_stack_var_1 (var);
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a hard register.  */
>
> @@ -1125,13 +1296,136 @@ expand_one_hard_reg_var (tree var)
>    rest_of_decl_compilation (var, 0, 0);
>  }
>
> +/* Record the alignment requirements of some variable assigned to a
> +   pseudo.  */
> +
> +static void
> +record_alignment_for_reg_var (unsigned int align)
> +{
> +  if (SUPPORTS_STACK_ALIGNMENT
> +      && crtl->stack_alignment_estimated < align)
> +    {
> +      /* stack_alignment_estimated shouldn't change after stack
> +         realign decision made */
> +      gcc_assert (!crtl->stack_realign_processed);
> +      crtl->stack_alignment_estimated = align;
> +    }
> +
> +  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> +     So here we only make sure stack_alignment_needed >= align.  */
> +  if (crtl->stack_alignment_needed < align)
> +    crtl->stack_alignment_needed = align;
> +  if (crtl->max_used_stack_slot_alignment < align)
> +    crtl->max_used_stack_slot_alignment = align;
> +}
> +
> +/* Create RTL for an SSA partition.  */
> +
> +static void
> +expand_one_ssa_partition (tree var)
> +{
> +  int part = var_to_partition (SA.map, var);
> +  gcc_assert (part != NO_PARTITION);
> +
> +  if (SA.partition_to_pseudo[part])
> +    return;
> +
> +  unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
> +                                         TYPE_MODE (TREE_TYPE (var)),
> +                                         TYPE_ALIGN (TREE_TYPE (var)));
> +
> +  /* If the variable alignment is very large we'll dynamicaly allocate
> +     it, which means that in-frame portion is just a pointer.  */
> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +    align = POINTER_SIZE;
> +
> +  record_alignment_for_reg_var (align);
> +
> +  if (!use_register_for_decl (var))
> +    {
> +      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
> +         && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
> +       {
> +         expand_one_stack_var_at (var, pc_rtx, 0, 0);
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (GET_CODE (x) == MEM);
> +         gcc_assert (GET_MODE (x) == BLKmode);
> +         gcc_assert (XEXP (x, 0) == pc_rtx);
> +         /* Reset the address, so that any attempt to use it will
> +            ICE.  It will be adjusted in assign_parm_setup_reg.  */
> +         XEXP (x, 0) = NULL_RTX;
> +       }
> +      else if (defer_stack_allocation (var, true))
> +       add_stack_var (var);
> +      else
> +       expand_one_stack_var_1 (var);
> +      return;
> +    }
> +
> +  machine_mode reg_mode = promote_ssa_mode (var, NULL);
> +
> +  rtx x = gen_reg_rtx (reg_mode);
> +
> +  set_rtl (var, x);
> +}
> +
> +/* Record the association between the RTL generated for a partition
> +   and the underlying variable of the SSA_NAME.  */
> +
> +static void
> +adjust_one_expanded_partition_var (tree var)
> +{
> +  if (!var)
> +    return;
> +
> +  tree decl = SSA_NAME_VAR (var);
> +
> +  int part = var_to_partition (SA.map, var);
> +  if (part == NO_PARTITION)
> +    return;
> +
> +  rtx x = SA.partition_to_pseudo[part];
> +
> +  if (!x)
> +    {
> +      /* This var will get a stack slot later.  */
> +      gcc_assert (defer_stack_allocation (var, true));
> +      return;
> +    }
> +
> +  set_rtl (var, x);
> +
> +  if (!REG_P (x))
> +    return;
> +
> +  /* Note if the object is a user variable.  */
> +  if (decl && !DECL_ARTIFICIAL (decl))
> +    mark_user_reg (x);
> +
> +  if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
> +    mark_reg_pointer (x, get_pointer_alignment (var));
> +}
> +
>  /* A subroutine of expand_one_var.  Called to assign rtl to a VAR_DECL
>     that will reside in a pseudo register.  */
>
>  static void
>  expand_one_register_var (tree var)
>  {
> -  tree decl = SSAVAR (var);
> +  if (TREE_CODE (var) == SSA_NAME)
> +    {
> +      int part = var_to_partition (SA.map, var);
> +      if (part != NO_PARTITION)
> +       {
> +         rtx x = SA.partition_to_pseudo[part];
> +         gcc_assert (x);
> +         gcc_assert (REG_P (x));
> +         return;
> +       }
> +      gcc_unreachable ();
> +    }
> +
> +  tree decl = var;
>    tree type = TREE_TYPE (decl);
>    machine_mode reg_mode = promote_decl_mode (decl, NULL);
>    rtx x = gen_reg_rtx (reg_mode);
> @@ -1177,10 +1471,14 @@ expand_one_error_var (tree var)
>  static bool
>  defer_stack_allocation (tree var, bool toplevel)
>  {
> +  tree size_unit = TREE_CODE (var) == SSA_NAME
> +    ? TYPE_SIZE_UNIT (TREE_TYPE (var))
> +    : DECL_SIZE_UNIT (var);
> +
>    /* Whether the variable is small enough for immediate allocation not to be
>       a problem with regard to the frame size.  */
>    bool smallish
> -    = ((HOST_WIDE_INT) tree_to_uhwi (DECL_SIZE_UNIT (var))
> +    = ((HOST_WIDE_INT) tree_to_uhwi (size_unit)
>         < PARAM_VALUE (PARAM_MIN_SIZE_FOR_STACK_SHARING));
>
>    /* If stack protection is enabled, *all* stack variables must be deferred,
> @@ -1189,16 +1487,24 @@ defer_stack_allocation (tree var, bool toplevel)
>    if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK))
>      return true;
>
> +  unsigned int align = TREE_CODE (var) == SSA_NAME
> +    ? TYPE_ALIGN (TREE_TYPE (var))
> +    : DECL_ALIGN (var);
> +
>    /* We handle "large" alignment via dynamic allocation.  We want to handle
>       this extra complication in only one place, so defer them.  */
> -  if (DECL_ALIGN (var) > MAX_SUPPORTED_STACK_ALIGNMENT)
> +  if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
>      return true;
>
> +  bool ignored = TREE_CODE (var) == SSA_NAME
> +    ? !SSAVAR (var) || DECL_IGNORED_P (SSA_NAME_VAR (var))
> +    : DECL_IGNORED_P (var);
> +
>    /* When optimization is enabled, DECL_IGNORED_P variables originally scoped
>       might be detached from their block and appear at toplevel when we reach
>       here.  We want to coalesce them with variables from other blocks when
>       the immediate contribution to the frame size would be noticeable.  */
> -  if (toplevel && optimize > 0 && DECL_IGNORED_P (var) && !smallish)
> +  if (toplevel && optimize > 0 && ignored && !smallish)
>      return true;
>
>    /* Variables declared in the outermost scope automatically conflict
> @@ -1265,21 +1571,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
>         align = POINTER_SIZE;
>      }
>
> -  if (SUPPORTS_STACK_ALIGNMENT
> -      && crtl->stack_alignment_estimated < align)
> -    {
> -      /* stack_alignment_estimated shouldn't change after stack
> -         realign decision made */
> -      gcc_assert (!crtl->stack_realign_processed);
> -      crtl->stack_alignment_estimated = align;
> -    }
> -
> -  /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> -     So here we only make sure stack_alignment_needed >= align.  */
> -  if (crtl->stack_alignment_needed < align)
> -    crtl->stack_alignment_needed = align;
> -  if (crtl->max_used_stack_slot_alignment < align)
> -    crtl->max_used_stack_slot_alignment = align;
> +  record_alignment_for_reg_var (align);
>
>    if (TREE_CODE (origvar) == SSA_NAME)
>      {
> @@ -1722,48 +2014,18 @@ expand_used_vars (void)
>    if (targetm.use_pseudo_pic_reg ())
>      pic_offset_table_rtx = gen_reg_rtx (Pmode);
>
> -  hash_map<tree, tree> ssa_name_decls;
>    for (i = 0; i < SA.map->num_partitions; i++)
>      {
>        tree var = partition_to_var (SA.map, i);
>
>        gcc_assert (!virtual_operand_p (var));
>
> -      /* Assign decls to each SSA name partition, share decls for partitions
> -         we could have coalesced (those with the same type).  */
> -      if (SSA_NAME_VAR (var) == NULL_TREE)
> -       {
> -         tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
> -         if (!*slot)
> -           *slot = create_tmp_reg (TREE_TYPE (var));
> -         replace_ssa_name_symbol (var, *slot);
> -       }
> -
> -      /* Always allocate space for partitions based on VAR_DECLs.  But for
> -        those based on PARM_DECLs or RESULT_DECLs and which matter for the
> -        debug info, there is no need to do so if optimization is disabled
> -        because all the SSA_NAMEs based on these DECLs have been coalesced
> -        into a single partition, which is thus assigned the canonical RTL
> -        location of the DECLs.  If in_lto_p, we can't rely on optimize,
> -        a function could be compiled with -O1 -flto first and only the
> -        link performed at -O0.  */
> -      if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
> -       expand_one_var (var, true, true);
> -      else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
> -       {
> -         /* This is a PARM_DECL or RESULT_DECL.  For those partitions that
> -            contain the default def (representing the parm or result itself)
> -            we don't do anything here.  But those which don't contain the
> -            default def (representing a temporary based on the parm/result)
> -            we need to allocate space just like for normal VAR_DECLs.  */
> -         if (!bitmap_bit_p (SA.partition_has_default_def, i))
> -           {
> -             expand_one_var (var, true, true);
> -             gcc_assert (SA.partition_to_pseudo[i]);
> -           }
> -       }
> +      expand_one_ssa_partition (var);
>      }
>
> +  for (i = 1; i < num_ssa_names; i++)
> +    adjust_one_expanded_partition_var (ssa_name (i));
> +
>    if (flag_stack_protect == SPCT_FLAG_STRONG)
>        gen_stack_protect_signal
>         = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -5928,35 +6190,6 @@ pass_expand::execute (function *fun)
>        parm_birth_insn = var_seq;
>      }
>
> -  /* Now that we also have the parameter RTXs, copy them over to our
> -     partitions.  */
> -  for (i = 0; i < SA.map->num_partitions; i++)
> -    {
> -      tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
> -
> -      if (TREE_CODE (var) != VAR_DECL
> -         && !SA.partition_to_pseudo[i])
> -       SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
> -      gcc_assert (SA.partition_to_pseudo[i]);
> -
> -      /* If this decl was marked as living in multiple places, reset
> -        this now to NULL.  */
> -      if (DECL_RTL_IF_SET (var) == pc_rtx)
> -       SET_DECL_RTL (var, NULL);
> -
> -      /* Some RTL parts really want to look at DECL_RTL(x) when x
> -        was a decl marked in REG_ATTR or MEM_ATTR.  We could use
> -        SET_DECL_RTL here making this available, but that would mean
> -        to select one of the potentially many RTLs for one DECL.  Instead
> -        of doing that we simply reset the MEM_EXPR of the RTL in question,
> -        then nobody can get at it and hence nobody can call DECL_RTL on it.  */
> -      if (!DECL_RTL_SET_P (var))
> -       {
> -         if (MEM_P (SA.partition_to_pseudo[i]))
> -           set_mem_expr (SA.partition_to_pseudo[i], NULL);
> -       }
> -    }
> -
>    /* If we have a class containing differently aligned pointers
>       we need to merge those into the corresponding RTL pointer
>       alignment.  */
> @@ -5964,7 +6197,6 @@ pass_expand::execute (function *fun)
>      {
>        tree name = ssa_name (i);
>        int part;
> -      rtx r;
>
>        if (!name
>           /* We might have generated new SSA names in
> @@ -5977,20 +6209,25 @@ pass_expand::execute (function *fun)
>        if (part == NO_PARTITION)
>         continue;
>
> -      /* Adjust all partition members to get the underlying decl of
> -        the representative which we might have created in expand_one_var.  */
> -      if (SSA_NAME_VAR (name) == NULL_TREE)
> +      gcc_assert (SA.partition_to_pseudo[part]
> +                 || defer_stack_allocation (name, true));
> +
> +      /* If this decl was marked as living in multiple places, reset
> +        this now to NULL.  */
> +      tree var = SSA_NAME_VAR (name);
> +      if (var && DECL_RTL_IF_SET (var) == pc_rtx)
> +       SET_DECL_RTL (var, NULL);
> +      /* Check that the pseudos chosen by assign_parms are those of
> +        the corresponding default defs.  */
> +      else if (SSA_NAME_IS_DEFAULT_DEF (name)
> +              && (TREE_CODE (var) == PARM_DECL
> +                  || TREE_CODE (var) == RESULT_DECL))
>         {
> -         tree leader = partition_to_var (SA.map, part);
> -         gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
> -         replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
> +         rtx in = DECL_RTL_IF_SET (var);
> +         gcc_assert (in);
> +         rtx out = SA.partition_to_pseudo[part];
> +         gcc_assert (in == out || rtx_equal_p (in, out));
>         }
> -      if (!POINTER_TYPE_P (TREE_TYPE (name)))
> -       continue;
> -
> -      r = SA.partition_to_pseudo[part];
> -      if (REG_P (r))
> -       mark_reg_pointer (r, get_pointer_alignment (name));
>      }
>
>    /* If this function is `main', emit a call to `__main'
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index a0b6e3e..987cf356 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,5 +22,8 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern tree gimple_assign_rhs_to_tree (gimple);
>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> +extern bool parm_maybe_byref_p (tree);
> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +
>
>  #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/common.opt b/gcc/common.opt
> index e80eadf..dd59ff3 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2234,16 +2234,16 @@ Common Report Var(flag_tree_ch) Optimization
>  Enable loop header copying on trees
>
>  ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Report Var(flag_tree_coalesce_vars) Optimization
> +Enable SSA coalescing of user variables
>
>  ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing.  Preserved for backward compatibility.
>
>  ftree-copy-prop
>  Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 2871337..27be317 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -342,7 +342,6 @@ Objective-C and Objective-C++ Dialects}.
>  -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
>  -fdump-tree-nrv -fdump-tree-vect @gol
>  -fdump-tree-sink @gol
>  -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
> @@ -448,9 +447,8 @@ Objective-C and Objective-C++ Dialects}.
>  -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>  -ftree-loop-if-convert-stores -ftree-loop-im @gol
>  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
> @@ -7133,11 +7131,6 @@ name is made by appending @file{.phiopt} to the source file name.
>  Dump each function after forward propagating single use variables.  The file
>  name is made by appending @file{.forwprop} to the source file name.
>
> -@item copyrename
> -@opindex fdump-tree-copyrename
> -Dump each function after applying the copy rename optimization.  The file
> -name is made by appending @file{.copyrename} to the source file name.
> -
>  @item nrv
>  @opindex fdump-tree-nrv
>  Dump each function after applying the named return value optimization on
> @@ -7602,8 +7595,8 @@ compilation time.
>  -ftree-ccp @gol
>  -fssa-phiopt @gol
>  -ftree-ch @gol
> +-ftree-coalesce-vars @gol
>  -ftree-copy-prop @gol
> --ftree-copyrename @gol
>  -ftree-dce @gol
>  -ftree-dominator-opts @gol
>  -ftree-dse @gol
> @@ -8867,6 +8860,15 @@ be parallelized.  Parallelize all the loops that can be analyzed to
>  not contain loop carried dependences without checking that it is
>  profitable to parallelize the loops.
>
> +@item -ftree-coalesce-vars
> +@opindex ftree-coalesce-vars
> +Tell the compiler to attempt to combine small user-defined variables
> +too, instead of just compiler temporaries.  This may severely limit the
> +ability to debug an optimized program compiled with
> +@option{-fno-var-tracking-assignments}.  In the negated form, this flag
> +prevents SSA coalescing of user variables.  This option is enabled by
> +default if optimization is enabled.
> +
>  @item -ftree-loop-if-convert
>  @opindex ftree-loop-if-convert
>  Attempt to transform conditional jumps in the innermost loops to
> @@ -8980,32 +8982,6 @@ Perform scalar replacement of aggregates.  This pass replaces structure
>  references with scalars to prevent committing structures to memory too
>  early.  This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees.  This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables.  This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions.  It is a more limited form of
> -@option{-ftree-coalesce-vars}.  This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries.  This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}.  In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones.  This option is enabled by default.
> -
>  @item -ftree-ter
>  @opindex ftree-ter
>  Perform temporary expression replacement during the SSA->normal phase.  Single
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index d211e6b0..a6ef154 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "target.h"
>  #include "builtins.h"
>  #include "rtl-iter.h"
> +#include "stor-layout.h"
>
>  struct target_rtl default_target_rtl;
>  #if SWITCHABLE_TARGET
> @@ -1233,6 +1234,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
>  void
>  set_reg_attrs_for_decl_rtl (tree t, rtx x)
>  {
> +  if (!t)
> +    return;
> +  tree tdecl = t;
>    if (GET_CODE (x) == SUBREG)
>      {
>        gcc_assert (subreg_lowpart_p (x));
> @@ -1241,7 +1245,9 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>    if (REG_P (x))
>      REG_ATTRS (x)
>        = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
> -                                              DECL_MODE (t)));
> +                                              DECL_P (tdecl)
> +                                              ? DECL_MODE (tdecl)
> +                                              : TYPE_MODE (TREE_TYPE (tdecl))));
>    if (GET_CODE (x) == CONCAT)
>      {
>        if (REG_P (XEXP (x, 0)))
> diff --git a/gcc/explow.c b/gcc/explow.c
> index bd342c1..6941f4e 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -842,6 +842,35 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>    return pmode;
>  }
>
> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
> +   mode of a temp decl of same type as the SSA_NAME, if we had created
> +   one.  */
> +
> +machine_mode
> +promote_ssa_mode (const_tree name, int *punsignedp)
> +{
> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
> +
> +  /* Partitions holding parms and results must be promoted as expected
> +     by function.c.  */
> +  if (SSA_NAME_VAR (name)
> +      && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
> +         || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
> +    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> +
> +  tree type = TREE_TYPE (name);
> +  int unsignedp = TYPE_UNSIGNED (type);
> +  machine_mode mode = TYPE_MODE (type);
> +
> +  machine_mode pmode = promote_mode (type, mode, &unsignedp);
> +  if (punsignedp)
> +    *punsignedp = unsignedp;
> +
> +  return pmode;
> +}
> +
> +
>
>  /* Controls the behaviour of {anti_,}adjust_stack.  */
>  static bool suppress_reg_args_size;
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 94613de..52113db 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
>  /* Return mode and signedness to use when object is promoted.  */
>  machine_mode promote_decl_mode (const_tree, int *);
>
> +/* Return mode and signedness to use when object is promoted.  */
> +machine_mode promote_ssa_mode (const_tree, int *);
> +
>  /* Remove some bytes from the stack.  An rtx says how many.  */
>  extern void adjust_stack (rtx);
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 31b4573..f604f52 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -3022,7 +3022,7 @@ write_complex_part (rtx cplx, rtx val, bool imag_p)
>  /* Extract one of the components of the complex value CPLX.  Extract the
>     real part if IMAG_P is false, and the imaginary part if it's true.  */
>
> -static rtx
> +rtx
>  read_complex_part (rtx cplx, bool imag_p)
>  {
>    machine_mode cmode, imode;
> @@ -9236,7 +9236,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>    rtx op0, op1, temp, decl_rtl;
>    tree type;
>    int unsignedp;
> -  machine_mode mode;
> +  machine_mode mode, dmode;
>    enum tree_code code = TREE_CODE (exp);
>    rtx subtarget, original_target;
>    int ignore;
> @@ -9367,7 +9367,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        if (g == NULL
>           && modifier == EXPAND_INITIALIZER
>           && !SSA_NAME_IS_DEFAULT_DEF (exp)
> -         && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> +         && (optimize || !SSA_NAME_VAR (exp)
> +             || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>           && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
>         g = SSA_NAME_DEF_STMT (exp);
>        if (g)
> @@ -9446,15 +9447,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        /* Ensure variable marked as used even if it doesn't go through
>          a parser.  If it hasn't be used yet, write out an external
>          definition.  */
> -      TREE_USED (exp) = 1;
> +      if (exp)
> +       TREE_USED (exp) = 1;
>
>        /* Show we haven't gotten RTL for this yet.  */
>        temp = 0;
>
>        /* Variables inherited from containing functions should have
>          been lowered by this point.  */
> -      context = decl_function_context (exp);
> -      gcc_assert (SCOPE_FILE_SCOPE_P (context)
> +      if (exp)
> +       context = decl_function_context (exp);
> +      gcc_assert (!exp
> +                 || SCOPE_FILE_SCOPE_P (context)
>                   || context == current_function_decl
>                   || TREE_STATIC (exp)
>                   || DECL_EXTERNAL (exp)
> @@ -9478,7 +9482,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>           decl_rtl = use_anchored_address (decl_rtl);
>           if (modifier != EXPAND_CONST_ADDRESS
>               && modifier != EXPAND_SUM
> -             && !memory_address_addr_space_p (DECL_MODE (exp),
> +             && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
> +                                              : GET_MODE (decl_rtl),
>                                                XEXP (decl_rtl, 0),
>                                                MEM_ADDR_SPACE (decl_rtl)))
>             temp = replace_equiv_address (decl_rtl,
> @@ -9489,12 +9494,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          if the address is a register.  */
>        if (temp != 0)
>         {
> -         if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
> +         if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
>             mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>
>           return temp;
>         }
>
> +      if (exp)
> +       dmode = DECL_MODE (exp);
> +      else
> +       dmode = TYPE_MODE (TREE_TYPE (ssa_name));
> +
>        /* If the mode of DECL_RTL does not match that of the decl,
>          there are two cases: we are dealing with a BLKmode value
>          that is returned in a register, or we are dealing with
> @@ -9502,22 +9512,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>          of the wanted mode, but mark it so that we know that it
>          was already extended.  */
>        if (REG_P (decl_rtl)
> -         && DECL_MODE (exp) != BLKmode
> -         && GET_MODE (decl_rtl) != DECL_MODE (exp))
> +         && dmode != BLKmode
> +         && GET_MODE (decl_rtl) != dmode)
>         {
>           machine_mode pmode;
>
>           /* Get the signedness to be used for this variable.  Ensure we get
>              the same mode we got when the variable was declared.  */
> -         if (code == SSA_NAME
> -             && (g = SSA_NAME_DEF_STMT (ssa_name))
> -             && gimple_code (g) == GIMPLE_CALL
> -             && !gimple_call_internal_p (g))
> +         if (code != SSA_NAME)
> +           pmode = promote_decl_mode (exp, &unsignedp);
> +         else if ((g = SSA_NAME_DEF_STMT (ssa_name))
> +                  && gimple_code (g) == GIMPLE_CALL
> +                  && !gimple_call_internal_p (g))
>             pmode = promote_function_mode (type, mode, &unsignedp,
>                                            gimple_call_fntype (g),
>                                            2);
>           else
> -           pmode = promote_decl_mode (exp, &unsignedp);
> +           pmode = promote_ssa_mode (ssa_name, &unsignedp);
>           gcc_assert (GET_MODE (decl_rtl) == pmode);
>
>           temp = gen_lowpart_SUBREG (mode, decl_rtl);
> diff --git a/gcc/expr.h b/gcc/expr.h
> index 32d1707..a2c8e1d 100644
> --- a/gcc/expr.h
> +++ b/gcc/expr.h
> @@ -210,6 +210,7 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx);
>
>  extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx);
>  extern rtx_insn *emit_move_complex_parts (rtx, rtx);
> +extern rtx read_complex_part (rtx, bool);
>  extern void write_complex_part (rtx, rtx, bool);
>  extern rtx emit_move_resolve_push (machine_mode, rtx);
>
> diff --git a/gcc/function.c b/gcc/function.c
> index 20bf3b3..715c19f 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -72,6 +72,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfganal.h"
>  #include "cfgbuild.h"
>  #include "cfgcleanup.h"
> +#include "cfgexpand.h"
> +#include "basic-block.h"
> +#include "df.h"
>  #include "params.h"
>  #include "bb-reorder.h"
>  #include "shrink-wrap.h"
> @@ -148,6 +151,9 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
>  static void prepare_function_start (void);
>  static void do_clobber_return_reg (rtx, void *);
>  static void do_use_return_reg (rtx, void *);
> +static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
> +static void maybe_reset_rtl_for_parm (tree);
> +
>
>  /* Stack of nested functions.  */
>  /* Keep track of the cfun stack.  */
> @@ -2105,6 +2111,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>  bool
>  use_register_for_decl (const_tree decl)
>  {
> +  if (TREE_CODE (decl) == SSA_NAME)
> +    {
> +      /* We often try to use the SSA_NAME, instead of its underlying
> +        decl, to get type information and guide decisions, to avoid
> +        differences of behavior between anonymous and named
> +        variables, but in this one case we have to go for the actual
> +        variable if there is one.  The main reason is that, at least
> +        at -O0, we want to place user variables on the stack, but we
> +        don't mind using pseudos for anonymous or ignored temps.
> +        Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
> +        should go in pseudos, whereas their corresponding variables
> +        might have to go on the stack.  So, disregarding the decl
> +        here would negatively impact debug info at -O0, enable
> +        coalescing between SSA_NAMEs that ought to get different
> +        stack/pseudo assignments, and get the incoming argument
> +        processing thoroughly confused by PARM_DECLs expected to live
> +        in stack slots but assigned to pseudos.  */
> +      if (!SSA_NAME_VAR (decl))
> +       return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> +         && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> +
> +      decl = SSA_NAME_VAR (decl);
> +    }
> +
>    /* Honor volatile.  */
>    if (TREE_SIDE_EFFECTS (decl))
>      return false;
> @@ -2240,7 +2270,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
>     needed, else the old list.  */
>
>  static void
> -split_complex_args (vec<tree> *args)
> +split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>  {
>    unsigned i;
>    tree p;
> @@ -2251,6 +2281,7 @@ split_complex_args (vec<tree> *args)
>        if (TREE_CODE (type) == COMPLEX_TYPE
>           && targetm.calls.split_complex_arg (type))
>         {
> +         tree cparm = p;
>           tree decl;
>           tree subtype = TREE_TYPE (type);
>           bool addressable = TREE_ADDRESSABLE (p);
> @@ -2269,6 +2300,9 @@ split_complex_args (vec<tree> *args)
>           DECL_ARTIFICIAL (p) = addressable;
>           DECL_IGNORED_P (p) = addressable;
>           TREE_ADDRESSABLE (p) = 0;
> +         /* Reset the RTL before layout_decl, or it may change the
> +            mode of the RTL of the original argument copied to P.  */
> +         SET_DECL_RTL (p, NULL_RTX);
>           layout_decl (p, 0);
>           (*args)[i] = p;
>
> @@ -2280,6 +2314,25 @@ split_complex_args (vec<tree> *args)
>           DECL_IGNORED_P (decl) = addressable;
>           layout_decl (decl, 0);
>           args->safe_insert (++i, decl);
> +
> +         /* If we are expanding a function, rather than gimplifying
> +            it, propagate the RTL of the complex parm to the split
> +            declarations, and set their contexts so that
> +            maybe_reset_rtl_for_parm can recognize them and refrain
> +            from resetting their RTL.  */
> +         if (currently_expanding_to_rtl)
> +           {
> +             maybe_reset_rtl_for_parm (cparm);
> +             rtx rtl = rtl_for_parm (all, cparm);
> +             if (rtl)
> +               {
> +                 SET_DECL_RTL (p, read_complex_part (rtl, false));
> +                 SET_DECL_RTL (decl, read_complex_part (rtl, true));
> +
> +                 DECL_CONTEXT (p) = cparm;
> +                 DECL_CONTEXT (decl) = cparm;
> +               }
> +           }
>         }
>      }
>  }
> @@ -2342,7 +2395,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
>
>    /* If the target wants to split complex arguments into scalars, do so.  */
>    if (targetm.calls.split_complex_arg)
> -    split_complex_args (&fnargs);
> +    split_complex_args (all, &fnargs);
>
>    return fnargs;
>  }
> @@ -2745,23 +2798,98 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>    data->entry_parm = entry_parm;
>  }
>
> +/* Wrapper for use_register_for_decl, that special-cases the
> +   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> +   passed by reference.  */
> +
> +static bool
> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (DECL_BY_REFERENCE (result))
> +       parm = result;
> +    }
> +
> +  return use_register_for_decl (parm);
> +}
> +
> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> +   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> +   is passed by reference.  */
> +
> +static rtx
> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> +{
> +  if (parm == all->function_result_decl)
> +    {
> +      tree result = DECL_RESULT (current_function_decl);
> +
> +      if (!DECL_BY_REFERENCE (result))
> +       return NULL_RTX;
> +
> +      parm = result;
> +    }
> +
> +  return get_rtl_for_parm_ssa_default_def (parm);
> +}
> +
> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> +   SSA_NAMEs in multiple partitions, so that assign_parms will choose
> +   the default def, if it exists, or create new RTL to hold the unused
> +   entry value.  If we are coalescing across variables, we want to
> +   reset the location too, because a parm without a default def
> +   (incoming value unused) might be coalesced with one with a default
> +   def, and then assign_parms would copy both incoming values to the
> +   same location, which might cause the wrong value to survive.  */
> +static void
> +maybe_reset_rtl_for_parm (tree parm)
> +{
> +  gcc_assert (TREE_CODE (parm) == PARM_DECL
> +             || TREE_CODE (parm) == RESULT_DECL);
> +
> +  /* This is a split complex parameter, and its context was set to its
> +     original PARM_DECL in split_complex_args so that we could
> +     recognize it here and not reset its RTL.  */
> +  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
> +    {
> +      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
> +      return;
> +    }
> +
> +  if ((flag_tree_coalesce_vars
> +       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> +      && is_gimple_reg (parm))
> +    SET_DECL_RTL (parm, NULL_RTX);
> +}
> +
>  /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
>     always valid and properly aligned.  */
>
>  static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> +                             struct assign_parm_data_one *data)
>  {
>    rtx stack_parm = data->stack_parm;
>
> +  /* If out-of-SSA assigned RTL to the parm default def, make sure we
> +     don't use what we might have computed before.  */
> +  rtx ssa_assigned = rtl_for_parm (all, parm);
> +  if (ssa_assigned)
> +    stack_parm = NULL;
> +
>    /* If we can't trust the parm stack slot to be aligned enough for its
>       ultimate type, don't use that slot after entry.  We'll make another
>       stack slot, if we need one.  */
> -  if (stack_parm
> -      && ((STRICT_ALIGNMENT
> -          && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> -         || (data->nominal_type
> -             && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> -             && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> +  else if (stack_parm
> +          && ((STRICT_ALIGNMENT
> +               && (GET_MODE_ALIGNMENT (data->nominal_mode)
> +                   > MEM_ALIGN (stack_parm)))
> +              || (data->nominal_type
> +                  && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> +                  && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>      stack_parm = NULL;
>
>    /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2823,14 +2951,32 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>
>    size = int_size_in_bytes (data->passed_type);
>    size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> +
>    if (stack_parm == 0)
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> -      stack_parm = assign_stack_local (BLKmode, size_stored,
> -                                      DECL_ALIGN (parm));
> -      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> -       PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -      set_mem_attributes (stack_parm, parm, 1);
> +      rtx from_expand = rtl_for_parm (all, parm);
> +      if (from_expand && (!parm_maybe_byref_p (parm)
> +                         || XEXP (from_expand, 0) != NULL_RTX))
> +       stack_parm = copy_rtx (from_expand);
> +      else
> +       {
> +         stack_parm = assign_stack_local (BLKmode, size_stored,
> +                                          DECL_ALIGN (parm));
> +         if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> +           PUT_MODE (stack_parm, GET_MODE (entry_parm));
> +         if (from_expand)
> +           {
> +             gcc_assert (GET_CODE (stack_parm) == MEM);
> +             gcc_assert (GET_CODE (from_expand) == MEM);
> +             gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
> +             XEXP (from_expand, 0) = XEXP (stack_parm, 0);
> +             PUT_MODE (from_expand, GET_MODE (stack_parm));
> +             stack_parm = copy_rtx (from_expand);
> +           }
> +         else
> +           set_mem_attributes (stack_parm, parm, 1);
> +       }
>      }
>
>    /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
> @@ -2968,14 +3114,34 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>      = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                              TREE_TYPE (current_function_decl), 2);
>
> -  parmreg = gen_reg_rtx (promoted_nominal_mode);
> +  rtx from_expand = parmreg = rtl_for_parm (all, parm);
>
> -  if (!DECL_ARTIFICIAL (parm))
> -    mark_user_reg (parmreg);
> +  if (from_expand && !data->passed_pointer)
> +    {
> +      if (GET_MODE (parmreg) != promoted_nominal_mode)
> +       parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
> +    }
> +  else if (!from_expand || parm_maybe_byref_p (parm))
> +    {
> +      parmreg = gen_reg_rtx (promoted_nominal_mode);
> +      if (!DECL_ARTIFICIAL (parm))
> +       mark_user_reg (parmreg);
> +
> +      if (from_expand)
> +       {
> +         gcc_assert (data->passed_pointer);
> +         gcc_assert (GET_CODE (from_expand) == MEM
> +                     && GET_MODE (from_expand) == BLKmode
> +                     && XEXP (from_expand, 0) == NULL_RTX);
> +         XEXP (from_expand, 0) = parmreg;
> +       }
> +    }
>
>    /* If this was an item that we received a pointer to,
>       set DECL_RTL appropriately.  */
> -  if (data->passed_pointer)
> +  if (from_expand)
> +    SET_DECL_RTL (parm, from_expand);
> +  else if (data->passed_pointer)
>      {
>        rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
>        set_mem_attributes (x, parm, 1);
> @@ -2990,10 +3156,13 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       assign_parm_find_data_types and expand_expr_real_1.  */
>
>    equiv_stack_parm = data->stack_parm;
> +  if (!equiv_stack_parm)
> +    equiv_stack_parm = data->entry_parm;
>    validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
>    need_conversion = (data->nominal_mode != data->passed_mode
>                      || promoted_nominal_mode != data->promoted_mode);
> +  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
>    moved = false;
>
>    if (need_conversion
> @@ -3125,16 +3294,28 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>        did_conversion = true;
>      }
> -  else
> +  /* We don't want to copy the incoming pointer to a parmreg expected
> +     to hold the value rather than the pointer.  */
> +  else if (!data->passed_pointer || parmreg != from_expand)
>      emit_move_insn (parmreg, validated_mem);
>
>    /* If we were passed a pointer but the actual value can safely live
>       in a register, retrieve it and use it directly.  */
> -  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
> +  if (data->passed_pointer
> +      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
>      {
> +      rtx src = DECL_RTL (parm);
> +
>        /* We can't use nominal_mode, because it will have been set to
>          Pmode above.  We must use the actual mode of the parm.  */
> -      if (use_register_for_decl (parm))
> +      if (from_expand)
> +       {
> +         parmreg = from_expand;
> +         gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> +         src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
> +         set_mem_attributes (src, parm, 1);
> +       }
> +      else if (use_register_for_decl (parm))
>         {
>           parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>           mark_user_reg (parmreg);
> @@ -3151,14 +3332,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>           set_mem_attributes (parmreg, parm, 1);
>         }
>
> -      if (GET_MODE (parmreg) != GET_MODE (DECL_RTL (parm)))
> +      if (GET_MODE (parmreg) != GET_MODE (src))
>         {
> -         rtx tempreg = gen_reg_rtx (GET_MODE (DECL_RTL (parm)));
> +         rtx tempreg = gen_reg_rtx (GET_MODE (src));
>           int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
>
>           push_to_sequence2 (all->first_conversion_insn,
>                              all->last_conversion_insn);
> -         emit_move_insn (tempreg, DECL_RTL (parm));
> +         emit_move_insn (tempreg, src);
>           tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
>           emit_move_insn (parmreg, tempreg);
>           all->first_conversion_insn = get_insns ();
> @@ -3167,14 +3348,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>           did_conversion = true;
>         }
> +      else if (GET_MODE (parmreg) == BLKmode)
> +       gcc_assert (parm_maybe_byref_p (parm));
>        else
> -       emit_move_insn (parmreg, DECL_RTL (parm));
> +       emit_move_insn (parmreg, src);
>
>        SET_DECL_RTL (parm, parmreg);
>
>        /* STACK_PARM is the pointer, not the parm, and PARMREG is
>          now the parm.  */
> -      data->stack_parm = NULL;
> +      data->stack_parm = equiv_stack_parm = NULL;
>      }
>
>    /* Mark the register as eliminable if we did no conversion and it was
> @@ -3184,11 +3367,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       make here would screw up life analysis for it.  */
>    if (data->nominal_mode == data->passed_mode
>        && !did_conversion
> -      && data->stack_parm != 0
> -      && MEM_P (data->stack_parm)
> +      && equiv_stack_parm != 0
> +      && MEM_P (equiv_stack_parm)
>        && data->locate.offset.var == 0
>        && reg_mentioned_p (virtual_incoming_args_rtx,
> -                         XEXP (data->stack_parm, 0)))
> +                         XEXP (equiv_stack_parm, 0)))
>      {
>        rtx_insn *linsn = get_last_insn ();
>        rtx_insn *sinsn;
> @@ -3201,8 +3384,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>             = GET_MODE_INNER (GET_MODE (parmreg));
>           int regnor = REGNO (XEXP (parmreg, 0));
>           int regnoi = REGNO (XEXP (parmreg, 1));
> -         rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> -         rtx stacki = adjust_address_nv (data->stack_parm, submode,
> +         rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> +         rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
>                                           GET_MODE_SIZE (submode));
>
>           /* Scan backwards for the set of the real and
> @@ -3275,6 +3458,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>
>        if (data->stack_parm == 0)
>         {
> +         rtx x = data->stack_parm = rtl_for_parm (all, parm);
> +         if (x)
> +           gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
> +       }
> +
> +      if (data->stack_parm == 0)
> +       {
>           int align = STACK_SLOT_ALIGNMENT (data->passed_type,
>                                             GET_MODE (data->entry_parm),
>                                             TYPE_ALIGN (data->passed_type));
> @@ -3337,11 +3527,21 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
>           imag = DECL_RTL (fnargs[i + 1]);
>           if (inner != GET_MODE (real))
>             {
> -             real = gen_lowpart_SUBREG (inner, real);
> -             imag = gen_lowpart_SUBREG (inner, imag);
> +             real = simplify_gen_subreg (inner, real, GET_MODE (real),
> +                                         subreg_lowpart_offset
> +                                         (inner, GET_MODE (real)));
> +             imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
> +                                         subreg_lowpart_offset
> +                                         (inner, GET_MODE (imag)));
>             }
>
> -         if (TREE_ADDRESSABLE (parm))
> +         if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
> +             && rtx_equal_p (real,
> +                             read_complex_part (tmp, false))
> +             && rtx_equal_p (imag,
> +                             read_complex_part (tmp, true)))
> +           ; /* We now have the right rtl in tmp.  */
> +         else if (TREE_ADDRESSABLE (parm))
>             {
>               rtx rmem, imem;
>               HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
> @@ -3487,7 +3687,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
>           assign_parm_setup_block (&all, pbdata->bounds_parm,
>                                    &pbdata->parm_data);
>         else if (pbdata->parm_data.passed_pointer
> -                || use_register_for_decl (pbdata->bounds_parm))
> +                || use_register_for_parm_decl (&all, pbdata->bounds_parm))
>           assign_parm_setup_reg (&all, pbdata->bounds_parm,
>                                  &pbdata->parm_data);
>         else
> @@ -3531,6 +3731,8 @@ assign_parms (tree fndecl)
>           DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>           continue;
>         }
> +      else
> +       maybe_reset_rtl_for_parm (parm);
>
>        /* Estimate stack alignment from parameter alignment.  */
>        if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3580,7 +3782,9 @@ assign_parms (tree fndecl)
>        else
>         set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> -      /* Boudns should be loaded in the particular order to
> +      assign_parm_adjust_stack_rtl (&all, parm, &data);
> +
> +      /* Bounds should be loaded in the particular order to
>          have registers allocated correctly.  Collect info about
>          input bounds and load them later.  */
>        if (POINTER_BOUNDS_TYPE_P (data.passed_type))
> @@ -3597,11 +3801,10 @@ assign_parms (tree fndecl)
>         }
>        else
>         {
> -         assign_parm_adjust_stack_rtl (&data);
> -
>           if (assign_parm_setup_block_p (&data))
>             assign_parm_setup_block (&all, parm, &data);
> -         else if (data.passed_pointer || use_register_for_decl (parm))
> +         else if (data.passed_pointer
> +                  || use_register_for_parm_decl (&all, parm))
>             assign_parm_setup_reg (&all, parm, &data);
>           else
>             assign_parm_setup_stack (&all, parm, &data);
> @@ -4932,7 +5135,9 @@ expand_function_start (tree subr)
>       before any library calls that assign parms might generate.  */
>
>    /* Decide whether to return the value in memory or in a register.  */
> -  if (aggregate_value_p (DECL_RESULT (subr), subr))
> +  tree res = DECL_RESULT (subr);
> +  maybe_reset_rtl_for_parm (res);
> +  if (aggregate_value_p (res, subr))
>      {
>        /* Returning something that won't go in a register.  */
>        rtx value_address = 0;
> @@ -4940,7 +5145,7 @@ expand_function_start (tree subr)
>  #ifdef PCC_STATIC_STRUCT_RETURN
>        if (cfun->returns_pcc_struct)
>         {
> -         int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
> +         int size = int_size_in_bytes (TREE_TYPE (res));
>           value_address = assemble_static_space (size);
>         }
>        else
> @@ -4952,36 +5157,45 @@ expand_function_start (tree subr)
>              it.  */
>           if (sv)
>             {
> -             value_address = gen_reg_rtx (Pmode);
> +             if (DECL_BY_REFERENCE (res))
> +               value_address = get_rtl_for_parm_ssa_default_def (res);
> +             if (!value_address)
> +               value_address = gen_reg_rtx (Pmode);
>               emit_move_insn (value_address, sv);
>             }
>         }
>        if (value_address)
>         {
>           rtx x = value_address;
> -         if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
> +         if (!DECL_BY_REFERENCE (res))
>             {
> -             x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
> -             set_mem_attributes (x, DECL_RESULT (subr), 1);
> +             x = get_rtl_for_parm_ssa_default_def (res);
> +             if (!x)
> +               {
> +                 x = gen_rtx_MEM (DECL_MODE (res), value_address);
> +                 set_mem_attributes (x, res, 1);
> +               }
>             }
> -         SET_DECL_RTL (DECL_RESULT (subr), x);
> +         SET_DECL_RTL (res, x);
>         }
>      }
> -  else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
> +  else if (DECL_MODE (res) == VOIDmode)
>      /* If return mode is void, this decl rtl should not be used.  */
> -    SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
> +    SET_DECL_RTL (res, NULL_RTX);
>    else
>      {
>        /* Compute the return values into a pseudo reg, which we will copy
>          into the true return register after the cleanups are done.  */
> -      tree return_type = TREE_TYPE (DECL_RESULT (subr));
> -      if (TYPE_MODE (return_type) != BLKmode
> -         && targetm.calls.return_in_msb (return_type))
> +      tree return_type = TREE_TYPE (res);
> +      rtx x = get_rtl_for_parm_ssa_default_def (res);
> +      if (x)
> +       /* Use it.  */;
> +      else if (TYPE_MODE (return_type) != BLKmode
> +              && targetm.calls.return_in_msb (return_type))
>         /* expand_function_end will insert the appropriate padding in
>            this case.  Use the return value's natural (unpadded) mode
>            within the function proper.  */
> -       SET_DECL_RTL (DECL_RESULT (subr),
> -                     gen_reg_rtx (TYPE_MODE (return_type)));
> +       x = gen_reg_rtx (TYPE_MODE (return_type));
>        else
>         {
>           /* In order to figure out what mode to use for the pseudo, we
> @@ -4992,25 +5206,26 @@ expand_function_start (tree subr)
>           /* Structures that are returned in registers are not
>              aggregate_value_p, so we may see a PARALLEL or a REG.  */
>           if (REG_P (hard_reg))
> -           SET_DECL_RTL (DECL_RESULT (subr),
> -                         gen_reg_rtx (GET_MODE (hard_reg)));
> +           x = gen_reg_rtx (GET_MODE (hard_reg));
>           else
>             {
>               gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> -             SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
> +             x = gen_group_rtx (hard_reg);
>             }
>         }
>
> +      SET_DECL_RTL (res, x);
> +
>        /* Set DECL_REGISTER flag so that expand_function_end will copy the
>          result to the real return register(s).  */
> -      DECL_REGISTER (DECL_RESULT (subr)) = 1;
> +      DECL_REGISTER (res) = 1;
>
>        if (chkp_function_instrumented_p (current_function_decl))
>         {
> -         tree return_type = TREE_TYPE (DECL_RESULT (subr));
> +         tree return_type = TREE_TYPE (res);
>           rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
>                                                                  subr, 1);
> -         SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
> +         SET_DECL_BOUNDS_RTL (res, bounds);
>         }
>      }
>
> @@ -5025,13 +5240,19 @@ expand_function_start (tree subr)
>        rtx local, chain;
>       rtx_insn *insn;
>
> -      local = gen_reg_rtx (Pmode);
> +      local = get_rtl_for_parm_ssa_default_def (parm);
> +      if (!local)
> +       local = gen_reg_rtx (Pmode);
>        chain = targetm.calls.static_chain (current_function_decl, true);
>
>        set_decl_incoming_rtl (parm, chain, false);
>        SET_DECL_RTL (parm, local);
>        mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
>
> +      if (GET_MODE (local) != Pmode)
> +       local = convert_to_mode (Pmode, local,
> +                                TYPE_UNSIGNED (TREE_TYPE (parm)));
> +
>        insn = emit_move_insn (local, chain);
>
>        /* Mark the register as eliminable, similar to parameters.  */
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index b558d90..baed630 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type)
>    return copy;
>  }
>
> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> -   coalescing together, false otherwise.
> -
> -   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> -
> -bool
> -gimple_can_coalesce_p (tree name1, tree name2)
> -{
> -  /* First check the SSA_NAME's associated DECL.  We only want to
> -     coalesce if they have the same DECL or both have no associated DECL.  */
> -  tree var1 = SSA_NAME_VAR (name1);
> -  tree var2 = SSA_NAME_VAR (name2);
> -  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> -  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> -  if (var1 != var2)
> -    return false;
> -
> -  /* Now check the types.  If the types are the same, then we should
> -     try to coalesce V1 and V2.  */
> -  tree t1 = TREE_TYPE (name1);
> -  tree t2 = TREE_TYPE (name2);
> -  if (t1 == t2)
> -    return true;
> -
> -  /* If the types are not the same, check for a canonical type match.  This
> -     (for example) allows coalescing when the types are fundamentally the
> -     same, but just have different names.
> -
> -     Note pointer types with different address spaces may have the same
> -     canonical type.  Those are rejected for coalescing by the
> -     types_compatible_p check.  */
> -  if (TYPE_CANONICAL (t1)
> -      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> -      && types_compatible_p (t1, t2))
> -    return true;
> -
> -  return false;
> -}
> -
>  /* Strip off a legitimate source ending from the input string NAME of
>     length LEN.  Rather than having to know the names used by all of
>     our front ends, we strip off an ending of a period followed by
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index ed23eb2..3d1c89f 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
>  extern bool gimple_has_body_p (tree);
>  extern const char *gimple_decl_printable_name (tree, int);
>  extern tree copy_var_decl (tree, tree, tree);
> -extern bool gimple_can_coalesce_p (tree, tree);
>  extern tree create_tmp_var_name (const char *);
>  extern tree create_tmp_var_raw (tree, const char * = NULL);
>  extern tree create_tmp_var (tree, const char * = NULL);
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 9d5de96..32de605 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -445,12 +445,12 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
> +    { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> -    { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
>      { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 6b66f8f..64fc4d9 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_all_early_optimizations);
>        PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
>           NEXT_PASS (pass_remove_cgraph_callee_edges);
> -         NEXT_PASS (pass_rename_ssa_copies);
>           NEXT_PASS (pass_object_sizes);
>           NEXT_PASS (pass_ccp);
>           /* After CCP we rewrite no longer addressed locals into SSA
> @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3.  If not see
>        /* Initial scalar cleanups before alias computation.
>          They ensure memory accesses are not indirect wherever possible.  */
>        NEXT_PASS (pass_strip_predict_hints);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        NEXT_PASS (pass_ccp);
>        /* After CCP we rewrite no longer addressed locals into SSA
>          form if possible.  */
> @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_ch);
>        NEXT_PASS (pass_lower_complex);
>        NEXT_PASS (pass_sra);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* The dom pass will also resolve all __builtin_constant_p calls
>           that are still there to 0.  This has to be done after some
>          propagations have already run, but before some more dead code
> @@ -293,7 +290,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_fold_builtins);
>        NEXT_PASS (pass_optimize_widening_mul);
>        NEXT_PASS (pass_tail_calls);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* FIXME: If DCE is not run before checking for uninitialized uses,
>          we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
>          However, this also causes us to misdiagnose cases that should be
> @@ -328,7 +324,6 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_dce);
>        NEXT_PASS (pass_asan);
>        NEXT_PASS (pass_tsan);
> -      NEXT_PASS (pass_rename_ssa_copies);
>        /* ???  We do want some kind of loop invariant motion, but we possibly
>           need to adjust LIM to be more friendly towards preserving accurate
>          debug information here.  */
> diff --git a/gcc/stmt.c b/gcc/stmt.c
> index 391686c..e7f7dd4 100644
> --- a/gcc/stmt.c
> +++ b/gcc/stmt.c
> @@ -891,7 +891,7 @@ emit_case_decision_tree (tree index_expr, tree index_type,
>      {
>        index = copy_to_reg (index);
>        if (TREE_CODE (index_expr) == SSA_NAME)
> -       set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (index_expr), index);
> +       set_reg_attrs_for_decl_rtl (index_expr, index);
>      }
>
>    balance_case_nodes (&case_list, NULL);
> diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
> index 9757777..938e54b 100644
> --- a/gcc/stor-layout.c
> +++ b/gcc/stor-layout.c
> @@ -782,7 +782,8 @@ layout_decl (tree decl, unsigned int known_align)
>      {
>        PUT_MODE (rtl, DECL_MODE (decl));
>        SET_DECL_RTL (decl, 0);
> -      set_mem_attributes (rtl, decl, 1);
> +      if (MEM_P (rtl))
> +       set_mem_attributes (rtl, decl, 1);
>        SET_DECL_RTL (decl, rtl);
>      }
>  }
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
> index 9b17187..e1e7293 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
> @@ -1,6 +1,6 @@
>  /* PR tree-optimization/54200 */
>  /* { dg-do run } */
> -/* { dg-options "-g -fno-var-tracking-assignments" } */
> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>
>  int o __attribute__((used));
>
> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
> index 5467f4d..db69332 100644
> --- a/gcc/testsuite/gcc.dg/ssp-1.c
> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>
>  int main ()
>  {
> -  int i;
> +  register int i;
>    char foo[255];
>
>    // smash stack
> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
> index 9a7ac32..752fe53 100644
> --- a/gcc/testsuite/gcc.dg/ssp-2.c
> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
>  void
>  overflow()
>  {
> -  int i = 0;
> +  register int i = 0;
>    char foo[30];
>
>    /* Overflow buffer.  */
> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> new file mode 100644
> index 0000000..dbd81c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> @@ -0,0 +1,40 @@
> +/* { dg-do run } */
> +
> +#include <stdlib.h>
> +
> +/* Make sure we don't coalesce both incoming parms, one whose incoming
> +   value is unused, to the same location, so as to overwrite one of
> +   them with the incoming value of the other.  */
> +
> +int __attribute__((noinline, noclone))
> +foo (int i, int j)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +/* Same as foo, but with swapped parameters.  */
> +int __attribute__((noinline, noclone))
> +bar (int j, int i)
> +{
> +  j = i; /* The incoming value for J is unused.  */
> +  i = 2;
> +  if (j)
> +    j++;
> +  j += i + 1;
> +  return j;
> +}
> +
> +int
> +main (void)
> +{
> +  if (foo (0, 1) != 3)
> +    abort ();
> +  if (bar (1, 0) != 3)
> +    abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index 7b747ab9..978476c 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>    rtx dest_rtx, seq, x;
>    machine_mode dest_mode, src_mode;
>    int unsignedp;
> -  tree var;
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> @@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>
>    start_sequence ();
>
> -  var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
> +  tree name = partition_to_var (SA.map, dest);
>    src_mode = TYPE_MODE (TREE_TYPE (src));
>    dest_mode = GET_MODE (dest_rtx);
> -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
> +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>    gcc_assert (!REG_P (dest_rtx)
> -             || dest_mode == promote_decl_mode (var, &unsignedp));
> +             || dest_mode == promote_ssa_mode (name, &unsignedp));
>
>    if (src_mode != dest_mode)
>      {
> @@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T)
>  static rtx
>  get_temp_reg (tree name)
>  {
> -  tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> -  tree type = TREE_TYPE (var);
> +  tree type = TREE_TYPE (name);
>    int unsignedp;
> -  machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
> +  machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>    rtx x = gen_reg_rtx (reg_mode);
>    if (POINTER_TYPE_P (type))
> -    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
> +    mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
>    return x;
>  }
>
> @@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
>    /* Return to viewing the variable list as just all reference variables after
>       coalescing has been performed.  */
> -  partition_view_normal (map, false);
> +  partition_view_normal (map);
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index bf8983f..08ce72c 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -36,6 +36,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-iterator.h"
>  #include "tree-ssa-live.h"
>  #include "tree-ssa-coalesce.h"
> +#include "cfgexpand.h"
> +#include "explow.h"
>  #include "diagnostic-core.h"
>
>
> @@ -806,6 +808,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>    basic_block bb;
>    ssa_op_iter iter;
>    live_track_p live;
> +  basic_block entry;
> +
> +  /* If inter-variable coalescing is enabled, we may attempt to
> +     coalesce variables from different base variables, including
> +     different parameters, so we have to make sure default defs live
> +     at the entry block conflict with each other.  */
> +  if (flag_tree_coalesce_vars)
> +    entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> +  else
> +    entry = NULL;
>
>    map = live_var_map (liveinfo);
>    graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -864,6 +876,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>             live_track_process_def (live, result, graph);
>         }
>
> +      /* Pretend there are defs for params' default defs at the start
> +        of the (post-)entry block.  */
> +      if (bb == entry)
> +       {
> +         unsigned base;
> +         bitmap_iterator bi;
> +         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> +           {
> +             bitmap_iterator bi2;
> +             unsigned part;
> +             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> +                                       0, part, bi2)
> +               {
> +                 tree var = partition_to_var (map, part);
> +                 if (!SSA_NAME_VAR (var)
> +                     || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> +                         && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> +                     || !SSA_NAME_IS_DEFAULT_DEF (var))
> +                   continue;
> +                 live_track_process_def (live, var, graph);
> +               }
> +           }
> +       }
> +
>       live_track_clear_base_vars (live);
>      }
>
> @@ -1132,6 +1168,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>      {
>        var1 = partition_to_var (map, p1);
>        var2 = partition_to_var (map, p2);
> +
>        z = var_union (map, var1, var2);
>        if (z == NO_PARTITION)
>         {
> @@ -1149,6 +1186,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
>        if (debug)
>         fprintf (debug, ": Success -> %d\n", z);
> +
>        return true;
>      }
>
> @@ -1244,6 +1282,333 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
>  }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F.  */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> +  int t;
> +  unsigned x, y;
> +  int p;
> +
> +  fprintf (f, "\nCoalescible Partition map \n\n");
> +
> +  for (x = 0; x < map->num_partitions; x++)
> +    {
> +      if (map->view_to_partition != NULL)
> +       p = map->view_to_partition[x];
> +      else
> +       p = x;
> +
> +      if (ssa_name (p) == NULL_TREE
> +         || virtual_operand_p (ssa_name (p)))
> +        continue;
> +
> +      t = 0;
> +      for (y = 1; y < num_ssa_names; y++)
> +        {
> +         tree var = version_to_var (map, y);
> +         if (!var)
> +           continue;
> +         int q = var_to_partition (map, var);
> +         p = partition_find (part, q);
> +         gcc_assert (map->partition_to_base_index[q]
> +                     == map->partition_to_base_index[p]);
> +
> +         if (p == (int)x)
> +           {
> +             if (t++ == 0)
> +               {
> +                 fprintf (f, "Partition %d, base %d (", x,
> +                          map->partition_to_base_index[q]);
> +                 print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> +                 fprintf (f, " - ");
> +               }
> +             fprintf (f, "%d ", y);
> +           }
> +       }
> +      if (t != 0)
> +       fprintf (f, ")\n");
> +    }
> +  fprintf (f, "\n");
> +}
> +
> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> +   coalescing together, false otherwise.
> +
> +   This must stay consistent with var_map_base_init in tree-ssa-live.c.  */
> +
> +bool
> +gimple_can_coalesce_p (tree name1, tree name2)
> +{
> +  /* First check the SSA_NAME's associated DECL.  Without
> +     optimization, we only want to coalesce if they have the same DECL
> +     or both have no associated DECL.  */
> +  tree var1 = SSA_NAME_VAR (name1);
> +  tree var2 = SSA_NAME_VAR (name2);
> +  var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> +  var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> +  if (var1 != var2 && !flag_tree_coalesce_vars)
> +    return false;
> +
> +  /* Now check the types.  If the types are the same, then we should
> +     try to coalesce V1 and V2.  */
> +  tree t1 = TREE_TYPE (name1);
> +  tree t2 = TREE_TYPE (name2);
> +  if (t1 == t2)
> +    {
> +    check_modes:
> +      /* If the base variables are the same, we're good: none of the
> +        other tests below could possibly fail.  */
> +      var1 = SSA_NAME_VAR (name1);
> +      var2 = SSA_NAME_VAR (name2);
> +      if (var1 == var2)
> +       return true;
> +
> +      /* We don't want to coalesce two SSA names if one of the base
> +        variables is supposed to be a register while the other is
> +        supposed to be on the stack.  Anonymous SSA names take
> +        registers, but when not optimizing, user variables should go
> +        on the stack, so coalescing them with the anonymous variable
> +        as the partition leader would end up assigning the user
> +        variable to a register.  Don't do that!  */
> +      bool reg1 = !var1 || use_register_for_decl (var1);
> +      bool reg2 = !var2 || use_register_for_decl (var2);
> +      if (reg1 != reg2)
> +       return false;
> +
> +      /* Check that the promoted modes are the same.  We don't want to
> +        coalesce if the promoted modes would be different.  Only
> +        PARM_DECLs and RESULT_DECLs have different promotion rules,
> +        so skip the test if both are variables, or both are anonymous
> +        SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
> +        coalesce its SSA versions with those of any other variables,
> +        because it may be passed by reference.  */
> +      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> +       || (/* The case var1 == var2 is already covered above.  */
> +           !parm_maybe_byref_p (var1)
> +           && !parm_maybe_byref_p (var2)
> +           && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
> +    }
> +
> +  /* If the types are not the same, check for a canonical type match.  This
> +     (for example) allows coalescing when the types are fundamentally the
> +     same, but just have different names.
> +
> +     Note pointer types with different address spaces may have the same
> +     canonical type.  Those are rejected for coalescing by the
> +     types_compatible_p check.  */
> +  if (TYPE_CANONICAL (t1)
> +      && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> +      && types_compatible_p (t1, t2))
> +    goto check_modes;
> +
> +  return false;
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> +   partition of SSA names USED_IN_COPIES and related by CL coalesce
> +   possibilities.  This must match gimple_can_coalesce_p in the
> +   optimized case.  */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> +                                  coalesce_list_p cl)
> +{
> +  int parts = num_var_partitions (map);
> +  partition tentative = partition_new (parts);
> +
> +  /* Partition the SSA versions so that, for each coalescible
> +     pair, both of its members are in the same partition in
> +     TENTATIVE.  */
> +  gcc_assert (!cl->sorted);
> +  coalesce_pair_p node;
> +  coalesce_iterator_type ppi;
> +  FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> +    {
> +      tree v1 = ssa_name (node->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (node->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* We have to deal with cost one pairs too.  */
> +  for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> +    {
> +      tree v1 = ssa_name (co->first_element);
> +      int p1 = partition_find (tentative, var_to_partition (map, v1));
> +      tree v2 = ssa_name (co->second_element);
> +      int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> +      if (p1 == p2)
> +       continue;
> +
> +      partition_union (tentative, p1, p2);
> +    }
> +
> +  /* And also with abnormal edges.  */
> +  basic_block bb;
> +  edge e;
> +  edge_iterator ei;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      FOR_EACH_EDGE (e, ei, bb->preds)
> +       if (e->flags & EDGE_ABNORMAL)
> +         {
> +           gphi_iterator gsi;
> +           for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> +                gsi_next (&gsi))
> +             {
> +               gphi *phi = gsi.phi ();
> +               tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> +               if (SSA_NAME_IS_DEFAULT_DEF (arg)
> +                   && (!SSA_NAME_VAR (arg)
> +                       || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> +                 continue;
> +
> +               tree res = PHI_RESULT (phi);
> +
> +               int p1 = partition_find (tentative, var_to_partition (map, res));
> +               int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> +               if (p1 == p2)
> +                 continue;
> +
> +               partition_union (tentative, p1, p2);
> +             }
> +         }
> +    }
> +
> +  map->partition_to_base_index = XCNEWVEC (int, parts);
> +  auto_vec<unsigned int> index_map (parts);
> +  if (parts)
> +    index_map.quick_grow (parts);
> +
> +  const unsigned no_part = -1;
> +  unsigned count = parts;
> +  while (count)
> +    index_map[--count] = no_part;
> +
> +  /* Initialize MAP's mapping from partition to base index, using
> +     as base indices an enumeration of the TENTATIVE partitions in
> +     which each SSA version ended up, so that we compute conflicts
> +     between all SSA versions that ended up in the same potential
> +     coalesce partition.  */
> +  bitmap_iterator bi;
> +  unsigned i;
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      if (index_map[base] != no_part)
> +       continue;
> +      index_map[base] = count++;
> +    }
> +
> +  map->num_basevars = count;
> +
> +  EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> +    {
> +      int pidx = var_to_partition (map, ssa_name (i));
> +      int base = partition_find (tentative, pidx);
> +      gcc_assert (index_map[base] < count);
> +      map->partition_to_base_index[pidx] = index_map[base];
> +    }
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    dump_part_var_map (dump_file, tentative, map);
> +
> +  partition_delete (tentative);
> +}
> +
> +/* Hashtable helpers.  */
> +
> +struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
> +{
> +  static inline hashval_t hash (const tree_int_map *);
> +  static inline bool equal (const tree_int_map *, const tree_int_map *);
> +};
> +
> +inline hashval_t
> +tree_int_map_hasher::hash (const tree_int_map *v)
> +{
> +  return tree_map_base_hash (v);
> +}
> +
> +inline bool
> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> +{
> +  return tree_int_map_eq (v, c);
> +}
> +
> +/* This routine will initialize the basevar fields of MAP with base
> +   names.  Partitions will share the same base if they have the same
> +   SSA_NAME_VAR, or, being anonymous variables, the same type.  This
> +   must match gimple_can_coalesce_p in the non-optimized case.  */
> +
> +static void
> +compute_samebase_partition_bases (var_map map)
> +{
> +  int x, num_part;
> +  tree var;
> +  struct tree_int_map *m, *mapstorage;
> +
> +  num_part = num_var_partitions (map);
> +  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> +  /* We can have at most num_part entries in the hash tables, so it's
> +     enough to allocate so many map elements once, saving some malloc
> +     calls.  */
> +  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> +
> +  /* If a base table already exists, clear it, otherwise create it.  */
> +  free (map->partition_to_base_index);
> +  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> +
> +  /* Build the base variable list, and point partitions at their bases.  */
> +  for (x = 0; x < num_part; x++)
> +    {
> +      struct tree_int_map **slot;
> +      unsigned baseindex;
> +      var = partition_to_var (map, x);
> +      if (SSA_NAME_VAR (var)
> +         && (!VAR_P (SSA_NAME_VAR (var))
> +             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> +       m->base.from = SSA_NAME_VAR (var);
> +      else
> +       /* This restricts what anonymous SSA names we can coalesce
> +          as it restricts the sets we compute conflicts for.
> +          Using TREE_TYPE to generate sets is the easies as
> +          type equivalency also holds for SSA names with the same
> +          underlying decl.
> +
> +          Check gimple_can_coalesce_p when changing this code.  */
> +       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> +                       ? TYPE_CANONICAL (TREE_TYPE (var))
> +                       : TREE_TYPE (var));
> +      /* If base variable hasn't been seen, set it up.  */
> +      slot = tree_to_index.find_slot (m, INSERT);
> +      if (!*slot)
> +       {
> +         baseindex = m - mapstorage;
> +         m->to = baseindex;
> +         *slot = m;
> +         m++;
> +       }
> +      else
> +       baseindex = (*slot)->to;
> +      map->partition_to_base_index[x] = baseindex;
> +    }
> +
> +  map->num_basevars = m - mapstorage;
> +
> +  free (mapstorage);
> +}
> +
>  /* Reduce the number of copies by coalescing variables in the function.  Return
>     a partition map with the resulting coalesces.  */
>
> @@ -1260,9 +1625,10 @@ coalesce_ssa_name (void)
>    cl = create_coalesce_list ();
>    map = create_outofssa_var_map (cl, used_in_copies);
>
> -  /* If optimization is disabled, we need to coalesce all the names originating
> -     from the same SSA_NAME_VAR so debug info remains undisturbed.  */
> -  if (!optimize)
> +  /* If this optimization is disabled, we need to coalesce all the
> +     names originating from the same SSA_NAME_VAR so debug info
> +     remains undisturbed.  */
> +  if (!flag_tree_coalesce_vars)
>      {
>        hash_table<ssa_name_var_hash> ssa_name_hash (10);
>
> @@ -1303,8 +1669,13 @@ coalesce_ssa_name (void)
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      dump_var_map (dump_file, map);
>
> -  /* Don't calculate live ranges for variables not in the coalesce list.  */
> -  partition_view_bitmap (map, used_in_copies, true);
> +  partition_view_bitmap (map, used_in_copies);
> +
> +  if (flag_tree_coalesce_vars)
> +    compute_optimized_partition_bases (map, used_in_copies, cl);
> +  else
> +    compute_samebase_partition_bases (map);
> +
>    BITMAP_FREE (used_in_copies);
>
>    if (num_var_partitions (map) < 1)
> @@ -1343,8 +1714,7 @@ coalesce_ssa_name (void)
>
>    /* Now coalesce everything in the list.  */
>    coalesce_partitions (map, graph, cl,
> -                      ((dump_flags & TDF_DETAILS) ? dump_file
> -                                                  : NULL));
> +                      ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
>    delete_coalesce_list (cl);
>    ssa_conflicts_delete (graph);
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index 99b188a..ae289b4 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>  #define GCC_TREE_SSA_COALESCE_H
>
>  extern var_map coalesce_ssa_name (void);
> +extern bool gimple_can_coalesce_p (tree, tree);
>
>  #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index aeb7f28..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,475 +0,0 @@
> -/* Rename SSA copies.
> -   Copyright (C) 2004-2015 Free Software Foundation, Inc.
> -   Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3.  If not see
> -<http://www.gnu.org/licenses/>.  */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "backend.h"
> -#include "tree.h"
> -#include "gimple.h"
> -#include "rtl.h"
> -#include "ssa.h"
> -#include "alias.h"
> -#include "fold-const.h"
> -#include "internal-fn.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> -  /* Number of copies coalesced.  */
> -  int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> -   This optimization looks for copies between 2 SSA_NAMES, either through a
> -   direct copy, or an implicit one via a PHI node result and its arguments.
> -
> -   Each copy is examined to determine if it is possible to rename the base
> -   variable of one of the operands to the same variable as the other operand.
> -   i.e.
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -
> -   If this copy couldn't be copy propagated, it could possibly remain in the
> -   program throughout the optimization phases.   After SSA->normal, it would
> -   become:
> -
> -   T.3 = <blah>
> -   a = T.3
> -
> -   Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> -   fundamental reason why the base variable needs to be T.3, subject to
> -   certain restrictions.  This optimization attempts to determine if we can
> -   change the base variable on copies like this, and result in code such as:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -
> -   This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> -   possible, the copy goes away completely. If it isn't possible, a new temp
> -   will be created for a_5, and you will end up with the exact same code:
> -
> -   a.8 = <blah>
> -   a = a.8
> -
> -   The other benefit of performing this optimization relates to what variables
> -   are chosen in copies.  Gimplification of the program uses temporaries for
> -   a lot of things. expressions like
> -
> -   a_1 = <blah>
> -   <blah2> = a_1
> -
> -   get turned into
> -
> -   T.3_5 = <blah>
> -   a_1 = T.3_5
> -   <blah2> = a_1
> -
> -   Copy propagation is done in a forward direction, and if we can propagate
> -   through the copy, we end up with:
> -
> -   T.3_5 = <blah>
> -   <blah2> = T.3_5
> -
> -   The copy is gone, but so is all reference to the user variable 'a'. By
> -   performing this optimization, we would see the sequence:
> -
> -   a_5 = <blah>
> -   a_1 = a_5
> -   <blah2> = a_1
> -
> -   which copy propagation would then turn into:
> -
> -   a_5 = <blah>
> -   <blah2> = a_5
> -
> -   and so we still retain the user variable whenever possible.  */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> -   Choose a representative for the partition, and send debug info to DEBUG.  */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> -  int p1, p2, p3;
> -  tree root1, root2;
> -  tree rep1, rep2;
> -  bool ign1, ign2, abnorm;
> -
> -  gcc_assert (TREE_CODE (var1) == SSA_NAME);
> -  gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> -  register_ssa_partition (map, var1);
> -  register_ssa_partition (map, var2);
> -
> -  p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> -  p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> -  if (debug)
> -    {
> -      fprintf (debug, "Try : ");
> -      print_generic_expr (debug, var1, TDF_SLIM);
> -      fprintf (debug, "(P%d) & ", p1);
> -      print_generic_expr (debug, var2, TDF_SLIM);
> -      fprintf (debug, "(P%d)", p2);
> -    }
> -
> -  gcc_assert (p1 != NO_PARTITION);
> -  gcc_assert (p2 != NO_PARTITION);
> -
> -  if (p1 == p2)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Already coalesced.\n");
> -      return;
> -    }
> -
> -  rep1 = partition_to_var (map, p1);
> -  rep2 = partition_to_var (map, p2);
> -  root1 = SSA_NAME_VAR (rep1);
> -  root2 = SSA_NAME_VAR (rep2);
> -  if (!root1 && !root2)
> -    return;
> -
> -  /* Don't coalesce if one of the variables occurs in an abnormal PHI.  */
> -  abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> -           || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> -  if (abnorm)
> -    {
> -      if (debug)
> -       fprintf (debug, " : Abnormal PHI barrier.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Partitions already have the same root, simply merge them.  */
> -  if (root1 == root2)
> -    {
> -      p1 = partition_union (map->var_partition, p1, p2);
> -      if (debug)
> -       fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> -      return;
> -    }
> -
> -  /* Never attempt to coalesce 2 different parameters.  */
> -  if ((root1 && TREE_CODE (root1) == PARM_DECL)
> -      && (root2 && TREE_CODE (root2) == PARM_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> -      return;
> -    }
> -
> -  if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> -      != (root2 && TREE_CODE (root2) == RESULT_DECL))
> -    {
> -      if (debug)
> -        fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> -      return;
> -    }
> -
> -  ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> -  ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> -  /* Refrain from coalescing user variables, if requested.  */
> -  if (!ign1 && !ign2)
> -    {
> -      if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> -       ign2 = true;
> -      else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> -       ign1 = true;
> -      else if (flag_ssa_coalesce_vars != 2)
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> -         return;
> -       }
> -      else
> -       ign2 = true;
> -    }
> -
> -  /* If both values have default defs, we can't coalesce.  If only one has a
> -     tag, make sure that variable is the new root partition.  */
> -  if (root1 && ssa_default_def (cfun, root1))
> -    {
> -      if (root2 && ssa_default_def (cfun, root2))
> -       {
> -         if (debug)
> -           fprintf (debug, " : 2 default defs. No coalesce.\n");
> -         return;
> -       }
> -      else
> -        {
> -         ign2 = true;
> -         ign1 = false;
> -       }
> -    }
> -  else if (root2 && ssa_default_def (cfun, root2))
> -    {
> -      ign1 = true;
> -      ign2 = false;
> -    }
> -
> -  /* Do not coalesce if we cannot assign a symbol to the partition.  */
> -  if (!(!ign2 && root2)
> -      && !(!ign1 && root1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Choosen variable has no root.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the new chosen root variable would be read-only.
> -     If both ign1 && ign2, then the root var of the larger partition
> -     wins, so reject in that case if any of the root vars is TREE_READONLY.
> -     Otherwise reject only if the root var, on which replace_ssa_name_symbol
> -     will be called below, is readonly.  */
> -  if (((root1 && TREE_READONLY (root1)) && ign2)
> -      || ((root2 && TREE_READONLY (root2)) && ign1))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Readonly variable.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Don't coalesce if the two variables aren't type compatible .  */
> -  if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> -      /* There is a disconnect between the middle-end type-system and
> -         VRP, avoid coalescing enum types with different bounds.  */
> -      || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> -          || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> -         && TREE_TYPE (var1) != TREE_TYPE (var2)))
> -    {
> -      if (debug)
> -       fprintf (debug, " : Incompatible types.  No coalesce.\n");
> -      return;
> -    }
> -
> -  /* Merge the two partitions.  */
> -  p3 = partition_union (map->var_partition, p1, p2);
> -
> -  /* Set the root variable of the partition to the better choice, if there is
> -     one.  */
> -  if (!ign2 && root2)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> -  else if (!ign1 && root1)
> -    replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> -  else
> -    gcc_unreachable ();
> -
> -  if (debug)
> -    {
> -      fprintf (debug, " --> P%d ", p3);
> -      print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> -                         TDF_SLIM);
> -      fprintf (debug, "\n");
> -    }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> -  GIMPLE_PASS, /* type */
> -  "copyrename", /* name */
> -  OPTGROUP_NONE, /* optinfo_flags */
> -  TV_TREE_COPY_RENAME, /* tv_id */
> -  ( PROP_cfg | PROP_ssa ), /* properties_required */
> -  0, /* properties_provided */
> -  0, /* properties_destroyed */
> -  0, /* todo_flags_start */
> -  0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> -  pass_rename_ssa_copies (gcc::context *ctxt)
> -    : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> -  {}
> -
> -  /* opt_pass methods: */
> -  opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> -  virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> -  virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> -   SSA versions which occur in PHI's or copies.  Coalescing is accomplished by
> -   changing the underlying root variable of all coalesced version.  This will
> -   then cause the SSA->normal pass to attempt to coalesce them all to the same
> -   variable.  */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> -  var_map map;
> -  basic_block bb;
> -  tree var, part_var;
> -  gimple stmt;
> -  unsigned x;
> -  FILE *debug;
> -
> -  memset (&stats, 0, sizeof (stats));
> -
> -  if (dump_file && (dump_flags & TDF_DETAILS))
> -    debug = dump_file;
> -  else
> -    debug = NULL;
> -
> -  map = init_var_map (num_ssa_names);
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Scan for real copies.  */
> -      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -       {
> -         stmt = gsi_stmt (gsi);
> -         if (gimple_assign_ssa_name_copy_p (stmt))
> -           {
> -             tree lhs = gimple_assign_lhs (stmt);
> -             tree rhs = gimple_assign_rhs1 (stmt);
> -
> -             copy_rename_partition_coalesce (map, lhs, rhs, debug);
> -           }
> -       }
> -    }
> -
> -  FOR_EACH_BB_FN (bb, fun)
> -    {
> -      /* Treat PHI nodes as copies between the result and each argument.  */
> -      for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> -          gsi_next (&gsi))
> -        {
> -          size_t i;
> -         tree res;
> -         gphi *phi = gsi.phi ();
> -         res = gimple_phi_result (phi);
> -
> -         /* Do not process virtual SSA_NAMES.  */
> -         if (virtual_operand_p (res))
> -           continue;
> -
> -         /* Make sure to only use the same partition for an argument
> -            as the result but never the other way around.  */
> -         if (SSA_NAME_VAR (res)
> -             && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> -           for (i = 0; i < gimple_phi_num_args (phi); i++)
> -             {
> -               tree arg = PHI_ARG_DEF (phi, i);
> -               if (TREE_CODE (arg) == SSA_NAME)
> -                 copy_rename_partition_coalesce (map, res, arg,
> -                                                 debug);
> -             }
> -         /* Else if all arguments are in the same partition try to merge
> -            it with the result.  */
> -         else
> -           {
> -             int all_p_same = -1;
> -             int p = -1;
> -             for (i = 0; i < gimple_phi_num_args (phi); i++)
> -               {
> -                 tree arg = PHI_ARG_DEF (phi, i);
> -                 if (TREE_CODE (arg) != SSA_NAME)
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -                 else if (all_p_same == -1)
> -                   {
> -                     p = partition_find (map->var_partition,
> -                                         SSA_NAME_VERSION (arg));
> -                     all_p_same = 1;
> -                   }
> -                 else if (all_p_same == 1
> -                          && p != partition_find (map->var_partition,
> -                                                  SSA_NAME_VERSION (arg)))
> -                   {
> -                     all_p_same = 0;
> -                     break;
> -                   }
> -               }
> -             if (all_p_same == 1)
> -               copy_rename_partition_coalesce (map, res,
> -                                               PHI_ARG_DEF (phi, 0),
> -                                               debug);
> -           }
> -        }
> -    }
> -
> -  if (debug)
> -    dump_var_map (debug, map);
> -
> -  /* Now one more pass to make all elements of a partition share the same
> -     root variable.  */
> -
> -  for (x = 1; x < num_ssa_names; x++)
> -    {
> -      part_var = partition_to_var (map, x);
> -      if (!part_var)
> -        continue;
> -      var = ssa_name (x);
> -      if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> -       continue;
> -      if (debug)
> -        {
> -         fprintf (debug, "Coalesced ");
> -         print_generic_expr (debug, var, TDF_SLIM);
> -         fprintf (debug, " to ");
> -         print_generic_expr (debug, part_var, TDF_SLIM);
> -         fprintf (debug, "\n");
> -       }
> -      stats.coalesced++;
> -      replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> -    }
> -
> -  statistics_counter_event (fun, "copies coalesced",
> -                           stats.coalesced);
> -  delete_var_map (map);
> -  return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> -  return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index 5b00f58..4772558 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -70,88 +70,6 @@ static void  verify_live_on_entry (tree_live_info_p);
>     ssa_name or variable, and vice versa.  */
>
>
> -/* Hashtable helpers.  */
> -
> -struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
> -{
> -  static inline hashval_t hash (const tree_int_map *);
> -  static inline bool equal (const tree_int_map *, const tree_int_map *);
> -};
> -
> -inline hashval_t
> -tree_int_map_hasher::hash (const tree_int_map *v)
> -{
> -  return tree_map_base_hash (v);
> -}
> -
> -inline bool
> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> -{
> -  return tree_int_map_eq (v, c);
> -}
> -
> -
> -/* This routine will initialize the basevar fields of MAP.  */
> -
> -static void
> -var_map_base_init (var_map map)
> -{
> -  int x, num_part;
> -  tree var;
> -  struct tree_int_map *m, *mapstorage;
> -
> -  num_part = num_var_partitions (map);
> -  hash_table<tree_int_map_hasher> tree_to_index (num_part);
> -  /* We can have at most num_part entries in the hash tables, so it's
> -     enough to allocate so many map elements once, saving some malloc
> -     calls.  */
> -  mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> -
> -  /* If a base table already exists, clear it, otherwise create it.  */
> -  free (map->partition_to_base_index);
> -  map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> -
> -  /* Build the base variable list, and point partitions at their bases.  */
> -  for (x = 0; x < num_part; x++)
> -    {
> -      struct tree_int_map **slot;
> -      unsigned baseindex;
> -      var = partition_to_var (map, x);
> -      if (SSA_NAME_VAR (var)
> -         && (!VAR_P (SSA_NAME_VAR (var))
> -             || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> -       m->base.from = SSA_NAME_VAR (var);
> -      else
> -       /* This restricts what anonymous SSA names we can coalesce
> -          as it restricts the sets we compute conflicts for.
> -          Using TREE_TYPE to generate sets is the easies as
> -          type equivalency also holds for SSA names with the same
> -          underlying decl.
> -
> -          Check gimple_can_coalesce_p when changing this code.  */
> -       m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> -                       ? TYPE_CANONICAL (TREE_TYPE (var))
> -                       : TREE_TYPE (var));
> -      /* If base variable hasn't been seen, set it up.  */
> -      slot = tree_to_index.find_slot (m, INSERT);
> -      if (!*slot)
> -       {
> -         baseindex = m - mapstorage;
> -         m->to = baseindex;
> -         *slot = m;
> -         m++;
> -       }
> -      else
> -       baseindex = (*slot)->to;
> -      map->partition_to_base_index[x] = baseindex;
> -    }
> -
> -  map->num_basevars = m - mapstorage;
> -
> -  free (mapstorage);
> -}
> -
> -
>  /* Remove the base table in MAP.  */
>
>  static void
> @@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected)
>  }
>
>
> -/* Create a partition view which includes all the used partitions in MAP.  If
> -   WANT_BASES is true, create the base variable map as well.  */
> +/* Create a partition view which includes all the used partitions in MAP.  */
>
>  void
> -partition_view_normal (var_map map, bool want_bases)
> +partition_view_normal (var_map map)
>  {
>    bitmap used;
>
>    used = partition_view_init (map);
>    partition_view_fini (map, used);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> @@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases)
>     as well.  */
>
>  void
> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> +partition_view_bitmap (var_map map, bitmap only)
>  {
>    bitmap used;
>    bitmap new_partitions = BITMAP_ALLOC (NULL);
> @@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>      }
>    partition_view_fini (map, new_partitions);
>
> -  if (want_bases)
> -    var_map_base_init (map);
> -  else
> -    var_map_base_fini (map);
> +  var_map_base_fini (map);
>  }
>
>
> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
> index d5d7820..1f88358 100644
> --- a/gcc/tree-ssa-live.h
> +++ b/gcc/tree-ssa-live.h
> @@ -71,8 +71,8 @@ typedef struct _var_map
>  extern var_map init_var_map (int);
>  extern void delete_var_map (var_map);
>  extern int var_union (var_map, tree, tree);
> -extern void partition_view_normal (var_map, bool);
> -extern void partition_view_bitmap (var_map, bitmap, bool);
> +extern void partition_view_normal (var_map);
> +extern void partition_view_bitmap (var_map, bitmap);
>  extern void dump_scope_blocks (FILE *, int);
>  extern void debug_scope_block (tree, int);
>  extern void debug_scope_blocks (int);
> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
> index 437f69d..1fbd71e 100644
> --- a/gcc/tree-ssa-uncprop.c
> +++ b/gcc/tree-ssa-uncprop.c
> @@ -38,6 +38,11 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-pass.h"
>  #include "tree-ssa-propagate.h"
>  #include "tree-hash-traits.h"
> +#include "bitmap.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-ssa-live.h"
> +#include "tree-ssa-coalesce.h"
>
>  /* The basic structure describing an equivalency created by traversing
>     an edge.  Traversing the edge effectively means that we can assume
> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
> index da9de28..a31a137 100644
> --- a/gcc/var-tracking.c
> +++ b/gcc/var-tracking.c
> @@ -4856,12 +4856,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
>     registers, as well as associations between MEMs and VALUEs.  */
>
>  static void
> -dataflow_set_clear_at_call (dataflow_set *set)
> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
>  {
>    unsigned int r;
>    hard_reg_set_iterator hrsi;
> +  HARD_REG_SET invalidated_regs;
>
> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
> +  get_call_reg_set_usage (call_insn, &invalidated_regs,
> +                         regs_invalidated_by_call);
> +
> +  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
>      var_regno_delete (set, r);
>
>    if (MAY_HAVE_DEBUG_INSNS)
> @@ -6645,7 +6649,7 @@ compute_bb_dataflow (basic_block bb)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (out);
> +           dataflow_set_clear_at_call (out, insn);
>             break;
>
>           case MO_USE:
> @@ -9107,7 +9111,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
>        switch (mo->type)
>         {
>           case MO_CALL:
> -           dataflow_set_clear_at_call (set);
> +           dataflow_set_clear_at_call (set, insn);
>             emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
>             {
>               rtx arguments = mo->u.loc, *p = &arguments;
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17  2:57                                                           ` Alexandre Oliva
@ 2015-08-17  8:23                                                             ` Andreas Schwab
  2015-08-17  9:21                                                               ` Andreas Schwab
  2015-08-17 11:58                                                               ` Alexandre Oliva
  0 siblings, 2 replies; 127+ messages in thread
From: Andreas Schwab @ 2015-08-17  8:23 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

Alexandre Oliva <aoliva@redhat.com> writes:

> Would you be so kind as to give it a spin on a m68k native?  TIA,

I tried it on ia64, and it falls flat on the floor.

../../../libgcc/config/ia64/unwind-ia64.c: In function ‘_Unwind_SetGR’:
../../../libgcc/config/ia64/unwind-ia64.c:1683:1: internal compiler error: Segmentation fault
 _Unwind_SetGR (struct _Unwind_Context *context, int index, _Unwind_Word val)
 ^
0x4000000001807edf crash_signal
        ../../gcc/toplev.c:352
0x4000000000d0ed60 parm_in_unassigned_mem_p
        ../../gcc/function.c:2940
0x4000000000d23e8f assign_parm_setup_stack
        ../../gcc/function.c:3473
0x4000000000d2b43f assign_parms
        ../../gcc/function.c:3830
0x4000000000d2e24f expand_function_start(tree_node*)
        ../../gcc/function.c:5254
0x40000000007bdabf execute
        ../../gcc/cfgexpand.c:6187

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17  8:23                                                             ` Andreas Schwab
@ 2015-08-17  9:21                                                               ` Andreas Schwab
  2015-08-17 11:58                                                               ` Alexandre Oliva
  1 sibling, 0 replies; 127+ messages in thread
From: Andreas Schwab @ 2015-08-17  9:21 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

Andreas Schwab <schwab@linux-m68k.org> writes:

> Alexandre Oliva <aoliva@redhat.com> writes:
>
>> Would you be so kind as to give it a spin on a m68k native?  TIA,
>
> I tried it on ia64, and it falls flat on the floor.

It fixes the m68k failures, though.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17  5:05                                                               ` Alexandre Oliva
@ 2015-08-17  9:29                                                                 ` Kyrill Tkachov
  2015-08-17 16:23                                                                   ` Andrew Pinski
  2015-08-18 16:18                                                                 ` Kyrill Tkachov
  1 sibling, 1 reply; 127+ messages in thread
From: Kyrill Tkachov @ 2015-08-17  9:29 UTC (permalink / raw)
  To: Alexandre Oliva, Andreas Schwab
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

Hi Alexandre,

On 17/08/15 03:56, Alexandre Oliva wrote:
> On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:
>
>> Alexandre Oliva <aoliva@redhat.com> writes:
>>> On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:
>>>
>>>> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)
>>>> In file included from
>>>> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:
>>> Are you sure this is a regression introduced by my patch?
>> Yes, it reintroduces the ICE.
> Ugh.  I see this testcase was introduced very recently, so presumably it
> wasn't present in the tree that James Greenhalgh tested and confirmed
> there were no regressions.

Yeah, I introduced it as part of the SWITCHABLE_TARGET
work for aarch64. A bit of a mid-air collision :(

> The hack in aarch64-builtins.c looks risky IMHO.  Changing the mode of a
> decl after RTL is assigned to it (or to its SSA partitions) seems fishy.
> The assert is doing just what it was supposed to do.  The only surprise
> to me is that it didn't catch this unexpected and unsupported change
> before.
>
> Presumably if we just dropped the assert in expand_expr_real_1, this
> case would work just fine, although the unsignedp bit would be
> meaningless and thus confusing, since the subreg isn't about a
> promotion, but about reflecting the mode change that was made from under
> us.
>
> May I suggest that you guys find (or introduce) other means to change
> the layout and mode of the decl *before* RTL is assigned to the params?
> I think this would save us a ton of trouble down the road.  Just think
> how much trouble you'd get if the different modes had different calling
> conventions, alignment requirements, valid register assignments, or
> anything that might make coalescing their SSA names with those of other
> variables invalid.
>
I'm not familiar with the intricacies in this area but
I'll have a look.
Perhaps we can somehow re-layout the SIMD types when
switching from a non-simd to a simd target...
Can you, or Andreas please file a PR so we don't forget?

Thanks,
Kyrill

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17  8:23                                                             ` Andreas Schwab
  2015-08-17  9:21                                                               ` Andreas Schwab
@ 2015-08-17 11:58                                                               ` Alexandre Oliva
  1 sibling, 0 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-17 11:58 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Aug 17, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:

> Alexandre Oliva <aoliva@redhat.com> writes:
>> Would you be so kind as to give it a spin on a m68k native?  TIA,

> I tried it on ia64, and it falls flat on the floor.

Doh, I see a logic flaw in the patch I posted.  The hunk in
assign_parm_setup_stack that looked like this:

+	  if (from_expand)
+	    gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
+	  else if (!parm_in_unassigned_mem_p (parm, from_expand))
+	    data->stack_parm = from_expand;

should look like this:

+	  if (from_expand)
+	    gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
+	  if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
+	    data->stack_parm = from_expand;

I'll give it some more testing before submitting a formal patch.

Meanwhile, thanks for confirming the m68k issues are fixed by that one;
this one shouldn't regress them; it would only fix the unintended crashes.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17  7:48                                                         ` Christophe Lyon
@ 2015-08-17 12:43                                                           ` Alexandre Oliva
  2015-08-17 13:39                                                             ` Christophe Lyon
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-17 12:43 UTC (permalink / raw)
  To: Christophe Lyon
  Cc: Patrick Marlier, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, David Edelsohn,
	Eric Botcazou

On Aug 17, 2015, Christophe Lyon <christophe.lyon@linaro.org> wrote:

> Since this was committed (r226901), I can see that the compiler build
> fails for armeb targets, when building libgcc:

Any chance you could get me a preprocessed testcase for this failure, please?

Thanks in advance,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17 12:43                                                           ` Alexandre Oliva
@ 2015-08-17 13:39                                                             ` Christophe Lyon
  2015-08-18  6:53                                                               ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Christophe Lyon @ 2015-08-17 13:39 UTC (permalink / raw)
  To: Alexandre Oliva, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 800 bytes --]

On 17 August 2015 at 13:58, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Aug 17, 2015, Christophe Lyon <christophe.lyon@linaro.org> wrote:
>
>> Since this was committed (r226901), I can see that the compiler build
>> fails for armeb targets, when building libgcc:
>
> Any chance you could get me a preprocessed testcase for this failure, please?
>
Yes, here it is, attached.

My gcc is configured with:
--target=armeb-linux-gnueabihf--with-mode=arm --with-cpu=cortex-a9
--with-fpu=neon

Thanks,

Christophe.

> Thanks in advance,
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

[-- Attachment #2: fixed-bit.i.xz --]
[-- Type: application/force-download, Size: 14204 bytes --]

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17  9:29                                                                 ` Kyrill Tkachov
@ 2015-08-17 16:23                                                                   ` Andrew Pinski
  0 siblings, 0 replies; 127+ messages in thread
From: Andrew Pinski @ 2015-08-17 16:23 UTC (permalink / raw)
  To: Kyrill Tkachov
  Cc: Alexandre Oliva, Andreas Schwab, Patrick Marlier, Jeff Law,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, Richard Biener,
	GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou

On Mon, Aug 17, 2015 at 5:20 PM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi Alexandre,
>
> On 17/08/15 03:56, Alexandre Oliva wrote:
>>
>> On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:
>>
>>> Alexandre Oliva <aoliva@redhat.com> writes:
>>>>
>>>> On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:
>>>>
>>>>> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler
>>>>> error)
>>>>> In file included from
>>>>>
>>>>> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:
>>>>
>>>> Are you sure this is a regression introduced by my patch?
>>>
>>> Yes, it reintroduces the ICE.
>>
>> Ugh.  I see this testcase was introduced very recently, so presumably it
>> wasn't present in the tree that James Greenhalgh tested and confirmed
>> there were no regressions.
>
>
> Yeah, I introduced it as part of the SWITCHABLE_TARGET
> work for aarch64. A bit of a mid-air collision :(
>
>> The hack in aarch64-builtins.c looks risky IMHO.  Changing the mode of a
>> decl after RTL is assigned to it (or to its SSA partitions) seems fishy.
>> The assert is doing just what it was supposed to do.  The only surprise
>> to me is that it didn't catch this unexpected and unsupported change
>> before.
>>
>> Presumably if we just dropped the assert in expand_expr_real_1, this
>> case would work just fine, although the unsignedp bit would be
>> meaningless and thus confusing, since the subreg isn't about a
>> promotion, but about reflecting the mode change that was made from under
>> us.
>>
>> May I suggest that you guys find (or introduce) other means to change
>> the layout and mode of the decl *before* RTL is assigned to the params?
>> I think this would save us a ton of trouble down the road.  Just think
>> how much trouble you'd get if the different modes had different calling
>> conventions, alignment requirements, valid register assignments, or
>> anything that might make coalescing their SSA names with those of other
>> variables invalid.
>>
> I'm not familiar with the intricacies in this area but
> I'll have a look.
> Perhaps we can somehow re-layout the SIMD types when
> switching from a non-simd to a simd target...
> Can you, or Andreas please file a PR so we don't forget?

How does x86 handle this case?  Because it should be handling this case somehow.

Thanks,
Andrew


>
> Thanks,
> Kyrill

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17 13:39                                                             ` Christophe Lyon
@ 2015-08-18  6:53                                                               ` Alexandre Oliva
  2015-08-19  6:50                                                                 ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-18  6:53 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: GCC Patches

On Aug 17, 2015, Christophe Lyon <christophe.lyon@linaro.org> wrote:

> On 17 August 2015 at 13:58, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Aug 17, 2015, Christophe Lyon <christophe.lyon@linaro.org> wrote:
>> 
>>> Since this was committed (r226901), I can see that the compiler build
>>> fails for armeb targets, when building libgcc:
>> 
>> Any chance you could get me a preprocessed testcase for this failure, please?
>> 
> Yes, here it is, attached.

Thanks.

This patch fixes this particular case.  I'll also add this configuration
to the cross build tests I'm going to rerun shortly, before submitting a
followup formally, to see whether other non-MEM mems need to be handled
explicitly.


--- a/gcc/function.c
+++ b/gcc/function.c
@@ -3017,6 +3017,11 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       else if (size == 0)
 	;
 
+      /* MEM may be a REG if coalescing assigns the param's partition
+	 to a pseudo.  */
+      else if (REG_P (mem))
+	emit_move_insn (mem, entry_parm);
+
       /* If SIZE is that of a mode no bigger than a word, just use
 	 that mode's store operation.  */
       else if (size <= UNITS_PER_WORD)


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-17  5:05                                                               ` Alexandre Oliva
  2015-08-17  9:29                                                                 ` Kyrill Tkachov
@ 2015-08-18 16:18                                                                 ` Kyrill Tkachov
  1 sibling, 0 replies; 127+ messages in thread
From: Kyrill Tkachov @ 2015-08-18 16:18 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: GCC Patches


On 17/08/15 03:56, Alexandre Oliva wrote:
> On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:
>
>> Alexandre Oliva <aoliva@redhat.com> writes:
>>> On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:
>>>
>>>> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)
>>>> In file included from
>>>> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:
>>> Are you sure this is a regression introduced by my patch?
>> Yes, it reintroduces the ICE.
> Ugh.  I see this testcase was introduced very recently, so presumably it
> wasn't present in the tree that James Greenhalgh tested and confirmed
> there were no regressions.
>
> The hack in aarch64-builtins.c looks risky IMHO.  Changing the mode of a
> decl after RTL is assigned to it (or to its SSA partitions) seems fishy.
> The assert is doing just what it was supposed to do.  The only surprise
> to me is that it didn't catch this unexpected and unsupported change
> before.
>
> Presumably if we just dropped the assert in expand_expr_real_1, this
> case would work just fine, although the unsignedp bit would be
> meaningless and thus confusing, since the subreg isn't about a
> promotion, but about reflecting the mode change that was made from under
> us.
>
> May I suggest that you guys find (or introduce) other means to change
> the layout and mode of the decl *before* RTL is assigned to the params?

Hmm, if in TARGET_SET_CURRENT_FUNCTION, which is called fairly
early on to set up cfun I do the relaying of the param decls
then it seems to work. Will do some more testing...


> I think this would save us a ton of trouble down the road.  Just think
> how much trouble you'd get if the different modes had different calling
> conventions, alignment requirements, valid register assignments, or
> anything that might make coalescing their SSA names with those of other
> variables invalid.
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-18  6:53                                                               ` Alexandre Oliva
@ 2015-08-19  6:50                                                                 ` Alexandre Oliva
  2015-08-19 10:17                                                                   ` Richard Biener
  2015-08-19 13:35                                                                   ` Andreas Schwab
  0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-19  6:50 UTC (permalink / raw)
  To: Christophe Lyon, Andreas Schwab
  Cc: GCC Patches, Patrick Marlier, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, Richard Biener, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Aug 18, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

>>> On Aug 17, 2015, Christophe Lyon <christophe.lyon@linaro.org> wrote:
>>>> Since this was committed (r226901), I can see that the compiler build
>>>> fails for armeb targets, when building libgcc:

> This patch fixes this particular case.  I'll also add this configuration
> to the cross build tests I'm going to rerun shortly, before submitting a
> followup formally, to see whether other non-MEM mems need to be handled
> explicitly.

On Aug 17, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:

> Andreas Schwab <schwab@linux-m68k.org> writes:

>> Alexandre Oliva <aoliva@redhat.com> writes:
>> 
>>> Would you be so kind as to give it a spin on a m68k native?  TIA,
>> 
>> I tried it on ia64, and it falls flat on the floor.

> It fixes the m68k failures, though.

On Aug 17, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

>> I tried it on ia64, and it falls flat on the floor.

> Doh, I see a logic flaw in the patch I posted.

There were other shortcomings in the snippets I posted before, revealed
by testing on on various other targets: remaining BLKmode asserts,
failure to deal with parms without a default def and split complex args
with an unassigned stack address.  This patch fixes them all.

It was regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu,
ppc64el-linux-gnu, and further tested with a compile-only 'make all' on
a binutils+gcc+newlib tree on all tens of cross targets mentioned
before, plus the armeb configuration Christophe mentioned.

Ok to install?


[PR64164] fix regressions reported on m68k and armeb

From: Alexandre Oliva <aoliva@redhat.com>

Defer stack slot address assignment for all parms that can't live in
pseudos, and accept pseudos assignments in assign_param_setup_block.

for  gcc/ChangeLog

	PR rtl-optimization/64164
	* cfgexpand.c (parm_maybe_byref_p): Renamed to...
	(parm_in_stack_slot_p): ... this.  Disregard mode, what
	matters is whether the parm will live in a pseudo or a stack
	slot.
	(expand_one_ssa_partition): Deal with params without a default
	def.  Disregard mode.
	* cfgexpand.h: Renamed function declaration.
	* tree-ssa-coalesce.c: Adjust.
	* function.c (split_complex_args): Allocate stack slot for
	unassigned parms before splitting.
	(parm_in_unassigned_mem_p): New.  Use it instead of
	parm_maybe_byref_p throughout this file.
	(assign_parm_setup_block): Use it.  Accept pseudos in the
	expand-assigned rtl.
	(assign_parm_setup_reg): Drop BLKmode requirement.
	(assign_parm_setup_stack): Allocate and fill in the address of
	unassigned MEM parms.
---
 gcc/cfgexpand.c         |   44 ++++++++++++++++++++++------
 gcc/cfgexpand.h         |    2 +
 gcc/function.c          |   74 ++++++++++++++++++++++++++++++++++++++++-------
 gcc/tree-ssa-coalesce.c |    4 +--
 4 files changed, 100 insertions(+), 24 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 0bc20f6..d567a87 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -172,17 +172,23 @@ leader_merge (tree cur, tree next)
   return cur;
 }
 
-/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
+/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
+   assigned to a stack slot.  We can't have expand_one_ssa_partition
+   choose their address: the pseudo holding the address would be set
+   up too late for assign_params to copy the parameter if needed.
+
    Such parameters are likely passed as a pointer to the value, rather
    than as a value, and so we must not coalesce them, nor allocate
    stack space for them before determining the calling conventions for
-   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
-   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
-   with NULL so as to make sure the MEM is not used before it is
-   adjusted in assign_parm_setup_reg.  */
+   them.
+
+   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
+   with pc_rtx as the address, and then it replaces the pc_rtx with
+   NULL so as to make sure the MEM is not used before it is adjusted
+   in assign_parm_setup_reg.  */
 
 bool
-parm_maybe_byref_p (tree var)
+parm_in_stack_slot_p (tree var)
 {
   if (!var || VAR_P (var))
     return false;
@@ -190,7 +196,7 @@ parm_maybe_byref_p (tree var)
   gcc_assert (TREE_CODE (var) == PARM_DECL
 	      || TREE_CODE (var) == RESULT_DECL);
 
-  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
+  return !use_register_for_decl (var);
 }
 
 /* Return the partition of the default SSA_DEF for decl VAR.  */
@@ -1343,17 +1349,35 @@ expand_one_ssa_partition (tree var)
 
   if (!use_register_for_decl (var))
     {
-      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
-	  && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
+      /* We can't risk having the parm assigned to a MEM location
+	 whose address references a pseudo, for the pseudo will only
+	 be set up after arguments are copied to the stack slot.
+
+	 If the parm doesn't have a default def (e.g., because its
+	 incoming value is unused), then we want to let assign_params
+	 do the allocation, too.  In this case we want to make sure
+	 SSA_NAMEs associated with the parm don't get assigned to more
+	 than one partition, lest we'd create two unassigned stac
+	 slots for the same parm, thus the assert at the end of the
+	 block.  */
+      if (parm_in_stack_slot_p (SSA_NAME_VAR (var))
+	  && (ssa_default_def_partition (SSA_NAME_VAR (var)) == part
+	      || !ssa_default_def (cfun, SSA_NAME_VAR (var))))
 	{
 	  expand_one_stack_var_at (var, pc_rtx, 0, 0);
 	  rtx x = SA.partition_to_pseudo[part];
 	  gcc_assert (GET_CODE (x) == MEM);
-	  gcc_assert (GET_MODE (x) == BLKmode);
 	  gcc_assert (XEXP (x, 0) == pc_rtx);
 	  /* Reset the address, so that any attempt to use it will
 	     ICE.  It will be adjusted in assign_parm_setup_reg.  */
 	  XEXP (x, 0) = NULL_RTX;
+	  /* If the RTL associated with the parm is not what we have
+	     just created, the parm has been split over multiple
+	     partitions.  In order for this to work, we must have a
+	     default def for the parm, otherwise assign_params won't
+	     know what to do.  */
+	  gcc_assert (DECL_RTL_IF_SET (SSA_NAME_VAR (var)) == x
+		      || ssa_default_def (cfun, SSA_NAME_VAR (var)));
 	}
       else if (defer_stack_allocation (var, true))
 	add_stack_var (var);
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index 987cf356..d168672 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
-extern bool parm_maybe_byref_p (tree);
+extern bool parm_in_stack_slot_p (tree);
 extern rtx get_rtl_for_parm_ssa_default_def (tree var);
 
 
diff --git a/gcc/function.c b/gcc/function.c
index 715c19f..222c76f 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -153,6 +153,7 @@ static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
 static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
 static void maybe_reset_rtl_for_parm (tree);
+static bool parm_in_unassigned_mem_p (tree, rtx);
 
 \f
 /* Stack of nested functions.  */
@@ -2326,6 +2327,22 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 	      rtx rtl = rtl_for_parm (all, cparm);
 	      if (rtl)
 		{
+		  /* If this is parm is unassigned, assign it now: the
+		     newly-created decls wouldn't expect the need for
+		     assignment, and if they were assigned
+		     independently, they might not end up in adjacent
+		     slots, so unsplit wouldn't be able to fill in the
+		     unassigned address of the complex MEM.  */
+		  if (parm_in_unassigned_mem_p (cparm, rtl))
+		    {
+		      int align = STACK_SLOT_ALIGNMENT
+			(TREE_TYPE (cparm), GET_MODE (rtl), MEM_ALIGN (rtl));
+		      rtx loc = assign_stack_local
+			(GET_MODE (rtl), GET_MODE_SIZE (GET_MODE (rtl)),
+			 align);
+		      XEXP (rtl, 0) = XEXP (loc, 0);
+		    }
+
 		  SET_DECL_RTL (p, read_complex_part (rtl, false));
 		  SET_DECL_RTL (decl, read_complex_part (rtl, true));
 
@@ -2934,6 +2951,27 @@ assign_parm_setup_block_p (struct assign_parm_data_one *data)
   return false;
 }
 
+/* Return true if FROM_EXPAND is a MEM with an address to be filled in
+   by assign_params.  This should be the case if, and only if,
+   parm_in_stack_slot_p holds for the parm DECL that expanded to
+   FROM_EXPAND, so we check that, too.  */
+
+static bool
+parm_in_unassigned_mem_p (tree decl, rtx from_expand)
+{
+  bool result = MEM_P (from_expand) && !XEXP (from_expand, 0);
+
+  gcc_assert (result == parm_in_stack_slot_p (decl)
+	      /* Maybe it was already assigned.  That's ok, especially
+		 for split complex args.  */
+	      || (!result && MEM_P (from_expand)
+		  && (XEXP (from_expand, 0) == virtual_stack_vars_rtx
+		      || (GET_CODE (XEXP (from_expand, 0)) == PLUS
+			  && XEXP (XEXP (from_expand, 0), 0) == virtual_stack_vars_rtx))));
+
+  return result;
+}
+
 /* A subroutine of assign_parms.  Arrange for the parameter to be
    present and valid in DATA->STACK_RTL.  */
 
@@ -2956,8 +2994,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
       rtx from_expand = rtl_for_parm (all, parm);
-      if (from_expand && (!parm_maybe_byref_p (parm)
-			  || XEXP (from_expand, 0) != NULL_RTX))
+      if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
 	stack_parm = copy_rtx (from_expand);
       else
 	{
@@ -2968,8 +3005,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 	  if (from_expand)
 	    {
 	      gcc_assert (GET_CODE (stack_parm) == MEM);
-	      gcc_assert (GET_CODE (from_expand) == MEM);
-	      gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
+	      gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
 	      XEXP (from_expand, 0) = XEXP (stack_parm, 0);
 	      PUT_MODE (from_expand, GET_MODE (stack_parm));
 	      stack_parm = copy_rtx (from_expand);
@@ -3017,6 +3053,11 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       else if (size == 0)
 	;
 
+      /* MEM may be a REG if coalescing assigns the param's partition
+	 to a pseudo.  */
+      else if (REG_P (mem))
+	emit_move_insn (mem, entry_parm);
+
       /* If SIZE is that of a mode no bigger than a word, just use
 	 that mode's store operation.  */
       else if (size <= UNITS_PER_WORD)
@@ -3121,7 +3162,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
       if (GET_MODE (parmreg) != promoted_nominal_mode)
 	parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
     }
-  else if (!from_expand || parm_maybe_byref_p (parm))
+  else if (!from_expand || parm_in_unassigned_mem_p (parm, from_expand))
     {
       parmreg = gen_reg_rtx (promoted_nominal_mode);
       if (!DECL_ARTIFICIAL (parm))
@@ -3131,7 +3172,6 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	{
 	  gcc_assert (data->passed_pointer);
 	  gcc_assert (GET_CODE (from_expand) == MEM
-		      && GET_MODE (from_expand) == BLKmode
 		      && XEXP (from_expand, 0) == NULL_RTX);
 	  XEXP (from_expand, 0) = parmreg;
 	}
@@ -3349,7 +3389,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	  did_conversion = true;
 	}
       else if (GET_MODE (parmreg) == BLKmode)
-	gcc_assert (parm_maybe_byref_p (parm));
+	gcc_assert (parm_in_stack_slot_p (parm));
       else
 	emit_move_insn (parmreg, src);
 
@@ -3455,12 +3495,15 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
   if (data->entry_parm != data->stack_parm)
     {
       rtx src, dest;
+      rtx from_expand = NULL_RTX;
 
       if (data->stack_parm == 0)
 	{
-	  rtx x = data->stack_parm = rtl_for_parm (all, parm);
-	  if (x)
-	    gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+	  from_expand = rtl_for_parm (all, parm);
+	  if (from_expand)
+	    gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
+	  if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
+	    data->stack_parm = from_expand;
 	}
 
       if (data->stack_parm == 0)
@@ -3472,7 +3515,16 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 	    = assign_stack_local (GET_MODE (data->entry_parm),
 				  GET_MODE_SIZE (GET_MODE (data->entry_parm)),
 				  align);
-	  set_mem_attributes (data->stack_parm, parm, 1);
+	  if (!from_expand)
+	    set_mem_attributes (data->stack_parm, parm, 1);
+	  else
+	    {
+	      gcc_assert (GET_CODE (data->stack_parm) == MEM);
+	      gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
+	      XEXP (from_expand, 0) = XEXP (data->stack_parm, 0);
+	      PUT_MODE (from_expand, GET_MODE (data->stack_parm));
+	      data->stack_parm = copy_rtx (from_expand);
+	    }
 	}
 
       dest = validize_mem (copy_rtx (data->stack_parm));
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 08ce72c..6468012 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -1386,8 +1386,8 @@ gimple_can_coalesce_p (tree name1, tree name2)
 	 because it may be passed by reference.  */
       return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
 	|| (/* The case var1 == var2 is already covered above.  */
-	    !parm_maybe_byref_p (var1)
-	    && !parm_maybe_byref_p (var2)
+	    !parm_in_stack_slot_p (var1)
+	    && !parm_in_stack_slot_p (var2)
 	    && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
     }
 


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-19  6:50                                                                 ` Alexandre Oliva
@ 2015-08-19 10:17                                                                   ` Richard Biener
  2015-08-19 13:35                                                                   ` Andreas Schwab
  1 sibling, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-08-19 10:17 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Christophe Lyon, Andreas Schwab, GCC Patches, Patrick Marlier,
	Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	David Edelsohn, Eric Botcazou

On Wed, Aug 19, 2015 at 8:45 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Aug 18, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>>>> On Aug 17, 2015, Christophe Lyon <christophe.lyon@linaro.org> wrote:
>>>>> Since this was committed (r226901), I can see that the compiler build
>>>>> fails for armeb targets, when building libgcc:
>
>> This patch fixes this particular case.  I'll also add this configuration
>> to the cross build tests I'm going to rerun shortly, before submitting a
>> followup formally, to see whether other non-MEM mems need to be handled
>> explicitly.
>
> On Aug 17, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:
>
>> Andreas Schwab <schwab@linux-m68k.org> writes:
>
>>> Alexandre Oliva <aoliva@redhat.com> writes:
>>>
>>>> Would you be so kind as to give it a spin on a m68k native?  TIA,
>>>
>>> I tried it on ia64, and it falls flat on the floor.
>
>> It fixes the m68k failures, though.
>
> On Aug 17, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>>> I tried it on ia64, and it falls flat on the floor.
>
>> Doh, I see a logic flaw in the patch I posted.
>
> There were other shortcomings in the snippets I posted before, revealed
> by testing on on various other targets: remaining BLKmode asserts,
> failure to deal with parms without a default def and split complex args
> with an unassigned stack address.  This patch fixes them all.
>
> It was regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu,
> ppc64el-linux-gnu, and further tested with a compile-only 'make all' on
> a binutils+gcc+newlib tree on all tens of cross targets mentioned
> before, plus the armeb configuration Christophe mentioned.
>
> Ok to install?

Ok.

Thanks,
Richard.

>
> [PR64164] fix regressions reported on m68k and armeb
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> Defer stack slot address assignment for all parms that can't live in
> pseudos, and accept pseudos assignments in assign_param_setup_block.
>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         * cfgexpand.c (parm_maybe_byref_p): Renamed to...
>         (parm_in_stack_slot_p): ... this.  Disregard mode, what
>         matters is whether the parm will live in a pseudo or a stack
>         slot.
>         (expand_one_ssa_partition): Deal with params without a default
>         def.  Disregard mode.
>         * cfgexpand.h: Renamed function declaration.
>         * tree-ssa-coalesce.c: Adjust.
>         * function.c (split_complex_args): Allocate stack slot for
>         unassigned parms before splitting.
>         (parm_in_unassigned_mem_p): New.  Use it instead of
>         parm_maybe_byref_p throughout this file.
>         (assign_parm_setup_block): Use it.  Accept pseudos in the
>         expand-assigned rtl.
>         (assign_parm_setup_reg): Drop BLKmode requirement.
>         (assign_parm_setup_stack): Allocate and fill in the address of
>         unassigned MEM parms.
> ---
>  gcc/cfgexpand.c         |   44 ++++++++++++++++++++++------
>  gcc/cfgexpand.h         |    2 +
>  gcc/function.c          |   74 ++++++++++++++++++++++++++++++++++++++++-------
>  gcc/tree-ssa-coalesce.c |    4 +--
>  4 files changed, 100 insertions(+), 24 deletions(-)
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 0bc20f6..d567a87 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -172,17 +172,23 @@ leader_merge (tree cur, tree next)
>    return cur;
>  }
>
> -/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
> +/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
> +   assigned to a stack slot.  We can't have expand_one_ssa_partition
> +   choose their address: the pseudo holding the address would be set
> +   up too late for assign_params to copy the parameter if needed.
> +
>     Such parameters are likely passed as a pointer to the value, rather
>     than as a value, and so we must not coalesce them, nor allocate
>     stack space for them before determining the calling conventions for
> -   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
> -   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
> -   with NULL so as to make sure the MEM is not used before it is
> -   adjusted in assign_parm_setup_reg.  */
> +   them.
> +
> +   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
> +   with pc_rtx as the address, and then it replaces the pc_rtx with
> +   NULL so as to make sure the MEM is not used before it is adjusted
> +   in assign_parm_setup_reg.  */
>
>  bool
> -parm_maybe_byref_p (tree var)
> +parm_in_stack_slot_p (tree var)
>  {
>    if (!var || VAR_P (var))
>      return false;
> @@ -190,7 +196,7 @@ parm_maybe_byref_p (tree var)
>    gcc_assert (TREE_CODE (var) == PARM_DECL
>               || TREE_CODE (var) == RESULT_DECL);
>
> -  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
> +  return !use_register_for_decl (var);
>  }
>
>  /* Return the partition of the default SSA_DEF for decl VAR.  */
> @@ -1343,17 +1349,35 @@ expand_one_ssa_partition (tree var)
>
>    if (!use_register_for_decl (var))
>      {
> -      if (parm_maybe_byref_p (SSA_NAME_VAR (var))
> -         && ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
> +      /* We can't risk having the parm assigned to a MEM location
> +        whose address references a pseudo, for the pseudo will only
> +        be set up after arguments are copied to the stack slot.
> +
> +        If the parm doesn't have a default def (e.g., because its
> +        incoming value is unused), then we want to let assign_params
> +        do the allocation, too.  In this case we want to make sure
> +        SSA_NAMEs associated with the parm don't get assigned to more
> +        than one partition, lest we'd create two unassigned stac
> +        slots for the same parm, thus the assert at the end of the
> +        block.  */
> +      if (parm_in_stack_slot_p (SSA_NAME_VAR (var))
> +         && (ssa_default_def_partition (SSA_NAME_VAR (var)) == part
> +             || !ssa_default_def (cfun, SSA_NAME_VAR (var))))
>         {
>           expand_one_stack_var_at (var, pc_rtx, 0, 0);
>           rtx x = SA.partition_to_pseudo[part];
>           gcc_assert (GET_CODE (x) == MEM);
> -         gcc_assert (GET_MODE (x) == BLKmode);
>           gcc_assert (XEXP (x, 0) == pc_rtx);
>           /* Reset the address, so that any attempt to use it will
>              ICE.  It will be adjusted in assign_parm_setup_reg.  */
>           XEXP (x, 0) = NULL_RTX;
> +         /* If the RTL associated with the parm is not what we have
> +            just created, the parm has been split over multiple
> +            partitions.  In order for this to work, we must have a
> +            default def for the parm, otherwise assign_params won't
> +            know what to do.  */
> +         gcc_assert (DECL_RTL_IF_SET (SSA_NAME_VAR (var)) == x
> +                     || ssa_default_def (cfun, SSA_NAME_VAR (var)));
>         }
>        else if (defer_stack_allocation (var, true))
>         add_stack_var (var);
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index 987cf356..d168672 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern tree gimple_assign_rhs_to_tree (gimple);
>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> -extern bool parm_maybe_byref_p (tree);
> +extern bool parm_in_stack_slot_p (tree);
>  extern rtx get_rtl_for_parm_ssa_default_def (tree var);
>
>
> diff --git a/gcc/function.c b/gcc/function.c
> index 715c19f..222c76f 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -153,6 +153,7 @@ static void do_clobber_return_reg (rtx, void *);
>  static void do_use_return_reg (rtx, void *);
>  static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
>  static void maybe_reset_rtl_for_parm (tree);
> +static bool parm_in_unassigned_mem_p (tree, rtx);
>
>
>  /* Stack of nested functions.  */
> @@ -2326,6 +2327,22 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>               rtx rtl = rtl_for_parm (all, cparm);
>               if (rtl)
>                 {
> +                 /* If this is parm is unassigned, assign it now: the
> +                    newly-created decls wouldn't expect the need for
> +                    assignment, and if they were assigned
> +                    independently, they might not end up in adjacent
> +                    slots, so unsplit wouldn't be able to fill in the
> +                    unassigned address of the complex MEM.  */
> +                 if (parm_in_unassigned_mem_p (cparm, rtl))
> +                   {
> +                     int align = STACK_SLOT_ALIGNMENT
> +                       (TREE_TYPE (cparm), GET_MODE (rtl), MEM_ALIGN (rtl));
> +                     rtx loc = assign_stack_local
> +                       (GET_MODE (rtl), GET_MODE_SIZE (GET_MODE (rtl)),
> +                        align);
> +                     XEXP (rtl, 0) = XEXP (loc, 0);
> +                   }
> +
>                   SET_DECL_RTL (p, read_complex_part (rtl, false));
>                   SET_DECL_RTL (decl, read_complex_part (rtl, true));
>
> @@ -2934,6 +2951,27 @@ assign_parm_setup_block_p (struct assign_parm_data_one *data)
>    return false;
>  }
>
> +/* Return true if FROM_EXPAND is a MEM with an address to be filled in
> +   by assign_params.  This should be the case if, and only if,
> +   parm_in_stack_slot_p holds for the parm DECL that expanded to
> +   FROM_EXPAND, so we check that, too.  */
> +
> +static bool
> +parm_in_unassigned_mem_p (tree decl, rtx from_expand)
> +{
> +  bool result = MEM_P (from_expand) && !XEXP (from_expand, 0);
> +
> +  gcc_assert (result == parm_in_stack_slot_p (decl)
> +             /* Maybe it was already assigned.  That's ok, especially
> +                for split complex args.  */
> +             || (!result && MEM_P (from_expand)
> +                 && (XEXP (from_expand, 0) == virtual_stack_vars_rtx
> +                     || (GET_CODE (XEXP (from_expand, 0)) == PLUS
> +                         && XEXP (XEXP (from_expand, 0), 0) == virtual_stack_vars_rtx))));
> +
> +  return result;
> +}
> +
>  /* A subroutine of assign_parms.  Arrange for the parameter to be
>     present and valid in DATA->STACK_RTL.  */
>
> @@ -2956,8 +2994,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
>        rtx from_expand = rtl_for_parm (all, parm);
> -      if (from_expand && (!parm_maybe_byref_p (parm)
> -                         || XEXP (from_expand, 0) != NULL_RTX))
> +      if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
>         stack_parm = copy_rtx (from_expand);
>        else
>         {
> @@ -2968,8 +3005,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>           if (from_expand)
>             {
>               gcc_assert (GET_CODE (stack_parm) == MEM);
> -             gcc_assert (GET_CODE (from_expand) == MEM);
> -             gcc_assert (XEXP (from_expand, 0) == NULL_RTX);
> +             gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
>               XEXP (from_expand, 0) = XEXP (stack_parm, 0);
>               PUT_MODE (from_expand, GET_MODE (stack_parm));
>               stack_parm = copy_rtx (from_expand);
> @@ -3017,6 +3053,11 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>        else if (size == 0)
>         ;
>
> +      /* MEM may be a REG if coalescing assigns the param's partition
> +        to a pseudo.  */
> +      else if (REG_P (mem))
> +       emit_move_insn (mem, entry_parm);
> +
>        /* If SIZE is that of a mode no bigger than a word, just use
>          that mode's store operation.  */
>        else if (size <= UNITS_PER_WORD)
> @@ -3121,7 +3162,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>        if (GET_MODE (parmreg) != promoted_nominal_mode)
>         parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
>      }
> -  else if (!from_expand || parm_maybe_byref_p (parm))
> +  else if (!from_expand || parm_in_unassigned_mem_p (parm, from_expand))
>      {
>        parmreg = gen_reg_rtx (promoted_nominal_mode);
>        if (!DECL_ARTIFICIAL (parm))
> @@ -3131,7 +3172,6 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>         {
>           gcc_assert (data->passed_pointer);
>           gcc_assert (GET_CODE (from_expand) == MEM
> -                     && GET_MODE (from_expand) == BLKmode
>                       && XEXP (from_expand, 0) == NULL_RTX);
>           XEXP (from_expand, 0) = parmreg;
>         }
> @@ -3349,7 +3389,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>           did_conversion = true;
>         }
>        else if (GET_MODE (parmreg) == BLKmode)
> -       gcc_assert (parm_maybe_byref_p (parm));
> +       gcc_assert (parm_in_stack_slot_p (parm));
>        else
>         emit_move_insn (parmreg, src);
>
> @@ -3455,12 +3495,15 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>    if (data->entry_parm != data->stack_parm)
>      {
>        rtx src, dest;
> +      rtx from_expand = NULL_RTX;
>
>        if (data->stack_parm == 0)
>         {
> -         rtx x = data->stack_parm = rtl_for_parm (all, parm);
> -         if (x)
> -           gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
> +         from_expand = rtl_for_parm (all, parm);
> +         if (from_expand)
> +           gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
> +         if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
> +           data->stack_parm = from_expand;
>         }
>
>        if (data->stack_parm == 0)
> @@ -3472,7 +3515,16 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>             = assign_stack_local (GET_MODE (data->entry_parm),
>                                   GET_MODE_SIZE (GET_MODE (data->entry_parm)),
>                                   align);
> -         set_mem_attributes (data->stack_parm, parm, 1);
> +         if (!from_expand)
> +           set_mem_attributes (data->stack_parm, parm, 1);
> +         else
> +           {
> +             gcc_assert (GET_CODE (data->stack_parm) == MEM);
> +             gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
> +             XEXP (from_expand, 0) = XEXP (data->stack_parm, 0);
> +             PUT_MODE (from_expand, GET_MODE (data->stack_parm));
> +             data->stack_parm = copy_rtx (from_expand);
> +           }
>         }
>
>        dest = validize_mem (copy_rtx (data->stack_parm));
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index 08ce72c..6468012 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -1386,8 +1386,8 @@ gimple_can_coalesce_p (tree name1, tree name2)
>          because it may be passed by reference.  */
>        return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
>         || (/* The case var1 == var2 is already covered above.  */
> -           !parm_maybe_byref_p (var1)
> -           && !parm_maybe_byref_p (var2)
> +           !parm_in_stack_slot_p (var1)
> +           && !parm_in_stack_slot_p (var2)
>             && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
>      }
>
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-19  6:50                                                                 ` Alexandre Oliva
  2015-08-19 10:17                                                                   ` Richard Biener
@ 2015-08-19 13:35                                                                   ` Andreas Schwab
  2015-08-19 13:45                                                                     ` Andreas Schwab
  1 sibling, 1 reply; 127+ messages in thread
From: Andreas Schwab @ 2015-08-19 13:35 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Christophe Lyon, GCC Patches, Patrick Marlier, Jeff Law,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, Richard Biener,
	David Edelsohn, Eric Botcazou

Alexandre Oliva <aoliva@redhat.com> writes:

> [PR64164] fix regressions reported on m68k and armeb
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> Defer stack slot address assignment for all parms that can't live in
> pseudos, and accept pseudos assignments in assign_param_setup_block.

That doesn't fix the ia64 Ada miscompilation though.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-19 13:35                                                                   ` Andreas Schwab
@ 2015-08-19 13:45                                                                     ` Andreas Schwab
  2015-08-19 17:48                                                                       ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Andreas Schwab @ 2015-08-19 13:45 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Christophe Lyon, GCC Patches, Patrick Marlier, Jeff Law,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, Richard Biener,
	David Edelsohn, Eric Botcazou

Andreas Schwab <schwab@linux-m68k.org> writes:

> Alexandre Oliva <aoliva@redhat.com> writes:
>
>> [PR64164] fix regressions reported on m68k and armeb
>>
>> From: Alexandre Oliva <aoliva@redhat.com>
>>
>> Defer stack slot address assignment for all parms that can't live in
>> pseudos, and accept pseudos assignments in assign_param_setup_block.
>
> That doesn't fix the ia64 Ada miscompilation though.

I mean miscomparison, not miscompilation.  The difference is only in the
insn scheduling.

--- x1	2015-08-19 15:26:41.000000000 +0200
+++ x2	2015-08-19 15:26:46.000000000 +0200
@@ -1,5 +1,5 @@
 
-stage2-gcc/ada/par.o:     file format elf64-ia64-little
+stage3-gcc/ada/par.o:     file format elf64-ia64-little
 
 
 Disassembly of section .text:
@@ -29467,25 +29467,25 @@
 			214b2: PCREL21B	atree__new_node
    214b6:	00 00 00 02 00 00 	            nop.i 0x0
    214bc:	08 00 00 50       	            br.call.sptk.many b0=214b0 <par__ch6__p_formal_part.2186+0xa30>
-   214c0:	08 78 e0 01 80 24 	[MMI]       mov r15=16504
-   214c6:	e0 80 03 00 49 20 	            mov r14=16496
-   214cc:	00 06 04 92       	            mov r1=16608
-   214d0:	0a 80 23 00 08 20 	[MMI]       addp4 r112=r8,r0;;
-   214d6:	f0 78 30 00 40 c0 	            add r15=r15,r12
-   214dc:	e1 60 00 80       	            add r14=r14,r12
-   214e0:	0a 08 04 18 00 20 	[MMI]       add r1=r1,r12;;
-   214e6:	f0 00 3c 20 20 00 	            ld4 r15=[r15]
-   214ec:	00 00 04 00       	            nop.i 0x0
+   214c0:	08 70 c0 01 80 24 	[MMI]       mov r14=16496
+   214c6:	00 00 00 02 00 e0 	            nop.m 0x0
+   214cc:	81 07 00 92       	            mov r15=16504
+   214d0:	09 08 80 01 81 24 	[MMI]       mov r1=16608
+   214d6:	00 00 00 02 00 00 	            nop.m 0x0
+   214dc:	8e 00 20 80       	            addp4 r112=r8,r0;;
+   214e0:	09 70 38 18 00 20 	[MMI]       add r14=r14,r12
+   214e6:	f0 78 30 00 40 20 	            add r15=r15,r12
+   214ec:	10 60 00 80       	            add r1=r1,r12;;
    214f0:	09 00 20 1c 90 11 	[MMI]       st4 [r14]=r8
    214f6:	10 00 04 30 20 00 	            ld8 r1=[r1]
    214fc:	00 00 04 00       	            nop.i 0x0;;
-   21500:	01 00 00 00 01 00 	[MII]       nop.m 0x0
-   21506:	e0 00 3c 2c 00 e0 	            sxt4 r14=r15
-   2150c:	01 61 00 84       	            adds r15=16,r12;;
-   21510:	0b 70 38 00 11 20 	[MMI]       shladd r14=r14,2,r0;;
-   21516:	e0 78 38 00 40 00 	            add r14=r15,r14
+   21500:	02 78 00 1e 10 10 	[MII]       ld4 r15=[r15]
+   21506:	00 00 00 02 00 c0 	            nop.i 0x0;;
+   2150c:	01 78 58 00       	            sxt4 r14=r15
+   21510:	0b 78 40 18 00 21 	[MMI]       adds r15=16,r12;;
+   21516:	e0 70 00 22 40 00 	            shladd r14=r14,2,r0
    2151c:	00 00 04 00       	            nop.i 0x0;;
-   21520:	09 00 00 00 01 00 	[MMI]       nop.m 0x0
+   21520:	0b 70 3c 1c 00 20 	[MMI]       add r14=r15,r14;;
    21526:	e0 e0 3b 7e 46 00 	            adds r14=-4,r14
    2152c:	00 00 04 00       	            nop.i 0x0;;
    21530:	10 88 03 1c 10 10 	[MIB]       ld4 r113=[r14]


Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-19 13:45                                                                     ` Andreas Schwab
@ 2015-08-19 17:48                                                                       ` Alexandre Oliva
  2015-08-20  1:44                                                                         ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-19 17:48 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Christophe Lyon, GCC Patches, Patrick Marlier, Jeff Law,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, Richard Biener,
	David Edelsohn, Eric Botcazou

On Aug 19, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote:

> Andreas Schwab <schwab@linux-m68k.org> writes:
>> Alexandre Oliva <aoliva@redhat.com> writes:
>> 
>>> [PR64164] fix regressions reported on m68k and armeb
>>> 
>>> From: Alexandre Oliva <aoliva@redhat.com>
>>> 
>>> Defer stack slot address assignment for all parms that can't live in
>>> pseudos, and accept pseudos assignments in assign_param_setup_block.
>> 
>> That doesn't fix the ia64 Ada miscompilation though.

That's not surprising, it's the first I hear of it ;-)

> I mean miscomparison, not miscompilation.  The difference is only in the
> insn scheduling.

Interesting.  I have a hard time figuring out how this could follow from
the patchset at hand, but...  let's try to figure it out.

I'm having some difficulty getting access to an ia64 box ATM, and for
ada bootstraps, a cross won't do, so...  if you still have that build
tree around, any chance you could recompile par.o with both stage1 and
stage2, with -fdump-rtl-expand-details, and email me the compiler dump
files?  Maybe that will suffice to figure out where the difference might
come from.

Thanks in advance,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-19 17:48                                                                       ` Alexandre Oliva
@ 2015-08-20  1:44                                                                         ` Alexandre Oliva
  2015-08-20 17:03                                                                           ` Jeff Law
                                                                                             ` (2 more replies)
  0 siblings, 3 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-20  1:44 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Christophe Lyon, GCC Patches, Patrick Marlier, Jeff Law,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, Richard Biener,
	David Edelsohn, Eric Botcazou

On Aug 19, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> I'm having some difficulty getting access to an ia64 box ATM, and for
> ada bootstraps, a cross won't do, so...  if you still have that build
> tree around, any chance you could recompile par.o with both stage1 and
> stage2, with -fdump-rtl-expand-details, and email me the compiler dump
> files?

Thanks!

In the mean time, I have been able to duplicate the problem myself.  As
you say, it is triggered by -gtoggle.  However, it has nothing
whatsoever to do with the recent patches I installed.  At most they
expose some latent problem in the scheduler.

I have verified in the expand dumps that both the gimple and the rtl
representation in the relevant parts of the code are identical, except
for the presence of debug stmts and insns.  Indeed, compiling with
-fno-schedule-insns{,2}, no differences arise.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-20  1:44                                                                         ` Alexandre Oliva
@ 2015-08-20 17:03                                                                           ` Jeff Law
  2015-08-21  7:57                                                                           ` Alexandre Oliva
  2015-08-21  8:11                                                                           ` Alexandre Oliva
  2 siblings, 0 replies; 127+ messages in thread
From: Jeff Law @ 2015-08-20 17:03 UTC (permalink / raw)
  To: Alexandre Oliva, Andreas Schwab
  Cc: Christophe Lyon, GCC Patches, Patrick Marlier, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, Richard Biener, David Edelsohn,
	Eric Botcazou

On 08/19/2015 06:00 PM, Alexandre Oliva wrote:
> On Aug 19, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> I'm having some difficulty getting access to an ia64 box ATM, and for
>> ada bootstraps, a cross won't do, so...  if you still have that build
>> tree around, any chance you could recompile par.o with both stage1 and
>> stage2, with -fdump-rtl-expand-details, and email me the compiler dump
>> files?
>
> Thanks!
>
> In the mean time, I have been able to duplicate the problem myself.  As
> you say, it is triggered by -gtoggle.  However, it has nothing
> whatsoever to do with the recent patches I installed.  At most they
> expose some latent problem in the scheduler.
>
> I have verified in the expand dumps that both the gimple and the rtl
> representation in the relevant parts of the code are identical, except
> for the presence of debug stmts and insns.  Indeed, compiling with
> -fno-schedule-insns{,2}, no differences arise.
We did a couple fixes to this code earlier this year.  Presumably 
there's something still subtly wrong in there that your changes are 
exposing.

See Maxim's changes from Feb.

You might also look and see if any of those insns have SCHED_GROUP_P set.

Jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-20  1:44                                                                         ` Alexandre Oliva
  2015-08-20 17:03                                                                           ` Jeff Law
@ 2015-08-21  7:57                                                                           ` Alexandre Oliva
  2015-08-21  8:38                                                                             ` Richard Biener
  2015-08-21 12:17                                                                             ` Andreas Schwab
  2015-08-21  8:11                                                                           ` Alexandre Oliva
  2 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-21  7:57 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Christophe Lyon, GCC Patches, Patrick Marlier, Jeff Law,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, Richard Biener,
	David Edelsohn, Eric Botcazou

On Aug 19, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Aug 19, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>> I'm having some difficulty getting access to an ia64 box ATM, and for
>> ada bootstraps, a cross won't do, so...  if you still have that build
>> tree around, any chance you could recompile par.o with both stage1 and
>> stage2, with -fdump-rtl-expand-details, and email me the compiler dump
>> files?

> Thanks!

> In the mean time, I have been able to duplicate the problem myself.  As
> you say, it is triggered by -gtoggle.  However, it has nothing
> whatsoever to do with the recent patches I installed.  At most they
> expose some latent problem in the scheduler.

The above was more likely wrong than right.  There may have been a
latent problem in the scheduler indeed, but the patch actually made it
worse, or even introduced it.

The scheduler relies on alias analysis to tell whether a given pair of
insns that read or modify memory should have a dependence set between
them.  It looks both at the RTL proper (including cselib values) and at
the MEM attrs.

The problem was that, for ada/par.o, we computed different dependencies,
and thus different sched priorities, for a pair of insns.  Specifically,
one wrote to a stack spill slot, and another read from a neighbor spill
slot.  Both had MEM_EXPRs pointing at a %sfp decl, and different
offsets.  In the stage3 (-g) compilation, there were debug insns between
them.

They caused additional equivalent expressions to be added to some
values, which in turn caused memrefs_conflict_p to return different
values.

In the stage2 compilation, both VALUEs resolved to PLUSes of two VALUEs,
the first of each resolved to a constant, while the latter of each
resolved to %sfp.  When the second operands of PLUSes match, we recurse
and compare the first operands, resolving both to CONST_INTs and, in
this case, concluding that there's no possible overlap.

In the stage3 compilation, one VALUE resolved to a PLUS of a VALUE and a
CONST_INT, whereas the other resolved to a PLUS of two VALUEs.  Without
canonicalization of VALUE order in PLUSes, it just so happened that the
VALUE that appeared as the second operand in the second PLUS was moved
to the first operand in the first PLUS, and so memrefs_conflict_p
couldn't tell whether or not there was an overlap.

Before the initial pr64164 patch, we had another chance to detect the
non-overlap analyzing the MEM attrs in nonoverlapping_memrefs_p: given
the same MEM_EXPR, but different offsets, we used to conclude there was
no overlap, so this got true_dependence to return the same value in both
compilations.

The pr64164 patch introduced an early exit from nonoverlapping_memrefs_p
when either operand is a gimple_reg, because some of these wouldn't have
a DECL_RTL set, and creating RTL for them at such points would not be
appropriate.  The problem is that the early exit would only return false
if the exprs were different.  If they were the same, we'd conclude an
overlap was possible, even if offsets were enough to tell otherwise.

My thought back then was that such exprs were not addressable anyway, so
we'd always access the entire object, so offsets couldn't possible be
different.  Right?  Well, no!  Think spill slots: %sfp (a gimple_reg
decl) + constant offset!  Same base gimple_reg, non-overlapping memory
addresses!


This patch improves memrefs_conflict_p so as to handle more combinations
of VALUEs in PLUSes: if both incoming addresses are PLUSes, check one
operand of one against the other operand of the other; if one address is
a PLUS and the other isn't, test the other against both operands of the
PLUS.  This causes memrefs_conflict_p to return consistent results for
that given pair of insns in both stage2 and stage3 compilation.

Additionally, it fixes the regression in nonoverlapping_memrefs_p,
adding code to check for non-overlapping offsets when the base expr is
the same (as long as offsets and sizes are known for both MEMs).

Either one would suffice to fix this particular case.  The latter would
fix the regression proper, but the former is sufficiently lightweight
(since comparing pointers is enough) that it's probably worth adding to
get more accurate and consistent results earlier.

I'm bootstrapping this on ia64-linux-gnu.  Ok to install?


fix sched compare regression

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	PR rtl-optimization/64164
	PR rtl-optimization/67227
	* alias.c (memrefs_conflict_p): Handle VALUEs in PLUS better.
	(nonoverlapping_memrefs_p): Test offsets and sizes when given
	identical gimple_reg exprs.
---
 gcc/alias.c |   23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/gcc/alias.c b/gcc/alias.c
index 4681e3f..f12d9d1 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2228,6 +2228,13 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c)
       rtx x0 = XEXP (x, 0);
       rtx x1 = XEXP (x, 1);
 
+      /* However, VALUEs might end up in different positions even in
+	 canonical PLUSes.  Comparing their addresses is enough.  */
+      if (x0 == y)
+	return memrefs_conflict_p (xsize, x1, ysize, const0_rtx, c);
+      else if (x1 == y)
+	return memrefs_conflict_p (xsize, x0, ysize, const0_rtx, c);
+
       if (GET_CODE (y) == PLUS)
 	{
 	  /* The fact that Y is canonicalized means that this
@@ -2235,6 +2242,11 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c)
 	  rtx y0 = XEXP (y, 0);
 	  rtx y1 = XEXP (y, 1);
 
+	  if (x0 == y1)
+	    return memrefs_conflict_p (xsize, x1, ysize, y0, c);
+	  if (x1 == y0)
+	    return memrefs_conflict_p (xsize, x0, ysize, y1, c);
+
 	  if (rtx_equal_for_memref_p (x1, y1))
 	    return memrefs_conflict_p (xsize, x0, ysize, y0, c);
 	  if (rtx_equal_for_memref_p (x0, y0))
@@ -2263,6 +2275,11 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c)
       rtx y0 = XEXP (y, 0);
       rtx y1 = XEXP (y, 1);
 
+      if (x == y0)
+	return memrefs_conflict_p (xsize, const0_rtx, ysize, y1, c);
+      if (x == y1)
+	return memrefs_conflict_p (xsize, const0_rtx, ysize, y0, c);
+
       if (CONST_INT_P (y1))
 	return memrefs_conflict_p (xsize, x, ysize, y0, c + INTVAL (y1));
       else
@@ -2518,7 +2535,11 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
      able to do anything about them since no SSA information will have
      remained to guide it.  */
   if (is_gimple_reg (exprx) || is_gimple_reg (expry))
-    return exprx != expry;
+    return exprx != expry
+      || (moffsetx_known_p && moffsety_known_p
+	  && MEM_SIZE_KNOWN_P (x) && MEM_SIZE_KNOWN_P (y)
+	  && !offset_overlap_p (moffsety - moffsetx,
+				MEM_SIZE (x), MEM_SIZE (y)));
 
   /* With invalid code we can end up storing into the constant pool.
      Bail out to avoid ICEing when creating RTL for this.


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-20  1:44                                                                         ` Alexandre Oliva
  2015-08-20 17:03                                                                           ` Jeff Law
  2015-08-21  7:57                                                                           ` Alexandre Oliva
@ 2015-08-21  8:11                                                                           ` Alexandre Oliva
  2015-08-21  8:37                                                                             ` Richard Biener
  2 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-08-21  8:11 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Christophe Lyon, GCC Patches, Patrick Marlier, Jeff Law,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, Richard Biener,
	David Edelsohn, Eric Botcazou

On Aug 19, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> I have verified in the expand dumps that both the gimple and the rtl
> representation in the relevant parts of the code are identical, except
> for the presence of debug stmts and insns.

While comparing the dumps, I noticed -fdump-unnumbered-links no longer
worked like it did back when I introduced it, with the very purpose of
making it easier to compare dumps with and without debug insns.

When the insn_uid was moved out of the u[] array, the indices that
print-rtl tested to tell whether to omit the ids of the prev and next
insns got off by one.

This patch updates the test to match the current indices.

Bootstrapping on ia64-linux-gnu.  Ok to install?

fix -fdump-unnumbered-links

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	* print-rtl.c (print_rtx): Check the correct range for
	flag_dump_unnumbered_links to behave as documented.
---
 gcc/print-rtl.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index aacadbb..b541d83 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -550,7 +550,7 @@ print_rtx (const_rtx in_rtx)
 	      }
 
 	    if (flag_dump_unnumbered
-		|| (flag_dump_unnumbered_links && (i == 1 || i == 2)
+		|| (flag_dump_unnumbered_links && i <= 1
 		    && (INSN_P (in_rtx) || NOTE_P (in_rtx)
 			|| LABEL_P (in_rtx) || BARRIER_P (in_rtx))))
 	      fputs (" #", outfile);


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-21  8:11                                                                           ` Alexandre Oliva
@ 2015-08-21  8:37                                                                             ` Richard Biener
  0 siblings, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-08-21  8:37 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Andreas Schwab, Christophe Lyon, GCC Patches, Patrick Marlier,
	Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	David Edelsohn, Eric Botcazou

On Fri, Aug 21, 2015 at 9:57 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Aug 19, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> I have verified in the expand dumps that both the gimple and the rtl
>> representation in the relevant parts of the code are identical, except
>> for the presence of debug stmts and insns.
>
> While comparing the dumps, I noticed -fdump-unnumbered-links no longer
> worked like it did back when I introduced it, with the very purpose of
> making it easier to compare dumps with and without debug insns.
>
> When the insn_uid was moved out of the u[] array, the indices that
> print-rtl tested to tell whether to omit the ids of the prev and next
> insns got off by one.
>
> This patch updates the test to match the current indices.
>
> Bootstrapping on ia64-linux-gnu.  Ok to install?

Ok.

Thanks,
Richard.

> fix -fdump-unnumbered-links
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> for  gcc/ChangeLog
>
>         * print-rtl.c (print_rtx): Check the correct range for
>         flag_dump_unnumbered_links to behave as documented.
> ---
>  gcc/print-rtl.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
> index aacadbb..b541d83 100644
> --- a/gcc/print-rtl.c
> +++ b/gcc/print-rtl.c
> @@ -550,7 +550,7 @@ print_rtx (const_rtx in_rtx)
>               }
>
>             if (flag_dump_unnumbered
> -               || (flag_dump_unnumbered_links && (i == 1 || i == 2)
> +               || (flag_dump_unnumbered_links && i <= 1
>                     && (INSN_P (in_rtx) || NOTE_P (in_rtx)
>                         || LABEL_P (in_rtx) || BARRIER_P (in_rtx))))
>               fputs (" #", outfile);
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-21  7:57                                                                           ` Alexandre Oliva
@ 2015-08-21  8:38                                                                             ` Richard Biener
  2015-08-21 12:17                                                                             ` Andreas Schwab
  1 sibling, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-08-21  8:38 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Andreas Schwab, Christophe Lyon, GCC Patches, Patrick Marlier,
	Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	David Edelsohn, Eric Botcazou

On Fri, Aug 21, 2015 at 9:46 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Aug 19, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> On Aug 19, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> I'm having some difficulty getting access to an ia64 box ATM, and for
>>> ada bootstraps, a cross won't do, so...  if you still have that build
>>> tree around, any chance you could recompile par.o with both stage1 and
>>> stage2, with -fdump-rtl-expand-details, and email me the compiler dump
>>> files?
>
>> Thanks!
>
>> In the mean time, I have been able to duplicate the problem myself.  As
>> you say, it is triggered by -gtoggle.  However, it has nothing
>> whatsoever to do with the recent patches I installed.  At most they
>> expose some latent problem in the scheduler.
>
> The above was more likely wrong than right.  There may have been a
> latent problem in the scheduler indeed, but the patch actually made it
> worse, or even introduced it.
>
> The scheduler relies on alias analysis to tell whether a given pair of
> insns that read or modify memory should have a dependence set between
> them.  It looks both at the RTL proper (including cselib values) and at
> the MEM attrs.
>
> The problem was that, for ada/par.o, we computed different dependencies,
> and thus different sched priorities, for a pair of insns.  Specifically,
> one wrote to a stack spill slot, and another read from a neighbor spill
> slot.  Both had MEM_EXPRs pointing at a %sfp decl, and different
> offsets.  In the stage3 (-g) compilation, there were debug insns between
> them.
>
> They caused additional equivalent expressions to be added to some
> values, which in turn caused memrefs_conflict_p to return different
> values.
>
> In the stage2 compilation, both VALUEs resolved to PLUSes of two VALUEs,
> the first of each resolved to a constant, while the latter of each
> resolved to %sfp.  When the second operands of PLUSes match, we recurse
> and compare the first operands, resolving both to CONST_INTs and, in
> this case, concluding that there's no possible overlap.
>
> In the stage3 compilation, one VALUE resolved to a PLUS of a VALUE and a
> CONST_INT, whereas the other resolved to a PLUS of two VALUEs.  Without
> canonicalization of VALUE order in PLUSes, it just so happened that the
> VALUE that appeared as the second operand in the second PLUS was moved
> to the first operand in the first PLUS, and so memrefs_conflict_p
> couldn't tell whether or not there was an overlap.
>
> Before the initial pr64164 patch, we had another chance to detect the
> non-overlap analyzing the MEM attrs in nonoverlapping_memrefs_p: given
> the same MEM_EXPR, but different offsets, we used to conclude there was
> no overlap, so this got true_dependence to return the same value in both
> compilations.
>
> The pr64164 patch introduced an early exit from nonoverlapping_memrefs_p
> when either operand is a gimple_reg, because some of these wouldn't have
> a DECL_RTL set, and creating RTL for them at such points would not be
> appropriate.  The problem is that the early exit would only return false
> if the exprs were different.  If they were the same, we'd conclude an
> overlap was possible, even if offsets were enough to tell otherwise.
>
> My thought back then was that such exprs were not addressable anyway, so
> we'd always access the entire object, so offsets couldn't possible be
> different.  Right?  Well, no!  Think spill slots: %sfp (a gimple_reg
> decl) + constant offset!  Same base gimple_reg, non-overlapping memory
> addresses!
>
>
> This patch improves memrefs_conflict_p so as to handle more combinations
> of VALUEs in PLUSes: if both incoming addresses are PLUSes, check one
> operand of one against the other operand of the other; if one address is
> a PLUS and the other isn't, test the other against both operands of the
> PLUS.  This causes memrefs_conflict_p to return consistent results for
> that given pair of insns in both stage2 and stage3 compilation.
>
> Additionally, it fixes the regression in nonoverlapping_memrefs_p,
> adding code to check for non-overlapping offsets when the base expr is
> the same (as long as offsets and sizes are known for both MEMs).
>
> Either one would suffice to fix this particular case.  The latter would
> fix the regression proper, but the former is sufficiently lightweight
> (since comparing pointers is enough) that it's probably worth adding to
> get more accurate and consistent results earlier.
>
> I'm bootstrapping this on ia64-linux-gnu.  Ok to install?

Looks ok to me.

Thanks,
Richard.

>
> fix sched compare regression
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         PR rtl-optimization/67227
>         * alias.c (memrefs_conflict_p): Handle VALUEs in PLUS better.
>         (nonoverlapping_memrefs_p): Test offsets and sizes when given
>         identical gimple_reg exprs.
> ---
>  gcc/alias.c |   23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 4681e3f..f12d9d1 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2228,6 +2228,13 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c)
>        rtx x0 = XEXP (x, 0);
>        rtx x1 = XEXP (x, 1);
>
> +      /* However, VALUEs might end up in different positions even in
> +        canonical PLUSes.  Comparing their addresses is enough.  */
> +      if (x0 == y)
> +       return memrefs_conflict_p (xsize, x1, ysize, const0_rtx, c);
> +      else if (x1 == y)
> +       return memrefs_conflict_p (xsize, x0, ysize, const0_rtx, c);
> +
>        if (GET_CODE (y) == PLUS)
>         {
>           /* The fact that Y is canonicalized means that this
> @@ -2235,6 +2242,11 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c)
>           rtx y0 = XEXP (y, 0);
>           rtx y1 = XEXP (y, 1);
>
> +         if (x0 == y1)
> +           return memrefs_conflict_p (xsize, x1, ysize, y0, c);
> +         if (x1 == y0)
> +           return memrefs_conflict_p (xsize, x0, ysize, y1, c);
> +
>           if (rtx_equal_for_memref_p (x1, y1))
>             return memrefs_conflict_p (xsize, x0, ysize, y0, c);
>           if (rtx_equal_for_memref_p (x0, y0))
> @@ -2263,6 +2275,11 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c)
>        rtx y0 = XEXP (y, 0);
>        rtx y1 = XEXP (y, 1);
>
> +      if (x == y0)
> +       return memrefs_conflict_p (xsize, const0_rtx, ysize, y1, c);
> +      if (x == y1)
> +       return memrefs_conflict_p (xsize, const0_rtx, ysize, y0, c);
> +
>        if (CONST_INT_P (y1))
>         return memrefs_conflict_p (xsize, x, ysize, y0, c + INTVAL (y1));
>        else
> @@ -2518,7 +2535,11 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>       able to do anything about them since no SSA information will have
>       remained to guide it.  */
>    if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> -    return exprx != expry;
> +    return exprx != expry
> +      || (moffsetx_known_p && moffsety_known_p
> +         && MEM_SIZE_KNOWN_P (x) && MEM_SIZE_KNOWN_P (y)
> +         && !offset_overlap_p (moffsety - moffsetx,
> +                               MEM_SIZE (x), MEM_SIZE (y)));
>
>    /* With invalid code we can end up storing into the constant pool.
>       Bail out to avoid ICEing when creating RTL for this.
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-21  7:57                                                                           ` Alexandre Oliva
  2015-08-21  8:38                                                                             ` Richard Biener
@ 2015-08-21 12:17                                                                             ` Andreas Schwab
  1 sibling, 0 replies; 127+ messages in thread
From: Andreas Schwab @ 2015-08-21 12:17 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Christophe Lyon, GCC Patches, Patrick Marlier, Jeff Law,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, Richard Biener,
	David Edelsohn, Eric Botcazou

Alexandre Oliva <aoliva@redhat.com> writes:

> 	PR rtl-optimization/64164
> 	PR rtl-optimization/67227
> 	* alias.c (memrefs_conflict_p): Handle VALUEs in PLUS better.
> 	(nonoverlapping_memrefs_p): Test offsets and sizes when given
> 	identical gimple_reg exprs.

I can confirm that this fixes the bootstrap.

Thanks, Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-08-14 19:03                                                       ` Alexandre Oliva
                                                                           ` (2 preceding siblings ...)
  2015-08-17  7:48                                                         ` Christophe Lyon
@ 2015-09-02 17:09                                                         ` Alan Lawrence
  2015-09-02 22:34                                                           ` Alexandre Oliva
  3 siblings, 1 reply; 127+ messages in thread
From: Alan Lawrence @ 2015-09-02 17:09 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On 14/08/15 19:57, Alexandre Oliva wrote:
>
> I'm glad it appears to be working to everyone's
> satisfaction now.  I've just committed it as r226901, with only a
> context adjustment to account for a change in use_register_for_decl in
> function.c.  /me crosses fingers :-)
>
> Here's the patch as checked in:

One more failure to report, I'm afraid. On AArch64 Bigendian, 
aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348):

In file included from /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aa
pcs64/func-ret-4.c:14:0:
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-4.c: In
  function 'func_return_val_10':
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:12:2
4: internal compiler error: in simplify_subreg, at simplify-rtx.c:5808
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:13:4
0: note: in definition of macro 'FUNC_NAME_COMBINE'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:15:2
7: note: in expansion of macro 'FUNC_NAME_1'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:15:3
9: note: in expansion of macro 'FUNC_BASE_NAME'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:69:3
3: note: in expansion of macro 'FUNC_NAME'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-4.c:23:
1: note: in expansion of macro 'FUNC_VAL_CHECK'
0xa7ba44 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
         /work/alalaw01/src/gcc/gcc/simplify-rtx.c:5808
0xa7c4ef simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
         /work/alalaw01/src/gcc/gcc/simplify-rtx.c:6031
0x7ad097 operand_subword(rtx_def*, unsigned int, int, machine_mode)
         /work/alalaw01/src/gcc/gcc/emit-rtl.c:1611
0x7def4e move_block_from_reg(int, rtx_def*, int)
         /work/alalaw01/src/gcc/gcc/expr.c:1536
0x83a494 assign_parm_setup_block
         /work/alalaw01/src/gcc/gcc/function.c:3117
0x841a43 assign_parms
         /work/alalaw01/src/gcc/gcc/function.c:3857
0x842ffa expand_function_start(tree_node*)
         /work/alalaw01/src/gcc/gcc/function.c:5286
0x6e7496 execute
         /work/alalaw01/src/gcc/gcc/cfgexpand.c:6203
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c compilation,  -O1  (internal 
compiler error)

Also at -O2, -O3 -g, -Og -g, -Os. -O0 is OK.

simplify_subreg is called with outermode=DImode, op=

(concat:CHI (reg:HI 76 [ t ])
     (reg:HI 77 [ t+2 ]))

innermode = BLKmode (which violates the assertion), byte=0.

move_block_from_reg (in expr.c) calls operand_subword(x, i, 1, BLKmode), here 
i=0 and x is the concat:CHI above, and operand_subword doesn't handle that case 
(well, it passes it onto simplify_subreg).

In assign_parm_setup_block, I see 'mem = validize_mem (copy_rtx (stack_parm))' 
where stack_parm is again the same concat:CHI.

This should be easily reproducible with a stage 1 compiler (aarch64_be-none-elf).

--Alan

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-09-02 17:09                                                         ` Alan Lawrence
@ 2015-09-02 22:34                                                           ` Alexandre Oliva
  2015-09-03 10:58                                                             ` Alan Lawrence
  2015-09-18 15:49                                                             ` Alan Lawrence
  0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-09-02 22:34 UTC (permalink / raw)
  To: Alan Lawrence
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On Sep  2, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:

> One more failure to report, I'm afraid. On AArch64 Bigendian,
> aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from
> r227348):

Thanks.  The failure mode was different in the current, revamped git
branch aoliva/pr64164, but I've just fixed it there.

I'm almost ready to post a new patch, with a new, simpler, less fragile
and more maintainable approach to integrate cfgexpand and assign_parms'
RTL assignment, so if you could give it a spin on big and little endian
aarch64 natives, that would be very much appreciated!

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-09-02 22:34                                                           ` Alexandre Oliva
@ 2015-09-03 10:58                                                             ` Alan Lawrence
  2015-09-18 15:49                                                             ` Alan Lawrence
  1 sibling, 0 replies; 127+ messages in thread
From: Alan Lawrence @ 2015-09-03 10:58 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On 02/09/15 23:12, Alexandre Oliva wrote:
> On Sep  2, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:
>
>> One more failure to report, I'm afraid. On AArch64 Bigendian,
>> aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from
>> r227348):
>
> Thanks.  The failure mode was different in the current, revamped git
> branch aoliva/pr64164, but I've just fixed it there.
>
> I'm almost ready to post a new patch, with a new, simpler, less fragile
> and more maintainable approach to integrate cfgexpand and assign_parms'
> RTL assignment, so if you could give it a spin on big and little endian
> aarch64 natives, that would be very much appreciated!
>

On aarch64_be, that branch fixes the ICE - but func-ret-4.c fails on execution, 
and now func-ret-3.c does too! Also it causes a bunch of errors building newlib 
using cross-built binutils, which I haven't tracked down yet:

/work/alalaw01/src2/binutils-gdb/newlib/libc/locale/locale.c: In function 
'__get_locale_env':
/work/alalaw01/src2/binutils-gdb/newlib/libc/locale/locale.c:911:1: internal 
compiler error: in insert_value_copy_on_edge, at tree-outof-ssa.c:308
  __get_locale_env(struct _reent *p, int category)
  ^
0xb4ecc4 insert_value_copy_on_edge
	/work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:307
0xb4ecc4 eliminate_phi
	/work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:780
0xb4ecc4 expand_phi_nodes(ssaexpand*)
	/work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:943
0x6e74a6 execute
	/work/alalaw01/src2/gcc/gcc/cfgexpand.c:6242
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
make[7]: *** [lib_a-locale.o] Error 1

--Alan

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-09-02 22:34                                                           ` Alexandre Oliva
  2015-09-03 10:58                                                             ` Alan Lawrence
@ 2015-09-18 15:49                                                             ` Alan Lawrence
  2015-09-23 20:44                                                               ` Alexandre Oliva
  1 sibling, 1 reply; 127+ messages in thread
From: Alan Lawrence @ 2015-09-18 15:49 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On 02/09/15 23:12, Alexandre Oliva wrote:
> On Sep  2, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:
>
>> One more failure to report, I'm afraid. On AArch64 Bigendian,
>> aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from
>> r227348):
>
> Thanks.  The failure mode was different in the current, revamped git
> branch aoliva/pr64164, but I've just fixed it there.
>
> I'm almost ready to post a new patch, with a new, simpler, less fragile
> and more maintainable approach to integrate cfgexpand and assign_parms'
> RTL assignment, so if you could give it a spin on big and little endian
> aarch64 natives, that would be very much appreciated!
>

On trunk, aarch64_be is still ICEing in gcc.target/aarch64/aapcs64/func-ret-4.c 
(complex numbers).

With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on branch 
aoliva/pr64164, I am now able to build a cross toolchain for aarch64 and 
aarch64_be, and can confirm the ABI failure is fixed on the branch.

HTH,
Alan

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-09-18 15:49                                                             ` Alan Lawrence
@ 2015-09-23 20:44                                                               ` Alexandre Oliva
  2015-09-25 11:39                                                                 ` Richard Biener
                                                                                   ` (2 more replies)
  0 siblings, 3 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-09-23 20:44 UTC (permalink / raw)
  To: Alan Lawrence
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On Sep 18, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:

> With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on
> branch aoliva/pr64164, I am now able to build a cross toolchain for
> aarch64 and aarch64_be, and can confirm the ABI failure is fixed on
> the branch.

Thanks for the confirmation.  I've made one further tweak for cris and
lm32, dropping the assert that caused build failures for libstdc++
atomics parms that required more alignment than
MAX_SUPPORTED_STACK_ALIGNMENT, consolidated the patchset and retested it
with a more recent baseline (r228019), with native regstraps on
x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
powerpc64le-linux-gnu, and cross toolchain builds for the following 73
platforms: aarch64_be-elf aarch64-elf arm-eabi armeb-eabihf
arm-symbianelf avr-elf bfin-elf c6x-elf cr16-elf cris-elf crisv32-elf
epiphany-elf fido-elf fr30-elf frv-elf ft32-elf h8300-elf i686-elf
ia64-elf iq2000-elf lm32-elf m32c-elf m32r-elf m32rle-elf m68k-elf
mcore-elf mep-elf microblaze-elf mips64el-elf mips64-elf mips64orion-elf
mips64vr-elf mipsel-elf mipsisa32-elfoabi mipsisa64-elfoabi
mipsisa64r2el-elf mipsisa64r2-sde-elf mipsisa64sb1-elf
mipsisa64sr71k-elf mipstx39-elf mn10300-elf moxie-elf msp430-elf
nds32be-elf nds32le-elf nios2-elf pdp11-aout powerpc-eabialtivec
powerpc-eabi powerpc-eabisimaltivec powerpc-eabisim powerpc-eabispe
powerpcle-eabi powerpcle-eabisim powerpcle-elf powerpc-xilinx-eabi
ppc64-eabi ppc-eabi ppc-elf rl78-elf rx-elf sh64-elf sh-elf
sh-superh-elf sparc64-elf sparc-elf sparc-leon-elf spu-elf v850e-elf
v850-elf visium-elf xstormy16-elf xtensa-elf.  Not all of them succeeded
in building, but those that didn't failed at the very same spots before
and after this patch.


This patch doesn't really add much functionality.  It rather
reimplements a lot of the ugly and fragile stuff I put in in the
previous big patchset in a far more robust and pleasant way.  It fixes a
number of regressions in the process, mainly because, instead of
modifying assign_parms so as to let cfgexpand do part of its job, it
reverts all of the RTL assignment for parameters and results to
assign_parms.  cfgexpand now leaves the RTL assignment of partitions
containing default defs or parms and results to assign_parms, and
assign_parms uses a single callback, set_parm_rtl, to tell cfgexpand the
assignment for the partition containing the default def of each
parameter.

This required introducing default defs for all parms and results, even
if unused; we could refrain from creating them, and refrain from
initializing those parameters (at least when optimizing), but that would
require messing with the fragile bits in assign_parms again, and it
would bring little benefit, since RTL optimization will likely notice
the initialization is unused and drop it anyway.  Besides, adding the
default defs was actually needed to fix a regression in the previous
patch, and even with the current patch it helps make sure we don't
assign more than one default def to the same SSA partition (the previous
patch attempted to do that, but there was a bug, fixed in the current
patch).  Having unused default defs makes it easier for us to decide
whether to use an entry_value rtx for the initial debug insn of a parm.
We track partitions holding default defs for parms and results with a
bitmap; we used to have a bitmap that tracked partitions holding default
defs, but it was unused!  I just renamed it and repurposed it.

I've also added checking asserts to set_rtl, to verify that, when we
expect a REG, we get a REG, and that it has the expected mode.  set_rtl
was also adjusted to record anonymous SSA names or their base types in
attrs of REGs or MEMs, respectively, so that code that relied on the
attrs to detect properties of the decl types no longer regress just
because we no longer generate decls for anonymous SSA names.  Since
there were prior uses of types in MEM attrs, that was expected to go
smoothly, but I was surprised at how smoothly adding SSA names to REG
attrs went.  No adjustments required!

I also tightened a bit the conditions for coalescing: we used to require
the same canonical type; I've added tests for same alignment
requirements, and for same signedness.  OTOH, I've added a few more
coalesce candidates for RESULT_DECLs and the newly-added default defs of
parms and results.

Other relevant changes were in mode promotion.  TYPE_MODE would often
return BLKmode for some vector types, which was fine for some return
decl RTL with PARALLEL, but that didn't quite work for SSA partitions.
There were other cases of mode promotion of result decls that failed the
asserts in set_rtl, that revealed promote_decl_mode didn't call
promote_function_mode as expected for results.

The new assers brought additional requirements: promoting the mode of
the RTL generated for the static chain, arranging for result decls to be
assigned to a pseudo where it would formerly have got a BLKmode PARALLEL
(as mentioned above), and arranging for parms set up by
assign_parm_setup_block, that would always get a MEM, to instead get a
REG when use_register_for_decl called for it.  In a few cases involving
complex parms, I couldn't figure out how to avoid a temporary MEM, used
to adjust padding of the parms, but although undesired, this is not a
regression, for we used to use the MEM, we'll just load them to
(coalescible) pseudos and use the pseudos instead, instead of coalescing
other vars that expected pseudos to the same MEM.

Is this ok to install?



revert to assign_parms assignments using default defs

From: Alexandre Oliva <aoliva@redhat.com>

Revert the fragile and complicated changes to assign_parms designed to
enable it to use RTL assigments chosen by cfgexpand, and instead have
cfgexpand use the RTL assignments by assign_parms, keying them off of
the default defs that are now necessarily introduced for each parm and
result.  The possible lack of a default def was already a problem, and
the fallbacks in place were not enough, as shown by PR67312.  We now
have checking asserts in set_rtl that verify that we're assigning to
each var a piece of RTL that matches the expectations set forth by
use_register_for_decl.

for  gcc/ChangeLog

	PR rtl-optimization/64164
	PR tree-optimization/67312
	PR middle-end/67340
	PR middle-end/67490
	PR bootstrap/67597
	* cfgexpand.c (parm_in_stack_slot_p): Remove.
	(ssa_default_def_partition): Remove.
	(get_rtl_for_parm_ssa_default_def): Remove.
	(set_rtl): Check that RTL assignments match expectations.
	Loop on SUBREGs, CONCATs and PARALLELs subexprs.  Set only the
	default def location for params and results.  Record SSA names
	or types in REG and MEM attrs, respectively.
	(set_parm_rtl): New.
	(expand_one_ssa_partition): Drop logic that assigned MEMs with
	unassigned addresses.
	(adjust_one_expanded_partition_var): Don't accept NULL RTL on
	deferred stack alloc vars.
	(expand_used_vars): Skip partitions holding parm default defs.
	Move adjust_one_expanded_partition_var loop...
	(pass_expand::execute): ... here.  Drop redundant assert.
	Adjust comments before the final loop over all ssa names.
	Require assigned rtl of parms and results to match exactly.
	Reset its attributes to match them, not any other variables in
	the same partition.
	(expand_debug_expr): Use entry value for PARM's default defs
	only iff they have zero nondebug uses.
	* cfgexpand.h (parm_in_stack_slot_p): Remove.
	(get_rtl_for_parm_ssa_default_def): Remove.
	(set_parm_rtl): Declare.
	* doc/invoke.texi: Improve wording.
	* explow.c (promote_decl_mode): Fix promote_function_mode for
	result decls not by reference.
	(promote_ssa_mode): Disregard BLKmode from promote_decl, and
	bypass TYPE_MODE to get the actual vector mode.
	* function.c: Include tree-dfa.h.  Revert 2015-08-14's and
	2015-08-19's changes as follows.  Drop include of
	basic-block.h and df.h.
	(rtl_for_parm): Remove.
	(maybe_reset_rtl_for_parm): Remove.
	(parm_in_unassigned_mem_p): Remove.
	(use_register_for_decl): Add logic for RESULT_DECLs matching
	assign_parms' behavior.
	(split_complex_args): Revert.
	(assign_parms_augmented_arg_list): Revert.  Add comment
	referencing the logic above.
	(assign_parm_adjust_stack_rtl): Revert.
	(assign_parm_setup_block): Revert.  Use set_parm_rtl instead
	of SET_DECL_RTL.  Set up a REG if the parm demands so.
	(assign_parm_setup_reg): Revert.  Consolidated SET_DECL_RTL
	calls into a single set_parm_rtl.  Set up a temporary RTL
	temporarily for expand_assignment.
	(assign_parm_setup_stack): Revert.  Use set_parm_rtl.
	(assign_parms_unsplit_complex): Revert.  Use set_parm_rtl.
	(assign_bounds): Revert.
	(assign_parms): Revert.  Use set_parm_rtl.
	(allocate_struct_function): Relayout result and parms of
	non-abstruct functions.
	(expand_function_start): Revert.  Use set_parm_rtl.  If the
	result is not a hard reg, create a pseudo from the promoted
	mode of the default def.  Promote static chain mode.
	* tree-outof-ssa.c (remove_ssa_form): Drop unused
	partition_has_default_def.  Set up
	partitions_for_parm_default_defs.
	(finish_out_of_ssa): Remove partition_has_default_def.
	Release partitions_for_parm_default_defs.
	* tree-outof-ssa.h (struct ssaexpand): Remove
	partition_has_default_def.  Add
	partitions_for_parm_default_defs.
	* tree-ssa-coalesce.c: Include tree-dfa.h, tm_p.h and
	stor-layout.h.
	(build_ssa_conflict_graph): Fix conflict-detection of default
	defs of even unused default defs of params and results.
	(for_all_parms): New.
	(create_default_def): New.
	(register_default_def): New.
	(coalesce_with_default): New.
	(create_outofssa_var_map): Create default defs for all parms
	and results, and register their partitions.  Add GIMPLE_RETURN
	operands as coalesce candidates with results.  Add default
	defs of each parm or result as coalesce candidates with its
	other defs.  Mark each result def, and each default def of
	parms, as used_in_copy.
	(gimple_can_coalesce_p): Call it.  Call use_register_for_decl
	with the ssa names, even anonymous ones.  Drop
	parm_in_stack_slot_p calls.  Require same signedness and
	alignment.
	(coalesce_ssa_name): Add coalesce candidates for all defs of
	each parm and result, even unused ones.
	(parm_default_def_partition_arg): New type.
	(set_parm_default_def_partition): New.
	(get_parm_default_def_partitions): New.
	* tree-ssa-coalesce.h (get_parm_default_def_partitions): New.
	* tree-ssa-live.c (partition_view_init): Regard unused defs of
	parms and results as used.
	(verify_live_on_entry): Don't error out just because they're
	not live.

for  gcc/testsuite/ChangeLog

	PR rtl-optimization/64164
	PR tree-optimization/67312
	* gcc.dg/pr67312.c: New.  From Zdenek Sojka.
	* gcc.target/i386/stackalign/return-4.c: Add -O.
---
 gcc/cfgexpand.c                                    |  332 +++++++-------
 gcc/cfgexpand.h                                    |    3 
 gcc/doc/invoke.texi                                |    9 
 gcc/explow.c                                       |   19 +
 gcc/function.c                                     |  477 +++++++-------------
 gcc/testsuite/gcc.dg/pr67312.c                     |    7 
 .../gcc.target/i386/stackalign/return-4.c          |    9 
 gcc/tree-outof-ssa.c                               |   15 -
 gcc/tree-outof-ssa.h                               |    6 
 gcc/tree-ssa-coalesce.c                            |  231 ++++++++--
 gcc/tree-ssa-coalesce.h                            |    1 
 gcc/tree-ssa-live.c                                |   10 
 12 files changed, 582 insertions(+), 537 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr67312.c

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 6c9284f..58e55d2 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -99,6 +99,8 @@ static rtx expand_debug_expr (tree);
 
 static bool defer_stack_allocation (tree, bool);
 
+static void record_alignment_for_reg_var (unsigned int);
+
 /* Return an expression tree corresponding to the RHS of GIMPLE
    statement STMT.  */
 
@@ -172,111 +174,86 @@ leader_merge (tree cur, tree next)
   return cur;
 }
 
-/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
-   assigned to a stack slot.  We can't have expand_one_ssa_partition
-   choose their address: the pseudo holding the address would be set
-   up too late for assign_params to copy the parameter if needed.
-
-   Such parameters are likely passed as a pointer to the value, rather
-   than as a value, and so we must not coalesce them, nor allocate
-   stack space for them before determining the calling conventions for
-   them.
-
-   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
-   with pc_rtx as the address, and then it replaces the pc_rtx with
-   NULL so as to make sure the MEM is not used before it is adjusted
-   in assign_parm_setup_reg.  */
-
-bool
-parm_in_stack_slot_p (tree var)
-{
-  if (!var || VAR_P (var))
-    return false;
-
-  gcc_assert (TREE_CODE (var) == PARM_DECL
-	      || TREE_CODE (var) == RESULT_DECL);
-
-  return !use_register_for_decl (var);
-}
-
-/* Return the partition of the default SSA_DEF for decl VAR.  */
-
-static int
-ssa_default_def_partition (tree var)
-{
-  tree name = ssa_default_def (cfun, var);
-
-  if (!name)
-    return NO_PARTITION;
-
-  return var_to_partition (SA.map, name);
-}
-
-/* Return the RTL for the default SSA def of a PARM or RESULT, if
-   there is one.  */
-
-rtx
-get_rtl_for_parm_ssa_default_def (tree var)
-{
-  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
-
-  if (!is_gimple_reg (var))
-    return NULL_RTX;
-
-  /* If we've already determined RTL for the decl, use it.  This is
-     not just an optimization: if VAR is a PARM whose incoming value
-     is unused, we won't find a default def to use its partition, but
-     we still want to use the location of the parm, if it was used at
-     all.  During assign_parms, until a location is assigned for the
-     VAR, RTL can only for a parm or result if we're not coalescing
-     across variables, when we know we're coalescing all SSA_NAMEs of
-     each parm or result, and we're not coalescing them with names
-     pertaining to other variables, such as other parms' default
-     defs.  */
-  if (DECL_RTL_SET_P (var))
-    {
-      gcc_assert (DECL_RTL (var) != pc_rtx);
-      return DECL_RTL (var);
-    }
-
-  int part = ssa_default_def_partition (var);
-  if (part == NO_PARTITION)
-    return NULL_RTX;
-
-  return SA.partition_to_pseudo[part];
-}
-
 /* Associate declaration T with storage space X.  If T is no
    SSA name this is exactly SET_DECL_RTL, otherwise make the
    partition of T associated with X.  */
 static inline void
 set_rtl (tree t, rtx x)
 {
-  if (x && SSAVAR (t))
+  gcc_checking_assert (!x
+		       || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t))
+		       || (use_register_for_decl (t)
+			   ? (REG_P (x)
+			      || (GET_CODE (x) == CONCAT
+				  && (REG_P (XEXP (x, 0))
+				      || SUBREG_P (XEXP (x, 0)))
+				  && (REG_P (XEXP (x, 1))
+				      || SUBREG_P (XEXP (x, 1))))
+			      || (GET_CODE (x) == PARALLEL
+				  && SSAVAR (t)
+				  && TREE_CODE (SSAVAR (t)) == RESULT_DECL
+				  && !flag_tree_coalesce_vars))
+			   : (MEM_P (x) || x == pc_rtx
+			      || (GET_CODE (x) == CONCAT
+				  && MEM_P (XEXP (x, 0))
+				  && MEM_P (XEXP (x, 1))))));
+  /* Check that the RTL for SSA_NAMEs and gimple-reg PARM_DECLs and
+     RESULT_DECLs has the expected mode.  For memory, we accept
+     unpromoted modes, since that's what we're likely to get.  For
+     PARM_DECLs and RESULT_DECLs, we'll have been called by
+     set_parm_rtl, which will give us the default def, so we don't
+     have to compute it ourselves.  For RESULT_DECLs, we accept mode
+     mismatches too, as long as we're not coalescing across variables,
+     so that we don't reject BLKmode PARALLELs or unpromoted REGs.  */
+  gcc_checking_assert (!x || x == pc_rtx || TREE_CODE (t) != SSA_NAME
+		       || (SSAVAR (t) && TREE_CODE (SSAVAR (t)) == RESULT_DECL
+			   && !flag_tree_coalesce_vars)
+		       || !use_register_for_decl (t)
+		       || GET_MODE (x) == promote_ssa_mode (t, NULL));
+
+  if (x)
     {
       bool skip = false;
       tree cur = NULL_TREE;
-
-      if (MEM_P (x))
-	cur = MEM_EXPR (x);
-      else if (REG_P (x))
-	cur = REG_EXPR (x);
-      else if (GET_CODE (x) == CONCAT
-	       && REG_P (XEXP (x, 0)))
-	cur = REG_EXPR (XEXP (x, 0));
-      else if (GET_CODE (x) == PARALLEL)
-	cur = REG_EXPR (XVECEXP (x, 0, 0));
-      else if (x == pc_rtx)
+      rtx xm = x;
+
+    retry:
+      if (MEM_P (xm))
+	cur = MEM_EXPR (xm);
+      else if (REG_P (xm))
+	cur = REG_EXPR (xm);
+      else if (SUBREG_P (xm))
+	{
+	  gcc_assert (subreg_lowpart_p (xm));
+	  xm = SUBREG_REG (xm);
+	  goto retry;
+	}
+      else if (GET_CODE (xm) == CONCAT)
+	{
+	  xm = XEXP (xm, 0);
+	  goto retry;
+	}
+      else if (GET_CODE (xm) == PARALLEL)
+	{
+	  xm = XVECEXP (xm, 0, 0);
+	  gcc_assert (GET_CODE (xm) == EXPR_LIST);
+	  xm = XEXP (xm, 0);
+	  goto retry;
+	}
+      else if (xm == pc_rtx)
 	skip = true;
       else
 	gcc_unreachable ();
 
-      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+      tree next = skip ? cur : leader_merge (cur, SSAVAR (t) ? SSAVAR (t) : t);
 
       if (cur != next)
 	{
 	  if (MEM_P (x))
-	    set_mem_attributes (x, next, true);
+	    set_mem_attributes (x,
+				next && TREE_CODE (next) == SSA_NAME
+				? TREE_TYPE (next)
+				: next, true);
 	  else
 	    set_reg_attrs_for_decl_rtl (next, x);
 	}
@@ -294,13 +271,11 @@ set_rtl (tree t, rtx x)
 	}
       /* For the benefit of debug information at -O0 (where
          vartracking doesn't run) record the place also in the base
-         DECL.  For PARMs and RESULTs, we may end up resetting these
-         in function.c:maybe_reset_rtl_for_parm, but in some rare
-         cases we may need them (unused and overwritten incoming
-         value, that at -O0 must share the location with the other
-         uses in spite of the missing default def), and this may be
-         the only chance to preserve them.  */
-      if (x && x != pc_rtx && SSA_NAME_VAR (t))
+         DECL.  For PARMs and RESULTs, do so only when setting the
+         default def.  */
+      if (x && x != pc_rtx && SSA_NAME_VAR (t)
+	  && (VAR_P (SSA_NAME_VAR (t))
+	      || SSA_NAME_IS_DEFAULT_DEF (t)))
 	{
 	  tree var = SSA_NAME_VAR (t);
 	  /* If we don't yet have something recorded, just record it now.  */
@@ -1242,6 +1217,49 @@ account_stack_vars (void)
   return size;
 }
 
+/* Record the RTL assignment X for the default def of PARM.  */
+
+extern void
+set_parm_rtl (tree parm, rtx x)
+{
+  gcc_assert (TREE_CODE (parm) == PARM_DECL
+	      || TREE_CODE (parm) == RESULT_DECL);
+
+  if (x && !MEM_P (x))
+    {
+      unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (parm),
+					      TYPE_MODE (TREE_TYPE (parm)),
+					      TYPE_ALIGN (TREE_TYPE (parm)));
+
+      /* If the variable alignment is very large we'll dynamicaly
+	 allocate it, which means that in-frame portion is just a
+	 pointer.  ??? We've got a pseudo for sure here, do we
+	 actually dynamically allocate its spilling area if needed?
+	 ??? Isn't it a problem when POINTER_SIZE also exceeds
+	 MAX_SUPPORTED_STACK_ALIGNMENT, as on cris and lm32?  */
+      if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+	align = POINTER_SIZE;
+
+      record_alignment_for_reg_var (align);
+    }
+
+  if (!is_gimple_reg (parm))
+    return set_rtl (parm, x);
+
+  tree ssa = ssa_default_def (cfun, parm);
+  if (!ssa)
+    return set_rtl (parm, x);
+
+  int part = var_to_partition (SA.map, ssa);
+  gcc_assert (part != NO_PARTITION);
+
+  bool changed = bitmap_bit_p (SA.partitions_for_parm_default_defs, part);
+  gcc_assert (changed);
+
+  set_rtl (ssa, x);
+  gcc_assert (DECL_RTL (parm) == x);
+}
+
 /* A subroutine of expand_one_var.  Called to immediately assign rtl
    to a variable to be allocated in the stack frame.  */
 
@@ -1349,37 +1367,7 @@ expand_one_ssa_partition (tree var)
 
   if (!use_register_for_decl (var))
     {
-      /* We can't risk having the parm assigned to a MEM location
-	 whose address references a pseudo, for the pseudo will only
-	 be set up after arguments are copied to the stack slot.
-
-	 If the parm doesn't have a default def (e.g., because its
-	 incoming value is unused), then we want to let assign_params
-	 do the allocation, too.  In this case we want to make sure
-	 SSA_NAMEs associated with the parm don't get assigned to more
-	 than one partition, lest we'd create two unassigned stac
-	 slots for the same parm, thus the assert at the end of the
-	 block.  */
-      if (parm_in_stack_slot_p (SSA_NAME_VAR (var))
-	  && (ssa_default_def_partition (SSA_NAME_VAR (var)) == part
-	      || !ssa_default_def (cfun, SSA_NAME_VAR (var))))
-	{
-	  expand_one_stack_var_at (var, pc_rtx, 0, 0);
-	  rtx x = SA.partition_to_pseudo[part];
-	  gcc_assert (GET_CODE (x) == MEM);
-	  gcc_assert (XEXP (x, 0) == pc_rtx);
-	  /* Reset the address, so that any attempt to use it will
-	     ICE.  It will be adjusted in assign_parm_setup_reg.  */
-	  XEXP (x, 0) = NULL_RTX;
-	  /* If the RTL associated with the parm is not what we have
-	     just created, the parm has been split over multiple
-	     partitions.  In order for this to work, we must have a
-	     default def for the parm, otherwise assign_params won't
-	     know what to do.  */
-	  gcc_assert (DECL_RTL_IF_SET (SSA_NAME_VAR (var)) == x
-		      || ssa_default_def (cfun, SSA_NAME_VAR (var)));
-	}
-      else if (defer_stack_allocation (var, true))
+      if (defer_stack_allocation (var, true))
 	add_stack_var (var);
       else
 	expand_one_stack_var_1 (var);
@@ -1393,8 +1381,8 @@ expand_one_ssa_partition (tree var)
   set_rtl (var, x);
 }
 
-/* Record the association between the RTL generated for a partition
-   and the underlying variable of the SSA_NAME.  */
+/* Record the association between the RTL generated for partition PART
+   and the underlying variable of the SSA_NAME VAR.  */
 
 static void
 adjust_one_expanded_partition_var (tree var)
@@ -1410,12 +1398,7 @@ adjust_one_expanded_partition_var (tree var)
 
   rtx x = SA.partition_to_pseudo[part];
 
-  if (!x)
-    {
-      /* This var will get a stack slot later.  */
-      gcc_assert (defer_stack_allocation (var, true));
-      return;
-    }
+  gcc_assert (x);
 
   set_rtl (var, x);
 
@@ -2040,6 +2023,9 @@ expand_used_vars (void)
 
   for (i = 0; i < SA.map->num_partitions; i++)
     {
+      if (bitmap_bit_p (SA.partitions_for_parm_default_defs, i))
+	continue;
+
       tree var = partition_to_var (SA.map, i);
 
       gcc_assert (!virtual_operand_p (var));
@@ -2047,9 +2033,6 @@ expand_used_vars (void)
       expand_one_ssa_partition (var);
     }
 
-  for (i = 1; i < num_ssa_names; i++)
-    adjust_one_expanded_partition_var (ssa_name (i));
-
   if (flag_stack_protect == SPCT_FLAG_STRONG)
       gen_stack_protect_signal
 	= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -4947,26 +4930,27 @@ expand_debug_expr (tree exp)
 	  }
 	else
 	  {
+	    /* If this is a reference to an incoming value of
+	       parameter that is never used in the code or where the
+	       incoming value is never used in the code, use
+	       PARM_DECL's DECL_RTL if set.  */
+	    if (SSA_NAME_IS_DEFAULT_DEF (exp)
+		&& SSA_NAME_VAR (exp)
+		&& TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL
+		&& has_zero_uses (exp))
+	      {
+		op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp));
+		if (op0)
+		  goto adjust_mode;
+		op0 = expand_debug_expr (SSA_NAME_VAR (exp));
+		if (op0)
+		  goto adjust_mode;
+	      }
+
 	    int part = var_to_partition (SA.map, exp);
 
 	    if (part == NO_PARTITION)
-	      {
-		/* If this is a reference to an incoming value of parameter
-		   that is never used in the code or where the incoming
-		   value is never used in the code, use PARM_DECL's
-		   DECL_RTL if set.  */
-		if (SSA_NAME_IS_DEFAULT_DEF (exp)
-		    && TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL)
-		  {
-		    op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp));
-		    if (op0)
-		      goto adjust_mode;
-		    op0 = expand_debug_expr (SSA_NAME_VAR (exp));
-		    if (op0)
-		      goto adjust_mode;
-		  }
-		return NULL;
-	      }
+	      return NULL;
 
 	    gcc_assert (part >= 0 && (unsigned)part < SA.map->num_partitions);
 
@@ -6216,9 +6200,26 @@ pass_expand::execute (function *fun)
       parm_birth_insn = var_seq;
     }
 
-  /* If we have a class containing differently aligned pointers
-     we need to merge those into the corresponding RTL pointer
-     alignment.  */
+  /* Now propagate the RTL assignment of each partition to the
+     underlying var of each SSA_NAME.  */
+  for (i = 1; i < num_ssa_names; i++)
+    {
+      tree name = ssa_name (i);
+
+      if (!name
+	  /* We might have generated new SSA names in
+	     update_alias_info_with_stack_vars.  They will have a NULL
+	     defining statements, and won't be part of the partitioning,
+	     so ignore those.  */
+	  || !SSA_NAME_DEF_STMT (name))
+	continue;
+
+      adjust_one_expanded_partition_var (name);
+    }
+
+  /* Clean up RTL of variables that straddle across multiple
+     partitions, and check that the rtl of any PARM_DECLs that are not
+     cleaned up is that of their default defs.  */
   for (i = 1; i < num_ssa_names; i++)
     {
       tree name = ssa_name (i);
@@ -6235,9 +6236,6 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      gcc_assert (SA.partition_to_pseudo[part]
-		  || defer_stack_allocation (name, true));
-
       /* If this decl was marked as living in multiple places, reset
 	 this now to NULL.  */
       tree var = SSA_NAME_VAR (name);
@@ -6252,7 +6250,19 @@ pass_expand::execute (function *fun)
 	  rtx in = DECL_RTL_IF_SET (var);
 	  gcc_assert (in);
 	  rtx out = SA.partition_to_pseudo[part];
-	  gcc_assert (in == out || rtx_equal_p (in, out));
+	  gcc_assert (in == out);
+
+	  /* Now reset VAR's RTL to IN, so that the _EXPR attrs match
+	     those expected by debug backends for each parm and for
+	     the result.  This is particularly important for stabs,
+	     whose register elimination from parm's DECL_RTL may cause
+	     -fcompare-debug differences as SET_DECL_RTL changes reg's
+	     attrs.  So, make sure the RTL already has the parm as the
+	     EXPR, so that it won't change.  */
+	  SET_DECL_RTL (var, NULL_RTX);
+	  if (MEM_P (in))
+	    set_mem_attributes (in, var, true);
+	  SET_DECL_RTL (var, in);
 	}
     }
 
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index ff7f4bef..8852411 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,8 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple *);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
-extern bool parm_in_stack_slot_p (tree);
-extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+extern void set_parm_rtl (tree, rtx);
 
 
 #endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 09c58ee..aefb061 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8866,12 +8866,13 @@ profitable to parallelize the loops.
 
 @item -ftree-coalesce-vars
 @opindex ftree-coalesce-vars
-Tell the compiler to attempt to combine small user-defined variables
-too, instead of just compiler temporaries.  This may severely limit the
-ability to debug an optimized program compiled with
+While transforming the program out of the SSA representation, attempt to
+reduce copying by coalescing versions of different user-defined
+variables, instead of just compiler temporaries.  This may severely
+limit the ability to debug an optimized program compiled with
 @option{-fno-var-tracking-assignments}.  In the negated form, this flag
 prevents SSA coalescing of user variables.  This option is enabled by
-default if optimization is enabled.
+default if optimization is enabled, and it does very little otherwise.
 
 @item -ftree-loop-if-convert
 @opindex ftree-loop-if-convert
diff --git a/gcc/explow.c b/gcc/explow.c
index 6941f4e..d104a79 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -830,8 +830,10 @@ promote_decl_mode (const_tree decl, int *punsignedp)
   machine_mode mode = DECL_MODE (decl);
   machine_mode pmode;
 
-  if (TREE_CODE (decl) == RESULT_DECL
-      || TREE_CODE (decl) == PARM_DECL)
+  if (TREE_CODE (decl) == RESULT_DECL && !DECL_BY_REFERENCE (decl))
+    pmode = promote_function_mode (type, mode, &unsignedp,
+                                   TREE_TYPE (current_function_decl), 1);
+  else if (TREE_CODE (decl) == RESULT_DECL || TREE_CODE (decl) == PARM_DECL)
     pmode = promote_function_mode (type, mode, &unsignedp,
                                    TREE_TYPE (current_function_decl), 2);
   else
@@ -857,12 +859,23 @@ promote_ssa_mode (const_tree name, int *punsignedp)
   if (SSA_NAME_VAR (name)
       && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
 	  || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
-    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+    {
+      machine_mode mode = promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+      if (mode != BLKmode)
+	return mode;
+    }
 
   tree type = TREE_TYPE (name);
   int unsignedp = TYPE_UNSIGNED (type);
   machine_mode mode = TYPE_MODE (type);
 
+  /* Bypass TYPE_MODE when it maps vector modes to BLKmode.  */
+  if (mode == BLKmode)
+    {
+      gcc_assert (VECTOR_TYPE_P (type));
+      mode = type->type_common.mode;
+    }
+
   machine_mode pmode = promote_mode (type, mode, &unsignedp);
   if (punsignedp)
     *punsignedp = unsignedp;
diff --git a/gcc/function.c b/gcc/function.c
index 9b4c2b9..21304689 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -74,8 +74,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
 #include "cfgexpand.h"
-#include "basic-block.h"
-#include "df.h"
 #include "params.h"
 #include "bb-reorder.h"
 #include "shrink-wrap.h"
@@ -83,6 +81,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl-iter.h"
 #include "tree-chkp.h"
 #include "rtl-chkp.h"
+#include "tree-dfa.h"
 
 /* So we can assign to cfun in this file.  */
 #undef cfun
@@ -152,9 +151,6 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
 static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
-static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
-static void maybe_reset_rtl_for_parm (tree);
-static bool parm_in_unassigned_mem_p (tree, rtx);
 
 \f
 /* Stack of nested functions.  */
@@ -2145,6 +2141,47 @@ use_register_for_decl (const_tree decl)
   if (TREE_ADDRESSABLE (decl))
     return false;
 
+  /* RESULT_DECLs are a bit special in that they're assigned without
+     regard to use_register_for_decl, but we generally only store in
+     them.  If we coalesce their SSA NAMEs, we'd better return a
+     result that matches the assignment in expand_function_start.  */
+  if (TREE_CODE (decl) == RESULT_DECL)
+    {
+      /* If it's not an aggregate, we're going to use a REG or a
+	 PARALLEL containing a REG.  */
+      if (!aggregate_value_p (decl, current_function_decl))
+	return true;
+
+      /* If expand_function_start determines the return value, we'll
+	 use MEM if it's not by reference.  */
+      if (cfun->returns_pcc_struct
+	  || (targetm.calls.struct_value_rtx
+	      (TREE_TYPE (current_function_decl), 1)))
+	return DECL_BY_REFERENCE (decl);
+
+      /* Otherwise, we're taking an extra all.function_result_decl
+	 argument.  It's set up in assign_parms_augmented_arg_list,
+	 under the (negated) conditions above, and then it's used to
+	 set up the RESULT_DECL rtl in assign_params, after looping
+	 over all parameters.  Now, if the RESULT_DECL is not by
+	 reference, we'll use a MEM either way.  */
+      if (!DECL_BY_REFERENCE (decl))
+	return false;
+
+      /* Otherwise, if RESULT_DECL is DECL_BY_REFERENCE, it will take
+	 the function_result_decl's assignment.  Since it's a pointer,
+	 we can short-circuit a number of the tests below, and we must
+	 duplicat e them because we don't have the
+	 function_result_decl to test.  */
+      if (!targetm.calls.allocate_stack_slots_for_args ())
+	return true;
+      /* We don't set DECL_IGNORED_P for the function_result_decl.  */
+      if (optimize)
+	return true;
+      /* We don't set DECL_REGISTER for the function_result_decl.  */
+      return false;
+    }
+
   /* Decl is implicitly addressible by bound stores and loads
      if it is an aggregate holding bounds.  */
   if (chkp_function_instrumented_p (current_function_decl)
@@ -2272,7 +2309,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
    needed, else the old list.  */
 
 static void
-split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
+split_complex_args (vec<tree> *args)
 {
   unsigned i;
   tree p;
@@ -2283,7 +2320,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
       if (TREE_CODE (type) == COMPLEX_TYPE
 	  && targetm.calls.split_complex_arg (type))
 	{
-	  tree cparm = p;
 	  tree decl;
 	  tree subtype = TREE_TYPE (type);
 	  bool addressable = TREE_ADDRESSABLE (p);
@@ -2302,9 +2338,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 	  DECL_ARTIFICIAL (p) = addressable;
 	  DECL_IGNORED_P (p) = addressable;
 	  TREE_ADDRESSABLE (p) = 0;
-	  /* Reset the RTL before layout_decl, or it may change the
-	     mode of the RTL of the original argument copied to P.  */
-	  SET_DECL_RTL (p, NULL_RTX);
 	  layout_decl (p, 0);
 	  (*args)[i] = p;
 
@@ -2316,41 +2349,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 	  DECL_IGNORED_P (decl) = addressable;
 	  layout_decl (decl, 0);
 	  args->safe_insert (++i, decl);
-
-	  /* If we are expanding a function, rather than gimplifying
-	     it, propagate the RTL of the complex parm to the split
-	     declarations, and set their contexts so that
-	     maybe_reset_rtl_for_parm can recognize them and refrain
-	     from resetting their RTL.  */
-	  if (currently_expanding_to_rtl)
-	    {
-	      maybe_reset_rtl_for_parm (cparm);
-	      rtx rtl = rtl_for_parm (all, cparm);
-	      if (rtl)
-		{
-		  /* If this is parm is unassigned, assign it now: the
-		     newly-created decls wouldn't expect the need for
-		     assignment, and if they were assigned
-		     independently, they might not end up in adjacent
-		     slots, so unsplit wouldn't be able to fill in the
-		     unassigned address of the complex MEM.  */
-		  if (parm_in_unassigned_mem_p (cparm, rtl))
-		    {
-		      int align = STACK_SLOT_ALIGNMENT
-			(TREE_TYPE (cparm), GET_MODE (rtl), MEM_ALIGN (rtl));
-		      rtx loc = assign_stack_local
-			(GET_MODE (rtl), GET_MODE_SIZE (GET_MODE (rtl)),
-			 align);
-		      XEXP (rtl, 0) = XEXP (loc, 0);
-		    }
-
-		  SET_DECL_RTL (p, read_complex_part (rtl, false));
-		  SET_DECL_RTL (decl, read_complex_part (rtl, true));
-
-		  DECL_CONTEXT (p) = cparm;
-		  DECL_CONTEXT (decl) = cparm;
-		}
-	    }
 	}
     }
 }
@@ -2386,6 +2384,9 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
       DECL_ARTIFICIAL (decl) = 1;
       DECL_NAMELESS (decl) = 1;
       TREE_CONSTANT (decl) = 1;
+      /* We don't set DECL_IGNORED_P or DECL_REGISTER here.  If this
+	 changes, the end of the RESULT_DECL handling block in
+	 use_register_for_decl must be adjusted to match.  */
 
       DECL_CHAIN (decl) = all->orig_fnargs;
       all->orig_fnargs = decl;
@@ -2413,7 +2414,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
 
   /* If the target wants to split complex arguments into scalars, do so.  */
   if (targetm.calls.split_complex_arg)
-    split_complex_args (all, &fnargs);
+    split_complex_args (&fnargs);
 
   return fnargs;
 }
@@ -2816,98 +2817,23 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
   data->entry_parm = entry_parm;
 }
 
-/* Wrapper for use_register_for_decl, that special-cases the
-   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
-   passed by reference.  */
-
-static bool
-use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
-{
-  if (parm == all->function_result_decl)
-    {
-      tree result = DECL_RESULT (current_function_decl);
-
-      if (DECL_BY_REFERENCE (result))
-	parm = result;
-    }
-
-  return use_register_for_decl (parm);
-}
-
-/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
-   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
-   is passed by reference.  */
-
-static rtx
-rtl_for_parm (struct assign_parm_data_all *all, tree parm)
-{
-  if (parm == all->function_result_decl)
-    {
-      tree result = DECL_RESULT (current_function_decl);
-
-      if (!DECL_BY_REFERENCE (result))
-	return NULL_RTX;
-
-      parm = result;
-    }
-
-  return get_rtl_for_parm_ssa_default_def (parm);
-}
-
-/* Reset the location of PARM_DECLs and RESULT_DECLs that had
-   SSA_NAMEs in multiple partitions, so that assign_parms will choose
-   the default def, if it exists, or create new RTL to hold the unused
-   entry value.  If we are coalescing across variables, we want to
-   reset the location too, because a parm without a default def
-   (incoming value unused) might be coalesced with one with a default
-   def, and then assign_parms would copy both incoming values to the
-   same location, which might cause the wrong value to survive.  */
-static void
-maybe_reset_rtl_for_parm (tree parm)
-{
-  gcc_assert (TREE_CODE (parm) == PARM_DECL
-	      || TREE_CODE (parm) == RESULT_DECL);
-
-  /* This is a split complex parameter, and its context was set to its
-     original PARM_DECL in split_complex_args so that we could
-     recognize it here and not reset its RTL.  */
-  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
-    {
-      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
-      return;
-    }
-
-  if ((flag_tree_coalesce_vars
-       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
-      && is_gimple_reg (parm))
-    SET_DECL_RTL (parm, NULL_RTX);
-}
-
 /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
    always valid and properly aligned.  */
 
 static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
-			      struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
 {
   rtx stack_parm = data->stack_parm;
 
-  /* If out-of-SSA assigned RTL to the parm default def, make sure we
-     don't use what we might have computed before.  */
-  rtx ssa_assigned = rtl_for_parm (all, parm);
-  if (ssa_assigned)
-    stack_parm = NULL;
-
   /* If we can't trust the parm stack slot to be aligned enough for its
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
-  else if (stack_parm
-	   && ((STRICT_ALIGNMENT
-		&& (GET_MODE_ALIGNMENT (data->nominal_mode)
-		    > MEM_ALIGN (stack_parm)))
-	       || (data->nominal_type
-		   && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
-		   && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+  if (stack_parm
+      && ((STRICT_ALIGNMENT
+	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
+	  || (data->nominal_type
+	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
     stack_parm = NULL;
 
   /* If parm was passed in memory, and we need to convert it on entry,
@@ -2952,27 +2878,6 @@ assign_parm_setup_block_p (struct assign_parm_data_one *data)
   return false;
 }
 
-/* Return true if FROM_EXPAND is a MEM with an address to be filled in
-   by assign_params.  This should be the case if, and only if,
-   parm_in_stack_slot_p holds for the parm DECL that expanded to
-   FROM_EXPAND, so we check that, too.  */
-
-static bool
-parm_in_unassigned_mem_p (tree decl, rtx from_expand)
-{
-  bool result = MEM_P (from_expand) && !XEXP (from_expand, 0);
-
-  gcc_assert (result == parm_in_stack_slot_p (decl)
-	      /* Maybe it was already assigned.  That's ok, especially
-		 for split complex args.  */
-	      || (!result && MEM_P (from_expand)
-		  && (XEXP (from_expand, 0) == virtual_stack_vars_rtx
-		      || (GET_CODE (XEXP (from_expand, 0)) == PLUS
-			  && XEXP (XEXP (from_expand, 0), 0) == virtual_stack_vars_rtx))));
-
-  return result;
-}
-
 /* A subroutine of assign_parms.  Arrange for the parameter to be
    present and valid in DATA->STACK_RTL.  */
 
@@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 {
   rtx entry_parm = data->entry_parm;
   rtx stack_parm = data->stack_parm;
+  rtx target_reg = NULL_RTX;
   HOST_WIDE_INT size;
   HOST_WIDE_INT size_stored;
 
   if (GET_CODE (entry_parm) == PARALLEL)
     entry_parm = emit_group_move_into_temps (entry_parm);
 
+  /* If we want the parameter in a pseudo, don't use a stack slot.  */
+  if (is_gimple_reg (parm) && use_register_for_decl (parm))
+    {
+      tree def = ssa_default_def (cfun, parm);
+      gcc_assert (def);
+      machine_mode mode = promote_ssa_mode (def, NULL);
+      rtx reg = gen_reg_rtx (mode);
+      if (GET_CODE (reg) != CONCAT)
+	stack_parm = reg;
+      else
+	/* This will use or allocate a stack slot that we'd rather
+	   avoid.  FIXME: Could we avoid it in more cases?  */
+	target_reg = reg;
+      data->stack_parm = NULL;
+    }
+
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
-
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      rtx from_expand = rtl_for_parm (all, parm);
-      if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
-	stack_parm = copy_rtx (from_expand);
-      else
-	{
-	  stack_parm = assign_stack_local (BLKmode, size_stored,
-					   DECL_ALIGN (parm));
-	  if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
-	    PUT_MODE (stack_parm, GET_MODE (entry_parm));
-	  if (from_expand)
-	    {
-	      gcc_assert (GET_CODE (stack_parm) == MEM);
-	      gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
-	      XEXP (from_expand, 0) = XEXP (stack_parm, 0);
-	      PUT_MODE (from_expand, GET_MODE (stack_parm));
-	      stack_parm = copy_rtx (from_expand);
-	    }
-	  else
-	    set_mem_attributes (stack_parm, parm, 1);
-	}
+      stack_parm = assign_stack_local (BLKmode, size_stored,
+				       DECL_ALIGN (parm));
+      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
+	PUT_MODE (stack_parm, GET_MODE (entry_parm));
+      set_mem_attributes (stack_parm, parm, 1);
     }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
@@ -3054,11 +2960,6 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       else if (size == 0)
 	;
 
-      /* MEM may be a REG if coalescing assigns the param's partition
-	 to a pseudo.  */
-      else if (REG_P (mem))
-	emit_move_insn (mem, entry_parm);
-
       /* If SIZE is that of a mode no bigger than a word, just use
 	 that mode's store operation.  */
       else if (size <= UNITS_PER_WORD)
@@ -3113,10 +3014,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 	      tem = change_address (mem, word_mode, 0);
 	      emit_move_insn (tem, x);
 	    }
+	  else if (!MEM_P (mem))
+	    emit_move_insn (mem, entry_parm);
 	  else
 	    move_block_from_reg (REGNO (entry_parm), mem,
 				 size_stored / UNITS_PER_WORD);
 	}
+      else if (!MEM_P (mem))
+	emit_move_insn (mem, entry_parm);
       else
 	move_block_from_reg (REGNO (entry_parm), mem,
 			     size_stored / UNITS_PER_WORD);
@@ -3131,8 +3036,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       end_sequence ();
     }
 
+  if (target_reg)
+    {
+      emit_move_insn (target_reg, stack_parm);
+      stack_parm = target_reg;
+    }
+
   data->stack_parm = stack_parm;
-  SET_DECL_RTL (parm, stack_parm);
+  set_parm_rtl (parm, stack_parm);
 }
 
 /* A subroutine of assign_parms.  Allocate a pseudo to hold the current
@@ -3148,6 +3059,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
   int unsignedp = TYPE_UNSIGNED (TREE_TYPE (parm));
   bool did_conversion = false;
   bool need_conversion, moved;
+  rtx rtl;
 
   /* Store the parm in a pseudoregister during the function, but we may
      need to do it in a wider mode.  Using 2 here makes the result
@@ -3156,40 +3068,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  rtx from_expand = parmreg = rtl_for_parm (all, parm);
-
-  if (from_expand && !data->passed_pointer)
-    {
-      if (GET_MODE (parmreg) != promoted_nominal_mode)
-	parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
-    }
-  else if (!from_expand || parm_in_unassigned_mem_p (parm, from_expand))
-    {
-      parmreg = gen_reg_rtx (promoted_nominal_mode);
-      if (!DECL_ARTIFICIAL (parm))
-	mark_user_reg (parmreg);
-
-      if (from_expand)
-	{
-	  gcc_assert (data->passed_pointer);
-	  gcc_assert (GET_CODE (from_expand) == MEM
-		      && XEXP (from_expand, 0) == NULL_RTX);
-	  XEXP (from_expand, 0) = parmreg;
-	}
-    }
+  parmreg = gen_reg_rtx (promoted_nominal_mode);
+  if (!DECL_ARTIFICIAL (parm))
+    mark_user_reg (parmreg);
 
   /* If this was an item that we received a pointer to,
-     set DECL_RTL appropriately.  */
-  if (from_expand)
-    SET_DECL_RTL (parm, from_expand);
-  else if (data->passed_pointer)
+     set rtl appropriately.  */
+  if (data->passed_pointer)
     {
-      rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
-      set_mem_attributes (x, parm, 1);
-      SET_DECL_RTL (parm, x);
+      rtl = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
+      set_mem_attributes (rtl, parm, 1);
     }
   else
-    SET_DECL_RTL (parm, parmreg);
+    rtl = parmreg;
 
   assign_parm_remove_parallels (data);
 
@@ -3197,13 +3088,10 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      assign_parm_find_data_types and expand_expr_real_1.  */
 
   equiv_stack_parm = data->stack_parm;
-  if (!equiv_stack_parm)
-    equiv_stack_parm = data->entry_parm;
   validated_mem = validize_mem (copy_rtx (data->entry_parm));
 
   need_conversion = (data->nominal_mode != data->passed_mode
 		     || promoted_nominal_mode != data->promoted_mode);
-  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
   moved = false;
 
   if (need_conversion
@@ -3327,7 +3215,9 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       /* TREE_USED gets set erroneously during expand_assignment.  */
       save_tree_used = TREE_USED (parm);
+      SET_DECL_RTL (parm, rtl);
       expand_assignment (parm, make_tree (data->nominal_type, tempreg), false);
+      SET_DECL_RTL (parm, NULL_RTX);
       TREE_USED (parm) = save_tree_used;
       all->first_conversion_insn = get_insns ();
       all->last_conversion_insn = get_last_insn ();
@@ -3335,28 +3225,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       did_conversion = true;
     }
-  /* We don't want to copy the incoming pointer to a parmreg expected
-     to hold the value rather than the pointer.  */
-  else if (!data->passed_pointer || parmreg != from_expand)
+  else
     emit_move_insn (parmreg, validated_mem);
 
   /* If we were passed a pointer but the actual value can safely live
      in a register, retrieve it and use it directly.  */
-  if (data->passed_pointer
-      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
+  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
     {
-      rtx src = DECL_RTL (parm);
-
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
-      if (from_expand)
-	{
-	  parmreg = from_expand;
-	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
-	  src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
-	  set_mem_attributes (src, parm, 1);
-	}
-      else if (use_register_for_decl (parm))
+      if (use_register_for_decl (parm))
 	{
 	  parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
 	  mark_user_reg (parmreg);
@@ -3373,14 +3251,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	  set_mem_attributes (parmreg, parm, 1);
 	}
 
-      if (GET_MODE (parmreg) != GET_MODE (src))
+      if (GET_MODE (parmreg) != GET_MODE (rtl))
 	{
-	  rtx tempreg = gen_reg_rtx (GET_MODE (src));
+	  rtx tempreg = gen_reg_rtx (GET_MODE (rtl));
 	  int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
 
 	  push_to_sequence2 (all->first_conversion_insn,
 			     all->last_conversion_insn);
-	  emit_move_insn (tempreg, src);
+	  emit_move_insn (tempreg, rtl);
 	  tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
 	  emit_move_insn (parmreg, tempreg);
 	  all->first_conversion_insn = get_insns ();
@@ -3389,18 +3267,18 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
 	  did_conversion = true;
 	}
-      else if (GET_MODE (parmreg) == BLKmode)
-	gcc_assert (parm_in_stack_slot_p (parm));
       else
-	emit_move_insn (parmreg, src);
+	emit_move_insn (parmreg, rtl);
 
-      SET_DECL_RTL (parm, parmreg);
+      rtl = parmreg;
 
       /* STACK_PARM is the pointer, not the parm, and PARMREG is
 	 now the parm.  */
-      data->stack_parm = equiv_stack_parm = NULL;
+      data->stack_parm = NULL;
     }
 
+  set_parm_rtl (parm, rtl);
+
   /* Mark the register as eliminable if we did no conversion and it was
      copied from memory at a fixed offset, and the arg pointer was not
      copied to a pseudo-reg.  If the arg pointer is a pseudo reg or the
@@ -3408,11 +3286,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      make here would screw up life analysis for it.  */
   if (data->nominal_mode == data->passed_mode
       && !did_conversion
-      && equiv_stack_parm != 0
-      && MEM_P (equiv_stack_parm)
+      && data->stack_parm != 0
+      && MEM_P (data->stack_parm)
       && data->locate.offset.var == 0
       && reg_mentioned_p (virtual_incoming_args_rtx,
-			  XEXP (equiv_stack_parm, 0)))
+			  XEXP (data->stack_parm, 0)))
     {
       rtx_insn *linsn = get_last_insn ();
       rtx_insn *sinsn;
@@ -3425,8 +3303,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	    = GET_MODE_INNER (GET_MODE (parmreg));
 	  int regnor = REGNO (XEXP (parmreg, 0));
 	  int regnoi = REGNO (XEXP (parmreg, 1));
-	  rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
-	  rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
+	  rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
+	  rtx stacki = adjust_address_nv (data->stack_parm, submode,
 					  GET_MODE_SIZE (submode));
 
 	  /* Scan backwards for the set of the real and
@@ -3444,7 +3322,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 		set_unique_reg_note (sinsn, REG_EQUIV, stackr);
 	    }
 	}
-      else 
+      else
 	set_dst_reg_note (linsn, REG_EQUIV, equiv_stack_parm, parmreg);
     }
 
@@ -3496,16 +3374,6 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
   if (data->entry_parm != data->stack_parm)
     {
       rtx src, dest;
-      rtx from_expand = NULL_RTX;
-
-      if (data->stack_parm == 0)
-	{
-	  from_expand = rtl_for_parm (all, parm);
-	  if (from_expand)
-	    gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
-	  if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
-	    data->stack_parm = from_expand;
-	}
 
       if (data->stack_parm == 0)
 	{
@@ -3516,16 +3384,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 	    = assign_stack_local (GET_MODE (data->entry_parm),
 				  GET_MODE_SIZE (GET_MODE (data->entry_parm)),
 				  align);
-	  if (!from_expand)
-	    set_mem_attributes (data->stack_parm, parm, 1);
-	  else
-	    {
-	      gcc_assert (GET_CODE (data->stack_parm) == MEM);
-	      gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
-	      XEXP (from_expand, 0) = XEXP (data->stack_parm, 0);
-	      PUT_MODE (from_expand, GET_MODE (data->stack_parm));
-	      data->stack_parm = copy_rtx (from_expand);
-	    }
+	  set_mem_attributes (data->stack_parm, parm, 1);
 	}
 
       dest = validize_mem (copy_rtx (data->stack_parm));
@@ -3554,7 +3413,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
       end_sequence ();
     }
 
-  SET_DECL_RTL (parm, data->stack_parm);
+  set_parm_rtl (parm, data->stack_parm);
 }
 
 /* A subroutine of assign_parms.  If the ABI splits complex arguments, then
@@ -3580,21 +3439,11 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
 	  imag = DECL_RTL (fnargs[i + 1]);
 	  if (inner != GET_MODE (real))
 	    {
-	      real = simplify_gen_subreg (inner, real, GET_MODE (real),
-					  subreg_lowpart_offset
-					  (inner, GET_MODE (real)));
-	      imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
-					  subreg_lowpart_offset
-					  (inner, GET_MODE (imag)));
+	      real = gen_lowpart_SUBREG (inner, real);
+	      imag = gen_lowpart_SUBREG (inner, imag);
 	    }
 
-	  if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
-	      && rtx_equal_p (real,
-			      read_complex_part (tmp, false))
-	      && rtx_equal_p (imag,
-			      read_complex_part (tmp, true)))
-	    ; /* We now have the right rtl in tmp.  */
-	  else if (TREE_ADDRESSABLE (parm))
+	  if (TREE_ADDRESSABLE (parm))
 	    {
 	      rtx rmem, imem;
 	      HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
@@ -3618,7 +3467,7 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
 	    }
 	  else
 	    tmp = gen_rtx_CONCAT (DECL_MODE (parm), real, imag);
-	  SET_DECL_RTL (parm, tmp);
+	  set_parm_rtl (parm, tmp);
 
 	  real = DECL_INCOMING_RTL (fnargs[i]);
 	  imag = DECL_INCOMING_RTL (fnargs[i + 1]);
@@ -3740,7 +3589,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
 	  assign_parm_setup_block (&all, pbdata->bounds_parm,
 				   &pbdata->parm_data);
 	else if (pbdata->parm_data.passed_pointer
-		 || use_register_for_parm_decl (&all, pbdata->bounds_parm))
+		 || use_register_for_decl (pbdata->bounds_parm))
 	  assign_parm_setup_reg (&all, pbdata->bounds_parm,
 				 &pbdata->parm_data);
 	else
@@ -3784,8 +3633,6 @@ assign_parms (tree fndecl)
 	  DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
 	  continue;
 	}
-      else
-	maybe_reset_rtl_for_parm (parm);
 
       /* Estimate stack alignment from parameter alignment.  */
       if (SUPPORTS_STACK_ALIGNMENT)
@@ -3835,7 +3682,7 @@ assign_parms (tree fndecl)
       else
 	set_decl_incoming_rtl (parm, data.entry_parm, false);
 
-      assign_parm_adjust_stack_rtl (&all, parm, &data);
+      assign_parm_adjust_stack_rtl (&data);
 
       /* Bounds should be loaded in the particular order to
 	 have registers allocated correctly.  Collect info about
@@ -3856,8 +3703,7 @@ assign_parms (tree fndecl)
 	{
 	  if (assign_parm_setup_block_p (&data))
 	    assign_parm_setup_block (&all, parm, &data);
-	  else if (data.passed_pointer
-		   || use_register_for_parm_decl (&all, parm))
+	  else if (data.passed_pointer || use_register_for_decl (parm))
 	    assign_parm_setup_reg (&all, parm, &data);
 	  else
 	    assign_parm_setup_stack (&all, parm, &data);
@@ -3954,7 +3800,7 @@ assign_parms (tree fndecl)
 
       DECL_HAS_VALUE_EXPR_P (result) = 1;
 
-      SET_DECL_RTL (result, x);
+      set_parm_rtl (result, x);
     }
 
   /* We have aligned all the args, so add space for the pretend args.  */
@@ -4986,6 +4832,18 @@ allocate_struct_function (tree fndecl, bool abstract_p)
   if (fndecl != NULL_TREE)
     {
       tree result = DECL_RESULT (fndecl);
+
+      if (!abstract_p)
+	{
+	  /* Now that we have activated any function-specific attributes
+	     that might affect layout, particularly vector modes, relayout
+	     each of the parameters and the result.  */
+	  relayout_decl (result);
+	  for (tree parm = DECL_ARGUMENTS (fndecl); parm;
+	       parm = DECL_CHAIN (parm))
+	    relayout_decl (parm);
+	}
+
       if (!abstract_p && aggregate_value_p (result, fndecl))
 	{
 #ifdef PCC_STATIC_STRUCT_RETURN
@@ -5189,7 +5047,6 @@ expand_function_start (tree subr)
 
   /* Decide whether to return the value in memory or in a register.  */
   tree res = DECL_RESULT (subr);
-  maybe_reset_rtl_for_parm (res);
   if (aggregate_value_p (res, subr))
     {
       /* Returning something that won't go in a register.  */
@@ -5210,10 +5067,7 @@ expand_function_start (tree subr)
 	     it.  */
 	  if (sv)
 	    {
-	      if (DECL_BY_REFERENCE (res))
-		value_address = get_rtl_for_parm_ssa_default_def (res);
-	      if (!value_address)
-		value_address = gen_reg_rtx (Pmode);
+	      value_address = gen_reg_rtx (Pmode);
 	      emit_move_insn (value_address, sv);
 	    }
 	}
@@ -5222,33 +5076,35 @@ expand_function_start (tree subr)
 	  rtx x = value_address;
 	  if (!DECL_BY_REFERENCE (res))
 	    {
-	      x = get_rtl_for_parm_ssa_default_def (res);
-	      if (!x)
-		{
-		  x = gen_rtx_MEM (DECL_MODE (res), value_address);
-		  set_mem_attributes (x, res, 1);
-		}
+	      x = gen_rtx_MEM (DECL_MODE (res), x);
+	      set_mem_attributes (x, res, 1);
 	    }
-	  SET_DECL_RTL (res, x);
+	  set_parm_rtl (res, x);
 	}
     }
   else if (DECL_MODE (res) == VOIDmode)
     /* If return mode is void, this decl rtl should not be used.  */
-    SET_DECL_RTL (res, NULL_RTX);
-  else
+    set_parm_rtl (res, NULL_RTX);
+  else 
     {
       /* Compute the return values into a pseudo reg, which we will copy
 	 into the true return register after the cleanups are done.  */
       tree return_type = TREE_TYPE (res);
-      rtx x = get_rtl_for_parm_ssa_default_def (res);
-      if (x)
-	/* Use it.  */;
+      /* If we may coalesce this result, make sure it has the expected
+	 mode.  */
+      if (flag_tree_coalesce_vars && is_gimple_reg (res))
+	{
+	  tree def = ssa_default_def (cfun, res);
+	  gcc_assert (def);
+	  machine_mode mode = promote_ssa_mode (def, NULL);
+	  set_parm_rtl (res, gen_reg_rtx (mode));
+	}
       else if (TYPE_MODE (return_type) != BLKmode
 	       && targetm.calls.return_in_msb (return_type))
 	/* expand_function_end will insert the appropriate padding in
 	   this case.  Use the return value's natural (unpadded) mode
 	   within the function proper.  */
-	x = gen_reg_rtx (TYPE_MODE (return_type));
+	set_parm_rtl (res, gen_reg_rtx (TYPE_MODE (return_type)));
       else
 	{
 	  /* In order to figure out what mode to use for the pseudo, we
@@ -5259,16 +5115,14 @@ expand_function_start (tree subr)
 	  /* Structures that are returned in registers are not
 	     aggregate_value_p, so we may see a PARALLEL or a REG.  */
 	  if (REG_P (hard_reg))
-	    x = gen_reg_rtx (GET_MODE (hard_reg));
+	    set_parm_rtl (res, gen_reg_rtx (GET_MODE (hard_reg)));
 	  else
 	    {
 	      gcc_assert (GET_CODE (hard_reg) == PARALLEL);
-	      x = gen_group_rtx (hard_reg);
+	      set_parm_rtl (res, gen_group_rtx (hard_reg));
 	    }
 	}
 
-      SET_DECL_RTL (res, x);
-
       /* Set DECL_REGISTER flag so that expand_function_end will copy the
 	 result to the real return register(s).  */
       DECL_REGISTER (res) = 1;
@@ -5291,22 +5145,23 @@ expand_function_start (tree subr)
     {
       tree parm = cfun->static_chain_decl;
       rtx local, chain;
-     rtx_insn *insn;
+      rtx_insn *insn;
+      int unsignedp;
 
-      local = get_rtl_for_parm_ssa_default_def (parm);
-      if (!local)
-	local = gen_reg_rtx (Pmode);
+      local = gen_reg_rtx (promote_decl_mode (parm, &unsignedp));
       chain = targetm.calls.static_chain (current_function_decl, true);
 
       set_decl_incoming_rtl (parm, chain, false);
-      SET_DECL_RTL (parm, local);
+      set_parm_rtl (parm, local);
       mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
 
-      if (GET_MODE (local) != Pmode)
-	local = convert_to_mode (Pmode, local,
-				 TYPE_UNSIGNED (TREE_TYPE (parm)));
-
-      insn = emit_move_insn (local, chain);
+      if (GET_MODE (local) != GET_MODE (chain))
+	{
+	  convert_move (local, chain, unsignedp);
+	  insn = get_last_insn ();
+	}
+      else
+	insn = emit_move_insn (local, chain);
 
       /* Mark the register as eliminable, similar to parameters.  */
       if (MEM_P (chain)
diff --git a/gcc/testsuite/gcc.dg/pr67312.c b/gcc/testsuite/gcc.dg/pr67312.c
new file mode 100644
index 0000000..f1c9fde
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr67312.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -ftree-coalesce-vars" } */
+
+void foo (int x, int y)
+{
+    y = x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
index a1e35dc..d14eb2f 100644
--- a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
+++ b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
@@ -1,6 +1,13 @@
 /* { dg-do compile } */
-/* { dg-options "-mpreferred-stack-boundary=4" } */
+/* { dg-options "-mpreferred-stack-boundary=4 -O" } */
 /* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-64,\[^\\n\]*sp" } } */
+/* We only guarantee we won't generate the stack alignment when
+   optimizing.  When not optimizing, the return value will be assigned
+   to a pseudo with the specified alignment, which in turn will force
+   stack alignment since the pseudo might have to be spilled.  Without
+   optimization, we wouldn't compute the actual stack requirements
+   after register allocation and reload, and just use the conservative
+   estimate.  */
 
 /* This compile only test is to detect an assertion failure in stack branch
    development.  */
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index fd00883..8dc4908 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -980,7 +980,6 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 {
   bitmap values = NULL;
   var_map map;
-  unsigned i;
 
   map = coalesce_ssa_name ();
 
@@ -1005,17 +1004,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 
   sa->map = map;
   sa->values = values;
-  sa->partition_has_default_def = BITMAP_ALLOC (NULL);
-  for (i = 1; i < num_ssa_names; i++)
-    {
-      tree t = ssa_name (i);
-      if (t && SSA_NAME_IS_DEFAULT_DEF (t))
-	{
-	  int p = var_to_partition (map, t);
-	  if (p != NO_PARTITION)
-	    bitmap_set_bit (sa->partition_has_default_def, p);
-	}
-    }
+  sa->partitions_for_parm_default_defs = get_parm_default_def_partitions (map);
 }
 
 
@@ -1190,7 +1179,7 @@ finish_out_of_ssa (struct ssaexpand *sa)
   if (sa->values)
     BITMAP_FREE (sa->values);
   delete_var_map (sa->map);
-  BITMAP_FREE (sa->partition_has_default_def);
+  BITMAP_FREE (sa->partitions_for_parm_default_defs);
   memset (sa, 0, sizeof *sa);
 }
 
diff --git a/gcc/tree-outof-ssa.h b/gcc/tree-outof-ssa.h
index 687e5a5..60b6379 100644
--- a/gcc/tree-outof-ssa.h
+++ b/gcc/tree-outof-ssa.h
@@ -39,9 +39,9 @@ struct ssaexpand
      a pseudos REG).  */
   rtx *partition_to_pseudo;
 
-  /* If partition I contains an SSA name that has a default def,
-     bit I will be set in this bitmap.  */
-  bitmap partition_has_default_def;
+  /* If partition I contains an SSA name that has a default def for a
+     parameter, bit I will be set in this bitmap.  */
+  bitmap partitions_for_parm_default_defs;
 };
 
 /* This is the singleton described above.  */
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 8af6583..ff75877 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -39,7 +39,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgexpand.h"
 #include "explow.h"
 #include "diagnostic-core.h"
-
+#include "tree-dfa.h"
+#include "tm_p.h"
+#include "stor-layout.h"
 
 /* This set of routines implements a coalesce_list.  This is an object which
    is used to track pairs of ssa_names which are desirable to coalesce
@@ -877,26 +879,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	}
 
       /* Pretend there are defs for params' default defs at the start
-	 of the (post-)entry block.  */
+	 of the (post-)entry block.  This will prevent PARM_DECLs from
+	 coalescing into the same partition.  Although RESULT_DECLs'
+	 default defs don't have a useful initial value, we have to
+	 prevent them from coalescing with PARM_DECLs' default defs
+	 too, otherwise assign_parms would attempt to assign different
+	 RTL to the same partition.  */
       if (bb == entry)
 	{
-	  unsigned base;
-	  bitmap_iterator bi;
-	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	  unsigned i;
+	  for (i = 1; i < num_ssa_names; i++)
 	    {
-	      bitmap_iterator bi2;
-	      unsigned part;
-	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
-					0, part, bi2)
-		{
-		  tree var = partition_to_var (map, part);
-		  if (!SSA_NAME_VAR (var)
-		      || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
-			  && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
-		      || !SSA_NAME_IS_DEFAULT_DEF (var))
-		    continue;
-		  live_track_process_def (live, var, graph);
-		}
+	      tree var = ssa_name (i);
+
+	      if (!var
+		  || !SSA_NAME_IS_DEFAULT_DEF (var)
+		  || !SSA_NAME_VAR (var)
+		  || VAR_P (SSA_NAME_VAR (var)))
+		continue;
+
+	      live_track_process_def (live, var, graph);
+	      /* Process a use too, so that it remains live and
+		 conflicts with other parms' default defs, even unused
+		 ones.  */
+	      live_track_process_use (live, var);
 	    }
 	}
 
@@ -937,6 +943,71 @@ fail_abnormal_edge_coalesce (int x, int y)
   internal_error ("SSA corruption");
 }
 
+/* Call CALLBACK for all PARM_DECLs and RESULT_DECLs for which
+   assign_parms may ask for a default partition.  */
+
+static void
+for_all_parms (void (*callback)(tree var, void *arg), void *arg)
+{
+  for (tree var = DECL_ARGUMENTS (current_function_decl); var;
+       var = DECL_CHAIN (var))
+    callback (var, arg);
+  if (!VOID_TYPE_P (TREE_TYPE (DECL_RESULT (current_function_decl))))
+    callback (DECL_RESULT (current_function_decl), arg);
+  if (cfun->static_chain_decl)
+    callback (cfun->static_chain_decl, arg);
+}
+
+/* Create a default def for VAR.  */
+
+static void
+create_default_def (tree var, void *arg ATTRIBUTE_UNUSED)
+{
+  if (!is_gimple_reg (var))
+    return;
+
+  tree ssa = get_or_create_ssa_default_def (cfun, var);
+  gcc_assert (ssa);
+}
+
+/* Register VAR's default def in MAP.  */
+
+static void
+register_default_def (tree var, void *map_)
+{
+  var_map map = (var_map)map_;
+
+  if (!is_gimple_reg (var))
+    return;
+
+  tree ssa = ssa_default_def (cfun, var);
+  gcc_assert (ssa);
+
+  register_ssa_partition (map, ssa);
+}
+
+/* If VAR is an SSA_NAME associated with a PARM_DECL or a RESULT_DECL,
+   and the DECL's default def is unused (i.e., it was introduced by
+   create_default_def), mark VAR and the default def for
+   coalescing.  */
+
+static void
+coalesce_with_default (tree var, coalesce_list_p cl, bitmap used_in_copy)
+{
+  if (SSA_NAME_IS_DEFAULT_DEF (var)
+      || !SSA_NAME_VAR (var)
+      || VAR_P (SSA_NAME_VAR (var)))
+    return;
+
+  tree ssa = ssa_default_def (cfun, SSA_NAME_VAR (var));
+  if (!has_zero_uses (ssa))
+    return;
+
+  add_cost_one_coalesce (cl, SSA_NAME_VERSION (ssa), SSA_NAME_VERSION (var));
+  bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
+  /* Default defs will have their used_in_copy bits set at the end of
+     create_outofssa_var_map.  */
+}
 
 /* This function creates a var_map for the current function as well as creating
    a coalesce list for use later in the out of ssa process.  */
@@ -954,8 +1025,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
   int v1, v2, cost;
   unsigned i;
 
+  for_all_parms (create_default_def, NULL);
+
   map = init_var_map (num_ssa_names);
 
+  for_all_parms (register_default_def, map);
+
   FOR_EACH_BB_FN (bb, cfun)
     {
       tree arg;
@@ -1034,6 +1109,30 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
 	      }
 	      break;
 
+	    case GIMPLE_RETURN:
+	      {
+		tree res = DECL_RESULT (current_function_decl);
+		if (VOID_TYPE_P (TREE_TYPE (res))
+		    || !is_gimple_reg (res))
+		  break;
+		tree rhs1 = gimple_return_retval (as_a <greturn *> (stmt));
+		if (!rhs1)
+		  break;
+		tree lhs = ssa_default_def (cfun, res);
+		gcc_assert (lhs);
+		if (TREE_CODE (rhs1) == SSA_NAME
+		    && gimple_can_coalesce_p (lhs, rhs1))
+		  {
+		    v1 = SSA_NAME_VERSION (lhs);
+		    v2 = SSA_NAME_VERSION (rhs1);
+		    cost = coalesce_cost_bb (bb);
+		    add_coalesce (cl, v1, v2, cost);
+		    bitmap_set_bit (used_in_copy, v1);
+		    bitmap_set_bit (used_in_copy, v2);
+		  }
+		break;
+	      }
+
 	    case GIMPLE_ASM:
 	      {
 		gasm *asm_stmt = as_a <gasm *> (stmt);
@@ -1100,10 +1199,13 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
       var = ssa_name (i);
       if (var != NULL_TREE && !virtual_operand_p (var))
         {
+	  coalesce_with_default (var, cl, used_in_copy);
+
 	  /* Add coalesces between all the result decls.  */
 	  if (SSA_NAME_VAR (var)
 	      && TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL)
 	    {
+	      bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
 	      if (first == NULL_TREE)
 		first = var;
 	      else
@@ -1111,8 +1213,6 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
 		  gcc_assert (gimple_can_coalesce_p (var, first));
 		  v1 = SSA_NAME_VERSION (first);
 		  v2 = SSA_NAME_VERSION (var);
-		  bitmap_set_bit (used_in_copy, v1);
-		  bitmap_set_bit (used_in_copy, v2);
 		  cost = coalesce_cost_bb (EXIT_BLOCK_PTR_FOR_FN (cfun));
 		  add_coalesce (cl, v1, v2, cost);
 		}
@@ -1121,7 +1221,9 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
 	     since they will have to be coalesced with the base variable.  If
 	     not marked as present, they won't be in the coalesce view. */
 	  if (SSA_NAME_IS_DEFAULT_DEF (var)
-	      && !has_zero_uses (var))
+	      && (!has_zero_uses (var)
+		  || (SSA_NAME_VAR (var)
+		      && !VAR_P (SSA_NAME_VAR (var)))))
 	    bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
 	}
     }
@@ -1367,30 +1469,38 @@ gimple_can_coalesce_p (tree name1, tree name2)
 
       /* We don't want to coalesce two SSA names if one of the base
 	 variables is supposed to be a register while the other is
-	 supposed to be on the stack.  Anonymous SSA names take
-	 registers, but when not optimizing, user variables should go
-	 on the stack, so coalescing them with the anonymous variable
-	 as the partition leader would end up assigning the user
-	 variable to a register.  Don't do that!  */
-      bool reg1 = !var1 || use_register_for_decl (var1);
-      bool reg2 = !var2 || use_register_for_decl (var2);
+	 supposed to be on the stack.  Anonymous SSA names most often
+	 take registers, but when not optimizing, user variables
+	 should go on the stack, so coalescing them with the anonymous
+	 variable as the partition leader would end up assigning the
+	 user variable to a register.  Don't do that!  */
+      bool reg1 = use_register_for_decl (name1);
+      bool reg2 = use_register_for_decl (name2);
       if (reg1 != reg2)
 	return false;
 
-      /* Check that the promoted modes are the same.  We don't want to
-	 coalesce if the promoted modes would be different.  Only
+      /* Check that the promoted modes and unsignedness are the same.
+	 We don't want to coalesce if the promoted modes would be
+	 different, or if they would sign-extend differently.  Only
 	 PARM_DECLs and RESULT_DECLs have different promotion rules,
 	 so skip the test if both are variables, or both are anonymous
-	 SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
-	 coalesce its SSA versions with those of any other variables,
-	 because it may be passed by reference.  */
+	 SSA_NAMEs.  */
+      int unsigned1, unsigned2;
       return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
-	|| (/* The case var1 == var2 is already covered above.  */
-	    !parm_in_stack_slot_p (var1)
-	    && !parm_in_stack_slot_p (var2)
-	    && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
+	|| ((promote_ssa_mode (name1, &unsigned1)
+	     == promote_ssa_mode (name2, &unsigned2))
+	    && unsigned1 == unsigned2);
     }
 
+  /* If alignment requirements are different, we can't coalesce.  */
+  if (MINIMUM_ALIGNMENT (t1,
+			 var1 ? DECL_MODE (var1) : TYPE_MODE (t1),
+			 var1 ? LOCAL_DECL_ALIGNMENT (var1) : TYPE_ALIGN (t1))
+      != MINIMUM_ALIGNMENT (t2,
+			    var2 ? DECL_MODE (var2) : TYPE_MODE (t2),
+			    var2 ? LOCAL_DECL_ALIGNMENT (var2) : TYPE_ALIGN (t2)))
+    return false;
+
   /* If the types are not the same, check for a canonical type match.  This
      (for example) allows coalescing when the types are fundamentally the
      same, but just have different names. 
@@ -1639,7 +1749,8 @@ coalesce_ssa_name (void)
 	  if (a
 	      && SSA_NAME_VAR (a)
 	      && !DECL_IGNORED_P (SSA_NAME_VAR (a))
-	      && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a)))
+	      && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a)
+		  || !VAR_P (SSA_NAME_VAR (a))))
 	    {
 	      tree *slot = ssa_name_hash.find_slot (a, INSERT);
 
@@ -1721,3 +1832,47 @@ coalesce_ssa_name (void)
 
   return map;
 }
+
+/* We need to pass two arguments to set_parm_default_def_partition,
+   but for_all_parms only supports one.  Use a pair.  */
+
+typedef std::pair<var_map, bitmap> parm_default_def_partition_arg;
+
+/* Set in ARG's PARTS bitmap the bit corresponding to the partition in
+   ARG's MAP containing VAR's default def.  */
+
+static void
+set_parm_default_def_partition (tree var, void *arg_)
+{
+  parm_default_def_partition_arg *arg = (parm_default_def_partition_arg *)arg_;
+  var_map map = arg->first;
+  bitmap parts = arg->second;
+
+  if (!is_gimple_reg (var))
+    return;
+
+  tree ssa = ssa_default_def (cfun, var);
+  gcc_assert (ssa);
+
+  int version = var_to_partition (map, ssa);
+  gcc_assert (version != NO_PARTITION);
+
+  bool changed = bitmap_set_bit (parts, version);
+  gcc_assert (changed);
+}
+
+/* Allocate and return a bitmap that has a bit set for each partition
+   that contains a default def for a parameter.  */
+
+extern bitmap
+get_parm_default_def_partitions (var_map map)
+{
+  bitmap parm_default_def_parts = BITMAP_ALLOC (NULL);
+
+  parm_default_def_partition_arg
+    arg = std::make_pair (map, parm_default_def_parts);
+
+  for_all_parms (set_parm_default_def_partition, &arg);
+
+  return parm_default_def_parts;
+}
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index ae289b4..8316f34 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -22,5 +22,6 @@ along with GCC; see the file COPYING3.  If not see
 
 extern var_map coalesce_ssa_name (void);
 extern bool gimple_can_coalesce_p (tree, tree);
+extern bitmap get_parm_default_def_partitions (var_map);
 
 #endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index e031725..25b548b 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -200,7 +200,9 @@ partition_view_init (var_map map)
       tmp = partition_find (map->var_partition, x);
       if (ssa_name (tmp) != NULL_TREE && !virtual_operand_p (ssa_name (tmp))
 	  && (!has_zero_uses (ssa_name (tmp))
-	      || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))))
+	      || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))
+	      || (SSA_NAME_VAR (ssa_name (tmp))
+		  && !VAR_P (SSA_NAME_VAR (ssa_name (tmp))))))
 	bitmap_set_bit (used, tmp);
     }
 
@@ -1404,6 +1406,12 @@ verify_live_on_entry (tree_live_info_p live)
 		  }
 		if (ok)
 		  continue;
+		/* Expand adds unused default defs for PARM_DECLs and
+		   RESULT_DECLs.  They're ok.  */
+		if (has_zero_uses (var)
+		    && SSA_NAME_VAR (var)
+		    && !VAR_P (SSA_NAME_VAR (var)))
+		  continue;
 	        num++;
 		print_generic_expr (stderr, var, TDF_SLIM);
 		fprintf (stderr, " is not marked live-on-entry to entry BB%d ",


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-09-23 20:44                                                               ` Alexandre Oliva
@ 2015-09-25 11:39                                                                 ` Richard Biener
  2015-10-09  5:26                                                                   ` [PR67828] don't unswitch loops on undefined SSA values (was: Re: [PR64164] drop copyrename, integrate into expand) Alexandre Oliva
  2015-10-09  5:36                                                                   ` [PR67766] reorder return value copying from PARALLELs and CONCATs " Alexandre Oliva
  2015-09-29 11:31                                                                 ` [PR64164] drop copyrename, integrate into expand Szabolcs Nagy
  2015-11-05  5:09                                                                 ` Alexandre Oliva
  2 siblings, 2 replies; 127+ messages in thread
From: Richard Biener @ 2015-09-25 11:39 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Alan Lawrence, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On Wed, Sep 23, 2015 at 10:07 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Sep 18, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:
>
>> With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on
>> branch aoliva/pr64164, I am now able to build a cross toolchain for
>> aarch64 and aarch64_be, and can confirm the ABI failure is fixed on
>> the branch.
>
> Thanks for the confirmation.  I've made one further tweak for cris and
> lm32, dropping the assert that caused build failures for libstdc++
> atomics parms that required more alignment than
> MAX_SUPPORTED_STACK_ALIGNMENT, consolidated the patchset and retested it
> with a more recent baseline (r228019), with native regstraps on
> x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
> powerpc64le-linux-gnu, and cross toolchain builds for the following 73
> platforms: aarch64_be-elf aarch64-elf arm-eabi armeb-eabihf
> arm-symbianelf avr-elf bfin-elf c6x-elf cr16-elf cris-elf crisv32-elf
> epiphany-elf fido-elf fr30-elf frv-elf ft32-elf h8300-elf i686-elf
> ia64-elf iq2000-elf lm32-elf m32c-elf m32r-elf m32rle-elf m68k-elf
> mcore-elf mep-elf microblaze-elf mips64el-elf mips64-elf mips64orion-elf
> mips64vr-elf mipsel-elf mipsisa32-elfoabi mipsisa64-elfoabi
> mipsisa64r2el-elf mipsisa64r2-sde-elf mipsisa64sb1-elf
> mipsisa64sr71k-elf mipstx39-elf mn10300-elf moxie-elf msp430-elf
> nds32be-elf nds32le-elf nios2-elf pdp11-aout powerpc-eabialtivec
> powerpc-eabi powerpc-eabisimaltivec powerpc-eabisim powerpc-eabispe
> powerpcle-eabi powerpcle-eabisim powerpcle-elf powerpc-xilinx-eabi
> ppc64-eabi ppc-eabi ppc-elf rl78-elf rx-elf sh64-elf sh-elf
> sh-superh-elf sparc64-elf sparc-elf sparc-leon-elf spu-elf v850e-elf
> v850-elf visium-elf xstormy16-elf xtensa-elf.  Not all of them succeeded
> in building, but those that didn't failed at the very same spots before
> and after this patch.
>
>
> This patch doesn't really add much functionality.  It rather
> reimplements a lot of the ugly and fragile stuff I put in in the
> previous big patchset in a far more robust and pleasant way.  It fixes a
> number of regressions in the process, mainly because, instead of
> modifying assign_parms so as to let cfgexpand do part of its job, it
> reverts all of the RTL assignment for parameters and results to
> assign_parms.  cfgexpand now leaves the RTL assignment of partitions
> containing default defs or parms and results to assign_parms, and
> assign_parms uses a single callback, set_parm_rtl, to tell cfgexpand the
> assignment for the partition containing the default def of each
> parameter.
>
> This required introducing default defs for all parms and results, even
> if unused; we could refrain from creating them, and refrain from
> initializing those parameters (at least when optimizing), but that would
> require messing with the fragile bits in assign_parms again, and it
> would bring little benefit, since RTL optimization will likely notice
> the initialization is unused and drop it anyway.  Besides, adding the
> default defs was actually needed to fix a regression in the previous
> patch, and even with the current patch it helps make sure we don't
> assign more than one default def to the same SSA partition (the previous
> patch attempted to do that, but there was a bug, fixed in the current
> patch).  Having unused default defs makes it easier for us to decide
> whether to use an entry_value rtx for the initial debug insn of a parm.
> We track partitions holding default defs for parms and results with a
> bitmap; we used to have a bitmap that tracked partitions holding default
> defs, but it was unused!  I just renamed it and repurposed it.
>
> I've also added checking asserts to set_rtl, to verify that, when we
> expect a REG, we get a REG, and that it has the expected mode.  set_rtl
> was also adjusted to record anonymous SSA names or their base types in
> attrs of REGs or MEMs, respectively, so that code that relied on the
> attrs to detect properties of the decl types no longer regress just
> because we no longer generate decls for anonymous SSA names.  Since
> there were prior uses of types in MEM attrs, that was expected to go
> smoothly, but I was surprised at how smoothly adding SSA names to REG
> attrs went.  No adjustments required!
>
> I also tightened a bit the conditions for coalescing: we used to require
> the same canonical type; I've added tests for same alignment
> requirements, and for same signedness.  OTOH, I've added a few more
> coalesce candidates for RESULT_DECLs and the newly-added default defs of
> parms and results.
>
> Other relevant changes were in mode promotion.  TYPE_MODE would often
> return BLKmode for some vector types, which was fine for some return
> decl RTL with PARALLEL, but that didn't quite work for SSA partitions.
> There were other cases of mode promotion of result decls that failed the
> asserts in set_rtl, that revealed promote_decl_mode didn't call
> promote_function_mode as expected for results.
>
> The new assers brought additional requirements: promoting the mode of
> the RTL generated for the static chain, arranging for result decls to be
> assigned to a pseudo where it would formerly have got a BLKmode PARALLEL
> (as mentioned above), and arranging for parms set up by
> assign_parm_setup_block, that would always get a MEM, to instead get a
> REG when use_register_for_decl called for it.  In a few cases involving
> complex parms, I couldn't figure out how to avoid a temporary MEM, used
> to adjust padding of the parms, but although undesired, this is not a
> regression, for we used to use the MEM, we'll just load them to
> (coalescible) pseudos and use the pseudos instead, instead of coalescing
> other vars that expected pseudos to the same MEM.
>
> Is this ok to install?

Ok.

Thanks,
Richard.

>
>
> revert to assign_parms assignments using default defs
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> Revert the fragile and complicated changes to assign_parms designed to
> enable it to use RTL assigments chosen by cfgexpand, and instead have
> cfgexpand use the RTL assignments by assign_parms, keying them off of
> the default defs that are now necessarily introduced for each parm and
> result.  The possible lack of a default def was already a problem, and
> the fallbacks in place were not enough, as shown by PR67312.  We now
> have checking asserts in set_rtl that verify that we're assigning to
> each var a piece of RTL that matches the expectations set forth by
> use_register_for_decl.
>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/64164
>         PR tree-optimization/67312
>         PR middle-end/67340
>         PR middle-end/67490
>         PR bootstrap/67597
>         * cfgexpand.c (parm_in_stack_slot_p): Remove.
>         (ssa_default_def_partition): Remove.
>         (get_rtl_for_parm_ssa_default_def): Remove.
>         (set_rtl): Check that RTL assignments match expectations.
>         Loop on SUBREGs, CONCATs and PARALLELs subexprs.  Set only the
>         default def location for params and results.  Record SSA names
>         or types in REG and MEM attrs, respectively.
>         (set_parm_rtl): New.
>         (expand_one_ssa_partition): Drop logic that assigned MEMs with
>         unassigned addresses.
>         (adjust_one_expanded_partition_var): Don't accept NULL RTL on
>         deferred stack alloc vars.
>         (expand_used_vars): Skip partitions holding parm default defs.
>         Move adjust_one_expanded_partition_var loop...
>         (pass_expand::execute): ... here.  Drop redundant assert.
>         Adjust comments before the final loop over all ssa names.
>         Require assigned rtl of parms and results to match exactly.
>         Reset its attributes to match them, not any other variables in
>         the same partition.
>         (expand_debug_expr): Use entry value for PARM's default defs
>         only iff they have zero nondebug uses.
>         * cfgexpand.h (parm_in_stack_slot_p): Remove.
>         (get_rtl_for_parm_ssa_default_def): Remove.
>         (set_parm_rtl): Declare.
>         * doc/invoke.texi: Improve wording.
>         * explow.c (promote_decl_mode): Fix promote_function_mode for
>         result decls not by reference.
>         (promote_ssa_mode): Disregard BLKmode from promote_decl, and
>         bypass TYPE_MODE to get the actual vector mode.
>         * function.c: Include tree-dfa.h.  Revert 2015-08-14's and
>         2015-08-19's changes as follows.  Drop include of
>         basic-block.h and df.h.
>         (rtl_for_parm): Remove.
>         (maybe_reset_rtl_for_parm): Remove.
>         (parm_in_unassigned_mem_p): Remove.
>         (use_register_for_decl): Add logic for RESULT_DECLs matching
>         assign_parms' behavior.
>         (split_complex_args): Revert.
>         (assign_parms_augmented_arg_list): Revert.  Add comment
>         referencing the logic above.
>         (assign_parm_adjust_stack_rtl): Revert.
>         (assign_parm_setup_block): Revert.  Use set_parm_rtl instead
>         of SET_DECL_RTL.  Set up a REG if the parm demands so.
>         (assign_parm_setup_reg): Revert.  Consolidated SET_DECL_RTL
>         calls into a single set_parm_rtl.  Set up a temporary RTL
>         temporarily for expand_assignment.
>         (assign_parm_setup_stack): Revert.  Use set_parm_rtl.
>         (assign_parms_unsplit_complex): Revert.  Use set_parm_rtl.
>         (assign_bounds): Revert.
>         (assign_parms): Revert.  Use set_parm_rtl.
>         (allocate_struct_function): Relayout result and parms of
>         non-abstruct functions.
>         (expand_function_start): Revert.  Use set_parm_rtl.  If the
>         result is not a hard reg, create a pseudo from the promoted
>         mode of the default def.  Promote static chain mode.
>         * tree-outof-ssa.c (remove_ssa_form): Drop unused
>         partition_has_default_def.  Set up
>         partitions_for_parm_default_defs.
>         (finish_out_of_ssa): Remove partition_has_default_def.
>         Release partitions_for_parm_default_defs.
>         * tree-outof-ssa.h (struct ssaexpand): Remove
>         partition_has_default_def.  Add
>         partitions_for_parm_default_defs.
>         * tree-ssa-coalesce.c: Include tree-dfa.h, tm_p.h and
>         stor-layout.h.
>         (build_ssa_conflict_graph): Fix conflict-detection of default
>         defs of even unused default defs of params and results.
>         (for_all_parms): New.
>         (create_default_def): New.
>         (register_default_def): New.
>         (coalesce_with_default): New.
>         (create_outofssa_var_map): Create default defs for all parms
>         and results, and register their partitions.  Add GIMPLE_RETURN
>         operands as coalesce candidates with results.  Add default
>         defs of each parm or result as coalesce candidates with its
>         other defs.  Mark each result def, and each default def of
>         parms, as used_in_copy.
>         (gimple_can_coalesce_p): Call it.  Call use_register_for_decl
>         with the ssa names, even anonymous ones.  Drop
>         parm_in_stack_slot_p calls.  Require same signedness and
>         alignment.
>         (coalesce_ssa_name): Add coalesce candidates for all defs of
>         each parm and result, even unused ones.
>         (parm_default_def_partition_arg): New type.
>         (set_parm_default_def_partition): New.
>         (get_parm_default_def_partitions): New.
>         * tree-ssa-coalesce.h (get_parm_default_def_partitions): New.
>         * tree-ssa-live.c (partition_view_init): Regard unused defs of
>         parms and results as used.
>         (verify_live_on_entry): Don't error out just because they're
>         not live.
>
> for  gcc/testsuite/ChangeLog
>
>         PR rtl-optimization/64164
>         PR tree-optimization/67312
>         * gcc.dg/pr67312.c: New.  From Zdenek Sojka.
>         * gcc.target/i386/stackalign/return-4.c: Add -O.
> ---
>  gcc/cfgexpand.c                                    |  332 +++++++-------
>  gcc/cfgexpand.h                                    |    3
>  gcc/doc/invoke.texi                                |    9
>  gcc/explow.c                                       |   19 +
>  gcc/function.c                                     |  477 +++++++-------------
>  gcc/testsuite/gcc.dg/pr67312.c                     |    7
>  .../gcc.target/i386/stackalign/return-4.c          |    9
>  gcc/tree-outof-ssa.c                               |   15 -
>  gcc/tree-outof-ssa.h                               |    6
>  gcc/tree-ssa-coalesce.c                            |  231 ++++++++--
>  gcc/tree-ssa-coalesce.h                            |    1
>  gcc/tree-ssa-live.c                                |   10
>  12 files changed, 582 insertions(+), 537 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr67312.c
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 6c9284f..58e55d2 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -99,6 +99,8 @@ static rtx expand_debug_expr (tree);
>
>  static bool defer_stack_allocation (tree, bool);
>
> +static void record_alignment_for_reg_var (unsigned int);
> +
>  /* Return an expression tree corresponding to the RHS of GIMPLE
>     statement STMT.  */
>
> @@ -172,111 +174,86 @@ leader_merge (tree cur, tree next)
>    return cur;
>  }
>
> -/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
> -   assigned to a stack slot.  We can't have expand_one_ssa_partition
> -   choose their address: the pseudo holding the address would be set
> -   up too late for assign_params to copy the parameter if needed.
> -
> -   Such parameters are likely passed as a pointer to the value, rather
> -   than as a value, and so we must not coalesce them, nor allocate
> -   stack space for them before determining the calling conventions for
> -   them.
> -
> -   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
> -   with pc_rtx as the address, and then it replaces the pc_rtx with
> -   NULL so as to make sure the MEM is not used before it is adjusted
> -   in assign_parm_setup_reg.  */
> -
> -bool
> -parm_in_stack_slot_p (tree var)
> -{
> -  if (!var || VAR_P (var))
> -    return false;
> -
> -  gcc_assert (TREE_CODE (var) == PARM_DECL
> -             || TREE_CODE (var) == RESULT_DECL);
> -
> -  return !use_register_for_decl (var);
> -}
> -
> -/* Return the partition of the default SSA_DEF for decl VAR.  */
> -
> -static int
> -ssa_default_def_partition (tree var)
> -{
> -  tree name = ssa_default_def (cfun, var);
> -
> -  if (!name)
> -    return NO_PARTITION;
> -
> -  return var_to_partition (SA.map, name);
> -}
> -
> -/* Return the RTL for the default SSA def of a PARM or RESULT, if
> -   there is one.  */
> -
> -rtx
> -get_rtl_for_parm_ssa_default_def (tree var)
> -{
> -  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> -
> -  if (!is_gimple_reg (var))
> -    return NULL_RTX;
> -
> -  /* If we've already determined RTL for the decl, use it.  This is
> -     not just an optimization: if VAR is a PARM whose incoming value
> -     is unused, we won't find a default def to use its partition, but
> -     we still want to use the location of the parm, if it was used at
> -     all.  During assign_parms, until a location is assigned for the
> -     VAR, RTL can only for a parm or result if we're not coalescing
> -     across variables, when we know we're coalescing all SSA_NAMEs of
> -     each parm or result, and we're not coalescing them with names
> -     pertaining to other variables, such as other parms' default
> -     defs.  */
> -  if (DECL_RTL_SET_P (var))
> -    {
> -      gcc_assert (DECL_RTL (var) != pc_rtx);
> -      return DECL_RTL (var);
> -    }
> -
> -  int part = ssa_default_def_partition (var);
> -  if (part == NO_PARTITION)
> -    return NULL_RTX;
> -
> -  return SA.partition_to_pseudo[part];
> -}
> -
>  /* Associate declaration T with storage space X.  If T is no
>     SSA name this is exactly SET_DECL_RTL, otherwise make the
>     partition of T associated with X.  */
>  static inline void
>  set_rtl (tree t, rtx x)
>  {
> -  if (x && SSAVAR (t))
> +  gcc_checking_assert (!x
> +                      || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t))
> +                      || (use_register_for_decl (t)
> +                          ? (REG_P (x)
> +                             || (GET_CODE (x) == CONCAT
> +                                 && (REG_P (XEXP (x, 0))
> +                                     || SUBREG_P (XEXP (x, 0)))
> +                                 && (REG_P (XEXP (x, 1))
> +                                     || SUBREG_P (XEXP (x, 1))))
> +                             || (GET_CODE (x) == PARALLEL
> +                                 && SSAVAR (t)
> +                                 && TREE_CODE (SSAVAR (t)) == RESULT_DECL
> +                                 && !flag_tree_coalesce_vars))
> +                          : (MEM_P (x) || x == pc_rtx
> +                             || (GET_CODE (x) == CONCAT
> +                                 && MEM_P (XEXP (x, 0))
> +                                 && MEM_P (XEXP (x, 1))))));
> +  /* Check that the RTL for SSA_NAMEs and gimple-reg PARM_DECLs and
> +     RESULT_DECLs has the expected mode.  For memory, we accept
> +     unpromoted modes, since that's what we're likely to get.  For
> +     PARM_DECLs and RESULT_DECLs, we'll have been called by
> +     set_parm_rtl, which will give us the default def, so we don't
> +     have to compute it ourselves.  For RESULT_DECLs, we accept mode
> +     mismatches too, as long as we're not coalescing across variables,
> +     so that we don't reject BLKmode PARALLELs or unpromoted REGs.  */
> +  gcc_checking_assert (!x || x == pc_rtx || TREE_CODE (t) != SSA_NAME
> +                      || (SSAVAR (t) && TREE_CODE (SSAVAR (t)) == RESULT_DECL
> +                          && !flag_tree_coalesce_vars)
> +                      || !use_register_for_decl (t)
> +                      || GET_MODE (x) == promote_ssa_mode (t, NULL));
> +
> +  if (x)
>      {
>        bool skip = false;
>        tree cur = NULL_TREE;
> -
> -      if (MEM_P (x))
> -       cur = MEM_EXPR (x);
> -      else if (REG_P (x))
> -       cur = REG_EXPR (x);
> -      else if (GET_CODE (x) == CONCAT
> -              && REG_P (XEXP (x, 0)))
> -       cur = REG_EXPR (XEXP (x, 0));
> -      else if (GET_CODE (x) == PARALLEL)
> -       cur = REG_EXPR (XVECEXP (x, 0, 0));
> -      else if (x == pc_rtx)
> +      rtx xm = x;
> +
> +    retry:
> +      if (MEM_P (xm))
> +       cur = MEM_EXPR (xm);
> +      else if (REG_P (xm))
> +       cur = REG_EXPR (xm);
> +      else if (SUBREG_P (xm))
> +       {
> +         gcc_assert (subreg_lowpart_p (xm));
> +         xm = SUBREG_REG (xm);
> +         goto retry;
> +       }
> +      else if (GET_CODE (xm) == CONCAT)
> +       {
> +         xm = XEXP (xm, 0);
> +         goto retry;
> +       }
> +      else if (GET_CODE (xm) == PARALLEL)
> +       {
> +         xm = XVECEXP (xm, 0, 0);
> +         gcc_assert (GET_CODE (xm) == EXPR_LIST);
> +         xm = XEXP (xm, 0);
> +         goto retry;
> +       }
> +      else if (xm == pc_rtx)
>         skip = true;
>        else
>         gcc_unreachable ();
>
> -      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +      tree next = skip ? cur : leader_merge (cur, SSAVAR (t) ? SSAVAR (t) : t);
>
>        if (cur != next)
>         {
>           if (MEM_P (x))
> -           set_mem_attributes (x, next, true);
> +           set_mem_attributes (x,
> +                               next && TREE_CODE (next) == SSA_NAME
> +                               ? TREE_TYPE (next)
> +                               : next, true);
>           else
>             set_reg_attrs_for_decl_rtl (next, x);
>         }
> @@ -294,13 +271,11 @@ set_rtl (tree t, rtx x)
>         }
>        /* For the benefit of debug information at -O0 (where
>           vartracking doesn't run) record the place also in the base
> -         DECL.  For PARMs and RESULTs, we may end up resetting these
> -         in function.c:maybe_reset_rtl_for_parm, but in some rare
> -         cases we may need them (unused and overwritten incoming
> -         value, that at -O0 must share the location with the other
> -         uses in spite of the missing default def), and this may be
> -         the only chance to preserve them.  */
> -      if (x && x != pc_rtx && SSA_NAME_VAR (t))
> +         DECL.  For PARMs and RESULTs, do so only when setting the
> +         default def.  */
> +      if (x && x != pc_rtx && SSA_NAME_VAR (t)
> +         && (VAR_P (SSA_NAME_VAR (t))
> +             || SSA_NAME_IS_DEFAULT_DEF (t)))
>         {
>           tree var = SSA_NAME_VAR (t);
>           /* If we don't yet have something recorded, just record it now.  */
> @@ -1242,6 +1217,49 @@ account_stack_vars (void)
>    return size;
>  }
>
> +/* Record the RTL assignment X for the default def of PARM.  */
> +
> +extern void
> +set_parm_rtl (tree parm, rtx x)
> +{
> +  gcc_assert (TREE_CODE (parm) == PARM_DECL
> +             || TREE_CODE (parm) == RESULT_DECL);
> +
> +  if (x && !MEM_P (x))
> +    {
> +      unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (parm),
> +                                             TYPE_MODE (TREE_TYPE (parm)),
> +                                             TYPE_ALIGN (TREE_TYPE (parm)));
> +
> +      /* If the variable alignment is very large we'll dynamicaly
> +        allocate it, which means that in-frame portion is just a
> +        pointer.  ??? We've got a pseudo for sure here, do we
> +        actually dynamically allocate its spilling area if needed?
> +        ??? Isn't it a problem when POINTER_SIZE also exceeds
> +        MAX_SUPPORTED_STACK_ALIGNMENT, as on cris and lm32?  */
> +      if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +       align = POINTER_SIZE;
> +
> +      record_alignment_for_reg_var (align);
> +    }
> +
> +  if (!is_gimple_reg (parm))
> +    return set_rtl (parm, x);
> +
> +  tree ssa = ssa_default_def (cfun, parm);
> +  if (!ssa)
> +    return set_rtl (parm, x);
> +
> +  int part = var_to_partition (SA.map, ssa);
> +  gcc_assert (part != NO_PARTITION);
> +
> +  bool changed = bitmap_bit_p (SA.partitions_for_parm_default_defs, part);
> +  gcc_assert (changed);
> +
> +  set_rtl (ssa, x);
> +  gcc_assert (DECL_RTL (parm) == x);
> +}
> +
>  /* A subroutine of expand_one_var.  Called to immediately assign rtl
>     to a variable to be allocated in the stack frame.  */
>
> @@ -1349,37 +1367,7 @@ expand_one_ssa_partition (tree var)
>
>    if (!use_register_for_decl (var))
>      {
> -      /* We can't risk having the parm assigned to a MEM location
> -        whose address references a pseudo, for the pseudo will only
> -        be set up after arguments are copied to the stack slot.
> -
> -        If the parm doesn't have a default def (e.g., because its
> -        incoming value is unused), then we want to let assign_params
> -        do the allocation, too.  In this case we want to make sure
> -        SSA_NAMEs associated with the parm don't get assigned to more
> -        than one partition, lest we'd create two unassigned stac
> -        slots for the same parm, thus the assert at the end of the
> -        block.  */
> -      if (parm_in_stack_slot_p (SSA_NAME_VAR (var))
> -         && (ssa_default_def_partition (SSA_NAME_VAR (var)) == part
> -             || !ssa_default_def (cfun, SSA_NAME_VAR (var))))
> -       {
> -         expand_one_stack_var_at (var, pc_rtx, 0, 0);
> -         rtx x = SA.partition_to_pseudo[part];
> -         gcc_assert (GET_CODE (x) == MEM);
> -         gcc_assert (XEXP (x, 0) == pc_rtx);
> -         /* Reset the address, so that any attempt to use it will
> -            ICE.  It will be adjusted in assign_parm_setup_reg.  */
> -         XEXP (x, 0) = NULL_RTX;
> -         /* If the RTL associated with the parm is not what we have
> -            just created, the parm has been split over multiple
> -            partitions.  In order for this to work, we must have a
> -            default def for the parm, otherwise assign_params won't
> -            know what to do.  */
> -         gcc_assert (DECL_RTL_IF_SET (SSA_NAME_VAR (var)) == x
> -                     || ssa_default_def (cfun, SSA_NAME_VAR (var)));
> -       }
> -      else if (defer_stack_allocation (var, true))
> +      if (defer_stack_allocation (var, true))
>         add_stack_var (var);
>        else
>         expand_one_stack_var_1 (var);
> @@ -1393,8 +1381,8 @@ expand_one_ssa_partition (tree var)
>    set_rtl (var, x);
>  }
>
> -/* Record the association between the RTL generated for a partition
> -   and the underlying variable of the SSA_NAME.  */
> +/* Record the association between the RTL generated for partition PART
> +   and the underlying variable of the SSA_NAME VAR.  */
>
>  static void
>  adjust_one_expanded_partition_var (tree var)
> @@ -1410,12 +1398,7 @@ adjust_one_expanded_partition_var (tree var)
>
>    rtx x = SA.partition_to_pseudo[part];
>
> -  if (!x)
> -    {
> -      /* This var will get a stack slot later.  */
> -      gcc_assert (defer_stack_allocation (var, true));
> -      return;
> -    }
> +  gcc_assert (x);
>
>    set_rtl (var, x);
>
> @@ -2040,6 +2023,9 @@ expand_used_vars (void)
>
>    for (i = 0; i < SA.map->num_partitions; i++)
>      {
> +      if (bitmap_bit_p (SA.partitions_for_parm_default_defs, i))
> +       continue;
> +
>        tree var = partition_to_var (SA.map, i);
>
>        gcc_assert (!virtual_operand_p (var));
> @@ -2047,9 +2033,6 @@ expand_used_vars (void)
>        expand_one_ssa_partition (var);
>      }
>
> -  for (i = 1; i < num_ssa_names; i++)
> -    adjust_one_expanded_partition_var (ssa_name (i));
> -
>    if (flag_stack_protect == SPCT_FLAG_STRONG)
>        gen_stack_protect_signal
>         = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -4947,26 +4930,27 @@ expand_debug_expr (tree exp)
>           }
>         else
>           {
> +           /* If this is a reference to an incoming value of
> +              parameter that is never used in the code or where the
> +              incoming value is never used in the code, use
> +              PARM_DECL's DECL_RTL if set.  */
> +           if (SSA_NAME_IS_DEFAULT_DEF (exp)
> +               && SSA_NAME_VAR (exp)
> +               && TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL
> +               && has_zero_uses (exp))
> +             {
> +               op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp));
> +               if (op0)
> +                 goto adjust_mode;
> +               op0 = expand_debug_expr (SSA_NAME_VAR (exp));
> +               if (op0)
> +                 goto adjust_mode;
> +             }
> +
>             int part = var_to_partition (SA.map, exp);
>
>             if (part == NO_PARTITION)
> -             {
> -               /* If this is a reference to an incoming value of parameter
> -                  that is never used in the code or where the incoming
> -                  value is never used in the code, use PARM_DECL's
> -                  DECL_RTL if set.  */
> -               if (SSA_NAME_IS_DEFAULT_DEF (exp)
> -                   && TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL)
> -                 {
> -                   op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp));
> -                   if (op0)
> -                     goto adjust_mode;
> -                   op0 = expand_debug_expr (SSA_NAME_VAR (exp));
> -                   if (op0)
> -                     goto adjust_mode;
> -                 }
> -               return NULL;
> -             }
> +             return NULL;
>
>             gcc_assert (part >= 0 && (unsigned)part < SA.map->num_partitions);
>
> @@ -6216,9 +6200,26 @@ pass_expand::execute (function *fun)
>        parm_birth_insn = var_seq;
>      }
>
> -  /* If we have a class containing differently aligned pointers
> -     we need to merge those into the corresponding RTL pointer
> -     alignment.  */
> +  /* Now propagate the RTL assignment of each partition to the
> +     underlying var of each SSA_NAME.  */
> +  for (i = 1; i < num_ssa_names; i++)
> +    {
> +      tree name = ssa_name (i);
> +
> +      if (!name
> +         /* We might have generated new SSA names in
> +            update_alias_info_with_stack_vars.  They will have a NULL
> +            defining statements, and won't be part of the partitioning,
> +            so ignore those.  */
> +         || !SSA_NAME_DEF_STMT (name))
> +       continue;
> +
> +      adjust_one_expanded_partition_var (name);
> +    }
> +
> +  /* Clean up RTL of variables that straddle across multiple
> +     partitions, and check that the rtl of any PARM_DECLs that are not
> +     cleaned up is that of their default defs.  */
>    for (i = 1; i < num_ssa_names; i++)
>      {
>        tree name = ssa_name (i);
> @@ -6235,9 +6236,6 @@ pass_expand::execute (function *fun)
>        if (part == NO_PARTITION)
>         continue;
>
> -      gcc_assert (SA.partition_to_pseudo[part]
> -                 || defer_stack_allocation (name, true));
> -
>        /* If this decl was marked as living in multiple places, reset
>          this now to NULL.  */
>        tree var = SSA_NAME_VAR (name);
> @@ -6252,7 +6250,19 @@ pass_expand::execute (function *fun)
>           rtx in = DECL_RTL_IF_SET (var);
>           gcc_assert (in);
>           rtx out = SA.partition_to_pseudo[part];
> -         gcc_assert (in == out || rtx_equal_p (in, out));
> +         gcc_assert (in == out);
> +
> +         /* Now reset VAR's RTL to IN, so that the _EXPR attrs match
> +            those expected by debug backends for each parm and for
> +            the result.  This is particularly important for stabs,
> +            whose register elimination from parm's DECL_RTL may cause
> +            -fcompare-debug differences as SET_DECL_RTL changes reg's
> +            attrs.  So, make sure the RTL already has the parm as the
> +            EXPR, so that it won't change.  */
> +         SET_DECL_RTL (var, NULL_RTX);
> +         if (MEM_P (in))
> +           set_mem_attributes (in, var, true);
> +         SET_DECL_RTL (var, in);
>         }
>      }
>
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index ff7f4bef..8852411 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,8 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern tree gimple_assign_rhs_to_tree (gimple *);
>  extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> -extern bool parm_in_stack_slot_p (tree);
> -extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +extern void set_parm_rtl (tree, rtx);
>
>
>  #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 09c58ee..aefb061 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -8866,12 +8866,13 @@ profitable to parallelize the loops.
>
>  @item -ftree-coalesce-vars
>  @opindex ftree-coalesce-vars
> -Tell the compiler to attempt to combine small user-defined variables
> -too, instead of just compiler temporaries.  This may severely limit the
> -ability to debug an optimized program compiled with
> +While transforming the program out of the SSA representation, attempt to
> +reduce copying by coalescing versions of different user-defined
> +variables, instead of just compiler temporaries.  This may severely
> +limit the ability to debug an optimized program compiled with
>  @option{-fno-var-tracking-assignments}.  In the negated form, this flag
>  prevents SSA coalescing of user variables.  This option is enabled by
> -default if optimization is enabled.
> +default if optimization is enabled, and it does very little otherwise.
>
>  @item -ftree-loop-if-convert
>  @opindex ftree-loop-if-convert
> diff --git a/gcc/explow.c b/gcc/explow.c
> index 6941f4e..d104a79 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -830,8 +830,10 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>    machine_mode mode = DECL_MODE (decl);
>    machine_mode pmode;
>
> -  if (TREE_CODE (decl) == RESULT_DECL
> -      || TREE_CODE (decl) == PARM_DECL)
> +  if (TREE_CODE (decl) == RESULT_DECL && !DECL_BY_REFERENCE (decl))
> +    pmode = promote_function_mode (type, mode, &unsignedp,
> +                                   TREE_TYPE (current_function_decl), 1);
> +  else if (TREE_CODE (decl) == RESULT_DECL || TREE_CODE (decl) == PARM_DECL)
>      pmode = promote_function_mode (type, mode, &unsignedp,
>                                     TREE_TYPE (current_function_decl), 2);
>    else
> @@ -857,12 +859,23 @@ promote_ssa_mode (const_tree name, int *punsignedp)
>    if (SSA_NAME_VAR (name)
>        && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
>           || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
> -    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> +    {
> +      machine_mode mode = promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> +      if (mode != BLKmode)
> +       return mode;
> +    }
>
>    tree type = TREE_TYPE (name);
>    int unsignedp = TYPE_UNSIGNED (type);
>    machine_mode mode = TYPE_MODE (type);
>
> +  /* Bypass TYPE_MODE when it maps vector modes to BLKmode.  */
> +  if (mode == BLKmode)
> +    {
> +      gcc_assert (VECTOR_TYPE_P (type));
> +      mode = type->type_common.mode;
> +    }
> +
>    machine_mode pmode = promote_mode (type, mode, &unsignedp);
>    if (punsignedp)
>      *punsignedp = unsignedp;
> diff --git a/gcc/function.c b/gcc/function.c
> index 9b4c2b9..21304689 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -74,8 +74,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfgbuild.h"
>  #include "cfgcleanup.h"
>  #include "cfgexpand.h"
> -#include "basic-block.h"
> -#include "df.h"
>  #include "params.h"
>  #include "bb-reorder.h"
>  #include "shrink-wrap.h"
> @@ -83,6 +81,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "rtl-iter.h"
>  #include "tree-chkp.h"
>  #include "rtl-chkp.h"
> +#include "tree-dfa.h"
>
>  /* So we can assign to cfun in this file.  */
>  #undef cfun
> @@ -152,9 +151,6 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
>  static void prepare_function_start (void);
>  static void do_clobber_return_reg (rtx, void *);
>  static void do_use_return_reg (rtx, void *);
> -static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
> -static void maybe_reset_rtl_for_parm (tree);
> -static bool parm_in_unassigned_mem_p (tree, rtx);
>
>
>  /* Stack of nested functions.  */
> @@ -2145,6 +2141,47 @@ use_register_for_decl (const_tree decl)
>    if (TREE_ADDRESSABLE (decl))
>      return false;
>
> +  /* RESULT_DECLs are a bit special in that they're assigned without
> +     regard to use_register_for_decl, but we generally only store in
> +     them.  If we coalesce their SSA NAMEs, we'd better return a
> +     result that matches the assignment in expand_function_start.  */
> +  if (TREE_CODE (decl) == RESULT_DECL)
> +    {
> +      /* If it's not an aggregate, we're going to use a REG or a
> +        PARALLEL containing a REG.  */
> +      if (!aggregate_value_p (decl, current_function_decl))
> +       return true;
> +
> +      /* If expand_function_start determines the return value, we'll
> +        use MEM if it's not by reference.  */
> +      if (cfun->returns_pcc_struct
> +         || (targetm.calls.struct_value_rtx
> +             (TREE_TYPE (current_function_decl), 1)))
> +       return DECL_BY_REFERENCE (decl);
> +
> +      /* Otherwise, we're taking an extra all.function_result_decl
> +        argument.  It's set up in assign_parms_augmented_arg_list,
> +        under the (negated) conditions above, and then it's used to
> +        set up the RESULT_DECL rtl in assign_params, after looping
> +        over all parameters.  Now, if the RESULT_DECL is not by
> +        reference, we'll use a MEM either way.  */
> +      if (!DECL_BY_REFERENCE (decl))
> +       return false;
> +
> +      /* Otherwise, if RESULT_DECL is DECL_BY_REFERENCE, it will take
> +        the function_result_decl's assignment.  Since it's a pointer,
> +        we can short-circuit a number of the tests below, and we must
> +        duplicat e them because we don't have the
> +        function_result_decl to test.  */
> +      if (!targetm.calls.allocate_stack_slots_for_args ())
> +       return true;
> +      /* We don't set DECL_IGNORED_P for the function_result_decl.  */
> +      if (optimize)
> +       return true;
> +      /* We don't set DECL_REGISTER for the function_result_decl.  */
> +      return false;
> +    }
> +
>    /* Decl is implicitly addressible by bound stores and loads
>       if it is an aggregate holding bounds.  */
>    if (chkp_function_instrumented_p (current_function_decl)
> @@ -2272,7 +2309,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
>     needed, else the old list.  */
>
>  static void
> -split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
> +split_complex_args (vec<tree> *args)
>  {
>    unsigned i;
>    tree p;
> @@ -2283,7 +2320,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>        if (TREE_CODE (type) == COMPLEX_TYPE
>           && targetm.calls.split_complex_arg (type))
>         {
> -         tree cparm = p;
>           tree decl;
>           tree subtype = TREE_TYPE (type);
>           bool addressable = TREE_ADDRESSABLE (p);
> @@ -2302,9 +2338,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>           DECL_ARTIFICIAL (p) = addressable;
>           DECL_IGNORED_P (p) = addressable;
>           TREE_ADDRESSABLE (p) = 0;
> -         /* Reset the RTL before layout_decl, or it may change the
> -            mode of the RTL of the original argument copied to P.  */
> -         SET_DECL_RTL (p, NULL_RTX);
>           layout_decl (p, 0);
>           (*args)[i] = p;
>
> @@ -2316,41 +2349,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>           DECL_IGNORED_P (decl) = addressable;
>           layout_decl (decl, 0);
>           args->safe_insert (++i, decl);
> -
> -         /* If we are expanding a function, rather than gimplifying
> -            it, propagate the RTL of the complex parm to the split
> -            declarations, and set their contexts so that
> -            maybe_reset_rtl_for_parm can recognize them and refrain
> -            from resetting their RTL.  */
> -         if (currently_expanding_to_rtl)
> -           {
> -             maybe_reset_rtl_for_parm (cparm);
> -             rtx rtl = rtl_for_parm (all, cparm);
> -             if (rtl)
> -               {
> -                 /* If this is parm is unassigned, assign it now: the
> -                    newly-created decls wouldn't expect the need for
> -                    assignment, and if they were assigned
> -                    independently, they might not end up in adjacent
> -                    slots, so unsplit wouldn't be able to fill in the
> -                    unassigned address of the complex MEM.  */
> -                 if (parm_in_unassigned_mem_p (cparm, rtl))
> -                   {
> -                     int align = STACK_SLOT_ALIGNMENT
> -                       (TREE_TYPE (cparm), GET_MODE (rtl), MEM_ALIGN (rtl));
> -                     rtx loc = assign_stack_local
> -                       (GET_MODE (rtl), GET_MODE_SIZE (GET_MODE (rtl)),
> -                        align);
> -                     XEXP (rtl, 0) = XEXP (loc, 0);
> -                   }
> -
> -                 SET_DECL_RTL (p, read_complex_part (rtl, false));
> -                 SET_DECL_RTL (decl, read_complex_part (rtl, true));
> -
> -                 DECL_CONTEXT (p) = cparm;
> -                 DECL_CONTEXT (decl) = cparm;
> -               }
> -           }
>         }
>      }
>  }
> @@ -2386,6 +2384,9 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
>        DECL_ARTIFICIAL (decl) = 1;
>        DECL_NAMELESS (decl) = 1;
>        TREE_CONSTANT (decl) = 1;
> +      /* We don't set DECL_IGNORED_P or DECL_REGISTER here.  If this
> +        changes, the end of the RESULT_DECL handling block in
> +        use_register_for_decl must be adjusted to match.  */
>
>        DECL_CHAIN (decl) = all->orig_fnargs;
>        all->orig_fnargs = decl;
> @@ -2413,7 +2414,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
>
>    /* If the target wants to split complex arguments into scalars, do so.  */
>    if (targetm.calls.split_complex_arg)
> -    split_complex_args (all, &fnargs);
> +    split_complex_args (&fnargs);
>
>    return fnargs;
>  }
> @@ -2816,98 +2817,23 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>    data->entry_parm = entry_parm;
>  }
>
> -/* Wrapper for use_register_for_decl, that special-cases the
> -   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> -   passed by reference.  */
> -
> -static bool
> -use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> -{
> -  if (parm == all->function_result_decl)
> -    {
> -      tree result = DECL_RESULT (current_function_decl);
> -
> -      if (DECL_BY_REFERENCE (result))
> -       parm = result;
> -    }
> -
> -  return use_register_for_decl (parm);
> -}
> -
> -/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> -   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> -   is passed by reference.  */
> -
> -static rtx
> -rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> -{
> -  if (parm == all->function_result_decl)
> -    {
> -      tree result = DECL_RESULT (current_function_decl);
> -
> -      if (!DECL_BY_REFERENCE (result))
> -       return NULL_RTX;
> -
> -      parm = result;
> -    }
> -
> -  return get_rtl_for_parm_ssa_default_def (parm);
> -}
> -
> -/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> -   SSA_NAMEs in multiple partitions, so that assign_parms will choose
> -   the default def, if it exists, or create new RTL to hold the unused
> -   entry value.  If we are coalescing across variables, we want to
> -   reset the location too, because a parm without a default def
> -   (incoming value unused) might be coalesced with one with a default
> -   def, and then assign_parms would copy both incoming values to the
> -   same location, which might cause the wrong value to survive.  */
> -static void
> -maybe_reset_rtl_for_parm (tree parm)
> -{
> -  gcc_assert (TREE_CODE (parm) == PARM_DECL
> -             || TREE_CODE (parm) == RESULT_DECL);
> -
> -  /* This is a split complex parameter, and its context was set to its
> -     original PARM_DECL in split_complex_args so that we could
> -     recognize it here and not reset its RTL.  */
> -  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
> -    {
> -      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
> -      return;
> -    }
> -
> -  if ((flag_tree_coalesce_vars
> -       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> -      && is_gimple_reg (parm))
> -    SET_DECL_RTL (parm, NULL_RTX);
> -}
> -
>  /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
>     always valid and properly aligned.  */
>
>  static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> -                             struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
>  {
>    rtx stack_parm = data->stack_parm;
>
> -  /* If out-of-SSA assigned RTL to the parm default def, make sure we
> -     don't use what we might have computed before.  */
> -  rtx ssa_assigned = rtl_for_parm (all, parm);
> -  if (ssa_assigned)
> -    stack_parm = NULL;
> -
>    /* If we can't trust the parm stack slot to be aligned enough for its
>       ultimate type, don't use that slot after entry.  We'll make another
>       stack slot, if we need one.  */
> -  else if (stack_parm
> -          && ((STRICT_ALIGNMENT
> -               && (GET_MODE_ALIGNMENT (data->nominal_mode)
> -                   > MEM_ALIGN (stack_parm)))
> -              || (data->nominal_type
> -                  && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> -                  && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> +  if (stack_parm
> +      && ((STRICT_ALIGNMENT
> +          && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> +         || (data->nominal_type
> +             && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> +             && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>      stack_parm = NULL;
>
>    /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2952,27 +2878,6 @@ assign_parm_setup_block_p (struct assign_parm_data_one *data)
>    return false;
>  }
>
> -/* Return true if FROM_EXPAND is a MEM with an address to be filled in
> -   by assign_params.  This should be the case if, and only if,
> -   parm_in_stack_slot_p holds for the parm DECL that expanded to
> -   FROM_EXPAND, so we check that, too.  */
> -
> -static bool
> -parm_in_unassigned_mem_p (tree decl, rtx from_expand)
> -{
> -  bool result = MEM_P (from_expand) && !XEXP (from_expand, 0);
> -
> -  gcc_assert (result == parm_in_stack_slot_p (decl)
> -             /* Maybe it was already assigned.  That's ok, especially
> -                for split complex args.  */
> -             || (!result && MEM_P (from_expand)
> -                 && (XEXP (from_expand, 0) == virtual_stack_vars_rtx
> -                     || (GET_CODE (XEXP (from_expand, 0)) == PLUS
> -                         && XEXP (XEXP (from_expand, 0), 0) == virtual_stack_vars_rtx))));
> -
> -  return result;
> -}
> -
>  /* A subroutine of assign_parms.  Arrange for the parameter to be
>     present and valid in DATA->STACK_RTL.  */
>
> @@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>  {
>    rtx entry_parm = data->entry_parm;
>    rtx stack_parm = data->stack_parm;
> +  rtx target_reg = NULL_RTX;
>    HOST_WIDE_INT size;
>    HOST_WIDE_INT size_stored;
>
>    if (GET_CODE (entry_parm) == PARALLEL)
>      entry_parm = emit_group_move_into_temps (entry_parm);
>
> +  /* If we want the parameter in a pseudo, don't use a stack slot.  */
> +  if (is_gimple_reg (parm) && use_register_for_decl (parm))
> +    {
> +      tree def = ssa_default_def (cfun, parm);
> +      gcc_assert (def);
> +      machine_mode mode = promote_ssa_mode (def, NULL);
> +      rtx reg = gen_reg_rtx (mode);
> +      if (GET_CODE (reg) != CONCAT)
> +       stack_parm = reg;
> +      else
> +       /* This will use or allocate a stack slot that we'd rather
> +          avoid.  FIXME: Could we avoid it in more cases?  */
> +       target_reg = reg;
> +      data->stack_parm = NULL;
> +    }
> +
>    size = int_size_in_bytes (data->passed_type);
>    size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> -
>    if (stack_parm == 0)
>      {
>        DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> -      rtx from_expand = rtl_for_parm (all, parm);
> -      if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
> -       stack_parm = copy_rtx (from_expand);
> -      else
> -       {
> -         stack_parm = assign_stack_local (BLKmode, size_stored,
> -                                          DECL_ALIGN (parm));
> -         if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> -           PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -         if (from_expand)
> -           {
> -             gcc_assert (GET_CODE (stack_parm) == MEM);
> -             gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
> -             XEXP (from_expand, 0) = XEXP (stack_parm, 0);
> -             PUT_MODE (from_expand, GET_MODE (stack_parm));
> -             stack_parm = copy_rtx (from_expand);
> -           }
> -         else
> -           set_mem_attributes (stack_parm, parm, 1);
> -       }
> +      stack_parm = assign_stack_local (BLKmode, size_stored,
> +                                      DECL_ALIGN (parm));
> +      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> +       PUT_MODE (stack_parm, GET_MODE (entry_parm));
> +      set_mem_attributes (stack_parm, parm, 1);
>      }
>
>    /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
> @@ -3054,11 +2960,6 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>        else if (size == 0)
>         ;
>
> -      /* MEM may be a REG if coalescing assigns the param's partition
> -        to a pseudo.  */
> -      else if (REG_P (mem))
> -       emit_move_insn (mem, entry_parm);
> -
>        /* If SIZE is that of a mode no bigger than a word, just use
>          that mode's store operation.  */
>        else if (size <= UNITS_PER_WORD)
> @@ -3113,10 +3014,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>               tem = change_address (mem, word_mode, 0);
>               emit_move_insn (tem, x);
>             }
> +         else if (!MEM_P (mem))
> +           emit_move_insn (mem, entry_parm);
>           else
>             move_block_from_reg (REGNO (entry_parm), mem,
>                                  size_stored / UNITS_PER_WORD);
>         }
> +      else if (!MEM_P (mem))
> +       emit_move_insn (mem, entry_parm);
>        else
>         move_block_from_reg (REGNO (entry_parm), mem,
>                              size_stored / UNITS_PER_WORD);
> @@ -3131,8 +3036,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>        end_sequence ();
>      }
>
> +  if (target_reg)
> +    {
> +      emit_move_insn (target_reg, stack_parm);
> +      stack_parm = target_reg;
> +    }
> +
>    data->stack_parm = stack_parm;
> -  SET_DECL_RTL (parm, stack_parm);
> +  set_parm_rtl (parm, stack_parm);
>  }
>
>  /* A subroutine of assign_parms.  Allocate a pseudo to hold the current
> @@ -3148,6 +3059,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>    int unsignedp = TYPE_UNSIGNED (TREE_TYPE (parm));
>    bool did_conversion = false;
>    bool need_conversion, moved;
> +  rtx rtl;
>
>    /* Store the parm in a pseudoregister during the function, but we may
>       need to do it in a wider mode.  Using 2 here makes the result
> @@ -3156,40 +3068,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>      = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                              TREE_TYPE (current_function_decl), 2);
>
> -  rtx from_expand = parmreg = rtl_for_parm (all, parm);
> -
> -  if (from_expand && !data->passed_pointer)
> -    {
> -      if (GET_MODE (parmreg) != promoted_nominal_mode)
> -       parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
> -    }
> -  else if (!from_expand || parm_in_unassigned_mem_p (parm, from_expand))
> -    {
> -      parmreg = gen_reg_rtx (promoted_nominal_mode);
> -      if (!DECL_ARTIFICIAL (parm))
> -       mark_user_reg (parmreg);
> -
> -      if (from_expand)
> -       {
> -         gcc_assert (data->passed_pointer);
> -         gcc_assert (GET_CODE (from_expand) == MEM
> -                     && XEXP (from_expand, 0) == NULL_RTX);
> -         XEXP (from_expand, 0) = parmreg;
> -       }
> -    }
> +  parmreg = gen_reg_rtx (promoted_nominal_mode);
> +  if (!DECL_ARTIFICIAL (parm))
> +    mark_user_reg (parmreg);
>
>    /* If this was an item that we received a pointer to,
> -     set DECL_RTL appropriately.  */
> -  if (from_expand)
> -    SET_DECL_RTL (parm, from_expand);
> -  else if (data->passed_pointer)
> +     set rtl appropriately.  */
> +  if (data->passed_pointer)
>      {
> -      rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
> -      set_mem_attributes (x, parm, 1);
> -      SET_DECL_RTL (parm, x);
> +      rtl = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
> +      set_mem_attributes (rtl, parm, 1);
>      }
>    else
> -    SET_DECL_RTL (parm, parmreg);
> +    rtl = parmreg;
>
>    assign_parm_remove_parallels (data);
>
> @@ -3197,13 +3088,10 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       assign_parm_find_data_types and expand_expr_real_1.  */
>
>    equiv_stack_parm = data->stack_parm;
> -  if (!equiv_stack_parm)
> -    equiv_stack_parm = data->entry_parm;
>    validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
>    need_conversion = (data->nominal_mode != data->passed_mode
>                      || promoted_nominal_mode != data->promoted_mode);
> -  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
>    moved = false;
>
>    if (need_conversion
> @@ -3327,7 +3215,9 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>        /* TREE_USED gets set erroneously during expand_assignment.  */
>        save_tree_used = TREE_USED (parm);
> +      SET_DECL_RTL (parm, rtl);
>        expand_assignment (parm, make_tree (data->nominal_type, tempreg), false);
> +      SET_DECL_RTL (parm, NULL_RTX);
>        TREE_USED (parm) = save_tree_used;
>        all->first_conversion_insn = get_insns ();
>        all->last_conversion_insn = get_last_insn ();
> @@ -3335,28 +3225,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>        did_conversion = true;
>      }
> -  /* We don't want to copy the incoming pointer to a parmreg expected
> -     to hold the value rather than the pointer.  */
> -  else if (!data->passed_pointer || parmreg != from_expand)
> +  else
>      emit_move_insn (parmreg, validated_mem);
>
>    /* If we were passed a pointer but the actual value can safely live
>       in a register, retrieve it and use it directly.  */
> -  if (data->passed_pointer
> -      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
> +  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
>      {
> -      rtx src = DECL_RTL (parm);
> -
>        /* We can't use nominal_mode, because it will have been set to
>          Pmode above.  We must use the actual mode of the parm.  */
> -      if (from_expand)
> -       {
> -         parmreg = from_expand;
> -         gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> -         src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
> -         set_mem_attributes (src, parm, 1);
> -       }
> -      else if (use_register_for_decl (parm))
> +      if (use_register_for_decl (parm))
>         {
>           parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>           mark_user_reg (parmreg);
> @@ -3373,14 +3251,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>           set_mem_attributes (parmreg, parm, 1);
>         }
>
> -      if (GET_MODE (parmreg) != GET_MODE (src))
> +      if (GET_MODE (parmreg) != GET_MODE (rtl))
>         {
> -         rtx tempreg = gen_reg_rtx (GET_MODE (src));
> +         rtx tempreg = gen_reg_rtx (GET_MODE (rtl));
>           int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
>
>           push_to_sequence2 (all->first_conversion_insn,
>                              all->last_conversion_insn);
> -         emit_move_insn (tempreg, src);
> +         emit_move_insn (tempreg, rtl);
>           tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
>           emit_move_insn (parmreg, tempreg);
>           all->first_conversion_insn = get_insns ();
> @@ -3389,18 +3267,18 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>           did_conversion = true;
>         }
> -      else if (GET_MODE (parmreg) == BLKmode)
> -       gcc_assert (parm_in_stack_slot_p (parm));
>        else
> -       emit_move_insn (parmreg, src);
> +       emit_move_insn (parmreg, rtl);
>
> -      SET_DECL_RTL (parm, parmreg);
> +      rtl = parmreg;
>
>        /* STACK_PARM is the pointer, not the parm, and PARMREG is
>          now the parm.  */
> -      data->stack_parm = equiv_stack_parm = NULL;
> +      data->stack_parm = NULL;
>      }
>
> +  set_parm_rtl (parm, rtl);
> +
>    /* Mark the register as eliminable if we did no conversion and it was
>       copied from memory at a fixed offset, and the arg pointer was not
>       copied to a pseudo-reg.  If the arg pointer is a pseudo reg or the
> @@ -3408,11 +3286,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       make here would screw up life analysis for it.  */
>    if (data->nominal_mode == data->passed_mode
>        && !did_conversion
> -      && equiv_stack_parm != 0
> -      && MEM_P (equiv_stack_parm)
> +      && data->stack_parm != 0
> +      && MEM_P (data->stack_parm)
>        && data->locate.offset.var == 0
>        && reg_mentioned_p (virtual_incoming_args_rtx,
> -                         XEXP (equiv_stack_parm, 0)))
> +                         XEXP (data->stack_parm, 0)))
>      {
>        rtx_insn *linsn = get_last_insn ();
>        rtx_insn *sinsn;
> @@ -3425,8 +3303,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>             = GET_MODE_INNER (GET_MODE (parmreg));
>           int regnor = REGNO (XEXP (parmreg, 0));
>           int regnoi = REGNO (XEXP (parmreg, 1));
> -         rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> -         rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
> +         rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> +         rtx stacki = adjust_address_nv (data->stack_parm, submode,
>                                           GET_MODE_SIZE (submode));
>
>           /* Scan backwards for the set of the real and
> @@ -3444,7 +3322,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>                 set_unique_reg_note (sinsn, REG_EQUIV, stackr);
>             }
>         }
> -      else
> +      else
>         set_dst_reg_note (linsn, REG_EQUIV, equiv_stack_parm, parmreg);
>      }
>
> @@ -3496,16 +3374,6 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>    if (data->entry_parm != data->stack_parm)
>      {
>        rtx src, dest;
> -      rtx from_expand = NULL_RTX;
> -
> -      if (data->stack_parm == 0)
> -       {
> -         from_expand = rtl_for_parm (all, parm);
> -         if (from_expand)
> -           gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
> -         if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
> -           data->stack_parm = from_expand;
> -       }
>
>        if (data->stack_parm == 0)
>         {
> @@ -3516,16 +3384,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>             = assign_stack_local (GET_MODE (data->entry_parm),
>                                   GET_MODE_SIZE (GET_MODE (data->entry_parm)),
>                                   align);
> -         if (!from_expand)
> -           set_mem_attributes (data->stack_parm, parm, 1);
> -         else
> -           {
> -             gcc_assert (GET_CODE (data->stack_parm) == MEM);
> -             gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
> -             XEXP (from_expand, 0) = XEXP (data->stack_parm, 0);
> -             PUT_MODE (from_expand, GET_MODE (data->stack_parm));
> -             data->stack_parm = copy_rtx (from_expand);
> -           }
> +         set_mem_attributes (data->stack_parm, parm, 1);
>         }
>
>        dest = validize_mem (copy_rtx (data->stack_parm));
> @@ -3554,7 +3413,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>        end_sequence ();
>      }
>
> -  SET_DECL_RTL (parm, data->stack_parm);
> +  set_parm_rtl (parm, data->stack_parm);
>  }
>
>  /* A subroutine of assign_parms.  If the ABI splits complex arguments, then
> @@ -3580,21 +3439,11 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
>           imag = DECL_RTL (fnargs[i + 1]);
>           if (inner != GET_MODE (real))
>             {
> -             real = simplify_gen_subreg (inner, real, GET_MODE (real),
> -                                         subreg_lowpart_offset
> -                                         (inner, GET_MODE (real)));
> -             imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
> -                                         subreg_lowpart_offset
> -                                         (inner, GET_MODE (imag)));
> +             real = gen_lowpart_SUBREG (inner, real);
> +             imag = gen_lowpart_SUBREG (inner, imag);
>             }
>
> -         if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
> -             && rtx_equal_p (real,
> -                             read_complex_part (tmp, false))
> -             && rtx_equal_p (imag,
> -                             read_complex_part (tmp, true)))
> -           ; /* We now have the right rtl in tmp.  */
> -         else if (TREE_ADDRESSABLE (parm))
> +         if (TREE_ADDRESSABLE (parm))
>             {
>               rtx rmem, imem;
>               HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
> @@ -3618,7 +3467,7 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
>             }
>           else
>             tmp = gen_rtx_CONCAT (DECL_MODE (parm), real, imag);
> -         SET_DECL_RTL (parm, tmp);
> +         set_parm_rtl (parm, tmp);
>
>           real = DECL_INCOMING_RTL (fnargs[i]);
>           imag = DECL_INCOMING_RTL (fnargs[i + 1]);
> @@ -3740,7 +3589,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
>           assign_parm_setup_block (&all, pbdata->bounds_parm,
>                                    &pbdata->parm_data);
>         else if (pbdata->parm_data.passed_pointer
> -                || use_register_for_parm_decl (&all, pbdata->bounds_parm))
> +                || use_register_for_decl (pbdata->bounds_parm))
>           assign_parm_setup_reg (&all, pbdata->bounds_parm,
>                                  &pbdata->parm_data);
>         else
> @@ -3784,8 +3633,6 @@ assign_parms (tree fndecl)
>           DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>           continue;
>         }
> -      else
> -       maybe_reset_rtl_for_parm (parm);
>
>        /* Estimate stack alignment from parameter alignment.  */
>        if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3835,7 +3682,7 @@ assign_parms (tree fndecl)
>        else
>         set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> -      assign_parm_adjust_stack_rtl (&all, parm, &data);
> +      assign_parm_adjust_stack_rtl (&data);
>
>        /* Bounds should be loaded in the particular order to
>          have registers allocated correctly.  Collect info about
> @@ -3856,8 +3703,7 @@ assign_parms (tree fndecl)
>         {
>           if (assign_parm_setup_block_p (&data))
>             assign_parm_setup_block (&all, parm, &data);
> -         else if (data.passed_pointer
> -                  || use_register_for_parm_decl (&all, parm))
> +         else if (data.passed_pointer || use_register_for_decl (parm))
>             assign_parm_setup_reg (&all, parm, &data);
>           else
>             assign_parm_setup_stack (&all, parm, &data);
> @@ -3954,7 +3800,7 @@ assign_parms (tree fndecl)
>
>        DECL_HAS_VALUE_EXPR_P (result) = 1;
>
> -      SET_DECL_RTL (result, x);
> +      set_parm_rtl (result, x);
>      }
>
>    /* We have aligned all the args, so add space for the pretend args.  */
> @@ -4986,6 +4832,18 @@ allocate_struct_function (tree fndecl, bool abstract_p)
>    if (fndecl != NULL_TREE)
>      {
>        tree result = DECL_RESULT (fndecl);
> +
> +      if (!abstract_p)
> +       {
> +         /* Now that we have activated any function-specific attributes
> +            that might affect layout, particularly vector modes, relayout
> +            each of the parameters and the result.  */
> +         relayout_decl (result);
> +         for (tree parm = DECL_ARGUMENTS (fndecl); parm;
> +              parm = DECL_CHAIN (parm))
> +           relayout_decl (parm);
> +       }
> +
>        if (!abstract_p && aggregate_value_p (result, fndecl))
>         {
>  #ifdef PCC_STATIC_STRUCT_RETURN
> @@ -5189,7 +5047,6 @@ expand_function_start (tree subr)
>
>    /* Decide whether to return the value in memory or in a register.  */
>    tree res = DECL_RESULT (subr);
> -  maybe_reset_rtl_for_parm (res);
>    if (aggregate_value_p (res, subr))
>      {
>        /* Returning something that won't go in a register.  */
> @@ -5210,10 +5067,7 @@ expand_function_start (tree subr)
>              it.  */
>           if (sv)
>             {
> -             if (DECL_BY_REFERENCE (res))
> -               value_address = get_rtl_for_parm_ssa_default_def (res);
> -             if (!value_address)
> -               value_address = gen_reg_rtx (Pmode);
> +             value_address = gen_reg_rtx (Pmode);
>               emit_move_insn (value_address, sv);
>             }
>         }
> @@ -5222,33 +5076,35 @@ expand_function_start (tree subr)
>           rtx x = value_address;
>           if (!DECL_BY_REFERENCE (res))
>             {
> -             x = get_rtl_for_parm_ssa_default_def (res);
> -             if (!x)
> -               {
> -                 x = gen_rtx_MEM (DECL_MODE (res), value_address);
> -                 set_mem_attributes (x, res, 1);
> -               }
> +             x = gen_rtx_MEM (DECL_MODE (res), x);
> +             set_mem_attributes (x, res, 1);
>             }
> -         SET_DECL_RTL (res, x);
> +         set_parm_rtl (res, x);
>         }
>      }
>    else if (DECL_MODE (res) == VOIDmode)
>      /* If return mode is void, this decl rtl should not be used.  */
> -    SET_DECL_RTL (res, NULL_RTX);
> -  else
> +    set_parm_rtl (res, NULL_RTX);
> +  else
>      {
>        /* Compute the return values into a pseudo reg, which we will copy
>          into the true return register after the cleanups are done.  */
>        tree return_type = TREE_TYPE (res);
> -      rtx x = get_rtl_for_parm_ssa_default_def (res);
> -      if (x)
> -       /* Use it.  */;
> +      /* If we may coalesce this result, make sure it has the expected
> +        mode.  */
> +      if (flag_tree_coalesce_vars && is_gimple_reg (res))
> +       {
> +         tree def = ssa_default_def (cfun, res);
> +         gcc_assert (def);
> +         machine_mode mode = promote_ssa_mode (def, NULL);
> +         set_parm_rtl (res, gen_reg_rtx (mode));
> +       }
>        else if (TYPE_MODE (return_type) != BLKmode
>                && targetm.calls.return_in_msb (return_type))
>         /* expand_function_end will insert the appropriate padding in
>            this case.  Use the return value's natural (unpadded) mode
>            within the function proper.  */
> -       x = gen_reg_rtx (TYPE_MODE (return_type));
> +       set_parm_rtl (res, gen_reg_rtx (TYPE_MODE (return_type)));
>        else
>         {
>           /* In order to figure out what mode to use for the pseudo, we
> @@ -5259,16 +5115,14 @@ expand_function_start (tree subr)
>           /* Structures that are returned in registers are not
>              aggregate_value_p, so we may see a PARALLEL or a REG.  */
>           if (REG_P (hard_reg))
> -           x = gen_reg_rtx (GET_MODE (hard_reg));
> +           set_parm_rtl (res, gen_reg_rtx (GET_MODE (hard_reg)));
>           else
>             {
>               gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> -             x = gen_group_rtx (hard_reg);
> +             set_parm_rtl (res, gen_group_rtx (hard_reg));
>             }
>         }
>
> -      SET_DECL_RTL (res, x);
> -
>        /* Set DECL_REGISTER flag so that expand_function_end will copy the
>          result to the real return register(s).  */
>        DECL_REGISTER (res) = 1;
> @@ -5291,22 +5145,23 @@ expand_function_start (tree subr)
>      {
>        tree parm = cfun->static_chain_decl;
>        rtx local, chain;
> -     rtx_insn *insn;
> +      rtx_insn *insn;
> +      int unsignedp;
>
> -      local = get_rtl_for_parm_ssa_default_def (parm);
> -      if (!local)
> -       local = gen_reg_rtx (Pmode);
> +      local = gen_reg_rtx (promote_decl_mode (parm, &unsignedp));
>        chain = targetm.calls.static_chain (current_function_decl, true);
>
>        set_decl_incoming_rtl (parm, chain, false);
> -      SET_DECL_RTL (parm, local);
> +      set_parm_rtl (parm, local);
>        mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
>
> -      if (GET_MODE (local) != Pmode)
> -       local = convert_to_mode (Pmode, local,
> -                                TYPE_UNSIGNED (TREE_TYPE (parm)));
> -
> -      insn = emit_move_insn (local, chain);
> +      if (GET_MODE (local) != GET_MODE (chain))
> +       {
> +         convert_move (local, chain, unsignedp);
> +         insn = get_last_insn ();
> +       }
> +      else
> +       insn = emit_move_insn (local, chain);
>
>        /* Mark the register as eliminable, similar to parameters.  */
>        if (MEM_P (chain)
> diff --git a/gcc/testsuite/gcc.dg/pr67312.c b/gcc/testsuite/gcc.dg/pr67312.c
> new file mode 100644
> index 0000000..f1c9fde
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr67312.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O0 -ftree-coalesce-vars" } */
> +
> +void foo (int x, int y)
> +{
> +    y = x;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
> index a1e35dc..d14eb2f 100644
> --- a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
> +++ b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
> @@ -1,6 +1,13 @@
>  /* { dg-do compile } */
> -/* { dg-options "-mpreferred-stack-boundary=4" } */
> +/* { dg-options "-mpreferred-stack-boundary=4 -O" } */
>  /* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-64,\[^\\n\]*sp" } } */
> +/* We only guarantee we won't generate the stack alignment when
> +   optimizing.  When not optimizing, the return value will be assigned
> +   to a pseudo with the specified alignment, which in turn will force
> +   stack alignment since the pseudo might have to be spilled.  Without
> +   optimization, we wouldn't compute the actual stack requirements
> +   after register allocation and reload, and just use the conservative
> +   estimate.  */
>
>  /* This compile only test is to detect an assertion failure in stack branch
>     development.  */
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index fd00883..8dc4908 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -980,7 +980,6 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>  {
>    bitmap values = NULL;
>    var_map map;
> -  unsigned i;
>
>    map = coalesce_ssa_name ();
>
> @@ -1005,17 +1004,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
>    sa->map = map;
>    sa->values = values;
> -  sa->partition_has_default_def = BITMAP_ALLOC (NULL);
> -  for (i = 1; i < num_ssa_names; i++)
> -    {
> -      tree t = ssa_name (i);
> -      if (t && SSA_NAME_IS_DEFAULT_DEF (t))
> -       {
> -         int p = var_to_partition (map, t);
> -         if (p != NO_PARTITION)
> -           bitmap_set_bit (sa->partition_has_default_def, p);
> -       }
> -    }
> +  sa->partitions_for_parm_default_defs = get_parm_default_def_partitions (map);
>  }
>
>
> @@ -1190,7 +1179,7 @@ finish_out_of_ssa (struct ssaexpand *sa)
>    if (sa->values)
>      BITMAP_FREE (sa->values);
>    delete_var_map (sa->map);
> -  BITMAP_FREE (sa->partition_has_default_def);
> +  BITMAP_FREE (sa->partitions_for_parm_default_defs);
>    memset (sa, 0, sizeof *sa);
>  }
>
> diff --git a/gcc/tree-outof-ssa.h b/gcc/tree-outof-ssa.h
> index 687e5a5..60b6379 100644
> --- a/gcc/tree-outof-ssa.h
> +++ b/gcc/tree-outof-ssa.h
> @@ -39,9 +39,9 @@ struct ssaexpand
>       a pseudos REG).  */
>    rtx *partition_to_pseudo;
>
> -  /* If partition I contains an SSA name that has a default def,
> -     bit I will be set in this bitmap.  */
> -  bitmap partition_has_default_def;
> +  /* If partition I contains an SSA name that has a default def for a
> +     parameter, bit I will be set in this bitmap.  */
> +  bitmap partitions_for_parm_default_defs;
>  };
>
>  /* This is the singleton described above.  */
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index 8af6583..ff75877 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -39,7 +39,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfgexpand.h"
>  #include "explow.h"
>  #include "diagnostic-core.h"
> -
> +#include "tree-dfa.h"
> +#include "tm_p.h"
> +#include "stor-layout.h"
>
>  /* This set of routines implements a coalesce_list.  This is an object which
>     is used to track pairs of ssa_names which are desirable to coalesce
> @@ -877,26 +879,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>         }
>
>        /* Pretend there are defs for params' default defs at the start
> -        of the (post-)entry block.  */
> +        of the (post-)entry block.  This will prevent PARM_DECLs from
> +        coalescing into the same partition.  Although RESULT_DECLs'
> +        default defs don't have a useful initial value, we have to
> +        prevent them from coalescing with PARM_DECLs' default defs
> +        too, otherwise assign_parms would attempt to assign different
> +        RTL to the same partition.  */
>        if (bb == entry)
>         {
> -         unsigned base;
> -         bitmap_iterator bi;
> -         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> +         unsigned i;
> +         for (i = 1; i < num_ssa_names; i++)
>             {
> -             bitmap_iterator bi2;
> -             unsigned part;
> -             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> -                                       0, part, bi2)
> -               {
> -                 tree var = partition_to_var (map, part);
> -                 if (!SSA_NAME_VAR (var)
> -                     || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> -                         && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> -                     || !SSA_NAME_IS_DEFAULT_DEF (var))
> -                   continue;
> -                 live_track_process_def (live, var, graph);
> -               }
> +             tree var = ssa_name (i);
> +
> +             if (!var
> +                 || !SSA_NAME_IS_DEFAULT_DEF (var)
> +                 || !SSA_NAME_VAR (var)
> +                 || VAR_P (SSA_NAME_VAR (var)))
> +               continue;
> +
> +             live_track_process_def (live, var, graph);
> +             /* Process a use too, so that it remains live and
> +                conflicts with other parms' default defs, even unused
> +                ones.  */
> +             live_track_process_use (live, var);
>             }
>         }
>
> @@ -937,6 +943,71 @@ fail_abnormal_edge_coalesce (int x, int y)
>    internal_error ("SSA corruption");
>  }
>
> +/* Call CALLBACK for all PARM_DECLs and RESULT_DECLs for which
> +   assign_parms may ask for a default partition.  */
> +
> +static void
> +for_all_parms (void (*callback)(tree var, void *arg), void *arg)
> +{
> +  for (tree var = DECL_ARGUMENTS (current_function_decl); var;
> +       var = DECL_CHAIN (var))
> +    callback (var, arg);
> +  if (!VOID_TYPE_P (TREE_TYPE (DECL_RESULT (current_function_decl))))
> +    callback (DECL_RESULT (current_function_decl), arg);
> +  if (cfun->static_chain_decl)
> +    callback (cfun->static_chain_decl, arg);
> +}
> +
> +/* Create a default def for VAR.  */
> +
> +static void
> +create_default_def (tree var, void *arg ATTRIBUTE_UNUSED)
> +{
> +  if (!is_gimple_reg (var))
> +    return;
> +
> +  tree ssa = get_or_create_ssa_default_def (cfun, var);
> +  gcc_assert (ssa);
> +}
> +
> +/* Register VAR's default def in MAP.  */
> +
> +static void
> +register_default_def (tree var, void *map_)
> +{
> +  var_map map = (var_map)map_;
> +
> +  if (!is_gimple_reg (var))
> +    return;
> +
> +  tree ssa = ssa_default_def (cfun, var);
> +  gcc_assert (ssa);
> +
> +  register_ssa_partition (map, ssa);
> +}
> +
> +/* If VAR is an SSA_NAME associated with a PARM_DECL or a RESULT_DECL,
> +   and the DECL's default def is unused (i.e., it was introduced by
> +   create_default_def), mark VAR and the default def for
> +   coalescing.  */
> +
> +static void
> +coalesce_with_default (tree var, coalesce_list_p cl, bitmap used_in_copy)
> +{
> +  if (SSA_NAME_IS_DEFAULT_DEF (var)
> +      || !SSA_NAME_VAR (var)
> +      || VAR_P (SSA_NAME_VAR (var)))
> +    return;
> +
> +  tree ssa = ssa_default_def (cfun, SSA_NAME_VAR (var));
> +  if (!has_zero_uses (ssa))
> +    return;
> +
> +  add_cost_one_coalesce (cl, SSA_NAME_VERSION (ssa), SSA_NAME_VERSION (var));
> +  bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
> +  /* Default defs will have their used_in_copy bits set at the end of
> +     create_outofssa_var_map.  */
> +}
>
>  /* This function creates a var_map for the current function as well as creating
>     a coalesce list for use later in the out of ssa process.  */
> @@ -954,8 +1025,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>    int v1, v2, cost;
>    unsigned i;
>
> +  for_all_parms (create_default_def, NULL);
> +
>    map = init_var_map (num_ssa_names);
>
> +  for_all_parms (register_default_def, map);
> +
>    FOR_EACH_BB_FN (bb, cfun)
>      {
>        tree arg;
> @@ -1034,6 +1109,30 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>               }
>               break;
>
> +           case GIMPLE_RETURN:
> +             {
> +               tree res = DECL_RESULT (current_function_decl);
> +               if (VOID_TYPE_P (TREE_TYPE (res))
> +                   || !is_gimple_reg (res))
> +                 break;
> +               tree rhs1 = gimple_return_retval (as_a <greturn *> (stmt));
> +               if (!rhs1)
> +                 break;
> +               tree lhs = ssa_default_def (cfun, res);
> +               gcc_assert (lhs);
> +               if (TREE_CODE (rhs1) == SSA_NAME
> +                   && gimple_can_coalesce_p (lhs, rhs1))
> +                 {
> +                   v1 = SSA_NAME_VERSION (lhs);
> +                   v2 = SSA_NAME_VERSION (rhs1);
> +                   cost = coalesce_cost_bb (bb);
> +                   add_coalesce (cl, v1, v2, cost);
> +                   bitmap_set_bit (used_in_copy, v1);
> +                   bitmap_set_bit (used_in_copy, v2);
> +                 }
> +               break;
> +             }
> +
>             case GIMPLE_ASM:
>               {
>                 gasm *asm_stmt = as_a <gasm *> (stmt);
> @@ -1100,10 +1199,13 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>        var = ssa_name (i);
>        if (var != NULL_TREE && !virtual_operand_p (var))
>          {
> +         coalesce_with_default (var, cl, used_in_copy);
> +
>           /* Add coalesces between all the result decls.  */
>           if (SSA_NAME_VAR (var)
>               && TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL)
>             {
> +             bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
>               if (first == NULL_TREE)
>                 first = var;
>               else
> @@ -1111,8 +1213,6 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>                   gcc_assert (gimple_can_coalesce_p (var, first));
>                   v1 = SSA_NAME_VERSION (first);
>                   v2 = SSA_NAME_VERSION (var);
> -                 bitmap_set_bit (used_in_copy, v1);
> -                 bitmap_set_bit (used_in_copy, v2);
>                   cost = coalesce_cost_bb (EXIT_BLOCK_PTR_FOR_FN (cfun));
>                   add_coalesce (cl, v1, v2, cost);
>                 }
> @@ -1121,7 +1221,9 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>              since they will have to be coalesced with the base variable.  If
>              not marked as present, they won't be in the coalesce view. */
>           if (SSA_NAME_IS_DEFAULT_DEF (var)
> -             && !has_zero_uses (var))
> +             && (!has_zero_uses (var)
> +                 || (SSA_NAME_VAR (var)
> +                     && !VAR_P (SSA_NAME_VAR (var)))))
>             bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
>         }
>      }
> @@ -1367,30 +1469,38 @@ gimple_can_coalesce_p (tree name1, tree name2)
>
>        /* We don't want to coalesce two SSA names if one of the base
>          variables is supposed to be a register while the other is
> -        supposed to be on the stack.  Anonymous SSA names take
> -        registers, but when not optimizing, user variables should go
> -        on the stack, so coalescing them with the anonymous variable
> -        as the partition leader would end up assigning the user
> -        variable to a register.  Don't do that!  */
> -      bool reg1 = !var1 || use_register_for_decl (var1);
> -      bool reg2 = !var2 || use_register_for_decl (var2);
> +        supposed to be on the stack.  Anonymous SSA names most often
> +        take registers, but when not optimizing, user variables
> +        should go on the stack, so coalescing them with the anonymous
> +        variable as the partition leader would end up assigning the
> +        user variable to a register.  Don't do that!  */
> +      bool reg1 = use_register_for_decl (name1);
> +      bool reg2 = use_register_for_decl (name2);
>        if (reg1 != reg2)
>         return false;
>
> -      /* Check that the promoted modes are the same.  We don't want to
> -        coalesce if the promoted modes would be different.  Only
> +      /* Check that the promoted modes and unsignedness are the same.
> +        We don't want to coalesce if the promoted modes would be
> +        different, or if they would sign-extend differently.  Only
>          PARM_DECLs and RESULT_DECLs have different promotion rules,
>          so skip the test if both are variables, or both are anonymous
> -        SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
> -        coalesce its SSA versions with those of any other variables,
> -        because it may be passed by reference.  */
> +        SSA_NAMEs.  */
> +      int unsigned1, unsigned2;
>        return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> -       || (/* The case var1 == var2 is already covered above.  */
> -           !parm_in_stack_slot_p (var1)
> -           && !parm_in_stack_slot_p (var2)
> -           && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
> +       || ((promote_ssa_mode (name1, &unsigned1)
> +            == promote_ssa_mode (name2, &unsigned2))
> +           && unsigned1 == unsigned2);
>      }
>
> +  /* If alignment requirements are different, we can't coalesce.  */
> +  if (MINIMUM_ALIGNMENT (t1,
> +                        var1 ? DECL_MODE (var1) : TYPE_MODE (t1),
> +                        var1 ? LOCAL_DECL_ALIGNMENT (var1) : TYPE_ALIGN (t1))
> +      != MINIMUM_ALIGNMENT (t2,
> +                           var2 ? DECL_MODE (var2) : TYPE_MODE (t2),
> +                           var2 ? LOCAL_DECL_ALIGNMENT (var2) : TYPE_ALIGN (t2)))
> +    return false;
> +
>    /* If the types are not the same, check for a canonical type match.  This
>       (for example) allows coalescing when the types are fundamentally the
>       same, but just have different names.
> @@ -1639,7 +1749,8 @@ coalesce_ssa_name (void)
>           if (a
>               && SSA_NAME_VAR (a)
>               && !DECL_IGNORED_P (SSA_NAME_VAR (a))
> -             && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a)))
> +             && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a)
> +                 || !VAR_P (SSA_NAME_VAR (a))))
>             {
>               tree *slot = ssa_name_hash.find_slot (a, INSERT);
>
> @@ -1721,3 +1832,47 @@ coalesce_ssa_name (void)
>
>    return map;
>  }
> +
> +/* We need to pass two arguments to set_parm_default_def_partition,
> +   but for_all_parms only supports one.  Use a pair.  */
> +
> +typedef std::pair<var_map, bitmap> parm_default_def_partition_arg;
> +
> +/* Set in ARG's PARTS bitmap the bit corresponding to the partition in
> +   ARG's MAP containing VAR's default def.  */
> +
> +static void
> +set_parm_default_def_partition (tree var, void *arg_)
> +{
> +  parm_default_def_partition_arg *arg = (parm_default_def_partition_arg *)arg_;
> +  var_map map = arg->first;
> +  bitmap parts = arg->second;
> +
> +  if (!is_gimple_reg (var))
> +    return;
> +
> +  tree ssa = ssa_default_def (cfun, var);
> +  gcc_assert (ssa);
> +
> +  int version = var_to_partition (map, ssa);
> +  gcc_assert (version != NO_PARTITION);
> +
> +  bool changed = bitmap_set_bit (parts, version);
> +  gcc_assert (changed);
> +}
> +
> +/* Allocate and return a bitmap that has a bit set for each partition
> +   that contains a default def for a parameter.  */
> +
> +extern bitmap
> +get_parm_default_def_partitions (var_map map)
> +{
> +  bitmap parm_default_def_parts = BITMAP_ALLOC (NULL);
> +
> +  parm_default_def_partition_arg
> +    arg = std::make_pair (map, parm_default_def_parts);
> +
> +  for_all_parms (set_parm_default_def_partition, &arg);
> +
> +  return parm_default_def_parts;
> +}
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index ae289b4..8316f34 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -22,5 +22,6 @@ along with GCC; see the file COPYING3.  If not see
>
>  extern var_map coalesce_ssa_name (void);
>  extern bool gimple_can_coalesce_p (tree, tree);
> +extern bitmap get_parm_default_def_partitions (var_map);
>
>  #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index e031725..25b548b 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -200,7 +200,9 @@ partition_view_init (var_map map)
>        tmp = partition_find (map->var_partition, x);
>        if (ssa_name (tmp) != NULL_TREE && !virtual_operand_p (ssa_name (tmp))
>           && (!has_zero_uses (ssa_name (tmp))
> -             || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))))
> +             || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))
> +             || (SSA_NAME_VAR (ssa_name (tmp))
> +                 && !VAR_P (SSA_NAME_VAR (ssa_name (tmp))))))
>         bitmap_set_bit (used, tmp);
>      }
>
> @@ -1404,6 +1406,12 @@ verify_live_on_entry (tree_live_info_p live)
>                   }
>                 if (ok)
>                   continue;
> +               /* Expand adds unused default defs for PARM_DECLs and
> +                  RESULT_DECLs.  They're ok.  */
> +               if (has_zero_uses (var)
> +                   && SSA_NAME_VAR (var)
> +                   && !VAR_P (SSA_NAME_VAR (var)))
> +                 continue;
>                 num++;
>                 print_generic_expr (stderr, var, TDF_SLIM);
>                 fprintf (stderr, " is not marked live-on-entry to entry BB%d ",
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-09-23 20:44                                                               ` Alexandre Oliva
  2015-09-25 11:39                                                                 ` Richard Biener
@ 2015-09-29 11:31                                                                 ` Szabolcs Nagy
  2015-10-07 22:37                                                                   ` Alexandre Oliva
  2015-11-05  5:09                                                                 ` Alexandre Oliva
  2 siblings, 1 reply; 127+ messages in thread
From: Szabolcs Nagy @ 2015-09-29 11:31 UTC (permalink / raw)
  To: Alexandre Oliva, Alan Lawrence
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 90530 bytes --]

On 23/09/15 21:07, Alexandre Oliva wrote:
> On Sep 18, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:
>
>> With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on
>> branch aoliva/pr64164, I am now able to build a cross toolchain for
>> aarch64 and aarch64_be, and can confirm the ABI failure is fixed on
>> the branch.
>

this commit

commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc
Author: aoliva <aoliva@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   2015-09-27 09:02:00 +0000

     revert to assign_parms assignments using default defs
     ...

introduced a test failure on arm-none-eabi (using newlib, compiling
with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ):

FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2

spawn arm-none-eabi-size pr43920-2.o
    text    data     bss     dec     hex filename
      56       0       0      56      38 pr43920-2.o
text size is 56
FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54

(i haven't looked into the failure, attached asm output before and after).

> Thanks for the confirmation.  I've made one further tweak for cris and
> lm32, dropping the assert that caused build failures for libstdc++
> atomics parms that required more alignment than
> MAX_SUPPORTED_STACK_ALIGNMENT, consolidated the patchset and retested it
> with a more recent baseline (r228019), with native regstraps on
> x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
> powerpc64le-linux-gnu, and cross toolchain builds for the following 73
> platforms: aarch64_be-elf aarch64-elf arm-eabi armeb-eabihf
> arm-symbianelf avr-elf bfin-elf c6x-elf cr16-elf cris-elf crisv32-elf
> epiphany-elf fido-elf fr30-elf frv-elf ft32-elf h8300-elf i686-elf
> ia64-elf iq2000-elf lm32-elf m32c-elf m32r-elf m32rle-elf m68k-elf
> mcore-elf mep-elf microblaze-elf mips64el-elf mips64-elf mips64orion-elf
> mips64vr-elf mipsel-elf mipsisa32-elfoabi mipsisa64-elfoabi
> mipsisa64r2el-elf mipsisa64r2-sde-elf mipsisa64sb1-elf
> mipsisa64sr71k-elf mipstx39-elf mn10300-elf moxie-elf msp430-elf
> nds32be-elf nds32le-elf nios2-elf pdp11-aout powerpc-eabialtivec
> powerpc-eabi powerpc-eabisimaltivec powerpc-eabisim powerpc-eabispe
> powerpcle-eabi powerpcle-eabisim powerpcle-elf powerpc-xilinx-eabi
> ppc64-eabi ppc-eabi ppc-elf rl78-elf rx-elf sh64-elf sh-elf
> sh-superh-elf sparc64-elf sparc-elf sparc-leon-elf spu-elf v850e-elf
> v850-elf visium-elf xstormy16-elf xtensa-elf.  Not all of them succeeded
> in building, but those that didn't failed at the very same spots before
> and after this patch.
>
>
> This patch doesn't really add much functionality.  It rather
> reimplements a lot of the ugly and fragile stuff I put in in the
> previous big patchset in a far more robust and pleasant way.  It fixes a
> number of regressions in the process, mainly because, instead of
> modifying assign_parms so as to let cfgexpand do part of its job, it
> reverts all of the RTL assignment for parameters and results to
> assign_parms.  cfgexpand now leaves the RTL assignment of partitions
> containing default defs or parms and results to assign_parms, and
> assign_parms uses a single callback, set_parm_rtl, to tell cfgexpand the
> assignment for the partition containing the default def of each
> parameter.
>
> This required introducing default defs for all parms and results, even
> if unused; we could refrain from creating them, and refrain from
> initializing those parameters (at least when optimizing), but that would
> require messing with the fragile bits in assign_parms again, and it
> would bring little benefit, since RTL optimization will likely notice
> the initialization is unused and drop it anyway.  Besides, adding the
> default defs was actually needed to fix a regression in the previous
> patch, and even with the current patch it helps make sure we don't
> assign more than one default def to the same SSA partition (the previous
> patch attempted to do that, but there was a bug, fixed in the current
> patch).  Having unused default defs makes it easier for us to decide
> whether to use an entry_value rtx for the initial debug insn of a parm.
> We track partitions holding default defs for parms and results with a
> bitmap; we used to have a bitmap that tracked partitions holding default
> defs, but it was unused!  I just renamed it and repurposed it.
>
> I've also added checking asserts to set_rtl, to verify that, when we
> expect a REG, we get a REG, and that it has the expected mode.  set_rtl
> was also adjusted to record anonymous SSA names or their base types in
> attrs of REGs or MEMs, respectively, so that code that relied on the
> attrs to detect properties of the decl types no longer regress just
> because we no longer generate decls for anonymous SSA names.  Since
> there were prior uses of types in MEM attrs, that was expected to go
> smoothly, but I was surprised at how smoothly adding SSA names to REG
> attrs went.  No adjustments required!
>
> I also tightened a bit the conditions for coalescing: we used to require
> the same canonical type; I've added tests for same alignment
> requirements, and for same signedness.  OTOH, I've added a few more
> coalesce candidates for RESULT_DECLs and the newly-added default defs of
> parms and results.
>
> Other relevant changes were in mode promotion.  TYPE_MODE would often
> return BLKmode for some vector types, which was fine for some return
> decl RTL with PARALLEL, but that didn't quite work for SSA partitions.
> There were other cases of mode promotion of result decls that failed the
> asserts in set_rtl, that revealed promote_decl_mode didn't call
> promote_function_mode as expected for results.
>
> The new assers brought additional requirements: promoting the mode of
> the RTL generated for the static chain, arranging for result decls to be
> assigned to a pseudo where it would formerly have got a BLKmode PARALLEL
> (as mentioned above), and arranging for parms set up by
> assign_parm_setup_block, that would always get a MEM, to instead get a
> REG when use_register_for_decl called for it.  In a few cases involving
> complex parms, I couldn't figure out how to avoid a temporary MEM, used
> to adjust padding of the parms, but although undesired, this is not a
> regression, for we used to use the MEM, we'll just load them to
> (coalescible) pseudos and use the pseudos instead, instead of coalescing
> other vars that expected pseudos to the same MEM.
>
> Is this ok to install?
>
>
>
> revert to assign_parms assignments using default defs
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> Revert the fragile and complicated changes to assign_parms designed to
> enable it to use RTL assigments chosen by cfgexpand, and instead have
> cfgexpand use the RTL assignments by assign_parms, keying them off of
> the default defs that are now necessarily introduced for each parm and
> result.  The possible lack of a default def was already a problem, and
> the fallbacks in place were not enough, as shown by PR67312.  We now
> have checking asserts in set_rtl that verify that we're assigning to
> each var a piece of RTL that matches the expectations set forth by
> use_register_for_decl.
>
> for  gcc/ChangeLog
>
>          PR rtl-optimization/64164
>          PR tree-optimization/67312
>          PR middle-end/67340
>          PR middle-end/67490
>          PR bootstrap/67597
>          * cfgexpand.c (parm_in_stack_slot_p): Remove.
>          (ssa_default_def_partition): Remove.
>          (get_rtl_for_parm_ssa_default_def): Remove.
>          (set_rtl): Check that RTL assignments match expectations.
>          Loop on SUBREGs, CONCATs and PARALLELs subexprs.  Set only the
>          default def location for params and results.  Record SSA names
>          or types in REG and MEM attrs, respectively.
>          (set_parm_rtl): New.
>          (expand_one_ssa_partition): Drop logic that assigned MEMs with
>          unassigned addresses.
>          (adjust_one_expanded_partition_var): Don't accept NULL RTL on
>          deferred stack alloc vars.
>          (expand_used_vars): Skip partitions holding parm default defs.
>          Move adjust_one_expanded_partition_var loop...
>          (pass_expand::execute): ... here.  Drop redundant assert.
>          Adjust comments before the final loop over all ssa names.
>          Require assigned rtl of parms and results to match exactly.
>          Reset its attributes to match them, not any other variables in
>          the same partition.
>          (expand_debug_expr): Use entry value for PARM's default defs
>          only iff they have zero nondebug uses.
>          * cfgexpand.h (parm_in_stack_slot_p): Remove.
>          (get_rtl_for_parm_ssa_default_def): Remove.
>          (set_parm_rtl): Declare.
>          * doc/invoke.texi: Improve wording.
>          * explow.c (promote_decl_mode): Fix promote_function_mode for
>          result decls not by reference.
>          (promote_ssa_mode): Disregard BLKmode from promote_decl, and
>          bypass TYPE_MODE to get the actual vector mode.
>          * function.c: Include tree-dfa.h.  Revert 2015-08-14's and
>          2015-08-19's changes as follows.  Drop include of
>          basic-block.h and df.h.
>          (rtl_for_parm): Remove.
>          (maybe_reset_rtl_for_parm): Remove.
>          (parm_in_unassigned_mem_p): Remove.
>          (use_register_for_decl): Add logic for RESULT_DECLs matching
>          assign_parms' behavior.
>          (split_complex_args): Revert.
>          (assign_parms_augmented_arg_list): Revert.  Add comment
>          referencing the logic above.
>          (assign_parm_adjust_stack_rtl): Revert.
>          (assign_parm_setup_block): Revert.  Use set_parm_rtl instead
>          of SET_DECL_RTL.  Set up a REG if the parm demands so.
>          (assign_parm_setup_reg): Revert.  Consolidated SET_DECL_RTL
>          calls into a single set_parm_rtl.  Set up a temporary RTL
>          temporarily for expand_assignment.
>          (assign_parm_setup_stack): Revert.  Use set_parm_rtl.
>          (assign_parms_unsplit_complex): Revert.  Use set_parm_rtl.
>          (assign_bounds): Revert.
>          (assign_parms): Revert.  Use set_parm_rtl.
>          (allocate_struct_function): Relayout result and parms of
>          non-abstruct functions.
>          (expand_function_start): Revert.  Use set_parm_rtl.  If the
>          result is not a hard reg, create a pseudo from the promoted
>          mode of the default def.  Promote static chain mode.
>          * tree-outof-ssa.c (remove_ssa_form): Drop unused
>          partition_has_default_def.  Set up
>          partitions_for_parm_default_defs.
>          (finish_out_of_ssa): Remove partition_has_default_def.
>          Release partitions_for_parm_default_defs.
>          * tree-outof-ssa.h (struct ssaexpand): Remove
>          partition_has_default_def.  Add
>          partitions_for_parm_default_defs.
>          * tree-ssa-coalesce.c: Include tree-dfa.h, tm_p.h and
>          stor-layout.h.
>          (build_ssa_conflict_graph): Fix conflict-detection of default
>          defs of even unused default defs of params and results.
>          (for_all_parms): New.
>          (create_default_def): New.
>          (register_default_def): New.
>          (coalesce_with_default): New.
>          (create_outofssa_var_map): Create default defs for all parms
>          and results, and register their partitions.  Add GIMPLE_RETURN
>          operands as coalesce candidates with results.  Add default
>          defs of each parm or result as coalesce candidates with its
>          other defs.  Mark each result def, and each default def of
>          parms, as used_in_copy.
>          (gimple_can_coalesce_p): Call it.  Call use_register_for_decl
>          with the ssa names, even anonymous ones.  Drop
>          parm_in_stack_slot_p calls.  Require same signedness and
>          alignment.
>          (coalesce_ssa_name): Add coalesce candidates for all defs of
>          each parm and result, even unused ones.
>          (parm_default_def_partition_arg): New type.
>          (set_parm_default_def_partition): New.
>          (get_parm_default_def_partitions): New.
>          * tree-ssa-coalesce.h (get_parm_default_def_partitions): New.
>          * tree-ssa-live.c (partition_view_init): Regard unused defs of
>          parms and results as used.
>          (verify_live_on_entry): Don't error out just because they're
>          not live.
>
> for  gcc/testsuite/ChangeLog
>
>          PR rtl-optimization/64164
>          PR tree-optimization/67312
>          * gcc.dg/pr67312.c: New.  From Zdenek Sojka.
>          * gcc.target/i386/stackalign/return-4.c: Add -O.
> ---
>   gcc/cfgexpand.c                                    |  332 +++++++-------
>   gcc/cfgexpand.h                                    |    3
>   gcc/doc/invoke.texi                                |    9
>   gcc/explow.c                                       |   19 +
>   gcc/function.c                                     |  477 +++++++-------------
>   gcc/testsuite/gcc.dg/pr67312.c                     |    7
>   .../gcc.target/i386/stackalign/return-4.c          |    9
>   gcc/tree-outof-ssa.c                               |   15 -
>   gcc/tree-outof-ssa.h                               |    6
>   gcc/tree-ssa-coalesce.c                            |  231 ++++++++--
>   gcc/tree-ssa-coalesce.h                            |    1
>   gcc/tree-ssa-live.c                                |   10
>   12 files changed, 582 insertions(+), 537 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.dg/pr67312.c
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 6c9284f..58e55d2 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -99,6 +99,8 @@ static rtx expand_debug_expr (tree);
>
>   static bool defer_stack_allocation (tree, bool);
>
> +static void record_alignment_for_reg_var (unsigned int);
> +
>   /* Return an expression tree corresponding to the RHS of GIMPLE
>      statement STMT.  */
>
> @@ -172,111 +174,86 @@ leader_merge (tree cur, tree next)
>     return cur;
>   }
>
> -/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
> -   assigned to a stack slot.  We can't have expand_one_ssa_partition
> -   choose their address: the pseudo holding the address would be set
> -   up too late for assign_params to copy the parameter if needed.
> -
> -   Such parameters are likely passed as a pointer to the value, rather
> -   than as a value, and so we must not coalesce them, nor allocate
> -   stack space for them before determining the calling conventions for
> -   them.
> -
> -   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
> -   with pc_rtx as the address, and then it replaces the pc_rtx with
> -   NULL so as to make sure the MEM is not used before it is adjusted
> -   in assign_parm_setup_reg.  */
> -
> -bool
> -parm_in_stack_slot_p (tree var)
> -{
> -  if (!var || VAR_P (var))
> -    return false;
> -
> -  gcc_assert (TREE_CODE (var) == PARM_DECL
> -             || TREE_CODE (var) == RESULT_DECL);
> -
> -  return !use_register_for_decl (var);
> -}
> -
> -/* Return the partition of the default SSA_DEF for decl VAR.  */
> -
> -static int
> -ssa_default_def_partition (tree var)
> -{
> -  tree name = ssa_default_def (cfun, var);
> -
> -  if (!name)
> -    return NO_PARTITION;
> -
> -  return var_to_partition (SA.map, name);
> -}
> -
> -/* Return the RTL for the default SSA def of a PARM or RESULT, if
> -   there is one.  */
> -
> -rtx
> -get_rtl_for_parm_ssa_default_def (tree var)
> -{
> -  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> -
> -  if (!is_gimple_reg (var))
> -    return NULL_RTX;
> -
> -  /* If we've already determined RTL for the decl, use it.  This is
> -     not just an optimization: if VAR is a PARM whose incoming value
> -     is unused, we won't find a default def to use its partition, but
> -     we still want to use the location of the parm, if it was used at
> -     all.  During assign_parms, until a location is assigned for the
> -     VAR, RTL can only for a parm or result if we're not coalescing
> -     across variables, when we know we're coalescing all SSA_NAMEs of
> -     each parm or result, and we're not coalescing them with names
> -     pertaining to other variables, such as other parms' default
> -     defs.  */
> -  if (DECL_RTL_SET_P (var))
> -    {
> -      gcc_assert (DECL_RTL (var) != pc_rtx);
> -      return DECL_RTL (var);
> -    }
> -
> -  int part = ssa_default_def_partition (var);
> -  if (part == NO_PARTITION)
> -    return NULL_RTX;
> -
> -  return SA.partition_to_pseudo[part];
> -}
> -
>   /* Associate declaration T with storage space X.  If T is no
>      SSA name this is exactly SET_DECL_RTL, otherwise make the
>      partition of T associated with X.  */
>   static inline void
>   set_rtl (tree t, rtx x)
>   {
> -  if (x && SSAVAR (t))
> +  gcc_checking_assert (!x
> +                      || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t))
> +                      || (use_register_for_decl (t)
> +                          ? (REG_P (x)
> +                             || (GET_CODE (x) == CONCAT
> +                                 && (REG_P (XEXP (x, 0))
> +                                     || SUBREG_P (XEXP (x, 0)))
> +                                 && (REG_P (XEXP (x, 1))
> +                                     || SUBREG_P (XEXP (x, 1))))
> +                             || (GET_CODE (x) == PARALLEL
> +                                 && SSAVAR (t)
> +                                 && TREE_CODE (SSAVAR (t)) == RESULT_DECL
> +                                 && !flag_tree_coalesce_vars))
> +                          : (MEM_P (x) || x == pc_rtx
> +                             || (GET_CODE (x) == CONCAT
> +                                 && MEM_P (XEXP (x, 0))
> +                                 && MEM_P (XEXP (x, 1))))));
> +  /* Check that the RTL for SSA_NAMEs and gimple-reg PARM_DECLs and
> +     RESULT_DECLs has the expected mode.  For memory, we accept
> +     unpromoted modes, since that's what we're likely to get.  For
> +     PARM_DECLs and RESULT_DECLs, we'll have been called by
> +     set_parm_rtl, which will give us the default def, so we don't
> +     have to compute it ourselves.  For RESULT_DECLs, we accept mode
> +     mismatches too, as long as we're not coalescing across variables,
> +     so that we don't reject BLKmode PARALLELs or unpromoted REGs.  */
> +  gcc_checking_assert (!x || x == pc_rtx || TREE_CODE (t) != SSA_NAME
> +                      || (SSAVAR (t) && TREE_CODE (SSAVAR (t)) == RESULT_DECL
> +                          && !flag_tree_coalesce_vars)
> +                      || !use_register_for_decl (t)
> +                      || GET_MODE (x) == promote_ssa_mode (t, NULL));
> +
> +  if (x)
>       {
>         bool skip = false;
>         tree cur = NULL_TREE;
> -
> -      if (MEM_P (x))
> -       cur = MEM_EXPR (x);
> -      else if (REG_P (x))
> -       cur = REG_EXPR (x);
> -      else if (GET_CODE (x) == CONCAT
> -              && REG_P (XEXP (x, 0)))
> -       cur = REG_EXPR (XEXP (x, 0));
> -      else if (GET_CODE (x) == PARALLEL)
> -       cur = REG_EXPR (XVECEXP (x, 0, 0));
> -      else if (x == pc_rtx)
> +      rtx xm = x;
> +
> +    retry:
> +      if (MEM_P (xm))
> +       cur = MEM_EXPR (xm);
> +      else if (REG_P (xm))
> +       cur = REG_EXPR (xm);
> +      else if (SUBREG_P (xm))
> +       {
> +         gcc_assert (subreg_lowpart_p (xm));
> +         xm = SUBREG_REG (xm);
> +         goto retry;
> +       }
> +      else if (GET_CODE (xm) == CONCAT)
> +       {
> +         xm = XEXP (xm, 0);
> +         goto retry;
> +       }
> +      else if (GET_CODE (xm) == PARALLEL)
> +       {
> +         xm = XVECEXP (xm, 0, 0);
> +         gcc_assert (GET_CODE (xm) == EXPR_LIST);
> +         xm = XEXP (xm, 0);
> +         goto retry;
> +       }
> +      else if (xm == pc_rtx)
>          skip = true;
>         else
>          gcc_unreachable ();
>
> -      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +      tree next = skip ? cur : leader_merge (cur, SSAVAR (t) ? SSAVAR (t) : t);
>
>         if (cur != next)
>          {
>            if (MEM_P (x))
> -           set_mem_attributes (x, next, true);
> +           set_mem_attributes (x,
> +                               next && TREE_CODE (next) == SSA_NAME
> +                               ? TREE_TYPE (next)
> +                               : next, true);
>            else
>              set_reg_attrs_for_decl_rtl (next, x);
>          }
> @@ -294,13 +271,11 @@ set_rtl (tree t, rtx x)
>          }
>         /* For the benefit of debug information at -O0 (where
>            vartracking doesn't run) record the place also in the base
> -         DECL.  For PARMs and RESULTs, we may end up resetting these
> -         in function.c:maybe_reset_rtl_for_parm, but in some rare
> -         cases we may need them (unused and overwritten incoming
> -         value, that at -O0 must share the location with the other
> -         uses in spite of the missing default def), and this may be
> -         the only chance to preserve them.  */
> -      if (x && x != pc_rtx && SSA_NAME_VAR (t))
> +         DECL.  For PARMs and RESULTs, do so only when setting the
> +         default def.  */
> +      if (x && x != pc_rtx && SSA_NAME_VAR (t)
> +         && (VAR_P (SSA_NAME_VAR (t))
> +             || SSA_NAME_IS_DEFAULT_DEF (t)))
>          {
>            tree var = SSA_NAME_VAR (t);
>            /* If we don't yet have something recorded, just record it now.  */
> @@ -1242,6 +1217,49 @@ account_stack_vars (void)
>     return size;
>   }
>
> +/* Record the RTL assignment X for the default def of PARM.  */
> +
> +extern void
> +set_parm_rtl (tree parm, rtx x)
> +{
> +  gcc_assert (TREE_CODE (parm) == PARM_DECL
> +             || TREE_CODE (parm) == RESULT_DECL);
> +
> +  if (x && !MEM_P (x))
> +    {
> +      unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (parm),
> +                                             TYPE_MODE (TREE_TYPE (parm)),
> +                                             TYPE_ALIGN (TREE_TYPE (parm)));
> +
> +      /* If the variable alignment is very large we'll dynamicaly
> +        allocate it, which means that in-frame portion is just a
> +        pointer.  ??? We've got a pseudo for sure here, do we
> +        actually dynamically allocate its spilling area if needed?
> +        ??? Isn't it a problem when POINTER_SIZE also exceeds
> +        MAX_SUPPORTED_STACK_ALIGNMENT, as on cris and lm32?  */
> +      if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +       align = POINTER_SIZE;
> +
> +      record_alignment_for_reg_var (align);
> +    }
> +
> +  if (!is_gimple_reg (parm))
> +    return set_rtl (parm, x);
> +
> +  tree ssa = ssa_default_def (cfun, parm);
> +  if (!ssa)
> +    return set_rtl (parm, x);
> +
> +  int part = var_to_partition (SA.map, ssa);
> +  gcc_assert (part != NO_PARTITION);
> +
> +  bool changed = bitmap_bit_p (SA.partitions_for_parm_default_defs, part);
> +  gcc_assert (changed);
> +
> +  set_rtl (ssa, x);
> +  gcc_assert (DECL_RTL (parm) == x);
> +}
> +
>   /* A subroutine of expand_one_var.  Called to immediately assign rtl
>      to a variable to be allocated in the stack frame.  */
>
> @@ -1349,37 +1367,7 @@ expand_one_ssa_partition (tree var)
>
>     if (!use_register_for_decl (var))
>       {
> -      /* We can't risk having the parm assigned to a MEM location
> -        whose address references a pseudo, for the pseudo will only
> -        be set up after arguments are copied to the stack slot.
> -
> -        If the parm doesn't have a default def (e.g., because its
> -        incoming value is unused), then we want to let assign_params
> -        do the allocation, too.  In this case we want to make sure
> -        SSA_NAMEs associated with the parm don't get assigned to more
> -        than one partition, lest we'd create two unassigned stac
> -        slots for the same parm, thus the assert at the end of the
> -        block.  */
> -      if (parm_in_stack_slot_p (SSA_NAME_VAR (var))
> -         && (ssa_default_def_partition (SSA_NAME_VAR (var)) == part
> -             || !ssa_default_def (cfun, SSA_NAME_VAR (var))))
> -       {
> -         expand_one_stack_var_at (var, pc_rtx, 0, 0);
> -         rtx x = SA.partition_to_pseudo[part];
> -         gcc_assert (GET_CODE (x) == MEM);
> -         gcc_assert (XEXP (x, 0) == pc_rtx);
> -         /* Reset the address, so that any attempt to use it will
> -            ICE.  It will be adjusted in assign_parm_setup_reg.  */
> -         XEXP (x, 0) = NULL_RTX;
> -         /* If the RTL associated with the parm is not what we have
> -            just created, the parm has been split over multiple
> -            partitions.  In order for this to work, we must have a
> -            default def for the parm, otherwise assign_params won't
> -            know what to do.  */
> -         gcc_assert (DECL_RTL_IF_SET (SSA_NAME_VAR (var)) == x
> -                     || ssa_default_def (cfun, SSA_NAME_VAR (var)));
> -       }
> -      else if (defer_stack_allocation (var, true))
> +      if (defer_stack_allocation (var, true))
>          add_stack_var (var);
>         else
>          expand_one_stack_var_1 (var);
> @@ -1393,8 +1381,8 @@ expand_one_ssa_partition (tree var)
>     set_rtl (var, x);
>   }
>
> -/* Record the association between the RTL generated for a partition
> -   and the underlying variable of the SSA_NAME.  */
> +/* Record the association between the RTL generated for partition PART
> +   and the underlying variable of the SSA_NAME VAR.  */
>
>   static void
>   adjust_one_expanded_partition_var (tree var)
> @@ -1410,12 +1398,7 @@ adjust_one_expanded_partition_var (tree var)
>
>     rtx x = SA.partition_to_pseudo[part];
>
> -  if (!x)
> -    {
> -      /* This var will get a stack slot later.  */
> -      gcc_assert (defer_stack_allocation (var, true));
> -      return;
> -    }
> +  gcc_assert (x);
>
>     set_rtl (var, x);
>
> @@ -2040,6 +2023,9 @@ expand_used_vars (void)
>
>     for (i = 0; i < SA.map->num_partitions; i++)
>       {
> +      if (bitmap_bit_p (SA.partitions_for_parm_default_defs, i))
> +       continue;
> +
>         tree var = partition_to_var (SA.map, i);
>
>         gcc_assert (!virtual_operand_p (var));
> @@ -2047,9 +2033,6 @@ expand_used_vars (void)
>         expand_one_ssa_partition (var);
>       }
>
> -  for (i = 1; i < num_ssa_names; i++)
> -    adjust_one_expanded_partition_var (ssa_name (i));
> -
>     if (flag_stack_protect == SPCT_FLAG_STRONG)
>         gen_stack_protect_signal
>          = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -4947,26 +4930,27 @@ expand_debug_expr (tree exp)
>            }
>          else
>            {
> +           /* If this is a reference to an incoming value of
> +              parameter that is never used in the code or where the
> +              incoming value is never used in the code, use
> +              PARM_DECL's DECL_RTL if set.  */
> +           if (SSA_NAME_IS_DEFAULT_DEF (exp)
> +               && SSA_NAME_VAR (exp)
> +               && TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL
> +               && has_zero_uses (exp))
> +             {
> +               op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp));
> +               if (op0)
> +                 goto adjust_mode;
> +               op0 = expand_debug_expr (SSA_NAME_VAR (exp));
> +               if (op0)
> +                 goto adjust_mode;
> +             }
> +
>              int part = var_to_partition (SA.map, exp);
>
>              if (part == NO_PARTITION)
> -             {
> -               /* If this is a reference to an incoming value of parameter
> -                  that is never used in the code or where the incoming
> -                  value is never used in the code, use PARM_DECL's
> -                  DECL_RTL if set.  */
> -               if (SSA_NAME_IS_DEFAULT_DEF (exp)
> -                   && TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL)
> -                 {
> -                   op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp));
> -                   if (op0)
> -                     goto adjust_mode;
> -                   op0 = expand_debug_expr (SSA_NAME_VAR (exp));
> -                   if (op0)
> -                     goto adjust_mode;
> -                 }
> -               return NULL;
> -             }
> +             return NULL;
>
>              gcc_assert (part >= 0 && (unsigned)part < SA.map->num_partitions);
>
> @@ -6216,9 +6200,26 @@ pass_expand::execute (function *fun)
>         parm_birth_insn = var_seq;
>       }
>
> -  /* If we have a class containing differently aligned pointers
> -     we need to merge those into the corresponding RTL pointer
> -     alignment.  */
> +  /* Now propagate the RTL assignment of each partition to the
> +     underlying var of each SSA_NAME.  */
> +  for (i = 1; i < num_ssa_names; i++)
> +    {
> +      tree name = ssa_name (i);
> +
> +      if (!name
> +         /* We might have generated new SSA names in
> +            update_alias_info_with_stack_vars.  They will have a NULL
> +            defining statements, and won't be part of the partitioning,
> +            so ignore those.  */
> +         || !SSA_NAME_DEF_STMT (name))
> +       continue;
> +
> +      adjust_one_expanded_partition_var (name);
> +    }
> +
> +  /* Clean up RTL of variables that straddle across multiple
> +     partitions, and check that the rtl of any PARM_DECLs that are not
> +     cleaned up is that of their default defs.  */
>     for (i = 1; i < num_ssa_names; i++)
>       {
>         tree name = ssa_name (i);
> @@ -6235,9 +6236,6 @@ pass_expand::execute (function *fun)
>         if (part == NO_PARTITION)
>          continue;
>
> -      gcc_assert (SA.partition_to_pseudo[part]
> -                 || defer_stack_allocation (name, true));
> -
>         /* If this decl was marked as living in multiple places, reset
>           this now to NULL.  */
>         tree var = SSA_NAME_VAR (name);
> @@ -6252,7 +6250,19 @@ pass_expand::execute (function *fun)
>            rtx in = DECL_RTL_IF_SET (var);
>            gcc_assert (in);
>            rtx out = SA.partition_to_pseudo[part];
> -         gcc_assert (in == out || rtx_equal_p (in, out));
> +         gcc_assert (in == out);
> +
> +         /* Now reset VAR's RTL to IN, so that the _EXPR attrs match
> +            those expected by debug backends for each parm and for
> +            the result.  This is particularly important for stabs,
> +            whose register elimination from parm's DECL_RTL may cause
> +            -fcompare-debug differences as SET_DECL_RTL changes reg's
> +            attrs.  So, make sure the RTL already has the parm as the
> +            EXPR, so that it won't change.  */
> +         SET_DECL_RTL (var, NULL_RTX);
> +         if (MEM_P (in))
> +           set_mem_attributes (in, var, true);
> +         SET_DECL_RTL (var, in);
>          }
>       }
>
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index ff7f4bef..8852411 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,8 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>
>   extern tree gimple_assign_rhs_to_tree (gimple *);
>   extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> -extern bool parm_in_stack_slot_p (tree);
> -extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +extern void set_parm_rtl (tree, rtx);
>
>
>   #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 09c58ee..aefb061 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -8866,12 +8866,13 @@ profitable to parallelize the loops.
>
>   @item -ftree-coalesce-vars
>   @opindex ftree-coalesce-vars
> -Tell the compiler to attempt to combine small user-defined variables
> -too, instead of just compiler temporaries.  This may severely limit the
> -ability to debug an optimized program compiled with
> +While transforming the program out of the SSA representation, attempt to
> +reduce copying by coalescing versions of different user-defined
> +variables, instead of just compiler temporaries.  This may severely
> +limit the ability to debug an optimized program compiled with
>   @option{-fno-var-tracking-assignments}.  In the negated form, this flag
>   prevents SSA coalescing of user variables.  This option is enabled by
> -default if optimization is enabled.
> +default if optimization is enabled, and it does very little otherwise.
>
>   @item -ftree-loop-if-convert
>   @opindex ftree-loop-if-convert
> diff --git a/gcc/explow.c b/gcc/explow.c
> index 6941f4e..d104a79 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -830,8 +830,10 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>     machine_mode mode = DECL_MODE (decl);
>     machine_mode pmode;
>
> -  if (TREE_CODE (decl) == RESULT_DECL
> -      || TREE_CODE (decl) == PARM_DECL)
> +  if (TREE_CODE (decl) == RESULT_DECL && !DECL_BY_REFERENCE (decl))
> +    pmode = promote_function_mode (type, mode, &unsignedp,
> +                                   TREE_TYPE (current_function_decl), 1);
> +  else if (TREE_CODE (decl) == RESULT_DECL || TREE_CODE (decl) == PARM_DECL)
>       pmode = promote_function_mode (type, mode, &unsignedp,
>                                      TREE_TYPE (current_function_decl), 2);
>     else
> @@ -857,12 +859,23 @@ promote_ssa_mode (const_tree name, int *punsignedp)
>     if (SSA_NAME_VAR (name)
>         && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
>            || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
> -    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> +    {
> +      machine_mode mode = promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> +      if (mode != BLKmode)
> +       return mode;
> +    }
>
>     tree type = TREE_TYPE (name);
>     int unsignedp = TYPE_UNSIGNED (type);
>     machine_mode mode = TYPE_MODE (type);
>
> +  /* Bypass TYPE_MODE when it maps vector modes to BLKmode.  */
> +  if (mode == BLKmode)
> +    {
> +      gcc_assert (VECTOR_TYPE_P (type));
> +      mode = type->type_common.mode;
> +    }
> +
>     machine_mode pmode = promote_mode (type, mode, &unsignedp);
>     if (punsignedp)
>       *punsignedp = unsignedp;
> diff --git a/gcc/function.c b/gcc/function.c
> index 9b4c2b9..21304689 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -74,8 +74,6 @@ along with GCC; see the file COPYING3.  If not see
>   #include "cfgbuild.h"
>   #include "cfgcleanup.h"
>   #include "cfgexpand.h"
> -#include "basic-block.h"
> -#include "df.h"
>   #include "params.h"
>   #include "bb-reorder.h"
>   #include "shrink-wrap.h"
> @@ -83,6 +81,7 @@ along with GCC; see the file COPYING3.  If not see
>   #include "rtl-iter.h"
>   #include "tree-chkp.h"
>   #include "rtl-chkp.h"
> +#include "tree-dfa.h"
>
>   /* So we can assign to cfun in this file.  */
>   #undef cfun
> @@ -152,9 +151,6 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
>   static void prepare_function_start (void);
>   static void do_clobber_return_reg (rtx, void *);
>   static void do_use_return_reg (rtx, void *);
> -static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
> -static void maybe_reset_rtl_for_parm (tree);
> -static bool parm_in_unassigned_mem_p (tree, rtx);
>
>
>   /* Stack of nested functions.  */
> @@ -2145,6 +2141,47 @@ use_register_for_decl (const_tree decl)
>     if (TREE_ADDRESSABLE (decl))
>       return false;
>
> +  /* RESULT_DECLs are a bit special in that they're assigned without
> +     regard to use_register_for_decl, but we generally only store in
> +     them.  If we coalesce their SSA NAMEs, we'd better return a
> +     result that matches the assignment in expand_function_start.  */
> +  if (TREE_CODE (decl) == RESULT_DECL)
> +    {
> +      /* If it's not an aggregate, we're going to use a REG or a
> +        PARALLEL containing a REG.  */
> +      if (!aggregate_value_p (decl, current_function_decl))
> +       return true;
> +
> +      /* If expand_function_start determines the return value, we'll
> +        use MEM if it's not by reference.  */
> +      if (cfun->returns_pcc_struct
> +         || (targetm.calls.struct_value_rtx
> +             (TREE_TYPE (current_function_decl), 1)))
> +       return DECL_BY_REFERENCE (decl);
> +
> +      /* Otherwise, we're taking an extra all.function_result_decl
> +        argument.  It's set up in assign_parms_augmented_arg_list,
> +        under the (negated) conditions above, and then it's used to
> +        set up the RESULT_DECL rtl in assign_params, after looping
> +        over all parameters.  Now, if the RESULT_DECL is not by
> +        reference, we'll use a MEM either way.  */
> +      if (!DECL_BY_REFERENCE (decl))
> +       return false;
> +
> +      /* Otherwise, if RESULT_DECL is DECL_BY_REFERENCE, it will take
> +        the function_result_decl's assignment.  Since it's a pointer,
> +        we can short-circuit a number of the tests below, and we must
> +        duplicat e them because we don't have the
> +        function_result_decl to test.  */
> +      if (!targetm.calls.allocate_stack_slots_for_args ())
> +       return true;
> +      /* We don't set DECL_IGNORED_P for the function_result_decl.  */
> +      if (optimize)
> +       return true;
> +      /* We don't set DECL_REGISTER for the function_result_decl.  */
> +      return false;
> +    }
> +
>     /* Decl is implicitly addressible by bound stores and loads
>        if it is an aggregate holding bounds.  */
>     if (chkp_function_instrumented_p (current_function_decl)
> @@ -2272,7 +2309,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
>      needed, else the old list.  */
>
>   static void
> -split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
> +split_complex_args (vec<tree> *args)
>   {
>     unsigned i;
>     tree p;
> @@ -2283,7 +2320,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>         if (TREE_CODE (type) == COMPLEX_TYPE
>            && targetm.calls.split_complex_arg (type))
>          {
> -         tree cparm = p;
>            tree decl;
>            tree subtype = TREE_TYPE (type);
>            bool addressable = TREE_ADDRESSABLE (p);
> @@ -2302,9 +2338,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>            DECL_ARTIFICIAL (p) = addressable;
>            DECL_IGNORED_P (p) = addressable;
>            TREE_ADDRESSABLE (p) = 0;
> -         /* Reset the RTL before layout_decl, or it may change the
> -            mode of the RTL of the original argument copied to P.  */
> -         SET_DECL_RTL (p, NULL_RTX);
>            layout_decl (p, 0);
>            (*args)[i] = p;
>
> @@ -2316,41 +2349,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
>            DECL_IGNORED_P (decl) = addressable;
>            layout_decl (decl, 0);
>            args->safe_insert (++i, decl);
> -
> -         /* If we are expanding a function, rather than gimplifying
> -            it, propagate the RTL of the complex parm to the split
> -            declarations, and set their contexts so that
> -            maybe_reset_rtl_for_parm can recognize them and refrain
> -            from resetting their RTL.  */
> -         if (currently_expanding_to_rtl)
> -           {
> -             maybe_reset_rtl_for_parm (cparm);
> -             rtx rtl = rtl_for_parm (all, cparm);
> -             if (rtl)
> -               {
> -                 /* If this is parm is unassigned, assign it now: the
> -                    newly-created decls wouldn't expect the need for
> -                    assignment, and if they were assigned
> -                    independently, they might not end up in adjacent
> -                    slots, so unsplit wouldn't be able to fill in the
> -                    unassigned address of the complex MEM.  */
> -                 if (parm_in_unassigned_mem_p (cparm, rtl))
> -                   {
> -                     int align = STACK_SLOT_ALIGNMENT
> -                       (TREE_TYPE (cparm), GET_MODE (rtl), MEM_ALIGN (rtl));
> -                     rtx loc = assign_stack_local
> -                       (GET_MODE (rtl), GET_MODE_SIZE (GET_MODE (rtl)),
> -                        align);
> -                     XEXP (rtl, 0) = XEXP (loc, 0);
> -                   }
> -
> -                 SET_DECL_RTL (p, read_complex_part (rtl, false));
> -                 SET_DECL_RTL (decl, read_complex_part (rtl, true));
> -
> -                 DECL_CONTEXT (p) = cparm;
> -                 DECL_CONTEXT (decl) = cparm;
> -               }
> -           }
>          }
>       }
>   }
> @@ -2386,6 +2384,9 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
>         DECL_ARTIFICIAL (decl) = 1;
>         DECL_NAMELESS (decl) = 1;
>         TREE_CONSTANT (decl) = 1;
> +      /* We don't set DECL_IGNORED_P or DECL_REGISTER here.  If this
> +        changes, the end of the RESULT_DECL handling block in
> +        use_register_for_decl must be adjusted to match.  */
>
>         DECL_CHAIN (decl) = all->orig_fnargs;
>         all->orig_fnargs = decl;
> @@ -2413,7 +2414,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
>
>     /* If the target wants to split complex arguments into scalars, do so.  */
>     if (targetm.calls.split_complex_arg)
> -    split_complex_args (all, &fnargs);
> +    split_complex_args (&fnargs);
>
>     return fnargs;
>   }
> @@ -2816,98 +2817,23 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>     data->entry_parm = entry_parm;
>   }
>
> -/* Wrapper for use_register_for_decl, that special-cases the
> -   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> -   passed by reference.  */
> -
> -static bool
> -use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> -{
> -  if (parm == all->function_result_decl)
> -    {
> -      tree result = DECL_RESULT (current_function_decl);
> -
> -      if (DECL_BY_REFERENCE (result))
> -       parm = result;
> -    }
> -
> -  return use_register_for_decl (parm);
> -}
> -
> -/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> -   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> -   is passed by reference.  */
> -
> -static rtx
> -rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> -{
> -  if (parm == all->function_result_decl)
> -    {
> -      tree result = DECL_RESULT (current_function_decl);
> -
> -      if (!DECL_BY_REFERENCE (result))
> -       return NULL_RTX;
> -
> -      parm = result;
> -    }
> -
> -  return get_rtl_for_parm_ssa_default_def (parm);
> -}
> -
> -/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> -   SSA_NAMEs in multiple partitions, so that assign_parms will choose
> -   the default def, if it exists, or create new RTL to hold the unused
> -   entry value.  If we are coalescing across variables, we want to
> -   reset the location too, because a parm without a default def
> -   (incoming value unused) might be coalesced with one with a default
> -   def, and then assign_parms would copy both incoming values to the
> -   same location, which might cause the wrong value to survive.  */
> -static void
> -maybe_reset_rtl_for_parm (tree parm)
> -{
> -  gcc_assert (TREE_CODE (parm) == PARM_DECL
> -             || TREE_CODE (parm) == RESULT_DECL);
> -
> -  /* This is a split complex parameter, and its context was set to its
> -     original PARM_DECL in split_complex_args so that we could
> -     recognize it here and not reset its RTL.  */
> -  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
> -    {
> -      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
> -      return;
> -    }
> -
> -  if ((flag_tree_coalesce_vars
> -       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> -      && is_gimple_reg (parm))
> -    SET_DECL_RTL (parm, NULL_RTX);
> -}
> -
>   /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
>      always valid and properly aligned.  */
>
>   static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> -                             struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
>   {
>     rtx stack_parm = data->stack_parm;
>
> -  /* If out-of-SSA assigned RTL to the parm default def, make sure we
> -     don't use what we might have computed before.  */
> -  rtx ssa_assigned = rtl_for_parm (all, parm);
> -  if (ssa_assigned)
> -    stack_parm = NULL;
> -
>     /* If we can't trust the parm stack slot to be aligned enough for its
>        ultimate type, don't use that slot after entry.  We'll make another
>        stack slot, if we need one.  */
> -  else if (stack_parm
> -          && ((STRICT_ALIGNMENT
> -               && (GET_MODE_ALIGNMENT (data->nominal_mode)
> -                   > MEM_ALIGN (stack_parm)))
> -              || (data->nominal_type
> -                  && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> -                  && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> +  if (stack_parm
> +      && ((STRICT_ALIGNMENT
> +          && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> +         || (data->nominal_type
> +             && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> +             && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>       stack_parm = NULL;
>
>     /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2952,27 +2878,6 @@ assign_parm_setup_block_p (struct assign_parm_data_one *data)
>     return false;
>   }
>
> -/* Return true if FROM_EXPAND is a MEM with an address to be filled in
> -   by assign_params.  This should be the case if, and only if,
> -   parm_in_stack_slot_p holds for the parm DECL that expanded to
> -   FROM_EXPAND, so we check that, too.  */
> -
> -static bool
> -parm_in_unassigned_mem_p (tree decl, rtx from_expand)
> -{
> -  bool result = MEM_P (from_expand) && !XEXP (from_expand, 0);
> -
> -  gcc_assert (result == parm_in_stack_slot_p (decl)
> -             /* Maybe it was already assigned.  That's ok, especially
> -                for split complex args.  */
> -             || (!result && MEM_P (from_expand)
> -                 && (XEXP (from_expand, 0) == virtual_stack_vars_rtx
> -                     || (GET_CODE (XEXP (from_expand, 0)) == PLUS
> -                         && XEXP (XEXP (from_expand, 0), 0) == virtual_stack_vars_rtx))));
> -
> -  return result;
> -}
> -
>   /* A subroutine of assign_parms.  Arrange for the parameter to be
>      present and valid in DATA->STACK_RTL.  */
>
> @@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>   {
>     rtx entry_parm = data->entry_parm;
>     rtx stack_parm = data->stack_parm;
> +  rtx target_reg = NULL_RTX;
>     HOST_WIDE_INT size;
>     HOST_WIDE_INT size_stored;
>
>     if (GET_CODE (entry_parm) == PARALLEL)
>       entry_parm = emit_group_move_into_temps (entry_parm);
>
> +  /* If we want the parameter in a pseudo, don't use a stack slot.  */
> +  if (is_gimple_reg (parm) && use_register_for_decl (parm))
> +    {
> +      tree def = ssa_default_def (cfun, parm);
> +      gcc_assert (def);
> +      machine_mode mode = promote_ssa_mode (def, NULL);
> +      rtx reg = gen_reg_rtx (mode);
> +      if (GET_CODE (reg) != CONCAT)
> +       stack_parm = reg;
> +      else
> +       /* This will use or allocate a stack slot that we'd rather
> +          avoid.  FIXME: Could we avoid it in more cases?  */
> +       target_reg = reg;
> +      data->stack_parm = NULL;
> +    }
> +
>     size = int_size_in_bytes (data->passed_type);
>     size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> -
>     if (stack_parm == 0)
>       {
>         DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> -      rtx from_expand = rtl_for_parm (all, parm);
> -      if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
> -       stack_parm = copy_rtx (from_expand);
> -      else
> -       {
> -         stack_parm = assign_stack_local (BLKmode, size_stored,
> -                                          DECL_ALIGN (parm));
> -         if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> -           PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -         if (from_expand)
> -           {
> -             gcc_assert (GET_CODE (stack_parm) == MEM);
> -             gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
> -             XEXP (from_expand, 0) = XEXP (stack_parm, 0);
> -             PUT_MODE (from_expand, GET_MODE (stack_parm));
> -             stack_parm = copy_rtx (from_expand);
> -           }
> -         else
> -           set_mem_attributes (stack_parm, parm, 1);
> -       }
> +      stack_parm = assign_stack_local (BLKmode, size_stored,
> +                                      DECL_ALIGN (parm));
> +      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> +       PUT_MODE (stack_parm, GET_MODE (entry_parm));
> +      set_mem_attributes (stack_parm, parm, 1);
>       }
>
>     /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
> @@ -3054,11 +2960,6 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>         else if (size == 0)
>          ;
>
> -      /* MEM may be a REG if coalescing assigns the param's partition
> -        to a pseudo.  */
> -      else if (REG_P (mem))
> -       emit_move_insn (mem, entry_parm);
> -
>         /* If SIZE is that of a mode no bigger than a word, just use
>           that mode's store operation.  */
>         else if (size <= UNITS_PER_WORD)
> @@ -3113,10 +3014,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>                tem = change_address (mem, word_mode, 0);
>                emit_move_insn (tem, x);
>              }
> +         else if (!MEM_P (mem))
> +           emit_move_insn (mem, entry_parm);
>            else
>              move_block_from_reg (REGNO (entry_parm), mem,
>                                   size_stored / UNITS_PER_WORD);
>          }
> +      else if (!MEM_P (mem))
> +       emit_move_insn (mem, entry_parm);
>         else
>          move_block_from_reg (REGNO (entry_parm), mem,
>                               size_stored / UNITS_PER_WORD);
> @@ -3131,8 +3036,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>         end_sequence ();
>       }
>
> +  if (target_reg)
> +    {
> +      emit_move_insn (target_reg, stack_parm);
> +      stack_parm = target_reg;
> +    }
> +
>     data->stack_parm = stack_parm;
> -  SET_DECL_RTL (parm, stack_parm);
> +  set_parm_rtl (parm, stack_parm);
>   }
>
>   /* A subroutine of assign_parms.  Allocate a pseudo to hold the current
> @@ -3148,6 +3059,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>     int unsignedp = TYPE_UNSIGNED (TREE_TYPE (parm));
>     bool did_conversion = false;
>     bool need_conversion, moved;
> +  rtx rtl;
>
>     /* Store the parm in a pseudoregister during the function, but we may
>        need to do it in a wider mode.  Using 2 here makes the result
> @@ -3156,40 +3068,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>       = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                               TREE_TYPE (current_function_decl), 2);
>
> -  rtx from_expand = parmreg = rtl_for_parm (all, parm);
> -
> -  if (from_expand && !data->passed_pointer)
> -    {
> -      if (GET_MODE (parmreg) != promoted_nominal_mode)
> -       parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
> -    }
> -  else if (!from_expand || parm_in_unassigned_mem_p (parm, from_expand))
> -    {
> -      parmreg = gen_reg_rtx (promoted_nominal_mode);
> -      if (!DECL_ARTIFICIAL (parm))
> -       mark_user_reg (parmreg);
> -
> -      if (from_expand)
> -       {
> -         gcc_assert (data->passed_pointer);
> -         gcc_assert (GET_CODE (from_expand) == MEM
> -                     && XEXP (from_expand, 0) == NULL_RTX);
> -         XEXP (from_expand, 0) = parmreg;
> -       }
> -    }
> +  parmreg = gen_reg_rtx (promoted_nominal_mode);
> +  if (!DECL_ARTIFICIAL (parm))
> +    mark_user_reg (parmreg);
>
>     /* If this was an item that we received a pointer to,
> -     set DECL_RTL appropriately.  */
> -  if (from_expand)
> -    SET_DECL_RTL (parm, from_expand);
> -  else if (data->passed_pointer)
> +     set rtl appropriately.  */
> +  if (data->passed_pointer)
>       {
> -      rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
> -      set_mem_attributes (x, parm, 1);
> -      SET_DECL_RTL (parm, x);
> +      rtl = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
> +      set_mem_attributes (rtl, parm, 1);
>       }
>     else
> -    SET_DECL_RTL (parm, parmreg);
> +    rtl = parmreg;
>
>     assign_parm_remove_parallels (data);
>
> @@ -3197,13 +3088,10 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>        assign_parm_find_data_types and expand_expr_real_1.  */
>
>     equiv_stack_parm = data->stack_parm;
> -  if (!equiv_stack_parm)
> -    equiv_stack_parm = data->entry_parm;
>     validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
>     need_conversion = (data->nominal_mode != data->passed_mode
>                       || promoted_nominal_mode != data->promoted_mode);
> -  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
>     moved = false;
>
>     if (need_conversion
> @@ -3327,7 +3215,9 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>         /* TREE_USED gets set erroneously during expand_assignment.  */
>         save_tree_used = TREE_USED (parm);
> +      SET_DECL_RTL (parm, rtl);
>         expand_assignment (parm, make_tree (data->nominal_type, tempreg), false);
> +      SET_DECL_RTL (parm, NULL_RTX);
>         TREE_USED (parm) = save_tree_used;
>         all->first_conversion_insn = get_insns ();
>         all->last_conversion_insn = get_last_insn ();
> @@ -3335,28 +3225,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>         did_conversion = true;
>       }
> -  /* We don't want to copy the incoming pointer to a parmreg expected
> -     to hold the value rather than the pointer.  */
> -  else if (!data->passed_pointer || parmreg != from_expand)
> +  else
>       emit_move_insn (parmreg, validated_mem);
>
>     /* If we were passed a pointer but the actual value can safely live
>        in a register, retrieve it and use it directly.  */
> -  if (data->passed_pointer
> -      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
> +  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
>       {
> -      rtx src = DECL_RTL (parm);
> -
>         /* We can't use nominal_mode, because it will have been set to
>           Pmode above.  We must use the actual mode of the parm.  */
> -      if (from_expand)
> -       {
> -         parmreg = from_expand;
> -         gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> -         src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
> -         set_mem_attributes (src, parm, 1);
> -       }
> -      else if (use_register_for_decl (parm))
> +      if (use_register_for_decl (parm))
>          {
>            parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>            mark_user_reg (parmreg);
> @@ -3373,14 +3251,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>            set_mem_attributes (parmreg, parm, 1);
>          }
>
> -      if (GET_MODE (parmreg) != GET_MODE (src))
> +      if (GET_MODE (parmreg) != GET_MODE (rtl))
>          {
> -         rtx tempreg = gen_reg_rtx (GET_MODE (src));
> +         rtx tempreg = gen_reg_rtx (GET_MODE (rtl));
>            int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
>
>            push_to_sequence2 (all->first_conversion_insn,
>                               all->last_conversion_insn);
> -         emit_move_insn (tempreg, src);
> +         emit_move_insn (tempreg, rtl);
>            tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
>            emit_move_insn (parmreg, tempreg);
>            all->first_conversion_insn = get_insns ();
> @@ -3389,18 +3267,18 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
>            did_conversion = true;
>          }
> -      else if (GET_MODE (parmreg) == BLKmode)
> -       gcc_assert (parm_in_stack_slot_p (parm));
>         else
> -       emit_move_insn (parmreg, src);
> +       emit_move_insn (parmreg, rtl);
>
> -      SET_DECL_RTL (parm, parmreg);
> +      rtl = parmreg;
>
>         /* STACK_PARM is the pointer, not the parm, and PARMREG is
>           now the parm.  */
> -      data->stack_parm = equiv_stack_parm = NULL;
> +      data->stack_parm = NULL;
>       }
>
> +  set_parm_rtl (parm, rtl);
> +
>     /* Mark the register as eliminable if we did no conversion and it was
>        copied from memory at a fixed offset, and the arg pointer was not
>        copied to a pseudo-reg.  If the arg pointer is a pseudo reg or the
> @@ -3408,11 +3286,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>        make here would screw up life analysis for it.  */
>     if (data->nominal_mode == data->passed_mode
>         && !did_conversion
> -      && equiv_stack_parm != 0
> -      && MEM_P (equiv_stack_parm)
> +      && data->stack_parm != 0
> +      && MEM_P (data->stack_parm)
>         && data->locate.offset.var == 0
>         && reg_mentioned_p (virtual_incoming_args_rtx,
> -                         XEXP (equiv_stack_parm, 0)))
> +                         XEXP (data->stack_parm, 0)))
>       {
>         rtx_insn *linsn = get_last_insn ();
>         rtx_insn *sinsn;
> @@ -3425,8 +3303,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>              = GET_MODE_INNER (GET_MODE (parmreg));
>            int regnor = REGNO (XEXP (parmreg, 0));
>            int regnoi = REGNO (XEXP (parmreg, 1));
> -         rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> -         rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
> +         rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> +         rtx stacki = adjust_address_nv (data->stack_parm, submode,
>                                            GET_MODE_SIZE (submode));
>
>            /* Scan backwards for the set of the real and
> @@ -3444,7 +3322,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>                  set_unique_reg_note (sinsn, REG_EQUIV, stackr);
>              }
>          }
> -      else
> +      else
>          set_dst_reg_note (linsn, REG_EQUIV, equiv_stack_parm, parmreg);
>       }
>
> @@ -3496,16 +3374,6 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>     if (data->entry_parm != data->stack_parm)
>       {
>         rtx src, dest;
> -      rtx from_expand = NULL_RTX;
> -
> -      if (data->stack_parm == 0)
> -       {
> -         from_expand = rtl_for_parm (all, parm);
> -         if (from_expand)
> -           gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
> -         if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
> -           data->stack_parm = from_expand;
> -       }
>
>         if (data->stack_parm == 0)
>          {
> @@ -3516,16 +3384,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>              = assign_stack_local (GET_MODE (data->entry_parm),
>                                    GET_MODE_SIZE (GET_MODE (data->entry_parm)),
>                                    align);
> -         if (!from_expand)
> -           set_mem_attributes (data->stack_parm, parm, 1);
> -         else
> -           {
> -             gcc_assert (GET_CODE (data->stack_parm) == MEM);
> -             gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
> -             XEXP (from_expand, 0) = XEXP (data->stack_parm, 0);
> -             PUT_MODE (from_expand, GET_MODE (data->stack_parm));
> -             data->stack_parm = copy_rtx (from_expand);
> -           }
> +         set_mem_attributes (data->stack_parm, parm, 1);
>          }
>
>         dest = validize_mem (copy_rtx (data->stack_parm));
> @@ -3554,7 +3413,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>         end_sequence ();
>       }
>
> -  SET_DECL_RTL (parm, data->stack_parm);
> +  set_parm_rtl (parm, data->stack_parm);
>   }
>
>   /* A subroutine of assign_parms.  If the ABI splits complex arguments, then
> @@ -3580,21 +3439,11 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
>            imag = DECL_RTL (fnargs[i + 1]);
>            if (inner != GET_MODE (real))
>              {
> -             real = simplify_gen_subreg (inner, real, GET_MODE (real),
> -                                         subreg_lowpart_offset
> -                                         (inner, GET_MODE (real)));
> -             imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
> -                                         subreg_lowpart_offset
> -                                         (inner, GET_MODE (imag)));
> +             real = gen_lowpart_SUBREG (inner, real);
> +             imag = gen_lowpart_SUBREG (inner, imag);
>              }
>
> -         if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
> -             && rtx_equal_p (real,
> -                             read_complex_part (tmp, false))
> -             && rtx_equal_p (imag,
> -                             read_complex_part (tmp, true)))
> -           ; /* We now have the right rtl in tmp.  */
> -         else if (TREE_ADDRESSABLE (parm))
> +         if (TREE_ADDRESSABLE (parm))
>              {
>                rtx rmem, imem;
>                HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
> @@ -3618,7 +3467,7 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
>              }
>            else
>              tmp = gen_rtx_CONCAT (DECL_MODE (parm), real, imag);
> -         SET_DECL_RTL (parm, tmp);
> +         set_parm_rtl (parm, tmp);
>
>            real = DECL_INCOMING_RTL (fnargs[i]);
>            imag = DECL_INCOMING_RTL (fnargs[i + 1]);
> @@ -3740,7 +3589,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
>            assign_parm_setup_block (&all, pbdata->bounds_parm,
>                                     &pbdata->parm_data);
>          else if (pbdata->parm_data.passed_pointer
> -                || use_register_for_parm_decl (&all, pbdata->bounds_parm))
> +                || use_register_for_decl (pbdata->bounds_parm))
>            assign_parm_setup_reg (&all, pbdata->bounds_parm,
>                                   &pbdata->parm_data);
>          else
> @@ -3784,8 +3633,6 @@ assign_parms (tree fndecl)
>            DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>            continue;
>          }
> -      else
> -       maybe_reset_rtl_for_parm (parm);
>
>         /* Estimate stack alignment from parameter alignment.  */
>         if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3835,7 +3682,7 @@ assign_parms (tree fndecl)
>         else
>          set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> -      assign_parm_adjust_stack_rtl (&all, parm, &data);
> +      assign_parm_adjust_stack_rtl (&data);
>
>         /* Bounds should be loaded in the particular order to
>           have registers allocated correctly.  Collect info about
> @@ -3856,8 +3703,7 @@ assign_parms (tree fndecl)
>          {
>            if (assign_parm_setup_block_p (&data))
>              assign_parm_setup_block (&all, parm, &data);
> -         else if (data.passed_pointer
> -                  || use_register_for_parm_decl (&all, parm))
> +         else if (data.passed_pointer || use_register_for_decl (parm))
>              assign_parm_setup_reg (&all, parm, &data);
>            else
>              assign_parm_setup_stack (&all, parm, &data);
> @@ -3954,7 +3800,7 @@ assign_parms (tree fndecl)
>
>         DECL_HAS_VALUE_EXPR_P (result) = 1;
>
> -      SET_DECL_RTL (result, x);
> +      set_parm_rtl (result, x);
>       }
>
>     /* We have aligned all the args, so add space for the pretend args.  */
> @@ -4986,6 +4832,18 @@ allocate_struct_function (tree fndecl, bool abstract_p)
>     if (fndecl != NULL_TREE)
>       {
>         tree result = DECL_RESULT (fndecl);
> +
> +      if (!abstract_p)
> +       {
> +         /* Now that we have activated any function-specific attributes
> +            that might affect layout, particularly vector modes, relayout
> +            each of the parameters and the result.  */
> +         relayout_decl (result);
> +         for (tree parm = DECL_ARGUMENTS (fndecl); parm;
> +              parm = DECL_CHAIN (parm))
> +           relayout_decl (parm);
> +       }
> +
>         if (!abstract_p && aggregate_value_p (result, fndecl))
>          {
>   #ifdef PCC_STATIC_STRUCT_RETURN
> @@ -5189,7 +5047,6 @@ expand_function_start (tree subr)
>
>     /* Decide whether to return the value in memory or in a register.  */
>     tree res = DECL_RESULT (subr);
> -  maybe_reset_rtl_for_parm (res);
>     if (aggregate_value_p (res, subr))
>       {
>         /* Returning something that won't go in a register.  */
> @@ -5210,10 +5067,7 @@ expand_function_start (tree subr)
>               it.  */
>            if (sv)
>              {
> -             if (DECL_BY_REFERENCE (res))
> -               value_address = get_rtl_for_parm_ssa_default_def (res);
> -             if (!value_address)
> -               value_address = gen_reg_rtx (Pmode);
> +             value_address = gen_reg_rtx (Pmode);
>                emit_move_insn (value_address, sv);
>              }
>          }
> @@ -5222,33 +5076,35 @@ expand_function_start (tree subr)
>            rtx x = value_address;
>            if (!DECL_BY_REFERENCE (res))
>              {
> -             x = get_rtl_for_parm_ssa_default_def (res);
> -             if (!x)
> -               {
> -                 x = gen_rtx_MEM (DECL_MODE (res), value_address);
> -                 set_mem_attributes (x, res, 1);
> -               }
> +             x = gen_rtx_MEM (DECL_MODE (res), x);
> +             set_mem_attributes (x, res, 1);
>              }
> -         SET_DECL_RTL (res, x);
> +         set_parm_rtl (res, x);
>          }
>       }
>     else if (DECL_MODE (res) == VOIDmode)
>       /* If return mode is void, this decl rtl should not be used.  */
> -    SET_DECL_RTL (res, NULL_RTX);
> -  else
> +    set_parm_rtl (res, NULL_RTX);
> +  else
>       {
>         /* Compute the return values into a pseudo reg, which we will copy
>           into the true return register after the cleanups are done.  */
>         tree return_type = TREE_TYPE (res);
> -      rtx x = get_rtl_for_parm_ssa_default_def (res);
> -      if (x)
> -       /* Use it.  */;
> +      /* If we may coalesce this result, make sure it has the expected
> +        mode.  */
> +      if (flag_tree_coalesce_vars && is_gimple_reg (res))
> +       {
> +         tree def = ssa_default_def (cfun, res);
> +         gcc_assert (def);
> +         machine_mode mode = promote_ssa_mode (def, NULL);
> +         set_parm_rtl (res, gen_reg_rtx (mode));
> +       }
>         else if (TYPE_MODE (return_type) != BLKmode
>                 && targetm.calls.return_in_msb (return_type))
>          /* expand_function_end will insert the appropriate padding in
>             this case.  Use the return value's natural (unpadded) mode
>             within the function proper.  */
> -       x = gen_reg_rtx (TYPE_MODE (return_type));
> +       set_parm_rtl (res, gen_reg_rtx (TYPE_MODE (return_type)));
>         else
>          {
>            /* In order to figure out what mode to use for the pseudo, we
> @@ -5259,16 +5115,14 @@ expand_function_start (tree subr)
>            /* Structures that are returned in registers are not
>               aggregate_value_p, so we may see a PARALLEL or a REG.  */
>            if (REG_P (hard_reg))
> -           x = gen_reg_rtx (GET_MODE (hard_reg));
> +           set_parm_rtl (res, gen_reg_rtx (GET_MODE (hard_reg)));
>            else
>              {
>                gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> -             x = gen_group_rtx (hard_reg);
> +             set_parm_rtl (res, gen_group_rtx (hard_reg));
>              }
>          }
>
> -      SET_DECL_RTL (res, x);
> -
>         /* Set DECL_REGISTER flag so that expand_function_end will copy the
>           result to the real return register(s).  */
>         DECL_REGISTER (res) = 1;
> @@ -5291,22 +5145,23 @@ expand_function_start (tree subr)
>       {
>         tree parm = cfun->static_chain_decl;
>         rtx local, chain;
> -     rtx_insn *insn;
> +      rtx_insn *insn;
> +      int unsignedp;
>
> -      local = get_rtl_for_parm_ssa_default_def (parm);
> -      if (!local)
> -       local = gen_reg_rtx (Pmode);
> +      local = gen_reg_rtx (promote_decl_mode (parm, &unsignedp));
>         chain = targetm.calls.static_chain (current_function_decl, true);
>
>         set_decl_incoming_rtl (parm, chain, false);
> -      SET_DECL_RTL (parm, local);
> +      set_parm_rtl (parm, local);
>         mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
>
> -      if (GET_MODE (local) != Pmode)
> -       local = convert_to_mode (Pmode, local,
> -                                TYPE_UNSIGNED (TREE_TYPE (parm)));
> -
> -      insn = emit_move_insn (local, chain);
> +      if (GET_MODE (local) != GET_MODE (chain))
> +       {
> +         convert_move (local, chain, unsignedp);
> +         insn = get_last_insn ();
> +       }
> +      else
> +       insn = emit_move_insn (local, chain);
>
>         /* Mark the register as eliminable, similar to parameters.  */
>         if (MEM_P (chain)
> diff --git a/gcc/testsuite/gcc.dg/pr67312.c b/gcc/testsuite/gcc.dg/pr67312.c
> new file mode 100644
> index 0000000..f1c9fde
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr67312.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O0 -ftree-coalesce-vars" } */
> +
> +void foo (int x, int y)
> +{
> +    y = x;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
> index a1e35dc..d14eb2f 100644
> --- a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
> +++ b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
> @@ -1,6 +1,13 @@
>   /* { dg-do compile } */
> -/* { dg-options "-mpreferred-stack-boundary=4" } */
> +/* { dg-options "-mpreferred-stack-boundary=4 -O" } */
>   /* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-64,\[^\\n\]*sp" } } */
> +/* We only guarantee we won't generate the stack alignment when
> +   optimizing.  When not optimizing, the return value will be assigned
> +   to a pseudo with the specified alignment, which in turn will force
> +   stack alignment since the pseudo might have to be spilled.  Without
> +   optimization, we wouldn't compute the actual stack requirements
> +   after register allocation and reload, and just use the conservative
> +   estimate.  */
>
>   /* This compile only test is to detect an assertion failure in stack branch
>      development.  */
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index fd00883..8dc4908 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -980,7 +980,6 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>   {
>     bitmap values = NULL;
>     var_map map;
> -  unsigned i;
>
>     map = coalesce_ssa_name ();
>
> @@ -1005,17 +1004,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
>     sa->map = map;
>     sa->values = values;
> -  sa->partition_has_default_def = BITMAP_ALLOC (NULL);
> -  for (i = 1; i < num_ssa_names; i++)
> -    {
> -      tree t = ssa_name (i);
> -      if (t && SSA_NAME_IS_DEFAULT_DEF (t))
> -       {
> -         int p = var_to_partition (map, t);
> -         if (p != NO_PARTITION)
> -           bitmap_set_bit (sa->partition_has_default_def, p);
> -       }
> -    }
> +  sa->partitions_for_parm_default_defs = get_parm_default_def_partitions (map);
>   }
>
>
> @@ -1190,7 +1179,7 @@ finish_out_of_ssa (struct ssaexpand *sa)
>     if (sa->values)
>       BITMAP_FREE (sa->values);
>     delete_var_map (sa->map);
> -  BITMAP_FREE (sa->partition_has_default_def);
> +  BITMAP_FREE (sa->partitions_for_parm_default_defs);
>     memset (sa, 0, sizeof *sa);
>   }
>
> diff --git a/gcc/tree-outof-ssa.h b/gcc/tree-outof-ssa.h
> index 687e5a5..60b6379 100644
> --- a/gcc/tree-outof-ssa.h
> +++ b/gcc/tree-outof-ssa.h
> @@ -39,9 +39,9 @@ struct ssaexpand
>        a pseudos REG).  */
>     rtx *partition_to_pseudo;
>
> -  /* If partition I contains an SSA name that has a default def,
> -     bit I will be set in this bitmap.  */
> -  bitmap partition_has_default_def;
> +  /* If partition I contains an SSA name that has a default def for a
> +     parameter, bit I will be set in this bitmap.  */
> +  bitmap partitions_for_parm_default_defs;
>   };
>
>   /* This is the singleton described above.  */
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index 8af6583..ff75877 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -39,7 +39,9 @@ along with GCC; see the file COPYING3.  If not see
>   #include "cfgexpand.h"
>   #include "explow.h"
>   #include "diagnostic-core.h"
> -
> +#include "tree-dfa.h"
> +#include "tm_p.h"
> +#include "stor-layout.h"
>
>   /* This set of routines implements a coalesce_list.  This is an object which
>      is used to track pairs of ssa_names which are desirable to coalesce
> @@ -877,26 +879,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>          }
>
>         /* Pretend there are defs for params' default defs at the start
> -        of the (post-)entry block.  */
> +        of the (post-)entry block.  This will prevent PARM_DECLs from
> +        coalescing into the same partition.  Although RESULT_DECLs'
> +        default defs don't have a useful initial value, we have to
> +        prevent them from coalescing with PARM_DECLs' default defs
> +        too, otherwise assign_parms would attempt to assign different
> +        RTL to the same partition.  */
>         if (bb == entry)
>          {
> -         unsigned base;
> -         bitmap_iterator bi;
> -         EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> +         unsigned i;
> +         for (i = 1; i < num_ssa_names; i++)
>              {
> -             bitmap_iterator bi2;
> -             unsigned part;
> -             EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> -                                       0, part, bi2)
> -               {
> -                 tree var = partition_to_var (map, part);
> -                 if (!SSA_NAME_VAR (var)
> -                     || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> -                         && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> -                     || !SSA_NAME_IS_DEFAULT_DEF (var))
> -                   continue;
> -                 live_track_process_def (live, var, graph);
> -               }
> +             tree var = ssa_name (i);
> +
> +             if (!var
> +                 || !SSA_NAME_IS_DEFAULT_DEF (var)
> +                 || !SSA_NAME_VAR (var)
> +                 || VAR_P (SSA_NAME_VAR (var)))
> +               continue;
> +
> +             live_track_process_def (live, var, graph);
> +             /* Process a use too, so that it remains live and
> +                conflicts with other parms' default defs, even unused
> +                ones.  */
> +             live_track_process_use (live, var);
>              }
>          }
>
> @@ -937,6 +943,71 @@ fail_abnormal_edge_coalesce (int x, int y)
>     internal_error ("SSA corruption");
>   }
>
> +/* Call CALLBACK for all PARM_DECLs and RESULT_DECLs for which
> +   assign_parms may ask for a default partition.  */
> +
> +static void
> +for_all_parms (void (*callback)(tree var, void *arg), void *arg)
> +{
> +  for (tree var = DECL_ARGUMENTS (current_function_decl); var;
> +       var = DECL_CHAIN (var))
> +    callback (var, arg);
> +  if (!VOID_TYPE_P (TREE_TYPE (DECL_RESULT (current_function_decl))))
> +    callback (DECL_RESULT (current_function_decl), arg);
> +  if (cfun->static_chain_decl)
> +    callback (cfun->static_chain_decl, arg);
> +}
> +
> +/* Create a default def for VAR.  */
> +
> +static void
> +create_default_def (tree var, void *arg ATTRIBUTE_UNUSED)
> +{
> +  if (!is_gimple_reg (var))
> +    return;
> +
> +  tree ssa = get_or_create_ssa_default_def (cfun, var);
> +  gcc_assert (ssa);
> +}
> +
> +/* Register VAR's default def in MAP.  */
> +
> +static void
> +register_default_def (tree var, void *map_)
> +{
> +  var_map map = (var_map)map_;
> +
> +  if (!is_gimple_reg (var))
> +    return;
> +
> +  tree ssa = ssa_default_def (cfun, var);
> +  gcc_assert (ssa);
> +
> +  register_ssa_partition (map, ssa);
> +}
> +
> +/* If VAR is an SSA_NAME associated with a PARM_DECL or a RESULT_DECL,
> +   and the DECL's default def is unused (i.e., it was introduced by
> +   create_default_def), mark VAR and the default def for
> +   coalescing.  */
> +
> +static void
> +coalesce_with_default (tree var, coalesce_list_p cl, bitmap used_in_copy)
> +{
> +  if (SSA_NAME_IS_DEFAULT_DEF (var)
> +      || !SSA_NAME_VAR (var)
> +      || VAR_P (SSA_NAME_VAR (var)))
> +    return;
> +
> +  tree ssa = ssa_default_def (cfun, SSA_NAME_VAR (var));
> +  if (!has_zero_uses (ssa))
> +    return;
> +
> +  add_cost_one_coalesce (cl, SSA_NAME_VERSION (ssa), SSA_NAME_VERSION (var));
> +  bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
> +  /* Default defs will have their used_in_copy bits set at the end of
> +     create_outofssa_var_map.  */
> +}
>
>   /* This function creates a var_map for the current function as well as creating
>      a coalesce list for use later in the out of ssa process.  */
> @@ -954,8 +1025,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>     int v1, v2, cost;
>     unsigned i;
>
> +  for_all_parms (create_default_def, NULL);
> +
>     map = init_var_map (num_ssa_names);
>
> +  for_all_parms (register_default_def, map);
> +
>     FOR_EACH_BB_FN (bb, cfun)
>       {
>         tree arg;
> @@ -1034,6 +1109,30 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>                }
>                break;
>
> +           case GIMPLE_RETURN:
> +             {
> +               tree res = DECL_RESULT (current_function_decl);
> +               if (VOID_TYPE_P (TREE_TYPE (res))
> +                   || !is_gimple_reg (res))
> +                 break;
> +               tree rhs1 = gimple_return_retval (as_a <greturn *> (stmt));
> +               if (!rhs1)
> +                 break;
> +               tree lhs = ssa_default_def (cfun, res);
> +               gcc_assert (lhs);
> +               if (TREE_CODE (rhs1) == SSA_NAME
> +                   && gimple_can_coalesce_p (lhs, rhs1))
> +                 {
> +                   v1 = SSA_NAME_VERSION (lhs);
> +                   v2 = SSA_NAME_VERSION (rhs1);
> +                   cost = coalesce_cost_bb (bb);
> +                   add_coalesce (cl, v1, v2, cost);
> +                   bitmap_set_bit (used_in_copy, v1);
> +                   bitmap_set_bit (used_in_copy, v2);
> +                 }
> +               break;
> +             }
> +
>              case GIMPLE_ASM:
>                {
>                  gasm *asm_stmt = as_a <gasm *> (stmt);
> @@ -1100,10 +1199,13 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>         var = ssa_name (i);
>         if (var != NULL_TREE && !virtual_operand_p (var))
>           {
> +         coalesce_with_default (var, cl, used_in_copy);
> +
>            /* Add coalesces between all the result decls.  */
>            if (SSA_NAME_VAR (var)
>                && TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL)
>              {
> +             bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
>                if (first == NULL_TREE)
>                  first = var;
>                else
> @@ -1111,8 +1213,6 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>                    gcc_assert (gimple_can_coalesce_p (var, first));
>                    v1 = SSA_NAME_VERSION (first);
>                    v2 = SSA_NAME_VERSION (var);
> -                 bitmap_set_bit (used_in_copy, v1);
> -                 bitmap_set_bit (used_in_copy, v2);
>                    cost = coalesce_cost_bb (EXIT_BLOCK_PTR_FOR_FN (cfun));
>                    add_coalesce (cl, v1, v2, cost);
>                  }
> @@ -1121,7 +1221,9 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>               since they will have to be coalesced with the base variable.  If
>               not marked as present, they won't be in the coalesce view. */
>            if (SSA_NAME_IS_DEFAULT_DEF (var)
> -             && !has_zero_uses (var))
> +             && (!has_zero_uses (var)
> +                 || (SSA_NAME_VAR (var)
> +                     && !VAR_P (SSA_NAME_VAR (var)))))
>              bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
>          }
>       }
> @@ -1367,30 +1469,38 @@ gimple_can_coalesce_p (tree name1, tree name2)
>
>         /* We don't want to coalesce two SSA names if one of the base
>           variables is supposed to be a register while the other is
> -        supposed to be on the stack.  Anonymous SSA names take
> -        registers, but when not optimizing, user variables should go
> -        on the stack, so coalescing them with the anonymous variable
> -        as the partition leader would end up assigning the user
> -        variable to a register.  Don't do that!  */
> -      bool reg1 = !var1 || use_register_for_decl (var1);
> -      bool reg2 = !var2 || use_register_for_decl (var2);
> +        supposed to be on the stack.  Anonymous SSA names most often
> +        take registers, but when not optimizing, user variables
> +        should go on the stack, so coalescing them with the anonymous
> +        variable as the partition leader would end up assigning the
> +        user variable to a register.  Don't do that!  */
> +      bool reg1 = use_register_for_decl (name1);
> +      bool reg2 = use_register_for_decl (name2);
>         if (reg1 != reg2)
>          return false;
>
> -      /* Check that the promoted modes are the same.  We don't want to
> -        coalesce if the promoted modes would be different.  Only
> +      /* Check that the promoted modes and unsignedness are the same.
> +        We don't want to coalesce if the promoted modes would be
> +        different, or if they would sign-extend differently.  Only
>           PARM_DECLs and RESULT_DECLs have different promotion rules,
>           so skip the test if both are variables, or both are anonymous
> -        SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
> -        coalesce its SSA versions with those of any other variables,
> -        because it may be passed by reference.  */
> +        SSA_NAMEs.  */
> +      int unsigned1, unsigned2;
>         return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> -       || (/* The case var1 == var2 is already covered above.  */
> -           !parm_in_stack_slot_p (var1)
> -           && !parm_in_stack_slot_p (var2)
> -           && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
> +       || ((promote_ssa_mode (name1, &unsigned1)
> +            == promote_ssa_mode (name2, &unsigned2))
> +           && unsigned1 == unsigned2);
>       }
>
> +  /* If alignment requirements are different, we can't coalesce.  */
> +  if (MINIMUM_ALIGNMENT (t1,
> +                        var1 ? DECL_MODE (var1) : TYPE_MODE (t1),
> +                        var1 ? LOCAL_DECL_ALIGNMENT (var1) : TYPE_ALIGN (t1))
> +      != MINIMUM_ALIGNMENT (t2,
> +                           var2 ? DECL_MODE (var2) : TYPE_MODE (t2),
> +                           var2 ? LOCAL_DECL_ALIGNMENT (var2) : TYPE_ALIGN (t2)))
> +    return false;
> +
>     /* If the types are not the same, check for a canonical type match.  This
>        (for example) allows coalescing when the types are fundamentally the
>        same, but just have different names.
> @@ -1639,7 +1749,8 @@ coalesce_ssa_name (void)
>            if (a
>                && SSA_NAME_VAR (a)
>                && !DECL_IGNORED_P (SSA_NAME_VAR (a))
> -             && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a)))
> +             && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a)
> +                 || !VAR_P (SSA_NAME_VAR (a))))
>              {
>                tree *slot = ssa_name_hash.find_slot (a, INSERT);
>
> @@ -1721,3 +1832,47 @@ coalesce_ssa_name (void)
>
>     return map;
>   }
> +
> +/* We need to pass two arguments to set_parm_default_def_partition,
> +   but for_all_parms only supports one.  Use a pair.  */
> +
> +typedef std::pair<var_map, bitmap> parm_default_def_partition_arg;
> +
> +/* Set in ARG's PARTS bitmap the bit corresponding to the partition in
> +   ARG's MAP containing VAR's default def.  */
> +
> +static void
> +set_parm_default_def_partition (tree var, void *arg_)
> +{
> +  parm_default_def_partition_arg *arg = (parm_default_def_partition_arg *)arg_;
> +  var_map map = arg->first;
> +  bitmap parts = arg->second;
> +
> +  if (!is_gimple_reg (var))
> +    return;
> +
> +  tree ssa = ssa_default_def (cfun, var);
> +  gcc_assert (ssa);
> +
> +  int version = var_to_partition (map, ssa);
> +  gcc_assert (version != NO_PARTITION);
> +
> +  bool changed = bitmap_set_bit (parts, version);
> +  gcc_assert (changed);
> +}
> +
> +/* Allocate and return a bitmap that has a bit set for each partition
> +   that contains a default def for a parameter.  */
> +
> +extern bitmap
> +get_parm_default_def_partitions (var_map map)
> +{
> +  bitmap parm_default_def_parts = BITMAP_ALLOC (NULL);
> +
> +  parm_default_def_partition_arg
> +    arg = std::make_pair (map, parm_default_def_parts);
> +
> +  for_all_parms (set_parm_default_def_partition, &arg);
> +
> +  return parm_default_def_parts;
> +}
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index ae289b4..8316f34 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -22,5 +22,6 @@ along with GCC; see the file COPYING3.  If not see
>
>   extern var_map coalesce_ssa_name (void);
>   extern bool gimple_can_coalesce_p (tree, tree);
> +extern bitmap get_parm_default_def_partitions (var_map);
>
>   #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index e031725..25b548b 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -200,7 +200,9 @@ partition_view_init (var_map map)
>         tmp = partition_find (map->var_partition, x);
>         if (ssa_name (tmp) != NULL_TREE && !virtual_operand_p (ssa_name (tmp))
>            && (!has_zero_uses (ssa_name (tmp))
> -             || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))))
> +             || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))
> +             || (SSA_NAME_VAR (ssa_name (tmp))
> +                 && !VAR_P (SSA_NAME_VAR (ssa_name (tmp))))))
>          bitmap_set_bit (used, tmp);
>       }
>
> @@ -1404,6 +1406,12 @@ verify_live_on_entry (tree_live_info_p live)
>                    }
>                  if (ok)
>                    continue;
> +               /* Expand adds unused default defs for PARM_DECLs and
> +                  RESULT_DECLs.  They're ok.  */
> +               if (has_zero_uses (var)
> +                   && SSA_NAME_VAR (var)
> +                   && !VAR_P (SSA_NAME_VAR (var)))
> +                 continue;
>                  num++;
>                  print_generic_expr (stderr, var, TDF_SLIM);
>                  fprintf (stderr, " is not marked live-on-entry to entry BB%d ",
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
>


[-- Attachment #2: pr43920-2.s.after --]
[-- Type: text/plain, Size: 1038 bytes --]

	.arch armv8-a
	.eabi_attribute 28, 1
	.fpu crypto-neon-fp-armv8
	.eabi_attribute 20, 1
	.eabi_attribute 21, 1
	.eabi_attribute 23, 3
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 1
	.eabi_attribute 30, 4
	.eabi_attribute 34, 1
	.eabi_attribute 18, 4
	.file	"pr43920-2.c"
	.text
	.align	1
	.global	getFileStartAndLength
	.syntax unified
	.thumb
	.thumb_func
	.type	getFileStartAndLength, %function
getFileStartAndLength:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	push	{r3, r4, r5, r6, r7, lr}
	mov	r7, r0
	mov	r6, r1
	mov	r5, r2
	movs	r1, #0
	movs	r2, #1
	bl	lseek
	movs	r2, #2
	mov	r4, r0
	movs	r1, #0
	mov	r0, r7
	bl	lseek
	adds	r2, r4, #1
	beq	.L1
	adds	r3, r0, #1
	beq	.L4
	subs	r0, r0, r4
	beq	.L4
	str	r4, [r6]
	movs	r4, #0
	str	r0, [r5]
	b	.L1
.L4:
	mov	r4, #-1
.L1:
	mov	r0, r4
	pop	{r3, r4, r5, r6, r7, pc}
	.size	getFileStartAndLength, .-getFileStartAndLength
	.ident	"GCC: (unknown) 6.0.0 20150927 (experimental)"

[-- Attachment #3: pr43920-2.s.before --]
[-- Type: text/plain, Size: 1048 bytes --]

	.arch armv8-a
	.eabi_attribute 28, 1
	.fpu crypto-neon-fp-armv8
	.eabi_attribute 20, 1
	.eabi_attribute 21, 1
	.eabi_attribute 23, 3
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 1
	.eabi_attribute 30, 4
	.eabi_attribute 34, 1
	.eabi_attribute 18, 4
	.file	"pr43920-2.c"
	.text
	.align	1
	.global	getFileStartAndLength
	.syntax unified
	.thumb
	.thumb_func
	.type	getFileStartAndLength, %function
getFileStartAndLength:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	push	{r3, r4, r5, r6, r7, lr}
	mov	r7, r0
	mov	r6, r1
	mov	r5, r2
	movs	r1, #0
	movs	r2, #1
	bl	lseek
	movs	r2, #2
	mov	r4, r0
	movs	r1, #0
	mov	r0, r7
	bl	lseek
	adds	r2, r4, #1
	beq	.L4
	adds	r3, r0, #1
	beq	.L2
	subs	r0, r0, r4
	beq	.L4
	str	r4, [r6]
	str	r0, [r5]
	movs	r0, #0
	pop	{r3, r4, r5, r6, r7, pc}
.L4:
	mov	r0, #-1
.L2:
	pop	{r3, r4, r5, r6, r7, pc}
	.size	getFileStartAndLength, .-getFileStartAndLength
	.ident	"GCC: (unknown) 6.0.0 20150927 (experimental)"

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-09-29 11:31                                                                 ` [PR64164] drop copyrename, integrate into expand Szabolcs Nagy
@ 2015-10-07 22:37                                                                   ` Alexandre Oliva
  2015-10-08 10:00                                                                     ` Richard Biener
  2015-10-09 21:10                                                                     ` Jeff Law
  0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-10-07 22:37 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Alan Lawrence, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, Richard Biener, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Sep 29, 2015, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:

> this commit

> commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc
> Author: aoliva <aoliva@138bc75d-0d04-0410-961f-82ee72b054a4>
> Date:   2015-09-27 09:02:00 +0000

>     revert to assign_parms assignments using default defs
>     ...

> introduced a test failure on arm-none-eabi (using newlib, compiling
> with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ):

> FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2

Thanks for the report.

The problem here is that we don't allocate the pseudo assigned to
<retval> to r0.  That's because we coalesce <retval> versions with
another variable that crosses a function call.  We do that because
uncprop brings these unrelated variables, that happen to contain the
same -1 value we want to return, into the PHI node with the final
<retval> value.

We can't coalesce both start and end with <retval>, because start and
end conflict, but by chance we try start first, and that succeeds.  If
we tried end first (e.g., by giving it a higher coalesce priority,
because fewer calls are crossed by its value in the path to the relevant
edge), we could have got the coalesced variable assigned to r0, and that
would enable us to optimize out the copy to r0 before return, and so
merge the return-only basic block with other blocks.  But ATM we don't
take the definition point or path to the edge into account when
computing coalesce costs, so we can't deterministically do better for
this testcase, and I'm not sure using these additional information would
make it better overall.

Compiling with -fno-tree-dominator-opts skips uncprop so that we don't
even try to coalesce other variables with <retval>, so we get the code
expected by the testcase.  But we obviously don't want to disable this
optimization in general.

Any other thoughts, anyone?

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-10-07 22:37                                                                   ` Alexandre Oliva
@ 2015-10-08 10:00                                                                     ` Richard Biener
  2015-10-09 21:10                                                                     ` Jeff Law
  1 sibling, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-10-08 10:00 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Szabolcs Nagy, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Thu, Oct 8, 2015 at 12:36 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Sep 29, 2015, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
>> this commit
>
>> commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc
>> Author: aoliva <aoliva@138bc75d-0d04-0410-961f-82ee72b054a4>
>> Date:   2015-09-27 09:02:00 +0000
>
>>     revert to assign_parms assignments using default defs
>>     ...
>
>> introduced a test failure on arm-none-eabi (using newlib, compiling
>> with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ):
>
>> FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2
>
> Thanks for the report.
>
> The problem here is that we don't allocate the pseudo assigned to
> <retval> to r0.  That's because we coalesce <retval> versions with
> another variable that crosses a function call.  We do that because
> uncprop brings these unrelated variables, that happen to contain the
> same -1 value we want to return, into the PHI node with the final
> <retval> value.
>
> We can't coalesce both start and end with <retval>, because start and
> end conflict, but by chance we try start first, and that succeeds.  If
> we tried end first (e.g., by giving it a higher coalesce priority,
> because fewer calls are crossed by its value in the path to the relevant
> edge), we could have got the coalesced variable assigned to r0, and that
> would enable us to optimize out the copy to r0 before return, and so
> merge the return-only basic block with other blocks.  But ATM we don't
> take the definition point or path to the edge into account when
> computing coalesce costs, so we can't deterministically do better for
> this testcase, and I'm not sure using these additional information would
> make it better overall.
>
> Compiling with -fno-tree-dominator-opts skips uncprop so that we don't
> even try to coalesce other variables with <retval>, so we get the code
> expected by the testcase.  But we obviously don't want to disable this
> optimization in general.
>
> Any other thoughts, anyone?

Bad luck?  Add some heuristics that always help?

Ok, that wasn't really useful :/

Richard.

>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PR67828] don't unswitch loops on undefined SSA values         (was: Re: [PR64164] drop copyrename, integrate into expand)
  2015-09-25 11:39                                                                 ` Richard Biener
@ 2015-10-09  5:26                                                                   ` Alexandre Oliva
  2015-10-09  9:35                                                                     ` Richard Biener
  2015-10-09  5:36                                                                   ` [PR67766] reorder return value copying from PARALLELs and CONCATs " Alexandre Oliva
  1 sibling, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-10-09  5:26 UTC (permalink / raw)
  To: Zhendong Su, Richard Biener
  Cc: Alan Lawrence, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

This patch fixes a latent bug in loop unswitching exposed by the PR64164
changes.

We would move a test out of a loop that might never have been executed,
and that accessed an uninitialized variable.  The uninitialized SSA
name, due to uncprop, now gets coalescesd with other SSA names,
expanding the ill effects of the undefined behavior we introduce: in
spite of the zero initialization introduced in later rtl stages for the
uninitialized pseudo, by then we've already expanded a PHI node that
referenced the unitialized variable in the path coming from a path in
which it would necessarily be zero, to a copy from the coalesced pseudo,
that gets modified between the zero-initialization and the copy, so the
copied zero is no longer zero.  Oops.

We might want to be stricter in coalesce conflict detection to avoid
this sort of problem, and perhaps to avoid undefined values in uncprop,
but this would all be attempting to limit the effects of undefined
behavior, which is probably a waste of effort.  As long as we avoid
introducing undefined behavior ourselves, we shouldn't have to do any of
that.  So, this patch fixes loop unswitching so as to not introduce
undefined behavior.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?


[PR67828] don't unswitch on default defs of non-parms

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	PR rtl-optimizatoin/67828
	* tree-ssa-loop-unswitch.c: Include tree-ssa.h.
	(tree_may_unswitch_on): Don't unswitch on expressions
	involving undefined values.

for  gcc/testsuite/ChangeLog

	PR rtl-optimization/67828
	* gcc.dg/torture/pr67828.c: New.
---
 gcc/testsuite/gcc.dg/torture/pr67828.c |   43 ++++++++++++++++++++++++++++++++
 gcc/tree-ssa-loop-unswitch.c           |    5 ++++
 2 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr67828.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr67828.c b/gcc/testsuite/gcc.dg/torture/pr67828.c
new file mode 100644
index 0000000..c7b6965
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr67828.c
@@ -0,0 +1,43 @@
+/* Check that we don't misoptimize the final value of d.  We used to
+   apply loop unswitching on if(j), introducing undefined behavior
+   that the original code wouldn't exercise, and this undefined
+   behavior would get later passes to misoptimize the loop.  */
+
+/* { dg-do run } */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+int x;
+
+int __attribute__ ((noinline, noclone))
+xprintf (int d) {
+  if (d)
+    {
+      if (x)
+	printf ("%d", d);
+      abort ();
+    }
+}
+
+int a, b;
+short c;
+
+int
+main ()
+{
+  int j, d = 1;
+  for (; c >= 0; c++)
+    {
+      a = d;
+      d = 0;
+      if (b)
+	{
+	  xprintf (0);
+	  if (j)
+	    xprintf (0);
+	}
+    }
+  xprintf (d);
+  exit (0);
+}
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 4328d6a..d6faa37 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "internal-fn.h"
 #include "gimplify.h"
 #include "tree-cfg.h"
+#include "tree-ssa.h"
 #include "tree-ssa-loop-niter.h"
 #include "tree-ssa-loop.h"
 #include "tree-into-ssa.h"
@@ -139,6 +140,10 @@ tree_may_unswitch_on (basic_block bb, struct loop *loop)
   /* Condition must be invariant.  */
   FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
     {
+      /* Unswitching on undefined values would introduce undefined
+	 behavior that the original program might never exercise.  */
+      if (ssa_undefined_value_p (use, true))
+	return NULL_TREE;
       def = SSA_NAME_DEF_STMT (use);
       def_bb = gimple_bb (def);
       if (def_bb


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PR67766] reorder return value copying from PARALLELs and CONCATs         (was: Re: [PR64164] drop copyrename, integrate into expand)
  2015-09-25 11:39                                                                 ` Richard Biener
  2015-10-09  5:26                                                                   ` [PR67828] don't unswitch loops on undefined SSA values (was: Re: [PR64164] drop copyrename, integrate into expand) Alexandre Oliva
@ 2015-10-09  5:36                                                                   ` Alexandre Oliva
  2015-10-09  7:33                                                                     ` [PR67891] drop is_gimple_reg test from set_parm_rtl (was: [PR67766] reorder return value copying from PARALLELs and CONCATs) Alexandre Oliva
  2015-10-09  9:36                                                                     ` [PR67766] reorder return value copying from PARALLELs and CONCATs (was: Re: [PR64164] drop copyrename, integrate into expand) Richard Biener
  1 sibling, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-10-09  5:36 UTC (permalink / raw)
  To: Uroš Bizjak, Richard Biener
  Cc: Alan Lawrence, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

This fixes fallout from the PR64164 expander revamp.  On alpha, PARALLEL
hard return values may be modeless, and this confuses the code that
wants to copy the pseudo/s in the returned value to the return hard
regs.

It used to work because PARALLELs and CONCATs used to lead to DECL_RTL
with the same mode, but now we try harder to create a pseudo or MEM with
a reasonable mode.

The solution was as simple as moving down the code that handled mode
differences, so that PARALLELs and CONCATs are handled as they should.
Since AFAICT they don't ever have to deal with mode promotion anyway, we
should be fine with this simple change, that Uroš kindly tested with an
alpha-linux-gnu regstrap.  I tested it myself on x86_64-linux-gnu and
i686-linux-gnu.

Ok to install?


[PR67766] reorder handling of parallels, concats and promoted values in return

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	PR middle-end/67766
	* function.c (expand_function_end): Move return value
	promotion past the handling of PARALLELs and CONCATs.
---
 gcc/function.c |   24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index e76ba2b..d16d6d8 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5446,18 +5446,6 @@ expand_function_end (void)
 			      decl_rtl);
 	      shift_return_value (GET_MODE (decl_rtl), true, real_decl_rtl);
 	    }
-	  /* If a named return value dumped decl_return to memory, then
-	     we may need to re-do the PROMOTE_MODE signed/unsigned
-	     extension.  */
-	  else if (GET_MODE (real_decl_rtl) != GET_MODE (decl_rtl))
-	    {
-	      int unsignedp = TYPE_UNSIGNED (TREE_TYPE (decl_result));
-	      promote_function_mode (TREE_TYPE (decl_result),
-				     GET_MODE (decl_rtl), &unsignedp,
-				     TREE_TYPE (current_function_decl), 1);
-
-	      convert_move (real_decl_rtl, decl_rtl, unsignedp);
-	    }
 	  else if (GET_CODE (real_decl_rtl) == PARALLEL)
 	    {
 	      /* If expand_function_start has created a PARALLEL for decl_rtl,
@@ -5488,6 +5476,18 @@ expand_function_end (void)
 	      emit_move_insn (tmp, decl_rtl);
 	      emit_move_insn (real_decl_rtl, tmp);
 	    }
+	  /* If a named return value dumped decl_return to memory, then
+	     we may need to re-do the PROMOTE_MODE signed/unsigned
+	     extension.  */
+	  else if (GET_MODE (real_decl_rtl) != GET_MODE (decl_rtl))
+	    {
+	      int unsignedp = TYPE_UNSIGNED (TREE_TYPE (decl_result));
+	      promote_function_mode (TREE_TYPE (decl_result),
+				     GET_MODE (decl_rtl), &unsignedp,
+				     TREE_TYPE (current_function_decl), 1);
+
+	      convert_move (real_decl_rtl, decl_rtl, unsignedp);
+	    }
 	  else
 	    emit_move_insn (real_decl_rtl, decl_rtl);
 	}


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PR67891] drop is_gimple_reg test from set_parm_rtl         (was: [PR67766] reorder return value copying from PARALLELs and CONCATs)
  2015-10-09  5:36                                                                   ` [PR67766] reorder return value copying from PARALLELs and CONCATs " Alexandre Oliva
@ 2015-10-09  7:33                                                                     ` Alexandre Oliva
  2015-10-09  9:40                                                                       ` Richard Biener
  2015-10-09  9:36                                                                     ` [PR67766] reorder return value copying from PARALLELs and CONCATs (was: Re: [PR64164] drop copyrename, integrate into expand) Richard Biener
  1 sibling, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-10-09  7:33 UTC (permalink / raw)
  To: Uroš Bizjak
  Cc: Richard Biener, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Oct  9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> This fixes fallout from the PR64164 expander revamp.

> Uroš kindly tested with an alpha-linux-gnu regstrap.

The one regression he mentioned from that run was gcc.dg/pr43300.c.  The
vector parameter there is handled by the emit_block_move case of
assign_parms_setup_block.  Alas, emit_block_move marks the decl as
addressable, which causes the subsequent is_gimple_reg test in
set_parm_rtl to return false.  This causes us to call set_rtl with the
parm decl, instead of its default def, and the latter would be required
to store the RTL in the partition holding the default def.

The good news it that we don't really need to call is_gimple_reg there,
though; testing whether there is a default def in place is enough, and
ssa_default_def will find the default def in spite of the parm's no
longer passing is_gimple_reg, and it won't complain if given a decl that
was never a gimple reg.

So, I'm dropping the test.  Regstrapped on x86_64-linux-gnu and
i686-linux-gnu.  Ok to install?


[PR67891] don't test is_gimple_reg after parm expansion

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	PR middle-end/67891
	* cfgexpand.c (set_parm_rtl): Drop is_gimple_reg test.
---
 gcc/cfgexpand.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 58e55d2..eaad859 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1243,9 +1243,6 @@ set_parm_rtl (tree parm, rtx x)
       record_alignment_for_reg_var (align);
     }
 
-  if (!is_gimple_reg (parm))
-    return set_rtl (parm, x);
-
   tree ssa = ssa_default_def (cfun, parm);
   if (!ssa)
     return set_rtl (parm, x);


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67828] don't unswitch loops on undefined SSA values (was: Re: [PR64164] drop copyrename, integrate into expand)
  2015-10-09  5:26                                                                   ` [PR67828] don't unswitch loops on undefined SSA values (was: Re: [PR64164] drop copyrename, integrate into expand) Alexandre Oliva
@ 2015-10-09  9:35                                                                     ` Richard Biener
  0 siblings, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-10-09  9:35 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Zhendong Su, Alan Lawrence, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On Fri, Oct 9, 2015 at 7:26 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> This patch fixes a latent bug in loop unswitching exposed by the PR64164
> changes.
>
> We would move a test out of a loop that might never have been executed,
> and that accessed an uninitialized variable.  The uninitialized SSA
> name, due to uncprop, now gets coalescesd with other SSA names,
> expanding the ill effects of the undefined behavior we introduce: in
> spite of the zero initialization introduced in later rtl stages for the
> uninitialized pseudo, by then we've already expanded a PHI node that
> referenced the unitialized variable in the path coming from a path in
> which it would necessarily be zero, to a copy from the coalesced pseudo,
> that gets modified between the zero-initialization and the copy, so the
> copied zero is no longer zero.  Oops.
>
> We might want to be stricter in coalesce conflict detection to avoid
> this sort of problem, and perhaps to avoid undefined values in uncprop,
> but this would all be attempting to limit the effects of undefined
> behavior, which is probably a waste of effort.  As long as we avoid
> introducing undefined behavior ourselves, we shouldn't have to do any of
> that.  So, this patch fixes loop unswitching so as to not introduce
> undefined behavior.
>
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

Ok.

Thanks,
Richard.

>
> [PR67828] don't unswitch on default defs of non-parms
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> for  gcc/ChangeLog
>
>         PR rtl-optimizatoin/67828
>         * tree-ssa-loop-unswitch.c: Include tree-ssa.h.
>         (tree_may_unswitch_on): Don't unswitch on expressions
>         involving undefined values.
>
> for  gcc/testsuite/ChangeLog
>
>         PR rtl-optimization/67828
>         * gcc.dg/torture/pr67828.c: New.
> ---
>  gcc/testsuite/gcc.dg/torture/pr67828.c |   43 ++++++++++++++++++++++++++++++++
>  gcc/tree-ssa-loop-unswitch.c           |    5 ++++
>  2 files changed, 48 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr67828.c
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr67828.c b/gcc/testsuite/gcc.dg/torture/pr67828.c
> new file mode 100644
> index 0000000..c7b6965
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr67828.c
> @@ -0,0 +1,43 @@
> +/* Check that we don't misoptimize the final value of d.  We used to
> +   apply loop unswitching on if(j), introducing undefined behavior
> +   that the original code wouldn't exercise, and this undefined
> +   behavior would get later passes to misoptimize the loop.  */
> +
> +/* { dg-do run } */
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +
> +int x;
> +
> +int __attribute__ ((noinline, noclone))
> +xprintf (int d) {
> +  if (d)
> +    {
> +      if (x)
> +       printf ("%d", d);
> +      abort ();
> +    }
> +}
> +
> +int a, b;
> +short c;
> +
> +int
> +main ()
> +{
> +  int j, d = 1;
> +  for (; c >= 0; c++)
> +    {
> +      a = d;
> +      d = 0;
> +      if (b)
> +       {
> +         xprintf (0);
> +         if (j)
> +           xprintf (0);
> +       }
> +    }
> +  xprintf (d);
> +  exit (0);
> +}
> diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
> index 4328d6a..d6faa37 100644
> --- a/gcc/tree-ssa-loop-unswitch.c
> +++ b/gcc/tree-ssa-loop-unswitch.c
> @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "internal-fn.h"
>  #include "gimplify.h"
>  #include "tree-cfg.h"
> +#include "tree-ssa.h"
>  #include "tree-ssa-loop-niter.h"
>  #include "tree-ssa-loop.h"
>  #include "tree-into-ssa.h"
> @@ -139,6 +140,10 @@ tree_may_unswitch_on (basic_block bb, struct loop *loop)
>    /* Condition must be invariant.  */
>    FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
>      {
> +      /* Unswitching on undefined values would introduce undefined
> +        behavior that the original program might never exercise.  */
> +      if (ssa_undefined_value_p (use, true))
> +       return NULL_TREE;
>        def = SSA_NAME_DEF_STMT (use);
>        def_bb = gimple_bb (def);
>        if (def_bb
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67766] reorder return value copying from PARALLELs and CONCATs (was: Re: [PR64164] drop copyrename, integrate into expand)
  2015-10-09  5:36                                                                   ` [PR67766] reorder return value copying from PARALLELs and CONCATs " Alexandre Oliva
  2015-10-09  7:33                                                                     ` [PR67891] drop is_gimple_reg test from set_parm_rtl (was: [PR67766] reorder return value copying from PARALLELs and CONCATs) Alexandre Oliva
@ 2015-10-09  9:36                                                                     ` Richard Biener
  1 sibling, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-10-09  9:36 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Uroš Bizjak, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Fri, Oct 9, 2015 at 7:36 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> This fixes fallout from the PR64164 expander revamp.  On alpha, PARALLEL
> hard return values may be modeless, and this confuses the code that
> wants to copy the pseudo/s in the returned value to the return hard
> regs.
>
> It used to work because PARALLELs and CONCATs used to lead to DECL_RTL
> with the same mode, but now we try harder to create a pseudo or MEM with
> a reasonable mode.
>
> The solution was as simple as moving down the code that handled mode
> differences, so that PARALLELs and CONCATs are handled as they should.
> Since AFAICT they don't ever have to deal with mode promotion anyway, we
> should be fine with this simple change, that Uroš kindly tested with an
> alpha-linux-gnu regstrap.  I tested it myself on x86_64-linux-gnu and
> i686-linux-gnu.
>
> Ok to install?

Ok.

Richard.

>
> [PR67766] reorder handling of parallels, concats and promoted values in return
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> for  gcc/ChangeLog
>
>         PR middle-end/67766
>         * function.c (expand_function_end): Move return value
>         promotion past the handling of PARALLELs and CONCATs.
> ---
>  gcc/function.c |   24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/function.c b/gcc/function.c
> index e76ba2b..d16d6d8 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -5446,18 +5446,6 @@ expand_function_end (void)
>                               decl_rtl);
>               shift_return_value (GET_MODE (decl_rtl), true, real_decl_rtl);
>             }
> -         /* If a named return value dumped decl_return to memory, then
> -            we may need to re-do the PROMOTE_MODE signed/unsigned
> -            extension.  */
> -         else if (GET_MODE (real_decl_rtl) != GET_MODE (decl_rtl))
> -           {
> -             int unsignedp = TYPE_UNSIGNED (TREE_TYPE (decl_result));
> -             promote_function_mode (TREE_TYPE (decl_result),
> -                                    GET_MODE (decl_rtl), &unsignedp,
> -                                    TREE_TYPE (current_function_decl), 1);
> -
> -             convert_move (real_decl_rtl, decl_rtl, unsignedp);
> -           }
>           else if (GET_CODE (real_decl_rtl) == PARALLEL)
>             {
>               /* If expand_function_start has created a PARALLEL for decl_rtl,
> @@ -5488,6 +5476,18 @@ expand_function_end (void)
>               emit_move_insn (tmp, decl_rtl);
>               emit_move_insn (real_decl_rtl, tmp);
>             }
> +         /* If a named return value dumped decl_return to memory, then
> +            we may need to re-do the PROMOTE_MODE signed/unsigned
> +            extension.  */
> +         else if (GET_MODE (real_decl_rtl) != GET_MODE (decl_rtl))
> +           {
> +             int unsignedp = TYPE_UNSIGNED (TREE_TYPE (decl_result));
> +             promote_function_mode (TREE_TYPE (decl_result),
> +                                    GET_MODE (decl_rtl), &unsignedp,
> +                                    TREE_TYPE (current_function_decl), 1);
> +
> +             convert_move (real_decl_rtl, decl_rtl, unsignedp);
> +           }
>           else
>             emit_move_insn (real_decl_rtl, decl_rtl);
>         }
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67891] drop is_gimple_reg test from set_parm_rtl (was: [PR67766] reorder return value copying from PARALLELs and CONCATs)
  2015-10-09  7:33                                                                     ` [PR67891] drop is_gimple_reg test from set_parm_rtl (was: [PR67766] reorder return value copying from PARALLELs and CONCATs) Alexandre Oliva
@ 2015-10-09  9:40                                                                       ` Richard Biener
  2015-10-10 13:20                                                                         ` [PR67891] drop is_gimple_reg test from set_parm_rtl Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-10-09  9:40 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Uroš Bizjak, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Fri, Oct 9, 2015 at 9:33 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Oct  9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> This fixes fallout from the PR64164 expander revamp.
>
>> Uroš kindly tested with an alpha-linux-gnu regstrap.
>
> The one regression he mentioned from that run was gcc.dg/pr43300.c.  The
> vector parameter there is handled by the emit_block_move case of
> assign_parms_setup_block.  Alas, emit_block_move marks the decl as
> addressable, which causes the subsequent is_gimple_reg test in
> set_parm_rtl to return false.  This causes us to call set_rtl with the
> parm decl, instead of its default def, and the latter would be required
> to store the RTL in the partition holding the default def.
>
> The good news it that we don't really need to call is_gimple_reg there,
> though; testing whether there is a default def in place is enough, and
> ssa_default_def will find the default def in spite of the parm's no
> longer passing is_gimple_reg, and it won't complain if given a decl that
> was never a gimple reg.
>
> So, I'm dropping the test.  Regstrapped on x86_64-linux-gnu and
> i686-linux-gnu.  Ok to install?

Ok.  Note that I think emit_block_move shouldn't mess with the addressable flag.

Thanks,
Richard.

>
> [PR67891] don't test is_gimple_reg after parm expansion
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> for  gcc/ChangeLog
>
>         PR middle-end/67891
>         * cfgexpand.c (set_parm_rtl): Drop is_gimple_reg test.
> ---
>  gcc/cfgexpand.c |    3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 58e55d2..eaad859 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -1243,9 +1243,6 @@ set_parm_rtl (tree parm, rtx x)
>        record_alignment_for_reg_var (align);
>      }
>
> -  if (!is_gimple_reg (parm))
> -    return set_rtl (parm, x);
> -
>    tree ssa = ssa_default_def (cfun, parm);
>    if (!ssa)
>      return set_rtl (parm, x);
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-10-07 22:37                                                                   ` Alexandre Oliva
  2015-10-08 10:00                                                                     ` Richard Biener
@ 2015-10-09 21:10                                                                     ` Jeff Law
  1 sibling, 0 replies; 127+ messages in thread
From: Jeff Law @ 2015-10-09 21:10 UTC (permalink / raw)
  To: Alexandre Oliva, Szabolcs Nagy
  Cc: Alan Lawrence, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On 10/07/2015 04:36 PM, Alexandre Oliva wrote:
> On Sep 29, 2015, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
>> this commit
>
>> commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc
>> Author: aoliva <aoliva@138bc75d-0d04-0410-961f-82ee72b054a4>
>> Date:   2015-09-27 09:02:00 +0000
>
>>      revert to assign_parms assignments using default defs
>>      ...
>
>> introduced a test failure on arm-none-eabi (using newlib, compiling
>> with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ):
>
>> FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2
>
> Thanks for the report.
>
> The problem here is that we don't allocate the pseudo assigned to
> <retval> to r0.  That's because we coalesce <retval> versions with
> another variable that crosses a function call.  We do that because
> uncprop brings these unrelated variables, that happen to contain the
> same -1 value we want to return, into the PHI node with the final
> <retval> value.
>
> We can't coalesce both start and end with <retval>, because start and
> end conflict, but by chance we try start first, and that succeeds.  If
> we tried end first (e.g., by giving it a higher coalesce priority,
> because fewer calls are crossed by its value in the path to the relevant
> edge), we could have got the coalesced variable assigned to r0, and that
> would enable us to optimize out the copy to r0 before return, and so
> merge the return-only basic block with other blocks.  But ATM we don't
> take the definition point or path to the edge into account when
> computing coalesce costs, so we can't deterministically do better for
> this testcase, and I'm not sure using these additional information would
> make it better overall.
>
> Compiling with -fno-tree-dominator-opts skips uncprop so that we don't
> even try to coalesce other variables with <retval>, so we get the code
> expected by the testcase.  But we obviously don't want to disable this
> optimization in general.
>
> Any other thoughts, anyone?
I keep coming back to my idea to avoid uncprop when doing so creates 
conflicts.  See c#22, c#24 & c#28.

It looks like I tossed my WIP around those ideas when you fixed 64164.

Jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67891] drop is_gimple_reg test from set_parm_rtl
  2015-10-09  9:40                                                                       ` Richard Biener
@ 2015-10-10 13:20                                                                         ` Alexandre Oliva
  2015-10-12 10:22                                                                           ` Richard Biener
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-10-10 13:20 UTC (permalink / raw)
  To: Richard Biener
  Cc: Uroš Bizjak, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Oct  9, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> Ok.  Note that I think emit_block_move shouldn't mess with the addressable flag.

I have successfully tested a patch that stops it from doing so,
reverting https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429#c11 but
according to bugs 49429 and 49454, it looks like removing it would mess
with escape analysis introduced in r175063 for bug 44194.  The thread
that introduces the mark_addressable calls suggests some discomfort with
this solution, and even a suggestion that the markings should be
deferred past the end of expand, but in the end there was agreement to
go with it.  https://gcc.gnu.org/ml/gcc-patches/2011-06/msg01746.html

I'm leaving it alone, since I can't reasonably test on the platforms
where the problems showed up.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67891] drop is_gimple_reg test from set_parm_rtl
  2015-10-10 13:20                                                                         ` [PR67891] drop is_gimple_reg test from set_parm_rtl Alexandre Oliva
@ 2015-10-12 10:22                                                                           ` Richard Biener
  2015-10-14  3:25                                                                             ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-10-12 10:22 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Uroš Bizjak, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Sat, Oct 10, 2015 at 3:16 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Oct  9, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> Ok.  Note that I think emit_block_move shouldn't mess with the addressable flag.
>
> I have successfully tested a patch that stops it from doing so,
> reverting https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429#c11 but
> according to bugs 49429 and 49454, it looks like removing it would mess
> with escape analysis introduced in r175063 for bug 44194.  The thread
> that introduces the mark_addressable calls suggests some discomfort with
> this solution, and even a suggestion that the markings should be
> deferred past the end of expand, but in the end there was agreement to
> go with it.  https://gcc.gnu.org/ml/gcc-patches/2011-06/msg01746.html

Aww, indeed.  Of course the issue is that we don't track pointers to the
stack introduced during RTL properly.

> I'm leaving it alone, since I can't reasonably test on the platforms
> where the problems showed up.

Yeah.

Thanks for checking.  Might want to add a comment before that
addressable setting now that you've done the archeology.

Richard.

>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67891] drop is_gimple_reg test from set_parm_rtl
  2015-10-12 10:22                                                                           ` Richard Biener
@ 2015-10-14  3:25                                                                             ` Alexandre Oliva
  2015-10-14  9:28                                                                               ` Richard Biener
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-10-14  3:25 UTC (permalink / raw)
  To: Richard Biener
  Cc: Uroš Bizjak, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Oct 12, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> On Sat, Oct 10, 2015 at 3:16 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Oct  9, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>> 
>>> Ok.  Note that I think emit_block_move shouldn't mess with the addressable flag.
>> 
>> I have successfully tested a patch that stops it from doing so,
>> reverting https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429#c11 but
>> according to bugs 49429 and 49454, it looks like removing it would mess
>> with escape analysis introduced in r175063 for bug 44194.  The thread
>> that introduces the mark_addressable calls suggests some discomfort with
>> this solution, and even a suggestion that the markings should be
>> deferred past the end of expand, but in the end there was agreement to
>> go with it.  https://gcc.gnu.org/ml/gcc-patches/2011-06/msg01746.html

> Aww, indeed.  Of course the issue is that we don't track pointers to the
> stack introduced during RTL properly.

> Thanks for checking.  Might want to add a comment before that
> addressable setting now that you've done the archeology.

I decided to give the following approach a try instead.  The following
patch was regstrapped on x86_64-linux-gnu and i686-linux-gnu.
Ok to install?

Would anyone with access to hpux (pa and ia64 are both affected) give it
a spin?


defer mark_addressable calls during expand till the end of expand

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	* gimple-expr.c: Include hash-set.h and rtl.h.
	(mark_addressable_queue): New var.
	(mark_addressable): Factor actual marking into...
	(mark_addressable_1): ... this.  Queue it up during expand.
	(mark_addressable_2): New.
	(flush_mark_addressable_queue): New.
	* gimple-expr.h (flush_mark_addressable_queue): Declare.
	* cfgexpand.c: Include gimple-expr.h.
	(pass_expand::execute): Flush mark_addressable queue.
---
 gcc/cfgexpand.c   |    3 +++
 gcc/gimple-expr.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++++--
 gcc/gimple-expr.h |    1 +
 3 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..a362e17 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "internal-fn.h"
 #include "tree-eh.h"
 #include "gimple-iterator.h"
+#include "gimple-expr.h"
 #include "gimple-walk.h"
 #include "cgraph.h"
 #include "tree-cfg.h"
@@ -6373,6 +6374,8 @@ pass_expand::execute (function *fun)
   /* We're done expanding trees to RTL.  */
   currently_expanding_to_rtl = 0;
 
+  flush_mark_addressable_queue ();
+
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (fun)->next_bb,
 		  EXIT_BLOCK_PTR_FOR_FN (fun), next_bb)
     {
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index 2a6ba1a..db249a3 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -35,6 +35,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "stor-layout.h"
 #include "demangle.h"
+#include "hash-set.h"
+#include "rtl.h"
 
 /* ----- Type related -----  */
 
@@ -823,6 +825,50 @@ is_gimple_mem_ref_addr (tree t)
 		  || decl_address_invariant_p (TREE_OPERAND (t, 0)))));
 }
 
+/* Hold trees marked addressable during expand.  */
+
+static hash_set<tree> *mark_addressable_queue;
+
+/* Mark X as addressable or queue it up if called during expand.  */
+
+static void
+mark_addressable_1 (tree x)
+{
+  if (!currently_expanding_to_rtl)
+    {
+      TREE_ADDRESSABLE (x) = 1;
+      return;
+    }
+
+  if (!mark_addressable_queue)
+    mark_addressable_queue = new hash_set<tree>();
+  mark_addressable_queue->add (x);
+}
+
+/* Adaptor for mark_addressable_1 for use in hash_set traversal.  */
+
+bool
+mark_addressable_2 (tree const &x, void * ATTRIBUTE_UNUSED = NULL)
+{
+  mark_addressable_1 (x);
+  return false;
+}
+
+/* Mark all queued trees as addressable, and empty the queue.  To be
+   called right after clearing CURRENTLY_EXPANDING_TO_RTL.  */
+
+void
+flush_mark_addressable_queue ()
+{
+  gcc_assert (!currently_expanding_to_rtl);
+  if (mark_addressable_queue)
+    {
+      mark_addressable_queue->traverse<void*, mark_addressable_2> (NULL);
+      delete mark_addressable_queue;
+      mark_addressable_queue = NULL;
+    }
+}
+
 /* Mark X addressable.  Unlike the langhook we expect X to be in gimple
    form and we don't do any syntax checking.  */
 
@@ -838,7 +884,7 @@ mark_addressable (tree x)
       && TREE_CODE (x) != PARM_DECL
       && TREE_CODE (x) != RESULT_DECL)
     return;
-  TREE_ADDRESSABLE (x) = 1;
+  mark_addressable_1 (x);
 
   /* Also mark the artificial SSA_NAME that points to the partition of X.  */
   if (TREE_CODE (x) == VAR_DECL
@@ -849,7 +895,7 @@ mark_addressable (tree x)
     {
       tree *namep = cfun->gimple_df->decls_to_pointers->get (x);
       if (namep)
-	TREE_ADDRESSABLE (*namep) = 1;
+	mark_addressable_1 (*namep);
     }
 }
 
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index 3d1c89f..2917d2752c 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -52,6 +52,7 @@ extern bool is_gimple_asm_val (tree);
 extern bool is_gimple_min_lval (tree);
 extern bool is_gimple_call_addr (tree);
 extern bool is_gimple_mem_ref_addr (tree);
+extern void flush_mark_addressable_queue (void);
 extern void mark_addressable (tree);
 extern bool is_gimple_reg_rhs (tree);
 


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67891] drop is_gimple_reg test from set_parm_rtl
  2015-10-14  3:25                                                                             ` Alexandre Oliva
@ 2015-10-14  9:28                                                                               ` Richard Biener
  2015-11-03  1:11                                                                                 ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-10-14  9:28 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Uroš Bizjak, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Wed, Oct 14, 2015 at 5:25 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Oct 12, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> On Sat, Oct 10, 2015 at 3:16 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> On Oct  9, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>>>
>>>> Ok.  Note that I think emit_block_move shouldn't mess with the addressable flag.
>>>
>>> I have successfully tested a patch that stops it from doing so,
>>> reverting https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429#c11 but
>>> according to bugs 49429 and 49454, it looks like removing it would mess
>>> with escape analysis introduced in r175063 for bug 44194.  The thread
>>> that introduces the mark_addressable calls suggests some discomfort with
>>> this solution, and even a suggestion that the markings should be
>>> deferred past the end of expand, but in the end there was agreement to
>>> go with it.  https://gcc.gnu.org/ml/gcc-patches/2011-06/msg01746.html
>
>> Aww, indeed.  Of course the issue is that we don't track pointers to the
>> stack introduced during RTL properly.
>
>> Thanks for checking.  Might want to add a comment before that
>> addressable setting now that you've done the archeology.
>
> I decided to give the following approach a try instead.  The following
> patch was regstrapped on x86_64-linux-gnu and i686-linux-gnu.
> Ok to install?

It looks ok to me but lacks a comment in mark_addressable_1 why we
do this queueing when currently expanding to RTL.

Richard.

> Would anyone with access to hpux (pa and ia64 are both affected) give it
> a spin?
>
>
> defer mark_addressable calls during expand till the end of expand
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> for  gcc/ChangeLog
>
>         * gimple-expr.c: Include hash-set.h and rtl.h.
>         (mark_addressable_queue): New var.
>         (mark_addressable): Factor actual marking into...
>         (mark_addressable_1): ... this.  Queue it up during expand.
>         (mark_addressable_2): New.
>         (flush_mark_addressable_queue): New.
>         * gimple-expr.h (flush_mark_addressable_queue): Declare.
>         * cfgexpand.c: Include gimple-expr.h.
>         (pass_expand::execute): Flush mark_addressable queue.
> ---
>  gcc/cfgexpand.c   |    3 +++
>  gcc/gimple-expr.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  gcc/gimple-expr.h |    1 +
>  3 files changed, 52 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index eaad859..a362e17 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "internal-fn.h"
>  #include "tree-eh.h"
>  #include "gimple-iterator.h"
> +#include "gimple-expr.h"
>  #include "gimple-walk.h"
>  #include "cgraph.h"
>  #include "tree-cfg.h"
> @@ -6373,6 +6374,8 @@ pass_expand::execute (function *fun)
>    /* We're done expanding trees to RTL.  */
>    currently_expanding_to_rtl = 0;
>
> +  flush_mark_addressable_queue ();
> +
>    FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (fun)->next_bb,
>                   EXIT_BLOCK_PTR_FOR_FN (fun), next_bb)
>      {
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index 2a6ba1a..db249a3 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -35,6 +35,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimplify.h"
>  #include "stor-layout.h"
>  #include "demangle.h"
> +#include "hash-set.h"
> +#include "rtl.h"
>
>  /* ----- Type related -----  */
>
> @@ -823,6 +825,50 @@ is_gimple_mem_ref_addr (tree t)
>                   || decl_address_invariant_p (TREE_OPERAND (t, 0)))));
>  }
>
> +/* Hold trees marked addressable during expand.  */
> +
> +static hash_set<tree> *mark_addressable_queue;
> +
> +/* Mark X as addressable or queue it up if called during expand.  */
> +
> +static void
> +mark_addressable_1 (tree x)
> +{
> +  if (!currently_expanding_to_rtl)
> +    {
> +      TREE_ADDRESSABLE (x) = 1;
> +      return;
> +    }
> +
> +  if (!mark_addressable_queue)
> +    mark_addressable_queue = new hash_set<tree>();
> +  mark_addressable_queue->add (x);
> +}
> +
> +/* Adaptor for mark_addressable_1 for use in hash_set traversal.  */
> +
> +bool
> +mark_addressable_2 (tree const &x, void * ATTRIBUTE_UNUSED = NULL)
> +{
> +  mark_addressable_1 (x);
> +  return false;
> +}
> +
> +/* Mark all queued trees as addressable, and empty the queue.  To be
> +   called right after clearing CURRENTLY_EXPANDING_TO_RTL.  */
> +
> +void
> +flush_mark_addressable_queue ()
> +{
> +  gcc_assert (!currently_expanding_to_rtl);
> +  if (mark_addressable_queue)
> +    {
> +      mark_addressable_queue->traverse<void*, mark_addressable_2> (NULL);
> +      delete mark_addressable_queue;
> +      mark_addressable_queue = NULL;
> +    }
> +}
> +
>  /* Mark X addressable.  Unlike the langhook we expect X to be in gimple
>     form and we don't do any syntax checking.  */
>
> @@ -838,7 +884,7 @@ mark_addressable (tree x)
>        && TREE_CODE (x) != PARM_DECL
>        && TREE_CODE (x) != RESULT_DECL)
>      return;
> -  TREE_ADDRESSABLE (x) = 1;
> +  mark_addressable_1 (x);
>
>    /* Also mark the artificial SSA_NAME that points to the partition of X.  */
>    if (TREE_CODE (x) == VAR_DECL
> @@ -849,7 +895,7 @@ mark_addressable (tree x)
>      {
>        tree *namep = cfun->gimple_df->decls_to_pointers->get (x);
>        if (namep)
> -       TREE_ADDRESSABLE (*namep) = 1;
> +       mark_addressable_1 (*namep);
>      }
>  }
>
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index 3d1c89f..2917d2752c 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -52,6 +52,7 @@ extern bool is_gimple_asm_val (tree);
>  extern bool is_gimple_min_lval (tree);
>  extern bool is_gimple_call_addr (tree);
>  extern bool is_gimple_mem_ref_addr (tree);
> +extern void flush_mark_addressable_queue (void);
>  extern void mark_addressable (tree);
>  extern bool is_gimple_reg_rhs (tree);
>
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67891] drop is_gimple_reg test from set_parm_rtl
  2015-10-14  9:28                                                                               ` Richard Biener
@ 2015-11-03  1:11                                                                                 ` Alexandre Oliva
  2015-11-03  3:14                                                                                   ` Jeff Law
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-11-03  1:11 UTC (permalink / raw)
  To: Richard Biener
  Cc: Uroš Bizjak, Alan Lawrence, Jeff Law, James Greenhalgh,
	H.J. Lu, Segher Boessenkool, GCC Patches, Christophe Lyon,
	David Edelsohn, Eric Botcazou

On Oct 14, 2015, Richard Biener <richard.guenther@gmail.com> wrote:

> On Wed, Oct 14, 2015 at 5:25 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Oct 12, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>> 
>>> On Sat, Oct 10, 2015 at 3:16 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>>> On Oct  9, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>>>> 
>>>>> Ok.  Note that I think emit_block_move shouldn't mess with the addressable flag.
>>>> 
>>>> I have successfully tested a patch that stops it from doing so,
>>>> reverting https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429#c11 but
>>>> according to bugs 49429 and 49454, it looks like removing it would mess
>>>> with escape analysis introduced in r175063 for bug 44194.  The thread
>>>> that introduces the mark_addressable calls suggests some discomfort with
>>>> this solution, and even a suggestion that the markings should be
>>>> deferred past the end of expand, but in the end there was agreement to
>>>> go with it.  https://gcc.gnu.org/ml/gcc-patches/2011-06/msg01746.html
>> 
>>> Aww, indeed.  Of course the issue is that we don't track pointers to the
>>> stack introduced during RTL properly.
>> 
>>> Thanks for checking.  Might want to add a comment before that
>>> addressable setting now that you've done the archeology.
>> 
>> I decided to give the following approach a try instead.  The following
>> patch was regstrapped on x86_64-linux-gnu and i686-linux-gnu.
>> Ok to install?

> It looks ok to me but lacks a comment in mark_addressable_1 why we
> do this queueing when currently expanding to RTL.

>> +/* Mark X as addressable or queue it up if called during expand.  */
>> +
>> +static void
>> +mark_addressable_1 (tree x)

How about this:

/* Mark X as addressable or queue it up if called during expand.  We
   don't want to apply it immediately during expand because decls are
   made addressable at that point due to RTL-only concerns, such as
   uses of memcpy for block moves, and TREE_ADDRESSABLE changes
   is_gimple_reg, which might make it seem like a variable that used
   to be a gimple_reg shouldn't have been an SSA name.  So we queue up
   this flag setting and only apply it when we're done with GIMPLE and
   only RTL issues matter.  */

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67891] drop is_gimple_reg test from set_parm_rtl
  2015-11-03  1:11                                                                                 ` Alexandre Oliva
@ 2015-11-03  3:14                                                                                   ` Jeff Law
  2015-11-03  4:29                                                                                     ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Jeff Law @ 2015-11-03  3:14 UTC (permalink / raw)
  To: Alexandre Oliva, Richard Biener
  Cc: Uroš Bizjak, Alan Lawrence, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On 11/02/2015 06:11 PM, Alexandre Oliva wrote:
> On Oct 14, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> On Wed, Oct 14, 2015 at 5:25 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> On Oct 12, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>>>
>>>> On Sat, Oct 10, 2015 at 3:16 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>>>> On Oct  9, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>
>>>>>> Ok.  Note that I think emit_block_move shouldn't mess with the addressable flag.
>>>>>
>>>>> I have successfully tested a patch that stops it from doing so,
>>>>> reverting https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429#c11 but
>>>>> according to bugs 49429 and 49454, it looks like removing it would mess
>>>>> with escape analysis introduced in r175063 for bug 44194.  The thread
>>>>> that introduces the mark_addressable calls suggests some discomfort with
>>>>> this solution, and even a suggestion that the markings should be
>>>>> deferred past the end of expand, but in the end there was agreement to
>>>>> go with it.  https://gcc.gnu.org/ml/gcc-patches/2011-06/msg01746.html
>>>
>>>> Aww, indeed.  Of course the issue is that we don't track pointers to the
>>>> stack introduced during RTL properly.
>>>
>>>> Thanks for checking.  Might want to add a comment before that
>>>> addressable setting now that you've done the archeology.
>>>
>>> I decided to give the following approach a try instead.  The following
>>> patch was regstrapped on x86_64-linux-gnu and i686-linux-gnu.
>>> Ok to install?
>
>> It looks ok to me but lacks a comment in mark_addressable_1 why we
>> do this queueing when currently expanding to RTL.
>
>>> +/* Mark X as addressable or queue it up if called during expand.  */
>>> +
>>> +static void
>>> +mark_addressable_1 (tree x)
>
> How about this:
>
> /* Mark X as addressable or queue it up if called during expand.  We
>     don't want to apply it immediately during expand because decls are
>     made addressable at that point due to RTL-only concerns, such as
>     uses of memcpy for block moves, and TREE_ADDRESSABLE changes
>     is_gimple_reg, which might make it seem like a variable that used
>     to be a gimple_reg shouldn't have been an SSA name.  So we queue up
>     this flag setting and only apply it when we're done with GIMPLE and
>     only RTL issues matter.  */
Sounds good.
jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR67891] drop is_gimple_reg test from set_parm_rtl
  2015-11-03  3:14                                                                                   ` Jeff Law
@ 2015-11-03  4:29                                                                                     ` Alexandre Oliva
  2022-10-17 12:08                                                                                       ` Tag 'gcc/gimple-expr.cc:mark_addressable_2' as 'static' (was: [PR67891] drop is_gimple_reg test from set_parm_rtl) Thomas Schwinge
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-11-03  4:29 UTC (permalink / raw)
  To: Jeff Law
  Cc: Richard Biener, Uroš Bizjak, Alan Lawrence,
	James Greenhalgh, H.J. Lu, Segher Boessenkool, GCC Patches,
	Christophe Lyon, David Edelsohn, Eric Botcazou

On Nov  3, 2015, Jeff Law <law@redhat.com> wrote:

> On 11/02/2015 06:11 PM, Alexandre Oliva wrote:
>> On Oct 14, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>>> It looks ok to me but lacks a comment in mark_addressable_1 why we
>>> do this queueing when currently expanding to RTL.
>> 
>>>> +/* Mark X as addressable or queue it up if called during expand.  */
>>>> +
>>>> +static void
>>>> +mark_addressable_1 (tree x)
>> 
>> How about this:
>> 
>> /* Mark X as addressable or queue it up if called during expand.  We
>> don't want to apply it immediately during expand because decls are
>> made addressable at that point due to RTL-only concerns, such as
>> uses of memcpy for block moves, and TREE_ADDRESSABLE changes
>> is_gimple_reg, which might make it seem like a variable that used
>> to be a gimple_reg shouldn't have been an SSA name.  So we queue up
>> this flag setting and only apply it when we're done with GIMPLE and
>> only RTL issues matter.  */
> Sounds good.

Thanks, here's the patch as just installed.

for  gcc/ChangeLog

	* gimple-expr.c: Include hash-set.h and rtl.h.
	(mark_addressable_queue): New var.
	(mark_addressable): Factor actual marking into...
	(mark_addressable_1): ... this.  Queue it up during expand.
	(mark_addressable_2): New.
	(flush_mark_addressable_queue): New.
	* gimple-expr.h (flush_mark_addressable_queue): Declare.
	* cfgexpand.c: Include gimple-expr.h.
	(pass_expand::execute): Flush mark_addressable queue.
---
 gcc/cfgexpand.c   |    3 +++
 gcc/gimple-expr.c |   57 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 gcc/gimple-expr.h |    1 +
 3 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b59ea02..bfbc958 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "internal-fn.h"
 #include "tree-eh.h"
 #include "gimple-iterator.h"
+#include "gimple-expr.h"
 #include "gimple-walk.h"
 #include "tree-cfg.h"
 #include "tree-dfa.h"
@@ -6368,6 +6369,8 @@ pass_expand::execute (function *fun)
   /* We're done expanding trees to RTL.  */
   currently_expanding_to_rtl = 0;
 
+  flush_mark_addressable_queue ();
+
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (fun)->next_bb,
 		  EXIT_BLOCK_PTR_FOR_FN (fun), next_bb)
     {
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index 44749b81..f5f9e87 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -32,6 +32,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "stor-layout.h"
 #include "demangle.h"
+#include "hash-set.h"
+#include "rtl.h"
 
 /* ----- Type related -----  */
 
@@ -817,6 +819,57 @@ is_gimple_mem_ref_addr (tree t)
 		  || decl_address_invariant_p (TREE_OPERAND (t, 0)))));
 }
 
+/* Hold trees marked addressable during expand.  */
+
+static hash_set<tree> *mark_addressable_queue;
+
+/* Mark X as addressable or queue it up if called during expand.  We
+   don't want to apply it immediately during expand because decls are
+   made addressable at that point due to RTL-only concerns, such as
+   uses of memcpy for block moves, and TREE_ADDRESSABLE changes
+   is_gimple_reg, which might make it seem like a variable that used
+   to be a gimple_reg shouldn't have been an SSA name.  So we queue up
+   this flag setting and only apply it when we're done with GIMPLE and
+   only RTL issues matter.  */
+
+static void
+mark_addressable_1 (tree x)
+{
+  if (!currently_expanding_to_rtl)
+    {
+      TREE_ADDRESSABLE (x) = 1;
+      return;
+    }
+
+  if (!mark_addressable_queue)
+    mark_addressable_queue = new hash_set<tree>();
+  mark_addressable_queue->add (x);
+}
+
+/* Adaptor for mark_addressable_1 for use in hash_set traversal.  */
+
+bool
+mark_addressable_2 (tree const &x, void * ATTRIBUTE_UNUSED = NULL)
+{
+  mark_addressable_1 (x);
+  return false;
+}
+
+/* Mark all queued trees as addressable, and empty the queue.  To be
+   called right after clearing CURRENTLY_EXPANDING_TO_RTL.  */
+
+void
+flush_mark_addressable_queue ()
+{
+  gcc_assert (!currently_expanding_to_rtl);
+  if (mark_addressable_queue)
+    {
+      mark_addressable_queue->traverse<void*, mark_addressable_2> (NULL);
+      delete mark_addressable_queue;
+      mark_addressable_queue = NULL;
+    }
+}
+
 /* Mark X addressable.  Unlike the langhook we expect X to be in gimple
    form and we don't do any syntax checking.  */
 
@@ -832,7 +885,7 @@ mark_addressable (tree x)
       && TREE_CODE (x) != PARM_DECL
       && TREE_CODE (x) != RESULT_DECL)
     return;
-  TREE_ADDRESSABLE (x) = 1;
+  mark_addressable_1 (x);
 
   /* Also mark the artificial SSA_NAME that points to the partition of X.  */
   if (TREE_CODE (x) == VAR_DECL
@@ -843,7 +896,7 @@ mark_addressable (tree x)
     {
       tree *namep = cfun->gimple_df->decls_to_pointers->get (x);
       if (namep)
-	TREE_ADDRESSABLE (*namep) = 1;
+	mark_addressable_1 (*namep);
     }
 }
 
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index 3d1c89f..2917d2752c 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -52,6 +52,7 @@ extern bool is_gimple_asm_val (tree);
 extern bool is_gimple_min_lval (tree);
 extern bool is_gimple_call_addr (tree);
 extern bool is_gimple_mem_ref_addr (tree);
+extern void flush_mark_addressable_queue (void);
 extern void mark_addressable (tree);
 extern bool is_gimple_reg_rhs (tree);
 


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-09-23 20:44                                                               ` Alexandre Oliva
  2015-09-25 11:39                                                                 ` Richard Biener
  2015-09-29 11:31                                                                 ` [PR64164] drop copyrename, integrate into expand Szabolcs Nagy
@ 2015-11-05  5:09                                                                 ` Alexandre Oliva
  2015-11-05 13:44                                                                   ` Richard Biener
  2015-11-10 15:31                                                                   ` Alan Lawrence
  2 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-11-05  5:09 UTC (permalink / raw)
  To: Alan Lawrence
  Cc: Jeff Law, James Greenhalgh, H.J. Lu, Segher Boessenkool,
	Richard Biener, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On Sep 23, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> @@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
[snip]
> +      if (GET_CODE (reg) != CONCAT)
> +        stack_parm = reg;
> +      else
> +        /* This will use or allocate a stack slot that we'd rather
> +           avoid.  FIXME: Could we avoid it in more cases?  */
> +        target_reg = reg;

It turns out that we can, and that helps fixing PR67753.  In the end, I
ended up using the ABI-reserved stack slot if there is one, but just
allocating an unsplit complex pseudo fixes all remaining cases that used
to require the allocation of a stack slot.  Yay!

As for pr67753 proper, we emitted the store of the PARALLEL entry_parm
into the stack parm only in the conversion seq, which was ultimately
emitted after the copy from stack_parm to target_reg that was supposed
to copy the value originally in entry_parm.  So we copied an
uninitialized stack slot, and the subsequent store in the conversion seq
was optimized out as dead.

This caused a number of regressions on hppa-linux-gnu.  The fix for this
is to arrange for the copy to target_reg to be emitted in the conversion
seq if the copy to stack_parm was.  I can't determine whether this fix
all reported regressions, but from visual inspection of the generated
code I'm pretty sure it fixes at least gcc.c-torture/execute/pr38969.c.


When we do NOT have an ABI-reserved stack slot, the store of the
PARALLEL entry_parm into the intermediate pseudo doesn't need to go in
the conversion seq (emit_group_store from a PARALLEL to a pseudo only
uses registers, according to another comment in function.c), so I've
simplified that case.


This was regstrapped on x86_64-linux-gnu, i686-linux-gnu,
ppc64-linux-gnu, ppc64el-linux-gnu, and cross-build-tested for all
targets for which I've tested the earlier patches in the patchset.
Ok to install?



[PR67753] fix copy of PARALLEL entry_parm to CONCAT target_reg

From: Alexandre Oliva <aoliva@redhat.com>

In assign_parms_setup_block, the copy of args in PARALLELs from
entry_parm to stack_parm is deferred to the parm conversion insn seq,
but the copy from stack_parm to target_reg was inserted in the normal
copy seq, that is executed before the conversion insn seq.  Oops.

We could do away with the need for an actual stack_parm in general,
which would have avoided the need for emitting the copy to target_reg
in the conversion seq, but at least on pa, due to the need for stack
to copy between SI and SF modes, it seems like using the reserved
stack slot is beneficial, so I put in logic to use a pre-reserved
stack slot when there is one, and emit the copy to target_reg in the
conversion seq if stack_parm was set up there.

for  gcc/ChangeLog

	PR rtl-optimization/67753
	PR rtl-optimization/64164
	* function.c (assign_parm_setup_block): Avoid allocating a
	stack slot if we don't have an ABI-reserved one.  Emit the
	copy to target_reg in the conversion seq if the copy from
	entry_parm is in it too.  Don't use the conversion seq to copy
	a PARALLEL to a REG or a CONCAT.
---
 gcc/function.c |   39 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index aaf49a4..156c72b 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2879,6 +2879,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
   rtx entry_parm = data->entry_parm;
   rtx stack_parm = data->stack_parm;
   rtx target_reg = NULL_RTX;
+  bool in_conversion_seq = false;
   HOST_WIDE_INT size;
   HOST_WIDE_INT size_stored;
 
@@ -2895,9 +2896,23 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       if (GET_CODE (reg) != CONCAT)
 	stack_parm = reg;
       else
-	/* This will use or allocate a stack slot that we'd rather
-	   avoid.  FIXME: Could we avoid it in more cases?  */
-	target_reg = reg;
+	{
+	  target_reg = reg;
+	  /* Avoid allocating a stack slot, if there isn't one
+	     preallocated by the ABI.  It might seem like we should
+	     always prefer a pseudo, but converting between
+	     floating-point and integer modes goes through the stack
+	     on various machines, so it's better to use the reserved
+	     stack slot than to risk wasting it and allocating more
+	     for the conversion.  */
+	  if (stack_parm == NULL_RTX)
+	    {
+	      int save = generating_concat_p;
+	      generating_concat_p = 0;
+	      stack_parm = gen_reg_rtx (mode);
+	      generating_concat_p = save;
+	    }
+	}
       data->stack_parm = NULL;
     }
 
@@ -2938,7 +2953,9 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       mem = validize_mem (copy_rtx (stack_parm));
 
       /* Handle values in multiple non-contiguous locations.  */
-      if (GET_CODE (entry_parm) == PARALLEL)
+      if (GET_CODE (entry_parm) == PARALLEL && !MEM_P (mem))
+	emit_group_store (mem, entry_parm, data->passed_type, size);
+      else if (GET_CODE (entry_parm) == PARALLEL)
 	{
 	  push_to_sequence2 (all->first_conversion_insn,
 			     all->last_conversion_insn);
@@ -2946,6 +2963,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 	  all->first_conversion_insn = get_insns ();
 	  all->last_conversion_insn = get_last_insn ();
 	  end_sequence ();
+	  in_conversion_seq = true;
 	}
 
       else if (size == 0)
@@ -3025,11 +3043,22 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       all->first_conversion_insn = get_insns ();
       all->last_conversion_insn = get_last_insn ();
       end_sequence ();
+      in_conversion_seq = true;
     }
 
   if (target_reg)
     {
-      emit_move_insn (target_reg, stack_parm);
+      if (!in_conversion_seq)
+	emit_move_insn (target_reg, stack_parm);
+      else
+	{
+	  push_to_sequence2 (all->first_conversion_insn,
+			     all->last_conversion_insn);
+	  emit_move_insn (target_reg, stack_parm);
+	  all->first_conversion_insn = get_insns ();
+	  all->last_conversion_insn = get_last_insn ();
+	  end_sequence ();
+	}
       stack_parm = target_reg;
     }
 


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-11-05  5:09                                                                 ` Alexandre Oliva
@ 2015-11-05 13:44                                                                   ` Richard Biener
  2015-11-10 15:31                                                                   ` Alan Lawrence
  1 sibling, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-11-05 13:44 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Alan Lawrence, Jeff Law, James Greenhalgh, H.J. Lu,
	Segher Boessenkool, GCC Patches, Christophe Lyon, David Edelsohn,
	Eric Botcazou

On Thu, Nov 5, 2015 at 6:08 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Sep 23, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> @@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
> [snip]
>> +      if (GET_CODE (reg) != CONCAT)
>> +        stack_parm = reg;
>> +      else
>> +        /* This will use or allocate a stack slot that we'd rather
>> +           avoid.  FIXME: Could we avoid it in more cases?  */
>> +        target_reg = reg;
>
> It turns out that we can, and that helps fixing PR67753.  In the end, I
> ended up using the ABI-reserved stack slot if there is one, but just
> allocating an unsplit complex pseudo fixes all remaining cases that used
> to require the allocation of a stack slot.  Yay!
>
> As for pr67753 proper, we emitted the store of the PARALLEL entry_parm
> into the stack parm only in the conversion seq, which was ultimately
> emitted after the copy from stack_parm to target_reg that was supposed
> to copy the value originally in entry_parm.  So we copied an
> uninitialized stack slot, and the subsequent store in the conversion seq
> was optimized out as dead.
>
> This caused a number of regressions on hppa-linux-gnu.  The fix for this
> is to arrange for the copy to target_reg to be emitted in the conversion
> seq if the copy to stack_parm was.  I can't determine whether this fix
> all reported regressions, but from visual inspection of the generated
> code I'm pretty sure it fixes at least gcc.c-torture/execute/pr38969.c.
>
>
> When we do NOT have an ABI-reserved stack slot, the store of the
> PARALLEL entry_parm into the intermediate pseudo doesn't need to go in
> the conversion seq (emit_group_store from a PARALLEL to a pseudo only
> uses registers, according to another comment in function.c), so I've
> simplified that case.
>
>
> This was regstrapped on x86_64-linux-gnu, i686-linux-gnu,
> ppc64-linux-gnu, ppc64el-linux-gnu, and cross-build-tested for all
> targets for which I've tested the earlier patches in the patchset.
> Ok to install?

Ok.

Thanks,
Richard.

>
>
> [PR67753] fix copy of PARALLEL entry_parm to CONCAT target_reg
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> In assign_parms_setup_block, the copy of args in PARALLELs from
> entry_parm to stack_parm is deferred to the parm conversion insn seq,
> but the copy from stack_parm to target_reg was inserted in the normal
> copy seq, that is executed before the conversion insn seq.  Oops.
>
> We could do away with the need for an actual stack_parm in general,
> which would have avoided the need for emitting the copy to target_reg
> in the conversion seq, but at least on pa, due to the need for stack
> to copy between SI and SF modes, it seems like using the reserved
> stack slot is beneficial, so I put in logic to use a pre-reserved
> stack slot when there is one, and emit the copy to target_reg in the
> conversion seq if stack_parm was set up there.
>
> for  gcc/ChangeLog
>
>         PR rtl-optimization/67753
>         PR rtl-optimization/64164
>         * function.c (assign_parm_setup_block): Avoid allocating a
>         stack slot if we don't have an ABI-reserved one.  Emit the
>         copy to target_reg in the conversion seq if the copy from
>         entry_parm is in it too.  Don't use the conversion seq to copy
>         a PARALLEL to a REG or a CONCAT.
> ---
>  gcc/function.c |   39 ++++++++++++++++++++++++++++++++++-----
>  1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/function.c b/gcc/function.c
> index aaf49a4..156c72b 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2879,6 +2879,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>    rtx entry_parm = data->entry_parm;
>    rtx stack_parm = data->stack_parm;
>    rtx target_reg = NULL_RTX;
> +  bool in_conversion_seq = false;
>    HOST_WIDE_INT size;
>    HOST_WIDE_INT size_stored;
>
> @@ -2895,9 +2896,23 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>        if (GET_CODE (reg) != CONCAT)
>         stack_parm = reg;
>        else
> -       /* This will use or allocate a stack slot that we'd rather
> -          avoid.  FIXME: Could we avoid it in more cases?  */
> -       target_reg = reg;
> +       {
> +         target_reg = reg;
> +         /* Avoid allocating a stack slot, if there isn't one
> +            preallocated by the ABI.  It might seem like we should
> +            always prefer a pseudo, but converting between
> +            floating-point and integer modes goes through the stack
> +            on various machines, so it's better to use the reserved
> +            stack slot than to risk wasting it and allocating more
> +            for the conversion.  */
> +         if (stack_parm == NULL_RTX)
> +           {
> +             int save = generating_concat_p;
> +             generating_concat_p = 0;
> +             stack_parm = gen_reg_rtx (mode);
> +             generating_concat_p = save;
> +           }
> +       }
>        data->stack_parm = NULL;
>      }
>
> @@ -2938,7 +2953,9 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>        mem = validize_mem (copy_rtx (stack_parm));
>
>        /* Handle values in multiple non-contiguous locations.  */
> -      if (GET_CODE (entry_parm) == PARALLEL)
> +      if (GET_CODE (entry_parm) == PARALLEL && !MEM_P (mem))
> +       emit_group_store (mem, entry_parm, data->passed_type, size);
> +      else if (GET_CODE (entry_parm) == PARALLEL)
>         {
>           push_to_sequence2 (all->first_conversion_insn,
>                              all->last_conversion_insn);
> @@ -2946,6 +2963,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>           all->first_conversion_insn = get_insns ();
>           all->last_conversion_insn = get_last_insn ();
>           end_sequence ();
> +         in_conversion_seq = true;
>         }
>
>        else if (size == 0)
> @@ -3025,11 +3043,22 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>        all->first_conversion_insn = get_insns ();
>        all->last_conversion_insn = get_last_insn ();
>        end_sequence ();
> +      in_conversion_seq = true;
>      }
>
>    if (target_reg)
>      {
> -      emit_move_insn (target_reg, stack_parm);
> +      if (!in_conversion_seq)
> +       emit_move_insn (target_reg, stack_parm);
> +      else
> +       {
> +         push_to_sequence2 (all->first_conversion_insn,
> +                            all->last_conversion_insn);
> +         emit_move_insn (target_reg, stack_parm);
> +         all->first_conversion_insn = get_insns ();
> +         all->last_conversion_insn = get_last_insn ();
> +         end_sequence ();
> +       }
>        stack_parm = target_reg;
>      }
>
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-11-05  5:09                                                                 ` Alexandre Oliva
  2015-11-05 13:44                                                                   ` Richard Biener
@ 2015-11-10 15:31                                                                   ` Alan Lawrence
  2015-11-10 22:59                                                                     ` Alexandre Oliva
  1 sibling, 1 reply; 127+ messages in thread
From: Alan Lawrence @ 2015-11-10 15:31 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Marcus Shawcroft, James Greenhalgh

On 05/11/15 05:08, Alexandre Oliva wrote:
> [PR67753] fix copy of PARALLEL entry_parm to CONCAT target_reg
> for  gcc/ChangeLog
>
> 	PR rtl-optimization/67753
> 	PR rtl-optimization/64164
> 	* function.c (assign_parm_setup_block): Avoid allocating a
> 	stack slot if we don't have an ABI-reserved one.  Emit the
> 	copy to target_reg in the conversion seq if the copy from
> 	entry_parm is in it too.  Don't use the conversion seq to copy
> 	a PARALLEL to a REG or a CONCAT.

Since this change, we have on aarch64_be:

FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O1
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O2
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O3 -g
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -Os
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -Og -g

The difference in the assembler looks as follows (this is at -Og):

  func_return_val_10:
-	sub	sp, sp, #16
-	lsr	x2, x1, 48
-	lsr	x1, x1, 32
+	ubfx	x2, x1, 16, 16
  	fmov	x3, d0
  	// Start of user assembly
  // 23 "func-ret-4.c" 1
  	mov x0, x30
  // 0 "" 2
  	// End of user assembly
  	adrp	x3, saved_return_address
  	str	x0, [x3, #:lo12:saved_return_address]
  	adrp	x0, myfunc
  	add	x0, x0, :lo12:myfunc
  	// Start of user assembly
  // 23 "func-ret-4.c" 1
  	mov x30, x0
  // 0 "" 2
  	// End of user assembly
  	bfi	w0, w2, 16, 16
  	bfi	w0, w1, 0, 16
  	lsl	x0, x0, 32
-	add	sp, sp, 16

(ubfx is a bitfield extract, the first immediate is the lsbit, the second the 
width. lsr = logical shift right.) And in the RTL dump, this (before the patch):

(insn 4 3 5 2 (set (mem/c:DI (plus:DI (reg/f:DI 68 virtual-stack-vars)
                 (const_int -8 [0xfffffffffffffff8])) [0 t+0 S8 A64])
         (reg:DI 1 x1)) func-ret-4.c:23 -1
      (nil))
(insn 5 4 6 2 (set (reg:HI 78 [ t ])
         (mem/c:HI (plus:DI (reg/f:DI 68 virtual-stack-vars)
                 (const_int -8 [0xfffffffffffffff8])) [0 t+0 S2 A64])) 
func-ret-4.c:23 -1
      (nil))
(insn 6 5 7 2 (set (reg:HI 79 [ t+2 ])
         (mem/c:HI (plus:DI (reg/f:DI 68 virtual-stack-vars)
                 (const_int -6 [0xfffffffffffffffa])) [0 t+2 S2 A16])) 
func-ret-4.c:23 -1
      (nil))

becomes (after the patch):

(insn 4 3 5 2 (set (subreg:SI (reg:CHI 80) 0)
         (reg:SI 1 x1 [ t ])) func-ret-4.c:23 -1
      (nil))
(insn 5 4 6 2 (set (reg:SI 81)
         (subreg:SI (reg:CHI 80) 0)) func-ret-4.c:23 -1
      (nil))
(insn 6 5 7 2 (set (subreg:DI (reg:HI 82) 0)
         (zero_extract:DI (subreg:DI (reg:SI 81) 0)
             (const_int 16 [0x10])
             (const_int 16 [0x10]))) func-ret-4.c:23 -1
      (nil))
(insn 7 6 8 2 (set (reg:HI 78 [ t ])
         (reg:HI 82)) func-ret-4.c:23 -1
      (nil))
(insn 8 7 9 2 (set (reg:SI 83)
         (subreg:SI (reg:CHI 80) 0)) func-ret-4.c:23 -1
      (nil))
(insn 9 8 10 2 (set (reg:HI 79 [ t+2 ])
         (subreg:HI (reg:SI 83) 2)) func-ret-4.c:23 -1
      (nil))

--Alan

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-11-10 15:31                                                                   ` Alan Lawrence
@ 2015-11-10 22:59                                                                     ` Alexandre Oliva
  2015-11-10 23:43                                                                       ` Jeff Law
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-11-10 22:59 UTC (permalink / raw)
  To: Alan Lawrence; +Cc: gcc-patches, Marcus Shawcroft, James Greenhalgh

On Nov 10, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:

> FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O2

Ugh, sorry.  I even checked that testcase by hand before submitting the
patch, because I knew it took the paths I was changing, but I didn't
realize the stack store and load would amount to shifts when the stack
slot was bypassed.

With the following patch, we get a lsr and a ubfx, without the sp
adjustments.  Please let me know if it causes any further problems.  So
far, I've tested it on x86_64-linux-gnu, i686-linux-gnu, and
ppc64le-linux-gnu; the ppc64-linux-gnu test run is running slower and
probably won't be done before I call it a day, but I wanted to give you
something before taking off for the day.

Is this ok to install if ppc64-linux-gnu also regstraps successfully?


[PR67753] adjust for padding when bypassing memory in assign_parm_setup_block

From: Alexandre Oliva <aoliva@redhat.com>

Storing a register in memory as a full word and then accessing the
same memory address under a smaller-than-word mode amounts to
right-shifting of the register word on big endian machines.  So, if
BLOCK_REG_PADDING chooses upward padding for BYTES_BIG_ENDIAN, and
we're copying from the entry_parm REG directly to a pseudo, bypassing
any stack slot, perform the shifting explicitly.

This fixes the miscompile of function_return_val_10 in
gcc.target/aarch64/aapcs64/func-ret-4.c for target aarch64_be-elf
introduced in the first patch for 67753.

for  gcc/ChangeLog

	PR rtl-optimization/67753
	PR rtl-optimization/64164
	* function.c (assign_parm_setup_block): Right-shift
	upward-padded big-endian args when bypassing the stack slot.
---
 gcc/function.c |   44 +++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 41 insertions(+), 3 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index a637cb3..1ee092c 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -3002,6 +3002,38 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 	      emit_move_insn (change_address (mem, mode, 0), reg);
 	    }
 
+#ifdef BLOCK_REG_PADDING
+	  /* Storing the register in memory as a full word, as
+	     move_block_from_reg below would do, and then using the
+	     MEM in a smaller mode, has the effect of shifting right
+	     if BYTES_BIG_ENDIAN.  If we're bypassing memory, the
+	     shifting must be explicit.  */
+	  else if (!MEM_P (mem))
+	    {
+	      rtx x;
+
+	      /* If the assert below fails, we should have taken the
+		 mode != BLKmode path above, unless we have downward
+		 padding of smaller-than-word arguments on a machine
+		 with little-endian bytes, which would likely require
+		 additional changes to work correctly.  */
+	      gcc_checking_assert (BYTES_BIG_ENDIAN
+				   && (BLOCK_REG_PADDING (mode,
+							  data->passed_type, 1)
+				       == upward));
+
+	      int by = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
+
+	      x = gen_rtx_REG (word_mode, REGNO (entry_parm));
+	      x = expand_shift (RSHIFT_EXPR, word_mode, x, by,
+				NULL_RTX, 1);
+	      x = force_reg (word_mode, x);
+	      x = gen_lowpart_SUBREG (GET_MODE (mem), x);
+
+	      emit_move_insn (mem, x);
+	    }
+#endif
+
 	  /* Blocks smaller than a word on a BYTES_BIG_ENDIAN
 	     machine must be aligned to the left before storing
 	     to memory.  Note that the previous test doesn't
@@ -3023,14 +3055,20 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 	      tem = change_address (mem, word_mode, 0);
 	      emit_move_insn (tem, x);
 	    }
-	  else if (!MEM_P (mem))
-	    emit_move_insn (mem, entry_parm);
 	  else
 	    move_block_from_reg (REGNO (entry_parm), mem,
 				 size_stored / UNITS_PER_WORD);
 	}
       else if (!MEM_P (mem))
-	emit_move_insn (mem, entry_parm);
+	{
+	  gcc_checking_assert (size > UNITS_PER_WORD);
+#ifdef BLOCK_REG_PADDING
+	  gcc_checking_assert (BLOCK_REG_PADDING (GET_MODE (mem),
+						  data->passed_type, 0)
+			       == upward);
+#endif
+	  emit_move_insn (mem, entry_parm);
+	}
       else
 	move_block_from_reg (REGNO (entry_parm), mem,
 			     size_stored / UNITS_PER_WORD);


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-11-10 22:59                                                                     ` Alexandre Oliva
@ 2015-11-10 23:43                                                                       ` Jeff Law
  2015-11-11 18:10                                                                         ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Jeff Law @ 2015-11-10 23:43 UTC (permalink / raw)
  To: Alexandre Oliva, Alan Lawrence
  Cc: gcc-patches, Marcus Shawcroft, James Greenhalgh

On 11/10/2015 03:58 PM, Alexandre Oliva wrote:
> On Nov 10, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:
>
>> FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O2
>
> Ugh, sorry.  I even checked that testcase by hand before submitting the
> patch, because I knew it took the paths I was changing, but I didn't
> realize the stack store and load would amount to shifts when the stack
> slot was bypassed.
>
> With the following patch, we get a lsr and a ubfx, without the sp
> adjustments.  Please let me know if it causes any further problems.  So
> far, I've tested it on x86_64-linux-gnu, i686-linux-gnu, and
> ppc64le-linux-gnu; the ppc64-linux-gnu test run is running slower and
> probably won't be done before I call it a day, but I wanted to give you
> something before taking off for the day.
>
> Is this ok to install if ppc64-linux-gnu also regstraps successfully?
>
>
> [PR67753] adjust for padding when bypassing memory in assign_parm_setup_block
>
> From: Alexandre Oliva <aoliva@redhat.com>
>
> Storing a register in memory as a full word and then accessing the
> same memory address under a smaller-than-word mode amounts to
> right-shifting of the register word on big endian machines.  So, if
> BLOCK_REG_PADDING chooses upward padding for BYTES_BIG_ENDIAN, and
> we're copying from the entry_parm REG directly to a pseudo, bypassing
> any stack slot, perform the shifting explicitly.
>
> This fixes the miscompile of function_return_val_10 in
> gcc.target/aarch64/aapcs64/func-ret-4.c for target aarch64_be-elf
> introduced in the first patch for 67753.
>
> for  gcc/ChangeLog
>
> 	PR rtl-optimization/67753
> 	PR rtl-optimization/64164
> 	* function.c (assign_parm_setup_block): Right-shift
> 	upward-padded big-endian args when bypassing the stack slot.
Don't you need to check the value of BLOCK_REG_PADDING at runtime?  The 
padding is essentially allowed to vary.

If you  look at the other places where BLOCK_REG_PADDING is used, it's 
checked in a #ifdef, then again inside a if conditional.



Jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-11-10 23:43                                                                       ` Jeff Law
@ 2015-11-11 18:10                                                                         ` Alexandre Oliva
  2015-11-13  6:33                                                                           ` Jeff Law
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-11-11 18:10 UTC (permalink / raw)
  To: Jeff Law; +Cc: Alan Lawrence, gcc-patches, Marcus Shawcroft, James Greenhalgh

On Nov 10, 2015, Jeff Law <law@redhat.com> wrote:

>> * function.c (assign_parm_setup_block): Right-shift
>> upward-padded big-endian args when bypassing the stack slot.
> Don't you need to check the value of BLOCK_REG_PADDING at runtime?
> The padding is essentially allowed to vary.

Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether
upward-padding occurred and shifting is required.

> If you  look at the other places where BLOCK_REG_PADDING is used, it's
> checked in a #ifdef, then again inside a if conditional.

That's what I do in the patch too.

That said, the initial conditions in the if/else-if/else chain for the
no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases
correctly, so that, if BLOCK_REG_PADDING is not defined, we can just
skip the !MEM_P block altogether.  That's also the reason why we can go
straight to shifting when we get there.

I tried to document my reasoning in the comments, but maybe it was still
too obscure?

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-11-11 18:10                                                                         ` Alexandre Oliva
@ 2015-11-13  6:33                                                                           ` Jeff Law
  2015-11-17  0:07                                                                             ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Jeff Law @ 2015-11-13  6:33 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Alan Lawrence, gcc-patches, Marcus Shawcroft, James Greenhalgh

On 11/11/2015 11:10 AM, Alexandre Oliva wrote:
> On Nov 10, 2015, Jeff Law <law@redhat.com> wrote:
>
>>> * function.c (assign_parm_setup_block): Right-shift
>>> upward-padded big-endian args when bypassing the stack slot.
>> Don't you need to check the value of BLOCK_REG_PADDING at runtime?
>> The padding is essentially allowed to vary.
>
> Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether
> upward-padding occurred and shifting is required.
>
>> If you  look at the other places where BLOCK_REG_PADDING is used, it's
>> checked in a #ifdef, then again inside a if conditional.
>
> That's what I do in the patch too.
?  I don't see the runtime check in your patch.  I see a couple 
gcc_asserts, but no runtime check of BLOCK_REG_PADDING.

>
> That said, the initial conditions in the if/else-if/else chain for the
> no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases
> correctly, so that, if BLOCK_REG_PADDING is not defined, we can just
> skip the !MEM_P block altogether.  That's also the reason why we can go
> straight to shifting when we get there.
>
> I tried to document my reasoning in the comments, but maybe it was still
> too obscure?
Certainly seems that way.  Is it your assertion that the new code is 
what we want regardless of the *value* of REG_BLOCK_PADDING? 
Essentially meaning the check in the IF is covering both cases?

What am I missing here?

Jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-11-13  6:33                                                                           ` Jeff Law
@ 2015-11-17  0:07                                                                             ` Alexandre Oliva
  2015-11-24  5:41                                                                               ` Jeff Law
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-11-17  0:07 UTC (permalink / raw)
  To: Jeff Law; +Cc: Alan Lawrence, gcc-patches, Marcus Shawcroft, James Greenhalgh

On Nov 13, 2015, Jeff Law <law@redhat.com> wrote:

> On 11/11/2015 11:10 AM, Alexandre Oliva wrote:
>> On Nov 10, 2015, Jeff Law <law@redhat.com> wrote:
>> 
>>>> * function.c (assign_parm_setup_block): Right-shift
>>>> upward-padded big-endian args when bypassing the stack slot.
>>> Don't you need to check the value of BLOCK_REG_PADDING at runtime?
>>> The padding is essentially allowed to vary.
>> 
>> Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether
>> upward-padding occurred and shifting is required.
>> 
>>> If you  look at the other places where BLOCK_REG_PADDING is used, it's
>>> checked in a #ifdef, then again inside a if conditional.
>> 
>> That's what I do in the patch too.
> ?  I don't see the runtime check in your patch.  I see a couple
> gcc_asserts, but no runtime check of BLOCK_REG_PADDING.

The check is not in my patch, indeed.  That's because the previous block
performs the runtime check, and it only lets through two cases: the one
we handle, and the one nobody uses.

The previous block tests this:

	  if (mode != BLKmode
#ifdef BLOCK_REG_PADDING
	      && (size == UNITS_PER_WORD
		  || (BLOCK_REG_PADDING (mode, data->passed_type, 1)
		      != (BYTES_BIG_ENDIAN ? upward : downward)))
#endif
	      )

i.e., whether we know the mode of the passed value, and its word-sized,
or its padded such that the passed value is in the lowpart of the word.
Since this is in a block that runs when size <= UNITS_PER_WORD, this
catches (and works for) all cases of default padding (when
BLOCK_REG_PADDING is not defined), and for cases in which
BLOCK_REG_PADDING is defined so as to behave like the default padding,
at least for smaller-than-word modes.

So, since this handles little-endian bytes with upward padding and
big-endian bytes with downward padding, what remains to be handled is
little-endian bytes with downward padding and big-endian bytes with
upward padding.  I found no evidence that the former is ever used
anywhere, or why anyone would ever force shifting for both REG and MEM
use, and I don't see how the code would have dealt with this case
anyway, so I left it unhandled.  The other case, big-endian bytes with
upward padding, is precisely the one that my previous patch broke on
AArch64: we have the passed values pushed to the upper part of the REG
so that it could be stored in memory as a whole word and then accessed
in the smaller mode at the same address.  After checking that this is
the case at hand, we shift the value as if we stored it in memory as a
word and loaded it in the value mode.

>> That said, the initial conditions in the if/else-if/else chain for the
>> no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases
>> correctly, so that, if BLOCK_REG_PADDING is not defined, we can just
>> skip the !MEM_P block altogether.  That's also the reason why we can go
>> straight to shifting when we get there.
>> 
>> I tried to document my reasoning in the comments, but maybe it was still
>> too obscure?
> Certainly seems that way.  Is it your assertion that the new code is
> what we want regardless of the *value* of REG_BLOCK_PADDING?

Sort of.  If we get to that point, there's only one reasonable value of
BLOCK_REG_PADDING (*), although there's another possible value that we
historically haven't handled and that makes very little sense to
support.  We'd have silently corrupted it before, while now we'd get an
assertion failure.  I count that as an improvement, though it's unlikely
we'd ever hit it: anyone trying to define BLOCK_REG_PADDING so as to pad
small args downward on little-endian bytes would AFAICT soon find out it
doesn't work.

(*) unless mode is BLKmode, which the newly-added code implicitly
excludes by testing that we don't have a MEM, but rather a REG.

> Essentially meaning the check in the IF is covering both cases?

Among all cases for arguments that are word-sized or smaller, the
initial IF (not present in the patch) covers all of the "usual" cases.
The remaining blocks, including the one I added, cover the remaining
handled case, namely, BIG_ENDIAN_BYTES and upward BLOCK_REG_PADDING, or
BLKmode BIG_ENDIAN_BYTES and downward BLOCK_REG_PADDING (that needs the
opposite padding when storing to big-endian mem, so that the value can
be accessed at the address in which the full word is stored).

> What am I missing here?

I agree that the way the remaining tests are written doesn't make it
clear that they're all handling a single case, which makes things
confusing.  In part, that's because they really aren't; they also deal
with BLKmode MEMs with "usual" padding.  But that's not a case that the
patch affects, because we don't have BLKmode REGs.

Any suggestions on how to improve the comments so that they convey
enough of this reasoning to make sense, without our having to write a
book :-) on the topic?

Thanks,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-11-17  0:07                                                                             ` Alexandre Oliva
@ 2015-11-24  5:41                                                                               ` Jeff Law
  0 siblings, 0 replies; 127+ messages in thread
From: Jeff Law @ 2015-11-24  5:41 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Alan Lawrence, gcc-patches, Marcus Shawcroft, James Greenhalgh

On 11/16/2015 05:07 PM, Alexandre Oliva wrote:
>
> The check is not in my patch, indeed.  That's because the previous block
> performs the runtime check, and it only lets through two cases: the one
> we handle, and the one nobody uses.
That was the conclusion I was starting to come to, but expressed so 
poorly in my last message.  Sadly it was non-obvious from staring at the 
current code.  Though I must admit that after a week, I can see it 
better now.  Maybe that's a result of re-reading your message a 
half-dozen more times with the current code and your patch all visible 
in windows next to each other :-)


Prior to your change we'd just blindly copy from ENTRY_PARM to MEM, 
which would result in missing the implicit shift if MEM wasn't actually 
a memory.

You're just moving that conditional up and handling the shift 
explicitly.  You've got asserts for the cases you're not handling (and 
no, I'm not aware of the need for this on any LE architecture, while I 
am aware of BE architectures that align in both directions).


> Any suggestions on how to improve the comments so that they convey
> enough of this reasoning to make sense, without our having to write a
> book :-) on the topic?
Refer back to this thread? :-)  Seriously though, looking at things a 
week later, I can see it much better now.  Thanks for your patience on this.

OK for the trunk,
jeff

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-03-27 18:04 [PR64164] drop copyrename, integrate into expand Alexandre Oliva
  2015-03-27 18:11 ` Alexandre Oliva
  2015-03-28 19:22 ` Alexandre Oliva
@ 2015-12-04 12:45 ` Dominik Vogt
  2 siblings, 0 replies; 127+ messages in thread
From: Dominik Vogt @ 2015-12-04 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Alexandre Oliva, Andreas Krebbel

On Fri, Mar 27, 2015 at 03:04:05PM -0300, Alexandre Oliva wrote:
> This patch reworks the out-of-ssa expander to enable coalescing of SSA
> partitions that don't share the same base name.  This is done only when
> optimizing.
> 
> The test we use to tell whether two partitions can be merged no longer
> demands them to have the same base variable when optimizing, so they
> become eligible for coalescing, as they would after copyrename.  We then
> compute the partitioning we'd get if all coalescible partitions were
> coalesced, using this partition assignment to assign base vars numbers.
> These base var numbers are then used to identify conflicts, which used
> to be based on shared base vars or base types.
> 
> We now propagate base var names during coalescing proper, only towards
> the leader variable.  I'm no longer sure this is still needed, but
> something about handling variables and results led me this way and I
> didn't revisit it.  I might rework that with a later patch, or a later
> revision of this patch; it would require other means to identify
> partitions holding result_decls during merging, or allow that and deal
> with param and result decls in a different way during expand proper.
> 
> I had to fix two lingering bugs in order for the whole thing to work: we
> perform conflict detection after abnormal coalescing, but we computed
> live ranges involving only the partition leaders, so conflicts with
> other names already coalesced wouldn't be detected.  The other problem
> was that we didn't track default defs for parms as live at entry, so
> they might end up coalesced.  I guess none of these problems would have
> been exercised in practice, because we wouldn't even consider merging
> ssa names associated with different variables.
> 
> In the end, I verified that this fixed the codegen regression in the
> PR64164 testcase, that failed to merge two partitions that could in
> theory be merged, but that wasn't even considered due to differences in
> the SSA var names.
> 
> I'd agree that disregarding the var names and dropping 4 passes is too
> much of a change to fix this one problem, but...  it's something we
> should have long tackled, and it gets this and other jobs done, so...
> 
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto.  Is this ok to install?

The patch that got committed as a result of this discussion causes
a performance regression on s390[x].  Bug report:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68695

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Tag 'gcc/gimple-expr.cc:mark_addressable_2' as 'static' (was: [PR67891] drop is_gimple_reg test from set_parm_rtl)
  2015-11-03  4:29                                                                                     ` Alexandre Oliva
@ 2022-10-17 12:08                                                                                       ` Thomas Schwinge
  0 siblings, 0 replies; 127+ messages in thread
From: Thomas Schwinge @ 2022-10-17 12:08 UTC (permalink / raw)
  To: Alexandre Oliva, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1396 bytes --]

Hi!

On 2015-11-03T02:29:41-0200, Alexandre Oliva <aoliva@redhat.com> wrote:
> Thanks, here's the patch as just installed.

> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c

> +static void
> +mark_addressable_1 (tree x)
> +{
> +  [...]
> +}
> +
> +/* Adaptor for mark_addressable_1 for use in hash_set traversal.  */
> +
> +bool
> +mark_addressable_2 (tree const &x, void * ATTRIBUTE_UNUSED = NULL)
> +{
> +  mark_addressable_1 (x);
> +  return false;
> +}

Found already a while ago, now pushed to master branch in
commit aeb1e2bff95ae17717026905ef404699d91f5c61
"Tag 'gcc/gimple-expr.cc:mark_addressable_2' as 'static'", see attached.


Grüße
 Thomas


> +void
> +flush_mark_addressable_queue ()
> +{
> +  gcc_assert (!currently_expanding_to_rtl);
> +  if (mark_addressable_queue)
> +    {
> +      mark_addressable_queue->traverse<void*, mark_addressable_2> (NULL);
> +      delete mark_addressable_queue;
> +      mark_addressable_queue = NULL;
> +    }
> +}

> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h

> +extern void flush_mark_addressable_queue (void);


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Tag-gcc-gimple-expr.cc-mark_addressable_2-as-static.patch --]
[-- Type: text/x-diff, Size: 944 bytes --]

From aeb1e2bff95ae17717026905ef404699d91f5c61 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Wed, 15 Dec 2021 22:00:53 +0100
Subject: [PATCH] Tag 'gcc/gimple-expr.cc:mark_addressable_2' as 'static'

Added in 2015 r229696 (commit 1b223a9f3489296c625bdb7cc764196d04fd9231)
"defer mark_addressable calls during expand till the end of expand",
it has never been used 'extern'ally.

	gcc/
	* gimple-expr.cc (mark_addressable_2): Tag as 'static'.
---
 gcc/gimple-expr.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gimple-expr.cc b/gcc/gimple-expr.cc
index c9c7285efbc..4fbce9369c7 100644
--- a/gcc/gimple-expr.cc
+++ b/gcc/gimple-expr.cc
@@ -912,7 +912,7 @@ mark_addressable_1 (tree x)
 
 /* Adaptor for mark_addressable_1 for use in hash_set traversal.  */
 
-bool
+static bool
 mark_addressable_2 (tree const &x, void * ATTRIBUTE_UNUSED = NULL)
 {
   mark_addressable_1 (x);
-- 
2.35.1


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-09 21:36     ` Eric Botcazou
@ 2015-06-09 21:38       ` David Edelsohn
  0 siblings, 0 replies; 127+ messages in thread
From: David Edelsohn @ 2015-06-09 21:38 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Alexandre Oliva, GCC Patches, Christophe Lyon,
	William J. Schmidt, Michael Meissner

Alex, I sent you a pre-processed file off-list.  You could try
bootstrapping on PPC64 on the GCC Compile Farm.

The SPARC failure reports a different error than PPC and ARM.  The PPC
and ARM failures are the same message, but seem to be on different
files.

After the breakage from Aldy's patch all weekend, this failure is very
frustrating.  If this is not fixed within 24 hours, your patch must be
reverted.  This patch clearly should have been tested on more
architectures than x86 before being approved and merged.

Thanks, David


On Tue, Jun 9, 2015 at 5:27 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> I'll look into cross-building some embedded targets and see if any
>> further issues surface.
>
> SPARC is also broken, see my message and the tescase under the PR.
>
> --
> Eric Botcazou

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-09 20:24   ` Alexandre Oliva
  2015-06-09 20:59     ` Jakub Jelinek
@ 2015-06-09 21:36     ` Eric Botcazou
  2015-06-09 21:38       ` David Edelsohn
  1 sibling, 1 reply; 127+ messages in thread
From: Eric Botcazou @ 2015-06-09 21:36 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: gcc-patches, Christophe Lyon, David Edelsohn, William J. Schmidt,
	Michael Meissner

> I'll look into cross-building some embedded targets and see if any
> further issues surface.

SPARC is also broken, see my message and the tescase under the PR.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-09 20:24   ` Alexandre Oliva
@ 2015-06-09 20:59     ` Jakub Jelinek
  2015-06-09 21:36     ` Eric Botcazou
  1 sibling, 0 replies; 127+ messages in thread
From: Jakub Jelinek @ 2015-06-09 20:59 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Christophe Lyon, David Edelsohn, GCC Patches, William J. Schmidt,
	Michael Meissner

On Tue, Jun 09, 2015 at 05:11:45PM -0300, Alexandre Oliva wrote:
> On Jun  9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
> 
> > On Jun  9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
> >> This also broke bootstrap on PPC64 LE Linux with the same error.
> 
> > Thanks for your reports.  I'm looking into the problem.
> 
> > I'd appreciate a preprocessed testcase from either of you to confirm the
> > fix, if not to help debug it.
> 
> The first potential source for this problem that jumped at me would be
> silenced with this change:
> 
> diff --git a/gcc/function.c b/gcc/function.c
> index 8bcc352..9201ed9 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>  	stack_parm = copy_rtx (stack_parm);
>        if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>  	PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -      set_mem_attributes (stack_parm, parm, 1);
> +      if (GET_CODE (stack_parm) == MEM)

FYI, this is preferrably if (MEM_P (stack_parm)) these days.

> +	set_mem_attributes (stack_parm, parm, 1);
>      }

	Jakub

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-09 18:36 ` Alexandre Oliva
@ 2015-06-09 20:24   ` Alexandre Oliva
  2015-06-09 20:59     ` Jakub Jelinek
  2015-06-09 21:36     ` Eric Botcazou
  0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-06-09 20:24 UTC (permalink / raw)
  To: Christophe Lyon
  Cc: David Edelsohn, GCC Patches, William J. Schmidt, Michael Meissner

On Jun  9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Jun  9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>> This also broke bootstrap on PPC64 LE Linux with the same error.

> Thanks for your reports.  I'm looking into the problem.

> I'd appreciate a preprocessed testcase from either of you to confirm the
> fix, if not to help debug it.

The first potential source for this problem that jumped at me would be
silenced with this change:

diff --git a/gcc/function.c b/gcc/function.c
index 8bcc352..9201ed9 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 	stack_parm = copy_rtx (stack_parm);
       if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
 	PUT_MODE (stack_parm, GET_MODE (entry_parm));
-      set_mem_attributes (stack_parm, parm, 1);
+      if (GET_CODE (stack_parm) == MEM)
+	set_mem_attributes (stack_parm, parm, 1);
     }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle

but I suspect there might be other similar issues lurking in function.c
after my attempt to turn parm assignment upside down ;-)

(namely, it used to assume it could pick stack slots and pseudos in a
whim, but after this change it must give way to out-of-SSA's partition
assignments.)

I'll look into cross-building some embedded targets and see if any
further issues surface.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
  2015-06-09 16:19 David Edelsohn
@ 2015-06-09 18:36 ` Alexandre Oliva
  2015-06-09 20:24   ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-06-09 18:36 UTC (permalink / raw)
  To: Christophe Lyon, David Edelsohn
  Cc: GCC Patches, William J. Schmidt, Michael Meissner

On Jun  9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:

> This also broke bootstrap on PPC64 LE Linux with the same error.

Thanks for your reports.  I'm looking into the problem.

I'd appreciate a preprocessed testcase from either of you to confirm the
fix, if not to help debug it.

Thanks in advance,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PR64164] drop copyrename, integrate into expand
@ 2015-06-09 16:19 David Edelsohn
  2015-06-09 18:36 ` Alexandre Oliva
  0 siblings, 1 reply; 127+ messages in thread
From: David Edelsohn @ 2015-06-09 16:19 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Christophe Lyon, GCC Patches, William J. Schmidt, Michael Meissner

This also broke bootstrap on PPC64 LE Linux with the same error.

- David

^ permalink raw reply	[flat|nested] 127+ messages in thread

end of thread, other threads:[~2022-10-17 12:08 UTC | newest]

Thread overview: 127+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-27 18:04 [PR64164] drop copyrename, integrate into expand Alexandre Oliva
2015-03-27 18:11 ` Alexandre Oliva
2015-03-28 19:22 ` Alexandre Oliva
2015-03-31  5:11   ` Jeff Law
2015-04-03 13:17     ` Alexandre Oliva
2015-04-06 16:08       ` Jeff Law
2015-04-24  1:56         ` Alexandre Oliva
2015-04-27 11:39           ` Richard Biener
2015-06-06  5:12             ` Alexandre Oliva
2015-06-08  8:16               ` Richard Biener
2015-06-09  8:58                 ` Christophe Lyon
2015-06-10  0:28               ` Alexandre Oliva
2015-06-10 13:36                 ` Richard Biener
2015-07-16  7:58                   ` Alexandre Oliva
2015-07-16  8:50                     ` Richard Biener
2015-07-16 21:33                       ` Alexandre Oliva
2015-07-18  8:26                         ` Alexandre Oliva
2015-07-21 13:25                           ` Richard Biener
2015-07-22 17:13                             ` Alexandre Oliva
2015-07-22 17:43                             ` Alexandre Oliva
2015-07-23 11:04                               ` Richard Biener
2015-07-23 15:42                                 ` Alexandre Oliva
2015-07-23 20:35                                   ` Segher Boessenkool
2015-07-23 21:24                                     ` H.J. Lu
2015-07-23 22:11                                       ` H.J. Lu
2015-07-24  1:31                                         ` David Edelsohn
2015-07-24  5:08                                           ` H.J. Lu
2015-07-24  9:26                                             ` Richard Biener
2015-07-24 12:50                                               ` H.J. Lu
2015-07-24 20:20                                           ` Alexandre Oliva
2015-07-25  2:37                                             ` David Edelsohn
2015-07-27 22:16                                               ` Alexandre Oliva
2015-07-27 22:31                                                 ` H.J. Lu
2015-07-24 18:51                                         ` Alexandre Oliva
2015-07-24 19:12                                           ` H.J. Lu
2015-07-24 19:31                                             ` David Edelsohn
2015-07-24 20:43                                               ` Alexandre Oliva
2015-07-24 20:47                                             ` Alexandre Oliva
2015-07-24 21:53                                               ` H.J. Lu
2015-07-25  7:17                                                 ` Richard Biener
2015-07-29 20:52                                         ` Alexandre Oliva
2015-07-29 21:06                                           ` H.J. Lu
2015-07-30 17:47                                             ` H.J. Lu
2015-08-03 23:46                                               ` Alexandre Oliva
2015-08-04  9:48                                                 ` Richard Biener
2015-08-05  0:39                                                   ` Alexandre Oliva
2015-08-05  9:14                                                     ` Richard Biener
2015-08-05 23:03                                                       ` Alexandre Oliva
2015-08-10  8:24                                                 ` James Greenhalgh
2015-08-10 15:14                                                   ` Jeff Law
2015-08-11  4:53                                                     ` Patrick Marlier
2015-08-14 19:03                                                       ` Alexandre Oliva
2015-08-15  8:57                                                         ` Andreas Schwab
2015-08-16 13:00                                                           ` Alexandre Oliva
     [not found]                                                             ` <m2k2sv8s21.fsf@linux-m68k.org>
2015-08-17  5:05                                                               ` Alexandre Oliva
2015-08-17  9:29                                                                 ` Kyrill Tkachov
2015-08-17 16:23                                                                   ` Andrew Pinski
2015-08-18 16:18                                                                 ` Kyrill Tkachov
2015-08-16 16:42                                                         ` Andreas Schwab
2015-08-17  2:57                                                           ` Alexandre Oliva
2015-08-17  8:23                                                             ` Andreas Schwab
2015-08-17  9:21                                                               ` Andreas Schwab
2015-08-17 11:58                                                               ` Alexandre Oliva
2015-08-17  7:48                                                         ` Christophe Lyon
2015-08-17 12:43                                                           ` Alexandre Oliva
2015-08-17 13:39                                                             ` Christophe Lyon
2015-08-18  6:53                                                               ` Alexandre Oliva
2015-08-19  6:50                                                                 ` Alexandre Oliva
2015-08-19 10:17                                                                   ` Richard Biener
2015-08-19 13:35                                                                   ` Andreas Schwab
2015-08-19 13:45                                                                     ` Andreas Schwab
2015-08-19 17:48                                                                       ` Alexandre Oliva
2015-08-20  1:44                                                                         ` Alexandre Oliva
2015-08-20 17:03                                                                           ` Jeff Law
2015-08-21  7:57                                                                           ` Alexandre Oliva
2015-08-21  8:38                                                                             ` Richard Biener
2015-08-21 12:17                                                                             ` Andreas Schwab
2015-08-21  8:11                                                                           ` Alexandre Oliva
2015-08-21  8:37                                                                             ` Richard Biener
2015-09-02 17:09                                                         ` Alan Lawrence
2015-09-02 22:34                                                           ` Alexandre Oliva
2015-09-03 10:58                                                             ` Alan Lawrence
2015-09-18 15:49                                                             ` Alan Lawrence
2015-09-23 20:44                                                               ` Alexandre Oliva
2015-09-25 11:39                                                                 ` Richard Biener
2015-10-09  5:26                                                                   ` [PR67828] don't unswitch loops on undefined SSA values (was: Re: [PR64164] drop copyrename, integrate into expand) Alexandre Oliva
2015-10-09  9:35                                                                     ` Richard Biener
2015-10-09  5:36                                                                   ` [PR67766] reorder return value copying from PARALLELs and CONCATs " Alexandre Oliva
2015-10-09  7:33                                                                     ` [PR67891] drop is_gimple_reg test from set_parm_rtl (was: [PR67766] reorder return value copying from PARALLELs and CONCATs) Alexandre Oliva
2015-10-09  9:40                                                                       ` Richard Biener
2015-10-10 13:20                                                                         ` [PR67891] drop is_gimple_reg test from set_parm_rtl Alexandre Oliva
2015-10-12 10:22                                                                           ` Richard Biener
2015-10-14  3:25                                                                             ` Alexandre Oliva
2015-10-14  9:28                                                                               ` Richard Biener
2015-11-03  1:11                                                                                 ` Alexandre Oliva
2015-11-03  3:14                                                                                   ` Jeff Law
2015-11-03  4:29                                                                                     ` Alexandre Oliva
2022-10-17 12:08                                                                                       ` Tag 'gcc/gimple-expr.cc:mark_addressable_2' as 'static' (was: [PR67891] drop is_gimple_reg test from set_parm_rtl) Thomas Schwinge
2015-10-09  9:36                                                                     ` [PR67766] reorder return value copying from PARALLELs and CONCATs (was: Re: [PR64164] drop copyrename, integrate into expand) Richard Biener
2015-09-29 11:31                                                                 ` [PR64164] drop copyrename, integrate into expand Szabolcs Nagy
2015-10-07 22:37                                                                   ` Alexandre Oliva
2015-10-08 10:00                                                                     ` Richard Biener
2015-10-09 21:10                                                                     ` Jeff Law
2015-11-05  5:09                                                                 ` Alexandre Oliva
2015-11-05 13:44                                                                   ` Richard Biener
2015-11-10 15:31                                                                   ` Alan Lawrence
2015-11-10 22:59                                                                     ` Alexandre Oliva
2015-11-10 23:43                                                                       ` Jeff Law
2015-11-11 18:10                                                                         ` Alexandre Oliva
2015-11-13  6:33                                                                           ` Jeff Law
2015-11-17  0:07                                                                             ` Alexandre Oliva
2015-11-24  5:41                                                                               ` Jeff Law
2015-07-24 18:21                                     ` Alexandre Oliva
2015-07-29 20:32                                     ` Alexandre Oliva
2015-04-29  3:51           ` Jeff Law
2015-03-31  6:55   ` Steven Bosscher
2015-03-31 13:30     ` Richard Biener
2015-03-31 14:06   ` Richard Biener
2015-04-03 13:30     ` Alexandre Oliva
2015-04-06 15:57       ` Jeff Law
2015-12-04 12:45 ` Dominik Vogt
2015-06-09 16:19 David Edelsohn
2015-06-09 18:36 ` Alexandre Oliva
2015-06-09 20:24   ` Alexandre Oliva
2015-06-09 20:59     ` Jakub Jelinek
2015-06-09 21:36     ` Eric Botcazou
2015-06-09 21:38       ` David Edelsohn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).