* [PR64164] drop copyrename, integrate into expand
@ 2015-03-27 18:04 Alexandre Oliva
2015-03-27 18:11 ` Alexandre Oliva
` (2 more replies)
0 siblings, 3 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-03-27 18:04 UTC (permalink / raw)
To: gcc-patches
This patch reworks the out-of-ssa expander to enable coalescing of SSA
partitions that don't share the same base name. This is done only when
optimizing.
The test we use to tell whether two partitions can be merged no longer
demands them to have the same base variable when optimizing, so they
become eligible for coalescing, as they would after copyrename. We then
compute the partitioning we'd get if all coalescible partitions were
coalesced, using this partition assignment to assign base vars numbers.
These base var numbers are then used to identify conflicts, which used
to be based on shared base vars or base types.
We now propagate base var names during coalescing proper, only towards
the leader variable. I'm no longer sure this is still needed, but
something about handling variables and results led me this way and I
didn't revisit it. I might rework that with a later patch, or a later
revision of this patch; it would require other means to identify
partitions holding result_decls during merging, or allow that and deal
with param and result decls in a different way during expand proper.
I had to fix two lingering bugs in order for the whole thing to work: we
perform conflict detection after abnormal coalescing, but we computed
live ranges involving only the partition leaders, so conflicts with
other names already coalesced wouldn't be detected. The other problem
was that we didn't track default defs for parms as live at entry, so
they might end up coalesced. I guess none of these problems would have
been exercised in practice, because we wouldn't even consider merging
ssa names associated with different variables.
In the end, I verified that this fixed the codegen regression in the
PR64164 testcase, that failed to merge two partitions that could in
theory be merged, but that wasn't even considered due to differences in
the SSA var names.
I'd agree that disregarding the var names and dropping 4 passes is too
much of a change to fix this one problem, but... it's something we
should have long tackled, and it gets this and other jobs done, so...
Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
on x86_64, so without lto. Is this ok to install?
for gcc/ChangeLog
PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-vars, ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across base variables when optimizing.
* tree-ssa-coalesce.c (build_ssa_conflict_graph): Process
PARM_DECLs's default defs at the entry point.
(attempt_coalesce): Add param_defaults argument, and
track the presence of default defs for params in each
partition. Propagate base var to leader on merge, preferring
parms and results, named vars, ignored vars, and then anon
vars. Refuse to merge a RESULT_DECL partition with a default
PARM_DECL one.
(perform_abnormal_coalescing): Add param_defaults argument,
and pass it to attempt_coalesce.
(coalesce_partitions): Likewise.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when optimizing, disabling
partition_view_bitmap's base assignment. Pass local
param_defaults to coalescer functions.
* tree-ssa-live.c (var_map_base_init): Note use only when not
optimizing.
(calculate_live_ranges): Initialize for all SSA names, not
just partition leaders.
---
gcc/Makefile.in | 1
gcc/common.opt | 12 +
gcc/doc/invoke.texi | 29 ---
gcc/gimple-expr.c | 7 -
gcc/opts.c | 1
gcc/passes.def | 5
gcc/tree-ssa-coalesce.c | 336 ++++++++++++++++++++++++++++++
gcc/tree-ssa-copyrename.c | 499 ---------------------------------------------
gcc/tree-ssa-live.c | 11 +
9 files changed, 347 insertions(+), 554 deletions(-)
delete mode 100644 gcc/tree-ssa-copyrename.c
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index de1f3b6..b3149ba 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1427,7 +1427,6 @@ OBJS = \
tree-ssa-ccp.o \
tree-ssa-coalesce.o \
tree-ssa-copy.o \
- tree-ssa-copyrename.o \
tree-ssa-dce.o \
tree-ssa-dom.o \
tree-ssa-dse.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index b49ac46..fefaee7 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2207,16 +2207,16 @@ Common Report Var(flag_tree_ch) Optimization
Enable loop header copying on trees
ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing. Preserved for backward compatibility.
ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Ignore
+Does nothing. Preserved for backward compatibility.
ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing. Preserved for backward compatibility.
ftree-copy-prop
Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 08ce074..2f6acb5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -442,8 +442,7 @@ Objective-C and Objective-C++ Dialects}.
-fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
-fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
-ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
+-ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-dse @gol
-ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
-ftree-loop-if-convert-stores -ftree-loop-im @gol
-ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
@@ -8800,32 +8799,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
references with scalars to prevent committing structures to memory too
early. This flag is enabled by default at @option{-O} and higher.
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees. This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables. This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions. It is a more limited form of
-@option{-ftree-coalesce-vars}. This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries. This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}. In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones. This option is enabled by default.
-
@item -ftree-ter
@opindex ftree-ter
Perform temporary expression replacement during the SSA->normal phase. Single
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index efc93b7..62ae577 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -399,13 +399,14 @@ copy_var_decl (tree var, tree name, tree type)
bool
gimple_can_coalesce_p (tree name1, tree name2)
{
- /* First check the SSA_NAME's associated DECL. We only want to
- coalesce if they have the same DECL or both have no associated DECL. */
+ /* First check the SSA_NAME's associated DECL. Without
+ optimization, we only want to coalesce if they have the same DECL
+ or both have no associated DECL. */
tree var1 = SSA_NAME_VAR (name1);
tree var2 = SSA_NAME_VAR (name2);
var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
- if (var1 != var2)
+ if (var1 != var2 && !optimize)
return false;
/* Now check the types. If the types are the same, then we should
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..8149421 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -453,7 +453,6 @@ static const struct default_options default_options_table[] =
{ OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
- { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 2bc5dcd..345f451 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -76,7 +76,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_all_early_optimizations);
PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
NEXT_PASS (pass_remove_cgraph_callee_edges);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
@@ -152,7 +151,6 @@ along with GCC; see the file COPYING3. If not see
/* Initial scalar cleanups before alias computation.
They ensure memory accesses are not indirect wherever possible. */
NEXT_PASS (pass_strip_predict_hints);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
@@ -180,7 +178,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_stdarg);
NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_sra);
- NEXT_PASS (pass_rename_ssa_copies);
/* The dom pass will also resolve all __builtin_constant_p calls
that are still there to 0. This has to be done after some
propagations have already run, but before some more dead code
@@ -289,7 +286,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_fold_builtins);
NEXT_PASS (pass_optimize_widening_mul);
NEXT_PASS (pass_tail_calls);
- NEXT_PASS (pass_rename_ssa_copies);
/* FIXME: If DCE is not run before checking for uninitialized uses,
we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
However, this also causes us to misdiagnose cases that should be
@@ -324,7 +320,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_dce);
NEXT_PASS (pass_asan);
NEXT_PASS (pass_tsan);
- NEXT_PASS (pass_rename_ssa_copies);
/* ??? We do want some kind of loop invariant motion, but we possibly
need to adjust LIM to be more friendly towards preserving accurate
debug information here. */
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index dd6b9c0..48a723c 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -833,6 +833,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
basic_block bb;
ssa_op_iter iter;
live_track_p live;
+ basic_block entry;
+
+ /* If we are optimizing, we may attempt to coalesce variables from
+ different base variables, including different parameters, so we
+ have to make sure default defs live at the entry block conflict
+ with each other. */
+ if (optimize)
+ entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+ else
+ entry = NULL;
map = live_var_map (liveinfo);
graph = ssa_conflicts_new (num_var_partitions (map));
@@ -891,6 +901,33 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
live_track_process_def (live, result, graph);
}
+ /* Pretend there are defs for params' default defs at the start
+ of the (post-)entry block. We run after abnormal coalescing,
+ so we can't assume the leader variable is the default
+ definition, but because of SSA_NAME_VAR adjustments in
+ attempt_coalesce, we can assume that if there is any
+ PARM_DECL in the partition, it will be the leader's
+ SSA_NAME_VAR. */
+ if (bb == entry)
+ {
+ unsigned part;
+ bitmap_iterator bi;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, part, bi)
+ {
+ bitmap_iterator bi2;
+ unsigned v;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[part],
+ 0, v, bi2)
+ {
+ tree var = partition_to_var (map, v);
+ if (!SSA_NAME_VAR (var)
+ || TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL)
+ continue;
+ live_track_process_def (live, var, graph);
+ }
+ }
+ }
+
live_track_clear_base_vars (live);
}
@@ -1127,11 +1164,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
static inline bool
attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
- FILE *debug)
+ bitmap param_defaults, FILE *debug)
{
int z;
tree var1, var2;
int p1, p2;
+ bool default_def = false;
p1 = var_to_partition (map, ssa_name (x));
p2 = var_to_partition (map, ssa_name (y));
@@ -1160,6 +1198,70 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
{
var1 = partition_to_var (map, p1);
var2 = partition_to_var (map, p2);
+
+ tree leader;
+
+ if (var1 == var2 || !SSA_NAME_VAR (var2)
+ || SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2))
+ {
+ leader = SSA_NAME_VAR (var1);
+ default_def = (leader && TREE_CODE (leader) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var1)
+ || bitmap_bit_p (param_defaults, p1)));
+ }
+ else if (!SSA_NAME_VAR (var1))
+ {
+ leader = SSA_NAME_VAR (var2);
+ default_def = (leader && TREE_CODE (leader) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var2)
+ || bitmap_bit_p (param_defaults, p2)));
+ }
+ else if ((TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var1)
+ || bitmap_bit_p (param_defaults, p1)))
+ || TREE_CODE (SSA_NAME_VAR (var1)) == RESULT_DECL)
+ {
+ if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var2)
+ || bitmap_bit_p (param_defaults, p2)))
+ || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
+ {
+ /* We only have one RESULT_DECL, and two PARM_DECL
+ DEFAULT_DEFs would have conflicted, so we know either
+ one of var1 or var2 is a PARM_DECL, and the other is
+ a RESULT_DECL. */
+ if (debug)
+ fprintf (debug, ": Cannot coalesce PARM_DECL and RESULT_DECL\n");
+ return false;
+ }
+ leader = SSA_NAME_VAR (var1);
+ default_def = TREE_CODE (leader) == PARM_DECL;
+ }
+ else if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var2)
+ || bitmap_bit_p (param_defaults, p2)))
+ || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
+ {
+ leader = SSA_NAME_VAR (var2);
+ default_def = TREE_CODE (leader) == PARM_DECL;
+ }
+ else if (TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL)
+ leader = SSA_NAME_VAR (var1);
+ else if (TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL)
+ leader = SSA_NAME_VAR (var2);
+ else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL
+ && !DECL_IGNORED_P (SSA_NAME_VAR (var1)))
+ leader = SSA_NAME_VAR (var1);
+ else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL
+ && !DECL_IGNORED_P (SSA_NAME_VAR (var2)))
+ leader = SSA_NAME_VAR (var2);
+ else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL)
+ leader = SSA_NAME_VAR (var1);
+ else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
+ leader = SSA_NAME_VAR (var2);
+ else /* What else could it be? */
+ gcc_unreachable ();
+
z = var_union (map, var1, var2);
if (z == NO_PARTITION)
{
@@ -1178,8 +1280,46 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
ssa_conflicts_merge (graph, p2, p1);
}
+ if (z == p1)
+ {
+ if (SSA_NAME_VAR (var1) != leader)
+ {
+ replace_ssa_name_symbol (var1, leader);
+ if (debug)
+ {
+ fprintf (debug, ": Renamed ");
+ print_generic_expr (debug, var1, TDF_SLIM);
+ }
+ }
+ if (default_def)
+ {
+ if (SSA_NAME_IS_DEFAULT_DEF (var2))
+ bitmap_clear_bit (param_defaults, p2);
+ bitmap_set_bit (param_defaults, p1);
+ }
+ }
+ else
+ {
+ if (SSA_NAME_VAR (var2) != leader)
+ {
+ replace_ssa_name_symbol (var2, leader);
+ if (debug)
+ {
+ fprintf (debug, ": Renamed ");
+ print_generic_expr (debug, var2, TDF_SLIM);
+ }
+ }
+ if (default_def)
+ {
+ if (SSA_NAME_IS_DEFAULT_DEF (var1))
+ bitmap_clear_bit (param_defaults, p1);
+ bitmap_set_bit (param_defaults, p2);
+ }
+ }
+
if (debug)
fprintf (debug, ": Success -> %d\n", z);
+
return true;
}
@@ -1194,7 +1334,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
Debug output is sent to DEBUG if it is non-NULL. */
static void
-perform_abnormal_coalescing (var_map map, FILE *debug)
+perform_abnormal_coalescing (var_map map, bitmap param_defaults, FILE *debug)
{
basic_block bb;
edge e;
@@ -1223,7 +1363,7 @@ perform_abnormal_coalescing (var_map map, FILE *debug)
if (debug)
fprintf (debug, "Abnormal coalesce: ");
- if (!attempt_coalesce (map, NULL, v1, v2, debug))
+ if (!attempt_coalesce (map, NULL, v1, v2, param_defaults, debug))
fail_abnormal_edge_coalesce (v1, v2);
}
}
@@ -1235,7 +1375,7 @@ perform_abnormal_coalescing (var_map map, FILE *debug)
static void
coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
- FILE *debug)
+ bitmap param_defaults, FILE *debug)
{
int x = 0, y = 0;
tree var1, var2;
@@ -1253,7 +1393,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
if (debug)
fprintf (debug, "Coalesce list: ");
- attempt_coalesce (map, graph, x, y, debug);
+ attempt_coalesce (map, graph, x, y, param_defaults, debug);
}
}
@@ -1281,6 +1421,178 @@ ssa_name_var_hash::equal (const value_type *n1, const compare_type *n2)
}
+/* Output partition map MAP with coalescing plan PART to file F. */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+ int t;
+ unsigned x, y;
+ int p;
+
+ fprintf (f, "\nCoalescible Partition map \n\n");
+
+ for (x = 0; x < map->num_partitions; x++)
+ {
+ if (map->view_to_partition != NULL)
+ p = map->view_to_partition[x];
+ else
+ p = x;
+
+ if (ssa_name (p) == NULL_TREE
+ || virtual_operand_p (ssa_name (p)))
+ continue;
+
+ t = 0;
+ for (y = 1; y < num_ssa_names; y++)
+ {
+ tree var = version_to_var (map, y);
+ if (!var)
+ continue;
+ int q = var_to_partition (map, var);
+ p = partition_find (part, q);
+ gcc_assert (map->partition_to_base_index[q]
+ == map->partition_to_base_index[p]);
+
+ if (p == (int)x)
+ {
+ if (t++ == 0)
+ {
+ fprintf (f, "Partition %d, base %d (", x,
+ map->partition_to_base_index[q]);
+ print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+ fprintf (f, " - ");
+ }
+ fprintf (f, "%d ", y);
+ }
+ }
+ if (t != 0)
+ fprintf (f, ")\n");
+ }
+ fprintf (f, "\n");
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+ partition of SSA names USED_IN_COPIES and related by CL
+ coalesce possibilities. */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+ coalesce_list_p cl)
+{
+ int parts = num_var_partitions (map);
+ partition tentative = partition_new (parts);
+
+ /* Partition the SSA versions so that, for each coalescible
+ pair, both of its members are in the same partition in
+ TENTATIVE. */
+ gcc_assert (!cl->sorted);
+ coalesce_pair_p node;
+ coalesce_iterator_type ppi;
+ FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+ {
+ tree v1 = ssa_name (node->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (node->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* We have to deal with cost one pairs too. */
+ for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+ {
+ tree v1 = ssa_name (co->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (co->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* And also with abnormal edges. */
+ basic_block bb;
+ edge e;
+ edge_iterator ei;
+ FOR_EACH_BB_FN (bb, cfun)
+ {
+ FOR_EACH_EDGE (e, ei, bb->preds)
+ if (e->flags & EDGE_ABNORMAL)
+ {
+ gphi_iterator gsi;
+ for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+ gsi_next (&gsi))
+ {
+ gphi *phi = gsi.phi ();
+ tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+ if (SSA_NAME_IS_DEFAULT_DEF (arg)
+ && (!SSA_NAME_VAR (arg)
+ || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+ continue;
+
+ tree res = PHI_RESULT (phi);
+
+ int p1 = partition_find (tentative, var_to_partition (map, res));
+ int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+ }
+ }
+
+
+ map->partition_to_base_index = XCNEWVEC (int, parts);
+ auto_vec<unsigned int> index_map (parts);
+ if (parts)
+ index_map.quick_grow (parts);
+
+ const unsigned no_part = -1;
+ unsigned count = parts;
+ while (count)
+ index_map[--count] = no_part;
+
+ /* Initialize MAP's mapping from partition to base index, using
+ as base indices an enumeration of the TENTATIVE partitions in
+ which each SSA version ended up, so that we compute conflicts
+ between all SSA versions that ended up in the same potential
+ coalesce partition. */
+ bitmap_iterator bi;
+ unsigned i;
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ if (index_map[base] != no_part)
+ continue;
+ index_map[base] = count++;
+ }
+
+ map->num_basevars = count;
+
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ gcc_assert (index_map[base] < count);
+ map->partition_to_base_index[pidx] = index_map[base];
+ }
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ dump_part_var_map (dump_file, tentative, map);
+
+ partition_delete (tentative);
+}
+
+
/* Reduce the number of copies by coalescing variables in the function. Return
a partition map with the resulting coalesces. */
@@ -1346,7 +1658,12 @@ coalesce_ssa_name (void)
dump_var_map (dump_file, map);
/* Don't calculate live ranges for variables not in the coalesce list. */
- partition_view_bitmap (map, used_in_copies, true);
+ partition_view_bitmap (map, used_in_copies, !optimize);
+
+ /* If we are optimizing, compute the base indices ourselves. */
+ if (optimize)
+ compute_optimized_partition_bases (map, used_in_copies, cl);
+
BITMAP_FREE (used_in_copies);
if (num_var_partitions (map) < 1)
@@ -1355,13 +1672,15 @@ coalesce_ssa_name (void)
return map;
}
+ bitmap param_defaults = BITMAP_ALLOC (NULL);
+
/* First, coalesce all the copies across abnormal edges. These are not placed
in the coalesce list because they do not need to be sorted, and simply
consume extra memory/compilation time in large programs.
Performing abnormal coalescing also needs no live/conflict computation
because it must succeed (but we lose checking that it indeed does).
Still for PR63155 this reduces memory usage from 10GB to zero. */
- perform_abnormal_coalescing (map,
+ perform_abnormal_coalescing (map, param_defaults,
((dump_flags & TDF_DETAILS) ? dump_file : NULL));
if (dump_file && (dump_flags & TDF_DETAILS))
@@ -1393,9 +1712,10 @@ coalesce_ssa_name (void)
dump_var_map (dump_file, map);
/* Now coalesce everything in the list. */
- coalesce_partitions (map, graph, cl,
+ coalesce_partitions (map, graph, cl, param_defaults,
((dump_flags & TDF_DETAILS) ? dump_file : NULL));
+ BITMAP_FREE (param_defaults);
delete_coalesce_list (cl);
ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index f3cb56e..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,499 +0,0 @@
-/* Rename SSA copies.
- Copyright (C) 2004-2015 Free Software Foundation, Inc.
- Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3. If not see
-<http://www.gnu.org/licenses/>. */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
-#include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
-#include "tree.h"
-#include "fold-const.h"
-#include "predict.h"
-#include "hard-reg-set.h"
-#include "function.h"
-#include "dominance.h"
-#include "cfg.h"
-#include "basic-block.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "bitmap.h"
-#include "gimple-ssa.h"
-#include "stringpool.h"
-#include "tree-ssanames.h"
-#include "hashtab.h"
-#include "rtl.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
- /* Number of copies coalesced. */
- int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
- This optimization looks for copies between 2 SSA_NAMES, either through a
- direct copy, or an implicit one via a PHI node result and its arguments.
-
- Each copy is examined to determine if it is possible to rename the base
- variable of one of the operands to the same variable as the other operand.
- i.e.
- T.3_5 = <blah>
- a_1 = T.3_5
-
- If this copy couldn't be copy propagated, it could possibly remain in the
- program throughout the optimization phases. After SSA->normal, it would
- become:
-
- T.3 = <blah>
- a = T.3
-
- Since T.3_5 is distinct from all other SSA versions of T.3, there is no
- fundamental reason why the base variable needs to be T.3, subject to
- certain restrictions. This optimization attempts to determine if we can
- change the base variable on copies like this, and result in code such as:
-
- a_5 = <blah>
- a_1 = a_5
-
- This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
- possible, the copy goes away completely. If it isn't possible, a new temp
- will be created for a_5, and you will end up with the exact same code:
-
- a.8 = <blah>
- a = a.8
-
- The other benefit of performing this optimization relates to what variables
- are chosen in copies. Gimplification of the program uses temporaries for
- a lot of things. expressions like
-
- a_1 = <blah>
- <blah2> = a_1
-
- get turned into
-
- T.3_5 = <blah>
- a_1 = T.3_5
- <blah2> = a_1
-
- Copy propagation is done in a forward direction, and if we can propagate
- through the copy, we end up with:
-
- T.3_5 = <blah>
- <blah2> = T.3_5
-
- The copy is gone, but so is all reference to the user variable 'a'. By
- performing this optimization, we would see the sequence:
-
- a_5 = <blah>
- a_1 = a_5
- <blah2> = a_1
-
- which copy propagation would then turn into:
-
- a_5 = <blah>
- <blah2> = a_5
-
- and so we still retain the user variable whenever possible. */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
- Choose a representative for the partition, and send debug info to DEBUG. */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
- int p1, p2, p3;
- tree root1, root2;
- tree rep1, rep2;
- bool ign1, ign2, abnorm;
-
- gcc_assert (TREE_CODE (var1) == SSA_NAME);
- gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
- register_ssa_partition (map, var1);
- register_ssa_partition (map, var2);
-
- p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
- p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
- if (debug)
- {
- fprintf (debug, "Try : ");
- print_generic_expr (debug, var1, TDF_SLIM);
- fprintf (debug, "(P%d) & ", p1);
- print_generic_expr (debug, var2, TDF_SLIM);
- fprintf (debug, "(P%d)", p2);
- }
-
- gcc_assert (p1 != NO_PARTITION);
- gcc_assert (p2 != NO_PARTITION);
-
- if (p1 == p2)
- {
- if (debug)
- fprintf (debug, " : Already coalesced.\n");
- return;
- }
-
- rep1 = partition_to_var (map, p1);
- rep2 = partition_to_var (map, p2);
- root1 = SSA_NAME_VAR (rep1);
- root2 = SSA_NAME_VAR (rep2);
- if (!root1 && !root2)
- return;
-
- /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
- abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
- || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
- if (abnorm)
- {
- if (debug)
- fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
- return;
- }
-
- /* Partitions already have the same root, simply merge them. */
- if (root1 == root2)
- {
- p1 = partition_union (map->var_partition, p1, p2);
- if (debug)
- fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
- return;
- }
-
- /* Never attempt to coalesce 2 different parameters. */
- if ((root1 && TREE_CODE (root1) == PARM_DECL)
- && (root2 && TREE_CODE (root2) == PARM_DECL))
- {
- if (debug)
- fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
- return;
- }
-
- if ((root1 && TREE_CODE (root1) == RESULT_DECL)
- != (root2 && TREE_CODE (root2) == RESULT_DECL))
- {
- if (debug)
- fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
- return;
- }
-
- ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
- ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
- /* Refrain from coalescing user variables, if requested. */
- if (!ign1 && !ign2)
- {
- if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
- ign2 = true;
- else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
- ign1 = true;
- else if (flag_ssa_coalesce_vars != 2)
- {
- if (debug)
- fprintf (debug, " : 2 different USER vars. No coalesce.\n");
- return;
- }
- else
- ign2 = true;
- }
-
- /* If both values have default defs, we can't coalesce. If only one has a
- tag, make sure that variable is the new root partition. */
- if (root1 && ssa_default_def (cfun, root1))
- {
- if (root2 && ssa_default_def (cfun, root2))
- {
- if (debug)
- fprintf (debug, " : 2 default defs. No coalesce.\n");
- return;
- }
- else
- {
- ign2 = true;
- ign1 = false;
- }
- }
- else if (root2 && ssa_default_def (cfun, root2))
- {
- ign1 = true;
- ign2 = false;
- }
-
- /* Do not coalesce if we cannot assign a symbol to the partition. */
- if (!(!ign2 && root2)
- && !(!ign1 && root1))
- {
- if (debug)
- fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the new chosen root variable would be read-only.
- If both ign1 && ign2, then the root var of the larger partition
- wins, so reject in that case if any of the root vars is TREE_READONLY.
- Otherwise reject only if the root var, on which replace_ssa_name_symbol
- will be called below, is readonly. */
- if (((root1 && TREE_READONLY (root1)) && ign2)
- || ((root2 && TREE_READONLY (root2)) && ign1))
- {
- if (debug)
- fprintf (debug, " : Readonly variable. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the two variables aren't type compatible . */
- if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
- /* There is a disconnect between the middle-end type-system and
- VRP, avoid coalescing enum types with different bounds. */
- || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
- || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
- && TREE_TYPE (var1) != TREE_TYPE (var2)))
- {
- if (debug)
- fprintf (debug, " : Incompatible types. No coalesce.\n");
- return;
- }
-
- /* Merge the two partitions. */
- p3 = partition_union (map->var_partition, p1, p2);
-
- /* Set the root variable of the partition to the better choice, if there is
- one. */
- if (!ign2 && root2)
- replace_ssa_name_symbol (partition_to_var (map, p3), root2);
- else if (!ign1 && root1)
- replace_ssa_name_symbol (partition_to_var (map, p3), root1);
- else
- gcc_unreachable ();
-
- if (debug)
- {
- fprintf (debug, " --> P%d ", p3);
- print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
- TDF_SLIM);
- fprintf (debug, "\n");
- }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
- GIMPLE_PASS, /* type */
- "copyrename", /* name */
- OPTGROUP_NONE, /* optinfo_flags */
- TV_TREE_COPY_RENAME, /* tv_id */
- ( PROP_cfg | PROP_ssa ), /* properties_required */
- 0, /* properties_provided */
- 0, /* properties_destroyed */
- 0, /* todo_flags_start */
- 0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
- pass_rename_ssa_copies (gcc::context *ctxt)
- : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
- {}
-
- /* opt_pass methods: */
- opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
- virtual bool gate (function *) { return flag_tree_copyrename != 0; }
- virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
- SSA versions which occur in PHI's or copies. Coalescing is accomplished by
- changing the underlying root variable of all coalesced version. This will
- then cause the SSA->normal pass to attempt to coalesce them all to the same
- variable. */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
- var_map map;
- basic_block bb;
- tree var, part_var;
- gimple stmt;
- unsigned x;
- FILE *debug;
-
- memset (&stats, 0, sizeof (stats));
-
- if (dump_file && (dump_flags & TDF_DETAILS))
- debug = dump_file;
- else
- debug = NULL;
-
- map = init_var_map (num_ssa_names);
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Scan for real copies. */
- for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- stmt = gsi_stmt (gsi);
- if (gimple_assign_ssa_name_copy_p (stmt))
- {
- tree lhs = gimple_assign_lhs (stmt);
- tree rhs = gimple_assign_rhs1 (stmt);
-
- copy_rename_partition_coalesce (map, lhs, rhs, debug);
- }
- }
- }
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Treat PHI nodes as copies between the result and each argument. */
- for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- size_t i;
- tree res;
- gphi *phi = gsi.phi ();
- res = gimple_phi_result (phi);
-
- /* Do not process virtual SSA_NAMES. */
- if (virtual_operand_p (res))
- continue;
-
- /* Make sure to only use the same partition for an argument
- as the result but never the other way around. */
- if (SSA_NAME_VAR (res)
- && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) == SSA_NAME)
- copy_rename_partition_coalesce (map, res, arg,
- debug);
- }
- /* Else if all arguments are in the same partition try to merge
- it with the result. */
- else
- {
- int all_p_same = -1;
- int p = -1;
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) != SSA_NAME)
- {
- all_p_same = 0;
- break;
- }
- else if (all_p_same == -1)
- {
- p = partition_find (map->var_partition,
- SSA_NAME_VERSION (arg));
- all_p_same = 1;
- }
- else if (all_p_same == 1
- && p != partition_find (map->var_partition,
- SSA_NAME_VERSION (arg)))
- {
- all_p_same = 0;
- break;
- }
- }
- if (all_p_same == 1)
- copy_rename_partition_coalesce (map, res,
- PHI_ARG_DEF (phi, 0),
- debug);
- }
- }
- }
-
- if (debug)
- dump_var_map (debug, map);
-
- /* Now one more pass to make all elements of a partition share the same
- root variable. */
-
- for (x = 1; x < num_ssa_names; x++)
- {
- part_var = partition_to_var (map, x);
- if (!part_var)
- continue;
- var = ssa_name (x);
- if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
- continue;
- if (debug)
- {
- fprintf (debug, "Coalesced ");
- print_generic_expr (debug, var, TDF_SLIM);
- fprintf (debug, " to ");
- print_generic_expr (debug, part_var, TDF_SLIM);
- fprintf (debug, "\n");
- }
- stats.coalesced++;
- replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
- }
-
- statistics_counter_event (fun, "copies coalesced",
- stats.coalesced);
- delete_var_map (map);
- return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
- return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index e0c42669..d8f2b08 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -119,7 +119,10 @@ tree_int_map_hasher::equal (const value_type *v, const compare_type *c)
}
-/* This routine will initialize the basevar fields of MAP. */
+/* This routine will initialize the basevar fields of MAP with base
+ names, when we are not optimizing. When optimizing, we'll use
+ partition numbers as base index numbers, see coalesce_ssa_name in
+ tree-ssa-coalesce.c. */
static void
var_map_base_init (var_map map)
@@ -1233,9 +1236,11 @@ calculate_live_ranges (var_map map, bool want_livein)
tree_live_info_p live;
live = new_tree_live_info (map);
- for (i = 0; i < num_var_partitions (map); i++)
+ /* We have already coalesced abnormal SSA names, so iterate over all
+ names, so as to cover all variables in each partition. */
+ for (i = 1; i < num_ssa_names; i++)
{
- var = partition_to_var (map, i);
+ var = ssa_name (i);
if (var != NULL_TREE)
set_var_live_on_entry (var, live);
}
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-03-27 18:04 [PR64164] drop copyrename, integrate into expand Alexandre Oliva
@ 2015-03-27 18:11 ` Alexandre Oliva
2015-03-28 19:22 ` Alexandre Oliva
2015-12-04 12:45 ` Dominik Vogt
2 siblings, 0 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-03-27 18:11 UTC (permalink / raw)
To: gcc-patches
On Mar 27, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto. Is this ok to install?
Err, sorry, wrong keystroke, I didn't mean to post that message yet, I
was just drafting it while several of the issues were still fresh, and a
wrong keystroke got it out. I still have a couple of ICEs to look into,
and a number of guality regressions to analyze. Apologies for setting
false expectations, but I will get there, and when I do, I'll post a
revised patch.
Comments on the one I posted are welcome nevertheless.
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-03-27 18:04 [PR64164] drop copyrename, integrate into expand Alexandre Oliva
2015-03-27 18:11 ` Alexandre Oliva
@ 2015-03-28 19:22 ` Alexandre Oliva
2015-03-31 5:11 ` Jeff Law
` (2 more replies)
2015-12-04 12:45 ` Dominik Vogt
2 siblings, 3 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-03-28 19:22 UTC (permalink / raw)
To: gcc-patches
On Mar 27, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
> This patch reworks the out-of-ssa expander to enable coalescing of SSA
> partitions that don't share the same base name. This is done only when
> optimizing.
> The test we use to tell whether two partitions can be merged no longer
> demands them to have the same base variable when optimizing, so they
> become eligible for coalescing, as they would after copyrename. We then
> compute the partitioning we'd get if all coalescible partitions were
> coalesced, using this partition assignment to assign base vars numbers.
> These base var numbers are then used to identify conflicts, which used
> to be based on shared base vars or base types.
> We now propagate base var names during coalescing proper, only towards
> the leader variable. I'm no longer sure this is still needed, but
> something about handling variables and results led me this way and I
> didn't revisit it. I might rework that with a later patch, or a later
> revision of this patch; it would require other means to identify
> partitions holding result_decls during merging, or allow that and deal
> with param and result decls in a different way during expand proper.
> I had to fix two lingering bugs in order for the whole thing to work: we
> perform conflict detection after abnormal coalescing, but we computed
> live ranges involving only the partition leaders, so conflicts with
> other names already coalesced wouldn't be detected.
This early abnormal coalescing was only present for a few days in the
trunk, and I was lucky enough to start working on a tree that had it.
It turns out that the fix for it was thus rendered unnecessary, so I
dropped it. It was the fix for it, that didn't cover the live range
check, that caused the two ICEs I saw in the regressions tests. Since
the ultimate cause of the problem is gone, and the change that
introduced the check failures, both problems went *poof* after I updated
the tree, resolved the conflicts and dropped the redundant code.
> The other problem was that we didn't track default defs for parms as
> live at entry, so they might end up coalesced.
I improved this a little bit, using the bitmap of partitions containing
default params to check that we only process function-entry defs for
them, rather than for all param decls in case they end up in other
partitions.
> I guess none of these problems would have been exercised in practice,
> because we wouldn't even consider merging ssa names associated with
> different variables.
> In the end, I verified that this fixed the codegen regression in the
> PR64164 testcase, that failed to merge two partitions that could in
> theory be merged, but that wasn't even considered due to differences in
> the SSA var names.
> I'd agree that disregarding the var names and dropping 4 passes is too
> much of a change to fix this one problem, but... it's something we
> should have long tackled, and it gets this and other jobs done, so...
Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
on x86_64, so without lto plugin. The only regression is in
gcc.dg/guality/pr54200.c, that explicitly disables VTA. When
optimization is enabled, the different coalescing we perform now causes
VTA-less variable tracking to lose track of variable "z". This
regression in non-VTA var-tracking is expected and, as richi put it in
PR 64164, I guess we don't care about that, do we? :-)
The other guality regressions I mentioned in my other email turned out
not to be regressions, but preexisting failures that somehow did not
make to the test_summary of my earlier pristine build.
Is this ok to install?
for gcc/ChangeLog
PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-vars, ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across base variables when optimizing.
* tree-ssa-coalesce.c (build_ssa_conflict_graph): Add
param_defaults argument. Process PARM_DECLs's default defs at
the entry point.
(attempt_coalesce): Add param_defaults argument, and
track the presence of default defs for params in each
partition. Propagate base var to leader on merge, preferring
parms and results, named vars, ignored vars, and then anon
vars. Refuse to merge a RESULT_DECL partition with a default
PARM_DECL one.
(coalesce_partitions): Add param_defaults argument,
and pass it to attempt_coalesce.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when optimizing, disabling
partition_view_bitmap's base assignment. Pass local
param_defaults to coalescer functions.
* tree-ssa-live.c (var_map_base_init): Note use only when not
optimizing.
---
gcc/Makefile.in | 1
gcc/common.opt | 12 +
gcc/doc/invoke.texi | 29 ---
gcc/gimple-expr.c | 7 -
gcc/opts.c | 1
gcc/passes.def | 5
gcc/tree-ssa-coalesce.c | 342 ++++++++++++++++++++++++++++++-
gcc/tree-ssa-copyrename.c | 499 ---------------------------------------------
gcc/tree-ssa-live.c | 5
9 files changed, 347 insertions(+), 554 deletions(-)
delete mode 100644 gcc/tree-ssa-copyrename.c
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f924fb8..990c4e9 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1428,7 +1428,6 @@ OBJS = \
tree-ssa-ccp.o \
tree-ssa-coalesce.o \
tree-ssa-copy.o \
- tree-ssa-copyrename.o \
tree-ssa-dce.o \
tree-ssa-dom.o \
tree-ssa-dse.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index b49ac46..fefaee7 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2207,16 +2207,16 @@ Common Report Var(flag_tree_ch) Optimization
Enable loop header copying on trees
ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing. Preserved for backward compatibility.
ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Ignore
+Does nothing. Preserved for backward compatibility.
ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing. Preserved for backward compatibility.
ftree-copy-prop
Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9749727..5d2c516 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -442,8 +442,7 @@ Objective-C and Objective-C++ Dialects}.
-fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
-fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
-ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
+-ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-dse @gol
-ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
-ftree-loop-if-convert-stores -ftree-loop-im @gol
-ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
@@ -8822,32 +8821,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
references with scalars to prevent committing structures to memory too
early. This flag is enabled by default at @option{-O} and higher.
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees. This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables. This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions. It is a more limited form of
-@option{-ftree-coalesce-vars}. This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries. This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}. In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones. This option is enabled by default.
-
@item -ftree-ter
@opindex ftree-ter
Perform temporary expression replacement during the SSA->normal phase. Single
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index efc93b7..62ae577 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -399,13 +399,14 @@ copy_var_decl (tree var, tree name, tree type)
bool
gimple_can_coalesce_p (tree name1, tree name2)
{
- /* First check the SSA_NAME's associated DECL. We only want to
- coalesce if they have the same DECL or both have no associated DECL. */
+ /* First check the SSA_NAME's associated DECL. Without
+ optimization, we only want to coalesce if they have the same DECL
+ or both have no associated DECL. */
tree var1 = SSA_NAME_VAR (name1);
tree var2 = SSA_NAME_VAR (name2);
var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
- if (var1 != var2)
+ if (var1 != var2 && !optimize)
return false;
/* Now check the types. If the types are the same, then we should
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..8149421 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -453,7 +453,6 @@ static const struct default_options default_options_table[] =
{ OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
- { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 1d598b2..f8fd0ef 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_all_early_optimizations);
PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
NEXT_PASS (pass_remove_cgraph_callee_edges);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_object_sizes);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
@@ -154,7 +153,6 @@ along with GCC; see the file COPYING3. If not see
/* Initial scalar cleanups before alias computation.
They ensure memory accesses are not indirect wherever possible. */
NEXT_PASS (pass_strip_predict_hints);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
@@ -182,7 +180,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_stdarg);
NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_sra);
- NEXT_PASS (pass_rename_ssa_copies);
/* The dom pass will also resolve all __builtin_constant_p calls
that are still there to 0. This has to be done after some
propagations have already run, but before some more dead code
@@ -291,7 +288,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_fold_builtins);
NEXT_PASS (pass_optimize_widening_mul);
NEXT_PASS (pass_tail_calls);
- NEXT_PASS (pass_rename_ssa_copies);
/* FIXME: If DCE is not run before checking for uninitialized uses,
we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
However, this also causes us to misdiagnose cases that should be
@@ -326,7 +322,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_dce);
NEXT_PASS (pass_asan);
NEXT_PASS (pass_tsan);
- NEXT_PASS (pass_rename_ssa_copies);
/* ??? We do want some kind of loop invariant motion, but we possibly
need to adjust LIM to be more friendly towards preserving accurate
debug information here. */
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 1afeefe..8557d84 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -825,13 +825,23 @@ live_track_clear_base_vars (live_track_p ptr)
base variable are added. */
static ssa_conflicts_p
-build_ssa_conflict_graph (tree_live_info_p liveinfo)
+build_ssa_conflict_graph (tree_live_info_p liveinfo, bitmap param_defaults)
{
ssa_conflicts_p graph;
var_map map;
basic_block bb;
ssa_op_iter iter;
live_track_p live;
+ basic_block entry;
+
+ /* If we are optimizing, we may attempt to coalesce variables from
+ different base variables, including different parameters, so we
+ have to make sure default defs live at the entry block conflict
+ with each other. */
+ if (optimize)
+ entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+ else
+ entry = NULL;
map = live_var_map (liveinfo);
graph = ssa_conflicts_new (num_var_partitions (map));
@@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
live_track_process_def (live, result, graph);
}
+ /* Pretend there are defs for params' default defs at the start
+ of the (post-)entry block. We run after abnormal coalescing,
+ so we can't assume the leader variable is the default
+ definition, but because of SSA_NAME_VAR adjustments in
+ attempt_coalesce, we can assume that if there is any
+ PARM_DECL in the partition, it will be the leader's
+ SSA_NAME_VAR. */
+ if (bb == entry)
+ {
+ unsigned base;
+ bitmap_iterator bi;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+ {
+ bitmap_iterator bi2;
+ unsigned part;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+ 0, part, bi2)
+ {
+ tree var = partition_to_var (map, part);
+ if (!SSA_NAME_VAR (var)
+ || TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+ || !(SSA_NAME_IS_DEFAULT_DEF (var)
+ || (param_defaults
+ && bitmap_bit_p (param_defaults, part))))
+ continue;
+ live_track_process_def (live, var, graph);
+ }
+ }
+ }
+
live_track_clear_base_vars (live);
}
@@ -1126,11 +1166,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
static inline bool
attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
- FILE *debug)
+ bitmap param_defaults, FILE *debug)
{
int z;
tree var1, var2;
int p1, p2;
+ bool default_def = false;
p1 = var_to_partition (map, ssa_name (x));
p2 = var_to_partition (map, ssa_name (y));
@@ -1158,6 +1199,70 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
{
var1 = partition_to_var (map, p1);
var2 = partition_to_var (map, p2);
+
+ tree leader;
+
+ if (var1 == var2 || !SSA_NAME_VAR (var2)
+ || SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2))
+ {
+ leader = SSA_NAME_VAR (var1);
+ default_def = (leader && TREE_CODE (leader) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var1)
+ || bitmap_bit_p (param_defaults, p1)));
+ }
+ else if (!SSA_NAME_VAR (var1))
+ {
+ leader = SSA_NAME_VAR (var2);
+ default_def = (leader && TREE_CODE (leader) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var2)
+ || bitmap_bit_p (param_defaults, p2)));
+ }
+ else if ((TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var1)
+ || bitmap_bit_p (param_defaults, p1)))
+ || TREE_CODE (SSA_NAME_VAR (var1)) == RESULT_DECL)
+ {
+ if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var2)
+ || bitmap_bit_p (param_defaults, p2)))
+ || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
+ {
+ /* We only have one RESULT_DECL, and two PARM_DECL
+ DEFAULT_DEFs would have conflicted, so we know either
+ one of var1 or var2 is a PARM_DECL, and the other is
+ a RESULT_DECL. */
+ if (debug)
+ fprintf (debug, ": Cannot coalesce PARM_DECL and RESULT_DECL\n");
+ return false;
+ }
+ leader = SSA_NAME_VAR (var1);
+ default_def = TREE_CODE (leader) == PARM_DECL;
+ }
+ else if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
+ && (SSA_NAME_IS_DEFAULT_DEF (var2)
+ || bitmap_bit_p (param_defaults, p2)))
+ || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
+ {
+ leader = SSA_NAME_VAR (var2);
+ default_def = TREE_CODE (leader) == PARM_DECL;
+ }
+ else if (TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL)
+ leader = SSA_NAME_VAR (var1);
+ else if (TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL)
+ leader = SSA_NAME_VAR (var2);
+ else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL
+ && !DECL_IGNORED_P (SSA_NAME_VAR (var1)))
+ leader = SSA_NAME_VAR (var1);
+ else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL
+ && !DECL_IGNORED_P (SSA_NAME_VAR (var2)))
+ leader = SSA_NAME_VAR (var2);
+ else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL)
+ leader = SSA_NAME_VAR (var1);
+ else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
+ leader = SSA_NAME_VAR (var2);
+ else /* What else could it be? */
+ gcc_unreachable ();
+
z = var_union (map, var1, var2);
if (z == NO_PARTITION)
{
@@ -1173,8 +1278,46 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
else
ssa_conflicts_merge (graph, p2, p1);
+ if (z == p1)
+ {
+ if (SSA_NAME_VAR (var1) != leader)
+ {
+ replace_ssa_name_symbol (var1, leader);
+ if (debug)
+ {
+ fprintf (debug, ": Renamed ");
+ print_generic_expr (debug, var1, TDF_SLIM);
+ }
+ }
+ if (default_def)
+ {
+ if (SSA_NAME_IS_DEFAULT_DEF (var2))
+ bitmap_clear_bit (param_defaults, p2);
+ bitmap_set_bit (param_defaults, p1);
+ }
+ }
+ else
+ {
+ if (SSA_NAME_VAR (var2) != leader)
+ {
+ replace_ssa_name_symbol (var2, leader);
+ if (debug)
+ {
+ fprintf (debug, ": Renamed ");
+ print_generic_expr (debug, var2, TDF_SLIM);
+ }
+ }
+ if (default_def)
+ {
+ if (SSA_NAME_IS_DEFAULT_DEF (var1))
+ bitmap_clear_bit (param_defaults, p1);
+ bitmap_set_bit (param_defaults, p2);
+ }
+ }
+
if (debug)
fprintf (debug, ": Success -> %d\n", z);
+
return true;
}
@@ -1190,7 +1333,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
static void
coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
- FILE *debug)
+ bitmap param_defaults, FILE *debug)
{
int x = 0, y = 0;
tree var1, var2;
@@ -1226,7 +1369,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
if (debug)
fprintf (debug, "Abnormal coalesce: ");
- if (!attempt_coalesce (map, graph, v1, v2, debug))
+ if (!attempt_coalesce (map, graph, v1, v2, param_defaults, debug))
fail_abnormal_edge_coalesce (v1, v2);
}
}
@@ -1244,7 +1387,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
if (debug)
fprintf (debug, "Coalesce list: ");
- attempt_coalesce (map, graph, x, y, debug);
+ attempt_coalesce (map, graph, x, y, param_defaults, debug);
}
}
@@ -1272,6 +1415,178 @@ ssa_name_var_hash::equal (const value_type *n1, const compare_type *n2)
}
+/* Output partition map MAP with coalescing plan PART to file F. */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+ int t;
+ unsigned x, y;
+ int p;
+
+ fprintf (f, "\nCoalescible Partition map \n\n");
+
+ for (x = 0; x < map->num_partitions; x++)
+ {
+ if (map->view_to_partition != NULL)
+ p = map->view_to_partition[x];
+ else
+ p = x;
+
+ if (ssa_name (p) == NULL_TREE
+ || virtual_operand_p (ssa_name (p)))
+ continue;
+
+ t = 0;
+ for (y = 1; y < num_ssa_names; y++)
+ {
+ tree var = version_to_var (map, y);
+ if (!var)
+ continue;
+ int q = var_to_partition (map, var);
+ p = partition_find (part, q);
+ gcc_assert (map->partition_to_base_index[q]
+ == map->partition_to_base_index[p]);
+
+ if (p == (int)x)
+ {
+ if (t++ == 0)
+ {
+ fprintf (f, "Partition %d, base %d (", x,
+ map->partition_to_base_index[q]);
+ print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+ fprintf (f, " - ");
+ }
+ fprintf (f, "%d ", y);
+ }
+ }
+ if (t != 0)
+ fprintf (f, ")\n");
+ }
+ fprintf (f, "\n");
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+ partition of SSA names USED_IN_COPIES and related by CL
+ coalesce possibilities. */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+ coalesce_list_p cl)
+{
+ int parts = num_var_partitions (map);
+ partition tentative = partition_new (parts);
+
+ /* Partition the SSA versions so that, for each coalescible
+ pair, both of its members are in the same partition in
+ TENTATIVE. */
+ gcc_assert (!cl->sorted);
+ coalesce_pair_p node;
+ coalesce_iterator_type ppi;
+ FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+ {
+ tree v1 = ssa_name (node->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (node->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* We have to deal with cost one pairs too. */
+ for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+ {
+ tree v1 = ssa_name (co->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (co->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* And also with abnormal edges. */
+ basic_block bb;
+ edge e;
+ edge_iterator ei;
+ FOR_EACH_BB_FN (bb, cfun)
+ {
+ FOR_EACH_EDGE (e, ei, bb->preds)
+ if (e->flags & EDGE_ABNORMAL)
+ {
+ gphi_iterator gsi;
+ for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+ gsi_next (&gsi))
+ {
+ gphi *phi = gsi.phi ();
+ tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+ if (SSA_NAME_IS_DEFAULT_DEF (arg)
+ && (!SSA_NAME_VAR (arg)
+ || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+ continue;
+
+ tree res = PHI_RESULT (phi);
+
+ int p1 = partition_find (tentative, var_to_partition (map, res));
+ int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+ }
+ }
+
+
+ map->partition_to_base_index = XCNEWVEC (int, parts);
+ auto_vec<unsigned int> index_map (parts);
+ if (parts)
+ index_map.quick_grow (parts);
+
+ const unsigned no_part = -1;
+ unsigned count = parts;
+ while (count)
+ index_map[--count] = no_part;
+
+ /* Initialize MAP's mapping from partition to base index, using
+ as base indices an enumeration of the TENTATIVE partitions in
+ which each SSA version ended up, so that we compute conflicts
+ between all SSA versions that ended up in the same potential
+ coalesce partition. */
+ bitmap_iterator bi;
+ unsigned i;
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ if (index_map[base] != no_part)
+ continue;
+ index_map[base] = count++;
+ }
+
+ map->num_basevars = count;
+
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ gcc_assert (index_map[base] < count);
+ map->partition_to_base_index[pidx] = index_map[base];
+ }
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ dump_part_var_map (dump_file, tentative, map);
+
+ partition_delete (tentative);
+}
+
+
/* Reduce the number of copies by coalescing variables in the function. Return
a partition map with the resulting coalesces. */
@@ -1332,7 +1647,12 @@ coalesce_ssa_name (void)
dump_var_map (dump_file, map);
/* Don't calculate live ranges for variables not in the coalesce list. */
- partition_view_bitmap (map, used_in_copies, true);
+ partition_view_bitmap (map, used_in_copies, !optimize);
+
+ /* If we are optimizing, compute the base indices ourselves. */
+ if (optimize)
+ compute_optimized_partition_bases (map, used_in_copies, cl);
+
BITMAP_FREE (used_in_copies);
if (num_var_partitions (map) < 1)
@@ -1341,6 +1661,8 @@ coalesce_ssa_name (void)
return map;
}
+ bitmap param_defaults = BITMAP_ALLOC (NULL);
+
if (dump_file && (dump_flags & TDF_DETAILS))
dump_var_map (dump_file, map);
@@ -1350,7 +1672,7 @@ coalesce_ssa_name (void)
dump_live_info (dump_file, liveinfo, LIVEDUMP_ENTRY);
/* Build a conflict graph. */
- graph = build_ssa_conflict_graph (liveinfo);
+ graph = build_ssa_conflict_graph (liveinfo, param_defaults);
delete_tree_live_info (liveinfo);
if (dump_file && (dump_flags & TDF_DETAILS))
ssa_conflicts_dump (dump_file, graph);
@@ -1370,10 +1692,10 @@ coalesce_ssa_name (void)
dump_var_map (dump_file, map);
/* Now coalesce everything in the list. */
- coalesce_partitions (map, graph, cl,
- ((dump_flags & TDF_DETAILS) ? dump_file
- : NULL));
+ coalesce_partitions (map, graph, cl, param_defaults,
+ ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
+ BITMAP_FREE (param_defaults);
delete_coalesce_list (cl);
ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index f3cb56e..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,499 +0,0 @@
-/* Rename SSA copies.
- Copyright (C) 2004-2015 Free Software Foundation, Inc.
- Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3. If not see
-<http://www.gnu.org/licenses/>. */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
-#include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
-#include "tree.h"
-#include "fold-const.h"
-#include "predict.h"
-#include "hard-reg-set.h"
-#include "function.h"
-#include "dominance.h"
-#include "cfg.h"
-#include "basic-block.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "bitmap.h"
-#include "gimple-ssa.h"
-#include "stringpool.h"
-#include "tree-ssanames.h"
-#include "hashtab.h"
-#include "rtl.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
- /* Number of copies coalesced. */
- int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
- This optimization looks for copies between 2 SSA_NAMES, either through a
- direct copy, or an implicit one via a PHI node result and its arguments.
-
- Each copy is examined to determine if it is possible to rename the base
- variable of one of the operands to the same variable as the other operand.
- i.e.
- T.3_5 = <blah>
- a_1 = T.3_5
-
- If this copy couldn't be copy propagated, it could possibly remain in the
- program throughout the optimization phases. After SSA->normal, it would
- become:
-
- T.3 = <blah>
- a = T.3
-
- Since T.3_5 is distinct from all other SSA versions of T.3, there is no
- fundamental reason why the base variable needs to be T.3, subject to
- certain restrictions. This optimization attempts to determine if we can
- change the base variable on copies like this, and result in code such as:
-
- a_5 = <blah>
- a_1 = a_5
-
- This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
- possible, the copy goes away completely. If it isn't possible, a new temp
- will be created for a_5, and you will end up with the exact same code:
-
- a.8 = <blah>
- a = a.8
-
- The other benefit of performing this optimization relates to what variables
- are chosen in copies. Gimplification of the program uses temporaries for
- a lot of things. expressions like
-
- a_1 = <blah>
- <blah2> = a_1
-
- get turned into
-
- T.3_5 = <blah>
- a_1 = T.3_5
- <blah2> = a_1
-
- Copy propagation is done in a forward direction, and if we can propagate
- through the copy, we end up with:
-
- T.3_5 = <blah>
- <blah2> = T.3_5
-
- The copy is gone, but so is all reference to the user variable 'a'. By
- performing this optimization, we would see the sequence:
-
- a_5 = <blah>
- a_1 = a_5
- <blah2> = a_1
-
- which copy propagation would then turn into:
-
- a_5 = <blah>
- <blah2> = a_5
-
- and so we still retain the user variable whenever possible. */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
- Choose a representative for the partition, and send debug info to DEBUG. */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
- int p1, p2, p3;
- tree root1, root2;
- tree rep1, rep2;
- bool ign1, ign2, abnorm;
-
- gcc_assert (TREE_CODE (var1) == SSA_NAME);
- gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
- register_ssa_partition (map, var1);
- register_ssa_partition (map, var2);
-
- p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
- p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
- if (debug)
- {
- fprintf (debug, "Try : ");
- print_generic_expr (debug, var1, TDF_SLIM);
- fprintf (debug, "(P%d) & ", p1);
- print_generic_expr (debug, var2, TDF_SLIM);
- fprintf (debug, "(P%d)", p2);
- }
-
- gcc_assert (p1 != NO_PARTITION);
- gcc_assert (p2 != NO_PARTITION);
-
- if (p1 == p2)
- {
- if (debug)
- fprintf (debug, " : Already coalesced.\n");
- return;
- }
-
- rep1 = partition_to_var (map, p1);
- rep2 = partition_to_var (map, p2);
- root1 = SSA_NAME_VAR (rep1);
- root2 = SSA_NAME_VAR (rep2);
- if (!root1 && !root2)
- return;
-
- /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
- abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
- || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
- if (abnorm)
- {
- if (debug)
- fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
- return;
- }
-
- /* Partitions already have the same root, simply merge them. */
- if (root1 == root2)
- {
- p1 = partition_union (map->var_partition, p1, p2);
- if (debug)
- fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
- return;
- }
-
- /* Never attempt to coalesce 2 different parameters. */
- if ((root1 && TREE_CODE (root1) == PARM_DECL)
- && (root2 && TREE_CODE (root2) == PARM_DECL))
- {
- if (debug)
- fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
- return;
- }
-
- if ((root1 && TREE_CODE (root1) == RESULT_DECL)
- != (root2 && TREE_CODE (root2) == RESULT_DECL))
- {
- if (debug)
- fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
- return;
- }
-
- ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
- ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
- /* Refrain from coalescing user variables, if requested. */
- if (!ign1 && !ign2)
- {
- if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
- ign2 = true;
- else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
- ign1 = true;
- else if (flag_ssa_coalesce_vars != 2)
- {
- if (debug)
- fprintf (debug, " : 2 different USER vars. No coalesce.\n");
- return;
- }
- else
- ign2 = true;
- }
-
- /* If both values have default defs, we can't coalesce. If only one has a
- tag, make sure that variable is the new root partition. */
- if (root1 && ssa_default_def (cfun, root1))
- {
- if (root2 && ssa_default_def (cfun, root2))
- {
- if (debug)
- fprintf (debug, " : 2 default defs. No coalesce.\n");
- return;
- }
- else
- {
- ign2 = true;
- ign1 = false;
- }
- }
- else if (root2 && ssa_default_def (cfun, root2))
- {
- ign1 = true;
- ign2 = false;
- }
-
- /* Do not coalesce if we cannot assign a symbol to the partition. */
- if (!(!ign2 && root2)
- && !(!ign1 && root1))
- {
- if (debug)
- fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the new chosen root variable would be read-only.
- If both ign1 && ign2, then the root var of the larger partition
- wins, so reject in that case if any of the root vars is TREE_READONLY.
- Otherwise reject only if the root var, on which replace_ssa_name_symbol
- will be called below, is readonly. */
- if (((root1 && TREE_READONLY (root1)) && ign2)
- || ((root2 && TREE_READONLY (root2)) && ign1))
- {
- if (debug)
- fprintf (debug, " : Readonly variable. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the two variables aren't type compatible . */
- if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
- /* There is a disconnect between the middle-end type-system and
- VRP, avoid coalescing enum types with different bounds. */
- || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
- || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
- && TREE_TYPE (var1) != TREE_TYPE (var2)))
- {
- if (debug)
- fprintf (debug, " : Incompatible types. No coalesce.\n");
- return;
- }
-
- /* Merge the two partitions. */
- p3 = partition_union (map->var_partition, p1, p2);
-
- /* Set the root variable of the partition to the better choice, if there is
- one. */
- if (!ign2 && root2)
- replace_ssa_name_symbol (partition_to_var (map, p3), root2);
- else if (!ign1 && root1)
- replace_ssa_name_symbol (partition_to_var (map, p3), root1);
- else
- gcc_unreachable ();
-
- if (debug)
- {
- fprintf (debug, " --> P%d ", p3);
- print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
- TDF_SLIM);
- fprintf (debug, "\n");
- }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
- GIMPLE_PASS, /* type */
- "copyrename", /* name */
- OPTGROUP_NONE, /* optinfo_flags */
- TV_TREE_COPY_RENAME, /* tv_id */
- ( PROP_cfg | PROP_ssa ), /* properties_required */
- 0, /* properties_provided */
- 0, /* properties_destroyed */
- 0, /* todo_flags_start */
- 0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
- pass_rename_ssa_copies (gcc::context *ctxt)
- : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
- {}
-
- /* opt_pass methods: */
- opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
- virtual bool gate (function *) { return flag_tree_copyrename != 0; }
- virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
- SSA versions which occur in PHI's or copies. Coalescing is accomplished by
- changing the underlying root variable of all coalesced version. This will
- then cause the SSA->normal pass to attempt to coalesce them all to the same
- variable. */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
- var_map map;
- basic_block bb;
- tree var, part_var;
- gimple stmt;
- unsigned x;
- FILE *debug;
-
- memset (&stats, 0, sizeof (stats));
-
- if (dump_file && (dump_flags & TDF_DETAILS))
- debug = dump_file;
- else
- debug = NULL;
-
- map = init_var_map (num_ssa_names);
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Scan for real copies. */
- for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- stmt = gsi_stmt (gsi);
- if (gimple_assign_ssa_name_copy_p (stmt))
- {
- tree lhs = gimple_assign_lhs (stmt);
- tree rhs = gimple_assign_rhs1 (stmt);
-
- copy_rename_partition_coalesce (map, lhs, rhs, debug);
- }
- }
- }
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Treat PHI nodes as copies between the result and each argument. */
- for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- size_t i;
- tree res;
- gphi *phi = gsi.phi ();
- res = gimple_phi_result (phi);
-
- /* Do not process virtual SSA_NAMES. */
- if (virtual_operand_p (res))
- continue;
-
- /* Make sure to only use the same partition for an argument
- as the result but never the other way around. */
- if (SSA_NAME_VAR (res)
- && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) == SSA_NAME)
- copy_rename_partition_coalesce (map, res, arg,
- debug);
- }
- /* Else if all arguments are in the same partition try to merge
- it with the result. */
- else
- {
- int all_p_same = -1;
- int p = -1;
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) != SSA_NAME)
- {
- all_p_same = 0;
- break;
- }
- else if (all_p_same == -1)
- {
- p = partition_find (map->var_partition,
- SSA_NAME_VERSION (arg));
- all_p_same = 1;
- }
- else if (all_p_same == 1
- && p != partition_find (map->var_partition,
- SSA_NAME_VERSION (arg)))
- {
- all_p_same = 0;
- break;
- }
- }
- if (all_p_same == 1)
- copy_rename_partition_coalesce (map, res,
- PHI_ARG_DEF (phi, 0),
- debug);
- }
- }
- }
-
- if (debug)
- dump_var_map (debug, map);
-
- /* Now one more pass to make all elements of a partition share the same
- root variable. */
-
- for (x = 1; x < num_ssa_names; x++)
- {
- part_var = partition_to_var (map, x);
- if (!part_var)
- continue;
- var = ssa_name (x);
- if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
- continue;
- if (debug)
- {
- fprintf (debug, "Coalesced ");
- print_generic_expr (debug, var, TDF_SLIM);
- fprintf (debug, " to ");
- print_generic_expr (debug, part_var, TDF_SLIM);
- fprintf (debug, "\n");
- }
- stats.coalesced++;
- replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
- }
-
- statistics_counter_event (fun, "copies coalesced",
- stats.coalesced);
- delete_var_map (map);
- return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
- return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index e0c42669..46b1869 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -119,7 +119,10 @@ tree_int_map_hasher::equal (const value_type *v, const compare_type *c)
}
-/* This routine will initialize the basevar fields of MAP. */
+/* This routine will initialize the basevar fields of MAP with base
+ names, when we are not optimizing. When optimizing, we'll use
+ partition numbers as base index numbers, see coalesce_ssa_name in
+ tree-ssa-coalesce.c. */
static void
var_map_base_init (var_map map)
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-03-28 19:22 ` Alexandre Oliva
@ 2015-03-31 5:11 ` Jeff Law
2015-04-03 13:17 ` Alexandre Oliva
2015-03-31 6:55 ` Steven Bosscher
2015-03-31 14:06 ` Richard Biener
2 siblings, 1 reply; 127+ messages in thread
From: Jeff Law @ 2015-03-31 5:11 UTC (permalink / raw)
To: Alexandre Oliva, gcc-patches
On 03/28/2015 01:21 PM, Alexandre Oliva wrote:
> On Mar 27, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> This patch reworks the out-of-ssa expander to enable coalescing of SSA
>> partitions that don't share the same base name. This is done only when
>> optimizing.
>
>> The test we use to tell whether two partitions can be merged no longer
>> demands them to have the same base variable when optimizing, so they
>> become eligible for coalescing, as they would after copyrename. We then
>> compute the partitioning we'd get if all coalescible partitions were
>> coalesced, using this partition assignment to assign base vars numbers.
>> These base var numbers are then used to identify conflicts, which used
>> to be based on shared base vars or base types.
>
>> We now propagate base var names during coalescing proper, only towards
>> the leader variable. I'm no longer sure this is still needed, but
>> something about handling variables and results led me this way and I
>> didn't revisit it. I might rework that with a later patch, or a later
>> revision of this patch; it would require other means to identify
>> partitions holding result_decls during merging, or allow that and deal
>> with param and result decls in a different way during expand proper.
>
>> I had to fix two lingering bugs in order for the whole thing to work: we
>> perform conflict detection after abnormal coalescing, but we computed
>> live ranges involving only the partition leaders, so conflicts with
>> other names already coalesced wouldn't be detected.
>
> This early abnormal coalescing was only present for a few days in the
> trunk, and I was lucky enough to start working on a tree that had it.
> It turns out that the fix for it was thus rendered unnecessary, so I
> dropped it. It was the fix for it, that didn't cover the live range
> check, that caused the two ICEs I saw in the regressions tests. Since
> the ultimate cause of the problem is gone, and the change that
> introduced the check failures, both problems went *poof* after I updated
> the tree, resolved the conflicts and dropped the redundant code.
>
>> The other problem was that we didn't track default defs for parms as
>> live at entry, so they might end up coalesced.
>
> I improved this a little bit, using the bitmap of partitions containing
> default params to check that we only process function-entry defs for
> them, rather than for all param decls in case they end up in other
> partitions.
>
>> I guess none of these problems would have been exercised in practice,
>> because we wouldn't even consider merging ssa names associated with
>> different variables.
>
>> In the end, I verified that this fixed the codegen regression in the
>> PR64164 testcase, that failed to merge two partitions that could in
>> theory be merged, but that wasn't even considered due to differences in
>> the SSA var names.
>
>> I'd agree that disregarding the var names and dropping 4 passes is too
>> much of a change to fix this one problem, but... it's something we
>> should have long tackled, and it gets this and other jobs done, so...
>
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto plugin. The only regression is in
> gcc.dg/guality/pr54200.c, that explicitly disables VTA. When
> optimization is enabled, the different coalescing we perform now causes
> VTA-less variable tracking to lose track of variable "z". This
> regression in non-VTA var-tracking is expected and, as richi put it in
> PR 64164, I guess we don't care about that, do we? :-)
>
> The other guality regressions I mentioned in my other email turned out
> not to be regressions, but preexisting failures that somehow did not
> make to the test_summary of my earlier pristine build.
>
> Is this ok to install?
>
>
> for gcc/ChangeLog
>
> PR rtl-optimization/64164
> * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
> * tree-ssa-copyrename.c: Removed.
> * opts.c (default_options_table): Drop -ftree-copyrename.
> * passes.def: Drop all occurrences of pass_rename_ssa_copies.
> * common.opt (ftree-copyrename): Ignore.
> (ftree-coalesce-vars, ftree-coalesce-inlined-vars): Likewise.
> * doc/invoke.texi: Remove the ignored options above.
> * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
> across base variables when optimizing.
> * tree-ssa-coalesce.c (build_ssa_conflict_graph): Add
> param_defaults argument. Process PARM_DECLs's default defs at
> the entry point.
> (attempt_coalesce): Add param_defaults argument, and
> track the presence of default defs for params in each
> partition. Propagate base var to leader on merge, preferring
> parms and results, named vars, ignored vars, and then anon
> vars. Refuse to merge a RESULT_DECL partition with a default
> PARM_DECL one.
> (coalesce_partitions): Add param_defaults argument,
> and pass it to attempt_coalesce.
> (dump_part_var_map): New.
> (compute_optimized_partition_bases): New, called by...
> (coalesce_ssa_name): ... when optimizing, disabling
> partition_view_bitmap's base assignment. Pass local
> param_defaults to coalescer functions.
> * tree-ssa-live.c (var_map_base_init): Note use only when not
> optimizing.
> ---
> gcc/Makefile.in | 1
> gcc/common.opt | 12 +
> gcc/doc/invoke.texi | 29 ---
> gcc/gimple-expr.c | 7 -
> gcc/opts.c | 1
> gcc/passes.def | 5
> gcc/tree-ssa-coalesce.c | 342 ++++++++++++++++++++++++++++++-
> gcc/tree-ssa-copyrename.c | 499 ---------------------------------------------
> gcc/tree-ssa-live.c | 5
> 9 files changed, 347 insertions(+), 554 deletions(-)
> delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index efc93b7..62ae577 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -399,13 +399,14 @@ copy_var_decl (tree var, tree name, tree type)
> bool
> gimple_can_coalesce_p (tree name1, tree name2)
> {
> - /* First check the SSA_NAME's associated DECL. We only want to
> - coalesce if they have the same DECL or both have no associated DECL. */
> + /* First check the SSA_NAME's associated DECL. Without
> + optimization, we only want to coalesce if they have the same DECL
> + or both have no associated DECL. */
> tree var1 = SSA_NAME_VAR (name1);
> tree var2 = SSA_NAME_VAR (name2);
> var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> - if (var1 != var2)
> + if (var1 != var2 && !optimize)
> return false;
So when when the base variables are different and we are optimizing,
this allows coalescing, right?
What I don't see is a corresponding change to var_map_base_init to
ensure we build a conflict graph which includes objects when
SSA_NAME_VARs are not the same. I see a vague reference in
var_map_base_init's header comment that refers us to coalesce_ssa_name.
It appears that compute_optimized_partition_bases handles this by
creating a partitions of things that are related by copies/phis
regardless of their underlying named object, type, etc. Right?
> index 1d598b2..f8fd0ef 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
[ ... ]
Hard to argue with removing a pass that gets called 5 times!
> @@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
> live_track_process_def (live, result, graph);
> }
>
> + /* Pretend there are defs for params' default defs at the start
> + of the (post-)entry block. We run after abnormal coalescing,
> + so we can't assume the leader variable is the default
> + definition, but because of SSA_NAME_VAR adjustments in
> + attempt_coalesce, we can assume that if there is any
> + PARM_DECL in the partition, it will be the leader's
> + SSA_NAME_VAR. */
So the issue here is you want to iterate over the objects live at the
entry block, which would include any SSA_NAMEs which result from
PARM_DECLs. I don't guess there's an easier way to do that other than
iterating over everything live in that initial block?
And the second second EXECUTE_IF_SET_IN_BITMAP iterates over everything
in the partitions associated with the SSA_NAMES that are live at the the
entry block, right?
I don't guess it'd be more efficient to walk over the SSA_NAMEs looking
for anything marked as a default definition, then map that back to a
partition since we'd have to look at every SSA_NAME whereas your code
only looks at paritions that are live in the entry block, then looks at
the elements in those partitions.
> @@ -1126,11 +1166,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>
> static inline bool
> attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> - FILE *debug)
> + bitmap param_defaults, FILE *debug)
[ ... ]
So the bulk of the changes into this routine are really about picking a
good leader, which presumably is how we're able to get the desired
effects on debuginfo that we used to get from tree-ssa-copyrename.c?
> @@ -1158,6 +1199,70 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> {
> var1 = partition_to_var (map, p1);
> var2 = partition_to_var (map, p2);
> +
> + tree leader;
> +
> + if (var1 == var2 || !SSA_NAME_VAR (var2)
> + || SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2))
> + {
> + leader = SSA_NAME_VAR (var1);
> + default_def = (leader && TREE_CODE (leader) == PARM_DECL
> + && (SSA_NAME_IS_DEFAULT_DEF (var1)
> + || bitmap_bit_p (param_defaults, p1)));
> + }
So some comments about the various cases here might help. I can sort
them out if I read the code, but one could argue that a block comment on
the rules for how to select the partition leader would be better.
Is the special casing of PARM_DECLs + RESULT_DECLs really a failing of
not handling one or both properly when computing liveness information?
I'm not aware of an inherent reason why a PARM_DECL couldn't coalesce
with a related RESULT_DECL if they are otherwise non-conflicting and
related by a copy/phi.
> @@ -1272,6 +1415,178 @@ ssa_name_var_hash::equal (const value_type *n1, const compare_type *n2)
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> + partition of SSA names USED_IN_COPIES and related by CL
> + coalesce possibilities. */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> + coalesce_list_p cl)
Presumably ordering of unioning of the partitions doesn't matter here as
we're looking at coalesce possibilities rather than things we have
actually coalesced? Thus it's OK (?) to handle the names occurring in
abnormal PHIs after those names that are associated by a copy.
This is all probably OK, but I want to make sure I understand what's
happening before a final approval.
jeff
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-03-28 19:22 ` Alexandre Oliva
2015-03-31 5:11 ` Jeff Law
@ 2015-03-31 6:55 ` Steven Bosscher
2015-03-31 13:30 ` Richard Biener
2015-03-31 14:06 ` Richard Biener
2 siblings, 1 reply; 127+ messages in thread
From: Steven Bosscher @ 2015-03-31 6:55 UTC (permalink / raw)
To: Alexandre Oliva; +Cc: GCC Patches
On Sat, Mar 28, 2015 at 8:21 PM, Alexandre Oliva wrote:
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto plugin. The only regression is in
> gcc.dg/guality/pr54200.c, that explicitly disables VTA.
What about memory footprint? IIRC this pass was in part introduced to
reduce the number of VAR_DECLs.
Ciao!
Steven
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-03-31 6:55 ` Steven Bosscher
@ 2015-03-31 13:30 ` Richard Biener
0 siblings, 0 replies; 127+ messages in thread
From: Richard Biener @ 2015-03-31 13:30 UTC (permalink / raw)
To: Steven Bosscher; +Cc: Alexandre Oliva, GCC Patches
On Tue, Mar 31, 2015 at 8:55 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Sat, Mar 28, 2015 at 8:21 PM, Alexandre Oliva wrote:
>> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
>> on x86_64, so without lto plugin. The only regression is in
>> gcc.dg/guality/pr54200.c, that explicitly disables VTA.
>
> What about memory footprint? IIRC this pass was in part introduced to
> reduce the number of VAR_DECLs.
That's no longer necessary as we now drop VAR_DECLs from non-user vars
completely at into-SSA time. We have "anonymous" SSA names without
associated decls.
Richard.
> Ciao!
> Steven
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-03-28 19:22 ` Alexandre Oliva
2015-03-31 5:11 ` Jeff Law
2015-03-31 6:55 ` Steven Bosscher
@ 2015-03-31 14:06 ` Richard Biener
2015-04-03 13:30 ` Alexandre Oliva
2 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-03-31 14:06 UTC (permalink / raw)
To: Alexandre Oliva; +Cc: GCC Patches
On Sat, Mar 28, 2015 at 8:21 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Mar 27, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> This patch reworks the out-of-ssa expander to enable coalescing of SSA
>> partitions that don't share the same base name. This is done only when
>> optimizing.
>
>> The test we use to tell whether two partitions can be merged no longer
>> demands them to have the same base variable when optimizing, so they
>> become eligible for coalescing, as they would after copyrename. We then
>> compute the partitioning we'd get if all coalescible partitions were
>> coalesced, using this partition assignment to assign base vars numbers.
>> These base var numbers are then used to identify conflicts, which used
>> to be based on shared base vars or base types.
>
>> We now propagate base var names during coalescing proper, only towards
>> the leader variable. I'm no longer sure this is still needed, but
>> something about handling variables and results led me this way and I
>> didn't revisit it. I might rework that with a later patch, or a later
>> revision of this patch; it would require other means to identify
>> partitions holding result_decls during merging, or allow that and deal
>> with param and result decls in a different way during expand proper.
>
>> I had to fix two lingering bugs in order for the whole thing to work: we
>> perform conflict detection after abnormal coalescing, but we computed
>> live ranges involving only the partition leaders, so conflicts with
>> other names already coalesced wouldn't be detected.
>
> This early abnormal coalescing was only present for a few days in the
> trunk, and I was lucky enough to start working on a tree that had it.
> It turns out that the fix for it was thus rendered unnecessary, so I
> dropped it. It was the fix for it, that didn't cover the live range
> check, that caused the two ICEs I saw in the regressions tests. Since
> the ultimate cause of the problem is gone, and the change that
> introduced the check failures, both problems went *poof* after I updated
> the tree, resolved the conflicts and dropped the redundant code.
>
>> The other problem was that we didn't track default defs for parms as
>> live at entry, so they might end up coalesced.
>
> I improved this a little bit, using the bitmap of partitions containing
> default params to check that we only process function-entry defs for
> them, rather than for all param decls in case they end up in other
> partitions.
>
>> I guess none of these problems would have been exercised in practice,
>> because we wouldn't even consider merging ssa names associated with
>> different variables.
>
>> In the end, I verified that this fixed the codegen regression in the
>> PR64164 testcase, that failed to merge two partitions that could in
>> theory be merged, but that wasn't even considered due to differences in
>> the SSA var names.
>
>> I'd agree that disregarding the var names and dropping 4 passes is too
>> much of a change to fix this one problem, but... it's something we
>> should have long tackled, and it gets this and other jobs done, so...
>
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto plugin. The only regression is in
> gcc.dg/guality/pr54200.c, that explicitly disables VTA. When
> optimization is enabled, the different coalescing we perform now causes
> VTA-less variable tracking to lose track of variable "z". This
> regression in non-VTA var-tracking is expected and, as richi put it in
> PR 64164, I guess we don't care about that, do we? :-)
Apart from at -O0, yes.
> The other guality regressions I mentioned in my other email turned out
> not to be regressions, but preexisting failures that somehow did not
> make to the test_summary of my earlier pristine build.
>
> Is this ok to install?
I think this is stage1 material. Some comments in-line
>
> for gcc/ChangeLog
>
> PR rtl-optimization/64164
> * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
> * tree-ssa-copyrename.c: Removed.
> * opts.c (default_options_table): Drop -ftree-copyrename.
> * passes.def: Drop all occurrences of pass_rename_ssa_copies.
> * common.opt (ftree-copyrename): Ignore.
> (ftree-coalesce-vars, ftree-coalesce-inlined-vars): Likewise.
> * doc/invoke.texi: Remove the ignored options above.
> * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
> across base variables when optimizing.
> * tree-ssa-coalesce.c (build_ssa_conflict_graph): Add
> param_defaults argument. Process PARM_DECLs's default defs at
> the entry point.
> (attempt_coalesce): Add param_defaults argument, and
> track the presence of default defs for params in each
> partition. Propagate base var to leader on merge, preferring
> parms and results, named vars, ignored vars, and then anon
> vars. Refuse to merge a RESULT_DECL partition with a default
> PARM_DECL one.
> (coalesce_partitions): Add param_defaults argument,
> and pass it to attempt_coalesce.
> (dump_part_var_map): New.
> (compute_optimized_partition_bases): New, called by...
> (coalesce_ssa_name): ... when optimizing, disabling
> partition_view_bitmap's base assignment. Pass local
> param_defaults to coalescer functions.
> * tree-ssa-live.c (var_map_base_init): Note use only when not
> optimizing.
> ---
> gcc/Makefile.in | 1
> gcc/common.opt | 12 +
> gcc/doc/invoke.texi | 29 ---
> gcc/gimple-expr.c | 7 -
> gcc/opts.c | 1
> gcc/passes.def | 5
> gcc/tree-ssa-coalesce.c | 342 ++++++++++++++++++++++++++++++-
> gcc/tree-ssa-copyrename.c | 499 ---------------------------------------------
> gcc/tree-ssa-live.c | 5
> 9 files changed, 347 insertions(+), 554 deletions(-)
> delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index f924fb8..990c4e9 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1428,7 +1428,6 @@ OBJS = \
> tree-ssa-ccp.o \
> tree-ssa-coalesce.o \
> tree-ssa-copy.o \
> - tree-ssa-copyrename.o \
> tree-ssa-dce.o \
> tree-ssa-dom.o \
> tree-ssa-dse.o \
> diff --git a/gcc/common.opt b/gcc/common.opt
> index b49ac46..fefaee7 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2207,16 +2207,16 @@ Common Report Var(flag_tree_ch) Optimization
> Enable loop header copying on trees
>
> ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing. Preserved for backward compatibility.
>
> ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Ignore
> +Does nothing. Preserved for backward compatibility.
>
> ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing. Preserved for backward compatibility.
>
> ftree-copy-prop
> Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 9749727..5d2c516 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -442,8 +442,7 @@ Objective-C and Objective-C++ Dialects}.
> -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
> -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> +-ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> -ftree-loop-if-convert-stores -ftree-loop-im @gol
> -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
> @@ -8822,32 +8821,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
> references with scalars to prevent committing structures to memory too
> early. This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees. This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables. This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions. It is a more limited form of
> -@option{-ftree-coalesce-vars}. This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries. This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}. In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones. This option is enabled by default.
> -
> @item -ftree-ter
> @opindex ftree-ter
> Perform temporary expression replacement during the SSA->normal phase. Single
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index efc93b7..62ae577 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -399,13 +399,14 @@ copy_var_decl (tree var, tree name, tree type)
> bool
> gimple_can_coalesce_p (tree name1, tree name2)
> {
> - /* First check the SSA_NAME's associated DECL. We only want to
> - coalesce if they have the same DECL or both have no associated DECL. */
> + /* First check the SSA_NAME's associated DECL. Without
> + optimization, we only want to coalesce if they have the same DECL
> + or both have no associated DECL. */
> tree var1 = SSA_NAME_VAR (name1);
> tree var2 = SSA_NAME_VAR (name2);
> var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> - if (var1 != var2)
> + if (var1 != var2 && !optimize)
> return false;
>
> /* Now check the types. If the types are the same, then we should
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 39c190d..8149421 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -453,7 +453,6 @@ static const struct default_options default_options_table[] =
> { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> - { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 1d598b2..f8fd0ef 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_all_early_optimizations);
> PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
> NEXT_PASS (pass_remove_cgraph_callee_edges);
> - NEXT_PASS (pass_rename_ssa_copies);
> NEXT_PASS (pass_object_sizes);
> NEXT_PASS (pass_ccp);
> /* After CCP we rewrite no longer addressed locals into SSA
> @@ -154,7 +153,6 @@ along with GCC; see the file COPYING3. If not see
> /* Initial scalar cleanups before alias computation.
> They ensure memory accesses are not indirect wherever possible. */
> NEXT_PASS (pass_strip_predict_hints);
> - NEXT_PASS (pass_rename_ssa_copies);
> NEXT_PASS (pass_ccp);
> /* After CCP we rewrite no longer addressed locals into SSA
> form if possible. */
> @@ -182,7 +180,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_stdarg);
> NEXT_PASS (pass_lower_complex);
> NEXT_PASS (pass_sra);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* The dom pass will also resolve all __builtin_constant_p calls
> that are still there to 0. This has to be done after some
> propagations have already run, but before some more dead code
> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_fold_builtins);
> NEXT_PASS (pass_optimize_widening_mul);
> NEXT_PASS (pass_tail_calls);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* FIXME: If DCE is not run before checking for uninitialized uses,
> we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
> However, this also causes us to misdiagnose cases that should be
> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_dce);
> NEXT_PASS (pass_asan);
> NEXT_PASS (pass_tsan);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* ??? We do want some kind of loop invariant motion, but we possibly
> need to adjust LIM to be more friendly towards preserving accurate
> debug information here. */
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index 1afeefe..8557d84 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -825,13 +825,23 @@ live_track_clear_base_vars (live_track_p ptr)
> base variable are added. */
>
> static ssa_conflicts_p
> -build_ssa_conflict_graph (tree_live_info_p liveinfo)
> +build_ssa_conflict_graph (tree_live_info_p liveinfo, bitmap param_defaults)
> {
> ssa_conflicts_p graph;
> var_map map;
> basic_block bb;
> ssa_op_iter iter;
> live_track_p live;
> + basic_block entry;
> +
> + /* If we are optimizing, we may attempt to coalesce variables from
> + different base variables, including different parameters, so we
> + have to make sure default defs live at the entry block conflict
> + with each other. */
> + if (optimize)
> + entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> + else
> + entry = NULL;
>
> map = live_var_map (liveinfo);
> graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
> live_track_process_def (live, result, graph);
> }
>
> + /* Pretend there are defs for params' default defs at the start
> + of the (post-)entry block. We run after abnormal coalescing,
> + so we can't assume the leader variable is the default
> + definition, but because of SSA_NAME_VAR adjustments in
> + attempt_coalesce, we can assume that if there is any
> + PARM_DECL in the partition, it will be the leader's
> + SSA_NAME_VAR. */
> + if (bb == entry)
> + {
> + unsigned base;
> + bitmap_iterator bi;
> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> + {
> + bitmap_iterator bi2;
> + unsigned part;
> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> + 0, part, bi2)
> + {
> + tree var = partition_to_var (map, part);
> + if (!SSA_NAME_VAR (var)
> + || TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> + || !(SSA_NAME_IS_DEFAULT_DEF (var)
> + || (param_defaults
> + && bitmap_bit_p (param_defaults, part))))
> + continue;
> + live_track_process_def (live, var, graph);
> + }
> + }
> + }
> +
This looks somewhat awkward to me ;) Is it really important to allow
coalescing PARM_DECL-based SSA vars with sth else? At least
abnormal coalescing doesn't need to do that, so just walking over
the function decls parameters and making their default-def live
should be enough?
That is, that param_defaults bitmap looks ugly to me.
> live_track_clear_base_vars (live);
> }
>
> @@ -1126,11 +1166,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>
> static inline bool
> attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> - FILE *debug)
> + bitmap param_defaults, FILE *debug)
> {
> int z;
> tree var1, var2;
> int p1, p2;
> + bool default_def = false;
>
> p1 = var_to_partition (map, ssa_name (x));
> p2 = var_to_partition (map, ssa_name (y));
> @@ -1158,6 +1199,70 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> {
> var1 = partition_to_var (map, p1);
> var2 = partition_to_var (map, p2);
> +
> + tree leader;
> +
> + if (var1 == var2 || !SSA_NAME_VAR (var2)
> + || SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2))
> + {
> + leader = SSA_NAME_VAR (var1);
> + default_def = (leader && TREE_CODE (leader) == PARM_DECL
> + && (SSA_NAME_IS_DEFAULT_DEF (var1)
> + || bitmap_bit_p (param_defaults, p1)));
> + }
> + else if (!SSA_NAME_VAR (var1))
> + {
> + leader = SSA_NAME_VAR (var2);
> + default_def = (leader && TREE_CODE (leader) == PARM_DECL
> + && (SSA_NAME_IS_DEFAULT_DEF (var2)
> + || bitmap_bit_p (param_defaults, p2)));
> + }
> + else if ((TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL
> + && (SSA_NAME_IS_DEFAULT_DEF (var1)
> + || bitmap_bit_p (param_defaults, p1)))
> + || TREE_CODE (SSA_NAME_VAR (var1)) == RESULT_DECL)
> + {
> + if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
> + && (SSA_NAME_IS_DEFAULT_DEF (var2)
> + || bitmap_bit_p (param_defaults, p2)))
> + || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
> + {
> + /* We only have one RESULT_DECL, and two PARM_DECL
> + DEFAULT_DEFs would have conflicted, so we know either
> + one of var1 or var2 is a PARM_DECL, and the other is
> + a RESULT_DECL. */
> + if (debug)
> + fprintf (debug, ": Cannot coalesce PARM_DECL and RESULT_DECL\n");
> + return false;
> + }
> + leader = SSA_NAME_VAR (var1);
> + default_def = TREE_CODE (leader) == PARM_DECL;
> + }
> + else if ((TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL
> + && (SSA_NAME_IS_DEFAULT_DEF (var2)
> + || bitmap_bit_p (param_defaults, p2)))
> + || TREE_CODE (SSA_NAME_VAR (var2)) == RESULT_DECL)
> + {
> + leader = SSA_NAME_VAR (var2);
> + default_def = TREE_CODE (leader) == PARM_DECL;
> + }
> + else if (TREE_CODE (SSA_NAME_VAR (var1)) == PARM_DECL)
> + leader = SSA_NAME_VAR (var1);
> + else if (TREE_CODE (SSA_NAME_VAR (var2)) == PARM_DECL)
> + leader = SSA_NAME_VAR (var2);
> + else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL
> + && !DECL_IGNORED_P (SSA_NAME_VAR (var1)))
> + leader = SSA_NAME_VAR (var1);
> + else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL
> + && !DECL_IGNORED_P (SSA_NAME_VAR (var2)))
> + leader = SSA_NAME_VAR (var2);
> + else if (TREE_CODE (SSA_NAME_VAR (var1)) == VAR_DECL)
> + leader = SSA_NAME_VAR (var1);
> + else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
> + leader = SSA_NAME_VAR (var2);
> + else /* What else could it be? */
> + gcc_unreachable ();
> +
definitely comments missing in this spaghetti...
> z = var_union (map, var1, var2);
> if (z == NO_PARTITION)
> {
> @@ -1173,8 +1278,46 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> else
> ssa_conflicts_merge (graph, p2, p1);
>
> + if (z == p1)
> + {
> + if (SSA_NAME_VAR (var1) != leader)
> + {
> + replace_ssa_name_symbol (var1, leader);
> + if (debug)
> + {
> + fprintf (debug, ": Renamed ");
> + print_generic_expr (debug, var1, TDF_SLIM);
> + }
> + }
> + if (default_def)
> + {
> + if (SSA_NAME_IS_DEFAULT_DEF (var2))
> + bitmap_clear_bit (param_defaults, p2);
> + bitmap_set_bit (param_defaults, p1);
> + }
> + }
> + else
> + {
> + if (SSA_NAME_VAR (var2) != leader)
> + {
> + replace_ssa_name_symbol (var2, leader);
> + if (debug)
> + {
> + fprintf (debug, ": Renamed ");
> + print_generic_expr (debug, var2, TDF_SLIM);
> + }
> + }
> + if (default_def)
> + {
> + if (SSA_NAME_IS_DEFAULT_DEF (var1))
> + bitmap_clear_bit (param_defaults, p1);
> + bitmap_set_bit (param_defaults, p2);
> + }
> + }
or seeing this, why coalesce default-defs at all? Either they are param values
or they have indetermined values (and thus we can and do pick whatever is
available at expansion time)?
> +
> if (debug)
> fprintf (debug, ": Success -> %d\n", z);
> +
> return true;
> }
>
> @@ -1190,7 +1333,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
> static void
> coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
> - FILE *debug)
> + bitmap param_defaults, FILE *debug)
> {
> int x = 0, y = 0;
> tree var1, var2;
> @@ -1226,7 +1369,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
> if (debug)
> fprintf (debug, "Abnormal coalesce: ");
>
> - if (!attempt_coalesce (map, graph, v1, v2, debug))
> + if (!attempt_coalesce (map, graph, v1, v2, param_defaults, debug))
> fail_abnormal_edge_coalesce (v1, v2);
> }
> }
> @@ -1244,7 +1387,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
>
> if (debug)
> fprintf (debug, "Coalesce list: ");
> - attempt_coalesce (map, graph, x, y, debug);
> + attempt_coalesce (map, graph, x, y, param_defaults, debug);
> }
> }
>
> @@ -1272,6 +1415,178 @@ ssa_name_var_hash::equal (const value_type *n1, const compare_type *n2)
> }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F. */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> + int t;
> + unsigned x, y;
> + int p;
> +
> + fprintf (f, "\nCoalescible Partition map \n\n");
> +
> + for (x = 0; x < map->num_partitions; x++)
> + {
> + if (map->view_to_partition != NULL)
> + p = map->view_to_partition[x];
> + else
> + p = x;
> +
> + if (ssa_name (p) == NULL_TREE
> + || virtual_operand_p (ssa_name (p)))
> + continue;
> +
> + t = 0;
> + for (y = 1; y < num_ssa_names; y++)
> + {
> + tree var = version_to_var (map, y);
> + if (!var)
> + continue;
> + int q = var_to_partition (map, var);
> + p = partition_find (part, q);
> + gcc_assert (map->partition_to_base_index[q]
> + == map->partition_to_base_index[p]);
> +
> + if (p == (int)x)
> + {
> + if (t++ == 0)
> + {
> + fprintf (f, "Partition %d, base %d (", x,
> + map->partition_to_base_index[q]);
> + print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> + fprintf (f, " - ");
> + }
> + fprintf (f, "%d ", y);
> + }
> + }
> + if (t != 0)
> + fprintf (f, ")\n");
> + }
> + fprintf (f, "\n");
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> + partition of SSA names USED_IN_COPIES and related by CL
> + coalesce possibilities. */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> + coalesce_list_p cl)
> +{
> + int parts = num_var_partitions (map);
> + partition tentative = partition_new (parts);
> +
> + /* Partition the SSA versions so that, for each coalescible
> + pair, both of its members are in the same partition in
> + TENTATIVE. */
> + gcc_assert (!cl->sorted);
> + coalesce_pair_p node;
> + coalesce_iterator_type ppi;
> + FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> + {
> + tree v1 = ssa_name (node->first_element);
> + int p1 = partition_find (tentative, var_to_partition (map, v1));
> + tree v2 = ssa_name (node->second_element);
> + int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> +
> + /* We have to deal with cost one pairs too. */
> + for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> + {
> + tree v1 = ssa_name (co->first_element);
> + int p1 = partition_find (tentative, var_to_partition (map, v1));
> + tree v2 = ssa_name (co->second_element);
> + int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> +
> + /* And also with abnormal edges. */
> + basic_block bb;
> + edge e;
> + edge_iterator ei;
> + FOR_EACH_BB_FN (bb, cfun)
> + {
> + FOR_EACH_EDGE (e, ei, bb->preds)
> + if (e->flags & EDGE_ABNORMAL)
> + {
> + gphi_iterator gsi;
> + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> + gsi_next (&gsi))
> + {
> + gphi *phi = gsi.phi ();
> + tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> + if (SSA_NAME_IS_DEFAULT_DEF (arg)
> + && (!SSA_NAME_VAR (arg)
> + || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> + continue;
> +
> + tree res = PHI_RESULT (phi);
> +
> + int p1 = partition_find (tentative, var_to_partition (map, res));
> + int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> + }
> + }
So the above does full coalescing ignoring conflicts.
> +
> + map->partition_to_base_index = XCNEWVEC (int, parts);
> + auto_vec<unsigned int> index_map (parts);
> + if (parts)
> + index_map.quick_grow (parts);
> +
> + const unsigned no_part = -1;
> + unsigned count = parts;
> + while (count)
> + index_map[--count] = no_part;
> +
> + /* Initialize MAP's mapping from partition to base index, using
> + as base indices an enumeration of the TENTATIVE partitions in
> + which each SSA version ended up, so that we compute conflicts
> + between all SSA versions that ended up in the same potential
> + coalesce partition. */
> + bitmap_iterator bi;
> + unsigned i;
> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> + {
> + int pidx = var_to_partition (map, ssa_name (i));
> + int base = partition_find (tentative, pidx);
> + if (index_map[base] != no_part)
> + continue;
> + index_map[base] = count++;
> + }
> +
> + map->num_basevars = count;
Did you do any statistics on how the number of basevars changes with your patch
compared to trunk?
So apart from possibly simplifying the patch by not dealing with
default-def coalesces
of PARAM_DECLs and ignoring them for conflict purposes for others (as
tree-ssa-live.c
does) the patch looks good to me.
Richard.
> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> + {
> + int pidx = var_to_partition (map, ssa_name (i));
> + int base = partition_find (tentative, pidx);
> + gcc_assert (index_map[base] < count);
> + map->partition_to_base_index[pidx] = index_map[base];
> + }
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + dump_part_var_map (dump_file, tentative, map);
> +
> + partition_delete (tentative);
> +}
> +
> +
> /* Reduce the number of copies by coalescing variables in the function. Return
> a partition map with the resulting coalesces. */
>
> @@ -1332,7 +1647,12 @@ coalesce_ssa_name (void)
> dump_var_map (dump_file, map);
>
> /* Don't calculate live ranges for variables not in the coalesce list. */
> - partition_view_bitmap (map, used_in_copies, true);
> + partition_view_bitmap (map, used_in_copies, !optimize);
> +
> + /* If we are optimizing, compute the base indices ourselves. */
> + if (optimize)
> + compute_optimized_partition_bases (map, used_in_copies, cl);
> +
> BITMAP_FREE (used_in_copies);
>
> if (num_var_partitions (map) < 1)
> @@ -1341,6 +1661,8 @@ coalesce_ssa_name (void)
> return map;
> }
>
> + bitmap param_defaults = BITMAP_ALLOC (NULL);
> +
> if (dump_file && (dump_flags & TDF_DETAILS))
> dump_var_map (dump_file, map);
>
> @@ -1350,7 +1672,7 @@ coalesce_ssa_name (void)
> dump_live_info (dump_file, liveinfo, LIVEDUMP_ENTRY);
>
> /* Build a conflict graph. */
> - graph = build_ssa_conflict_graph (liveinfo);
> + graph = build_ssa_conflict_graph (liveinfo, param_defaults);
> delete_tree_live_info (liveinfo);
> if (dump_file && (dump_flags & TDF_DETAILS))
> ssa_conflicts_dump (dump_file, graph);
> @@ -1370,10 +1692,10 @@ coalesce_ssa_name (void)
> dump_var_map (dump_file, map);
>
> /* Now coalesce everything in the list. */
> - coalesce_partitions (map, graph, cl,
> - ((dump_flags & TDF_DETAILS) ? dump_file
> - : NULL));
> + coalesce_partitions (map, graph, cl, param_defaults,
> + ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
> + BITMAP_FREE (param_defaults);
> delete_coalesce_list (cl);
> ssa_conflicts_delete (graph);
>
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index f3cb56e..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,499 +0,0 @@
> -/* Rename SSA copies.
> - Copyright (C) 2004-2015 Free Software Foundation, Inc.
> - Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3. If not see
> -<http://www.gnu.org/licenses/>. */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "tm.h"
> -#include "hash-set.h"
> -#include "machmode.h"
> -#include "vec.h"
> -#include "double-int.h"
> -#include "input.h"
> -#include "alias.h"
> -#include "symtab.h"
> -#include "wide-int.h"
> -#include "inchash.h"
> -#include "tree.h"
> -#include "fold-const.h"
> -#include "predict.h"
> -#include "hard-reg-set.h"
> -#include "function.h"
> -#include "dominance.h"
> -#include "cfg.h"
> -#include "basic-block.h"
> -#include "tree-ssa-alias.h"
> -#include "internal-fn.h"
> -#include "gimple-expr.h"
> -#include "is-a.h"
> -#include "gimple.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "bitmap.h"
> -#include "gimple-ssa.h"
> -#include "stringpool.h"
> -#include "tree-ssanames.h"
> -#include "hashtab.h"
> -#include "rtl.h"
> -#include "statistics.h"
> -#include "real.h"
> -#include "fixed-value.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> - /* Number of copies coalesced. */
> - int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> - This optimization looks for copies between 2 SSA_NAMES, either through a
> - direct copy, or an implicit one via a PHI node result and its arguments.
> -
> - Each copy is examined to determine if it is possible to rename the base
> - variable of one of the operands to the same variable as the other operand.
> - i.e.
> - T.3_5 = <blah>
> - a_1 = T.3_5
> -
> - If this copy couldn't be copy propagated, it could possibly remain in the
> - program throughout the optimization phases. After SSA->normal, it would
> - become:
> -
> - T.3 = <blah>
> - a = T.3
> -
> - Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> - fundamental reason why the base variable needs to be T.3, subject to
> - certain restrictions. This optimization attempts to determine if we can
> - change the base variable on copies like this, and result in code such as:
> -
> - a_5 = <blah>
> - a_1 = a_5
> -
> - This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> - possible, the copy goes away completely. If it isn't possible, a new temp
> - will be created for a_5, and you will end up with the exact same code:
> -
> - a.8 = <blah>
> - a = a.8
> -
> - The other benefit of performing this optimization relates to what variables
> - are chosen in copies. Gimplification of the program uses temporaries for
> - a lot of things. expressions like
> -
> - a_1 = <blah>
> - <blah2> = a_1
> -
> - get turned into
> -
> - T.3_5 = <blah>
> - a_1 = T.3_5
> - <blah2> = a_1
> -
> - Copy propagation is done in a forward direction, and if we can propagate
> - through the copy, we end up with:
> -
> - T.3_5 = <blah>
> - <blah2> = T.3_5
> -
> - The copy is gone, but so is all reference to the user variable 'a'. By
> - performing this optimization, we would see the sequence:
> -
> - a_5 = <blah>
> - a_1 = a_5
> - <blah2> = a_1
> -
> - which copy propagation would then turn into:
> -
> - a_5 = <blah>
> - <blah2> = a_5
> -
> - and so we still retain the user variable whenever possible. */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> - Choose a representative for the partition, and send debug info to DEBUG. */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> - int p1, p2, p3;
> - tree root1, root2;
> - tree rep1, rep2;
> - bool ign1, ign2, abnorm;
> -
> - gcc_assert (TREE_CODE (var1) == SSA_NAME);
> - gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> - register_ssa_partition (map, var1);
> - register_ssa_partition (map, var2);
> -
> - p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> - p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> - if (debug)
> - {
> - fprintf (debug, "Try : ");
> - print_generic_expr (debug, var1, TDF_SLIM);
> - fprintf (debug, "(P%d) & ", p1);
> - print_generic_expr (debug, var2, TDF_SLIM);
> - fprintf (debug, "(P%d)", p2);
> - }
> -
> - gcc_assert (p1 != NO_PARTITION);
> - gcc_assert (p2 != NO_PARTITION);
> -
> - if (p1 == p2)
> - {
> - if (debug)
> - fprintf (debug, " : Already coalesced.\n");
> - return;
> - }
> -
> - rep1 = partition_to_var (map, p1);
> - rep2 = partition_to_var (map, p2);
> - root1 = SSA_NAME_VAR (rep1);
> - root2 = SSA_NAME_VAR (rep2);
> - if (!root1 && !root2)
> - return;
> -
> - /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
> - abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> - || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> - if (abnorm)
> - {
> - if (debug)
> - fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
> - return;
> - }
> -
> - /* Partitions already have the same root, simply merge them. */
> - if (root1 == root2)
> - {
> - p1 = partition_union (map->var_partition, p1, p2);
> - if (debug)
> - fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> - return;
> - }
> -
> - /* Never attempt to coalesce 2 different parameters. */
> - if ((root1 && TREE_CODE (root1) == PARM_DECL)
> - && (root2 && TREE_CODE (root2) == PARM_DECL))
> - {
> - if (debug)
> - fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> - return;
> - }
> -
> - if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> - != (root2 && TREE_CODE (root2) == RESULT_DECL))
> - {
> - if (debug)
> - fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> - return;
> - }
> -
> - ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> - ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> - /* Refrain from coalescing user variables, if requested. */
> - if (!ign1 && !ign2)
> - {
> - if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> - ign2 = true;
> - else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> - ign1 = true;
> - else if (flag_ssa_coalesce_vars != 2)
> - {
> - if (debug)
> - fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> - return;
> - }
> - else
> - ign2 = true;
> - }
> -
> - /* If both values have default defs, we can't coalesce. If only one has a
> - tag, make sure that variable is the new root partition. */
> - if (root1 && ssa_default_def (cfun, root1))
> - {
> - if (root2 && ssa_default_def (cfun, root2))
> - {
> - if (debug)
> - fprintf (debug, " : 2 default defs. No coalesce.\n");
> - return;
> - }
> - else
> - {
> - ign2 = true;
> - ign1 = false;
> - }
> - }
> - else if (root2 && ssa_default_def (cfun, root2))
> - {
> - ign1 = true;
> - ign2 = false;
> - }
> -
> - /* Do not coalesce if we cannot assign a symbol to the partition. */
> - if (!(!ign2 && root2)
> - && !(!ign1 && root1))
> - {
> - if (debug)
> - fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
> - return;
> - }
> -
> - /* Don't coalesce if the new chosen root variable would be read-only.
> - If both ign1 && ign2, then the root var of the larger partition
> - wins, so reject in that case if any of the root vars is TREE_READONLY.
> - Otherwise reject only if the root var, on which replace_ssa_name_symbol
> - will be called below, is readonly. */
> - if (((root1 && TREE_READONLY (root1)) && ign2)
> - || ((root2 && TREE_READONLY (root2)) && ign1))
> - {
> - if (debug)
> - fprintf (debug, " : Readonly variable. No coalesce.\n");
> - return;
> - }
> -
> - /* Don't coalesce if the two variables aren't type compatible . */
> - if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> - /* There is a disconnect between the middle-end type-system and
> - VRP, avoid coalescing enum types with different bounds. */
> - || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> - || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> - && TREE_TYPE (var1) != TREE_TYPE (var2)))
> - {
> - if (debug)
> - fprintf (debug, " : Incompatible types. No coalesce.\n");
> - return;
> - }
> -
> - /* Merge the two partitions. */
> - p3 = partition_union (map->var_partition, p1, p2);
> -
> - /* Set the root variable of the partition to the better choice, if there is
> - one. */
> - if (!ign2 && root2)
> - replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> - else if (!ign1 && root1)
> - replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> - else
> - gcc_unreachable ();
> -
> - if (debug)
> - {
> - fprintf (debug, " --> P%d ", p3);
> - print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> - TDF_SLIM);
> - fprintf (debug, "\n");
> - }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> - GIMPLE_PASS, /* type */
> - "copyrename", /* name */
> - OPTGROUP_NONE, /* optinfo_flags */
> - TV_TREE_COPY_RENAME, /* tv_id */
> - ( PROP_cfg | PROP_ssa ), /* properties_required */
> - 0, /* properties_provided */
> - 0, /* properties_destroyed */
> - 0, /* todo_flags_start */
> - 0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> - pass_rename_ssa_copies (gcc::context *ctxt)
> - : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> - {}
> -
> - /* opt_pass methods: */
> - opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> - virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> - virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> - SSA versions which occur in PHI's or copies. Coalescing is accomplished by
> - changing the underlying root variable of all coalesced version. This will
> - then cause the SSA->normal pass to attempt to coalesce them all to the same
> - variable. */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> - var_map map;
> - basic_block bb;
> - tree var, part_var;
> - gimple stmt;
> - unsigned x;
> - FILE *debug;
> -
> - memset (&stats, 0, sizeof (stats));
> -
> - if (dump_file && (dump_flags & TDF_DETAILS))
> - debug = dump_file;
> - else
> - debug = NULL;
> -
> - map = init_var_map (num_ssa_names);
> -
> - FOR_EACH_BB_FN (bb, fun)
> - {
> - /* Scan for real copies. */
> - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> - gsi_next (&gsi))
> - {
> - stmt = gsi_stmt (gsi);
> - if (gimple_assign_ssa_name_copy_p (stmt))
> - {
> - tree lhs = gimple_assign_lhs (stmt);
> - tree rhs = gimple_assign_rhs1 (stmt);
> -
> - copy_rename_partition_coalesce (map, lhs, rhs, debug);
> - }
> - }
> - }
> -
> - FOR_EACH_BB_FN (bb, fun)
> - {
> - /* Treat PHI nodes as copies between the result and each argument. */
> - for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> - gsi_next (&gsi))
> - {
> - size_t i;
> - tree res;
> - gphi *phi = gsi.phi ();
> - res = gimple_phi_result (phi);
> -
> - /* Do not process virtual SSA_NAMES. */
> - if (virtual_operand_p (res))
> - continue;
> -
> - /* Make sure to only use the same partition for an argument
> - as the result but never the other way around. */
> - if (SSA_NAME_VAR (res)
> - && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> - for (i = 0; i < gimple_phi_num_args (phi); i++)
> - {
> - tree arg = PHI_ARG_DEF (phi, i);
> - if (TREE_CODE (arg) == SSA_NAME)
> - copy_rename_partition_coalesce (map, res, arg,
> - debug);
> - }
> - /* Else if all arguments are in the same partition try to merge
> - it with the result. */
> - else
> - {
> - int all_p_same = -1;
> - int p = -1;
> - for (i = 0; i < gimple_phi_num_args (phi); i++)
> - {
> - tree arg = PHI_ARG_DEF (phi, i);
> - if (TREE_CODE (arg) != SSA_NAME)
> - {
> - all_p_same = 0;
> - break;
> - }
> - else if (all_p_same == -1)
> - {
> - p = partition_find (map->var_partition,
> - SSA_NAME_VERSION (arg));
> - all_p_same = 1;
> - }
> - else if (all_p_same == 1
> - && p != partition_find (map->var_partition,
> - SSA_NAME_VERSION (arg)))
> - {
> - all_p_same = 0;
> - break;
> - }
> - }
> - if (all_p_same == 1)
> - copy_rename_partition_coalesce (map, res,
> - PHI_ARG_DEF (phi, 0),
> - debug);
> - }
> - }
> - }
> -
> - if (debug)
> - dump_var_map (debug, map);
> -
> - /* Now one more pass to make all elements of a partition share the same
> - root variable. */
> -
> - for (x = 1; x < num_ssa_names; x++)
> - {
> - part_var = partition_to_var (map, x);
> - if (!part_var)
> - continue;
> - var = ssa_name (x);
> - if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> - continue;
> - if (debug)
> - {
> - fprintf (debug, "Coalesced ");
> - print_generic_expr (debug, var, TDF_SLIM);
> - fprintf (debug, " to ");
> - print_generic_expr (debug, part_var, TDF_SLIM);
> - fprintf (debug, "\n");
> - }
> - stats.coalesced++;
> - replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> - }
> -
> - statistics_counter_event (fun, "copies coalesced",
> - stats.coalesced);
> - delete_var_map (map);
> - return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> - return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index e0c42669..46b1869 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -119,7 +119,10 @@ tree_int_map_hasher::equal (const value_type *v, const compare_type *c)
> }
>
>
> -/* This routine will initialize the basevar fields of MAP. */
> +/* This routine will initialize the basevar fields of MAP with base
> + names, when we are not optimizing. When optimizing, we'll use
> + partition numbers as base index numbers, see coalesce_ssa_name in
> + tree-ssa-coalesce.c. */
>
> static void
> var_map_base_init (var_map map)
>
>
> --
> Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/ FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-03-31 5:11 ` Jeff Law
@ 2015-04-03 13:17 ` Alexandre Oliva
2015-04-06 16:08 ` Jeff Law
0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-04-03 13:17 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
On Mar 31, 2015, Jeff Law <law@redhat.com> wrote:
>> - if (var1 != var2)
>> + if (var1 != var2 && !optimize)
>> return false;
> So when when the base variables are different and we are optimizing,
> this allows coalescing, right?
Yeah.
> What I don't see is a corresponding change to var_map_base_init to
> ensure we build a conflict graph which includes objects when
> SSA_NAME_VARs are not the same. I see a vague reference in
> var_map_base_init's header comment that refers us to
> coalesce_ssa_name.
> It appears that compute_optimized_partition_bases handles this by
> creating a partitions of things that are related by copies/phis
> regardless of their underlying named object, type, etc. Right?
Correct. I guess it makes sense to move partition base computation to a
single location. Since compute_optimized_partition_bases relies on data
structures local to this source file, I'm moving the non-optimized
version to tree-ssa-coalesce.c, and dropping support for basevar
initialization from tree-ssa-live.c.
> Hard to argue with removing a pass that gets called 5 times!
:-)
>> @@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>> live_track_process_def (live, result, graph);
>> }
>>
>> + /* Pretend there are defs for params' default defs at the start
>> + of the (post-)entry block. We run after abnormal coalescing,
>> + so we can't assume the leader variable is the default
>> + definition, but because of SSA_NAME_VAR adjustments in
>> + attempt_coalesce, we can assume that if there is any
>> + PARM_DECL in the partition, it will be the leader's
>> + SSA_NAME_VAR. */
This comment is outdated. Since we no longer have abnormal coalescing
before building the conflict graph, we can just test whether each
SSA_NAME is a default def for a PARM_DECL and be done with it.
> So the issue here is you want to iterate over the objects live at the
> entry block, which would include any SSA_NAMEs which result from
> PARM_DECLs. I don't guess there's an easier way to do that other than
> iterating over everything live in that initial block?
We could iterate over all SSA_NAMEs, but that would probably be more
costly. There shouldn't be very many live variables at the function
entry, so using the live bitmaps is likely to save us time, especially
on functions with lots of SSA_NAMEs.
> And the second second EXECUTE_IF_SET_IN_BITMAP iterates over
> everything in the partitions associated with the SSA_NAMES that are
> live at the the entry block, right?
Yeah, we iterate over the bases in live_base_var, because the per-base
bitmaps are only accurate when the corresponding live_base_var bit is
iset.
>> @@ -1126,11 +1166,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
>>
>> static inline bool
>> attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>> - FILE *debug)
>> + bitmap param_defaults, FILE *debug)
> [ ... ]
> So the bulk of the changes into this routine are really about picking
> a good leader, which presumably is how we're able to get the desired
> effects on debuginfo that we used to get from tree-ssa-copyrename.c?
This has nothing to do with debuginfo, I'm afraid. We just had to keep
track of parm and result decls to avoid coalescing them together, and
parm decl default defs to promote them to leaders, because expand copies
incoming REGs to pseudos in PARM_DECL's DECL_RTL. We should fill that
in with the RTL created for the default def for the PARM_DECL. At the
end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
return register or rtl. I didn't want to tackle the reworking of these
expanders to avoid problems out of copying incoming parms to one pseudo
and then reading it from another, as I observed before I made this
change. I'm now tackling that, so that we can refrain from touching the
base vars altogether, and not refrain from coalescing parms and results.
> So some comments about the various cases here might help. I can sort
> them out if I read the code, but one could argue that a block comment
> on the rules for how to select the partition leader would be better.
*nod*. I won't bother, though, if this code ends up gone in the next
iteration of the patch ;-)
> Is the special casing of PARM_DECLs + RESULT_DECLs really a failing of
> not handling one or both properly when computing liveness information?
No, it's about RTL assignment and copying to/from hard regs. We assign
RTL to PARM_DECLs and RESULT_DECLs explicitly in the expander, but we
can't assign different RTL to them if they are coalesced in a single
partition.
> I'm not aware of an inherent reason why a PARM_DECL couldn't coalesce
> with a related RESULT_DECL if they are otherwise non-conflicting and
> related by a copy/phi.
Indeed, there isn't any inherent reason. It was just a restriction I
carried over from copyrename, and that I postponed cleaning up.
> Presumably ordering of unioning of the partitions doesn't matter here
> as we're looking at coalesce possibilities rather than things we have
> actually coalesced? Thus it's OK (?) to handle the names occurring in
> abnormal PHIs after those names that are associated by a copy.
Yeah, they'll end up with the same basevar one way or another.
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-03-31 14:06 ` Richard Biener
@ 2015-04-03 13:30 ` Alexandre Oliva
2015-04-06 15:57 ` Jeff Law
0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-04-03 13:30 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On Mar 31, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>> + || !(SSA_NAME_IS_DEFAULT_DEF (var)
>> + || (param_defaults
>> + && bitmap_bit_p (param_defaults, part))))
> This looks somewhat awkward to me ;) Is it really important to allow
> coalescing PARM_DECL-based SSA vars with sth else?
It's a valid optimization. I can't say it's really important, but if
the only objection is to param_defaults, I'm getting rid of it.
> At least abnormal coalescing doesn't need to do that, so just walking
> over the function decls parameters and making their default-def live
> should be enough?
It should. We'd have to duplicate logic of parameters, including static
chain and whatnot. I figured this would make it more resilient to
changes elsewhere.
>> + else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
>> + leader = SSA_NAME_VAR (var2);
>> + else /* What else could it be? */
>> + gcc_unreachable ();
>> +
> definitely comments missing in this spaghetti...
I'm trying to remove the spaghetting now.
> or seeing this, why coalesce default-defs at all?
Why not? (the referenced code is gone from my local tree, BTW)
> Either they are param values or they have indetermined values (and
> thus we can and do pick whatever is available at expansion time)?
If they are param values, we want to have them available; if they
aren't, whatever we coalesce with is good.
> So the above does full coalescing ignoring conflicts.
Yeah. We want to tell what we'd get if all coalesce possibilities are
taken, so that we can assign the same basevar to all partitions so that
we detect conflicts.
> Did you do any statistics on how the number of basevars changes with your patch
> compared to trunk?
'fraid I didn't run any statistics whatsoever. I didn't think it was
important, since it's pretty much just doing copyrename during coalesce.
Truth be told, this has since relaxed some of the constraints from
copyrename, and I'm going to relax some more in the next iteration, so I
guess some statistics wouldn't be a bad idea. Is there any specific
testcase you're interested in, or something like a GCC bootstrap or
somesuch?
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-04-03 13:30 ` Alexandre Oliva
@ 2015-04-06 15:57 ` Jeff Law
0 siblings, 0 replies; 127+ messages in thread
From: Jeff Law @ 2015-04-06 15:57 UTC (permalink / raw)
To: Alexandre Oliva, Richard Biener; +Cc: GCC Patches
On 04/03/2015 07:28 AM, Alexandre Oliva wrote:
> On Mar 31, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>>> + || !(SSA_NAME_IS_DEFAULT_DEF (var)
>>> + || (param_defaults
>>> + && bitmap_bit_p (param_defaults, part))))
>
>> This looks somewhat awkward to me ;) Is it really important to allow
>> coalescing PARM_DECL-based SSA vars with sth else?
>
> It's a valid optimization. I can't say it's really important, but if
> the only objection is to param_defaults, I'm getting rid of it.
I doubt it's terribly important, but I agree it's a valid optimization.
Do you have a testcase where it triggers? Can we include that too so
that if someone wants to remove this later for some reason or another
we'd at least have a chance of seeing a regression.
ISTM it can only trigger when the PARM is tied to another object via a
copy and the PARM and other object have non-overlapping lifetimes. I'd
expect that this may happen at PHIs where the PARM appears on the RHS
and dies at that point -- the PARM and the PHI result are likely not
going to conflict and thus may coalesce.
>
>> At least abnormal coalescing doesn't need to do that, so just walking
>> over the function decls parameters and making their default-def live
>> should be enough?
>
> It should. We'd have to duplicate logic of parameters, including static
> chain and whatnot. I figured this would make it more resilient to
> changes elsewhere.
This ties in a bit to my comment about whether or not we've got proper
life information for PARMs. I'd generally prefer to see us get the life
information corrrect.
>
>>> + else if (TREE_CODE (SSA_NAME_VAR (var2)) == VAR_DECL)
>>> + leader = SSA_NAME_VAR (var2);
>>> + else /* What else could it be? */
>>> + gcc_unreachable ();
>>> +
>
>> definitely comments missing in this spaghetti...
>
> I'm trying to remove the spaghetting now.
Good :-)
>
>> or seeing this, why coalesce default-defs at all?
>
> Why not? (the referenced code is gone from my local tree, BTW)
>
>> Either they are param values or they have indetermined values (and
>> thus we can and do pick whatever is available at expansion time)?
>
> If they are param values, we want to have them available; if they
> aren't, whatever we coalesce with is good.
Agreed. Didn't we recently change the coalescing code to allow
coalescing non-PARM default defs more aggressively:
Author: glisse <glisse@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Mon Nov 3 10:47:04 2014 +0000
2014-11-03 Marc Glisse <marc.glisse@inria.fr>
PR tree-optimization/60770
gcc/
* tree-into-ssa.c (rewrite_update_stmt): Return whether the
statement should be removed.
(maybe_register_def): Likewise. Replace clobbers with default
definitions.
(rewrite_dom_walker::before_dom_children): Remove statement if
rewrite_update_stmt says so.
* tree-ssa-live.c: Include tree-ssa.h.
(set_var_live_on_entry): Do not mark undefined variables as live.
(verify_live_on_entry): Do not check undefined variables.
* tree-ssa.h (ssa_undefined_value_p): New parameter for the case
of partially undefined variables.
* tree-ssa.c (ssa_undefined_value_p): Likewise.
(execute_update_addresses_taken): Do not drop clobbers.
gcc/testsuite/
* gcc.dg/tree-ssa/pr60770-1.c: New file.
> guess some statistics wouldn't be a bad idea. Is there any specific
> testcase you're interested in, or something like a GCC bootstrap or
> somesuch?
Not from me. bootstrap or .i files from gcc bootstrap would seem to be
sufficient to me.
jeff
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-04-03 13:17 ` Alexandre Oliva
@ 2015-04-06 16:08 ` Jeff Law
2015-04-24 1:56 ` Alexandre Oliva
0 siblings, 1 reply; 127+ messages in thread
From: Jeff Law @ 2015-04-06 16:08 UTC (permalink / raw)
To: Alexandre Oliva; +Cc: gcc-patches
On 04/03/2015 07:17 AM, Alexandre Oliva wrote:
>>> @@ -890,6 +900,36 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>>> live_track_process_def (live, result, graph);
>>> }
>>>
>>> + /* Pretend there are defs for params' default defs at the start
>>> + of the (post-)entry block. We run after abnormal coalescing,
>>> + so we can't assume the leader variable is the default
>>> + definition, but because of SSA_NAME_VAR adjustments in
>>> + attempt_coalesce, we can assume that if there is any
>>> + PARM_DECL in the partition, it will be the leader's
>>> + SSA_NAME_VAR. */
>
> This comment is outdated. Since we no longer have abnormal coalescing
> before building the conflict graph, we can just test whether each
> SSA_NAME is a default def for a PARM_DECL and be done with it.
OK. Please update the comment :-0
>
>> So the issue here is you want to iterate over the objects live at the
>> entry block, which would include any SSA_NAMEs which result from
>> PARM_DECLs. I don't guess there's an easier way to do that other than
>> iterating over everything live in that initial block?
>
> We could iterate over all SSA_NAMEs, but that would probably be more
> costly. There shouldn't be very many live variables at the function
> entry, so using the live bitmaps is likely to save us time, especially
> on functions with lots of SSA_NAMEs.
Agreed. Iterating over the SSA_NAMEs was what came to mind when I
pondered this a bit more and I'd rejected it for the same reason.
Can we get to the SSA_NAMEs associated with the PARM_DECLs from the
function decl? I can't think of a way off the top of my head, but if we
could, then that'd avoid the iteration over the bitmap of live variables.
But then again, the bitmap of live variables ought to be small,
particularly if we're not marking non-PARM default defs as live anymore
(see patch reference in my prior message).
>> So the bulk of the changes into this routine are really about picking
>> a good leader, which presumably is how we're able to get the desired
>> effects on debuginfo that we used to get from tree-ssa-copyrename.c?
>
> This has nothing to do with debuginfo, I'm afraid. We just had to keep
> track of parm and result decls to avoid coalescing them together, and
> parm decl default defs to promote them to leaders, because expand copies
> incoming REGs to pseudos in PARM_DECL's DECL_RTL. We should fill that
> in with the RTL created for the default def for the PARM_DECL. At the
> end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
> return register or rtl. I didn't want to tackle the reworking of these
> expanders to avoid problems out of copying incoming parms to one pseudo
> and then reading it from another, as I observed before I made this
> change. I'm now tackling that, so that we can refrain from touching the
> base vars altogether, and not refrain from coalescing parms and results.
Hmmm, so the real issue here is the expansion setup of parms and
results. I hadn't pondered that aspect. I'd encourage fixing the
expansion code too if you can see a path for that.
Basically I just don't like special casing things like this --
coalescing should be driven by life information/conflict graph and a
copy relationship between the two candidate objects.
Overall it looks like you're on the right path and we'll just need to
iterate a bit more.
jeff
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-04-06 16:08 ` Jeff Law
@ 2015-04-24 1:56 ` Alexandre Oliva
2015-04-27 11:39 ` Richard Biener
2015-04-29 3:51 ` Jeff Law
0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-04-24 1:56 UTC (permalink / raw)
To: Jeff Law, Richard Biener; +Cc: gcc-patches
On Apr 6, 2015, Jeff Law <law@redhat.com> wrote:
>>> So the bulk of the changes into this routine are really about picking
>>> a good leader, which presumably is how we're able to get the desired
>>> effects on debuginfo that we used to get from tree-ssa-copyrename.c?
>>
>> This has nothing to do with debuginfo, I'm afraid. We just had to keep
>> track of parm and result decls to avoid coalescing them together, and
>> parm decl default defs to promote them to leaders, because expand copies
>> incoming REGs to pseudos in PARM_DECL's DECL_RTL. We should fill that
>> in with the RTL created for the default def for the PARM_DECL. At the
>> end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
>> return register or rtl. I didn't want to tackle the reworking of these
>> expanders to avoid problems out of copying incoming parms to one pseudo
>> and then reading it from another, as I observed before I made this
>> change. I'm now tackling that, so that we can refrain from touching the
>> base vars altogether, and not refrain from coalescing parms and results.
> Hmmm, so the real issue here is the expansion setup of parms and
> results. I hadn't pondered that aspect. I'd encourage fixing the
> expansion code too if you can see a path for that.
That was the trickiest bit of the patch: getting assign_parms to use the
out-of-SSA-chosen RTL for the (default def of the) param instead of
creating a pseudo or stack slot of its own, especially when we create a
.result_ptr decl and there is an incoming by-ref result_decl, in which
case we ought to use the same SSA-assigned pseudo for both. Another
case worth mentioning is that in which a param is unused: there is no
default def for it, but in the non-optimized case, we have to assign it
to the same location. I've used the DECL_RTL itself to carry the
information in this case, at least in the non-optimized case, in which
we know all SSA_NAMEs associated with each param *will* be assigned to
the same partition, and use the same RTL. If we do optimize, the param
may get multiple locations, and DECL_RTL will be cleared. That's fine:
the incoming value of the param will end up being copied to a separate
pseudo, so that there's no risk of messing with any other default def
(there's a new testcase for this), and the copy is likely to be
optimized out.
The other tricky bit was to fix all expander bits that required
SSA_NAMEs to have a associated decl. I've removed all such cases, so we
can now expand anonymous SSA decls directly, without having to create an
ignored decl. Doing that, we can coalesce variables and expand each
partition without worrying about choosing a preferred partition leader.
We just have to make sure we don't consider pairs of variables eligible
for coalescing if they should get different promoted modes, or a
different register-or-stack choice, and then expansion of partitions is
streamlined: we just expand each leader, and then adjust all SSA_NAMEs
to associate the RTL with their base variables, if any.
In this revision of the patch, I have retained -ftree-coalesce-vars, so
that its negated form can be used in testcases that formerly expected no
coalescing across user variables, but that explicitly disabled VTA.
As for testcases, while investigating test regressions, I found out
various guality failures had to do with VT's lack of awareness of custom
calling conventions. Caller's variables saved in registers that are
normally call-clobbered, but that are call-saved in custom conventions
set up for a callee, would end up invalidating the entry-point location
associations. I've arranged for var-tracking to use custom calling
conventions for register invalidation at call insns, and this fixed not
only a few guality regressions due to changes in register assignment,
but a number of other long-standing guality failures. Yay! This could
be split out into a standalone patch.
On Mar 31, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
> Did you do any statistics on how the number of basevars changes with your patch
> compared to trunk?
In this version of the patch, we no longer touch the base vars at all.
We just associate the piece of RTL generated for the partition with a
list of decls, if needed. (I've just realized that I never noticed a
list of decls show up anywhere, and looking into this, I saw a bug in
the leader_merge function, that causes it to fail to go from a single
entry to a list: it creates the list, but then returns the original
non-list entry; that's why I never saw them! I won't delay posting the
patch just because of this; I'm not even sure we want decl lists in REG
or MEM attrs begin with)
I have collected some statistics on the effects of the patch in
compiling stage3-gcc/, before and after the patch, with and without
-fno-tree-coalesce-vars. I counted, per function:
b/a: before the patch, or after the patch
c/n: -ftree-coalesce-vars (default when optimizing) or
-fno-tree-coalesce-vars
cv: the coalescible var count, i.e., the active partition count prior to
coalescing. SSA_NAMEs not elligible for coalescing are not counted.
The more of these there are, the larger the conflict graph we have to
build.
base: the base variable count that guides the construction of the
conflict map. The more of these there are, the smaller the conflict
graph we have to build, but it is also a lower bound for the final
partition count.
part: the partition count after coalescing, not counting those of
SSA_NAMEs that were not elligible for coalescing to begin with.
abn: successful abnormal coalesce count. How many times
attempt_coalesce returned true as called in the abnormal coalesce loop.
same: successful normal coalesces of pairs of SSA_NAMEs that share the
same base variable (SSA_NAME_VAR, not the base index used to guide the
construction of the conflict graph). Ignored base decls are regarded as
NULL for purposes of this comparison. How many times attempt_coalesce
returned true for variables that share the same base variable. This may
count cases in which both vars are in the same partition already due to
earlier coalesces.
other: successful normal coalesces of pairs of SSA_NAMEs that do NOT
share the same base variable. Same caveats as above.
fail: failed attempts at normal coalece. How many times
attempt_coalesce returned false.
b/a c/n cv base part abn same other fail
before -fno-tr 570180 176682 221442 82076 370746 0 10542
before -ftree- 577212 171581 221927 82076 378093 0 18654
after -fno-tr 608533 179959 220948 82076 488119 0 11697
after -ftree- 589243 202588 221817 82076 349373 41775 24124
Here's (for reference only) the patch used to gather the data
consolidated above:
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index eeac5a4..d9fe4cc 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -1199,6 +1199,11 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
edge e;
edge_iterator ei;
+ int abnormal = 0, samevar = 0, othervar = 0, failure = 0;
+ int initial_partitions = num_var_partitions (map);
+ int final_partitions = initial_partitions;
+ int p1, p2;
+
/* First, coalesce all the copies across abnormal edges. These are not placed
in the coalesce list because they do not need to be sorted, and simply
consume extra memory/compilation time in large programs. */
@@ -1226,8 +1231,17 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
if (debug)
fprintf (debug, "Abnormal coalesce: ");
+ p1 = var_to_partition (map, arg);
+ p2 = var_to_partition (map, res);
+
if (!attempt_coalesce (map, graph, v1, v2, debug))
fail_abnormal_edge_coalesce (v1, v2);
+ else
+ {
+ abnormal++;
+ if (p1 != p2)
+ final_partitions--;
+ }
}
}
}
@@ -1244,8 +1258,30 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
if (debug)
fprintf (debug, "Coalesce list: ");
- attempt_coalesce (map, graph, x, y, debug);
+
+ p1 = var_to_partition (map, var1);
+ p2 = var_to_partition (map, var2);
+
+ if (!attempt_coalesce (map, graph, x, y, debug))
+ failure++;
+ else
+ {
+ if (p1 != p2)
+ final_partitions--;
+ if ((SSA_NAME_VAR (var1) && !DECL_IGNORED_P (SSA_NAME_VAR (var1))
+ ? SSA_NAME_VAR (var1) : NULL)
+ == (SSA_NAME_VAR (var2) && !DECL_IGNORED_P (SSA_NAME_VAR (var2))
+ ? SSA_NAME_VAR (var2) : NULL))
+ samevar++;
+ else
+ othervar++;
+ }
}
+
+ inform (1,
+ "%i cv, %i base, %i part, %i abn, %i same, %i other, %i failed in %q+F",
+ initial_partitions, num_basevars (map), final_partitions,
+ abnormal, samevar, othervar, failure, current_function_decl);
}
And here's the actual patch I'm submitting for your appreciation (I was
gonna say for inclusion, but given the leader_merge brown paper bag bug,
I'll just want feedback on whether we want that or not, and either drop
the list-building, or probably post a revised patch that fixes fallout
from lists where decls are expected.)
No regressions, and many progressions, on x86_64-linux-gnu and
i686-pc-linux-gnu.
[PR64164] Drop copyrename, use coalescible partition as base when optimizing.
for gcc/ChangeLog
PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename. Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Note def location.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars. Check register
use and promoted modes to allow coalescing. Moved to
tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise. Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
of compute_samebase_partition_bases. Adjust.
* alias.c (nonoverlapping_memrefs_p): Special-case RTL-less
gimple-reg exprs.
* cfgexpand.c (leader_merge): New.
(get_rtl_for_parm_ssa_default_def): New.
(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
(expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
redundant MEM attr setting.
(expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
from...
(expand_one_stack_var): ... this. New wrapper to check and
skip already expanded SSA partitions.
(record_alignment_for_reg_var): New, factored out of...
(expand_one_var): ... this.
(expand_one_ssa_partition): New.
(adjust_one_expanded_partition_var): New.
(expand_one_register_var): Check and skip already expanded SSA
partitions.
(expand_used_vars): Don't create DECLs for anonymous SSA
names. Expand all SSA partitions, then adjust all SSA names.
(pass::execute): Replace the loops that set
SA.partition_to_pseudo from partition leaders and cleared
DECL_RTL for multi-location variables, and that which used to
rename vars and set attrs, with one that clears DECL_RTL and
checks that PARMs and RESULTs default_defs match DECL_RTL.
* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL and
TREE_LIST decl.
* explow.c (promote_ssa_mode): New.
* explow.h (promote_ssa_mode): Declare.
* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
* function.c: Include cfgexpand.h.
(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
(use_register_for_parm_decl): Wrapper for the above to
special-case the result_ptr.
(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
multiple locations.
(assign_parm_adjust_stack_rtl): Add all and parm arguments,
for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
(assign_parm_setup_block): Prefer SSA-assigned location.
(assign_parm_setup_reg): Likewise. Use entry_parm for equiv
if stack_parm is NULL.
(assign_parm_setup_stack): Prefer SSA-assigned location.
(assign_parms): Maybe reset DECL_RTL of params. Adjust stack
rtl before testing for pointer bounds. Special-case result_ptr.
(expand_function_start): Maybe reset DECL_RTL of result.
Prefer SSA-assigned location for result and static chain.
Factor out DECL_RESULT and SET_DECL_RTL.
* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
anonymous SSA names. Use promote_ssa_mode.
(get_temp_reg): Likewise.
(remove_ssa_form): Adjust.
* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
and get its reg_usage for reg invalidation.
(compute_bb_dataflow): Pass it insn.
(emit_notes_in_bb): Likewise.
for gcc/testsuite/ChangeLog
* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
* gcc.dg/ssp-1.c: Make counter a register.
* gcc.dg/ssp-2.c: Likewise.
* gcc.dg/torture/parm-coalesce.c: New.
---
gcc/Makefile.in | 1
gcc/alias.c | 12 +
gcc/cfgexpand.c | 383 +++++++++++++++-----
gcc/cfgexpand.h | 2
gcc/common.opt | 12 -
gcc/doc/invoke.texi | 48 +--
gcc/emit-rtl.c | 7
gcc/explow.c | 25 +
gcc/explow.h | 3
gcc/expr.c | 33 +-
gcc/function.c | 211 +++++++++--
gcc/gimple-expr.c | 39 --
gcc/gimple-expr.h | 5
gcc/opts.c | 2
gcc/passes.def | 5
gcc/testsuite/gcc.dg/guality/pr54200.c | 2
gcc/testsuite/gcc.dg/ssp-1.c | 2
gcc/testsuite/gcc.dg/ssp-2.c | 2
gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++
gcc/tree-outof-ssa.c | 15 -
gcc/tree-ssa-coalesce.c | 380 +++++++++++++++++++-
gcc/tree-ssa-copyrename.c | 499 --------------------------
gcc/tree-ssa-live.c | 101 -----
gcc/tree-ssa-live.h | 4
gcc/var-tracking.c | 12 -
25 files changed, 980 insertions(+), 865 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
delete mode 100644 gcc/tree-ssa-copyrename.c
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 80c91f0..6920ee7 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1428,7 +1428,6 @@ OBJS = \
tree-ssa-ccp.o \
tree-ssa-coalesce.o \
tree-ssa-copy.o \
- tree-ssa-copyrename.o \
tree-ssa-dce.o \
tree-ssa-dom.o \
tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index a7160f3..2100e8b 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2365,6 +2365,18 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
if (! DECL_P (exprx) || ! DECL_P (expry))
return 0;
+ /* If we refer to different gimple registers, or one gimple register
+ and one non-gimple-register, we know they can't overlap. Now,
+ there could be more than one stack slot for (different versions
+ of) the same gimple register, but we can presumably tell they
+ don't overlap based on offsets from stack base addresses
+ elsewhere. It's important that we don't proceed to DECL_RTL,
+ because gimple registers may not pass DECL_RTL_SET_P, and
+ make_decl_rtl won't be able to do anything about them since no
+ SSA information will have remained to guide it. */
+ if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+ return exprx != expry;
+
/* With invalid code we can end up storing into the constant pool.
Bail out to avoid ICEing when creating RTL for this.
See gfortran.dg/lto/20091028-2_0.f90. */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index ca491a0..74190a6d 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -179,21 +179,137 @@ gimple_assign_rhs_to_tree (gimple stmt)
#define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
+/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
+ TREE_LIST of DECLs. If NEXT is covered by CUR, return CUR
+ unchanged. Otherwise, return a list with all entries of CUR, with
+ NEXT at the end. If CUR was a list, it will be modified in
+ place. */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+ if (cur == NULL || cur == next)
+ return next;
+
+ tree list;
+
+ if (TREE_CODE (cur) == TREE_LIST)
+ {
+ /* Look for NEXT in the list. Stop at the last node to insert
+ there. */
+ for (list = cur; ; list = TREE_CHAIN (list))
+ {
+ if (TREE_VALUE (list) == next)
+ return cur;
+ if (!TREE_CHAIN (list))
+ break;
+ }
+ }
+ else
+ /* Create the first node. */
+ list = build_tree_list (NULL, cur);
+
+ next = build_tree_list (NULL, next);
+ TREE_CHAIN (list) = next;
+
+ return cur;
+}
+
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+ there is one. */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+ gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+ if (!is_gimple_reg (var))
+ return NULL_RTX;
+
+ /* If we've already determined RTL for the decl, use it. This is
+ not just an optimization: if VAR is a PARM whose incoming value
+ is unused, we won't find a default def to use its partition, but
+ we still want to use the location of the parm, if it was used at
+ all. During assign_parms, until a location is assigned for the
+ VAR, RTL can only for a parm or result if we're not coalescing
+ across variables, when we know we're coalescing all SSA_NAMEs of
+ each parm or result, and we're not coalescing them with names
+ pertaining to other variables, such as other parms' default
+ defs. */
+ if (DECL_RTL_SET_P (var))
+ {
+ gcc_assert (DECL_RTL (var) != pc_rtx);
+ return DECL_RTL (var);
+ }
+
+ tree name = ssa_default_def (cfun, var);
+
+ if (!name)
+ return NULL_RTX;
+
+ int part = var_to_partition (SA.map, name);
+ if (part == NO_PARTITION)
+ return NULL_RTX;
+
+ return SA.partition_to_pseudo[part];
+}
+
/* Associate declaration T with storage space X. If T is no
SSA name this is exactly SET_DECL_RTL, otherwise make the
partition of T associated with X. */
static inline void
set_rtl (tree t, rtx x)
{
+ if (x && SSAVAR (t))
+ {
+ bool skip = false;
+ tree cur = NULL_TREE;
+
+ if (MEM_P (x))
+ cur = MEM_EXPR (x);
+ else if (REG_P (x))
+ cur = REG_EXPR (x);
+ else if (GET_CODE (x) == CONCAT
+ && REG_P (XEXP (x, 0)))
+ cur = REG_EXPR (XEXP (x, 0));
+ else if (GET_CODE (x) == PARALLEL)
+ cur = REG_EXPR (XVECEXP (x, 0, 0));
+ else if (x == pc_rtx)
+ skip = true;
+ else
+ gcc_unreachable ();
+
+ tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+ if (cur != next)
+ {
+ if (MEM_P (x))
+ set_mem_attributes (x, SSAVAR (t), true);
+ else
+ set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
+ }
+ }
+
if (TREE_CODE (t) == SSA_NAME)
{
- SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
- if (x && !MEM_P (x))
- set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
- /* For the benefit of debug information at -O0 (where vartracking
- doesn't run) record the place also in the base DECL if it's
- a normal variable (not a parameter). */
- if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+ int part = var_to_partition (SA.map, t);
+ if (part != NO_PARTITION)
+ {
+ if (SA.partition_to_pseudo[part])
+ gcc_assert (SA.partition_to_pseudo[part] == x);
+ else
+ SA.partition_to_pseudo[part] = x;
+ }
+ /* For the benefit of debug information at -O0 (where
+ vartracking doesn't run) record the place also in the base
+ DECL. For PARMs and RESULTs, we may end up resetting these
+ in function.c:maybe_reset_rtl_for_parm, but in some rare
+ cases we may need them (unused and overwritten incoming
+ value, that at -O0 must share the location with the other
+ uses in spite of the missing default def), and this may be
+ the only chance to preserve them. */
+ if (x && x != pc_rtx && SSA_NAME_VAR (t))
{
tree var = SSA_NAME_VAR (t);
/* If we don't yet have something recorded, just record it now. */
@@ -909,7 +1025,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
x = plus_constant (Pmode, base, offset);
- x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+ x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
+ ? DECL_MODE (SSAVAR (decl))
+ : TYPE_MODE (TREE_TYPE (decl)), x);
if (TREE_CODE (decl) != SSA_NAME)
{
@@ -931,7 +1049,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
DECL_USER_ALIGN (decl) = 0;
}
- set_mem_attributes (x, SSAVAR (decl), true);
set_rtl (decl, x);
}
@@ -1146,13 +1263,22 @@ account_stack_vars (void)
to a variable to be allocated in the stack frame. */
static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
{
HOST_WIDE_INT size, offset;
unsigned byte_align;
- size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
- byte_align = align_local_variable (SSAVAR (var));
+ if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
+ {
+ size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
+ byte_align = align_local_variable (SSAVAR (var));
+ }
+ else
+ {
+ tree type = TREE_TYPE (var);
+ size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+ byte_align = TYPE_ALIGN_UNIT (type);
+ }
/* We handle highly aligned variables in expand_stack_vars. */
gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1163,6 +1289,27 @@ expand_one_stack_var (tree var)
crtl->max_used_stack_slot_alignment, offset);
}
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+ already assigned some MEM. */
+
+static void
+expand_one_stack_var (tree var)
+{
+ if (TREE_CODE (var) == SSA_NAME)
+ {
+ int part = var_to_partition (SA.map, var);
+ if (part != NO_PARTITION)
+ {
+ rtx x = SA.partition_to_pseudo[part];
+ gcc_assert (x);
+ gcc_assert (MEM_P (x));
+ return;
+ }
+ }
+
+ return expand_one_stack_var_1 (var);
+}
+
/* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
that will reside in a hard register. */
@@ -1172,12 +1319,112 @@ expand_one_hard_reg_var (tree var)
rest_of_decl_compilation (var, 0, 0);
}
+/* Record the alignment requirements of some variable assigned to a
+ pseudo. */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+ if (SUPPORTS_STACK_ALIGNMENT
+ && crtl->stack_alignment_estimated < align)
+ {
+ /* stack_alignment_estimated shouldn't change after stack
+ realign decision made */
+ gcc_assert (!crtl->stack_realign_processed);
+ crtl->stack_alignment_estimated = align;
+ }
+
+ /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+ So here we only make sure stack_alignment_needed >= align. */
+ if (crtl->stack_alignment_needed < align)
+ crtl->stack_alignment_needed = align;
+ if (crtl->max_used_stack_slot_alignment < align)
+ crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition. */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+ int part = var_to_partition (SA.map, var);
+ gcc_assert (part != NO_PARTITION);
+
+ if (SA.partition_to_pseudo[part])
+ return;
+
+ if (!use_register_for_decl (var))
+ {
+ expand_one_stack_var_1 (var);
+ return;
+ }
+
+ unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+ TYPE_MODE (TREE_TYPE (var)),
+ TYPE_ALIGN (TREE_TYPE (var)));
+
+ /* If the variable alignment is very large we'll dynamicaly allocate
+ it, which means that in-frame portion is just a pointer. */
+ if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+ align = POINTER_SIZE;
+
+ record_alignment_for_reg_var (align);
+
+ machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+ rtx x = gen_reg_rtx (reg_mode);
+
+ set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+ and the underlying variable of the SSA_NAME. */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+ if (!var)
+ return;
+
+ tree decl = SSA_NAME_VAR (var);
+
+ int part = var_to_partition (SA.map, var);
+ if (part == NO_PARTITION)
+ return;
+
+ rtx x = SA.partition_to_pseudo[part];
+
+ set_rtl (var, x);
+
+ if (!REG_P (x))
+ return;
+
+ /* Note if the object is a user variable. */
+ if (decl && !DECL_ARTIFICIAL (decl))
+ mark_user_reg (x);
+
+ if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+ mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
/* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
that will reside in a pseudo register. */
static void
expand_one_register_var (tree var)
{
+ if (TREE_CODE (var) == SSA_NAME)
+ {
+ int part = var_to_partition (SA.map, var);
+ if (part != NO_PARTITION)
+ {
+ rtx x = SA.partition_to_pseudo[part];
+ gcc_assert (x);
+ gcc_assert (REG_P (x));
+ return;
+ }
+ }
+
tree decl = SSAVAR (var);
tree type = TREE_TYPE (decl);
machine_mode reg_mode = promote_decl_mode (decl, NULL);
@@ -1312,21 +1559,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
align = POINTER_SIZE;
}
- if (SUPPORTS_STACK_ALIGNMENT
- && crtl->stack_alignment_estimated < align)
- {
- /* stack_alignment_estimated shouldn't change after stack
- realign decision made */
- gcc_assert (!crtl->stack_realign_processed);
- crtl->stack_alignment_estimated = align;
- }
-
- /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
- So here we only make sure stack_alignment_needed >= align. */
- if (crtl->stack_alignment_needed < align)
- crtl->stack_alignment_needed = align;
- if (crtl->max_used_stack_slot_alignment < align)
- crtl->max_used_stack_slot_alignment = align;
+ record_alignment_for_reg_var (align);
if (TREE_CODE (origvar) == SSA_NAME)
{
@@ -1760,48 +1993,18 @@ expand_used_vars (void)
if (targetm.use_pseudo_pic_reg ())
pic_offset_table_rtx = gen_reg_rtx (Pmode);
- hash_map<tree, tree> ssa_name_decls;
for (i = 0; i < SA.map->num_partitions; i++)
{
tree var = partition_to_var (SA.map, i);
gcc_assert (!virtual_operand_p (var));
- /* Assign decls to each SSA name partition, share decls for partitions
- we could have coalesced (those with the same type). */
- if (SSA_NAME_VAR (var) == NULL_TREE)
- {
- tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
- if (!*slot)
- *slot = create_tmp_reg (TREE_TYPE (var));
- replace_ssa_name_symbol (var, *slot);
- }
-
- /* Always allocate space for partitions based on VAR_DECLs. But for
- those based on PARM_DECLs or RESULT_DECLs and which matter for the
- debug info, there is no need to do so if optimization is disabled
- because all the SSA_NAMEs based on these DECLs have been coalesced
- into a single partition, which is thus assigned the canonical RTL
- location of the DECLs. If in_lto_p, we can't rely on optimize,
- a function could be compiled with -O1 -flto first and only the
- link performed at -O0. */
- if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
- expand_one_var (var, true, true);
- else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
- {
- /* This is a PARM_DECL or RESULT_DECL. For those partitions that
- contain the default def (representing the parm or result itself)
- we don't do anything here. But those which don't contain the
- default def (representing a temporary based on the parm/result)
- we need to allocate space just like for normal VAR_DECLs. */
- if (!bitmap_bit_p (SA.partition_has_default_def, i))
- {
- expand_one_var (var, true, true);
- gcc_assert (SA.partition_to_pseudo[i]);
- }
- }
+ expand_one_ssa_partition (var);
}
+ for (i = 1; i < num_ssa_names; i++)
+ adjust_one_expanded_partition_var (ssa_name (i));
+
if (flag_stack_protect == SPCT_FLAG_STRONG)
gen_stack_protect_signal
= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -6033,35 +6236,6 @@ pass_expand::execute (function *fun)
parm_birth_insn = var_seq;
}
- /* Now that we also have the parameter RTXs, copy them over to our
- partitions. */
- for (i = 0; i < SA.map->num_partitions; i++)
- {
- tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
- if (TREE_CODE (var) != VAR_DECL
- && !SA.partition_to_pseudo[i])
- SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
- gcc_assert (SA.partition_to_pseudo[i]);
-
- /* If this decl was marked as living in multiple places, reset
- this now to NULL. */
- if (DECL_RTL_IF_SET (var) == pc_rtx)
- SET_DECL_RTL (var, NULL);
-
- /* Some RTL parts really want to look at DECL_RTL(x) when x
- was a decl marked in REG_ATTR or MEM_ATTR. We could use
- SET_DECL_RTL here making this available, but that would mean
- to select one of the potentially many RTLs for one DECL. Instead
- of doing that we simply reset the MEM_EXPR of the RTL in question,
- then nobody can get at it and hence nobody can call DECL_RTL on it. */
- if (!DECL_RTL_SET_P (var))
- {
- if (MEM_P (SA.partition_to_pseudo[i]))
- set_mem_expr (SA.partition_to_pseudo[i], NULL);
- }
- }
-
/* If we have a class containing differently aligned pointers
we need to merge those into the corresponding RTL pointer
alignment. */
@@ -6069,7 +6243,6 @@ pass_expand::execute (function *fun)
{
tree name = ssa_name (i);
int part;
- rtx r;
if (!name
/* We might have generated new SSA names in
@@ -6082,20 +6255,24 @@ pass_expand::execute (function *fun)
if (part == NO_PARTITION)
continue;
- /* Adjust all partition members to get the underlying decl of
- the representative which we might have created in expand_one_var. */
- if (SSA_NAME_VAR (name) == NULL_TREE)
+ gcc_assert (SA.partition_to_pseudo[part]);
+
+ /* If this decl was marked as living in multiple places, reset
+ this now to NULL. */
+ tree var = SSA_NAME_VAR (name);
+ if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+ SET_DECL_RTL (var, NULL);
+ /* Check that the pseudos chosen by assign_parms are those of
+ the corresponding default defs. */
+ else if (SSA_NAME_IS_DEFAULT_DEF (name)
+ && (TREE_CODE (var) == PARM_DECL
+ || TREE_CODE (var) == RESULT_DECL))
{
- tree leader = partition_to_var (SA.map, part);
- gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
- replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+ rtx in = DECL_RTL_IF_SET (var);
+ gcc_assert (in);
+ rtx out = SA.partition_to_pseudo[part];
+ gcc_assert (in == out || rtx_equal_p (in, out));
}
- if (!POINTER_TYPE_P (TREE_TYPE (name)))
- continue;
-
- r = SA.partition_to_pseudo[part];
- if (REG_P (r))
- mark_reg_pointer (r, get_pointer_alignment (name));
}
/* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..602579d 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,7 @@ along with GCC; see the file COPYING3. If not see
extern tree gimple_assign_rhs_to_tree (gimple);
extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
#endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 380848c..2cdbea1 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2212,16 +2212,16 @@ Common Report Var(flag_tree_ch) Optimization
Enable loop header copying on trees
ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing. Preserved for backward compatibility.
ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing. Preserved for backward compatibility.
ftree-copy-prop
Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c20dd4d..0a3b930 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -337,7 +337,6 @@ Objective-C and Objective-C++ Dialects}.
-fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
-fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
-fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
-fdump-tree-nrv -fdump-tree-vect @gol
-fdump-tree-sink @gol
-fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -443,9 +442,8 @@ Objective-C and Objective-C++ Dialects}.
-fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
-fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
-ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
-ftree-loop-if-convert-stores -ftree-loop-im @gol
-ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
-ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -6989,11 +6987,6 @@ name is made by appending @file{.phiopt} to the source file name.
Dump each function after forward propagating single use variables. The file
name is made by appending @file{.forwprop} to the source file name.
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization. The file
-name is made by appending @file{.copyrename} to the source file name.
-
@item nrv
@opindex fdump-tree-nrv
Dump each function after applying the named return value optimization on
@@ -7458,8 +7451,8 @@ compilation time.
-ftree-ccp @gol
-fssa-phiopt @gol
-ftree-ch @gol
+-ftree-coalesce-vars @gol
-ftree-copy-prop @gol
--ftree-copyrename @gol
-ftree-dce @gol
-ftree-dominator-opts @gol
-ftree-dse @gol
@@ -8724,6 +8717,15 @@ profitable to parallelize the loops.
Compare the results of several data dependence analyzers. This option
is used for debugging the data dependence analyzers.
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries. This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}. In the negated form, this flag
+prevents SSA coalescing of user variables. This option is enabled by
+default if optimization is enabled.
+
@item -ftree-loop-if-convert
@opindex ftree-loop-if-convert
Attempt to transform conditional jumps in the innermost loops to
@@ -8837,32 +8839,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
references with scalars to prevent committing structures to memory too
early. This flag is enabled by default at @option{-O} and higher.
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees. This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables. This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions. It is a more limited form of
-@option{-ftree-coalesce-vars}. This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries. This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}. In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones. This option is enabled by default.
-
@item -ftree-ter
@opindex ftree-ter
Perform temporary expression replacement during the SSA->normal phase. Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index b8dc7d5..ef31ba0f 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1229,6 +1229,11 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
void
set_reg_attrs_for_decl_rtl (tree t, rtx x)
{
+ if (!t)
+ return;
+ tree tdecl = t;
+ if (TREE_CODE (t) == TREE_LIST)
+ tdecl = TREE_VALUE (t);
if (GET_CODE (x) == SUBREG)
{
gcc_assert (subreg_lowpart_p (x));
@@ -1237,7 +1242,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
if (REG_P (x))
REG_ATTRS (x)
= get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
- DECL_MODE (t)));
+ DECL_MODE (tdecl)));
if (GET_CODE (x) == CONCAT)
{
if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index de446a9..b53a3b7 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -854,6 +854,31 @@ promote_decl_mode (const_tree decl, int *punsignedp)
return pmode;
}
+/* Return the promoted mode for name. If it is a named SSA_NAME, it
+ is the same as promote_decl_mode. Otherwise, it is the promoted
+ mode of a temp decl of same type as the SSA_NAME, if we had created
+ one. */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+ gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+ if (SSA_NAME_VAR (name))
+ return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+
+ tree type = TREE_TYPE (name);
+ int unsignedp = TYPE_UNSIGNED (type);
+ machine_mode mode = TYPE_MODE (type);
+
+ machine_mode pmode = promote_mode (type, mode, &unsignedp);
+ if (punsignedp)
+ *punsignedp = unsignedp;
+
+ return pmode;
+}
+
+
\f
/* Controls the behaviour of {anti_,}adjust_stack. */
static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 48f1859..7b11e46 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
/* Return mode and signedness to use when object is promoted. */
machine_mode promote_decl_mode (const_tree, int *);
+/* Return mode and signedness to use when object is promoted. */
+machine_mode promote_ssa_mode (const_tree, int *);
+
/* Remove some bytes from the stack. An rtx says how many. */
extern void adjust_stack (rtx);
diff --git a/gcc/expr.c b/gcc/expr.c
index 530a944..95a9bab 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9388,7 +9388,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
rtx op0, op1, temp, decl_rtl;
tree type;
int unsignedp;
- machine_mode mode;
+ machine_mode mode, dmode;
enum tree_code code = TREE_CODE (exp);
rtx subtarget, original_target;
int ignore;
@@ -9519,7 +9519,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
if (g == NULL
&& modifier == EXPAND_INITIALIZER
&& !SSA_NAME_IS_DEFAULT_DEF (exp)
- && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+ && (optimize || !SSA_NAME_VAR (exp)
+ || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
&& stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
g = SSA_NAME_DEF_STMT (exp);
if (g)
@@ -9598,15 +9599,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
/* Ensure variable marked as used even if it doesn't go through
a parser. If it hasn't be used yet, write out an external
definition. */
- TREE_USED (exp) = 1;
+ if (exp)
+ TREE_USED (exp) = 1;
/* Show we haven't gotten RTL for this yet. */
temp = 0;
/* Variables inherited from containing functions should have
been lowered by this point. */
- context = decl_function_context (exp);
- gcc_assert (SCOPE_FILE_SCOPE_P (context)
+ if (exp)
+ context = decl_function_context (exp);
+ gcc_assert (!exp
+ || SCOPE_FILE_SCOPE_P (context)
|| context == current_function_decl
|| TREE_STATIC (exp)
|| DECL_EXTERNAL (exp)
@@ -9630,7 +9634,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
decl_rtl = use_anchored_address (decl_rtl);
if (modifier != EXPAND_CONST_ADDRESS
&& modifier != EXPAND_SUM
- && !memory_address_addr_space_p (DECL_MODE (exp),
+ && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+ : GET_MODE (decl_rtl),
XEXP (decl_rtl, 0),
MEM_ADDR_SPACE (decl_rtl)))
temp = replace_equiv_address (decl_rtl,
@@ -9641,12 +9646,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
if the address is a register. */
if (temp != 0)
{
- if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+ if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
return temp;
}
+ if (exp)
+ dmode = DECL_MODE (exp);
+ else
+ dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
/* If the mode of DECL_RTL does not match that of the decl,
there are two cases: we are dealing with a BLKmode value
that is returned in a register, or we are dealing with
@@ -9654,8 +9664,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
of the wanted mode, but mark it so that we know that it
was already extended. */
if (REG_P (decl_rtl)
- && DECL_MODE (exp) != BLKmode
- && GET_MODE (decl_rtl) != DECL_MODE (exp))
+ && dmode != BLKmode
+ && GET_MODE (decl_rtl) != dmode)
{
machine_mode pmode;
@@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
pmode = promote_function_mode (type, mode, &unsignedp,
gimple_call_fntype (g),
2);
+ else if (!exp)
+ {
+ gcc_assert (code == SSA_NAME);
+ pmode = promote_ssa_mode (ssa_name, &unsignedp);
+ }
else
pmode = promote_decl_mode (exp, &unsignedp);
gcc_assert (GET_MODE (decl_rtl) == pmode);
diff --git a/gcc/function.c b/gcc/function.c
index 7d4df92..1f5296e 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3. If not see
#include "cfganal.h"
#include "cfgbuild.h"
#include "cfgcleanup.h"
+#include "cfgexpand.h"
#include "basic-block.h"
#include "df.h"
#include "params.h"
@@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
bool
use_register_for_decl (const_tree decl)
{
+ if (TREE_CODE (decl) == SSA_NAME)
+ {
+ if (!SSA_NAME_VAR (decl))
+ return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+ && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+ decl = SSA_NAME_VAR (decl);
+ }
+
if (!targetm.calls.allocate_stack_slots_for_args ())
return true;
@@ -2804,23 +2814,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
data->entry_parm = entry_parm;
}
+/* Wrapper for use_register_for_decl, that special-cases the
+ .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+ passed by reference. */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+ if (parm == all->function_result_decl)
+ {
+ tree result = DECL_RESULT (current_function_decl);
+
+ if (DECL_BY_REFERENCE (result))
+ parm = result;
+ }
+
+ return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+ the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+ is passed by reference. */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+ if (parm == all->function_result_decl)
+ {
+ tree result = DECL_RESULT (current_function_decl);
+
+ if (!DECL_BY_REFERENCE (result))
+ return NULL_RTX;
+
+ parm = result;
+ }
+
+ return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+ SSA_NAMEs in multiple partitions, so that assign_parms will choose
+ the default def, if it exists, or create new RTL to hold the unused
+ entry value. If we are coalescing across variables, we want to
+ reset the location too, because a parm without a default def
+ (incoming value unused) might be coalesced with one with a default
+ def, and then assign_parms would copy both incoming values to the
+ same location, which might cause the wrong value to survive. */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+ gcc_assert (TREE_CODE (parm) == PARM_DECL
+ || TREE_CODE (parm) == RESULT_DECL);
+ if ((flag_tree_coalesce_vars
+ || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+ && is_gimple_reg (parm))
+ SET_DECL_RTL (parm, NULL_RTX);
+}
+
/* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's
always valid and properly aligned. */
static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+ struct assign_parm_data_one *data)
{
rtx stack_parm = data->stack_parm;
+ /* If out-of-SSA assigned RTL to the parm default def, make sure we
+ don't use what we might have computed before. */
+ rtx ssa_assigned = rtl_for_parm (all, parm);
+ if (ssa_assigned)
+ stack_parm = NULL;
+
/* If we can't trust the parm stack slot to be aligned enough for its
ultimate type, don't use that slot after entry. We'll make another
stack slot, if we need one. */
- if (stack_parm
- && ((STRICT_ALIGNMENT
- && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
- || (data->nominal_type
- && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
- && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+ else if (stack_parm
+ && ((STRICT_ALIGNMENT
+ && (GET_MODE_ALIGNMENT (data->nominal_mode)
+ > MEM_ALIGN (stack_parm)))
+ || (data->nominal_type
+ && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+ && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
stack_parm = NULL;
/* If parm was passed in memory, and we need to convert it on entry,
@@ -2882,11 +2957,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
size = int_size_in_bytes (data->passed_type);
size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
if (stack_parm == 0)
{
DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
- stack_parm = assign_stack_local (BLKmode, size_stored,
- DECL_ALIGN (parm));
+ stack_parm = rtl_for_parm (all, parm);
+ if (!stack_parm)
+ stack_parm = assign_stack_local (BLKmode, size_stored,
+ DECL_ALIGN (parm));
+ else
+ stack_parm = copy_rtx (stack_parm);
if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
PUT_MODE (stack_parm, GET_MODE (entry_parm));
set_mem_attributes (stack_parm, parm, 1);
@@ -3027,10 +3107,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
= promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
TREE_TYPE (current_function_decl), 2);
- parmreg = gen_reg_rtx (promoted_nominal_mode);
+ rtx from_expand = rtl_for_parm (all, parm);
- if (!DECL_ARTIFICIAL (parm))
- mark_user_reg (parmreg);
+ if (from_expand && !data->passed_pointer)
+ {
+ parmreg = from_expand;
+ gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
+ }
+ else
+ {
+ parmreg = gen_reg_rtx (promoted_nominal_mode);
+ if (!DECL_ARTIFICIAL (parm))
+ mark_user_reg (parmreg);
+ }
/* If this was an item that we received a pointer to,
set DECL_RTL appropriately. */
@@ -3049,6 +3138,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
assign_parm_find_data_types and expand_expr_real_1. */
equiv_stack_parm = data->stack_parm;
+ if (!equiv_stack_parm)
+ equiv_stack_parm = data->entry_parm;
validated_mem = validize_mem (copy_rtx (data->entry_parm));
need_conversion = (data->nominal_mode != data->passed_mode
@@ -3189,11 +3280,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
/* If we were passed a pointer but the actual value can safely live
in a register, retrieve it and use it directly. */
- if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+ if (data->passed_pointer
+ && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
{
/* We can't use nominal_mode, because it will have been set to
Pmode above. We must use the actual mode of the parm. */
- if (use_register_for_decl (parm))
+ if (from_expand)
+ {
+ parmreg = from_expand;
+ gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+ }
+ else if (use_register_for_decl (parm))
{
parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
mark_user_reg (parmreg);
@@ -3233,7 +3330,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
/* STACK_PARM is the pointer, not the parm, and PARMREG is
now the parm. */
- data->stack_parm = NULL;
+ data->stack_parm = equiv_stack_parm = NULL;
}
/* Mark the register as eliminable if we did no conversion and it was
@@ -3243,11 +3340,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
make here would screw up life analysis for it. */
if (data->nominal_mode == data->passed_mode
&& !did_conversion
- && data->stack_parm != 0
- && MEM_P (data->stack_parm)
+ && equiv_stack_parm != 0
+ && MEM_P (equiv_stack_parm)
&& data->locate.offset.var == 0
&& reg_mentioned_p (virtual_incoming_args_rtx,
- XEXP (data->stack_parm, 0)))
+ XEXP (equiv_stack_parm, 0)))
{
rtx_insn *linsn = get_last_insn ();
rtx_insn *sinsn;
@@ -3260,8 +3357,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
= GET_MODE_INNER (GET_MODE (parmreg));
int regnor = REGNO (XEXP (parmreg, 0));
int regnoi = REGNO (XEXP (parmreg, 1));
- rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
- rtx stacki = adjust_address_nv (data->stack_parm, submode,
+ rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+ rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
GET_MODE_SIZE (submode));
/* Scan backwards for the set of the real and
@@ -3334,6 +3431,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
if (data->stack_parm == 0)
{
+ rtx x = data->stack_parm = rtl_for_parm (all, parm);
+ if (x)
+ gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+ }
+
+ if (data->stack_parm == 0)
+ {
int align = STACK_SLOT_ALIGNMENT (data->passed_type,
GET_MODE (data->entry_parm),
TYPE_ALIGN (data->passed_type));
@@ -3592,6 +3696,8 @@ assign_parms (tree fndecl)
DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
continue;
}
+ else
+ maybe_reset_rtl_for_parm (parm);
/* Estimate stack alignment from parameter alignment. */
if (SUPPORTS_STACK_ALIGNMENT)
@@ -3641,7 +3747,9 @@ assign_parms (tree fndecl)
else
set_decl_incoming_rtl (parm, data.entry_parm, false);
- /* Boudns should be loaded in the particular order to
+ assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+ /* Bounds should be loaded in the particular order to
have registers allocated correctly. Collect info about
input bounds and load them later. */
if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3658,11 +3766,10 @@ assign_parms (tree fndecl)
}
else
{
- assign_parm_adjust_stack_rtl (&data);
-
if (assign_parm_setup_block_p (&data))
assign_parm_setup_block (&all, parm, &data);
- else if (data.passed_pointer || use_register_for_decl (parm))
+ else if (data.passed_pointer
+ || use_register_for_parm_decl (&all, parm))
assign_parm_setup_reg (&all, parm, &data);
else
assign_parm_setup_stack (&all, parm, &data);
@@ -5001,7 +5108,9 @@ expand_function_start (tree subr)
before any library calls that assign parms might generate. */
/* Decide whether to return the value in memory or in a register. */
- if (aggregate_value_p (DECL_RESULT (subr), subr))
+ tree res = DECL_RESULT (subr);
+ maybe_reset_rtl_for_parm (res);
+ if (aggregate_value_p (res, subr))
{
/* Returning something that won't go in a register. */
rtx value_address = 0;
@@ -5009,7 +5118,7 @@ expand_function_start (tree subr)
#ifdef PCC_STATIC_STRUCT_RETURN
if (cfun->returns_pcc_struct)
{
- int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+ int size = int_size_in_bytes (TREE_TYPE (res));
value_address = assemble_static_space (size);
}
else
@@ -5021,36 +5130,45 @@ expand_function_start (tree subr)
it. */
if (sv)
{
- value_address = gen_reg_rtx (Pmode);
+ if (DECL_BY_REFERENCE (res))
+ value_address = get_rtl_for_parm_ssa_default_def (res);
+ if (!value_address)
+ value_address = gen_reg_rtx (Pmode);
emit_move_insn (value_address, sv);
}
}
if (value_address)
{
rtx x = value_address;
- if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+ if (!DECL_BY_REFERENCE (res))
{
- x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
- set_mem_attributes (x, DECL_RESULT (subr), 1);
+ x = get_rtl_for_parm_ssa_default_def (res);
+ if (!x)
+ {
+ x = gen_rtx_MEM (DECL_MODE (res), value_address);
+ set_mem_attributes (x, res, 1);
+ }
}
- SET_DECL_RTL (DECL_RESULT (subr), x);
+ SET_DECL_RTL (res, x);
}
}
- else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+ else if (DECL_MODE (res) == VOIDmode)
/* If return mode is void, this decl rtl should not be used. */
- SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+ SET_DECL_RTL (res, NULL_RTX);
else
{
/* Compute the return values into a pseudo reg, which we will copy
into the true return register after the cleanups are done. */
- tree return_type = TREE_TYPE (DECL_RESULT (subr));
- if (TYPE_MODE (return_type) != BLKmode
- && targetm.calls.return_in_msb (return_type))
+ tree return_type = TREE_TYPE (res);
+ rtx x = get_rtl_for_parm_ssa_default_def (res);
+ if (x)
+ /* Use it. */;
+ else if (TYPE_MODE (return_type) != BLKmode
+ && targetm.calls.return_in_msb (return_type))
/* expand_function_end will insert the appropriate padding in
this case. Use the return value's natural (unpadded) mode
within the function proper. */
- SET_DECL_RTL (DECL_RESULT (subr),
- gen_reg_rtx (TYPE_MODE (return_type)));
+ x = gen_reg_rtx (TYPE_MODE (return_type));
else
{
/* In order to figure out what mode to use for the pseudo, we
@@ -5061,25 +5179,26 @@ expand_function_start (tree subr)
/* Structures that are returned in registers are not
aggregate_value_p, so we may see a PARALLEL or a REG. */
if (REG_P (hard_reg))
- SET_DECL_RTL (DECL_RESULT (subr),
- gen_reg_rtx (GET_MODE (hard_reg)));
+ x = gen_reg_rtx (GET_MODE (hard_reg));
else
{
gcc_assert (GET_CODE (hard_reg) == PARALLEL);
- SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+ x = gen_group_rtx (hard_reg);
}
}
+ SET_DECL_RTL (res, x);
+
/* Set DECL_REGISTER flag so that expand_function_end will copy the
result to the real return register(s). */
- DECL_REGISTER (DECL_RESULT (subr)) = 1;
+ DECL_REGISTER (res) = 1;
if (chkp_function_instrumented_p (current_function_decl))
{
- tree return_type = TREE_TYPE (DECL_RESULT (subr));
+ tree return_type = TREE_TYPE (res);
rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
subr, 1);
- SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+ SET_DECL_BOUNDS_RTL (res, bounds);
}
}
@@ -5093,7 +5212,9 @@ expand_function_start (tree subr)
tree parm = cfun->static_chain_decl;
rtx local, chain, insn;
- local = gen_reg_rtx (Pmode);
+ local = get_rtl_for_parm_ssa_default_def (parm);
+ if (!local)
+ local = gen_reg_rtx (Pmode);
chain = targetm.calls.static_chain (current_function_decl, true);
set_decl_incoming_rtl (parm, chain, false);
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index efc93b7..e29f300 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
return copy;
}
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
- coalescing together, false otherwise.
-
- This must stay consistent with var_map_base_init in tree-ssa-live.c. */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
- /* First check the SSA_NAME's associated DECL. We only want to
- coalesce if they have the same DECL or both have no associated DECL. */
- tree var1 = SSA_NAME_VAR (name1);
- tree var2 = SSA_NAME_VAR (name2);
- var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
- var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
- if (var1 != var2)
- return false;
-
- /* Now check the types. If the types are the same, then we should
- try to coalesce V1 and V2. */
- tree t1 = TREE_TYPE (name1);
- tree t2 = TREE_TYPE (name2);
- if (t1 == t2)
- return true;
-
- /* If the types are not the same, check for a canonical type match. This
- (for example) allows coalescing when the types are fundamentally the
- same, but just have different names.
-
- Note pointer types with different address spaces may have the same
- canonical type. Those are rejected for coalescing by the
- types_compatible_p check. */
- if (TYPE_CANONICAL (t1)
- && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
- && types_compatible_p (t1, t2))
- return true;
-
- return false;
-}
-
/* Strip off a legitimate source ending from the input string NAME of
length LEN. Rather than having to know the names used by all of
our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index a50a90a..b492137 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
extern bool gimple_has_body_p (tree);
extern const char *gimple_decl_printable_name (tree, int);
extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
extern tree create_tmp_var_name (const char *);
extern tree create_tmp_var_raw (tree, const char * = NULL);
extern tree create_tmp_var (tree, const char * = NULL);
@@ -56,6 +55,10 @@ extern bool is_gimple_mem_ref_addr (tree);
extern void mark_addressable (tree);
extern bool is_gimple_reg_rhs (tree);
+/* Defined in tree-ssa-coalesce.c. */
+extern bool gimple_can_coalesce_p (tree, tree);
+
+
/* Return true if a conversion from either type of TYPE1 and TYPE2
to the other is not required. Otherwise return false. */
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..7e41b1f 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
{ OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+ { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
- { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index ffa63b5..4548b20 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_all_early_optimizations);
PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
NEXT_PASS (pass_remove_cgraph_callee_edges);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_object_sizes);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
@@ -154,7 +153,6 @@ along with GCC; see the file COPYING3. If not see
/* Initial scalar cleanups before alias computation.
They ensure memory accesses are not indirect wherever possible. */
NEXT_PASS (pass_strip_predict_hints);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
@@ -182,7 +180,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_stdarg);
NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_sra);
- NEXT_PASS (pass_rename_ssa_copies);
/* The dom pass will also resolve all __builtin_constant_p calls
that are still there to 0. This has to be done after some
propagations have already run, but before some more dead code
@@ -291,7 +288,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_fold_builtins);
NEXT_PASS (pass_optimize_widening_mul);
NEXT_PASS (pass_tail_calls);
- NEXT_PASS (pass_rename_ssa_copies);
/* FIXME: If DCE is not run before checking for uninitialized uses,
we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
However, this also causes us to misdiagnose cases that should be
@@ -326,7 +322,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_dce);
NEXT_PASS (pass_asan);
NEXT_PASS (pass_tsan);
- NEXT_PASS (pass_rename_ssa_copies);
/* ??? We do want some kind of loop invariant motion, but we possibly
need to adjust LIM to be more friendly towards preserving accurate
debug information here. */
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
/* PR tree-optimization/54200 */
/* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
int o __attribute__((used));
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
int main ()
{
- int i;
+ register int i;
char foo[255];
// smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
void
overflow()
{
- int i = 0;
+ register int i = 0;
char foo[30];
/* Overflow buffer. */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+ value is unused, to the same location, so as to overwrite one of
+ them with the incoming value of the other. */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+ j = i; /* The incoming value for J is unused. */
+ i = 2;
+ if (j)
+ j++;
+ j += i + 1;
+ return j;
+}
+
+/* Same as foo, but with swapped parameters. */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+ j = i; /* The incoming value for J is unused. */
+ i = 2;
+ if (j)
+ j++;
+ j += i + 1;
+ return j;
+}
+
+int
+main (void)
+{
+ if (foo (0, 1) != 3)
+ abort ();
+ if (bar (1, 0) != 3)
+ abort ();
+ return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index e6310cd..e62f36b 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -330,12 +330,13 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
start_sequence ();
- var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+ tree name = partition_to_var (SA.map, dest);
+ var = SSA_NAME_VAR (name);
src_mode = TYPE_MODE (TREE_TYPE (src));
dest_mode = GET_MODE (dest_rtx);
- gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+ gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
gcc_assert (!REG_P (dest_rtx)
- || dest_mode == promote_decl_mode (var, &unsignedp));
+ || dest_mode == promote_ssa_mode (name, &unsignedp));
if (src_mode != dest_mode)
{
@@ -714,12 +715,12 @@ static rtx
get_temp_reg (tree name)
{
tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
- tree type = TREE_TYPE (var);
+ tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
int unsignedp;
- machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+ machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
rtx x = gen_reg_rtx (reg_mode);
if (POINTER_TYPE_P (type))
- mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+ mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
return x;
}
@@ -1019,7 +1020,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
/* Return to viewing the variable list as just all reference variables after
coalescing has been performed. */
- partition_view_normal (map, false);
+ partition_view_normal (map);
if (dump_file && (dump_flags & TDF_DETAILS))
{
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index eeac5a4..c2cdeef0 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see
#include "tree-ssanames.h"
#include "tree-ssa-live.h"
#include "tree-ssa-coalesce.h"
+#include "explow.h"
#include "diagnostic-core.h"
@@ -832,6 +833,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
basic_block bb;
ssa_op_iter iter;
live_track_p live;
+ basic_block entry;
+
+ /* If inter-variable coalescing is enabled, we may attempt to
+ coalesce variables from different base variables, including
+ different parameters, so we have to make sure default defs live
+ at the entry block conflict with each other. */
+ if (flag_tree_coalesce_vars)
+ entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+ else
+ entry = NULL;
map = live_var_map (liveinfo);
graph = ssa_conflicts_new (num_var_partitions (map));
@@ -890,6 +901,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
live_track_process_def (live, result, graph);
}
+ /* Pretend there are defs for params' default defs at the start
+ of the (post-)entry block. */
+ if (bb == entry)
+ {
+ unsigned base;
+ bitmap_iterator bi;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+ {
+ bitmap_iterator bi2;
+ unsigned part;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+ 0, part, bi2)
+ {
+ tree var = partition_to_var (map, part);
+ if (!SSA_NAME_VAR (var)
+ || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+ && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+ || !SSA_NAME_IS_DEFAULT_DEF (var))
+ continue;
+ live_track_process_def (live, var, graph);
+ }
+ }
+ }
+
live_track_clear_base_vars (live);
}
@@ -1158,6 +1193,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
{
var1 = partition_to_var (map, p1);
var2 = partition_to_var (map, p2);
+
z = var_union (map, var1, var2);
if (z == NO_PARTITION)
{
@@ -1175,6 +1211,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
if (debug)
fprintf (debug, ": Success -> %d\n", z);
+
return true;
}
@@ -1272,6 +1309,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
}
+/* Output partition map MAP with coalescing plan PART to file F. */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+ int t;
+ unsigned x, y;
+ int p;
+
+ fprintf (f, "\nCoalescible Partition map \n\n");
+
+ for (x = 0; x < map->num_partitions; x++)
+ {
+ if (map->view_to_partition != NULL)
+ p = map->view_to_partition[x];
+ else
+ p = x;
+
+ if (ssa_name (p) == NULL_TREE
+ || virtual_operand_p (ssa_name (p)))
+ continue;
+
+ t = 0;
+ for (y = 1; y < num_ssa_names; y++)
+ {
+ tree var = version_to_var (map, y);
+ if (!var)
+ continue;
+ int q = var_to_partition (map, var);
+ p = partition_find (part, q);
+ gcc_assert (map->partition_to_base_index[q]
+ == map->partition_to_base_index[p]);
+
+ if (p == (int)x)
+ {
+ if (t++ == 0)
+ {
+ fprintf (f, "Partition %d, base %d (", x,
+ map->partition_to_base_index[q]);
+ print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+ fprintf (f, " - ");
+ }
+ fprintf (f, "%d ", y);
+ }
+ }
+ if (t != 0)
+ fprintf (f, ")\n");
+ }
+ fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+ coalescing together, false otherwise.
+
+ This must stay consistent with var_map_base_init in tree-ssa-live.c. */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+ /* First check the SSA_NAME's associated DECL. Without
+ optimization, we only want to coalesce if they have the same DECL
+ or both have no associated DECL. */
+ tree var1 = SSA_NAME_VAR (name1);
+ tree var2 = SSA_NAME_VAR (name2);
+ var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+ var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+ if (var1 != var2 && !flag_tree_coalesce_vars)
+ return false;
+
+ /* Now check the types. If the types are the same, then we should
+ try to coalesce V1 and V2. */
+ tree t1 = TREE_TYPE (name1);
+ tree t2 = TREE_TYPE (name2);
+ if (t1 == t2)
+ {
+ check_modes:
+ /* If the base variables are the same, we're good: none of the
+ other tests below could possibly fail. */
+ var1 = SSA_NAME_VAR (name1);
+ var2 = SSA_NAME_VAR (name2);
+ if (var1 == var2)
+ return true;
+
+ /* We don't want to coalesce two SSA names if one of the base
+ variables is supposed to be a register while the other is
+ supposed to be on the stack. Anonymous SSA names take
+ registers, but when not optimizing, user variables should go
+ on the stack, so coalescing them with the anonymous variable
+ as the partition leader would end up assigning the user
+ variable to a register. Don't do that! */
+ bool reg1 = !var1 || use_register_for_decl (var1);
+ bool reg2 = !var2 || use_register_for_decl (var2);
+ if (reg1 != reg2)
+ return false;
+
+ /* Check that the promoted modes are the same. We don't want to
+ coalesce if the promoted modes would be different. Only
+ PARM_DECLs and RESULT_DECLs have different promotion rules,
+ so skip the test if we both are variables or anonymous
+ SSA_NAMEs. */
+ return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+ || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
+ }
+
+ /* If the types are not the same, check for a canonical type match. This
+ (for example) allows coalescing when the types are fundamentally the
+ same, but just have different names.
+
+ Note pointer types with different address spaces may have the same
+ canonical type. Those are rejected for coalescing by the
+ types_compatible_p check. */
+ if (TYPE_CANONICAL (t1)
+ && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+ && types_compatible_p (t1, t2))
+ goto check_modes;
+
+ return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+ partition of SSA names USED_IN_COPIES and related by CL coalesce
+ possibilities. This must match gimple_can_coalesce_p in the
+ optimized case. */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+ coalesce_list_p cl)
+{
+ int parts = num_var_partitions (map);
+ partition tentative = partition_new (parts);
+
+ /* Partition the SSA versions so that, for each coalescible
+ pair, both of its members are in the same partition in
+ TENTATIVE. */
+ gcc_assert (!cl->sorted);
+ coalesce_pair_p node;
+ coalesce_iterator_type ppi;
+ FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+ {
+ tree v1 = ssa_name (node->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (node->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* We have to deal with cost one pairs too. */
+ for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+ {
+ tree v1 = ssa_name (co->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (co->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* And also with abnormal edges. */
+ basic_block bb;
+ edge e;
+ edge_iterator ei;
+ FOR_EACH_BB_FN (bb, cfun)
+ {
+ FOR_EACH_EDGE (e, ei, bb->preds)
+ if (e->flags & EDGE_ABNORMAL)
+ {
+ gphi_iterator gsi;
+ for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+ gsi_next (&gsi))
+ {
+ gphi *phi = gsi.phi ();
+ tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+ if (SSA_NAME_IS_DEFAULT_DEF (arg)
+ && (!SSA_NAME_VAR (arg)
+ || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+ continue;
+
+ tree res = PHI_RESULT (phi);
+
+ int p1 = partition_find (tentative, var_to_partition (map, res));
+ int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+ }
+ }
+
+ map->partition_to_base_index = XCNEWVEC (int, parts);
+ auto_vec<unsigned int> index_map (parts);
+ if (parts)
+ index_map.quick_grow (parts);
+
+ const unsigned no_part = -1;
+ unsigned count = parts;
+ while (count)
+ index_map[--count] = no_part;
+
+ /* Initialize MAP's mapping from partition to base index, using
+ as base indices an enumeration of the TENTATIVE partitions in
+ which each SSA version ended up, so that we compute conflicts
+ between all SSA versions that ended up in the same potential
+ coalesce partition. */
+ bitmap_iterator bi;
+ unsigned i;
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ if (index_map[base] != no_part)
+ continue;
+ index_map[base] = count++;
+ }
+
+ map->num_basevars = count;
+
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ gcc_assert (index_map[base] < count);
+ map->partition_to_base_index[pidx] = index_map[base];
+ }
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ dump_part_var_map (dump_file, tentative, map);
+
+ partition_delete (tentative);
+}
+
+/* Hashtable helpers. */
+
+struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
+{
+ typedef tree_int_map *value_type;
+ typedef tree_int_map *compare_type;
+ static inline hashval_t hash (const tree_int_map *);
+ static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+ return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+ return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+ names. Partitions will share the same base if they have the same
+ SSA_NAME_VAR, or, being anonymous variables, the same type. This
+ must match gimple_can_coalesce_p in the non-optimized case. */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+ int x, num_part;
+ tree var;
+ struct tree_int_map *m, *mapstorage;
+
+ num_part = num_var_partitions (map);
+ hash_table<tree_int_map_hasher> tree_to_index (num_part);
+ /* We can have at most num_part entries in the hash tables, so it's
+ enough to allocate so many map elements once, saving some malloc
+ calls. */
+ mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+ /* If a base table already exists, clear it, otherwise create it. */
+ free (map->partition_to_base_index);
+ map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+ /* Build the base variable list, and point partitions at their bases. */
+ for (x = 0; x < num_part; x++)
+ {
+ struct tree_int_map **slot;
+ unsigned baseindex;
+ var = partition_to_var (map, x);
+ if (SSA_NAME_VAR (var)
+ && (!VAR_P (SSA_NAME_VAR (var))
+ || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+ m->base.from = SSA_NAME_VAR (var);
+ else
+ /* This restricts what anonymous SSA names we can coalesce
+ as it restricts the sets we compute conflicts for.
+ Using TREE_TYPE to generate sets is the easies as
+ type equivalency also holds for SSA names with the same
+ underlying decl.
+
+ Check gimple_can_coalesce_p when changing this code. */
+ m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+ ? TYPE_CANONICAL (TREE_TYPE (var))
+ : TREE_TYPE (var));
+ /* If base variable hasn't been seen, set it up. */
+ slot = tree_to_index.find_slot (m, INSERT);
+ if (!*slot)
+ {
+ baseindex = m - mapstorage;
+ m->to = baseindex;
+ *slot = m;
+ m++;
+ }
+ else
+ baseindex = (*slot)->to;
+ map->partition_to_base_index[x] = baseindex;
+ }
+
+ map->num_basevars = m - mapstorage;
+
+ free (mapstorage);
+}
+
/* Reduce the number of copies by coalescing variables in the function. Return
a partition map with the resulting coalesces. */
@@ -1288,9 +1649,10 @@ coalesce_ssa_name (void)
cl = create_coalesce_list ();
map = create_outofssa_var_map (cl, used_in_copies);
- /* If optimization is disabled, we need to coalesce all the names originating
- from the same SSA_NAME_VAR so debug info remains undisturbed. */
- if (!optimize)
+ /* If this optimization is disabled, we need to coalesce all the
+ names originating from the same SSA_NAME_VAR so debug info
+ remains undisturbed. */
+ if (!flag_tree_coalesce_vars)
{
hash_table<ssa_name_var_hash> ssa_name_hash (10);
@@ -1331,8 +1693,13 @@ coalesce_ssa_name (void)
if (dump_file && (dump_flags & TDF_DETAILS))
dump_var_map (dump_file, map);
- /* Don't calculate live ranges for variables not in the coalesce list. */
- partition_view_bitmap (map, used_in_copies, true);
+ partition_view_bitmap (map, used_in_copies);
+
+ if (flag_tree_coalesce_vars)
+ compute_optimized_partition_bases (map, used_in_copies, cl);
+ else
+ compute_samebase_partition_bases (map);
+
BITMAP_FREE (used_in_copies);
if (num_var_partitions (map) < 1)
@@ -1371,8 +1738,7 @@ coalesce_ssa_name (void)
/* Now coalesce everything in the list. */
coalesce_partitions (map, graph, cl,
- ((dump_flags & TDF_DETAILS) ? dump_file
- : NULL));
+ ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
delete_coalesce_list (cl);
ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index f3cb56e..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,499 +0,0 @@
-/* Rename SSA copies.
- Copyright (C) 2004-2015 Free Software Foundation, Inc.
- Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3. If not see
-<http://www.gnu.org/licenses/>. */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
-#include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
-#include "tree.h"
-#include "fold-const.h"
-#include "predict.h"
-#include "hard-reg-set.h"
-#include "function.h"
-#include "dominance.h"
-#include "cfg.h"
-#include "basic-block.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "bitmap.h"
-#include "gimple-ssa.h"
-#include "stringpool.h"
-#include "tree-ssanames.h"
-#include "hashtab.h"
-#include "rtl.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
- /* Number of copies coalesced. */
- int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
- This optimization looks for copies between 2 SSA_NAMES, either through a
- direct copy, or an implicit one via a PHI node result and its arguments.
-
- Each copy is examined to determine if it is possible to rename the base
- variable of one of the operands to the same variable as the other operand.
- i.e.
- T.3_5 = <blah>
- a_1 = T.3_5
-
- If this copy couldn't be copy propagated, it could possibly remain in the
- program throughout the optimization phases. After SSA->normal, it would
- become:
-
- T.3 = <blah>
- a = T.3
-
- Since T.3_5 is distinct from all other SSA versions of T.3, there is no
- fundamental reason why the base variable needs to be T.3, subject to
- certain restrictions. This optimization attempts to determine if we can
- change the base variable on copies like this, and result in code such as:
-
- a_5 = <blah>
- a_1 = a_5
-
- This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
- possible, the copy goes away completely. If it isn't possible, a new temp
- will be created for a_5, and you will end up with the exact same code:
-
- a.8 = <blah>
- a = a.8
-
- The other benefit of performing this optimization relates to what variables
- are chosen in copies. Gimplification of the program uses temporaries for
- a lot of things. expressions like
-
- a_1 = <blah>
- <blah2> = a_1
-
- get turned into
-
- T.3_5 = <blah>
- a_1 = T.3_5
- <blah2> = a_1
-
- Copy propagation is done in a forward direction, and if we can propagate
- through the copy, we end up with:
-
- T.3_5 = <blah>
- <blah2> = T.3_5
-
- The copy is gone, but so is all reference to the user variable 'a'. By
- performing this optimization, we would see the sequence:
-
- a_5 = <blah>
- a_1 = a_5
- <blah2> = a_1
-
- which copy propagation would then turn into:
-
- a_5 = <blah>
- <blah2> = a_5
-
- and so we still retain the user variable whenever possible. */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
- Choose a representative for the partition, and send debug info to DEBUG. */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
- int p1, p2, p3;
- tree root1, root2;
- tree rep1, rep2;
- bool ign1, ign2, abnorm;
-
- gcc_assert (TREE_CODE (var1) == SSA_NAME);
- gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
- register_ssa_partition (map, var1);
- register_ssa_partition (map, var2);
-
- p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
- p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
- if (debug)
- {
- fprintf (debug, "Try : ");
- print_generic_expr (debug, var1, TDF_SLIM);
- fprintf (debug, "(P%d) & ", p1);
- print_generic_expr (debug, var2, TDF_SLIM);
- fprintf (debug, "(P%d)", p2);
- }
-
- gcc_assert (p1 != NO_PARTITION);
- gcc_assert (p2 != NO_PARTITION);
-
- if (p1 == p2)
- {
- if (debug)
- fprintf (debug, " : Already coalesced.\n");
- return;
- }
-
- rep1 = partition_to_var (map, p1);
- rep2 = partition_to_var (map, p2);
- root1 = SSA_NAME_VAR (rep1);
- root2 = SSA_NAME_VAR (rep2);
- if (!root1 && !root2)
- return;
-
- /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
- abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
- || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
- if (abnorm)
- {
- if (debug)
- fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
- return;
- }
-
- /* Partitions already have the same root, simply merge them. */
- if (root1 == root2)
- {
- p1 = partition_union (map->var_partition, p1, p2);
- if (debug)
- fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
- return;
- }
-
- /* Never attempt to coalesce 2 different parameters. */
- if ((root1 && TREE_CODE (root1) == PARM_DECL)
- && (root2 && TREE_CODE (root2) == PARM_DECL))
- {
- if (debug)
- fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
- return;
- }
-
- if ((root1 && TREE_CODE (root1) == RESULT_DECL)
- != (root2 && TREE_CODE (root2) == RESULT_DECL))
- {
- if (debug)
- fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
- return;
- }
-
- ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
- ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
- /* Refrain from coalescing user variables, if requested. */
- if (!ign1 && !ign2)
- {
- if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
- ign2 = true;
- else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
- ign1 = true;
- else if (flag_ssa_coalesce_vars != 2)
- {
- if (debug)
- fprintf (debug, " : 2 different USER vars. No coalesce.\n");
- return;
- }
- else
- ign2 = true;
- }
-
- /* If both values have default defs, we can't coalesce. If only one has a
- tag, make sure that variable is the new root partition. */
- if (root1 && ssa_default_def (cfun, root1))
- {
- if (root2 && ssa_default_def (cfun, root2))
- {
- if (debug)
- fprintf (debug, " : 2 default defs. No coalesce.\n");
- return;
- }
- else
- {
- ign2 = true;
- ign1 = false;
- }
- }
- else if (root2 && ssa_default_def (cfun, root2))
- {
- ign1 = true;
- ign2 = false;
- }
-
- /* Do not coalesce if we cannot assign a symbol to the partition. */
- if (!(!ign2 && root2)
- && !(!ign1 && root1))
- {
- if (debug)
- fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the new chosen root variable would be read-only.
- If both ign1 && ign2, then the root var of the larger partition
- wins, so reject in that case if any of the root vars is TREE_READONLY.
- Otherwise reject only if the root var, on which replace_ssa_name_symbol
- will be called below, is readonly. */
- if (((root1 && TREE_READONLY (root1)) && ign2)
- || ((root2 && TREE_READONLY (root2)) && ign1))
- {
- if (debug)
- fprintf (debug, " : Readonly variable. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the two variables aren't type compatible . */
- if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
- /* There is a disconnect between the middle-end type-system and
- VRP, avoid coalescing enum types with different bounds. */
- || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
- || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
- && TREE_TYPE (var1) != TREE_TYPE (var2)))
- {
- if (debug)
- fprintf (debug, " : Incompatible types. No coalesce.\n");
- return;
- }
-
- /* Merge the two partitions. */
- p3 = partition_union (map->var_partition, p1, p2);
-
- /* Set the root variable of the partition to the better choice, if there is
- one. */
- if (!ign2 && root2)
- replace_ssa_name_symbol (partition_to_var (map, p3), root2);
- else if (!ign1 && root1)
- replace_ssa_name_symbol (partition_to_var (map, p3), root1);
- else
- gcc_unreachable ();
-
- if (debug)
- {
- fprintf (debug, " --> P%d ", p3);
- print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
- TDF_SLIM);
- fprintf (debug, "\n");
- }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
- GIMPLE_PASS, /* type */
- "copyrename", /* name */
- OPTGROUP_NONE, /* optinfo_flags */
- TV_TREE_COPY_RENAME, /* tv_id */
- ( PROP_cfg | PROP_ssa ), /* properties_required */
- 0, /* properties_provided */
- 0, /* properties_destroyed */
- 0, /* todo_flags_start */
- 0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
- pass_rename_ssa_copies (gcc::context *ctxt)
- : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
- {}
-
- /* opt_pass methods: */
- opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
- virtual bool gate (function *) { return flag_tree_copyrename != 0; }
- virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
- SSA versions which occur in PHI's or copies. Coalescing is accomplished by
- changing the underlying root variable of all coalesced version. This will
- then cause the SSA->normal pass to attempt to coalesce them all to the same
- variable. */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
- var_map map;
- basic_block bb;
- tree var, part_var;
- gimple stmt;
- unsigned x;
- FILE *debug;
-
- memset (&stats, 0, sizeof (stats));
-
- if (dump_file && (dump_flags & TDF_DETAILS))
- debug = dump_file;
- else
- debug = NULL;
-
- map = init_var_map (num_ssa_names);
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Scan for real copies. */
- for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- stmt = gsi_stmt (gsi);
- if (gimple_assign_ssa_name_copy_p (stmt))
- {
- tree lhs = gimple_assign_lhs (stmt);
- tree rhs = gimple_assign_rhs1 (stmt);
-
- copy_rename_partition_coalesce (map, lhs, rhs, debug);
- }
- }
- }
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Treat PHI nodes as copies between the result and each argument. */
- for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- size_t i;
- tree res;
- gphi *phi = gsi.phi ();
- res = gimple_phi_result (phi);
-
- /* Do not process virtual SSA_NAMES. */
- if (virtual_operand_p (res))
- continue;
-
- /* Make sure to only use the same partition for an argument
- as the result but never the other way around. */
- if (SSA_NAME_VAR (res)
- && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) == SSA_NAME)
- copy_rename_partition_coalesce (map, res, arg,
- debug);
- }
- /* Else if all arguments are in the same partition try to merge
- it with the result. */
- else
- {
- int all_p_same = -1;
- int p = -1;
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) != SSA_NAME)
- {
- all_p_same = 0;
- break;
- }
- else if (all_p_same == -1)
- {
- p = partition_find (map->var_partition,
- SSA_NAME_VERSION (arg));
- all_p_same = 1;
- }
- else if (all_p_same == 1
- && p != partition_find (map->var_partition,
- SSA_NAME_VERSION (arg)))
- {
- all_p_same = 0;
- break;
- }
- }
- if (all_p_same == 1)
- copy_rename_partition_coalesce (map, res,
- PHI_ARG_DEF (phi, 0),
- debug);
- }
- }
- }
-
- if (debug)
- dump_var_map (debug, map);
-
- /* Now one more pass to make all elements of a partition share the same
- root variable. */
-
- for (x = 1; x < num_ssa_names; x++)
- {
- part_var = partition_to_var (map, x);
- if (!part_var)
- continue;
- var = ssa_name (x);
- if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
- continue;
- if (debug)
- {
- fprintf (debug, "Coalesced ");
- print_generic_expr (debug, var, TDF_SLIM);
- fprintf (debug, " to ");
- print_generic_expr (debug, part_var, TDF_SLIM);
- fprintf (debug, "\n");
- }
- stats.coalesced++;
- replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
- }
-
- statistics_counter_event (fun, "copies coalesced",
- stats.coalesced);
- delete_var_map (map);
- return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
- return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 2c7c072..821b2f4 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -100,90 +100,6 @@ static void verify_live_on_entry (tree_live_info_p);
ssa_name or variable, and vice versa. */
-/* Hashtable helpers. */
-
-struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
-{
- typedef tree_int_map *value_type;
- typedef tree_int_map *compare_type;
- static inline hashval_t hash (const tree_int_map *);
- static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
- return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
- return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP. */
-
-static void
-var_map_base_init (var_map map)
-{
- int x, num_part;
- tree var;
- struct tree_int_map *m, *mapstorage;
-
- num_part = num_var_partitions (map);
- hash_table<tree_int_map_hasher> tree_to_index (num_part);
- /* We can have at most num_part entries in the hash tables, so it's
- enough to allocate so many map elements once, saving some malloc
- calls. */
- mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
- /* If a base table already exists, clear it, otherwise create it. */
- free (map->partition_to_base_index);
- map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
- /* Build the base variable list, and point partitions at their bases. */
- for (x = 0; x < num_part; x++)
- {
- struct tree_int_map **slot;
- unsigned baseindex;
- var = partition_to_var (map, x);
- if (SSA_NAME_VAR (var)
- && (!VAR_P (SSA_NAME_VAR (var))
- || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
- m->base.from = SSA_NAME_VAR (var);
- else
- /* This restricts what anonymous SSA names we can coalesce
- as it restricts the sets we compute conflicts for.
- Using TREE_TYPE to generate sets is the easies as
- type equivalency also holds for SSA names with the same
- underlying decl.
-
- Check gimple_can_coalesce_p when changing this code. */
- m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
- ? TYPE_CANONICAL (TREE_TYPE (var))
- : TREE_TYPE (var));
- /* If base variable hasn't been seen, set it up. */
- slot = tree_to_index.find_slot (m, INSERT);
- if (!*slot)
- {
- baseindex = m - mapstorage;
- m->to = baseindex;
- *slot = m;
- m++;
- }
- else
- baseindex = (*slot)->to;
- map->partition_to_base_index[x] = baseindex;
- }
-
- map->num_basevars = m - mapstorage;
-
- free (mapstorage);
-}
-
-
/* Remove the base table in MAP. */
static void
@@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
}
-/* Create a partition view which includes all the used partitions in MAP. If
- WANT_BASES is true, create the base variable map as well. */
+/* Create a partition view which includes all the used partitions in MAP. */
void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
{
bitmap used;
used = partition_view_init (map);
partition_view_fini (map, used);
- if (want_bases)
- var_map_base_init (map);
- else
- var_map_base_fini (map);
+ var_map_base_fini (map);
}
@@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
as well. */
void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
{
bitmap used;
bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
}
partition_view_fini (map, new_partitions);
- if (want_bases)
- var_map_base_init (map);
- else
- var_map_base_fini (map);
+ var_map_base_fini (map);
}
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
extern var_map init_var_map (int);
extern void delete_var_map (var_map);
extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
extern void dump_scope_blocks (FILE *, int);
extern void debug_scope_block (tree, int);
extern void debug_scope_blocks (int);
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 685fcc38c..447fcd9 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4872,12 +4872,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
registers, as well as associations between MEMs and VALUEs. */
static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
{
unsigned int r;
hard_reg_set_iterator hrsi;
+ HARD_REG_SET invalidated_regs;
- EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+ get_call_reg_set_usage (call_insn, &invalidated_regs,
+ regs_invalidated_by_call);
+
+ EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
var_regno_delete (set, r);
if (MAY_HAVE_DEBUG_INSNS)
@@ -6685,7 +6689,7 @@ compute_bb_dataflow (basic_block bb)
switch (mo->type)
{
case MO_CALL:
- dataflow_set_clear_at_call (out);
+ dataflow_set_clear_at_call (out, insn);
break;
case MO_USE:
@@ -9152,7 +9156,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
switch (mo->type)
{
case MO_CALL:
- dataflow_set_clear_at_call (set);
+ dataflow_set_clear_at_call (set, insn);
emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
{
rtx arguments = mo->u.loc, *p = &arguments;
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-04-24 1:56 ` Alexandre Oliva
@ 2015-04-27 11:39 ` Richard Biener
2015-06-06 5:12 ` Alexandre Oliva
2015-04-29 3:51 ` Jeff Law
1 sibling, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-04-27 11:39 UTC (permalink / raw)
To: Alexandre Oliva; +Cc: Jeff Law, GCC Patches
On Fri, Apr 24, 2015 at 3:56 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Apr 6, 2015, Jeff Law <law@redhat.com> wrote:
>
>>>> So the bulk of the changes into this routine are really about picking
>>>> a good leader, which presumably is how we're able to get the desired
>>>> effects on debuginfo that we used to get from tree-ssa-copyrename.c?
>>>
>>> This has nothing to do with debuginfo, I'm afraid. We just had to keep
>>> track of parm and result decls to avoid coalescing them together, and
>>> parm decl default defs to promote them to leaders, because expand copies
>>> incoming REGs to pseudos in PARM_DECL's DECL_RTL. We should fill that
>>> in with the RTL created for the default def for the PARM_DECL. At the
>>> end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
>>> return register or rtl. I didn't want to tackle the reworking of these
>>> expanders to avoid problems out of copying incoming parms to one pseudo
>>> and then reading it from another, as I observed before I made this
>>> change. I'm now tackling that, so that we can refrain from touching the
>>> base vars altogether, and not refrain from coalescing parms and results.
>
>> Hmmm, so the real issue here is the expansion setup of parms and
>> results. I hadn't pondered that aspect. I'd encourage fixing the
>> expansion code too if you can see a path for that.
>
> That was the trickiest bit of the patch: getting assign_parms to use the
> out-of-SSA-chosen RTL for the (default def of the) param instead of
> creating a pseudo or stack slot of its own, especially when we create a
> .result_ptr decl and there is an incoming by-ref result_decl, in which
> case we ought to use the same SSA-assigned pseudo for both. Another
> case worth mentioning is that in which a param is unused: there is no
> default def for it, but in the non-optimized case, we have to assign it
> to the same location. I've used the DECL_RTL itself to carry the
> information in this case, at least in the non-optimized case, in which
> we know all SSA_NAMEs associated with each param *will* be assigned to
> the same partition, and use the same RTL. If we do optimize, the param
> may get multiple locations, and DECL_RTL will be cleared. That's fine:
> the incoming value of the param will end up being copied to a separate
> pseudo, so that there's no risk of messing with any other default def
> (there's a new testcase for this), and the copy is likely to be
> optimized out.
>
> The other tricky bit was to fix all expander bits that required
> SSA_NAMEs to have a associated decl. I've removed all such cases, so we
> can now expand anonymous SSA decls directly, without having to create an
> ignored decl. Doing that, we can coalesce variables and expand each
> partition without worrying about choosing a preferred partition leader.
> We just have to make sure we don't consider pairs of variables eligible
> for coalescing if they should get different promoted modes, or a
> different register-or-stack choice, and then expansion of partitions is
> streamlined: we just expand each leader, and then adjust all SSA_NAMEs
> to associate the RTL with their base variables, if any.
>
>
> In this revision of the patch, I have retained -ftree-coalesce-vars, so
> that its negated form can be used in testcases that formerly expected no
> coalescing across user variables, but that explicitly disabled VTA.
>
> As for testcases, while investigating test regressions, I found out
> various guality failures had to do with VT's lack of awareness of custom
> calling conventions. Caller's variables saved in registers that are
> normally call-clobbered, but that are call-saved in custom conventions
> set up for a callee, would end up invalidating the entry-point location
> associations. I've arranged for var-tracking to use custom calling
> conventions for register invalidation at call insns, and this fixed not
> only a few guality regressions due to changes in register assignment,
> but a number of other long-standing guality failures. Yay! This could
> be split out into a standalone patch.
>
>
> On Mar 31, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> Did you do any statistics on how the number of basevars changes with your patch
>> compared to trunk?
>
> In this version of the patch, we no longer touch the base vars at all.
> We just associate the piece of RTL generated for the partition with a
> list of decls, if needed. (I've just realized that I never noticed a
> list of decls show up anywhere, and looking into this, I saw a bug in
> the leader_merge function, that causes it to fail to go from a single
> entry to a list: it creates the list, but then returns the original
> non-list entry; that's why I never saw them! I won't delay posting the
> patch just because of this; I'm not even sure we want decl lists in REG
> or MEM attrs begin with)
>
> I have collected some statistics on the effects of the patch in
> compiling stage3-gcc/, before and after the patch, with and without
> -fno-tree-coalesce-vars. I counted, per function:
>
> b/a: before the patch, or after the patch
>
> c/n: -ftree-coalesce-vars (default when optimizing) or
> -fno-tree-coalesce-vars
>
> cv: the coalescible var count, i.e., the active partition count prior to
> coalescing. SSA_NAMEs not elligible for coalescing are not counted.
> The more of these there are, the larger the conflict graph we have to
> build.
>
> base: the base variable count that guides the construction of the
> conflict map. The more of these there are, the smaller the conflict
> graph we have to build, but it is also a lower bound for the final
> partition count.
>
> part: the partition count after coalescing, not counting those of
> SSA_NAMEs that were not elligible for coalescing to begin with.
>
> abn: successful abnormal coalesce count. How many times
> attempt_coalesce returned true as called in the abnormal coalesce loop.
>
> same: successful normal coalesces of pairs of SSA_NAMEs that share the
> same base variable (SSA_NAME_VAR, not the base index used to guide the
> construction of the conflict graph). Ignored base decls are regarded as
> NULL for purposes of this comparison. How many times attempt_coalesce
> returned true for variables that share the same base variable. This may
> count cases in which both vars are in the same partition already due to
> earlier coalesces.
>
> other: successful normal coalesces of pairs of SSA_NAMEs that do NOT
> share the same base variable. Same caveats as above.
>
> fail: failed attempts at normal coalece. How many times
> attempt_coalesce returned false.
>
> b/a c/n cv base part abn same other fail
>
> before -fno-tr 570180 176682 221442 82076 370746 0 10542
> before -ftree- 577212 171581 221927 82076 378093 0 18654
>
> after -fno-tr 608533 179959 220948 82076 488119 0 11697
> after -ftree- 589243 202588 221817 82076 349373 41775 24124
>
>
> Here's (for reference only) the patch used to gather the data
> consolidated above:
>
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index eeac5a4..d9fe4cc 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -1199,6 +1199,11 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
> edge e;
> edge_iterator ei;
>
> + int abnormal = 0, samevar = 0, othervar = 0, failure = 0;
> + int initial_partitions = num_var_partitions (map);
> + int final_partitions = initial_partitions;
> + int p1, p2;
> +
> /* First, coalesce all the copies across abnormal edges. These are not placed
> in the coalesce list because they do not need to be sorted, and simply
> consume extra memory/compilation time in large programs. */
> @@ -1226,8 +1231,17 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
> if (debug)
> fprintf (debug, "Abnormal coalesce: ");
>
> + p1 = var_to_partition (map, arg);
> + p2 = var_to_partition (map, res);
> +
> if (!attempt_coalesce (map, graph, v1, v2, debug))
> fail_abnormal_edge_coalesce (v1, v2);
> + else
> + {
> + abnormal++;
> + if (p1 != p2)
> + final_partitions--;
> + }
> }
> }
> }
> @@ -1244,8 +1258,30 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl,
>
> if (debug)
> fprintf (debug, "Coalesce list: ");
> - attempt_coalesce (map, graph, x, y, debug);
> +
> + p1 = var_to_partition (map, var1);
> + p2 = var_to_partition (map, var2);
> +
> + if (!attempt_coalesce (map, graph, x, y, debug))
> + failure++;
> + else
> + {
> + if (p1 != p2)
> + final_partitions--;
> + if ((SSA_NAME_VAR (var1) && !DECL_IGNORED_P (SSA_NAME_VAR (var1))
> + ? SSA_NAME_VAR (var1) : NULL)
> + == (SSA_NAME_VAR (var2) && !DECL_IGNORED_P (SSA_NAME_VAR (var2))
> + ? SSA_NAME_VAR (var2) : NULL))
> + samevar++;
> + else
> + othervar++;
> + }
> }
> +
> + inform (1,
> + "%i cv, %i base, %i part, %i abn, %i same, %i other, %i failed in %q+F",
> + initial_partitions, num_basevars (map), final_partitions,
> + abnormal, samevar, othervar, failure, current_function_decl);
> }
>
>
>
> And here's the actual patch I'm submitting for your appreciation (I was
> gonna say for inclusion, but given the leader_merge brown paper bag bug,
> I'll just want feedback on whether we want that or not, and either drop
> the list-building, or probably post a revised patch that fixes fallout
> from lists where decls are expected.)
>
> No regressions, and many progressions, on x86_64-linux-gnu and
> i686-pc-linux-gnu.
>
> [PR64164] Drop copyrename, use coalescible partition as base when optimizing.
>
> for gcc/ChangeLog
>
> PR rtl-optimization/64164
> * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
> * tree-ssa-copyrename.c: Removed.
> * opts.c (default_options_table): Drop -ftree-copyrename. Add
> -ftree-coalesce-vars.
> * passes.def: Drop all occurrences of pass_rename_ssa_copies.
> * common.opt (ftree-copyrename): Ignore.
> (ftree-coalesce-inlined-vars): Likewise.
> * doc/invoke.texi: Remove the ignored options above.
> * gimple-expr.h (gimple_can_coalesce_p): Note def location.
> * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
> across variables when flag_tree_coalesce_vars. Check register
> use and promoted modes to allow coalescing. Moved to
> tree-ssa-coalesce.c.
> * tree-ssa-live.c (struct tree_int_map_hasher): Move along
> with its member functions to tree-ssa-coalesce.c.
> (var_map_base_init): Likewise. Renamed to
> compute_samebase_partition_bases.
> (partition_view_normal): Drop want_bases parameter.
> (partition_view_bitmap): Likewise.
> * tree-ssa-live.h: Adjust declarations.
> * tree-ssa-coalesce.c: Include explow.h.
> (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
> default defs at the entry point.
> (dump_part_var_map): New.
> (compute_optimized_partition_bases): New, called by...
> (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
> of compute_samebase_partition_bases. Adjust.
> * alias.c (nonoverlapping_memrefs_p): Special-case RTL-less
> gimple-reg exprs.
> * cfgexpand.c (leader_merge): New.
> (get_rtl_for_parm_ssa_default_def): New.
> (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
> vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
> (expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
> redundant MEM attr setting.
> (expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
> from...
> (expand_one_stack_var): ... this. New wrapper to check and
> skip already expanded SSA partitions.
> (record_alignment_for_reg_var): New, factored out of...
> (expand_one_var): ... this.
> (expand_one_ssa_partition): New.
> (adjust_one_expanded_partition_var): New.
> (expand_one_register_var): Check and skip already expanded SSA
> partitions.
> (expand_used_vars): Don't create DECLs for anonymous SSA
> names. Expand all SSA partitions, then adjust all SSA names.
> (pass::execute): Replace the loops that set
> SA.partition_to_pseudo from partition leaders and cleared
> DECL_RTL for multi-location variables, and that which used to
> rename vars and set attrs, with one that clears DECL_RTL and
> checks that PARMs and RESULTs default_defs match DECL_RTL.
> * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
> * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL and
> TREE_LIST decl.
> * explow.c (promote_ssa_mode): New.
> * explow.h (promote_ssa_mode): Declare.
> * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
> * function.c: Include cfgexpand.h.
> (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
> (use_register_for_parm_decl): Wrapper for the above to
> special-case the result_ptr.
> (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
> (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
> multiple locations.
> (assign_parm_adjust_stack_rtl): Add all and parm arguments,
> for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
> (assign_parm_setup_block): Prefer SSA-assigned location.
> (assign_parm_setup_reg): Likewise. Use entry_parm for equiv
> if stack_parm is NULL.
> (assign_parm_setup_stack): Prefer SSA-assigned location.
> (assign_parms): Maybe reset DECL_RTL of params. Adjust stack
> rtl before testing for pointer bounds. Special-case result_ptr.
> (expand_function_start): Maybe reset DECL_RTL of result.
> Prefer SSA-assigned location for result and static chain.
> Factor out DECL_RESULT and SET_DECL_RTL.
> * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
> anonymous SSA names. Use promote_ssa_mode.
> (get_temp_reg): Likewise.
> (remove_ssa_form): Adjust.
> * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
> and get its reg_usage for reg invalidation.
> (compute_bb_dataflow): Pass it insn.
> (emit_notes_in_bb): Likewise.
>
> for gcc/testsuite/ChangeLog
>
> * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
> * gcc.dg/ssp-1.c: Make counter a register.
> * gcc.dg/ssp-2.c: Likewise.
> * gcc.dg/torture/parm-coalesce.c: New.
> ---
> gcc/Makefile.in | 1
> gcc/alias.c | 12 +
> gcc/cfgexpand.c | 383 +++++++++++++++-----
> gcc/cfgexpand.h | 2
> gcc/common.opt | 12 -
> gcc/doc/invoke.texi | 48 +--
> gcc/emit-rtl.c | 7
> gcc/explow.c | 25 +
> gcc/explow.h | 3
> gcc/expr.c | 33 +-
> gcc/function.c | 211 +++++++++--
> gcc/gimple-expr.c | 39 --
> gcc/gimple-expr.h | 5
> gcc/opts.c | 2
> gcc/passes.def | 5
> gcc/testsuite/gcc.dg/guality/pr54200.c | 2
> gcc/testsuite/gcc.dg/ssp-1.c | 2
> gcc/testsuite/gcc.dg/ssp-2.c | 2
> gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++
> gcc/tree-outof-ssa.c | 15 -
> gcc/tree-ssa-coalesce.c | 380 +++++++++++++++++++-
> gcc/tree-ssa-copyrename.c | 499 --------------------------
> gcc/tree-ssa-live.c | 101 -----
> gcc/tree-ssa-live.h | 4
> gcc/var-tracking.c | 12 -
> 25 files changed, 980 insertions(+), 865 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 80c91f0..6920ee7 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1428,7 +1428,6 @@ OBJS = \
> tree-ssa-ccp.o \
> tree-ssa-coalesce.o \
> tree-ssa-copy.o \
> - tree-ssa-copyrename.o \
> tree-ssa-dce.o \
> tree-ssa-dom.o \
> tree-ssa-dse.o \
> diff --git a/gcc/alias.c b/gcc/alias.c
> index a7160f3..2100e8b 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2365,6 +2365,18 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
> if (! DECL_P (exprx) || ! DECL_P (expry))
> return 0;
>
> + /* If we refer to different gimple registers, or one gimple register
> + and one non-gimple-register, we know they can't overlap. Now,
> + there could be more than one stack slot for (different versions
> + of) the same gimple register, but we can presumably tell they
> + don't overlap based on offsets from stack base addresses
> + elsewhere. It's important that we don't proceed to DECL_RTL,
> + because gimple registers may not pass DECL_RTL_SET_P, and
> + make_decl_rtl won't be able to do anything about them since no
> + SSA information will have remained to guide it. */
> + if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> + return exprx != expry;
> +
This should also mention that is_gimple_reg vars do not have their
address taken.
> /* With invalid code we can end up storing into the constant pool.
> Bail out to avoid ICEing when creating RTL for this.
> See gfortran.dg/lto/20091028-2_0.f90. */
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index ca491a0..74190a6d 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -179,21 +179,137 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
> + TREE_LIST of DECLs. If NEXT is covered by CUR, return CUR
> + unchanged. Otherwise, return a list with all entries of CUR, with
> + NEXT at the end. If CUR was a list, it will be modified in
> + place. */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> + if (cur == NULL || cur == next)
> + return next;
> +
> + tree list;
> +
> + if (TREE_CODE (cur) == TREE_LIST)
> + {
> + /* Look for NEXT in the list. Stop at the last node to insert
> + there. */
> + for (list = cur; ; list = TREE_CHAIN (list))
> + {
> + if (TREE_VALUE (list) == next)
> + return cur;
> + if (!TREE_CHAIN (list))
> + break;
> + }
> + }
> + else
> + /* Create the first node. */
> + list = build_tree_list (NULL, cur);
> +
> + next = build_tree_list (NULL, next);
> + TREE_CHAIN (list) = next;
Ick - presumably you can't use sth better than a TREE_LIST here?
First the linear
walk looks expensive and 2nd, well, TREE_LIST ...
> +
> + return cur;
> +}
> +
> +
> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
> + there is one. */
> +
> +rtx
> +get_rtl_for_parm_ssa_default_def (tree var)
> +{
> + gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> +
> + if (!is_gimple_reg (var))
> + return NULL_RTX;
> +
> + /* If we've already determined RTL for the decl, use it. This is
> + not just an optimization: if VAR is a PARM whose incoming value
> + is unused, we won't find a default def to use its partition, but
> + we still want to use the location of the parm, if it was used at
> + all. During assign_parms, until a location is assigned for the
> + VAR, RTL can only for a parm or result if we're not coalescing
> + across variables, when we know we're coalescing all SSA_NAMEs of
> + each parm or result, and we're not coalescing them with names
> + pertaining to other variables, such as other parms' default
> + defs. */
> + if (DECL_RTL_SET_P (var))
> + {
> + gcc_assert (DECL_RTL (var) != pc_rtx);
> + return DECL_RTL (var);
> + }
> +
> + tree name = ssa_default_def (cfun, var);
> +
> + if (!name)
> + return NULL_RTX;
> +
> + int part = var_to_partition (SA.map, name);
> + if (part == NO_PARTITION)
> + return NULL_RTX;
> +
> + return SA.partition_to_pseudo[part];
> +}
> +
> /* Associate declaration T with storage space X. If T is no
> SSA name this is exactly SET_DECL_RTL, otherwise make the
> partition of T associated with X. */
> static inline void
> set_rtl (tree t, rtx x)
> {
> + if (x && SSAVAR (t))
> + {
> + bool skip = false;
> + tree cur = NULL_TREE;
> +
> + if (MEM_P (x))
> + cur = MEM_EXPR (x);
> + else if (REG_P (x))
> + cur = REG_EXPR (x);
> + else if (GET_CODE (x) == CONCAT
> + && REG_P (XEXP (x, 0)))
> + cur = REG_EXPR (XEXP (x, 0));
> + else if (GET_CODE (x) == PARALLEL)
> + cur = REG_EXPR (XVECEXP (x, 0, 0));
> + else if (x == pc_rtx)
> + skip = true;
> + else
> + gcc_unreachable ();
> +
> + tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +
> + if (cur != next)
> + {
> + if (MEM_P (x))
> + set_mem_attributes (x, SSAVAR (t), true);
> + else
> + set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
> + }
> + }
> +
> if (TREE_CODE (t) == SSA_NAME)
> {
> - SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> - if (x && !MEM_P (x))
> - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> - /* For the benefit of debug information at -O0 (where vartracking
> - doesn't run) record the place also in the base DECL if it's
> - a normal variable (not a parameter). */
> - if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
> + int part = var_to_partition (SA.map, t);
> + if (part != NO_PARTITION)
> + {
> + if (SA.partition_to_pseudo[part])
> + gcc_assert (SA.partition_to_pseudo[part] == x);
> + else
> + SA.partition_to_pseudo[part] = x;
> + }
> + /* For the benefit of debug information at -O0 (where
> + vartracking doesn't run) record the place also in the base
> + DECL. For PARMs and RESULTs, we may end up resetting these
> + in function.c:maybe_reset_rtl_for_parm, but in some rare
> + cases we may need them (unused and overwritten incoming
> + value, that at -O0 must share the location with the other
> + uses in spite of the missing default def), and this may be
> + the only chance to preserve them. */
> + if (x && x != pc_rtx && SSA_NAME_VAR (t))
> {
> tree var = SSA_NAME_VAR (t);
> /* If we don't yet have something recorded, just record it now. */
> @@ -909,7 +1025,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
> x = plus_constant (Pmode, base, offset);
> - x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
> + x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
> + ? DECL_MODE (SSAVAR (decl))
> + : TYPE_MODE (TREE_TYPE (decl)), x);
>
> if (TREE_CODE (decl) != SSA_NAME)
> {
> @@ -931,7 +1049,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
> DECL_USER_ALIGN (decl) = 0;
> }
>
> - set_mem_attributes (x, SSAVAR (decl), true);
> set_rtl (decl, x);
> }
>
> @@ -1146,13 +1263,22 @@ account_stack_vars (void)
> to a variable to be allocated in the stack frame. */
>
> static void
> -expand_one_stack_var (tree var)
> +expand_one_stack_var_1 (tree var)
> {
> HOST_WIDE_INT size, offset;
> unsigned byte_align;
>
> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> - byte_align = align_local_variable (SSAVAR (var));
> + if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
> + {
> + size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> + byte_align = align_local_variable (SSAVAR (var));
> + }
> + else
I'd go here for all TREE_CODE (var) == SSA_NAME (and get rid of
the SSAVAR macro?)
> + {
> + tree type = TREE_TYPE (var);
> + size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> + byte_align = TYPE_ALIGN_UNIT (type);
> + }
>
> /* We handle highly aligned variables in expand_stack_vars. */
> gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1163,6 +1289,27 @@ expand_one_stack_var (tree var)
> crtl->max_used_stack_slot_alignment, offset);
> }
>
> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> + already assigned some MEM. */
> +
> +static void
> +expand_one_stack_var (tree var)
> +{
> + if (TREE_CODE (var) == SSA_NAME)
> + {
> + int part = var_to_partition (SA.map, var);
> + if (part != NO_PARTITION)
> + {
> + rtx x = SA.partition_to_pseudo[part];
> + gcc_assert (x);
> + gcc_assert (MEM_P (x));
> + return;
> + }
> + }
> +
> + return expand_one_stack_var_1 (var);
> +}
> +
> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
> that will reside in a hard register. */
>
> @@ -1172,12 +1319,112 @@ expand_one_hard_reg_var (tree var)
> rest_of_decl_compilation (var, 0, 0);
> }
>
> +/* Record the alignment requirements of some variable assigned to a
> + pseudo. */
> +
> +static void
> +record_alignment_for_reg_var (unsigned int align)
> +{
> + if (SUPPORTS_STACK_ALIGNMENT
> + && crtl->stack_alignment_estimated < align)
> + {
> + /* stack_alignment_estimated shouldn't change after stack
> + realign decision made */
> + gcc_assert (!crtl->stack_realign_processed);
> + crtl->stack_alignment_estimated = align;
> + }
> +
> + /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> + So here we only make sure stack_alignment_needed >= align. */
> + if (crtl->stack_alignment_needed < align)
> + crtl->stack_alignment_needed = align;
> + if (crtl->max_used_stack_slot_alignment < align)
> + crtl->max_used_stack_slot_alignment = align;
> +}
> +
> +/* Create RTL for an SSA partition. */
> +
> +static void
> +expand_one_ssa_partition (tree var)
> +{
> + int part = var_to_partition (SA.map, var);
> + gcc_assert (part != NO_PARTITION);
> +
> + if (SA.partition_to_pseudo[part])
> + return;
> +
> + if (!use_register_for_decl (var))
> + {
> + expand_one_stack_var_1 (var);
> + return;
> + }
> +
> + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
> + TYPE_MODE (TREE_TYPE (var)),
> + TYPE_ALIGN (TREE_TYPE (var)));
> +
> + /* If the variable alignment is very large we'll dynamicaly allocate
> + it, which means that in-frame portion is just a pointer. */
> + if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> + align = POINTER_SIZE;
> +
> + record_alignment_for_reg_var (align);
> +
> + machine_mode reg_mode = promote_ssa_mode (var, NULL);
> +
> + rtx x = gen_reg_rtx (reg_mode);
> +
> + set_rtl (var, x);
> +}
> +
> +/* Record the association between the RTL generated for a partition
> + and the underlying variable of the SSA_NAME. */
> +
> +static void
> +adjust_one_expanded_partition_var (tree var)
> +{
> + if (!var)
> + return;
> +
> + tree decl = SSA_NAME_VAR (var);
> +
> + int part = var_to_partition (SA.map, var);
> + if (part == NO_PARTITION)
> + return;
> +
> + rtx x = SA.partition_to_pseudo[part];
> +
> + set_rtl (var, x);
> +
> + if (!REG_P (x))
> + return;
> +
> + /* Note if the object is a user variable. */
> + if (decl && !DECL_ARTIFICIAL (decl))
> + mark_user_reg (x);
> +
> + if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
> + mark_reg_pointer (x, get_pointer_alignment (var));
> +}
> +
> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
> that will reside in a pseudo register. */
>
> static void
> expand_one_register_var (tree var)
> {
> + if (TREE_CODE (var) == SSA_NAME)
> + {
> + int part = var_to_partition (SA.map, var);
> + if (part != NO_PARTITION)
> + {
> + rtx x = SA.partition_to_pseudo[part];
> + gcc_assert (x);
> + gcc_assert (REG_P (x));
> + return;
> + }
> + }
> +
> tree decl = SSAVAR (var);
> tree type = TREE_TYPE (decl);
> machine_mode reg_mode = promote_decl_mode (decl, NULL);
> @@ -1312,21 +1559,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
> align = POINTER_SIZE;
> }
>
> - if (SUPPORTS_STACK_ALIGNMENT
> - && crtl->stack_alignment_estimated < align)
> - {
> - /* stack_alignment_estimated shouldn't change after stack
> - realign decision made */
> - gcc_assert (!crtl->stack_realign_processed);
> - crtl->stack_alignment_estimated = align;
> - }
> -
> - /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> - So here we only make sure stack_alignment_needed >= align. */
> - if (crtl->stack_alignment_needed < align)
> - crtl->stack_alignment_needed = align;
> - if (crtl->max_used_stack_slot_alignment < align)
> - crtl->max_used_stack_slot_alignment = align;
> + record_alignment_for_reg_var (align);
>
> if (TREE_CODE (origvar) == SSA_NAME)
> {
> @@ -1760,48 +1993,18 @@ expand_used_vars (void)
> if (targetm.use_pseudo_pic_reg ())
> pic_offset_table_rtx = gen_reg_rtx (Pmode);
>
> - hash_map<tree, tree> ssa_name_decls;
> for (i = 0; i < SA.map->num_partitions; i++)
> {
> tree var = partition_to_var (SA.map, i);
>
> gcc_assert (!virtual_operand_p (var));
>
> - /* Assign decls to each SSA name partition, share decls for partitions
> - we could have coalesced (those with the same type). */
> - if (SSA_NAME_VAR (var) == NULL_TREE)
> - {
> - tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
> - if (!*slot)
> - *slot = create_tmp_reg (TREE_TYPE (var));
> - replace_ssa_name_symbol (var, *slot);
> - }
> -
> - /* Always allocate space for partitions based on VAR_DECLs. But for
> - those based on PARM_DECLs or RESULT_DECLs and which matter for the
> - debug info, there is no need to do so if optimization is disabled
> - because all the SSA_NAMEs based on these DECLs have been coalesced
> - into a single partition, which is thus assigned the canonical RTL
> - location of the DECLs. If in_lto_p, we can't rely on optimize,
> - a function could be compiled with -O1 -flto first and only the
> - link performed at -O0. */
> - if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
> - expand_one_var (var, true, true);
> - else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
> - {
> - /* This is a PARM_DECL or RESULT_DECL. For those partitions that
> - contain the default def (representing the parm or result itself)
> - we don't do anything here. But those which don't contain the
> - default def (representing a temporary based on the parm/result)
> - we need to allocate space just like for normal VAR_DECLs. */
> - if (!bitmap_bit_p (SA.partition_has_default_def, i))
> - {
> - expand_one_var (var, true, true);
> - gcc_assert (SA.partition_to_pseudo[i]);
> - }
> - }
> + expand_one_ssa_partition (var);
> }
>
> + for (i = 1; i < num_ssa_names; i++)
> + adjust_one_expanded_partition_var (ssa_name (i));
> +
> if (flag_stack_protect == SPCT_FLAG_STRONG)
> gen_stack_protect_signal
> = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -6033,35 +6236,6 @@ pass_expand::execute (function *fun)
> parm_birth_insn = var_seq;
> }
>
> - /* Now that we also have the parameter RTXs, copy them over to our
> - partitions. */
> - for (i = 0; i < SA.map->num_partitions; i++)
> - {
> - tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
> -
> - if (TREE_CODE (var) != VAR_DECL
> - && !SA.partition_to_pseudo[i])
> - SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
> - gcc_assert (SA.partition_to_pseudo[i]);
> -
> - /* If this decl was marked as living in multiple places, reset
> - this now to NULL. */
> - if (DECL_RTL_IF_SET (var) == pc_rtx)
> - SET_DECL_RTL (var, NULL);
> -
> - /* Some RTL parts really want to look at DECL_RTL(x) when x
> - was a decl marked in REG_ATTR or MEM_ATTR. We could use
> - SET_DECL_RTL here making this available, but that would mean
> - to select one of the potentially many RTLs for one DECL. Instead
> - of doing that we simply reset the MEM_EXPR of the RTL in question,
> - then nobody can get at it and hence nobody can call DECL_RTL on it. */
> - if (!DECL_RTL_SET_P (var))
> - {
> - if (MEM_P (SA.partition_to_pseudo[i]))
> - set_mem_expr (SA.partition_to_pseudo[i], NULL);
> - }
> - }
> -
> /* If we have a class containing differently aligned pointers
> we need to merge those into the corresponding RTL pointer
> alignment. */
> @@ -6069,7 +6243,6 @@ pass_expand::execute (function *fun)
> {
> tree name = ssa_name (i);
> int part;
> - rtx r;
>
> if (!name
> /* We might have generated new SSA names in
> @@ -6082,20 +6255,24 @@ pass_expand::execute (function *fun)
> if (part == NO_PARTITION)
> continue;
>
> - /* Adjust all partition members to get the underlying decl of
> - the representative which we might have created in expand_one_var. */
> - if (SSA_NAME_VAR (name) == NULL_TREE)
> + gcc_assert (SA.partition_to_pseudo[part]);
> +
> + /* If this decl was marked as living in multiple places, reset
> + this now to NULL. */
> + tree var = SSA_NAME_VAR (name);
> + if (var && DECL_RTL_IF_SET (var) == pc_rtx)
> + SET_DECL_RTL (var, NULL);
> + /* Check that the pseudos chosen by assign_parms are those of
> + the corresponding default defs. */
> + else if (SSA_NAME_IS_DEFAULT_DEF (name)
> + && (TREE_CODE (var) == PARM_DECL
> + || TREE_CODE (var) == RESULT_DECL))
> {
> - tree leader = partition_to_var (SA.map, part);
> - gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
> - replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
> + rtx in = DECL_RTL_IF_SET (var);
> + gcc_assert (in);
> + rtx out = SA.partition_to_pseudo[part];
> + gcc_assert (in == out || rtx_equal_p (in, out));
> }
> - if (!POINTER_TYPE_P (TREE_TYPE (name)))
> - continue;
> -
> - r = SA.partition_to_pseudo[part];
> - if (REG_P (r))
> - mark_reg_pointer (r, get_pointer_alignment (name));
> }
>
> /* If this function is `main', emit a call to `__main'
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index a0b6e3e..602579d 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3. If not see
>
> extern tree gimple_assign_rhs_to_tree (gimple);
> extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +
>
> #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 380848c..2cdbea1 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2212,16 +2212,16 @@ Common Report Var(flag_tree_ch) Optimization
> Enable loop header copying on trees
>
> ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing. Preserved for backward compatibility.
>
> ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Report Var(flag_tree_coalesce_vars) Optimization
> +Enable SSA coalescing of user variables
>
> ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing. Preserved for backward compatibility.
>
> ftree-copy-prop
> Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index c20dd4d..0a3b930 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -337,7 +337,6 @@ Objective-C and Objective-C++ Dialects}.
> -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
> -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
> -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
> -fdump-tree-nrv -fdump-tree-vect @gol
> -fdump-tree-sink @gol
> -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
> @@ -443,9 +442,8 @@ Objective-C and Objective-C++ Dialects}.
> -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
> -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> -ftree-loop-if-convert-stores -ftree-loop-im @gol
> -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
> -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
> @@ -6989,11 +6987,6 @@ name is made by appending @file{.phiopt} to the source file name.
> Dump each function after forward propagating single use variables. The file
> name is made by appending @file{.forwprop} to the source file name.
>
> -@item copyrename
> -@opindex fdump-tree-copyrename
> -Dump each function after applying the copy rename optimization. The file
> -name is made by appending @file{.copyrename} to the source file name.
> -
> @item nrv
> @opindex fdump-tree-nrv
> Dump each function after applying the named return value optimization on
> @@ -7458,8 +7451,8 @@ compilation time.
> -ftree-ccp @gol
> -fssa-phiopt @gol
> -ftree-ch @gol
> +-ftree-coalesce-vars @gol
> -ftree-copy-prop @gol
> --ftree-copyrename @gol
> -ftree-dce @gol
> -ftree-dominator-opts @gol
> -ftree-dse @gol
> @@ -8724,6 +8717,15 @@ profitable to parallelize the loops.
> Compare the results of several data dependence analyzers. This option
> is used for debugging the data dependence analyzers.
>
> +@item -ftree-coalesce-vars
> +@opindex ftree-coalesce-vars
> +Tell the compiler to attempt to combine small user-defined variables
> +too, instead of just compiler temporaries. This may severely limit the
> +ability to debug an optimized program compiled with
> +@option{-fno-var-tracking-assignments}. In the negated form, this flag
> +prevents SSA coalescing of user variables. This option is enabled by
> +default if optimization is enabled.
> +
> @item -ftree-loop-if-convert
> @opindex ftree-loop-if-convert
> Attempt to transform conditional jumps in the innermost loops to
> @@ -8837,32 +8839,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
> references with scalars to prevent committing structures to memory too
> early. This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees. This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables. This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions. It is a more limited form of
> -@option{-ftree-coalesce-vars}. This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries. This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}. In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones. This option is enabled by default.
> -
> @item -ftree-ter
> @opindex ftree-ter
> Perform temporary expression replacement during the SSA->normal phase. Single
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index b8dc7d5..ef31ba0f 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1229,6 +1229,11 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
> void
> set_reg_attrs_for_decl_rtl (tree t, rtx x)
> {
> + if (!t)
> + return;
> + tree tdecl = t;
> + if (TREE_CODE (t) == TREE_LIST)
> + tdecl = TREE_VALUE (t);
So it only uses the "first" entry?...
> if (GET_CODE (x) == SUBREG)
> {
> gcc_assert (subreg_lowpart_p (x));
> @@ -1237,7 +1242,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
> if (REG_P (x))
> REG_ATTRS (x)
> = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
> - DECL_MODE (t)));
> + DECL_MODE (tdecl)));
> if (GET_CODE (x) == CONCAT)
> {
> if (REG_P (XEXP (x, 0)))
> diff --git a/gcc/explow.c b/gcc/explow.c
> index de446a9..b53a3b7 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -854,6 +854,31 @@ promote_decl_mode (const_tree decl, int *punsignedp)
> return pmode;
> }
>
> +/* Return the promoted mode for name. If it is a named SSA_NAME, it
> + is the same as promote_decl_mode. Otherwise, it is the promoted
> + mode of a temp decl of same type as the SSA_NAME, if we had created
> + one. */
> +
> +machine_mode
> +promote_ssa_mode (const_tree name, int *punsignedp)
> +{
> + gcc_assert (TREE_CODE (name) == SSA_NAME);
> +
> + if (SSA_NAME_VAR (name))
> + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
As above I'd rather not have different paths for anonymous vs. non-anonymous
vars (so just delete the above two lines).
> + tree type = TREE_TYPE (name);
> + int unsignedp = TYPE_UNSIGNED (type);
> + machine_mode mode = TYPE_MODE (type);
> +
> + machine_mode pmode = promote_mode (type, mode, &unsignedp);
> + if (punsignedp)
> + *punsignedp = unsignedp;
> +
> + return pmode;
> +}
> +
> +
>
> /* Controls the behaviour of {anti_,}adjust_stack. */
> static bool suppress_reg_args_size;
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 48f1859..7b11e46 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
> /* Return mode and signedness to use when object is promoted. */
> machine_mode promote_decl_mode (const_tree, int *);
>
> +/* Return mode and signedness to use when object is promoted. */
> +machine_mode promote_ssa_mode (const_tree, int *);
> +
> /* Remove some bytes from the stack. An rtx says how many. */
> extern void adjust_stack (rtx);
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 530a944..95a9bab 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -9388,7 +9388,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> rtx op0, op1, temp, decl_rtl;
> tree type;
> int unsignedp;
> - machine_mode mode;
> + machine_mode mode, dmode;
> enum tree_code code = TREE_CODE (exp);
> rtx subtarget, original_target;
> int ignore;
> @@ -9519,7 +9519,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> if (g == NULL
> && modifier == EXPAND_INITIALIZER
> && !SSA_NAME_IS_DEFAULT_DEF (exp)
> - && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> + && (optimize || !SSA_NAME_VAR (exp)
> + || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
> g = SSA_NAME_DEF_STMT (exp);
> if (g)
> @@ -9598,15 +9599,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> /* Ensure variable marked as used even if it doesn't go through
> a parser. If it hasn't be used yet, write out an external
> definition. */
> - TREE_USED (exp) = 1;
> + if (exp)
> + TREE_USED (exp) = 1;
>
> /* Show we haven't gotten RTL for this yet. */
> temp = 0;
>
> /* Variables inherited from containing functions should have
> been lowered by this point. */
> - context = decl_function_context (exp);
> - gcc_assert (SCOPE_FILE_SCOPE_P (context)
> + if (exp)
> + context = decl_function_context (exp);
> + gcc_assert (!exp
> + || SCOPE_FILE_SCOPE_P (context)
> || context == current_function_decl
> || TREE_STATIC (exp)
> || DECL_EXTERNAL (exp)
> @@ -9630,7 +9634,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> decl_rtl = use_anchored_address (decl_rtl);
> if (modifier != EXPAND_CONST_ADDRESS
> && modifier != EXPAND_SUM
> - && !memory_address_addr_space_p (DECL_MODE (exp),
> + && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
> + : GET_MODE (decl_rtl),
> XEXP (decl_rtl, 0),
> MEM_ADDR_SPACE (decl_rtl)))
> temp = replace_equiv_address (decl_rtl,
> @@ -9641,12 +9646,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> if the address is a register. */
> if (temp != 0)
> {
> - if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
> + if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
> mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>
> return temp;
> }
>
> + if (exp)
> + dmode = DECL_MODE (exp);
> + else
> + dmode = TYPE_MODE (TREE_TYPE (ssa_name));
> +
> /* If the mode of DECL_RTL does not match that of the decl,
> there are two cases: we are dealing with a BLKmode value
> that is returned in a register, or we are dealing with
> @@ -9654,8 +9664,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> of the wanted mode, but mark it so that we know that it
> was already extended. */
> if (REG_P (decl_rtl)
> - && DECL_MODE (exp) != BLKmode
> - && GET_MODE (decl_rtl) != DECL_MODE (exp))
> + && dmode != BLKmode
> + && GET_MODE (decl_rtl) != dmode)
> {
> machine_mode pmode;
>
> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> pmode = promote_function_mode (type, mode, &unsignedp,
> gimple_call_fntype (g),
> 2);
> + else if (!exp)
> + {
> + gcc_assert (code == SSA_NAME);
promote_ssa_mode should assert this.
> + pmode = promote_ssa_mode (ssa_name, &unsignedp);
> + }
> else
> pmode = promote_decl_mode (exp, &unsignedp);
> gcc_assert (GET_MODE (decl_rtl) == pmode);
> diff --git a/gcc/function.c b/gcc/function.c
> index 7d4df92..1f5296e 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3. If not see
> #include "cfganal.h"
> #include "cfgbuild.h"
> #include "cfgcleanup.h"
> +#include "cfgexpand.h"
> #include "basic-block.h"
> #include "df.h"
> #include "params.h"
> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
> bool
> use_register_for_decl (const_tree decl)
> {
> + if (TREE_CODE (decl) == SSA_NAME)
> + {
> + if (!SSA_NAME_VAR (decl))
> + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> +
> + decl = SSA_NAME_VAR (decl);
See above. Please drop the SSA_NAME_VAR != NULL path.
> + }
> +
> if (!targetm.calls.allocate_stack_slots_for_args ())
> return true;
>
> @@ -2804,23 +2814,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
> data->entry_parm = entry_parm;
> }
>
> +/* Wrapper for use_register_for_decl, that special-cases the
> + .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> + passed by reference. */
> +
> +static bool
> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> +{
> + if (parm == all->function_result_decl)
> + {
> + tree result = DECL_RESULT (current_function_decl);
> +
> + if (DECL_BY_REFERENCE (result))
> + parm = result;
> + }
> +
> + return use_register_for_decl (parm);
> +}
> +
> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> + the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> + is passed by reference. */
> +
> +static rtx
> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> +{
> + if (parm == all->function_result_decl)
> + {
> + tree result = DECL_RESULT (current_function_decl);
> +
> + if (!DECL_BY_REFERENCE (result))
> + return NULL_RTX;
> +
> + parm = result;
> + }
> +
> + return get_rtl_for_parm_ssa_default_def (parm);
> +}
> +
> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> + SSA_NAMEs in multiple partitions, so that assign_parms will choose
> + the default def, if it exists, or create new RTL to hold the unused
> + entry value. If we are coalescing across variables, we want to
> + reset the location too, because a parm without a default def
> + (incoming value unused) might be coalesced with one with a default
> + def, and then assign_parms would copy both incoming values to the
> + same location, which might cause the wrong value to survive. */
> +static void
> +maybe_reset_rtl_for_parm (tree parm)
> +{
> + gcc_assert (TREE_CODE (parm) == PARM_DECL
> + || TREE_CODE (parm) == RESULT_DECL);
> + if ((flag_tree_coalesce_vars
> + || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> + && is_gimple_reg (parm))
> + SET_DECL_RTL (parm, NULL_RTX);
> +}
> +
> /* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's
> always valid and properly aligned. */
>
> static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> + struct assign_parm_data_one *data)
> {
> rtx stack_parm = data->stack_parm;
>
> + /* If out-of-SSA assigned RTL to the parm default def, make sure we
> + don't use what we might have computed before. */
> + rtx ssa_assigned = rtl_for_parm (all, parm);
> + if (ssa_assigned)
> + stack_parm = NULL;
> +
> /* If we can't trust the parm stack slot to be aligned enough for its
> ultimate type, don't use that slot after entry. We'll make another
> stack slot, if we need one. */
> - if (stack_parm
> - && ((STRICT_ALIGNMENT
> - && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> - || (data->nominal_type
> - && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> - && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> + else if (stack_parm
> + && ((STRICT_ALIGNMENT
> + && (GET_MODE_ALIGNMENT (data->nominal_mode)
> + > MEM_ALIGN (stack_parm)))
> + || (data->nominal_type
> + && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> + && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> stack_parm = NULL;
>
> /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2882,11 +2957,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>
> size = int_size_in_bytes (data->passed_type);
> size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> +
> if (stack_parm == 0)
> {
> DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> - stack_parm = assign_stack_local (BLKmode, size_stored,
> - DECL_ALIGN (parm));
> + stack_parm = rtl_for_parm (all, parm);
> + if (!stack_parm)
> + stack_parm = assign_stack_local (BLKmode, size_stored,
> + DECL_ALIGN (parm));
> + else
> + stack_parm = copy_rtx (stack_parm);
> if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> PUT_MODE (stack_parm, GET_MODE (entry_parm));
> set_mem_attributes (stack_parm, parm, 1);
> @@ -3027,10 +3107,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
> = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
> TREE_TYPE (current_function_decl), 2);
>
> - parmreg = gen_reg_rtx (promoted_nominal_mode);
> + rtx from_expand = rtl_for_parm (all, parm);
>
> - if (!DECL_ARTIFICIAL (parm))
> - mark_user_reg (parmreg);
> + if (from_expand && !data->passed_pointer)
> + {
> + parmreg = from_expand;
> + gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
> + }
> + else
> + {
> + parmreg = gen_reg_rtx (promoted_nominal_mode);
> + if (!DECL_ARTIFICIAL (parm))
> + mark_user_reg (parmreg);
> + }
>
> /* If this was an item that we received a pointer to,
> set DECL_RTL appropriately. */
> @@ -3049,6 +3138,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
> assign_parm_find_data_types and expand_expr_real_1. */
>
> equiv_stack_parm = data->stack_parm;
> + if (!equiv_stack_parm)
> + equiv_stack_parm = data->entry_parm;
> validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
> need_conversion = (data->nominal_mode != data->passed_mode
> @@ -3189,11 +3280,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
> /* If we were passed a pointer but the actual value can safely live
> in a register, retrieve it and use it directly. */
> - if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
> + if (data->passed_pointer
> + && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
> {
> /* We can't use nominal_mode, because it will have been set to
> Pmode above. We must use the actual mode of the parm. */
> - if (use_register_for_decl (parm))
> + if (from_expand)
> + {
> + parmreg = from_expand;
> + gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> + }
> + else if (use_register_for_decl (parm))
> {
> parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
> mark_user_reg (parmreg);
> @@ -3233,7 +3330,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
> /* STACK_PARM is the pointer, not the parm, and PARMREG is
> now the parm. */
> - data->stack_parm = NULL;
> + data->stack_parm = equiv_stack_parm = NULL;
> }
>
> /* Mark the register as eliminable if we did no conversion and it was
> @@ -3243,11 +3340,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
> make here would screw up life analysis for it. */
> if (data->nominal_mode == data->passed_mode
> && !did_conversion
> - && data->stack_parm != 0
> - && MEM_P (data->stack_parm)
> + && equiv_stack_parm != 0
> + && MEM_P (equiv_stack_parm)
> && data->locate.offset.var == 0
> && reg_mentioned_p (virtual_incoming_args_rtx,
> - XEXP (data->stack_parm, 0)))
> + XEXP (equiv_stack_parm, 0)))
> {
> rtx_insn *linsn = get_last_insn ();
> rtx_insn *sinsn;
> @@ -3260,8 +3357,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
> = GET_MODE_INNER (GET_MODE (parmreg));
> int regnor = REGNO (XEXP (parmreg, 0));
> int regnoi = REGNO (XEXP (parmreg, 1));
> - rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> - rtx stacki = adjust_address_nv (data->stack_parm, submode,
> + rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> + rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
> GET_MODE_SIZE (submode));
>
> /* Scan backwards for the set of the real and
> @@ -3334,6 +3431,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>
> if (data->stack_parm == 0)
> {
> + rtx x = data->stack_parm = rtl_for_parm (all, parm);
> + if (x)
> + gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
> + }
> +
> + if (data->stack_parm == 0)
> + {
> int align = STACK_SLOT_ALIGNMENT (data->passed_type,
> GET_MODE (data->entry_parm),
> TYPE_ALIGN (data->passed_type));
> @@ -3592,6 +3696,8 @@ assign_parms (tree fndecl)
> DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
> continue;
> }
> + else
> + maybe_reset_rtl_for_parm (parm);
>
> /* Estimate stack alignment from parameter alignment. */
> if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3641,7 +3747,9 @@ assign_parms (tree fndecl)
> else
> set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> - /* Boudns should be loaded in the particular order to
> + assign_parm_adjust_stack_rtl (&all, parm, &data);
> +
> + /* Bounds should be loaded in the particular order to
> have registers allocated correctly. Collect info about
> input bounds and load them later. */
> if (POINTER_BOUNDS_TYPE_P (data.passed_type))
> @@ -3658,11 +3766,10 @@ assign_parms (tree fndecl)
> }
> else
> {
> - assign_parm_adjust_stack_rtl (&data);
> -
> if (assign_parm_setup_block_p (&data))
> assign_parm_setup_block (&all, parm, &data);
> - else if (data.passed_pointer || use_register_for_decl (parm))
> + else if (data.passed_pointer
> + || use_register_for_parm_decl (&all, parm))
> assign_parm_setup_reg (&all, parm, &data);
> else
> assign_parm_setup_stack (&all, parm, &data);
> @@ -5001,7 +5108,9 @@ expand_function_start (tree subr)
> before any library calls that assign parms might generate. */
>
> /* Decide whether to return the value in memory or in a register. */
> - if (aggregate_value_p (DECL_RESULT (subr), subr))
> + tree res = DECL_RESULT (subr);
> + maybe_reset_rtl_for_parm (res);
> + if (aggregate_value_p (res, subr))
> {
> /* Returning something that won't go in a register. */
> rtx value_address = 0;
> @@ -5009,7 +5118,7 @@ expand_function_start (tree subr)
> #ifdef PCC_STATIC_STRUCT_RETURN
> if (cfun->returns_pcc_struct)
> {
> - int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
> + int size = int_size_in_bytes (TREE_TYPE (res));
> value_address = assemble_static_space (size);
> }
> else
> @@ -5021,36 +5130,45 @@ expand_function_start (tree subr)
> it. */
> if (sv)
> {
> - value_address = gen_reg_rtx (Pmode);
> + if (DECL_BY_REFERENCE (res))
> + value_address = get_rtl_for_parm_ssa_default_def (res);
> + if (!value_address)
> + value_address = gen_reg_rtx (Pmode);
> emit_move_insn (value_address, sv);
> }
> }
> if (value_address)
> {
> rtx x = value_address;
> - if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
> + if (!DECL_BY_REFERENCE (res))
> {
> - x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
> - set_mem_attributes (x, DECL_RESULT (subr), 1);
> + x = get_rtl_for_parm_ssa_default_def (res);
> + if (!x)
> + {
> + x = gen_rtx_MEM (DECL_MODE (res), value_address);
> + set_mem_attributes (x, res, 1);
> + }
> }
> - SET_DECL_RTL (DECL_RESULT (subr), x);
> + SET_DECL_RTL (res, x);
> }
> }
> - else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
> + else if (DECL_MODE (res) == VOIDmode)
> /* If return mode is void, this decl rtl should not be used. */
> - SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
> + SET_DECL_RTL (res, NULL_RTX);
> else
> {
> /* Compute the return values into a pseudo reg, which we will copy
> into the true return register after the cleanups are done. */
> - tree return_type = TREE_TYPE (DECL_RESULT (subr));
> - if (TYPE_MODE (return_type) != BLKmode
> - && targetm.calls.return_in_msb (return_type))
> + tree return_type = TREE_TYPE (res);
> + rtx x = get_rtl_for_parm_ssa_default_def (res);
> + if (x)
> + /* Use it. */;
> + else if (TYPE_MODE (return_type) != BLKmode
> + && targetm.calls.return_in_msb (return_type))
> /* expand_function_end will insert the appropriate padding in
> this case. Use the return value's natural (unpadded) mode
> within the function proper. */
> - SET_DECL_RTL (DECL_RESULT (subr),
> - gen_reg_rtx (TYPE_MODE (return_type)));
> + x = gen_reg_rtx (TYPE_MODE (return_type));
> else
> {
> /* In order to figure out what mode to use for the pseudo, we
> @@ -5061,25 +5179,26 @@ expand_function_start (tree subr)
> /* Structures that are returned in registers are not
> aggregate_value_p, so we may see a PARALLEL or a REG. */
> if (REG_P (hard_reg))
> - SET_DECL_RTL (DECL_RESULT (subr),
> - gen_reg_rtx (GET_MODE (hard_reg)));
> + x = gen_reg_rtx (GET_MODE (hard_reg));
> else
> {
> gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> - SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
> + x = gen_group_rtx (hard_reg);
> }
> }
>
> + SET_DECL_RTL (res, x);
> +
> /* Set DECL_REGISTER flag so that expand_function_end will copy the
> result to the real return register(s). */
> - DECL_REGISTER (DECL_RESULT (subr)) = 1;
> + DECL_REGISTER (res) = 1;
>
> if (chkp_function_instrumented_p (current_function_decl))
> {
> - tree return_type = TREE_TYPE (DECL_RESULT (subr));
> + tree return_type = TREE_TYPE (res);
> rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
> subr, 1);
> - SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
> + SET_DECL_BOUNDS_RTL (res, bounds);
> }
> }
>
> @@ -5093,7 +5212,9 @@ expand_function_start (tree subr)
> tree parm = cfun->static_chain_decl;
> rtx local, chain, insn;
>
> - local = gen_reg_rtx (Pmode);
> + local = get_rtl_for_parm_ssa_default_def (parm);
> + if (!local)
> + local = gen_reg_rtx (Pmode);
> chain = targetm.calls.static_chain (current_function_decl, true);
>
> set_decl_incoming_rtl (parm, chain, false);
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index efc93b7..e29f300 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
> return copy;
> }
>
> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> - coalescing together, false otherwise.
> -
> - This must stay consistent with var_map_base_init in tree-ssa-live.c. */
> -
> -bool
> -gimple_can_coalesce_p (tree name1, tree name2)
> -{
> - /* First check the SSA_NAME's associated DECL. We only want to
> - coalesce if they have the same DECL or both have no associated DECL. */
> - tree var1 = SSA_NAME_VAR (name1);
> - tree var2 = SSA_NAME_VAR (name2);
> - var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> - var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> - if (var1 != var2)
> - return false;
> -
> - /* Now check the types. If the types are the same, then we should
> - try to coalesce V1 and V2. */
> - tree t1 = TREE_TYPE (name1);
> - tree t2 = TREE_TYPE (name2);
> - if (t1 == t2)
> - return true;
> -
> - /* If the types are not the same, check for a canonical type match. This
> - (for example) allows coalescing when the types are fundamentally the
> - same, but just have different names.
> -
> - Note pointer types with different address spaces may have the same
> - canonical type. Those are rejected for coalescing by the
> - types_compatible_p check. */
> - if (TYPE_CANONICAL (t1)
> - && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> - && types_compatible_p (t1, t2))
> - return true;
> -
> - return false;
> -}
> -
> /* Strip off a legitimate source ending from the input string NAME of
> length LEN. Rather than having to know the names used by all of
> our front ends, we strip off an ending of a period followed by
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index a50a90a..b492137 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
> extern bool gimple_has_body_p (tree);
> extern const char *gimple_decl_printable_name (tree, int);
> extern tree copy_var_decl (tree, tree, tree);
> -extern bool gimple_can_coalesce_p (tree, tree);
> extern tree create_tmp_var_name (const char *);
> extern tree create_tmp_var_raw (tree, const char * = NULL);
> extern tree create_tmp_var (tree, const char * = NULL);
> @@ -56,6 +55,10 @@ extern bool is_gimple_mem_ref_addr (tree);
> extern void mark_addressable (tree);
> extern bool is_gimple_reg_rhs (tree);
>
> +/* Defined in tree-ssa-coalesce.c. */
Err, put it to tree-ssa-coalesce.h?
> +extern bool gimple_can_coalesce_p (tree, tree);
> +
> +
> /* Return true if a conversion from either type of TYPE1 and TYPE2
> to the other is not required. Otherwise return false. */
>
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 39c190d..7e41b1f 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
> { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
> + { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> - { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index ffa63b5..4548b20 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_all_early_optimizations);
> PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
> NEXT_PASS (pass_remove_cgraph_callee_edges);
> - NEXT_PASS (pass_rename_ssa_copies);
> NEXT_PASS (pass_object_sizes);
> NEXT_PASS (pass_ccp);
> /* After CCP we rewrite no longer addressed locals into SSA
> @@ -154,7 +153,6 @@ along with GCC; see the file COPYING3. If not see
> /* Initial scalar cleanups before alias computation.
> They ensure memory accesses are not indirect wherever possible. */
> NEXT_PASS (pass_strip_predict_hints);
> - NEXT_PASS (pass_rename_ssa_copies);
> NEXT_PASS (pass_ccp);
> /* After CCP we rewrite no longer addressed locals into SSA
> form if possible. */
> @@ -182,7 +180,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_stdarg);
> NEXT_PASS (pass_lower_complex);
> NEXT_PASS (pass_sra);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* The dom pass will also resolve all __builtin_constant_p calls
> that are still there to 0. This has to be done after some
> propagations have already run, but before some more dead code
> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_fold_builtins);
> NEXT_PASS (pass_optimize_widening_mul);
> NEXT_PASS (pass_tail_calls);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* FIXME: If DCE is not run before checking for uninitialized uses,
> we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
> However, this also causes us to misdiagnose cases that should be
> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_dce);
> NEXT_PASS (pass_asan);
> NEXT_PASS (pass_tsan);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* ??? We do want some kind of loop invariant motion, but we possibly
> need to adjust LIM to be more friendly towards preserving accurate
> debug information here. */
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
> index 9b17187..e1e7293 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
> @@ -1,6 +1,6 @@
> /* PR tree-optimization/54200 */
> /* { dg-do run } */
> -/* { dg-options "-g -fno-var-tracking-assignments" } */
> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>
> int o __attribute__((used));
>
> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
> index 5467f4d..db69332 100644
> --- a/gcc/testsuite/gcc.dg/ssp-1.c
> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>
> int main ()
> {
> - int i;
> + register int i;
> char foo[255];
>
> // smash stack
> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
> index 9a7ac32..752fe53 100644
> --- a/gcc/testsuite/gcc.dg/ssp-2.c
> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
> void
> overflow()
> {
> - int i = 0;
> + register int i = 0;
> char foo[30];
>
> /* Overflow buffer. */
> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> new file mode 100644
> index 0000000..dbd81c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> @@ -0,0 +1,40 @@
> +/* { dg-do run } */
> +
> +#include <stdlib.h>
> +
> +/* Make sure we don't coalesce both incoming parms, one whose incoming
> + value is unused, to the same location, so as to overwrite one of
> + them with the incoming value of the other. */
> +
> +int __attribute__((noinline, noclone))
> +foo (int i, int j)
> +{
> + j = i; /* The incoming value for J is unused. */
> + i = 2;
> + if (j)
> + j++;
> + j += i + 1;
> + return j;
> +}
> +
> +/* Same as foo, but with swapped parameters. */
> +int __attribute__((noinline, noclone))
> +bar (int j, int i)
> +{
> + j = i; /* The incoming value for J is unused. */
> + i = 2;
> + if (j)
> + j++;
> + j += i + 1;
> + return j;
> +}
> +
> +int
> +main (void)
> +{
> + if (foo (0, 1) != 3)
> + abort ();
> + if (bar (1, 0) != 3)
> + abort ();
> + return 0;
> +}
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index e6310cd..e62f36b 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -330,12 +330,13 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>
> start_sequence ();
>
> - var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
> + tree name = partition_to_var (SA.map, dest);
> + var = SSA_NAME_VAR (name);
> src_mode = TYPE_MODE (TREE_TYPE (src));
> dest_mode = GET_MODE (dest_rtx);
> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
The TREE_TYPE of name and its SSA_NAME_VAR are always the same. So just
use TREE_TYPE (name) here.
> gcc_assert (!REG_P (dest_rtx)
> - || dest_mode == promote_decl_mode (var, &unsignedp));
> + || dest_mode == promote_ssa_mode (name, &unsignedp));
>
> if (src_mode != dest_mode)
> {
> @@ -714,12 +715,12 @@ static rtx
> get_temp_reg (tree name)
> {
> tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> - tree type = TREE_TYPE (var);
> + tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
See above.
> int unsignedp;
> - machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
> + machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
> rtx x = gen_reg_rtx (reg_mode);
> if (POINTER_TYPE_P (type))
> - mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
> + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
> return x;
> }
>
> @@ -1019,7 +1020,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
> /* Return to viewing the variable list as just all reference variables after
> coalescing has been performed. */
> - partition_view_normal (map, false);
> + partition_view_normal (map);
>
> if (dump_file && (dump_flags & TDF_DETAILS))
> {
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index eeac5a4..c2cdeef0 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see
> #include "tree-ssanames.h"
> #include "tree-ssa-live.h"
> #include "tree-ssa-coalesce.h"
> +#include "explow.h"
> #include "diagnostic-core.h"
>
>
> @@ -832,6 +833,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
> basic_block bb;
> ssa_op_iter iter;
> live_track_p live;
> + basic_block entry;
> +
> + /* If inter-variable coalescing is enabled, we may attempt to
> + coalesce variables from different base variables, including
> + different parameters, so we have to make sure default defs live
> + at the entry block conflict with each other. */
> + if (flag_tree_coalesce_vars)
> + entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> + else
> + entry = NULL;
>
> map = live_var_map (liveinfo);
> graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -890,6 +901,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
> live_track_process_def (live, result, graph);
> }
>
> + /* Pretend there are defs for params' default defs at the start
> + of the (post-)entry block. */
> + if (bb == entry)
> + {
> + unsigned base;
> + bitmap_iterator bi;
> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> + {
> + bitmap_iterator bi2;
> + unsigned part;
> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> + 0, part, bi2)
> + {
> + tree var = partition_to_var (map, part);
> + if (!SSA_NAME_VAR (var)
> + || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> + && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> + || !SSA_NAME_IS_DEFAULT_DEF (var))
> + continue;
> + live_track_process_def (live, var, graph);
> + }
> + }
> + }
> +
> live_track_clear_base_vars (live);
> }
>
> @@ -1158,6 +1193,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> {
> var1 = partition_to_var (map, p1);
> var2 = partition_to_var (map, p2);
> +
> z = var_union (map, var1, var2);
> if (z == NO_PARTITION)
> {
> @@ -1175,6 +1211,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
> if (debug)
> fprintf (debug, ": Success -> %d\n", z);
> +
> return true;
> }
>
> @@ -1272,6 +1309,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
> }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F. */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> + int t;
> + unsigned x, y;
> + int p;
> +
> + fprintf (f, "\nCoalescible Partition map \n\n");
> +
> + for (x = 0; x < map->num_partitions; x++)
> + {
> + if (map->view_to_partition != NULL)
> + p = map->view_to_partition[x];
> + else
> + p = x;
> +
> + if (ssa_name (p) == NULL_TREE
> + || virtual_operand_p (ssa_name (p)))
> + continue;
> +
> + t = 0;
> + for (y = 1; y < num_ssa_names; y++)
> + {
> + tree var = version_to_var (map, y);
> + if (!var)
> + continue;
> + int q = var_to_partition (map, var);
> + p = partition_find (part, q);
> + gcc_assert (map->partition_to_base_index[q]
> + == map->partition_to_base_index[p]);
> +
> + if (p == (int)x)
> + {
> + if (t++ == 0)
> + {
> + fprintf (f, "Partition %d, base %d (", x,
> + map->partition_to_base_index[q]);
> + print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> + fprintf (f, " - ");
> + }
> + fprintf (f, "%d ", y);
> + }
> + }
> + if (t != 0)
> + fprintf (f, ")\n");
> + }
> + fprintf (f, "\n");
> +}
> +
> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> + coalescing together, false otherwise.
> +
> + This must stay consistent with var_map_base_init in tree-ssa-live.c. */
> +
> +bool
> +gimple_can_coalesce_p (tree name1, tree name2)
> +{
> + /* First check the SSA_NAME's associated DECL. Without
> + optimization, we only want to coalesce if they have the same DECL
> + or both have no associated DECL. */
> + tree var1 = SSA_NAME_VAR (name1);
> + tree var2 = SSA_NAME_VAR (name2);
> + var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> + var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> + if (var1 != var2 && !flag_tree_coalesce_vars)
> + return false;
> +
> + /* Now check the types. If the types are the same, then we should
> + try to coalesce V1 and V2. */
> + tree t1 = TREE_TYPE (name1);
> + tree t2 = TREE_TYPE (name2);
> + if (t1 == t2)
> + {
> + check_modes:
> + /* If the base variables are the same, we're good: none of the
> + other tests below could possibly fail. */
> + var1 = SSA_NAME_VAR (name1);
> + var2 = SSA_NAME_VAR (name2);
> + if (var1 == var2)
> + return true;
> +
> + /* We don't want to coalesce two SSA names if one of the base
> + variables is supposed to be a register while the other is
> + supposed to be on the stack. Anonymous SSA names take
> + registers, but when not optimizing, user variables should go
> + on the stack, so coalescing them with the anonymous variable
> + as the partition leader would end up assigning the user
> + variable to a register. Don't do that! */
> + bool reg1 = !var1 || use_register_for_decl (var1);
> + bool reg2 = !var2 || use_register_for_decl (var2);
> + if (reg1 != reg2)
> + return false;
> +
> + /* Check that the promoted modes are the same. We don't want to
> + coalesce if the promoted modes would be different. Only
> + PARM_DECLs and RESULT_DECLs have different promotion rules,
> + so skip the test if we both are variables or anonymous
> + SSA_NAMEs. */
> + return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> + || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
> + }
> +
> + /* If the types are not the same, check for a canonical type match. This
> + (for example) allows coalescing when the types are fundamentally the
> + same, but just have different names.
> +
> + Note pointer types with different address spaces may have the same
> + canonical type. Those are rejected for coalescing by the
> + types_compatible_p check. */
> + if (TYPE_CANONICAL (t1)
> + && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> + && types_compatible_p (t1, t2))
> + goto check_modes;
> +
> + return false;
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> + partition of SSA names USED_IN_COPIES and related by CL coalesce
> + possibilities. This must match gimple_can_coalesce_p in the
> + optimized case. */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> + coalesce_list_p cl)
> +{
> + int parts = num_var_partitions (map);
> + partition tentative = partition_new (parts);
> +
> + /* Partition the SSA versions so that, for each coalescible
> + pair, both of its members are in the same partition in
> + TENTATIVE. */
> + gcc_assert (!cl->sorted);
> + coalesce_pair_p node;
> + coalesce_iterator_type ppi;
> + FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> + {
> + tree v1 = ssa_name (node->first_element);
> + int p1 = partition_find (tentative, var_to_partition (map, v1));
> + tree v2 = ssa_name (node->second_element);
> + int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> +
> + /* We have to deal with cost one pairs too. */
> + for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> + {
> + tree v1 = ssa_name (co->first_element);
> + int p1 = partition_find (tentative, var_to_partition (map, v1));
> + tree v2 = ssa_name (co->second_element);
> + int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> +
> + /* And also with abnormal edges. */
> + basic_block bb;
> + edge e;
> + edge_iterator ei;
> + FOR_EACH_BB_FN (bb, cfun)
> + {
> + FOR_EACH_EDGE (e, ei, bb->preds)
> + if (e->flags & EDGE_ABNORMAL)
> + {
> + gphi_iterator gsi;
> + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> + gsi_next (&gsi))
> + {
> + gphi *phi = gsi.phi ();
> + tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> + if (SSA_NAME_IS_DEFAULT_DEF (arg)
> + && (!SSA_NAME_VAR (arg)
> + || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> + continue;
> +
> + tree res = PHI_RESULT (phi);
> +
> + int p1 = partition_find (tentative, var_to_partition (map, res));
> + int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> + }
> + }
> +
> + map->partition_to_base_index = XCNEWVEC (int, parts);
> + auto_vec<unsigned int> index_map (parts);
> + if (parts)
> + index_map.quick_grow (parts);
> +
> + const unsigned no_part = -1;
> + unsigned count = parts;
> + while (count)
> + index_map[--count] = no_part;
> +
> + /* Initialize MAP's mapping from partition to base index, using
> + as base indices an enumeration of the TENTATIVE partitions in
> + which each SSA version ended up, so that we compute conflicts
> + between all SSA versions that ended up in the same potential
> + coalesce partition. */
> + bitmap_iterator bi;
> + unsigned i;
> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> + {
> + int pidx = var_to_partition (map, ssa_name (i));
> + int base = partition_find (tentative, pidx);
> + if (index_map[base] != no_part)
> + continue;
> + index_map[base] = count++;
> + }
> +
> + map->num_basevars = count;
> +
> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> + {
> + int pidx = var_to_partition (map, ssa_name (i));
> + int base = partition_find (tentative, pidx);
> + gcc_assert (index_map[base] < count);
> + map->partition_to_base_index[pidx] = index_map[base];
> + }
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + dump_part_var_map (dump_file, tentative, map);
> +
> + partition_delete (tentative);
> +}
> +
> +/* Hashtable helpers. */
> +
> +struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
> +{
> + typedef tree_int_map *value_type;
> + typedef tree_int_map *compare_type;
> + static inline hashval_t hash (const tree_int_map *);
> + static inline bool equal (const tree_int_map *, const tree_int_map *);
> +};
> +
> +inline hashval_t
> +tree_int_map_hasher::hash (const tree_int_map *v)
> +{
> + return tree_map_base_hash (v);
> +}
> +
> +inline bool
> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> +{
> + return tree_int_map_eq (v, c);
> +}
> +
> +/* This routine will initialize the basevar fields of MAP with base
> + names. Partitions will share the same base if they have the same
> + SSA_NAME_VAR, or, being anonymous variables, the same type. This
> + must match gimple_can_coalesce_p in the non-optimized case. */
> +
> +static void
> +compute_samebase_partition_bases (var_map map)
> +{
> + int x, num_part;
> + tree var;
> + struct tree_int_map *m, *mapstorage;
> +
> + num_part = num_var_partitions (map);
> + hash_table<tree_int_map_hasher> tree_to_index (num_part);
> + /* We can have at most num_part entries in the hash tables, so it's
> + enough to allocate so many map elements once, saving some malloc
> + calls. */
> + mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> +
> + /* If a base table already exists, clear it, otherwise create it. */
> + free (map->partition_to_base_index);
> + map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> +
> + /* Build the base variable list, and point partitions at their bases. */
> + for (x = 0; x < num_part; x++)
> + {
> + struct tree_int_map **slot;
> + unsigned baseindex;
> + var = partition_to_var (map, x);
> + if (SSA_NAME_VAR (var)
> + && (!VAR_P (SSA_NAME_VAR (var))
> + || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> + m->base.from = SSA_NAME_VAR (var);
> + else
> + /* This restricts what anonymous SSA names we can coalesce
> + as it restricts the sets we compute conflicts for.
> + Using TREE_TYPE to generate sets is the easies as
> + type equivalency also holds for SSA names with the same
> + underlying decl.
> +
> + Check gimple_can_coalesce_p when changing this code. */
> + m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> + ? TYPE_CANONICAL (TREE_TYPE (var))
> + : TREE_TYPE (var));
> + /* If base variable hasn't been seen, set it up. */
> + slot = tree_to_index.find_slot (m, INSERT);
> + if (!*slot)
> + {
> + baseindex = m - mapstorage;
> + m->to = baseindex;
> + *slot = m;
> + m++;
> + }
> + else
> + baseindex = (*slot)->to;
> + map->partition_to_base_index[x] = baseindex;
> + }
> +
> + map->num_basevars = m - mapstorage;
> +
> + free (mapstorage);
> +}
> +
> /* Reduce the number of copies by coalescing variables in the function. Return
> a partition map with the resulting coalesces. */
>
> @@ -1288,9 +1649,10 @@ coalesce_ssa_name (void)
> cl = create_coalesce_list ();
> map = create_outofssa_var_map (cl, used_in_copies);
>
> - /* If optimization is disabled, we need to coalesce all the names originating
> - from the same SSA_NAME_VAR so debug info remains undisturbed. */
> - if (!optimize)
> + /* If this optimization is disabled, we need to coalesce all the
> + names originating from the same SSA_NAME_VAR so debug info
> + remains undisturbed. */
> + if (!flag_tree_coalesce_vars)
> {
> hash_table<ssa_name_var_hash> ssa_name_hash (10);
>
> @@ -1331,8 +1693,13 @@ coalesce_ssa_name (void)
> if (dump_file && (dump_flags & TDF_DETAILS))
> dump_var_map (dump_file, map);
>
> - /* Don't calculate live ranges for variables not in the coalesce list. */
> - partition_view_bitmap (map, used_in_copies, true);
> + partition_view_bitmap (map, used_in_copies);
> +
> + if (flag_tree_coalesce_vars)
> + compute_optimized_partition_bases (map, used_in_copies, cl);
> + else
> + compute_samebase_partition_bases (map);
> +
> BITMAP_FREE (used_in_copies);
>
> if (num_var_partitions (map) < 1)
> @@ -1371,8 +1738,7 @@ coalesce_ssa_name (void)
>
> /* Now coalesce everything in the list. */
> coalesce_partitions (map, graph, cl,
> - ((dump_flags & TDF_DETAILS) ? dump_file
> - : NULL));
> + ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
> delete_coalesce_list (cl);
> ssa_conflicts_delete (graph);
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index f3cb56e..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,499 +0,0 @@
> -/* Rename SSA copies.
> - Copyright (C) 2004-2015 Free Software Foundation, Inc.
> - Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3. If not see
> -<http://www.gnu.org/licenses/>. */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "tm.h"
> -#include "hash-set.h"
> -#include "machmode.h"
> -#include "vec.h"
> -#include "double-int.h"
> -#include "input.h"
> -#include "alias.h"
> -#include "symtab.h"
> -#include "wide-int.h"
> -#include "inchash.h"
> -#include "tree.h"
> -#include "fold-const.h"
> -#include "predict.h"
> -#include "hard-reg-set.h"
> -#include "function.h"
> -#include "dominance.h"
> -#include "cfg.h"
> -#include "basic-block.h"
> -#include "tree-ssa-alias.h"
> -#include "internal-fn.h"
> -#include "gimple-expr.h"
> -#include "is-a.h"
> -#include "gimple.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "bitmap.h"
> -#include "gimple-ssa.h"
> -#include "stringpool.h"
> -#include "tree-ssanames.h"
> -#include "hashtab.h"
> -#include "rtl.h"
> -#include "statistics.h"
> -#include "real.h"
> -#include "fixed-value.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> - /* Number of copies coalesced. */
> - int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> - This optimization looks for copies between 2 SSA_NAMES, either through a
> - direct copy, or an implicit one via a PHI node result and its arguments.
> -
> - Each copy is examined to determine if it is possible to rename the base
> - variable of one of the operands to the same variable as the other operand.
> - i.e.
> - T.3_5 = <blah>
> - a_1 = T.3_5
> -
> - If this copy couldn't be copy propagated, it could possibly remain in the
> - program throughout the optimization phases. After SSA->normal, it would
> - become:
> -
> - T.3 = <blah>
> - a = T.3
> -
> - Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> - fundamental reason why the base variable needs to be T.3, subject to
> - certain restrictions. This optimization attempts to determine if we can
> - change the base variable on copies like this, and result in code such as:
> -
> - a_5 = <blah>
> - a_1 = a_5
> -
> - This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> - possible, the copy goes away completely. If it isn't possible, a new temp
> - will be created for a_5, and you will end up with the exact same code:
> -
> - a.8 = <blah>
> - a = a.8
> -
> - The other benefit of performing this optimization relates to what variables
> - are chosen in copies. Gimplification of the program uses temporaries for
> - a lot of things. expressions like
> -
> - a_1 = <blah>
> - <blah2> = a_1
> -
> - get turned into
> -
> - T.3_5 = <blah>
> - a_1 = T.3_5
> - <blah2> = a_1
> -
> - Copy propagation is done in a forward direction, and if we can propagate
> - through the copy, we end up with:
> -
> - T.3_5 = <blah>
> - <blah2> = T.3_5
> -
> - The copy is gone, but so is all reference to the user variable 'a'. By
> - performing this optimization, we would see the sequence:
> -
> - a_5 = <blah>
> - a_1 = a_5
> - <blah2> = a_1
> -
> - which copy propagation would then turn into:
> -
> - a_5 = <blah>
> - <blah2> = a_5
> -
> - and so we still retain the user variable whenever possible. */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> - Choose a representative for the partition, and send debug info to DEBUG. */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> - int p1, p2, p3;
> - tree root1, root2;
> - tree rep1, rep2;
> - bool ign1, ign2, abnorm;
> -
> - gcc_assert (TREE_CODE (var1) == SSA_NAME);
> - gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> - register_ssa_partition (map, var1);
> - register_ssa_partition (map, var2);
> -
> - p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> - p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> - if (debug)
> - {
> - fprintf (debug, "Try : ");
> - print_generic_expr (debug, var1, TDF_SLIM);
> - fprintf (debug, "(P%d) & ", p1);
> - print_generic_expr (debug, var2, TDF_SLIM);
> - fprintf (debug, "(P%d)", p2);
> - }
> -
> - gcc_assert (p1 != NO_PARTITION);
> - gcc_assert (p2 != NO_PARTITION);
> -
> - if (p1 == p2)
> - {
> - if (debug)
> - fprintf (debug, " : Already coalesced.\n");
> - return;
> - }
> -
> - rep1 = partition_to_var (map, p1);
> - rep2 = partition_to_var (map, p2);
> - root1 = SSA_NAME_VAR (rep1);
> - root2 = SSA_NAME_VAR (rep2);
> - if (!root1 && !root2)
> - return;
> -
> - /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
> - abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> - || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> - if (abnorm)
> - {
> - if (debug)
> - fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
> - return;
> - }
> -
> - /* Partitions already have the same root, simply merge them. */
> - if (root1 == root2)
> - {
> - p1 = partition_union (map->var_partition, p1, p2);
> - if (debug)
> - fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> - return;
> - }
> -
> - /* Never attempt to coalesce 2 different parameters. */
> - if ((root1 && TREE_CODE (root1) == PARM_DECL)
> - && (root2 && TREE_CODE (root2) == PARM_DECL))
> - {
> - if (debug)
> - fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> - return;
> - }
> -
> - if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> - != (root2 && TREE_CODE (root2) == RESULT_DECL))
> - {
> - if (debug)
> - fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> - return;
> - }
> -
> - ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> - ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> - /* Refrain from coalescing user variables, if requested. */
> - if (!ign1 && !ign2)
> - {
> - if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> - ign2 = true;
> - else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> - ign1 = true;
> - else if (flag_ssa_coalesce_vars != 2)
> - {
> - if (debug)
> - fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> - return;
> - }
> - else
> - ign2 = true;
> - }
> -
> - /* If both values have default defs, we can't coalesce. If only one has a
> - tag, make sure that variable is the new root partition. */
> - if (root1 && ssa_default_def (cfun, root1))
> - {
> - if (root2 && ssa_default_def (cfun, root2))
> - {
> - if (debug)
> - fprintf (debug, " : 2 default defs. No coalesce.\n");
> - return;
> - }
> - else
> - {
> - ign2 = true;
> - ign1 = false;
> - }
> - }
> - else if (root2 && ssa_default_def (cfun, root2))
> - {
> - ign1 = true;
> - ign2 = false;
> - }
> -
> - /* Do not coalesce if we cannot assign a symbol to the partition. */
> - if (!(!ign2 && root2)
> - && !(!ign1 && root1))
> - {
> - if (debug)
> - fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
> - return;
> - }
> -
> - /* Don't coalesce if the new chosen root variable would be read-only.
> - If both ign1 && ign2, then the root var of the larger partition
> - wins, so reject in that case if any of the root vars is TREE_READONLY.
> - Otherwise reject only if the root var, on which replace_ssa_name_symbol
> - will be called below, is readonly. */
> - if (((root1 && TREE_READONLY (root1)) && ign2)
> - || ((root2 && TREE_READONLY (root2)) && ign1))
> - {
> - if (debug)
> - fprintf (debug, " : Readonly variable. No coalesce.\n");
> - return;
> - }
> -
> - /* Don't coalesce if the two variables aren't type compatible . */
> - if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> - /* There is a disconnect between the middle-end type-system and
> - VRP, avoid coalescing enum types with different bounds. */
> - || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> - || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> - && TREE_TYPE (var1) != TREE_TYPE (var2)))
> - {
> - if (debug)
> - fprintf (debug, " : Incompatible types. No coalesce.\n");
> - return;
> - }
> -
> - /* Merge the two partitions. */
> - p3 = partition_union (map->var_partition, p1, p2);
> -
> - /* Set the root variable of the partition to the better choice, if there is
> - one. */
> - if (!ign2 && root2)
> - replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> - else if (!ign1 && root1)
> - replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> - else
> - gcc_unreachable ();
> -
> - if (debug)
> - {
> - fprintf (debug, " --> P%d ", p3);
> - print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> - TDF_SLIM);
> - fprintf (debug, "\n");
> - }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> - GIMPLE_PASS, /* type */
> - "copyrename", /* name */
> - OPTGROUP_NONE, /* optinfo_flags */
> - TV_TREE_COPY_RENAME, /* tv_id */
> - ( PROP_cfg | PROP_ssa ), /* properties_required */
> - 0, /* properties_provided */
> - 0, /* properties_destroyed */
> - 0, /* todo_flags_start */
> - 0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> - pass_rename_ssa_copies (gcc::context *ctxt)
> - : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> - {}
> -
> - /* opt_pass methods: */
> - opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> - virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> - virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> - SSA versions which occur in PHI's or copies. Coalescing is accomplished by
> - changing the underlying root variable of all coalesced version. This will
> - then cause the SSA->normal pass to attempt to coalesce them all to the same
> - variable. */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> - var_map map;
> - basic_block bb;
> - tree var, part_var;
> - gimple stmt;
> - unsigned x;
> - FILE *debug;
> -
> - memset (&stats, 0, sizeof (stats));
> -
> - if (dump_file && (dump_flags & TDF_DETAILS))
> - debug = dump_file;
> - else
> - debug = NULL;
> -
> - map = init_var_map (num_ssa_names);
> -
> - FOR_EACH_BB_FN (bb, fun)
> - {
> - /* Scan for real copies. */
> - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> - gsi_next (&gsi))
> - {
> - stmt = gsi_stmt (gsi);
> - if (gimple_assign_ssa_name_copy_p (stmt))
> - {
> - tree lhs = gimple_assign_lhs (stmt);
> - tree rhs = gimple_assign_rhs1 (stmt);
> -
> - copy_rename_partition_coalesce (map, lhs, rhs, debug);
> - }
> - }
> - }
> -
> - FOR_EACH_BB_FN (bb, fun)
> - {
> - /* Treat PHI nodes as copies between the result and each argument. */
> - for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> - gsi_next (&gsi))
> - {
> - size_t i;
> - tree res;
> - gphi *phi = gsi.phi ();
> - res = gimple_phi_result (phi);
> -
> - /* Do not process virtual SSA_NAMES. */
> - if (virtual_operand_p (res))
> - continue;
> -
> - /* Make sure to only use the same partition for an argument
> - as the result but never the other way around. */
> - if (SSA_NAME_VAR (res)
> - && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> - for (i = 0; i < gimple_phi_num_args (phi); i++)
> - {
> - tree arg = PHI_ARG_DEF (phi, i);
> - if (TREE_CODE (arg) == SSA_NAME)
> - copy_rename_partition_coalesce (map, res, arg,
> - debug);
> - }
> - /* Else if all arguments are in the same partition try to merge
> - it with the result. */
> - else
> - {
> - int all_p_same = -1;
> - int p = -1;
> - for (i = 0; i < gimple_phi_num_args (phi); i++)
> - {
> - tree arg = PHI_ARG_DEF (phi, i);
> - if (TREE_CODE (arg) != SSA_NAME)
> - {
> - all_p_same = 0;
> - break;
> - }
> - else if (all_p_same == -1)
> - {
> - p = partition_find (map->var_partition,
> - SSA_NAME_VERSION (arg));
> - all_p_same = 1;
> - }
> - else if (all_p_same == 1
> - && p != partition_find (map->var_partition,
> - SSA_NAME_VERSION (arg)))
> - {
> - all_p_same = 0;
> - break;
> - }
> - }
> - if (all_p_same == 1)
> - copy_rename_partition_coalesce (map, res,
> - PHI_ARG_DEF (phi, 0),
> - debug);
> - }
> - }
> - }
> -
> - if (debug)
> - dump_var_map (debug, map);
> -
> - /* Now one more pass to make all elements of a partition share the same
> - root variable. */
> -
> - for (x = 1; x < num_ssa_names; x++)
> - {
> - part_var = partition_to_var (map, x);
> - if (!part_var)
> - continue;
> - var = ssa_name (x);
> - if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> - continue;
> - if (debug)
> - {
> - fprintf (debug, "Coalesced ");
> - print_generic_expr (debug, var, TDF_SLIM);
> - fprintf (debug, " to ");
> - print_generic_expr (debug, part_var, TDF_SLIM);
> - fprintf (debug, "\n");
> - }
> - stats.coalesced++;
> - replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> - }
> -
> - statistics_counter_event (fun, "copies coalesced",
> - stats.coalesced);
> - delete_var_map (map);
> - return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> - return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index 2c7c072..821b2f4 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -100,90 +100,6 @@ static void verify_live_on_entry (tree_live_info_p);
> ssa_name or variable, and vice versa. */
>
>
> -/* Hashtable helpers. */
> -
> -struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
> -{
> - typedef tree_int_map *value_type;
> - typedef tree_int_map *compare_type;
> - static inline hashval_t hash (const tree_int_map *);
> - static inline bool equal (const tree_int_map *, const tree_int_map *);
> -};
> -
> -inline hashval_t
> -tree_int_map_hasher::hash (const tree_int_map *v)
> -{
> - return tree_map_base_hash (v);
> -}
> -
> -inline bool
> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> -{
> - return tree_int_map_eq (v, c);
> -}
> -
> -
> -/* This routine will initialize the basevar fields of MAP. */
> -
> -static void
> -var_map_base_init (var_map map)
> -{
> - int x, num_part;
> - tree var;
> - struct tree_int_map *m, *mapstorage;
> -
> - num_part = num_var_partitions (map);
> - hash_table<tree_int_map_hasher> tree_to_index (num_part);
> - /* We can have at most num_part entries in the hash tables, so it's
> - enough to allocate so many map elements once, saving some malloc
> - calls. */
> - mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> -
> - /* If a base table already exists, clear it, otherwise create it. */
> - free (map->partition_to_base_index);
> - map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> -
> - /* Build the base variable list, and point partitions at their bases. */
> - for (x = 0; x < num_part; x++)
> - {
> - struct tree_int_map **slot;
> - unsigned baseindex;
> - var = partition_to_var (map, x);
> - if (SSA_NAME_VAR (var)
> - && (!VAR_P (SSA_NAME_VAR (var))
> - || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> - m->base.from = SSA_NAME_VAR (var);
> - else
> - /* This restricts what anonymous SSA names we can coalesce
> - as it restricts the sets we compute conflicts for.
> - Using TREE_TYPE to generate sets is the easies as
> - type equivalency also holds for SSA names with the same
> - underlying decl.
> -
> - Check gimple_can_coalesce_p when changing this code. */
> - m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> - ? TYPE_CANONICAL (TREE_TYPE (var))
> - : TREE_TYPE (var));
> - /* If base variable hasn't been seen, set it up. */
> - slot = tree_to_index.find_slot (m, INSERT);
> - if (!*slot)
> - {
> - baseindex = m - mapstorage;
> - m->to = baseindex;
> - *slot = m;
> - m++;
> - }
> - else
> - baseindex = (*slot)->to;
> - map->partition_to_base_index[x] = baseindex;
> - }
> -
> - map->num_basevars = m - mapstorage;
> -
> - free (mapstorage);
> -}
> -
> -
> /* Remove the base table in MAP. */
>
> static void
> @@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
> }
>
>
> -/* Create a partition view which includes all the used partitions in MAP. If
> - WANT_BASES is true, create the base variable map as well. */
> +/* Create a partition view which includes all the used partitions in MAP. */
>
> void
> -partition_view_normal (var_map map, bool want_bases)
> +partition_view_normal (var_map map)
> {
> bitmap used;
>
> used = partition_view_init (map);
> partition_view_fini (map, used);
>
> - if (want_bases)
> - var_map_base_init (map);
> - else
> - var_map_base_fini (map);
> + var_map_base_fini (map);
> }
>
>
> @@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
> as well. */
>
> void
> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> +partition_view_bitmap (var_map map, bitmap only)
> {
> bitmap used;
> bitmap new_partitions = BITMAP_ALLOC (NULL);
> @@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> }
> partition_view_fini (map, new_partitions);
>
> - if (want_bases)
> - var_map_base_init (map);
> - else
> - var_map_base_fini (map);
> + var_map_base_fini (map);
> }
>
>
> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
> index d5d7820..1f88358 100644
> --- a/gcc/tree-ssa-live.h
> +++ b/gcc/tree-ssa-live.h
> @@ -71,8 +71,8 @@ typedef struct _var_map
> extern var_map init_var_map (int);
> extern void delete_var_map (var_map);
> extern int var_union (var_map, tree, tree);
> -extern void partition_view_normal (var_map, bool);
> -extern void partition_view_bitmap (var_map, bitmap, bool);
> +extern void partition_view_normal (var_map);
> +extern void partition_view_bitmap (var_map, bitmap);
> extern void dump_scope_blocks (FILE *, int);
> extern void debug_scope_block (tree, int);
> extern void debug_scope_blocks (int);
> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
> index 685fcc38c..447fcd9 100644
> --- a/gcc/var-tracking.c
> +++ b/gcc/var-tracking.c
> @@ -4872,12 +4872,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
> registers, as well as associations between MEMs and VALUEs. */
>
> static void
> -dataflow_set_clear_at_call (dataflow_set *set)
> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
> {
> unsigned int r;
> hard_reg_set_iterator hrsi;
> + HARD_REG_SET invalidated_regs;
>
> - EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
> + get_call_reg_set_usage (call_insn, &invalidated_regs,
> + regs_invalidated_by_call);
> +
> + EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
> var_regno_delete (set, r);
>
> if (MAY_HAVE_DEBUG_INSNS)
> @@ -6685,7 +6689,7 @@ compute_bb_dataflow (basic_block bb)
> switch (mo->type)
> {
> case MO_CALL:
> - dataflow_set_clear_at_call (out);
> + dataflow_set_clear_at_call (out, insn);
> break;
>
> case MO_USE:
> @@ -9152,7 +9156,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
> switch (mo->type)
> {
> case MO_CALL:
> - dataflow_set_clear_at_call (set);
> + dataflow_set_clear_at_call (set, insn);
> emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
> {
> rtx arguments = mo->u.loc, *p = &arguments;
Otherwise this looks fine to me - I didn't really spot the TREE_LIST
uses though (apart from that first element use).
2nd eyes welcome.
Thanks,
Richard.
>
> --
> Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/ FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-04-24 1:56 ` Alexandre Oliva
2015-04-27 11:39 ` Richard Biener
@ 2015-04-29 3:51 ` Jeff Law
1 sibling, 0 replies; 127+ messages in thread
From: Jeff Law @ 2015-04-29 3:51 UTC (permalink / raw)
To: Alexandre Oliva, Richard Biener; +Cc: gcc-patches
On 04/23/2015 07:56 PM, Alexandre Oliva wrote:
>
> The other tricky bit was to fix all expander bits that required
> SSA_NAMEs to have a associated decl. I've removed all such cases, so we
> can now expand anonymous SSA decls directly, without having to create an
> ignored decl. Doing that, we can coalesce variables and expand each
> partition without worrying about choosing a preferred partition leader.
> We just have to make sure we don't consider pairs of variables eligible
> for coalescing if they should get different promoted modes, or a
> different register-or-stack choice, and then expansion of partitions is
> streamlined: we just expand each leader, and then adjust all SSA_NAMEs
> to associate the RTL with their base variables, if any.
Nice.
>
>
> In this revision of the patch, I have retained -ftree-coalesce-vars, so
> that its negated form can be used in testcases that formerly expected no
> coalescing across user variables, but that explicitly disabled VTA.
Seems reasonable.
>
> As for testcases, while investigating test regressions, I found out
> various guality failures had to do with VT's lack of awareness of custom
> calling conventions. Caller's variables saved in registers that are
> normally call-clobbered, but that are call-saved in custom conventions
> set up for a callee, would end up invalidating the entry-point location
> associations. I've arranged for var-tracking to use custom calling
> conventions for register invalidation at call insns, and this fixed not
> only a few guality regressions due to changes in register assignment,
> but a number of other long-standing guality failures. Yay! This could
> be split out into a standalone patch.
That might be wise -- I think we're going to need at least one more
iteration on the removal of copyrename.
> In this version of the patch, we no longer touch the base vars at all.
> We just associate the piece of RTL generated for the partition with a
> list of decls, if needed. (I've just realized that I never noticed a
> list of decls show up anywhere, and looking into this, I saw a bug in
> the leader_merge function, that causes it to fail to go from a single
> entry to a list: it creates the list, but then returns the original
> non-list entry; that's why I never saw them! I won't delay posting the
> patch just because of this; I'm not even sure we want decl lists in REG
> or MEM attrs begin with)
Well, Richi noted the compile-time cost and poor data structure choice.
I'd ask the question, what's the benefit in tracking these as a list?
If we want to track, then how often do we need to actually traverse the
list, how hard would it be to build a pathological case (from a compile
time standpoint). Presumably there's no way to sort the list to make
finding an entry cheap?
>
> I have collected some statistics on the effects of the patch in
> compiling stage3-gcc/, before and after the patch, with and without
> -fno-tree-coalesce-vars. I counted, per function:
>
> b/a: before the patch, or after the patch
>
> c/n: -ftree-coalesce-vars (default when optimizing) or
> -fno-tree-coalesce-vars
>
> cv: the coalescible var count, i.e., the active partition count prior to
> coalescing. SSA_NAMEs not elligible for coalescing are not counted.
> The more of these there are, the larger the conflict graph we have to
> build.
>
> base: the base variable count that guides the construction of the
> conflict map. The more of these there are, the smaller the conflict
> graph we have to build, but it is also a lower bound for the final
> partition count.
>
> part: the partition count after coalescing, not counting those of
> SSA_NAMEs that were not elligible for coalescing to begin with.
>
> abn: successful abnormal coalesce count. How many times
> attempt_coalesce returned true as called in the abnormal coalesce loop.
>
> same: successful normal coalesces of pairs of SSA_NAMEs that share the
> same base variable (SSA_NAME_VAR, not the base index used to guide the
> construction of the conflict graph). Ignored base decls are regarded as
> NULL for purposes of this comparison. How many times attempt_coalesce
> returned true for variables that share the same base variable. This may
> count cases in which both vars are in the same partition already due to
> earlier coalesces.
>
> other: successful normal coalesces of pairs of SSA_NAMEs that do NOT
> share the same base variable. Same caveats as above.
>
> fail: failed attempts at normal coalece. How many times
> attempt_coalesce returned false.
>
> b/a c/n cv base part abn same other fail
>
> before -fno-tr 570180 176682 221442 82076 370746 0 10542
> before -ftree- 577212 171581 221927 82076 378093 0 18654
>
> after -fno-tr 608533 179959 220948 82076 488119 0 11697
> after -ftree- 589243 202588 221817 82076 349373 41775 24124
I've spent quite a bit of time trying to figure out what all this means.
I think the takeaway is we'll use a bit more memory, but we also
coalesce a bit better. Neither effect appears to be very large.
>
>
>
> And here's the actual patch I'm submitting for your appreciation (I was
> gonna say for inclusion, but given the leader_merge brown paper bag bug,
> I'll just want feedback on whether we want that or not, and either drop
> the list-building, or probably post a revised patch that fixes fallout
> from lists where decls are expected.)
>
> No regressions, and many progressions, on x86_64-linux-gnu and
> i686-pc-linux-gnu.
>
> [PR64164] Drop copyrename, use coalescible partition as base when optimizing.
>
> for gcc/ChangeLog
>
> PR rtl-optimization/64164
> * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
> * tree-ssa-copyrename.c: Removed.
> * opts.c (default_options_table): Drop -ftree-copyrename. Add
> -ftree-coalesce-vars.
> * passes.def: Drop all occurrences of pass_rename_ssa_copies.
> * common.opt (ftree-copyrename): Ignore.
> (ftree-coalesce-inlined-vars): Likewise.
> * doc/invoke.texi: Remove the ignored options above.
> * gimple-expr.h (gimple_can_coalesce_p): Note def location.
> * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
> across variables when flag_tree_coalesce_vars. Check register
> use and promoted modes to allow coalescing. Moved to
> tree-ssa-coalesce.c.
> * tree-ssa-live.c (struct tree_int_map_hasher): Move along
> with its member functions to tree-ssa-coalesce.c.
> (var_map_base_init): Likewise. Renamed to
> compute_samebase_partition_bases.
> (partition_view_normal): Drop want_bases parameter.
> (partition_view_bitmap): Likewise.
> * tree-ssa-live.h: Adjust declarations.
> * tree-ssa-coalesce.c: Include explow.h.
> (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
> default defs at the entry point.
> (dump_part_var_map): New.
> (compute_optimized_partition_bases): New, called by...
> (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
> of compute_samebase_partition_bases. Adjust.
> * alias.c (nonoverlapping_memrefs_p): Special-case RTL-less
> gimple-reg exprs.
> * cfgexpand.c (leader_merge): New.
> (get_rtl_for_parm_ssa_default_def): New.
> (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
> vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
> (expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
> redundant MEM attr setting.
> (expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
> from...
> (expand_one_stack_var): ... this. New wrapper to check and
> skip already expanded SSA partitions.
> (record_alignment_for_reg_var): New, factored out of...
> (expand_one_var): ... this.
> (expand_one_ssa_partition): New.
> (adjust_one_expanded_partition_var): New.
> (expand_one_register_var): Check and skip already expanded SSA
> partitions.
> (expand_used_vars): Don't create DECLs for anonymous SSA
> names. Expand all SSA partitions, then adjust all SSA names.
> (pass::execute): Replace the loops that set
> SA.partition_to_pseudo from partition leaders and cleared
> DECL_RTL for multi-location variables, and that which used to
> rename vars and set attrs, with one that clears DECL_RTL and
> checks that PARMs and RESULTs default_defs match DECL_RTL.
> * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
> * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL and
> TREE_LIST decl.
> * explow.c (promote_ssa_mode): New.
> * explow.h (promote_ssa_mode): Declare.
> * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
> * function.c: Include cfgexpand.h.
> (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
> (use_register_for_parm_decl): Wrapper for the above to
> special-case the result_ptr.
> (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
> (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
> multiple locations.
> (assign_parm_adjust_stack_rtl): Add all and parm arguments,
> for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
> (assign_parm_setup_block): Prefer SSA-assigned location.
> (assign_parm_setup_reg): Likewise. Use entry_parm for equiv
> if stack_parm is NULL.
> (assign_parm_setup_stack): Prefer SSA-assigned location.
> (assign_parms): Maybe reset DECL_RTL of params. Adjust stack
> rtl before testing for pointer bounds. Special-case result_ptr.
> (expand_function_start): Maybe reset DECL_RTL of result.
> Prefer SSA-assigned location for result and static chain.
> Factor out DECL_RESULT and SET_DECL_RTL.
> * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
> anonymous SSA names. Use promote_ssa_mode.
> (get_temp_reg): Likewise.
> (remove_ssa_form): Adjust.
> * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
> and get its reg_usage for reg invalidation.
> (compute_bb_dataflow): Pass it insn.
> (emit_notes_in_bb): Likewise.
>
> for gcc/testsuite/ChangeLog
>
> * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
> * gcc.dg/ssp-1.c: Make counter a register.
> * gcc.dg/ssp-2.c: Likewise.
> * gcc.dg/torture/parm-coalesce.c: New.
Just a few comments in addition to Richi's....
> ---
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index ca491a0..74190a6d 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -179,21 +179,137 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
> + TREE_LIST of DECLs. If NEXT is covered by CUR, return CUR
> + unchanged. Otherwise, return a list with all entries of CUR, with
> + NEXT at the end. If CUR was a list, it will be modified in
> + place. */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> + if (cur == NULL || cur == next)
> + return next;
> +
> + tree list;
> +
> + if (TREE_CODE (cur) == TREE_LIST)
> + {
> + /* Look for NEXT in the list. Stop at the last node to insert
> + there. */
> + for (list = cur; ; list = TREE_CHAIN (list))
> + {
> + if (TREE_VALUE (list) == next)
> + return cur;
> + if (!TREE_CHAIN (list))
> + break;
> + }
> + }
> + else
> + /* Create the first node. */
> + list = build_tree_list (NULL, cur);
> +
> + next = build_tree_list (NULL, next);
> + TREE_CHAIN (list) = next;
> +
> + return cur;
> +}
As Richi notes, avoid TREE_LIST :-) I suspect a vec would be an
improvement here. How often do we have more than one entry? How often
do we have to search this list and how hard is it to trigger
pathological behaviour here? If we're not gaining much, consider
dropping this completely. It's the most controversial part of the patch.
> +
> + /* If this decl was marked as living in multiple places, reset
> + this now to NULL. */
> + tree var = SSA_NAME_VAR (name);
> + if (var && DECL_RTL_IF_SET (var) == pc_rtx)
Do we document the special meaning of pc_rtx in DECL_RTL?
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index b8dc7d5..ef31ba0f 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1229,6 +1229,11 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
> void
> set_reg_attrs_for_decl_rtl (tree t, rtx x)
> {
> + if (!t)
> + return;
> + tree tdecl = t;
> + if (TREE_CODE (t) == TREE_LIST)
> + tdecl = TREE_VALUE (t);
As Richi mentioned, we only use the first entry, which begs the
question, do we need the the leader_merge bits at all.
Jeff
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-04-27 11:39 ` Richard Biener
@ 2015-06-06 5:12 ` Alexandre Oliva
2015-06-08 8:16 ` Richard Biener
2015-06-10 0:28 ` Alexandre Oliva
0 siblings, 2 replies; 127+ messages in thread
From: Alexandre Oliva @ 2015-06-06 5:12 UTC (permalink / raw)
To: Richard Biener; +Cc: Jeff Law, GCC Patches
On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
> This should also mention that is_gimple_reg vars do not have their
> address taken.
check
>> +static tree
>> +leader_merge (tree cur, tree next)
> Ick - presumably you can't use sth better than a TREE_LIST here?
The list was an experiment that never really worked, and when I tried to
make it work after the patch, it proved to be unworkable, so I dropped
it, and rewrote leader_merge to choose either of the params, preferring
anonymous over ignored over named, so as to reduce the likelihood of
misreading of debug dumps, since that's all they're used for.
>> static void
>> -expand_one_stack_var (tree var)
>> +expand_one_stack_var_1 (tree var)
>> {
>> HOST_WIDE_INT size, offset;
>> unsigned byte_align;
>>
>> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>> - byte_align = align_local_variable (SSAVAR (var));
>> + if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
>> + {
>> + size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>> + byte_align = align_local_variable (SSAVAR (var));
>> + }
>> + else
> I'd go here for all TREE_CODE (var) == SSA_NAME
Check
> (and get rid of the SSAVAR macro?)
There are remaining uses that don't seem worth dropping it for.
>> +/* Return the promoted mode for name. If it is a named SSA_NAME, it
>> + is the same as promote_decl_mode. Otherwise, it is the promoted
>> + mode of a temp decl of same type as the SSA_NAME, if we had created
>> + one. */
>> +
>> +machine_mode
>> +promote_ssa_mode (const_tree name, int *punsignedp)
>> +{
>> + gcc_assert (TREE_CODE (name) == SSA_NAME);
>> +
>> + if (SSA_NAME_VAR (name))
>> + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> As above I'd rather not have different paths for anonymous vs. non-anonymous
> vars (so just delete the above two lines).
Check
>> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>> pmode = promote_function_mode (type, mode, &unsignedp,
>> gimple_call_fntype (g),
>> 2);
>> + else if (!exp)
>> + {
>> + gcc_assert (code == SSA_NAME);
> promote_ssa_mode should assert this.
>> + pmode = promote_ssa_mode (ssa_name, &unsignedp);
It does, so... check.
>> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>> bool
>> use_register_for_decl (const_tree decl)
>> {
>> + if (TREE_CODE (decl) == SSA_NAME)
>> + {
>> + if (!SSA_NAME_VAR (decl))
>> + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>> + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>> +
>> + decl = SSA_NAME_VAR (decl);
> See above. Please drop the SSA_NAME_VAR != NULL path.
Check, then taken back, after a bootstrap failure and some debugging
made me realize this would be wrong. Here are the nearly-added comments
that explain why:
/* We often try to use the SSA_NAME, instead of its underlying
decl, to get type information and guide decisions, to avoid
differences of behavior between anonymous and named
variables, but in this one case we have to go for the actual
variable if there is one. The main reason is that, at least
at -O0, we want to place user variables on the stack, but we
don't mind using pseudos for anonymous or ignored temps.
Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
should go in pseudos, whereas their corresponding variables
might have to go on the stack. So, disregarding the decl
here would negatively impact debug info at -O0, enable
coalescing between SSA_NAMEs that ought to get different
stack/pseudo assignments, and get the incoming argument
processing thoroughly confused by PARM_DECLs expected to live
in stack slots but assigned to pseudos. */
>> +++ b/gcc/gimple-expr.h
>> +/* Defined in tree-ssa-coalesce.c. */
>> +extern bool gimple_can_coalesce_p (tree, tree);
> Err, put it to tree-ssa-coalesce.h?
Check. Lots of additional headers required to be able to include
tree-ssa-coalesce.h, though.
>> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
>> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
> The TREE_TYPE of name and its SSA_NAME_VAR are always the same. So just
> use TREE_TYPE (name) here.
Check
>> gcc_assert (!REG_P (dest_rtx)
>> - || dest_mode == promote_decl_mode (var, &unsignedp));
>> + || dest_mode == promote_ssa_mode (name, &unsignedp));
>>
>> if (src_mode != dest_mode)
>> {
>> @@ -714,12 +715,12 @@ static rtx
>> get_temp_reg (tree name)
>> {
>> tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>> - tree type = TREE_TYPE (var);
>> + tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
> See above.
Check
Here's the revised patch, regstrapped on x86_64-linux-gnu and
i686-linux-gnu. The first attempt failed to compile libjava on x86_64,
requiring the new change in tree-ssa-loop-niter.c to pass. It didn't
occur in the unpatched tree because the differences between anon or
named SSA_NAMEs in copyrename changed costs and caused different choices
in ivopts, which ultimately failed to expose the problem in loop-niter
during vrp.
At the end, I enclose the incremental changes since the previous
revision of the patch, to ease the incremental review.
Ok to install?
for gcc/ChangeLog
PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename. Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Move declaration
* tree-ssa-coalesce.h: ... here.
* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
headers required by it.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars. Check register
use and promoted modes to allow coalescing. Moved to
tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise. Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
of compute_samebase_partition_bases. Adjust.
* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
* cfgexpand.c (leader_merge): New.
(get_rtl_for_parm_ssa_default_def): New.
(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
(expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
redundant MEM attr setting.
(expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
from...
(expand_one_stack_var): ... this. New wrapper to check and
skip already expanded SSA partitions.
(record_alignment_for_reg_var): New, factored out of...
(expand_one_var): ... this.
(expand_one_ssa_partition): New.
(adjust_one_expanded_partition_var): New.
(expand_one_register_var): Check and skip already expanded SSA
partitions.
(expand_used_vars): Don't create DECLs for anonymous SSA
names. Expand all SSA partitions, then adjust all SSA names.
(pass::execute): Replace the loops that set
SA.partition_to_pseudo from partition leaders and cleared
DECL_RTL for multi-location variables, and that which used to
rename vars and set attrs, with one that clears DECL_RTL and
checks that PARMs and RESULTs default_defs match DECL_RTL.
* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
* explow.c (promote_ssa_mode): New.
* explow.h (promote_ssa_mode): Declare.
* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
* function.c: Include cfgexpand.h.
(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
(use_register_for_parm_decl): Wrapper for the above to
special-case the result_ptr.
(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
multiple locations.
(assign_parm_adjust_stack_rtl): Add all and parm arguments,
for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
(assign_parm_setup_block): Prefer SSA-assigned location.
(assign_parm_setup_reg): Likewise. Use entry_parm for equiv
if stack_parm is NULL.
(assign_parm_setup_stack): Prefer SSA-assigned location.
(assign_parms): Maybe reset DECL_RTL of params. Adjust stack
rtl before testing for pointer bounds. Special-case result_ptr.
(expand_function_start): Maybe reset DECL_RTL of result.
Prefer SSA-assigned location for result and static chain.
Factor out DECL_RESULT and SET_DECL_RTL.
* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
anonymous SSA names. Use promote_ssa_mode.
(get_temp_reg): Likewise.
(remove_ssa_form): Adjust.
* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
and get its reg_usage for reg invalidation.
(compute_bb_dataflow): Pass it insn.
(emit_notes_in_bb): Likewise.
* tree-ssa-loop-niter.c (loop_exits_before_overflow): Don't
fail assert on conversion between unsigned types.
for gcc/testsuite/ChangeLog
* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
* gcc.dg/ssp-1.c: Make counter a register.
* gcc.dg/ssp-2.c: Likewise.
* gcc.dg/torture/parm-coalesce.c: New.
---
gcc/Makefile.in | 1
gcc/alias.c | 13 +
gcc/cfgexpand.c | 370 ++++++++++++++-----
gcc/cfgexpand.h | 2
gcc/common.opt | 12 -
gcc/doc/invoke.texi | 48 +--
gcc/emit-rtl.c | 5
gcc/explow.c | 22 +
gcc/explow.h | 3
gcc/expr.c | 39 +-
gcc/function.c | 226 +++++++++---
gcc/gimple-expr.c | 39 --
gcc/gimple-expr.h | 1
gcc/opts.c | 2
gcc/passes.def | 5
gcc/testsuite/gcc.dg/guality/pr54200.c | 2
gcc/testsuite/gcc.dg/ssp-1.c | 2
gcc/testsuite/gcc.dg/ssp-2.c | 2
gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++
gcc/tree-outof-ssa.c | 16 -
gcc/tree-ssa-coalesce.c | 380 +++++++++++++++++++-
gcc/tree-ssa-coalesce.h | 1
gcc/tree-ssa-copyrename.c | 499 --------------------------
gcc/tree-ssa-live.c | 101 -----
gcc/tree-ssa-live.h | 4
gcc/tree-ssa-loop-niter.c | 6
gcc/tree-ssa-uncprop.c | 5
gcc/var-tracking.c | 12 -
28 files changed, 984 insertions(+), 874 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
delete mode 100644 gcc/tree-ssa-copyrename.c
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 3d14938..2a03223 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1441,7 +1441,6 @@ OBJS = \
tree-ssa-ccp.o \
tree-ssa-coalesce.o \
tree-ssa-copy.o \
- tree-ssa-copyrename.o \
tree-ssa-dce.o \
tree-ssa-dom.o \
tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index ea539c5..5a031d9 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2552,6 +2552,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
if (! DECL_P (exprx) || ! DECL_P (expry))
return 0;
+ /* If we refer to different gimple registers, or one gimple register
+ and one non-gimple-register, we know they can't overlap. First,
+ gimple registers don't have their addresses taken. Now, there
+ could be more than one stack slot for (different versions of) the
+ same gimple register, but we can presumably tell they don't
+ overlap based on offsets from stack base addresses elsewhere.
+ It's important that we don't proceed to DECL_RTL, because gimple
+ registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+ able to do anything about them since no SSA information will have
+ remained to guide it. */
+ if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+ return exprx != expry;
+
/* With invalid code we can end up storing into the constant pool.
Bail out to avoid ICEing when creating RTL for this.
See gfortran.dg/lto/20091028-2_0.f90. */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b190f91..bf972fc 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -179,21 +179,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
#define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+ Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+ out of the same user variable being in multiple partitions (this is
+ less likely for compiler-introduced temps). */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+ if (cur == NULL || cur == next)
+ return next;
+
+ if (DECL_P (cur) && DECL_IGNORED_P (cur))
+ return cur;
+
+ if (DECL_P (next) && DECL_IGNORED_P (next))
+ return next;
+
+ return cur;
+}
+
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+ there is one. */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+ gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+ if (!is_gimple_reg (var))
+ return NULL_RTX;
+
+ /* If we've already determined RTL for the decl, use it. This is
+ not just an optimization: if VAR is a PARM whose incoming value
+ is unused, we won't find a default def to use its partition, but
+ we still want to use the location of the parm, if it was used at
+ all. During assign_parms, until a location is assigned for the
+ VAR, RTL can only for a parm or result if we're not coalescing
+ across variables, when we know we're coalescing all SSA_NAMEs of
+ each parm or result, and we're not coalescing them with names
+ pertaining to other variables, such as other parms' default
+ defs. */
+ if (DECL_RTL_SET_P (var))
+ {
+ gcc_assert (DECL_RTL (var) != pc_rtx);
+ return DECL_RTL (var);
+ }
+
+ tree name = ssa_default_def (cfun, var);
+
+ if (!name)
+ return NULL_RTX;
+
+ int part = var_to_partition (SA.map, name);
+ if (part == NO_PARTITION)
+ return NULL_RTX;
+
+ return SA.partition_to_pseudo[part];
+}
+
/* Associate declaration T with storage space X. If T is no
SSA name this is exactly SET_DECL_RTL, otherwise make the
partition of T associated with X. */
static inline void
set_rtl (tree t, rtx x)
{
+ if (x && SSAVAR (t))
+ {
+ bool skip = false;
+ tree cur = NULL_TREE;
+
+ if (MEM_P (x))
+ cur = MEM_EXPR (x);
+ else if (REG_P (x))
+ cur = REG_EXPR (x);
+ else if (GET_CODE (x) == CONCAT
+ && REG_P (XEXP (x, 0)))
+ cur = REG_EXPR (XEXP (x, 0));
+ else if (GET_CODE (x) == PARALLEL)
+ cur = REG_EXPR (XVECEXP (x, 0, 0));
+ else if (x == pc_rtx)
+ skip = true;
+ else
+ gcc_unreachable ();
+
+ tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+ if (cur != next)
+ {
+ if (MEM_P (x))
+ set_mem_attributes (x, next, true);
+ else
+ set_reg_attrs_for_decl_rtl (next, x);
+ }
+ }
+
if (TREE_CODE (t) == SSA_NAME)
{
- SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
- if (x && !MEM_P (x))
- set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
- /* For the benefit of debug information at -O0 (where vartracking
- doesn't run) record the place also in the base DECL if it's
- a normal variable (not a parameter). */
- if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+ int part = var_to_partition (SA.map, t);
+ if (part != NO_PARTITION)
+ {
+ if (SA.partition_to_pseudo[part])
+ gcc_assert (SA.partition_to_pseudo[part] == x);
+ else
+ SA.partition_to_pseudo[part] = x;
+ }
+ /* For the benefit of debug information at -O0 (where
+ vartracking doesn't run) record the place also in the base
+ DECL. For PARMs and RESULTs, we may end up resetting these
+ in function.c:maybe_reset_rtl_for_parm, but in some rare
+ cases we may need them (unused and overwritten incoming
+ value, that at -O0 must share the location with the other
+ uses in spite of the missing default def), and this may be
+ the only chance to preserve them. */
+ if (x && x != pc_rtx && SSA_NAME_VAR (t))
{
tree var = SSA_NAME_VAR (t);
/* If we don't yet have something recorded, just record it now. */
@@ -909,7 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
x = plus_constant (Pmode, base, offset);
- x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+ x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+ ? TYPE_MODE (TREE_TYPE (decl))
+ : DECL_MODE (SSAVAR (decl)), x);
if (TREE_CODE (decl) != SSA_NAME)
{
@@ -931,7 +1033,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
DECL_USER_ALIGN (decl) = 0;
}
- set_mem_attributes (x, SSAVAR (decl), true);
set_rtl (decl, x);
}
@@ -1146,13 +1247,22 @@ account_stack_vars (void)
to a variable to be allocated in the stack frame. */
static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
{
HOST_WIDE_INT size, offset;
unsigned byte_align;
- size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
- byte_align = align_local_variable (SSAVAR (var));
+ if (TREE_CODE (var) == SSA_NAME)
+ {
+ tree type = TREE_TYPE (var);
+ size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+ byte_align = TYPE_ALIGN_UNIT (type);
+ }
+ else
+ {
+ size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+ byte_align = align_local_variable (var);
+ }
/* We handle highly aligned variables in expand_stack_vars. */
gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1163,6 +1273,27 @@ expand_one_stack_var (tree var)
crtl->max_used_stack_slot_alignment, offset);
}
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+ already assigned some MEM. */
+
+static void
+expand_one_stack_var (tree var)
+{
+ if (TREE_CODE (var) == SSA_NAME)
+ {
+ int part = var_to_partition (SA.map, var);
+ if (part != NO_PARTITION)
+ {
+ rtx x = SA.partition_to_pseudo[part];
+ gcc_assert (x);
+ gcc_assert (MEM_P (x));
+ return;
+ }
+ }
+
+ return expand_one_stack_var_1 (var);
+}
+
/* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
that will reside in a hard register. */
@@ -1172,13 +1303,114 @@ expand_one_hard_reg_var (tree var)
rest_of_decl_compilation (var, 0, 0);
}
+/* Record the alignment requirements of some variable assigned to a
+ pseudo. */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+ if (SUPPORTS_STACK_ALIGNMENT
+ && crtl->stack_alignment_estimated < align)
+ {
+ /* stack_alignment_estimated shouldn't change after stack
+ realign decision made */
+ gcc_assert (!crtl->stack_realign_processed);
+ crtl->stack_alignment_estimated = align;
+ }
+
+ /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+ So here we only make sure stack_alignment_needed >= align. */
+ if (crtl->stack_alignment_needed < align)
+ crtl->stack_alignment_needed = align;
+ if (crtl->max_used_stack_slot_alignment < align)
+ crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition. */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+ int part = var_to_partition (SA.map, var);
+ gcc_assert (part != NO_PARTITION);
+
+ if (SA.partition_to_pseudo[part])
+ return;
+
+ if (!use_register_for_decl (var))
+ {
+ expand_one_stack_var_1 (var);
+ return;
+ }
+
+ unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+ TYPE_MODE (TREE_TYPE (var)),
+ TYPE_ALIGN (TREE_TYPE (var)));
+
+ /* If the variable alignment is very large we'll dynamicaly allocate
+ it, which means that in-frame portion is just a pointer. */
+ if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+ align = POINTER_SIZE;
+
+ record_alignment_for_reg_var (align);
+
+ machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+ rtx x = gen_reg_rtx (reg_mode);
+
+ set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+ and the underlying variable of the SSA_NAME. */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+ if (!var)
+ return;
+
+ tree decl = SSA_NAME_VAR (var);
+
+ int part = var_to_partition (SA.map, var);
+ if (part == NO_PARTITION)
+ return;
+
+ rtx x = SA.partition_to_pseudo[part];
+
+ set_rtl (var, x);
+
+ if (!REG_P (x))
+ return;
+
+ /* Note if the object is a user variable. */
+ if (decl && !DECL_ARTIFICIAL (decl))
+ mark_user_reg (x);
+
+ if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+ mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
/* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
that will reside in a pseudo register. */
static void
expand_one_register_var (tree var)
{
- tree decl = SSAVAR (var);
+ if (TREE_CODE (var) == SSA_NAME)
+ {
+ int part = var_to_partition (SA.map, var);
+ if (part != NO_PARTITION)
+ {
+ rtx x = SA.partition_to_pseudo[part];
+ gcc_assert (x);
+ gcc_assert (REG_P (x));
+ return;
+ }
+ gcc_unreachable ();
+ }
+
+ tree decl = var;
tree type = TREE_TYPE (decl);
machine_mode reg_mode = promote_decl_mode (decl, NULL);
rtx x = gen_reg_rtx (reg_mode);
@@ -1312,21 +1544,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
align = POINTER_SIZE;
}
- if (SUPPORTS_STACK_ALIGNMENT
- && crtl->stack_alignment_estimated < align)
- {
- /* stack_alignment_estimated shouldn't change after stack
- realign decision made */
- gcc_assert (!crtl->stack_realign_processed);
- crtl->stack_alignment_estimated = align;
- }
-
- /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
- So here we only make sure stack_alignment_needed >= align. */
- if (crtl->stack_alignment_needed < align)
- crtl->stack_alignment_needed = align;
- if (crtl->max_used_stack_slot_alignment < align)
- crtl->max_used_stack_slot_alignment = align;
+ record_alignment_for_reg_var (align);
if (TREE_CODE (origvar) == SSA_NAME)
{
@@ -1760,48 +1978,18 @@ expand_used_vars (void)
if (targetm.use_pseudo_pic_reg ())
pic_offset_table_rtx = gen_reg_rtx (Pmode);
- hash_map<tree, tree> ssa_name_decls;
for (i = 0; i < SA.map->num_partitions; i++)
{
tree var = partition_to_var (SA.map, i);
gcc_assert (!virtual_operand_p (var));
- /* Assign decls to each SSA name partition, share decls for partitions
- we could have coalesced (those with the same type). */
- if (SSA_NAME_VAR (var) == NULL_TREE)
- {
- tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
- if (!*slot)
- *slot = create_tmp_reg (TREE_TYPE (var));
- replace_ssa_name_symbol (var, *slot);
- }
-
- /* Always allocate space for partitions based on VAR_DECLs. But for
- those based on PARM_DECLs or RESULT_DECLs and which matter for the
- debug info, there is no need to do so if optimization is disabled
- because all the SSA_NAMEs based on these DECLs have been coalesced
- into a single partition, which is thus assigned the canonical RTL
- location of the DECLs. If in_lto_p, we can't rely on optimize,
- a function could be compiled with -O1 -flto first and only the
- link performed at -O0. */
- if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
- expand_one_var (var, true, true);
- else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
- {
- /* This is a PARM_DECL or RESULT_DECL. For those partitions that
- contain the default def (representing the parm or result itself)
- we don't do anything here. But those which don't contain the
- default def (representing a temporary based on the parm/result)
- we need to allocate space just like for normal VAR_DECLs. */
- if (!bitmap_bit_p (SA.partition_has_default_def, i))
- {
- expand_one_var (var, true, true);
- gcc_assert (SA.partition_to_pseudo[i]);
- }
- }
+ expand_one_ssa_partition (var);
}
+ for (i = 1; i < num_ssa_names; i++)
+ adjust_one_expanded_partition_var (ssa_name (i));
+
if (flag_stack_protect == SPCT_FLAG_STRONG)
gen_stack_protect_signal
= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -5961,35 +6149,6 @@ pass_expand::execute (function *fun)
parm_birth_insn = var_seq;
}
- /* Now that we also have the parameter RTXs, copy them over to our
- partitions. */
- for (i = 0; i < SA.map->num_partitions; i++)
- {
- tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
- if (TREE_CODE (var) != VAR_DECL
- && !SA.partition_to_pseudo[i])
- SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
- gcc_assert (SA.partition_to_pseudo[i]);
-
- /* If this decl was marked as living in multiple places, reset
- this now to NULL. */
- if (DECL_RTL_IF_SET (var) == pc_rtx)
- SET_DECL_RTL (var, NULL);
-
- /* Some RTL parts really want to look at DECL_RTL(x) when x
- was a decl marked in REG_ATTR or MEM_ATTR. We could use
- SET_DECL_RTL here making this available, but that would mean
- to select one of the potentially many RTLs for one DECL. Instead
- of doing that we simply reset the MEM_EXPR of the RTL in question,
- then nobody can get at it and hence nobody can call DECL_RTL on it. */
- if (!DECL_RTL_SET_P (var))
- {
- if (MEM_P (SA.partition_to_pseudo[i]))
- set_mem_expr (SA.partition_to_pseudo[i], NULL);
- }
- }
-
/* If we have a class containing differently aligned pointers
we need to merge those into the corresponding RTL pointer
alignment. */
@@ -5997,7 +6156,6 @@ pass_expand::execute (function *fun)
{
tree name = ssa_name (i);
int part;
- rtx r;
if (!name
/* We might have generated new SSA names in
@@ -6010,20 +6168,24 @@ pass_expand::execute (function *fun)
if (part == NO_PARTITION)
continue;
- /* Adjust all partition members to get the underlying decl of
- the representative which we might have created in expand_one_var. */
- if (SSA_NAME_VAR (name) == NULL_TREE)
+ gcc_assert (SA.partition_to_pseudo[part]);
+
+ /* If this decl was marked as living in multiple places, reset
+ this now to NULL. */
+ tree var = SSA_NAME_VAR (name);
+ if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+ SET_DECL_RTL (var, NULL);
+ /* Check that the pseudos chosen by assign_parms are those of
+ the corresponding default defs. */
+ else if (SSA_NAME_IS_DEFAULT_DEF (name)
+ && (TREE_CODE (var) == PARM_DECL
+ || TREE_CODE (var) == RESULT_DECL))
{
- tree leader = partition_to_var (SA.map, part);
- gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
- replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+ rtx in = DECL_RTL_IF_SET (var);
+ gcc_assert (in);
+ rtx out = SA.partition_to_pseudo[part];
+ gcc_assert (in == out || rtx_equal_p (in, out));
}
- if (!POINTER_TYPE_P (TREE_TYPE (name)))
- continue;
-
- r = SA.partition_to_pseudo[part];
- if (REG_P (r))
- mark_reg_pointer (r, get_pointer_alignment (name));
}
/* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..602579d 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,7 @@ along with GCC; see the file COPYING3. If not see
extern tree gimple_assign_rhs_to_tree (gimple);
extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
#endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 32b416a..051f824 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2227,16 +2227,16 @@ Common Report Var(flag_tree_ch) Optimization
Enable loop header copying on trees
ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing. Preserved for backward compatibility.
ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing. Preserved for backward compatibility.
ftree-copy-prop
Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e25bd62..e359be2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
-fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
-fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
-fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
-fdump-tree-nrv -fdump-tree-vect @gol
-fdump-tree-sink @gol
-fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
-fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
-fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
-ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
-ftree-loop-if-convert-stores -ftree-loop-im @gol
-ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
-ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -7076,11 +7074,6 @@ name is made by appending @file{.phiopt} to the source file name.
Dump each function after forward propagating single use variables. The file
name is made by appending @file{.forwprop} to the source file name.
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization. The file
-name is made by appending @file{.copyrename} to the source file name.
-
@item nrv
@opindex fdump-tree-nrv
Dump each function after applying the named return value optimization on
@@ -7545,8 +7538,8 @@ compilation time.
-ftree-ccp @gol
-fssa-phiopt @gol
-ftree-ch @gol
+-ftree-coalesce-vars @gol
-ftree-copy-prop @gol
--ftree-copyrename @gol
-ftree-dce @gol
-ftree-dominator-opts @gol
-ftree-dse @gol
@@ -8815,6 +8808,15 @@ profitable to parallelize the loops.
Compare the results of several data dependence analyzers. This option
is used for debugging the data dependence analyzers.
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries. This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}. In the negated form, this flag
+prevents SSA coalescing of user variables. This option is enabled by
+default if optimization is enabled.
+
@item -ftree-loop-if-convert
@opindex ftree-loop-if-convert
Attempt to transform conditional jumps in the innermost loops to
@@ -8928,32 +8930,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
references with scalars to prevent committing structures to memory too
early. This flag is enabled by default at @option{-O} and higher.
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees. This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables. This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions. It is a more limited form of
-@option{-ftree-coalesce-vars}. This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries. This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}. In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones. This option is enabled by default.
-
@item -ftree-ter
@opindex ftree-ter
Perform temporary expression replacement during the SSA->normal phase. Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 49a1509..2b98946 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1249,6 +1249,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
void
set_reg_attrs_for_decl_rtl (tree t, rtx x)
{
+ if (!t)
+ return;
+ tree tdecl = t;
if (GET_CODE (x) == SUBREG)
{
gcc_assert (subreg_lowpart_p (x));
@@ -1257,7 +1260,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
if (REG_P (x))
REG_ATTRS (x)
= get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
- DECL_MODE (t)));
+ DECL_MODE (tdecl)));
if (GET_CODE (x) == CONCAT)
{
if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index 8745aea..5b0d49c 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -856,6 +856,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
return pmode;
}
+/* Return the promoted mode for name. If it is a named SSA_NAME, it
+ is the same as promote_decl_mode. Otherwise, it is the promoted
+ mode of a temp decl of same type as the SSA_NAME, if we had created
+ one. */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+ gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+ tree type = TREE_TYPE (name);
+ int unsignedp = TYPE_UNSIGNED (type);
+ machine_mode mode = TYPE_MODE (type);
+
+ machine_mode pmode = promote_mode (type, mode, &unsignedp);
+ if (punsignedp)
+ *punsignedp = unsignedp;
+
+ return pmode;
+}
+
+
\f
/* Controls the behaviour of {anti_,}adjust_stack. */
static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 94613de..52113db 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
/* Return mode and signedness to use when object is promoted. */
machine_mode promote_decl_mode (const_tree, int *);
+/* Return mode and signedness to use when object is promoted. */
+machine_mode promote_ssa_mode (const_tree, int *);
+
/* Remove some bytes from the stack. An rtx says how many. */
extern void adjust_stack (rtx);
diff --git a/gcc/expr.c b/gcc/expr.c
index 5a931dc..5b6e16e 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9301,7 +9301,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
rtx op0, op1, temp, decl_rtl;
tree type;
int unsignedp;
- machine_mode mode;
+ machine_mode mode, dmode;
enum tree_code code = TREE_CODE (exp);
rtx subtarget, original_target;
int ignore;
@@ -9432,7 +9432,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
if (g == NULL
&& modifier == EXPAND_INITIALIZER
&& !SSA_NAME_IS_DEFAULT_DEF (exp)
- && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+ && (optimize || !SSA_NAME_VAR (exp)
+ || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
&& stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
g = SSA_NAME_DEF_STMT (exp);
if (g)
@@ -9511,15 +9512,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
/* Ensure variable marked as used even if it doesn't go through
a parser. If it hasn't be used yet, write out an external
definition. */
- TREE_USED (exp) = 1;
+ if (exp)
+ TREE_USED (exp) = 1;
/* Show we haven't gotten RTL for this yet. */
temp = 0;
/* Variables inherited from containing functions should have
been lowered by this point. */
- context = decl_function_context (exp);
- gcc_assert (SCOPE_FILE_SCOPE_P (context)
+ if (exp)
+ context = decl_function_context (exp);
+ gcc_assert (!exp
+ || SCOPE_FILE_SCOPE_P (context)
|| context == current_function_decl
|| TREE_STATIC (exp)
|| DECL_EXTERNAL (exp)
@@ -9543,7 +9547,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
decl_rtl = use_anchored_address (decl_rtl);
if (modifier != EXPAND_CONST_ADDRESS
&& modifier != EXPAND_SUM
- && !memory_address_addr_space_p (DECL_MODE (exp),
+ && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+ : GET_MODE (decl_rtl),
XEXP (decl_rtl, 0),
MEM_ADDR_SPACE (decl_rtl)))
temp = replace_equiv_address (decl_rtl,
@@ -9554,12 +9559,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
if the address is a register. */
if (temp != 0)
{
- if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+ if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
return temp;
}
+ if (exp)
+ dmode = DECL_MODE (exp);
+ else
+ dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
/* If the mode of DECL_RTL does not match that of the decl,
there are two cases: we are dealing with a BLKmode value
that is returned in a register, or we are dealing with
@@ -9567,22 +9577,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
of the wanted mode, but mark it so that we know that it
was already extended. */
if (REG_P (decl_rtl)
- && DECL_MODE (exp) != BLKmode
- && GET_MODE (decl_rtl) != DECL_MODE (exp))
+ && dmode != BLKmode
+ && GET_MODE (decl_rtl) != dmode)
{
machine_mode pmode;
/* Get the signedness to be used for this variable. Ensure we get
the same mode we got when the variable was declared. */
- if (code == SSA_NAME
- && (g = SSA_NAME_DEF_STMT (ssa_name))
- && gimple_code (g) == GIMPLE_CALL
- && !gimple_call_internal_p (g))
+ if (code != SSA_NAME)
+ pmode = promote_decl_mode (exp, &unsignedp);
+ else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+ && gimple_code (g) == GIMPLE_CALL
+ && !gimple_call_internal_p (g))
pmode = promote_function_mode (type, mode, &unsignedp,
gimple_call_fntype (g),
2);
else
- pmode = promote_decl_mode (exp, &unsignedp);
+ pmode = promote_ssa_mode (ssa_name, &unsignedp);
gcc_assert (GET_MODE (decl_rtl) == pmode);
temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/function.c b/gcc/function.c
index 7d2d7e4..58e2498 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3. If not see
#include "cfganal.h"
#include "cfgbuild.h"
#include "cfgcleanup.h"
+#include "cfgexpand.h"
#include "basic-block.h"
#include "df.h"
#include "params.h"
@@ -2121,6 +2122,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
bool
use_register_for_decl (const_tree decl)
{
+ if (TREE_CODE (decl) == SSA_NAME)
+ {
+ /* We often try to use the SSA_NAME, instead of its underlying
+ decl, to get type information and guide decisions, to avoid
+ differences of behavior between anonymous and named
+ variables, but in this one case we have to go for the actual
+ variable if there is one. The main reason is that, at least
+ at -O0, we want to place user variables on the stack, but we
+ don't mind using pseudos for anonymous or ignored temps.
+ Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+ should go in pseudos, whereas their corresponding variables
+ might have to go on the stack. So, disregarding the decl
+ here would negatively impact debug info at -O0, enable
+ coalescing between SSA_NAMEs that ought to get different
+ stack/pseudo assignments, and get the incoming argument
+ processing thoroughly confused by PARM_DECLs expected to live
+ in stack slots but assigned to pseudos. */
+ if (!SSA_NAME_VAR (decl))
+ return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+ && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+ decl = SSA_NAME_VAR (decl);
+ }
+
if (!targetm.calls.allocate_stack_slots_for_args ())
return true;
@@ -2804,23 +2829,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
data->entry_parm = entry_parm;
}
+/* Wrapper for use_register_for_decl, that special-cases the
+ .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+ passed by reference. */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+ if (parm == all->function_result_decl)
+ {
+ tree result = DECL_RESULT (current_function_decl);
+
+ if (DECL_BY_REFERENCE (result))
+ parm = result;
+ }
+
+ return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+ the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+ is passed by reference. */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+ if (parm == all->function_result_decl)
+ {
+ tree result = DECL_RESULT (current_function_decl);
+
+ if (!DECL_BY_REFERENCE (result))
+ return NULL_RTX;
+
+ parm = result;
+ }
+
+ return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+ SSA_NAMEs in multiple partitions, so that assign_parms will choose
+ the default def, if it exists, or create new RTL to hold the unused
+ entry value. If we are coalescing across variables, we want to
+ reset the location too, because a parm without a default def
+ (incoming value unused) might be coalesced with one with a default
+ def, and then assign_parms would copy both incoming values to the
+ same location, which might cause the wrong value to survive. */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+ gcc_assert (TREE_CODE (parm) == PARM_DECL
+ || TREE_CODE (parm) == RESULT_DECL);
+ if ((flag_tree_coalesce_vars
+ || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+ && is_gimple_reg (parm))
+ SET_DECL_RTL (parm, NULL_RTX);
+}
+
/* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's
always valid and properly aligned. */
static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+ struct assign_parm_data_one *data)
{
rtx stack_parm = data->stack_parm;
+ /* If out-of-SSA assigned RTL to the parm default def, make sure we
+ don't use what we might have computed before. */
+ rtx ssa_assigned = rtl_for_parm (all, parm);
+ if (ssa_assigned)
+ stack_parm = NULL;
+
/* If we can't trust the parm stack slot to be aligned enough for its
ultimate type, don't use that slot after entry. We'll make another
stack slot, if we need one. */
- if (stack_parm
- && ((STRICT_ALIGNMENT
- && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
- || (data->nominal_type
- && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
- && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+ else if (stack_parm
+ && ((STRICT_ALIGNMENT
+ && (GET_MODE_ALIGNMENT (data->nominal_mode)
+ > MEM_ALIGN (stack_parm)))
+ || (data->nominal_type
+ && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+ && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
stack_parm = NULL;
/* If parm was passed in memory, and we need to convert it on entry,
@@ -2882,11 +2972,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
size = int_size_in_bytes (data->passed_type);
size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
if (stack_parm == 0)
{
DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
- stack_parm = assign_stack_local (BLKmode, size_stored,
- DECL_ALIGN (parm));
+ stack_parm = rtl_for_parm (all, parm);
+ if (!stack_parm)
+ stack_parm = assign_stack_local (BLKmode, size_stored,
+ DECL_ALIGN (parm));
+ else
+ stack_parm = copy_rtx (stack_parm);
if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
PUT_MODE (stack_parm, GET_MODE (entry_parm));
set_mem_attributes (stack_parm, parm, 1);
@@ -3027,10 +3122,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
= promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
TREE_TYPE (current_function_decl), 2);
- parmreg = gen_reg_rtx (promoted_nominal_mode);
+ rtx from_expand = rtl_for_parm (all, parm);
- if (!DECL_ARTIFICIAL (parm))
- mark_user_reg (parmreg);
+ if (from_expand && !data->passed_pointer)
+ {
+ parmreg = from_expand;
+ gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
+ }
+ else
+ {
+ parmreg = gen_reg_rtx (promoted_nominal_mode);
+ if (!DECL_ARTIFICIAL (parm))
+ mark_user_reg (parmreg);
+ }
/* If this was an item that we received a pointer to,
set DECL_RTL appropriately. */
@@ -3049,6 +3153,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
assign_parm_find_data_types and expand_expr_real_1. */
equiv_stack_parm = data->stack_parm;
+ if (!equiv_stack_parm)
+ equiv_stack_parm = data->entry_parm;
validated_mem = validize_mem (copy_rtx (data->entry_parm));
need_conversion = (data->nominal_mode != data->passed_mode
@@ -3189,11 +3295,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
/* If we were passed a pointer but the actual value can safely live
in a register, retrieve it and use it directly. */
- if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+ if (data->passed_pointer
+ && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
{
/* We can't use nominal_mode, because it will have been set to
Pmode above. We must use the actual mode of the parm. */
- if (use_register_for_decl (parm))
+ if (from_expand)
+ {
+ parmreg = from_expand;
+ gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+ }
+ else if (use_register_for_decl (parm))
{
parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
mark_user_reg (parmreg);
@@ -3233,7 +3345,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
/* STACK_PARM is the pointer, not the parm, and PARMREG is
now the parm. */
- data->stack_parm = NULL;
+ data->stack_parm = equiv_stack_parm = NULL;
}
/* Mark the register as eliminable if we did no conversion and it was
@@ -3243,11 +3355,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
make here would screw up life analysis for it. */
if (data->nominal_mode == data->passed_mode
&& !did_conversion
- && data->stack_parm != 0
- && MEM_P (data->stack_parm)
+ && equiv_stack_parm != 0
+ && MEM_P (equiv_stack_parm)
&& data->locate.offset.var == 0
&& reg_mentioned_p (virtual_incoming_args_rtx,
- XEXP (data->stack_parm, 0)))
+ XEXP (equiv_stack_parm, 0)))
{
rtx_insn *linsn = get_last_insn ();
rtx_insn *sinsn;
@@ -3260,8 +3372,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
= GET_MODE_INNER (GET_MODE (parmreg));
int regnor = REGNO (XEXP (parmreg, 0));
int regnoi = REGNO (XEXP (parmreg, 1));
- rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
- rtx stacki = adjust_address_nv (data->stack_parm, submode,
+ rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+ rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
GET_MODE_SIZE (submode));
/* Scan backwards for the set of the real and
@@ -3334,6 +3446,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
if (data->stack_parm == 0)
{
+ rtx x = data->stack_parm = rtl_for_parm (all, parm);
+ if (x)
+ gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+ }
+
+ if (data->stack_parm == 0)
+ {
int align = STACK_SLOT_ALIGNMENT (data->passed_type,
GET_MODE (data->entry_parm),
TYPE_ALIGN (data->passed_type));
@@ -3592,6 +3711,8 @@ assign_parms (tree fndecl)
DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
continue;
}
+ else
+ maybe_reset_rtl_for_parm (parm);
/* Estimate stack alignment from parameter alignment. */
if (SUPPORTS_STACK_ALIGNMENT)
@@ -3641,7 +3762,9 @@ assign_parms (tree fndecl)
else
set_decl_incoming_rtl (parm, data.entry_parm, false);
- /* Boudns should be loaded in the particular order to
+ assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+ /* Bounds should be loaded in the particular order to
have registers allocated correctly. Collect info about
input bounds and load them later. */
if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3658,11 +3781,10 @@ assign_parms (tree fndecl)
}
else
{
- assign_parm_adjust_stack_rtl (&data);
-
if (assign_parm_setup_block_p (&data))
assign_parm_setup_block (&all, parm, &data);
- else if (data.passed_pointer || use_register_for_decl (parm))
+ else if (data.passed_pointer
+ || use_register_for_parm_decl (&all, parm))
assign_parm_setup_reg (&all, parm, &data);
else
assign_parm_setup_stack (&all, parm, &data);
@@ -5004,7 +5126,9 @@ expand_function_start (tree subr)
before any library calls that assign parms might generate. */
/* Decide whether to return the value in memory or in a register. */
- if (aggregate_value_p (DECL_RESULT (subr), subr))
+ tree res = DECL_RESULT (subr);
+ maybe_reset_rtl_for_parm (res);
+ if (aggregate_value_p (res, subr))
{
/* Returning something that won't go in a register. */
rtx value_address = 0;
@@ -5012,7 +5136,7 @@ expand_function_start (tree subr)
#ifdef PCC_STATIC_STRUCT_RETURN
if (cfun->returns_pcc_struct)
{
- int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+ int size = int_size_in_bytes (TREE_TYPE (res));
value_address = assemble_static_space (size);
}
else
@@ -5024,36 +5148,45 @@ expand_function_start (tree subr)
it. */
if (sv)
{
- value_address = gen_reg_rtx (Pmode);
+ if (DECL_BY_REFERENCE (res))
+ value_address = get_rtl_for_parm_ssa_default_def (res);
+ if (!value_address)
+ value_address = gen_reg_rtx (Pmode);
emit_move_insn (value_address, sv);
}
}
if (value_address)
{
rtx x = value_address;
- if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+ if (!DECL_BY_REFERENCE (res))
{
- x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
- set_mem_attributes (x, DECL_RESULT (subr), 1);
+ x = get_rtl_for_parm_ssa_default_def (res);
+ if (!x)
+ {
+ x = gen_rtx_MEM (DECL_MODE (res), value_address);
+ set_mem_attributes (x, res, 1);
+ }
}
- SET_DECL_RTL (DECL_RESULT (subr), x);
+ SET_DECL_RTL (res, x);
}
}
- else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+ else if (DECL_MODE (res) == VOIDmode)
/* If return mode is void, this decl rtl should not be used. */
- SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+ SET_DECL_RTL (res, NULL_RTX);
else
{
/* Compute the return values into a pseudo reg, which we will copy
into the true return register after the cleanups are done. */
- tree return_type = TREE_TYPE (DECL_RESULT (subr));
- if (TYPE_MODE (return_type) != BLKmode
- && targetm.calls.return_in_msb (return_type))
+ tree return_type = TREE_TYPE (res);
+ rtx x = get_rtl_for_parm_ssa_default_def (res);
+ if (x)
+ /* Use it. */;
+ else if (TYPE_MODE (return_type) != BLKmode
+ && targetm.calls.return_in_msb (return_type))
/* expand_function_end will insert the appropriate padding in
this case. Use the return value's natural (unpadded) mode
within the function proper. */
- SET_DECL_RTL (DECL_RESULT (subr),
- gen_reg_rtx (TYPE_MODE (return_type)));
+ x = gen_reg_rtx (TYPE_MODE (return_type));
else
{
/* In order to figure out what mode to use for the pseudo, we
@@ -5064,25 +5197,26 @@ expand_function_start (tree subr)
/* Structures that are returned in registers are not
aggregate_value_p, so we may see a PARALLEL or a REG. */
if (REG_P (hard_reg))
- SET_DECL_RTL (DECL_RESULT (subr),
- gen_reg_rtx (GET_MODE (hard_reg)));
+ x = gen_reg_rtx (GET_MODE (hard_reg));
else
{
gcc_assert (GET_CODE (hard_reg) == PARALLEL);
- SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+ x = gen_group_rtx (hard_reg);
}
}
+ SET_DECL_RTL (res, x);
+
/* Set DECL_REGISTER flag so that expand_function_end will copy the
result to the real return register(s). */
- DECL_REGISTER (DECL_RESULT (subr)) = 1;
+ DECL_REGISTER (res) = 1;
if (chkp_function_instrumented_p (current_function_decl))
{
- tree return_type = TREE_TYPE (DECL_RESULT (subr));
+ tree return_type = TREE_TYPE (res);
rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
subr, 1);
- SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+ SET_DECL_BOUNDS_RTL (res, bounds);
}
}
@@ -5097,7 +5231,9 @@ expand_function_start (tree subr)
rtx local, chain;
rtx_insn *insn;
- local = gen_reg_rtx (Pmode);
+ local = get_rtl_for_parm_ssa_default_def (parm);
+ if (!local)
+ local = gen_reg_rtx (Pmode);
chain = targetm.calls.static_chain (current_function_decl, true);
set_decl_incoming_rtl (parm, chain, false);
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index 4d683d6..d3d1c5f 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
return copy;
}
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
- coalescing together, false otherwise.
-
- This must stay consistent with var_map_base_init in tree-ssa-live.c. */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
- /* First check the SSA_NAME's associated DECL. We only want to
- coalesce if they have the same DECL or both have no associated DECL. */
- tree var1 = SSA_NAME_VAR (name1);
- tree var2 = SSA_NAME_VAR (name2);
- var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
- var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
- if (var1 != var2)
- return false;
-
- /* Now check the types. If the types are the same, then we should
- try to coalesce V1 and V2. */
- tree t1 = TREE_TYPE (name1);
- tree t2 = TREE_TYPE (name2);
- if (t1 == t2)
- return true;
-
- /* If the types are not the same, check for a canonical type match. This
- (for example) allows coalescing when the types are fundamentally the
- same, but just have different names.
-
- Note pointer types with different address spaces may have the same
- canonical type. Those are rejected for coalescing by the
- types_compatible_p check. */
- if (TYPE_CANONICAL (t1)
- && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
- && types_compatible_p (t1, t2))
- return true;
-
- return false;
-}
-
/* Strip off a legitimate source ending from the input string NAME of
length LEN. Rather than having to know the names used by all of
our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index ed23eb2..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
extern bool gimple_has_body_p (tree);
extern const char *gimple_decl_printable_name (tree, int);
extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
extern tree create_tmp_var_name (const char *);
extern tree create_tmp_var_raw (tree, const char * = NULL);
extern tree create_tmp_var (tree, const char * = NULL);
diff --git a/gcc/opts.c b/gcc/opts.c
index 9793999..5305299 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
{ OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+ { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
- { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 4690e23..230e089 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_all_early_optimizations);
PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
NEXT_PASS (pass_remove_cgraph_callee_edges);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_object_sizes);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
@@ -155,7 +154,6 @@ along with GCC; see the file COPYING3. If not see
/* Initial scalar cleanups before alias computation.
They ensure memory accesses are not indirect wherever possible. */
NEXT_PASS (pass_strip_predict_hints);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
@@ -183,7 +181,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_ch);
NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_sra);
- NEXT_PASS (pass_rename_ssa_copies);
/* The dom pass will also resolve all __builtin_constant_p calls
that are still there to 0. This has to be done after some
propagations have already run, but before some more dead code
@@ -291,7 +288,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_fold_builtins);
NEXT_PASS (pass_optimize_widening_mul);
NEXT_PASS (pass_tail_calls);
- NEXT_PASS (pass_rename_ssa_copies);
/* FIXME: If DCE is not run before checking for uninitialized uses,
we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
However, this also causes us to misdiagnose cases that should be
@@ -326,7 +322,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_dce);
NEXT_PASS (pass_asan);
NEXT_PASS (pass_tsan);
- NEXT_PASS (pass_rename_ssa_copies);
/* ??? We do want some kind of loop invariant motion, but we possibly
need to adjust LIM to be more friendly towards preserving accurate
debug information here. */
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
/* PR tree-optimization/54200 */
/* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
int o __attribute__((used));
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
int main ()
{
- int i;
+ register int i;
char foo[255];
// smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
void
overflow()
{
- int i = 0;
+ register int i = 0;
char foo[30];
/* Overflow buffer. */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+ value is unused, to the same location, so as to overwrite one of
+ them with the incoming value of the other. */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+ j = i; /* The incoming value for J is unused. */
+ i = 2;
+ if (j)
+ j++;
+ j += i + 1;
+ return j;
+}
+
+/* Same as foo, but with swapped parameters. */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+ j = i; /* The incoming value for J is unused. */
+ i = 2;
+ if (j)
+ j++;
+ j += i + 1;
+ return j;
+}
+
+int
+main (void)
+{
+ if (foo (0, 1) != 3)
+ abort ();
+ if (bar (1, 0) != 3)
+ abort ();
+ return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index e23bc0b..59d91c6 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
rtx dest_rtx, seq, x;
machine_mode dest_mode, src_mode;
int unsignedp;
- tree var;
if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -327,12 +326,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
start_sequence ();
- var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+ tree name = partition_to_var (SA.map, dest);
src_mode = TYPE_MODE (TREE_TYPE (src));
dest_mode = GET_MODE (dest_rtx);
- gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+ gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
gcc_assert (!REG_P (dest_rtx)
- || dest_mode == promote_decl_mode (var, &unsignedp));
+ || dest_mode == promote_ssa_mode (name, &unsignedp));
if (src_mode != dest_mode)
{
@@ -708,13 +707,12 @@ elim_backward (elim_graph g, int T)
static rtx
get_temp_reg (tree name)
{
- tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
- tree type = TREE_TYPE (var);
+ tree type = TREE_TYPE (name);
int unsignedp;
- machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+ machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
rtx x = gen_reg_rtx (reg_mode);
if (POINTER_TYPE_P (type))
- mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+ mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
return x;
}
@@ -1014,7 +1012,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
/* Return to viewing the variable list as just all reference variables after
coalescing has been performed. */
- partition_view_normal (map, false);
+ partition_view_normal (map);
if (dump_file && (dump_flags & TDF_DETAILS))
{
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index b05a860..9ffa3f1 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see
#include "tree-ssanames.h"
#include "tree-ssa-live.h"
#include "tree-ssa-coalesce.h"
+#include "explow.h"
#include "diagnostic-core.h"
@@ -830,6 +831,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
basic_block bb;
ssa_op_iter iter;
live_track_p live;
+ basic_block entry;
+
+ /* If inter-variable coalescing is enabled, we may attempt to
+ coalesce variables from different base variables, including
+ different parameters, so we have to make sure default defs live
+ at the entry block conflict with each other. */
+ if (flag_tree_coalesce_vars)
+ entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+ else
+ entry = NULL;
map = live_var_map (liveinfo);
graph = ssa_conflicts_new (num_var_partitions (map));
@@ -888,6 +899,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
live_track_process_def (live, result, graph);
}
+ /* Pretend there are defs for params' default defs at the start
+ of the (post-)entry block. */
+ if (bb == entry)
+ {
+ unsigned base;
+ bitmap_iterator bi;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+ {
+ bitmap_iterator bi2;
+ unsigned part;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+ 0, part, bi2)
+ {
+ tree var = partition_to_var (map, part);
+ if (!SSA_NAME_VAR (var)
+ || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+ && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+ || !SSA_NAME_IS_DEFAULT_DEF (var))
+ continue;
+ live_track_process_def (live, var, graph);
+ }
+ }
+ }
+
live_track_clear_base_vars (live);
}
@@ -1156,6 +1191,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
{
var1 = partition_to_var (map, p1);
var2 = partition_to_var (map, p2);
+
z = var_union (map, var1, var2);
if (z == NO_PARTITION)
{
@@ -1173,6 +1209,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
if (debug)
fprintf (debug, ": Success -> %d\n", z);
+
return true;
}
@@ -1270,6 +1307,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
}
+/* Output partition map MAP with coalescing plan PART to file F. */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+ int t;
+ unsigned x, y;
+ int p;
+
+ fprintf (f, "\nCoalescible Partition map \n\n");
+
+ for (x = 0; x < map->num_partitions; x++)
+ {
+ if (map->view_to_partition != NULL)
+ p = map->view_to_partition[x];
+ else
+ p = x;
+
+ if (ssa_name (p) == NULL_TREE
+ || virtual_operand_p (ssa_name (p)))
+ continue;
+
+ t = 0;
+ for (y = 1; y < num_ssa_names; y++)
+ {
+ tree var = version_to_var (map, y);
+ if (!var)
+ continue;
+ int q = var_to_partition (map, var);
+ p = partition_find (part, q);
+ gcc_assert (map->partition_to_base_index[q]
+ == map->partition_to_base_index[p]);
+
+ if (p == (int)x)
+ {
+ if (t++ == 0)
+ {
+ fprintf (f, "Partition %d, base %d (", x,
+ map->partition_to_base_index[q]);
+ print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+ fprintf (f, " - ");
+ }
+ fprintf (f, "%d ", y);
+ }
+ }
+ if (t != 0)
+ fprintf (f, ")\n");
+ }
+ fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+ coalescing together, false otherwise.
+
+ This must stay consistent with var_map_base_init in tree-ssa-live.c. */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+ /* First check the SSA_NAME's associated DECL. Without
+ optimization, we only want to coalesce if they have the same DECL
+ or both have no associated DECL. */
+ tree var1 = SSA_NAME_VAR (name1);
+ tree var2 = SSA_NAME_VAR (name2);
+ var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+ var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+ if (var1 != var2 && !flag_tree_coalesce_vars)
+ return false;
+
+ /* Now check the types. If the types are the same, then we should
+ try to coalesce V1 and V2. */
+ tree t1 = TREE_TYPE (name1);
+ tree t2 = TREE_TYPE (name2);
+ if (t1 == t2)
+ {
+ check_modes:
+ /* If the base variables are the same, we're good: none of the
+ other tests below could possibly fail. */
+ var1 = SSA_NAME_VAR (name1);
+ var2 = SSA_NAME_VAR (name2);
+ if (var1 == var2)
+ return true;
+
+ /* We don't want to coalesce two SSA names if one of the base
+ variables is supposed to be a register while the other is
+ supposed to be on the stack. Anonymous SSA names take
+ registers, but when not optimizing, user variables should go
+ on the stack, so coalescing them with the anonymous variable
+ as the partition leader would end up assigning the user
+ variable to a register. Don't do that! */
+ bool reg1 = !var1 || use_register_for_decl (var1);
+ bool reg2 = !var2 || use_register_for_decl (var2);
+ if (reg1 != reg2)
+ return false;
+
+ /* Check that the promoted modes are the same. We don't want to
+ coalesce if the promoted modes would be different. Only
+ PARM_DECLs and RESULT_DECLs have different promotion rules,
+ so skip the test if we both are variables or anonymous
+ SSA_NAMEs. */
+ return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+ || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
+ }
+
+ /* If the types are not the same, check for a canonical type match. This
+ (for example) allows coalescing when the types are fundamentally the
+ same, but just have different names.
+
+ Note pointer types with different address spaces may have the same
+ canonical type. Those are rejected for coalescing by the
+ types_compatible_p check. */
+ if (TYPE_CANONICAL (t1)
+ && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+ && types_compatible_p (t1, t2))
+ goto check_modes;
+
+ return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+ partition of SSA names USED_IN_COPIES and related by CL coalesce
+ possibilities. This must match gimple_can_coalesce_p in the
+ optimized case. */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+ coalesce_list_p cl)
+{
+ int parts = num_var_partitions (map);
+ partition tentative = partition_new (parts);
+
+ /* Partition the SSA versions so that, for each coalescible
+ pair, both of its members are in the same partition in
+ TENTATIVE. */
+ gcc_assert (!cl->sorted);
+ coalesce_pair_p node;
+ coalesce_iterator_type ppi;
+ FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+ {
+ tree v1 = ssa_name (node->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (node->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* We have to deal with cost one pairs too. */
+ for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+ {
+ tree v1 = ssa_name (co->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (co->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* And also with abnormal edges. */
+ basic_block bb;
+ edge e;
+ edge_iterator ei;
+ FOR_EACH_BB_FN (bb, cfun)
+ {
+ FOR_EACH_EDGE (e, ei, bb->preds)
+ if (e->flags & EDGE_ABNORMAL)
+ {
+ gphi_iterator gsi;
+ for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+ gsi_next (&gsi))
+ {
+ gphi *phi = gsi.phi ();
+ tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+ if (SSA_NAME_IS_DEFAULT_DEF (arg)
+ && (!SSA_NAME_VAR (arg)
+ || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+ continue;
+
+ tree res = PHI_RESULT (phi);
+
+ int p1 = partition_find (tentative, var_to_partition (map, res));
+ int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+ }
+ }
+
+ map->partition_to_base_index = XCNEWVEC (int, parts);
+ auto_vec<unsigned int> index_map (parts);
+ if (parts)
+ index_map.quick_grow (parts);
+
+ const unsigned no_part = -1;
+ unsigned count = parts;
+ while (count)
+ index_map[--count] = no_part;
+
+ /* Initialize MAP's mapping from partition to base index, using
+ as base indices an enumeration of the TENTATIVE partitions in
+ which each SSA version ended up, so that we compute conflicts
+ between all SSA versions that ended up in the same potential
+ coalesce partition. */
+ bitmap_iterator bi;
+ unsigned i;
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ if (index_map[base] != no_part)
+ continue;
+ index_map[base] = count++;
+ }
+
+ map->num_basevars = count;
+
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ gcc_assert (index_map[base] < count);
+ map->partition_to_base_index[pidx] = index_map[base];
+ }
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ dump_part_var_map (dump_file, tentative, map);
+
+ partition_delete (tentative);
+}
+
+/* Hashtable helpers. */
+
+struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
+{
+ typedef tree_int_map *value_type;
+ typedef tree_int_map *compare_type;
+ static inline hashval_t hash (const tree_int_map *);
+ static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+ return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+ return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+ names. Partitions will share the same base if they have the same
+ SSA_NAME_VAR, or, being anonymous variables, the same type. This
+ must match gimple_can_coalesce_p in the non-optimized case. */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+ int x, num_part;
+ tree var;
+ struct tree_int_map *m, *mapstorage;
+
+ num_part = num_var_partitions (map);
+ hash_table<tree_int_map_hasher> tree_to_index (num_part);
+ /* We can have at most num_part entries in the hash tables, so it's
+ enough to allocate so many map elements once, saving some malloc
+ calls. */
+ mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+ /* If a base table already exists, clear it, otherwise create it. */
+ free (map->partition_to_base_index);
+ map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+ /* Build the base variable list, and point partitions at their bases. */
+ for (x = 0; x < num_part; x++)
+ {
+ struct tree_int_map **slot;
+ unsigned baseindex;
+ var = partition_to_var (map, x);
+ if (SSA_NAME_VAR (var)
+ && (!VAR_P (SSA_NAME_VAR (var))
+ || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+ m->base.from = SSA_NAME_VAR (var);
+ else
+ /* This restricts what anonymous SSA names we can coalesce
+ as it restricts the sets we compute conflicts for.
+ Using TREE_TYPE to generate sets is the easies as
+ type equivalency also holds for SSA names with the same
+ underlying decl.
+
+ Check gimple_can_coalesce_p when changing this code. */
+ m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+ ? TYPE_CANONICAL (TREE_TYPE (var))
+ : TREE_TYPE (var));
+ /* If base variable hasn't been seen, set it up. */
+ slot = tree_to_index.find_slot (m, INSERT);
+ if (!*slot)
+ {
+ baseindex = m - mapstorage;
+ m->to = baseindex;
+ *slot = m;
+ m++;
+ }
+ else
+ baseindex = (*slot)->to;
+ map->partition_to_base_index[x] = baseindex;
+ }
+
+ map->num_basevars = m - mapstorage;
+
+ free (mapstorage);
+}
+
/* Reduce the number of copies by coalescing variables in the function. Return
a partition map with the resulting coalesces. */
@@ -1286,9 +1647,10 @@ coalesce_ssa_name (void)
cl = create_coalesce_list ();
map = create_outofssa_var_map (cl, used_in_copies);
- /* If optimization is disabled, we need to coalesce all the names originating
- from the same SSA_NAME_VAR so debug info remains undisturbed. */
- if (!optimize)
+ /* If this optimization is disabled, we need to coalesce all the
+ names originating from the same SSA_NAME_VAR so debug info
+ remains undisturbed. */
+ if (!flag_tree_coalesce_vars)
{
hash_table<ssa_name_var_hash> ssa_name_hash (10);
@@ -1329,8 +1691,13 @@ coalesce_ssa_name (void)
if (dump_file && (dump_flags & TDF_DETAILS))
dump_var_map (dump_file, map);
- /* Don't calculate live ranges for variables not in the coalesce list. */
- partition_view_bitmap (map, used_in_copies, true);
+ partition_view_bitmap (map, used_in_copies);
+
+ if (flag_tree_coalesce_vars)
+ compute_optimized_partition_bases (map, used_in_copies, cl);
+ else
+ compute_samebase_partition_bases (map);
+
BITMAP_FREE (used_in_copies);
if (num_var_partitions (map) < 1)
@@ -1369,8 +1736,7 @@ coalesce_ssa_name (void)
/* Now coalesce everything in the list. */
coalesce_partitions (map, graph, cl,
- ((dump_flags & TDF_DETAILS) ? dump_file
- : NULL));
+ ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
delete_coalesce_list (cl);
ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see
#define GCC_TREE_SSA_COALESCE_H
extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
#endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index f3cb56e..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,499 +0,0 @@
-/* Rename SSA copies.
- Copyright (C) 2004-2015 Free Software Foundation, Inc.
- Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3. If not see
-<http://www.gnu.org/licenses/>. */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
-#include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
-#include "tree.h"
-#include "fold-const.h"
-#include "predict.h"
-#include "hard-reg-set.h"
-#include "function.h"
-#include "dominance.h"
-#include "cfg.h"
-#include "basic-block.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "bitmap.h"
-#include "gimple-ssa.h"
-#include "stringpool.h"
-#include "tree-ssanames.h"
-#include "hashtab.h"
-#include "rtl.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
- /* Number of copies coalesced. */
- int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
- This optimization looks for copies between 2 SSA_NAMES, either through a
- direct copy, or an implicit one via a PHI node result and its arguments.
-
- Each copy is examined to determine if it is possible to rename the base
- variable of one of the operands to the same variable as the other operand.
- i.e.
- T.3_5 = <blah>
- a_1 = T.3_5
-
- If this copy couldn't be copy propagated, it could possibly remain in the
- program throughout the optimization phases. After SSA->normal, it would
- become:
-
- T.3 = <blah>
- a = T.3
-
- Since T.3_5 is distinct from all other SSA versions of T.3, there is no
- fundamental reason why the base variable needs to be T.3, subject to
- certain restrictions. This optimization attempts to determine if we can
- change the base variable on copies like this, and result in code such as:
-
- a_5 = <blah>
- a_1 = a_5
-
- This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
- possible, the copy goes away completely. If it isn't possible, a new temp
- will be created for a_5, and you will end up with the exact same code:
-
- a.8 = <blah>
- a = a.8
-
- The other benefit of performing this optimization relates to what variables
- are chosen in copies. Gimplification of the program uses temporaries for
- a lot of things. expressions like
-
- a_1 = <blah>
- <blah2> = a_1
-
- get turned into
-
- T.3_5 = <blah>
- a_1 = T.3_5
- <blah2> = a_1
-
- Copy propagation is done in a forward direction, and if we can propagate
- through the copy, we end up with:
-
- T.3_5 = <blah>
- <blah2> = T.3_5
-
- The copy is gone, but so is all reference to the user variable 'a'. By
- performing this optimization, we would see the sequence:
-
- a_5 = <blah>
- a_1 = a_5
- <blah2> = a_1
-
- which copy propagation would then turn into:
-
- a_5 = <blah>
- <blah2> = a_5
-
- and so we still retain the user variable whenever possible. */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
- Choose a representative for the partition, and send debug info to DEBUG. */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
- int p1, p2, p3;
- tree root1, root2;
- tree rep1, rep2;
- bool ign1, ign2, abnorm;
-
- gcc_assert (TREE_CODE (var1) == SSA_NAME);
- gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
- register_ssa_partition (map, var1);
- register_ssa_partition (map, var2);
-
- p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
- p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
- if (debug)
- {
- fprintf (debug, "Try : ");
- print_generic_expr (debug, var1, TDF_SLIM);
- fprintf (debug, "(P%d) & ", p1);
- print_generic_expr (debug, var2, TDF_SLIM);
- fprintf (debug, "(P%d)", p2);
- }
-
- gcc_assert (p1 != NO_PARTITION);
- gcc_assert (p2 != NO_PARTITION);
-
- if (p1 == p2)
- {
- if (debug)
- fprintf (debug, " : Already coalesced.\n");
- return;
- }
-
- rep1 = partition_to_var (map, p1);
- rep2 = partition_to_var (map, p2);
- root1 = SSA_NAME_VAR (rep1);
- root2 = SSA_NAME_VAR (rep2);
- if (!root1 && !root2)
- return;
-
- /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
- abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
- || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
- if (abnorm)
- {
- if (debug)
- fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
- return;
- }
-
- /* Partitions already have the same root, simply merge them. */
- if (root1 == root2)
- {
- p1 = partition_union (map->var_partition, p1, p2);
- if (debug)
- fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
- return;
- }
-
- /* Never attempt to coalesce 2 different parameters. */
- if ((root1 && TREE_CODE (root1) == PARM_DECL)
- && (root2 && TREE_CODE (root2) == PARM_DECL))
- {
- if (debug)
- fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
- return;
- }
-
- if ((root1 && TREE_CODE (root1) == RESULT_DECL)
- != (root2 && TREE_CODE (root2) == RESULT_DECL))
- {
- if (debug)
- fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
- return;
- }
-
- ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
- ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
- /* Refrain from coalescing user variables, if requested. */
- if (!ign1 && !ign2)
- {
- if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
- ign2 = true;
- else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
- ign1 = true;
- else if (flag_ssa_coalesce_vars != 2)
- {
- if (debug)
- fprintf (debug, " : 2 different USER vars. No coalesce.\n");
- return;
- }
- else
- ign2 = true;
- }
-
- /* If both values have default defs, we can't coalesce. If only one has a
- tag, make sure that variable is the new root partition. */
- if (root1 && ssa_default_def (cfun, root1))
- {
- if (root2 && ssa_default_def (cfun, root2))
- {
- if (debug)
- fprintf (debug, " : 2 default defs. No coalesce.\n");
- return;
- }
- else
- {
- ign2 = true;
- ign1 = false;
- }
- }
- else if (root2 && ssa_default_def (cfun, root2))
- {
- ign1 = true;
- ign2 = false;
- }
-
- /* Do not coalesce if we cannot assign a symbol to the partition. */
- if (!(!ign2 && root2)
- && !(!ign1 && root1))
- {
- if (debug)
- fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the new chosen root variable would be read-only.
- If both ign1 && ign2, then the root var of the larger partition
- wins, so reject in that case if any of the root vars is TREE_READONLY.
- Otherwise reject only if the root var, on which replace_ssa_name_symbol
- will be called below, is readonly. */
- if (((root1 && TREE_READONLY (root1)) && ign2)
- || ((root2 && TREE_READONLY (root2)) && ign1))
- {
- if (debug)
- fprintf (debug, " : Readonly variable. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the two variables aren't type compatible . */
- if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
- /* There is a disconnect between the middle-end type-system and
- VRP, avoid coalescing enum types with different bounds. */
- || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
- || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
- && TREE_TYPE (var1) != TREE_TYPE (var2)))
- {
- if (debug)
- fprintf (debug, " : Incompatible types. No coalesce.\n");
- return;
- }
-
- /* Merge the two partitions. */
- p3 = partition_union (map->var_partition, p1, p2);
-
- /* Set the root variable of the partition to the better choice, if there is
- one. */
- if (!ign2 && root2)
- replace_ssa_name_symbol (partition_to_var (map, p3), root2);
- else if (!ign1 && root1)
- replace_ssa_name_symbol (partition_to_var (map, p3), root1);
- else
- gcc_unreachable ();
-
- if (debug)
- {
- fprintf (debug, " --> P%d ", p3);
- print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
- TDF_SLIM);
- fprintf (debug, "\n");
- }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
- GIMPLE_PASS, /* type */
- "copyrename", /* name */
- OPTGROUP_NONE, /* optinfo_flags */
- TV_TREE_COPY_RENAME, /* tv_id */
- ( PROP_cfg | PROP_ssa ), /* properties_required */
- 0, /* properties_provided */
- 0, /* properties_destroyed */
- 0, /* todo_flags_start */
- 0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
- pass_rename_ssa_copies (gcc::context *ctxt)
- : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
- {}
-
- /* opt_pass methods: */
- opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
- virtual bool gate (function *) { return flag_tree_copyrename != 0; }
- virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
- SSA versions which occur in PHI's or copies. Coalescing is accomplished by
- changing the underlying root variable of all coalesced version. This will
- then cause the SSA->normal pass to attempt to coalesce them all to the same
- variable. */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
- var_map map;
- basic_block bb;
- tree var, part_var;
- gimple stmt;
- unsigned x;
- FILE *debug;
-
- memset (&stats, 0, sizeof (stats));
-
- if (dump_file && (dump_flags & TDF_DETAILS))
- debug = dump_file;
- else
- debug = NULL;
-
- map = init_var_map (num_ssa_names);
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Scan for real copies. */
- for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- stmt = gsi_stmt (gsi);
- if (gimple_assign_ssa_name_copy_p (stmt))
- {
- tree lhs = gimple_assign_lhs (stmt);
- tree rhs = gimple_assign_rhs1 (stmt);
-
- copy_rename_partition_coalesce (map, lhs, rhs, debug);
- }
- }
- }
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Treat PHI nodes as copies between the result and each argument. */
- for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- size_t i;
- tree res;
- gphi *phi = gsi.phi ();
- res = gimple_phi_result (phi);
-
- /* Do not process virtual SSA_NAMES. */
- if (virtual_operand_p (res))
- continue;
-
- /* Make sure to only use the same partition for an argument
- as the result but never the other way around. */
- if (SSA_NAME_VAR (res)
- && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) == SSA_NAME)
- copy_rename_partition_coalesce (map, res, arg,
- debug);
- }
- /* Else if all arguments are in the same partition try to merge
- it with the result. */
- else
- {
- int all_p_same = -1;
- int p = -1;
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) != SSA_NAME)
- {
- all_p_same = 0;
- break;
- }
- else if (all_p_same == -1)
- {
- p = partition_find (map->var_partition,
- SSA_NAME_VERSION (arg));
- all_p_same = 1;
- }
- else if (all_p_same == 1
- && p != partition_find (map->var_partition,
- SSA_NAME_VERSION (arg)))
- {
- all_p_same = 0;
- break;
- }
- }
- if (all_p_same == 1)
- copy_rename_partition_coalesce (map, res,
- PHI_ARG_DEF (phi, 0),
- debug);
- }
- }
- }
-
- if (debug)
- dump_var_map (debug, map);
-
- /* Now one more pass to make all elements of a partition share the same
- root variable. */
-
- for (x = 1; x < num_ssa_names; x++)
- {
- part_var = partition_to_var (map, x);
- if (!part_var)
- continue;
- var = ssa_name (x);
- if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
- continue;
- if (debug)
- {
- fprintf (debug, "Coalesced ");
- print_generic_expr (debug, var, TDF_SLIM);
- fprintf (debug, " to ");
- print_generic_expr (debug, part_var, TDF_SLIM);
- fprintf (debug, "\n");
- }
- stats.coalesced++;
- replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
- }
-
- statistics_counter_event (fun, "copies coalesced",
- stats.coalesced);
- delete_var_map (map);
- return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
- return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 2c7c072..821b2f4 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -100,90 +100,6 @@ static void verify_live_on_entry (tree_live_info_p);
ssa_name or variable, and vice versa. */
-/* Hashtable helpers. */
-
-struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
-{
- typedef tree_int_map *value_type;
- typedef tree_int_map *compare_type;
- static inline hashval_t hash (const tree_int_map *);
- static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
- return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
- return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP. */
-
-static void
-var_map_base_init (var_map map)
-{
- int x, num_part;
- tree var;
- struct tree_int_map *m, *mapstorage;
-
- num_part = num_var_partitions (map);
- hash_table<tree_int_map_hasher> tree_to_index (num_part);
- /* We can have at most num_part entries in the hash tables, so it's
- enough to allocate so many map elements once, saving some malloc
- calls. */
- mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
- /* If a base table already exists, clear it, otherwise create it. */
- free (map->partition_to_base_index);
- map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
- /* Build the base variable list, and point partitions at their bases. */
- for (x = 0; x < num_part; x++)
- {
- struct tree_int_map **slot;
- unsigned baseindex;
- var = partition_to_var (map, x);
- if (SSA_NAME_VAR (var)
- && (!VAR_P (SSA_NAME_VAR (var))
- || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
- m->base.from = SSA_NAME_VAR (var);
- else
- /* This restricts what anonymous SSA names we can coalesce
- as it restricts the sets we compute conflicts for.
- Using TREE_TYPE to generate sets is the easies as
- type equivalency also holds for SSA names with the same
- underlying decl.
-
- Check gimple_can_coalesce_p when changing this code. */
- m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
- ? TYPE_CANONICAL (TREE_TYPE (var))
- : TREE_TYPE (var));
- /* If base variable hasn't been seen, set it up. */
- slot = tree_to_index.find_slot (m, INSERT);
- if (!*slot)
- {
- baseindex = m - mapstorage;
- m->to = baseindex;
- *slot = m;
- m++;
- }
- else
- baseindex = (*slot)->to;
- map->partition_to_base_index[x] = baseindex;
- }
-
- map->num_basevars = m - mapstorage;
-
- free (mapstorage);
-}
-
-
/* Remove the base table in MAP. */
static void
@@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
}
-/* Create a partition view which includes all the used partitions in MAP. If
- WANT_BASES is true, create the base variable map as well. */
+/* Create a partition view which includes all the used partitions in MAP. */
void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
{
bitmap used;
used = partition_view_init (map);
partition_view_fini (map, used);
- if (want_bases)
- var_map_base_init (map);
- else
- var_map_base_fini (map);
+ var_map_base_fini (map);
}
@@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
as well. */
void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
{
bitmap used;
bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
}
partition_view_fini (map, new_partitions);
- if (want_bases)
- var_map_base_init (map);
- else
- var_map_base_fini (map);
+ var_map_base_fini (map);
}
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
extern var_map init_var_map (int);
extern void delete_var_map (var_map);
extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
extern void dump_scope_blocks (FILE *, int);
extern void debug_scope_block (tree, int);
extern void debug_scope_blocks (int);
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 3f6bebe..7bef8cf 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
continue;
e = TREE_OPERAND (e, 0);
- gcc_assert (operand_equal_p (e, base, 0));
+ /* If E has an unsigned type, the operand equality test below
+ would fail, but the equality test above would have already
+ verified the equality, so we can proceed with it. */
+ gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
+ || operand_equal_p (e, base, 0));
if (tree_int_cst_sign_bit (step))
{
code = LT_EXPR;
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index f75a7f1..0982305 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see
#include "domwalk.h"
#include "tree-pass.h"
#include "tree-ssa-propagate.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
/* The basic structure describing an equivalency created by traversing
an edge. Traversing the edge effectively means that we can assume
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 0b24007..acdcd46 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4931,12 +4931,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
registers, as well as associations between MEMs and VALUEs. */
static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
{
unsigned int r;
hard_reg_set_iterator hrsi;
+ HARD_REG_SET invalidated_regs;
- EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+ get_call_reg_set_usage (call_insn, &invalidated_regs,
+ regs_invalidated_by_call);
+
+ EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
var_regno_delete (set, r);
if (MAY_HAVE_DEBUG_INSNS)
@@ -6720,7 +6724,7 @@ compute_bb_dataflow (basic_block bb)
switch (mo->type)
{
case MO_CALL:
- dataflow_set_clear_at_call (out);
+ dataflow_set_clear_at_call (out, insn);
break;
case MO_USE:
@@ -9182,7 +9186,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
switch (mo->type)
{
case MO_CALL:
- dataflow_set_clear_at_call (set);
+ dataflow_set_clear_at_call (set, insn);
emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
{
rtx arguments = mo->u.loc, *p = &arguments;
And here's the incremental patch:
---
gcc/alias.c | 17 +++++++------
gcc/cfgexpand.c | 57 +++++++++++++++++----------------------------
gcc/emit-rtl.c | 2 --
gcc/explow.c | 3 --
gcc/expr.c | 16 +++++--------
gcc/function.c | 15 ++++++++++++
gcc/gimple-expr.h | 4 ---
gcc/tree-outof-ssa.c | 7 ++----
gcc/tree-ssa-coalesce.h | 1 +
gcc/tree-ssa-loop-niter.c | 6 ++++-
gcc/tree-ssa-uncprop.c | 5 ++++
11 files changed, 64 insertions(+), 69 deletions(-)
diff --git a/gcc/alias.c b/gcc/alias.c
index 7a74e81..5a031d9 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2553,14 +2553,15 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
return 0;
/* If we refer to different gimple registers, or one gimple register
- and one non-gimple-register, we know they can't overlap. Now,
- there could be more than one stack slot for (different versions
- of) the same gimple register, but we can presumably tell they
- don't overlap based on offsets from stack base addresses
- elsewhere. It's important that we don't proceed to DECL_RTL,
- because gimple registers may not pass DECL_RTL_SET_P, and
- make_decl_rtl won't be able to do anything about them since no
- SSA information will have remained to guide it. */
+ and one non-gimple-register, we know they can't overlap. First,
+ gimple registers don't have their addresses taken. Now, there
+ could be more than one stack slot for (different versions of) the
+ same gimple register, but we can presumably tell they don't
+ overlap based on offsets from stack base addresses elsewhere.
+ It's important that we don't proceed to DECL_RTL, because gimple
+ registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+ able to do anything about them since no SSA information will have
+ remained to guide it. */
if (is_gimple_reg (exprx) || is_gimple_reg (expry))
return exprx != expry;
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 3e80b4a..bf972fc 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -179,11 +179,10 @@ gimple_assign_rhs_to_tree (gimple stmt)
#define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
-/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
- TREE_LIST of DECLs. If NEXT is covered by CUR, return CUR
- unchanged. Otherwise, return a list with all entries of CUR, with
- NEXT at the end. If CUR was a list, it will be modified in
- place. */
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+ Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+ out of the same user variable being in multiple partitions (this is
+ less likely for compiler-introduced temps). */
static tree
leader_merge (tree cur, tree next)
@@ -191,26 +190,11 @@ leader_merge (tree cur, tree next)
if (cur == NULL || cur == next)
return next;
- tree list;
+ if (DECL_P (cur) && DECL_IGNORED_P (cur))
+ return cur;
- if (TREE_CODE (cur) == TREE_LIST)
- {
- /* Look for NEXT in the list. Stop at the last node to insert
- there. */
- for (list = cur; ; list = TREE_CHAIN (list))
- {
- if (TREE_VALUE (list) == next)
- return cur;
- if (!TREE_CHAIN (list))
- break;
- }
- }
- else
- /* Create the first node. */
- list = build_tree_list (NULL, cur);
-
- next = build_tree_list (NULL, next);
- TREE_CHAIN (list) = next;
+ if (DECL_P (next) && DECL_IGNORED_P (next))
+ return next;
return cur;
}
@@ -285,9 +269,9 @@ set_rtl (tree t, rtx x)
if (cur != next)
{
if (MEM_P (x))
- set_mem_attributes (x, SSAVAR (t), true);
+ set_mem_attributes (x, next, true);
else
- set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
+ set_reg_attrs_for_decl_rtl (next, x);
}
}
@@ -1025,9 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
x = plus_constant (Pmode, base, offset);
- x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
- ? DECL_MODE (SSAVAR (decl))
- : TYPE_MODE (TREE_TYPE (decl)), x);
+ x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+ ? TYPE_MODE (TREE_TYPE (decl))
+ : DECL_MODE (SSAVAR (decl)), x);
if (TREE_CODE (decl) != SSA_NAME)
{
@@ -1268,17 +1252,17 @@ expand_one_stack_var_1 (tree var)
HOST_WIDE_INT size, offset;
unsigned byte_align;
- if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
- {
- size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
- byte_align = align_local_variable (SSAVAR (var));
- }
- else
+ if (TREE_CODE (var) == SSA_NAME)
{
tree type = TREE_TYPE (var);
size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
byte_align = TYPE_ALIGN_UNIT (type);
}
+ else
+ {
+ size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+ byte_align = align_local_variable (var);
+ }
/* We handle highly aligned variables in expand_stack_vars. */
gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1423,9 +1407,10 @@ expand_one_register_var (tree var)
gcc_assert (REG_P (x));
return;
}
+ gcc_unreachable ();
}
- tree decl = SSAVAR (var);
+ tree decl = var;
tree type = TREE_TYPE (decl);
machine_mode reg_mode = promote_decl_mode (decl, NULL);
rtx x = gen_reg_rtx (reg_mode);
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 308da40..2b98946 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1252,8 +1252,6 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
if (!t)
return;
tree tdecl = t;
- if (TREE_CODE (t) == TREE_LIST)
- tdecl = TREE_VALUE (t);
if (GET_CODE (x) == SUBREG)
{
gcc_assert (subreg_lowpart_p (x));
diff --git a/gcc/explow.c b/gcc/explow.c
index e09c032e1..5b0d49c 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -866,9 +866,6 @@ promote_ssa_mode (const_tree name, int *punsignedp)
{
gcc_assert (TREE_CODE (name) == SSA_NAME);
- if (SSA_NAME_VAR (name))
- return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
-
tree type = TREE_TYPE (name);
int unsignedp = TYPE_UNSIGNED (type);
machine_mode mode = TYPE_MODE (type);
diff --git a/gcc/expr.c b/gcc/expr.c
index effe379..5b6e16e 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9584,20 +9584,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
/* Get the signedness to be used for this variable. Ensure we get
the same mode we got when the variable was declared. */
- if (code == SSA_NAME
- && (g = SSA_NAME_DEF_STMT (ssa_name))
- && gimple_code (g) == GIMPLE_CALL
- && !gimple_call_internal_p (g))
+ if (code != SSA_NAME)
+ pmode = promote_decl_mode (exp, &unsignedp);
+ else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+ && gimple_code (g) == GIMPLE_CALL
+ && !gimple_call_internal_p (g))
pmode = promote_function_mode (type, mode, &unsignedp,
gimple_call_fntype (g),
2);
- else if (!exp)
- {
- gcc_assert (code == SSA_NAME);
- pmode = promote_ssa_mode (ssa_name, &unsignedp);
- }
else
- pmode = promote_decl_mode (exp, &unsignedp);
+ pmode = promote_ssa_mode (ssa_name, &unsignedp);
gcc_assert (GET_MODE (decl_rtl) == pmode);
temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/function.c b/gcc/function.c
index dc9e77f..58e2498 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2124,6 +2124,21 @@ use_register_for_decl (const_tree decl)
{
if (TREE_CODE (decl) == SSA_NAME)
{
+ /* We often try to use the SSA_NAME, instead of its underlying
+ decl, to get type information and guide decisions, to avoid
+ differences of behavior between anonymous and named
+ variables, but in this one case we have to go for the actual
+ variable if there is one. The main reason is that, at least
+ at -O0, we want to place user variables on the stack, but we
+ don't mind using pseudos for anonymous or ignored temps.
+ Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+ should go in pseudos, whereas their corresponding variables
+ might have to go on the stack. So, disregarding the decl
+ here would negatively impact debug info at -O0, enable
+ coalescing between SSA_NAMEs that ought to get different
+ stack/pseudo assignments, and get the incoming argument
+ processing thoroughly confused by PARM_DECLs expected to live
+ in stack slots but assigned to pseudos. */
if (!SSA_NAME_VAR (decl))
return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
&& !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index 146cede..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -55,10 +55,6 @@ extern bool is_gimple_mem_ref_addr (tree);
extern void mark_addressable (tree);
extern bool is_gimple_reg_rhs (tree);
-/* Defined in tree-ssa-coalesce.c. */
-extern bool gimple_can_coalesce_p (tree, tree);
-
-
/* Return true if a conversion from either type of TYPE1 and TYPE2
to the other is not required. Otherwise return false. */
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index dda9973..59d91c6 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
rtx dest_rtx, seq, x;
machine_mode dest_mode, src_mode;
int unsignedp;
- tree var;
if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -328,10 +327,9 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
start_sequence ();
tree name = partition_to_var (SA.map, dest);
- var = SSA_NAME_VAR (name);
src_mode = TYPE_MODE (TREE_TYPE (src));
dest_mode = GET_MODE (dest_rtx);
- gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
+ gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
gcc_assert (!REG_P (dest_rtx)
|| dest_mode == promote_ssa_mode (name, &unsignedp));
@@ -709,8 +707,7 @@ elim_backward (elim_graph g, int T)
static rtx
get_temp_reg (tree name)
{
- tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
- tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
+ tree type = TREE_TYPE (name);
int unsignedp;
machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
rtx x = gen_reg_rtx (reg_mode);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see
#define GCC_TREE_SSA_COALESCE_H
extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
#endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 3f6bebe..7bef8cf 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
continue;
e = TREE_OPERAND (e, 0);
- gcc_assert (operand_equal_p (e, base, 0));
+ /* If E has an unsigned type, the operand equality test below
+ would fail, but the equality test above would have already
+ verified the equality, so we can proceed with it. */
+ gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
+ || operand_equal_p (e, base, 0));
if (tree_int_cst_sign_bit (step))
{
code = LT_EXPR;
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index f75a7f1..0982305 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see
#include "domwalk.h"
#include "tree-pass.h"
#include "tree-ssa-propagate.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
/* The basic structure describing an equivalency created by traversing
an edge. Traversing the edge effectively means that we can assume
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-06-06 5:12 ` Alexandre Oliva
@ 2015-06-08 8:16 ` Richard Biener
2015-06-09 8:58 ` Christophe Lyon
2015-06-10 0:28 ` Alexandre Oliva
1 sibling, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-06-08 8:16 UTC (permalink / raw)
To: Alexandre Oliva; +Cc: Jeff Law, GCC Patches
On Sat, Jun 6, 2015 at 3:14 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> This should also mention that is_gimple_reg vars do not have their
>> address taken.
>
> check
>
>>> +static tree
>>> +leader_merge (tree cur, tree next)
>
>> Ick - presumably you can't use sth better than a TREE_LIST here?
>
> The list was an experiment that never really worked, and when I tried to
> make it work after the patch, it proved to be unworkable, so I dropped
> it, and rewrote leader_merge to choose either of the params, preferring
> anonymous over ignored over named, so as to reduce the likelihood of
> misreading of debug dumps, since that's all they're used for.
>
>>> static void
>>> -expand_one_stack_var (tree var)
>>> +expand_one_stack_var_1 (tree var)
>>> {
>>> HOST_WIDE_INT size, offset;
>>> unsigned byte_align;
>>>
>>> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>>> - byte_align = align_local_variable (SSAVAR (var));
>>> + if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
>>> + {
>>> + size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>>> + byte_align = align_local_variable (SSAVAR (var));
>>> + }
>>> + else
>
>> I'd go here for all TREE_CODE (var) == SSA_NAME
>
> Check
>
>> (and get rid of the SSAVAR macro?)
>
> There are remaining uses that don't seem worth dropping it for.
>
>>> +/* Return the promoted mode for name. If it is a named SSA_NAME, it
>>> + is the same as promote_decl_mode. Otherwise, it is the promoted
>>> + mode of a temp decl of same type as the SSA_NAME, if we had created
>>> + one. */
>>> +
>>> +machine_mode
>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>> +{
>>> + gcc_assert (TREE_CODE (name) == SSA_NAME);
>>> +
>>> + if (SSA_NAME_VAR (name))
>>> + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>
>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>> vars (so just delete the above two lines).
>
> Check
>
>>> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>> pmode = promote_function_mode (type, mode, &unsignedp,
>>> gimple_call_fntype (g),
>>> 2);
>>> + else if (!exp)
>>> + {
>>> + gcc_assert (code == SSA_NAME);
>
>> promote_ssa_mode should assert this.
>
>>> + pmode = promote_ssa_mode (ssa_name, &unsignedp);
>
> It does, so... check.
>
>
>>> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>>> bool
>>> use_register_for_decl (const_tree decl)
>>> {
>>> + if (TREE_CODE (decl) == SSA_NAME)
>>> + {
>>> + if (!SSA_NAME_VAR (decl))
>>> + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>>> + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>>> +
>>> + decl = SSA_NAME_VAR (decl);
>
>> See above. Please drop the SSA_NAME_VAR != NULL path.
>
> Check, then taken back, after a bootstrap failure and some debugging
> made me realize this would be wrong. Here are the nearly-added comments
> that explain why:
>
> /* We often try to use the SSA_NAME, instead of its underlying
> decl, to get type information and guide decisions, to avoid
> differences of behavior between anonymous and named
> variables, but in this one case we have to go for the actual
> variable if there is one. The main reason is that, at least
> at -O0, we want to place user variables on the stack, but we
> don't mind using pseudos for anonymous or ignored temps.
> Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
> should go in pseudos, whereas their corresponding variables
> might have to go on the stack. So, disregarding the decl
> here would negatively impact debug info at -O0, enable
> coalescing between SSA_NAMEs that ought to get different
> stack/pseudo assignments, and get the incoming argument
> processing thoroughly confused by PARM_DECLs expected to live
> in stack slots but assigned to pseudos. */
>
>
>>> +++ b/gcc/gimple-expr.h
>>> +/* Defined in tree-ssa-coalesce.c. */
>>> +extern bool gimple_can_coalesce_p (tree, tree);
>
>> Err, put it to tree-ssa-coalesce.h?
>
> Check. Lots of additional headers required to be able to include
> tree-ssa-coalesce.h, though.
>
>
>>> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
>>> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
>
>> The TREE_TYPE of name and its SSA_NAME_VAR are always the same. So just
>> use TREE_TYPE (name) here.
>
> Check
>
>>> gcc_assert (!REG_P (dest_rtx)
>>> - || dest_mode == promote_decl_mode (var, &unsignedp));
>>> + || dest_mode == promote_ssa_mode (name, &unsignedp));
>>>
>>> if (src_mode != dest_mode)
>>> {
>>> @@ -714,12 +715,12 @@ static rtx
>>> get_temp_reg (tree name)
>>> {
>>> tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>>> - tree type = TREE_TYPE (var);
>>> + tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
>
>> See above.
>
> Check
>
>
> Here's the revised patch, regstrapped on x86_64-linux-gnu and
> i686-linux-gnu. The first attempt failed to compile libjava on x86_64,
> requiring the new change in tree-ssa-loop-niter.c to pass. It didn't
> occur in the unpatched tree because the differences between anon or
> named SSA_NAMEs in copyrename changed costs and caused different choices
> in ivopts, which ultimately failed to expose the problem in loop-niter
> during vrp.
>
> At the end, I enclose the incremental changes since the previous
> revision of the patch, to ease the incremental review.
>
> Ok to install?
Ok.
Thanks,
Richard.
>
> for gcc/ChangeLog
>
> PR rtl-optimization/64164
> * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
> * tree-ssa-copyrename.c: Removed.
> * opts.c (default_options_table): Drop -ftree-copyrename. Add
> -ftree-coalesce-vars.
> * passes.def: Drop all occurrences of pass_rename_ssa_copies.
> * common.opt (ftree-copyrename): Ignore.
> (ftree-coalesce-inlined-vars): Likewise.
> * doc/invoke.texi: Remove the ignored options above.
> * gimple-expr.h (gimple_can_coalesce_p): Move declaration
> * tree-ssa-coalesce.h: ... here.
> * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
> headers required by it.
> * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
> across variables when flag_tree_coalesce_vars. Check register
> use and promoted modes to allow coalescing. Moved to
> tree-ssa-coalesce.c.
> * tree-ssa-live.c (struct tree_int_map_hasher): Move along
> with its member functions to tree-ssa-coalesce.c.
> (var_map_base_init): Likewise. Renamed to
> compute_samebase_partition_bases.
> (partition_view_normal): Drop want_bases parameter.
> (partition_view_bitmap): Likewise.
> * tree-ssa-live.h: Adjust declarations.
> * tree-ssa-coalesce.c: Include explow.h.
> (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
> default defs at the entry point.
> (dump_part_var_map): New.
> (compute_optimized_partition_bases): New, called by...
> (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
> of compute_samebase_partition_bases. Adjust.
> * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
> * cfgexpand.c (leader_merge): New.
> (get_rtl_for_parm_ssa_default_def): New.
> (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
> vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
> (expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
> redundant MEM attr setting.
> (expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
> from...
> (expand_one_stack_var): ... this. New wrapper to check and
> skip already expanded SSA partitions.
> (record_alignment_for_reg_var): New, factored out of...
> (expand_one_var): ... this.
> (expand_one_ssa_partition): New.
> (adjust_one_expanded_partition_var): New.
> (expand_one_register_var): Check and skip already expanded SSA
> partitions.
> (expand_used_vars): Don't create DECLs for anonymous SSA
> names. Expand all SSA partitions, then adjust all SSA names.
> (pass::execute): Replace the loops that set
> SA.partition_to_pseudo from partition leaders and cleared
> DECL_RTL for multi-location variables, and that which used to
> rename vars and set attrs, with one that clears DECL_RTL and
> checks that PARMs and RESULTs default_defs match DECL_RTL.
> * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
> * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
> * explow.c (promote_ssa_mode): New.
> * explow.h (promote_ssa_mode): Declare.
> * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
> * function.c: Include cfgexpand.h.
> (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
> (use_register_for_parm_decl): Wrapper for the above to
> special-case the result_ptr.
> (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
> (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
> multiple locations.
> (assign_parm_adjust_stack_rtl): Add all and parm arguments,
> for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
> (assign_parm_setup_block): Prefer SSA-assigned location.
> (assign_parm_setup_reg): Likewise. Use entry_parm for equiv
> if stack_parm is NULL.
> (assign_parm_setup_stack): Prefer SSA-assigned location.
> (assign_parms): Maybe reset DECL_RTL of params. Adjust stack
> rtl before testing for pointer bounds. Special-case result_ptr.
> (expand_function_start): Maybe reset DECL_RTL of result.
> Prefer SSA-assigned location for result and static chain.
> Factor out DECL_RESULT and SET_DECL_RTL.
> * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
> anonymous SSA names. Use promote_ssa_mode.
> (get_temp_reg): Likewise.
> (remove_ssa_form): Adjust.
> * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
> and get its reg_usage for reg invalidation.
> (compute_bb_dataflow): Pass it insn.
> (emit_notes_in_bb): Likewise.
> * tree-ssa-loop-niter.c (loop_exits_before_overflow): Don't
> fail assert on conversion between unsigned types.
>
> for gcc/testsuite/ChangeLog
>
> * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
> * gcc.dg/ssp-1.c: Make counter a register.
> * gcc.dg/ssp-2.c: Likewise.
> * gcc.dg/torture/parm-coalesce.c: New.
> ---
> gcc/Makefile.in | 1
> gcc/alias.c | 13 +
> gcc/cfgexpand.c | 370 ++++++++++++++-----
> gcc/cfgexpand.h | 2
> gcc/common.opt | 12 -
> gcc/doc/invoke.texi | 48 +--
> gcc/emit-rtl.c | 5
> gcc/explow.c | 22 +
> gcc/explow.h | 3
> gcc/expr.c | 39 +-
> gcc/function.c | 226 +++++++++---
> gcc/gimple-expr.c | 39 --
> gcc/gimple-expr.h | 1
> gcc/opts.c | 2
> gcc/passes.def | 5
> gcc/testsuite/gcc.dg/guality/pr54200.c | 2
> gcc/testsuite/gcc.dg/ssp-1.c | 2
> gcc/testsuite/gcc.dg/ssp-2.c | 2
> gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++
> gcc/tree-outof-ssa.c | 16 -
> gcc/tree-ssa-coalesce.c | 380 +++++++++++++++++++-
> gcc/tree-ssa-coalesce.h | 1
> gcc/tree-ssa-copyrename.c | 499 --------------------------
> gcc/tree-ssa-live.c | 101 -----
> gcc/tree-ssa-live.h | 4
> gcc/tree-ssa-loop-niter.c | 6
> gcc/tree-ssa-uncprop.c | 5
> gcc/var-tracking.c | 12 -
> 28 files changed, 984 insertions(+), 874 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 3d14938..2a03223 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1441,7 +1441,6 @@ OBJS = \
> tree-ssa-ccp.o \
> tree-ssa-coalesce.o \
> tree-ssa-copy.o \
> - tree-ssa-copyrename.o \
> tree-ssa-dce.o \
> tree-ssa-dom.o \
> tree-ssa-dse.o \
> diff --git a/gcc/alias.c b/gcc/alias.c
> index ea539c5..5a031d9 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2552,6 +2552,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
> if (! DECL_P (exprx) || ! DECL_P (expry))
> return 0;
>
> + /* If we refer to different gimple registers, or one gimple register
> + and one non-gimple-register, we know they can't overlap. First,
> + gimple registers don't have their addresses taken. Now, there
> + could be more than one stack slot for (different versions of) the
> + same gimple register, but we can presumably tell they don't
> + overlap based on offsets from stack base addresses elsewhere.
> + It's important that we don't proceed to DECL_RTL, because gimple
> + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
> + able to do anything about them since no SSA information will have
> + remained to guide it. */
> + if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> + return exprx != expry;
> +
> /* With invalid code we can end up storing into the constant pool.
> Bail out to avoid ICEing when creating RTL for this.
> See gfortran.dg/lto/20091028-2_0.f90. */
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index b190f91..bf972fc 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -179,21 +179,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* Choose either CUR or NEXT as the leader DECL for a partition.
> + Prefer ignored decls, to simplify debug dumps and reduce ambiguity
> + out of the same user variable being in multiple partitions (this is
> + less likely for compiler-introduced temps). */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> + if (cur == NULL || cur == next)
> + return next;
> +
> + if (DECL_P (cur) && DECL_IGNORED_P (cur))
> + return cur;
> +
> + if (DECL_P (next) && DECL_IGNORED_P (next))
> + return next;
> +
> + return cur;
> +}
> +
> +
> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
> + there is one. */
> +
> +rtx
> +get_rtl_for_parm_ssa_default_def (tree var)
> +{
> + gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> +
> + if (!is_gimple_reg (var))
> + return NULL_RTX;
> +
> + /* If we've already determined RTL for the decl, use it. This is
> + not just an optimization: if VAR is a PARM whose incoming value
> + is unused, we won't find a default def to use its partition, but
> + we still want to use the location of the parm, if it was used at
> + all. During assign_parms, until a location is assigned for the
> + VAR, RTL can only for a parm or result if we're not coalescing
> + across variables, when we know we're coalescing all SSA_NAMEs of
> + each parm or result, and we're not coalescing them with names
> + pertaining to other variables, such as other parms' default
> + defs. */
> + if (DECL_RTL_SET_P (var))
> + {
> + gcc_assert (DECL_RTL (var) != pc_rtx);
> + return DECL_RTL (var);
> + }
> +
> + tree name = ssa_default_def (cfun, var);
> +
> + if (!name)
> + return NULL_RTX;
> +
> + int part = var_to_partition (SA.map, name);
> + if (part == NO_PARTITION)
> + return NULL_RTX;
> +
> + return SA.partition_to_pseudo[part];
> +}
> +
> /* Associate declaration T with storage space X. If T is no
> SSA name this is exactly SET_DECL_RTL, otherwise make the
> partition of T associated with X. */
> static inline void
> set_rtl (tree t, rtx x)
> {
> + if (x && SSAVAR (t))
> + {
> + bool skip = false;
> + tree cur = NULL_TREE;
> +
> + if (MEM_P (x))
> + cur = MEM_EXPR (x);
> + else if (REG_P (x))
> + cur = REG_EXPR (x);
> + else if (GET_CODE (x) == CONCAT
> + && REG_P (XEXP (x, 0)))
> + cur = REG_EXPR (XEXP (x, 0));
> + else if (GET_CODE (x) == PARALLEL)
> + cur = REG_EXPR (XVECEXP (x, 0, 0));
> + else if (x == pc_rtx)
> + skip = true;
> + else
> + gcc_unreachable ();
> +
> + tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +
> + if (cur != next)
> + {
> + if (MEM_P (x))
> + set_mem_attributes (x, next, true);
> + else
> + set_reg_attrs_for_decl_rtl (next, x);
> + }
> + }
> +
> if (TREE_CODE (t) == SSA_NAME)
> {
> - SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> - if (x && !MEM_P (x))
> - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> - /* For the benefit of debug information at -O0 (where vartracking
> - doesn't run) record the place also in the base DECL if it's
> - a normal variable (not a parameter). */
> - if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
> + int part = var_to_partition (SA.map, t);
> + if (part != NO_PARTITION)
> + {
> + if (SA.partition_to_pseudo[part])
> + gcc_assert (SA.partition_to_pseudo[part] == x);
> + else
> + SA.partition_to_pseudo[part] = x;
> + }
> + /* For the benefit of debug information at -O0 (where
> + vartracking doesn't run) record the place also in the base
> + DECL. For PARMs and RESULTs, we may end up resetting these
> + in function.c:maybe_reset_rtl_for_parm, but in some rare
> + cases we may need them (unused and overwritten incoming
> + value, that at -O0 must share the location with the other
> + uses in spite of the missing default def), and this may be
> + the only chance to preserve them. */
> + if (x && x != pc_rtx && SSA_NAME_VAR (t))
> {
> tree var = SSA_NAME_VAR (t);
> /* If we don't yet have something recorded, just record it now. */
> @@ -909,7 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
> x = plus_constant (Pmode, base, offset);
> - x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
> + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
> + ? TYPE_MODE (TREE_TYPE (decl))
> + : DECL_MODE (SSAVAR (decl)), x);
>
> if (TREE_CODE (decl) != SSA_NAME)
> {
> @@ -931,7 +1033,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
> DECL_USER_ALIGN (decl) = 0;
> }
>
> - set_mem_attributes (x, SSAVAR (decl), true);
> set_rtl (decl, x);
> }
>
> @@ -1146,13 +1247,22 @@ account_stack_vars (void)
> to a variable to be allocated in the stack frame. */
>
> static void
> -expand_one_stack_var (tree var)
> +expand_one_stack_var_1 (tree var)
> {
> HOST_WIDE_INT size, offset;
> unsigned byte_align;
>
> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> - byte_align = align_local_variable (SSAVAR (var));
> + if (TREE_CODE (var) == SSA_NAME)
> + {
> + tree type = TREE_TYPE (var);
> + size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> + byte_align = TYPE_ALIGN_UNIT (type);
> + }
> + else
> + {
> + size = tree_to_uhwi (DECL_SIZE_UNIT (var));
> + byte_align = align_local_variable (var);
> + }
>
> /* We handle highly aligned variables in expand_stack_vars. */
> gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1163,6 +1273,27 @@ expand_one_stack_var (tree var)
> crtl->max_used_stack_slot_alignment, offset);
> }
>
> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> + already assigned some MEM. */
> +
> +static void
> +expand_one_stack_var (tree var)
> +{
> + if (TREE_CODE (var) == SSA_NAME)
> + {
> + int part = var_to_partition (SA.map, var);
> + if (part != NO_PARTITION)
> + {
> + rtx x = SA.partition_to_pseudo[part];
> + gcc_assert (x);
> + gcc_assert (MEM_P (x));
> + return;
> + }
> + }
> +
> + return expand_one_stack_var_1 (var);
> +}
> +
> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
> that will reside in a hard register. */
>
> @@ -1172,13 +1303,114 @@ expand_one_hard_reg_var (tree var)
> rest_of_decl_compilation (var, 0, 0);
> }
>
> +/* Record the alignment requirements of some variable assigned to a
> + pseudo. */
> +
> +static void
> +record_alignment_for_reg_var (unsigned int align)
> +{
> + if (SUPPORTS_STACK_ALIGNMENT
> + && crtl->stack_alignment_estimated < align)
> + {
> + /* stack_alignment_estimated shouldn't change after stack
> + realign decision made */
> + gcc_assert (!crtl->stack_realign_processed);
> + crtl->stack_alignment_estimated = align;
> + }
> +
> + /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> + So here we only make sure stack_alignment_needed >= align. */
> + if (crtl->stack_alignment_needed < align)
> + crtl->stack_alignment_needed = align;
> + if (crtl->max_used_stack_slot_alignment < align)
> + crtl->max_used_stack_slot_alignment = align;
> +}
> +
> +/* Create RTL for an SSA partition. */
> +
> +static void
> +expand_one_ssa_partition (tree var)
> +{
> + int part = var_to_partition (SA.map, var);
> + gcc_assert (part != NO_PARTITION);
> +
> + if (SA.partition_to_pseudo[part])
> + return;
> +
> + if (!use_register_for_decl (var))
> + {
> + expand_one_stack_var_1 (var);
> + return;
> + }
> +
> + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
> + TYPE_MODE (TREE_TYPE (var)),
> + TYPE_ALIGN (TREE_TYPE (var)));
> +
> + /* If the variable alignment is very large we'll dynamicaly allocate
> + it, which means that in-frame portion is just a pointer. */
> + if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> + align = POINTER_SIZE;
> +
> + record_alignment_for_reg_var (align);
> +
> + machine_mode reg_mode = promote_ssa_mode (var, NULL);
> +
> + rtx x = gen_reg_rtx (reg_mode);
> +
> + set_rtl (var, x);
> +}
> +
> +/* Record the association between the RTL generated for a partition
> + and the underlying variable of the SSA_NAME. */
> +
> +static void
> +adjust_one_expanded_partition_var (tree var)
> +{
> + if (!var)
> + return;
> +
> + tree decl = SSA_NAME_VAR (var);
> +
> + int part = var_to_partition (SA.map, var);
> + if (part == NO_PARTITION)
> + return;
> +
> + rtx x = SA.partition_to_pseudo[part];
> +
> + set_rtl (var, x);
> +
> + if (!REG_P (x))
> + return;
> +
> + /* Note if the object is a user variable. */
> + if (decl && !DECL_ARTIFICIAL (decl))
> + mark_user_reg (x);
> +
> + if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
> + mark_reg_pointer (x, get_pointer_alignment (var));
> +}
> +
> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
> that will reside in a pseudo register. */
>
> static void
> expand_one_register_var (tree var)
> {
> - tree decl = SSAVAR (var);
> + if (TREE_CODE (var) == SSA_NAME)
> + {
> + int part = var_to_partition (SA.map, var);
> + if (part != NO_PARTITION)
> + {
> + rtx x = SA.partition_to_pseudo[part];
> + gcc_assert (x);
> + gcc_assert (REG_P (x));
> + return;
> + }
> + gcc_unreachable ();
> + }
> +
> + tree decl = var;
> tree type = TREE_TYPE (decl);
> machine_mode reg_mode = promote_decl_mode (decl, NULL);
> rtx x = gen_reg_rtx (reg_mode);
> @@ -1312,21 +1544,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
> align = POINTER_SIZE;
> }
>
> - if (SUPPORTS_STACK_ALIGNMENT
> - && crtl->stack_alignment_estimated < align)
> - {
> - /* stack_alignment_estimated shouldn't change after stack
> - realign decision made */
> - gcc_assert (!crtl->stack_realign_processed);
> - crtl->stack_alignment_estimated = align;
> - }
> -
> - /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> - So here we only make sure stack_alignment_needed >= align. */
> - if (crtl->stack_alignment_needed < align)
> - crtl->stack_alignment_needed = align;
> - if (crtl->max_used_stack_slot_alignment < align)
> - crtl->max_used_stack_slot_alignment = align;
> + record_alignment_for_reg_var (align);
>
> if (TREE_CODE (origvar) == SSA_NAME)
> {
> @@ -1760,48 +1978,18 @@ expand_used_vars (void)
> if (targetm.use_pseudo_pic_reg ())
> pic_offset_table_rtx = gen_reg_rtx (Pmode);
>
> - hash_map<tree, tree> ssa_name_decls;
> for (i = 0; i < SA.map->num_partitions; i++)
> {
> tree var = partition_to_var (SA.map, i);
>
> gcc_assert (!virtual_operand_p (var));
>
> - /* Assign decls to each SSA name partition, share decls for partitions
> - we could have coalesced (those with the same type). */
> - if (SSA_NAME_VAR (var) == NULL_TREE)
> - {
> - tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
> - if (!*slot)
> - *slot = create_tmp_reg (TREE_TYPE (var));
> - replace_ssa_name_symbol (var, *slot);
> - }
> -
> - /* Always allocate space for partitions based on VAR_DECLs. But for
> - those based on PARM_DECLs or RESULT_DECLs and which matter for the
> - debug info, there is no need to do so if optimization is disabled
> - because all the SSA_NAMEs based on these DECLs have been coalesced
> - into a single partition, which is thus assigned the canonical RTL
> - location of the DECLs. If in_lto_p, we can't rely on optimize,
> - a function could be compiled with -O1 -flto first and only the
> - link performed at -O0. */
> - if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
> - expand_one_var (var, true, true);
> - else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
> - {
> - /* This is a PARM_DECL or RESULT_DECL. For those partitions that
> - contain the default def (representing the parm or result itself)
> - we don't do anything here. But those which don't contain the
> - default def (representing a temporary based on the parm/result)
> - we need to allocate space just like for normal VAR_DECLs. */
> - if (!bitmap_bit_p (SA.partition_has_default_def, i))
> - {
> - expand_one_var (var, true, true);
> - gcc_assert (SA.partition_to_pseudo[i]);
> - }
> - }
> + expand_one_ssa_partition (var);
> }
>
> + for (i = 1; i < num_ssa_names; i++)
> + adjust_one_expanded_partition_var (ssa_name (i));
> +
> if (flag_stack_protect == SPCT_FLAG_STRONG)
> gen_stack_protect_signal
> = stack_protect_decl_p () || stack_protect_return_slot_p ();
> @@ -5961,35 +6149,6 @@ pass_expand::execute (function *fun)
> parm_birth_insn = var_seq;
> }
>
> - /* Now that we also have the parameter RTXs, copy them over to our
> - partitions. */
> - for (i = 0; i < SA.map->num_partitions; i++)
> - {
> - tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
> -
> - if (TREE_CODE (var) != VAR_DECL
> - && !SA.partition_to_pseudo[i])
> - SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
> - gcc_assert (SA.partition_to_pseudo[i]);
> -
> - /* If this decl was marked as living in multiple places, reset
> - this now to NULL. */
> - if (DECL_RTL_IF_SET (var) == pc_rtx)
> - SET_DECL_RTL (var, NULL);
> -
> - /* Some RTL parts really want to look at DECL_RTL(x) when x
> - was a decl marked in REG_ATTR or MEM_ATTR. We could use
> - SET_DECL_RTL here making this available, but that would mean
> - to select one of the potentially many RTLs for one DECL. Instead
> - of doing that we simply reset the MEM_EXPR of the RTL in question,
> - then nobody can get at it and hence nobody can call DECL_RTL on it. */
> - if (!DECL_RTL_SET_P (var))
> - {
> - if (MEM_P (SA.partition_to_pseudo[i]))
> - set_mem_expr (SA.partition_to_pseudo[i], NULL);
> - }
> - }
> -
> /* If we have a class containing differently aligned pointers
> we need to merge those into the corresponding RTL pointer
> alignment. */
> @@ -5997,7 +6156,6 @@ pass_expand::execute (function *fun)
> {
> tree name = ssa_name (i);
> int part;
> - rtx r;
>
> if (!name
> /* We might have generated new SSA names in
> @@ -6010,20 +6168,24 @@ pass_expand::execute (function *fun)
> if (part == NO_PARTITION)
> continue;
>
> - /* Adjust all partition members to get the underlying decl of
> - the representative which we might have created in expand_one_var. */
> - if (SSA_NAME_VAR (name) == NULL_TREE)
> + gcc_assert (SA.partition_to_pseudo[part]);
> +
> + /* If this decl was marked as living in multiple places, reset
> + this now to NULL. */
> + tree var = SSA_NAME_VAR (name);
> + if (var && DECL_RTL_IF_SET (var) == pc_rtx)
> + SET_DECL_RTL (var, NULL);
> + /* Check that the pseudos chosen by assign_parms are those of
> + the corresponding default defs. */
> + else if (SSA_NAME_IS_DEFAULT_DEF (name)
> + && (TREE_CODE (var) == PARM_DECL
> + || TREE_CODE (var) == RESULT_DECL))
> {
> - tree leader = partition_to_var (SA.map, part);
> - gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
> - replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
> + rtx in = DECL_RTL_IF_SET (var);
> + gcc_assert (in);
> + rtx out = SA.partition_to_pseudo[part];
> + gcc_assert (in == out || rtx_equal_p (in, out));
> }
> - if (!POINTER_TYPE_P (TREE_TYPE (name)))
> - continue;
> -
> - r = SA.partition_to_pseudo[part];
> - if (REG_P (r))
> - mark_reg_pointer (r, get_pointer_alignment (name));
> }
>
> /* If this function is `main', emit a call to `__main'
> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
> index a0b6e3e..602579d 100644
> --- a/gcc/cfgexpand.h
> +++ b/gcc/cfgexpand.h
> @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3. If not see
>
> extern tree gimple_assign_rhs_to_tree (gimple);
> extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
> +
>
> #endif /* GCC_CFGEXPAND_H */
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 32b416a..051f824 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2227,16 +2227,16 @@ Common Report Var(flag_tree_ch) Optimization
> Enable loop header copying on trees
>
> ftree-coalesce-inlined-vars
> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
> -Enable coalescing of copy-related user variables that are inlined
> +Common Ignore RejectNegative
> +Does nothing. Preserved for backward compatibility.
>
> ftree-coalesce-vars
> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
> -Enable coalescing of all copy-related user variables
> +Common Report Var(flag_tree_coalesce_vars) Optimization
> +Enable SSA coalescing of user variables
>
> ftree-copyrename
> -Common Report Var(flag_tree_copyrename) Optimization
> -Replace SSA temporaries with better names in copies
> +Common Ignore
> +Does nothing. Preserved for backward compatibility.
>
> ftree-copy-prop
> Common Report Var(flag_tree_copy_prop) Optimization
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e25bd62..e359be2 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
> -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
> -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
> -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
> -fdump-tree-nrv -fdump-tree-vect @gol
> -fdump-tree-sink @gol
> -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
> @@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
> -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
> -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> -ftree-loop-if-convert-stores -ftree-loop-im @gol
> -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
> -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
> @@ -7076,11 +7074,6 @@ name is made by appending @file{.phiopt} to the source file name.
> Dump each function after forward propagating single use variables. The file
> name is made by appending @file{.forwprop} to the source file name.
>
> -@item copyrename
> -@opindex fdump-tree-copyrename
> -Dump each function after applying the copy rename optimization. The file
> -name is made by appending @file{.copyrename} to the source file name.
> -
> @item nrv
> @opindex fdump-tree-nrv
> Dump each function after applying the named return value optimization on
> @@ -7545,8 +7538,8 @@ compilation time.
> -ftree-ccp @gol
> -fssa-phiopt @gol
> -ftree-ch @gol
> +-ftree-coalesce-vars @gol
> -ftree-copy-prop @gol
> --ftree-copyrename @gol
> -ftree-dce @gol
> -ftree-dominator-opts @gol
> -ftree-dse @gol
> @@ -8815,6 +8808,15 @@ profitable to parallelize the loops.
> Compare the results of several data dependence analyzers. This option
> is used for debugging the data dependence analyzers.
>
> +@item -ftree-coalesce-vars
> +@opindex ftree-coalesce-vars
> +Tell the compiler to attempt to combine small user-defined variables
> +too, instead of just compiler temporaries. This may severely limit the
> +ability to debug an optimized program compiled with
> +@option{-fno-var-tracking-assignments}. In the negated form, this flag
> +prevents SSA coalescing of user variables. This option is enabled by
> +default if optimization is enabled.
> +
> @item -ftree-loop-if-convert
> @opindex ftree-loop-if-convert
> Attempt to transform conditional jumps in the innermost loops to
> @@ -8928,32 +8930,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
> references with scalars to prevent committing structures to memory too
> early. This flag is enabled by default at @option{-O} and higher.
>
> -@item -ftree-copyrename
> -@opindex ftree-copyrename
> -Perform copy renaming on trees. This pass attempts to rename compiler
> -temporaries to other variables at copy locations, usually resulting in
> -variable names which more closely resemble the original variables. This flag
> -is enabled by default at @option{-O} and higher.
> -
> -@item -ftree-coalesce-inlined-vars
> -@opindex ftree-coalesce-inlined-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, but only if they are inlined
> -from other functions. It is a more limited form of
> -@option{-ftree-coalesce-vars}. This may harm debug information of such
> -inlined variables, but it keeps variables of the inlined-into
> -function apart from each other, such that they are more likely to
> -contain the expected values in a debugging session.
> -
> -@item -ftree-coalesce-vars
> -@opindex ftree-coalesce-vars
> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
> -combine small user-defined variables too, instead of just compiler
> -temporaries. This may severely limit the ability to debug an optimized
> -program compiled with @option{-fno-var-tracking-assignments}. In the
> -negated form, this flag prevents SSA coalescing of user variables,
> -including inlined ones. This option is enabled by default.
> -
> @item -ftree-ter
> @opindex ftree-ter
> Perform temporary expression replacement during the SSA->normal phase. Single
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 49a1509..2b98946 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1249,6 +1249,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
> void
> set_reg_attrs_for_decl_rtl (tree t, rtx x)
> {
> + if (!t)
> + return;
> + tree tdecl = t;
> if (GET_CODE (x) == SUBREG)
> {
> gcc_assert (subreg_lowpart_p (x));
> @@ -1257,7 +1260,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
> if (REG_P (x))
> REG_ATTRS (x)
> = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
> - DECL_MODE (t)));
> + DECL_MODE (tdecl)));
> if (GET_CODE (x) == CONCAT)
> {
> if (REG_P (XEXP (x, 0)))
> diff --git a/gcc/explow.c b/gcc/explow.c
> index 8745aea..5b0d49c 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -856,6 +856,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
> return pmode;
> }
>
> +/* Return the promoted mode for name. If it is a named SSA_NAME, it
> + is the same as promote_decl_mode. Otherwise, it is the promoted
> + mode of a temp decl of same type as the SSA_NAME, if we had created
> + one. */
> +
> +machine_mode
> +promote_ssa_mode (const_tree name, int *punsignedp)
> +{
> + gcc_assert (TREE_CODE (name) == SSA_NAME);
> +
> + tree type = TREE_TYPE (name);
> + int unsignedp = TYPE_UNSIGNED (type);
> + machine_mode mode = TYPE_MODE (type);
> +
> + machine_mode pmode = promote_mode (type, mode, &unsignedp);
> + if (punsignedp)
> + *punsignedp = unsignedp;
> +
> + return pmode;
> +}
> +
> +
>
> /* Controls the behaviour of {anti_,}adjust_stack. */
> static bool suppress_reg_args_size;
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 94613de..52113db 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
> /* Return mode and signedness to use when object is promoted. */
> machine_mode promote_decl_mode (const_tree, int *);
>
> +/* Return mode and signedness to use when object is promoted. */
> +machine_mode promote_ssa_mode (const_tree, int *);
> +
> /* Remove some bytes from the stack. An rtx says how many. */
> extern void adjust_stack (rtx);
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 5a931dc..5b6e16e 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -9301,7 +9301,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> rtx op0, op1, temp, decl_rtl;
> tree type;
> int unsignedp;
> - machine_mode mode;
> + machine_mode mode, dmode;
> enum tree_code code = TREE_CODE (exp);
> rtx subtarget, original_target;
> int ignore;
> @@ -9432,7 +9432,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> if (g == NULL
> && modifier == EXPAND_INITIALIZER
> && !SSA_NAME_IS_DEFAULT_DEF (exp)
> - && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> + && (optimize || !SSA_NAME_VAR (exp)
> + || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
> && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
> g = SSA_NAME_DEF_STMT (exp);
> if (g)
> @@ -9511,15 +9512,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> /* Ensure variable marked as used even if it doesn't go through
> a parser. If it hasn't be used yet, write out an external
> definition. */
> - TREE_USED (exp) = 1;
> + if (exp)
> + TREE_USED (exp) = 1;
>
> /* Show we haven't gotten RTL for this yet. */
> temp = 0;
>
> /* Variables inherited from containing functions should have
> been lowered by this point. */
> - context = decl_function_context (exp);
> - gcc_assert (SCOPE_FILE_SCOPE_P (context)
> + if (exp)
> + context = decl_function_context (exp);
> + gcc_assert (!exp
> + || SCOPE_FILE_SCOPE_P (context)
> || context == current_function_decl
> || TREE_STATIC (exp)
> || DECL_EXTERNAL (exp)
> @@ -9543,7 +9547,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> decl_rtl = use_anchored_address (decl_rtl);
> if (modifier != EXPAND_CONST_ADDRESS
> && modifier != EXPAND_SUM
> - && !memory_address_addr_space_p (DECL_MODE (exp),
> + && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
> + : GET_MODE (decl_rtl),
> XEXP (decl_rtl, 0),
> MEM_ADDR_SPACE (decl_rtl)))
> temp = replace_equiv_address (decl_rtl,
> @@ -9554,12 +9559,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> if the address is a register. */
> if (temp != 0)
> {
> - if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
> + if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
> mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>
> return temp;
> }
>
> + if (exp)
> + dmode = DECL_MODE (exp);
> + else
> + dmode = TYPE_MODE (TREE_TYPE (ssa_name));
> +
> /* If the mode of DECL_RTL does not match that of the decl,
> there are two cases: we are dealing with a BLKmode value
> that is returned in a register, or we are dealing with
> @@ -9567,22 +9577,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> of the wanted mode, but mark it so that we know that it
> was already extended. */
> if (REG_P (decl_rtl)
> - && DECL_MODE (exp) != BLKmode
> - && GET_MODE (decl_rtl) != DECL_MODE (exp))
> + && dmode != BLKmode
> + && GET_MODE (decl_rtl) != dmode)
> {
> machine_mode pmode;
>
> /* Get the signedness to be used for this variable. Ensure we get
> the same mode we got when the variable was declared. */
> - if (code == SSA_NAME
> - && (g = SSA_NAME_DEF_STMT (ssa_name))
> - && gimple_code (g) == GIMPLE_CALL
> - && !gimple_call_internal_p (g))
> + if (code != SSA_NAME)
> + pmode = promote_decl_mode (exp, &unsignedp);
> + else if ((g = SSA_NAME_DEF_STMT (ssa_name))
> + && gimple_code (g) == GIMPLE_CALL
> + && !gimple_call_internal_p (g))
> pmode = promote_function_mode (type, mode, &unsignedp,
> gimple_call_fntype (g),
> 2);
> else
> - pmode = promote_decl_mode (exp, &unsignedp);
> + pmode = promote_ssa_mode (ssa_name, &unsignedp);
> gcc_assert (GET_MODE (decl_rtl) == pmode);
>
> temp = gen_lowpart_SUBREG (mode, decl_rtl);
> diff --git a/gcc/function.c b/gcc/function.c
> index 7d2d7e4..58e2498 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3. If not see
> #include "cfganal.h"
> #include "cfgbuild.h"
> #include "cfgcleanup.h"
> +#include "cfgexpand.h"
> #include "basic-block.h"
> #include "df.h"
> #include "params.h"
> @@ -2121,6 +2122,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
> bool
> use_register_for_decl (const_tree decl)
> {
> + if (TREE_CODE (decl) == SSA_NAME)
> + {
> + /* We often try to use the SSA_NAME, instead of its underlying
> + decl, to get type information and guide decisions, to avoid
> + differences of behavior between anonymous and named
> + variables, but in this one case we have to go for the actual
> + variable if there is one. The main reason is that, at least
> + at -O0, we want to place user variables on the stack, but we
> + don't mind using pseudos for anonymous or ignored temps.
> + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
> + should go in pseudos, whereas their corresponding variables
> + might have to go on the stack. So, disregarding the decl
> + here would negatively impact debug info at -O0, enable
> + coalescing between SSA_NAMEs that ought to get different
> + stack/pseudo assignments, and get the incoming argument
> + processing thoroughly confused by PARM_DECLs expected to live
> + in stack slots but assigned to pseudos. */
> + if (!SSA_NAME_VAR (decl))
> + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> +
> + decl = SSA_NAME_VAR (decl);
> + }
> +
> if (!targetm.calls.allocate_stack_slots_for_args ())
> return true;
>
> @@ -2804,23 +2829,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
> data->entry_parm = entry_parm;
> }
>
> +/* Wrapper for use_register_for_decl, that special-cases the
> + .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
> + passed by reference. */
> +
> +static bool
> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
> +{
> + if (parm == all->function_result_decl)
> + {
> + tree result = DECL_RESULT (current_function_decl);
> +
> + if (DECL_BY_REFERENCE (result))
> + parm = result;
> + }
> +
> + return use_register_for_decl (parm);
> +}
> +
> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
> + the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
> + is passed by reference. */
> +
> +static rtx
> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
> +{
> + if (parm == all->function_result_decl)
> + {
> + tree result = DECL_RESULT (current_function_decl);
> +
> + if (!DECL_BY_REFERENCE (result))
> + return NULL_RTX;
> +
> + parm = result;
> + }
> +
> + return get_rtl_for_parm_ssa_default_def (parm);
> +}
> +
> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
> + SSA_NAMEs in multiple partitions, so that assign_parms will choose
> + the default def, if it exists, or create new RTL to hold the unused
> + entry value. If we are coalescing across variables, we want to
> + reset the location too, because a parm without a default def
> + (incoming value unused) might be coalesced with one with a default
> + def, and then assign_parms would copy both incoming values to the
> + same location, which might cause the wrong value to survive. */
> +static void
> +maybe_reset_rtl_for_parm (tree parm)
> +{
> + gcc_assert (TREE_CODE (parm) == PARM_DECL
> + || TREE_CODE (parm) == RESULT_DECL);
> + if ((flag_tree_coalesce_vars
> + || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
> + && is_gimple_reg (parm))
> + SET_DECL_RTL (parm, NULL_RTX);
> +}
> +
> /* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's
> always valid and properly aligned. */
>
> static void
> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
> + struct assign_parm_data_one *data)
> {
> rtx stack_parm = data->stack_parm;
>
> + /* If out-of-SSA assigned RTL to the parm default def, make sure we
> + don't use what we might have computed before. */
> + rtx ssa_assigned = rtl_for_parm (all, parm);
> + if (ssa_assigned)
> + stack_parm = NULL;
> +
> /* If we can't trust the parm stack slot to be aligned enough for its
> ultimate type, don't use that slot after entry. We'll make another
> stack slot, if we need one. */
> - if (stack_parm
> - && ((STRICT_ALIGNMENT
> - && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
> - || (data->nominal_type
> - && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> - && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> + else if (stack_parm
> + && ((STRICT_ALIGNMENT
> + && (GET_MODE_ALIGNMENT (data->nominal_mode)
> + > MEM_ALIGN (stack_parm)))
> + || (data->nominal_type
> + && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> + && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> stack_parm = NULL;
>
> /* If parm was passed in memory, and we need to convert it on entry,
> @@ -2882,11 +2972,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>
> size = int_size_in_bytes (data->passed_type);
> size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
> +
> if (stack_parm == 0)
> {
> DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
> - stack_parm = assign_stack_local (BLKmode, size_stored,
> - DECL_ALIGN (parm));
> + stack_parm = rtl_for_parm (all, parm);
> + if (!stack_parm)
> + stack_parm = assign_stack_local (BLKmode, size_stored,
> + DECL_ALIGN (parm));
> + else
> + stack_parm = copy_rtx (stack_parm);
> if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> PUT_MODE (stack_parm, GET_MODE (entry_parm));
> set_mem_attributes (stack_parm, parm, 1);
> @@ -3027,10 +3122,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
> = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
> TREE_TYPE (current_function_decl), 2);
>
> - parmreg = gen_reg_rtx (promoted_nominal_mode);
> + rtx from_expand = rtl_for_parm (all, parm);
>
> - if (!DECL_ARTIFICIAL (parm))
> - mark_user_reg (parmreg);
> + if (from_expand && !data->passed_pointer)
> + {
> + parmreg = from_expand;
> + gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
> + }
> + else
> + {
> + parmreg = gen_reg_rtx (promoted_nominal_mode);
> + if (!DECL_ARTIFICIAL (parm))
> + mark_user_reg (parmreg);
> + }
>
> /* If this was an item that we received a pointer to,
> set DECL_RTL appropriately. */
> @@ -3049,6 +3153,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
> assign_parm_find_data_types and expand_expr_real_1. */
>
> equiv_stack_parm = data->stack_parm;
> + if (!equiv_stack_parm)
> + equiv_stack_parm = data->entry_parm;
> validated_mem = validize_mem (copy_rtx (data->entry_parm));
>
> need_conversion = (data->nominal_mode != data->passed_mode
> @@ -3189,11 +3295,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
> /* If we were passed a pointer but the actual value can safely live
> in a register, retrieve it and use it directly. */
> - if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
> + if (data->passed_pointer
> + && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
> {
> /* We can't use nominal_mode, because it will have been set to
> Pmode above. We must use the actual mode of the parm. */
> - if (use_register_for_decl (parm))
> + if (from_expand)
> + {
> + parmreg = from_expand;
> + gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
> + }
> + else if (use_register_for_decl (parm))
> {
> parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
> mark_user_reg (parmreg);
> @@ -3233,7 +3345,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>
> /* STACK_PARM is the pointer, not the parm, and PARMREG is
> now the parm. */
> - data->stack_parm = NULL;
> + data->stack_parm = equiv_stack_parm = NULL;
> }
>
> /* Mark the register as eliminable if we did no conversion and it was
> @@ -3243,11 +3355,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
> make here would screw up life analysis for it. */
> if (data->nominal_mode == data->passed_mode
> && !did_conversion
> - && data->stack_parm != 0
> - && MEM_P (data->stack_parm)
> + && equiv_stack_parm != 0
> + && MEM_P (equiv_stack_parm)
> && data->locate.offset.var == 0
> && reg_mentioned_p (virtual_incoming_args_rtx,
> - XEXP (data->stack_parm, 0)))
> + XEXP (equiv_stack_parm, 0)))
> {
> rtx_insn *linsn = get_last_insn ();
> rtx_insn *sinsn;
> @@ -3260,8 +3372,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
> = GET_MODE_INNER (GET_MODE (parmreg));
> int regnor = REGNO (XEXP (parmreg, 0));
> int regnoi = REGNO (XEXP (parmreg, 1));
> - rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
> - rtx stacki = adjust_address_nv (data->stack_parm, submode,
> + rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
> + rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
> GET_MODE_SIZE (submode));
>
> /* Scan backwards for the set of the real and
> @@ -3334,6 +3446,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>
> if (data->stack_parm == 0)
> {
> + rtx x = data->stack_parm = rtl_for_parm (all, parm);
> + if (x)
> + gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
> + }
> +
> + if (data->stack_parm == 0)
> + {
> int align = STACK_SLOT_ALIGNMENT (data->passed_type,
> GET_MODE (data->entry_parm),
> TYPE_ALIGN (data->passed_type));
> @@ -3592,6 +3711,8 @@ assign_parms (tree fndecl)
> DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
> continue;
> }
> + else
> + maybe_reset_rtl_for_parm (parm);
>
> /* Estimate stack alignment from parameter alignment. */
> if (SUPPORTS_STACK_ALIGNMENT)
> @@ -3641,7 +3762,9 @@ assign_parms (tree fndecl)
> else
> set_decl_incoming_rtl (parm, data.entry_parm, false);
>
> - /* Boudns should be loaded in the particular order to
> + assign_parm_adjust_stack_rtl (&all, parm, &data);
> +
> + /* Bounds should be loaded in the particular order to
> have registers allocated correctly. Collect info about
> input bounds and load them later. */
> if (POINTER_BOUNDS_TYPE_P (data.passed_type))
> @@ -3658,11 +3781,10 @@ assign_parms (tree fndecl)
> }
> else
> {
> - assign_parm_adjust_stack_rtl (&data);
> -
> if (assign_parm_setup_block_p (&data))
> assign_parm_setup_block (&all, parm, &data);
> - else if (data.passed_pointer || use_register_for_decl (parm))
> + else if (data.passed_pointer
> + || use_register_for_parm_decl (&all, parm))
> assign_parm_setup_reg (&all, parm, &data);
> else
> assign_parm_setup_stack (&all, parm, &data);
> @@ -5004,7 +5126,9 @@ expand_function_start (tree subr)
> before any library calls that assign parms might generate. */
>
> /* Decide whether to return the value in memory or in a register. */
> - if (aggregate_value_p (DECL_RESULT (subr), subr))
> + tree res = DECL_RESULT (subr);
> + maybe_reset_rtl_for_parm (res);
> + if (aggregate_value_p (res, subr))
> {
> /* Returning something that won't go in a register. */
> rtx value_address = 0;
> @@ -5012,7 +5136,7 @@ expand_function_start (tree subr)
> #ifdef PCC_STATIC_STRUCT_RETURN
> if (cfun->returns_pcc_struct)
> {
> - int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
> + int size = int_size_in_bytes (TREE_TYPE (res));
> value_address = assemble_static_space (size);
> }
> else
> @@ -5024,36 +5148,45 @@ expand_function_start (tree subr)
> it. */
> if (sv)
> {
> - value_address = gen_reg_rtx (Pmode);
> + if (DECL_BY_REFERENCE (res))
> + value_address = get_rtl_for_parm_ssa_default_def (res);
> + if (!value_address)
> + value_address = gen_reg_rtx (Pmode);
> emit_move_insn (value_address, sv);
> }
> }
> if (value_address)
> {
> rtx x = value_address;
> - if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
> + if (!DECL_BY_REFERENCE (res))
> {
> - x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
> - set_mem_attributes (x, DECL_RESULT (subr), 1);
> + x = get_rtl_for_parm_ssa_default_def (res);
> + if (!x)
> + {
> + x = gen_rtx_MEM (DECL_MODE (res), value_address);
> + set_mem_attributes (x, res, 1);
> + }
> }
> - SET_DECL_RTL (DECL_RESULT (subr), x);
> + SET_DECL_RTL (res, x);
> }
> }
> - else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
> + else if (DECL_MODE (res) == VOIDmode)
> /* If return mode is void, this decl rtl should not be used. */
> - SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
> + SET_DECL_RTL (res, NULL_RTX);
> else
> {
> /* Compute the return values into a pseudo reg, which we will copy
> into the true return register after the cleanups are done. */
> - tree return_type = TREE_TYPE (DECL_RESULT (subr));
> - if (TYPE_MODE (return_type) != BLKmode
> - && targetm.calls.return_in_msb (return_type))
> + tree return_type = TREE_TYPE (res);
> + rtx x = get_rtl_for_parm_ssa_default_def (res);
> + if (x)
> + /* Use it. */;
> + else if (TYPE_MODE (return_type) != BLKmode
> + && targetm.calls.return_in_msb (return_type))
> /* expand_function_end will insert the appropriate padding in
> this case. Use the return value's natural (unpadded) mode
> within the function proper. */
> - SET_DECL_RTL (DECL_RESULT (subr),
> - gen_reg_rtx (TYPE_MODE (return_type)));
> + x = gen_reg_rtx (TYPE_MODE (return_type));
> else
> {
> /* In order to figure out what mode to use for the pseudo, we
> @@ -5064,25 +5197,26 @@ expand_function_start (tree subr)
> /* Structures that are returned in registers are not
> aggregate_value_p, so we may see a PARALLEL or a REG. */
> if (REG_P (hard_reg))
> - SET_DECL_RTL (DECL_RESULT (subr),
> - gen_reg_rtx (GET_MODE (hard_reg)));
> + x = gen_reg_rtx (GET_MODE (hard_reg));
> else
> {
> gcc_assert (GET_CODE (hard_reg) == PARALLEL);
> - SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
> + x = gen_group_rtx (hard_reg);
> }
> }
>
> + SET_DECL_RTL (res, x);
> +
> /* Set DECL_REGISTER flag so that expand_function_end will copy the
> result to the real return register(s). */
> - DECL_REGISTER (DECL_RESULT (subr)) = 1;
> + DECL_REGISTER (res) = 1;
>
> if (chkp_function_instrumented_p (current_function_decl))
> {
> - tree return_type = TREE_TYPE (DECL_RESULT (subr));
> + tree return_type = TREE_TYPE (res);
> rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
> subr, 1);
> - SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
> + SET_DECL_BOUNDS_RTL (res, bounds);
> }
> }
>
> @@ -5097,7 +5231,9 @@ expand_function_start (tree subr)
> rtx local, chain;
> rtx_insn *insn;
>
> - local = gen_reg_rtx (Pmode);
> + local = get_rtl_for_parm_ssa_default_def (parm);
> + if (!local)
> + local = gen_reg_rtx (Pmode);
> chain = targetm.calls.static_chain (current_function_decl, true);
>
> set_decl_incoming_rtl (parm, chain, false);
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index 4d683d6..d3d1c5f 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
> return copy;
> }
>
> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> - coalescing together, false otherwise.
> -
> - This must stay consistent with var_map_base_init in tree-ssa-live.c. */
> -
> -bool
> -gimple_can_coalesce_p (tree name1, tree name2)
> -{
> - /* First check the SSA_NAME's associated DECL. We only want to
> - coalesce if they have the same DECL or both have no associated DECL. */
> - tree var1 = SSA_NAME_VAR (name1);
> - tree var2 = SSA_NAME_VAR (name2);
> - var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> - var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> - if (var1 != var2)
> - return false;
> -
> - /* Now check the types. If the types are the same, then we should
> - try to coalesce V1 and V2. */
> - tree t1 = TREE_TYPE (name1);
> - tree t2 = TREE_TYPE (name2);
> - if (t1 == t2)
> - return true;
> -
> - /* If the types are not the same, check for a canonical type match. This
> - (for example) allows coalescing when the types are fundamentally the
> - same, but just have different names.
> -
> - Note pointer types with different address spaces may have the same
> - canonical type. Those are rejected for coalescing by the
> - types_compatible_p check. */
> - if (TYPE_CANONICAL (t1)
> - && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> - && types_compatible_p (t1, t2))
> - return true;
> -
> - return false;
> -}
> -
> /* Strip off a legitimate source ending from the input string NAME of
> length LEN. Rather than having to know the names used by all of
> our front ends, we strip off an ending of a period followed by
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index ed23eb2..3d1c89f 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
> extern bool gimple_has_body_p (tree);
> extern const char *gimple_decl_printable_name (tree, int);
> extern tree copy_var_decl (tree, tree, tree);
> -extern bool gimple_can_coalesce_p (tree, tree);
> extern tree create_tmp_var_name (const char *);
> extern tree create_tmp_var_raw (tree, const char * = NULL);
> extern tree create_tmp_var (tree, const char * = NULL);
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 9793999..5305299 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
> { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
> + { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
> - { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
> { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 4690e23..230e089 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_all_early_optimizations);
> PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
> NEXT_PASS (pass_remove_cgraph_callee_edges);
> - NEXT_PASS (pass_rename_ssa_copies);
> NEXT_PASS (pass_object_sizes);
> NEXT_PASS (pass_ccp);
> /* After CCP we rewrite no longer addressed locals into SSA
> @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3. If not see
> /* Initial scalar cleanups before alias computation.
> They ensure memory accesses are not indirect wherever possible. */
> NEXT_PASS (pass_strip_predict_hints);
> - NEXT_PASS (pass_rename_ssa_copies);
> NEXT_PASS (pass_ccp);
> /* After CCP we rewrite no longer addressed locals into SSA
> form if possible. */
> @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_ch);
> NEXT_PASS (pass_lower_complex);
> NEXT_PASS (pass_sra);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* The dom pass will also resolve all __builtin_constant_p calls
> that are still there to 0. This has to be done after some
> propagations have already run, but before some more dead code
> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_fold_builtins);
> NEXT_PASS (pass_optimize_widening_mul);
> NEXT_PASS (pass_tail_calls);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* FIXME: If DCE is not run before checking for uninitialized uses,
> we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
> However, this also causes us to misdiagnose cases that should be
> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_dce);
> NEXT_PASS (pass_asan);
> NEXT_PASS (pass_tsan);
> - NEXT_PASS (pass_rename_ssa_copies);
> /* ??? We do want some kind of loop invariant motion, but we possibly
> need to adjust LIM to be more friendly towards preserving accurate
> debug information here. */
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
> index 9b17187..e1e7293 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
> @@ -1,6 +1,6 @@
> /* PR tree-optimization/54200 */
> /* { dg-do run } */
> -/* { dg-options "-g -fno-var-tracking-assignments" } */
> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>
> int o __attribute__((used));
>
> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
> index 5467f4d..db69332 100644
> --- a/gcc/testsuite/gcc.dg/ssp-1.c
> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>
> int main ()
> {
> - int i;
> + register int i;
> char foo[255];
>
> // smash stack
> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
> index 9a7ac32..752fe53 100644
> --- a/gcc/testsuite/gcc.dg/ssp-2.c
> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
> void
> overflow()
> {
> - int i = 0;
> + register int i = 0;
> char foo[30];
>
> /* Overflow buffer. */
> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> new file mode 100644
> index 0000000..dbd81c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> @@ -0,0 +1,40 @@
> +/* { dg-do run } */
> +
> +#include <stdlib.h>
> +
> +/* Make sure we don't coalesce both incoming parms, one whose incoming
> + value is unused, to the same location, so as to overwrite one of
> + them with the incoming value of the other. */
> +
> +int __attribute__((noinline, noclone))
> +foo (int i, int j)
> +{
> + j = i; /* The incoming value for J is unused. */
> + i = 2;
> + if (j)
> + j++;
> + j += i + 1;
> + return j;
> +}
> +
> +/* Same as foo, but with swapped parameters. */
> +int __attribute__((noinline, noclone))
> +bar (int j, int i)
> +{
> + j = i; /* The incoming value for J is unused. */
> + i = 2;
> + if (j)
> + j++;
> + j += i + 1;
> + return j;
> +}
> +
> +int
> +main (void)
> +{
> + if (foo (0, 1) != 3)
> + abort ();
> + if (bar (1, 0) != 3)
> + abort ();
> + return 0;
> +}
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index e23bc0b..59d91c6 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
> rtx dest_rtx, seq, x;
> machine_mode dest_mode, src_mode;
> int unsignedp;
> - tree var;
>
> if (dump_file && (dump_flags & TDF_DETAILS))
> {
> @@ -327,12 +326,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>
> start_sequence ();
>
> - var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
> + tree name = partition_to_var (SA.map, dest);
> src_mode = TYPE_MODE (TREE_TYPE (src));
> dest_mode = GET_MODE (dest_rtx);
> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
> gcc_assert (!REG_P (dest_rtx)
> - || dest_mode == promote_decl_mode (var, &unsignedp));
> + || dest_mode == promote_ssa_mode (name, &unsignedp));
>
> if (src_mode != dest_mode)
> {
> @@ -708,13 +707,12 @@ elim_backward (elim_graph g, int T)
> static rtx
> get_temp_reg (tree name)
> {
> - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> - tree type = TREE_TYPE (var);
> + tree type = TREE_TYPE (name);
> int unsignedp;
> - machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
> + machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
> rtx x = gen_reg_rtx (reg_mode);
> if (POINTER_TYPE_P (type))
> - mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
> + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
> return x;
> }
>
> @@ -1014,7 +1012,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>
> /* Return to viewing the variable list as just all reference variables after
> coalescing has been performed. */
> - partition_view_normal (map, false);
> + partition_view_normal (map);
>
> if (dump_file && (dump_flags & TDF_DETAILS))
> {
> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
> index b05a860..9ffa3f1 100644
> --- a/gcc/tree-ssa-coalesce.c
> +++ b/gcc/tree-ssa-coalesce.c
> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see
> #include "tree-ssanames.h"
> #include "tree-ssa-live.h"
> #include "tree-ssa-coalesce.h"
> +#include "explow.h"
> #include "diagnostic-core.h"
>
>
> @@ -830,6 +831,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
> basic_block bb;
> ssa_op_iter iter;
> live_track_p live;
> + basic_block entry;
> +
> + /* If inter-variable coalescing is enabled, we may attempt to
> + coalesce variables from different base variables, including
> + different parameters, so we have to make sure default defs live
> + at the entry block conflict with each other. */
> + if (flag_tree_coalesce_vars)
> + entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> + else
> + entry = NULL;
>
> map = live_var_map (liveinfo);
> graph = ssa_conflicts_new (num_var_partitions (map));
> @@ -888,6 +899,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
> live_track_process_def (live, result, graph);
> }
>
> + /* Pretend there are defs for params' default defs at the start
> + of the (post-)entry block. */
> + if (bb == entry)
> + {
> + unsigned base;
> + bitmap_iterator bi;
> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
> + {
> + bitmap_iterator bi2;
> + unsigned part;
> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
> + 0, part, bi2)
> + {
> + tree var = partition_to_var (map, part);
> + if (!SSA_NAME_VAR (var)
> + || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
> + && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
> + || !SSA_NAME_IS_DEFAULT_DEF (var))
> + continue;
> + live_track_process_def (live, var, graph);
> + }
> + }
> + }
> +
> live_track_clear_base_vars (live);
> }
>
> @@ -1156,6 +1191,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
> {
> var1 = partition_to_var (map, p1);
> var2 = partition_to_var (map, p2);
> +
> z = var_union (map, var1, var2);
> if (z == NO_PARTITION)
> {
> @@ -1173,6 +1209,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>
> if (debug)
> fprintf (debug, ": Success -> %d\n", z);
> +
> return true;
> }
>
> @@ -1270,6 +1307,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
> }
>
>
> +/* Output partition map MAP with coalescing plan PART to file F. */
> +
> +void
> +dump_part_var_map (FILE *f, partition part, var_map map)
> +{
> + int t;
> + unsigned x, y;
> + int p;
> +
> + fprintf (f, "\nCoalescible Partition map \n\n");
> +
> + for (x = 0; x < map->num_partitions; x++)
> + {
> + if (map->view_to_partition != NULL)
> + p = map->view_to_partition[x];
> + else
> + p = x;
> +
> + if (ssa_name (p) == NULL_TREE
> + || virtual_operand_p (ssa_name (p)))
> + continue;
> +
> + t = 0;
> + for (y = 1; y < num_ssa_names; y++)
> + {
> + tree var = version_to_var (map, y);
> + if (!var)
> + continue;
> + int q = var_to_partition (map, var);
> + p = partition_find (part, q);
> + gcc_assert (map->partition_to_base_index[q]
> + == map->partition_to_base_index[p]);
> +
> + if (p == (int)x)
> + {
> + if (t++ == 0)
> + {
> + fprintf (f, "Partition %d, base %d (", x,
> + map->partition_to_base_index[q]);
> + print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
> + fprintf (f, " - ");
> + }
> + fprintf (f, "%d ", y);
> + }
> + }
> + if (t != 0)
> + fprintf (f, ")\n");
> + }
> + fprintf (f, "\n");
> +}
> +
> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
> + coalescing together, false otherwise.
> +
> + This must stay consistent with var_map_base_init in tree-ssa-live.c. */
> +
> +bool
> +gimple_can_coalesce_p (tree name1, tree name2)
> +{
> + /* First check the SSA_NAME's associated DECL. Without
> + optimization, we only want to coalesce if they have the same DECL
> + or both have no associated DECL. */
> + tree var1 = SSA_NAME_VAR (name1);
> + tree var2 = SSA_NAME_VAR (name2);
> + var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
> + var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
> + if (var1 != var2 && !flag_tree_coalesce_vars)
> + return false;
> +
> + /* Now check the types. If the types are the same, then we should
> + try to coalesce V1 and V2. */
> + tree t1 = TREE_TYPE (name1);
> + tree t2 = TREE_TYPE (name2);
> + if (t1 == t2)
> + {
> + check_modes:
> + /* If the base variables are the same, we're good: none of the
> + other tests below could possibly fail. */
> + var1 = SSA_NAME_VAR (name1);
> + var2 = SSA_NAME_VAR (name2);
> + if (var1 == var2)
> + return true;
> +
> + /* We don't want to coalesce two SSA names if one of the base
> + variables is supposed to be a register while the other is
> + supposed to be on the stack. Anonymous SSA names take
> + registers, but when not optimizing, user variables should go
> + on the stack, so coalescing them with the anonymous variable
> + as the partition leader would end up assigning the user
> + variable to a register. Don't do that! */
> + bool reg1 = !var1 || use_register_for_decl (var1);
> + bool reg2 = !var2 || use_register_for_decl (var2);
> + if (reg1 != reg2)
> + return false;
> +
> + /* Check that the promoted modes are the same. We don't want to
> + coalesce if the promoted modes would be different. Only
> + PARM_DECLs and RESULT_DECLs have different promotion rules,
> + so skip the test if we both are variables or anonymous
> + SSA_NAMEs. */
> + return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
> + || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
> + }
> +
> + /* If the types are not the same, check for a canonical type match. This
> + (for example) allows coalescing when the types are fundamentally the
> + same, but just have different names.
> +
> + Note pointer types with different address spaces may have the same
> + canonical type. Those are rejected for coalescing by the
> + types_compatible_p check. */
> + if (TYPE_CANONICAL (t1)
> + && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
> + && types_compatible_p (t1, t2))
> + goto check_modes;
> +
> + return false;
> +}
> +
> +/* Fill in MAP's partition_to_base_index, with one index for each
> + partition of SSA names USED_IN_COPIES and related by CL coalesce
> + possibilities. This must match gimple_can_coalesce_p in the
> + optimized case. */
> +
> +static void
> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
> + coalesce_list_p cl)
> +{
> + int parts = num_var_partitions (map);
> + partition tentative = partition_new (parts);
> +
> + /* Partition the SSA versions so that, for each coalescible
> + pair, both of its members are in the same partition in
> + TENTATIVE. */
> + gcc_assert (!cl->sorted);
> + coalesce_pair_p node;
> + coalesce_iterator_type ppi;
> + FOR_EACH_PARTITION_PAIR (node, ppi, cl)
> + {
> + tree v1 = ssa_name (node->first_element);
> + int p1 = partition_find (tentative, var_to_partition (map, v1));
> + tree v2 = ssa_name (node->second_element);
> + int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> +
> + /* We have to deal with cost one pairs too. */
> + for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
> + {
> + tree v1 = ssa_name (co->first_element);
> + int p1 = partition_find (tentative, var_to_partition (map, v1));
> + tree v2 = ssa_name (co->second_element);
> + int p2 = partition_find (tentative, var_to_partition (map, v2));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> +
> + /* And also with abnormal edges. */
> + basic_block bb;
> + edge e;
> + edge_iterator ei;
> + FOR_EACH_BB_FN (bb, cfun)
> + {
> + FOR_EACH_EDGE (e, ei, bb->preds)
> + if (e->flags & EDGE_ABNORMAL)
> + {
> + gphi_iterator gsi;
> + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> + gsi_next (&gsi))
> + {
> + gphi *phi = gsi.phi ();
> + tree arg = PHI_ARG_DEF (phi, e->dest_idx);
> + if (SSA_NAME_IS_DEFAULT_DEF (arg)
> + && (!SSA_NAME_VAR (arg)
> + || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
> + continue;
> +
> + tree res = PHI_RESULT (phi);
> +
> + int p1 = partition_find (tentative, var_to_partition (map, res));
> + int p2 = partition_find (tentative, var_to_partition (map, arg));
> +
> + if (p1 == p2)
> + continue;
> +
> + partition_union (tentative, p1, p2);
> + }
> + }
> + }
> +
> + map->partition_to_base_index = XCNEWVEC (int, parts);
> + auto_vec<unsigned int> index_map (parts);
> + if (parts)
> + index_map.quick_grow (parts);
> +
> + const unsigned no_part = -1;
> + unsigned count = parts;
> + while (count)
> + index_map[--count] = no_part;
> +
> + /* Initialize MAP's mapping from partition to base index, using
> + as base indices an enumeration of the TENTATIVE partitions in
> + which each SSA version ended up, so that we compute conflicts
> + between all SSA versions that ended up in the same potential
> + coalesce partition. */
> + bitmap_iterator bi;
> + unsigned i;
> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> + {
> + int pidx = var_to_partition (map, ssa_name (i));
> + int base = partition_find (tentative, pidx);
> + if (index_map[base] != no_part)
> + continue;
> + index_map[base] = count++;
> + }
> +
> + map->num_basevars = count;
> +
> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
> + {
> + int pidx = var_to_partition (map, ssa_name (i));
> + int base = partition_find (tentative, pidx);
> + gcc_assert (index_map[base] < count);
> + map->partition_to_base_index[pidx] = index_map[base];
> + }
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + dump_part_var_map (dump_file, tentative, map);
> +
> + partition_delete (tentative);
> +}
> +
> +/* Hashtable helpers. */
> +
> +struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
> +{
> + typedef tree_int_map *value_type;
> + typedef tree_int_map *compare_type;
> + static inline hashval_t hash (const tree_int_map *);
> + static inline bool equal (const tree_int_map *, const tree_int_map *);
> +};
> +
> +inline hashval_t
> +tree_int_map_hasher::hash (const tree_int_map *v)
> +{
> + return tree_map_base_hash (v);
> +}
> +
> +inline bool
> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> +{
> + return tree_int_map_eq (v, c);
> +}
> +
> +/* This routine will initialize the basevar fields of MAP with base
> + names. Partitions will share the same base if they have the same
> + SSA_NAME_VAR, or, being anonymous variables, the same type. This
> + must match gimple_can_coalesce_p in the non-optimized case. */
> +
> +static void
> +compute_samebase_partition_bases (var_map map)
> +{
> + int x, num_part;
> + tree var;
> + struct tree_int_map *m, *mapstorage;
> +
> + num_part = num_var_partitions (map);
> + hash_table<tree_int_map_hasher> tree_to_index (num_part);
> + /* We can have at most num_part entries in the hash tables, so it's
> + enough to allocate so many map elements once, saving some malloc
> + calls. */
> + mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> +
> + /* If a base table already exists, clear it, otherwise create it. */
> + free (map->partition_to_base_index);
> + map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> +
> + /* Build the base variable list, and point partitions at their bases. */
> + for (x = 0; x < num_part; x++)
> + {
> + struct tree_int_map **slot;
> + unsigned baseindex;
> + var = partition_to_var (map, x);
> + if (SSA_NAME_VAR (var)
> + && (!VAR_P (SSA_NAME_VAR (var))
> + || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> + m->base.from = SSA_NAME_VAR (var);
> + else
> + /* This restricts what anonymous SSA names we can coalesce
> + as it restricts the sets we compute conflicts for.
> + Using TREE_TYPE to generate sets is the easies as
> + type equivalency also holds for SSA names with the same
> + underlying decl.
> +
> + Check gimple_can_coalesce_p when changing this code. */
> + m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> + ? TYPE_CANONICAL (TREE_TYPE (var))
> + : TREE_TYPE (var));
> + /* If base variable hasn't been seen, set it up. */
> + slot = tree_to_index.find_slot (m, INSERT);
> + if (!*slot)
> + {
> + baseindex = m - mapstorage;
> + m->to = baseindex;
> + *slot = m;
> + m++;
> + }
> + else
> + baseindex = (*slot)->to;
> + map->partition_to_base_index[x] = baseindex;
> + }
> +
> + map->num_basevars = m - mapstorage;
> +
> + free (mapstorage);
> +}
> +
> /* Reduce the number of copies by coalescing variables in the function. Return
> a partition map with the resulting coalesces. */
>
> @@ -1286,9 +1647,10 @@ coalesce_ssa_name (void)
> cl = create_coalesce_list ();
> map = create_outofssa_var_map (cl, used_in_copies);
>
> - /* If optimization is disabled, we need to coalesce all the names originating
> - from the same SSA_NAME_VAR so debug info remains undisturbed. */
> - if (!optimize)
> + /* If this optimization is disabled, we need to coalesce all the
> + names originating from the same SSA_NAME_VAR so debug info
> + remains undisturbed. */
> + if (!flag_tree_coalesce_vars)
> {
> hash_table<ssa_name_var_hash> ssa_name_hash (10);
>
> @@ -1329,8 +1691,13 @@ coalesce_ssa_name (void)
> if (dump_file && (dump_flags & TDF_DETAILS))
> dump_var_map (dump_file, map);
>
> - /* Don't calculate live ranges for variables not in the coalesce list. */
> - partition_view_bitmap (map, used_in_copies, true);
> + partition_view_bitmap (map, used_in_copies);
> +
> + if (flag_tree_coalesce_vars)
> + compute_optimized_partition_bases (map, used_in_copies, cl);
> + else
> + compute_samebase_partition_bases (map);
> +
> BITMAP_FREE (used_in_copies);
>
> if (num_var_partitions (map) < 1)
> @@ -1369,8 +1736,7 @@ coalesce_ssa_name (void)
>
> /* Now coalesce everything in the list. */
> coalesce_partitions (map, graph, cl,
> - ((dump_flags & TDF_DETAILS) ? dump_file
> - : NULL));
> + ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>
> delete_coalesce_list (cl);
> ssa_conflicts_delete (graph);
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index 99b188a..ae289b4 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see
> #define GCC_TREE_SSA_COALESCE_H
>
> extern var_map coalesce_ssa_name (void);
> +extern bool gimple_can_coalesce_p (tree, tree);
>
> #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
> deleted file mode 100644
> index f3cb56e..0000000
> --- a/gcc/tree-ssa-copyrename.c
> +++ /dev/null
> @@ -1,499 +0,0 @@
> -/* Rename SSA copies.
> - Copyright (C) 2004-2015 Free Software Foundation, Inc.
> - Contributed by Andrew MacLeod <amacleod@redhat.com>
> -
> -This file is part of GCC.
> -
> -GCC is free software; you can redistribute it and/or modify
> -it under the terms of the GNU General Public License as published by
> -the Free Software Foundation; either version 3, or (at your option)
> -any later version.
> -
> -GCC is distributed in the hope that it will be useful,
> -but WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> -GNU General Public License for more details.
> -
> -You should have received a copy of the GNU General Public License
> -along with GCC; see the file COPYING3. If not see
> -<http://www.gnu.org/licenses/>. */
> -
> -#include "config.h"
> -#include "system.h"
> -#include "coretypes.h"
> -#include "tm.h"
> -#include "hash-set.h"
> -#include "machmode.h"
> -#include "vec.h"
> -#include "double-int.h"
> -#include "input.h"
> -#include "alias.h"
> -#include "symtab.h"
> -#include "wide-int.h"
> -#include "inchash.h"
> -#include "tree.h"
> -#include "fold-const.h"
> -#include "predict.h"
> -#include "hard-reg-set.h"
> -#include "function.h"
> -#include "dominance.h"
> -#include "cfg.h"
> -#include "basic-block.h"
> -#include "tree-ssa-alias.h"
> -#include "internal-fn.h"
> -#include "gimple-expr.h"
> -#include "is-a.h"
> -#include "gimple.h"
> -#include "gimple-iterator.h"
> -#include "flags.h"
> -#include "tree-pretty-print.h"
> -#include "bitmap.h"
> -#include "gimple-ssa.h"
> -#include "stringpool.h"
> -#include "tree-ssanames.h"
> -#include "hashtab.h"
> -#include "rtl.h"
> -#include "statistics.h"
> -#include "real.h"
> -#include "fixed-value.h"
> -#include "insn-config.h"
> -#include "expmed.h"
> -#include "dojump.h"
> -#include "explow.h"
> -#include "calls.h"
> -#include "emit-rtl.h"
> -#include "varasm.h"
> -#include "stmt.h"
> -#include "expr.h"
> -#include "tree-dfa.h"
> -#include "tree-inline.h"
> -#include "tree-ssa-live.h"
> -#include "tree-pass.h"
> -#include "langhooks.h"
> -
> -static struct
> -{
> - /* Number of copies coalesced. */
> - int coalesced;
> -} stats;
> -
> -/* The following routines implement the SSA copy renaming phase.
> -
> - This optimization looks for copies between 2 SSA_NAMES, either through a
> - direct copy, or an implicit one via a PHI node result and its arguments.
> -
> - Each copy is examined to determine if it is possible to rename the base
> - variable of one of the operands to the same variable as the other operand.
> - i.e.
> - T.3_5 = <blah>
> - a_1 = T.3_5
> -
> - If this copy couldn't be copy propagated, it could possibly remain in the
> - program throughout the optimization phases. After SSA->normal, it would
> - become:
> -
> - T.3 = <blah>
> - a = T.3
> -
> - Since T.3_5 is distinct from all other SSA versions of T.3, there is no
> - fundamental reason why the base variable needs to be T.3, subject to
> - certain restrictions. This optimization attempts to determine if we can
> - change the base variable on copies like this, and result in code such as:
> -
> - a_5 = <blah>
> - a_1 = a_5
> -
> - This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
> - possible, the copy goes away completely. If it isn't possible, a new temp
> - will be created for a_5, and you will end up with the exact same code:
> -
> - a.8 = <blah>
> - a = a.8
> -
> - The other benefit of performing this optimization relates to what variables
> - are chosen in copies. Gimplification of the program uses temporaries for
> - a lot of things. expressions like
> -
> - a_1 = <blah>
> - <blah2> = a_1
> -
> - get turned into
> -
> - T.3_5 = <blah>
> - a_1 = T.3_5
> - <blah2> = a_1
> -
> - Copy propagation is done in a forward direction, and if we can propagate
> - through the copy, we end up with:
> -
> - T.3_5 = <blah>
> - <blah2> = T.3_5
> -
> - The copy is gone, but so is all reference to the user variable 'a'. By
> - performing this optimization, we would see the sequence:
> -
> - a_5 = <blah>
> - a_1 = a_5
> - <blah2> = a_1
> -
> - which copy propagation would then turn into:
> -
> - a_5 = <blah>
> - <blah2> = a_5
> -
> - and so we still retain the user variable whenever possible. */
> -
> -
> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
> - Choose a representative for the partition, and send debug info to DEBUG. */
> -
> -static void
> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
> -{
> - int p1, p2, p3;
> - tree root1, root2;
> - tree rep1, rep2;
> - bool ign1, ign2, abnorm;
> -
> - gcc_assert (TREE_CODE (var1) == SSA_NAME);
> - gcc_assert (TREE_CODE (var2) == SSA_NAME);
> -
> - register_ssa_partition (map, var1);
> - register_ssa_partition (map, var2);
> -
> - p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
> - p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
> -
> - if (debug)
> - {
> - fprintf (debug, "Try : ");
> - print_generic_expr (debug, var1, TDF_SLIM);
> - fprintf (debug, "(P%d) & ", p1);
> - print_generic_expr (debug, var2, TDF_SLIM);
> - fprintf (debug, "(P%d)", p2);
> - }
> -
> - gcc_assert (p1 != NO_PARTITION);
> - gcc_assert (p2 != NO_PARTITION);
> -
> - if (p1 == p2)
> - {
> - if (debug)
> - fprintf (debug, " : Already coalesced.\n");
> - return;
> - }
> -
> - rep1 = partition_to_var (map, p1);
> - rep2 = partition_to_var (map, p2);
> - root1 = SSA_NAME_VAR (rep1);
> - root2 = SSA_NAME_VAR (rep2);
> - if (!root1 && !root2)
> - return;
> -
> - /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
> - abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
> - || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
> - if (abnorm)
> - {
> - if (debug)
> - fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
> - return;
> - }
> -
> - /* Partitions already have the same root, simply merge them. */
> - if (root1 == root2)
> - {
> - p1 = partition_union (map->var_partition, p1, p2);
> - if (debug)
> - fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
> - return;
> - }
> -
> - /* Never attempt to coalesce 2 different parameters. */
> - if ((root1 && TREE_CODE (root1) == PARM_DECL)
> - && (root2 && TREE_CODE (root2) == PARM_DECL))
> - {
> - if (debug)
> - fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
> - return;
> - }
> -
> - if ((root1 && TREE_CODE (root1) == RESULT_DECL)
> - != (root2 && TREE_CODE (root2) == RESULT_DECL))
> - {
> - if (debug)
> - fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
> - return;
> - }
> -
> - ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
> - ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
> -
> - /* Refrain from coalescing user variables, if requested. */
> - if (!ign1 && !ign2)
> - {
> - if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
> - ign2 = true;
> - else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
> - ign1 = true;
> - else if (flag_ssa_coalesce_vars != 2)
> - {
> - if (debug)
> - fprintf (debug, " : 2 different USER vars. No coalesce.\n");
> - return;
> - }
> - else
> - ign2 = true;
> - }
> -
> - /* If both values have default defs, we can't coalesce. If only one has a
> - tag, make sure that variable is the new root partition. */
> - if (root1 && ssa_default_def (cfun, root1))
> - {
> - if (root2 && ssa_default_def (cfun, root2))
> - {
> - if (debug)
> - fprintf (debug, " : 2 default defs. No coalesce.\n");
> - return;
> - }
> - else
> - {
> - ign2 = true;
> - ign1 = false;
> - }
> - }
> - else if (root2 && ssa_default_def (cfun, root2))
> - {
> - ign1 = true;
> - ign2 = false;
> - }
> -
> - /* Do not coalesce if we cannot assign a symbol to the partition. */
> - if (!(!ign2 && root2)
> - && !(!ign1 && root1))
> - {
> - if (debug)
> - fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
> - return;
> - }
> -
> - /* Don't coalesce if the new chosen root variable would be read-only.
> - If both ign1 && ign2, then the root var of the larger partition
> - wins, so reject in that case if any of the root vars is TREE_READONLY.
> - Otherwise reject only if the root var, on which replace_ssa_name_symbol
> - will be called below, is readonly. */
> - if (((root1 && TREE_READONLY (root1)) && ign2)
> - || ((root2 && TREE_READONLY (root2)) && ign1))
> - {
> - if (debug)
> - fprintf (debug, " : Readonly variable. No coalesce.\n");
> - return;
> - }
> -
> - /* Don't coalesce if the two variables aren't type compatible . */
> - if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
> - /* There is a disconnect between the middle-end type-system and
> - VRP, avoid coalescing enum types with different bounds. */
> - || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
> - || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
> - && TREE_TYPE (var1) != TREE_TYPE (var2)))
> - {
> - if (debug)
> - fprintf (debug, " : Incompatible types. No coalesce.\n");
> - return;
> - }
> -
> - /* Merge the two partitions. */
> - p3 = partition_union (map->var_partition, p1, p2);
> -
> - /* Set the root variable of the partition to the better choice, if there is
> - one. */
> - if (!ign2 && root2)
> - replace_ssa_name_symbol (partition_to_var (map, p3), root2);
> - else if (!ign1 && root1)
> - replace_ssa_name_symbol (partition_to_var (map, p3), root1);
> - else
> - gcc_unreachable ();
> -
> - if (debug)
> - {
> - fprintf (debug, " --> P%d ", p3);
> - print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
> - TDF_SLIM);
> - fprintf (debug, "\n");
> - }
> -}
> -
> -
> -namespace {
> -
> -const pass_data pass_data_rename_ssa_copies =
> -{
> - GIMPLE_PASS, /* type */
> - "copyrename", /* name */
> - OPTGROUP_NONE, /* optinfo_flags */
> - TV_TREE_COPY_RENAME, /* tv_id */
> - ( PROP_cfg | PROP_ssa ), /* properties_required */
> - 0, /* properties_provided */
> - 0, /* properties_destroyed */
> - 0, /* todo_flags_start */
> - 0, /* todo_flags_finish */
> -};
> -
> -class pass_rename_ssa_copies : public gimple_opt_pass
> -{
> -public:
> - pass_rename_ssa_copies (gcc::context *ctxt)
> - : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
> - {}
> -
> - /* opt_pass methods: */
> - opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
> - virtual bool gate (function *) { return flag_tree_copyrename != 0; }
> - virtual unsigned int execute (function *);
> -
> -}; // class pass_rename_ssa_copies
> -
> -/* This function will make a pass through the IL, and attempt to coalesce any
> - SSA versions which occur in PHI's or copies. Coalescing is accomplished by
> - changing the underlying root variable of all coalesced version. This will
> - then cause the SSA->normal pass to attempt to coalesce them all to the same
> - variable. */
> -
> -unsigned int
> -pass_rename_ssa_copies::execute (function *fun)
> -{
> - var_map map;
> - basic_block bb;
> - tree var, part_var;
> - gimple stmt;
> - unsigned x;
> - FILE *debug;
> -
> - memset (&stats, 0, sizeof (stats));
> -
> - if (dump_file && (dump_flags & TDF_DETAILS))
> - debug = dump_file;
> - else
> - debug = NULL;
> -
> - map = init_var_map (num_ssa_names);
> -
> - FOR_EACH_BB_FN (bb, fun)
> - {
> - /* Scan for real copies. */
> - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> - gsi_next (&gsi))
> - {
> - stmt = gsi_stmt (gsi);
> - if (gimple_assign_ssa_name_copy_p (stmt))
> - {
> - tree lhs = gimple_assign_lhs (stmt);
> - tree rhs = gimple_assign_rhs1 (stmt);
> -
> - copy_rename_partition_coalesce (map, lhs, rhs, debug);
> - }
> - }
> - }
> -
> - FOR_EACH_BB_FN (bb, fun)
> - {
> - /* Treat PHI nodes as copies between the result and each argument. */
> - for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
> - gsi_next (&gsi))
> - {
> - size_t i;
> - tree res;
> - gphi *phi = gsi.phi ();
> - res = gimple_phi_result (phi);
> -
> - /* Do not process virtual SSA_NAMES. */
> - if (virtual_operand_p (res))
> - continue;
> -
> - /* Make sure to only use the same partition for an argument
> - as the result but never the other way around. */
> - if (SSA_NAME_VAR (res)
> - && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
> - for (i = 0; i < gimple_phi_num_args (phi); i++)
> - {
> - tree arg = PHI_ARG_DEF (phi, i);
> - if (TREE_CODE (arg) == SSA_NAME)
> - copy_rename_partition_coalesce (map, res, arg,
> - debug);
> - }
> - /* Else if all arguments are in the same partition try to merge
> - it with the result. */
> - else
> - {
> - int all_p_same = -1;
> - int p = -1;
> - for (i = 0; i < gimple_phi_num_args (phi); i++)
> - {
> - tree arg = PHI_ARG_DEF (phi, i);
> - if (TREE_CODE (arg) != SSA_NAME)
> - {
> - all_p_same = 0;
> - break;
> - }
> - else if (all_p_same == -1)
> - {
> - p = partition_find (map->var_partition,
> - SSA_NAME_VERSION (arg));
> - all_p_same = 1;
> - }
> - else if (all_p_same == 1
> - && p != partition_find (map->var_partition,
> - SSA_NAME_VERSION (arg)))
> - {
> - all_p_same = 0;
> - break;
> - }
> - }
> - if (all_p_same == 1)
> - copy_rename_partition_coalesce (map, res,
> - PHI_ARG_DEF (phi, 0),
> - debug);
> - }
> - }
> - }
> -
> - if (debug)
> - dump_var_map (debug, map);
> -
> - /* Now one more pass to make all elements of a partition share the same
> - root variable. */
> -
> - for (x = 1; x < num_ssa_names; x++)
> - {
> - part_var = partition_to_var (map, x);
> - if (!part_var)
> - continue;
> - var = ssa_name (x);
> - if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
> - continue;
> - if (debug)
> - {
> - fprintf (debug, "Coalesced ");
> - print_generic_expr (debug, var, TDF_SLIM);
> - fprintf (debug, " to ");
> - print_generic_expr (debug, part_var, TDF_SLIM);
> - fprintf (debug, "\n");
> - }
> - stats.coalesced++;
> - replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
> - }
> -
> - statistics_counter_event (fun, "copies coalesced",
> - stats.coalesced);
> - delete_var_map (map);
> - return 0;
> -}
> -
> -} // anon namespace
> -
> -gimple_opt_pass *
> -make_pass_rename_ssa_copies (gcc::context *ctxt)
> -{
> - return new pass_rename_ssa_copies (ctxt);
> -}
> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> index 2c7c072..821b2f4 100644
> --- a/gcc/tree-ssa-live.c
> +++ b/gcc/tree-ssa-live.c
> @@ -100,90 +100,6 @@ static void verify_live_on_entry (tree_live_info_p);
> ssa_name or variable, and vice versa. */
>
>
> -/* Hashtable helpers. */
> -
> -struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
> -{
> - typedef tree_int_map *value_type;
> - typedef tree_int_map *compare_type;
> - static inline hashval_t hash (const tree_int_map *);
> - static inline bool equal (const tree_int_map *, const tree_int_map *);
> -};
> -
> -inline hashval_t
> -tree_int_map_hasher::hash (const tree_int_map *v)
> -{
> - return tree_map_base_hash (v);
> -}
> -
> -inline bool
> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
> -{
> - return tree_int_map_eq (v, c);
> -}
> -
> -
> -/* This routine will initialize the basevar fields of MAP. */
> -
> -static void
> -var_map_base_init (var_map map)
> -{
> - int x, num_part;
> - tree var;
> - struct tree_int_map *m, *mapstorage;
> -
> - num_part = num_var_partitions (map);
> - hash_table<tree_int_map_hasher> tree_to_index (num_part);
> - /* We can have at most num_part entries in the hash tables, so it's
> - enough to allocate so many map elements once, saving some malloc
> - calls. */
> - mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
> -
> - /* If a base table already exists, clear it, otherwise create it. */
> - free (map->partition_to_base_index);
> - map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
> -
> - /* Build the base variable list, and point partitions at their bases. */
> - for (x = 0; x < num_part; x++)
> - {
> - struct tree_int_map **slot;
> - unsigned baseindex;
> - var = partition_to_var (map, x);
> - if (SSA_NAME_VAR (var)
> - && (!VAR_P (SSA_NAME_VAR (var))
> - || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
> - m->base.from = SSA_NAME_VAR (var);
> - else
> - /* This restricts what anonymous SSA names we can coalesce
> - as it restricts the sets we compute conflicts for.
> - Using TREE_TYPE to generate sets is the easies as
> - type equivalency also holds for SSA names with the same
> - underlying decl.
> -
> - Check gimple_can_coalesce_p when changing this code. */
> - m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
> - ? TYPE_CANONICAL (TREE_TYPE (var))
> - : TREE_TYPE (var));
> - /* If base variable hasn't been seen, set it up. */
> - slot = tree_to_index.find_slot (m, INSERT);
> - if (!*slot)
> - {
> - baseindex = m - mapstorage;
> - m->to = baseindex;
> - *slot = m;
> - m++;
> - }
> - else
> - baseindex = (*slot)->to;
> - map->partition_to_base_index[x] = baseindex;
> - }
> -
> - map->num_basevars = m - mapstorage;
> -
> - free (mapstorage);
> -}
> -
> -
> /* Remove the base table in MAP. */
>
> static void
> @@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
> }
>
>
> -/* Create a partition view which includes all the used partitions in MAP. If
> - WANT_BASES is true, create the base variable map as well. */
> +/* Create a partition view which includes all the used partitions in MAP. */
>
> void
> -partition_view_normal (var_map map, bool want_bases)
> +partition_view_normal (var_map map)
> {
> bitmap used;
>
> used = partition_view_init (map);
> partition_view_fini (map, used);
>
> - if (want_bases)
> - var_map_base_init (map);
> - else
> - var_map_base_fini (map);
> + var_map_base_fini (map);
> }
>
>
> @@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
> as well. */
>
> void
> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> +partition_view_bitmap (var_map map, bitmap only)
> {
> bitmap used;
> bitmap new_partitions = BITMAP_ALLOC (NULL);
> @@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
> }
> partition_view_fini (map, new_partitions);
>
> - if (want_bases)
> - var_map_base_init (map);
> - else
> - var_map_base_fini (map);
> + var_map_base_fini (map);
> }
>
>
> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
> index d5d7820..1f88358 100644
> --- a/gcc/tree-ssa-live.h
> +++ b/gcc/tree-ssa-live.h
> @@ -71,8 +71,8 @@ typedef struct _var_map
> extern var_map init_var_map (int);
> extern void delete_var_map (var_map);
> extern int var_union (var_map, tree, tree);
> -extern void partition_view_normal (var_map, bool);
> -extern void partition_view_bitmap (var_map, bitmap, bool);
> +extern void partition_view_normal (var_map);
> +extern void partition_view_bitmap (var_map, bitmap);
> extern void dump_scope_blocks (FILE *, int);
> extern void debug_scope_block (tree, int);
> extern void debug_scope_blocks (int);
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 3f6bebe..7bef8cf 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
> if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
> continue;
> e = TREE_OPERAND (e, 0);
> - gcc_assert (operand_equal_p (e, base, 0));
> + /* If E has an unsigned type, the operand equality test below
> + would fail, but the equality test above would have already
> + verified the equality, so we can proceed with it. */
> + gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
> + || operand_equal_p (e, base, 0));
> if (tree_int_cst_sign_bit (step))
> {
> code = LT_EXPR;
> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
> index f75a7f1..0982305 100644
> --- a/gcc/tree-ssa-uncprop.c
> +++ b/gcc/tree-ssa-uncprop.c
> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see
> #include "domwalk.h"
> #include "tree-pass.h"
> #include "tree-ssa-propagate.h"
> +#include "bitmap.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-ssa-live.h"
> +#include "tree-ssa-coalesce.h"
>
> /* The basic structure describing an equivalency created by traversing
> an edge. Traversing the edge effectively means that we can assume
> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
> index 0b24007..acdcd46 100644
> --- a/gcc/var-tracking.c
> +++ b/gcc/var-tracking.c
> @@ -4931,12 +4931,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
> registers, as well as associations between MEMs and VALUEs. */
>
> static void
> -dataflow_set_clear_at_call (dataflow_set *set)
> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
> {
> unsigned int r;
> hard_reg_set_iterator hrsi;
> + HARD_REG_SET invalidated_regs;
>
> - EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
> + get_call_reg_set_usage (call_insn, &invalidated_regs,
> + regs_invalidated_by_call);
> +
> + EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
> var_regno_delete (set, r);
>
> if (MAY_HAVE_DEBUG_INSNS)
> @@ -6720,7 +6724,7 @@ compute_bb_dataflow (basic_block bb)
> switch (mo->type)
> {
> case MO_CALL:
> - dataflow_set_clear_at_call (out);
> + dataflow_set_clear_at_call (out, insn);
> break;
>
> case MO_USE:
> @@ -9182,7 +9186,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
> switch (mo->type)
> {
> case MO_CALL:
> - dataflow_set_clear_at_call (set);
> + dataflow_set_clear_at_call (set, insn);
> emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
> {
> rtx arguments = mo->u.loc, *p = &arguments;
>
>
>
> And here's the incremental patch:
>
> ---
> gcc/alias.c | 17 +++++++------
> gcc/cfgexpand.c | 57 +++++++++++++++++----------------------------
> gcc/emit-rtl.c | 2 --
> gcc/explow.c | 3 --
> gcc/expr.c | 16 +++++--------
> gcc/function.c | 15 ++++++++++++
> gcc/gimple-expr.h | 4 ---
> gcc/tree-outof-ssa.c | 7 ++----
> gcc/tree-ssa-coalesce.h | 1 +
> gcc/tree-ssa-loop-niter.c | 6 ++++-
> gcc/tree-ssa-uncprop.c | 5 ++++
> 11 files changed, 64 insertions(+), 69 deletions(-)
>
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 7a74e81..5a031d9 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2553,14 +2553,15 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
> return 0;
>
> /* If we refer to different gimple registers, or one gimple register
> - and one non-gimple-register, we know they can't overlap. Now,
> - there could be more than one stack slot for (different versions
> - of) the same gimple register, but we can presumably tell they
> - don't overlap based on offsets from stack base addresses
> - elsewhere. It's important that we don't proceed to DECL_RTL,
> - because gimple registers may not pass DECL_RTL_SET_P, and
> - make_decl_rtl won't be able to do anything about them since no
> - SSA information will have remained to guide it. */
> + and one non-gimple-register, we know they can't overlap. First,
> + gimple registers don't have their addresses taken. Now, there
> + could be more than one stack slot for (different versions of) the
> + same gimple register, but we can presumably tell they don't
> + overlap based on offsets from stack base addresses elsewhere.
> + It's important that we don't proceed to DECL_RTL, because gimple
> + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
> + able to do anything about them since no SSA information will have
> + remained to guide it. */
> if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> return exprx != expry;
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 3e80b4a..bf972fc 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -179,11 +179,10 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> -/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
> - TREE_LIST of DECLs. If NEXT is covered by CUR, return CUR
> - unchanged. Otherwise, return a list with all entries of CUR, with
> - NEXT at the end. If CUR was a list, it will be modified in
> - place. */
> +/* Choose either CUR or NEXT as the leader DECL for a partition.
> + Prefer ignored decls, to simplify debug dumps and reduce ambiguity
> + out of the same user variable being in multiple partitions (this is
> + less likely for compiler-introduced temps). */
>
> static tree
> leader_merge (tree cur, tree next)
> @@ -191,26 +190,11 @@ leader_merge (tree cur, tree next)
> if (cur == NULL || cur == next)
> return next;
>
> - tree list;
> + if (DECL_P (cur) && DECL_IGNORED_P (cur))
> + return cur;
>
> - if (TREE_CODE (cur) == TREE_LIST)
> - {
> - /* Look for NEXT in the list. Stop at the last node to insert
> - there. */
> - for (list = cur; ; list = TREE_CHAIN (list))
> - {
> - if (TREE_VALUE (list) == next)
> - return cur;
> - if (!TREE_CHAIN (list))
> - break;
> - }
> - }
> - else
> - /* Create the first node. */
> - list = build_tree_list (NULL, cur);
> -
> - next = build_tree_list (NULL, next);
> - TREE_CHAIN (list) = next;
> + if (DECL_P (next) && DECL_IGNORED_P (next))
> + return next;
>
> return cur;
> }
> @@ -285,9 +269,9 @@ set_rtl (tree t, rtx x)
> if (cur != next)
> {
> if (MEM_P (x))
> - set_mem_attributes (x, SSAVAR (t), true);
> + set_mem_attributes (x, next, true);
> else
> - set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
> + set_reg_attrs_for_decl_rtl (next, x);
> }
> }
>
> @@ -1025,9 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
> x = plus_constant (Pmode, base, offset);
> - x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
> - ? DECL_MODE (SSAVAR (decl))
> - : TYPE_MODE (TREE_TYPE (decl)), x);
> + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
> + ? TYPE_MODE (TREE_TYPE (decl))
> + : DECL_MODE (SSAVAR (decl)), x);
>
> if (TREE_CODE (decl) != SSA_NAME)
> {
> @@ -1268,17 +1252,17 @@ expand_one_stack_var_1 (tree var)
> HOST_WIDE_INT size, offset;
> unsigned byte_align;
>
> - if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
> - {
> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> - byte_align = align_local_variable (SSAVAR (var));
> - }
> - else
> + if (TREE_CODE (var) == SSA_NAME)
> {
> tree type = TREE_TYPE (var);
> size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> byte_align = TYPE_ALIGN_UNIT (type);
> }
> + else
> + {
> + size = tree_to_uhwi (DECL_SIZE_UNIT (var));
> + byte_align = align_local_variable (var);
> + }
>
> /* We handle highly aligned variables in expand_stack_vars. */
> gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1423,9 +1407,10 @@ expand_one_register_var (tree var)
> gcc_assert (REG_P (x));
> return;
> }
> + gcc_unreachable ();
> }
>
> - tree decl = SSAVAR (var);
> + tree decl = var;
> tree type = TREE_TYPE (decl);
> machine_mode reg_mode = promote_decl_mode (decl, NULL);
> rtx x = gen_reg_rtx (reg_mode);
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 308da40..2b98946 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1252,8 +1252,6 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
> if (!t)
> return;
> tree tdecl = t;
> - if (TREE_CODE (t) == TREE_LIST)
> - tdecl = TREE_VALUE (t);
> if (GET_CODE (x) == SUBREG)
> {
> gcc_assert (subreg_lowpart_p (x));
> diff --git a/gcc/explow.c b/gcc/explow.c
> index e09c032e1..5b0d49c 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -866,9 +866,6 @@ promote_ssa_mode (const_tree name, int *punsignedp)
> {
> gcc_assert (TREE_CODE (name) == SSA_NAME);
>
> - if (SSA_NAME_VAR (name))
> - return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
> -
> tree type = TREE_TYPE (name);
> int unsignedp = TYPE_UNSIGNED (type);
> machine_mode mode = TYPE_MODE (type);
> diff --git a/gcc/expr.c b/gcc/expr.c
> index effe379..5b6e16e 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -9584,20 +9584,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>
> /* Get the signedness to be used for this variable. Ensure we get
> the same mode we got when the variable was declared. */
> - if (code == SSA_NAME
> - && (g = SSA_NAME_DEF_STMT (ssa_name))
> - && gimple_code (g) == GIMPLE_CALL
> - && !gimple_call_internal_p (g))
> + if (code != SSA_NAME)
> + pmode = promote_decl_mode (exp, &unsignedp);
> + else if ((g = SSA_NAME_DEF_STMT (ssa_name))
> + && gimple_code (g) == GIMPLE_CALL
> + && !gimple_call_internal_p (g))
> pmode = promote_function_mode (type, mode, &unsignedp,
> gimple_call_fntype (g),
> 2);
> - else if (!exp)
> - {
> - gcc_assert (code == SSA_NAME);
> - pmode = promote_ssa_mode (ssa_name, &unsignedp);
> - }
> else
> - pmode = promote_decl_mode (exp, &unsignedp);
> + pmode = promote_ssa_mode (ssa_name, &unsignedp);
> gcc_assert (GET_MODE (decl_rtl) == pmode);
>
> temp = gen_lowpart_SUBREG (mode, decl_rtl);
> diff --git a/gcc/function.c b/gcc/function.c
> index dc9e77f..58e2498 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2124,6 +2124,21 @@ use_register_for_decl (const_tree decl)
> {
> if (TREE_CODE (decl) == SSA_NAME)
> {
> + /* We often try to use the SSA_NAME, instead of its underlying
> + decl, to get type information and guide decisions, to avoid
> + differences of behavior between anonymous and named
> + variables, but in this one case we have to go for the actual
> + variable if there is one. The main reason is that, at least
> + at -O0, we want to place user variables on the stack, but we
> + don't mind using pseudos for anonymous or ignored temps.
> + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
> + should go in pseudos, whereas their corresponding variables
> + might have to go on the stack. So, disregarding the decl
> + here would negatively impact debug info at -O0, enable
> + coalescing between SSA_NAMEs that ought to get different
> + stack/pseudo assignments, and get the incoming argument
> + processing thoroughly confused by PARM_DECLs expected to live
> + in stack slots but assigned to pseudos. */
> if (!SSA_NAME_VAR (decl))
> return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index 146cede..3d1c89f 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -55,10 +55,6 @@ extern bool is_gimple_mem_ref_addr (tree);
> extern void mark_addressable (tree);
> extern bool is_gimple_reg_rhs (tree);
>
> -/* Defined in tree-ssa-coalesce.c. */
> -extern bool gimple_can_coalesce_p (tree, tree);
> -
> -
> /* Return true if a conversion from either type of TYPE1 and TYPE2
> to the other is not required. Otherwise return false. */
>
> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
> index dda9973..59d91c6 100644
> --- a/gcc/tree-outof-ssa.c
> +++ b/gcc/tree-outof-ssa.c
> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
> rtx dest_rtx, seq, x;
> machine_mode dest_mode, src_mode;
> int unsignedp;
> - tree var;
>
> if (dump_file && (dump_flags & TDF_DETAILS))
> {
> @@ -328,10 +327,9 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
> start_sequence ();
>
> tree name = partition_to_var (SA.map, dest);
> - var = SSA_NAME_VAR (name);
> src_mode = TYPE_MODE (TREE_TYPE (src));
> dest_mode = GET_MODE (dest_rtx);
> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
> gcc_assert (!REG_P (dest_rtx)
> || dest_mode == promote_ssa_mode (name, &unsignedp));
>
> @@ -709,8 +707,7 @@ elim_backward (elim_graph g, int T)
> static rtx
> get_temp_reg (tree name)
> {
> - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
> - tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
> + tree type = TREE_TYPE (name);
> int unsignedp;
> machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
> rtx x = gen_reg_rtx (reg_mode);
> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
> index 99b188a..ae289b4 100644
> --- a/gcc/tree-ssa-coalesce.h
> +++ b/gcc/tree-ssa-coalesce.h
> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see
> #define GCC_TREE_SSA_COALESCE_H
>
> extern var_map coalesce_ssa_name (void);
> +extern bool gimple_can_coalesce_p (tree, tree);
>
> #endif /* GCC_TREE_SSA_COALESCE_H */
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 3f6bebe..7bef8cf 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
> if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
> continue;
> e = TREE_OPERAND (e, 0);
> - gcc_assert (operand_equal_p (e, base, 0));
> + /* If E has an unsigned type, the operand equality test below
> + would fail, but the equality test above would have already
> + verified the equality, so we can proceed with it. */
> + gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
> + || operand_equal_p (e, base, 0));
> if (tree_int_cst_sign_bit (step))
> {
> code = LT_EXPR;
> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
> index f75a7f1..0982305 100644
> --- a/gcc/tree-ssa-uncprop.c
> +++ b/gcc/tree-ssa-uncprop.c
> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see
> #include "domwalk.h"
> #include "tree-pass.h"
> #include "tree-ssa-propagate.h"
> +#include "bitmap.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-ssa-live.h"
> +#include "tree-ssa-coalesce.h"
>
> /* The basic structure describing an equivalency created by traversing
> an edge. Traversing the edge effectively means that we can assume
>
>
>
> --
> Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/ FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-06-08 8:16 ` Richard Biener
@ 2015-06-09 8:58 ` Christophe Lyon
0 siblings, 0 replies; 127+ messages in thread
From: Christophe Lyon @ 2015-06-09 8:58 UTC (permalink / raw)
To: Alexandre Oliva; +Cc: GCC Patches
On 8 June 2015 at 10:14, Richard Biener <richard.guenther@gmail.com> wrote:
> On Sat, Jun 6, 2015 at 3:14 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>>
>>> This should also mention that is_gimple_reg vars do not have their
>>> address taken.
>>
>> check
>>
>>>> +static tree
>>>> +leader_merge (tree cur, tree next)
>>
>>> Ick - presumably you can't use sth better than a TREE_LIST here?
>>
>> The list was an experiment that never really worked, and when I tried to
>> make it work after the patch, it proved to be unworkable, so I dropped
>> it, and rewrote leader_merge to choose either of the params, preferring
>> anonymous over ignored over named, so as to reduce the likelihood of
>> misreading of debug dumps, since that's all they're used for.
>>
>>>> static void
>>>> -expand_one_stack_var (tree var)
>>>> +expand_one_stack_var_1 (tree var)
>>>> {
>>>> HOST_WIDE_INT size, offset;
>>>> unsigned byte_align;
>>>>
>>>> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>>>> - byte_align = align_local_variable (SSAVAR (var));
>>>> + if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
>>>> + {
>>>> + size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>>>> + byte_align = align_local_variable (SSAVAR (var));
>>>> + }
>>>> + else
>>
>>> I'd go here for all TREE_CODE (var) == SSA_NAME
>>
>> Check
>>
>>> (and get rid of the SSAVAR macro?)
>>
>> There are remaining uses that don't seem worth dropping it for.
>>
>>>> +/* Return the promoted mode for name. If it is a named SSA_NAME, it
>>>> + is the same as promote_decl_mode. Otherwise, it is the promoted
>>>> + mode of a temp decl of same type as the SSA_NAME, if we had created
>>>> + one. */
>>>> +
>>>> +machine_mode
>>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>>> +{
>>>> + gcc_assert (TREE_CODE (name) == SSA_NAME);
>>>> +
>>>> + if (SSA_NAME_VAR (name))
>>>> + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>>
>>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>>> vars (so just delete the above two lines).
>>
>> Check
>>
>>>> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>>> pmode = promote_function_mode (type, mode, &unsignedp,
>>>> gimple_call_fntype (g),
>>>> 2);
>>>> + else if (!exp)
>>>> + {
>>>> + gcc_assert (code == SSA_NAME);
>>
>>> promote_ssa_mode should assert this.
>>
>>>> + pmode = promote_ssa_mode (ssa_name, &unsignedp);
>>
>> It does, so... check.
>>
>>
>>>> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>>>> bool
>>>> use_register_for_decl (const_tree decl)
>>>> {
>>>> + if (TREE_CODE (decl) == SSA_NAME)
>>>> + {
>>>> + if (!SSA_NAME_VAR (decl))
>>>> + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>>>> + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>>>> +
>>>> + decl = SSA_NAME_VAR (decl);
>>
>>> See above. Please drop the SSA_NAME_VAR != NULL path.
>>
>> Check, then taken back, after a bootstrap failure and some debugging
>> made me realize this would be wrong. Here are the nearly-added comments
>> that explain why:
>>
>> /* We often try to use the SSA_NAME, instead of its underlying
>> decl, to get type information and guide decisions, to avoid
>> differences of behavior between anonymous and named
>> variables, but in this one case we have to go for the actual
>> variable if there is one. The main reason is that, at least
>> at -O0, we want to place user variables on the stack, but we
>> don't mind using pseudos for anonymous or ignored temps.
>> Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
>> should go in pseudos, whereas their corresponding variables
>> might have to go on the stack. So, disregarding the decl
>> here would negatively impact debug info at -O0, enable
>> coalescing between SSA_NAMEs that ought to get different
>> stack/pseudo assignments, and get the incoming argument
>> processing thoroughly confused by PARM_DECLs expected to live
>> in stack slots but assigned to pseudos. */
>>
>>
>>>> +++ b/gcc/gimple-expr.h
>>>> +/* Defined in tree-ssa-coalesce.c. */
>>>> +extern bool gimple_can_coalesce_p (tree, tree);
>>
>>> Err, put it to tree-ssa-coalesce.h?
>>
>> Check. Lots of additional headers required to be able to include
>> tree-ssa-coalesce.h, though.
>>
>>
>>>> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
>>>> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
>>
>>> The TREE_TYPE of name and its SSA_NAME_VAR are always the same. So just
>>> use TREE_TYPE (name) here.
>>
>> Check
>>
>>>> gcc_assert (!REG_P (dest_rtx)
>>>> - || dest_mode == promote_decl_mode (var, &unsignedp));
>>>> + || dest_mode == promote_ssa_mode (name, &unsignedp));
>>>>
>>>> if (src_mode != dest_mode)
>>>> {
>>>> @@ -714,12 +715,12 @@ static rtx
>>>> get_temp_reg (tree name)
>>>> {
>>>> tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>>>> - tree type = TREE_TYPE (var);
>>>> + tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
>>
>>> See above.
>>
>> Check
>>
>>
>> Here's the revised patch, regstrapped on x86_64-linux-gnu and
>> i686-linux-gnu. The first attempt failed to compile libjava on x86_64,
>> requiring the new change in tree-ssa-loop-niter.c to pass. It didn't
>> occur in the unpatched tree because the differences between anon or
>> named SSA_NAMEs in copyrename changed costs and caused different choices
>> in ivopts, which ultimately failed to expose the problem in loop-niter
>> during vrp.
>>
>> At the end, I enclose the incremental changes since the previous
>> revision of the patch, to ease the incremental review.
>>
>> Ok to install?
>
> Ok.
>
> Thanks,
> Richard.
>
>>
>> for gcc/ChangeLog
>>
>> PR rtl-optimization/64164
>> * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
>> * tree-ssa-copyrename.c: Removed.
>> * opts.c (default_options_table): Drop -ftree-copyrename. Add
>> -ftree-coalesce-vars.
>> * passes.def: Drop all occurrences of pass_rename_ssa_copies.
>> * common.opt (ftree-copyrename): Ignore.
>> (ftree-coalesce-inlined-vars): Likewise.
>> * doc/invoke.texi: Remove the ignored options above.
>> * gimple-expr.h (gimple_can_coalesce_p): Move declaration
>> * tree-ssa-coalesce.h: ... here.
>> * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
>> headers required by it.
>> * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
>> across variables when flag_tree_coalesce_vars. Check register
>> use and promoted modes to allow coalescing. Moved to
>> tree-ssa-coalesce.c.
>> * tree-ssa-live.c (struct tree_int_map_hasher): Move along
>> with its member functions to tree-ssa-coalesce.c.
>> (var_map_base_init): Likewise. Renamed to
>> compute_samebase_partition_bases.
>> (partition_view_normal): Drop want_bases parameter.
>> (partition_view_bitmap): Likewise.
>> * tree-ssa-live.h: Adjust declarations.
>> * tree-ssa-coalesce.c: Include explow.h.
>> (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
>> default defs at the entry point.
>> (dump_part_var_map): New.
>> (compute_optimized_partition_bases): New, called by...
>> (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
>> of compute_samebase_partition_bases. Adjust.
>> * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
>> * cfgexpand.c (leader_merge): New.
>> (get_rtl_for_parm_ssa_default_def): New.
>> (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
>> vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
>> (expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
>> redundant MEM attr setting.
>> (expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
>> from...
>> (expand_one_stack_var): ... this. New wrapper to check and
>> skip already expanded SSA partitions.
>> (record_alignment_for_reg_var): New, factored out of...
>> (expand_one_var): ... this.
>> (expand_one_ssa_partition): New.
>> (adjust_one_expanded_partition_var): New.
>> (expand_one_register_var): Check and skip already expanded SSA
>> partitions.
>> (expand_used_vars): Don't create DECLs for anonymous SSA
>> names. Expand all SSA partitions, then adjust all SSA names.
>> (pass::execute): Replace the loops that set
>> SA.partition_to_pseudo from partition leaders and cleared
>> DECL_RTL for multi-location variables, and that which used to
>> rename vars and set attrs, with one that clears DECL_RTL and
>> checks that PARMs and RESULTs default_defs match DECL_RTL.
>> * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
>> * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
>> * explow.c (promote_ssa_mode): New.
>> * explow.h (promote_ssa_mode): Declare.
>> * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
>> * function.c: Include cfgexpand.h.
>> (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
>> (use_register_for_parm_decl): Wrapper for the above to
>> special-case the result_ptr.
>> (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
>> (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
>> multiple locations.
>> (assign_parm_adjust_stack_rtl): Add all and parm arguments,
>> for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
>> (assign_parm_setup_block): Prefer SSA-assigned location.
>> (assign_parm_setup_reg): Likewise. Use entry_parm for equiv
>> if stack_parm is NULL.
>> (assign_parm_setup_stack): Prefer SSA-assigned location.
>> (assign_parms): Maybe reset DECL_RTL of params. Adjust stack
>> rtl before testing for pointer bounds. Special-case result_ptr.
>> (expand_function_start): Maybe reset DECL_RTL of result.
>> Prefer SSA-assigned location for result and static chain.
>> Factor out DECL_RESULT and SET_DECL_RTL.
>> * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
>> anonymous SSA names. Use promote_ssa_mode.
>> (get_temp_reg): Likewise.
>> (remove_ssa_form): Adjust.
>> * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
>> and get its reg_usage for reg invalidation.
>> (compute_bb_dataflow): Pass it insn.
>> (emit_notes_in_bb): Likewise.
>> * tree-ssa-loop-niter.c (loop_exits_before_overflow): Don't
>> fail assert on conversion between unsigned types.
>>
Hi,
This patch causes a GCC build failure with target
armeb-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
--with-fpu=neon
during the libgcc compilation:
Here is the backtrace I have:
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/obj-armeb-none-linux-gnueabihf/gcc1/./gcc/xgcc
-B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/obj-armeb-none-linux-gnueabihf/gcc1/./gcc/
-B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/bin/
-B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/lib/
-isystem /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/include
-isystem /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/sys-include
-g -O2 -O2 -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wstrict-prototypes
-Wmissing-prototypes -Wold-style-definition -isystem ./include
-fPIC -fno-inline -g -DIN_LIBGCC2 -fbuilding-libgcc
-fno-stack-protector -Dinhibit_libc -fPIC -fno-inline -I. -I.
-I../.././gcc -I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc
-I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/.
-I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/../gcc
-I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/../include
-DHAVE_CC_TLS -o _addQQ.o -MT _addQQ.o -MD -MP -MF _addQQ.dep
-DL_add -DQQ_MODE -c
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c
-fvisibility=hidden -DHIDE_EXPORTS
In file included from
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c:55:0:
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c:
In function '__gnu_addqq3':
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:450:31:
internal compiler error: RTL flag check: MEM_VOLATILE_P used with
unexpected rtx code 'reg' in set_mem_attributes_minus_bitpos, at
emit-rtl.c:1787
#define FIXED_OP(OP,MODE,NUM) __gnu_ ## OP ## MODE ## NUM
^
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:460:30:
note: in expansion of macro 'FIXED_OP'
#define FIXED_ADD_TEMP(NAME) FIXED_OP(add,NAME,3)
^
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:492:19:
note: in expansion of macro 'FIXED_ADD_TEMP'
#define FIXED_ADD FIXED_ADD_TEMP(MODE_NAME_S)
^
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c:59:1:
note: in expansion of macro 'FIXED_ADD'
FIXED_ADD (FIXED_C_TYPE a, FIXED_C_TYPE b)
^
0xa6eb52 rtl_check_failed_flag(char const*, rtx_def const*, char
const*, int, char const*)
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/rtl.c:800
0x771fc7 set_mem_attributes_minus_bitpos(rtx_def*, tree_node*, int, long)
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/emit-rtl.c:1787
0x805294 assign_parm_setup_block
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:2977
0x80b65c assign_parms
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:3775
0x80e087 expand_function_start(tree_node*)
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:5215
0x6a77ed execute
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/cfgexpand.c:6127
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-obj.mk:27:
recipe for target '_addQQ.o' failed
make[2]: *** [_addQQ.o] Error 1
>> for gcc/testsuite/ChangeLog
>>
>> * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
>> * gcc.dg/ssp-1.c: Make counter a register.
>> * gcc.dg/ssp-2.c: Likewise.
>> * gcc.dg/torture/parm-coalesce.c: New.
>> ---
>> gcc/Makefile.in | 1
>> gcc/alias.c | 13 +
>> gcc/cfgexpand.c | 370 ++++++++++++++-----
>> gcc/cfgexpand.h | 2
>> gcc/common.opt | 12 -
>> gcc/doc/invoke.texi | 48 +--
>> gcc/emit-rtl.c | 5
>> gcc/explow.c | 22 +
>> gcc/explow.h | 3
>> gcc/expr.c | 39 +-
>> gcc/function.c | 226 +++++++++---
>> gcc/gimple-expr.c | 39 --
>> gcc/gimple-expr.h | 1
>> gcc/opts.c | 2
>> gcc/passes.def | 5
>> gcc/testsuite/gcc.dg/guality/pr54200.c | 2
>> gcc/testsuite/gcc.dg/ssp-1.c | 2
>> gcc/testsuite/gcc.dg/ssp-2.c | 2
>> gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++
>> gcc/tree-outof-ssa.c | 16 -
>> gcc/tree-ssa-coalesce.c | 380 +++++++++++++++++++-
>> gcc/tree-ssa-coalesce.h | 1
>> gcc/tree-ssa-copyrename.c | 499 --------------------------
>> gcc/tree-ssa-live.c | 101 -----
>> gcc/tree-ssa-live.h | 4
>> gcc/tree-ssa-loop-niter.c | 6
>> gcc/tree-ssa-uncprop.c | 5
>> gcc/var-tracking.c | 12 -
>> 28 files changed, 984 insertions(+), 874 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>> delete mode 100644 gcc/tree-ssa-copyrename.c
>>
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>> index 3d14938..2a03223 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -1441,7 +1441,6 @@ OBJS = \
>> tree-ssa-ccp.o \
>> tree-ssa-coalesce.o \
>> tree-ssa-copy.o \
>> - tree-ssa-copyrename.o \
>> tree-ssa-dce.o \
>> tree-ssa-dom.o \
>> tree-ssa-dse.o \
>> diff --git a/gcc/alias.c b/gcc/alias.c
>> index ea539c5..5a031d9 100644
>> --- a/gcc/alias.c
>> +++ b/gcc/alias.c
>> @@ -2552,6 +2552,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>> if (! DECL_P (exprx) || ! DECL_P (expry))
>> return 0;
>>
>> + /* If we refer to different gimple registers, or one gimple register
>> + and one non-gimple-register, we know they can't overlap. First,
>> + gimple registers don't have their addresses taken. Now, there
>> + could be more than one stack slot for (different versions of) the
>> + same gimple register, but we can presumably tell they don't
>> + overlap based on offsets from stack base addresses elsewhere.
>> + It's important that we don't proceed to DECL_RTL, because gimple
>> + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
>> + able to do anything about them since no SSA information will have
>> + remained to guide it. */
>> + if (is_gimple_reg (exprx) || is_gimple_reg (expry))
>> + return exprx != expry;
>> +
>> /* With invalid code we can end up storing into the constant pool.
>> Bail out to avoid ICEing when creating RTL for this.
>> See gfortran.dg/lto/20091028-2_0.f90. */
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index b190f91..bf972fc 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -179,21 +179,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
>>
>> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>>
>> +/* Choose either CUR or NEXT as the leader DECL for a partition.
>> + Prefer ignored decls, to simplify debug dumps and reduce ambiguity
>> + out of the same user variable being in multiple partitions (this is
>> + less likely for compiler-introduced temps). */
>> +
>> +static tree
>> +leader_merge (tree cur, tree next)
>> +{
>> + if (cur == NULL || cur == next)
>> + return next;
>> +
>> + if (DECL_P (cur) && DECL_IGNORED_P (cur))
>> + return cur;
>> +
>> + if (DECL_P (next) && DECL_IGNORED_P (next))
>> + return next;
>> +
>> + return cur;
>> +}
>> +
>> +
>> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
>> + there is one. */
>> +
>> +rtx
>> +get_rtl_for_parm_ssa_default_def (tree var)
>> +{
>> + gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
>> +
>> + if (!is_gimple_reg (var))
>> + return NULL_RTX;
>> +
>> + /* If we've already determined RTL for the decl, use it. This is
>> + not just an optimization: if VAR is a PARM whose incoming value
>> + is unused, we won't find a default def to use its partition, but
>> + we still want to use the location of the parm, if it was used at
>> + all. During assign_parms, until a location is assigned for the
>> + VAR, RTL can only for a parm or result if we're not coalescing
>> + across variables, when we know we're coalescing all SSA_NAMEs of
>> + each parm or result, and we're not coalescing them with names
>> + pertaining to other variables, such as other parms' default
>> + defs. */
>> + if (DECL_RTL_SET_P (var))
>> + {
>> + gcc_assert (DECL_RTL (var) != pc_rtx);
>> + return DECL_RTL (var);
>> + }
>> +
>> + tree name = ssa_default_def (cfun, var);
>> +
>> + if (!name)
>> + return NULL_RTX;
>> +
>> + int part = var_to_partition (SA.map, name);
>> + if (part == NO_PARTITION)
>> + return NULL_RTX;
>> +
>> + return SA.partition_to_pseudo[part];
>> +}
>> +
>> /* Associate declaration T with storage space X. If T is no
>> SSA name this is exactly SET_DECL_RTL, otherwise make the
>> partition of T associated with X. */
>> static inline void
>> set_rtl (tree t, rtx x)
>> {
>> + if (x && SSAVAR (t))
>> + {
>> + bool skip = false;
>> + tree cur = NULL_TREE;
>> +
>> + if (MEM_P (x))
>> + cur = MEM_EXPR (x);
>> + else if (REG_P (x))
>> + cur = REG_EXPR (x);
>> + else if (GET_CODE (x) == CONCAT
>> + && REG_P (XEXP (x, 0)))
>> + cur = REG_EXPR (XEXP (x, 0));
>> + else if (GET_CODE (x) == PARALLEL)
>> + cur = REG_EXPR (XVECEXP (x, 0, 0));
>> + else if (x == pc_rtx)
>> + skip = true;
>> + else
>> + gcc_unreachable ();
>> +
>> + tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
>> +
>> + if (cur != next)
>> + {
>> + if (MEM_P (x))
>> + set_mem_attributes (x, next, true);
>> + else
>> + set_reg_attrs_for_decl_rtl (next, x);
>> + }
>> + }
>> +
>> if (TREE_CODE (t) == SSA_NAME)
>> {
>> - SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
>> - if (x && !MEM_P (x))
>> - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
>> - /* For the benefit of debug information at -O0 (where vartracking
>> - doesn't run) record the place also in the base DECL if it's
>> - a normal variable (not a parameter). */
>> - if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
>> + int part = var_to_partition (SA.map, t);
>> + if (part != NO_PARTITION)
>> + {
>> + if (SA.partition_to_pseudo[part])
>> + gcc_assert (SA.partition_to_pseudo[part] == x);
>> + else
>> + SA.partition_to_pseudo[part] = x;
>> + }
>> + /* For the benefit of debug information at -O0 (where
>> + vartracking doesn't run) record the place also in the base
>> + DECL. For PARMs and RESULTs, we may end up resetting these
>> + in function.c:maybe_reset_rtl_for_parm, but in some rare
>> + cases we may need them (unused and overwritten incoming
>> + value, that at -O0 must share the location with the other
>> + uses in spite of the missing default def), and this may be
>> + the only chance to preserve them. */
>> + if (x && x != pc_rtx && SSA_NAME_VAR (t))
>> {
>> tree var = SSA_NAME_VAR (t);
>> /* If we don't yet have something recorded, just record it now. */
>> @@ -909,7 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>>
>> x = plus_constant (Pmode, base, offset);
>> - x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
>> + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
>> + ? TYPE_MODE (TREE_TYPE (decl))
>> + : DECL_MODE (SSAVAR (decl)), x);
>>
>> if (TREE_CODE (decl) != SSA_NAME)
>> {
>> @@ -931,7 +1033,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>> DECL_USER_ALIGN (decl) = 0;
>> }
>>
>> - set_mem_attributes (x, SSAVAR (decl), true);
>> set_rtl (decl, x);
>> }
>>
>> @@ -1146,13 +1247,22 @@ account_stack_vars (void)
>> to a variable to be allocated in the stack frame. */
>>
>> static void
>> -expand_one_stack_var (tree var)
>> +expand_one_stack_var_1 (tree var)
>> {
>> HOST_WIDE_INT size, offset;
>> unsigned byte_align;
>>
>> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>> - byte_align = align_local_variable (SSAVAR (var));
>> + if (TREE_CODE (var) == SSA_NAME)
>> + {
>> + tree type = TREE_TYPE (var);
>> + size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
>> + byte_align = TYPE_ALIGN_UNIT (type);
>> + }
>> + else
>> + {
>> + size = tree_to_uhwi (DECL_SIZE_UNIT (var));
>> + byte_align = align_local_variable (var);
>> + }
>>
>> /* We handle highly aligned variables in expand_stack_vars. */
>> gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
>> @@ -1163,6 +1273,27 @@ expand_one_stack_var (tree var)
>> crtl->max_used_stack_slot_alignment, offset);
>> }
>>
>> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
>> + already assigned some MEM. */
>> +
>> +static void
>> +expand_one_stack_var (tree var)
>> +{
>> + if (TREE_CODE (var) == SSA_NAME)
>> + {
>> + int part = var_to_partition (SA.map, var);
>> + if (part != NO_PARTITION)
>> + {
>> + rtx x = SA.partition_to_pseudo[part];
>> + gcc_assert (x);
>> + gcc_assert (MEM_P (x));
>> + return;
>> + }
>> + }
>> +
>> + return expand_one_stack_var_1 (var);
>> +}
>> +
>> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
>> that will reside in a hard register. */
>>
>> @@ -1172,13 +1303,114 @@ expand_one_hard_reg_var (tree var)
>> rest_of_decl_compilation (var, 0, 0);
>> }
>>
>> +/* Record the alignment requirements of some variable assigned to a
>> + pseudo. */
>> +
>> +static void
>> +record_alignment_for_reg_var (unsigned int align)
>> +{
>> + if (SUPPORTS_STACK_ALIGNMENT
>> + && crtl->stack_alignment_estimated < align)
>> + {
>> + /* stack_alignment_estimated shouldn't change after stack
>> + realign decision made */
>> + gcc_assert (!crtl->stack_realign_processed);
>> + crtl->stack_alignment_estimated = align;
>> + }
>> +
>> + /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
>> + So here we only make sure stack_alignment_needed >= align. */
>> + if (crtl->stack_alignment_needed < align)
>> + crtl->stack_alignment_needed = align;
>> + if (crtl->max_used_stack_slot_alignment < align)
>> + crtl->max_used_stack_slot_alignment = align;
>> +}
>> +
>> +/* Create RTL for an SSA partition. */
>> +
>> +static void
>> +expand_one_ssa_partition (tree var)
>> +{
>> + int part = var_to_partition (SA.map, var);
>> + gcc_assert (part != NO_PARTITION);
>> +
>> + if (SA.partition_to_pseudo[part])
>> + return;
>> +
>> + if (!use_register_for_decl (var))
>> + {
>> + expand_one_stack_var_1 (var);
>> + return;
>> + }
>> +
>> + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
>> + TYPE_MODE (TREE_TYPE (var)),
>> + TYPE_ALIGN (TREE_TYPE (var)));
>> +
>> + /* If the variable alignment is very large we'll dynamicaly allocate
>> + it, which means that in-frame portion is just a pointer. */
>> + if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
>> + align = POINTER_SIZE;
>> +
>> + record_alignment_for_reg_var (align);
>> +
>> + machine_mode reg_mode = promote_ssa_mode (var, NULL);
>> +
>> + rtx x = gen_reg_rtx (reg_mode);
>> +
>> + set_rtl (var, x);
>> +}
>> +
>> +/* Record the association between the RTL generated for a partition
>> + and the underlying variable of the SSA_NAME. */
>> +
>> +static void
>> +adjust_one_expanded_partition_var (tree var)
>> +{
>> + if (!var)
>> + return;
>> +
>> + tree decl = SSA_NAME_VAR (var);
>> +
>> + int part = var_to_partition (SA.map, var);
>> + if (part == NO_PARTITION)
>> + return;
>> +
>> + rtx x = SA.partition_to_pseudo[part];
>> +
>> + set_rtl (var, x);
>> +
>> + if (!REG_P (x))
>> + return;
>> +
>> + /* Note if the object is a user variable. */
>> + if (decl && !DECL_ARTIFICIAL (decl))
>> + mark_user_reg (x);
>> +
>> + if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
>> + mark_reg_pointer (x, get_pointer_alignment (var));
>> +}
>> +
>> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
>> that will reside in a pseudo register. */
>>
>> static void
>> expand_one_register_var (tree var)
>> {
>> - tree decl = SSAVAR (var);
>> + if (TREE_CODE (var) == SSA_NAME)
>> + {
>> + int part = var_to_partition (SA.map, var);
>> + if (part != NO_PARTITION)
>> + {
>> + rtx x = SA.partition_to_pseudo[part];
>> + gcc_assert (x);
>> + gcc_assert (REG_P (x));
>> + return;
>> + }
>> + gcc_unreachable ();
>> + }
>> +
>> + tree decl = var;
>> tree type = TREE_TYPE (decl);
>> machine_mode reg_mode = promote_decl_mode (decl, NULL);
>> rtx x = gen_reg_rtx (reg_mode);
>> @@ -1312,21 +1544,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
>> align = POINTER_SIZE;
>> }
>>
>> - if (SUPPORTS_STACK_ALIGNMENT
>> - && crtl->stack_alignment_estimated < align)
>> - {
>> - /* stack_alignment_estimated shouldn't change after stack
>> - realign decision made */
>> - gcc_assert (!crtl->stack_realign_processed);
>> - crtl->stack_alignment_estimated = align;
>> - }
>> -
>> - /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
>> - So here we only make sure stack_alignment_needed >= align. */
>> - if (crtl->stack_alignment_needed < align)
>> - crtl->stack_alignment_needed = align;
>> - if (crtl->max_used_stack_slot_alignment < align)
>> - crtl->max_used_stack_slot_alignment = align;
>> + record_alignment_for_reg_var (align);
>>
>> if (TREE_CODE (origvar) == SSA_NAME)
>> {
>> @@ -1760,48 +1978,18 @@ expand_used_vars (void)
>> if (targetm.use_pseudo_pic_reg ())
>> pic_offset_table_rtx = gen_reg_rtx (Pmode);
>>
>> - hash_map<tree, tree> ssa_name_decls;
>> for (i = 0; i < SA.map->num_partitions; i++)
>> {
>> tree var = partition_to_var (SA.map, i);
>>
>> gcc_assert (!virtual_operand_p (var));
>>
>> - /* Assign decls to each SSA name partition, share decls for partitions
>> - we could have coalesced (those with the same type). */
>> - if (SSA_NAME_VAR (var) == NULL_TREE)
>> - {
>> - tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
>> - if (!*slot)
>> - *slot = create_tmp_reg (TREE_TYPE (var));
>> - replace_ssa_name_symbol (var, *slot);
>> - }
>> -
>> - /* Always allocate space for partitions based on VAR_DECLs. But for
>> - those based on PARM_DECLs or RESULT_DECLs and which matter for the
>> - debug info, there is no need to do so if optimization is disabled
>> - because all the SSA_NAMEs based on these DECLs have been coalesced
>> - into a single partition, which is thus assigned the canonical RTL
>> - location of the DECLs. If in_lto_p, we can't rely on optimize,
>> - a function could be compiled with -O1 -flto first and only the
>> - link performed at -O0. */
>> - if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
>> - expand_one_var (var, true, true);
>> - else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
>> - {
>> - /* This is a PARM_DECL or RESULT_DECL. For those partitions that
>> - contain the default def (representing the parm or result itself)
>> - we don't do anything here. But those which don't contain the
>> - default def (representing a temporary based on the parm/result)
>> - we need to allocate space just like for normal VAR_DECLs. */
>> - if (!bitmap_bit_p (SA.partition_has_default_def, i))
>> - {
>> - expand_one_var (var, true, true);
>> - gcc_assert (SA.partition_to_pseudo[i]);
>> - }
>> - }
>> + expand_one_ssa_partition (var);
>> }
>>
>> + for (i = 1; i < num_ssa_names; i++)
>> + adjust_one_expanded_partition_var (ssa_name (i));
>> +
>> if (flag_stack_protect == SPCT_FLAG_STRONG)
>> gen_stack_protect_signal
>> = stack_protect_decl_p () || stack_protect_return_slot_p ();
>> @@ -5961,35 +6149,6 @@ pass_expand::execute (function *fun)
>> parm_birth_insn = var_seq;
>> }
>>
>> - /* Now that we also have the parameter RTXs, copy them over to our
>> - partitions. */
>> - for (i = 0; i < SA.map->num_partitions; i++)
>> - {
>> - tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
>> -
>> - if (TREE_CODE (var) != VAR_DECL
>> - && !SA.partition_to_pseudo[i])
>> - SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
>> - gcc_assert (SA.partition_to_pseudo[i]);
>> -
>> - /* If this decl was marked as living in multiple places, reset
>> - this now to NULL. */
>> - if (DECL_RTL_IF_SET (var) == pc_rtx)
>> - SET_DECL_RTL (var, NULL);
>> -
>> - /* Some RTL parts really want to look at DECL_RTL(x) when x
>> - was a decl marked in REG_ATTR or MEM_ATTR. We could use
>> - SET_DECL_RTL here making this available, but that would mean
>> - to select one of the potentially many RTLs for one DECL. Instead
>> - of doing that we simply reset the MEM_EXPR of the RTL in question,
>> - then nobody can get at it and hence nobody can call DECL_RTL on it. */
>> - if (!DECL_RTL_SET_P (var))
>> - {
>> - if (MEM_P (SA.partition_to_pseudo[i]))
>> - set_mem_expr (SA.partition_to_pseudo[i], NULL);
>> - }
>> - }
>> -
>> /* If we have a class containing differently aligned pointers
>> we need to merge those into the corresponding RTL pointer
>> alignment. */
>> @@ -5997,7 +6156,6 @@ pass_expand::execute (function *fun)
>> {
>> tree name = ssa_name (i);
>> int part;
>> - rtx r;
>>
>> if (!name
>> /* We might have generated new SSA names in
>> @@ -6010,20 +6168,24 @@ pass_expand::execute (function *fun)
>> if (part == NO_PARTITION)
>> continue;
>>
>> - /* Adjust all partition members to get the underlying decl of
>> - the representative which we might have created in expand_one_var. */
>> - if (SSA_NAME_VAR (name) == NULL_TREE)
>> + gcc_assert (SA.partition_to_pseudo[part]);
>> +
>> + /* If this decl was marked as living in multiple places, reset
>> + this now to NULL. */
>> + tree var = SSA_NAME_VAR (name);
>> + if (var && DECL_RTL_IF_SET (var) == pc_rtx)
>> + SET_DECL_RTL (var, NULL);
>> + /* Check that the pseudos chosen by assign_parms are those of
>> + the corresponding default defs. */
>> + else if (SSA_NAME_IS_DEFAULT_DEF (name)
>> + && (TREE_CODE (var) == PARM_DECL
>> + || TREE_CODE (var) == RESULT_DECL))
>> {
>> - tree leader = partition_to_var (SA.map, part);
>> - gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
>> - replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
>> + rtx in = DECL_RTL_IF_SET (var);
>> + gcc_assert (in);
>> + rtx out = SA.partition_to_pseudo[part];
>> + gcc_assert (in == out || rtx_equal_p (in, out));
>> }
>> - if (!POINTER_TYPE_P (TREE_TYPE (name)))
>> - continue;
>> -
>> - r = SA.partition_to_pseudo[part];
>> - if (REG_P (r))
>> - mark_reg_pointer (r, get_pointer_alignment (name));
>> }
>>
>> /* If this function is `main', emit a call to `__main'
>> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
>> index a0b6e3e..602579d 100644
>> --- a/gcc/cfgexpand.h
>> +++ b/gcc/cfgexpand.h
>> @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3. If not see
>>
>> extern tree gimple_assign_rhs_to_tree (gimple);
>> extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
>> +extern rtx get_rtl_for_parm_ssa_default_def (tree var);
>> +
>>
>> #endif /* GCC_CFGEXPAND_H */
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 32b416a..051f824 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2227,16 +2227,16 @@ Common Report Var(flag_tree_ch) Optimization
>> Enable loop header copying on trees
>>
>> ftree-coalesce-inlined-vars
>> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
>> -Enable coalescing of copy-related user variables that are inlined
>> +Common Ignore RejectNegative
>> +Does nothing. Preserved for backward compatibility.
>>
>> ftree-coalesce-vars
>> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization
>> -Enable coalescing of all copy-related user variables
>> +Common Report Var(flag_tree_coalesce_vars) Optimization
>> +Enable SSA coalescing of user variables
>>
>> ftree-copyrename
>> -Common Report Var(flag_tree_copyrename) Optimization
>> -Replace SSA temporaries with better names in copies
>> +Common Ignore
>> +Does nothing. Preserved for backward compatibility.
>>
>> ftree-copy-prop
>> Common Report Var(flag_tree_copy_prop) Optimization
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index e25bd62..e359be2 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
>> -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
>> -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
>> -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
>> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
>> -fdump-tree-nrv -fdump-tree-vect @gol
>> -fdump-tree-sink @gol
>> -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
>> @@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
>> -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>> -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>> -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
>> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
>> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>> -ftree-loop-if-convert-stores -ftree-loop-im @gol
>> -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
>> -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
>> @@ -7076,11 +7074,6 @@ name is made by appending @file{.phiopt} to the source file name.
>> Dump each function after forward propagating single use variables. The file
>> name is made by appending @file{.forwprop} to the source file name.
>>
>> -@item copyrename
>> -@opindex fdump-tree-copyrename
>> -Dump each function after applying the copy rename optimization. The file
>> -name is made by appending @file{.copyrename} to the source file name.
>> -
>> @item nrv
>> @opindex fdump-tree-nrv
>> Dump each function after applying the named return value optimization on
>> @@ -7545,8 +7538,8 @@ compilation time.
>> -ftree-ccp @gol
>> -fssa-phiopt @gol
>> -ftree-ch @gol
>> +-ftree-coalesce-vars @gol
>> -ftree-copy-prop @gol
>> --ftree-copyrename @gol
>> -ftree-dce @gol
>> -ftree-dominator-opts @gol
>> -ftree-dse @gol
>> @@ -8815,6 +8808,15 @@ profitable to parallelize the loops.
>> Compare the results of several data dependence analyzers. This option
>> is used for debugging the data dependence analyzers.
>>
>> +@item -ftree-coalesce-vars
>> +@opindex ftree-coalesce-vars
>> +Tell the compiler to attempt to combine small user-defined variables
>> +too, instead of just compiler temporaries. This may severely limit the
>> +ability to debug an optimized program compiled with
>> +@option{-fno-var-tracking-assignments}. In the negated form, this flag
>> +prevents SSA coalescing of user variables. This option is enabled by
>> +default if optimization is enabled.
>> +
>> @item -ftree-loop-if-convert
>> @opindex ftree-loop-if-convert
>> Attempt to transform conditional jumps in the innermost loops to
>> @@ -8928,32 +8930,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
>> references with scalars to prevent committing structures to memory too
>> early. This flag is enabled by default at @option{-O} and higher.
>>
>> -@item -ftree-copyrename
>> -@opindex ftree-copyrename
>> -Perform copy renaming on trees. This pass attempts to rename compiler
>> -temporaries to other variables at copy locations, usually resulting in
>> -variable names which more closely resemble the original variables. This flag
>> -is enabled by default at @option{-O} and higher.
>> -
>> -@item -ftree-coalesce-inlined-vars
>> -@opindex ftree-coalesce-inlined-vars
>> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
>> -combine small user-defined variables too, but only if they are inlined
>> -from other functions. It is a more limited form of
>> -@option{-ftree-coalesce-vars}. This may harm debug information of such
>> -inlined variables, but it keeps variables of the inlined-into
>> -function apart from each other, such that they are more likely to
>> -contain the expected values in a debugging session.
>> -
>> -@item -ftree-coalesce-vars
>> -@opindex ftree-coalesce-vars
>> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
>> -combine small user-defined variables too, instead of just compiler
>> -temporaries. This may severely limit the ability to debug an optimized
>> -program compiled with @option{-fno-var-tracking-assignments}. In the
>> -negated form, this flag prevents SSA coalescing of user variables,
>> -including inlined ones. This option is enabled by default.
>> -
>> @item -ftree-ter
>> @opindex ftree-ter
>> Perform temporary expression replacement during the SSA->normal phase. Single
>> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
>> index 49a1509..2b98946 100644
>> --- a/gcc/emit-rtl.c
>> +++ b/gcc/emit-rtl.c
>> @@ -1249,6 +1249,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
>> void
>> set_reg_attrs_for_decl_rtl (tree t, rtx x)
>> {
>> + if (!t)
>> + return;
>> + tree tdecl = t;
>> if (GET_CODE (x) == SUBREG)
>> {
>> gcc_assert (subreg_lowpart_p (x));
>> @@ -1257,7 +1260,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>> if (REG_P (x))
>> REG_ATTRS (x)
>> = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
>> - DECL_MODE (t)));
>> + DECL_MODE (tdecl)));
>> if (GET_CODE (x) == CONCAT)
>> {
>> if (REG_P (XEXP (x, 0)))
>> diff --git a/gcc/explow.c b/gcc/explow.c
>> index 8745aea..5b0d49c 100644
>> --- a/gcc/explow.c
>> +++ b/gcc/explow.c
>> @@ -856,6 +856,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
>> return pmode;
>> }
>>
>> +/* Return the promoted mode for name. If it is a named SSA_NAME, it
>> + is the same as promote_decl_mode. Otherwise, it is the promoted
>> + mode of a temp decl of same type as the SSA_NAME, if we had created
>> + one. */
>> +
>> +machine_mode
>> +promote_ssa_mode (const_tree name, int *punsignedp)
>> +{
>> + gcc_assert (TREE_CODE (name) == SSA_NAME);
>> +
>> + tree type = TREE_TYPE (name);
>> + int unsignedp = TYPE_UNSIGNED (type);
>> + machine_mode mode = TYPE_MODE (type);
>> +
>> + machine_mode pmode = promote_mode (type, mode, &unsignedp);
>> + if (punsignedp)
>> + *punsignedp = unsignedp;
>> +
>> + return pmode;
>> +}
>> +
>> +
>>
>> /* Controls the behaviour of {anti_,}adjust_stack. */
>> static bool suppress_reg_args_size;
>> diff --git a/gcc/explow.h b/gcc/explow.h
>> index 94613de..52113db 100644
>> --- a/gcc/explow.h
>> +++ b/gcc/explow.h
>> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
>> /* Return mode and signedness to use when object is promoted. */
>> machine_mode promote_decl_mode (const_tree, int *);
>>
>> +/* Return mode and signedness to use when object is promoted. */
>> +machine_mode promote_ssa_mode (const_tree, int *);
>> +
>> /* Remove some bytes from the stack. An rtx says how many. */
>> extern void adjust_stack (rtx);
>>
>> diff --git a/gcc/expr.c b/gcc/expr.c
>> index 5a931dc..5b6e16e 100644
>> --- a/gcc/expr.c
>> +++ b/gcc/expr.c
>> @@ -9301,7 +9301,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>> rtx op0, op1, temp, decl_rtl;
>> tree type;
>> int unsignedp;
>> - machine_mode mode;
>> + machine_mode mode, dmode;
>> enum tree_code code = TREE_CODE (exp);
>> rtx subtarget, original_target;
>> int ignore;
>> @@ -9432,7 +9432,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>> if (g == NULL
>> && modifier == EXPAND_INITIALIZER
>> && !SSA_NAME_IS_DEFAULT_DEF (exp)
>> - && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>> + && (optimize || !SSA_NAME_VAR (exp)
>> + || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
>> && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
>> g = SSA_NAME_DEF_STMT (exp);
>> if (g)
>> @@ -9511,15 +9512,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>> /* Ensure variable marked as used even if it doesn't go through
>> a parser. If it hasn't be used yet, write out an external
>> definition. */
>> - TREE_USED (exp) = 1;
>> + if (exp)
>> + TREE_USED (exp) = 1;
>>
>> /* Show we haven't gotten RTL for this yet. */
>> temp = 0;
>>
>> /* Variables inherited from containing functions should have
>> been lowered by this point. */
>> - context = decl_function_context (exp);
>> - gcc_assert (SCOPE_FILE_SCOPE_P (context)
>> + if (exp)
>> + context = decl_function_context (exp);
>> + gcc_assert (!exp
>> + || SCOPE_FILE_SCOPE_P (context)
>> || context == current_function_decl
>> || TREE_STATIC (exp)
>> || DECL_EXTERNAL (exp)
>> @@ -9543,7 +9547,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>> decl_rtl = use_anchored_address (decl_rtl);
>> if (modifier != EXPAND_CONST_ADDRESS
>> && modifier != EXPAND_SUM
>> - && !memory_address_addr_space_p (DECL_MODE (exp),
>> + && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
>> + : GET_MODE (decl_rtl),
>> XEXP (decl_rtl, 0),
>> MEM_ADDR_SPACE (decl_rtl)))
>> temp = replace_equiv_address (decl_rtl,
>> @@ -9554,12 +9559,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>> if the address is a register. */
>> if (temp != 0)
>> {
>> - if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
>> + if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
>> mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>>
>> return temp;
>> }
>>
>> + if (exp)
>> + dmode = DECL_MODE (exp);
>> + else
>> + dmode = TYPE_MODE (TREE_TYPE (ssa_name));
>> +
>> /* If the mode of DECL_RTL does not match that of the decl,
>> there are two cases: we are dealing with a BLKmode value
>> that is returned in a register, or we are dealing with
>> @@ -9567,22 +9577,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>> of the wanted mode, but mark it so that we know that it
>> was already extended. */
>> if (REG_P (decl_rtl)
>> - && DECL_MODE (exp) != BLKmode
>> - && GET_MODE (decl_rtl) != DECL_MODE (exp))
>> + && dmode != BLKmode
>> + && GET_MODE (decl_rtl) != dmode)
>> {
>> machine_mode pmode;
>>
>> /* Get the signedness to be used for this variable. Ensure we get
>> the same mode we got when the variable was declared. */
>> - if (code == SSA_NAME
>> - && (g = SSA_NAME_DEF_STMT (ssa_name))
>> - && gimple_code (g) == GIMPLE_CALL
>> - && !gimple_call_internal_p (g))
>> + if (code != SSA_NAME)
>> + pmode = promote_decl_mode (exp, &unsignedp);
>> + else if ((g = SSA_NAME_DEF_STMT (ssa_name))
>> + && gimple_code (g) == GIMPLE_CALL
>> + && !gimple_call_internal_p (g))
>> pmode = promote_function_mode (type, mode, &unsignedp,
>> gimple_call_fntype (g),
>> 2);
>> else
>> - pmode = promote_decl_mode (exp, &unsignedp);
>> + pmode = promote_ssa_mode (ssa_name, &unsignedp);
>> gcc_assert (GET_MODE (decl_rtl) == pmode);
>>
>> temp = gen_lowpart_SUBREG (mode, decl_rtl);
>> diff --git a/gcc/function.c b/gcc/function.c
>> index 7d2d7e4..58e2498 100644
>> --- a/gcc/function.c
>> +++ b/gcc/function.c
>> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3. If not see
>> #include "cfganal.h"
>> #include "cfgbuild.h"
>> #include "cfgcleanup.h"
>> +#include "cfgexpand.h"
>> #include "basic-block.h"
>> #include "df.h"
>> #include "params.h"
>> @@ -2121,6 +2122,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
>> bool
>> use_register_for_decl (const_tree decl)
>> {
>> + if (TREE_CODE (decl) == SSA_NAME)
>> + {
>> + /* We often try to use the SSA_NAME, instead of its underlying
>> + decl, to get type information and guide decisions, to avoid
>> + differences of behavior between anonymous and named
>> + variables, but in this one case we have to go for the actual
>> + variable if there is one. The main reason is that, at least
>> + at -O0, we want to place user variables on the stack, but we
>> + don't mind using pseudos for anonymous or ignored temps.
>> + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
>> + should go in pseudos, whereas their corresponding variables
>> + might have to go on the stack. So, disregarding the decl
>> + here would negatively impact debug info at -O0, enable
>> + coalescing between SSA_NAMEs that ought to get different
>> + stack/pseudo assignments, and get the incoming argument
>> + processing thoroughly confused by PARM_DECLs expected to live
>> + in stack slots but assigned to pseudos. */
>> + if (!SSA_NAME_VAR (decl))
>> + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>> + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>> +
>> + decl = SSA_NAME_VAR (decl);
>> + }
>> +
>> if (!targetm.calls.allocate_stack_slots_for_args ())
>> return true;
>>
>> @@ -2804,23 +2829,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
>> data->entry_parm = entry_parm;
>> }
>>
>> +/* Wrapper for use_register_for_decl, that special-cases the
>> + .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
>> + passed by reference. */
>> +
>> +static bool
>> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
>> +{
>> + if (parm == all->function_result_decl)
>> + {
>> + tree result = DECL_RESULT (current_function_decl);
>> +
>> + if (DECL_BY_REFERENCE (result))
>> + parm = result;
>> + }
>> +
>> + return use_register_for_decl (parm);
>> +}
>> +
>> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
>> + the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
>> + is passed by reference. */
>> +
>> +static rtx
>> +rtl_for_parm (struct assign_parm_data_all *all, tree parm)
>> +{
>> + if (parm == all->function_result_decl)
>> + {
>> + tree result = DECL_RESULT (current_function_decl);
>> +
>> + if (!DECL_BY_REFERENCE (result))
>> + return NULL_RTX;
>> +
>> + parm = result;
>> + }
>> +
>> + return get_rtl_for_parm_ssa_default_def (parm);
>> +}
>> +
>> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had
>> + SSA_NAMEs in multiple partitions, so that assign_parms will choose
>> + the default def, if it exists, or create new RTL to hold the unused
>> + entry value. If we are coalescing across variables, we want to
>> + reset the location too, because a parm without a default def
>> + (incoming value unused) might be coalesced with one with a default
>> + def, and then assign_parms would copy both incoming values to the
>> + same location, which might cause the wrong value to survive. */
>> +static void
>> +maybe_reset_rtl_for_parm (tree parm)
>> +{
>> + gcc_assert (TREE_CODE (parm) == PARM_DECL
>> + || TREE_CODE (parm) == RESULT_DECL);
>> + if ((flag_tree_coalesce_vars
>> + || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
>> + && is_gimple_reg (parm))
>> + SET_DECL_RTL (parm, NULL_RTX);
>> +}
>> +
>> /* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's
>> always valid and properly aligned. */
>>
>> static void
>> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
>> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
>> + struct assign_parm_data_one *data)
>> {
>> rtx stack_parm = data->stack_parm;
>>
>> + /* If out-of-SSA assigned RTL to the parm default def, make sure we
>> + don't use what we might have computed before. */
>> + rtx ssa_assigned = rtl_for_parm (all, parm);
>> + if (ssa_assigned)
>> + stack_parm = NULL;
>> +
>> /* If we can't trust the parm stack slot to be aligned enough for its
>> ultimate type, don't use that slot after entry. We'll make another
>> stack slot, if we need one. */
>> - if (stack_parm
>> - && ((STRICT_ALIGNMENT
>> - && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
>> - || (data->nominal_type
>> - && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
>> - && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>> + else if (stack_parm
>> + && ((STRICT_ALIGNMENT
>> + && (GET_MODE_ALIGNMENT (data->nominal_mode)
>> + > MEM_ALIGN (stack_parm)))
>> + || (data->nominal_type
>> + && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
>> + && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
>> stack_parm = NULL;
>>
>> /* If parm was passed in memory, and we need to convert it on entry,
>> @@ -2882,11 +2972,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>>
>> size = int_size_in_bytes (data->passed_type);
>> size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
>> +
>> if (stack_parm == 0)
>> {
>> DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
>> - stack_parm = assign_stack_local (BLKmode, size_stored,
>> - DECL_ALIGN (parm));
>> + stack_parm = rtl_for_parm (all, parm);
>> + if (!stack_parm)
>> + stack_parm = assign_stack_local (BLKmode, size_stored,
>> + DECL_ALIGN (parm));
>> + else
>> + stack_parm = copy_rtx (stack_parm);
>> if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>> PUT_MODE (stack_parm, GET_MODE (entry_parm));
>> set_mem_attributes (stack_parm, parm, 1);
>> @@ -3027,10 +3122,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>> = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>> TREE_TYPE (current_function_decl), 2);
>>
>> - parmreg = gen_reg_rtx (promoted_nominal_mode);
>> + rtx from_expand = rtl_for_parm (all, parm);
>>
>> - if (!DECL_ARTIFICIAL (parm))
>> - mark_user_reg (parmreg);
>> + if (from_expand && !data->passed_pointer)
>> + {
>> + parmreg = from_expand;
>> + gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
>> + }
>> + else
>> + {
>> + parmreg = gen_reg_rtx (promoted_nominal_mode);
>> + if (!DECL_ARTIFICIAL (parm))
>> + mark_user_reg (parmreg);
>> + }
>>
>> /* If this was an item that we received a pointer to,
>> set DECL_RTL appropriately. */
>> @@ -3049,6 +3153,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>> assign_parm_find_data_types and expand_expr_real_1. */
>>
>> equiv_stack_parm = data->stack_parm;
>> + if (!equiv_stack_parm)
>> + equiv_stack_parm = data->entry_parm;
>> validated_mem = validize_mem (copy_rtx (data->entry_parm));
>>
>> need_conversion = (data->nominal_mode != data->passed_mode
>> @@ -3189,11 +3295,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>>
>> /* If we were passed a pointer but the actual value can safely live
>> in a register, retrieve it and use it directly. */
>> - if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
>> + if (data->passed_pointer
>> + && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
>> {
>> /* We can't use nominal_mode, because it will have been set to
>> Pmode above. We must use the actual mode of the parm. */
>> - if (use_register_for_decl (parm))
>> + if (from_expand)
>> + {
>> + parmreg = from_expand;
>> + gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
>> + }
>> + else if (use_register_for_decl (parm))
>> {
>> parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
>> mark_user_reg (parmreg);
>> @@ -3233,7 +3345,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>>
>> /* STACK_PARM is the pointer, not the parm, and PARMREG is
>> now the parm. */
>> - data->stack_parm = NULL;
>> + data->stack_parm = equiv_stack_parm = NULL;
>> }
>>
>> /* Mark the register as eliminable if we did no conversion and it was
>> @@ -3243,11 +3355,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>> make here would screw up life analysis for it. */
>> if (data->nominal_mode == data->passed_mode
>> && !did_conversion
>> - && data->stack_parm != 0
>> - && MEM_P (data->stack_parm)
>> + && equiv_stack_parm != 0
>> + && MEM_P (equiv_stack_parm)
>> && data->locate.offset.var == 0
>> && reg_mentioned_p (virtual_incoming_args_rtx,
>> - XEXP (data->stack_parm, 0)))
>> + XEXP (equiv_stack_parm, 0)))
>> {
>> rtx_insn *linsn = get_last_insn ();
>> rtx_insn *sinsn;
>> @@ -3260,8 +3372,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
>> = GET_MODE_INNER (GET_MODE (parmreg));
>> int regnor = REGNO (XEXP (parmreg, 0));
>> int regnoi = REGNO (XEXP (parmreg, 1));
>> - rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
>> - rtx stacki = adjust_address_nv (data->stack_parm, submode,
>> + rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
>> + rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
>> GET_MODE_SIZE (submode));
>>
>> /* Scan backwards for the set of the real and
>> @@ -3334,6 +3446,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
>>
>> if (data->stack_parm == 0)
>> {
>> + rtx x = data->stack_parm = rtl_for_parm (all, parm);
>> + if (x)
>> + gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
>> + }
>> +
>> + if (data->stack_parm == 0)
>> + {
>> int align = STACK_SLOT_ALIGNMENT (data->passed_type,
>> GET_MODE (data->entry_parm),
>> TYPE_ALIGN (data->passed_type));
>> @@ -3592,6 +3711,8 @@ assign_parms (tree fndecl)
>> DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
>> continue;
>> }
>> + else
>> + maybe_reset_rtl_for_parm (parm);
>>
>> /* Estimate stack alignment from parameter alignment. */
>> if (SUPPORTS_STACK_ALIGNMENT)
>> @@ -3641,7 +3762,9 @@ assign_parms (tree fndecl)
>> else
>> set_decl_incoming_rtl (parm, data.entry_parm, false);
>>
>> - /* Boudns should be loaded in the particular order to
>> + assign_parm_adjust_stack_rtl (&all, parm, &data);
>> +
>> + /* Bounds should be loaded in the particular order to
>> have registers allocated correctly. Collect info about
>> input bounds and load them later. */
>> if (POINTER_BOUNDS_TYPE_P (data.passed_type))
>> @@ -3658,11 +3781,10 @@ assign_parms (tree fndecl)
>> }
>> else
>> {
>> - assign_parm_adjust_stack_rtl (&data);
>> -
>> if (assign_parm_setup_block_p (&data))
>> assign_parm_setup_block (&all, parm, &data);
>> - else if (data.passed_pointer || use_register_for_decl (parm))
>> + else if (data.passed_pointer
>> + || use_register_for_parm_decl (&all, parm))
>> assign_parm_setup_reg (&all, parm, &data);
>> else
>> assign_parm_setup_stack (&all, parm, &data);
>> @@ -5004,7 +5126,9 @@ expand_function_start (tree subr)
>> before any library calls that assign parms might generate. */
>>
>> /* Decide whether to return the value in memory or in a register. */
>> - if (aggregate_value_p (DECL_RESULT (subr), subr))
>> + tree res = DECL_RESULT (subr);
>> + maybe_reset_rtl_for_parm (res);
>> + if (aggregate_value_p (res, subr))
>> {
>> /* Returning something that won't go in a register. */
>> rtx value_address = 0;
>> @@ -5012,7 +5136,7 @@ expand_function_start (tree subr)
>> #ifdef PCC_STATIC_STRUCT_RETURN
>> if (cfun->returns_pcc_struct)
>> {
>> - int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
>> + int size = int_size_in_bytes (TREE_TYPE (res));
>> value_address = assemble_static_space (size);
>> }
>> else
>> @@ -5024,36 +5148,45 @@ expand_function_start (tree subr)
>> it. */
>> if (sv)
>> {
>> - value_address = gen_reg_rtx (Pmode);
>> + if (DECL_BY_REFERENCE (res))
>> + value_address = get_rtl_for_parm_ssa_default_def (res);
>> + if (!value_address)
>> + value_address = gen_reg_rtx (Pmode);
>> emit_move_insn (value_address, sv);
>> }
>> }
>> if (value_address)
>> {
>> rtx x = value_address;
>> - if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
>> + if (!DECL_BY_REFERENCE (res))
>> {
>> - x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
>> - set_mem_attributes (x, DECL_RESULT (subr), 1);
>> + x = get_rtl_for_parm_ssa_default_def (res);
>> + if (!x)
>> + {
>> + x = gen_rtx_MEM (DECL_MODE (res), value_address);
>> + set_mem_attributes (x, res, 1);
>> + }
>> }
>> - SET_DECL_RTL (DECL_RESULT (subr), x);
>> + SET_DECL_RTL (res, x);
>> }
>> }
>> - else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
>> + else if (DECL_MODE (res) == VOIDmode)
>> /* If return mode is void, this decl rtl should not be used. */
>> - SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
>> + SET_DECL_RTL (res, NULL_RTX);
>> else
>> {
>> /* Compute the return values into a pseudo reg, which we will copy
>> into the true return register after the cleanups are done. */
>> - tree return_type = TREE_TYPE (DECL_RESULT (subr));
>> - if (TYPE_MODE (return_type) != BLKmode
>> - && targetm.calls.return_in_msb (return_type))
>> + tree return_type = TREE_TYPE (res);
>> + rtx x = get_rtl_for_parm_ssa_default_def (res);
>> + if (x)
>> + /* Use it. */;
>> + else if (TYPE_MODE (return_type) != BLKmode
>> + && targetm.calls.return_in_msb (return_type))
>> /* expand_function_end will insert the appropriate padding in
>> this case. Use the return value's natural (unpadded) mode
>> within the function proper. */
>> - SET_DECL_RTL (DECL_RESULT (subr),
>> - gen_reg_rtx (TYPE_MODE (return_type)));
>> + x = gen_reg_rtx (TYPE_MODE (return_type));
>> else
>> {
>> /* In order to figure out what mode to use for the pseudo, we
>> @@ -5064,25 +5197,26 @@ expand_function_start (tree subr)
>> /* Structures that are returned in registers are not
>> aggregate_value_p, so we may see a PARALLEL or a REG. */
>> if (REG_P (hard_reg))
>> - SET_DECL_RTL (DECL_RESULT (subr),
>> - gen_reg_rtx (GET_MODE (hard_reg)));
>> + x = gen_reg_rtx (GET_MODE (hard_reg));
>> else
>> {
>> gcc_assert (GET_CODE (hard_reg) == PARALLEL);
>> - SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
>> + x = gen_group_rtx (hard_reg);
>> }
>> }
>>
>> + SET_DECL_RTL (res, x);
>> +
>> /* Set DECL_REGISTER flag so that expand_function_end will copy the
>> result to the real return register(s). */
>> - DECL_REGISTER (DECL_RESULT (subr)) = 1;
>> + DECL_REGISTER (res) = 1;
>>
>> if (chkp_function_instrumented_p (current_function_decl))
>> {
>> - tree return_type = TREE_TYPE (DECL_RESULT (subr));
>> + tree return_type = TREE_TYPE (res);
>> rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
>> subr, 1);
>> - SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
>> + SET_DECL_BOUNDS_RTL (res, bounds);
>> }
>> }
>>
>> @@ -5097,7 +5231,9 @@ expand_function_start (tree subr)
>> rtx local, chain;
>> rtx_insn *insn;
>>
>> - local = gen_reg_rtx (Pmode);
>> + local = get_rtl_for_parm_ssa_default_def (parm);
>> + if (!local)
>> + local = gen_reg_rtx (Pmode);
>> chain = targetm.calls.static_chain (current_function_decl, true);
>>
>> set_decl_incoming_rtl (parm, chain, false);
>> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
>> index 4d683d6..d3d1c5f 100644
>> --- a/gcc/gimple-expr.c
>> +++ b/gcc/gimple-expr.c
>> @@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type)
>> return copy;
>> }
>>
>> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
>> - coalescing together, false otherwise.
>> -
>> - This must stay consistent with var_map_base_init in tree-ssa-live.c. */
>> -
>> -bool
>> -gimple_can_coalesce_p (tree name1, tree name2)
>> -{
>> - /* First check the SSA_NAME's associated DECL. We only want to
>> - coalesce if they have the same DECL or both have no associated DECL. */
>> - tree var1 = SSA_NAME_VAR (name1);
>> - tree var2 = SSA_NAME_VAR (name2);
>> - var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
>> - var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
>> - if (var1 != var2)
>> - return false;
>> -
>> - /* Now check the types. If the types are the same, then we should
>> - try to coalesce V1 and V2. */
>> - tree t1 = TREE_TYPE (name1);
>> - tree t2 = TREE_TYPE (name2);
>> - if (t1 == t2)
>> - return true;
>> -
>> - /* If the types are not the same, check for a canonical type match. This
>> - (for example) allows coalescing when the types are fundamentally the
>> - same, but just have different names.
>> -
>> - Note pointer types with different address spaces may have the same
>> - canonical type. Those are rejected for coalescing by the
>> - types_compatible_p check. */
>> - if (TYPE_CANONICAL (t1)
>> - && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
>> - && types_compatible_p (t1, t2))
>> - return true;
>> -
>> - return false;
>> -}
>> -
>> /* Strip off a legitimate source ending from the input string NAME of
>> length LEN. Rather than having to know the names used by all of
>> our front ends, we strip off an ending of a period followed by
>> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
>> index ed23eb2..3d1c89f 100644
>> --- a/gcc/gimple-expr.h
>> +++ b/gcc/gimple-expr.h
>> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
>> extern bool gimple_has_body_p (tree);
>> extern const char *gimple_decl_printable_name (tree, int);
>> extern tree copy_var_decl (tree, tree, tree);
>> -extern bool gimple_can_coalesce_p (tree, tree);
>> extern tree create_tmp_var_name (const char *);
>> extern tree create_tmp_var_raw (tree, const char * = NULL);
>> extern tree create_tmp_var (tree, const char * = NULL);
>> diff --git a/gcc/opts.c b/gcc/opts.c
>> index 9793999..5305299 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -448,12 +448,12 @@ static const struct default_options default_options_table[] =
>> { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
>> { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
>> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
>> + { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
>> { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
>> { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
>> { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
>> { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
>> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
>> - { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
>> { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
>> { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
>> { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
>> diff --git a/gcc/passes.def b/gcc/passes.def
>> index 4690e23..230e089 100644
>> --- a/gcc/passes.def
>> +++ b/gcc/passes.def
>> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see
>> NEXT_PASS (pass_all_early_optimizations);
>> PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
>> NEXT_PASS (pass_remove_cgraph_callee_edges);
>> - NEXT_PASS (pass_rename_ssa_copies);
>> NEXT_PASS (pass_object_sizes);
>> NEXT_PASS (pass_ccp);
>> /* After CCP we rewrite no longer addressed locals into SSA
>> @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3. If not see
>> /* Initial scalar cleanups before alias computation.
>> They ensure memory accesses are not indirect wherever possible. */
>> NEXT_PASS (pass_strip_predict_hints);
>> - NEXT_PASS (pass_rename_ssa_copies);
>> NEXT_PASS (pass_ccp);
>> /* After CCP we rewrite no longer addressed locals into SSA
>> form if possible. */
>> @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3. If not see
>> NEXT_PASS (pass_ch);
>> NEXT_PASS (pass_lower_complex);
>> NEXT_PASS (pass_sra);
>> - NEXT_PASS (pass_rename_ssa_copies);
>> /* The dom pass will also resolve all __builtin_constant_p calls
>> that are still there to 0. This has to be done after some
>> propagations have already run, but before some more dead code
>> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3. If not see
>> NEXT_PASS (pass_fold_builtins);
>> NEXT_PASS (pass_optimize_widening_mul);
>> NEXT_PASS (pass_tail_calls);
>> - NEXT_PASS (pass_rename_ssa_copies);
>> /* FIXME: If DCE is not run before checking for uninitialized uses,
>> we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
>> However, this also causes us to misdiagnose cases that should be
>> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3. If not see
>> NEXT_PASS (pass_dce);
>> NEXT_PASS (pass_asan);
>> NEXT_PASS (pass_tsan);
>> - NEXT_PASS (pass_rename_ssa_copies);
>> /* ??? We do want some kind of loop invariant motion, but we possibly
>> need to adjust LIM to be more friendly towards preserving accurate
>> debug information here. */
>> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
>> index 9b17187..e1e7293 100644
>> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c
>> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
>> @@ -1,6 +1,6 @@
>> /* PR tree-optimization/54200 */
>> /* { dg-do run } */
>> -/* { dg-options "-g -fno-var-tracking-assignments" } */
>> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
>>
>> int o __attribute__((used));
>>
>> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
>> index 5467f4d..db69332 100644
>> --- a/gcc/testsuite/gcc.dg/ssp-1.c
>> +++ b/gcc/testsuite/gcc.dg/ssp-1.c
>> @@ -12,7 +12,7 @@ __stack_chk_fail (void)
>>
>> int main ()
>> {
>> - int i;
>> + register int i;
>> char foo[255];
>>
>> // smash stack
>> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
>> index 9a7ac32..752fe53 100644
>> --- a/gcc/testsuite/gcc.dg/ssp-2.c
>> +++ b/gcc/testsuite/gcc.dg/ssp-2.c
>> @@ -14,7 +14,7 @@ __stack_chk_fail (void)
>> void
>> overflow()
>> {
>> - int i = 0;
>> + register int i = 0;
>> char foo[30];
>>
>> /* Overflow buffer. */
>> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>> new file mode 100644
>> index 0000000..dbd81c1
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
>> @@ -0,0 +1,40 @@
>> +/* { dg-do run } */
>> +
>> +#include <stdlib.h>
>> +
>> +/* Make sure we don't coalesce both incoming parms, one whose incoming
>> + value is unused, to the same location, so as to overwrite one of
>> + them with the incoming value of the other. */
>> +
>> +int __attribute__((noinline, noclone))
>> +foo (int i, int j)
>> +{
>> + j = i; /* The incoming value for J is unused. */
>> + i = 2;
>> + if (j)
>> + j++;
>> + j += i + 1;
>> + return j;
>> +}
>> +
>> +/* Same as foo, but with swapped parameters. */
>> +int __attribute__((noinline, noclone))
>> +bar (int j, int i)
>> +{
>> + j = i; /* The incoming value for J is unused. */
>> + i = 2;
>> + if (j)
>> + j++;
>> + j += i + 1;
>> + return j;
>> +}
>> +
>> +int
>> +main (void)
>> +{
>> + if (foo (0, 1) != 3)
>> + abort ();
>> + if (bar (1, 0) != 3)
>> + abort ();
>> + return 0;
>> +}
>> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
>> index e23bc0b..59d91c6 100644
>> --- a/gcc/tree-outof-ssa.c
>> +++ b/gcc/tree-outof-ssa.c
>> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>> rtx dest_rtx, seq, x;
>> machine_mode dest_mode, src_mode;
>> int unsignedp;
>> - tree var;
>>
>> if (dump_file && (dump_flags & TDF_DETAILS))
>> {
>> @@ -327,12 +326,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>>
>> start_sequence ();
>>
>> - var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
>> + tree name = partition_to_var (SA.map, dest);
>> src_mode = TYPE_MODE (TREE_TYPE (src));
>> dest_mode = GET_MODE (dest_rtx);
>> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
>> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>> gcc_assert (!REG_P (dest_rtx)
>> - || dest_mode == promote_decl_mode (var, &unsignedp));
>> + || dest_mode == promote_ssa_mode (name, &unsignedp));
>>
>> if (src_mode != dest_mode)
>> {
>> @@ -708,13 +707,12 @@ elim_backward (elim_graph g, int T)
>> static rtx
>> get_temp_reg (tree name)
>> {
>> - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>> - tree type = TREE_TYPE (var);
>> + tree type = TREE_TYPE (name);
>> int unsignedp;
>> - machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
>> + machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>> rtx x = gen_reg_rtx (reg_mode);
>> if (POINTER_TYPE_P (type))
>> - mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
>> + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
>> return x;
>> }
>>
>> @@ -1014,7 +1012,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
>>
>> /* Return to viewing the variable list as just all reference variables after
>> coalescing has been performed. */
>> - partition_view_normal (map, false);
>> + partition_view_normal (map);
>>
>> if (dump_file && (dump_flags & TDF_DETAILS))
>> {
>> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
>> index b05a860..9ffa3f1 100644
>> --- a/gcc/tree-ssa-coalesce.c
>> +++ b/gcc/tree-ssa-coalesce.c
>> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see
>> #include "tree-ssanames.h"
>> #include "tree-ssa-live.h"
>> #include "tree-ssa-coalesce.h"
>> +#include "explow.h"
>> #include "diagnostic-core.h"
>>
>>
>> @@ -830,6 +831,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>> basic_block bb;
>> ssa_op_iter iter;
>> live_track_p live;
>> + basic_block entry;
>> +
>> + /* If inter-variable coalescing is enabled, we may attempt to
>> + coalesce variables from different base variables, including
>> + different parameters, so we have to make sure default defs live
>> + at the entry block conflict with each other. */
>> + if (flag_tree_coalesce_vars)
>> + entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>> + else
>> + entry = NULL;
>>
>> map = live_var_map (liveinfo);
>> graph = ssa_conflicts_new (num_var_partitions (map));
>> @@ -888,6 +899,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
>> live_track_process_def (live, result, graph);
>> }
>>
>> + /* Pretend there are defs for params' default defs at the start
>> + of the (post-)entry block. */
>> + if (bb == entry)
>> + {
>> + unsigned base;
>> + bitmap_iterator bi;
>> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
>> + {
>> + bitmap_iterator bi2;
>> + unsigned part;
>> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
>> + 0, part, bi2)
>> + {
>> + tree var = partition_to_var (map, part);
>> + if (!SSA_NAME_VAR (var)
>> + || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
>> + && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
>> + || !SSA_NAME_IS_DEFAULT_DEF (var))
>> + continue;
>> + live_track_process_def (live, var, graph);
>> + }
>> + }
>> + }
>> +
>> live_track_clear_base_vars (live);
>> }
>>
>> @@ -1156,6 +1191,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>> {
>> var1 = partition_to_var (map, p1);
>> var2 = partition_to_var (map, p2);
>> +
>> z = var_union (map, var1, var2);
>> if (z == NO_PARTITION)
>> {
>> @@ -1173,6 +1209,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
>>
>> if (debug)
>> fprintf (debug, ": Success -> %d\n", z);
>> +
>> return true;
>> }
>>
>> @@ -1270,6 +1307,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
>> }
>>
>>
>> +/* Output partition map MAP with coalescing plan PART to file F. */
>> +
>> +void
>> +dump_part_var_map (FILE *f, partition part, var_map map)
>> +{
>> + int t;
>> + unsigned x, y;
>> + int p;
>> +
>> + fprintf (f, "\nCoalescible Partition map \n\n");
>> +
>> + for (x = 0; x < map->num_partitions; x++)
>> + {
>> + if (map->view_to_partition != NULL)
>> + p = map->view_to_partition[x];
>> + else
>> + p = x;
>> +
>> + if (ssa_name (p) == NULL_TREE
>> + || virtual_operand_p (ssa_name (p)))
>> + continue;
>> +
>> + t = 0;
>> + for (y = 1; y < num_ssa_names; y++)
>> + {
>> + tree var = version_to_var (map, y);
>> + if (!var)
>> + continue;
>> + int q = var_to_partition (map, var);
>> + p = partition_find (part, q);
>> + gcc_assert (map->partition_to_base_index[q]
>> + == map->partition_to_base_index[p]);
>> +
>> + if (p == (int)x)
>> + {
>> + if (t++ == 0)
>> + {
>> + fprintf (f, "Partition %d, base %d (", x,
>> + map->partition_to_base_index[q]);
>> + print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
>> + fprintf (f, " - ");
>> + }
>> + fprintf (f, "%d ", y);
>> + }
>> + }
>> + if (t != 0)
>> + fprintf (f, ")\n");
>> + }
>> + fprintf (f, "\n");
>> +}
>> +
>> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
>> + coalescing together, false otherwise.
>> +
>> + This must stay consistent with var_map_base_init in tree-ssa-live.c. */
>> +
>> +bool
>> +gimple_can_coalesce_p (tree name1, tree name2)
>> +{
>> + /* First check the SSA_NAME's associated DECL. Without
>> + optimization, we only want to coalesce if they have the same DECL
>> + or both have no associated DECL. */
>> + tree var1 = SSA_NAME_VAR (name1);
>> + tree var2 = SSA_NAME_VAR (name2);
>> + var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
>> + var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
>> + if (var1 != var2 && !flag_tree_coalesce_vars)
>> + return false;
>> +
>> + /* Now check the types. If the types are the same, then we should
>> + try to coalesce V1 and V2. */
>> + tree t1 = TREE_TYPE (name1);
>> + tree t2 = TREE_TYPE (name2);
>> + if (t1 == t2)
>> + {
>> + check_modes:
>> + /* If the base variables are the same, we're good: none of the
>> + other tests below could possibly fail. */
>> + var1 = SSA_NAME_VAR (name1);
>> + var2 = SSA_NAME_VAR (name2);
>> + if (var1 == var2)
>> + return true;
>> +
>> + /* We don't want to coalesce two SSA names if one of the base
>> + variables is supposed to be a register while the other is
>> + supposed to be on the stack. Anonymous SSA names take
>> + registers, but when not optimizing, user variables should go
>> + on the stack, so coalescing them with the anonymous variable
>> + as the partition leader would end up assigning the user
>> + variable to a register. Don't do that! */
>> + bool reg1 = !var1 || use_register_for_decl (var1);
>> + bool reg2 = !var2 || use_register_for_decl (var2);
>> + if (reg1 != reg2)
>> + return false;
>> +
>> + /* Check that the promoted modes are the same. We don't want to
>> + coalesce if the promoted modes would be different. Only
>> + PARM_DECLs and RESULT_DECLs have different promotion rules,
>> + so skip the test if we both are variables or anonymous
>> + SSA_NAMEs. */
>> + return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
>> + || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
>> + }
>> +
>> + /* If the types are not the same, check for a canonical type match. This
>> + (for example) allows coalescing when the types are fundamentally the
>> + same, but just have different names.
>> +
>> + Note pointer types with different address spaces may have the same
>> + canonical type. Those are rejected for coalescing by the
>> + types_compatible_p check. */
>> + if (TYPE_CANONICAL (t1)
>> + && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
>> + && types_compatible_p (t1, t2))
>> + goto check_modes;
>> +
>> + return false;
>> +}
>> +
>> +/* Fill in MAP's partition_to_base_index, with one index for each
>> + partition of SSA names USED_IN_COPIES and related by CL coalesce
>> + possibilities. This must match gimple_can_coalesce_p in the
>> + optimized case. */
>> +
>> +static void
>> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
>> + coalesce_list_p cl)
>> +{
>> + int parts = num_var_partitions (map);
>> + partition tentative = partition_new (parts);
>> +
>> + /* Partition the SSA versions so that, for each coalescible
>> + pair, both of its members are in the same partition in
>> + TENTATIVE. */
>> + gcc_assert (!cl->sorted);
>> + coalesce_pair_p node;
>> + coalesce_iterator_type ppi;
>> + FOR_EACH_PARTITION_PAIR (node, ppi, cl)
>> + {
>> + tree v1 = ssa_name (node->first_element);
>> + int p1 = partition_find (tentative, var_to_partition (map, v1));
>> + tree v2 = ssa_name (node->second_element);
>> + int p2 = partition_find (tentative, var_to_partition (map, v2));
>> +
>> + if (p1 == p2)
>> + continue;
>> +
>> + partition_union (tentative, p1, p2);
>> + }
>> +
>> + /* We have to deal with cost one pairs too. */
>> + for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
>> + {
>> + tree v1 = ssa_name (co->first_element);
>> + int p1 = partition_find (tentative, var_to_partition (map, v1));
>> + tree v2 = ssa_name (co->second_element);
>> + int p2 = partition_find (tentative, var_to_partition (map, v2));
>> +
>> + if (p1 == p2)
>> + continue;
>> +
>> + partition_union (tentative, p1, p2);
>> + }
>> +
>> + /* And also with abnormal edges. */
>> + basic_block bb;
>> + edge e;
>> + edge_iterator ei;
>> + FOR_EACH_BB_FN (bb, cfun)
>> + {
>> + FOR_EACH_EDGE (e, ei, bb->preds)
>> + if (e->flags & EDGE_ABNORMAL)
>> + {
>> + gphi_iterator gsi;
>> + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
>> + gsi_next (&gsi))
>> + {
>> + gphi *phi = gsi.phi ();
>> + tree arg = PHI_ARG_DEF (phi, e->dest_idx);
>> + if (SSA_NAME_IS_DEFAULT_DEF (arg)
>> + && (!SSA_NAME_VAR (arg)
>> + || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
>> + continue;
>> +
>> + tree res = PHI_RESULT (phi);
>> +
>> + int p1 = partition_find (tentative, var_to_partition (map, res));
>> + int p2 = partition_find (tentative, var_to_partition (map, arg));
>> +
>> + if (p1 == p2)
>> + continue;
>> +
>> + partition_union (tentative, p1, p2);
>> + }
>> + }
>> + }
>> +
>> + map->partition_to_base_index = XCNEWVEC (int, parts);
>> + auto_vec<unsigned int> index_map (parts);
>> + if (parts)
>> + index_map.quick_grow (parts);
>> +
>> + const unsigned no_part = -1;
>> + unsigned count = parts;
>> + while (count)
>> + index_map[--count] = no_part;
>> +
>> + /* Initialize MAP's mapping from partition to base index, using
>> + as base indices an enumeration of the TENTATIVE partitions in
>> + which each SSA version ended up, so that we compute conflicts
>> + between all SSA versions that ended up in the same potential
>> + coalesce partition. */
>> + bitmap_iterator bi;
>> + unsigned i;
>> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
>> + {
>> + int pidx = var_to_partition (map, ssa_name (i));
>> + int base = partition_find (tentative, pidx);
>> + if (index_map[base] != no_part)
>> + continue;
>> + index_map[base] = count++;
>> + }
>> +
>> + map->num_basevars = count;
>> +
>> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
>> + {
>> + int pidx = var_to_partition (map, ssa_name (i));
>> + int base = partition_find (tentative, pidx);
>> + gcc_assert (index_map[base] < count);
>> + map->partition_to_base_index[pidx] = index_map[base];
>> + }
>> +
>> + if (dump_file && (dump_flags & TDF_DETAILS))
>> + dump_part_var_map (dump_file, tentative, map);
>> +
>> + partition_delete (tentative);
>> +}
>> +
>> +/* Hashtable helpers. */
>> +
>> +struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
>> +{
>> + typedef tree_int_map *value_type;
>> + typedef tree_int_map *compare_type;
>> + static inline hashval_t hash (const tree_int_map *);
>> + static inline bool equal (const tree_int_map *, const tree_int_map *);
>> +};
>> +
>> +inline hashval_t
>> +tree_int_map_hasher::hash (const tree_int_map *v)
>> +{
>> + return tree_map_base_hash (v);
>> +}
>> +
>> +inline bool
>> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
>> +{
>> + return tree_int_map_eq (v, c);
>> +}
>> +
>> +/* This routine will initialize the basevar fields of MAP with base
>> + names. Partitions will share the same base if they have the same
>> + SSA_NAME_VAR, or, being anonymous variables, the same type. This
>> + must match gimple_can_coalesce_p in the non-optimized case. */
>> +
>> +static void
>> +compute_samebase_partition_bases (var_map map)
>> +{
>> + int x, num_part;
>> + tree var;
>> + struct tree_int_map *m, *mapstorage;
>> +
>> + num_part = num_var_partitions (map);
>> + hash_table<tree_int_map_hasher> tree_to_index (num_part);
>> + /* We can have at most num_part entries in the hash tables, so it's
>> + enough to allocate so many map elements once, saving some malloc
>> + calls. */
>> + mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
>> +
>> + /* If a base table already exists, clear it, otherwise create it. */
>> + free (map->partition_to_base_index);
>> + map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
>> +
>> + /* Build the base variable list, and point partitions at their bases. */
>> + for (x = 0; x < num_part; x++)
>> + {
>> + struct tree_int_map **slot;
>> + unsigned baseindex;
>> + var = partition_to_var (map, x);
>> + if (SSA_NAME_VAR (var)
>> + && (!VAR_P (SSA_NAME_VAR (var))
>> + || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
>> + m->base.from = SSA_NAME_VAR (var);
>> + else
>> + /* This restricts what anonymous SSA names we can coalesce
>> + as it restricts the sets we compute conflicts for.
>> + Using TREE_TYPE to generate sets is the easies as
>> + type equivalency also holds for SSA names with the same
>> + underlying decl.
>> +
>> + Check gimple_can_coalesce_p when changing this code. */
>> + m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
>> + ? TYPE_CANONICAL (TREE_TYPE (var))
>> + : TREE_TYPE (var));
>> + /* If base variable hasn't been seen, set it up. */
>> + slot = tree_to_index.find_slot (m, INSERT);
>> + if (!*slot)
>> + {
>> + baseindex = m - mapstorage;
>> + m->to = baseindex;
>> + *slot = m;
>> + m++;
>> + }
>> + else
>> + baseindex = (*slot)->to;
>> + map->partition_to_base_index[x] = baseindex;
>> + }
>> +
>> + map->num_basevars = m - mapstorage;
>> +
>> + free (mapstorage);
>> +}
>> +
>> /* Reduce the number of copies by coalescing variables in the function. Return
>> a partition map with the resulting coalesces. */
>>
>> @@ -1286,9 +1647,10 @@ coalesce_ssa_name (void)
>> cl = create_coalesce_list ();
>> map = create_outofssa_var_map (cl, used_in_copies);
>>
>> - /* If optimization is disabled, we need to coalesce all the names originating
>> - from the same SSA_NAME_VAR so debug info remains undisturbed. */
>> - if (!optimize)
>> + /* If this optimization is disabled, we need to coalesce all the
>> + names originating from the same SSA_NAME_VAR so debug info
>> + remains undisturbed. */
>> + if (!flag_tree_coalesce_vars)
>> {
>> hash_table<ssa_name_var_hash> ssa_name_hash (10);
>>
>> @@ -1329,8 +1691,13 @@ coalesce_ssa_name (void)
>> if (dump_file && (dump_flags & TDF_DETAILS))
>> dump_var_map (dump_file, map);
>>
>> - /* Don't calculate live ranges for variables not in the coalesce list. */
>> - partition_view_bitmap (map, used_in_copies, true);
>> + partition_view_bitmap (map, used_in_copies);
>> +
>> + if (flag_tree_coalesce_vars)
>> + compute_optimized_partition_bases (map, used_in_copies, cl);
>> + else
>> + compute_samebase_partition_bases (map);
>> +
>> BITMAP_FREE (used_in_copies);
>>
>> if (num_var_partitions (map) < 1)
>> @@ -1369,8 +1736,7 @@ coalesce_ssa_name (void)
>>
>> /* Now coalesce everything in the list. */
>> coalesce_partitions (map, graph, cl,
>> - ((dump_flags & TDF_DETAILS) ? dump_file
>> - : NULL));
>> + ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
>>
>> delete_coalesce_list (cl);
>> ssa_conflicts_delete (graph);
>> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
>> index 99b188a..ae289b4 100644
>> --- a/gcc/tree-ssa-coalesce.h
>> +++ b/gcc/tree-ssa-coalesce.h
>> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see
>> #define GCC_TREE_SSA_COALESCE_H
>>
>> extern var_map coalesce_ssa_name (void);
>> +extern bool gimple_can_coalesce_p (tree, tree);
>>
>> #endif /* GCC_TREE_SSA_COALESCE_H */
>> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
>> deleted file mode 100644
>> index f3cb56e..0000000
>> --- a/gcc/tree-ssa-copyrename.c
>> +++ /dev/null
>> @@ -1,499 +0,0 @@
>> -/* Rename SSA copies.
>> - Copyright (C) 2004-2015 Free Software Foundation, Inc.
>> - Contributed by Andrew MacLeod <amacleod@redhat.com>
>> -
>> -This file is part of GCC.
>> -
>> -GCC is free software; you can redistribute it and/or modify
>> -it under the terms of the GNU General Public License as published by
>> -the Free Software Foundation; either version 3, or (at your option)
>> -any later version.
>> -
>> -GCC is distributed in the hope that it will be useful,
>> -but WITHOUT ANY WARRANTY; without even the implied warranty of
>> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> -GNU General Public License for more details.
>> -
>> -You should have received a copy of the GNU General Public License
>> -along with GCC; see the file COPYING3. If not see
>> -<http://www.gnu.org/licenses/>. */
>> -
>> -#include "config.h"
>> -#include "system.h"
>> -#include "coretypes.h"
>> -#include "tm.h"
>> -#include "hash-set.h"
>> -#include "machmode.h"
>> -#include "vec.h"
>> -#include "double-int.h"
>> -#include "input.h"
>> -#include "alias.h"
>> -#include "symtab.h"
>> -#include "wide-int.h"
>> -#include "inchash.h"
>> -#include "tree.h"
>> -#include "fold-const.h"
>> -#include "predict.h"
>> -#include "hard-reg-set.h"
>> -#include "function.h"
>> -#include "dominance.h"
>> -#include "cfg.h"
>> -#include "basic-block.h"
>> -#include "tree-ssa-alias.h"
>> -#include "internal-fn.h"
>> -#include "gimple-expr.h"
>> -#include "is-a.h"
>> -#include "gimple.h"
>> -#include "gimple-iterator.h"
>> -#include "flags.h"
>> -#include "tree-pretty-print.h"
>> -#include "bitmap.h"
>> -#include "gimple-ssa.h"
>> -#include "stringpool.h"
>> -#include "tree-ssanames.h"
>> -#include "hashtab.h"
>> -#include "rtl.h"
>> -#include "statistics.h"
>> -#include "real.h"
>> -#include "fixed-value.h"
>> -#include "insn-config.h"
>> -#include "expmed.h"
>> -#include "dojump.h"
>> -#include "explow.h"
>> -#include "calls.h"
>> -#include "emit-rtl.h"
>> -#include "varasm.h"
>> -#include "stmt.h"
>> -#include "expr.h"
>> -#include "tree-dfa.h"
>> -#include "tree-inline.h"
>> -#include "tree-ssa-live.h"
>> -#include "tree-pass.h"
>> -#include "langhooks.h"
>> -
>> -static struct
>> -{
>> - /* Number of copies coalesced. */
>> - int coalesced;
>> -} stats;
>> -
>> -/* The following routines implement the SSA copy renaming phase.
>> -
>> - This optimization looks for copies between 2 SSA_NAMES, either through a
>> - direct copy, or an implicit one via a PHI node result and its arguments.
>> -
>> - Each copy is examined to determine if it is possible to rename the base
>> - variable of one of the operands to the same variable as the other operand.
>> - i.e.
>> - T.3_5 = <blah>
>> - a_1 = T.3_5
>> -
>> - If this copy couldn't be copy propagated, it could possibly remain in the
>> - program throughout the optimization phases. After SSA->normal, it would
>> - become:
>> -
>> - T.3 = <blah>
>> - a = T.3
>> -
>> - Since T.3_5 is distinct from all other SSA versions of T.3, there is no
>> - fundamental reason why the base variable needs to be T.3, subject to
>> - certain restrictions. This optimization attempts to determine if we can
>> - change the base variable on copies like this, and result in code such as:
>> -
>> - a_5 = <blah>
>> - a_1 = a_5
>> -
>> - This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
>> - possible, the copy goes away completely. If it isn't possible, a new temp
>> - will be created for a_5, and you will end up with the exact same code:
>> -
>> - a.8 = <blah>
>> - a = a.8
>> -
>> - The other benefit of performing this optimization relates to what variables
>> - are chosen in copies. Gimplification of the program uses temporaries for
>> - a lot of things. expressions like
>> -
>> - a_1 = <blah>
>> - <blah2> = a_1
>> -
>> - get turned into
>> -
>> - T.3_5 = <blah>
>> - a_1 = T.3_5
>> - <blah2> = a_1
>> -
>> - Copy propagation is done in a forward direction, and if we can propagate
>> - through the copy, we end up with:
>> -
>> - T.3_5 = <blah>
>> - <blah2> = T.3_5
>> -
>> - The copy is gone, but so is all reference to the user variable 'a'. By
>> - performing this optimization, we would see the sequence:
>> -
>> - a_5 = <blah>
>> - a_1 = a_5
>> - <blah2> = a_1
>> -
>> - which copy propagation would then turn into:
>> -
>> - a_5 = <blah>
>> - <blah2> = a_5
>> -
>> - and so we still retain the user variable whenever possible. */
>> -
>> -
>> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
>> - Choose a representative for the partition, and send debug info to DEBUG. */
>> -
>> -static void
>> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
>> -{
>> - int p1, p2, p3;
>> - tree root1, root2;
>> - tree rep1, rep2;
>> - bool ign1, ign2, abnorm;
>> -
>> - gcc_assert (TREE_CODE (var1) == SSA_NAME);
>> - gcc_assert (TREE_CODE (var2) == SSA_NAME);
>> -
>> - register_ssa_partition (map, var1);
>> - register_ssa_partition (map, var2);
>> -
>> - p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
>> - p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
>> -
>> - if (debug)
>> - {
>> - fprintf (debug, "Try : ");
>> - print_generic_expr (debug, var1, TDF_SLIM);
>> - fprintf (debug, "(P%d) & ", p1);
>> - print_generic_expr (debug, var2, TDF_SLIM);
>> - fprintf (debug, "(P%d)", p2);
>> - }
>> -
>> - gcc_assert (p1 != NO_PARTITION);
>> - gcc_assert (p2 != NO_PARTITION);
>> -
>> - if (p1 == p2)
>> - {
>> - if (debug)
>> - fprintf (debug, " : Already coalesced.\n");
>> - return;
>> - }
>> -
>> - rep1 = partition_to_var (map, p1);
>> - rep2 = partition_to_var (map, p2);
>> - root1 = SSA_NAME_VAR (rep1);
>> - root2 = SSA_NAME_VAR (rep2);
>> - if (!root1 && !root2)
>> - return;
>> -
>> - /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
>> - abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
>> - || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
>> - if (abnorm)
>> - {
>> - if (debug)
>> - fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
>> - return;
>> - }
>> -
>> - /* Partitions already have the same root, simply merge them. */
>> - if (root1 == root2)
>> - {
>> - p1 = partition_union (map->var_partition, p1, p2);
>> - if (debug)
>> - fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
>> - return;
>> - }
>> -
>> - /* Never attempt to coalesce 2 different parameters. */
>> - if ((root1 && TREE_CODE (root1) == PARM_DECL)
>> - && (root2 && TREE_CODE (root2) == PARM_DECL))
>> - {
>> - if (debug)
>> - fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
>> - return;
>> - }
>> -
>> - if ((root1 && TREE_CODE (root1) == RESULT_DECL)
>> - != (root2 && TREE_CODE (root2) == RESULT_DECL))
>> - {
>> - if (debug)
>> - fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
>> - return;
>> - }
>> -
>> - ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
>> - ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
>> -
>> - /* Refrain from coalescing user variables, if requested. */
>> - if (!ign1 && !ign2)
>> - {
>> - if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
>> - ign2 = true;
>> - else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
>> - ign1 = true;
>> - else if (flag_ssa_coalesce_vars != 2)
>> - {
>> - if (debug)
>> - fprintf (debug, " : 2 different USER vars. No coalesce.\n");
>> - return;
>> - }
>> - else
>> - ign2 = true;
>> - }
>> -
>> - /* If both values have default defs, we can't coalesce. If only one has a
>> - tag, make sure that variable is the new root partition. */
>> - if (root1 && ssa_default_def (cfun, root1))
>> - {
>> - if (root2 && ssa_default_def (cfun, root2))
>> - {
>> - if (debug)
>> - fprintf (debug, " : 2 default defs. No coalesce.\n");
>> - return;
>> - }
>> - else
>> - {
>> - ign2 = true;
>> - ign1 = false;
>> - }
>> - }
>> - else if (root2 && ssa_default_def (cfun, root2))
>> - {
>> - ign1 = true;
>> - ign2 = false;
>> - }
>> -
>> - /* Do not coalesce if we cannot assign a symbol to the partition. */
>> - if (!(!ign2 && root2)
>> - && !(!ign1 && root1))
>> - {
>> - if (debug)
>> - fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
>> - return;
>> - }
>> -
>> - /* Don't coalesce if the new chosen root variable would be read-only.
>> - If both ign1 && ign2, then the root var of the larger partition
>> - wins, so reject in that case if any of the root vars is TREE_READONLY.
>> - Otherwise reject only if the root var, on which replace_ssa_name_symbol
>> - will be called below, is readonly. */
>> - if (((root1 && TREE_READONLY (root1)) && ign2)
>> - || ((root2 && TREE_READONLY (root2)) && ign1))
>> - {
>> - if (debug)
>> - fprintf (debug, " : Readonly variable. No coalesce.\n");
>> - return;
>> - }
>> -
>> - /* Don't coalesce if the two variables aren't type compatible . */
>> - if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
>> - /* There is a disconnect between the middle-end type-system and
>> - VRP, avoid coalescing enum types with different bounds. */
>> - || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
>> - || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
>> - && TREE_TYPE (var1) != TREE_TYPE (var2)))
>> - {
>> - if (debug)
>> - fprintf (debug, " : Incompatible types. No coalesce.\n");
>> - return;
>> - }
>> -
>> - /* Merge the two partitions. */
>> - p3 = partition_union (map->var_partition, p1, p2);
>> -
>> - /* Set the root variable of the partition to the better choice, if there is
>> - one. */
>> - if (!ign2 && root2)
>> - replace_ssa_name_symbol (partition_to_var (map, p3), root2);
>> - else if (!ign1 && root1)
>> - replace_ssa_name_symbol (partition_to_var (map, p3), root1);
>> - else
>> - gcc_unreachable ();
>> -
>> - if (debug)
>> - {
>> - fprintf (debug, " --> P%d ", p3);
>> - print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
>> - TDF_SLIM);
>> - fprintf (debug, "\n");
>> - }
>> -}
>> -
>> -
>> -namespace {
>> -
>> -const pass_data pass_data_rename_ssa_copies =
>> -{
>> - GIMPLE_PASS, /* type */
>> - "copyrename", /* name */
>> - OPTGROUP_NONE, /* optinfo_flags */
>> - TV_TREE_COPY_RENAME, /* tv_id */
>> - ( PROP_cfg | PROP_ssa ), /* properties_required */
>> - 0, /* properties_provided */
>> - 0, /* properties_destroyed */
>> - 0, /* todo_flags_start */
>> - 0, /* todo_flags_finish */
>> -};
>> -
>> -class pass_rename_ssa_copies : public gimple_opt_pass
>> -{
>> -public:
>> - pass_rename_ssa_copies (gcc::context *ctxt)
>> - : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
>> - {}
>> -
>> - /* opt_pass methods: */
>> - opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
>> - virtual bool gate (function *) { return flag_tree_copyrename != 0; }
>> - virtual unsigned int execute (function *);
>> -
>> -}; // class pass_rename_ssa_copies
>> -
>> -/* This function will make a pass through the IL, and attempt to coalesce any
>> - SSA versions which occur in PHI's or copies. Coalescing is accomplished by
>> - changing the underlying root variable of all coalesced version. This will
>> - then cause the SSA->normal pass to attempt to coalesce them all to the same
>> - variable. */
>> -
>> -unsigned int
>> -pass_rename_ssa_copies::execute (function *fun)
>> -{
>> - var_map map;
>> - basic_block bb;
>> - tree var, part_var;
>> - gimple stmt;
>> - unsigned x;
>> - FILE *debug;
>> -
>> - memset (&stats, 0, sizeof (stats));
>> -
>> - if (dump_file && (dump_flags & TDF_DETAILS))
>> - debug = dump_file;
>> - else
>> - debug = NULL;
>> -
>> - map = init_var_map (num_ssa_names);
>> -
>> - FOR_EACH_BB_FN (bb, fun)
>> - {
>> - /* Scan for real copies. */
>> - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
>> - gsi_next (&gsi))
>> - {
>> - stmt = gsi_stmt (gsi);
>> - if (gimple_assign_ssa_name_copy_p (stmt))
>> - {
>> - tree lhs = gimple_assign_lhs (stmt);
>> - tree rhs = gimple_assign_rhs1 (stmt);
>> -
>> - copy_rename_partition_coalesce (map, lhs, rhs, debug);
>> - }
>> - }
>> - }
>> -
>> - FOR_EACH_BB_FN (bb, fun)
>> - {
>> - /* Treat PHI nodes as copies between the result and each argument. */
>> - for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
>> - gsi_next (&gsi))
>> - {
>> - size_t i;
>> - tree res;
>> - gphi *phi = gsi.phi ();
>> - res = gimple_phi_result (phi);
>> -
>> - /* Do not process virtual SSA_NAMES. */
>> - if (virtual_operand_p (res))
>> - continue;
>> -
>> - /* Make sure to only use the same partition for an argument
>> - as the result but never the other way around. */
>> - if (SSA_NAME_VAR (res)
>> - && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
>> - for (i = 0; i < gimple_phi_num_args (phi); i++)
>> - {
>> - tree arg = PHI_ARG_DEF (phi, i);
>> - if (TREE_CODE (arg) == SSA_NAME)
>> - copy_rename_partition_coalesce (map, res, arg,
>> - debug);
>> - }
>> - /* Else if all arguments are in the same partition try to merge
>> - it with the result. */
>> - else
>> - {
>> - int all_p_same = -1;
>> - int p = -1;
>> - for (i = 0; i < gimple_phi_num_args (phi); i++)
>> - {
>> - tree arg = PHI_ARG_DEF (phi, i);
>> - if (TREE_CODE (arg) != SSA_NAME)
>> - {
>> - all_p_same = 0;
>> - break;
>> - }
>> - else if (all_p_same == -1)
>> - {
>> - p = partition_find (map->var_partition,
>> - SSA_NAME_VERSION (arg));
>> - all_p_same = 1;
>> - }
>> - else if (all_p_same == 1
>> - && p != partition_find (map->var_partition,
>> - SSA_NAME_VERSION (arg)))
>> - {
>> - all_p_same = 0;
>> - break;
>> - }
>> - }
>> - if (all_p_same == 1)
>> - copy_rename_partition_coalesce (map, res,
>> - PHI_ARG_DEF (phi, 0),
>> - debug);
>> - }
>> - }
>> - }
>> -
>> - if (debug)
>> - dump_var_map (debug, map);
>> -
>> - /* Now one more pass to make all elements of a partition share the same
>> - root variable. */
>> -
>> - for (x = 1; x < num_ssa_names; x++)
>> - {
>> - part_var = partition_to_var (map, x);
>> - if (!part_var)
>> - continue;
>> - var = ssa_name (x);
>> - if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
>> - continue;
>> - if (debug)
>> - {
>> - fprintf (debug, "Coalesced ");
>> - print_generic_expr (debug, var, TDF_SLIM);
>> - fprintf (debug, " to ");
>> - print_generic_expr (debug, part_var, TDF_SLIM);
>> - fprintf (debug, "\n");
>> - }
>> - stats.coalesced++;
>> - replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
>> - }
>> -
>> - statistics_counter_event (fun, "copies coalesced",
>> - stats.coalesced);
>> - delete_var_map (map);
>> - return 0;
>> -}
>> -
>> -} // anon namespace
>> -
>> -gimple_opt_pass *
>> -make_pass_rename_ssa_copies (gcc::context *ctxt)
>> -{
>> - return new pass_rename_ssa_copies (ctxt);
>> -}
>> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
>> index 2c7c072..821b2f4 100644
>> --- a/gcc/tree-ssa-live.c
>> +++ b/gcc/tree-ssa-live.c
>> @@ -100,90 +100,6 @@ static void verify_live_on_entry (tree_live_info_p);
>> ssa_name or variable, and vice versa. */
>>
>>
>> -/* Hashtable helpers. */
>> -
>> -struct tree_int_map_hasher : typed_noop_remove <tree_int_map>
>> -{
>> - typedef tree_int_map *value_type;
>> - typedef tree_int_map *compare_type;
>> - static inline hashval_t hash (const tree_int_map *);
>> - static inline bool equal (const tree_int_map *, const tree_int_map *);
>> -};
>> -
>> -inline hashval_t
>> -tree_int_map_hasher::hash (const tree_int_map *v)
>> -{
>> - return tree_map_base_hash (v);
>> -}
>> -
>> -inline bool
>> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
>> -{
>> - return tree_int_map_eq (v, c);
>> -}
>> -
>> -
>> -/* This routine will initialize the basevar fields of MAP. */
>> -
>> -static void
>> -var_map_base_init (var_map map)
>> -{
>> - int x, num_part;
>> - tree var;
>> - struct tree_int_map *m, *mapstorage;
>> -
>> - num_part = num_var_partitions (map);
>> - hash_table<tree_int_map_hasher> tree_to_index (num_part);
>> - /* We can have at most num_part entries in the hash tables, so it's
>> - enough to allocate so many map elements once, saving some malloc
>> - calls. */
>> - mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
>> -
>> - /* If a base table already exists, clear it, otherwise create it. */
>> - free (map->partition_to_base_index);
>> - map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
>> -
>> - /* Build the base variable list, and point partitions at their bases. */
>> - for (x = 0; x < num_part; x++)
>> - {
>> - struct tree_int_map **slot;
>> - unsigned baseindex;
>> - var = partition_to_var (map, x);
>> - if (SSA_NAME_VAR (var)
>> - && (!VAR_P (SSA_NAME_VAR (var))
>> - || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
>> - m->base.from = SSA_NAME_VAR (var);
>> - else
>> - /* This restricts what anonymous SSA names we can coalesce
>> - as it restricts the sets we compute conflicts for.
>> - Using TREE_TYPE to generate sets is the easies as
>> - type equivalency also holds for SSA names with the same
>> - underlying decl.
>> -
>> - Check gimple_can_coalesce_p when changing this code. */
>> - m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
>> - ? TYPE_CANONICAL (TREE_TYPE (var))
>> - : TREE_TYPE (var));
>> - /* If base variable hasn't been seen, set it up. */
>> - slot = tree_to_index.find_slot (m, INSERT);
>> - if (!*slot)
>> - {
>> - baseindex = m - mapstorage;
>> - m->to = baseindex;
>> - *slot = m;
>> - m++;
>> - }
>> - else
>> - baseindex = (*slot)->to;
>> - map->partition_to_base_index[x] = baseindex;
>> - }
>> -
>> - map->num_basevars = m - mapstorage;
>> -
>> - free (mapstorage);
>> -}
>> -
>> -
>> /* Remove the base table in MAP. */
>>
>> static void
>> @@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected)
>> }
>>
>>
>> -/* Create a partition view which includes all the used partitions in MAP. If
>> - WANT_BASES is true, create the base variable map as well. */
>> +/* Create a partition view which includes all the used partitions in MAP. */
>>
>> void
>> -partition_view_normal (var_map map, bool want_bases)
>> +partition_view_normal (var_map map)
>> {
>> bitmap used;
>>
>> used = partition_view_init (map);
>> partition_view_fini (map, used);
>>
>> - if (want_bases)
>> - var_map_base_init (map);
>> - else
>> - var_map_base_fini (map);
>> + var_map_base_fini (map);
>> }
>>
>>
>> @@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases)
>> as well. */
>>
>> void
>> -partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>> +partition_view_bitmap (var_map map, bitmap only)
>> {
>> bitmap used;
>> bitmap new_partitions = BITMAP_ALLOC (NULL);
>> @@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
>> }
>> partition_view_fini (map, new_partitions);
>>
>> - if (want_bases)
>> - var_map_base_init (map);
>> - else
>> - var_map_base_fini (map);
>> + var_map_base_fini (map);
>> }
>>
>>
>> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
>> index d5d7820..1f88358 100644
>> --- a/gcc/tree-ssa-live.h
>> +++ b/gcc/tree-ssa-live.h
>> @@ -71,8 +71,8 @@ typedef struct _var_map
>> extern var_map init_var_map (int);
>> extern void delete_var_map (var_map);
>> extern int var_union (var_map, tree, tree);
>> -extern void partition_view_normal (var_map, bool);
>> -extern void partition_view_bitmap (var_map, bitmap, bool);
>> +extern void partition_view_normal (var_map);
>> +extern void partition_view_bitmap (var_map, bitmap);
>> extern void dump_scope_blocks (FILE *, int);
>> extern void debug_scope_block (tree, int);
>> extern void debug_scope_blocks (int);
>> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
>> index 3f6bebe..7bef8cf 100644
>> --- a/gcc/tree-ssa-loop-niter.c
>> +++ b/gcc/tree-ssa-loop-niter.c
>> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
>> if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
>> continue;
>> e = TREE_OPERAND (e, 0);
>> - gcc_assert (operand_equal_p (e, base, 0));
>> + /* If E has an unsigned type, the operand equality test below
>> + would fail, but the equality test above would have already
>> + verified the equality, so we can proceed with it. */
>> + gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
>> + || operand_equal_p (e, base, 0));
>> if (tree_int_cst_sign_bit (step))
>> {
>> code = LT_EXPR;
>> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
>> index f75a7f1..0982305 100644
>> --- a/gcc/tree-ssa-uncprop.c
>> +++ b/gcc/tree-ssa-uncprop.c
>> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see
>> #include "domwalk.h"
>> #include "tree-pass.h"
>> #include "tree-ssa-propagate.h"
>> +#include "bitmap.h"
>> +#include "stringpool.h"
>> +#include "tree-ssanames.h"
>> +#include "tree-ssa-live.h"
>> +#include "tree-ssa-coalesce.h"
>>
>> /* The basic structure describing an equivalency created by traversing
>> an edge. Traversing the edge effectively means that we can assume
>> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
>> index 0b24007..acdcd46 100644
>> --- a/gcc/var-tracking.c
>> +++ b/gcc/var-tracking.c
>> @@ -4931,12 +4931,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
>> registers, as well as associations between MEMs and VALUEs. */
>>
>> static void
>> -dataflow_set_clear_at_call (dataflow_set *set)
>> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
>> {
>> unsigned int r;
>> hard_reg_set_iterator hrsi;
>> + HARD_REG_SET invalidated_regs;
>>
>> - EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
>> + get_call_reg_set_usage (call_insn, &invalidated_regs,
>> + regs_invalidated_by_call);
>> +
>> + EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
>> var_regno_delete (set, r);
>>
>> if (MAY_HAVE_DEBUG_INSNS)
>> @@ -6720,7 +6724,7 @@ compute_bb_dataflow (basic_block bb)
>> switch (mo->type)
>> {
>> case MO_CALL:
>> - dataflow_set_clear_at_call (out);
>> + dataflow_set_clear_at_call (out, insn);
>> break;
>>
>> case MO_USE:
>> @@ -9182,7 +9186,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
>> switch (mo->type)
>> {
>> case MO_CALL:
>> - dataflow_set_clear_at_call (set);
>> + dataflow_set_clear_at_call (set, insn);
>> emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
>> {
>> rtx arguments = mo->u.loc, *p = &arguments;
>>
>>
>>
>> And here's the incremental patch:
>>
>> ---
>> gcc/alias.c | 17 +++++++------
>> gcc/cfgexpand.c | 57 +++++++++++++++++----------------------------
>> gcc/emit-rtl.c | 2 --
>> gcc/explow.c | 3 --
>> gcc/expr.c | 16 +++++--------
>> gcc/function.c | 15 ++++++++++++
>> gcc/gimple-expr.h | 4 ---
>> gcc/tree-outof-ssa.c | 7 ++----
>> gcc/tree-ssa-coalesce.h | 1 +
>> gcc/tree-ssa-loop-niter.c | 6 ++++-
>> gcc/tree-ssa-uncprop.c | 5 ++++
>> 11 files changed, 64 insertions(+), 69 deletions(-)
>>
>> diff --git a/gcc/alias.c b/gcc/alias.c
>> index 7a74e81..5a031d9 100644
>> --- a/gcc/alias.c
>> +++ b/gcc/alias.c
>> @@ -2553,14 +2553,15 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
>> return 0;
>>
>> /* If we refer to different gimple registers, or one gimple register
>> - and one non-gimple-register, we know they can't overlap. Now,
>> - there could be more than one stack slot for (different versions
>> - of) the same gimple register, but we can presumably tell they
>> - don't overlap based on offsets from stack base addresses
>> - elsewhere. It's important that we don't proceed to DECL_RTL,
>> - because gimple registers may not pass DECL_RTL_SET_P, and
>> - make_decl_rtl won't be able to do anything about them since no
>> - SSA information will have remained to guide it. */
>> + and one non-gimple-register, we know they can't overlap. First,
>> + gimple registers don't have their addresses taken. Now, there
>> + could be more than one stack slot for (different versions of) the
>> + same gimple register, but we can presumably tell they don't
>> + overlap based on offsets from stack base addresses elsewhere.
>> + It's important that we don't proceed to DECL_RTL, because gimple
>> + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
>> + able to do anything about them since no SSA information will have
>> + remained to guide it. */
>> if (is_gimple_reg (exprx) || is_gimple_reg (expry))
>> return exprx != expry;
>>
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index 3e80b4a..bf972fc 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -179,11 +179,10 @@ gimple_assign_rhs_to_tree (gimple stmt)
>>
>> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>>
>> -/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a
>> - TREE_LIST of DECLs. If NEXT is covered by CUR, return CUR
>> - unchanged. Otherwise, return a list with all entries of CUR, with
>> - NEXT at the end. If CUR was a list, it will be modified in
>> - place. */
>> +/* Choose either CUR or NEXT as the leader DECL for a partition.
>> + Prefer ignored decls, to simplify debug dumps and reduce ambiguity
>> + out of the same user variable being in multiple partitions (this is
>> + less likely for compiler-introduced temps). */
>>
>> static tree
>> leader_merge (tree cur, tree next)
>> @@ -191,26 +190,11 @@ leader_merge (tree cur, tree next)
>> if (cur == NULL || cur == next)
>> return next;
>>
>> - tree list;
>> + if (DECL_P (cur) && DECL_IGNORED_P (cur))
>> + return cur;
>>
>> - if (TREE_CODE (cur) == TREE_LIST)
>> - {
>> - /* Look for NEXT in the list. Stop at the last node to insert
>> - there. */
>> - for (list = cur; ; list = TREE_CHAIN (list))
>> - {
>> - if (TREE_VALUE (list) == next)
>> - return cur;
>> - if (!TREE_CHAIN (list))
>> - break;
>> - }
>> - }
>> - else
>> - /* Create the first node. */
>> - list = build_tree_list (NULL, cur);
>> -
>> - next = build_tree_list (NULL, next);
>> - TREE_CHAIN (list) = next;
>> + if (DECL_P (next) && DECL_IGNORED_P (next))
>> + return next;
>>
>> return cur;
>> }
>> @@ -285,9 +269,9 @@ set_rtl (tree t, rtx x)
>> if (cur != next)
>> {
>> if (MEM_P (x))
>> - set_mem_attributes (x, SSAVAR (t), true);
>> + set_mem_attributes (x, next, true);
>> else
>> - set_reg_attrs_for_decl_rtl (SSAVAR (t), x);
>> + set_reg_attrs_for_decl_rtl (next, x);
>> }
>> }
>>
>> @@ -1025,9 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>>
>> x = plus_constant (Pmode, base, offset);
>> - x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl))
>> - ? DECL_MODE (SSAVAR (decl))
>> - : TYPE_MODE (TREE_TYPE (decl)), x);
>> + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
>> + ? TYPE_MODE (TREE_TYPE (decl))
>> + : DECL_MODE (SSAVAR (decl)), x);
>>
>> if (TREE_CODE (decl) != SSA_NAME)
>> {
>> @@ -1268,17 +1252,17 @@ expand_one_stack_var_1 (tree var)
>> HOST_WIDE_INT size, offset;
>> unsigned byte_align;
>>
>> - if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
>> - {
>> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
>> - byte_align = align_local_variable (SSAVAR (var));
>> - }
>> - else
>> + if (TREE_CODE (var) == SSA_NAME)
>> {
>> tree type = TREE_TYPE (var);
>> size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
>> byte_align = TYPE_ALIGN_UNIT (type);
>> }
>> + else
>> + {
>> + size = tree_to_uhwi (DECL_SIZE_UNIT (var));
>> + byte_align = align_local_variable (var);
>> + }
>>
>> /* We handle highly aligned variables in expand_stack_vars. */
>> gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
>> @@ -1423,9 +1407,10 @@ expand_one_register_var (tree var)
>> gcc_assert (REG_P (x));
>> return;
>> }
>> + gcc_unreachable ();
>> }
>>
>> - tree decl = SSAVAR (var);
>> + tree decl = var;
>> tree type = TREE_TYPE (decl);
>> machine_mode reg_mode = promote_decl_mode (decl, NULL);
>> rtx x = gen_reg_rtx (reg_mode);
>> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
>> index 308da40..2b98946 100644
>> --- a/gcc/emit-rtl.c
>> +++ b/gcc/emit-rtl.c
>> @@ -1252,8 +1252,6 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
>> if (!t)
>> return;
>> tree tdecl = t;
>> - if (TREE_CODE (t) == TREE_LIST)
>> - tdecl = TREE_VALUE (t);
>> if (GET_CODE (x) == SUBREG)
>> {
>> gcc_assert (subreg_lowpart_p (x));
>> diff --git a/gcc/explow.c b/gcc/explow.c
>> index e09c032e1..5b0d49c 100644
>> --- a/gcc/explow.c
>> +++ b/gcc/explow.c
>> @@ -866,9 +866,6 @@ promote_ssa_mode (const_tree name, int *punsignedp)
>> {
>> gcc_assert (TREE_CODE (name) == SSA_NAME);
>>
>> - if (SSA_NAME_VAR (name))
>> - return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>> -
>> tree type = TREE_TYPE (name);
>> int unsignedp = TYPE_UNSIGNED (type);
>> machine_mode mode = TYPE_MODE (type);
>> diff --git a/gcc/expr.c b/gcc/expr.c
>> index effe379..5b6e16e 100644
>> --- a/gcc/expr.c
>> +++ b/gcc/expr.c
>> @@ -9584,20 +9584,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>>
>> /* Get the signedness to be used for this variable. Ensure we get
>> the same mode we got when the variable was declared. */
>> - if (code == SSA_NAME
>> - && (g = SSA_NAME_DEF_STMT (ssa_name))
>> - && gimple_code (g) == GIMPLE_CALL
>> - && !gimple_call_internal_p (g))
>> + if (code != SSA_NAME)
>> + pmode = promote_decl_mode (exp, &unsignedp);
>> + else if ((g = SSA_NAME_DEF_STMT (ssa_name))
>> + && gimple_code (g) == GIMPLE_CALL
>> + && !gimple_call_internal_p (g))
>> pmode = promote_function_mode (type, mode, &unsignedp,
>> gimple_call_fntype (g),
>> 2);
>> - else if (!exp)
>> - {
>> - gcc_assert (code == SSA_NAME);
>> - pmode = promote_ssa_mode (ssa_name, &unsignedp);
>> - }
>> else
>> - pmode = promote_decl_mode (exp, &unsignedp);
>> + pmode = promote_ssa_mode (ssa_name, &unsignedp);
>> gcc_assert (GET_MODE (decl_rtl) == pmode);
>>
>> temp = gen_lowpart_SUBREG (mode, decl_rtl);
>> diff --git a/gcc/function.c b/gcc/function.c
>> index dc9e77f..58e2498 100644
>> --- a/gcc/function.c
>> +++ b/gcc/function.c
>> @@ -2124,6 +2124,21 @@ use_register_for_decl (const_tree decl)
>> {
>> if (TREE_CODE (decl) == SSA_NAME)
>> {
>> + /* We often try to use the SSA_NAME, instead of its underlying
>> + decl, to get type information and guide decisions, to avoid
>> + differences of behavior between anonymous and named
>> + variables, but in this one case we have to go for the actual
>> + variable if there is one. The main reason is that, at least
>> + at -O0, we want to place user variables on the stack, but we
>> + don't mind using pseudos for anonymous or ignored temps.
>> + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
>> + should go in pseudos, whereas their corresponding variables
>> + might have to go on the stack. So, disregarding the decl
>> + here would negatively impact debug info at -O0, enable
>> + coalescing between SSA_NAMEs that ought to get different
>> + stack/pseudo assignments, and get the incoming argument
>> + processing thoroughly confused by PARM_DECLs expected to live
>> + in stack slots but assigned to pseudos. */
>> if (!SSA_NAME_VAR (decl))
>> return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
>> && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
>> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
>> index 146cede..3d1c89f 100644
>> --- a/gcc/gimple-expr.h
>> +++ b/gcc/gimple-expr.h
>> @@ -55,10 +55,6 @@ extern bool is_gimple_mem_ref_addr (tree);
>> extern void mark_addressable (tree);
>> extern bool is_gimple_reg_rhs (tree);
>>
>> -/* Defined in tree-ssa-coalesce.c. */
>> -extern bool gimple_can_coalesce_p (tree, tree);
>> -
>> -
>> /* Return true if a conversion from either type of TYPE1 and TYPE2
>> to the other is not required. Otherwise return false. */
>>
>> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
>> index dda9973..59d91c6 100644
>> --- a/gcc/tree-outof-ssa.c
>> +++ b/gcc/tree-outof-ssa.c
>> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>> rtx dest_rtx, seq, x;
>> machine_mode dest_mode, src_mode;
>> int unsignedp;
>> - tree var;
>>
>> if (dump_file && (dump_flags & TDF_DETAILS))
>> {
>> @@ -328,10 +327,9 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
>> start_sequence ();
>>
>> tree name = partition_to_var (SA.map, dest);
>> - var = SSA_NAME_VAR (name);
>> src_mode = TYPE_MODE (TREE_TYPE (src));
>> dest_mode = GET_MODE (dest_rtx);
>> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
>> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
>> gcc_assert (!REG_P (dest_rtx)
>> || dest_mode == promote_ssa_mode (name, &unsignedp));
>>
>> @@ -709,8 +707,7 @@ elim_backward (elim_graph g, int T)
>> static rtx
>> get_temp_reg (tree name)
>> {
>> - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
>> - tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);
>> + tree type = TREE_TYPE (name);
>> int unsignedp;
>> machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
>> rtx x = gen_reg_rtx (reg_mode);
>> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
>> index 99b188a..ae289b4 100644
>> --- a/gcc/tree-ssa-coalesce.h
>> +++ b/gcc/tree-ssa-coalesce.h
>> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see
>> #define GCC_TREE_SSA_COALESCE_H
>>
>> extern var_map coalesce_ssa_name (void);
>> +extern bool gimple_can_coalesce_p (tree, tree);
>>
>> #endif /* GCC_TREE_SSA_COALESCE_H */
>> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
>> index 3f6bebe..7bef8cf 100644
>> --- a/gcc/tree-ssa-loop-niter.c
>> +++ b/gcc/tree-ssa-loop-niter.c
>> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step,
>> if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0))
>> continue;
>> e = TREE_OPERAND (e, 0);
>> - gcc_assert (operand_equal_p (e, base, 0));
>> + /* If E has an unsigned type, the operand equality test below
>> + would fail, but the equality test above would have already
>> + verified the equality, so we can proceed with it. */
>> + gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e))
>> + || operand_equal_p (e, base, 0));
>> if (tree_int_cst_sign_bit (step))
>> {
>> code = LT_EXPR;
>> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
>> index f75a7f1..0982305 100644
>> --- a/gcc/tree-ssa-uncprop.c
>> +++ b/gcc/tree-ssa-uncprop.c
>> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see
>> #include "domwalk.h"
>> #include "tree-pass.h"
>> #include "tree-ssa-propagate.h"
>> +#include "bitmap.h"
>> +#include "stringpool.h"
>> +#include "tree-ssanames.h"
>> +#include "tree-ssa-live.h"
>> +#include "tree-ssa-coalesce.h"
>>
>> /* The basic structure describing an equivalency created by traversing
>> an edge. Traversing the edge effectively means that we can assume
>>
>>
>>
>> --
>> Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
>> You must be the change you wish to see in the world. -- Gandhi
>> Be Free! -- http://FSFLA.org/ FSF Latin America board member
>> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-06-06 5:12 ` Alexandre Oliva
2015-06-08 8:16 ` Richard Biener
@ 2015-06-10 0:28 ` Alexandre Oliva
2015-06-10 13:36 ` Richard Biener
1 sibling, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-06-10 0:28 UTC (permalink / raw)
To: Richard Biener
Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou
On Jun 5, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>>> +/* Return the promoted mode for name. If it is a named SSA_NAME, it
>>> + is the same as promote_decl_mode. Otherwise, it is the promoted
>>> + mode of a temp decl of same type as the SSA_NAME, if we had created
>>> + one. */
>>> +
>>> +machine_mode
>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>> +{
>>> + gcc_assert (TREE_CODE (name) == SSA_NAME);
>>> +
>>> + if (SSA_NAME_VAR (name))
>>> + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>> vars (so just delete the above two lines).
> Check
This caused the sparc regression reported by Eric in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37
We need to match the mode of the rtl created for the partition and the
promoted mode expected for the parm. I recall working to make parm and
result decls the partition leaders, so that promote_ssa_mode would DTRT,
but this escaped my mind when revisiting the patch after some time on
another project.
So we either restore promote_ssa_mode's check for an underlying decl, at
least for PARM_ and RESULT_DECLs, or further massage function.c to deal
with the mode difference. Any preference?
I'm reverting the patch for now, so that we don't have to rush to a fix
on this, and I can have more time to test and fix other arches. It was
a terrible mistake to not do so before submitting the final version of
the patch, or at least before installing it. I apologize for that.
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-06-10 0:28 ` Alexandre Oliva
@ 2015-06-10 13:36 ` Richard Biener
2015-07-16 7:58 ` Alexandre Oliva
0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-06-10 13:36 UTC (permalink / raw)
To: Alexandre Oliva
Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou
On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jun 5, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> On Apr 27, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>>>> +/* Return the promoted mode for name. If it is a named SSA_NAME, it
>>>> + is the same as promote_decl_mode. Otherwise, it is the promoted
>>>> + mode of a temp decl of same type as the SSA_NAME, if we had created
>>>> + one. */
>>>> +
>>>> +machine_mode
>>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>>> +{
>>>> + gcc_assert (TREE_CODE (name) == SSA_NAME);
>>>> +
>>>> + if (SSA_NAME_VAR (name))
>>>> + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>
>>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>>> vars (so just delete the above two lines).
>
>> Check
>
> This caused the sparc regression reported by Eric in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37
>
> We need to match the mode of the rtl created for the partition and the
> promoted mode expected for the parm. I recall working to make parm and
> result decls the partition leaders, so that promote_ssa_mode would DTRT,
> but this escaped my mind when revisiting the patch after some time on
> another project.
>
> So we either restore promote_ssa_mode's check for an underlying decl, at
> least for PARM_ and RESULT_DECLs, or further massage function.c to deal
> with the mode difference. Any preference?
Alternatively not coalesce SSA names when promote_decl_mode gives
different answers (for their underlying decl)? It sounds wrong to do that
(if that is really what happens).
Richard.
> I'm reverting the patch for now, so that we don't have to rush to a fix
> on this, and I can have more time to test and fix other arches. It was
> a terrible mistake to not do so before submitting the final version of
> the patch, or at least before installing it. I apologize for that.
>
> --
> Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/ FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-06-10 13:36 ` Richard Biener
@ 2015-07-16 7:58 ` Alexandre Oliva
2015-07-16 8:50 ` Richard Biener
0 siblings, 1 reply; 127+ messages in thread
From: Alexandre Oliva @ 2015-07-16 7:58 UTC (permalink / raw)
To: Richard Biener
Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou
On Jun 10, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
> On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> This caused the sparc regression reported by Eric in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37
>> We need to match the mode of the rtl created for the partition and the
>> promoted mode expected for the parm. I recall working to make parm and
>> result decls the partition leaders, so that promote_ssa_mode would DTRT,
>> but this escaped my mind when revisiting the patch after some time on
>> another project.
FWIW, during the development of this improvement, I dropped the notion
of making parm and result decls partition leaders, and instead only
considered eligible for coalescing into the same partition SSA_NAMEs
that promoted to the same mode.
> Alternatively not coalesce SSA names when promote_decl_mode gives
> different answers (for their underlying decl)? It sounds wrong to do that
> (if that is really what happens).
Exactly. I've now restored the promote_decl_mode behavior to
promote_ssa_mode for PARM_ and RESULT_DECLs, so that the strategy
described above works again. This fixed the sparc regression.
On Jun 9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jun 9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Jun 9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>>> This also broke bootstrap on PPC64 LE Linux with the same error.
>> Thanks for your reports. I'm looking into the problem.
>> I'd appreciate a preprocessed testcase from either of you to confirm the
>> fix, if not to help debug it.
> The first potential source for this problem that jumped at me would be
> silenced with this change:
> diff --git a/gcc/function.c b/gcc/function.c
> index 8bcc352..9201ed9 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
> stack_parm = copy_rtx (stack_parm);
> if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
> PUT_MODE (stack_parm, GET_MODE (entry_parm));
> - set_mem_attributes (stack_parm, parm, 1);
> + if (GET_CODE (stack_parm) == MEM)
> + set_mem_attributes (stack_parm, parm, 1);
> }
> /* If a BLKmode arrives in registers, copy it to a stack slot. Handle
I ended up fixing this in a slightly different way, running the original
code above, from assign_stack_local to set_mem_attributes, only when
rtl_for_parm does not obtain an assignment set up by out-of-ssa.
> but I suspect there might be other similar issues lurking in function.c
> after my attempt to turn parm assignment upside down ;-)
There weren't, after all.
On Jun 9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
> This patch clearly should have been tested on more
> architectures than x86 before being approved and merged.
The following patch was regstrapped on x86_64-linux-gnu and
i686-pc-linux-gnu. I've also cross-built all-target successfully for
targets aarch64-elf, arm-eabi, arm-symbianelf, avr-elf, bfin-elf,
cr16-elf, cris-elf, crisv32-elf, epiphany-elf, fido-elf, fr30-elf,
frv-elf, i686-elf, lm32-elf, m68k-elf, mcore-elf, microblaze-elf,
mips64el-elf, mips64-elf, mips64orion-elf, mipsel-elf,
mipsisa32-elfoabi, mipsisa64-elfoabi, mipsisa64r2el-elf,
mipsisa64r2-sde-elf, mipsisa64sb1-elf, mipstx39-elf, mn10300-elf,
moxie-elf, nds32be-elf, nds32le-elf, nios2-elf, powerpc-eabialtivec,
powerpc-eabisimaltivec, powerpc-eabisim, powerpc-eabispe, powerpc-eabi,
powerpcle-eabisim, powerpcle-eabi, powerpcle-elf, ppc-eabi, ppc-elf,
rx-elf, sh-elf, sh-superh-elf, sparc64-elf, sparc-elf, spu-elf, and
visium-elf, and got the same build failures before and after the patch
with targets c6x-elf, ft32-elf, h8300-elf, ia64-elf, iq2000-elf,
m32c-elf, m32r-elf, m32rle-elf, mep-elf, mips64vr-elf
(mips64vr-elf/mips16/newlib/libm/math/lib_a_e_hypot.o failed to build
with the patch and passed without it, but there were other "invalid
operand" failures for "lwu" insns without the patch, so I'm counting the
e_hypot failure as present but latent before), mipsisa64sr71k-elf,
msp430-elf, pdp11-aout, powerpc-xilinx-eabi, ppc64-eabi, rl78-elf,
sh64-elf, sparc-leon-elf, v850e-elf, v850-elf, xstormy16-elf, and
xtensa-elf.
This patch differs from the previous one in that I dropped the hunk I
had put in loop_exits_before_overflow, already noticed and fixed
independently (PR66638); I updated tree_int_map_hasher, that was updated
in the trunk in tree-ssa-live.c, but that the patch moved to
tree-ssa-coalesce.c; I resolved other conflicts in files that had
#includes added by the patch and by other changes; and I put in the two
fixes mentioned above. After the full updated patch, I enclose a diff
with these two additional fixes, to ease the review.
Is this ok to install?
for gcc/ChangeLog
PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename. Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Move declaration
* tree-ssa-coalesce.h: ... here.
* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
headers required by it.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars. Check register
use and promoted modes to allow coalescing. Moved to
tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise. Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
of compute_samebase_partition_bases. Adjust.
* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
* cfgexpand.c (leader_merge): New.
(get_rtl_for_parm_ssa_default_def): New.
(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
(expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
redundant MEM attr setting.
(expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
from...
(expand_one_stack_var): ... this. New wrapper to check and
skip already expanded SSA partitions.
(record_alignment_for_reg_var): New, factored out of...
(expand_one_var): ... this.
(expand_one_ssa_partition): New.
(adjust_one_expanded_partition_var): New.
(expand_one_register_var): Check and skip already expanded SSA
partitions.
(expand_used_vars): Don't create DECLs for anonymous SSA
names. Expand all SSA partitions, then adjust all SSA names.
(pass::execute): Replace the loops that set
SA.partition_to_pseudo from partition leaders and cleared
DECL_RTL for multi-location variables, and that which used to
rename vars and set attrs, with one that clears DECL_RTL and
checks that PARMs and RESULTs default_defs match DECL_RTL.
* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
* explow.c (promote_ssa_mode): New.
* explow.h (promote_ssa_mode): Declare.
* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
* function.c: Include cfgexpand.h.
(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
(use_register_for_parm_decl): Wrapper for the above to
special-case the result_ptr.
(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
multiple locations.
(assign_parm_adjust_stack_rtl): Add all and parm arguments,
for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
(assign_parm_setup_block): Prefer SSA-assigned location.
(assign_parm_setup_reg): Likewise. Use entry_parm for equiv
if stack_parm is NULL.
(assign_parm_setup_stack): Prefer SSA-assigned location.
(assign_parms): Maybe reset DECL_RTL of params. Adjust stack
rtl before testing for pointer bounds. Special-case result_ptr.
(expand_function_start): Maybe reset DECL_RTL of result.
Prefer SSA-assigned location for result and static chain.
Factor out DECL_RESULT and SET_DECL_RTL.
* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
anonymous SSA names. Use promote_ssa_mode.
(get_temp_reg): Likewise.
(remove_ssa_form): Adjust.
* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
and get its reg_usage for reg invalidation.
(compute_bb_dataflow): Pass it insn.
(emit_notes_in_bb): Likewise.
for gcc/testsuite/ChangeLog
* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
* gcc.dg/ssp-1.c: Make counter a register.
* gcc.dg/ssp-2.c: Likewise.
* gcc.dg/torture/parm-coalesce.c: New.
---
gcc/Makefile.in | 1
gcc/alias.c | 13 +
gcc/cfgexpand.c | 370 +++++++++++++++-----
gcc/cfgexpand.h | 2
gcc/common.opt | 12 -
gcc/doc/invoke.texi | 48 +--
gcc/emit-rtl.c | 5
gcc/explow.c | 22 +
gcc/explow.h | 3
gcc/expr.c | 39 +-
gcc/function.c | 228 ++++++++++--
gcc/gimple-expr.c | 39 --
gcc/gimple-expr.h | 1
gcc/opts.c | 2
gcc/passes.def | 5
gcc/testsuite/gcc.dg/guality/pr54200.c | 2
gcc/testsuite/gcc.dg/ssp-1.c | 2
gcc/testsuite/gcc.dg/ssp-2.c | 2
gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++
gcc/tree-outof-ssa.c | 16 -
gcc/tree-ssa-coalesce.c | 378 ++++++++++++++++++++-
gcc/tree-ssa-coalesce.h | 1
gcc/tree-ssa-copyrename.c | 475 --------------------------
gcc/tree-ssa-live.c | 99 -----
gcc/tree-ssa-live.h | 4
gcc/tree-ssa-uncprop.c | 5
gcc/var-tracking.c | 12 -
27 files changed, 979 insertions(+), 847 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
delete mode 100644 gcc/tree-ssa-copyrename.c
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index bf2186a..b36f9c1 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1445,7 +1445,6 @@ OBJS = \
tree-ssa-ccp.o \
tree-ssa-coalesce.o \
tree-ssa-copy.o \
- tree-ssa-copyrename.o \
tree-ssa-dce.o \
tree-ssa-dom.o \
tree-ssa-dse.o \
diff --git a/gcc/alias.c b/gcc/alias.c
index 3203722..69e3732 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
if (! DECL_P (exprx) || ! DECL_P (expry))
return 0;
+ /* If we refer to different gimple registers, or one gimple register
+ and one non-gimple-register, we know they can't overlap. First,
+ gimple registers don't have their addresses taken. Now, there
+ could be more than one stack slot for (different versions of) the
+ same gimple register, but we can presumably tell they don't
+ overlap based on offsets from stack base addresses elsewhere.
+ It's important that we don't proceed to DECL_RTL, because gimple
+ registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
+ able to do anything about them since no SSA information will have
+ remained to guide it. */
+ if (is_gimple_reg (exprx) || is_gimple_reg (expry))
+ return exprx != expry;
+
/* With invalid code we can end up storing into the constant pool.
Bail out to avoid ICEing when creating RTL for this.
See gfortran.dg/lto/20091028-2_0.f90. */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index a047632..0b19953 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -150,21 +150,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
#define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
+/* Choose either CUR or NEXT as the leader DECL for a partition.
+ Prefer ignored decls, to simplify debug dumps and reduce ambiguity
+ out of the same user variable being in multiple partitions (this is
+ less likely for compiler-introduced temps). */
+
+static tree
+leader_merge (tree cur, tree next)
+{
+ if (cur == NULL || cur == next)
+ return next;
+
+ if (DECL_P (cur) && DECL_IGNORED_P (cur))
+ return cur;
+
+ if (DECL_P (next) && DECL_IGNORED_P (next))
+ return next;
+
+ return cur;
+}
+
+
+/* Return the RTL for the default SSA def of a PARM or RESULT, if
+ there is one. */
+
+rtx
+get_rtl_for_parm_ssa_default_def (tree var)
+{
+ gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
+
+ if (!is_gimple_reg (var))
+ return NULL_RTX;
+
+ /* If we've already determined RTL for the decl, use it. This is
+ not just an optimization: if VAR is a PARM whose incoming value
+ is unused, we won't find a default def to use its partition, but
+ we still want to use the location of the parm, if it was used at
+ all. During assign_parms, until a location is assigned for the
+ VAR, RTL can only for a parm or result if we're not coalescing
+ across variables, when we know we're coalescing all SSA_NAMEs of
+ each parm or result, and we're not coalescing them with names
+ pertaining to other variables, such as other parms' default
+ defs. */
+ if (DECL_RTL_SET_P (var))
+ {
+ gcc_assert (DECL_RTL (var) != pc_rtx);
+ return DECL_RTL (var);
+ }
+
+ tree name = ssa_default_def (cfun, var);
+
+ if (!name)
+ return NULL_RTX;
+
+ int part = var_to_partition (SA.map, name);
+ if (part == NO_PARTITION)
+ return NULL_RTX;
+
+ return SA.partition_to_pseudo[part];
+}
+
/* Associate declaration T with storage space X. If T is no
SSA name this is exactly SET_DECL_RTL, otherwise make the
partition of T associated with X. */
static inline void
set_rtl (tree t, rtx x)
{
+ if (x && SSAVAR (t))
+ {
+ bool skip = false;
+ tree cur = NULL_TREE;
+
+ if (MEM_P (x))
+ cur = MEM_EXPR (x);
+ else if (REG_P (x))
+ cur = REG_EXPR (x);
+ else if (GET_CODE (x) == CONCAT
+ && REG_P (XEXP (x, 0)))
+ cur = REG_EXPR (XEXP (x, 0));
+ else if (GET_CODE (x) == PARALLEL)
+ cur = REG_EXPR (XVECEXP (x, 0, 0));
+ else if (x == pc_rtx)
+ skip = true;
+ else
+ gcc_unreachable ();
+
+ tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+
+ if (cur != next)
+ {
+ if (MEM_P (x))
+ set_mem_attributes (x, next, true);
+ else
+ set_reg_attrs_for_decl_rtl (next, x);
+ }
+ }
+
if (TREE_CODE (t) == SSA_NAME)
{
- SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
- if (x && !MEM_P (x))
- set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
- /* For the benefit of debug information at -O0 (where vartracking
- doesn't run) record the place also in the base DECL if it's
- a normal variable (not a parameter). */
- if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
+ int part = var_to_partition (SA.map, t);
+ if (part != NO_PARTITION)
+ {
+ if (SA.partition_to_pseudo[part])
+ gcc_assert (SA.partition_to_pseudo[part] == x);
+ else
+ SA.partition_to_pseudo[part] = x;
+ }
+ /* For the benefit of debug information at -O0 (where
+ vartracking doesn't run) record the place also in the base
+ DECL. For PARMs and RESULTs, we may end up resetting these
+ in function.c:maybe_reset_rtl_for_parm, but in some rare
+ cases we may need them (unused and overwritten incoming
+ value, that at -O0 must share the location with the other
+ uses in spite of the missing default def), and this may be
+ the only chance to preserve them. */
+ if (x && x != pc_rtx && SSA_NAME_VAR (t))
{
tree var = SSA_NAME_VAR (t);
/* If we don't yet have something recorded, just record it now. */
@@ -862,7 +962,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
x = plus_constant (Pmode, base, offset);
- x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
+ x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
+ ? TYPE_MODE (TREE_TYPE (decl))
+ : DECL_MODE (SSAVAR (decl)), x);
if (TREE_CODE (decl) != SSA_NAME)
{
@@ -884,7 +986,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
DECL_USER_ALIGN (decl) = 0;
}
- set_mem_attributes (x, SSAVAR (decl), true);
set_rtl (decl, x);
}
@@ -1099,13 +1200,22 @@ account_stack_vars (void)
to a variable to be allocated in the stack frame. */
static void
-expand_one_stack_var (tree var)
+expand_one_stack_var_1 (tree var)
{
HOST_WIDE_INT size, offset;
unsigned byte_align;
- size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
- byte_align = align_local_variable (SSAVAR (var));
+ if (TREE_CODE (var) == SSA_NAME)
+ {
+ tree type = TREE_TYPE (var);
+ size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+ byte_align = TYPE_ALIGN_UNIT (type);
+ }
+ else
+ {
+ size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+ byte_align = align_local_variable (var);
+ }
/* We handle highly aligned variables in expand_stack_vars. */
gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
@@ -1116,6 +1226,27 @@ expand_one_stack_var (tree var)
crtl->max_used_stack_slot_alignment, offset);
}
+/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
+ already assigned some MEM. */
+
+static void
+expand_one_stack_var (tree var)
+{
+ if (TREE_CODE (var) == SSA_NAME)
+ {
+ int part = var_to_partition (SA.map, var);
+ if (part != NO_PARTITION)
+ {
+ rtx x = SA.partition_to_pseudo[part];
+ gcc_assert (x);
+ gcc_assert (MEM_P (x));
+ return;
+ }
+ }
+
+ return expand_one_stack_var_1 (var);
+}
+
/* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
that will reside in a hard register. */
@@ -1125,13 +1256,114 @@ expand_one_hard_reg_var (tree var)
rest_of_decl_compilation (var, 0, 0);
}
+/* Record the alignment requirements of some variable assigned to a
+ pseudo. */
+
+static void
+record_alignment_for_reg_var (unsigned int align)
+{
+ if (SUPPORTS_STACK_ALIGNMENT
+ && crtl->stack_alignment_estimated < align)
+ {
+ /* stack_alignment_estimated shouldn't change after stack
+ realign decision made */
+ gcc_assert (!crtl->stack_realign_processed);
+ crtl->stack_alignment_estimated = align;
+ }
+
+ /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
+ So here we only make sure stack_alignment_needed >= align. */
+ if (crtl->stack_alignment_needed < align)
+ crtl->stack_alignment_needed = align;
+ if (crtl->max_used_stack_slot_alignment < align)
+ crtl->max_used_stack_slot_alignment = align;
+}
+
+/* Create RTL for an SSA partition. */
+
+static void
+expand_one_ssa_partition (tree var)
+{
+ int part = var_to_partition (SA.map, var);
+ gcc_assert (part != NO_PARTITION);
+
+ if (SA.partition_to_pseudo[part])
+ return;
+
+ if (!use_register_for_decl (var))
+ {
+ expand_one_stack_var_1 (var);
+ return;
+ }
+
+ unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
+ TYPE_MODE (TREE_TYPE (var)),
+ TYPE_ALIGN (TREE_TYPE (var)));
+
+ /* If the variable alignment is very large we'll dynamicaly allocate
+ it, which means that in-frame portion is just a pointer. */
+ if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+ align = POINTER_SIZE;
+
+ record_alignment_for_reg_var (align);
+
+ machine_mode reg_mode = promote_ssa_mode (var, NULL);
+
+ rtx x = gen_reg_rtx (reg_mode);
+
+ set_rtl (var, x);
+}
+
+/* Record the association between the RTL generated for a partition
+ and the underlying variable of the SSA_NAME. */
+
+static void
+adjust_one_expanded_partition_var (tree var)
+{
+ if (!var)
+ return;
+
+ tree decl = SSA_NAME_VAR (var);
+
+ int part = var_to_partition (SA.map, var);
+ if (part == NO_PARTITION)
+ return;
+
+ rtx x = SA.partition_to_pseudo[part];
+
+ set_rtl (var, x);
+
+ if (!REG_P (x))
+ return;
+
+ /* Note if the object is a user variable. */
+ if (decl && !DECL_ARTIFICIAL (decl))
+ mark_user_reg (x);
+
+ if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
+ mark_reg_pointer (x, get_pointer_alignment (var));
+}
+
/* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
that will reside in a pseudo register. */
static void
expand_one_register_var (tree var)
{
- tree decl = SSAVAR (var);
+ if (TREE_CODE (var) == SSA_NAME)
+ {
+ int part = var_to_partition (SA.map, var);
+ if (part != NO_PARTITION)
+ {
+ rtx x = SA.partition_to_pseudo[part];
+ gcc_assert (x);
+ gcc_assert (REG_P (x));
+ return;
+ }
+ gcc_unreachable ();
+ }
+
+ tree decl = var;
tree type = TREE_TYPE (decl);
machine_mode reg_mode = promote_decl_mode (decl, NULL);
rtx x = gen_reg_rtx (reg_mode);
@@ -1265,21 +1497,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
align = POINTER_SIZE;
}
- if (SUPPORTS_STACK_ALIGNMENT
- && crtl->stack_alignment_estimated < align)
- {
- /* stack_alignment_estimated shouldn't change after stack
- realign decision made */
- gcc_assert (!crtl->stack_realign_processed);
- crtl->stack_alignment_estimated = align;
- }
-
- /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
- So here we only make sure stack_alignment_needed >= align. */
- if (crtl->stack_alignment_needed < align)
- crtl->stack_alignment_needed = align;
- if (crtl->max_used_stack_slot_alignment < align)
- crtl->max_used_stack_slot_alignment = align;
+ record_alignment_for_reg_var (align);
if (TREE_CODE (origvar) == SSA_NAME)
{
@@ -1713,48 +1931,18 @@ expand_used_vars (void)
if (targetm.use_pseudo_pic_reg ())
pic_offset_table_rtx = gen_reg_rtx (Pmode);
- hash_map<tree, tree> ssa_name_decls;
for (i = 0; i < SA.map->num_partitions; i++)
{
tree var = partition_to_var (SA.map, i);
gcc_assert (!virtual_operand_p (var));
- /* Assign decls to each SSA name partition, share decls for partitions
- we could have coalesced (those with the same type). */
- if (SSA_NAME_VAR (var) == NULL_TREE)
- {
- tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
- if (!*slot)
- *slot = create_tmp_reg (TREE_TYPE (var));
- replace_ssa_name_symbol (var, *slot);
- }
-
- /* Always allocate space for partitions based on VAR_DECLs. But for
- those based on PARM_DECLs or RESULT_DECLs and which matter for the
- debug info, there is no need to do so if optimization is disabled
- because all the SSA_NAMEs based on these DECLs have been coalesced
- into a single partition, which is thus assigned the canonical RTL
- location of the DECLs. If in_lto_p, we can't rely on optimize,
- a function could be compiled with -O1 -flto first and only the
- link performed at -O0. */
- if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
- expand_one_var (var, true, true);
- else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p)
- {
- /* This is a PARM_DECL or RESULT_DECL. For those partitions that
- contain the default def (representing the parm or result itself)
- we don't do anything here. But those which don't contain the
- default def (representing a temporary based on the parm/result)
- we need to allocate space just like for normal VAR_DECLs. */
- if (!bitmap_bit_p (SA.partition_has_default_def, i))
- {
- expand_one_var (var, true, true);
- gcc_assert (SA.partition_to_pseudo[i]);
- }
- }
+ expand_one_ssa_partition (var);
}
+ for (i = 1; i < num_ssa_names; i++)
+ adjust_one_expanded_partition_var (ssa_name (i));
+
if (flag_stack_protect == SPCT_FLAG_STRONG)
gen_stack_protect_signal
= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -5928,35 +6116,6 @@ pass_expand::execute (function *fun)
parm_birth_insn = var_seq;
}
- /* Now that we also have the parameter RTXs, copy them over to our
- partitions. */
- for (i = 0; i < SA.map->num_partitions; i++)
- {
- tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
-
- if (TREE_CODE (var) != VAR_DECL
- && !SA.partition_to_pseudo[i])
- SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
- gcc_assert (SA.partition_to_pseudo[i]);
-
- /* If this decl was marked as living in multiple places, reset
- this now to NULL. */
- if (DECL_RTL_IF_SET (var) == pc_rtx)
- SET_DECL_RTL (var, NULL);
-
- /* Some RTL parts really want to look at DECL_RTL(x) when x
- was a decl marked in REG_ATTR or MEM_ATTR. We could use
- SET_DECL_RTL here making this available, but that would mean
- to select one of the potentially many RTLs for one DECL. Instead
- of doing that we simply reset the MEM_EXPR of the RTL in question,
- then nobody can get at it and hence nobody can call DECL_RTL on it. */
- if (!DECL_RTL_SET_P (var))
- {
- if (MEM_P (SA.partition_to_pseudo[i]))
- set_mem_expr (SA.partition_to_pseudo[i], NULL);
- }
- }
-
/* If we have a class containing differently aligned pointers
we need to merge those into the corresponding RTL pointer
alignment. */
@@ -5964,7 +6123,6 @@ pass_expand::execute (function *fun)
{
tree name = ssa_name (i);
int part;
- rtx r;
if (!name
/* We might have generated new SSA names in
@@ -5977,20 +6135,24 @@ pass_expand::execute (function *fun)
if (part == NO_PARTITION)
continue;
- /* Adjust all partition members to get the underlying decl of
- the representative which we might have created in expand_one_var. */
- if (SSA_NAME_VAR (name) == NULL_TREE)
+ gcc_assert (SA.partition_to_pseudo[part]);
+
+ /* If this decl was marked as living in multiple places, reset
+ this now to NULL. */
+ tree var = SSA_NAME_VAR (name);
+ if (var && DECL_RTL_IF_SET (var) == pc_rtx)
+ SET_DECL_RTL (var, NULL);
+ /* Check that the pseudos chosen by assign_parms are those of
+ the corresponding default defs. */
+ else if (SSA_NAME_IS_DEFAULT_DEF (name)
+ && (TREE_CODE (var) == PARM_DECL
+ || TREE_CODE (var) == RESULT_DECL))
{
- tree leader = partition_to_var (SA.map, part);
- gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE);
- replace_ssa_name_symbol (name, SSA_NAME_VAR (leader));
+ rtx in = DECL_RTL_IF_SET (var);
+ gcc_assert (in);
+ rtx out = SA.partition_to_pseudo[part];
+ gcc_assert (in == out || rtx_equal_p (in, out));
}
- if (!POINTER_TYPE_P (TREE_TYPE (name)))
- continue;
-
- r = SA.partition_to_pseudo[part];
- if (REG_P (r))
- mark_reg_pointer (r, get_pointer_alignment (name));
}
/* If this function is `main', emit a call to `__main'
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index a0b6e3e..602579d 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,5 +22,7 @@ along with GCC; see the file COPYING3. If not see
extern tree gimple_assign_rhs_to_tree (gimple);
extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
+extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+
#endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 6b2ccbc..89dcabf 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2230,16 +2230,16 @@ Common Report Var(flag_tree_ch) Optimization
Enable loop header copying on trees
ftree-coalesce-inlined-vars
-Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization
-Enable coalescing of copy-related user variables that are inlined
+Common Ignore RejectNegative
+Does nothing. Preserved for backward compatibility.
ftree-coalesce-vars
-Common Report Var(flag_ssa_coalesce_vars,2) Optimization
-Enable coalescing of all copy-related user variables
+Common Report Var(flag_tree_coalesce_vars) Optimization
+Enable SSA coalescing of user variables
ftree-copyrename
-Common Report Var(flag_tree_copyrename) Optimization
-Replace SSA temporaries with better names in copies
+Common Ignore
+Does nothing. Preserved for backward compatibility.
ftree-copy-prop
Common Report Var(flag_tree_copy_prop) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 522e924..681c33e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}.
-fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
-fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
-fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
--fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol
-fdump-tree-nrv -fdump-tree-vect @gol
-fdump-tree-sink @gol
-fdump-tree-sra@r{[}-@var{n}@r{]} @gol
@@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}.
-fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
-fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
-ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
--ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
+-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
-ftree-loop-if-convert-stores -ftree-loop-im @gol
-ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
-ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
@@ -7078,11 +7076,6 @@ name is made by appending @file{.phiopt} to the source file name.
Dump each function after forward propagating single use variables. The file
name is made by appending @file{.forwprop} to the source file name.
-@item copyrename
-@opindex fdump-tree-copyrename
-Dump each function after applying the copy rename optimization. The file
-name is made by appending @file{.copyrename} to the source file name.
-
@item nrv
@opindex fdump-tree-nrv
Dump each function after applying the named return value optimization on
@@ -7547,8 +7540,8 @@ compilation time.
-ftree-ccp @gol
-fssa-phiopt @gol
-ftree-ch @gol
+-ftree-coalesce-vars @gol
-ftree-copy-prop @gol
--ftree-copyrename @gol
-ftree-dce @gol
-ftree-dominator-opts @gol
-ftree-dse @gol
@@ -8817,6 +8810,15 @@ profitable to parallelize the loops.
Compare the results of several data dependence analyzers. This option
is used for debugging the data dependence analyzers.
+@item -ftree-coalesce-vars
+@opindex ftree-coalesce-vars
+Tell the compiler to attempt to combine small user-defined variables
+too, instead of just compiler temporaries. This may severely limit the
+ability to debug an optimized program compiled with
+@option{-fno-var-tracking-assignments}. In the negated form, this flag
+prevents SSA coalescing of user variables. This option is enabled by
+default if optimization is enabled.
+
@item -ftree-loop-if-convert
@opindex ftree-loop-if-convert
Attempt to transform conditional jumps in the innermost loops to
@@ -8930,32 +8932,6 @@ Perform scalar replacement of aggregates. This pass replaces structure
references with scalars to prevent committing structures to memory too
early. This flag is enabled by default at @option{-O} and higher.
-@item -ftree-copyrename
-@opindex ftree-copyrename
-Perform copy renaming on trees. This pass attempts to rename compiler
-temporaries to other variables at copy locations, usually resulting in
-variable names which more closely resemble the original variables. This flag
-is enabled by default at @option{-O} and higher.
-
-@item -ftree-coalesce-inlined-vars
-@opindex ftree-coalesce-inlined-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, but only if they are inlined
-from other functions. It is a more limited form of
-@option{-ftree-coalesce-vars}. This may harm debug information of such
-inlined variables, but it keeps variables of the inlined-into
-function apart from each other, such that they are more likely to
-contain the expected values in a debugging session.
-
-@item -ftree-coalesce-vars
-@opindex ftree-coalesce-vars
-Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to
-combine small user-defined variables too, instead of just compiler
-temporaries. This may severely limit the ability to debug an optimized
-program compiled with @option{-fno-var-tracking-assignments}. In the
-negated form, this flag prevents SSA coalescing of user variables,
-including inlined ones. This option is enabled by default.
-
@item -ftree-ter
@opindex ftree-ter
Perform temporary expression replacement during the SSA->normal phase. Single
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index ed2b30b..0648af6 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1232,6 +1232,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
void
set_reg_attrs_for_decl_rtl (tree t, rtx x)
{
+ if (!t)
+ return;
+ tree tdecl = t;
if (GET_CODE (x) == SUBREG)
{
gcc_assert (subreg_lowpart_p (x));
@@ -1240,7 +1243,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x)
if (REG_P (x))
REG_ATTRS (x)
= get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x),
- DECL_MODE (t)));
+ DECL_MODE (tdecl)));
if (GET_CODE (x) == CONCAT)
{
if (REG_P (XEXP (x, 0)))
diff --git a/gcc/explow.c b/gcc/explow.c
index bd342c1..6dba6e5 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -842,6 +842,28 @@ promote_decl_mode (const_tree decl, int *punsignedp)
return pmode;
}
+/* Return the promoted mode for name. If it is a named SSA_NAME, it
+ is the same as promote_decl_mode. Otherwise, it is the promoted
+ mode of a temp decl of same type as the SSA_NAME, if we had created
+ one. */
+
+machine_mode
+promote_ssa_mode (const_tree name, int *punsignedp)
+{
+ gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+ tree type = TREE_TYPE (name);
+ int unsignedp = TYPE_UNSIGNED (type);
+ machine_mode mode = TYPE_MODE (type);
+
+ machine_mode pmode = promote_mode (type, mode, &unsignedp);
+ if (punsignedp)
+ *punsignedp = unsignedp;
+
+ return pmode;
+}
+
+
\f
/* Controls the behaviour of {anti_,}adjust_stack. */
static bool suppress_reg_args_size;
diff --git a/gcc/explow.h b/gcc/explow.h
index 94613de..52113db 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *);
/* Return mode and signedness to use when object is promoted. */
machine_mode promote_decl_mode (const_tree, int *);
+/* Return mode and signedness to use when object is promoted. */
+machine_mode promote_ssa_mode (const_tree, int *);
+
/* Remove some bytes from the stack. An rtx says how many. */
extern void adjust_stack (rtx);
diff --git a/gcc/expr.c b/gcc/expr.c
index 899a42c..d601129 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9246,7 +9246,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
rtx op0, op1, temp, decl_rtl;
tree type;
int unsignedp;
- machine_mode mode;
+ machine_mode mode, dmode;
enum tree_code code = TREE_CODE (exp);
rtx subtarget, original_target;
int ignore;
@@ -9377,7 +9377,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
if (g == NULL
&& modifier == EXPAND_INITIALIZER
&& !SSA_NAME_IS_DEFAULT_DEF (exp)
- && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
+ && (optimize || !SSA_NAME_VAR (exp)
+ || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
&& stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
g = SSA_NAME_DEF_STMT (exp);
if (g)
@@ -9456,15 +9457,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
/* Ensure variable marked as used even if it doesn't go through
a parser. If it hasn't be used yet, write out an external
definition. */
- TREE_USED (exp) = 1;
+ if (exp)
+ TREE_USED (exp) = 1;
/* Show we haven't gotten RTL for this yet. */
temp = 0;
/* Variables inherited from containing functions should have
been lowered by this point. */
- context = decl_function_context (exp);
- gcc_assert (SCOPE_FILE_SCOPE_P (context)
+ if (exp)
+ context = decl_function_context (exp);
+ gcc_assert (!exp
+ || SCOPE_FILE_SCOPE_P (context)
|| context == current_function_decl
|| TREE_STATIC (exp)
|| DECL_EXTERNAL (exp)
@@ -9488,7 +9492,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
decl_rtl = use_anchored_address (decl_rtl);
if (modifier != EXPAND_CONST_ADDRESS
&& modifier != EXPAND_SUM
- && !memory_address_addr_space_p (DECL_MODE (exp),
+ && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
+ : GET_MODE (decl_rtl),
XEXP (decl_rtl, 0),
MEM_ADDR_SPACE (decl_rtl)))
temp = replace_equiv_address (decl_rtl,
@@ -9499,12 +9504,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
if the address is a register. */
if (temp != 0)
{
- if (MEM_P (temp) && REG_P (XEXP (temp, 0)))
+ if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
return temp;
}
+ if (exp)
+ dmode = DECL_MODE (exp);
+ else
+ dmode = TYPE_MODE (TREE_TYPE (ssa_name));
+
/* If the mode of DECL_RTL does not match that of the decl,
there are two cases: we are dealing with a BLKmode value
that is returned in a register, or we are dealing with
@@ -9512,22 +9522,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
of the wanted mode, but mark it so that we know that it
was already extended. */
if (REG_P (decl_rtl)
- && DECL_MODE (exp) != BLKmode
- && GET_MODE (decl_rtl) != DECL_MODE (exp))
+ && dmode != BLKmode
+ && GET_MODE (decl_rtl) != dmode)
{
machine_mode pmode;
/* Get the signedness to be used for this variable. Ensure we get
the same mode we got when the variable was declared. */
- if (code == SSA_NAME
- && (g = SSA_NAME_DEF_STMT (ssa_name))
- && gimple_code (g) == GIMPLE_CALL
- && !gimple_call_internal_p (g))
+ if (code != SSA_NAME)
+ pmode = promote_decl_mode (exp, &unsignedp);
+ else if ((g = SSA_NAME_DEF_STMT (ssa_name))
+ && gimple_code (g) == GIMPLE_CALL
+ && !gimple_call_internal_p (g))
pmode = promote_function_mode (type, mode, &unsignedp,
gimple_call_fntype (g),
2);
else
- pmode = promote_decl_mode (exp, &unsignedp);
+ pmode = promote_ssa_mode (ssa_name, &unsignedp);
gcc_assert (GET_MODE (decl_rtl) == pmode);
temp = gen_lowpart_SUBREG (mode, decl_rtl);
diff --git a/gcc/function.c b/gcc/function.c
index f9d11bf4..840f4a2 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -72,6 +72,9 @@ along with GCC; see the file COPYING3. If not see
#include "cfganal.h"
#include "cfgbuild.h"
#include "cfgcleanup.h"
+#include "cfgexpand.h"
+#include "basic-block.h"
+#include "df.h"
#include "params.h"
#include "bb-reorder.h"
#include "shrink-wrap.h"
@@ -2105,6 +2108,30 @@ aggregate_value_p (const_tree exp, const_tree fntype)
bool
use_register_for_decl (const_tree decl)
{
+ if (TREE_CODE (decl) == SSA_NAME)
+ {
+ /* We often try to use the SSA_NAME, instead of its underlying
+ decl, to get type information and guide decisions, to avoid
+ differences of behavior between anonymous and named
+ variables, but in this one case we have to go for the actual
+ variable if there is one. The main reason is that, at least
+ at -O0, we want to place user variables on the stack, but we
+ don't mind using pseudos for anonymous or ignored temps.
+ Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
+ should go in pseudos, whereas their corresponding variables
+ might have to go on the stack. So, disregarding the decl
+ here would negatively impact debug info at -O0, enable
+ coalescing between SSA_NAMEs that ought to get different
+ stack/pseudo assignments, and get the incoming argument
+ processing thoroughly confused by PARM_DECLs expected to live
+ in stack slots but assigned to pseudos. */
+ if (!SSA_NAME_VAR (decl))
+ return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+ && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
+
+ decl = SSA_NAME_VAR (decl);
+ }
+
if (!targetm.calls.allocate_stack_slots_for_args ())
return true;
@@ -2745,23 +2772,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
data->entry_parm = entry_parm;
}
+/* Wrapper for use_register_for_decl, that special-cases the
+ .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
+ passed by reference. */
+
+static bool
+use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
+{
+ if (parm == all->function_result_decl)
+ {
+ tree result = DECL_RESULT (current_function_decl);
+
+ if (DECL_BY_REFERENCE (result))
+ parm = result;
+ }
+
+ return use_register_for_decl (parm);
+}
+
+/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
+ the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
+ is passed by reference. */
+
+static rtx
+rtl_for_parm (struct assign_parm_data_all *all, tree parm)
+{
+ if (parm == all->function_result_decl)
+ {
+ tree result = DECL_RESULT (current_function_decl);
+
+ if (!DECL_BY_REFERENCE (result))
+ return NULL_RTX;
+
+ parm = result;
+ }
+
+ return get_rtl_for_parm_ssa_default_def (parm);
+}
+
+/* Reset the location of PARM_DECLs and RESULT_DECLs that had
+ SSA_NAMEs in multiple partitions, so that assign_parms will choose
+ the default def, if it exists, or create new RTL to hold the unused
+ entry value. If we are coalescing across variables, we want to
+ reset the location too, because a parm without a default def
+ (incoming value unused) might be coalesced with one with a default
+ def, and then assign_parms would copy both incoming values to the
+ same location, which might cause the wrong value to survive. */
+static void
+maybe_reset_rtl_for_parm (tree parm)
+{
+ gcc_assert (TREE_CODE (parm) == PARM_DECL
+ || TREE_CODE (parm) == RESULT_DECL);
+ if ((flag_tree_coalesce_vars
+ || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
+ && is_gimple_reg (parm))
+ SET_DECL_RTL (parm, NULL_RTX);
+}
+
/* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's
always valid and properly aligned. */
static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
+ struct assign_parm_data_one *data)
{
rtx stack_parm = data->stack_parm;
+ /* If out-of-SSA assigned RTL to the parm default def, make sure we
+ don't use what we might have computed before. */
+ rtx ssa_assigned = rtl_for_parm (all, parm);
+ if (ssa_assigned)
+ stack_parm = NULL;
+
/* If we can't trust the parm stack slot to be aligned enough for its
ultimate type, don't use that slot after entry. We'll make another
stack slot, if we need one. */
- if (stack_parm
- && ((STRICT_ALIGNMENT
- && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
- || (data->nominal_type
- && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
- && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+ else if (stack_parm
+ && ((STRICT_ALIGNMENT
+ && (GET_MODE_ALIGNMENT (data->nominal_mode)
+ > MEM_ALIGN (stack_parm)))
+ || (data->nominal_type
+ && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+ && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
stack_parm = NULL;
/* If parm was passed in memory, and we need to convert it on entry,
@@ -2823,11 +2915,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
size = int_size_in_bytes (data->passed_type);
size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
+
if (stack_parm == 0)
{
DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
- stack_parm = assign_stack_local (BLKmode, size_stored,
- DECL_ALIGN (parm));
+ stack_parm = rtl_for_parm (all, parm);
+ if (!stack_parm)
+ stack_parm = assign_stack_local (BLKmode, size_stored,
+ DECL_ALIGN (parm));
+ else
+ stack_parm = copy_rtx (stack_parm);
if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
PUT_MODE (stack_parm, GET_MODE (entry_parm));
set_mem_attributes (stack_parm, parm, 1);
@@ -2968,10 +3065,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
= promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
TREE_TYPE (current_function_decl), 2);
- parmreg = gen_reg_rtx (promoted_nominal_mode);
+ rtx from_expand = rtl_for_parm (all, parm);
- if (!DECL_ARTIFICIAL (parm))
- mark_user_reg (parmreg);
+ if (from_expand && !data->passed_pointer)
+ {
+ parmreg = from_expand;
+ gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode);
+ }
+ else
+ {
+ parmreg = gen_reg_rtx (promoted_nominal_mode);
+ if (!DECL_ARTIFICIAL (parm))
+ mark_user_reg (parmreg);
+ }
/* If this was an item that we received a pointer to,
set DECL_RTL appropriately. */
@@ -2990,6 +3096,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
assign_parm_find_data_types and expand_expr_real_1. */
equiv_stack_parm = data->stack_parm;
+ if (!equiv_stack_parm)
+ equiv_stack_parm = data->entry_parm;
validated_mem = validize_mem (copy_rtx (data->entry_parm));
need_conversion = (data->nominal_mode != data->passed_mode
@@ -3130,11 +3238,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
/* If we were passed a pointer but the actual value can safely live
in a register, retrieve it and use it directly. */
- if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
+ if (data->passed_pointer
+ && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
{
/* We can't use nominal_mode, because it will have been set to
Pmode above. We must use the actual mode of the parm. */
- if (use_register_for_decl (parm))
+ if (from_expand)
+ {
+ parmreg = from_expand;
+ gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
+ }
+ else if (use_register_for_decl (parm))
{
parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
mark_user_reg (parmreg);
@@ -3174,7 +3288,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
/* STACK_PARM is the pointer, not the parm, and PARMREG is
now the parm. */
- data->stack_parm = NULL;
+ data->stack_parm = equiv_stack_parm = NULL;
}
/* Mark the register as eliminable if we did no conversion and it was
@@ -3184,11 +3298,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
make here would screw up life analysis for it. */
if (data->nominal_mode == data->passed_mode
&& !did_conversion
- && data->stack_parm != 0
- && MEM_P (data->stack_parm)
+ && equiv_stack_parm != 0
+ && MEM_P (equiv_stack_parm)
&& data->locate.offset.var == 0
&& reg_mentioned_p (virtual_incoming_args_rtx,
- XEXP (data->stack_parm, 0)))
+ XEXP (equiv_stack_parm, 0)))
{
rtx_insn *linsn = get_last_insn ();
rtx_insn *sinsn;
@@ -3201,8 +3315,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
= GET_MODE_INNER (GET_MODE (parmreg));
int regnor = REGNO (XEXP (parmreg, 0));
int regnoi = REGNO (XEXP (parmreg, 1));
- rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
- rtx stacki = adjust_address_nv (data->stack_parm, submode,
+ rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
+ rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
GET_MODE_SIZE (submode));
/* Scan backwards for the set of the real and
@@ -3275,6 +3389,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
if (data->stack_parm == 0)
{
+ rtx x = data->stack_parm = rtl_for_parm (all, parm);
+ if (x)
+ gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm));
+ }
+
+ if (data->stack_parm == 0)
+ {
int align = STACK_SLOT_ALIGNMENT (data->passed_type,
GET_MODE (data->entry_parm),
TYPE_ALIGN (data->passed_type));
@@ -3531,6 +3652,8 @@ assign_parms (tree fndecl)
DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
continue;
}
+ else
+ maybe_reset_rtl_for_parm (parm);
/* Estimate stack alignment from parameter alignment. */
if (SUPPORTS_STACK_ALIGNMENT)
@@ -3580,7 +3703,9 @@ assign_parms (tree fndecl)
else
set_decl_incoming_rtl (parm, data.entry_parm, false);
- /* Boudns should be loaded in the particular order to
+ assign_parm_adjust_stack_rtl (&all, parm, &data);
+
+ /* Bounds should be loaded in the particular order to
have registers allocated correctly. Collect info about
input bounds and load them later. */
if (POINTER_BOUNDS_TYPE_P (data.passed_type))
@@ -3597,11 +3722,10 @@ assign_parms (tree fndecl)
}
else
{
- assign_parm_adjust_stack_rtl (&data);
-
if (assign_parm_setup_block_p (&data))
assign_parm_setup_block (&all, parm, &data);
- else if (data.passed_pointer || use_register_for_decl (parm))
+ else if (data.passed_pointer
+ || use_register_for_parm_decl (&all, parm))
assign_parm_setup_reg (&all, parm, &data);
else
assign_parm_setup_stack (&all, parm, &data);
@@ -4932,7 +5056,9 @@ expand_function_start (tree subr)
before any library calls that assign parms might generate. */
/* Decide whether to return the value in memory or in a register. */
- if (aggregate_value_p (DECL_RESULT (subr), subr))
+ tree res = DECL_RESULT (subr);
+ maybe_reset_rtl_for_parm (res);
+ if (aggregate_value_p (res, subr))
{
/* Returning something that won't go in a register. */
rtx value_address = 0;
@@ -4940,7 +5066,7 @@ expand_function_start (tree subr)
#ifdef PCC_STATIC_STRUCT_RETURN
if (cfun->returns_pcc_struct)
{
- int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr)));
+ int size = int_size_in_bytes (TREE_TYPE (res));
value_address = assemble_static_space (size);
}
else
@@ -4952,36 +5078,45 @@ expand_function_start (tree subr)
it. */
if (sv)
{
- value_address = gen_reg_rtx (Pmode);
+ if (DECL_BY_REFERENCE (res))
+ value_address = get_rtl_for_parm_ssa_default_def (res);
+ if (!value_address)
+ value_address = gen_reg_rtx (Pmode);
emit_move_insn (value_address, sv);
}
}
if (value_address)
{
rtx x = value_address;
- if (!DECL_BY_REFERENCE (DECL_RESULT (subr)))
+ if (!DECL_BY_REFERENCE (res))
{
- x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x);
- set_mem_attributes (x, DECL_RESULT (subr), 1);
+ x = get_rtl_for_parm_ssa_default_def (res);
+ if (!x)
+ {
+ x = gen_rtx_MEM (DECL_MODE (res), value_address);
+ set_mem_attributes (x, res, 1);
+ }
}
- SET_DECL_RTL (DECL_RESULT (subr), x);
+ SET_DECL_RTL (res, x);
}
}
- else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode)
+ else if (DECL_MODE (res) == VOIDmode)
/* If return mode is void, this decl rtl should not be used. */
- SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX);
+ SET_DECL_RTL (res, NULL_RTX);
else
{
/* Compute the return values into a pseudo reg, which we will copy
into the true return register after the cleanups are done. */
- tree return_type = TREE_TYPE (DECL_RESULT (subr));
- if (TYPE_MODE (return_type) != BLKmode
- && targetm.calls.return_in_msb (return_type))
+ tree return_type = TREE_TYPE (res);
+ rtx x = get_rtl_for_parm_ssa_default_def (res);
+ if (x)
+ /* Use it. */;
+ else if (TYPE_MODE (return_type) != BLKmode
+ && targetm.calls.return_in_msb (return_type))
/* expand_function_end will insert the appropriate padding in
this case. Use the return value's natural (unpadded) mode
within the function proper. */
- SET_DECL_RTL (DECL_RESULT (subr),
- gen_reg_rtx (TYPE_MODE (return_type)));
+ x = gen_reg_rtx (TYPE_MODE (return_type));
else
{
/* In order to figure out what mode to use for the pseudo, we
@@ -4992,25 +5127,26 @@ expand_function_start (tree subr)
/* Structures that are returned in registers are not
aggregate_value_p, so we may see a PARALLEL or a REG. */
if (REG_P (hard_reg))
- SET_DECL_RTL (DECL_RESULT (subr),
- gen_reg_rtx (GET_MODE (hard_reg)));
+ x = gen_reg_rtx (GET_MODE (hard_reg));
else
{
gcc_assert (GET_CODE (hard_reg) == PARALLEL);
- SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg));
+ x = gen_group_rtx (hard_reg);
}
}
+ SET_DECL_RTL (res, x);
+
/* Set DECL_REGISTER flag so that expand_function_end will copy the
result to the real return register(s). */
- DECL_REGISTER (DECL_RESULT (subr)) = 1;
+ DECL_REGISTER (res) = 1;
if (chkp_function_instrumented_p (current_function_decl))
{
- tree return_type = TREE_TYPE (DECL_RESULT (subr));
+ tree return_type = TREE_TYPE (res);
rtx bounds = targetm.calls.chkp_function_value_bounds (return_type,
subr, 1);
- SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds);
+ SET_DECL_BOUNDS_RTL (res, bounds);
}
}
@@ -5025,7 +5161,9 @@ expand_function_start (tree subr)
rtx local, chain;
rtx_insn *insn;
- local = gen_reg_rtx (Pmode);
+ local = get_rtl_for_parm_ssa_default_def (parm);
+ if (!local)
+ local = gen_reg_rtx (Pmode);
chain = targetm.calls.static_chain (current_function_decl, true);
set_decl_incoming_rtl (parm, chain, false);
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index b558d90..baed630 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type)
return copy;
}
-/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
- coalescing together, false otherwise.
-
- This must stay consistent with var_map_base_init in tree-ssa-live.c. */
-
-bool
-gimple_can_coalesce_p (tree name1, tree name2)
-{
- /* First check the SSA_NAME's associated DECL. We only want to
- coalesce if they have the same DECL or both have no associated DECL. */
- tree var1 = SSA_NAME_VAR (name1);
- tree var2 = SSA_NAME_VAR (name2);
- var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
- var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
- if (var1 != var2)
- return false;
-
- /* Now check the types. If the types are the same, then we should
- try to coalesce V1 and V2. */
- tree t1 = TREE_TYPE (name1);
- tree t2 = TREE_TYPE (name2);
- if (t1 == t2)
- return true;
-
- /* If the types are not the same, check for a canonical type match. This
- (for example) allows coalescing when the types are fundamentally the
- same, but just have different names.
-
- Note pointer types with different address spaces may have the same
- canonical type. Those are rejected for coalescing by the
- types_compatible_p check. */
- if (TYPE_CANONICAL (t1)
- && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
- && types_compatible_p (t1, t2))
- return true;
-
- return false;
-}
-
/* Strip off a legitimate source ending from the input string NAME of
length LEN. Rather than having to know the names used by all of
our front ends, we strip off an ending of a period followed by
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index ed23eb2..3d1c89f 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree);
extern bool gimple_has_body_p (tree);
extern const char *gimple_decl_printable_name (tree, int);
extern tree copy_var_decl (tree, tree, tree);
-extern bool gimple_can_coalesce_p (tree, tree);
extern tree create_tmp_var_name (const char *);
extern tree create_tmp_var_raw (tree, const char * = NULL);
extern tree create_tmp_var (tree, const char * = NULL);
diff --git a/gcc/opts.c b/gcc/opts.c
index 468a802..f22edd3 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -445,12 +445,12 @@ static const struct default_options default_options_table[] =
{ OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 },
+ { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
- { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 5cd07ae..103fd2e 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_all_early_optimizations);
PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
NEXT_PASS (pass_remove_cgraph_callee_edges);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_object_sizes);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
@@ -155,7 +154,6 @@ along with GCC; see the file COPYING3. If not see
/* Initial scalar cleanups before alias computation.
They ensure memory accesses are not indirect wherever possible. */
NEXT_PASS (pass_strip_predict_hints);
- NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_ccp);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
@@ -183,7 +181,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_ch);
NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_sra);
- NEXT_PASS (pass_rename_ssa_copies);
/* The dom pass will also resolve all __builtin_constant_p calls
that are still there to 0. This has to be done after some
propagations have already run, but before some more dead code
@@ -294,7 +291,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_fold_builtins);
NEXT_PASS (pass_optimize_widening_mul);
NEXT_PASS (pass_tail_calls);
- NEXT_PASS (pass_rename_ssa_copies);
/* FIXME: If DCE is not run before checking for uninitialized uses,
we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
However, this also causes us to misdiagnose cases that should be
@@ -329,7 +325,6 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_dce);
NEXT_PASS (pass_asan);
NEXT_PASS (pass_tsan);
- NEXT_PASS (pass_rename_ssa_copies);
/* ??? We do want some kind of loop invariant motion, but we possibly
need to adjust LIM to be more friendly towards preserving accurate
debug information here. */
diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c
index 9b17187..e1e7293 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54200.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54200.c
@@ -1,6 +1,6 @@
/* PR tree-optimization/54200 */
/* { dg-do run } */
-/* { dg-options "-g -fno-var-tracking-assignments" } */
+/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */
int o __attribute__((used));
diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c
index 5467f4d..db69332 100644
--- a/gcc/testsuite/gcc.dg/ssp-1.c
+++ b/gcc/testsuite/gcc.dg/ssp-1.c
@@ -12,7 +12,7 @@ __stack_chk_fail (void)
int main ()
{
- int i;
+ register int i;
char foo[255];
// smash stack
diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c
index 9a7ac32..752fe53 100644
--- a/gcc/testsuite/gcc.dg/ssp-2.c
+++ b/gcc/testsuite/gcc.dg/ssp-2.c
@@ -14,7 +14,7 @@ __stack_chk_fail (void)
void
overflow()
{
- int i = 0;
+ register int i = 0;
char foo[30];
/* Overflow buffer. */
diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
new file mode 100644
index 0000000..dbd81c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+/* Make sure we don't coalesce both incoming parms, one whose incoming
+ value is unused, to the same location, so as to overwrite one of
+ them with the incoming value of the other. */
+
+int __attribute__((noinline, noclone))
+foo (int i, int j)
+{
+ j = i; /* The incoming value for J is unused. */
+ i = 2;
+ if (j)
+ j++;
+ j += i + 1;
+ return j;
+}
+
+/* Same as foo, but with swapped parameters. */
+int __attribute__((noinline, noclone))
+bar (int j, int i)
+{
+ j = i; /* The incoming value for J is unused. */
+ i = 2;
+ if (j)
+ j++;
+ j += i + 1;
+ return j;
+}
+
+int
+main (void)
+{
+ if (foo (0, 1) != 3)
+ abort ();
+ if (bar (1, 0) != 3)
+ abort ();
+ return 0;
+}
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index 7b747ab9..978476c 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
rtx dest_rtx, seq, x;
machine_mode dest_mode, src_mode;
int unsignedp;
- tree var;
if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus)
start_sequence ();
- var = SSA_NAME_VAR (partition_to_var (SA.map, dest));
+ tree name = partition_to_var (SA.map, dest);
src_mode = TYPE_MODE (TREE_TYPE (src));
dest_mode = GET_MODE (dest_rtx);
- gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
+ gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name)));
gcc_assert (!REG_P (dest_rtx)
- || dest_mode == promote_decl_mode (var, &unsignedp));
+ || dest_mode == promote_ssa_mode (name, &unsignedp));
if (src_mode != dest_mode)
{
@@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T)
static rtx
get_temp_reg (tree name)
{
- tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
- tree type = TREE_TYPE (var);
+ tree type = TREE_TYPE (name);
int unsignedp;
- machine_mode reg_mode = promote_decl_mode (var, &unsignedp);
+ machine_mode reg_mode = promote_ssa_mode (name, &unsignedp);
rtx x = gen_reg_rtx (reg_mode);
if (POINTER_TYPE_P (type))
- mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+ mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
return x;
}
@@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
/* Return to viewing the variable list as just all reference variables after
coalescing has been performed. */
- partition_view_normal (map, false);
+ partition_view_normal (map);
if (dump_file && (dump_flags & TDF_DETAILS))
{
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index bf8983f..a622728 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. If not see
#include "gimple-iterator.h"
#include "tree-ssa-live.h"
#include "tree-ssa-coalesce.h"
+#include "explow.h"
#include "diagnostic-core.h"
@@ -806,6 +807,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
basic_block bb;
ssa_op_iter iter;
live_track_p live;
+ basic_block entry;
+
+ /* If inter-variable coalescing is enabled, we may attempt to
+ coalesce variables from different base variables, including
+ different parameters, so we have to make sure default defs live
+ at the entry block conflict with each other. */
+ if (flag_tree_coalesce_vars)
+ entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+ else
+ entry = NULL;
map = live_var_map (liveinfo);
graph = ssa_conflicts_new (num_var_partitions (map));
@@ -864,6 +875,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
live_track_process_def (live, result, graph);
}
+ /* Pretend there are defs for params' default defs at the start
+ of the (post-)entry block. */
+ if (bb == entry)
+ {
+ unsigned base;
+ bitmap_iterator bi;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+ {
+ bitmap_iterator bi2;
+ unsigned part;
+ EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
+ 0, part, bi2)
+ {
+ tree var = partition_to_var (map, part);
+ if (!SSA_NAME_VAR (var)
+ || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
+ && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
+ || !SSA_NAME_IS_DEFAULT_DEF (var))
+ continue;
+ live_track_process_def (live, var, graph);
+ }
+ }
+ }
+
live_track_clear_base_vars (live);
}
@@ -1132,6 +1167,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
{
var1 = partition_to_var (map, p1);
var2 = partition_to_var (map, p2);
+
z = var_union (map, var1, var2);
if (z == NO_PARTITION)
{
@@ -1149,6 +1185,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y,
if (debug)
fprintf (debug, ": Success -> %d\n", z);
+
return true;
}
@@ -1244,6 +1281,328 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2)
}
+/* Output partition map MAP with coalescing plan PART to file F. */
+
+void
+dump_part_var_map (FILE *f, partition part, var_map map)
+{
+ int t;
+ unsigned x, y;
+ int p;
+
+ fprintf (f, "\nCoalescible Partition map \n\n");
+
+ for (x = 0; x < map->num_partitions; x++)
+ {
+ if (map->view_to_partition != NULL)
+ p = map->view_to_partition[x];
+ else
+ p = x;
+
+ if (ssa_name (p) == NULL_TREE
+ || virtual_operand_p (ssa_name (p)))
+ continue;
+
+ t = 0;
+ for (y = 1; y < num_ssa_names; y++)
+ {
+ tree var = version_to_var (map, y);
+ if (!var)
+ continue;
+ int q = var_to_partition (map, var);
+ p = partition_find (part, q);
+ gcc_assert (map->partition_to_base_index[q]
+ == map->partition_to_base_index[p]);
+
+ if (p == (int)x)
+ {
+ if (t++ == 0)
+ {
+ fprintf (f, "Partition %d, base %d (", x,
+ map->partition_to_base_index[q]);
+ print_generic_expr (f, partition_to_var (map, q), TDF_SLIM);
+ fprintf (f, " - ");
+ }
+ fprintf (f, "%d ", y);
+ }
+ }
+ if (t != 0)
+ fprintf (f, ")\n");
+ }
+ fprintf (f, "\n");
+}
+
+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+ coalescing together, false otherwise.
+
+ This must stay consistent with var_map_base_init in tree-ssa-live.c. */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+ /* First check the SSA_NAME's associated DECL. Without
+ optimization, we only want to coalesce if they have the same DECL
+ or both have no associated DECL. */
+ tree var1 = SSA_NAME_VAR (name1);
+ tree var2 = SSA_NAME_VAR (name2);
+ var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE;
+ var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE;
+ if (var1 != var2 && !flag_tree_coalesce_vars)
+ return false;
+
+ /* Now check the types. If the types are the same, then we should
+ try to coalesce V1 and V2. */
+ tree t1 = TREE_TYPE (name1);
+ tree t2 = TREE_TYPE (name2);
+ if (t1 == t2)
+ {
+ check_modes:
+ /* If the base variables are the same, we're good: none of the
+ other tests below could possibly fail. */
+ var1 = SSA_NAME_VAR (name1);
+ var2 = SSA_NAME_VAR (name2);
+ if (var1 == var2)
+ return true;
+
+ /* We don't want to coalesce two SSA names if one of the base
+ variables is supposed to be a register while the other is
+ supposed to be on the stack. Anonymous SSA names take
+ registers, but when not optimizing, user variables should go
+ on the stack, so coalescing them with the anonymous variable
+ as the partition leader would end up assigning the user
+ variable to a register. Don't do that! */
+ bool reg1 = !var1 || use_register_for_decl (var1);
+ bool reg2 = !var2 || use_register_for_decl (var2);
+ if (reg1 != reg2)
+ return false;
+
+ /* Check that the promoted modes are the same. We don't want to
+ coalesce if the promoted modes would be different. Only
+ PARM_DECLs and RESULT_DECLs have different promotion rules,
+ so skip the test if we both are variables or anonymous
+ SSA_NAMEs. */
+ return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+ || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL);
+ }
+
+ /* If the types are not the same, check for a canonical type match. This
+ (for example) allows coalescing when the types are fundamentally the
+ same, but just have different names.
+
+ Note pointer types with different address spaces may have the same
+ canonical type. Those are rejected for coalescing by the
+ types_compatible_p check. */
+ if (TYPE_CANONICAL (t1)
+ && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+ && types_compatible_p (t1, t2))
+ goto check_modes;
+
+ return false;
+}
+
+/* Fill in MAP's partition_to_base_index, with one index for each
+ partition of SSA names USED_IN_COPIES and related by CL coalesce
+ possibilities. This must match gimple_can_coalesce_p in the
+ optimized case. */
+
+static void
+compute_optimized_partition_bases (var_map map, bitmap used_in_copies,
+ coalesce_list_p cl)
+{
+ int parts = num_var_partitions (map);
+ partition tentative = partition_new (parts);
+
+ /* Partition the SSA versions so that, for each coalescible
+ pair, both of its members are in the same partition in
+ TENTATIVE. */
+ gcc_assert (!cl->sorted);
+ coalesce_pair_p node;
+ coalesce_iterator_type ppi;
+ FOR_EACH_PARTITION_PAIR (node, ppi, cl)
+ {
+ tree v1 = ssa_name (node->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (node->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* We have to deal with cost one pairs too. */
+ for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next)
+ {
+ tree v1 = ssa_name (co->first_element);
+ int p1 = partition_find (tentative, var_to_partition (map, v1));
+ tree v2 = ssa_name (co->second_element);
+ int p2 = partition_find (tentative, var_to_partition (map, v2));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+
+ /* And also with abnormal edges. */
+ basic_block bb;
+ edge e;
+ edge_iterator ei;
+ FOR_EACH_BB_FN (bb, cfun)
+ {
+ FOR_EACH_EDGE (e, ei, bb->preds)
+ if (e->flags & EDGE_ABNORMAL)
+ {
+ gphi_iterator gsi;
+ for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
+ gsi_next (&gsi))
+ {
+ gphi *phi = gsi.phi ();
+ tree arg = PHI_ARG_DEF (phi, e->dest_idx);
+ if (SSA_NAME_IS_DEFAULT_DEF (arg)
+ && (!SSA_NAME_VAR (arg)
+ || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL))
+ continue;
+
+ tree res = PHI_RESULT (phi);
+
+ int p1 = partition_find (tentative, var_to_partition (map, res));
+ int p2 = partition_find (tentative, var_to_partition (map, arg));
+
+ if (p1 == p2)
+ continue;
+
+ partition_union (tentative, p1, p2);
+ }
+ }
+ }
+
+ map->partition_to_base_index = XCNEWVEC (int, parts);
+ auto_vec<unsigned int> index_map (parts);
+ if (parts)
+ index_map.quick_grow (parts);
+
+ const unsigned no_part = -1;
+ unsigned count = parts;
+ while (count)
+ index_map[--count] = no_part;
+
+ /* Initialize MAP's mapping from partition to base index, using
+ as base indices an enumeration of the TENTATIVE partitions in
+ which each SSA version ended up, so that we compute conflicts
+ between all SSA versions that ended up in the same potential
+ coalesce partition. */
+ bitmap_iterator bi;
+ unsigned i;
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ if (index_map[base] != no_part)
+ continue;
+ index_map[base] = count++;
+ }
+
+ map->num_basevars = count;
+
+ EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi)
+ {
+ int pidx = var_to_partition (map, ssa_name (i));
+ int base = partition_find (tentative, pidx);
+ gcc_assert (index_map[base] < count);
+ map->partition_to_base_index[pidx] = index_map[base];
+ }
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ dump_part_var_map (dump_file, tentative, map);
+
+ partition_delete (tentative);
+}
+
+/* Hashtable helpers. */
+
+struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
+{
+ static inline hashval_t hash (const tree_int_map *);
+ static inline bool equal (const tree_int_map *, const tree_int_map *);
+};
+
+inline hashval_t
+tree_int_map_hasher::hash (const tree_int_map *v)
+{
+ return tree_map_base_hash (v);
+}
+
+inline bool
+tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
+{
+ return tree_int_map_eq (v, c);
+}
+
+/* This routine will initialize the basevar fields of MAP with base
+ names. Partitions will share the same base if they have the same
+ SSA_NAME_VAR, or, being anonymous variables, the same type. This
+ must match gimple_can_coalesce_p in the non-optimized case. */
+
+static void
+compute_samebase_partition_bases (var_map map)
+{
+ int x, num_part;
+ tree var;
+ struct tree_int_map *m, *mapstorage;
+
+ num_part = num_var_partitions (map);
+ hash_table<tree_int_map_hasher> tree_to_index (num_part);
+ /* We can have at most num_part entries in the hash tables, so it's
+ enough to allocate so many map elements once, saving some malloc
+ calls. */
+ mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
+
+ /* If a base table already exists, clear it, otherwise create it. */
+ free (map->partition_to_base_index);
+ map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
+
+ /* Build the base variable list, and point partitions at their bases. */
+ for (x = 0; x < num_part; x++)
+ {
+ struct tree_int_map **slot;
+ unsigned baseindex;
+ var = partition_to_var (map, x);
+ if (SSA_NAME_VAR (var)
+ && (!VAR_P (SSA_NAME_VAR (var))
+ || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
+ m->base.from = SSA_NAME_VAR (var);
+ else
+ /* This restricts what anonymous SSA names we can coalesce
+ as it restricts the sets we compute conflicts for.
+ Using TREE_TYPE to generate sets is the easies as
+ type equivalency also holds for SSA names with the same
+ underlying decl.
+
+ Check gimple_can_coalesce_p when changing this code. */
+ m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+ ? TYPE_CANONICAL (TREE_TYPE (var))
+ : TREE_TYPE (var));
+ /* If base variable hasn't been seen, set it up. */
+ slot = tree_to_index.find_slot (m, INSERT);
+ if (!*slot)
+ {
+ baseindex = m - mapstorage;
+ m->to = baseindex;
+ *slot = m;
+ m++;
+ }
+ else
+ baseindex = (*slot)->to;
+ map->partition_to_base_index[x] = baseindex;
+ }
+
+ map->num_basevars = m - mapstorage;
+
+ free (mapstorage);
+}
+
/* Reduce the number of copies by coalescing variables in the function. Return
a partition map with the resulting coalesces. */
@@ -1260,9 +1619,10 @@ coalesce_ssa_name (void)
cl = create_coalesce_list ();
map = create_outofssa_var_map (cl, used_in_copies);
- /* If optimization is disabled, we need to coalesce all the names originating
- from the same SSA_NAME_VAR so debug info remains undisturbed. */
- if (!optimize)
+ /* If this optimization is disabled, we need to coalesce all the
+ names originating from the same SSA_NAME_VAR so debug info
+ remains undisturbed. */
+ if (!flag_tree_coalesce_vars)
{
hash_table<ssa_name_var_hash> ssa_name_hash (10);
@@ -1303,8 +1663,13 @@ coalesce_ssa_name (void)
if (dump_file && (dump_flags & TDF_DETAILS))
dump_var_map (dump_file, map);
- /* Don't calculate live ranges for variables not in the coalesce list. */
- partition_view_bitmap (map, used_in_copies, true);
+ partition_view_bitmap (map, used_in_copies);
+
+ if (flag_tree_coalesce_vars)
+ compute_optimized_partition_bases (map, used_in_copies, cl);
+ else
+ compute_samebase_partition_bases (map);
+
BITMAP_FREE (used_in_copies);
if (num_var_partitions (map) < 1)
@@ -1343,8 +1708,7 @@ coalesce_ssa_name (void)
/* Now coalesce everything in the list. */
coalesce_partitions (map, graph, cl,
- ((dump_flags & TDF_DETAILS) ? dump_file
- : NULL));
+ ((dump_flags & TDF_DETAILS) ? dump_file : NULL));
delete_coalesce_list (cl);
ssa_conflicts_delete (graph);
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index 99b188a..ae289b4 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see
#define GCC_TREE_SSA_COALESCE_H
extern var_map coalesce_ssa_name (void);
+extern bool gimple_can_coalesce_p (tree, tree);
#endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c
deleted file mode 100644
index aeb7f28..0000000
--- a/gcc/tree-ssa-copyrename.c
+++ /dev/null
@@ -1,475 +0,0 @@
-/* Rename SSA copies.
- Copyright (C) 2004-2015 Free Software Foundation, Inc.
- Contributed by Andrew MacLeod <amacleod@redhat.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3. If not see
-<http://www.gnu.org/licenses/>. */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "backend.h"
-#include "tree.h"
-#include "gimple.h"
-#include "rtl.h"
-#include "ssa.h"
-#include "alias.h"
-#include "fold-const.h"
-#include "internal-fn.h"
-#include "gimple-iterator.h"
-#include "flags.h"
-#include "tree-pretty-print.h"
-#include "insn-config.h"
-#include "expmed.h"
-#include "dojump.h"
-#include "explow.h"
-#include "calls.h"
-#include "emit-rtl.h"
-#include "varasm.h"
-#include "stmt.h"
-#include "expr.h"
-#include "tree-dfa.h"
-#include "tree-inline.h"
-#include "tree-ssa-live.h"
-#include "tree-pass.h"
-#include "langhooks.h"
-
-static struct
-{
- /* Number of copies coalesced. */
- int coalesced;
-} stats;
-
-/* The following routines implement the SSA copy renaming phase.
-
- This optimization looks for copies between 2 SSA_NAMES, either through a
- direct copy, or an implicit one via a PHI node result and its arguments.
-
- Each copy is examined to determine if it is possible to rename the base
- variable of one of the operands to the same variable as the other operand.
- i.e.
- T.3_5 = <blah>
- a_1 = T.3_5
-
- If this copy couldn't be copy propagated, it could possibly remain in the
- program throughout the optimization phases. After SSA->normal, it would
- become:
-
- T.3 = <blah>
- a = T.3
-
- Since T.3_5 is distinct from all other SSA versions of T.3, there is no
- fundamental reason why the base variable needs to be T.3, subject to
- certain restrictions. This optimization attempts to determine if we can
- change the base variable on copies like this, and result in code such as:
-
- a_5 = <blah>
- a_1 = a_5
-
- This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is
- possible, the copy goes away completely. If it isn't possible, a new temp
- will be created for a_5, and you will end up with the exact same code:
-
- a.8 = <blah>
- a = a.8
-
- The other benefit of performing this optimization relates to what variables
- are chosen in copies. Gimplification of the program uses temporaries for
- a lot of things. expressions like
-
- a_1 = <blah>
- <blah2> = a_1
-
- get turned into
-
- T.3_5 = <blah>
- a_1 = T.3_5
- <blah2> = a_1
-
- Copy propagation is done in a forward direction, and if we can propagate
- through the copy, we end up with:
-
- T.3_5 = <blah>
- <blah2> = T.3_5
-
- The copy is gone, but so is all reference to the user variable 'a'. By
- performing this optimization, we would see the sequence:
-
- a_5 = <blah>
- a_1 = a_5
- <blah2> = a_1
-
- which copy propagation would then turn into:
-
- a_5 = <blah>
- <blah2> = a_5
-
- and so we still retain the user variable whenever possible. */
-
-
-/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
- Choose a representative for the partition, and send debug info to DEBUG. */
-
-static void
-copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug)
-{
- int p1, p2, p3;
- tree root1, root2;
- tree rep1, rep2;
- bool ign1, ign2, abnorm;
-
- gcc_assert (TREE_CODE (var1) == SSA_NAME);
- gcc_assert (TREE_CODE (var2) == SSA_NAME);
-
- register_ssa_partition (map, var1);
- register_ssa_partition (map, var2);
-
- p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
- p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
-
- if (debug)
- {
- fprintf (debug, "Try : ");
- print_generic_expr (debug, var1, TDF_SLIM);
- fprintf (debug, "(P%d) & ", p1);
- print_generic_expr (debug, var2, TDF_SLIM);
- fprintf (debug, "(P%d)", p2);
- }
-
- gcc_assert (p1 != NO_PARTITION);
- gcc_assert (p2 != NO_PARTITION);
-
- if (p1 == p2)
- {
- if (debug)
- fprintf (debug, " : Already coalesced.\n");
- return;
- }
-
- rep1 = partition_to_var (map, p1);
- rep2 = partition_to_var (map, p2);
- root1 = SSA_NAME_VAR (rep1);
- root2 = SSA_NAME_VAR (rep2);
- if (!root1 && !root2)
- return;
-
- /* Don't coalesce if one of the variables occurs in an abnormal PHI. */
- abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1)
- || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2));
- if (abnorm)
- {
- if (debug)
- fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n");
- return;
- }
-
- /* Partitions already have the same root, simply merge them. */
- if (root1 == root2)
- {
- p1 = partition_union (map->var_partition, p1, p2);
- if (debug)
- fprintf (debug, " : Same root, coalesced --> P%d.\n", p1);
- return;
- }
-
- /* Never attempt to coalesce 2 different parameters. */
- if ((root1 && TREE_CODE (root1) == PARM_DECL)
- && (root2 && TREE_CODE (root2) == PARM_DECL))
- {
- if (debug)
- fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n");
- return;
- }
-
- if ((root1 && TREE_CODE (root1) == RESULT_DECL)
- != (root2 && TREE_CODE (root2) == RESULT_DECL))
- {
- if (debug)
- fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n");
- return;
- }
-
- ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1));
- ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2));
-
- /* Refrain from coalescing user variables, if requested. */
- if (!ign1 && !ign2)
- {
- if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2))
- ign2 = true;
- else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1))
- ign1 = true;
- else if (flag_ssa_coalesce_vars != 2)
- {
- if (debug)
- fprintf (debug, " : 2 different USER vars. No coalesce.\n");
- return;
- }
- else
- ign2 = true;
- }
-
- /* If both values have default defs, we can't coalesce. If only one has a
- tag, make sure that variable is the new root partition. */
- if (root1 && ssa_default_def (cfun, root1))
- {
- if (root2 && ssa_default_def (cfun, root2))
- {
- if (debug)
- fprintf (debug, " : 2 default defs. No coalesce.\n");
- return;
- }
- else
- {
- ign2 = true;
- ign1 = false;
- }
- }
- else if (root2 && ssa_default_def (cfun, root2))
- {
- ign1 = true;
- ign2 = false;
- }
-
- /* Do not coalesce if we cannot assign a symbol to the partition. */
- if (!(!ign2 && root2)
- && !(!ign1 && root1))
- {
- if (debug)
- fprintf (debug, " : Choosen variable has no root. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the new chosen root variable would be read-only.
- If both ign1 && ign2, then the root var of the larger partition
- wins, so reject in that case if any of the root vars is TREE_READONLY.
- Otherwise reject only if the root var, on which replace_ssa_name_symbol
- will be called below, is readonly. */
- if (((root1 && TREE_READONLY (root1)) && ign2)
- || ((root2 && TREE_READONLY (root2)) && ign1))
- {
- if (debug)
- fprintf (debug, " : Readonly variable. No coalesce.\n");
- return;
- }
-
- /* Don't coalesce if the two variables aren't type compatible . */
- if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2))
- /* There is a disconnect between the middle-end type-system and
- VRP, avoid coalescing enum types with different bounds. */
- || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE
- || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE)
- && TREE_TYPE (var1) != TREE_TYPE (var2)))
- {
- if (debug)
- fprintf (debug, " : Incompatible types. No coalesce.\n");
- return;
- }
-
- /* Merge the two partitions. */
- p3 = partition_union (map->var_partition, p1, p2);
-
- /* Set the root variable of the partition to the better choice, if there is
- one. */
- if (!ign2 && root2)
- replace_ssa_name_symbol (partition_to_var (map, p3), root2);
- else if (!ign1 && root1)
- replace_ssa_name_symbol (partition_to_var (map, p3), root1);
- else
- gcc_unreachable ();
-
- if (debug)
- {
- fprintf (debug, " --> P%d ", p3);
- print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)),
- TDF_SLIM);
- fprintf (debug, "\n");
- }
-}
-
-
-namespace {
-
-const pass_data pass_data_rename_ssa_copies =
-{
- GIMPLE_PASS, /* type */
- "copyrename", /* name */
- OPTGROUP_NONE, /* optinfo_flags */
- TV_TREE_COPY_RENAME, /* tv_id */
- ( PROP_cfg | PROP_ssa ), /* properties_required */
- 0, /* properties_provided */
- 0, /* properties_destroyed */
- 0, /* todo_flags_start */
- 0, /* todo_flags_finish */
-};
-
-class pass_rename_ssa_copies : public gimple_opt_pass
-{
-public:
- pass_rename_ssa_copies (gcc::context *ctxt)
- : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt)
- {}
-
- /* opt_pass methods: */
- opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); }
- virtual bool gate (function *) { return flag_tree_copyrename != 0; }
- virtual unsigned int execute (function *);
-
-}; // class pass_rename_ssa_copies
-
-/* This function will make a pass through the IL, and attempt to coalesce any
- SSA versions which occur in PHI's or copies. Coalescing is accomplished by
- changing the underlying root variable of all coalesced version. This will
- then cause the SSA->normal pass to attempt to coalesce them all to the same
- variable. */
-
-unsigned int
-pass_rename_ssa_copies::execute (function *fun)
-{
- var_map map;
- basic_block bb;
- tree var, part_var;
- gimple stmt;
- unsigned x;
- FILE *debug;
-
- memset (&stats, 0, sizeof (stats));
-
- if (dump_file && (dump_flags & TDF_DETAILS))
- debug = dump_file;
- else
- debug = NULL;
-
- map = init_var_map (num_ssa_names);
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Scan for real copies. */
- for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- stmt = gsi_stmt (gsi);
- if (gimple_assign_ssa_name_copy_p (stmt))
- {
- tree lhs = gimple_assign_lhs (stmt);
- tree rhs = gimple_assign_rhs1 (stmt);
-
- copy_rename_partition_coalesce (map, lhs, rhs, debug);
- }
- }
- }
-
- FOR_EACH_BB_FN (bb, fun)
- {
- /* Treat PHI nodes as copies between the result and each argument. */
- for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
- gsi_next (&gsi))
- {
- size_t i;
- tree res;
- gphi *phi = gsi.phi ();
- res = gimple_phi_result (phi);
-
- /* Do not process virtual SSA_NAMES. */
- if (virtual_operand_p (res))
- continue;
-
- /* Make sure to only use the same partition for an argument
- as the result but never the other way around. */
- if (SSA_NAME_VAR (res)
- && !DECL_IGNORED_P (SSA_NAME_VAR (res)))
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) == SSA_NAME)
- copy_rename_partition_coalesce (map, res, arg,
- debug);
- }
- /* Else if all arguments are in the same partition try to merge
- it with the result. */
- else
- {
- int all_p_same = -1;
- int p = -1;
- for (i = 0; i < gimple_phi_num_args (phi); i++)
- {
- tree arg = PHI_ARG_DEF (phi, i);
- if (TREE_CODE (arg) != SSA_NAME)
- {
- all_p_same = 0;
- break;
- }
- else if (all_p_same == -1)
- {
- p = partition_find (map->var_partition,
- SSA_NAME_VERSION (arg));
- all_p_same = 1;
- }
- else if (all_p_same == 1
- && p != partition_find (map->var_partition,
- SSA_NAME_VERSION (arg)))
- {
- all_p_same = 0;
- break;
- }
- }
- if (all_p_same == 1)
- copy_rename_partition_coalesce (map, res,
- PHI_ARG_DEF (phi, 0),
- debug);
- }
- }
- }
-
- if (debug)
- dump_var_map (debug, map);
-
- /* Now one more pass to make all elements of a partition share the same
- root variable. */
-
- for (x = 1; x < num_ssa_names; x++)
- {
- part_var = partition_to_var (map, x);
- if (!part_var)
- continue;
- var = ssa_name (x);
- if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
- continue;
- if (debug)
- {
- fprintf (debug, "Coalesced ");
- print_generic_expr (debug, var, TDF_SLIM);
- fprintf (debug, " to ");
- print_generic_expr (debug, part_var, TDF_SLIM);
- fprintf (debug, "\n");
- }
- stats.coalesced++;
- replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));
- }
-
- statistics_counter_event (fun, "copies coalesced",
- stats.coalesced);
- delete_var_map (map);
- return 0;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_rename_ssa_copies (gcc::context *ctxt)
-{
- return new pass_rename_ssa_copies (ctxt);
-}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 5b00f58..4772558 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -70,88 +70,6 @@ static void verify_live_on_entry (tree_live_info_p);
ssa_name or variable, and vice versa. */
-/* Hashtable helpers. */
-
-struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map>
-{
- static inline hashval_t hash (const tree_int_map *);
- static inline bool equal (const tree_int_map *, const tree_int_map *);
-};
-
-inline hashval_t
-tree_int_map_hasher::hash (const tree_int_map *v)
-{
- return tree_map_base_hash (v);
-}
-
-inline bool
-tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c)
-{
- return tree_int_map_eq (v, c);
-}
-
-
-/* This routine will initialize the basevar fields of MAP. */
-
-static void
-var_map_base_init (var_map map)
-{
- int x, num_part;
- tree var;
- struct tree_int_map *m, *mapstorage;
-
- num_part = num_var_partitions (map);
- hash_table<tree_int_map_hasher> tree_to_index (num_part);
- /* We can have at most num_part entries in the hash tables, so it's
- enough to allocate so many map elements once, saving some malloc
- calls. */
- mapstorage = m = XNEWVEC (struct tree_int_map, num_part);
-
- /* If a base table already exists, clear it, otherwise create it. */
- free (map->partition_to_base_index);
- map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part);
-
- /* Build the base variable list, and point partitions at their bases. */
- for (x = 0; x < num_part; x++)
- {
- struct tree_int_map **slot;
- unsigned baseindex;
- var = partition_to_var (map, x);
- if (SSA_NAME_VAR (var)
- && (!VAR_P (SSA_NAME_VAR (var))
- || !DECL_IGNORED_P (SSA_NAME_VAR (var))))
- m->base.from = SSA_NAME_VAR (var);
- else
- /* This restricts what anonymous SSA names we can coalesce
- as it restricts the sets we compute conflicts for.
- Using TREE_TYPE to generate sets is the easies as
- type equivalency also holds for SSA names with the same
- underlying decl.
-
- Check gimple_can_coalesce_p when changing this code. */
- m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
- ? TYPE_CANONICAL (TREE_TYPE (var))
- : TREE_TYPE (var));
- /* If base variable hasn't been seen, set it up. */
- slot = tree_to_index.find_slot (m, INSERT);
- if (!*slot)
- {
- baseindex = m - mapstorage;
- m->to = baseindex;
- *slot = m;
- m++;
- }
- else
- baseindex = (*slot)->to;
- map->partition_to_base_index[x] = baseindex;
- }
-
- map->num_basevars = m - mapstorage;
-
- free (mapstorage);
-}
-
-
/* Remove the base table in MAP. */
static void
@@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected)
}
-/* Create a partition view which includes all the used partitions in MAP. If
- WANT_BASES is true, create the base variable map as well. */
+/* Create a partition view which includes all the used partitions in MAP. */
void
-partition_view_normal (var_map map, bool want_bases)
+partition_view_normal (var_map map)
{
bitmap used;
used = partition_view_init (map);
partition_view_fini (map, used);
- if (want_bases)
- var_map_base_init (map);
- else
- var_map_base_fini (map);
+ var_map_base_fini (map);
}
@@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases)
as well. */
void
-partition_view_bitmap (var_map map, bitmap only, bool want_bases)
+partition_view_bitmap (var_map map, bitmap only)
{
bitmap used;
bitmap new_partitions = BITMAP_ALLOC (NULL);
@@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases)
}
partition_view_fini (map, new_partitions);
- if (want_bases)
- var_map_base_init (map);
- else
- var_map_base_fini (map);
+ var_map_base_fini (map);
}
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index d5d7820..1f88358 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -71,8 +71,8 @@ typedef struct _var_map
extern var_map init_var_map (int);
extern void delete_var_map (var_map);
extern int var_union (var_map, tree, tree);
-extern void partition_view_normal (var_map, bool);
-extern void partition_view_bitmap (var_map, bitmap, bool);
+extern void partition_view_normal (var_map);
+extern void partition_view_bitmap (var_map, bitmap);
extern void dump_scope_blocks (FILE *, int);
extern void debug_scope_block (tree, int);
extern void debug_scope_blocks (int);
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index 437f69d..1fbd71e 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -38,6 +38,11 @@ along with GCC; see the file COPYING3. If not see
#include "tree-pass.h"
#include "tree-ssa-propagate.h"
#include "tree-hash-traits.h"
+#include "bitmap.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa-live.h"
+#include "tree-ssa-coalesce.h"
/* The basic structure describing an equivalency created by traversing
an edge. Traversing the edge effectively means that we can assume
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index b5b0cb6..e10f775 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4909,12 +4909,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set)
registers, as well as associations between MEMs and VALUEs. */
static void
-dataflow_set_clear_at_call (dataflow_set *set)
+dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
{
unsigned int r;
hard_reg_set_iterator hrsi;
+ HARD_REG_SET invalidated_regs;
- EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi)
+ get_call_reg_set_usage (call_insn, &invalidated_regs,
+ regs_invalidated_by_call);
+
+ EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
var_regno_delete (set, r);
if (MAY_HAVE_DEBUG_INSNS)
@@ -6698,7 +6702,7 @@ compute_bb_dataflow (basic_block bb)
switch (mo->type)
{
case MO_CALL:
- dataflow_set_clear_at_call (out);
+ dataflow_set_clear_at_call (out, insn);
break;
case MO_USE:
@@ -9160,7 +9164,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set)
switch (mo->type)
{
case MO_CALL:
- dataflow_set_clear_at_call (set);
+ dataflow_set_clear_at_call (set, insn);
emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars);
{
rtx arguments = mo->u.loc, *p = &arguments;
These are the incremental fixes:
diff --git a/gcc/explow.c b/gcc/explow.c
index 6dba6e5..6941f4e 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -852,6 +852,13 @@ promote_ssa_mode (const_tree name, int *punsignedp)
{
gcc_assert (TREE_CODE (name) == SSA_NAME);
+ /* Partitions holding parms and results must be promoted as expected
+ by function.c. */
+ if (SSA_NAME_VAR (name)
+ && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
+ || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
+ return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+
tree type = TREE_TYPE (name);
int unsignedp = TYPE_UNSIGNED (type);
machine_mode mode = TYPE_MODE (type);
diff --git a/gcc/function.c b/gcc/function.c
index 840f4a2..753d889 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2920,14 +2920,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
{
DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
stack_parm = rtl_for_parm (all, parm);
- if (!stack_parm)
- stack_parm = assign_stack_local (BLKmode, size_stored,
- DECL_ALIGN (parm));
- else
+ if (stack_parm)
stack_parm = copy_rtx (stack_parm);
- if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
- PUT_MODE (stack_parm, GET_MODE (entry_parm));
- set_mem_attributes (stack_parm, parm, 1);
+ else
+ {
+ stack_parm = assign_stack_local (BLKmode, size_stored,
+ DECL_ALIGN (parm));
+ if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
+ PUT_MODE (stack_parm, GET_MODE (entry_parm));
+ set_mem_attributes (stack_parm, parm, 1);
+ }
}
/* If a BLKmode arrives in registers, copy it to a stack slot. Handle
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: [PR64164] drop copyrename, integrate into expand
2015-07-16 7:58 ` Alexandre Oliva
@ 2015-07-16 8:50 ` Richard Biener
2015-07-16 21:33 ` Alexandre Oliva
0 siblings, 1 reply; 127+ messages in thread
From: Richard Biener @ 2015-07-16 8:50 UTC (permalink / raw)
To: Alexandre Oliva
Cc: Jeff Law, GCC Patches, Christophe Lyon, David Edelsohn, Eric Botcazou
On Thu, Jul 16, 2015 at 9:29 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jun 10, 2015, Richard Biener <richard.guenther@gmail.com> wrote:
>
>> On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> This caused the sparc regression reported by Eric in
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37
>
>>> We need to match the mode of the rtl created for the partition and the
>>> promoted mode expected for the parm. I recall working to make parm and
>>> result decls the partition leaders, so that promote_ssa_mode would DTRT,
>>> but this escaped my mind when revisiting the patch after some time on
>>> another project.
>
> FWIW, during the development of this improvement, I dropped the notion
> of making parm and result decls partition leaders, and instead only
> considered eligible for coalescing into the same partition SSA_NAMEs
> that promoted to the same mode.
>
>> Alternatively not coalesce SSA names when promote_decl_mode gives
>> different answers (for their underlying decl)? It sounds wrong to do that
>> (if that is really what happens).
>
> Exactly. I've now restored the promote_decl_mode behavior to
> promote_ssa_mode for PARM_ and RESULT_DECLs, so that the strategy
> described above works again. This fixed the sparc regression.
>
> On Jun 9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> On Jun 9, 2015, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>>> On Jun 9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>>>> This also broke bootstrap on PPC64 LE Linux with the same error.
>
>>> Thanks for your reports. I'm looking into the problem.
>
>>> I'd appreciate a preprocessed testcase from either of you to confirm the
>>> fix, if not to help debug it.
>
>> The first potential source for this problem that jumped at me would be
>> silenced with this change:
>
>> diff --git a/gcc/function.c b/gcc/function.c
>> index 8bcc352..9201ed9 100644
>> --- a/gcc/function.c
>> +++ b/gcc/function.c
>> @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>> stack_parm = copy_rtx (stack_parm);
>> if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>> PUT_MODE (stack_parm, GET_MODE (entry_parm));
>> - set_mem_attributes (stack_parm, parm, 1);
>> + if (GET_CODE (stack_parm) == MEM)
>> + set_mem_attributes (stack_parm, parm, 1);
>> }
>
>> /* If a BLKmode arrives in registers, copy it to a stack slot. Handle
>
> I ended up fixing this in a slightly different way, running the original
> code above, from assign_stack_local to set_mem_attributes, only when
> rtl_for_parm does not obtain an assignment set up by out-of-ssa.
>
>> but I suspect there might be other similar issues lurking in function.c
>> after my attempt to turn parm assignment upside down ;-)
>
> There weren't, after all.
>
> On Jun 9, 2015, David Edelsohn <dje.gcc@gmail.com> wrote:
>
>> This patch clearly should have been tested on more
>> architectures than x86 before being approved and merged.
>
> The following patch was regstrapped on x86_64-linux-gnu and
> i686-pc-linux-gnu. I've also cross-built all-target successfully for
> targets aarch64-elf, arm-eabi, arm-symbianelf, avr-elf, bfin-elf,
> cr16-elf, cris-elf, crisv32-elf, epiphany-elf, fido-elf, fr30-elf,
> frv-elf, i686-elf, lm32-elf, m68k-elf, mcore-elf, microblaze-elf,
> mips64el-elf, mips64-elf, mips64orion-elf, mipsel-elf,
> mipsisa32-elfoabi, mipsisa64-elfoabi, mipsisa64r2el-elf,
> mipsisa64r2-sde-elf, mipsisa64sb1-elf, mipstx39-elf, mn10300-elf,
> moxie-elf, nds32be-elf, nds32le-elf, nios2-elf, powerpc-eabialtivec,
> powerpc-eabisimaltivec, powerpc-eabisim, powerpc-eabispe, powerpc-eabi,
> powerpcle-eabisim, powerpcle-eabi, powerpcle-elf, ppc-eabi, ppc-elf,
> rx-elf, sh-elf, sh-superh-elf, sparc64-elf, sparc-elf, spu-elf, and
> visium-elf, and got the same build failures before and after the patch
> with targets c6x-elf, ft32-elf, h8300-elf, ia64-elf, iq2000-elf,
> m32c-elf, m32r-elf, m32rle-elf, mep-elf, mips64vr-elf
> (mips64vr-elf/mips16/newlib/libm/math/lib_a_e_hypot.o failed to build
> with the patch and passed without it, but there were other "invalid
> operand" failures for "lwu" insns without the patch, so I'm counting the
> e_hypot failure as present but latent before), mipsisa64sr71k-elf,
> msp430-elf, pdp11-aout, powerpc-xilinx-eabi, ppc64-eabi, rl78-elf,
> sh64-elf, sparc-leon-elf, v850e-elf, v850-elf, xstormy16-elf, and
> xtensa-elf.
>
> This patch differs from the previous one in that I dropped the hunk I
> had put in loop_exits_before_overflow, already noticed and fixed
> independently (PR66638); I updated tree_int_map_hasher, that was updated
> in the trunk in tree-ssa-live.c, but that the patch moved to
> tree-ssa-coalesce.c; I resolved other conflicts in files that had
> #includes added by the patch and by other changes; and I put in the two
> fixes mentioned above. After the full updated patch, I enclose a diff
> with these two additional fixes, to ease the review.
>
> Is this ok to install?
Yes.
Thanks again for taking care of this!
Richard.
>
> for gcc/ChangeLog
>
> PR rtl-optimization/64164
> * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
> * tree-ssa-copyrename.c: Removed.
> * opts.c (default_options_table): Drop -ftree-copyrename. Add
> -ftree-coalesce-vars.
> * passes.def: Drop all occurrences of pass_rename_ssa_copies.
> * common.opt (ftree-copyrename): Ignore.
> (ftree-coalesce-inlined-vars): Likewise.
> * doc/invoke.texi: Remove the ignored options above.
> * gimple-expr.h (gimple_can_coalesce_p): Move declaration
> * tree-ssa-coalesce.h: ... here.
> * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
> headers required by it.
> * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
> across variables when flag_tree_coalesce_vars. Check register
> use and promoted modes to allow coalescing. Moved to
> tree-ssa-coalesce.c.
> * tree-ssa-live.c (struct tree_int_map_hasher): Move along
> with its member functions to tree-ssa-coalesce.c.
> (var_map_base_init): Likewise. Renamed to
> compute_samebase_partition_bases.
> (partition_view_normal): Drop want_bases parameter.
> (partition_view_bitmap): Likewise.
> * tree-ssa-live.h: Adjust declarations.
> * tree-ssa-coalesce.c: Include explow.h.
> (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
> default defs at the entry point.
> (dump_part_var_map): New.
> (compute_optimized_partition_bases): New, called by...
> (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
> of compute_samebase_partition_bases. Adjust.
> * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
> * cfgexpand.c (leader_merge): New.
> (get_rtl_for_parm_ssa_default_def): New.
> (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
> vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
> (expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
> redundant MEM attr setting.
> (expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
> from...
> (expand_one_stack_var): ... this. New wrapper to check and
> skip already expanded SSA partitions.
> (record_alignment_for_reg_var): New, factored out of...
> (expand_one_var): ... this.
> (expand_one_ssa_partition): New.
> (adjust_one_expanded_partition_var): New.
> (expand_one_register_var): Check and skip already expanded SSA
> partitions.
> (expand_used_vars): Don't create DECLs for anonymous SSA
> names. Expand all SSA partitions, then adjust all SSA names.
> (pass::execute): Replace the loops that set
> SA.partition_to_pseudo from partition leaders and cleared
> DECL_RTL for multi-location variables, and that which used to
> rename vars and set attrs, with one that clears DECL_RTL and
> checks that PARMs and RESULTs default_defs match DECL_RTL.
> * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
> * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
> * explow.c (promote_ssa_mode): New.
> * explow.h (promote_ssa_mode): Declare.
> * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
> * function.c: Include cfgexpand.h.
> (use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
> (use_register_for_parm_decl): Wrapper for the above to
> special-case the result_ptr.
> (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
> (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
> multiple locations.
> (assign_parm_adjust_stack_rtl): Add all and parm arguments,
> for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
> (assign_parm_setup_block): Prefer SSA-assigned location.
> (assign_parm_setup_reg): Likewise. Use entry_parm for equiv
> if stack_parm is NULL.
> (assign_parm_setup_stack): Prefer SSA-assigned location.
> (assign_parms): Maybe reset DECL_RTL of params. Adjust stack
> rtl before testing for pointer bounds. Special-case result_ptr.
> (expand_function_start): Maybe reset DECL_RTL of result.
> Prefer SSA-assigned location for result and static chain.
> Factor out DECL_RESULT and SET_DECL_RTL.
> * tree-outof-ssa.c (insert_value_copy_on_edge): Handle
> anonymous SSA names. Use promote_ssa_mode.
> (get_temp_reg): Likewise.
> (remove_ssa_form): Adjust.
> * var-tracking.c (dataflow_set_clear_at_call): Take call_insn
> and get its reg_usage for reg invalidation.
> (compute_bb_dataflow): Pass it insn.
> (emit_notes_in_bb): Likewise.
>
> for gcc/testsuite/ChangeLog
>
> * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
> * gcc.dg/ssp-1.c: Make counter a register.
> * gcc.dg/ssp-2.c: Likewise.
> * gcc.dg/torture/parm-coalesce.c: New.
> ---
> gcc/Makefile.in | 1
> gcc/alias.c | 13 +
> gcc/cfgexpand.c | 370 +++++++++++++++-----
> gcc/cfgexpand.h | 2
> gcc/common.opt | 12 -
> gcc/doc/invoke.texi | 48 +--
> gcc/emit-rtl.c | 5
> gcc/explow.c | 22 +
> gcc/explow.h | 3
> gcc/expr.c | 39 +-
> gcc/function.c | 228 ++++++++++--
> gcc/gimple-expr.c | 39 --
> gcc/gimple-expr.h | 1
> gcc/opts.c | 2
> gcc/passes.def | 5
> gcc/testsuite/gcc.dg/guality/pr54200.c | 2
> gcc/testsuite/gcc.dg/ssp-1.c | 2
> gcc/testsuite/gcc.dg/ssp-2.c | 2
> gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++
> gcc/tree-outof-ssa.c | 16 -
> gcc/tree-ssa-coalesce.c | 378 ++++++++++++++++++++-
> gcc/tree-ssa-coalesce.h | 1
> gcc/tree-ssa-copyrename.c | 475 --------------------------
> gcc/tree-ssa-live.c | 99 -----
> gcc/tree-ssa-live.h | 4
> gcc/tree-ssa-uncprop.c | 5
> gcc/var-tracking.c | 12 -
> 27 files changed, 979 insertions(+), 847 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c
> delete mode 100644 gcc/tree-ssa-copyrename.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index bf2186a..b36f9c1 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1445,7 +1445,6 @@ OBJS = \
> tree-ssa-ccp.o \
> tree-ssa-coalesce.o \
> tree-ssa-copy.o \
> - tree-ssa-copyrename.o \
> tree-ssa-dce.o \
> tree-ssa-dom.o \
> tree-ssa-dse.o \
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 3203722..69e3732 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant)
> if (! DECL_P (exprx) || ! DECL_P (expry))
> return 0;
>
> + /* If we refer to different gimple registers, or one gimple register
> + and one non-gimple-register, we know they can't overlap. First,
> + gimple registers don't have their addresses taken. Now, there
> + could be more than one stack slot for (different versions of) the
> + same gimple register, but we can presumably tell they don't
> + overlap based on offsets from stack base addresses elsewhere.
> + It's important that we don't proceed to DECL_RTL, because gimple
> + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be
> + able to do anything about them since no SSA information will have
> + remained to guide it. */
> + if (is_gimple_reg (exprx) || is_gimple_reg (expry))
> + return exprx != expry;
> +
> /* With invalid code we can end up storing into the constant pool.
> Bail out to avoid ICEing when creating RTL for this.
> See gfortran.dg/lto/20091028-2_0.f90. */
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index a047632..0b19953 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -150,21 +150,121 @@ gimple_assign_rhs_to_tree (gimple stmt)
>
> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> +/* Choose either CUR or NEXT as the leader DECL for a partition.
> + Prefer ignored decls, to simplify debug dumps and reduce ambiguity
> + out of the same user variable being in multiple partitions (this is
> + less likely for compiler-introduced temps). */
> +
> +static tree
> +leader_merge (tree cur, tree next)
> +{
> + if (cur == NULL || cur == next)
> + return next;
> +
> + if (DECL_P (cur) && DECL_IGNORED_P (cur))
> + return cur;
> +
> + if (DECL_P (next) && DECL_IGNORED_P (next))
> + return next;
> +
> + return cur;
> +}
> +
> +
> +/* Return the RTL for the default SSA def of a PARM or RESULT, if
> + there is one. */
> +
> +rtx
> +get_rtl_for_parm_ssa_default_def (tree var)
> +{
> + gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
> +
> + if (!is_gimple_reg (var))
> + return NULL_RTX;
> +
> + /* If we've already determined RTL for the decl, use it. This is
> + not just an optimization: if VAR is a PARM whose incoming value
> + is unused, we won't find a default def to use its partition, but
> + we still want to use the location of the parm, if it was used at
> + all. During assign_parms, until a location is assigned for the
> + VAR, RTL can only for a parm or result if we're not coalescing
> + across variables, when we know we're coalescing all SSA_NAMEs of
> + each parm or result, and we're not coalescing them with names
> + pertaining to other variables, such as other parms' default
> + defs. */
> + if (DECL_RTL_SET_P (var))
> + {
> + gcc_assert (DECL_RTL (var) != pc_rtx);
> + return DECL_RTL (var);
> + }
> +
> + tree name = ssa_default_def (cfun, var);
> +
> + if (!name)
> + return NULL_RTX;
> +
> + int part = var_to_partition (SA.map, name);
> + if (part == NO_PARTITION)
> + return NULL_RTX;
> +
> + return SA.partition_to_pseudo[part];
> +}
> +
> /* Associate declaration T with storage space X. If T is no
> SSA name this is exactly SET_DECL_RTL, otherwise make the
> partition of T associated with X. */
> static inline void
> set_rtl (tree t, rtx x)
> {
> + if (x && SSAVAR (t))
> + {
> + bool skip = false;
> + tree cur = NULL_TREE;
> +
> + if (MEM_P (x))
> + cur = MEM_EXPR (x);
> + else if (REG_P (x))
> + cur = REG_EXPR (x);
> + else if (GET_CODE (x) == CONCAT
> + && REG_P (XEXP (x, 0)))
> + cur = REG_EXPR (XEXP (x, 0));
> + else if (GET_CODE (x) == PARALLEL)
> + cur = REG_EXPR (XVECEXP (x, 0, 0));
> + else if (x == pc_rtx)
> + skip = true;
> + else
> + gcc_unreachable ();
> +
> + tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
> +
> + if (cur != next)
> + {
> + if (MEM_P (x))
> + set_mem_attributes (x, next, true);
> + else
> + set_reg_attrs_for_decl_rtl (next, x);
> + }
> + }
> +
> if (TREE_CODE (t) == SSA_NAME)
> {
> - SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> - if (x && !MEM_P (x))
> - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> - /* For the benefit of debug information at -O0 (where vartracking
> - doesn't run) record the place also in the base DECL if it's
> - a normal variable (not a parameter). */
> - if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL)
> + int part = var_to_partition (SA.map, t);
> + if (part != NO_PARTITION)
> + {
> + if (SA.partition_to_pseudo[part])
> + gcc_assert (SA.partition_to_pseudo[part] == x);
> + else
> + SA.partition_to_pseudo[part] = x;
> + }
> + /* For the benefit of debug information at -O0 (where
> + vartracking doesn't run) record the place also in the base
> + DECL. For PARMs and RESULTs, we may end up resetting these
> + in function.c:maybe_reset_rtl_for_parm, but in some rare
> + cases we may need them (unused and overwritten incoming
> + value, that at -O0 must share the location with the other
> + uses in spite of the missing default def), and this may be
> + the only chance to preserve them. */
> + if (x && x != pc_rtx && SSA_NAME_VAR (t))
> {
> tree var = SSA_NAME_VAR (t);
> /* If we don't yet have something recorded, just record it now. */
> @@ -862,7 +962,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
> x = plus_constant (Pmode, base, offset);
> - x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
> + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
> + ? TYPE_MODE (TREE_TYPE (decl))
> + : DECL_MODE (SSAVAR (decl)), x);
>
> if (TREE_CODE (decl) != SSA_NAME)
> {
> @@ -884,7 +986,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
> DECL_USER_ALIGN (decl) = 0;
> }
>
> - set_mem_attributes (x, SSAVAR (decl), true);
> set_rtl (decl, x);
> }
>
> @@ -1099,13 +1200,22 @@ account_stack_vars (void)
> to a variable to be allocated in the stack frame. */
>
> static void
> -expand_one_stack_var (tree var)
> +expand_one_stack_var_1 (tree var)
> {
> HOST_WIDE_INT size, offset;
> unsigned byte_align;
>
> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
> - byte_align = align_local_variable (SSAVAR (var));
> + if (TREE_CODE (var) == SSA_NAME)
> + {
> + tree type = TREE_TYPE (var);
> + size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> + byte_align = TYPE_ALIGN_UNIT (type);
> + }
> + else
> + {
> + size = tree_to_uhwi (DECL_SIZE_UNIT (var));
> + byte_align = align_local_variable (var);
> + }
>
> /* We handle highly aligned variables in expand_stack_vars. */
> gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> @@ -1116,6 +1226,27 @@ expand_one_stack_var (tree var)
> crtl->max_used_stack_slot_alignment, offset);
> }
>
> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> + already assigned some MEM. */
> +
> +static void
> +expand_one_stack_var (tree var)
> +{
> + if (TREE_CODE (var) == SSA_NAME)
> + {
> + int part = var_to_partition (SA.map, var);
> + if (part != NO_PARTITION)
> + {
> + rtx x = SA.partition_to_pseudo[part];
> + gcc_assert (x);
> + gcc_assert (MEM_P (x));
> + return;
> + }
> + }
> +
> + return expand_one_stack_var_1 (var);
> +}
> +
> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
> that will reside in a hard register. */
>
> @@ -1125,13 +1256,114 @@ expand_one_hard_reg_var (tree var)
> rest_of_decl_compilation (var, 0, 0);
> }
>
> +/* Record the alignment requirements of some variable assigned to a
> + pseudo. */
> +
> +static void
> +record_alignment_for_reg_var (unsigned int align)
> +{
> + if (SUPPORTS_STACK_ALIGNMENT
> + && crtl->stack_alignment_estimated < align)
> + {
> + /* stack_alignment_estimated shouldn't change after stack
> + realign decision made */
> + gcc_assert (!crtl->stack_realign_processed);
> + crtl->stack_alignment_estimated = align;
> + }
> +
> + /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> + So here we only make sure stack_alignment_needed >= align. */
> + if (crtl->stack_alignment_needed < align)
> + crtl->stack_alignment_needed = align;
> + if (crtl->max_used_stack_slot_alignment < align)
> + crtl->max_used_stack_slot_alignment = align;
> +}
> +
> +/* Create RTL for an SSA partition. */
> +
> +static void
> +expand_one_ssa_partition (tree var)
> +{
> + int part = var_to_partition (SA.map, var);
> + gcc_assert (part != NO_PARTITION);
> +
> + if (SA.partition_to_pseudo[part])
> + return;
> +
> + if (!use_register_for_decl (var))
> + {
> + expand_one_stack_var_1 (var);
> + return;
> + }
> +
> + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
> + TYPE_MODE (TREE_TYPE (var)),
> + TYPE_ALIGN (TREE_TYPE (var)));
> +
> + /* If the variable alignment is very large we'll dynamicaly allocate
> + it, which means that in-frame portion is just a pointer. */
> + if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> + align = POINTER_SIZE;
> +
> + record_alignment_for_reg_var (align);
> +
> + machine_mode reg_mode = promote_ssa_mode (var, NULL);
> +
> + rtx x = gen_reg_rtx (reg_mode);
> +
> + set_rtl (var, x);
> +}
> +
> +/* Record the association between the RTL generated for a partition
> + and the underlying variable of the SSA_NAME. */
> +
> +static void
> +adjust_one_expanded_partition_var (tree var)
> +{
> + if (!var)
> + return;
> +
> + tree decl = SSA_NAME_VAR (var);
> +
> + int part = var_to_partition (SA.map, var);
> + if (part == NO_PARTITION)
> + return;
> +
> + rtx x = SA.partition_to_pseudo[part];
> +
> + set_rtl (var, x);
> +
> + if (!REG_P (x))
> + return;
> +
> + /* Note if the object is a user variable. */
> + if (decl && !DECL_ARTIFICIAL (decl))
> + mark_user_reg (x);
> +
> + if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var)))
> + mark_reg_pointer (x, get_pointer_alignment (var));
> +}
> +
> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL
> that will reside in a pseudo register. */
>
> static void
> expand_one_register_var (tree var)
> {
> - tree decl = SSAVAR (var);
> + if (TREE_CODE (var) == SSA_NAME)
> + {
> + int part = var_to_partition (SA.map, var);
> + if (part != NO_PARTITION)
> + {
> + rtx x = SA.partition_to_pseudo[part];
> + gcc_assert (x);
> + gcc_assert (REG_P (x));
> + return;
> + }
> + gcc_unreachable ();
> + }
> +
> + tree decl = var;
> tree type = TREE_TYPE (decl);
> machine_mode reg_mode = promote_decl_mode (decl, NULL);
> rtx x = gen_reg_rtx (reg_mode);
> @@ -1265,21 +1497,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
> align = POINTER_SIZE;
> }
>
> - if (SUPPORTS_STACK_ALIGNMENT
> - && crtl->stack_alignment_estimated < align)
> - {
> - /* stack_alignment_estimated shouldn't change after stack
> - realign decision made */
> - gcc_assert (!crtl->stack_realign_processed);
> - crtl->stack_alignment_estimated = align;
> - }
> -
> - /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
> - So here we only make sure stack_alignment_needed >= align. */
> - if (crtl->stack_alignment_needed < align)
> - crtl->stack_alignment_needed = align;
> - if (crtl->max_used_stack_slot_alignment < align)
> - crtl->max_used_stack_slot_alignment = align;
> + record_alignment_for_reg_var (align);
>
> if (TREE_CODE (origvar) == SSA_NAME)
> {
> @@ -1713,48 +1931,18 @@ expand_used_vars (void)
> if (targetm.use_pseudo_pic_reg ())
> pic_offset_table_rtx = gen_reg_rtx (Pmode);
>
> - hash_map<tree, tree> ssa_name_decls;
> for (i = 0; i < SA.map->num_partitions; i++)
> {
> tree var = partition_to_var (SA.map, i);
>
> gcc_assert (!virtual_operand_p (var));
>
> - /* Assign decls to each SSA name partition, share decls for partitions
> - we could have coalesced (those with the same type). */
> - if (SSA_NAME_VAR (var) == NULL_TREE)
> - {
> - tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var));
> - if (!*slot)
> - *slot = create_tmp_reg (TREE_TYPE (var));
> - replace_ssa_name_symbol (var, *slot);
> - }
> -
> - /* Always allocate space for partitions based on