* RFC: expand from SSA form (1/2) @ 2009-04-13 20:50 Michael Matz 2009-04-21 18:23 ` Andrew MacLeod 0 siblings, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-13 20:50 UTC (permalink / raw) To: gcc-patches Hi, This patch implements expanding directly from SSA form, i.e. without first going out and producing GENERIC. It doesn't yet get rid of explicitely building trees instead of expanding GIMPLE directly, that's for somewhen later. But it does make passing information from tree-ssa land into RTL land somewhat easier already (the obvious candidates are debug information because we don't generate new decls and alias info). This first patch actually implements the whole thing, the second patch cleans up all the cruft ("if (0)", dead functions and the like). I've separated them to not clutter the important parts with the deletions. Anyway, here is how it works: (1) in SSA form: create and coalesce partitions, as before, but don't rewrite anything. Pass this info to cfgexpand.c. (2) cfgexpand.c: allocate some space (in pseudos or stack) for each partition not containing a default def (these are the ones corresponding to VAR_DECLs). Set interesting attributes for that RTX according to the underlying SSA_NAME_VAR. I.e. we don't have to create artificial abc.24 variables for secondary partition. (3) cfgexpand.c: setup argument RTXs (4) all partitions not yet having some RTL will now get the one from the underlying SSA_NAME_VAR (these are the ones containing a default def, corresponding to PARM_DECLs) Now all partitions have some allocated space, which we're going to use later in expr.c when we expand SSA_NAMEs (each SSA name belongs to a partition, if it's used anywhere in program text). (5) RTL cfg hooks are activated. Note that basic blocks still are in gimple SSA form, and we still have PHI nodes. (6) tree-outof-ssa.c: expand all phi nodes. This is similar to before, just that we aren't emitting gimple instructions, but already real RTL insns, on the edges. Yep, on the edges we have RTL insns, while in the basic blocks we still have gimple. Which also means we can't yet commit the insns on edges. (7) all basic blocks are expanded like before (i.e. constructing trees of gimple, expanding those one by one, when we see an SSA_NAME we use the RTX expressions from above) Now we have only RTL insns in BBs and edges. Some basic block are expanded to contain multiple jumps, so they really are super blocks. (8) commit all insns from edges. This possibly means splitting edges, which needs redirecting edges, which is a problem when the source block is really a super block. Redirecting edges in RTL land rewrites the jump insn label. As we might have multiple jumps in one super block we need to look into all of them. I explicitely don't want to split all critical edges before expanding. We're basically done now. (9) cleanup data structures (which formerly were passes in their own right), ensure some invariantes in RTL land (no EDGE_EXECUTABLE for instance) and the like. For the necessary interaction between outof-ssa, cfgexpand and expand itself I've created a new header containing the necessary info (the coalesced partitions, plus space for their RTL expressions and some accessors to those). TER is still supported and implemented by deferring expanding an assignment that is TERed to the place where it's actually used. That's a change from before: we don't insert the tree of the RHS directly into the tree that is going to be expanded. The attempt to optimize placement of instructions on edges (process_single_block_loop_latch, analyze_edges_for_bb) isn't reimplemented for now. Currently it's implemented by scanning the gimple instructions looking for common sequences, which is more difficult in RTL land as one move can possibly be emitted as multiple insns. This commonizing should IMO be implemented somewhere else, namely as preconditioning pass before un-SSA. The process_single_block_loop_latch hack needs to be moved somewhere else too, which I'd also like to do later. I've also changed out-of-ssa to mostly work only with partition numbers instead of partition variables (those are removed in the cleanup pass). We also don't need to must-coalesce default defs with base variables or in fact change the variables per partition in any way at all (i.e. a partition always corresponds to exactly one SSA name, whose version is the partition number). As out-of-ssa and expand are now basically one pass there can't be any passes working on non-SSA trees anymore. Currently that's only two passes: tree-nrv, which is easily fixed, and mudflap, which I deactivated for now. This patch (and this one plus the cleanup patch) bootstrap fine one x86_64-linux. Regtesting shows only expected problems (in particular no execute.exp fails), namely fallout from deactivating mudflap and fallout because there's no .optimized treedump anymore. I haven't yet checked the performance impact on SPEC or the like. Neither did I check in detail memory use (i.e. if the cleanups indeed did cleanup all unnecessary memory). But I'm interested in any comments you might have. In particular I'm not exceptionally fond of the four insert_*_on_edge() routines, though we indeed need four signatures: partition to partition, tree to partition, rtx to partition and partition to rtx. No ChangeLog yet. Ciao, Michael. -- Index: builtins.c =================================================================== *** builtins.c.orig --- builtins.c *************** expand_builtin (tree exp, rtx target, rt *** 6237,6242 **** --- 6237,6244 ---- enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl); enum machine_mode target_mode = TYPE_MODE (TREE_TYPE (exp)); + if (mode == VOIDmode) + mode = target_mode; if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD) return targetm.expand_builtin (exp, target, subtarget, mode, ignore); Index: tree-nrv.c =================================================================== *** tree-nrv.c.orig --- tree-nrv.c *************** struct nrv_data *** 56,61 **** --- 56,62 ---- /* This is the function's RESULT_DECL. We will replace all occurrences of VAR with RESULT_DECL when we apply this optimization. */ tree result; + int modified; }; static tree finalize_nrv_r (tree *, int *, void *); *************** finalize_nrv_r (tree *tp, int *walk_subt *** 83,89 **** /* Otherwise replace all occurrences of VAR with RESULT. */ else if (*tp == dp->var) ! *tp = dp->result; /* Keep iterating. */ return NULL_TREE; --- 84,90 ---- /* Otherwise replace all occurrences of VAR with RESULT. */ else if (*tp == dp->var) ! *tp = dp->result, dp->modified = 1; /* Keep iterating. */ return NULL_TREE; *************** tree_nrv (void) *** 110,115 **** --- 111,117 ---- basic_block bb; gimple_stmt_iterator gsi; struct nrv_data data; + int any_modified = 0; /* If this function does not return an aggregate type in memory, then there is nothing to do. */ *************** tree_nrv (void) *** 235,241 **** --- 237,246 ---- struct walk_stmt_info wi; memset (&wi, 0, sizeof (wi)); wi.info = &data; + data.modified = 0; walk_gimple_op (stmt, finalize_nrv_r, &wi); + if (data.modified) + update_stmt (stmt), any_modified = 1; gsi_next (&gsi); } } *************** tree_nrv (void) *** 243,248 **** --- 248,258 ---- /* FOUND is no longer used. Ensure it gets removed. */ var_ann (found)->used = 0; + if (any_modified) + { + mark_sym_for_renaming (gimple_vop (cfun)); + return TODO_update_ssa; + } return 0; } *************** struct gimple_opt_pass pass_nrv = *** 263,269 **** NULL, /* next */ 0, /* static_pass_number */ TV_TREE_NRV, /* tv_id */ ! PROP_cfg, /* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ --- 273,279 ---- NULL, /* next */ 0, /* static_pass_number */ TV_TREE_NRV, /* tv_id */ ! PROP_ssa | PROP_cfg, /* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ Index: expr.c =================================================================== *** expr.c.orig --- expr.c *************** along with GCC; see the file COPYING3. *** 54,59 **** --- 54,60 ---- #include "timevar.h" #include "df.h" #include "diagnostic.h" + #include "ssaexpand.h" /* Decide whether a function's arguments should be processed from first to last or from last to first. *************** expand_expr_real_1 (tree exp, rtx target *** 7244,7251 **** } case SSA_NAME: ! return expand_expr_real_1 (SSA_NAME_VAR (exp), target, tmode, modifier, ! NULL); case PARM_DECL: case VAR_DECL: --- 7245,7265 ---- } case SSA_NAME: ! /* ??? ivopts calls expander, without any preparation from ! out-of-ssa. So fake instructions as if this was an access to the ! base variable. This unnecessarily allocates a pseudo, see how we can ! reuse it, if partition base vars have it set already. */ ! if (!currently_expanding_to_rtl) ! return expand_expr_real_1 (SSA_NAME_VAR (exp), target, tmode, modifier, NULL); ! { ! gimple g = get_gimple_for_ssa_name (exp); ! if (g) ! return expand_expr_real_1 (gimple_assign_rhs_to_tree (g), target, ! tmode, modifier, NULL); ! } ! decl_rtl = get_rtx_for_ssa_name (exp); ! exp = SSA_NAME_VAR (exp); ! goto expand_decl_rtl; case PARM_DECL: case VAR_DECL: *************** expand_expr_real_1 (tree exp, rtx target *** 7271,7276 **** --- 7285,7291 ---- case FUNCTION_DECL: case RESULT_DECL: decl_rtl = DECL_RTL (exp); + expand_decl_rtl: gcc_assert (decl_rtl); decl_rtl = copy_rtx (decl_rtl); Index: emit-rtl.c =================================================================== *** emit-rtl.c.orig --- emit-rtl.c *************** set_reg_attrs_for_parm (rtx parm_rtx, rt *** 1028,1034 **** /* Set the REG_ATTRS for registers in value X, given that X represents decl T. */ ! static void set_reg_attrs_for_decl_rtl (tree t, rtx x) { if (GET_CODE (x) == SUBREG) --- 1028,1034 ---- /* Set the REG_ATTRS for registers in value X, given that X represents decl T. */ ! void set_reg_attrs_for_decl_rtl (tree t, rtx x) { if (GET_CODE (x) == SUBREG) Index: cfgexpand.c =================================================================== *** cfgexpand.c.orig --- cfgexpand.c *************** along with GCC; see the file COPYING3. *** 42,49 **** --- 42,54 ---- #include "tree-inline.h" #include "value-prof.h" #include "target.h" + #include "ssaexpand.h" + /* This variable holds information helping the rewriting of SSA trees + into RTL. */ + struct ssaexpand SA; + /* Return an expression tree corresponding to the RHS of GIMPLE statement STMT. */ *************** failed: *** 423,428 **** --- 428,447 ---- #define STACK_ALIGNMENT_NEEDED 1 #endif + #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) + + static inline void + set_rtl (tree t, rtx x) + { + if (TREE_CODE (t) == SSA_NAME) + { + SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x; + if (x && !MEM_P (x)) + set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x); + } + else + SET_DECL_RTL (t, x); + } /* This structure holds data relevant to one variable that will be placed in a stack slot. */ *************** add_stack_var (tree decl) *** 561,575 **** } stack_vars[stack_vars_num].decl = decl; stack_vars[stack_vars_num].offset = 0; ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (decl), 1); ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (decl); /* All variables are initially in their own partition. */ stack_vars[stack_vars_num].representative = stack_vars_num; stack_vars[stack_vars_num].next = EOC; /* Ensure that this decl doesn't get put onto the list twice. */ ! SET_DECL_RTL (decl, pc_rtx); stack_vars_num++; } --- 580,594 ---- } stack_vars[stack_vars_num].decl = decl; stack_vars[stack_vars_num].offset = 0; ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1); ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (SSAVAR (decl)); /* All variables are initially in their own partition. */ stack_vars[stack_vars_num].representative = stack_vars_num; stack_vars[stack_vars_num].next = EOC; /* Ensure that this decl doesn't get put onto the list twice. */ ! set_rtl (decl, pc_rtx); stack_vars_num++; } *************** add_alias_set_conflicts (void) *** 688,709 **** } /* A subroutine of partition_stack_vars. A comparison function for qsort, ! sorting an array of indices by the size of the object. */ static int stack_var_size_cmp (const void *a, const void *b) { HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size; HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size; ! unsigned int uida = DECL_UID (stack_vars[*(const size_t *)a].decl); ! unsigned int uidb = DECL_UID (stack_vars[*(const size_t *)b].decl); if (sa < sb) return -1; if (sa > sb) return 1; ! /* For stack variables of the same size use the uid of the decl ! to make the sort stable. */ if (uida < uidb) return -1; if (uida > uidb) --- 707,743 ---- } /* A subroutine of partition_stack_vars. A comparison function for qsort, ! sorting an array of indices by the size and type of the object. */ static int stack_var_size_cmp (const void *a, const void *b) { HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size; HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size; ! tree decla, declb; ! unsigned int uida, uidb; if (sa < sb) return -1; if (sa > sb) return 1; ! decla = stack_vars[*(const size_t *)a].decl; ! declb = stack_vars[*(const size_t *)b].decl; ! /* For stack variables of the same size use and id of the decls ! to make the sort stable. Two SSA names are compared by their ! version, SSA names come before non-SSA names, and two normal ! decls are compared by their DECL_UID. */ ! if (TREE_CODE (decla) == SSA_NAME) ! { ! if (TREE_CODE (declb) == SSA_NAME) ! uida = SSA_NAME_VERSION (decla), uidb = SSA_NAME_VERSION (declb); ! else ! return -1; ! } ! else if (TREE_CODE (declb) == SSA_NAME) ! return 1; ! else ! uida = DECL_UID (decla), uidb = DECL_UID (declb); if (uida < uidb) return -1; if (uida > uidb) *************** expand_one_stack_var_at (tree decl, HOST *** 874,894 **** gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); x = plus_constant (virtual_stack_vars_rtx, offset); ! x = gen_rtx_MEM (DECL_MODE (decl), x); ! /* Set alignment we actually gave this decl. */ ! offset -= frame_phase; ! align = offset & -offset; ! align *= BITS_PER_UNIT; ! if (align == 0) ! align = STACK_BOUNDARY; ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) ! align = MAX_SUPPORTED_STACK_ALIGNMENT; ! DECL_ALIGN (decl) = align; ! DECL_USER_ALIGN (decl) = 0; ! set_mem_attributes (x, decl, true); ! SET_DECL_RTL (decl, x); } /* A subroutine of expand_used_vars. Give each partition representative --- 908,934 ---- gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); x = plus_constant (virtual_stack_vars_rtx, offset); ! x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x); ! if (TREE_CODE (decl) != SSA_NAME) ! { ! /* Set alignment we actually gave this decl if it isn't an SSA name. ! If it is we generate stack slots only accidentally so it isn't as ! important, we'll simply use the alignment that is already set. */ ! offset -= frame_phase; ! align = offset & -offset; ! align *= BITS_PER_UNIT; ! if (align == 0) ! align = STACK_BOUNDARY; ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) ! align = MAX_SUPPORTED_STACK_ALIGNMENT; ! ! DECL_ALIGN (decl) = align; ! DECL_USER_ALIGN (decl) = 0; ! } ! set_mem_attributes (x, SSAVAR (decl), true); ! set_rtl (decl, x); } /* A subroutine of expand_used_vars. Give each partition representative *************** expand_stack_vars (bool (*pred) (tree)) *** 912,918 **** /* Skip variables that have already had rtl assigned. See also add_stack_var where we perpetrate this pc_rtx hack. */ ! if (DECL_RTL (stack_vars[i].decl) != pc_rtx) continue; /* Check the predicate to see whether this variable should be --- 952,960 ---- /* Skip variables that have already had rtl assigned. See also add_stack_var where we perpetrate this pc_rtx hack. */ ! if ((TREE_CODE (stack_vars[i].decl) == SSA_NAME ! ? SA.partition_to_pseudo[var_to_partition (SA.map, stack_vars[i].decl)] ! : DECL_RTL (stack_vars[i].decl)) != pc_rtx) continue; /* Check the predicate to see whether this variable should be *************** account_stack_vars (void) *** 951,957 **** size += stack_vars[i].size; for (j = i; j != EOC; j = stack_vars[j].next) ! SET_DECL_RTL (stack_vars[j].decl, NULL); } return size; } --- 993,999 ---- size += stack_vars[i].size; for (j = i; j != EOC; j = stack_vars[j].next) ! set_rtl (stack_vars[j].decl, NULL); } return size; } *************** expand_one_stack_var (tree var) *** 964,971 **** { HOST_WIDE_INT size, offset, align; ! size = tree_low_cst (DECL_SIZE_UNIT (var), 1); ! align = get_decl_align_unit (var); offset = alloc_stack_frame_space (size, align); expand_one_stack_var_at (var, offset); --- 1006,1013 ---- { HOST_WIDE_INT size, offset, align; ! size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (var)), 1); ! align = get_decl_align_unit (SSAVAR (var)); offset = alloc_stack_frame_space (size, align); expand_one_stack_var_at (var, offset); *************** expand_one_hard_reg_var (tree var) *** 986,1005 **** static void expand_one_register_var (tree var) { ! tree type = TREE_TYPE (var); int unsignedp = TYPE_UNSIGNED (type); enum machine_mode reg_mode ! = promote_mode (type, DECL_MODE (var), &unsignedp, 0); rtx x = gen_reg_rtx (reg_mode); ! SET_DECL_RTL (var, x); /* Note if the object is a user variable. */ ! if (!DECL_ARTIFICIAL (var)) ! mark_user_reg (x); if (POINTER_TYPE_P (type)) ! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); } /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that --- 1028,1048 ---- static void expand_one_register_var (tree var) { ! tree decl = SSAVAR (var); ! tree type = TREE_TYPE (decl); int unsignedp = TYPE_UNSIGNED (type); enum machine_mode reg_mode ! = promote_mode (type, DECL_MODE (decl), &unsignedp, 0); rtx x = gen_reg_rtx (reg_mode); ! set_rtl (var, x); /* Note if the object is a user variable. */ ! if (!DECL_ARTIFICIAL (decl)) ! mark_user_reg (x); if (POINTER_TYPE_P (type)) ! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type))); } /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that *************** defer_stack_allocation (tree var, bool t *** 1067,1072 **** --- 1110,1119 ---- static HOST_WIDE_INT expand_one_var (tree var, bool toplevel, bool really_expand) { + tree origvar = var; + if (TREE_CODE (var) == SSA_NAME) + var = SSA_NAME_VAR (var); + if (SUPPORTS_STACK_ALIGNMENT && TREE_TYPE (var) != error_mark_node && TREE_CODE (var) == VAR_DECL) *************** expand_one_var (tree var, bool toplevel, *** 1092,1098 **** } } ! if (TREE_CODE (var) != VAR_DECL) ; else if (DECL_EXTERNAL (var)) ; --- 1139,1156 ---- } } ! if (TREE_CODE (origvar) == SSA_NAME) ! { ! gcc_assert (TREE_CODE (var) != VAR_DECL ! || (!DECL_EXTERNAL (var) ! && !DECL_HAS_VALUE_EXPR_P (var) ! && !TREE_STATIC (var) ! && !DECL_RTL_SET_P (var) ! && TREE_TYPE (var) != error_mark_node ! && !DECL_HARD_REGISTER (var) ! && really_expand)); ! } ! if (TREE_CODE (var) != VAR_DECL && TREE_CODE (origvar) != SSA_NAME) ; else if (DECL_EXTERNAL (var)) ; *************** expand_one_var (tree var, bool toplevel, *** 1107,1113 **** if (really_expand) expand_one_error_var (var); } ! else if (DECL_HARD_REGISTER (var)) { if (really_expand) expand_one_hard_reg_var (var); --- 1165,1171 ---- if (really_expand) expand_one_error_var (var); } ! else if (TREE_CODE (var) == VAR_DECL && DECL_HARD_REGISTER (var)) { if (really_expand) expand_one_hard_reg_var (var); *************** expand_one_var (tree var, bool toplevel, *** 1115,1128 **** else if (use_register_for_decl (var)) { if (really_expand) ! expand_one_register_var (var); } else if (defer_stack_allocation (var, toplevel)) ! add_stack_var (var); else { if (really_expand) ! expand_one_stack_var (var); return tree_low_cst (DECL_SIZE_UNIT (var), 1); } return 0; --- 1173,1186 ---- else if (use_register_for_decl (var)) { if (really_expand) ! expand_one_register_var (origvar); } else if (defer_stack_allocation (var, toplevel)) ! add_stack_var (origvar); else { if (really_expand) ! expand_one_stack_var (origvar); return tree_low_cst (DECL_SIZE_UNIT (var), 1); } return 0; *************** static void *** 1441,1446 **** --- 1499,1505 ---- expand_used_vars (void) { tree t, next, outer_block = DECL_INITIAL (current_function_decl); + unsigned i; /* Compute the phase of the stack frame for this function. */ { *************** expand_used_vars (void) *** 1451,1456 **** --- 1510,1557 ---- init_vars_expansion (); + for (i = 0; i < SA.map->num_partitions; i++) + { + tree var = partition_to_var (SA.map, i); + + if (TREE_CODE (var) == SSA_NAME) + var = SSA_NAME_VAR (var); + gcc_assert (is_gimple_reg (var)); + if (TREE_CODE (var) == VAR_DECL) + expand_one_var (partition_to_var (SA.map, i), true, true); + else + { + /* This is a PARM_DECL or RESULT_DECL. For those partitions that + contain the default def (representing the parm or result itself) + we don't do anything here. But those which don't contain the + default def (representing a temporary based on the parm/result) + we need to allocate space just like for normal VAR_DECLs. */ + int j = i; + struct partition_elem *start, *elem; + int has_default = 0; + if (SA.map->view_to_partition) + j = SA.map->view_to_partition[j]; + j = partition_find (SA.map->var_partition, j); + start = elem = SA.map->var_partition->elements + j; + do + { + j = elem - SA.map->var_partition->elements; + elem = elem->next; + if (SSA_NAME_IS_DEFAULT_DEF (ssa_name (j))) + { + has_default = 1; + break; + } + } + while (elem != start); + if (!has_default) + { + expand_one_var (partition_to_var (SA.map, i), true, true); + gcc_assert (SA.partition_to_pseudo[i]); + } + } + } + /* At this point all variables on the local_decls with TREE_USED set are not associated with any block scope. Lay them out. */ t = cfun->local_decls; *************** expand_used_vars (void) *** 1462,1473 **** next = TREE_CHAIN (t); /* We didn't set a block for static or extern because it's hard to tell the difference between a global variable (re)declared in a local scope, and one that's really declared there to begin with. And it doesn't really matter much, since we're not giving them stack space. Expand them now. */ ! if (TREE_STATIC (var) || DECL_EXTERNAL (var)) expand_now = true; /* Any variable that could have been hoisted into an SSA_NAME --- 1563,1577 ---- next = TREE_CHAIN (t); + /* Expanded above already. */ + if (is_gimple_reg (var)) + ; /* We didn't set a block for static or extern because it's hard to tell the difference between a global variable (re)declared in a local scope, and one that's really declared there to begin with. And it doesn't really matter much, since we're not giving them stack space. Expand them now. */ ! else if (TREE_STATIC (var) || DECL_EXTERNAL (var)) expand_now = true; /* Any variable that could have been hoisted into an SSA_NAME *************** expand_gimple_cond (basic_block bb, gimp *** 1722,1727 **** --- 1826,1832 ---- if (BARRIER_P (BB_END (new_bb))) BB_END (new_bb) = PREV_INSN (BB_END (new_bb)); update_bb_for_insn (new_bb); + gcc_assert ((dest->flags & BB_RTL) || gimple_seq_empty_p (phi_nodes (dest))); maybe_dump_rtl_for_gimple_stmt (stmt, last2); *************** expand_gimple_basic_block (basic_block b *** 1854,1859 **** --- 1959,1965 ---- { gimple_stmt_iterator gsi; gimple_seq stmts; + gimple_seq phis; gimple stmt = NULL; rtx note, last; edge e; *************** expand_gimple_basic_block (basic_block b *** 1869,1874 **** --- 1975,1981 ---- block to be in GIMPLE, instead of RTL. Therefore, we need to access the BB sequence directly. */ stmts = bb_seq (bb); + phis = phi_nodes (bb); bb->il.gimple = NULL; rtl_profile_for_bb (bb); init_rtl_bb_info (bb); *************** expand_gimple_basic_block (basic_block b *** 1932,1951 **** NOTE_BASIC_BLOCK (note) = bb; - for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); ) - { - /* Clear EDGE_EXECUTABLE. This flag is never used in the backend. */ - e->flags &= ~EDGE_EXECUTABLE; - - /* At the moment not all abnormal edges match the RTL representation. - It is safe to remove them here as find_many_sub_basic_blocks will - rediscover them. In the future we should get this fixed properly. */ - if (e->flags & EDGE_ABNORMAL) - remove_edge (e); - else - ei_next (&ei); - } - for (; !gsi_end_p (gsi); gsi_next (&gsi)) { gimple stmt = gsi_stmt (gsi); --- 2039,2044 ---- *************** expand_gimple_basic_block (basic_block b *** 1975,1981 **** } else if (gimple_code (stmt) != GIMPLE_CHANGE_DYNAMIC_TYPE) { ! tree stmt_tree = gimple_to_tree (stmt); last = get_last_insn (); expand_expr_stmt (stmt_tree); maybe_dump_rtl_for_gimple_stmt (stmt, last); --- 2068,2086 ---- } else if (gimple_code (stmt) != GIMPLE_CHANGE_DYNAMIC_TYPE) { ! def_operand_p def_p; ! tree stmt_tree; ! def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF); ! ! if (def_p != NULL) ! { ! /* Mark this stmt for removal if it is the list of replaceable ! expressions. */ ! if (SA.values ! && SA.values[SSA_NAME_VERSION (DEF_FROM_PTR (def_p))]) ! continue; ! } ! stmt_tree = gimple_to_tree (stmt); last = get_last_insn (); expand_expr_stmt (stmt_tree); maybe_dump_rtl_for_gimple_stmt (stmt, last); *************** construct_init_block (void) *** 2058,2063 **** --- 2163,2170 ---- first_block = e->dest; redirect_edge_succ (e, init_block); e = make_edge (init_block, first_block, flags); + gcc_assert ((first_block->flags & BB_RTL) + || gimple_seq_empty_p (phi_nodes (first_block))); } else e = make_edge (init_block, EXIT_BLOCK_PTR, EDGE_FALLTHRU); *************** gimple_expand_cfg (void) *** 2286,2291 **** --- 2393,2406 ---- sbitmap blocks; edge_iterator ei; edge e; + unsigned i; + + rewrite_out_of_ssa (&SA); + SA.partition_to_pseudo = (rtx *)xcalloc (SA.map->num_partitions, + sizeof (rtx)); + /* XXX remove_unused_locals also removes var annotation which we rely on during + phi node elimination for now. */ + /*remove_unused_locals ();*/ /* Some backends want to know that we are expanding to RTL. */ currently_expanding_to_rtl = 1; *************** gimple_expand_cfg (void) *** 2339,2344 **** --- 2454,2484 ---- /* Set up parameters and prepare for return, for the function. */ expand_function_start (current_function_decl); + /* Now that we also have the parameter RTXs, copy them over to our + partitions. */ + for (i = 0; i < SA.map->num_partitions; i++) + { + tree var = partition_to_var (SA.map, i); + + if (TREE_CODE (var) == SSA_NAME) + var = SSA_NAME_VAR (var); + if (TREE_CODE (var) != VAR_DECL + && !SA.partition_to_pseudo[i]) + SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var); + gcc_assert (SA.partition_to_pseudo[i]); + /* Some RTL parts really want to look at DECL_RTL(x) when x + was a decl marked in REG_ATTR or MEM_ATTR. We could use + SET_DECL_RTL here making this available, but that would mean + to select one of the potentially many RTLs for one DECL. Instead + of doing that we simply reset the MEM_EXPR of the RTL in question, + then nobody can get at it and hence nobody can call DECL_RTL on it. */ + if (!DECL_RTL_SET_P (var)) + { + if (MEM_P (SA.partition_to_pseudo[i])) + set_mem_expr (SA.partition_to_pseudo[i], NULL); + } + } + /* If this function is `main', emit a call to `__main' to run global initializers, etc. */ if (DECL_NAME (current_function_decl) *************** gimple_expand_cfg (void) *** 2371,2380 **** /* Register rtl specific functions for cfg. */ rtl_register_cfg_hooks (); init_block = construct_init_block (); /* Clear EDGE_EXECUTABLE on the entry edge(s). It is cleaned from the ! remaining edges in expand_gimple_basic_block. */ FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs) e->flags &= ~EDGE_EXECUTABLE; --- 2511,2522 ---- /* Register rtl specific functions for cfg. */ rtl_register_cfg_hooks (); + expand_phi_nodes (&SA); + init_block = construct_init_block (); /* Clear EDGE_EXECUTABLE on the entry edge(s). It is cleaned from the ! remaining edges later. */ FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs) e->flags &= ~EDGE_EXECUTABLE; *************** gimple_expand_cfg (void) *** 2382,2387 **** --- 2524,2532 ---- FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb) bb = expand_gimple_basic_block (bb); + execute_free_datastructures (); + finish_out_of_ssa (&SA); + /* Expansion is used by optimization passes too, set maybe_hot_insn_p conservatively to true until they are all profile aware. */ pointer_map_destroy (lab_rtx_for_bb); *************** gimple_expand_cfg (void) *** 2401,2411 **** rebuild_jump_labels (get_insns ()); find_exception_handler_labels (); blocks = sbitmap_alloc (last_basic_block); sbitmap_ones (blocks); find_many_sub_basic_blocks (blocks); - purge_all_dead_edges (); sbitmap_free (blocks); compact_blocks (); --- 2546,2597 ---- rebuild_jump_labels (get_insns ()); find_exception_handler_labels (); + currently_expanding_to_rtl = 1; + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb) + { + edge e; + edge_iterator ei; + /* ??? commit_one_edge_insertion might reorder the edges of the + current block (when it needs to split an edge), so that we + might miss some edges with instructions on them. Pff. + (see execute/ashldi-1.c). */ + VEC(edge,gc) *edges = VEC_copy (edge,gc,bb->preds); + + for (ei = ei_start (edges); (e = ei_safe_edge (ei)); ) + { + ei_next (&ei); + if (e->insns.r) + commit_one_edge_insertion (e); + } + VEC_free (edge, gc, edges); + } + currently_expanding_to_rtl = 0; + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb) + { + edge e; + edge_iterator ei; + for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); ) + { + /* Clear EDGE_EXECUTABLE. This flag is never used in the backend. */ + e->flags &= ~EDGE_EXECUTABLE; + + /* At the moment not all abnormal edges match the RTL + representation. It is safe to remove them here as + find_many_sub_basic_blocks will rediscover them. + In the future we should get this fixed properly. */ + if ((e->flags & EDGE_ABNORMAL) + && !(e->flags & EDGE_SIBCALL)) + remove_edge (e); + else + ei_next (&ei); + } + } + blocks = sbitmap_alloc (last_basic_block); sbitmap_ones (blocks); find_many_sub_basic_blocks (blocks); sbitmap_free (blocks); + purge_all_dead_edges (); compact_blocks (); *************** struct rtl_opt_pass pass_expand = *** 2471,2480 **** 0, /* static_pass_number */ TV_EXPAND, /* tv_id */ /* ??? If TER is enabled, we actually receive GENERIC. */ ! PROP_gimple_leh | PROP_cfg, /* properties_required */ PROP_rtl, /* properties_provided */ ! PROP_trees, /* properties_destroyed */ ! 0, /* todo_flags_start */ ! TODO_dump_func, /* todo_flags_finish */ } }; --- 2657,2668 ---- 0, /* static_pass_number */ TV_EXPAND, /* tv_id */ /* ??? If TER is enabled, we actually receive GENERIC. */ ! PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */ PROP_rtl, /* properties_provided */ ! PROP_ssa | PROP_trees, /* properties_destroyed */ ! TODO_verify_ssa | TODO_verify_flow ! | TODO_verify_stmts, /* todo_flags_start */ ! TODO_dump_func ! | TODO_ggc_collect /* todo_flags_finish */ } }; Index: tree-ssa-live.c =================================================================== *** tree-ssa-live.c.orig --- tree-ssa-live.c *************** change_partition_var (var_map map, tree *** 388,393 **** --- 388,394 ---- { var_ann_t ann; + return; gcc_assert (TREE_CODE (var) != SSA_NAME); ann = var_ann (var); Index: tree-ssa.c =================================================================== *** tree-ssa.c.orig --- tree-ssa.c *************** delete_tree_ssa (void) *** 845,851 **** gimple_set_modified (stmt, true); } ! set_phi_nodes (bb, NULL); } /* Remove annotations from every referenced local variable. */ --- 845,852 ---- gimple_set_modified (stmt, true); } ! if (!(bb->flags & BB_RTL)) ! set_phi_nodes (bb, NULL); } /* Remove annotations from every referenced local variable. */ Index: tree-optimize.c =================================================================== *** tree-optimize.c.orig --- tree-optimize.c *************** struct gimple_opt_pass pass_cleanup_cfg_ *** 219,225 **** /* Pass: do the actions required to finish with tree-ssa optimization passes. */ ! static unsigned int execute_free_datastructures (void) { free_dominance_info (CDI_DOMINATORS); --- 219,225 ---- /* Pass: do the actions required to finish with tree-ssa optimization passes. */ ! unsigned int execute_free_datastructures (void) { free_dominance_info (CDI_DOMINATORS); *************** execute_free_datastructures (void) *** 228,233 **** --- 228,237 ---- /* Remove the ssa structures. */ if (cfun->gimple_df) delete_tree_ssa (); + + /* And get rid of annotations we no longer need. */ + delete_tree_cfg_annotations (); + return 0; } *************** struct gimple_opt_pass pass_free_datastr *** 254,262 **** static unsigned int execute_free_cfg_annotations (void) { - /* And get rid of annotations we no longer need. */ - delete_tree_cfg_annotations (); - return 0; } --- 258,263 ---- Index: tree-outof-ssa.c =================================================================== *** tree-outof-ssa.c.orig --- tree-outof-ssa.c *************** along with GCC; see the file COPYING3. *** 30,38 **** #include "tree-flow.h" #include "timevar.h" #include "tree-dump.h" - #include "tree-ssa-live.h" #include "tree-pass.h" #include "toplev.h" /* Used to hold all the components required to do SSA PHI elimination. --- 30,40 ---- #include "tree-flow.h" #include "timevar.h" #include "tree-dump.h" #include "tree-pass.h" #include "toplev.h" + #include "rtl.h" + #include "expr.h" + #include "ssaexpand.h" /* Used to hold all the components required to do SSA PHI elimination. *************** typedef struct _elim_graph { *** 61,67 **** int size; /* List of nodes in the elimination graph. */ ! VEC(tree,heap) *nodes; /* The predecessor and successor edge list. */ VEC(int,heap) *edge_list; --- 63,69 ---- int size; /* List of nodes in the elimination graph. */ ! VEC(int,heap) *nodes; /* The predecessor and successor edge list. */ VEC(int,heap) *edge_list; *************** typedef struct _elim_graph { *** 79,84 **** --- 81,87 ---- edge e; /* List of constant copies to emit. These are pushed on in pairs. */ + VEC(int,heap) *const_dests; VEC(tree,heap) *const_copies; } *elim_graph; *************** create_temp (tree t) *** 131,163 **** } ! /* This helper function fill insert a copy from a constant or variable SRC to ! variable DEST on edge E. */ static void ! insert_copy_on_edge (edge e, tree dest, tree src) { ! gimple copy; ! copy = gimple_build_assign (dest, src); ! set_is_used (dest); ! if (TREE_CODE (src) == ADDR_EXPR) ! src = TREE_OPERAND (src, 0); ! if (TREE_CODE (src) == VAR_DECL || TREE_CODE (src) == PARM_DECL) ! set_is_used (src); if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, ! "Inserting a copy on edge BB%d->BB%d :", e->src->index, ! e->dest->index); ! print_gimple_stmt (dump_file, copy, 0, dump_flags); fprintf (dump_file, "\n"); } ! gsi_insert_on_edge (e, copy); } --- 134,253 ---- } ! /* Insert a copy instruction from partition SRC to DEST onto edge E. */ static void ! insert_partition_copy_on_edge (edge e, int dest, int src) { ! rtx seq; ! if (dump_file && (dump_flags & TDF_DETAILS)) ! { ! fprintf (dump_file, ! "Inserting a partition copy on edge BB%d->BB%d :" ! "PART.%d = PART.%d", ! e->src->index, ! e->dest->index, dest, src); ! fprintf (dump_file, "\n"); ! } ! ! gcc_assert (SA.partition_to_pseudo[dest]); ! gcc_assert (SA.partition_to_pseudo[src]); ! ! /* Partition copy between same base variables only, so it's the same mode, ! hence we can use emit_move_insn. */ ! start_sequence (); ! emit_move_insn (SA.partition_to_pseudo[dest], SA.partition_to_pseudo[src]); ! seq = get_insns (); ! end_sequence (); ! insert_insn_on_edge (seq, e); ! } ! /* Insert a copy instruction from expression SRC to partition DEST ! onto edge E. */ + static void + insert_value_copy_on_edge (edge e, int dest, tree src) + { + rtx seq, x; + enum machine_mode mode; if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, ! "Inserting a value copy on edge BB%d->BB%d : PART.%d = ", e->src->index, ! e->dest->index, dest); ! print_generic_expr (dump_file, src, TDF_SLIM); fprintf (dump_file, "\n"); } ! gcc_assert (SA.partition_to_pseudo[dest]); ! ! start_sequence (); ! mode = GET_MODE (SA.partition_to_pseudo[dest]); ! x = expand_expr (src, SA.partition_to_pseudo[dest], mode, EXPAND_NORMAL); ! if (GET_MODE (x) != mode) ! x = convert_to_mode (mode, x, TYPE_UNSIGNED (TREE_TYPE (src))); ! if (x != SA.partition_to_pseudo[dest]) ! emit_move_insn (SA.partition_to_pseudo[dest], x); ! seq = get_insns (); ! end_sequence (); ! ! insert_insn_on_edge (seq, e); ! } ! ! /* Insert a copy instruction from RTL expression SRC to partition DEST ! onto edge E. */ ! ! static void ! insert_rtx_to_part_on_edge (edge e, int dest, rtx src) ! { ! rtx seq; ! if (dump_file && (dump_flags & TDF_DETAILS)) ! { ! fprintf (dump_file, ! "Inserting a temp copy on edge BB%d->BB%d : PART.%d = ", ! e->src->index, ! e->dest->index, dest); ! print_simple_rtl (dump_file, src); ! fprintf (dump_file, "\n"); ! } ! ! gcc_assert (SA.partition_to_pseudo[dest]); ! start_sequence (); ! gcc_assert (GET_MODE (src) == GET_MODE (SA.partition_to_pseudo[dest])); ! emit_move_insn (SA.partition_to_pseudo[dest], src); ! seq = get_insns (); ! end_sequence (); ! ! insert_insn_on_edge (seq, e); ! } ! ! /* Insert a copy instruction from partition SRC to RTL lvalue DEST ! onto edge E. */ ! ! static void ! insert_part_to_rtx_on_edge (edge e, rtx dest, int src) ! { ! rtx seq; ! if (dump_file && (dump_flags & TDF_DETAILS)) ! { ! fprintf (dump_file, ! "Inserting a temp copy on edge BB%d->BB%d : ", ! e->src->index, ! e->dest->index); ! print_simple_rtl (dump_file, dest); ! fprintf (dump_file, "= PART.%d\n", src); ! } ! ! gcc_assert (SA.partition_to_pseudo[src]); ! start_sequence (); ! gcc_assert (GET_MODE (dest) == GET_MODE (SA.partition_to_pseudo[src])); ! emit_move_insn (dest, SA.partition_to_pseudo[src]); ! seq = get_insns (); ! end_sequence (); ! ! insert_insn_on_edge (seq, e); } *************** new_elim_graph (int size) *** 169,175 **** { elim_graph g = (elim_graph) xmalloc (sizeof (struct _elim_graph)); ! g->nodes = VEC_alloc (tree, heap, 30); g->const_copies = VEC_alloc (tree, heap, 20); g->edge_list = VEC_alloc (int, heap, 20); g->stack = VEC_alloc (int, heap, 30); --- 259,266 ---- { elim_graph g = (elim_graph) xmalloc (sizeof (struct _elim_graph)); ! g->nodes = VEC_alloc (int, heap, 30); ! g->const_dests = VEC_alloc (int, heap, 20); g->const_copies = VEC_alloc (tree, heap, 20); g->edge_list = VEC_alloc (int, heap, 20); g->stack = VEC_alloc (int, heap, 30); *************** new_elim_graph (int size) *** 185,191 **** static inline void clear_elim_graph (elim_graph g) { ! VEC_truncate (tree, g->nodes, 0); VEC_truncate (int, g->edge_list, 0); } --- 276,282 ---- static inline void clear_elim_graph (elim_graph g) { ! VEC_truncate (int, g->nodes, 0); VEC_truncate (int, g->edge_list, 0); } *************** delete_elim_graph (elim_graph g) *** 199,205 **** VEC_free (int, heap, g->stack); VEC_free (int, heap, g->edge_list); VEC_free (tree, heap, g->const_copies); ! VEC_free (tree, heap, g->nodes); free (g); } --- 290,297 ---- VEC_free (int, heap, g->stack); VEC_free (int, heap, g->edge_list); VEC_free (tree, heap, g->const_copies); ! VEC_free (int, heap, g->const_dests); ! VEC_free (int, heap, g->nodes); free (g); } *************** delete_elim_graph (elim_graph g) *** 209,230 **** static inline int elim_graph_size (elim_graph g) { ! return VEC_length (tree, g->nodes); } /* Add NODE to graph G, if it doesn't exist already. */ static inline void ! elim_graph_add_node (elim_graph g, tree node) { int x; ! tree t; ! for (x = 0; VEC_iterate (tree, g->nodes, x, t); x++) if (t == node) return; ! VEC_safe_push (tree, heap, g->nodes, node); } --- 301,322 ---- static inline int elim_graph_size (elim_graph g) { ! return VEC_length (int, g->nodes); } /* Add NODE to graph G, if it doesn't exist already. */ static inline void ! elim_graph_add_node (elim_graph g, int node) { int x; ! int t; ! for (x = 0; VEC_iterate (int, g->nodes, x, t); x++) if (t == node) return; ! VEC_safe_push (int, heap, g->nodes, node); } *************** do { \ *** 299,305 **** /* Add T to elimination graph G. */ static inline void ! eliminate_name (elim_graph g, tree T) { elim_graph_add_node (g, T); } --- 391,397 ---- /* Add T to elimination graph G. */ static inline void ! eliminate_name (elim_graph g, int T) { elim_graph_add_node (g, T); } *************** eliminate_name (elim_graph g, tree T) *** 309,315 **** G->e. */ static void ! eliminate_build (elim_graph g, basic_block B) { tree T0, Ti; int p0, pi; --- 401,407 ---- G->e. */ static void ! eliminate_build (elim_graph g) { tree T0, Ti; int p0, pi; *************** eliminate_build (elim_graph g, basic_blo *** 317,332 **** clear_elim_graph (g); ! for (gsi = gsi_start_phis (B); !gsi_end_p (gsi); gsi_next (&gsi)) { gimple phi = gsi_stmt (gsi); ! T0 = var_to_partition_to_var (g->map, gimple_phi_result (phi)); ! /* Ignore results which are not in partitions. */ ! if (T0 == NULL_TREE) continue; Ti = PHI_ARG_DEF (phi, g->e->dest_idx); /* If this argument is a constant, or a SSA_NAME which is being --- 409,424 ---- clear_elim_graph (g); ! for (gsi = gsi_start_phis (g->e->dest); !gsi_end_p (gsi); gsi_next (&gsi)) { gimple phi = gsi_stmt (gsi); ! p0 = var_to_partition (g->map, gimple_phi_result (phi)); /* Ignore results which are not in partitions. */ ! if (p0 == NO_PARTITION) continue; + T0 = partition_to_var (g->map, p0); Ti = PHI_ARG_DEF (phi, g->e->dest_idx); /* If this argument is a constant, or a SSA_NAME which is being *************** eliminate_build (elim_graph g, basic_blo *** 338,355 **** { /* Save constant copies until all other copies have been emitted on this edge. */ ! VEC_safe_push (tree, heap, g->const_copies, T0); VEC_safe_push (tree, heap, g->const_copies, Ti); } else { ! Ti = var_to_partition_to_var (g->map, Ti); ! if (T0 != Ti) { ! eliminate_name (g, T0); ! eliminate_name (g, Ti); ! p0 = var_to_partition (g->map, T0); ! pi = var_to_partition (g->map, Ti); elim_graph_add_edge (g, p0, pi); } } --- 430,448 ---- { /* Save constant copies until all other copies have been emitted on this edge. */ ! VEC_safe_push (int, heap, g->const_dests, p0); VEC_safe_push (tree, heap, g->const_copies, Ti); } else { ! pi = var_to_partition (g->map, Ti); ! if (p0 != pi) { ! /*Ti = var_to_partition_to_var (g->map, Ti);*/ ! eliminate_name (g, p0); ! eliminate_name (g, pi); ! /*p0 = var_to_partition (g->map, T0); ! pi = var_to_partition (g->map, Ti);*/ elim_graph_add_edge (g, p0, pi); } } *************** elim_backward (elim_graph g, int T) *** 399,430 **** if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); ! insert_copy_on_edge (g->e, ! partition_to_var (g->map, P), ! partition_to_var (g->map, T)); } }); } /* Insert required copies for T in graph G. Check for a strongly connected region, and create a temporary to break the cycle if one is found. */ static void elim_create (elim_graph g, int T) { - tree U; int P, S; if (elim_unvisited_predecessor (g, T)) { ! U = create_temp (partition_to_var (g->map, T)); ! insert_copy_on_edge (g->e, U, partition_to_var (g->map, T)); FOR_EACH_ELIM_GRAPH_PRED (g, T, P, { if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); ! insert_copy_on_edge (g->e, partition_to_var (g->map, P), U); } }); } --- 492,534 ---- if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); ! insert_partition_copy_on_edge (g->e, P, T); } }); } + static rtx + get_temp_reg (tree name) + { + tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; + tree type = TREE_TYPE (var); + int unsignedp = TYPE_UNSIGNED (type); + enum machine_mode reg_mode + = promote_mode (type, DECL_MODE (var), &unsignedp, 0); + rtx x = gen_reg_rtx (reg_mode); + if (POINTER_TYPE_P (type)) + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); + return x; + } + /* Insert required copies for T in graph G. Check for a strongly connected region, and create a temporary to break the cycle if one is found. */ static void elim_create (elim_graph g, int T) { int P, S; if (elim_unvisited_predecessor (g, T)) { ! rtx U = get_temp_reg (partition_to_var (g->map, T)); ! insert_part_to_rtx_on_edge (g->e, U, T); FOR_EACH_ELIM_GRAPH_PRED (g, T, P, { if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); ! insert_rtx_to_part_on_edge (g->e, P, U); } }); } *************** elim_create (elim_graph g, int T) *** 434,445 **** if (S != -1) { SET_BIT (g->visited, T); ! insert_copy_on_edge (g->e, ! partition_to_var (g->map, T), ! partition_to_var (g->map, S)); } } - } --- 538,546 ---- if (S != -1) { SET_BIT (g->visited, T); ! insert_partition_copy_on_edge (g->e, T, S); } } } *************** static void *** 449,455 **** eliminate_phi (edge e, elim_graph g) { int x; - basic_block B = e->dest; gcc_assert (VEC_length (tree, g->const_copies) == 0); --- 550,555 ---- *************** eliminate_phi (edge e, elim_graph g) *** 459,478 **** g->e = e; ! eliminate_build (g, B); if (elim_graph_size (g) != 0) { ! tree var; sbitmap_zero (g->visited); VEC_truncate (int, g->stack, 0); ! for (x = 0; VEC_iterate (tree, g->nodes, x, var); x++) { ! int p = var_to_partition (g->map, var); ! if (!TEST_BIT (g->visited, p)) ! elim_forward (g, p); } sbitmap_zero (g->visited); --- 559,577 ---- g->e = e; ! eliminate_build (g); if (elim_graph_size (g) != 0) { ! int part; sbitmap_zero (g->visited); VEC_truncate (int, g->stack, 0); ! for (x = 0; VEC_iterate (int, g->nodes, x, part); x++) { ! if (!TEST_BIT (g->visited, part)) ! elim_forward (g, part); } sbitmap_zero (g->visited); *************** eliminate_phi (edge e, elim_graph g) *** 487,496 **** /* If there are any pending constant copies, issue them now. */ while (VEC_length (tree, g->const_copies) > 0) { ! tree src, dest; src = VEC_pop (tree, g->const_copies); ! dest = VEC_pop (tree, g->const_copies); ! insert_copy_on_edge (e, dest, src); } } --- 586,596 ---- /* If there are any pending constant copies, issue them now. */ while (VEC_length (tree, g->const_copies) > 0) { ! int dest; ! tree src; src = VEC_pop (tree, g->const_copies); ! dest = VEC_pop (int, g->const_dests); ! insert_value_copy_on_edge (e, dest, src); } } *************** rewrite_trees (var_map map, gimple *valu *** 750,755 **** --- 850,856 ---- g->map = map; FOR_EACH_BB (bb) { + if (0) for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) { gimple stmt = gsi_stmt (gsi); *************** rewrite_trees (var_map map, gimple *valu *** 816,822 **** } phi = phi_nodes (bb); ! if (phi) { edge_iterator ei; FOR_EACH_EDGE (e, ei, bb->preds) --- 917,923 ---- } phi = phi_nodes (bb); ! if (0 && phi) { edge_iterator ei; FOR_EACH_EDGE (e, ei, bb->preds) *************** rewrite_trees (var_map map, gimple *valu *** 827,832 **** --- 928,949 ---- delete_elim_graph (g); } + void + expand_phi_nodes (struct ssaexpand *sa) + { + basic_block bb; + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb) + if (!gimple_seq_empty_p (phi_nodes (bb))) + { + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->preds) + eliminate_phi (e, (elim_graph)sa->elim_graph); + set_phi_nodes (bb, NULL); + } + } + + /* These are the local work structures used to determine the best place to insert the copies that were placed on edges by the SSA->normal pass.. */ static VEC(edge,heap) *edge_leader; *************** perform_edge_inserts (void) *** 1339,1350 **** should also be used. */ static void ! remove_ssa_form (bool perform_ter) { basic_block bb; gimple *values = NULL; var_map map; gimple_stmt_iterator gsi; map = coalesce_ssa_name (); --- 1456,1468 ---- should also be used. */ static void ! remove_ssa_form (bool perform_ter, struct ssaexpand *sa) { basic_block bb; gimple *values = NULL; var_map map; gimple_stmt_iterator gsi; + elim_graph g; map = coalesce_ssa_name (); *************** remove_ssa_form (bool perform_ter) *** 1366,1371 **** --- 1484,1490 ---- } /* Assign real variables to the partitions now. */ + if (0) assign_vars (map); if (dump_file && (dump_flags & TDF_DETAILS)) *************** remove_ssa_form (bool perform_ter) *** 1376,1393 **** rewrite_trees (map, values); ! if (values) free (values); /* Remove PHI nodes which have been translated back to real variables. */ FOR_EACH_BB (bb) for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);) remove_phi_node (&gsi, true); /* If any copies were inserted on edges, analyze and insert them now. */ perform_edge_inserts (); delete_var_map (map); } --- 1495,1522 ---- rewrite_trees (map, values); ! if (0 && values) free (values); /* Remove PHI nodes which have been translated back to real variables. */ + if (0) FOR_EACH_BB (bb) for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);) remove_phi_node (&gsi, true); /* If any copies were inserted on edges, analyze and insert them now. */ + if (0) perform_edge_inserts (); + if (0) delete_var_map (map); + + sa->map = map; + sa->values = values; + + g = new_elim_graph (map->num_partitions); + sa->elim_graph = g; + g->map = map; } *************** insert_backedge_copies (void) *** 1477,1488 **** } } /* Take the current function out of SSA form, translating PHIs as described in R. Morgan, ``Building an Optimizing Compiler'', Butterworth-Heinemann, Boston, MA, 1998. pp 176-186. */ ! static unsigned int ! rewrite_out_of_ssa (void) { /* If elimination of a PHI requires inserting a copy on a backedge, then we will have to split the backedge which has numerous --- 1606,1626 ---- } } + void + finish_out_of_ssa (struct ssaexpand *sa) + { + delete_elim_graph ((elim_graph)sa->elim_graph); + if (sa->values) + free (sa->values); + memset (sa, 0, sizeof *sa); + } + /* Take the current function out of SSA form, translating PHIs as described in R. Morgan, ``Building an Optimizing Compiler'', Butterworth-Heinemann, Boston, MA, 1998. pp 176-186. */ ! unsigned int ! rewrite_out_of_ssa (struct ssaexpand *sa) { /* If elimination of a PHI requires inserting a copy on a backedge, then we will have to split the backedge which has numerous *************** rewrite_out_of_ssa (void) *** 1499,1509 **** if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); ! remove_ssa_form (flag_tree_ter && !flag_mudflap); if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); cfun->gimple_df->in_ssa_p = false; return 0; } --- 1637,1648 ---- if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); ! remove_ssa_form (flag_tree_ter && !flag_mudflap, sa); if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); + if (0) cfun->gimple_df->in_ssa_p = false; return 0; } *************** rewrite_out_of_ssa (void) *** 1511,1516 **** --- 1650,1656 ---- /* Define the parameters of the out of SSA pass. */ + #if 0 struct gimple_opt_pass pass_del_ssa = { { *************** struct gimple_opt_pass pass_del_ssa = *** 1533,1535 **** --- 1673,1676 ---- | TODO_remove_unused_locals /* todo_flags_finish */ } }; + #endif Index: ssaexpand.h =================================================================== *** /dev/null --- ssaexpand.h *************** *** 0 **** --- 1,59 ---- + /* Routines for expanding from SSA form to RTL. + Copyright (C) 2009 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + + + #ifndef _SSAEXPAND_H + #define _SSAEXPAND_H 1 + + #include "tree-ssa-live.h" + + struct ssaexpand + { + var_map map; + gimple *values; + rtx *partition_to_pseudo; + void *elim_graph; + }; + + extern struct ssaexpand SA; + + static inline rtx + get_rtx_for_ssa_name (tree exp) + { + int p = partition_find (SA.map->var_partition, SSA_NAME_VERSION (exp)); + if (SA.map->partition_to_view) + p = SA.map->partition_to_view[p]; + return SA.partition_to_pseudo[p]; + } + + static inline gimple + get_gimple_for_ssa_name (tree exp) + { + int v = SSA_NAME_VERSION (exp); + if (SA.values) + return SA.values[v]; + return NULL; + } + + /* In tree-outof-ssa.c. */ + void finish_out_of_ssa (struct ssaexpand *sa); + unsigned int rewrite_out_of_ssa (struct ssaexpand *sa); + void expand_phi_nodes (struct ssaexpand *sa); + + #endif Index: passes.c =================================================================== *** passes.c.orig --- passes.c *************** init_optimization_passes (void) *** 707,723 **** NEXT_PASS (pass_local_pure_const); } NEXT_PASS (pass_cleanup_eh); - NEXT_PASS (pass_del_ssa); NEXT_PASS (pass_nrv); NEXT_PASS (pass_mark_used_blocks); NEXT_PASS (pass_cleanup_cfg_post_optimizing); - NEXT_PASS (pass_warn_function_noreturn); ! NEXT_PASS (pass_free_datastructures); ! NEXT_PASS (pass_mudflap_2); ! NEXT_PASS (pass_free_cfg_annotations); NEXT_PASS (pass_expand); NEXT_PASS (pass_rest_of_compilation); { struct opt_pass **p = &pass_rest_of_compilation.pass.sub; --- 707,723 ---- NEXT_PASS (pass_local_pure_const); } NEXT_PASS (pass_cleanup_eh); NEXT_PASS (pass_nrv); NEXT_PASS (pass_mark_used_blocks); NEXT_PASS (pass_cleanup_cfg_post_optimizing); NEXT_PASS (pass_warn_function_noreturn); ! /* NEXT_PASS (pass_mudflap_2); */ ! /* NEXT_PASS (pass_del_ssa); ! NEXT_PASS (pass_free_datastructures); ! NEXT_PASS (pass_free_cfg_annotations);*/ NEXT_PASS (pass_expand); + NEXT_PASS (pass_rest_of_compilation); { struct opt_pass **p = &pass_rest_of_compilation.pass.sub; Index: cfgrtl.c =================================================================== *** cfgrtl.c.orig --- cfgrtl.c *************** along with GCC; see the file COPYING3. *** 64,70 **** static int can_delete_note_p (const_rtx); static int can_delete_label_p (const_rtx); - static void commit_one_edge_insertion (edge); static basic_block rtl_split_edge (edge); static bool rtl_move_block_after (basic_block, basic_block); static int rtl_verify_flow_info (void); --- 64,69 ---- *************** try_redirect_by_replacing_jump (edge e, *** 872,902 **** return e; } ! /* Redirect edge representing branch of (un)conditional jump or tablejump, ! NULL on failure */ ! static edge ! redirect_branch_edge (edge e, basic_block target) { rtx tmp; - rtx old_label = BB_HEAD (e->dest); - basic_block src = e->src; - rtx insn = BB_END (src); - - /* We can only redirect non-fallthru edges of jump insn. */ - if (e->flags & EDGE_FALLTHRU) - return NULL; - else if (!JUMP_P (insn)) - return NULL; - /* Recognize a tablejump and adjust all matching cases. */ if (tablejump_p (insn, NULL, &tmp)) { rtvec vec; int j; ! rtx new_label = block_label (target); ! if (target == EXIT_BLOCK_PTR) ! return NULL; if (GET_CODE (PATTERN (tmp)) == ADDR_VEC) vec = XVEC (PATTERN (tmp), 0); else --- 871,895 ---- return e; } ! /* Subroutine of redirect_branch_edge that tries to patch the jump ! instruction INSN so that it reaches block NEW. Do this ! only when it originally reached block OLD. Return true if this ! worked or the original target wasn't OLD, return false if redirection ! doesn't work. */ ! ! static bool ! patch_jump_insn (rtx insn, rtx old_label, basic_block new_bb) { rtx tmp; /* Recognize a tablejump and adjust all matching cases. */ if (tablejump_p (insn, NULL, &tmp)) { rtvec vec; int j; ! rtx new_label = block_label (new_bb); ! if (new_bb == EXIT_BLOCK_PTR) ! return false; if (GET_CODE (PATTERN (tmp)) == ADDR_VEC) vec = XVEC (PATTERN (tmp), 0); else *************** redirect_branch_edge (edge e, basic_bloc *** 931,950 **** if (computed_jump_p (insn) /* A return instruction can't be redirected. */ || returnjump_p (insn)) ! return NULL; ! ! /* If the insn doesn't go where we think, we're confused. */ ! gcc_assert (JUMP_LABEL (insn) == old_label); ! /* If the substitution doesn't succeed, die. This can happen ! if the back end emitted unrecognizable instructions or if ! target is exit block on some arches. */ ! if (!redirect_jump (insn, block_label (target), 0)) { ! gcc_assert (target == EXIT_BLOCK_PTR); ! return NULL; } } if (dump_file) fprintf (dump_file, "Edge %i->%i redirected to %i\n", --- 924,978 ---- if (computed_jump_p (insn) /* A return instruction can't be redirected. */ || returnjump_p (insn)) ! return false; ! if (!currently_expanding_to_rtl || JUMP_LABEL (insn) == old_label) { ! /* If the insn doesn't go where we think, we're confused. */ ! gcc_assert (JUMP_LABEL (insn) == old_label); ! ! /* If the substitution doesn't succeed, die. This can happen ! if the back end emitted unrecognizable instructions or if ! target is exit block on some arches. */ ! if (!redirect_jump (insn, block_label (new_bb), 0)) ! { ! gcc_assert (new_bb == EXIT_BLOCK_PTR); ! return false; ! } } } + return true; + } + + + /* Redirect edge representing branch of (un)conditional jump or tablejump, + NULL on failure */ + static edge + redirect_branch_edge (edge e, basic_block target) + { + rtx old_label = BB_HEAD (e->dest); + basic_block src = e->src; + rtx insn = BB_END (src); + + /* We can only redirect non-fallthru edges of jump insn. */ + if (e->flags & EDGE_FALLTHRU) + return NULL; + else if (!JUMP_P (insn) && !currently_expanding_to_rtl) + return NULL; + + if (!currently_expanding_to_rtl) + { + if (!patch_jump_insn (insn, old_label, target)) + return NULL; + } + else + /* When expanding this BB might actually contain multiple + jumps (i.e. not yet split by find_many_sub_basic_blocks). + Redirect all of those that match our label. */ + for (insn = BB_HEAD (src); insn != NEXT_INSN (BB_END (src)); + insn = NEXT_INSN (insn)) + if (JUMP_P (insn) && !patch_jump_insn (insn, old_label, target)) + return NULL; if (dump_file) fprintf (dump_file, "Edge %i->%i redirected to %i\n", *************** insert_insn_on_edge (rtx pattern, edge e *** 1329,1335 **** /* Update the CFG for the instructions queued on edge E. */ ! static void commit_one_edge_insertion (edge e) { rtx before = NULL_RTX, after = NULL_RTX, insns, tmp, last; --- 1357,1363 ---- /* Update the CFG for the instructions queued on edge E. */ ! void commit_one_edge_insertion (edge e) { rtx before = NULL_RTX, after = NULL_RTX, insns, tmp, last; Index: tree-ssa-coalesce.c =================================================================== *** tree-ssa-coalesce.c.orig --- tree-ssa-coalesce.c *************** compare_pairs (const void *p1, const voi *** 314,320 **** const_coalesce_pair_p const *const pp2 = (const_coalesce_pair_p const *) p2; int result; ! result = (* pp2)->cost - (* pp1)->cost; /* Since qsort does not guarantee stability we use the elements as a secondary key. This provides us with independence from the host's implementation of the sorting algorithm. */ --- 314,320 ---- const_coalesce_pair_p const *const pp2 = (const_coalesce_pair_p const *) p2; int result; ! result = (* pp1)->cost - (* pp2)->cost; /* Since qsort does not guarantee stability we use the elements as a secondary key. This provides us with independence from the host's implementation of the sorting algorithm. */ Index: Makefile.in =================================================================== *** Makefile.in.orig --- Makefile.in *************** TREE_FLOW_H = tree-flow.h tree-flow-inli *** 857,862 **** --- 857,863 ---- $(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \ tree-ssa-alias.h TREE_SSA_LIVE_H = tree-ssa-live.h $(PARTITION_H) vecprim.h + SSAEXPAND_H = ssaexpand.h $(TREE_SSA_LIVE_H) PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H) DIAGNOSTIC_H = diagnostic.h diagnostic.def $(PRETTY_PRINT_H) options.h C_PRETTY_PRINT_H = c-pretty-print.h $(PRETTY_PRINT_H) $(C_COMMON_H) $(TREE_H) *************** tree-ssa-coalesce.o : tree-ssa-coalesce. *** 2095,2101 **** $(TREE_SSA_LIVE_H) $(BITMAP_H) $(FLAGS_H) $(HASHTAB_H) $(TOPLEV_H) tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ $(TREE_H) $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \ ! tree-pass.h $(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) $(TOPLEV_H) tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(GGC_H) $(TREE_H) $(RTL_H) $(TM_P_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) tree-pass.h $(TREE_DUMP_H) domwalk.h $(FLAGS_H) \ --- 2096,2102 ---- $(TREE_SSA_LIVE_H) $(BITMAP_H) $(FLAGS_H) $(HASHTAB_H) $(TOPLEV_H) tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ $(TREE_H) $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \ ! tree-pass.h $(SSAEXPAND_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) $(TOPLEV_H) tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(GGC_H) $(TREE_H) $(RTL_H) $(TM_P_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) tree-pass.h $(TREE_DUMP_H) domwalk.h $(FLAGS_H) \ *************** expr.o : expr.c $(CONFIG_H) $(SYSTEM_H) *** 2509,2515 **** typeclass.h hard-reg-set.h $(TOPLEV_H) hard-reg-set.h $(EXCEPT_H) reload.h \ $(GGC_H) langhooks.h intl.h $(TM_P_H) $(REAL_H) $(TARGET_H) \ tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \ ! tree-pass.h $(DF_H) $(DIAGNOSTIC_H) vecprim.h dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \ $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \ langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) --- 2510,2516 ---- typeclass.h hard-reg-set.h $(TOPLEV_H) hard-reg-set.h $(EXCEPT_H) reload.h \ $(GGC_H) langhooks.h intl.h $(TM_P_H) $(REAL_H) $(TARGET_H) \ tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \ ! tree-pass.h $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \ $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \ langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) *************** cfgexpand.o : cfgexpand.c $(TREE_FLOW_H) *** 2780,2786 **** $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \ coretypes.h $(TREE_DUMP_H) $(EXCEPT_H) langhooks.h tree-pass.h $(RTL_H) \ $(DIAGNOSTIC_H) $(TOPLEV_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \ ! value-prof.h $(TREE_INLINE_H) $(TARGET_H) cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \ output.h $(TOPLEV_H) $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) insn-config.h $(EXPR_H) \ --- 2781,2787 ---- $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \ coretypes.h $(TREE_DUMP_H) $(EXCEPT_H) langhooks.h tree-pass.h $(RTL_H) \ $(DIAGNOSTIC_H) $(TOPLEV_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \ ! value-prof.h $(TREE_INLINE_H) $(TARGET_H) $(SSAEXPAND_H) cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \ output.h $(TOPLEV_H) $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) insn-config.h $(EXPR_H) \ Index: basic-block.h =================================================================== *** basic-block.h.orig --- basic-block.h *************** extern void update_bb_for_insn (basic_bl *** 512,517 **** --- 512,518 ---- extern void insert_insn_on_edge (rtx, edge); basic_block split_edge_and_insert (edge, rtx); + extern void commit_one_edge_insertion (edge e); extern void commit_edge_insertions (void); extern void remove_fake_edges (void); Index: rtl.h =================================================================== *** rtl.h.orig --- rtl.h *************** extern rtx emit_copy_of_insn_after (rtx, *** 1495,1500 **** --- 1495,1501 ---- extern void set_reg_attrs_from_value (rtx, rtx); extern void set_mem_attrs_from_reg (rtx, rtx); extern void set_reg_attrs_for_parm (rtx, rtx); + extern void set_reg_attrs_for_decl_rtl (tree t, rtx x); extern void adjust_reg_mode (rtx, enum machine_mode); extern int mem_expr_equal_p (const_tree, const_tree); Index: tree-flow.h =================================================================== *** tree-flow.h.orig --- tree-flow.h *************** rtx addr_for_mem_ref (struct mem_address *** 975,980 **** --- 975,981 ---- void get_address_description (tree, struct mem_address *); tree maybe_fold_tmr (tree); + unsigned int execute_free_datastructures (void); unsigned int execute_fixup_cfg (void); #include "tree-flow-inline.h" ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: RFC: expand from SSA form (1/2) 2009-04-13 20:50 RFC: expand from SSA form (1/2) Michael Matz @ 2009-04-21 18:23 ` Andrew MacLeod 2009-04-22 10:12 ` Paolo Bonzini 2009-04-22 10:54 ` Michael Matz 0 siblings, 2 replies; 63+ messages in thread From: Andrew MacLeod @ 2009-04-21 18:23 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 2996 bytes --] Michael Matz wrote: > Hi, > > This patch implements expanding directly from SSA form, i.e. without first > going out and producing GENERIC. It doesn't yet get rid of explicitely > Sorry I'm so slow getting to this :-) This seems like a good approach, and is similar to what I figured was the most straightforward way of merging out-of-ssa and expand. Basically perform out-of-ssa and allow the ssa-names to pass through to to expand and deal with them there. > > Anyway, here is how it works: > (1) in SSA form: create and coalesce partitions, as before, but don't > rewrite anything. Pass this info to cfgexpand.c. > (2) cfgexpand.c: allocate some space (in pseudos or stack) for each > partition not containing a default def (these are the ones > corresponding to VAR_DECLs). Set interesting attributes for that RTX > according to the underlying SSA_NAME_VAR. I.e. we don't have to > create artificial abc.24 variables for secondary partition. > Do you still create stack space for the original SSA_NAME variable? It doesn't look like it, but I'm not real familiar with that code. It shouldn't be needed normally right? I'm just wondering which partitions without default defs would require stack space. > TER is still supported and implemented by deferring expanding an > assignment that is TERed to the place where it's actually used. That's a > change from before: we don't insert the tree of the RHS directly into the > tree that is going to be expanded. > yeah, but TER is basically just a large expression table indexed by ssa version. So you just carry that table through to expand and utilize it as you encounter the single use during expansion right? > > I've also changed out-of-ssa to mostly work only with partition numbers > instead of partition variables (those are removed in the cleanup pass). > We also don't need to must-coalesce default defs with base variables or in > fact change the variables per partition in any way at all (i.e. a > partition always corresponds to exactly one SSA name, whose version is the > partition number). > yes, this aspect is certainly simplified now. > As out-of-ssa and expand are now basically one pass there can't be any > passes working on non-SSA trees anymore. Currently that's only two > passes: tree-nrv, which is easily fixed, and mudflap, which I deactivated > for now. > Has there been any thought to turning mudflap into a plugin now that we have the machinery for that? Its seems like a ripe candidate. > > But I'm interested in any comments you might have. In particular I'm not > exceptionally fond of the four insert_*_on_edge() routines, though we > indeed need four signatures: partition to partition, tree to partition, > rtx to partition and partition to rtx. > yeah, but I'm not sure what you can usefully do about that either, with tree, rtx and int types to deal with. I think it looks pretty good. Just a few comments. Andrew [-- Attachment #2: outofssa-reply --] [-- Type: text/plain, Size: 5201 bytes --] > *************** expand_used_vars (void) > *** 1451,1456 **** > --- 1510,1557 ---- > > init_vars_expansion (); > > + for (i = 0; i < SA.map->num_partitions; i++) > + { > + tree var = partition_to_var (SA.map, i); > + > + if (TREE_CODE (var) == SSA_NAME) > + var = SSA_NAME_VAR (var); > + gcc_assert (is_gimple_reg (var)); > + if (TREE_CODE (var) == VAR_DECL) > + expand_one_var (partition_to_var (SA.map, i), true, true); > + else > + { > + /* This is a PARM_DECL or RESULT_DECL. For those partitions that > + contain the default def (representing the parm or result itself) > + we don't do anything here. But those which don't contain the > + default def (representing a temporary based on the parm/result) > + we need to allocate space just like for normal VAR_DECLs. */ I presume the allocated space for these temps is normally in a register? > + int j = i; > + struct partition_elem *start, *elem; > + int has_default = 0; > + if (SA.map->view_to_partition) > + j = SA.map->view_to_partition[j]; > + j = partition_find (SA.map->var_partition, j); > + start = elem = SA.map->var_partition->elements + j; > + do Ugg, do you have to expose the internal workings of the partition mechanism? I tried to use only the published API in case I ever wanted to change it for performance reasons, or whatever... Maybe add a bitmap to the expansion structure which indicates which partitions have a default_def in them, and initialize it at the end of the out-of-ssa process once partitions don't change any more. It'll also make the code here much clearer, just check if the bit is on. > ! def_operand_p def_p; > ! tree stmt_tree; > ! def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF); > ! > ! if (def_p != NULL) > ! { > ! /* Mark this stmt for removal if it is the list of replaceable > ! expressions. */ The stmt isnt being marked for removal, it just isnt having any expansion done on it... > > + /* Now that we also have the parameter RTXs, copy them over to our > + partitions. */ > + for (i = 0; i < SA.map->num_partitions; i++) > + { > + tree var = partition_to_var (SA.map, i); > + > + if (TREE_CODE (var) == SSA_NAME) > + var = SSA_NAME_VAR (var); > + if (TREE_CODE (var) != VAR_DECL > + && !SA.partition_to_pseudo[i]) > + SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var); > + gcc_assert (SA.partition_to_pseudo[i]); > + /* Some RTL parts really want to look at DECL_RTL(x) when x > + was a decl marked in REG_ATTR or MEM_ATTR. We could use > + SET_DECL_RTL here making this available, but that would mean > + to select one of the potentially many RTLs for one DECL. Instead > + of doing that we simply reset the MEM_EXPR of the RTL in question, > + then nobody can get at it and hence nobody can call DECL_RTL on it. */ I presume that the MEM_EXPR is recreated somewhere? or it doesn't matter for some other reason? > + currently_expanding_to_rtl = 1; > + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb) Wasn't 'currently_expanding_to_rtl' already set earlier in gimple_expand_cfg()? > + { > + edge e; > + edge_iterator ei; > + /* ??? commit_one_edge_insertion might reorder the edges of the > + current block (when it needs to split an edge), so that we > + might miss some edges with instructions on them. Pff. > + (see execute/ashldi-1.c). */ > + VEC(edge,gc) *edges = VEC_copy (edge,gc,bb->preds); > + > + for (ei = ei_start (edges); (e = ei_safe_edge (ei)); ) > + { > + ei_next (&ei); > + if (e->insns.r) > + commit_one_edge_insertion (e); > + } > + VEC_free (edge, gc, edges); > + } how annoying eh. Why doesn't commit_edge_insertions() run into this problem as well? It just loops over the edges with a FOR... is it because it does SUCC's instead? could we use SUCCs here instead? > Index: tree-ssa-coalesce.c > =================================================================== > *** tree-ssa-coalesce.c.orig > --- tree-ssa-coalesce.c > *************** compare_pairs (const void *p1, const voi > *** 314,320 **** > const_coalesce_pair_p const *const pp2 = (const_coalesce_pair_p const *) p2; > int result; > > ! result = (* pp2)->cost - (* pp1)->cost; > /* Since qsort does not guarantee stability we use the elements as a secondary key. This provides us with independence from the host's implementation of the sorting algorithm. */ > --- 314,320 ---- > const_coalesce_pair_p const *const pp2 = (const_coalesce_pair_p const *) p2; > int result; > > ! result = (* pp1)->cost - (* pp2)->cost; > /* Since qsort does not guarantee stability we use the elements as a secondary key. This provides us with independence from the host's implementation of the sorting algorithm. */ Hmm. how long has this been backwards... its seems fine in 4.2, I rewrote it for 4.3, and jeez, its been backwards since 4.3.0 was released it seems. Zoinks. Wonder if there are any performance regressions due to that.... ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: RFC: expand from SSA form (1/2) 2009-04-21 18:23 ` Andrew MacLeod @ 2009-04-22 10:12 ` Paolo Bonzini 2009-04-22 10:16 ` Richard Guenther 2009-04-22 10:54 ` Michael Matz 1 sibling, 1 reply; 63+ messages in thread From: Paolo Bonzini @ 2009-04-22 10:12 UTC (permalink / raw) To: Andrew MacLeod; +Cc: Michael Matz, gcc-patches >> TER is still supported and implemented by deferring expanding an >> assignment that is TERed to the place where it's actually used. >> That's a change from before: we don't insert the tree of the RHS >> directly into the tree that is going to be expanded. > > yeah, but TER is basically just a large expression table indexed by ssa > version. So you just carry that table through to expand and utilize it > as you encounter the single use during expansion right? We could use SSA_NAME_VALUE here, maybe? Paolo ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: RFC: expand from SSA form (1/2) 2009-04-22 10:12 ` Paolo Bonzini @ 2009-04-22 10:16 ` Richard Guenther 0 siblings, 0 replies; 63+ messages in thread From: Richard Guenther @ 2009-04-22 10:16 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Andrew MacLeod, Michael Matz, gcc-patches On Wed, Apr 22, 2009 at 12:12 PM, Paolo Bonzini <bonzini@gnu.org> wrote: > >>> TER is still supported and implemented by deferring expanding an >>> assignment that is TERed to the place where it's actually used. >>> That's a change from before: we don't insert the tree of the RHS >>> directly into the tree that is going to be expanded. >> >> yeah, but TER is basically just a large expression table indexed by ssa >> version. So you just carry that table through to expand and utilize it >> as you encounter the single use during expansion right? > > We could use SSA_NAME_VALUE here, maybe? No, we should get rid of that field ;) Richard. > Paolo > ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: RFC: expand from SSA form (1/2) 2009-04-21 18:23 ` Andrew MacLeod 2009-04-22 10:12 ` Paolo Bonzini @ 2009-04-22 10:54 ` Michael Matz 2009-04-22 11:20 ` Richard Guenther ` (2 more replies) 1 sibling, 3 replies; 63+ messages in thread From: Michael Matz @ 2009-04-22 10:54 UTC (permalink / raw) To: Andrew MacLeod; +Cc: gcc-patches Hi, On Tue, 21 Apr 2009, Andrew MacLeod wrote: > > Anyway, here is how it works: > > (1) in SSA form: create and coalesce partitions, as before, but don't > > rewrite anything. Pass this info to cfgexpand.c. > > (2) cfgexpand.c: allocate some space (in pseudos or stack) for each > > partition not containing a default def (these are the ones corresponding > > to VAR_DECLs). Set interesting attributes for that RTX according to the > > underlying SSA_NAME_VAR. I.e. we don't have to create artificial abc.24 > > variables for secondary partition. > > Do you still create stack space for the original SSA_NAME variable? It > doesn't look like it, but I'm not real familiar with that code. No, no space is allocated for the base variables itself. Only for a partition which happens to be based on a PARM_DECL, but which doesn't contain the default def I have to allocate new space. These partitions are the ones for which formerly new temporary variables would have been created (and hence new space allocated) as they weren't coalescable with the partition containing the default def. > It shouldn't be needed normally right? I'm just wondering which > partitions without default defs would require stack space. All partitions without default def require space. If stack or pseudos is decided by use_register_for_decl(). > yeah, but TER is basically just a large expression table indexed by ssa > version. So you just carry that table through to expand and utilize it > as you encounter the single use during expansion right? Exactly. So the effect of TER with those patches is (for now) only to reduce the lifetime of the LHS (most probably a pseudo). It doesn't start at the original point of definition anymore, but only right before it's use (where it then also dies). In the future we can make use of the table directly in expand, i.e. insert the RHS into the to-be-expanded expressions on the fly. > > As out-of-ssa and expand are now basically one pass there can't be any > > passes working on non-SSA trees anymore. Currently that's only two > > passes: tree-nrv, which is easily fixed, and mudflap, which I > > deactivated for now. > > Has there been any thought to turning mudflap into a plugin now that we > have the machinery for that? Its seems like a ripe candidate. At least not from my side. I went the sissy way and instead fixed mudflap to work on SSA form :-| Comments on the patch comments below :) I'll soon send a new version of the patch that fixes all problems and testcases I encountered. Ciao, Michael. > > + /* This is a PARM_DECL or RESULT_DECL. For those partitions that > > + contain the default def (representing the parm or result itself) > > + we don't do anything here. But those which don't contain the > > + default def (representing a temporary based on the parm/result) > > + we need to allocate space just like for normal VAR_DECLs. */ > > I presume the allocated space for these temps is normally in a register? Yes, when optimization is on. With -O0 mostly everything will be comitted to stack. > > + int j = i; > > + struct partition_elem *start, *elem; > > + int has_default = 0; > > + if (SA.map->view_to_partition) > > + j = SA.map->view_to_partition[j]; > > + j = partition_find (SA.map->var_partition, j); > > + start = elem = SA.map->var_partition->elements + j; > > + do > > Ugg, do you have to expose the internal workings of the partition > mechanism? I tried to use only the published API in case I ever > wanted to change it for performance reasons, or whatever... Maybe add > a bitmap to the expansion structure which indicates which partitions > have a default_def in them, and initialize it at the end of the > out-of-ssa process once partitions don't change any more. Yeah, I'm not terribly fond of the iteration above either. I somehow wanted to avoid walking all SSA names to set this to-be-invented flag, but I also didn't want to slow the partition unioning by merging the flags. I probably bite the apple and create a new bitmap per partition as you suggested. > > ! /* Mark this stmt for removal if it is the list of replaceable > > ! expressions. */ > > The stmt isnt being marked for removal, it just isnt having any expansion > done on it... Oh, right. copy&pasted comments tend to become stale :) > > + /* Some RTL parts really want to look at DECL_RTL(x) when x > > + was a decl marked in REG_ATTR or MEM_ATTR. We could use > > + SET_DECL_RTL here making this available, but that would mean > > + to select one of the potentially many RTLs for one DECL. Instead > > + of doing that we simply reset the MEM_EXPR of the RTL in question, > > + then nobody can get at it and hence nobody can call DECL_RTL on it. */ > > I presume that the MEM_EXPR is recreated somewhere? or it doesn't > matter for some other reason? MEM_ATTR (and hence MEM_EXPR) are created by set_decl_rtl (via set_mem_attr*). If it isn't set explicitely it isn't recreated lazily anymore. Which is a good thing. It's used in the RTL alias analysis. So given two RTL MEMs it looks up the MEM_EXPRs of both (getting at e.g. tree _DECL nodes), and compares those. But it then _also_ looks at DECL_RTL() of those nodes, getting back to the RTL expression (either the original or one representing the base object, e.g. without offset). So it first goes from RTL to tree and from there back to RTL. The problem with that is that DECL_RTL() lazily tries to create the RTL expression if it weren't set yet. And we don't set DECL_RTL for variables split into SSA partitions (because there are multiple RTL expressions for each base var). So this DECL_RTL lookup in alias.c would lazily try to create the RTL, which then ICEs because such lazy generation is not acceptable for VAR_DECLs. So, we have to break the RTL->tree->RTL lookup at one of the two steps. Either fix all RTL passes to not look up DECL_RTL() on some tree, or not even providing the tree to start with. I chose the latter for the following reason: the problem only occurs with MEMs (not REGs). We generate MEMs for SSA names only if not optimizing or in exceptional situations (impossible machine mode for registers for instance). If not optimizing the alias machinery isn't active, so doesn't need the MEM_EXPR. In the exceptional situations looking at the MEM RTL expression itself is enough for disambiguation (and it happens very seldomly). So removing the MEM_EXPR from the MEM_ATTRs doesn't hinder optimization, and is a central point to ensure that the later RTL passes don't accidentally call DECL_RTL. > > + currently_expanding_to_rtl = 1; > > + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb) > > Wasn't 'currently_expanding_to_rtl' already set earlier in > gimple_expand_cfg()? Yeah. It's reset to 0 five lines above: currently_expanding_to_rtl = 0; convert_from_eh_region_ranges (); set_eh_throw_stmt_table (cfun, NULL); rebuild_jump_labels (get_insns ()); find_exception_handler_labels (); currently_expanding_to_rtl = 1; The four routines in between look as if they might be affected by the setting of currently_expanding_to_rtl, but splitting edges needs to come afterwards and needs to have it set. greping around now I see that before the patch currently_expanding_to_rtl is only used in two backends, so it's probably safe to remove the funny reset/set. > > + /* ??? commit_one_edge_insertion might reorder the edges of the > > + current block (when it needs to split an edge), so that we > > + might miss some edges with instructions on them. Pff. > > + (see execute/ashldi-1.c). */ > > + VEC(edge,gc) *edges = VEC_copy (edge,gc,bb->preds); > > + > > + for (ei = ei_start (edges); (e = ei_safe_edge (ei)); ) > > + { > > + ei_next (&ei); > > + if (e->insns.r) > > + commit_one_edge_insertion (e); > > + } > > + VEC_free (edge, gc, edges); > > + } > > how annoying eh. Why doesn't commit_edge_insertions() run into this > problem as well? Yeah, I was also confused about this. > It just loops over the edges with a FOR... is it > because it does SUCC's instead? could we use SUCCs here instead? I tried using preds and succs, iterating from back or from front. Nothing helped. If one thinks about it it also can't help (the way edges are removed/inserted to split them necessarily invalidates iterators). The only theory I have is, that in all places where currently commit_edge_insertions() is called there either aren't that many critical edges with insns on them, or at least there's only one such edge per block. The nature of the problem is the following: Suppose you have this edge list (preds or succs doesn't matter): E2 E3* E4 E5* (the * edges have insns, suppose E3 is critical). Now when inserting for E3 we need to split it, let's call it E3'. For wiring this new edge into the above list, we first remove E3 and end up with: E2 E5* E4 (that's the fast removal at work, fill up the empty slot with the last element) and put back E3' : E2 E5* E4 E3'. Voila, our iterator (now pointing to the next element, E4) won't come by E5 anymore and miss an insertion. Thinking about it now I see a way to solve it. Simply not advancing the iterator when we have something to insert should do the trick. Still doesn't explain why commit_edge_insertions should work. > > ! result = (* pp1)->cost - (* pp2)->cost; > > /* Since qsort does not guarantee stability we use the elements > as a secondary key. This provides us with independence from > the host's implementation of the sorting algorithm. */ > > Hmm. how long has this been backwards... > its seems fine in 4.2, I rewrote it for 4.3, and jeez, its been backwards > since 4.3.0 was released it seems. Zoinks. :-) It's a bit non-obvious, because the predicate looks correct at first sight and only becomes incorrect considering that the sorted array then is walked from back to front ;) ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: RFC: expand from SSA form (1/2) 2009-04-22 10:54 ` Michael Matz @ 2009-04-22 11:20 ` Richard Guenther 2009-04-22 12:34 ` Andrew MacLeod 2009-04-22 16:45 ` [RFA] " Michael Matz 2 siblings, 0 replies; 63+ messages in thread From: Richard Guenther @ 2009-04-22 11:20 UTC (permalink / raw) To: Michael Matz; +Cc: Andrew MacLeod, gcc-patches On Wed, Apr 22, 2009 at 12:54 PM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Tue, 21 Apr 2009, Andrew MacLeod wrote: > >> > Anyway, here is how it works: >> > (1) in SSA form: create and coalesce partitions, as before, but don't >> > rewrite anything. Pass this info to cfgexpand.c. >> > (2) cfgexpand.c: allocate some space (in pseudos or stack) for each >> > partition not containing a default def (these are the ones corresponding >> > to VAR_DECLs). Set interesting attributes for that RTX according to the >> > underlying SSA_NAME_VAR. I.e. we don't have to create artificial abc.24 >> > variables for secondary partition. >> >> Do you still create stack space for the original SSA_NAME variable? It >> doesn't look like it, but I'm not real familiar with that code. > > No, no space is allocated for the base variables itself. Only for a > partition which happens to be based on a PARM_DECL, but which doesn't > contain the default def I have to allocate new space. These partitions > are the ones for which formerly new temporary variables would have been > created (and hence new space allocated) as they weren't coalescable with > the partition containing the default def. > >> It shouldn't be needed normally right? I'm just wondering which >> partitions without default defs would require stack space. > > All partitions without default def require space. If stack or pseudos is > decided by use_register_for_decl(). > >> yeah, but TER is basically just a large expression table indexed by ssa >> version. So you just carry that table through to expand and utilize it >> as you encounter the single use during expansion right? > > Exactly. So the effect of TER with those patches is (for now) only to > reduce the lifetime of the LHS (most probably a pseudo). It doesn't start > at the original point of definition anymore, but only right before it's > use (where it then also dies). In the future we can make use of the table > directly in expand, i.e. insert the RHS into the to-be-expanded > expressions on the fly. > >> > As out-of-ssa and expand are now basically one pass there can't be any >> > passes working on non-SSA trees anymore. Currently that's only two >> > passes: tree-nrv, which is easily fixed, and mudflap, which I >> > deactivated for now. >> >> Has there been any thought to turning mudflap into a plugin now that we >> have the machinery for that? Its seems like a ripe candidate. > > At least not from my side. I went the sissy way and instead fixed mudflap > to work on SSA form :-| > > Comments on the patch comments below :) I'll soon send a new version of > the patch that fixes all problems and testcases I encountered. > > > Ciao, > Michael. > >> > + /* This is a PARM_DECL or RESULT_DECL. For those partitions that >> > + contain the default def (representing the parm or result itself) >> > + we don't do anything here. But those which don't contain the >> > + default def (representing a temporary based on the parm/result) >> > + we need to allocate space just like for normal VAR_DECLs. */ >> >> I presume the allocated space for these temps is normally in a register? > > Yes, when optimization is on. With -O0 mostly everything will be comitted > to stack. > >> > + int j = i; >> > + struct partition_elem *start, *elem; >> > + int has_default = 0; >> > + if (SA.map->view_to_partition) >> > + j = SA.map->view_to_partition[j]; >> > + j = partition_find (SA.map->var_partition, j); >> > + start = elem = SA.map->var_partition->elements + j; >> > + do >> >> Ugg, do you have to expose the internal workings of the partition >> mechanism? I tried to use only the published API in case I ever >> wanted to change it for performance reasons, or whatever... Maybe add >> a bitmap to the expansion structure which indicates which partitions >> have a default_def in them, and initialize it at the end of the >> out-of-ssa process once partitions don't change any more. > > Yeah, I'm not terribly fond of the iteration above either. I somehow > wanted to avoid walking all SSA names to set this to-be-invented flag, > but I also didn't want to slow the partition unioning by merging the > flags. I probably bite the apple and create a new bitmap per partition > as you suggested. > >> > ! /* Mark this stmt for removal if it is the list of replaceable >> > ! expressions. */ >> >> The stmt isnt being marked for removal, it just isnt having any expansion >> done on it... > > Oh, right. copy&pasted comments tend to become stale :) > >> > + /* Some RTL parts really want to look at DECL_RTL(x) when x >> > + was a decl marked in REG_ATTR or MEM_ATTR. We could use >> > + SET_DECL_RTL here making this available, but that would mean >> > + to select one of the potentially many RTLs for one DECL. Instead >> > + of doing that we simply reset the MEM_EXPR of the RTL in question, >> > + then nobody can get at it and hence nobody can call DECL_RTL on it. */ >> >> I presume that the MEM_EXPR is recreated somewhere? or it doesn't >> matter for some other reason? > > MEM_ATTR (and hence MEM_EXPR) are created by > set_decl_rtl (via set_mem_attr*). If it isn't set explicitely it > isn't recreated lazily anymore. Which is a good thing. It's used in > the RTL alias analysis. So given two RTL MEMs it looks > up the MEM_EXPRs of both (getting at e.g. tree _DECL nodes), and > compares those. But it then _also_ looks at DECL_RTL() of those nodes, > getting back to the RTL expression (either the original or one > representing the base object, e.g. without offset). So it first goes > from RTL to tree and from there back to RTL. > > The problem with that is that DECL_RTL() lazily tries to create the RTL > expression if it weren't set yet. And we don't set DECL_RTL for > variables split into SSA partitions (because there are multiple RTL > expressions for each base var). So this DECL_RTL lookup in alias.c > would lazily try to create the RTL, which then ICEs because such lazy > generation is not acceptable for VAR_DECLs. > > So, we have to break the RTL->tree->RTL lookup at one of the two steps. > Either fix all RTL passes to not look up DECL_RTL() on some tree, or not > even providing the tree to start with. I chose the latter for the > following reason: the problem only occurs with MEMs (not REGs). We > generate MEMs for SSA names only if not optimizing or in exceptional > situations (impossible machine mode for registers for instance). If not > optimizing the alias machinery isn't active, so doesn't need the > MEM_EXPR. In the exceptional situations looking at the MEM RTL > expression itself is enough for disambiguation (and it happens very > seldomly). So removing the MEM_EXPR from the MEM_ATTRs doesn't hinder > optimization, and is a central point to ensure that the later RTL > passes don't accidentally call DECL_RTL. > >> > + currently_expanding_to_rtl = 1; >> > + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb) >> >> Wasn't 'currently_expanding_to_rtl' already set earlier in >> gimple_expand_cfg()? > > Yeah. It's reset to 0 five lines above: > currently_expanding_to_rtl = 0; > convert_from_eh_region_ranges (); > set_eh_throw_stmt_table (cfun, NULL); > rebuild_jump_labels (get_insns ()); > find_exception_handler_labels (); > currently_expanding_to_rtl = 1; > The four routines in between look as if they might be affected by the > setting of currently_expanding_to_rtl, but splitting edges needs to come > afterwards and needs to have it set. greping around now I see that > before the patch currently_expanding_to_rtl is only used in two > backends, so it's probably safe to remove the funny reset/set. > >> > + /* ??? commit_one_edge_insertion might reorder the edges of the >> > + current block (when it needs to split an edge), so that we >> > + might miss some edges with instructions on them. Pff. >> > + (see execute/ashldi-1.c). */ >> > + VEC(edge,gc) *edges = VEC_copy (edge,gc,bb->preds); >> > + >> > + for (ei = ei_start (edges); (e = ei_safe_edge (ei)); ) >> > + { >> > + ei_next (&ei); >> > + if (e->insns.r) >> > + commit_one_edge_insertion (e); >> > + } >> > + VEC_free (edge, gc, edges); >> > + } >> >> how annoying eh. Why doesn't commit_edge_insertions() run into this >> problem as well? > > Yeah, I was also confused about this. > >> It just loops over the edges with a FOR... is it >> because it does SUCC's instead? could we use SUCCs here instead? > > I tried using preds and succs, iterating from back or from front. > Nothing helped. If one thinks about it it also can't help (the way > edges are removed/inserted to split them necessarily invalidates > iterators). The only theory I have is, that in all places where > currently commit_edge_insertions() is called there either aren't that > many critical edges with insns on them, or at least there's only one > such edge per block. > > The nature of the problem is the following: Suppose you have this edge > list (preds or succs doesn't matter): E2 E3* E4 E5* > (the * edges have insns, suppose E3 is critical). Now when inserting > for E3 we need to split it, let's call it E3'. For wiring this new edge > into the above list, we first remove E3 and end up with: > E2 E5* E4 (that's the fast removal at work, fill up the empty slot > with the last element) > and put back E3' : E2 E5* E4 E3'. Voila, our iterator (now pointing to > the next element, E4) won't come by E5 anymore and miss an insertion. > > Thinking about it now I see a way to solve it. Simply not advancing the > iterator when we have something to insert should do the trick. Still > doesn't explain why commit_edge_insertions should work. We probably should fix commit_edge_insertions the same way, likewise the tree variant in gimple-iterator. Note that FOR_EACH_EDGE specifically says It must not be used when an element might be removed during the traversal, otherwise elements will be missed A separate patch for this is welcome. >> > ! result = (* pp1)->cost - (* pp2)->cost; >> > /* Since qsort does not guarantee stability we use the elements >> as a secondary key. This provides us with independence from >> the host's implementation of the sorting algorithm. */ >> >> Hmm. how long has this been backwards... >> its seems fine in 4.2, I rewrote it for 4.3, and jeez, its been backwards >> since 4.3.0 was released it seems. Zoinks. > > :-) It's a bit non-obvious, because the predicate looks correct at > first sight and only becomes incorrect considering that the sorted array > then is walked from back to front ;) Maybe fix this on trunk separately. Richard. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: RFC: expand from SSA form (1/2) 2009-04-22 10:54 ` Michael Matz 2009-04-22 11:20 ` Richard Guenther @ 2009-04-22 12:34 ` Andrew MacLeod 2009-04-22 16:45 ` [RFA] " Michael Matz 2 siblings, 0 replies; 63+ messages in thread From: Andrew MacLeod @ 2009-04-22 12:34 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches Michael Matz wrote: >> Has there been any thought to turning mudflap into a plugin now that we >> have the machinery for that? Its seems like a ripe candidate. >> > > At least not from my side. I went the sissy way and instead fixed mudflap > to work on SSA form :-| > > Which is also what I would have done :-) > > > Yeah, I'm not terribly fond of the iteration above either. I somehow > wanted to avoid walking all SSA names to set this to-be-invented flag, > but I also didn't want to slow the partition unioning by merging the > flags. I probably bite the apple and create a new bitmap per partition > as you suggested. > > I wasn't thinking of doing it in the unioning, but after coalescing was done. I figured that since you end up walking each element of each partition checking for default defs, it should be a similar cost to simply loop over each ssa_version after coalescing, check for default_def, and set a bit for the owning partition. or something like that. > > > Thinking about it now I see a way to solve it. Simply not advancing the > iterator when we have something to insert should do the trick. Still > doesn't explain why commit_edge_insertions should work. > > As richi says, perhaps its just never had sufficient opportunity... odd but possible. Maybe it could be fixed in commit_edge_insertions, and have the initial part of that routine factored out since that exactly what you are doing here as well, and simply call the 'fixed' factored routine. That should count as a future bug fixed :-) >>> ! result = (* pp1)->cost - (* pp2)->cost; >>> /* Since qsort does not guarantee stability we use the elements >>> >> as a secondary key. This provides us with independence from >> the host's implementation of the sorting algorithm. */ >> >> Hmm. how long has this been backwards... >> its seems fine in 4.2, I rewrote it for 4.3, and jeez, its been backwards >> since 4.3.0 was released it seems. Zoinks. >> > > :-) It's a bit non-obvious, because the predicate looks correct at > first sight and only becomes incorrect considering that the sorted array > then is walked from back to front ;) > Speaking of bugs fixed. Yeah, well, it was immediately obvious to me when I looked at a sample and the list of coalesces comes out in the same order as the low-to-high cost list in .optimized. I guess I never noticed the forest through the trees. Man that's lame. Andrew ^ permalink raw reply [flat|nested] 63+ messages in thread
* [RFA] expand from SSA form (1/2) 2009-04-22 10:54 ` Michael Matz 2009-04-22 11:20 ` Richard Guenther 2009-04-22 12:34 ` Andrew MacLeod @ 2009-04-22 16:45 ` Michael Matz 2009-04-23 15:10 ` Andrew MacLeod ` (4 more replies) 2 siblings, 5 replies; 63+ messages in thread From: Michael Matz @ 2009-04-22 16:45 UTC (permalink / raw) To: gcc-patches; +Cc: Andrew MacLeod, Andrey Belevantsev On Wed, 22 Apr 2009, Michael Matz wrote: > I'll soon send a new version of the patch that fixes all problems and > testcases I encountered. Like so. This is the full patch, i.e. including the cleanups, but excluding the testsuite changes. It should incorporate all feedback. Compared to the last version it adds comments for new functions, fixes muflap2, and generally some other minor problems showing when I started testing Ada and a bug reported by Andrey. This patch (plus testsuite changes) was bootstrapped with Ada on x86_64-linux. There are no testsuite regressions: FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) FAIL: libmudflap.c++/pass41-frag.cxx execution test FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test All of these happen without the patch too (known bugs, old binutils, and pass41-frag never seems to work anyway). I'd like to ask for approval for the series. Ciao, Michael. -- * builtins.c (fold_builtin_next_arg): Handle SSA names. * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate beyond num_ssa_names, use ssa_name() directly. * tree-ssa-ter.c (free_temp_expr_table): Likewise. * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, mark only useful SSA names. (compare_pairs): Swap cost comparison. (coalesce_ssa_name): Don't use change_partition_var. * tree-nrv.c (struct nrv_data): Add modified member. (finalize_nrv_r): Set it. (tree_nrv): Use it to update statements. (pass_nrv): Require PROP_ssa. * tree-mudflap.c (create_referenced_var): New static helper. (mf_decl_cache_locals, mf_build_check_statement_for): Use it. (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. * alias.c (find_base_decl): Handle SSA names. * emit-rtl (set_reg_attrs_for_parm): Make non-static. (component_ref_for_mem_expr): Don't leak SSA names into RTL. * rtl.h (set_reg_attrs_for_parm): Declare. * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename to "optimized", remove unused locals at finish. (execute_free_datastructures): Make global, call delete_tree_cfg_annotations. (execute_free_cfg_annotations): Don't call delete_tree_cfg_annotations. * ssaexpand.h: New file. * expr.c (toplevel): Include ssaexpand.h. (expand_assignment): Handle SSA names the same as register variables. (expand_expr_real_1): Expand SSA names. * cfgexpand.c (toplevel): Include ssaexpand.h. (SA): New global variable. (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. (SSAVAR): New macro. (set_rtl): New helper function. (add_stack_var): Deal with SSA names, use set_rtl. (expand_one_stack_var_at): Likewise. (expand_one_stack_var): Deal with SSA names. (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker before unique numbers. (expand_stack_vars): Use set_rtl. (expand_one_var): Accept SSA names, add asserts for them, feed them to above subroutines. (expand_used_vars): Expand all partitions (without default defs), then only the local decls (ignoring those expanded already). (expand_gimple_cond): Remove edges when jumpif() expands an unconditional jump. (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, or remove abnormal edges. Ignore insns setting the LHS of a TERed SSA name. (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize members of SA; deal with PARM_DECL partitions here; expand all PHI nodes, free tree datastructures and SA. Commit instructions on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow info and statements at start, collect garbage at finish. * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. (VAR_ANN_PARTITION) Remove. (change_partition_var): Don't declare. (partition_to_var): Always return SSA names. (var_to_partition): Only accept SSA names. (register_ssa_partition): Only check argument. * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var member. (delete_var_map): Don't free it. (var_union): Only accept SSA names, simplify. (partition_view_init): Mark only useful SSA names as used. (partition_view_fini): Only deal with SSA names. (change_partition_var): Remove. (dump_var_map): Use ssa_name instead of partition_to_var member. * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL basic blocks. * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. (struct _elim_graph): New member const_dests; nodes member vector of ints. (set_location_for_edge): New static helper. (create_temp): Remove. (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New functions. (new_elim_graph): Allocate const_dests member. (clean_elim_graph): Truncate const_dests member. (delete_elim_graph): Free const_dests member. (elim_graph_size): Adapt to new type of nodes member. (elim_graph_add_node): Likewise. (eliminate_name): Likewise. (eliminate_build): Don't take basic block argument, deal only with partition numbers, not variables. (get_temp_reg): New static helper. (elim_create): Use it, deal with RTL temporaries instead of trees. (eliminate_phi): Adjust all calls to new signature. (assign_vars, replace_use_variable, replace_def_variable): Remove. (rewrite_trees): Only do checking. (edge_leader, stmt_list, leader_has_match, leader_match): Remove. (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, init_analyze_edges_for_bb, fini_analyze_edges_for_bb, contains_tree_r, MAX_STMTS_IN_LATCH, process_single_block_loop_latch, analyze_edges_for_bb, perform_edge_inserts): Remove. (expand_phi_nodes): New global function. (remove_ssa_form): Take ssaexpand parameter. Don't call removed functions, initialize new parameter, remember partitions having a default def. (finish_out_of_ssa): New global function. (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, don't reset in_ssa_p here. (pass_del_ssa): Remove. * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and partition members. (execute_free_datastructures): Declare. * Makefile.in (SSAEXPAND_H): New variable. (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. * basic-block.h (commit_one_edge_insertion): Declare. * passes.c (init_optimization_passes): Move pass_nrv and pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. (redirect_branch_edge): Deal with super block when expanding, split out jump patching itself into ... (patch_jump_insn): ... here, new static helper. Index: builtins.c =================================================================== *** builtins.c (revision 146576) --- builtins.c (working copy) *************** fold_builtin_next_arg (tree exp, bool va *** 11801,11806 **** --- 11801,11809 ---- arg = CALL_EXPR_ARG (exp, 0); } + if (TREE_CODE (arg) == SSA_NAME) + arg = SSA_NAME_VAR (arg); + /* We destructively modify the call to be __builtin_va_start (ap, 0) or __builtin_next_arg (0) the first time we see it, after checking the arguments and if needed issuing a warning. */ Index: tree-ssa-copyrename.c =================================================================== *** tree-ssa-copyrename.c (revision 146576) --- tree-ssa-copyrename.c (working copy) *************** rename_ssa_copies (void) *** 291,297 **** else debug = NULL; ! map = init_var_map (num_ssa_names + 1); FOR_EACH_BB (bb) { --- 291,297 ---- else debug = NULL; ! map = init_var_map (num_ssa_names); FOR_EACH_BB (bb) { *************** rename_ssa_copies (void) *** 339,350 **** /* Now one more pass to make all elements of a partition share the same root variable. */ ! for (x = 1; x <= num_ssa_names; x++) { part_var = partition_to_var (map, x); if (!part_var) continue; ! var = map->partition_to_var[x]; if (debug) { if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var)) --- 339,350 ---- /* Now one more pass to make all elements of a partition share the same root variable. */ ! for (x = 1; x < num_ssa_names; x++) { part_var = partition_to_var (map, x); if (!part_var) continue; ! var = ssa_name (x); if (debug) { if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var)) Index: tree-nrv.c =================================================================== *** tree-nrv.c (revision 146576) --- tree-nrv.c (working copy) *************** struct nrv_data *** 56,61 **** --- 56,62 ---- /* This is the function's RESULT_DECL. We will replace all occurrences of VAR with RESULT_DECL when we apply this optimization. */ tree result; + int modified; }; static tree finalize_nrv_r (tree *, int *, void *); *************** finalize_nrv_r (tree *tp, int *walk_subt *** 83,89 **** /* Otherwise replace all occurrences of VAR with RESULT. */ else if (*tp == dp->var) ! *tp = dp->result; /* Keep iterating. */ return NULL_TREE; --- 84,93 ---- /* Otherwise replace all occurrences of VAR with RESULT. */ else if (*tp == dp->var) ! { ! *tp = dp->result; ! dp->modified = 1; ! } /* Keep iterating. */ return NULL_TREE; *************** tree_nrv (void) *** 229,241 **** if (gimple_assign_copy_p (stmt) && gimple_assign_lhs (stmt) == result && gimple_assign_rhs1 (stmt) == found) ! gsi_remove (&gsi, true); else { struct walk_stmt_info wi; memset (&wi, 0, sizeof (wi)); wi.info = &data; walk_gimple_op (stmt, finalize_nrv_r, &wi); gsi_next (&gsi); } } --- 233,251 ---- if (gimple_assign_copy_p (stmt) && gimple_assign_lhs (stmt) == result && gimple_assign_rhs1 (stmt) == found) ! { ! unlink_stmt_vdef (stmt); ! gsi_remove (&gsi, true); ! } else { struct walk_stmt_info wi; memset (&wi, 0, sizeof (wi)); wi.info = &data; + data.modified = 0; walk_gimple_op (stmt, finalize_nrv_r, &wi); + if (data.modified) + update_stmt (stmt); gsi_next (&gsi); } } *************** struct gimple_opt_pass pass_nrv = *** 263,269 **** NULL, /* next */ 0, /* static_pass_number */ TV_TREE_NRV, /* tv_id */ ! PROP_cfg, /* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ --- 273,279 ---- NULL, /* next */ 0, /* static_pass_number */ TV_TREE_NRV, /* tv_id */ ! PROP_ssa | PROP_cfg, /* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ Index: expr.c =================================================================== *** expr.c (revision 146576) --- expr.c (working copy) *************** along with GCC; see the file COPYING3. *** 54,59 **** --- 54,60 ---- #include "timevar.h" #include "df.h" #include "diagnostic.h" + #include "ssaexpand.h" /* Decide whether a function's arguments should be processed from first to last or from last to first. *************** expand_assignment (tree to, tree from, b *** 4284,4295 **** Don't do this if TO is a VAR_DECL or PARM_DECL whose DECL_RTL is REG since it might be a promoted variable where the zero- or sign- extension needs to be done. Handling this in the normal way is safe because no ! computation is done before the call. */ if (TREE_CODE (from) == CALL_EXPR && ! aggregate_value_p (from, from) && COMPLETE_TYPE_P (TREE_TYPE (from)) && TREE_CODE (TYPE_SIZE (TREE_TYPE (from))) == INTEGER_CST ! && ! ((TREE_CODE (to) == VAR_DECL || TREE_CODE (to) == PARM_DECL) ! && REG_P (DECL_RTL (to)))) { rtx value; --- 4285,4297 ---- Don't do this if TO is a VAR_DECL or PARM_DECL whose DECL_RTL is REG since it might be a promoted variable where the zero- or sign- extension needs to be done. Handling this in the normal way is safe because no ! computation is done before the call. The same is true for SSA names. */ if (TREE_CODE (from) == CALL_EXPR && ! aggregate_value_p (from, from) && COMPLETE_TYPE_P (TREE_TYPE (from)) && TREE_CODE (TYPE_SIZE (TREE_TYPE (from))) == INTEGER_CST ! && ! (((TREE_CODE (to) == VAR_DECL || TREE_CODE (to) == PARM_DECL) ! && REG_P (DECL_RTL (to))) ! || TREE_CODE (to) == SSA_NAME)) { rtx value; *************** expand_expr_real_1 (tree exp, rtx target *** 7223,7230 **** } case SSA_NAME: ! return expand_expr_real_1 (SSA_NAME_VAR (exp), target, tmode, modifier, ! NULL); case PARM_DECL: case VAR_DECL: --- 7225,7245 ---- } case SSA_NAME: ! /* ??? ivopts calls expander, without any preparation from ! out-of-ssa. So fake instructions as if this was an access to the ! base variable. This unnecessarily allocates a pseudo, see how we can ! reuse it, if partition base vars have it set already. */ ! if (!currently_expanding_to_rtl) ! return expand_expr_real_1 (SSA_NAME_VAR (exp), target, tmode, modifier, NULL); ! { ! gimple g = get_gimple_for_ssa_name (exp); ! if (g) ! return expand_expr_real_1 (gimple_assign_rhs_to_tree (g), target, ! tmode, modifier, NULL); ! } ! decl_rtl = get_rtx_for_ssa_name (exp); ! exp = SSA_NAME_VAR (exp); ! goto expand_decl_rtl; case PARM_DECL: case VAR_DECL: *************** expand_expr_real_1 (tree exp, rtx target *** 7250,7255 **** --- 7265,7271 ---- case FUNCTION_DECL: case RESULT_DECL: decl_rtl = DECL_RTL (exp); + expand_decl_rtl: gcc_assert (decl_rtl); decl_rtl = copy_rtx (decl_rtl); Index: alias.c =================================================================== *** alias.c (revision 146576) --- alias.c (working copy) *************** find_base_decl (tree t) *** 436,441 **** --- 436,444 ---- if (t == 0 || t == error_mark_node || ! POINTER_TYPE_P (TREE_TYPE (t))) return 0; + if (TREE_CODE (t) == SSA_NAME) + t = SSA_NAME_VAR (t); + /* If this is a declaration, return it. If T is based on a restrict qualified decl, return that decl. */ if (DECL_P (t)) Index: tree-ssa-coalesce.c =================================================================== *** tree-ssa-coalesce.c (revision 146576) --- tree-ssa-coalesce.c (working copy) *************** compare_pairs (const void *p1, const voi *** 314,320 **** const_coalesce_pair_p const *const pp2 = (const_coalesce_pair_p const *) p2; int result; ! result = (* pp2)->cost - (* pp1)->cost; /* Since qsort does not guarantee stability we use the elements as a secondary key. This provides us with independence from the host's implementation of the sorting algorithm. */ --- 314,320 ---- const_coalesce_pair_p const *const pp2 = (const_coalesce_pair_p const *) p2; int result; ! result = (* pp1)->cost - (* pp2)->cost; /* Since qsort does not guarantee stability we use the elements as a secondary key. This provides us with independence from the host's implementation of the sorting algorithm. */ *************** create_outofssa_var_map (coalesce_list_p *** 974,980 **** used_in_virtual_ops = BITMAP_ALLOC (NULL); #endif ! map = init_var_map (num_ssa_names + 1); FOR_EACH_BB (bb) { --- 974,980 ---- used_in_virtual_ops = BITMAP_ALLOC (NULL); #endif ! map = init_var_map (num_ssa_names); FOR_EACH_BB (bb) { *************** create_outofssa_var_map (coalesce_list_p *** 1126,1133 **** first = NULL_TREE; for (i = 1; i < num_ssa_names; i++) { ! var = map->partition_to_var[i]; ! if (var != NULL_TREE) { /* Add coalesces between all the result decls. */ if (TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL) --- 1126,1133 ---- first = NULL_TREE; for (i = 1; i < num_ssa_names; i++) { ! var = ssa_name (i); ! if (var != NULL_TREE && is_gimple_reg (var)) { /* Add coalesces between all the result decls. */ if (TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL) *************** create_outofssa_var_map (coalesce_list_p *** 1148,1154 **** /* Mark any default_def variables as being in the coalesce list since they will have to be coalesced with the base variable. If not marked as present, they won't be in the coalesce view. */ ! if (gimple_default_def (cfun, SSA_NAME_VAR (var)) == var) bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var)); } } --- 1148,1155 ---- /* Mark any default_def variables as being in the coalesce list since they will have to be coalesced with the base variable. If not marked as present, they won't be in the coalesce view. */ ! if (gimple_default_def (cfun, SSA_NAME_VAR (var)) == var ! && !has_zero_uses (var)) bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var)); } } *************** eq_ssa_name_by_var (const void *p1, cons *** 1329,1335 **** extern var_map coalesce_ssa_name (void) { - unsigned num, x; tree_live_info_p liveinfo; ssa_conflicts_p graph; coalesce_list_p cl; --- 1330,1335 ---- *************** coalesce_ssa_name (void) *** 1406,1436 **** /* First, coalesce all live on entry variables to their base variable. This will ensure the first use is coming from the correct location. */ - num = num_var_partitions (map); - for (x = 0 ; x < num; x++) - { - tree var = partition_to_var (map, x); - tree root; - - if (TREE_CODE (var) != SSA_NAME) - continue; - - root = SSA_NAME_VAR (var); - if (gimple_default_def (cfun, root) == var) - { - /* This root variable should have not already been assigned - to another partition which is not coalesced with this one. */ - gcc_assert (!var_ann (root)->out_of_ssa_tag); - - if (dump_file && (dump_flags & TDF_DETAILS)) - { - print_exprs (dump_file, "Must coalesce ", var, - " with the root variable ", root, ".\n"); - } - change_partition_var (map, root, x); - } - } - if (dump_file && (dump_flags & TDF_DETAILS)) dump_var_map (dump_file, map); --- 1406,1411 ---- Index: emit-rtl.c =================================================================== *** emit-rtl.c (revision 146576) --- emit-rtl.c (working copy) *************** set_reg_attrs_for_parm (rtx parm_rtx, rt *** 1028,1034 **** /* Set the REG_ATTRS for registers in value X, given that X represents decl T. */ ! static void set_reg_attrs_for_decl_rtl (tree t, rtx x) { if (GET_CODE (x) == SUBREG) --- 1028,1034 ---- /* Set the REG_ATTRS for registers in value X, given that X represents decl T. */ ! void set_reg_attrs_for_decl_rtl (tree t, rtx x) { if (GET_CODE (x) == SUBREG) *************** component_ref_for_mem_expr (tree ref) *** 1449,1455 **** inner = NULL_TREE; } ! if (inner == TREE_OPERAND (ref, 0)) return ref; else return build3 (COMPONENT_REF, TREE_TYPE (ref), inner, --- 1449,1458 ---- inner = NULL_TREE; } ! if (inner == TREE_OPERAND (ref, 0) ! /* Don't leak SSA-names in the third operand. */ ! && (!TREE_OPERAND (ref, 2) ! || TREE_CODE (TREE_OPERAND (ref, 2)) != SSA_NAME)) return ref; else return build3 (COMPONENT_REF, TREE_TYPE (ref), inner, Index: cfgexpand.c =================================================================== *** cfgexpand.c (revision 146576) --- cfgexpand.c (working copy) *************** along with GCC; see the file COPYING3. *** 42,49 **** --- 42,54 ---- #include "tree-inline.h" #include "value-prof.h" #include "target.h" + #include "ssaexpand.h" + /* This variable holds information helping the rewriting of SSA trees + into RTL. */ + struct ssaexpand SA; + /* Return an expression tree corresponding to the RHS of GIMPLE statement STMT. */ *************** gimple_assign_rhs_to_tree (gimple stmt) *** 78,85 **** static tree gimple_cond_pred_to_tree (gimple stmt) { return build2 (gimple_cond_code (stmt), boolean_type_node, ! gimple_cond_lhs (stmt), gimple_cond_rhs (stmt)); } /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression --- 83,104 ---- static tree gimple_cond_pred_to_tree (gimple stmt) { + /* We're sometimes presented with such code: + D.123_1 = x < y; + if (D.123_1 != 0) + ... + This would expand to two comparisons which then later might + be cleaned up by combine. But some pattern matchers like if-conversion + work better when there's only one compare, so make up for this + here as special exception if TER would have made the same change. */ + tree lhs = gimple_cond_lhs (stmt); + if (SA.values + && TREE_CODE (lhs) == SSA_NAME + && SA.values[SSA_NAME_VERSION (lhs)]) + lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]); + return build2 (gimple_cond_code (stmt), boolean_type_node, ! lhs, gimple_cond_rhs (stmt)); } /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression *************** failed: *** 423,428 **** --- 442,464 ---- #define STACK_ALIGNMENT_NEEDED 1 #endif + #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) + + /* Associate declaration T with storage space X. If T is no + SSA name this is exactly SET_DECL_RTL, otherwise make the + partition of T associated with X. */ + static inline void + set_rtl (tree t, rtx x) + { + if (TREE_CODE (t) == SSA_NAME) + { + SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x; + if (x && !MEM_P (x)) + set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x); + } + else + SET_DECL_RTL (t, x); + } /* This structure holds data relevant to one variable that will be placed in a stack slot. */ *************** add_stack_var (tree decl) *** 561,575 **** } stack_vars[stack_vars_num].decl = decl; stack_vars[stack_vars_num].offset = 0; ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (decl), 1); ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (decl); /* All variables are initially in their own partition. */ stack_vars[stack_vars_num].representative = stack_vars_num; stack_vars[stack_vars_num].next = EOC; /* Ensure that this decl doesn't get put onto the list twice. */ ! SET_DECL_RTL (decl, pc_rtx); stack_vars_num++; } --- 597,611 ---- } stack_vars[stack_vars_num].decl = decl; stack_vars[stack_vars_num].offset = 0; ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1); ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (SSAVAR (decl)); /* All variables are initially in their own partition. */ stack_vars[stack_vars_num].representative = stack_vars_num; stack_vars[stack_vars_num].next = EOC; /* Ensure that this decl doesn't get put onto the list twice. */ ! set_rtl (decl, pc_rtx); stack_vars_num++; } *************** add_alias_set_conflicts (void) *** 688,709 **** } /* A subroutine of partition_stack_vars. A comparison function for qsort, ! sorting an array of indices by the size of the object. */ static int stack_var_size_cmp (const void *a, const void *b) { HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size; HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size; ! unsigned int uida = DECL_UID (stack_vars[*(const size_t *)a].decl); ! unsigned int uidb = DECL_UID (stack_vars[*(const size_t *)b].decl); if (sa < sb) return -1; if (sa > sb) return 1; ! /* For stack variables of the same size use the uid of the decl ! to make the sort stable. */ if (uida < uidb) return -1; if (uida > uidb) --- 724,760 ---- } /* A subroutine of partition_stack_vars. A comparison function for qsort, ! sorting an array of indices by the size and type of the object. */ static int stack_var_size_cmp (const void *a, const void *b) { HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size; HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size; ! tree decla, declb; ! unsigned int uida, uidb; if (sa < sb) return -1; if (sa > sb) return 1; ! decla = stack_vars[*(const size_t *)a].decl; ! declb = stack_vars[*(const size_t *)b].decl; ! /* For stack variables of the same size use and id of the decls ! to make the sort stable. Two SSA names are compared by their ! version, SSA names come before non-SSA names, and two normal ! decls are compared by their DECL_UID. */ ! if (TREE_CODE (decla) == SSA_NAME) ! { ! if (TREE_CODE (declb) == SSA_NAME) ! uida = SSA_NAME_VERSION (decla), uidb = SSA_NAME_VERSION (declb); ! else ! return -1; ! } ! else if (TREE_CODE (declb) == SSA_NAME) ! return 1; ! else ! uida = DECL_UID (decla), uidb = DECL_UID (declb); if (uida < uidb) return -1; if (uida > uidb) *************** expand_one_stack_var_at (tree decl, HOST *** 874,894 **** gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); x = plus_constant (virtual_stack_vars_rtx, offset); ! x = gen_rtx_MEM (DECL_MODE (decl), x); ! /* Set alignment we actually gave this decl. */ ! offset -= frame_phase; ! align = offset & -offset; ! align *= BITS_PER_UNIT; ! if (align == 0) ! align = STACK_BOUNDARY; ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) ! align = MAX_SUPPORTED_STACK_ALIGNMENT; ! DECL_ALIGN (decl) = align; ! DECL_USER_ALIGN (decl) = 0; ! set_mem_attributes (x, decl, true); ! SET_DECL_RTL (decl, x); } /* A subroutine of expand_used_vars. Give each partition representative --- 925,951 ---- gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); x = plus_constant (virtual_stack_vars_rtx, offset); ! x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x); ! if (TREE_CODE (decl) != SSA_NAME) ! { ! /* Set alignment we actually gave this decl if it isn't an SSA name. ! If it is we generate stack slots only accidentally so it isn't as ! important, we'll simply use the alignment that is already set. */ ! offset -= frame_phase; ! align = offset & -offset; ! align *= BITS_PER_UNIT; ! if (align == 0) ! align = STACK_BOUNDARY; ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) ! align = MAX_SUPPORTED_STACK_ALIGNMENT; ! DECL_ALIGN (decl) = align; ! DECL_USER_ALIGN (decl) = 0; ! } ! ! set_mem_attributes (x, SSAVAR (decl), true); ! set_rtl (decl, x); } /* A subroutine of expand_used_vars. Give each partition representative *************** expand_stack_vars (bool (*pred) (tree)) *** 912,918 **** /* Skip variables that have already had rtl assigned. See also add_stack_var where we perpetrate this pc_rtx hack. */ ! if (DECL_RTL (stack_vars[i].decl) != pc_rtx) continue; /* Check the predicate to see whether this variable should be --- 969,977 ---- /* Skip variables that have already had rtl assigned. See also add_stack_var where we perpetrate this pc_rtx hack. */ ! if ((TREE_CODE (stack_vars[i].decl) == SSA_NAME ! ? SA.partition_to_pseudo[var_to_partition (SA.map, stack_vars[i].decl)] ! : DECL_RTL (stack_vars[i].decl)) != pc_rtx) continue; /* Check the predicate to see whether this variable should be *************** account_stack_vars (void) *** 951,957 **** size += stack_vars[i].size; for (j = i; j != EOC; j = stack_vars[j].next) ! SET_DECL_RTL (stack_vars[j].decl, NULL); } return size; } --- 1010,1016 ---- size += stack_vars[i].size; for (j = i; j != EOC; j = stack_vars[j].next) ! set_rtl (stack_vars[j].decl, NULL); } return size; } *************** expand_one_stack_var (tree var) *** 964,971 **** { HOST_WIDE_INT size, offset, align; ! size = tree_low_cst (DECL_SIZE_UNIT (var), 1); ! align = get_decl_align_unit (var); offset = alloc_stack_frame_space (size, align); expand_one_stack_var_at (var, offset); --- 1023,1030 ---- { HOST_WIDE_INT size, offset, align; ! size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (var)), 1); ! align = get_decl_align_unit (SSAVAR (var)); offset = alloc_stack_frame_space (size, align); expand_one_stack_var_at (var, offset); *************** expand_one_hard_reg_var (tree var) *** 986,1005 **** static void expand_one_register_var (tree var) { ! tree type = TREE_TYPE (var); int unsignedp = TYPE_UNSIGNED (type); enum machine_mode reg_mode ! = promote_mode (type, DECL_MODE (var), &unsignedp, 0); rtx x = gen_reg_rtx (reg_mode); ! SET_DECL_RTL (var, x); /* Note if the object is a user variable. */ ! if (!DECL_ARTIFICIAL (var)) ! mark_user_reg (x); if (POINTER_TYPE_P (type)) ! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); } /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that --- 1045,1065 ---- static void expand_one_register_var (tree var) { ! tree decl = SSAVAR (var); ! tree type = TREE_TYPE (decl); int unsignedp = TYPE_UNSIGNED (type); enum machine_mode reg_mode ! = promote_mode (type, DECL_MODE (decl), &unsignedp, 0); rtx x = gen_reg_rtx (reg_mode); ! set_rtl (var, x); /* Note if the object is a user variable. */ ! if (!DECL_ARTIFICIAL (decl)) ! mark_user_reg (x); if (POINTER_TYPE_P (type)) ! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type))); } /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that *************** defer_stack_allocation (tree var, bool t *** 1067,1072 **** --- 1127,1135 ---- static HOST_WIDE_INT expand_one_var (tree var, bool toplevel, bool really_expand) { + tree origvar = var; + var = SSAVAR (var); + if (SUPPORTS_STACK_ALIGNMENT && TREE_TYPE (var) != error_mark_node && TREE_CODE (var) == VAR_DECL) *************** expand_one_var (tree var, bool toplevel, *** 1092,1098 **** } } ! if (TREE_CODE (var) != VAR_DECL) ; else if (DECL_EXTERNAL (var)) ; --- 1155,1172 ---- } } ! if (TREE_CODE (origvar) == SSA_NAME) ! { ! gcc_assert (TREE_CODE (var) != VAR_DECL ! || (!DECL_EXTERNAL (var) ! && !DECL_HAS_VALUE_EXPR_P (var) ! && !TREE_STATIC (var) ! && !DECL_RTL_SET_P (var) ! && TREE_TYPE (var) != error_mark_node ! && !DECL_HARD_REGISTER (var) ! && really_expand)); ! } ! if (TREE_CODE (var) != VAR_DECL && TREE_CODE (origvar) != SSA_NAME) ; else if (DECL_EXTERNAL (var)) ; *************** expand_one_var (tree var, bool toplevel, *** 1107,1113 **** if (really_expand) expand_one_error_var (var); } ! else if (DECL_HARD_REGISTER (var)) { if (really_expand) expand_one_hard_reg_var (var); --- 1181,1187 ---- if (really_expand) expand_one_error_var (var); } ! else if (TREE_CODE (var) == VAR_DECL && DECL_HARD_REGISTER (var)) { if (really_expand) expand_one_hard_reg_var (var); *************** expand_one_var (tree var, bool toplevel, *** 1115,1128 **** else if (use_register_for_decl (var)) { if (really_expand) ! expand_one_register_var (var); } else if (defer_stack_allocation (var, toplevel)) ! add_stack_var (var); else { if (really_expand) ! expand_one_stack_var (var); return tree_low_cst (DECL_SIZE_UNIT (var), 1); } return 0; --- 1189,1202 ---- else if (use_register_for_decl (var)) { if (really_expand) ! expand_one_register_var (origvar); } else if (defer_stack_allocation (var, toplevel)) ! add_stack_var (origvar); else { if (really_expand) ! expand_one_stack_var (origvar); return tree_low_cst (DECL_SIZE_UNIT (var), 1); } return 0; *************** static void *** 1441,1446 **** --- 1515,1521 ---- expand_used_vars (void) { tree t, next, outer_block = DECL_INITIAL (current_function_decl); + unsigned i; /* Compute the phase of the stack frame for this function. */ { *************** expand_used_vars (void) *** 1451,1456 **** --- 1526,1553 ---- init_vars_expansion (); + for (i = 0; i < SA.map->num_partitions; i++) + { + tree var = partition_to_var (SA.map, i); + + gcc_assert (is_gimple_reg (var)); + if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL) + expand_one_var (var, true, true); + else + { + /* This is a PARM_DECL or RESULT_DECL. For those partitions that + contain the default def (representing the parm or result itself) + we don't do anything here. But those which don't contain the + default def (representing a temporary based on the parm/result) + we need to allocate space just like for normal VAR_DECLs. */ + if (!bitmap_bit_p (SA.partition_has_default_def, i)) + { + expand_one_var (var, true, true); + gcc_assert (SA.partition_to_pseudo[i]); + } + } + } + /* At this point all variables on the local_decls with TREE_USED set are not associated with any block scope. Lay them out. */ t = cfun->local_decls; *************** expand_used_vars (void) *** 1462,1480 **** next = TREE_CHAIN (t); /* We didn't set a block for static or extern because it's hard to tell the difference between a global variable (re)declared in a local scope, and one that's really declared there to begin with. And it doesn't really matter much, since we're not giving them stack space. Expand them now. */ ! if (TREE_STATIC (var) || DECL_EXTERNAL (var)) ! expand_now = true; ! ! /* Any variable that could have been hoisted into an SSA_NAME ! will have been propagated anywhere the optimizers chose, ! i.e. not confined to their original block. Allocate them ! as if they were defined in the outermost scope. */ ! else if (is_gimple_reg (var)) expand_now = true; /* If the variable is not associated with any block, then it --- 1559,1573 ---- next = TREE_CHAIN (t); + /* Expanded above already. */ + if (is_gimple_reg (var)) + ; /* We didn't set a block for static or extern because it's hard to tell the difference between a global variable (re)declared in a local scope, and one that's really declared there to begin with. And it doesn't really matter much, since we're not giving them stack space. Expand them now. */ ! else if (TREE_STATIC (var) || DECL_EXTERNAL (var)) expand_now = true; /* If the variable is not associated with any block, then it *************** expand_gimple_cond (basic_block bb, gimp *** 1674,1679 **** --- 1767,1785 ---- true_edge->goto_block = NULL; false_edge->flags |= EDGE_FALLTHRU; ggc_free (pred); + /* Special case: when jumpif decides that the condition is + trivial it emits an unconditional jump (and the necessary + barrier). But we still have two edges, the fallthru one is + wrong. purge_dead_edges would clean this up later. Unfortunately + we have to insert insns (and split edges) before + find_many_sub_basic_blocks and hence before purge_dead_edges. + But splitting edges might create new blocks which depend on the + fact that if there are two edges there's no barrier. So the + barrier would get lost and verify_flow_info would ICE. Instead + of auditing all edge splitters to care for the barrier (which + normally isn't there in a cleaned CFG), fix it here. */ + if (BARRIER_P (get_last_insn ())) + remove_edge (false_edge); return NULL; } if (true_edge->dest == bb->next_bb) *************** expand_gimple_cond (basic_block bb, gimp *** 1690,1695 **** --- 1796,1803 ---- false_edge->goto_block = NULL; true_edge->flags |= EDGE_FALLTHRU; ggc_free (pred); + if (BARRIER_P (get_last_insn ())) + remove_edge (true_edge); return NULL; } *************** expand_gimple_basic_block (basic_block b *** 1932,1951 **** NOTE_BASIC_BLOCK (note) = bb; - for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); ) - { - /* Clear EDGE_EXECUTABLE. This flag is never used in the backend. */ - e->flags &= ~EDGE_EXECUTABLE; - - /* At the moment not all abnormal edges match the RTL representation. - It is safe to remove them here as find_many_sub_basic_blocks will - rediscover them. In the future we should get this fixed properly. */ - if (e->flags & EDGE_ABNORMAL) - remove_edge (e); - else - ei_next (&ei); - } - for (; !gsi_end_p (gsi); gsi_next (&gsi)) { gimple stmt = gsi_stmt (gsi); --- 2040,2045 ---- *************** expand_gimple_basic_block (basic_block b *** 1975,1981 **** } else if (gimple_code (stmt) != GIMPLE_CHANGE_DYNAMIC_TYPE) { ! tree stmt_tree = gimple_to_tree (stmt); last = get_last_insn (); expand_expr_stmt (stmt_tree); maybe_dump_rtl_for_gimple_stmt (stmt, last); --- 2069,2087 ---- } else if (gimple_code (stmt) != GIMPLE_CHANGE_DYNAMIC_TYPE) { ! def_operand_p def_p; ! tree stmt_tree; ! def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF); ! ! if (def_p != NULL) ! { ! /* Ignore this stmt if it is in the list of ! replaceable expressions. */ ! if (SA.values ! && SA.values[SSA_NAME_VERSION (DEF_FROM_PTR (def_p))]) ! continue; ! } ! stmt_tree = gimple_to_tree (stmt); last = get_last_insn (); expand_expr_stmt (stmt_tree); maybe_dump_rtl_for_gimple_stmt (stmt, last); *************** gimple_expand_cfg (void) *** 2286,2291 **** --- 2392,2402 ---- sbitmap blocks; edge_iterator ei; edge e; + unsigned i; + + rewrite_out_of_ssa (&SA); + SA.partition_to_pseudo = (rtx *)xcalloc (SA.map->num_partitions, + sizeof (rtx)); /* Some backends want to know that we are expanding to RTL. */ currently_expanding_to_rtl = 1; *************** gimple_expand_cfg (void) *** 2339,2344 **** --- 2450,2478 ---- /* Set up parameters and prepare for return, for the function. */ expand_function_start (current_function_decl); + /* Now that we also have the parameter RTXs, copy them over to our + partitions. */ + for (i = 0; i < SA.map->num_partitions; i++) + { + tree var = SSA_NAME_VAR (partition_to_var (SA.map, i)); + + if (TREE_CODE (var) != VAR_DECL + && !SA.partition_to_pseudo[i]) + SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var); + gcc_assert (SA.partition_to_pseudo[i]); + /* Some RTL parts really want to look at DECL_RTL(x) when x + was a decl marked in REG_ATTR or MEM_ATTR. We could use + SET_DECL_RTL here making this available, but that would mean + to select one of the potentially many RTLs for one DECL. Instead + of doing that we simply reset the MEM_EXPR of the RTL in question, + then nobody can get at it and hence nobody can call DECL_RTL on it. */ + if (!DECL_RTL_SET_P (var)) + { + if (MEM_P (SA.partition_to_pseudo[i])) + set_mem_expr (SA.partition_to_pseudo[i], NULL); + } + } + /* If this function is `main', emit a call to `__main' to run global initializers, etc. */ if (DECL_NAME (current_function_decl) *************** gimple_expand_cfg (void) *** 2371,2380 **** /* Register rtl specific functions for cfg. */ rtl_register_cfg_hooks (); init_block = construct_init_block (); /* Clear EDGE_EXECUTABLE on the entry edge(s). It is cleaned from the ! remaining edges in expand_gimple_basic_block. */ FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs) e->flags &= ~EDGE_EXECUTABLE; --- 2505,2516 ---- /* Register rtl specific functions for cfg. */ rtl_register_cfg_hooks (); + expand_phi_nodes (&SA); + init_block = construct_init_block (); /* Clear EDGE_EXECUTABLE on the entry edge(s). It is cleaned from the ! remaining edges later. */ FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs) e->flags &= ~EDGE_EXECUTABLE; *************** gimple_expand_cfg (void) *** 2382,2387 **** --- 2518,2526 ---- FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb) bb = expand_gimple_basic_block (bb); + execute_free_datastructures (); + finish_out_of_ssa (&SA); + /* Expansion is used by optimization passes too, set maybe_hot_insn_p conservatively to true until they are all profile aware. */ pointer_map_destroy (lab_rtx_for_bb); *************** gimple_expand_cfg (void) *** 2391,2399 **** set_curr_insn_block (DECL_INITIAL (current_function_decl)); insn_locators_finalize (); - /* We're done expanding trees to RTL. */ - currently_expanding_to_rtl = 0; - /* Convert tree EH labels to RTL EH labels and zap the tree EH table. */ convert_from_eh_region_ranges (); set_eh_throw_stmt_table (cfun, NULL); --- 2530,2535 ---- *************** gimple_expand_cfg (void) *** 2401,2411 **** rebuild_jump_labels (get_insns ()); find_exception_handler_labels (); blocks = sbitmap_alloc (last_basic_block); sbitmap_ones (blocks); find_many_sub_basic_blocks (blocks); - purge_all_dead_edges (); sbitmap_free (blocks); compact_blocks (); --- 2537,2584 ---- rebuild_jump_labels (get_insns ()); find_exception_handler_labels (); + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR, next_bb) + { + edge e; + edge_iterator ei; + for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); ) + { + if (e->insns.r) + commit_one_edge_insertion (e); + else + ei_next (&ei); + } + } + + /* We're done expanding trees to RTL. */ + currently_expanding_to_rtl = 0; + + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb) + { + edge e; + edge_iterator ei; + for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); ) + { + /* Clear EDGE_EXECUTABLE. This flag is never used in the backend. */ + e->flags &= ~EDGE_EXECUTABLE; + + /* At the moment not all abnormal edges match the RTL + representation. It is safe to remove them here as + find_many_sub_basic_blocks will rediscover them. + In the future we should get this fixed properly. */ + if ((e->flags & EDGE_ABNORMAL) + && !(e->flags & EDGE_SIBCALL)) + remove_edge (e); + else + ei_next (&ei); + } + } + blocks = sbitmap_alloc (last_basic_block); sbitmap_ones (blocks); find_many_sub_basic_blocks (blocks); sbitmap_free (blocks); + purge_all_dead_edges (); compact_blocks (); *************** struct rtl_opt_pass pass_expand = *** 2471,2480 **** 0, /* static_pass_number */ TV_EXPAND, /* tv_id */ /* ??? If TER is enabled, we actually receive GENERIC. */ ! PROP_gimple_leh | PROP_cfg, /* properties_required */ PROP_rtl, /* properties_provided */ ! PROP_trees, /* properties_destroyed */ ! 0, /* todo_flags_start */ ! TODO_dump_func, /* todo_flags_finish */ } }; --- 2644,2655 ---- 0, /* static_pass_number */ TV_EXPAND, /* tv_id */ /* ??? If TER is enabled, we actually receive GENERIC. */ ! PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */ PROP_rtl, /* properties_provided */ ! PROP_ssa | PROP_trees, /* properties_destroyed */ ! TODO_verify_ssa | TODO_verify_flow ! | TODO_verify_stmts, /* todo_flags_start */ ! TODO_dump_func ! | TODO_ggc_collect /* todo_flags_finish */ } }; Index: tree-ssa-live.c =================================================================== *** tree-ssa-live.c (revision 146576) --- tree-ssa-live.c (working copy) *************** init_var_map (int size) *** 136,144 **** map = (var_map) xmalloc (sizeof (struct _var_map)); map->var_partition = partition_new (size); - map->partition_to_var - = (tree *)xmalloc (size * sizeof (tree)); - memset (map->partition_to_var, 0, size * sizeof (tree)); map->partition_to_view = NULL; map->view_to_partition = NULL; --- 136,141 ---- *************** void *** 157,163 **** delete_var_map (var_map map) { var_map_base_fini (map); - free (map->partition_to_var); partition_delete (map->var_partition); if (map->partition_to_view) free (map->partition_to_view); --- 154,159 ---- *************** int *** 175,215 **** var_union (var_map map, tree var1, tree var2) { int p1, p2, p3; ! tree root_var = NULL_TREE; ! tree other_var = NULL_TREE; /* This is independent of partition_to_view. If partition_to_view is on, then whichever one of these partitions is absorbed will never have a dereference into the partition_to_view array any more. */ ! if (TREE_CODE (var1) == SSA_NAME) ! p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1)); ! else ! { ! p1 = var_to_partition (map, var1); ! if (map->view_to_partition) ! p1 = map->view_to_partition[p1]; ! root_var = var1; ! } ! ! if (TREE_CODE (var2) == SSA_NAME) ! p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2)); ! else ! { ! p2 = var_to_partition (map, var2); ! if (map->view_to_partition) ! p2 = map->view_to_partition[p2]; ! ! /* If there is no root_var set, or it's not a user variable, set the ! root_var to this one. */ ! if (!root_var || (DECL_P (root_var) && DECL_IGNORED_P (root_var))) ! { ! other_var = root_var; ! root_var = var2; ! } ! else ! other_var = var2; ! } gcc_assert (p1 != NO_PARTITION); gcc_assert (p2 != NO_PARTITION); --- 171,186 ---- var_union (var_map map, tree var1, tree var2) { int p1, p2, p3; ! ! gcc_assert (TREE_CODE (var1) == SSA_NAME); ! gcc_assert (TREE_CODE (var2) == SSA_NAME); /* This is independent of partition_to_view. If partition_to_view is on, then whichever one of these partitions is absorbed will never have a dereference into the partition_to_view array any more. */ ! p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1)); ! p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2)); gcc_assert (p1 != NO_PARTITION); gcc_assert (p2 != NO_PARTITION); *************** var_union (var_map map, tree var1, tree *** 222,232 **** if (map->partition_to_view) p3 = map->partition_to_view[p3]; - if (root_var) - change_partition_var (map, root_var, p3); - if (other_var) - change_partition_var (map, other_var, p3); - return p3; } --- 193,198 ---- *************** partition_view_init (var_map map) *** 278,284 **** for (x = 0; x < map->partition_size; x++) { tmp = partition_find (map->var_partition, x); ! if (map->partition_to_var[tmp] != NULL_TREE && !bitmap_bit_p (used, tmp)) bitmap_set_bit (used, tmp); } --- 244,252 ---- for (x = 0; x < map->partition_size; x++) { tmp = partition_find (map->var_partition, x); ! if (ssa_name (tmp) != NULL_TREE && is_gimple_reg (ssa_name (tmp)) ! && (!has_zero_uses (ssa_name (tmp)) ! || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp)))) bitmap_set_bit (used, tmp); } *************** partition_view_fini (var_map map, bitmap *** 297,303 **** { bitmap_iterator bi; unsigned count, i, x, limit; - tree var; gcc_assert (selected); --- 265,270 ---- *************** partition_view_fini (var_map map, bitmap *** 317,327 **** { map->partition_to_view[x] = i; map->view_to_partition[i] = x; - var = map->partition_to_var[x]; - /* If any one of the members of a partition is not an SSA_NAME, make - sure it is the representative. */ - if (TREE_CODE (var) != SSA_NAME) - change_partition_var (map, var, i); i++; } gcc_assert (i == count); --- 284,289 ---- *************** partition_view_bitmap (var_map map, bitm *** 379,403 **** } - /* This function is used to change the representative variable in MAP for VAR's - partition to a regular non-ssa variable. This allows partitions to be - mapped back to real variables. */ - - void - change_partition_var (var_map map, tree var, int part) - { - var_ann_t ann; - - gcc_assert (TREE_CODE (var) != SSA_NAME); - - ann = var_ann (var); - ann->out_of_ssa_tag = 1; - VAR_ANN_PARTITION (ann) = part; - if (map->view_to_partition) - map->partition_to_var[map->view_to_partition[part]] = var; - } - - static inline void mark_all_vars_used (tree *, void *data); /* Helper function for mark_all_vars_used, called via walk_tree. */ --- 341,346 ---- *************** dump_var_map (FILE *f, var_map map) *** 1105,1111 **** else p = x; ! if (map->partition_to_var[p] == NULL_TREE) continue; t = 0; --- 1048,1054 ---- else p = x; ! if (ssa_name (p) == NULL_TREE) continue; t = 0; Index: tree-ssa-live.h =================================================================== *** tree-ssa-live.h (revision 146576) --- tree-ssa-live.h (working copy) *************** typedef struct _var_map *** 60,68 **** int *partition_to_view; int *view_to_partition; - /* Mapping of partition numbers to variables. */ - tree *partition_to_var; - /* Current number of partitions in var_map based on the current view. */ unsigned int num_partitions; --- 60,65 ---- *************** typedef struct _var_map *** 80,87 **** } *var_map; - /* Partition number of a non ssa-name variable. */ - #define VAR_ANN_PARTITION(ann) (ann->partition) /* Index to the basevar table of a non ssa-name variable. */ #define VAR_ANN_BASE_INDEX(ann) (ann->base_index) --- 77,82 ---- *************** extern var_map init_var_map (int); *** 93,99 **** extern void delete_var_map (var_map); extern void dump_var_map (FILE *, var_map); extern int var_union (var_map, tree, tree); - extern void change_partition_var (var_map, tree, int); extern void partition_view_normal (var_map, bool); extern void partition_view_bitmap (var_map, bitmap, bool); #ifdef ENABLE_CHECKING --- 88,93 ---- *************** num_var_partitions (var_map map) *** 116,125 **** static inline tree partition_to_var (var_map map, int i) { if (map->view_to_partition) i = map->view_to_partition[i]; i = partition_find (map->var_partition, i); ! return map->partition_to_var[i]; } --- 110,121 ---- static inline tree partition_to_var (var_map map, int i) { + tree name; if (map->view_to_partition) i = map->view_to_partition[i]; i = partition_find (map->var_partition, i); ! name = ssa_name (i); ! return name; } *************** version_to_var (var_map map, int version *** 146,168 **** static inline int var_to_partition (var_map map, tree var) { - var_ann_t ann; int part; ! if (TREE_CODE (var) == SSA_NAME) ! { ! part = partition_find (map->var_partition, SSA_NAME_VERSION (var)); ! if (map->partition_to_view) ! part = map->partition_to_view[part]; ! } ! else ! { ! ann = var_ann (var); ! if (ann && ann->out_of_ssa_tag) ! part = VAR_ANN_PARTITION (ann); ! else ! part = NO_PARTITION; ! } return part; } --- 142,153 ---- static inline int var_to_partition (var_map map, tree var) { int part; ! gcc_assert (TREE_CODE (var) == SSA_NAME); ! part = partition_find (map->var_partition, SSA_NAME_VERSION (var)); ! if (map->partition_to_view) ! part = map->partition_to_view[part]; return part; } *************** num_basevars (var_map map) *** 207,223 **** partitions may be filtered out by a view later. */ static inline void ! register_ssa_partition (var_map map, tree ssa_var) { - int version; - #if defined ENABLE_CHECKING register_ssa_partition_check (ssa_var); #endif - - version = SSA_NAME_VERSION (ssa_var); - if (map->partition_to_var[version] == NULL_TREE) - map->partition_to_var[version] = ssa_var; } --- 192,202 ---- partitions may be filtered out by a view later. */ static inline void ! register_ssa_partition (var_map map ATTRIBUTE_UNUSED, tree ssa_var) { #if defined ENABLE_CHECKING register_ssa_partition_check (ssa_var); #endif } Index: tree-mudflap.c =================================================================== *** tree-mudflap.c (revision 146576) --- tree-mudflap.c (working copy) *************** execute_mudflap_function_ops (void) *** 447,452 **** --- 447,465 ---- return 0; } + /* Construct a new temporary variable with TYPE + as type and PREFIX as name prefix, add it to referenced vars + and mark it for renaming. */ + + static tree + create_referenced_var (tree type, const char *prefix) + { + tree var = create_tmp_var (type, prefix); + add_referenced_var (var); + mark_sym_for_renaming (var); + return var; + } + /* Create and initialize local shadow variables for the lookup cache globals. Put their decls in the *_l globals for use by mf_build_check_statement_for. */ *************** mf_decl_cache_locals (void) *** 459,469 **** /* Build the cache vars. */ mf_cache_shift_decl_l ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_shift_decl), "__mf_lookup_shift_l")); mf_cache_mask_decl_l ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_mask_decl), "__mf_lookup_mask_l")); /* Build initialization nodes for the cache vars. We just load the --- 472,482 ---- /* Build the cache vars. */ mf_cache_shift_decl_l ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_shift_decl), "__mf_lookup_shift_l")); mf_cache_mask_decl_l ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_mask_decl), "__mf_lookup_mask_l")); /* Build initialization nodes for the cache vars. We just load the *************** mf_build_check_statement_for (tree base, *** 546,554 **** } /* Build our local variables. */ ! mf_elem = create_tmp_var (mf_cache_structptr_type, "__mf_elem"); ! mf_base = create_tmp_var (mf_uintptr_type, "__mf_base"); ! mf_limit = create_tmp_var (mf_uintptr_type, "__mf_limit"); /* Build: __mf_base = (uintptr_t) <base address expression>. */ seq = gimple_seq_alloc (); --- 559,567 ---- } /* Build our local variables. */ ! mf_elem = create_referenced_var (mf_cache_structptr_type, "__mf_elem"); ! mf_base = create_referenced_var (mf_uintptr_type, "__mf_base"); ! mf_limit = create_referenced_var (mf_uintptr_type, "__mf_limit"); /* Build: __mf_base = (uintptr_t) <base address expression>. */ seq = gimple_seq_alloc (); *************** mf_build_check_statement_for (tree base, *** 627,633 **** t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u); t = force_gimple_operand (t, &stmts, false, NULL_TREE); gimple_seq_add_seq (&seq, stmts); ! cond = create_tmp_var (boolean_type_node, "__mf_unlikely_cond"); g = gimple_build_assign (cond, t); gimple_set_location (g, location); gimple_seq_add_stmt (&seq, g); --- 640,646 ---- t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u); t = force_gimple_operand (t, &stmts, false, NULL_TREE); gimple_seq_add_seq (&seq, stmts); ! cond = create_referenced_var (boolean_type_node, "__mf_unlikely_cond"); g = gimple_build_assign (cond, t); gimple_set_location (g, location); gimple_seq_add_stmt (&seq, g); *************** struct gimple_opt_pass pass_mudflap_2 = *** 1366,1377 **** NULL, /* next */ 0, /* static_pass_number */ TV_NONE, /* tv_id */ ! PROP_gimple_leh, /* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ TODO_verify_flow | TODO_verify_stmts ! | TODO_dump_func /* todo_flags_finish */ } }; --- 1379,1390 ---- NULL, /* next */ 0, /* static_pass_number */ TV_NONE, /* tv_id */ ! PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ TODO_verify_flow | TODO_verify_stmts ! | TODO_dump_func | TODO_update_ssa /* todo_flags_finish */ } }; Index: tree-ssa-ter.c =================================================================== *** tree-ssa-ter.c (revision 146576) --- tree-ssa-ter.c (working copy) *************** free_temp_expr_table (temp_expr_table_p *** 225,231 **** unsigned x; for (x = 0; x <= num_var_partitions (t->map); x++) gcc_assert (!t->kill_list[x]); ! for (x = 0; x < num_ssa_names + 1; x++) { gcc_assert (t->expr_decl_uids[x] == NULL); gcc_assert (t->partition_dependencies[x] == NULL); --- 225,231 ---- unsigned x; for (x = 0; x <= num_var_partitions (t->map); x++) gcc_assert (!t->kill_list[x]); ! for (x = 0; x < num_ssa_names; x++) { gcc_assert (t->expr_decl_uids[x] == NULL); gcc_assert (t->partition_dependencies[x] == NULL); Index: tree-ssa.c =================================================================== *** tree-ssa.c (revision 146576) --- tree-ssa.c (working copy) *************** delete_tree_ssa (void) *** 844,850 **** gimple_set_modified (stmt, true); } ! set_phi_nodes (bb, NULL); } /* Remove annotations from every referenced local variable. */ --- 844,851 ---- gimple_set_modified (stmt, true); } ! if (!(bb->flags & BB_RTL)) ! set_phi_nodes (bb, NULL); } /* Remove annotations from every referenced local variable. */ Index: rtl.h =================================================================== *** rtl.h (revision 146576) --- rtl.h (working copy) *************** extern rtx gen_int_mode (HOST_WIDE_INT, *** 1494,1499 **** --- 1494,1500 ---- extern rtx emit_copy_of_insn_after (rtx, rtx); extern void set_reg_attrs_from_value (rtx, rtx); extern void set_reg_attrs_for_parm (rtx, rtx); + extern void set_reg_attrs_for_decl_rtl (tree t, rtx x); extern void adjust_reg_mode (rtx, enum machine_mode); extern int mem_expr_equal_p (const_tree, const_tree); Index: tree-optimize.c =================================================================== *** tree-optimize.c (revision 146576) --- tree-optimize.c (working copy) *************** struct gimple_opt_pass pass_cleanup_cfg_ *** 201,207 **** { { GIMPLE_PASS, ! "final_cleanup", /* name */ NULL, /* gate */ execute_cleanup_cfg_post_optimizing, /* execute */ NULL, /* sub */ --- 201,207 ---- { { GIMPLE_PASS, ! "optimized", /* name */ NULL, /* gate */ execute_cleanup_cfg_post_optimizing, /* execute */ NULL, /* sub */ *************** struct gimple_opt_pass pass_cleanup_cfg_ *** 213,225 **** 0, /* properties_destroyed */ 0, /* todo_flags_start */ TODO_dump_func /* todo_flags_finish */ } }; /* Pass: do the actions required to finish with tree-ssa optimization passes. */ ! static unsigned int execute_free_datastructures (void) { free_dominance_info (CDI_DOMINATORS); --- 213,226 ---- 0, /* properties_destroyed */ 0, /* todo_flags_start */ TODO_dump_func /* todo_flags_finish */ + | TODO_remove_unused_locals } }; /* Pass: do the actions required to finish with tree-ssa optimization passes. */ ! unsigned int execute_free_datastructures (void) { free_dominance_info (CDI_DOMINATORS); *************** execute_free_datastructures (void) *** 228,233 **** --- 229,238 ---- /* Remove the ssa structures. */ if (cfun->gimple_df) delete_tree_ssa (); + + /* And get rid of annotations we no longer need. */ + delete_tree_cfg_annotations (); + return 0; } *************** struct gimple_opt_pass pass_free_datastr *** 254,262 **** static unsigned int execute_free_cfg_annotations (void) { - /* And get rid of annotations we no longer need. */ - delete_tree_cfg_annotations (); - return 0; } --- 259,264 ---- Index: tree-outof-ssa.c =================================================================== *** tree-outof-ssa.c (revision 146576) --- tree-outof-ssa.c (working copy) *************** along with GCC; see the file COPYING3. *** 30,38 **** #include "tree-flow.h" #include "timevar.h" #include "tree-dump.h" - #include "tree-ssa-live.h" #include "tree-pass.h" #include "toplev.h" /* Used to hold all the components required to do SSA PHI elimination. --- 30,39 ---- #include "tree-flow.h" #include "timevar.h" #include "tree-dump.h" #include "tree-pass.h" #include "toplev.h" + #include "expr.h" + #include "ssaexpand.h" /* Used to hold all the components required to do SSA PHI elimination. *************** typedef struct _elim_graph { *** 61,67 **** int size; /* List of nodes in the elimination graph. */ ! VEC(tree,heap) *nodes; /* The predecessor and successor edge list. */ VEC(int,heap) *edge_list; --- 62,68 ---- int size; /* List of nodes in the elimination graph. */ ! VEC(int,heap) *nodes; /* The predecessor and successor edge list. */ VEC(int,heap) *edge_list; *************** typedef struct _elim_graph { *** 79,163 **** edge e; /* List of constant copies to emit. These are pushed on in pairs. */ VEC(tree,heap) *const_copies; } *elim_graph; ! /* Create a temporary variable based on the type of variable T. Use T's name ! as the prefix. */ ! static tree ! create_temp (tree t) { ! tree tmp; ! const char *name = NULL; ! tree type; ! ! if (TREE_CODE (t) == SSA_NAME) ! t = SSA_NAME_VAR (t); ! ! gcc_assert (TREE_CODE (t) == VAR_DECL || TREE_CODE (t) == PARM_DECL); ! type = TREE_TYPE (t); ! tmp = DECL_NAME (t); ! if (tmp) ! name = IDENTIFIER_POINTER (tmp); ! if (name == NULL) ! name = "temp"; ! tmp = create_tmp_var (type, name); ! if (DECL_DEBUG_EXPR_IS_FROM (t) && DECL_DEBUG_EXPR (t)) { ! SET_DECL_DEBUG_EXPR (tmp, DECL_DEBUG_EXPR (t)); ! DECL_DEBUG_EXPR_IS_FROM (tmp) = 1; } ! else if (!DECL_IGNORED_P (t)) { ! SET_DECL_DEBUG_EXPR (tmp, t); ! DECL_DEBUG_EXPR_IS_FROM (tmp) = 1; } - DECL_ARTIFICIAL (tmp) = DECL_ARTIFICIAL (t); - DECL_IGNORED_P (tmp) = DECL_IGNORED_P (t); - DECL_GIMPLE_REG_P (tmp) = DECL_GIMPLE_REG_P (t); - add_referenced_var (tmp); ! /* We should never have copied variables in non-automatic storage ! or variables that have their address taken. So it is pointless ! to try to copy call-clobber state here. */ ! gcc_assert (!may_be_aliased (t) && !is_global_var (t)); ! return tmp; ! } ! /* This helper function fill insert a copy from a constant or variable SRC to ! variable DEST on edge E. */ static void ! insert_copy_on_edge (edge e, tree dest, tree src) { ! gimple copy; ! copy = gimple_build_assign (dest, src); ! set_is_used (dest); ! if (TREE_CODE (src) == ADDR_EXPR) ! src = TREE_OPERAND (src, 0); ! if (TREE_CODE (src) == VAR_DECL || TREE_CODE (src) == PARM_DECL) ! set_is_used (src); if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, ! "Inserting a copy on edge BB%d->BB%d :", e->src->index, e->dest->index); ! print_gimple_stmt (dump_file, copy, 0, dump_flags); ! fprintf (dump_file, "\n"); } ! gsi_insert_on_edge (e, copy); } --- 80,255 ---- edge e; /* List of constant copies to emit. These are pushed on in pairs. */ + VEC(int,heap) *const_dests; VEC(tree,heap) *const_copies; } *elim_graph; ! /* For an edge E find out a good source location to associate with ! instructions inserted on edge E. If E has an implicit goto set, ! use its location. Otherwise search instructions in predecessors ! of E for a location, and use that one. That makes sense because ! we insert on edges for PHI nodes, and effects of PHIs happen on ! the end of the predecessor conceptually. */ ! static void ! set_location_for_edge (edge e) { ! if (e->goto_locus) ! { ! set_curr_insn_source_location (e->goto_locus); ! set_curr_insn_block (e->goto_block); ! } ! else ! { ! basic_block bb = e->src; ! gimple_stmt_iterator gsi; ! do ! { ! for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi)) ! { ! gimple stmt = gsi_stmt (gsi); ! if (gimple_has_location (stmt) || gimple_block (stmt)) ! { ! set_curr_insn_source_location (gimple_location (stmt)); ! set_curr_insn_block (gimple_block (stmt)); ! return; ! } ! } ! /* Nothing found in this basic block. Make a half-assed attempt ! to continue with another block. */ ! if (single_pred_p (bb)) ! bb = single_pred (bb); ! else ! bb = e->src; ! } ! while (bb != e->src); ! } ! } ! /* Insert a copy instruction from partition SRC to DEST onto edge E. */ ! static void ! insert_partition_copy_on_edge (edge e, int dest, int src) ! { ! rtx seq; ! if (dump_file && (dump_flags & TDF_DETAILS)) { ! fprintf (dump_file, ! "Inserting a partition copy on edge BB%d->BB%d :" ! "PART.%d = PART.%d", ! e->src->index, ! e->dest->index, dest, src); ! fprintf (dump_file, "\n"); } ! ! gcc_assert (SA.partition_to_pseudo[dest]); ! gcc_assert (SA.partition_to_pseudo[src]); ! ! set_location_for_edge (e); ! ! /* Partition copy between same base variables only, so it's the same mode, ! hence we can use emit_move_insn. */ ! start_sequence (); ! emit_move_insn (SA.partition_to_pseudo[dest], SA.partition_to_pseudo[src]); ! seq = get_insns (); ! end_sequence (); ! ! insert_insn_on_edge (seq, e); ! } ! ! /* Insert a copy instruction from expression SRC to partition DEST ! onto edge E. */ ! ! static void ! insert_value_copy_on_edge (edge e, int dest, tree src) ! { ! rtx seq, x; ! enum machine_mode mode; ! if (dump_file && (dump_flags & TDF_DETAILS)) { ! fprintf (dump_file, ! "Inserting a value copy on edge BB%d->BB%d : PART.%d = ", ! e->src->index, ! e->dest->index, dest); ! print_generic_expr (dump_file, src, TDF_SLIM); ! fprintf (dump_file, "\n"); } ! gcc_assert (SA.partition_to_pseudo[dest]); ! set_location_for_edge (e); + start_sequence (); + mode = GET_MODE (SA.partition_to_pseudo[dest]); + x = expand_expr (src, SA.partition_to_pseudo[dest], mode, EXPAND_NORMAL); + if (GET_MODE (x) != mode) + x = convert_to_mode (mode, x, TYPE_UNSIGNED (TREE_TYPE (src))); + if (x != SA.partition_to_pseudo[dest]) + emit_move_insn (SA.partition_to_pseudo[dest], x); + seq = get_insns (); + end_sequence (); ! insert_insn_on_edge (seq, e); ! } ! ! /* Insert a copy instruction from RTL expression SRC to partition DEST ! onto edge E. */ static void ! insert_rtx_to_part_on_edge (edge e, int dest, rtx src) { ! rtx seq; ! if (dump_file && (dump_flags & TDF_DETAILS)) ! { ! fprintf (dump_file, ! "Inserting a temp copy on edge BB%d->BB%d : PART.%d = ", ! e->src->index, ! e->dest->index, dest); ! print_simple_rtl (dump_file, src); ! fprintf (dump_file, "\n"); ! } ! ! gcc_assert (SA.partition_to_pseudo[dest]); ! set_location_for_edge (e); ! start_sequence (); ! gcc_assert (GET_MODE (src) == GET_MODE (SA.partition_to_pseudo[dest])); ! emit_move_insn (SA.partition_to_pseudo[dest], src); ! seq = get_insns (); ! end_sequence (); ! insert_insn_on_edge (seq, e); ! } ! ! /* Insert a copy instruction from partition SRC to RTL lvalue DEST ! onto edge E. */ + static void + insert_part_to_rtx_on_edge (edge e, rtx dest, int src) + { + rtx seq; if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, ! "Inserting a temp copy on edge BB%d->BB%d : ", e->src->index, e->dest->index); ! print_simple_rtl (dump_file, dest); ! fprintf (dump_file, "= PART.%d\n", src); } ! gcc_assert (SA.partition_to_pseudo[src]); ! set_location_for_edge (e); ! ! start_sequence (); ! gcc_assert (GET_MODE (dest) == GET_MODE (SA.partition_to_pseudo[src])); ! emit_move_insn (dest, SA.partition_to_pseudo[src]); ! seq = get_insns (); ! end_sequence (); ! ! insert_insn_on_edge (seq, e); } *************** new_elim_graph (int size) *** 169,175 **** { elim_graph g = (elim_graph) xmalloc (sizeof (struct _elim_graph)); ! g->nodes = VEC_alloc (tree, heap, 30); g->const_copies = VEC_alloc (tree, heap, 20); g->edge_list = VEC_alloc (int, heap, 20); g->stack = VEC_alloc (int, heap, 30); --- 261,268 ---- { elim_graph g = (elim_graph) xmalloc (sizeof (struct _elim_graph)); ! g->nodes = VEC_alloc (int, heap, 30); ! g->const_dests = VEC_alloc (int, heap, 20); g->const_copies = VEC_alloc (tree, heap, 20); g->edge_list = VEC_alloc (int, heap, 20); g->stack = VEC_alloc (int, heap, 30); *************** new_elim_graph (int size) *** 185,191 **** static inline void clear_elim_graph (elim_graph g) { ! VEC_truncate (tree, g->nodes, 0); VEC_truncate (int, g->edge_list, 0); } --- 278,284 ---- static inline void clear_elim_graph (elim_graph g) { ! VEC_truncate (int, g->nodes, 0); VEC_truncate (int, g->edge_list, 0); } *************** delete_elim_graph (elim_graph g) *** 199,205 **** VEC_free (int, heap, g->stack); VEC_free (int, heap, g->edge_list); VEC_free (tree, heap, g->const_copies); ! VEC_free (tree, heap, g->nodes); free (g); } --- 292,299 ---- VEC_free (int, heap, g->stack); VEC_free (int, heap, g->edge_list); VEC_free (tree, heap, g->const_copies); ! VEC_free (int, heap, g->const_dests); ! VEC_free (int, heap, g->nodes); free (g); } *************** delete_elim_graph (elim_graph g) *** 209,230 **** static inline int elim_graph_size (elim_graph g) { ! return VEC_length (tree, g->nodes); } /* Add NODE to graph G, if it doesn't exist already. */ static inline void ! elim_graph_add_node (elim_graph g, tree node) { int x; ! tree t; ! for (x = 0; VEC_iterate (tree, g->nodes, x, t); x++) if (t == node) return; ! VEC_safe_push (tree, heap, g->nodes, node); } --- 303,324 ---- static inline int elim_graph_size (elim_graph g) { ! return VEC_length (int, g->nodes); } /* Add NODE to graph G, if it doesn't exist already. */ static inline void ! elim_graph_add_node (elim_graph g, int node) { int x; ! int t; ! for (x = 0; VEC_iterate (int, g->nodes, x, t); x++) if (t == node) return; ! VEC_safe_push (int, heap, g->nodes, node); } *************** do { \ *** 299,305 **** /* Add T to elimination graph G. */ static inline void ! eliminate_name (elim_graph g, tree T) { elim_graph_add_node (g, T); } --- 393,399 ---- /* Add T to elimination graph G. */ static inline void ! eliminate_name (elim_graph g, int T) { elim_graph_add_node (g, T); } *************** eliminate_name (elim_graph g, tree T) *** 309,330 **** G->e. */ static void ! eliminate_build (elim_graph g, basic_block B) { ! tree T0, Ti; int p0, pi; gimple_stmt_iterator gsi; clear_elim_graph (g); ! for (gsi = gsi_start_phis (B); !gsi_end_p (gsi); gsi_next (&gsi)) { gimple phi = gsi_stmt (gsi); ! T0 = var_to_partition_to_var (g->map, gimple_phi_result (phi)); ! /* Ignore results which are not in partitions. */ ! if (T0 == NULL_TREE) continue; Ti = PHI_ARG_DEF (phi, g->e->dest_idx); --- 403,423 ---- G->e. */ static void ! eliminate_build (elim_graph g) { ! tree Ti; int p0, pi; gimple_stmt_iterator gsi; clear_elim_graph (g); ! for (gsi = gsi_start_phis (g->e->dest); !gsi_end_p (gsi); gsi_next (&gsi)) { gimple phi = gsi_stmt (gsi); ! p0 = var_to_partition (g->map, gimple_phi_result (phi)); /* Ignore results which are not in partitions. */ ! if (p0 == NO_PARTITION) continue; Ti = PHI_ARG_DEF (phi, g->e->dest_idx); *************** eliminate_build (elim_graph g, basic_blo *** 338,355 **** { /* Save constant copies until all other copies have been emitted on this edge. */ ! VEC_safe_push (tree, heap, g->const_copies, T0); VEC_safe_push (tree, heap, g->const_copies, Ti); } else { ! Ti = var_to_partition_to_var (g->map, Ti); ! if (T0 != Ti) { ! eliminate_name (g, T0); ! eliminate_name (g, Ti); ! p0 = var_to_partition (g->map, T0); ! pi = var_to_partition (g->map, Ti); elim_graph_add_edge (g, p0, pi); } } --- 431,446 ---- { /* Save constant copies until all other copies have been emitted on this edge. */ ! VEC_safe_push (int, heap, g->const_dests, p0); VEC_safe_push (tree, heap, g->const_copies, Ti); } else { ! pi = var_to_partition (g->map, Ti); ! if (p0 != pi) { ! eliminate_name (g, p0); ! eliminate_name (g, pi); elim_graph_add_edge (g, p0, pi); } } *************** elim_backward (elim_graph g, int T) *** 399,430 **** if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); ! insert_copy_on_edge (g->e, ! partition_to_var (g->map, P), ! partition_to_var (g->map, T)); } }); } /* Insert required copies for T in graph G. Check for a strongly connected region, and create a temporary to break the cycle if one is found. */ static void elim_create (elim_graph g, int T) { - tree U; int P, S; if (elim_unvisited_predecessor (g, T)) { ! U = create_temp (partition_to_var (g->map, T)); ! insert_copy_on_edge (g->e, U, partition_to_var (g->map, T)); FOR_EACH_ELIM_GRAPH_PRED (g, T, P, { if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); ! insert_copy_on_edge (g->e, partition_to_var (g->map, P), U); } }); } --- 490,535 ---- if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); ! insert_partition_copy_on_edge (g->e, P, T); } }); } + /* Allocate a new pseudo register usable for storing values sitting + in NAME (a decl or SSA name), i.e. with matching mode and attributes. */ + + static rtx + get_temp_reg (tree name) + { + tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; + tree type = TREE_TYPE (var); + int unsignedp = TYPE_UNSIGNED (type); + enum machine_mode reg_mode + = promote_mode (type, DECL_MODE (var), &unsignedp, 0); + rtx x = gen_reg_rtx (reg_mode); + if (POINTER_TYPE_P (type)) + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); + return x; + } + /* Insert required copies for T in graph G. Check for a strongly connected region, and create a temporary to break the cycle if one is found. */ static void elim_create (elim_graph g, int T) { int P, S; if (elim_unvisited_predecessor (g, T)) { ! rtx U = get_temp_reg (partition_to_var (g->map, T)); ! insert_part_to_rtx_on_edge (g->e, U, T); FOR_EACH_ELIM_GRAPH_PRED (g, T, P, { if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); ! insert_rtx_to_part_on_edge (g->e, P, U); } }); } *************** elim_create (elim_graph g, int T) *** 434,445 **** if (S != -1) { SET_BIT (g->visited, T); ! insert_copy_on_edge (g->e, ! partition_to_var (g->map, T), ! partition_to_var (g->map, S)); } } - } --- 539,547 ---- if (S != -1) { SET_BIT (g->visited, T); ! insert_partition_copy_on_edge (g->e, T, S); } } } *************** static void *** 449,455 **** eliminate_phi (edge e, elim_graph g) { int x; - basic_block B = e->dest; gcc_assert (VEC_length (tree, g->const_copies) == 0); --- 551,556 ---- *************** eliminate_phi (edge e, elim_graph g) *** 459,478 **** g->e = e; ! eliminate_build (g, B); if (elim_graph_size (g) != 0) { ! tree var; sbitmap_zero (g->visited); VEC_truncate (int, g->stack, 0); ! for (x = 0; VEC_iterate (tree, g->nodes, x, var); x++) { ! int p = var_to_partition (g->map, var); ! if (!TEST_BIT (g->visited, p)) ! elim_forward (g, p); } sbitmap_zero (g->visited); --- 560,578 ---- g->e = e; ! eliminate_build (g); if (elim_graph_size (g) != 0) { ! int part; sbitmap_zero (g->visited); VEC_truncate (int, g->stack, 0); ! for (x = 0; VEC_iterate (int, g->nodes, x, part); x++) { ! if (!TEST_BIT (g->visited, part)) ! elim_forward (g, part); } sbitmap_zero (g->visited); *************** eliminate_phi (edge e, elim_graph g) *** 487,607 **** /* If there are any pending constant copies, issue them now. */ while (VEC_length (tree, g->const_copies) > 0) { ! tree src, dest; src = VEC_pop (tree, g->const_copies); ! dest = VEC_pop (tree, g->const_copies); ! insert_copy_on_edge (e, dest, src); } } - /* Take the ssa-name var_map MAP, and assign real variables to each - partition. */ - - static void - assign_vars (var_map map) - { - int x, num; - tree var, root; - var_ann_t ann; - - num = num_var_partitions (map); - for (x = 0; x < num; x++) - { - var = partition_to_var (map, x); - if (TREE_CODE (var) != SSA_NAME) - { - ann = var_ann (var); - /* It must already be coalesced. */ - gcc_assert (ann->out_of_ssa_tag == 1); - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, "partition %d already has variable ", x); - print_generic_expr (dump_file, var, TDF_SLIM); - fprintf (dump_file, " assigned to it.\n"); - } - } - else - { - root = SSA_NAME_VAR (var); - ann = var_ann (root); - /* If ROOT is already associated, create a new one. */ - if (ann->out_of_ssa_tag) - { - root = create_temp (root); - ann = var_ann (root); - } - /* ROOT has not been coalesced yet, so use it. */ - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, "Partition %d is assigned to var ", x); - print_generic_stmt (dump_file, root, TDF_SLIM); - } - change_partition_var (map, root, x); - } - } - } - - - /* Replace use operand P with whatever variable it has been rewritten to based - on the partitions in MAP. EXPR is an optional expression vector over SSA - versions which is used to replace P with an expression instead of a variable. - If the stmt is changed, return true. */ - - static inline bool - replace_use_variable (var_map map, use_operand_p p, gimple *expr) - { - tree new_var; - tree var = USE_FROM_PTR (p); - - /* Check if we are replacing this variable with an expression. */ - if (expr) - { - int version = SSA_NAME_VERSION (var); - if (expr[version]) - { - SET_USE (p, gimple_assign_rhs_to_tree (expr[version])); - return true; - } - } - - new_var = var_to_partition_to_var (map, var); - if (new_var) - { - SET_USE (p, new_var); - set_is_used (new_var); - return true; - } - return false; - } - - - /* Replace def operand DEF_P with whatever variable it has been rewritten to - based on the partitions in MAP. EXPR is an optional expression vector over - SSA versions which is used to replace DEF_P with an expression instead of a - variable. If the stmt is changed, return true. */ - - static inline bool - replace_def_variable (var_map map, def_operand_p def_p, tree *expr) - { - tree new_var; - tree var = DEF_FROM_PTR (def_p); - - /* Do nothing if we are replacing this variable with an expression. */ - if (expr && expr[SSA_NAME_VERSION (var)]) - return true; - - new_var = var_to_partition_to_var (map, var); - if (new_var) - { - SET_DEF (def_p, new_var); - set_is_used (new_var); - return true; - } - return false; - } - - /* Remove each argument from PHI. If an arg was the last use of an SSA_NAME, check to see if this allows another PHI node to be removed. */ --- 587,601 ---- /* If there are any pending constant copies, issue them now. */ while (VEC_length (tree, g->const_copies) > 0) { ! int dest; ! tree src; src = VEC_pop (tree, g->const_copies); ! dest = VEC_pop (int, g->const_dests); ! insert_value_copy_on_edge (e, dest, src); } } /* Remove each argument from PHI. If an arg was the last use of an SSA_NAME, check to see if this allows another PHI node to be removed. */ *************** eliminate_useless_phis (void) *** 704,724 **** variable. */ static void ! rewrite_trees (var_map map, gimple *values) { - elim_graph g; - basic_block bb; - gimple_stmt_iterator gsi; - edge e; - gimple_seq phi; - bool changed; - #ifdef ENABLE_CHECKING /* Search for PHIs where the destination has no partition, but one or more arguments has a partition. This should not happen and can create incorrect code. */ FOR_EACH_BB (bb) { for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) { gimple phi = gsi_stmt (gsi); --- 698,713 ---- variable. */ static void ! rewrite_trees (var_map map) { #ifdef ENABLE_CHECKING + basic_block bb; /* Search for PHIs where the destination has no partition, but one or more arguments has a partition. This should not happen and can create incorrect code. */ FOR_EACH_BB (bb) { + gimple_stmt_iterator gsi; for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) { gimple phi = gsi_stmt (gsi); *************** rewrite_trees (var_map map, gimple *valu *** 744,1350 **** } } #endif - - /* Replace PHI nodes with any required copies. */ - g = new_elim_graph (map->num_partitions); - g->map = map; - FOR_EACH_BB (bb) - { - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) - { - gimple stmt = gsi_stmt (gsi); - use_operand_p use_p, copy_use_p; - def_operand_p def_p; - bool remove = false, is_copy = false; - int num_uses = 0; - ssa_op_iter iter; - - changed = false; - - if (gimple_assign_copy_p (stmt)) - is_copy = true; - - copy_use_p = NULL_USE_OPERAND_P; - FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE) - { - if (replace_use_variable (map, use_p, values)) - changed = true; - copy_use_p = use_p; - num_uses++; - } - - if (num_uses != 1) - is_copy = false; - - def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF); - - if (def_p != NULL) - { - /* Mark this stmt for removal if it is the list of replaceable - expressions. */ - if (values && values[SSA_NAME_VERSION (DEF_FROM_PTR (def_p))]) - remove = true; - else - { - if (replace_def_variable (map, def_p, NULL)) - changed = true; - /* If both SSA_NAMEs coalesce to the same variable, - mark the now redundant copy for removal. */ - if (is_copy) - { - gcc_assert (copy_use_p != NULL_USE_OPERAND_P); - if (DEF_FROM_PTR (def_p) == USE_FROM_PTR (copy_use_p)) - remove = true; - } - } - } - else - FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF) - if (replace_def_variable (map, def_p, NULL)) - changed = true; - - /* Remove any stmts marked for removal. */ - if (remove) - gsi_remove (&gsi, true); - else - { - if (changed) - if (maybe_clean_or_replace_eh_stmt (stmt, stmt)) - gimple_purge_dead_eh_edges (bb); - gsi_next (&gsi); - } - } - - phi = phi_nodes (bb); - if (phi) - { - edge_iterator ei; - FOR_EACH_EDGE (e, ei, bb->preds) - eliminate_phi (e, g); - } - } - - delete_elim_graph (g); - } - - /* These are the local work structures used to determine the best place to - insert the copies that were placed on edges by the SSA->normal pass.. */ - static VEC(edge,heap) *edge_leader; - static VEC(gimple_seq,heap) *stmt_list; - static bitmap leader_has_match = NULL; - static edge leader_match = NULL; - - - /* Pass this function to make_forwarder_block so that all the edges with - matching PENDING_STMT lists to 'curr_stmt_list' get redirected. E is the - edge to test for a match. */ - - static inline bool - same_stmt_list_p (edge e) - { - return (e->aux == (PTR) leader_match) ? true : false; - } - - - /* Return TRUE if S1 and S2 are equivalent copies. */ - - static inline bool - identical_copies_p (const_gimple s1, const_gimple s2) - { - #ifdef ENABLE_CHECKING - gcc_assert (is_gimple_assign (s1)); - gcc_assert (is_gimple_assign (s2)); - gcc_assert (DECL_P (gimple_assign_lhs (s1))); - gcc_assert (DECL_P (gimple_assign_lhs (s2))); - #endif - - if (gimple_assign_lhs (s1) != gimple_assign_lhs (s2)) - return false; - - if (gimple_assign_rhs1 (s1) != gimple_assign_rhs1 (s2)) - return false; - - return true; - } - - - /* Compare the PENDING_STMT list for edges E1 and E2. Return true if the lists - contain the same sequence of copies. */ - - static inline bool - identical_stmt_lists_p (const_edge e1, const_edge e2) - { - gimple_seq t1 = PENDING_STMT (e1); - gimple_seq t2 = PENDING_STMT (e2); - gimple_stmt_iterator gsi1, gsi2; - - for (gsi1 = gsi_start (t1), gsi2 = gsi_start (t2); - !gsi_end_p (gsi1) && !gsi_end_p (gsi2); - gsi_next (&gsi1), gsi_next (&gsi2)) - { - if (!identical_copies_p (gsi_stmt (gsi1), gsi_stmt (gsi2))) - break; - } - - if (!gsi_end_p (gsi1) || !gsi_end_p (gsi2)) - return false; - - return true; - } - - - /* Allocate data structures used in analyze_edges_for_bb. */ - - static void - init_analyze_edges_for_bb (void) - { - edge_leader = VEC_alloc (edge, heap, 25); - stmt_list = VEC_alloc (gimple_seq, heap, 25); - leader_has_match = BITMAP_ALLOC (NULL); - } - - - /* Free data structures used in analyze_edges_for_bb. */ - - static void - fini_analyze_edges_for_bb (void) - { - VEC_free (edge, heap, edge_leader); - VEC_free (gimple_seq, heap, stmt_list); - BITMAP_FREE (leader_has_match); - } - - /* A helper function to be called via walk_tree. Return DATA if it is - contained in subtree TP. */ - - static tree - contains_tree_r (tree * tp, int *walk_subtrees, void *data) - { - if (*tp == data) - { - *walk_subtrees = 0; - return (tree) data; - } - else - return NULL_TREE; - } - - /* A threshold for the number of insns contained in the latch block. - It is used to prevent blowing the loop with too many copies from - the latch. */ - #define MAX_STMTS_IN_LATCH 2 - - /* Return TRUE if the stmts on SINGLE-EDGE can be moved to the - body of the loop. This should be permitted only if SINGLE-EDGE is a - single-basic-block latch edge and thus cleaning the latch will help - to create a single-basic-block loop. Otherwise return FALSE. */ - - static bool - process_single_block_loop_latch (edge single_edge) - { - gimple_seq stmts; - basic_block b_exit, b_pheader, b_loop = single_edge->src; - edge_iterator ei; - edge e; - gimple_stmt_iterator gsi, gsi_exit; - gimple_stmt_iterator tsi; - tree expr; - gimple stmt; - unsigned int count = 0; - - if (single_edge == NULL || (single_edge->dest != single_edge->src) - || (EDGE_COUNT (b_loop->succs) != 2) - || (EDGE_COUNT (b_loop->preds) != 2)) - return false; - - /* Get the stmts on the latch edge. */ - stmts = PENDING_STMT (single_edge); - - /* Find the successor edge which is not the latch edge. */ - FOR_EACH_EDGE (e, ei, b_loop->succs) - if (e->dest != b_loop) - break; - - b_exit = e->dest; - - /* Check that the exit block has only the loop as a predecessor, - and that there are no pending stmts on that edge as well. */ - if (EDGE_COUNT (b_exit->preds) != 1 || PENDING_STMT (e)) - return false; - - /* Find the predecessor edge which is not the latch edge. */ - FOR_EACH_EDGE (e, ei, b_loop->preds) - if (e->src != b_loop) - break; - - b_pheader = e->src; - - if (b_exit == b_pheader || b_exit == b_loop || b_pheader == b_loop) - return false; - - gsi_exit = gsi_after_labels (b_exit); - - /* Get the last stmt in the loop body. */ - gsi = gsi_last_bb (single_edge->src); - stmt = gsi_stmt (gsi); - - if (gimple_code (stmt) != GIMPLE_COND) - return false; - - - expr = build2 (gimple_cond_code (stmt), boolean_type_node, - gimple_cond_lhs (stmt), gimple_cond_rhs (stmt)); - /* Iterate over the insns on the latch and count them. */ - for (tsi = gsi_start (stmts); !gsi_end_p (tsi); gsi_next (&tsi)) - { - gimple stmt1 = gsi_stmt (tsi); - tree var; - - count++; - /* Check that the condition does not contain any new definition - created in the latch as the stmts from the latch intended - to precede it. */ - if (gimple_code (stmt1) != GIMPLE_ASSIGN) - return false; - var = gimple_assign_lhs (stmt1); - if (TREE_THIS_VOLATILE (var) - || TYPE_VOLATILE (TREE_TYPE (var)) - || walk_tree (&expr, contains_tree_r, var, NULL)) - return false; - } - /* Check that the latch does not contain more than MAX_STMTS_IN_LATCH - insns. The purpose of this restriction is to prevent blowing the - loop with too many copies from the latch. */ - if (count > MAX_STMTS_IN_LATCH) - return false; - - /* Apply the transformation - clean up the latch block: - - var = something; - L1: - x1 = expr; - if (cond) goto L2 else goto L3; - L2: - var = x1; - goto L1 - L3: - ... - - ==> - - var = something; - L1: - x1 = expr; - tmp_var = var; - var = x1; - if (cond) goto L1 else goto L2; - L2: - var = tmp_var; - ... - */ - for (tsi = gsi_start (stmts); !gsi_end_p (tsi); gsi_next (&tsi)) - { - gimple stmt1 = gsi_stmt (tsi); - tree var, tmp_var; - gimple copy; - - /* Create a new variable to load back the value of var in case - we exit the loop. */ - var = gimple_assign_lhs (stmt1); - tmp_var = create_temp (var); - copy = gimple_build_assign (tmp_var, var); - set_is_used (tmp_var); - gsi_insert_before (&gsi, copy, GSI_SAME_STMT); - copy = gimple_build_assign (var, tmp_var); - gsi_insert_before (&gsi_exit, copy, GSI_SAME_STMT); - } - - PENDING_STMT (single_edge) = 0; - /* Insert the new stmts to the loop body. */ - gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT); - - if (dump_file) - fprintf (dump_file, - "\nCleaned-up latch block of loop with single BB: %d\n\n", - single_edge->dest->index); - - return true; - } - - /* Look at all the incoming edges to block BB, and decide where the best place - to insert the stmts on each edge are, and perform those insertions. */ - - static void - analyze_edges_for_bb (basic_block bb) - { - edge e; - edge_iterator ei; - int count; - unsigned int x; - bool have_opportunity; - gimple_stmt_iterator gsi; - gimple stmt; - edge single_edge = NULL; - bool is_label; - edge leader; - - count = 0; - - /* Blocks which contain at least one abnormal edge cannot use - make_forwarder_block. Look for these blocks, and commit any PENDING_STMTs - found on edges in these block. */ - have_opportunity = true; - FOR_EACH_EDGE (e, ei, bb->preds) - if (e->flags & EDGE_ABNORMAL) - { - have_opportunity = false; - break; - } - - if (!have_opportunity) - { - FOR_EACH_EDGE (e, ei, bb->preds) - if (PENDING_STMT (e)) - gsi_commit_one_edge_insert (e, NULL); - return; - } - - /* Find out how many edges there are with interesting pending stmts on them. - Commit the stmts on edges we are not interested in. */ - FOR_EACH_EDGE (e, ei, bb->preds) - { - if (PENDING_STMT (e)) - { - gcc_assert (!(e->flags & EDGE_ABNORMAL)); - if (e->flags & EDGE_FALLTHRU) - { - gsi = gsi_start_bb (e->src); - if (!gsi_end_p (gsi)) - { - stmt = gsi_stmt (gsi); - gsi_next (&gsi); - gcc_assert (stmt != NULL); - is_label = (gimple_code (stmt) == GIMPLE_LABEL); - /* Punt if it has non-label stmts, or isn't local. */ - if (!is_label - || DECL_NONLOCAL (gimple_label_label (stmt)) - || !gsi_end_p (gsi)) - { - gsi_commit_one_edge_insert (e, NULL); - continue; - } - } - } - single_edge = e; - count++; - } - } - - /* If there aren't at least 2 edges, no sharing will happen. */ - if (count < 2) - { - if (single_edge) - { - /* Add stmts to the edge unless processed specially as a - single-block loop latch edge. */ - if (!process_single_block_loop_latch (single_edge)) - gsi_commit_one_edge_insert (single_edge, NULL); - } - return; - } - - /* Ensure that we have empty worklists. */ - #ifdef ENABLE_CHECKING - gcc_assert (VEC_length (edge, edge_leader) == 0); - gcc_assert (VEC_length (gimple_seq, stmt_list) == 0); - gcc_assert (bitmap_empty_p (leader_has_match)); - #endif - - /* Find the "leader" block for each set of unique stmt lists. Preference is - given to FALLTHRU blocks since they would need a GOTO to arrive at another - block. The leader edge destination is the block which all the other edges - with the same stmt list will be redirected to. */ - have_opportunity = false; - FOR_EACH_EDGE (e, ei, bb->preds) - { - if (PENDING_STMT (e)) - { - bool found = false; - - /* Look for the same stmt list in edge leaders list. */ - for (x = 0; VEC_iterate (edge, edge_leader, x, leader); x++) - { - if (identical_stmt_lists_p (leader, e)) - { - /* Give this edge the same stmt list pointer. */ - PENDING_STMT (e) = NULL; - e->aux = leader; - bitmap_set_bit (leader_has_match, x); - have_opportunity = found = true; - break; - } - } - - /* If no similar stmt list, add this edge to the leader list. */ - if (!found) - { - VEC_safe_push (edge, heap, edge_leader, e); - VEC_safe_push (gimple_seq, heap, stmt_list, PENDING_STMT (e)); - } - } - } - - /* If there are no similar lists, just issue the stmts. */ - if (!have_opportunity) - { - for (x = 0; VEC_iterate (edge, edge_leader, x, leader); x++) - gsi_commit_one_edge_insert (leader, NULL); - VEC_truncate (edge, edge_leader, 0); - VEC_truncate (gimple_seq, stmt_list, 0); - bitmap_clear (leader_has_match); - return; - } - - if (dump_file) - fprintf (dump_file, "\nOpportunities in BB %d for stmt/block reduction:\n", - bb->index); - - /* For each common list, create a forwarding block and issue the stmt's - in that block. */ - for (x = 0; VEC_iterate (edge, edge_leader, x, leader); x++) - if (bitmap_bit_p (leader_has_match, x)) - { - edge new_edge; - gimple_stmt_iterator gsi; - gimple_seq curr_stmt_list; - - leader_match = leader; - - /* The tree_* cfg manipulation routines use the PENDING_EDGE field - for various PHI manipulations, so it gets cleared when calls are - made to make_forwarder_block(). So make sure the edge is clear, - and use the saved stmt list. */ - PENDING_STMT (leader) = NULL; - leader->aux = leader; - curr_stmt_list = VEC_index (gimple_seq, stmt_list, x); - - new_edge = make_forwarder_block (leader->dest, same_stmt_list_p, - NULL); - bb = new_edge->dest; - if (dump_file) - { - fprintf (dump_file, "Splitting BB %d for Common stmt list. ", - leader->dest->index); - fprintf (dump_file, "Original block is now BB%d.\n", bb->index); - print_gimple_seq (dump_file, curr_stmt_list, 0, TDF_VOPS); - } - - FOR_EACH_EDGE (e, ei, new_edge->src->preds) - { - e->aux = NULL; - if (dump_file) - fprintf (dump_file, " Edge (%d->%d) lands here.\n", - e->src->index, e->dest->index); - } - - gsi = gsi_last_bb (leader->dest); - gsi_insert_seq_after (&gsi, curr_stmt_list, GSI_NEW_STMT); - - leader_match = NULL; - /* We should never get a new block now. */ - } - else - { - PENDING_STMT (leader) = VEC_index (gimple_seq, stmt_list, x); - gsi_commit_one_edge_insert (leader, NULL); - } - - - /* Clear the working data structures. */ - VEC_truncate (edge, edge_leader, 0); - VEC_truncate (gimple_seq, stmt_list, 0); - bitmap_clear (leader_has_match); } ! /* This function will analyze the insertions which were performed on edges, ! and decide whether they should be left on that edge, or whether it is more ! efficient to emit some subset of them in a single block. All stmts are ! inserted somewhere. */ ! ! static void ! perform_edge_inserts (void) { basic_block bb; ! if (dump_file) ! fprintf(dump_file, "Analyzing Edge Insertions.\n"); ! ! /* analyze_edges_for_bb calls make_forwarder_block, which tries to ! incrementally update the dominator information. Since we don't ! need dominator information after this pass, go ahead and free the ! dominator information. */ ! free_dominance_info (CDI_DOMINATORS); ! free_dominance_info (CDI_POST_DOMINATORS); ! ! /* Allocate data structures used in analyze_edges_for_bb. */ ! init_analyze_edges_for_bb (); ! ! FOR_EACH_BB (bb) ! analyze_edges_for_bb (bb); ! ! analyze_edges_for_bb (EXIT_BLOCK_PTR); ! ! /* Free data structures used in analyze_edges_for_bb. */ ! fini_analyze_edges_for_bb (); ! ! #ifdef ENABLE_CHECKING ! { ! edge_iterator ei; ! edge e; ! FOR_EACH_BB (bb) { FOR_EACH_EDGE (e, ei, bb->preds) ! { ! if (PENDING_STMT (e)) ! error (" Pending stmts not issued on PRED edge (%d, %d)\n", ! e->src->index, e->dest->index); ! } ! FOR_EACH_EDGE (e, ei, bb->succs) ! { ! if (PENDING_STMT (e)) ! error (" Pending stmts not issued on SUCC edge (%d, %d)\n", ! e->src->index, e->dest->index); ! } ! } ! FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs) ! { ! if (PENDING_STMT (e)) ! error (" Pending stmts not issued on ENTRY edge (%d, %d)\n", ! e->src->index, e->dest->index); } ! FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR->preds) ! { ! if (PENDING_STMT (e)) ! error (" Pending stmts not issued on EXIT edge (%d, %d)\n", ! e->src->index, e->dest->index); ! } ! } ! #endif } /* Remove the ssa-names in the current function and translate them into normal compiler variables. PERFORM_TER is true if Temporary Expression Replacement should also be used. */ static void ! remove_ssa_form (bool perform_ter) { - basic_block bb; gimple *values = NULL; var_map map; ! gimple_stmt_iterator gsi; map = coalesce_ssa_name (); --- 733,777 ---- } } #endif } + /* Given the out-of-ssa info object SA (with prepared partitions) + eliminate all phi nodes in all basic blocks. Afterwards no + basic block will have phi nodes anymore and there are possibly + some RTL instructions inserted on edges. */ ! void ! expand_phi_nodes (struct ssaexpand *sa) { basic_block bb; + elim_graph g = new_elim_graph (sa->map->num_partitions); + g->map = sa->map; ! FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb) ! if (!gimple_seq_empty_p (phi_nodes (bb))) { + edge e; + edge_iterator ei; FOR_EACH_EDGE (e, ei, bb->preds) ! eliminate_phi (e, g); ! set_phi_nodes (bb, NULL); } ! ! delete_elim_graph (g); } + /* Remove the ssa-names in the current function and translate them into normal compiler variables. PERFORM_TER is true if Temporary Expression Replacement should also be used. */ static void ! remove_ssa_form (bool perform_ter, struct ssaexpand *sa) { gimple *values = NULL; var_map map; ! unsigned i; map = coalesce_ssa_name (); *************** remove_ssa_form (bool perform_ter) *** 1365,1393 **** dump_replaceable_exprs (dump_file, values); } ! /* Assign real variables to the partitions now. */ ! assign_vars (map); ! if (dump_file && (dump_flags & TDF_DETAILS)) { ! fprintf (dump_file, "After Base variable replacement:\n"); ! dump_var_map (dump_file, map); } - - rewrite_trees (map, values); - - if (values) - free (values); - - /* Remove PHI nodes which have been translated back to real variables. */ - FOR_EACH_BB (bb) - for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);) - remove_phi_node (&gsi, true); - - /* If any copies were inserted on edges, analyze and insert them now. */ - perform_edge_inserts (); - - delete_var_map (map); } --- 792,812 ---- dump_replaceable_exprs (dump_file, values); } ! rewrite_trees (map); ! sa->map = map; ! sa->values = values; ! sa->partition_has_default_def = BITMAP_ALLOC (NULL); ! for (i = 1; i < num_ssa_names; i++) { ! tree t = ssa_name (i); ! if (t && SSA_NAME_IS_DEFAULT_DEF (t)) ! { ! int p = var_to_partition (map, t); ! if (p != NO_PARTITION) ! bitmap_set_bit (sa->partition_has_default_def, p); ! } } } *************** insert_backedge_copies (void) *** 1477,1488 **** } } /* Take the current function out of SSA form, translating PHIs as described in R. Morgan, ``Building an Optimizing Compiler'', Butterworth-Heinemann, Boston, MA, 1998. pp 176-186. */ ! static unsigned int ! rewrite_out_of_ssa (void) { /* If elimination of a PHI requires inserting a copy on a backedge, then we will have to split the backedge which has numerous --- 896,921 ---- } } + /* Free all memory associated with going out of SSA form. SA is + the outof-SSA info object. */ + + void + finish_out_of_ssa (struct ssaexpand *sa) + { + free (sa->partition_to_pseudo); + if (sa->values) + free (sa->values); + delete_var_map (sa->map); + BITMAP_FREE (sa->partition_has_default_def); + memset (sa, 0, sizeof *sa); + } + /* Take the current function out of SSA form, translating PHIs as described in R. Morgan, ``Building an Optimizing Compiler'', Butterworth-Heinemann, Boston, MA, 1998. pp 176-186. */ ! unsigned int ! rewrite_out_of_ssa (struct ssaexpand *sa) { /* If elimination of a PHI requires inserting a copy on a backedge, then we will have to split the backedge which has numerous *************** rewrite_out_of_ssa (void) *** 1499,1535 **** if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); ! remove_ssa_form (flag_tree_ter && !flag_mudflap); if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); - cfun->gimple_df->in_ssa_p = false; return 0; } - - - /* Define the parameters of the out of SSA pass. */ - - struct gimple_opt_pass pass_del_ssa = - { - { - GIMPLE_PASS, - "optimized", /* name */ - NULL, /* gate */ - rewrite_out_of_ssa, /* execute */ - NULL, /* sub */ - NULL, /* next */ - 0, /* static_pass_number */ - TV_TREE_SSA_TO_NORMAL, /* tv_id */ - PROP_cfg | PROP_ssa, /* properties_required */ - 0, /* properties_provided */ - /* ??? If TER is enabled, we also kill gimple. */ - PROP_ssa, /* properties_destroyed */ - TODO_verify_ssa | TODO_verify_flow - | TODO_verify_stmts, /* todo_flags_start */ - TODO_dump_func - | TODO_ggc_collect - | TODO_remove_unused_locals /* todo_flags_finish */ - } - }; --- 932,941 ---- if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); ! remove_ssa_form (flag_tree_ter && !flag_mudflap, sa); if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); return 0; } Index: ssaexpand.h =================================================================== *** ssaexpand.h (revision 0) --- ssaexpand.h (revision 0) *************** *** 0 **** --- 1,80 ---- + /* Routines for expanding from SSA form to RTL. + Copyright (C) 2009 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + + + #ifndef _SSAEXPAND_H + #define _SSAEXPAND_H 1 + + #include "tree-ssa-live.h" + + /* This structure (of which only a singleton SA exists) is used to + pass around information between the outof-SSA functions, cfgexpand + and expand itself. */ + struct ssaexpand + { + /* The computed partitions of SSA names are stored here. */ + var_map map; + + /* For a SSA name version V values[V] contains the gimple statement + defining it iff TER decided that it should be forwarded, NULL + otherwise. */ + gimple *values; + + /* For a partition number I partition_to_pseudo[I] contains the + RTL expression of the allocated space of it (either a MEM or + a pseudos REG). */ + rtx *partition_to_pseudo; + + /* If partition I contains an SSA name that has a default def, + bit I will be set in this bitmap. */ + bitmap partition_has_default_def; + }; + + /* This is the singleton described above. */ + extern struct ssaexpand SA; + + /* Returns the RTX expression representing the storage of the outof-SSA + partition that the SSA name EXP is a member of. */ + static inline rtx + get_rtx_for_ssa_name (tree exp) + { + int p = partition_find (SA.map->var_partition, SSA_NAME_VERSION (exp)); + if (SA.map->partition_to_view) + p = SA.map->partition_to_view[p]; + gcc_assert (p != NO_PARTITION); + return SA.partition_to_pseudo[p]; + } + + /* If TER decided to forward the definition of SSA name EXP this function + returns the defining statement, otherwise NULL. */ + static inline gimple + get_gimple_for_ssa_name (tree exp) + { + int v = SSA_NAME_VERSION (exp); + if (SA.values) + return SA.values[v]; + return NULL; + } + + /* In tree-outof-ssa.c. */ + void finish_out_of_ssa (struct ssaexpand *sa); + unsigned int rewrite_out_of_ssa (struct ssaexpand *sa); + void expand_phi_nodes (struct ssaexpand *sa); + + #endif Index: tree-flow.h =================================================================== *** tree-flow.h (revision 146576) --- tree-flow.h (working copy) *************** struct var_ann_d GTY(()) *** 209,218 **** { struct tree_ann_common_d common; - /* Used by the out of SSA pass to determine whether this variable has - been seen yet or not. */ - unsigned out_of_ssa_tag : 1; - /* Used when building base variable structures in a var_map. */ unsigned base_var_processed : 1; --- 209,214 ---- *************** struct var_ann_d GTY(()) *** 234,243 **** information on each attribute. */ ENUM_BITFIELD (noalias_state) noalias_state : 2; - /* Used when going out of SSA form to indicate which partition this - variable represents storage for. */ - unsigned partition; - /* Used by var_map for the base index of ssa base variables. */ unsigned base_index; --- 230,235 ---- *************** rtx addr_for_mem_ref (struct mem_address *** 976,981 **** --- 968,974 ---- void get_address_description (tree, struct mem_address *); tree maybe_fold_tmr (tree); + unsigned int execute_free_datastructures (void); unsigned int execute_fixup_cfg (void); #include "tree-flow-inline.h" Index: Makefile.in =================================================================== *** Makefile.in (revision 146576) --- Makefile.in (working copy) *************** TREE_FLOW_H = tree-flow.h tree-flow-inli *** 864,869 **** --- 864,870 ---- $(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \ tree-ssa-alias.h TREE_SSA_LIVE_H = tree-ssa-live.h $(PARTITION_H) vecprim.h + SSAEXPAND_H = ssaexpand.h $(TREE_SSA_LIVE_H) PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H) DIAGNOSTIC_H = diagnostic.h diagnostic.def $(PRETTY_PRINT_H) options.h C_PRETTY_PRINT_H = c-pretty-print.h $(PRETTY_PRINT_H) $(C_COMMON_H) $(TREE_H) *************** tree-ssa-coalesce.o : tree-ssa-coalesce. *** 2106,2112 **** tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ $(TREE_H) $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \ $(TREE_PASS_H) $(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) \ ! $(TOPLEV_H) tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(GGC_H) $(TREE_H) $(RTL_H) $(TM_P_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) domwalk.h $(FLAGS_H) \ --- 2107,2113 ---- tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ $(TREE_H) $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \ $(TREE_PASS_H) $(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) \ ! $(TOPLEV_H) $(EXPR_H) $(SSAEXPAND_H) tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(GGC_H) $(TREE_H) $(RTL_H) $(TM_P_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) domwalk.h $(FLAGS_H) \ *************** expr.o : expr.c $(CONFIG_H) $(SYSTEM_H) *** 2533,2539 **** typeclass.h hard-reg-set.h $(TOPLEV_H) hard-reg-set.h $(EXCEPT_H) reload.h \ $(GGC_H) langhooks.h intl.h $(TM_P_H) $(REAL_H) $(TARGET_H) \ tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \ ! $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \ $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \ langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) --- 2534,2540 ---- typeclass.h hard-reg-set.h $(TOPLEV_H) hard-reg-set.h $(EXCEPT_H) reload.h \ $(GGC_H) langhooks.h intl.h $(TM_P_H) $(REAL_H) $(TARGET_H) \ tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \ ! $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \ $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \ langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) *************** cfgexpand.o : cfgexpand.c $(TREE_FLOW_H) *** 2805,2811 **** $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \ coretypes.h $(TREE_DUMP_H) $(EXCEPT_H) langhooks.h $(TREE_PASS_H) $(RTL_H) \ $(DIAGNOSTIC_H) $(TOPLEV_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \ ! value-prof.h $(TREE_INLINE_H) $(TARGET_H) cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \ output.h $(TOPLEV_H) $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) insn-config.h $(EXPR_H) \ --- 2806,2812 ---- $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \ coretypes.h $(TREE_DUMP_H) $(EXCEPT_H) langhooks.h $(TREE_PASS_H) $(RTL_H) \ $(DIAGNOSTIC_H) $(TOPLEV_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \ ! value-prof.h $(TREE_INLINE_H) $(TARGET_H) $(SSAEXPAND_H) cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \ output.h $(TOPLEV_H) $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) insn-config.h $(EXPR_H) \ Index: basic-block.h =================================================================== *** basic-block.h (revision 146576) --- basic-block.h (working copy) *************** extern void update_bb_for_insn (basic_bl *** 512,517 **** --- 512,518 ---- extern void insert_insn_on_edge (rtx, edge); basic_block split_edge_and_insert (edge, rtx); + extern void commit_one_edge_insertion (edge e); extern void commit_edge_insertions (void); extern void remove_fake_edges (void); Index: passes.c =================================================================== *** passes.c (revision 146576) --- passes.c (working copy) *************** init_optimization_passes (void) *** 707,723 **** NEXT_PASS (pass_local_pure_const); } NEXT_PASS (pass_cleanup_eh); - NEXT_PASS (pass_del_ssa); NEXT_PASS (pass_nrv); NEXT_PASS (pass_mark_used_blocks); NEXT_PASS (pass_cleanup_cfg_post_optimizing); - NEXT_PASS (pass_warn_function_noreturn); - NEXT_PASS (pass_free_datastructures); - NEXT_PASS (pass_mudflap_2); ! NEXT_PASS (pass_free_cfg_annotations); NEXT_PASS (pass_expand); NEXT_PASS (pass_rest_of_compilation); { struct opt_pass **p = &pass_rest_of_compilation.pass.sub; --- 707,723 ---- NEXT_PASS (pass_local_pure_const); } NEXT_PASS (pass_cleanup_eh); NEXT_PASS (pass_nrv); + NEXT_PASS (pass_mudflap_2); NEXT_PASS (pass_mark_used_blocks); NEXT_PASS (pass_cleanup_cfg_post_optimizing); NEXT_PASS (pass_warn_function_noreturn); ! /* NEXT_PASS (pass_del_ssa); ! NEXT_PASS (pass_free_datastructures); ! NEXT_PASS (pass_free_cfg_annotations);*/ NEXT_PASS (pass_expand); + NEXT_PASS (pass_rest_of_compilation); { struct opt_pass **p = &pass_rest_of_compilation.pass.sub; Index: cfgrtl.c =================================================================== *** cfgrtl.c (revision 146576) --- cfgrtl.c (working copy) *************** along with GCC; see the file COPYING3. *** 64,70 **** static int can_delete_note_p (const_rtx); static int can_delete_label_p (const_rtx); - static void commit_one_edge_insertion (edge); static basic_block rtl_split_edge (edge); static bool rtl_move_block_after (basic_block, basic_block); static int rtl_verify_flow_info (void); --- 64,69 ---- *************** try_redirect_by_replacing_jump (edge e, *** 856,886 **** return e; } ! /* Redirect edge representing branch of (un)conditional jump or tablejump, ! NULL on failure */ ! static edge ! redirect_branch_edge (edge e, basic_block target) { rtx tmp; - rtx old_label = BB_HEAD (e->dest); - basic_block src = e->src; - rtx insn = BB_END (src); - - /* We can only redirect non-fallthru edges of jump insn. */ - if (e->flags & EDGE_FALLTHRU) - return NULL; - else if (!JUMP_P (insn)) - return NULL; - /* Recognize a tablejump and adjust all matching cases. */ if (tablejump_p (insn, NULL, &tmp)) { rtvec vec; int j; ! rtx new_label = block_label (target); ! if (target == EXIT_BLOCK_PTR) ! return NULL; if (GET_CODE (PATTERN (tmp)) == ADDR_VEC) vec = XVEC (PATTERN (tmp), 0); else --- 855,879 ---- return e; } ! /* Subroutine of redirect_branch_edge that tries to patch the jump ! instruction INSN so that it reaches block NEW. Do this ! only when it originally reached block OLD. Return true if this ! worked or the original target wasn't OLD, return false if redirection ! doesn't work. */ ! ! static bool ! patch_jump_insn (rtx insn, rtx old_label, basic_block new_bb) { rtx tmp; /* Recognize a tablejump and adjust all matching cases. */ if (tablejump_p (insn, NULL, &tmp)) { rtvec vec; int j; ! rtx new_label = block_label (new_bb); ! if (new_bb == EXIT_BLOCK_PTR) ! return false; if (GET_CODE (PATTERN (tmp)) == ADDR_VEC) vec = XVEC (PATTERN (tmp), 0); else *************** redirect_branch_edge (edge e, basic_bloc *** 915,934 **** if (computed_jump_p (insn) /* A return instruction can't be redirected. */ || returnjump_p (insn)) ! return NULL; ! ! /* If the insn doesn't go where we think, we're confused. */ ! gcc_assert (JUMP_LABEL (insn) == old_label); ! /* If the substitution doesn't succeed, die. This can happen ! if the back end emitted unrecognizable instructions or if ! target is exit block on some arches. */ ! if (!redirect_jump (insn, block_label (target), 0)) { ! gcc_assert (target == EXIT_BLOCK_PTR); ! return NULL; } } if (dump_file) fprintf (dump_file, "Edge %i->%i redirected to %i\n", --- 908,962 ---- if (computed_jump_p (insn) /* A return instruction can't be redirected. */ || returnjump_p (insn)) ! return false; ! if (!currently_expanding_to_rtl || JUMP_LABEL (insn) == old_label) { ! /* If the insn doesn't go where we think, we're confused. */ ! gcc_assert (JUMP_LABEL (insn) == old_label); ! ! /* If the substitution doesn't succeed, die. This can happen ! if the back end emitted unrecognizable instructions or if ! target is exit block on some arches. */ ! if (!redirect_jump (insn, block_label (new_bb), 0)) ! { ! gcc_assert (new_bb == EXIT_BLOCK_PTR); ! return false; ! } } } + return true; + } + + + /* Redirect edge representing branch of (un)conditional jump or tablejump, + NULL on failure */ + static edge + redirect_branch_edge (edge e, basic_block target) + { + rtx old_label = BB_HEAD (e->dest); + basic_block src = e->src; + rtx insn = BB_END (src); + + /* We can only redirect non-fallthru edges of jump insn. */ + if (e->flags & EDGE_FALLTHRU) + return NULL; + else if (!JUMP_P (insn) && !currently_expanding_to_rtl) + return NULL; + + if (!currently_expanding_to_rtl) + { + if (!patch_jump_insn (insn, old_label, target)) + return NULL; + } + else + /* When expanding this BB might actually contain multiple + jumps (i.e. not yet split by find_many_sub_basic_blocks). + Redirect all of those that match our label. */ + for (insn = BB_HEAD (src); insn != NEXT_INSN (BB_END (src)); + insn = NEXT_INSN (insn)) + if (JUMP_P (insn) && !patch_jump_insn (insn, old_label, target)) + return NULL; if (dump_file) fprintf (dump_file, "Edge %i->%i redirected to %i\n", *************** insert_insn_on_edge (rtx pattern, edge e *** 1313,1319 **** /* Update the CFG for the instructions queued on edge E. */ ! static void commit_one_edge_insertion (edge e) { rtx before = NULL_RTX, after = NULL_RTX, insns, tmp, last; --- 1341,1347 ---- /* Update the CFG for the instructions queued on edge E. */ ! void commit_one_edge_insertion (edge e) { rtx before = NULL_RTX, after = NULL_RTX, insns, tmp, last; ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-22 16:45 ` [RFA] " Michael Matz @ 2009-04-23 15:10 ` Andrew MacLeod 2009-04-24 9:42 ` Richard Guenther 2009-04-26 20:27 ` Michael Matz 2009-04-24 14:32 ` Richard Guenther ` (3 subsequent siblings) 4 siblings, 2 replies; 63+ messages in thread From: Andrew MacLeod @ 2009-04-23 15:10 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrey Belevantsev [-- Attachment #1: Type: text/plain, Size: 173 bytes --] Michael Matz wrote: > On Wed, 22 Apr 2009, Michael Matz wrote: > > > I'd like to ask for approval for the series. > > > Ciao, > Michael. > Looks good to me. Andrew [-- Attachment #2: com --] [-- Type: text/plain, Size: 1229 bytes --] > --- 233,251 ---- if (gimple_assign_copy_p (stmt) && gimple_assign_lhs (stmt) == result && gimple_assign_rhs1 (stmt) == found) > ! { > ! unlink_stmt_vdef (stmt); > ! gsi_remove (&gsi, true); > ! } else Unrelated to your patch, but out of curiosity, shouldn't gsi_remove() automatically do an unlink_stmt_vdef() if the remove_permanently flag is true like it is here? Seems like an oversight bug waiting to happen... > --- 2537,2584 ---- > rebuild_jump_labels (get_insns ()); > find_exception_handler_labels (); > > + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR, next_bb) > + { > + edge e; > + edge_iterator ei; > + for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); ) > + { > + if (e->insns.r) > + commit_one_edge_insertion (e); > + else > + ei_next (&ei); > + } > + } I do think this should be fixed in commit_edge_insertions and the common code factored out and called from here, but that can be done separately. Then commit_one_edge_insertion() could return to being a static function as well. Im guessing current_ir_type() is not IR_RTL_CFGLAYOUT, or you could just call commit_edge_insertions directly... ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-23 15:10 ` Andrew MacLeod @ 2009-04-24 9:42 ` Richard Guenther 2009-04-26 20:27 ` Michael Matz 1 sibling, 0 replies; 63+ messages in thread From: Richard Guenther @ 2009-04-24 9:42 UTC (permalink / raw) To: Andrew MacLeod; +Cc: Michael Matz, gcc-patches, Andrey Belevantsev On Thu, Apr 23, 2009 at 4:30 PM, Andrew MacLeod <amacleod@redhat.com> wrote: > Michael Matz wrote: >> >> On Wed, 22 Apr 2009, Michael Matz wrote: >> >> I'd like to ask for approval for the series. >> >> >> Ciao, >> Michael. >> > > Looks good to me. > > Andrew > > > > > >> --- 233,251 ---- > if (gimple_assign_copy_p (stmt) > && gimple_assign_lhs (stmt) == result > && gimple_assign_rhs1 (stmt) == found) >> ! { >> ! unlink_stmt_vdef (stmt); >> ! gsi_remove (&gsi, true); >> ! } > else > > Unrelated to your patch, but out of curiosity, shouldn't gsi_remove() > automatically do an unlink_stmt_vdef() if the remove_permanently flag is > true like it is here? Seems like an oversight bug waiting to happen... No, similar to release_defs unlink_stmt_vdef should not be done if you plan to re-use the virtual operands of the stmt, for example when replacing the stmt with a newly built one. Richard. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-23 15:10 ` Andrew MacLeod 2009-04-24 9:42 ` Richard Guenther @ 2009-04-26 20:27 ` Michael Matz 2010-01-19 15:48 ` H.J. Lu 1 sibling, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-26 20:27 UTC (permalink / raw) To: Andrew MacLeod; +Cc: gcc-patches, Andrey Belevantsev Hi, On Thu, 23 Apr 2009, Andrew MacLeod wrote: > Looks good to me. It's now in. > I do think this should be fixed in commit_edge_insertions and the common > code factored out and called from here, but that can be done separately. > Then commit_one_edge_insertion() could return to being a static function > as well. Yes, I plan to do that. There are also some other cleanups on the plate. > Im guessing current_ir_type() is not IR_RTL_CFGLAYOUT, or you could just > call commit_edge_insertions directly... No, unfortunately it's not. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 20:27 ` Michael Matz @ 2010-01-19 15:48 ` H.J. Lu 0 siblings, 0 replies; 63+ messages in thread From: H.J. Lu @ 2010-01-19 15:48 UTC (permalink / raw) To: Michael Matz; +Cc: Andrew MacLeod, gcc-patches, Andrey Belevantsev On Sun, Apr 26, 2009 at 12:21 PM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Thu, 23 Apr 2009, Andrew MacLeod wrote: > >> Looks good to me. > > It's now in. > It caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42800 -- H.J. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-22 16:45 ` [RFA] " Michael Matz 2009-04-23 15:10 ` Andrew MacLeod @ 2009-04-24 14:32 ` Richard Guenther 2009-04-24 14:46 ` Richard Guenther 2009-04-27 5:47 ` H.J. Lu ` (2 subsequent siblings) 4 siblings, 1 reply; 63+ messages in thread From: Richard Guenther @ 2009-04-24 14:32 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Wed, Apr 22, 2009 at 6:42 PM, Michael Matz <matz@suse.de> wrote: > On Wed, 22 Apr 2009, Michael Matz wrote: > >> I'll soon send a new version of the patch that fixes all problems and >> testcases I encountered. > > Like so. This is the full patch, i.e. including the cleanups, but > excluding the testsuite changes. It should incorporate all feedback. > Compared to the last version it adds comments for new functions, fixes > muflap2, and generally some other minor problems showing when I started > testing Ada and a bug reported by Andrey. > > This patch (plus testsuite changes) was bootstrapped with Ada on > x86_64-linux. There are no testsuite regressions: > FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE > FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE > FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) > FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) > FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) > FAIL: libmudflap.c++/pass41-frag.cxx execution test > FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test > FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test > FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test > > All of these happen without the patch too (known bugs, old binutils, and > pass41-frag never seems to work anyway). > > I'd like to ask for approval for the series. > > > Ciao, > Michael. > -- > * builtins.c (fold_builtin_next_arg): Handle SSA names. > * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate > beyond num_ssa_names, use ssa_name() directly. > * tree-ssa-ter.c (free_temp_expr_table): Likewise. > * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, > mark only useful SSA names. > (compare_pairs): Swap cost comparison. > (coalesce_ssa_name): Don't use change_partition_var. > * tree-nrv.c (struct nrv_data): Add modified member. > (finalize_nrv_r): Set it. > (tree_nrv): Use it to update statements. > (pass_nrv): Require PROP_ssa. > * tree-mudflap.c (create_referenced_var): New static helper. > (mf_decl_cache_locals, mf_build_check_statement_for): Use it. > (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. > * alias.c (find_base_decl): Handle SSA names. > * emit-rtl (set_reg_attrs_for_parm): Make non-static. > (component_ref_for_mem_expr): Don't leak SSA names into RTL. > * rtl.h (set_reg_attrs_for_parm): Declare. > * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename > to "optimized", remove unused locals at finish. > (execute_free_datastructures): Make global, call > delete_tree_cfg_annotations. > (execute_free_cfg_annotations): Don't call > delete_tree_cfg_annotations. > > * ssaexpand.h: New file. > * expr.c (toplevel): Include ssaexpand.h. > (expand_assignment): Handle SSA names the same as register > variables. > (expand_expr_real_1): Expand SSA names. > * cfgexpand.c (toplevel): Include ssaexpand.h. > (SA): New global variable. > (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. > (SSAVAR): New macro. > (set_rtl): New helper function. > (add_stack_var): Deal with SSA names, use set_rtl. > (expand_one_stack_var_at): Likewise. > (expand_one_stack_var): Deal with SSA names. > (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker > before unique numbers. > (expand_stack_vars): Use set_rtl. > (expand_one_var): Accept SSA names, add asserts for them, feed them > to above subroutines. > (expand_used_vars): Expand all partitions (without default defs), > then only the local decls (ignoring those expanded already). > (expand_gimple_cond): Remove edges when jumpif() expands an > unconditional jump. > (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, > or remove abnormal edges. Ignore insns setting the LHS of a TERed > SSA name. > (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize > members of SA; deal with PARM_DECL partitions here; expand > all PHI nodes, free tree datastructures and SA. Commit instructions > on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. > (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow > info and statements at start, collect garbage at finish. > * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. > (VAR_ANN_PARTITION) Remove. > (change_partition_var): Don't declare. > (partition_to_var): Always return SSA names. > (var_to_partition): Only accept SSA names. > (register_ssa_partition): Only check argument. > * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var > member. > (delete_var_map): Don't free it. > (var_union): Only accept SSA names, simplify. > (partition_view_init): Mark only useful SSA names as used. > (partition_view_fini): Only deal with SSA names. > (change_partition_var): Remove. > (dump_var_map): Use ssa_name instead of partition_to_var member. > * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL > basic blocks. > * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. > (struct _elim_graph): New member const_dests; nodes member vector of > ints. > (set_location_for_edge): New static helper. > (create_temp): Remove. > (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, > insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New > functions. > (new_elim_graph): Allocate const_dests member. > (clean_elim_graph): Truncate const_dests member. > (delete_elim_graph): Free const_dests member. > (elim_graph_size): Adapt to new type of nodes member. > (elim_graph_add_node): Likewise. > (eliminate_name): Likewise. > (eliminate_build): Don't take basic block argument, deal only with > partition numbers, not variables. > (get_temp_reg): New static helper. > (elim_create): Use it, deal with RTL temporaries instead of trees. > (eliminate_phi): Adjust all calls to new signature. > (assign_vars, replace_use_variable, replace_def_variable): Remove. > (rewrite_trees): Only do checking. > (edge_leader, stmt_list, leader_has_match, leader_match): Remove. > (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, > init_analyze_edges_for_bb, fini_analyze_edges_for_bb, > contains_tree_r, MAX_STMTS_IN_LATCH, > process_single_block_loop_latch, analyze_edges_for_bb, > perform_edge_inserts): Remove. > (expand_phi_nodes): New global function. > (remove_ssa_form): Take ssaexpand parameter. Don't call removed > functions, initialize new parameter, remember partitions having a > default def. > (finish_out_of_ssa): New global function. > (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, > don't reset in_ssa_p here. > (pass_del_ssa): Remove. > * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and > partition members. > (execute_free_datastructures): Declare. > * Makefile.in (SSAEXPAND_H): New variable. > (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. > * basic-block.h (commit_one_edge_insertion): Declare. > * passes.c (init_optimization_passes): Move pass_nrv and > pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove > pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. > * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. > (redirect_branch_edge): Deal with super block when expanding, split > out jump patching itself into ... > (patch_jump_insn): ... here, new static helper. Some comments inline... > Index: tree-ssa-copyrename.c > =================================================================== > *** tree-ssa-copyrename.c (revision 146576) > --- tree-ssa-copyrename.c (working copy) > *************** rename_ssa_copies (void) > *** 291,297 **** > else > debug = NULL; > > ! map = init_var_map (num_ssa_names + 1); > > FOR_EACH_BB (bb) > { > --- 291,297 ---- > else > debug = NULL; > > ! map = init_var_map (num_ssa_names); > > FOR_EACH_BB (bb) > { > *************** rename_ssa_copies (void) > *** 339,350 **** > /* Now one more pass to make all elements of a partition share the same > root variable. */ > > ! for (x = 1; x <= num_ssa_names; x++) > { > part_var = partition_to_var (map, x); > if (!part_var) > continue; > ! var = map->partition_to_var[x]; > if (debug) > { > if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var)) > --- 339,350 ---- > /* Now one more pass to make all elements of a partition share the same > root variable. */ > > ! for (x = 1; x < num_ssa_names; x++) > { > part_var = partition_to_var (map, x); > if (!part_var) > continue; > ! var = ssa_name (x); > if (debug) > { > if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var)) This piece is ok as obvious. Please commit it separately. > Index: cfgexpand.c > =================================================================== > *** cfgexpand.c (revision 146576) > --- cfgexpand.c (working copy) > *************** along with GCC; see the file COPYING3. > *** 42,49 **** > --- 42,54 ---- > #include "tree-inline.h" > #include "value-prof.h" > #include "target.h" > + #include "ssaexpand.h" > > > + /* This variable holds information helping the rewriting of SSA trees > + into RTL. */ > + struct ssaexpand SA; > + > /* Return an expression tree corresponding to the RHS of GIMPLE > statement STMT. */ > > *************** gimple_assign_rhs_to_tree (gimple stmt) > *** 78,85 **** > static tree > gimple_cond_pred_to_tree (gimple stmt) > { > return build2 (gimple_cond_code (stmt), boolean_type_node, > ! gimple_cond_lhs (stmt), gimple_cond_rhs (stmt)); > } > > /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression > --- 83,104 ---- > static tree > gimple_cond_pred_to_tree (gimple stmt) > { > + /* We're sometimes presented with such code: > + D.123_1 = x < y; > + if (D.123_1 != 0) > + ... > + This would expand to two comparisons which then later might > + be cleaned up by combine. But some pattern matchers like if-conversion > + work better when there's only one compare, so make up for this > + here as special exception if TER would have made the same change. */ > + tree lhs = gimple_cond_lhs (stmt); > + if (SA.values > + && TREE_CODE (lhs) == SSA_NAME > + && SA.values[SSA_NAME_VERSION (lhs)]) > + lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]); Do we really need the SA.values array here? It seems that a bitmap would be enough, as the definition should be still reachable via SSA_NAME_DEF_STMT (lhs). (Thanks Paolo for noticing this) > + > return build2 (gimple_cond_code (stmt), boolean_type_node, > ! lhs, gimple_cond_rhs (stmt)); > } > > /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression > *************** failed: > *** 423,428 **** > --- 442,464 ---- > #define STACK_ALIGNMENT_NEEDED 1 > #endif > > + #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) This should be in some public place, ideally with a more sound name. Any ideas? > + /* Associate declaration T with storage space X. If T is no > + SSA name this is exactly SET_DECL_RTL, otherwise make the > + partition of T associated with X. */ > + static inline void > + set_rtl (tree t, rtx x) > + { > + if (TREE_CODE (t) == SSA_NAME) > + { > + SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x; > + if (x && !MEM_P (x)) > + set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x); > + } > + else > + SET_DECL_RTL (t, x); > + } > > /* This structure holds data relevant to one variable that will be > placed in a stack slot. */ > *************** add_stack_var (tree decl) > *** 561,575 **** > } > stack_vars[stack_vars_num].decl = decl; > stack_vars[stack_vars_num].offset = 0; > ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (decl), 1); > ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (decl); > > /* All variables are initially in their own partition. */ > stack_vars[stack_vars_num].representative = stack_vars_num; > stack_vars[stack_vars_num].next = EOC; > > /* Ensure that this decl doesn't get put onto the list twice. */ > ! SET_DECL_RTL (decl, pc_rtx); > > stack_vars_num++; > } > --- 597,611 ---- > } > stack_vars[stack_vars_num].decl = decl; > stack_vars[stack_vars_num].offset = 0; > ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1); > ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (SSAVAR (decl)); > > /* All variables are initially in their own partition. */ > stack_vars[stack_vars_num].representative = stack_vars_num; > stack_vars[stack_vars_num].next = EOC; > > /* Ensure that this decl doesn't get put onto the list twice. */ > ! set_rtl (decl, pc_rtx); > > stack_vars_num++; > } > *************** add_alias_set_conflicts (void) > *** 688,709 **** > } > > /* A subroutine of partition_stack_vars. A comparison function for qsort, > ! sorting an array of indices by the size of the object. */ > > static int > stack_var_size_cmp (const void *a, const void *b) > { > HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size; > HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size; > ! unsigned int uida = DECL_UID (stack_vars[*(const size_t *)a].decl); > ! unsigned int uidb = DECL_UID (stack_vars[*(const size_t *)b].decl); > > if (sa < sb) > return -1; > if (sa > sb) > return 1; > ! /* For stack variables of the same size use the uid of the decl > ! to make the sort stable. */ > if (uida < uidb) > return -1; > if (uida > uidb) > --- 724,760 ---- > } > > /* A subroutine of partition_stack_vars. A comparison function for qsort, > ! sorting an array of indices by the size and type of the object. */ > > static int > stack_var_size_cmp (const void *a, const void *b) > { > HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size; > HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size; > ! tree decla, declb; > ! unsigned int uida, uidb; > > if (sa < sb) > return -1; > if (sa > sb) > return 1; > ! decla = stack_vars[*(const size_t *)a].decl; > ! declb = stack_vars[*(const size_t *)b].decl; > ! /* For stack variables of the same size use and id of the decls > ! to make the sort stable. Two SSA names are compared by their > ! version, SSA names come before non-SSA names, and two normal > ! decls are compared by their DECL_UID. */ > ! if (TREE_CODE (decla) == SSA_NAME) > ! { > ! if (TREE_CODE (declb) == SSA_NAME) > ! uida = SSA_NAME_VERSION (decla), uidb = SSA_NAME_VERSION (declb); > ! else > ! return -1; > ! } > ! else if (TREE_CODE (declb) == SSA_NAME) > ! return 1; > ! else > ! uida = DECL_UID (decla), uidb = DECL_UID (declb); > if (uida < uidb) > return -1; > if (uida > uidb) > *************** expand_one_stack_var_at (tree decl, HOST > *** 874,894 **** > gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); > > x = plus_constant (virtual_stack_vars_rtx, offset); > ! x = gen_rtx_MEM (DECL_MODE (decl), x); > > ! /* Set alignment we actually gave this decl. */ > ! offset -= frame_phase; > ! align = offset & -offset; > ! align *= BITS_PER_UNIT; > ! if (align == 0) > ! align = STACK_BOUNDARY; > ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) > ! align = MAX_SUPPORTED_STACK_ALIGNMENT; > ! DECL_ALIGN (decl) = align; > ! DECL_USER_ALIGN (decl) = 0; > > ! set_mem_attributes (x, decl, true); > ! SET_DECL_RTL (decl, x); > } > > /* A subroutine of expand_used_vars. Give each partition representative > --- 925,951 ---- > gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); > > x = plus_constant (virtual_stack_vars_rtx, offset); > ! x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x); > > ! if (TREE_CODE (decl) != SSA_NAME) > ! { > ! /* Set alignment we actually gave this decl if it isn't an SSA name. > ! If it is we generate stack slots only accidentally so it isn't as > ! important, we'll simply use the alignment that is already set. */ > ! offset -= frame_phase; > ! align = offset & -offset; > ! align *= BITS_PER_UNIT; > ! if (align == 0) > ! align = STACK_BOUNDARY; > ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) > ! align = MAX_SUPPORTED_STACK_ALIGNMENT; > > ! DECL_ALIGN (decl) = align; > ! DECL_USER_ALIGN (decl) = 0; > ! } > ! > ! set_mem_attributes (x, SSAVAR (decl), true); > ! set_rtl (decl, x); > } > > /* A subroutine of expand_used_vars. Give each partition representative > *************** expand_stack_vars (bool (*pred) (tree)) > *** 912,918 **** > > /* Skip variables that have already had rtl assigned. See also > add_stack_var where we perpetrate this pc_rtx hack. */ > ! if (DECL_RTL (stack_vars[i].decl) != pc_rtx) > continue; > > /* Check the predicate to see whether this variable should be > --- 969,977 ---- > > /* Skip variables that have already had rtl assigned. See also > add_stack_var where we perpetrate this pc_rtx hack. */ > ! if ((TREE_CODE (stack_vars[i].decl) == SSA_NAME > ! ? SA.partition_to_pseudo[var_to_partition (SA.map, stack_vars[i].decl)] > ! : DECL_RTL (stack_vars[i].decl)) != pc_rtx) > continue; > > /* Check the predicate to see whether this variable should be > *************** account_stack_vars (void) > *** 951,957 **** > > size += stack_vars[i].size; > for (j = i; j != EOC; j = stack_vars[j].next) > ! SET_DECL_RTL (stack_vars[j].decl, NULL); > } > return size; > } > --- 1010,1016 ---- > > size += stack_vars[i].size; > for (j = i; j != EOC; j = stack_vars[j].next) > ! set_rtl (stack_vars[j].decl, NULL); > } > return size; > } > *************** expand_one_stack_var (tree var) > *** 964,971 **** > { > HOST_WIDE_INT size, offset, align; > > ! size = tree_low_cst (DECL_SIZE_UNIT (var), 1); > ! align = get_decl_align_unit (var); > offset = alloc_stack_frame_space (size, align); > > expand_one_stack_var_at (var, offset); > --- 1023,1030 ---- > { > HOST_WIDE_INT size, offset, align; > > ! size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (var)), 1); > ! align = get_decl_align_unit (SSAVAR (var)); > offset = alloc_stack_frame_space (size, align); > > expand_one_stack_var_at (var, offset); > *************** expand_one_hard_reg_var (tree var) > *** 986,1005 **** > static void > expand_one_register_var (tree var) > { > ! tree type = TREE_TYPE (var); > int unsignedp = TYPE_UNSIGNED (type); > enum machine_mode reg_mode > ! = promote_mode (type, DECL_MODE (var), &unsignedp, 0); > rtx x = gen_reg_rtx (reg_mode); > > ! SET_DECL_RTL (var, x); > > /* Note if the object is a user variable. */ > ! if (!DECL_ARTIFICIAL (var)) > ! mark_user_reg (x); > > if (POINTER_TYPE_P (type)) > ! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); > } > > /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that > --- 1045,1065 ---- > static void > expand_one_register_var (tree var) > { > *************** struct rtl_opt_pass pass_expand = > *** 2471,2480 **** > 0, /* static_pass_number */ > TV_EXPAND, /* tv_id */ > /* ??? If TER is enabled, we actually receive GENERIC. */ > ! PROP_gimple_leh | PROP_cfg, /* properties_required */ > PROP_rtl, /* properties_provided */ > ! PROP_trees, /* properties_destroyed */ > ! 0, /* todo_flags_start */ > ! TODO_dump_func, /* todo_flags_finish */ > } > }; > --- 2644,2655 ---- > 0, /* static_pass_number */ > TV_EXPAND, /* tv_id */ > /* ??? If TER is enabled, we actually receive GENERIC. */ This is no longer true ;) > ! PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */ > PROP_rtl, /* properties_provided */ > ! PROP_ssa | PROP_trees, /* properties_destroyed */ > ! TODO_verify_ssa | TODO_verify_flow > ! | TODO_verify_stmts, /* todo_flags_start */ > ! TODO_dump_func > ! | TODO_ggc_collect /* todo_flags_finish */ > } > }; > Index: tree-mudflap.c > =================================================================== > *** tree-mudflap.c (revision 146576) > --- tree-mudflap.c (working copy) > *************** execute_mudflap_function_ops (void) > *** 447,452 **** > --- 447,465 ---- > return 0; > } > > + /* Construct a new temporary variable with TYPE > + as type and PREFIX as name prefix, add it to referenced vars > + and mark it for renaming. */ > + > + static tree > + create_referenced_var (tree type, const char *prefix) > + { > + tree var = create_tmp_var (type, prefix); > + add_referenced_var (var); > + mark_sym_for_renaming (var); > + return var; > + } This is exactly the same as make_rename_temp () from tree-dfa.c, so use that. > /* Create and initialize local shadow variables for the lookup cache > globals. Put their decls in the *_l globals for use by > mf_build_check_statement_for. */ > *************** mf_decl_cache_locals (void) > *** 459,469 **** > > /* Build the cache vars. */ > mf_cache_shift_decl_l > ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_shift_decl), > "__mf_lookup_shift_l")); > > mf_cache_mask_decl_l > ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_mask_decl), > "__mf_lookup_mask_l")); > > /* Build initialization nodes for the cache vars. We just load the > --- 472,482 ---- > > /* Build the cache vars. */ > mf_cache_shift_decl_l > ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_shift_decl), > "__mf_lookup_shift_l")); > > mf_cache_mask_decl_l > ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_mask_decl), > "__mf_lookup_mask_l")); > > /* Build initialization nodes for the cache vars. We just load the > *************** mf_build_check_statement_for (tree base, > *** 546,554 **** > } > > /* Build our local variables. */ > ! mf_elem = create_tmp_var (mf_cache_structptr_type, "__mf_elem"); > ! mf_base = create_tmp_var (mf_uintptr_type, "__mf_base"); > ! mf_limit = create_tmp_var (mf_uintptr_type, "__mf_limit"); > > /* Build: __mf_base = (uintptr_t) <base address expression>. */ > seq = gimple_seq_alloc (); > --- 559,567 ---- > } > > /* Build our local variables. */ > ! mf_elem = create_referenced_var (mf_cache_structptr_type, "__mf_elem"); > ! mf_base = create_referenced_var (mf_uintptr_type, "__mf_base"); > ! mf_limit = create_referenced_var (mf_uintptr_type, "__mf_limit"); > > /* Build: __mf_base = (uintptr_t) <base address expression>. */ > seq = gimple_seq_alloc (); > *************** mf_build_check_statement_for (tree base, > *** 627,633 **** > t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u); > t = force_gimple_operand (t, &stmts, false, NULL_TREE); > gimple_seq_add_seq (&seq, stmts); > ! cond = create_tmp_var (boolean_type_node, "__mf_unlikely_cond"); > g = gimple_build_assign (cond, t); > gimple_set_location (g, location); > gimple_seq_add_stmt (&seq, g); > --- 640,646 ---- > t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u); > t = force_gimple_operand (t, &stmts, false, NULL_TREE); > gimple_seq_add_seq (&seq, stmts); > ! cond = create_referenced_var (boolean_type_node, "__mf_unlikely_cond"); > g = gimple_build_assign (cond, t); > gimple_set_location (g, location); > gimple_seq_add_stmt (&seq, g); > *************** struct gimple_opt_pass pass_mudflap_2 = > *** 1366,1377 **** > NULL, /* next */ > 0, /* static_pass_number */ > TV_NONE, /* tv_id */ > ! PROP_gimple_leh, /* properties_required */ > 0, /* properties_provided */ > 0, /* properties_destroyed */ > 0, /* todo_flags_start */ > TODO_verify_flow | TODO_verify_stmts > ! | TODO_dump_func /* todo_flags_finish */ > } > }; > > --- 1379,1390 ---- > NULL, /* next */ > 0, /* static_pass_number */ > TV_NONE, /* tv_id */ > ! PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required */ > 0, /* properties_provided */ > 0, /* properties_destroyed */ > 0, /* todo_flags_start */ > TODO_verify_flow | TODO_verify_stmts > ! | TODO_dump_func | TODO_update_ssa /* todo_flags_finish */ > } > }; > > Index: tree-ssa-ter.c > =================================================================== > *** tree-ssa-ter.c (revision 146576) > --- tree-ssa-ter.c (working copy) > *************** free_temp_expr_table (temp_expr_table_p > *** 225,231 **** > unsigned x; > for (x = 0; x <= num_var_partitions (t->map); x++) > gcc_assert (!t->kill_list[x]); > ! for (x = 0; x < num_ssa_names + 1; x++) > { > gcc_assert (t->expr_decl_uids[x] == NULL); > gcc_assert (t->partition_dependencies[x] == NULL); > --- 225,231 ---- > unsigned x; > for (x = 0; x <= num_var_partitions (t->map); x++) > gcc_assert (!t->kill_list[x]); > ! for (x = 0; x < num_ssa_names; x++) > { > gcc_assert (t->expr_decl_uids[x] == NULL); > gcc_assert (t->partition_dependencies[x] == NULL); Obvious, commit with the other similar piece separately. > Index: tree-ssa.c > =================================================================== > *** tree-ssa.c (revision 146576) > --- tree-ssa.c (working copy) > *************** delete_tree_ssa (void) > *** 844,850 **** > > gimple_set_modified (stmt, true); > } > ! set_phi_nodes (bb, NULL); > } > > /* Remove annotations from every referenced local variable. */ > --- 844,851 ---- > > gimple_set_modified (stmt, true); > } > ! if (!(bb->flags & BB_RTL)) > ! set_phi_nodes (bb, NULL); > } > > /* Remove annotations from every referenced local variable. */ > Index: rtl.h > =================================================================== > *** rtl.h (revision 146576) > --- rtl.h (working copy) > *************** extern rtx gen_int_mode (HOST_WIDE_INT, > *** 1494,1499 **** > --- 1494,1500 ---- > extern rtx emit_copy_of_insn_after (rtx, rtx); > extern void set_reg_attrs_from_value (rtx, rtx); > extern void set_reg_attrs_for_parm (rtx, rtx); > + extern void set_reg_attrs_for_decl_rtl (tree t, rtx x); > extern void adjust_reg_mode (rtx, enum machine_mode); > extern int mem_expr_equal_p (const_tree, const_tree); > > Index: tree-optimize.c > =================================================================== > *** tree-optimize.c (revision 146576) > --- tree-optimize.c (working copy) > *************** struct gimple_opt_pass pass_cleanup_cfg_ > *** 201,207 **** > { > { > GIMPLE_PASS, > ! "final_cleanup", /* name */ > NULL, /* gate */ > execute_cleanup_cfg_post_optimizing, /* execute */ > NULL, /* sub */ > --- 201,207 ---- > { > { > GIMPLE_PASS, > ! "optimized", /* name */ > NULL, /* gate */ > execute_cleanup_cfg_post_optimizing, /* execute */ > NULL, /* sub */ > *************** struct gimple_opt_pass pass_cleanup_cfg_ > *** 213,225 **** > 0, /* properties_destroyed */ > 0, /* todo_flags_start */ > TODO_dump_func /* todo_flags_finish */ > } > }; > > /* Pass: do the actions required to finish with tree-ssa optimization > passes. */ > > ! static unsigned int > execute_free_datastructures (void) > { > free_dominance_info (CDI_DOMINATORS); > --- 213,226 ---- > 0, /* properties_destroyed */ > 0, /* todo_flags_start */ > TODO_dump_func /* todo_flags_finish */ > + | TODO_remove_unused_locals > } > }; > > /* Pass: do the actions required to finish with tree-ssa optimization > passes. */ > > ! unsigned int > execute_free_datastructures (void) > { > free_dominance_info (CDI_DOMINATORS); > *************** execute_free_datastructures (void) > *** 228,233 **** > --- 229,238 ---- > /* Remove the ssa structures. */ > if (cfun->gimple_df) > delete_tree_ssa (); > + > + /* And get rid of annotations we no longer need. */ > + delete_tree_cfg_annotations (); > + > return 0; > } > > *************** struct gimple_opt_pass pass_free_datastr > *** 254,262 **** > static unsigned int > execute_free_cfg_annotations (void) > { > - /* And get rid of annotations we no longer need. */ > - delete_tree_cfg_annotations (); > - > return 0; > } It looks like the pass referencing this is now dead. Please remove this function and the pass structure. > [Message clipped] bah ... ;) Other half in next mail. Richard. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-24 14:32 ` Richard Guenther @ 2009-04-24 14:46 ` Richard Guenther 2009-04-26 20:21 ` Michael Matz 0 siblings, 1 reply; 63+ messages in thread From: Richard Guenther @ 2009-04-24 14:46 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Fri, Apr 24, 2009 at 4:21 PM, Richard Guenther <richard.guenther@gmail.com> wrote: > On Wed, Apr 22, 2009 at 6:42 PM, Michael Matz <matz@suse.de> wrote: >> On Wed, 22 Apr 2009, Michael Matz wrote: >> >>> I'll soon send a new version of the patch that fixes all problems and >>> testcases I encountered. >> >> Like so. This is the full patch, i.e. including the cleanups, but >> excluding the testsuite changes. It should incorporate all feedback. >> Compared to the last version it adds comments for new functions, fixes >> muflap2, and generally some other minor problems showing when I started >> testing Ada and a bug reported by Andrey. >> >> This patch (plus testsuite changes) was bootstrapped with Ada on >> x86_64-linux. There are no testsuite regressions: >> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE >> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE >> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) >> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) >> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) >> FAIL: libmudflap.c++/pass41-frag.cxx execution test >> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test >> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test >> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test >> >> All of these happen without the patch too (known bugs, old binutils, and >> pass41-frag never seems to work anyway). >> >> I'd like to ask for approval for the series. >> >> >> Ciao, >> Michael. >> -- >> * builtins.c (fold_builtin_next_arg): Handle SSA names. >> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate >> beyond num_ssa_names, use ssa_name() directly. >> * tree-ssa-ter.c (free_temp_expr_table): Likewise. >> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, >> mark only useful SSA names. >> (compare_pairs): Swap cost comparison. >> (coalesce_ssa_name): Don't use change_partition_var. >> * tree-nrv.c (struct nrv_data): Add modified member. >> (finalize_nrv_r): Set it. >> (tree_nrv): Use it to update statements. >> (pass_nrv): Require PROP_ssa. >> * tree-mudflap.c (create_referenced_var): New static helper. >> (mf_decl_cache_locals, mf_build_check_statement_for): Use it. >> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. >> * alias.c (find_base_decl): Handle SSA names. >> * emit-rtl (set_reg_attrs_for_parm): Make non-static. >> (component_ref_for_mem_expr): Don't leak SSA names into RTL. >> * rtl.h (set_reg_attrs_for_parm): Declare. >> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename >> to "optimized", remove unused locals at finish. >> (execute_free_datastructures): Make global, call >> delete_tree_cfg_annotations. >> (execute_free_cfg_annotations): Don't call >> delete_tree_cfg_annotations. >> >> * ssaexpand.h: New file. >> * expr.c (toplevel): Include ssaexpand.h. >> (expand_assignment): Handle SSA names the same as register >> variables. >> (expand_expr_real_1): Expand SSA names. >> * cfgexpand.c (toplevel): Include ssaexpand.h. >> (SA): New global variable. >> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. >> (SSAVAR): New macro. >> (set_rtl): New helper function. >> (add_stack_var): Deal with SSA names, use set_rtl. >> (expand_one_stack_var_at): Likewise. >> (expand_one_stack_var): Deal with SSA names. >> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker >> before unique numbers. >> (expand_stack_vars): Use set_rtl. >> (expand_one_var): Accept SSA names, add asserts for them, feed them >> to above subroutines. >> (expand_used_vars): Expand all partitions (without default defs), >> then only the local decls (ignoring those expanded already). >> (expand_gimple_cond): Remove edges when jumpif() expands an >> unconditional jump. >> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, >> or remove abnormal edges. Ignore insns setting the LHS of a TERed >> SSA name. >> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize >> members of SA; deal with PARM_DECL partitions here; expand >> all PHI nodes, free tree datastructures and SA. Commit instructions >> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. >> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow >> info and statements at start, collect garbage at finish. >> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. >> (VAR_ANN_PARTITION) Remove. >> (change_partition_var): Don't declare. >> (partition_to_var): Always return SSA names. >> (var_to_partition): Only accept SSA names. >> (register_ssa_partition): Only check argument. >> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var >> member. >> (delete_var_map): Don't free it. >> (var_union): Only accept SSA names, simplify. >> (partition_view_init): Mark only useful SSA names as used. >> (partition_view_fini): Only deal with SSA names. >> (change_partition_var): Remove. >> (dump_var_map): Use ssa_name instead of partition_to_var member. >> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL >> basic blocks. >> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. >> (struct _elim_graph): New member const_dests; nodes member vector of >> ints. >> (set_location_for_edge): New static helper. >> (create_temp): Remove. >> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, >> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New >> functions. >> (new_elim_graph): Allocate const_dests member. >> (clean_elim_graph): Truncate const_dests member. >> (delete_elim_graph): Free const_dests member. >> (elim_graph_size): Adapt to new type of nodes member. >> (elim_graph_add_node): Likewise. >> (eliminate_name): Likewise. >> (eliminate_build): Don't take basic block argument, deal only with >> partition numbers, not variables. >> (get_temp_reg): New static helper. >> (elim_create): Use it, deal with RTL temporaries instead of trees. >> (eliminate_phi): Adjust all calls to new signature. >> (assign_vars, replace_use_variable, replace_def_variable): Remove. >> (rewrite_trees): Only do checking. >> (edge_leader, stmt_list, leader_has_match, leader_match): Remove. >> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, >> init_analyze_edges_for_bb, fini_analyze_edges_for_bb, >> contains_tree_r, MAX_STMTS_IN_LATCH, >> process_single_block_loop_latch, analyze_edges_for_bb, >> perform_edge_inserts): Remove. >> (expand_phi_nodes): New global function. >> (remove_ssa_form): Take ssaexpand parameter. Don't call removed >> functions, initialize new parameter, remember partitions having a >> default def. >> (finish_out_of_ssa): New global function. >> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, >> don't reset in_ssa_p here. >> (pass_del_ssa): Remove. >> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and >> partition members. >> (execute_free_datastructures): Declare. >> * Makefile.in (SSAEXPAND_H): New variable. >> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. >> * basic-block.h (commit_one_edge_insertion): Declare. >> * passes.c (init_optimization_passes): Move pass_nrv and >> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove >> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. >> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. >> (redirect_branch_edge): Deal with super block when expanding, split >> out jump patching itself into ... >> (patch_jump_insn): ... here, new static helper. > > Some comments inline... > >> Index: tree-ssa-copyrename.c >> =================================================================== >> *** tree-ssa-copyrename.c (revision 146576) >> --- tree-ssa-copyrename.c (working copy) >> *************** rename_ssa_copies (void) >> *** 291,297 **** >> else >> debug = NULL; >> >> ! map = init_var_map (num_ssa_names + 1); >> >> FOR_EACH_BB (bb) >> { >> --- 291,297 ---- >> else >> debug = NULL; >> >> ! map = init_var_map (num_ssa_names); >> >> FOR_EACH_BB (bb) >> { >> *************** rename_ssa_copies (void) >> *** 339,350 **** >> /* Now one more pass to make all elements of a partition share the same >> root variable. */ >> >> ! for (x = 1; x <= num_ssa_names; x++) >> { >> part_var = partition_to_var (map, x); >> if (!part_var) >> continue; >> ! var = map->partition_to_var[x]; >> if (debug) >> { >> if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var)) >> --- 339,350 ---- >> /* Now one more pass to make all elements of a partition share the same >> root variable. */ >> >> ! for (x = 1; x < num_ssa_names; x++) >> { >> part_var = partition_to_var (map, x); >> if (!part_var) >> continue; >> ! var = ssa_name (x); >> if (debug) >> { >> if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var)) > > This piece is ok as obvious. Please commit it separately. > > >> Index: cfgexpand.c >> =================================================================== >> *** cfgexpand.c (revision 146576) >> --- cfgexpand.c (working copy) >> *************** along with GCC; see the file COPYING3. >> *** 42,49 **** >> --- 42,54 ---- >> #include "tree-inline.h" >> #include "value-prof.h" >> #include "target.h" >> + #include "ssaexpand.h" >> >> >> + /* This variable holds information helping the rewriting of SSA trees >> + into RTL. */ >> + struct ssaexpand SA; >> + >> /* Return an expression tree corresponding to the RHS of GIMPLE >> statement STMT. */ >> >> *************** gimple_assign_rhs_to_tree (gimple stmt) >> *** 78,85 **** >> static tree >> gimple_cond_pred_to_tree (gimple stmt) >> { >> return build2 (gimple_cond_code (stmt), boolean_type_node, >> ! gimple_cond_lhs (stmt), gimple_cond_rhs (stmt)); >> } >> >> /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression >> --- 83,104 ---- >> static tree >> gimple_cond_pred_to_tree (gimple stmt) >> { >> + /* We're sometimes presented with such code: >> + D.123_1 = x < y; >> + if (D.123_1 != 0) >> + ... >> + This would expand to two comparisons which then later might >> + be cleaned up by combine. But some pattern matchers like if-conversion >> + work better when there's only one compare, so make up for this >> + here as special exception if TER would have made the same change. */ >> + tree lhs = gimple_cond_lhs (stmt); >> + if (SA.values >> + && TREE_CODE (lhs) == SSA_NAME >> + && SA.values[SSA_NAME_VERSION (lhs)]) >> + lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]); > > Do we really need the SA.values array here? It seems that > a bitmap would be enough, as the definition should be still > reachable via SSA_NAME_DEF_STMT (lhs). (Thanks Paolo > for noticing this) > >> + >> return build2 (gimple_cond_code (stmt), boolean_type_node, >> ! lhs, gimple_cond_rhs (stmt)); >> } >> >> /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression >> *************** failed: >> *** 423,428 **** >> --- 442,464 ---- >> #define STACK_ALIGNMENT_NEEDED 1 >> #endif >> >> + #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) > > This should be in some public place, ideally with a more > sound name. Any ideas? > >> + /* Associate declaration T with storage space X. If T is no >> + SSA name this is exactly SET_DECL_RTL, otherwise make the >> + partition of T associated with X. */ >> + static inline void >> + set_rtl (tree t, rtx x) >> + { >> + if (TREE_CODE (t) == SSA_NAME) >> + { >> + SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x; >> + if (x && !MEM_P (x)) >> + set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x); >> + } >> + else >> + SET_DECL_RTL (t, x); >> + } >> >> /* This structure holds data relevant to one variable that will be >> placed in a stack slot. */ >> *************** add_stack_var (tree decl) >> *** 561,575 **** >> } >> stack_vars[stack_vars_num].decl = decl; >> stack_vars[stack_vars_num].offset = 0; >> ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (decl), 1); >> ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (decl); >> >> /* All variables are initially in their own partition. */ >> stack_vars[stack_vars_num].representative = stack_vars_num; >> stack_vars[stack_vars_num].next = EOC; >> >> /* Ensure that this decl doesn't get put onto the list twice. */ >> ! SET_DECL_RTL (decl, pc_rtx); >> >> stack_vars_num++; >> } >> --- 597,611 ---- >> } >> stack_vars[stack_vars_num].decl = decl; >> stack_vars[stack_vars_num].offset = 0; >> ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1); >> ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (SSAVAR (decl)); >> >> /* All variables are initially in their own partition. */ >> stack_vars[stack_vars_num].representative = stack_vars_num; >> stack_vars[stack_vars_num].next = EOC; >> >> /* Ensure that this decl doesn't get put onto the list twice. */ >> ! set_rtl (decl, pc_rtx); >> >> stack_vars_num++; >> } >> *************** add_alias_set_conflicts (void) >> *** 688,709 **** >> } >> >> /* A subroutine of partition_stack_vars. A comparison function for qsort, >> ! sorting an array of indices by the size of the object. */ >> >> static int >> stack_var_size_cmp (const void *a, const void *b) >> { >> HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size; >> HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size; >> ! unsigned int uida = DECL_UID (stack_vars[*(const size_t *)a].decl); >> ! unsigned int uidb = DECL_UID (stack_vars[*(const size_t *)b].decl); >> >> if (sa < sb) >> return -1; >> if (sa > sb) >> return 1; >> ! /* For stack variables of the same size use the uid of the decl >> ! to make the sort stable. */ >> if (uida < uidb) >> return -1; >> if (uida > uidb) >> --- 724,760 ---- >> } >> >> /* A subroutine of partition_stack_vars. A comparison function for qsort, >> ! sorting an array of indices by the size and type of the object. */ >> >> static int >> stack_var_size_cmp (const void *a, const void *b) >> { >> HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size; >> HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size; >> ! tree decla, declb; >> ! unsigned int uida, uidb; >> >> if (sa < sb) >> return -1; >> if (sa > sb) >> return 1; >> ! decla = stack_vars[*(const size_t *)a].decl; >> ! declb = stack_vars[*(const size_t *)b].decl; >> ! /* For stack variables of the same size use and id of the decls >> ! to make the sort stable. Two SSA names are compared by their >> ! version, SSA names come before non-SSA names, and two normal >> ! decls are compared by their DECL_UID. */ >> ! if (TREE_CODE (decla) == SSA_NAME) >> ! { >> ! if (TREE_CODE (declb) == SSA_NAME) >> ! uida = SSA_NAME_VERSION (decla), uidb = SSA_NAME_VERSION (declb); >> ! else >> ! return -1; >> ! } >> ! else if (TREE_CODE (declb) == SSA_NAME) >> ! return 1; >> ! else >> ! uida = DECL_UID (decla), uidb = DECL_UID (declb); >> if (uida < uidb) >> return -1; >> if (uida > uidb) >> *************** expand_one_stack_var_at (tree decl, HOST >> *** 874,894 **** >> gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); >> >> x = plus_constant (virtual_stack_vars_rtx, offset); >> ! x = gen_rtx_MEM (DECL_MODE (decl), x); >> >> ! /* Set alignment we actually gave this decl. */ >> ! offset -= frame_phase; >> ! align = offset & -offset; >> ! align *= BITS_PER_UNIT; >> ! if (align == 0) >> ! align = STACK_BOUNDARY; >> ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) >> ! align = MAX_SUPPORTED_STACK_ALIGNMENT; >> ! DECL_ALIGN (decl) = align; >> ! DECL_USER_ALIGN (decl) = 0; >> >> ! set_mem_attributes (x, decl, true); >> ! SET_DECL_RTL (decl, x); >> } >> >> /* A subroutine of expand_used_vars. Give each partition representative >> --- 925,951 ---- >> gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); >> >> x = plus_constant (virtual_stack_vars_rtx, offset); >> ! x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x); >> >> ! if (TREE_CODE (decl) != SSA_NAME) >> ! { >> ! /* Set alignment we actually gave this decl if it isn't an SSA name. >> ! If it is we generate stack slots only accidentally so it isn't as >> ! important, we'll simply use the alignment that is already set. */ >> ! offset -= frame_phase; >> ! align = offset & -offset; >> ! align *= BITS_PER_UNIT; >> ! if (align == 0) >> ! align = STACK_BOUNDARY; >> ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) >> ! align = MAX_SUPPORTED_STACK_ALIGNMENT; >> >> ! DECL_ALIGN (decl) = align; >> ! DECL_USER_ALIGN (decl) = 0; >> ! } >> ! >> ! set_mem_attributes (x, SSAVAR (decl), true); >> ! set_rtl (decl, x); >> } >> >> /* A subroutine of expand_used_vars. Give each partition representative >> *************** expand_stack_vars (bool (*pred) (tree)) >> *** 912,918 **** >> >> /* Skip variables that have already had rtl assigned. See also >> add_stack_var where we perpetrate this pc_rtx hack. */ >> ! if (DECL_RTL (stack_vars[i].decl) != pc_rtx) >> continue; >> >> /* Check the predicate to see whether this variable should be >> --- 969,977 ---- >> >> /* Skip variables that have already had rtl assigned. See also >> add_stack_var where we perpetrate this pc_rtx hack. */ >> ! if ((TREE_CODE (stack_vars[i].decl) == SSA_NAME >> ! ? SA.partition_to_pseudo[var_to_partition (SA.map, stack_vars[i].decl)] >> ! : DECL_RTL (stack_vars[i].decl)) != pc_rtx) >> continue; >> >> /* Check the predicate to see whether this variable should be >> *************** account_stack_vars (void) >> *** 951,957 **** >> >> size += stack_vars[i].size; >> for (j = i; j != EOC; j = stack_vars[j].next) >> ! SET_DECL_RTL (stack_vars[j].decl, NULL); >> } >> return size; >> } >> --- 1010,1016 ---- >> >> size += stack_vars[i].size; >> for (j = i; j != EOC; j = stack_vars[j].next) >> ! set_rtl (stack_vars[j].decl, NULL); >> } >> return size; >> } >> *************** expand_one_stack_var (tree var) >> *** 964,971 **** >> { >> HOST_WIDE_INT size, offset, align; >> >> ! size = tree_low_cst (DECL_SIZE_UNIT (var), 1); >> ! align = get_decl_align_unit (var); >> offset = alloc_stack_frame_space (size, align); >> >> expand_one_stack_var_at (var, offset); >> --- 1023,1030 ---- >> { >> HOST_WIDE_INT size, offset, align; >> >> ! size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (var)), 1); >> ! align = get_decl_align_unit (SSAVAR (var)); >> offset = alloc_stack_frame_space (size, align); >> >> expand_one_stack_var_at (var, offset); >> *************** expand_one_hard_reg_var (tree var) >> *** 986,1005 **** >> static void >> expand_one_register_var (tree var) >> { >> ! tree type = TREE_TYPE (var); >> int unsignedp = TYPE_UNSIGNED (type); >> enum machine_mode reg_mode >> ! = promote_mode (type, DECL_MODE (var), &unsignedp, 0); >> rtx x = gen_reg_rtx (reg_mode); >> >> ! SET_DECL_RTL (var, x); >> >> /* Note if the object is a user variable. */ >> ! if (!DECL_ARTIFICIAL (var)) >> ! mark_user_reg (x); >> >> if (POINTER_TYPE_P (type)) >> ! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); >> } >> >> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that >> --- 1045,1065 ---- >> static void >> expand_one_register_var (tree var) >> { > > >> *************** struct rtl_opt_pass pass_expand = >> *** 2471,2480 **** >> 0, /* static_pass_number */ >> TV_EXPAND, /* tv_id */ >> /* ??? If TER is enabled, we actually receive GENERIC. */ >> ! PROP_gimple_leh | PROP_cfg, /* properties_required */ >> PROP_rtl, /* properties_provided */ >> ! PROP_trees, /* properties_destroyed */ >> ! 0, /* todo_flags_start */ >> ! TODO_dump_func, /* todo_flags_finish */ >> } >> }; >> --- 2644,2655 ---- >> 0, /* static_pass_number */ >> TV_EXPAND, /* tv_id */ >> /* ??? If TER is enabled, we actually receive GENERIC. */ > > This is no longer true ;) > >> ! PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */ >> PROP_rtl, /* properties_provided */ >> ! PROP_ssa | PROP_trees, /* properties_destroyed */ >> ! TODO_verify_ssa | TODO_verify_flow >> ! | TODO_verify_stmts, /* todo_flags_start */ >> ! TODO_dump_func >> ! | TODO_ggc_collect /* todo_flags_finish */ >> } >> }; > > >> Index: tree-mudflap.c >> =================================================================== >> *** tree-mudflap.c (revision 146576) >> --- tree-mudflap.c (working copy) >> *************** execute_mudflap_function_ops (void) >> *** 447,452 **** >> --- 447,465 ---- >> return 0; >> } >> >> + /* Construct a new temporary variable with TYPE >> + as type and PREFIX as name prefix, add it to referenced vars >> + and mark it for renaming. */ >> + >> + static tree >> + create_referenced_var (tree type, const char *prefix) >> + { >> + tree var = create_tmp_var (type, prefix); >> + add_referenced_var (var); >> + mark_sym_for_renaming (var); >> + return var; >> + } > > This is exactly the same as make_rename_temp () from tree-dfa.c, > so use that. > >> /* Create and initialize local shadow variables for the lookup cache >> globals. Put their decls in the *_l globals for use by >> mf_build_check_statement_for. */ >> *************** mf_decl_cache_locals (void) >> *** 459,469 **** >> >> /* Build the cache vars. */ >> mf_cache_shift_decl_l >> ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_shift_decl), >> "__mf_lookup_shift_l")); >> >> mf_cache_mask_decl_l >> ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_mask_decl), >> "__mf_lookup_mask_l")); >> >> /* Build initialization nodes for the cache vars. We just load the >> --- 472,482 ---- >> >> /* Build the cache vars. */ >> mf_cache_shift_decl_l >> ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_shift_decl), >> "__mf_lookup_shift_l")); >> >> mf_cache_mask_decl_l >> ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_mask_decl), >> "__mf_lookup_mask_l")); >> >> /* Build initialization nodes for the cache vars. We just load the >> *************** mf_build_check_statement_for (tree base, >> *** 546,554 **** >> } >> >> /* Build our local variables. */ >> ! mf_elem = create_tmp_var (mf_cache_structptr_type, "__mf_elem"); >> ! mf_base = create_tmp_var (mf_uintptr_type, "__mf_base"); >> ! mf_limit = create_tmp_var (mf_uintptr_type, "__mf_limit"); >> >> /* Build: __mf_base = (uintptr_t) <base address expression>. */ >> seq = gimple_seq_alloc (); >> --- 559,567 ---- >> } >> >> /* Build our local variables. */ >> ! mf_elem = create_referenced_var (mf_cache_structptr_type, "__mf_elem"); >> ! mf_base = create_referenced_var (mf_uintptr_type, "__mf_base"); >> ! mf_limit = create_referenced_var (mf_uintptr_type, "__mf_limit"); >> >> /* Build: __mf_base = (uintptr_t) <base address expression>. */ >> seq = gimple_seq_alloc (); >> *************** mf_build_check_statement_for (tree base, >> *** 627,633 **** >> t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u); >> t = force_gimple_operand (t, &stmts, false, NULL_TREE); >> gimple_seq_add_seq (&seq, stmts); >> ! cond = create_tmp_var (boolean_type_node, "__mf_unlikely_cond"); >> g = gimple_build_assign (cond, t); >> gimple_set_location (g, location); >> gimple_seq_add_stmt (&seq, g); >> --- 640,646 ---- >> t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u); >> t = force_gimple_operand (t, &stmts, false, NULL_TREE); >> gimple_seq_add_seq (&seq, stmts); >> ! cond = create_referenced_var (boolean_type_node, "__mf_unlikely_cond"); >> g = gimple_build_assign (cond, t); >> gimple_set_location (g, location); >> gimple_seq_add_stmt (&seq, g); >> *************** struct gimple_opt_pass pass_mudflap_2 = >> *** 1366,1377 **** >> NULL, /* next */ >> 0, /* static_pass_number */ >> TV_NONE, /* tv_id */ >> ! PROP_gimple_leh, /* properties_required */ >> 0, /* properties_provided */ >> 0, /* properties_destroyed */ >> 0, /* todo_flags_start */ >> TODO_verify_flow | TODO_verify_stmts >> ! | TODO_dump_func /* todo_flags_finish */ >> } >> }; >> >> --- 1379,1390 ---- >> NULL, /* next */ >> 0, /* static_pass_number */ >> TV_NONE, /* tv_id */ >> ! PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required */ >> 0, /* properties_provided */ >> 0, /* properties_destroyed */ >> 0, /* todo_flags_start */ >> TODO_verify_flow | TODO_verify_stmts >> ! | TODO_dump_func | TODO_update_ssa /* todo_flags_finish */ >> } >> }; >> >> Index: tree-ssa-ter.c >> =================================================================== >> *** tree-ssa-ter.c (revision 146576) >> --- tree-ssa-ter.c (working copy) >> *************** free_temp_expr_table (temp_expr_table_p >> *** 225,231 **** >> unsigned x; >> for (x = 0; x <= num_var_partitions (t->map); x++) >> gcc_assert (!t->kill_list[x]); >> ! for (x = 0; x < num_ssa_names + 1; x++) >> { >> gcc_assert (t->expr_decl_uids[x] == NULL); >> gcc_assert (t->partition_dependencies[x] == NULL); >> --- 225,231 ---- >> unsigned x; >> for (x = 0; x <= num_var_partitions (t->map); x++) >> gcc_assert (!t->kill_list[x]); >> ! for (x = 0; x < num_ssa_names; x++) >> { >> gcc_assert (t->expr_decl_uids[x] == NULL); >> gcc_assert (t->partition_dependencies[x] == NULL); > > Obvious, commit with the other similar piece separately. > > >> Index: tree-ssa.c >> =================================================================== >> *** tree-ssa.c (revision 146576) >> --- tree-ssa.c (working copy) >> *************** delete_tree_ssa (void) >> *** 844,850 **** >> >> gimple_set_modified (stmt, true); >> } >> ! set_phi_nodes (bb, NULL); >> } >> >> /* Remove annotations from every referenced local variable. */ >> --- 844,851 ---- >> >> gimple_set_modified (stmt, true); >> } >> ! if (!(bb->flags & BB_RTL)) >> ! set_phi_nodes (bb, NULL); >> } >> >> /* Remove annotations from every referenced local variable. */ >> Index: rtl.h >> =================================================================== >> *** rtl.h (revision 146576) >> --- rtl.h (working copy) >> *************** extern rtx gen_int_mode (HOST_WIDE_INT, >> *** 1494,1499 **** >> --- 1494,1500 ---- >> extern rtx emit_copy_of_insn_after (rtx, rtx); >> extern void set_reg_attrs_from_value (rtx, rtx); >> extern void set_reg_attrs_for_parm (rtx, rtx); >> + extern void set_reg_attrs_for_decl_rtl (tree t, rtx x); >> extern void adjust_reg_mode (rtx, enum machine_mode); >> extern int mem_expr_equal_p (const_tree, const_tree); >> >> Index: tree-optimize.c >> =================================================================== >> *** tree-optimize.c (revision 146576) >> --- tree-optimize.c (working copy) >> *************** struct gimple_opt_pass pass_cleanup_cfg_ >> *** 201,207 **** >> { >> { >> GIMPLE_PASS, >> ! "final_cleanup", /* name */ >> NULL, /* gate */ >> execute_cleanup_cfg_post_optimizing, /* execute */ >> NULL, /* sub */ >> --- 201,207 ---- >> { >> { >> GIMPLE_PASS, >> ! "optimized", /* name */ >> NULL, /* gate */ >> execute_cleanup_cfg_post_optimizing, /* execute */ >> NULL, /* sub */ >> *************** struct gimple_opt_pass pass_cleanup_cfg_ >> *** 213,225 **** >> 0, /* properties_destroyed */ >> 0, /* todo_flags_start */ >> TODO_dump_func /* todo_flags_finish */ >> } >> }; >> >> /* Pass: do the actions required to finish with tree-ssa optimization >> passes. */ >> >> ! static unsigned int >> execute_free_datastructures (void) >> { >> free_dominance_info (CDI_DOMINATORS); >> --- 213,226 ---- >> 0, /* properties_destroyed */ >> 0, /* todo_flags_start */ >> TODO_dump_func /* todo_flags_finish */ >> + | TODO_remove_unused_locals >> } >> }; >> >> /* Pass: do the actions required to finish with tree-ssa optimization >> passes. */ >> >> ! unsigned int >> execute_free_datastructures (void) >> { >> free_dominance_info (CDI_DOMINATORS); >> *************** execute_free_datastructures (void) >> *** 228,233 **** >> --- 229,238 ---- >> /* Remove the ssa structures. */ >> if (cfun->gimple_df) >> delete_tree_ssa (); >> + >> + /* And get rid of annotations we no longer need. */ >> + delete_tree_cfg_annotations (); >> + >> return 0; >> } >> >> *************** struct gimple_opt_pass pass_free_datastr >> *** 254,262 **** >> static unsigned int >> execute_free_cfg_annotations (void) >> { >> - /* And get rid of annotations we no longer need. */ >> - delete_tree_cfg_annotations (); >> - >> return 0; >> } > > It looks like the pass referencing this is now dead. Please remove > this function and the pass structure. > >> [Message clipped] > > bah ... ;) Other half in next mail. Continued (without quoting for the patch) --- 932,941 ---- if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); ! remove_ssa_form (flag_tree_ter && !flag_mudflap, sa); if (dump_file && (dump_flags & TDF_DETAILS)) gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS); it looks like this restriction for !flag_mudflap can now be lifted. + struct ssaexpand + { + /* The computed partitions of SSA names are stored here. */ + var_map map; + + /* For a SSA name version V values[V] contains the gimple statement + defining it iff TER decided that it should be forwarded, NULL + otherwise. */ + gimple *values; as said above this could be a bitmap. *************** struct var_ann_d GTY(()) *** 234,243 **** information on each attribute. */ ENUM_BITFIELD (noalias_state) noalias_state : 2; - /* Used when going out of SSA form to indicate which partition this - variable represents storage for. */ - unsigned partition; - /* Used by var_map for the base index of ssa base variables. */ unsigned base_index; hmm, so what is needed to remove base_index as well? --- 707,723 ---- NEXT_PASS (pass_local_pure_const); } NEXT_PASS (pass_cleanup_eh); NEXT_PASS (pass_nrv); + NEXT_PASS (pass_mudflap_2); NEXT_PASS (pass_mark_used_blocks); NEXT_PASS (pass_cleanup_cfg_post_optimizing); NEXT_PASS (pass_warn_function_noreturn); pass_mark_unused_blocks is no longer needed now as you remove unused locals and that marks blocks used? Please remove this pass (and its code). The patches are ok with the suggested minor changes and with the larger cleanups as followup (or the same patch if you prefer). Thanks, Richard. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-24 14:46 ` Richard Guenther @ 2009-04-26 20:21 ` Michael Matz 2009-04-26 20:34 ` Richard Guenther ` (2 more replies) 0 siblings, 3 replies; 63+ messages in thread From: Michael Matz @ 2009-04-26 20:21 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev [-- Attachment #1: Type: TEXT/PLAIN, Size: 3055 bytes --] Hi, On Fri, 24 Apr 2009, Richard Guenther wrote: > > ! Â map = init_var_map (num_ssa_names + 1); > > --- 291,297 ---- > > ! Â map = init_var_map (num_ssa_names); > > This piece is ok as obvious. Please commit it separately. This and the other two instances of num_ssa_names confusions committed as r146815. The rest of the patch (with adjustments except as noted below) committed as r146817. I had to apply the below additional change to be able to build Ada, as we can't redirect (and hence split) EH edges in RTL land. So I have to pre-split them when queing insns. Reregstrapped on x86_64-linux. I haven't dealt with with these comments: > > + Â if (SA.values > > + Â Â Â && TREE_CODE (lhs) == SSA_NAME > > + Â Â Â && SA.values[SSA_NAME_VERSION (lhs)]) > > + Â Â lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]); > > Do we really need the SA.values array here? It seems that > a bitmap would be enough, as the definition should be still > reachable via SSA_NAME_DEF_STMT (lhs). (Thanks Paolo > for noticing this) This should indeed be possible, but I want to do this as cleanup next week. > > + #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) > > This should be in some public place, ideally with a more sound name. > Any ideas? No bright idea. STRIP_SSA perhaps? If someone has a better name or agrees to that one, I'd move it to tree.h and make a pass over the compiler to substitute the above pattern with the macro. > > Â execute_free_cfg_annotations (void) > > Â { > > - Â /* And get rid of annotations we no longer need. Â */ > > - Â delete_tree_cfg_annotations (); > > - > > It looks like the pass referencing this is now dead. Please remove > this function and the pass structure. It is, but done as followup. > *************** struct var_ann_d GTY(()) > *** 234,243 **** > /* Used by var_map for the base index of ssa base variables. */ > unsigned base_index; > > hmm, so what is needed to remove base_index as well? Rewriting var_map_base_init() to not use var annotations. SSA name coalescing and conflict building wants to have a DECL->dense-integer mapping for various reasons (two-stage approach for building conflicts). Currently that mapping is generated by storing the next available index into the var annotation (to be able to read it out again when the same basevar is seen for a different partition). But this whole info is strictly local to the above function, so it doesn't need to live in the annotation. I could very well implement this as an array indexed by DECL_UID. The UIDs shouldn't become exceptionally large, so that seems feasible. I wouldn't want to use a hash-table for fear of slowing down var_map_base_init(). > + NEXT_PASS (pass_mudflap_2); > NEXT_PASS (pass_mark_used_blocks); > > pass_mark_unused_blocks is no longer needed now as you remove unused > locals and that > marks blocks used? Please remove this pass (and its code). Also followup. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 20:21 ` Michael Matz @ 2009-04-26 20:34 ` Richard Guenther 2009-04-26 20:53 ` Michael Matz 2009-04-26 21:42 ` Michael Matz 2009-04-27 12:34 ` Michael Matz 2 siblings, 1 reply; 63+ messages in thread From: Richard Guenther @ 2009-04-26 20:34 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Sun, Apr 26, 2009 at 10:19 PM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Fri, 24 Apr 2009, Richard Guenther wrote: > >> *************** struct var_ann_d GTY(()) >> *** 234,243 **** >> /* Used by var_map for the base index of ssa base variables. */ >> unsigned base_index; >> >> hmm, so what is needed to remove base_index as well? > > Rewriting var_map_base_init() to not use var annotations. SSA name > coalescing and conflict building wants to have a DECL->dense-integer > mapping for various reasons (two-stage approach for building > conflicts). > > Currently that mapping is generated by storing the next available index > into the var annotation (to be able to read it out again when the same > basevar is seen for a different partition). But this whole info is > strictly local to the above function, so it doesn't need to live in the > annotation. I could very well implement this as an array indexed by > DECL_UID. The UIDs shouldn't become exceptionally large, so that seems Hm. DECL_UIDs are sparse (and global), it would be a bad idea to index an array with it. Is var_ann->common.aux already used by out-of-SSA? (also common.rn and common.value_handle are "local" values, but they should go as well in the end). > feasible. I wouldn't want to use a hash-table for fear of slowing down > var_map_base_init(). I think it would be not too bad ;) Richard. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 20:34 ` Richard Guenther @ 2009-04-26 20:53 ` Michael Matz 2009-04-26 21:14 ` Richard Guenther 0 siblings, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-26 20:53 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev [-- Attachment #1: Type: TEXT/PLAIN, Size: 1084 bytes --] Hi, On Sun, 26 Apr 2009, Richard Guenther wrote: > > into the var annotation (to be able to read it out again when the same > > basevar is seen for a different partition). Â But this whole info is > > strictly local to the above function, so it doesn't need to live in the > > annotation. Â I could very well implement this as an array indexed by > > DECL_UID. Â The UIDs shouldn't become exceptionally large, so that seems > > Hm. DECL_UIDs are sparse (and global), it would be a bad idea to index > an array with it. Those few 100k entries ... nobody will notice. Hmm, well, maybe someone does :-) > Is var_ann->common.aux already used by out-of-SSA? Oh joy. This field is unused by out-of-SSA. Which makes sense, as - drumroll - nothing (!) at all uses it (in the sense of deleting it doesn't break compile). I'd rather remove that field and the fields in the var annotation. > > feasible. Â I wouldn't want to use a hash-table for fear of slowing down > > var_map_base_init(). > > I think it would be not too bad ;) I'll experiment a bit. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 20:53 ` Michael Matz @ 2009-04-26 21:14 ` Richard Guenther 2009-04-26 21:15 ` Michael Matz 0 siblings, 1 reply; 63+ messages in thread From: Richard Guenther @ 2009-04-26 21:14 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Sun, Apr 26, 2009 at 10:47 PM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Sun, 26 Apr 2009, Richard Guenther wrote: > >> > into the var annotation (to be able to read it out again when the same >> > basevar is seen for a different partition). But this whole info is >> > strictly local to the above function, so it doesn't need to live in the >> > annotation. I could very well implement this as an array indexed by >> > DECL_UID. The UIDs shouldn't become exceptionally large, so that seems >> >> Hm. DECL_UIDs are sparse (and global), it would be a bad idea to index >> an array with it. > > Those few 100k entries ... nobody will notice. Hmm, well, maybe someone > does :-) > >> Is var_ann->common.aux already used by out-of-SSA? > > Oh joy. This field is unused by out-of-SSA. Which makes sense, as - > drumroll - nothing (!) at all uses it (in the sense of deleting it doesn't > break compile). > > I'd rather remove that field and the fields in the var annotation. Indeed. I'm testing a patch to remove the aux field. The partial transition to tuples (the hack going back to GENERIC for expansion) also introduced two members, stmt and rn. Fixing that partial transition should shrink tree_ann_common_d further. In the end we would like to get rid of tree annotations completely of course. Richard. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 21:14 ` Richard Guenther @ 2009-04-26 21:15 ` Michael Matz 2009-04-26 21:17 ` Richard Guenther 0 siblings, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-26 21:15 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev Hi, On Sun, 26 Apr 2009, Richard Guenther wrote: > > I'd rather remove that field and the fields in the var annotation. > > Indeed. I'm testing a patch to remove the aux field. I had one in testing also removing the value_handle field. It's unused since some time too (it's moved over to the ssa_name structure directly). > The partial transition to tuples (the hack going back to GENERIC for > expansion) also introduced two members, stmt and rn. Fixing that > partial transition should shrink tree_ann_common_d further. Yes. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 21:15 ` Michael Matz @ 2009-04-26 21:17 ` Richard Guenther 2009-04-26 22:21 ` Michael Matz 0 siblings, 1 reply; 63+ messages in thread From: Richard Guenther @ 2009-04-26 21:17 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Sun, Apr 26, 2009 at 11:13 PM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Sun, 26 Apr 2009, Richard Guenther wrote: > >> > I'd rather remove that field and the fields in the var annotation. >> >> Indeed. I'm testing a patch to remove the aux field. > > I had one in testing also removing the value_handle field. It's unused > since some time too (it's moved over to the ssa_name structure directly). Pre-approved if it passes bootstrap. Thanks, Richard. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 21:17 ` Richard Guenther @ 2009-04-26 22:21 ` Michael Matz 0 siblings, 0 replies; 63+ messages in thread From: Michael Matz @ 2009-04-26 22:21 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 1013 bytes --] Hi, On Sun, 26 Apr 2009, Richard Guenther wrote: > > I had one in testing also removing the value_handle field. Â It's > > unused since some time too (it's moved over to the ssa_name structure > > directly). > > Pre-approved if it passes bootstrap. Committed this as r146820. Ciao, Michael. -- * tree-flow.h (tree_ann_common_d): Remove aux and value_handle members. Index: tree-flow.h =================================================================== --- tree-flow.h (Revision 146817) +++ tree-flow.h (Arbeitskopie) @@ -136,13 +136,6 @@ struct GTY(()) tree_ann_common_d { expansion (see gimple_to_tree). */ int rn; - /* Auxiliary info specific to a pass. At all times, this - should either point to valid data or be NULL. */ - PTR GTY ((skip (""))) aux; - - /* The value handle for this expression. Used by GVN-PRE. */ - tree GTY((skip)) value_handle; - /* Pointer to original GIMPLE statement. Used during RTL expansion (see gimple_to_tree). */ gimple stmt; ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 20:21 ` Michael Matz 2009-04-26 20:34 ` Richard Guenther @ 2009-04-26 21:42 ` Michael Matz 2009-04-26 22:15 ` Michael Matz 2009-04-27 12:34 ` Michael Matz 2 siblings, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-26 21:42 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches Hi, On Sun, 26 Apr 2009, Michael Matz wrote: > > It looks like the pass referencing this is now dead. Please remove > > this function and the pass structure. > > It is, but done as followup. > > > pass_mark_unused_blocks is no longer needed now as you remove unused > > locals and that marks blocks used? Please remove this pass (and its > > code). > > Also followup. Like so. Will commit if regstrapping passes. Ciao, Michael. -- * tree-pass.h (pass_del_ssa, pass_mark_used_blocks, pass_free_cfg_annotations, pass_free_datastructures): Remove decls. * gimple-low.c (mark_blocks_with_used_vars, mark_used_blocks, pass_mark_used_blocks): Remove. * tree-optimize.c (pass_free_datastructures, execute_free_cfg_annotations, pass_free_cfg_annotations): Remove. * passes.c (init_optimization_passes): Don't call pass_mark_used_blocks, remove dead code. Index: tree-pass.h =================================================================== --- tree-pass.h (Revision 146806) +++ tree-pass.h (Arbeitskopie) @@ -344,7 +344,6 @@ extern struct gimple_opt_pass pass_ch; extern struct gimple_opt_pass pass_ccp; extern struct gimple_opt_pass pass_phi_only_cprop; extern struct gimple_opt_pass pass_build_ssa; -extern struct gimple_opt_pass pass_del_ssa; extern struct gimple_opt_pass pass_build_alias; extern struct gimple_opt_pass pass_dominator; extern struct gimple_opt_pass pass_dce; @@ -380,7 +379,6 @@ extern struct gimple_opt_pass pass_phipr extern struct gimple_opt_pass pass_tree_ifcombine; extern struct gimple_opt_pass pass_dse; extern struct gimple_opt_pass pass_nrv; -extern struct gimple_opt_pass pass_mark_used_blocks; extern struct gimple_opt_pass pass_rename_ssa_copies; extern struct gimple_opt_pass pass_rest_of_compilation; extern struct gimple_opt_pass pass_sink_code; @@ -414,8 +412,6 @@ extern struct simple_ipa_opt_pass pass_i extern struct gimple_opt_pass pass_all_optimizations; extern struct gimple_opt_pass pass_cleanup_cfg_post_optimizing; -extern struct gimple_opt_pass pass_free_cfg_annotations; -extern struct gimple_opt_pass pass_free_datastructures; extern struct gimple_opt_pass pass_init_datastructures; extern struct gimple_opt_pass pass_fixup_cfg; Index: gimple-low.c =================================================================== --- gimple-low.c (Revision 146806) +++ gimple-low.c (Arbeitskopie) @@ -900,61 +900,3 @@ record_vars (tree vars) { record_vars_into (vars, current_function_decl); } - - -/* Mark BLOCK used if it has a used variable in it, then recurse over its - subblocks. */ - -static void -mark_blocks_with_used_vars (tree block) -{ - tree var; - tree subblock; - - if (!TREE_USED (block)) - { - for (var = BLOCK_VARS (block); - var; - var = TREE_CHAIN (var)) - { - if (TREE_USED (var)) - { - TREE_USED (block) = true; - break; - } - } - } - for (subblock = BLOCK_SUBBLOCKS (block); - subblock; - subblock = BLOCK_CHAIN (subblock)) - mark_blocks_with_used_vars (subblock); -} - -/* Mark the used attribute on blocks correctly. */ - -static unsigned int -mark_used_blocks (void) -{ - mark_blocks_with_used_vars (DECL_INITIAL (current_function_decl)); - return 0; -} - - -struct gimple_opt_pass pass_mark_used_blocks = -{ - { - GIMPLE_PASS, - "blocks", /* name */ - NULL, /* gate */ - mark_used_blocks, /* execute */ - NULL, /* sub */ - NULL, /* next */ - 0, /* static_pass_number */ - TV_NONE, /* tv_id */ - 0, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - TODO_dump_func /* todo_flags_finish */ - } -}; Index: tree-optimize.c =================================================================== --- tree-optimize.c (Revision 146817) +++ tree-optimize.c (Arbeitskopie) @@ -236,51 +236,6 @@ execute_free_datastructures (void) return 0; } -struct gimple_opt_pass pass_free_datastructures = -{ - { - GIMPLE_PASS, - NULL, /* name */ - NULL, /* gate */ - execute_free_datastructures, /* execute */ - NULL, /* sub */ - NULL, /* next */ - 0, /* static_pass_number */ - TV_NONE, /* tv_id */ - PROP_cfg, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - 0 /* todo_flags_finish */ - } -}; -/* Pass: free cfg annotations. */ - -static unsigned int -execute_free_cfg_annotations (void) -{ - return 0; -} - -struct gimple_opt_pass pass_free_cfg_annotations = -{ - { - GIMPLE_PASS, - NULL, /* name */ - NULL, /* gate */ - execute_free_cfg_annotations, /* execute */ - NULL, /* sub */ - NULL, /* next */ - 0, /* static_pass_number */ - TV_NONE, /* tv_id */ - PROP_cfg, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - 0 /* todo_flags_finish */ - } -}; - /* Pass: fixup_cfg. IPA passes, compilation of earlier functions or inlining might have changed some properties, such as marked functions nothrow. Remove redundant edges and basic blocks, and create new ones if necessary. Index: passes.c =================================================================== --- passes.c (Revision 146817) +++ passes.c (Arbeitskopie) @@ -709,13 +709,9 @@ init_optimization_passes (void) NEXT_PASS (pass_cleanup_eh); NEXT_PASS (pass_nrv); NEXT_PASS (pass_mudflap_2); - NEXT_PASS (pass_mark_used_blocks); NEXT_PASS (pass_cleanup_cfg_post_optimizing); NEXT_PASS (pass_warn_function_noreturn); -/* NEXT_PASS (pass_del_ssa); - NEXT_PASS (pass_free_datastructures); - NEXT_PASS (pass_free_cfg_annotations);*/ NEXT_PASS (pass_expand); NEXT_PASS (pass_rest_of_compilation); ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 21:42 ` Michael Matz @ 2009-04-26 22:15 ` Michael Matz 0 siblings, 0 replies; 63+ messages in thread From: Michael Matz @ 2009-04-26 22:15 UTC (permalink / raw) To: gcc-patches Hi, On Sun, 26 Apr 2009, Michael Matz wrote: > Like so. Will commit if regstrapping passes. Which it did on x86_64-linux --> r146819. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-26 20:21 ` Michael Matz 2009-04-26 20:34 ` Richard Guenther 2009-04-26 21:42 ` Michael Matz @ 2009-04-27 12:34 ` Michael Matz 2 siblings, 0 replies; 63+ messages in thread From: Michael Matz @ 2009-04-27 12:34 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 7187 bytes --] Hi, On Sun, 26 Apr 2009, Michael Matz wrote: > > > + Â if (SA.values > > > + Â Â Â && TREE_CODE (lhs) == SSA_NAME > > > + Â Â Â && SA.values[SSA_NAME_VERSION (lhs)]) > > > + Â Â lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]); > > > > Do we really need the SA.values array here? It seems that > > a bitmap would be enough, as the definition should be still > > reachable via SSA_NAME_DEF_STMT (lhs). (Thanks Paolo > > for noticing this) > > This should indeed be possible, but I want to do this as cleanup next > week. Like so. Already regstrapped without regressions on x86_64-linux. Committed as r146837. Ciao, Michael. -- * ssaexpand.h (struct ssaexpand): Member 'values' is a bitmap. (get_gimple_for_ssa_name): Adjust, lookup using SSA_NAME_DEF_STMT. * tree-ssa-live.h: (find_replaceable_exprs): Return a bitmap. (dump_replaceable_exprs): Take a bitmap. * cfgexpand.c (gimple_cond_pred_to_tree): Handle bitmap instead of array. (expand_gimple_basic_block): Likewise. * tree-ssa-ter.c (struct temp_expr_table_d): Make replaceable_expressions member a bitmap. (free_temp_expr_table): Pass back and deal with bitmap, not gimple*. (mark_replaceable): Likewise. (find_replaceable_in_bb, dump_replaceable_exprs): Likewise. * tree-outof-ssa.c (remove_ssa_form): 'values' is a bitmap. Index: cfgexpand.c =================================================================== --- cfgexpand.c (Revision 146817) +++ cfgexpand.c (Arbeitskopie) @@ -94,8 +94,8 @@ gimple_cond_pred_to_tree (gimple stmt) tree lhs = gimple_cond_lhs (stmt); if (SA.values && TREE_CODE (lhs) == SSA_NAME - && SA.values[SSA_NAME_VERSION (lhs)]) - lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]); + && bitmap_bit_p (SA.values, SSA_NAME_VERSION (lhs))) + lhs = gimple_assign_rhs_to_tree (SSA_NAME_DEF_STMT (lhs)); return build2 (gimple_cond_code (stmt), boolean_type_node, lhs, gimple_cond_rhs (stmt)); @@ -2078,7 +2078,8 @@ expand_gimple_basic_block (basic_block b /* Ignore this stmt if it is in the list of replaceable expressions. */ if (SA.values - && SA.values[SSA_NAME_VERSION (DEF_FROM_PTR (def_p))]) + && bitmap_bit_p (SA.values, + SSA_NAME_VERSION (DEF_FROM_PTR (def_p)))) continue; } stmt_tree = gimple_to_tree (stmt); Index: tree-ssa-live.h =================================================================== --- tree-ssa-live.h (Revision 146817) +++ tree-ssa-live.h (Arbeitskopie) @@ -340,8 +340,8 @@ extern var_map coalesce_ssa_name (void); /* From tree-ssa-ter.c */ -extern gimple *find_replaceable_exprs (var_map); -extern void dump_replaceable_exprs (FILE *, gimple *); +extern bitmap find_replaceable_exprs (var_map); +extern void dump_replaceable_exprs (FILE *, bitmap); #endif /* _TREE_SSA_LIVE_H */ Index: tree-ssa-ter.c =================================================================== --- tree-ssa-ter.c (Revision 146815) +++ tree-ssa-ter.c (Arbeitskopie) @@ -159,7 +159,7 @@ typedef struct temp_expr_table_d { var_map map; bitmap *partition_dependencies; /* Partitions expr is dependent on. */ - gimple *replaceable_expressions; /* Replacement expression table. */ + bitmap replaceable_expressions; /* Replacement expression table. */ bitmap *expr_decl_uids; /* Base uids of exprs. */ bitmap *kill_list; /* Expr's killed by a partition. */ int virtual_partition; /* Pseudo partition for virtual ops. */ @@ -216,10 +216,10 @@ new_temp_expr_table (var_map map) /* Free TER table T. If there are valid replacements, return the expression vector. */ -static gimple * +static bitmap free_temp_expr_table (temp_expr_table_p t) { - gimple *ret = NULL; + bitmap ret = NULL; #ifdef ENABLE_CHECKING unsigned x; @@ -255,7 +255,7 @@ version_to_be_replaced_p (temp_expr_tabl { if (!tab->replaceable_expressions) return false; - return tab->replaceable_expressions[version] != NULL; + return bitmap_bit_p (tab->replaceable_expressions, version); } @@ -562,8 +562,8 @@ mark_replaceable (temp_expr_table_p tab, /* Set the replaceable expression. */ if (!tab->replaceable_expressions) - tab->replaceable_expressions = XCNEWVEC (gimple, num_ssa_names + 1); - tab->replaceable_expressions[version] = SSA_NAME_DEF_STMT (var); + tab->replaceable_expressions = BITMAP_ALLOC (NULL); + bitmap_set_bit (tab->replaceable_expressions, version); } @@ -653,12 +653,12 @@ find_replaceable_in_bb (temp_expr_table_ NULL is returned by the function, otherwise an expression vector indexed by SSA_NAME version numbers. */ -extern gimple * +extern bitmap find_replaceable_exprs (var_map map) { basic_block bb; temp_expr_table_p table; - gimple *ret; + bitmap ret; table = new_temp_expr_table (map); FOR_EACH_BB (bb) @@ -676,19 +676,19 @@ find_replaceable_exprs (var_map map) /* Dump TER expression table EXPR to file F. */ void -dump_replaceable_exprs (FILE *f, gimple *expr) +dump_replaceable_exprs (FILE *f, bitmap expr) { tree var; unsigned x; fprintf (f, "\nReplacing Expressions\n"); for (x = 0; x < num_ssa_names; x++) - if (expr[x]) + if (bitmap_bit_p (expr, x)) { var = ssa_name (x); print_generic_expr (f, var, TDF_SLIM); fprintf (f, " replace with --> "); - print_gimple_stmt (f, expr[x], 0, TDF_SLIM); + print_gimple_stmt (f, SSA_NAME_DEF_STMT (var), 0, TDF_SLIM); fprintf (f, "\n"); } fprintf (f, "\n"); Index: tree-outof-ssa.c =================================================================== --- tree-outof-ssa.c (Revision 146817) +++ tree-outof-ssa.c (Arbeitskopie) @@ -791,7 +791,7 @@ expand_phi_nodes (struct ssaexpand *sa) static void remove_ssa_form (bool perform_ter, struct ssaexpand *sa) { - gimple *values = NULL; + bitmap values = NULL; var_map map; unsigned i; @@ -926,7 +926,7 @@ finish_out_of_ssa (struct ssaexpand *sa) { free (sa->partition_to_pseudo); if (sa->values) - free (sa->values); + BITMAP_FREE (sa->values); delete_var_map (sa->map); BITMAP_FREE (sa->partition_has_default_def); memset (sa, 0, sizeof *sa); Index: ssaexpand.h =================================================================== --- ssaexpand.h (Revision 146817) +++ ssaexpand.h (Arbeitskopie) @@ -31,10 +31,9 @@ struct ssaexpand /* The computed partitions of SSA names are stored here. */ var_map map; - /* For a SSA name version V values[V] contains the gimple statement - defining it iff TER decided that it should be forwarded, NULL - otherwise. */ - gimple *values; + /* For an SSA name version V bit V is set iff TER decided that + its definition should be forwarded. */ + bitmap values; /* For a partition number I partition_to_pseudo[I] contains the RTL expression of the allocated space of it (either a MEM or @@ -67,8 +66,8 @@ static inline gimple get_gimple_for_ssa_name (tree exp) { int v = SSA_NAME_VERSION (exp); - if (SA.values) - return SA.values[v]; + if (SA.values && bitmap_bit_p (SA.values, v)) + return SSA_NAME_DEF_STMT (exp); return NULL; } ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-22 16:45 ` [RFA] " Michael Matz 2009-04-23 15:10 ` Andrew MacLeod 2009-04-24 14:32 ` Richard Guenther @ 2009-04-27 5:47 ` H.J. Lu 2009-04-28 23:49 ` H.J. Lu 2009-04-27 7:22 ` Hans-Peter Nilsson 2009-04-30 18:18 ` Steve Ellcey 4 siblings, 1 reply; 63+ messages in thread From: H.J. Lu @ 2009-04-27 5:47 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote: > On Wed, 22 Apr 2009, Michael Matz wrote: > >> I'll soon send a new version of the patch that fixes all problems and >> testcases I encountered. > > Like so. This is the full patch, i.e. including the cleanups, but > excluding the testsuite changes. It should incorporate all feedback. > Compared to the last version it adds comments for new functions, fixes > muflap2, and generally some other minor problems showing when I started > testing Ada and a bug reported by Andrey. > > This patch (plus testsuite changes) was bootstrapped with Ada on > x86_64-linux. There are no testsuite regressions: > FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE > FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE > FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) > FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) > FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) > FAIL: libmudflap.c++/pass41-frag.cxx execution test > FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test > FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test > FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test > > All of these happen without the patch too (known bugs, old binutils, and > pass41-frag never seems to work anyway). > > I'd like to ask for approval for the series. > > > Ciao, > Michael. > -- > * builtins.c (fold_builtin_next_arg): Handle SSA names. > * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate > beyond num_ssa_names, use ssa_name() directly. > * tree-ssa-ter.c (free_temp_expr_table): Likewise. > * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, > mark only useful SSA names. > (compare_pairs): Swap cost comparison. > (coalesce_ssa_name): Don't use change_partition_var. > * tree-nrv.c (struct nrv_data): Add modified member. > (finalize_nrv_r): Set it. > (tree_nrv): Use it to update statements. > (pass_nrv): Require PROP_ssa. > * tree-mudflap.c (create_referenced_var): New static helper. > (mf_decl_cache_locals, mf_build_check_statement_for): Use it. > (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. > * alias.c (find_base_decl): Handle SSA names. > * emit-rtl (set_reg_attrs_for_parm): Make non-static. > (component_ref_for_mem_expr): Don't leak SSA names into RTL. > * rtl.h (set_reg_attrs_for_parm): Declare. > * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename > to "optimized", remove unused locals at finish. > (execute_free_datastructures): Make global, call > delete_tree_cfg_annotations. > (execute_free_cfg_annotations): Don't call > delete_tree_cfg_annotations. > > * ssaexpand.h: New file. > * expr.c (toplevel): Include ssaexpand.h. > (expand_assignment): Handle SSA names the same as register > variables. > (expand_expr_real_1): Expand SSA names. > * cfgexpand.c (toplevel): Include ssaexpand.h. > (SA): New global variable. > (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. > (SSAVAR): New macro. > (set_rtl): New helper function. > (add_stack_var): Deal with SSA names, use set_rtl. > (expand_one_stack_var_at): Likewise. > (expand_one_stack_var): Deal with SSA names. > (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker > before unique numbers. > (expand_stack_vars): Use set_rtl. > (expand_one_var): Accept SSA names, add asserts for them, feed them > to above subroutines. > (expand_used_vars): Expand all partitions (without default defs), > then only the local decls (ignoring those expanded already). > (expand_gimple_cond): Remove edges when jumpif() expands an > unconditional jump. > (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, > or remove abnormal edges. Ignore insns setting the LHS of a TERed > SSA name. > (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize > members of SA; deal with PARM_DECL partitions here; expand > all PHI nodes, free tree datastructures and SA. Commit instructions > on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. > (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow > info and statements at start, collect garbage at finish. > * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. > (VAR_ANN_PARTITION) Remove. > (change_partition_var): Don't declare. > (partition_to_var): Always return SSA names. > (var_to_partition): Only accept SSA names. > (register_ssa_partition): Only check argument. > * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var > member. > (delete_var_map): Don't free it. > (var_union): Only accept SSA names, simplify. > (partition_view_init): Mark only useful SSA names as used. > (partition_view_fini): Only deal with SSA names. > (change_partition_var): Remove. > (dump_var_map): Use ssa_name instead of partition_to_var member. > * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL > basic blocks. > * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. > (struct _elim_graph): New member const_dests; nodes member vector of > ints. > (set_location_for_edge): New static helper. > (create_temp): Remove. > (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, > insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New > functions. > (new_elim_graph): Allocate const_dests member. > (clean_elim_graph): Truncate const_dests member. > (delete_elim_graph): Free const_dests member. > (elim_graph_size): Adapt to new type of nodes member. > (elim_graph_add_node): Likewise. > (eliminate_name): Likewise. > (eliminate_build): Don't take basic block argument, deal only with > partition numbers, not variables. > (get_temp_reg): New static helper. > (elim_create): Use it, deal with RTL temporaries instead of trees. > (eliminate_phi): Adjust all calls to new signature. > (assign_vars, replace_use_variable, replace_def_variable): Remove. > (rewrite_trees): Only do checking. > (edge_leader, stmt_list, leader_has_match, leader_match): Remove. > (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, > init_analyze_edges_for_bb, fini_analyze_edges_for_bb, > contains_tree_r, MAX_STMTS_IN_LATCH, > process_single_block_loop_latch, analyze_edges_for_bb, > perform_edge_inserts): Remove. > (expand_phi_nodes): New global function. > (remove_ssa_form): Take ssaexpand parameter. Don't call removed > functions, initialize new parameter, remember partitions having a > default def. > (finish_out_of_ssa): New global function. > (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, > don't reset in_ssa_p here. > (pass_del_ssa): Remove. > * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and > partition members. > (execute_free_datastructures): Declare. > * Makefile.in (SSAEXPAND_H): New variable. > (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. > * basic-block.h (commit_one_edge_insertion): Declare. > * passes.c (init_optimization_passes): Move pass_nrv and > pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove > pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. > * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. > (redirect_branch_edge): Deal with super block when expanding, split > out jump patching itself into ... > (patch_jump_insn): ... here, new static helper. > This patch caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922 You may need a 32bit host to see it since I didn't see it on Linux/x86-64 with -m32. -- H.J. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 5:47 ` H.J. Lu @ 2009-04-28 23:49 ` H.J. Lu 2009-04-29 0:21 ` Andrew Pinski 2009-04-30 13:47 ` H.J. Lu 0 siblings, 2 replies; 63+ messages in thread From: H.J. Lu @ 2009-04-28 23:49 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote: >> On Wed, 22 Apr 2009, Michael Matz wrote: >> >>> I'll soon send a new version of the patch that fixes all problems and >>> testcases I encountered. >> >> Like so. This is the full patch, i.e. including the cleanups, but >> excluding the testsuite changes. It should incorporate all feedback. >> Compared to the last version it adds comments for new functions, fixes >> muflap2, and generally some other minor problems showing when I started >> testing Ada and a bug reported by Andrey. >> >> This patch (plus testsuite changes) was bootstrapped with Ada on >> x86_64-linux. There are no testsuite regressions: >> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE >> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE >> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) >> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) >> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) >> FAIL: libmudflap.c++/pass41-frag.cxx execution test >> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test >> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test >> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test >> >> All of these happen without the patch too (known bugs, old binutils, and >> pass41-frag never seems to work anyway). >> >> I'd like to ask for approval for the series. >> >> >> Ciao, >> Michael. >> -- >> * builtins.c (fold_builtin_next_arg): Handle SSA names. >> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate >> beyond num_ssa_names, use ssa_name() directly. >> * tree-ssa-ter.c (free_temp_expr_table): Likewise. >> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, >> mark only useful SSA names. >> (compare_pairs): Swap cost comparison. >> (coalesce_ssa_name): Don't use change_partition_var. >> * tree-nrv.c (struct nrv_data): Add modified member. >> (finalize_nrv_r): Set it. >> (tree_nrv): Use it to update statements. >> (pass_nrv): Require PROP_ssa. >> * tree-mudflap.c (create_referenced_var): New static helper. >> (mf_decl_cache_locals, mf_build_check_statement_for): Use it. >> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. >> * alias.c (find_base_decl): Handle SSA names. >> * emit-rtl (set_reg_attrs_for_parm): Make non-static. >> (component_ref_for_mem_expr): Don't leak SSA names into RTL. >> * rtl.h (set_reg_attrs_for_parm): Declare. >> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename >> to "optimized", remove unused locals at finish. >> (execute_free_datastructures): Make global, call >> delete_tree_cfg_annotations. >> (execute_free_cfg_annotations): Don't call >> delete_tree_cfg_annotations. >> >> * ssaexpand.h: New file. >> * expr.c (toplevel): Include ssaexpand.h. >> (expand_assignment): Handle SSA names the same as register >> variables. >> (expand_expr_real_1): Expand SSA names. >> * cfgexpand.c (toplevel): Include ssaexpand.h. >> (SA): New global variable. >> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. >> (SSAVAR): New macro. >> (set_rtl): New helper function. >> (add_stack_var): Deal with SSA names, use set_rtl. >> (expand_one_stack_var_at): Likewise. >> (expand_one_stack_var): Deal with SSA names. >> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker >> before unique numbers. >> (expand_stack_vars): Use set_rtl. >> (expand_one_var): Accept SSA names, add asserts for them, feed them >> to above subroutines. >> (expand_used_vars): Expand all partitions (without default defs), >> then only the local decls (ignoring those expanded already). >> (expand_gimple_cond): Remove edges when jumpif() expands an >> unconditional jump. >> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, >> or remove abnormal edges. Ignore insns setting the LHS of a TERed >> SSA name. >> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize >> members of SA; deal with PARM_DECL partitions here; expand >> all PHI nodes, free tree datastructures and SA. Commit instructions >> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. >> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow >> info and statements at start, collect garbage at finish. >> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. >> (VAR_ANN_PARTITION) Remove. >> (change_partition_var): Don't declare. >> (partition_to_var): Always return SSA names. >> (var_to_partition): Only accept SSA names. >> (register_ssa_partition): Only check argument. >> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var >> member. >> (delete_var_map): Don't free it. >> (var_union): Only accept SSA names, simplify. >> (partition_view_init): Mark only useful SSA names as used. >> (partition_view_fini): Only deal with SSA names. >> (change_partition_var): Remove. >> (dump_var_map): Use ssa_name instead of partition_to_var member. >> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL >> basic blocks. >> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. >> (struct _elim_graph): New member const_dests; nodes member vector of >> ints. >> (set_location_for_edge): New static helper. >> (create_temp): Remove. >> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, >> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New >> functions. >> (new_elim_graph): Allocate const_dests member. >> (clean_elim_graph): Truncate const_dests member. >> (delete_elim_graph): Free const_dests member. >> (elim_graph_size): Adapt to new type of nodes member. >> (elim_graph_add_node): Likewise. >> (eliminate_name): Likewise. >> (eliminate_build): Don't take basic block argument, deal only with >> partition numbers, not variables. >> (get_temp_reg): New static helper. >> (elim_create): Use it, deal with RTL temporaries instead of trees. >> (eliminate_phi): Adjust all calls to new signature. >> (assign_vars, replace_use_variable, replace_def_variable): Remove. >> (rewrite_trees): Only do checking. >> (edge_leader, stmt_list, leader_has_match, leader_match): Remove. >> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, >> init_analyze_edges_for_bb, fini_analyze_edges_for_bb, >> contains_tree_r, MAX_STMTS_IN_LATCH, >> process_single_block_loop_latch, analyze_edges_for_bb, >> perform_edge_inserts): Remove. >> (expand_phi_nodes): New global function. >> (remove_ssa_form): Take ssaexpand parameter. Don't call removed >> functions, initialize new parameter, remember partitions having a >> default def. >> (finish_out_of_ssa): New global function. >> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, >> don't reset in_ssa_p here. >> (pass_del_ssa): Remove. >> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and >> partition members. >> (execute_free_datastructures): Declare. >> * Makefile.in (SSAEXPAND_H): New variable. >> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. >> * basic-block.h (commit_one_edge_insertion): Declare. >> * passes.c (init_optimization_passes): Move pass_nrv and >> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove >> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. >> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. >> (redirect_branch_edge): Deal with super block when expanding, split >> out jump patching itself into ... >> (patch_jump_insn): ... here, new static helper. >> > > This patch caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922 > > You may need a 32bit host to see it since I didn't see it on > Linux/x86-64 with -m32. > This also caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954 -- H.J. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 23:49 ` H.J. Lu @ 2009-04-29 0:21 ` Andrew Pinski 2009-04-30 13:47 ` H.J. Lu 1 sibling, 0 replies; 63+ messages in thread From: Andrew Pinski @ 2009-04-29 0:21 UTC (permalink / raw) To: H.J. Lu; +Cc: Michael Matz, gcc-patches, Andrew MacLeod, Andrey Belevantsev On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > > This also caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954 But as I mentioned in that bug report, I think the taking the address here is not so valid thing of doing. -- Pinski ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 23:49 ` H.J. Lu 2009-04-29 0:21 ` Andrew Pinski @ 2009-04-30 13:47 ` H.J. Lu 2009-05-29 3:47 ` H.J. Lu 1 sibling, 1 reply; 63+ messages in thread From: H.J. Lu @ 2009-04-30 13:47 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote: >>> On Wed, 22 Apr 2009, Michael Matz wrote: >>> >>>> I'll soon send a new version of the patch that fixes all problems and >>>> testcases I encountered. >>> >>> Like so. This is the full patch, i.e. including the cleanups, but >>> excluding the testsuite changes. It should incorporate all feedback. >>> Compared to the last version it adds comments for new functions, fixes >>> muflap2, and generally some other minor problems showing when I started >>> testing Ada and a bug reported by Andrey. >>> >>> This patch (plus testsuite changes) was bootstrapped with Ada on >>> x86_64-linux. There are no testsuite regressions: >>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE >>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE >>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) >>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) >>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) >>> FAIL: libmudflap.c++/pass41-frag.cxx execution test >>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test >>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test >>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test >>> >>> All of these happen without the patch too (known bugs, old binutils, and >>> pass41-frag never seems to work anyway). >>> >>> I'd like to ask for approval for the series. >>> >>> >>> Ciao, >>> Michael. >>> -- >>> * builtins.c (fold_builtin_next_arg): Handle SSA names. >>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate >>> beyond num_ssa_names, use ssa_name() directly. >>> * tree-ssa-ter.c (free_temp_expr_table): Likewise. >>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, >>> mark only useful SSA names. >>> (compare_pairs): Swap cost comparison. >>> (coalesce_ssa_name): Don't use change_partition_var. >>> * tree-nrv.c (struct nrv_data): Add modified member. >>> (finalize_nrv_r): Set it. >>> (tree_nrv): Use it to update statements. >>> (pass_nrv): Require PROP_ssa. >>> * tree-mudflap.c (create_referenced_var): New static helper. >>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it. >>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. >>> * alias.c (find_base_decl): Handle SSA names. >>> * emit-rtl (set_reg_attrs_for_parm): Make non-static. >>> (component_ref_for_mem_expr): Don't leak SSA names into RTL. >>> * rtl.h (set_reg_attrs_for_parm): Declare. >>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename >>> to "optimized", remove unused locals at finish. >>> (execute_free_datastructures): Make global, call >>> delete_tree_cfg_annotations. >>> (execute_free_cfg_annotations): Don't call >>> delete_tree_cfg_annotations. >>> >>> * ssaexpand.h: New file. >>> * expr.c (toplevel): Include ssaexpand.h. >>> (expand_assignment): Handle SSA names the same as register >>> variables. >>> (expand_expr_real_1): Expand SSA names. >>> * cfgexpand.c (toplevel): Include ssaexpand.h. >>> (SA): New global variable. >>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. >>> (SSAVAR): New macro. >>> (set_rtl): New helper function. >>> (add_stack_var): Deal with SSA names, use set_rtl. >>> (expand_one_stack_var_at): Likewise. >>> (expand_one_stack_var): Deal with SSA names. >>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker >>> before unique numbers. >>> (expand_stack_vars): Use set_rtl. >>> (expand_one_var): Accept SSA names, add asserts for them, feed them >>> to above subroutines. >>> (expand_used_vars): Expand all partitions (without default defs), >>> then only the local decls (ignoring those expanded already). >>> (expand_gimple_cond): Remove edges when jumpif() expands an >>> unconditional jump. >>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, >>> or remove abnormal edges. Ignore insns setting the LHS of a TERed >>> SSA name. >>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize >>> members of SA; deal with PARM_DECL partitions here; expand >>> all PHI nodes, free tree datastructures and SA. Commit instructions >>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. >>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow >>> info and statements at start, collect garbage at finish. >>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. >>> (VAR_ANN_PARTITION) Remove. >>> (change_partition_var): Don't declare. >>> (partition_to_var): Always return SSA names. >>> (var_to_partition): Only accept SSA names. >>> (register_ssa_partition): Only check argument. >>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var >>> member. >>> (delete_var_map): Don't free it. >>> (var_union): Only accept SSA names, simplify. >>> (partition_view_init): Mark only useful SSA names as used. >>> (partition_view_fini): Only deal with SSA names. >>> (change_partition_var): Remove. >>> (dump_var_map): Use ssa_name instead of partition_to_var member. >>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL >>> basic blocks. >>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. >>> (struct _elim_graph): New member const_dests; nodes member vector of >>> ints. >>> (set_location_for_edge): New static helper. >>> (create_temp): Remove. >>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, >>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New >>> functions. >>> (new_elim_graph): Allocate const_dests member. >>> (clean_elim_graph): Truncate const_dests member. >>> (delete_elim_graph): Free const_dests member. >>> (elim_graph_size): Adapt to new type of nodes member. >>> (elim_graph_add_node): Likewise. >>> (eliminate_name): Likewise. >>> (eliminate_build): Don't take basic block argument, deal only with >>> partition numbers, not variables. >>> (get_temp_reg): New static helper. >>> (elim_create): Use it, deal with RTL temporaries instead of trees. >>> (eliminate_phi): Adjust all calls to new signature. >>> (assign_vars, replace_use_variable, replace_def_variable): Remove. >>> (rewrite_trees): Only do checking. >>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove. >>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, >>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb, >>> contains_tree_r, MAX_STMTS_IN_LATCH, >>> process_single_block_loop_latch, analyze_edges_for_bb, >>> perform_edge_inserts): Remove. >>> (expand_phi_nodes): New global function. >>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed >>> functions, initialize new parameter, remember partitions having a >>> default def. >>> (finish_out_of_ssa): New global function. >>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, >>> don't reset in_ssa_p here. >>> (pass_del_ssa): Remove. >>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and >>> partition members. >>> (execute_free_datastructures): Declare. >>> * Makefile.in (SSAEXPAND_H): New variable. >>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. >>> * basic-block.h (commit_one_edge_insertion): Declare. >>> * passes.c (init_optimization_passes): Move pass_nrv and >>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove >>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. >>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. >>> (redirect_branch_edge): Deal with super block when expanding, split >>> out jump patching itself into ... >>> (patch_jump_insn): ... here, new static helper. >>> >> >> This patch caused: >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922 >> >> You may need a 32bit host to see it since I didn't see it on >> Linux/x86-64 with -m32. >> > > This also caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954 > This also caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39973 and may cause: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39972 -- H.J. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-30 13:47 ` H.J. Lu @ 2009-05-29 3:47 ` H.J. Lu 2010-10-20 18:02 ` H.J. Lu 0 siblings, 1 reply; 63+ messages in thread From: H.J. Lu @ 2009-05-29 3:47 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Thu, Apr 30, 2009 at 6:06 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >> On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote: >>>> On Wed, 22 Apr 2009, Michael Matz wrote: >>>> >>>>> I'll soon send a new version of the patch that fixes all problems and >>>>> testcases I encountered. >>>> >>>> Like so. This is the full patch, i.e. including the cleanups, but >>>> excluding the testsuite changes. It should incorporate all feedback. >>>> Compared to the last version it adds comments for new functions, fixes >>>> muflap2, and generally some other minor problems showing when I started >>>> testing Ada and a bug reported by Andrey. >>>> >>>> This patch (plus testsuite changes) was bootstrapped with Ada on >>>> x86_64-linux. There are no testsuite regressions: >>>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE >>>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE >>>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) >>>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) >>>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) >>>> FAIL: libmudflap.c++/pass41-frag.cxx execution test >>>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test >>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test >>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test >>>> >>>> All of these happen without the patch too (known bugs, old binutils, and >>>> pass41-frag never seems to work anyway). >>>> >>>> I'd like to ask for approval for the series. >>>> >>>> >>>> Ciao, >>>> Michael. >>>> -- >>>> * builtins.c (fold_builtin_next_arg): Handle SSA names. >>>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate >>>> beyond num_ssa_names, use ssa_name() directly. >>>> * tree-ssa-ter.c (free_temp_expr_table): Likewise. >>>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, >>>> mark only useful SSA names. >>>> (compare_pairs): Swap cost comparison. >>>> (coalesce_ssa_name): Don't use change_partition_var. >>>> * tree-nrv.c (struct nrv_data): Add modified member. >>>> (finalize_nrv_r): Set it. >>>> (tree_nrv): Use it to update statements. >>>> (pass_nrv): Require PROP_ssa. >>>> * tree-mudflap.c (create_referenced_var): New static helper. >>>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it. >>>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. >>>> * alias.c (find_base_decl): Handle SSA names. >>>> * emit-rtl (set_reg_attrs_for_parm): Make non-static. >>>> (component_ref_for_mem_expr): Don't leak SSA names into RTL. >>>> * rtl.h (set_reg_attrs_for_parm): Declare. >>>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename >>>> to "optimized", remove unused locals at finish. >>>> (execute_free_datastructures): Make global, call >>>> delete_tree_cfg_annotations. >>>> (execute_free_cfg_annotations): Don't call >>>> delete_tree_cfg_annotations. >>>> >>>> * ssaexpand.h: New file. >>>> * expr.c (toplevel): Include ssaexpand.h. >>>> (expand_assignment): Handle SSA names the same as register >>>> variables. >>>> (expand_expr_real_1): Expand SSA names. >>>> * cfgexpand.c (toplevel): Include ssaexpand.h. >>>> (SA): New global variable. >>>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. >>>> (SSAVAR): New macro. >>>> (set_rtl): New helper function. >>>> (add_stack_var): Deal with SSA names, use set_rtl. >>>> (expand_one_stack_var_at): Likewise. >>>> (expand_one_stack_var): Deal with SSA names. >>>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker >>>> before unique numbers. >>>> (expand_stack_vars): Use set_rtl. >>>> (expand_one_var): Accept SSA names, add asserts for them, feed them >>>> to above subroutines. >>>> (expand_used_vars): Expand all partitions (without default defs), >>>> then only the local decls (ignoring those expanded already). >>>> (expand_gimple_cond): Remove edges when jumpif() expands an >>>> unconditional jump. >>>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, >>>> or remove abnormal edges. Ignore insns setting the LHS of a TERed >>>> SSA name. >>>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize >>>> members of SA; deal with PARM_DECL partitions here; expand >>>> all PHI nodes, free tree datastructures and SA. Commit instructions >>>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. >>>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow >>>> info and statements at start, collect garbage at finish. >>>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. >>>> (VAR_ANN_PARTITION) Remove. >>>> (change_partition_var): Don't declare. >>>> (partition_to_var): Always return SSA names. >>>> (var_to_partition): Only accept SSA names. >>>> (register_ssa_partition): Only check argument. >>>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var >>>> member. >>>> (delete_var_map): Don't free it. >>>> (var_union): Only accept SSA names, simplify. >>>> (partition_view_init): Mark only useful SSA names as used. >>>> (partition_view_fini): Only deal with SSA names. >>>> (change_partition_var): Remove. >>>> (dump_var_map): Use ssa_name instead of partition_to_var member. >>>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL >>>> basic blocks. >>>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. >>>> (struct _elim_graph): New member const_dests; nodes member vector of >>>> ints. >>>> (set_location_for_edge): New static helper. >>>> (create_temp): Remove. >>>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, >>>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New >>>> functions. >>>> (new_elim_graph): Allocate const_dests member. >>>> (clean_elim_graph): Truncate const_dests member. >>>> (delete_elim_graph): Free const_dests member. >>>> (elim_graph_size): Adapt to new type of nodes member. >>>> (elim_graph_add_node): Likewise. >>>> (eliminate_name): Likewise. >>>> (eliminate_build): Don't take basic block argument, deal only with >>>> partition numbers, not variables. >>>> (get_temp_reg): New static helper. >>>> (elim_create): Use it, deal with RTL temporaries instead of trees. >>>> (eliminate_phi): Adjust all calls to new signature. >>>> (assign_vars, replace_use_variable, replace_def_variable): Remove. >>>> (rewrite_trees): Only do checking. >>>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove. >>>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, >>>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb, >>>> contains_tree_r, MAX_STMTS_IN_LATCH, >>>> process_single_block_loop_latch, analyze_edges_for_bb, >>>> perform_edge_inserts): Remove. >>>> (expand_phi_nodes): New global function. >>>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed >>>> functions, initialize new parameter, remember partitions having a >>>> default def. >>>> (finish_out_of_ssa): New global function. >>>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, >>>> don't reset in_ssa_p here. >>>> (pass_del_ssa): Remove. >>>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and >>>> partition members. >>>> (execute_free_datastructures): Declare. >>>> * Makefile.in (SSAEXPAND_H): New variable. >>>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. >>>> * basic-block.h (commit_one_edge_insertion): Declare. >>>> * passes.c (init_optimization_passes): Move pass_nrv and >>>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove >>>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. >>>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. >>>> (redirect_branch_edge): Deal with super block when expanding, split >>>> out jump patching itself into ... >>>> (patch_jump_insn): ... here, new static helper. >>>> >>> >>> This patch caused: >>> >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922 >>> >>> You may need a 32bit host to see it since I didn't see it on >>> Linux/x86-64 with -m32. >>> >> >> This also caused: >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954 >> > > This also caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39973 > > and may cause: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39972 > This also caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40012 -- H.J. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-05-29 3:47 ` H.J. Lu @ 2010-10-20 18:02 ` H.J. Lu 2011-02-14 18:02 ` H.J. Lu 0 siblings, 1 reply; 63+ messages in thread From: H.J. Lu @ 2010-10-20 18:02 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Thu, May 28, 2009 at 8:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Thu, Apr 30, 2009 at 6:06 AM, H.J. Lu <hjl.tools@gmail.com> wrote: >> On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>> On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>>> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote: >>>>> On Wed, 22 Apr 2009, Michael Matz wrote: >>>>> >>>>>> I'll soon send a new version of the patch that fixes all problems and >>>>>> testcases I encountered. >>>>> >>>>> Like so. This is the full patch, i.e. including the cleanups, but >>>>> excluding the testsuite changes. It should incorporate all feedback. >>>>> Compared to the last version it adds comments for new functions, fixes >>>>> muflap2, and generally some other minor problems showing when I started >>>>> testing Ada and a bug reported by Andrey. >>>>> >>>>> This patch (plus testsuite changes) was bootstrapped with Ada on >>>>> x86_64-linux. There are no testsuite regressions: >>>>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE >>>>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE >>>>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) >>>>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) >>>>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) >>>>> FAIL: libmudflap.c++/pass41-frag.cxx execution test >>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test >>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test >>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test >>>>> >>>>> All of these happen without the patch too (known bugs, old binutils, and >>>>> pass41-frag never seems to work anyway). >>>>> >>>>> I'd like to ask for approval for the series. >>>>> >>>>> >>>>> Ciao, >>>>> Michael. >>>>> -- >>>>> * builtins.c (fold_builtin_next_arg): Handle SSA names. >>>>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate >>>>> beyond num_ssa_names, use ssa_name() directly. >>>>> * tree-ssa-ter.c (free_temp_expr_table): Likewise. >>>>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, >>>>> mark only useful SSA names. >>>>> (compare_pairs): Swap cost comparison. >>>>> (coalesce_ssa_name): Don't use change_partition_var. >>>>> * tree-nrv.c (struct nrv_data): Add modified member. >>>>> (finalize_nrv_r): Set it. >>>>> (tree_nrv): Use it to update statements. >>>>> (pass_nrv): Require PROP_ssa. >>>>> * tree-mudflap.c (create_referenced_var): New static helper. >>>>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it. >>>>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. >>>>> * alias.c (find_base_decl): Handle SSA names. >>>>> * emit-rtl (set_reg_attrs_for_parm): Make non-static. >>>>> (component_ref_for_mem_expr): Don't leak SSA names into RTL. >>>>> * rtl.h (set_reg_attrs_for_parm): Declare. >>>>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename >>>>> to "optimized", remove unused locals at finish. >>>>> (execute_free_datastructures): Make global, call >>>>> delete_tree_cfg_annotations. >>>>> (execute_free_cfg_annotations): Don't call >>>>> delete_tree_cfg_annotations. >>>>> >>>>> * ssaexpand.h: New file. >>>>> * expr.c (toplevel): Include ssaexpand.h. >>>>> (expand_assignment): Handle SSA names the same as register >>>>> variables. >>>>> (expand_expr_real_1): Expand SSA names. >>>>> * cfgexpand.c (toplevel): Include ssaexpand.h. >>>>> (SA): New global variable. >>>>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. >>>>> (SSAVAR): New macro. >>>>> (set_rtl): New helper function. >>>>> (add_stack_var): Deal with SSA names, use set_rtl. >>>>> (expand_one_stack_var_at): Likewise. >>>>> (expand_one_stack_var): Deal with SSA names. >>>>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker >>>>> before unique numbers. >>>>> (expand_stack_vars): Use set_rtl. >>>>> (expand_one_var): Accept SSA names, add asserts for them, feed them >>>>> to above subroutines. >>>>> (expand_used_vars): Expand all partitions (without default defs), >>>>> then only the local decls (ignoring those expanded already). >>>>> (expand_gimple_cond): Remove edges when jumpif() expands an >>>>> unconditional jump. >>>>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, >>>>> or remove abnormal edges. Ignore insns setting the LHS of a TERed >>>>> SSA name. >>>>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize >>>>> members of SA; deal with PARM_DECL partitions here; expand >>>>> all PHI nodes, free tree datastructures and SA. Commit instructions >>>>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. >>>>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow >>>>> info and statements at start, collect garbage at finish. >>>>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. >>>>> (VAR_ANN_PARTITION) Remove. >>>>> (change_partition_var): Don't declare. >>>>> (partition_to_var): Always return SSA names. >>>>> (var_to_partition): Only accept SSA names. >>>>> (register_ssa_partition): Only check argument. >>>>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var >>>>> member. >>>>> (delete_var_map): Don't free it. >>>>> (var_union): Only accept SSA names, simplify. >>>>> (partition_view_init): Mark only useful SSA names as used. >>>>> (partition_view_fini): Only deal with SSA names. >>>>> (change_partition_var): Remove. >>>>> (dump_var_map): Use ssa_name instead of partition_to_var member. >>>>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL >>>>> basic blocks. >>>>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. >>>>> (struct _elim_graph): New member const_dests; nodes member vector of >>>>> ints. >>>>> (set_location_for_edge): New static helper. >>>>> (create_temp): Remove. >>>>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, >>>>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New >>>>> functions. >>>>> (new_elim_graph): Allocate const_dests member. >>>>> (clean_elim_graph): Truncate const_dests member. >>>>> (delete_elim_graph): Free const_dests member. >>>>> (elim_graph_size): Adapt to new type of nodes member. >>>>> (elim_graph_add_node): Likewise. >>>>> (eliminate_name): Likewise. >>>>> (eliminate_build): Don't take basic block argument, deal only with >>>>> partition numbers, not variables. >>>>> (get_temp_reg): New static helper. >>>>> (elim_create): Use it, deal with RTL temporaries instead of trees. >>>>> (eliminate_phi): Adjust all calls to new signature. >>>>> (assign_vars, replace_use_variable, replace_def_variable): Remove. >>>>> (rewrite_trees): Only do checking. >>>>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove. >>>>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, >>>>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb, >>>>> contains_tree_r, MAX_STMTS_IN_LATCH, >>>>> process_single_block_loop_latch, analyze_edges_for_bb, >>>>> perform_edge_inserts): Remove. >>>>> (expand_phi_nodes): New global function. >>>>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed >>>>> functions, initialize new parameter, remember partitions having a >>>>> default def. >>>>> (finish_out_of_ssa): New global function. >>>>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, >>>>> don't reset in_ssa_p here. >>>>> (pass_del_ssa): Remove. >>>>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and >>>>> partition members. >>>>> (execute_free_datastructures): Declare. >>>>> * Makefile.in (SSAEXPAND_H): New variable. >>>>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. >>>>> * basic-block.h (commit_one_edge_insertion): Declare. >>>>> * passes.c (init_optimization_passes): Move pass_nrv and >>>>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove >>>>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. >>>>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. >>>>> (redirect_branch_edge): Deal with super block when expanding, split >>>>> out jump patching itself into ... >>>>> (patch_jump_insn): ... here, new static helper. >>>>> >>>> >>>> This patch caused: >>>> >>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922 >>>> >>>> You may need a 32bit host to see it since I didn't see it on >>>> Linux/x86-64 with -m32. >>>> >>> >>> This also caused: >>> >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954 >>> >> >> This also caused: >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39973 >> >> and may cause: >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39972 >> > > This also caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40012 > > This also caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46098 -- H.J. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2010-10-20 18:02 ` H.J. Lu @ 2011-02-14 18:02 ` H.J. Lu 0 siblings, 0 replies; 63+ messages in thread From: H.J. Lu @ 2011-02-14 18:02 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev On Wed, Oct 20, 2010 at 10:00 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Thu, May 28, 2009 at 8:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >> On Thu, Apr 30, 2009 at 6:06 AM, H.J. Lu <hjl.tools@gmail.com> wrote: >>> On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>>> On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>>>> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote: >>>>>> On Wed, 22 Apr 2009, Michael Matz wrote: >>>>>> >>>>>>> I'll soon send a new version of the patch that fixes all problems and >>>>>>> testcases I encountered. >>>>>> >>>>>> Like so. This is the full patch, i.e. including the cleanups, but >>>>>> excluding the testsuite changes. It should incorporate all feedback. >>>>>> Compared to the last version it adds comments for new functions, fixes >>>>>> muflap2, and generally some other minor problems showing when I started >>>>>> testing Ada and a bug reported by Andrey. >>>>>> >>>>>> This patch (plus testsuite changes) was bootstrapped with Ada on >>>>>> x86_64-linux. There are no testsuite regressions: >>>>>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE >>>>>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE >>>>>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors) >>>>>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors) >>>>>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors) >>>>>> FAIL: libmudflap.c++/pass41-frag.cxx execution test >>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test >>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test >>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test >>>>>> >>>>>> All of these happen without the patch too (known bugs, old binutils, and >>>>>> pass41-frag never seems to work anyway). >>>>>> >>>>>> I'd like to ask for approval for the series. >>>>>> >>>>>> >>>>>> Ciao, >>>>>> Michael. >>>>>> -- >>>>>> * builtins.c (fold_builtin_next_arg): Handle SSA names. >>>>>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate >>>>>> beyond num_ssa_names, use ssa_name() directly. >>>>>> * tree-ssa-ter.c (free_temp_expr_table): Likewise. >>>>>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise, >>>>>> mark only useful SSA names. >>>>>> (compare_pairs): Swap cost comparison. >>>>>> (coalesce_ssa_name): Don't use change_partition_var. >>>>>> * tree-nrv.c (struct nrv_data): Add modified member. >>>>>> (finalize_nrv_r): Set it. >>>>>> (tree_nrv): Use it to update statements. >>>>>> (pass_nrv): Require PROP_ssa. >>>>>> * tree-mudflap.c (create_referenced_var): New static helper. >>>>>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it. >>>>>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish. >>>>>> * alias.c (find_base_decl): Handle SSA names. >>>>>> * emit-rtl (set_reg_attrs_for_parm): Make non-static. >>>>>> (component_ref_for_mem_expr): Don't leak SSA names into RTL. >>>>>> * rtl.h (set_reg_attrs_for_parm): Declare. >>>>>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename >>>>>> to "optimized", remove unused locals at finish. >>>>>> (execute_free_datastructures): Make global, call >>>>>> delete_tree_cfg_annotations. >>>>>> (execute_free_cfg_annotations): Don't call >>>>>> delete_tree_cfg_annotations. >>>>>> >>>>>> * ssaexpand.h: New file. >>>>>> * expr.c (toplevel): Include ssaexpand.h. >>>>>> (expand_assignment): Handle SSA names the same as register >>>>>> variables. >>>>>> (expand_expr_real_1): Expand SSA names. >>>>>> * cfgexpand.c (toplevel): Include ssaexpand.h. >>>>>> (SA): New global variable. >>>>>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates. >>>>>> (SSAVAR): New macro. >>>>>> (set_rtl): New helper function. >>>>>> (add_stack_var): Deal with SSA names, use set_rtl. >>>>>> (expand_one_stack_var_at): Likewise. >>>>>> (expand_one_stack_var): Deal with SSA names. >>>>>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker >>>>>> before unique numbers. >>>>>> (expand_stack_vars): Use set_rtl. >>>>>> (expand_one_var): Accept SSA names, add asserts for them, feed them >>>>>> to above subroutines. >>>>>> (expand_used_vars): Expand all partitions (without default defs), >>>>>> then only the local decls (ignoring those expanded already). >>>>>> (expand_gimple_cond): Remove edges when jumpif() expands an >>>>>> unconditional jump. >>>>>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here, >>>>>> or remove abnormal edges. Ignore insns setting the LHS of a TERed >>>>>> SSA name. >>>>>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize >>>>>> members of SA; deal with PARM_DECL partitions here; expand >>>>>> all PHI nodes, free tree datastructures and SA. Commit instructions >>>>>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here. >>>>>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow >>>>>> info and statements at start, collect garbage at finish. >>>>>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member. >>>>>> (VAR_ANN_PARTITION) Remove. >>>>>> (change_partition_var): Don't declare. >>>>>> (partition_to_var): Always return SSA names. >>>>>> (var_to_partition): Only accept SSA names. >>>>>> (register_ssa_partition): Only check argument. >>>>>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var >>>>>> member. >>>>>> (delete_var_map): Don't free it. >>>>>> (var_union): Only accept SSA names, simplify. >>>>>> (partition_view_init): Mark only useful SSA names as used. >>>>>> (partition_view_fini): Only deal with SSA names. >>>>>> (change_partition_var): Remove. >>>>>> (dump_var_map): Use ssa_name instead of partition_to_var member. >>>>>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL >>>>>> basic blocks. >>>>>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h. >>>>>> (struct _elim_graph): New member const_dests; nodes member vector of >>>>>> ints. >>>>>> (set_location_for_edge): New static helper. >>>>>> (create_temp): Remove. >>>>>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge, >>>>>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New >>>>>> functions. >>>>>> (new_elim_graph): Allocate const_dests member. >>>>>> (clean_elim_graph): Truncate const_dests member. >>>>>> (delete_elim_graph): Free const_dests member. >>>>>> (elim_graph_size): Adapt to new type of nodes member. >>>>>> (elim_graph_add_node): Likewise. >>>>>> (eliminate_name): Likewise. >>>>>> (eliminate_build): Don't take basic block argument, deal only with >>>>>> partition numbers, not variables. >>>>>> (get_temp_reg): New static helper. >>>>>> (elim_create): Use it, deal with RTL temporaries instead of trees. >>>>>> (eliminate_phi): Adjust all calls to new signature. >>>>>> (assign_vars, replace_use_variable, replace_def_variable): Remove. >>>>>> (rewrite_trees): Only do checking. >>>>>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove. >>>>>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p, >>>>>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb, >>>>>> contains_tree_r, MAX_STMTS_IN_LATCH, >>>>>> process_single_block_loop_latch, analyze_edges_for_bb, >>>>>> perform_edge_inserts): Remove. >>>>>> (expand_phi_nodes): New global function. >>>>>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed >>>>>> functions, initialize new parameter, remember partitions having a >>>>>> default def. >>>>>> (finish_out_of_ssa): New global function. >>>>>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form, >>>>>> don't reset in_ssa_p here. >>>>>> (pass_del_ssa): Remove. >>>>>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and >>>>>> partition members. >>>>>> (execute_free_datastructures): Declare. >>>>>> * Makefile.in (SSAEXPAND_H): New variable. >>>>>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H. >>>>>> * basic-block.h (commit_one_edge_insertion): Declare. >>>>>> * passes.c (init_optimization_passes): Move pass_nrv and >>>>>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove >>>>>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations. >>>>>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare. >>>>>> (redirect_branch_edge): Deal with super block when expanding, split >>>>>> out jump patching itself into ... >>>>>> (patch_jump_insn): ... here, new static helper. >>>>>> >>>>> >>>>> This patch caused: >>>>> >>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922 >>>>> >>>>> You may need a 32bit host to see it since I didn't see it on >>>>> Linux/x86-64 with -m32. >>>>> >>>> >>>> This also caused: >>>> >>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954 >>>> >>> >>> This also caused: >>> >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39973 >>> >>> and may cause: >>> >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39972 >>> >> >> This also caused: >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40012 >> >> > > This also caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46098 > This also caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47735 -- H.J. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-22 16:45 ` [RFA] " Michael Matz ` (2 preceding siblings ...) 2009-04-27 5:47 ` H.J. Lu @ 2009-04-27 7:22 ` Hans-Peter Nilsson 2009-04-30 18:18 ` Steve Ellcey 4 siblings, 0 replies; 63+ messages in thread From: Hans-Peter Nilsson @ 2009-04-27 7:22 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches On Wed, 22 Apr 2009, Michael Matz wrote: > * builtins.c (fold_builtin_next_arg): Handle SSA names. <long expand-from-ssa CL elided> On the off-chance that matz-at-gcc doesn't reach you, a change in the range 146814:146820 (i.e. this patch) caused PR39927. No good deed goes unpunished! brgds, H-P ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-22 16:45 ` [RFA] " Michael Matz ` (3 preceding siblings ...) 2009-04-27 7:22 ` Hans-Peter Nilsson @ 2009-04-30 18:18 ` Steve Ellcey 2009-05-01 17:40 ` Michael Matz 4 siblings, 1 reply; 63+ messages in thread From: Steve Ellcey @ 2009-04-30 18:18 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches, dave FYI: I have submitted a bug about the hppa2.0w-hp-hpux11.11 bootstrap not working after version r146817. The bug number is 39977. I get an ICE while compiling libgcc and there is a cutdown source program in the bug report. Steve Ellcey sje@cup.hp.com ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-30 18:18 ` Steve Ellcey @ 2009-05-01 17:40 ` Michael Matz 0 siblings, 0 replies; 63+ messages in thread From: Michael Matz @ 2009-05-01 17:40 UTC (permalink / raw) To: Steve Ellcey; +Cc: gcc-patches, dave Hi, On Thu, 30 Apr 2009, Steve Ellcey wrote: > FYI: I have submitted a bug about the hppa2.0w-hp-hpux11.11 bootstrap > not working after version r146817. The bug number is 39977. > > I get an ICE while compiling libgcc and there is a cutdown source > program in the bug report. I've attached a possible patch for this problem there, it would be nice if you could test it on a hppa machine (I don't have one available). Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) @ 2009-04-27 14:15 David Edelsohn 2009-04-27 14:43 ` H.J. Lu 2009-04-27 15:08 ` Michael Matz 0 siblings, 2 replies; 63+ messages in thread From: David Edelsohn @ 2009-04-27 14:15 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches Michael, A patch in the last 24 hours has caused a mis-optimization on PowerPC. The error appears as a failure while compiling libobjc/linking.m. A function in cgraphunit.c is being mis-compiled, probably build_cdtor(). This causes cgraph_build_static_cdtor() to be called with an invalid priority of "-1", instead of 65535. The priority should not be negative. This negative priority value generates an invalid global file function name: _GLOBAL__I_-00001_0___objc_linking A function name should not contain a minus sign, which is a syntax error. Recompiling cgarphunit.c without optimization fixes the miscompilation. Darwin displays a similar error earlier in bootstrap. Jakub tested his patch on PowerPC and Geoff's tester failed before Honza's patch, so the "expand from SSA" patch appears to be the likely culprit. Both Darwin and AIX are bootstrapping in 32 bit mode. Richi suggested on IRC that this may be related to some known CONST_INT issue for which you already have a patch. Thanks, David ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 14:15 David Edelsohn @ 2009-04-27 14:43 ` H.J. Lu 2009-04-27 15:08 ` Michael Matz 1 sibling, 0 replies; 63+ messages in thread From: H.J. Lu @ 2009-04-27 14:43 UTC (permalink / raw) To: David Edelsohn; +Cc: Michael Matz, gcc-patches On Mon, Apr 27, 2009 at 7:11 AM, David Edelsohn <dje.gcc@gmail.com> wrote: > Michael, > > A patch in the last 24 hours has caused a mis-optimization on PowerPC. > The error appears as a failure while compiling libobjc/linking.m. > > A function in cgraphunit.c is being mis-compiled, probably > build_cdtor(). This causes cgraph_build_static_cdtor() to be called > with an invalid priority of "-1", instead of 65535. The priority > should not be negative. This negative priority value generates an > invalid global file function name: > > _GLOBAL__I_-00001_0___objc_linking > > A function name should not contain a minus sign, which is a syntax > error. Recompiling cgarphunit.c without optimization fixes the > miscompilation. > > Darwin displays a similar error earlier in bootstrap. > > Jakub tested his patch on PowerPC and Geoff's tester failed before > Honza's patch, so the "expand from SSA" patch appears to be the likely > culprit. > > Both Darwin and AIX are bootstrapping in 32 bit mode. Richi suggested > on IRC that this may be related to some known CONST_INT issue for > which you already have a patch. > It could be: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922 -- H.J. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 14:15 David Edelsohn 2009-04-27 14:43 ` H.J. Lu @ 2009-04-27 15:08 ` Michael Matz 2009-04-27 15:11 ` David Edelsohn 1 sibling, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-27 15:08 UTC (permalink / raw) To: David Edelsohn; +Cc: gcc-patches Hi, On Mon, 27 Apr 2009, David Edelsohn wrote: > A function in cgraphunit.c is being mis-compiled, probably > build_cdtor(). This causes cgraph_build_static_cdtor() to be called > with an invalid priority of "-1", instead of 65535. The priority should > not be negative. This negative priority value generates an invalid > global file function name: > ... > Both Darwin and AIX are bootstrapping in 32 bit mode. Richi suggested > on IRC that this may be related to some known CONST_INT issue for > which you already have a patch. This confusion with the numbers seems indeed related to PR39922. If you can test the below patch on powerpc that would be nice. I know that it fixes the PR on i686-linux where something similar happens. Ciao, Michael. -- PR middle-end/39922 * tree-outof-ssa.c (insert_value_copy_on_edge): Don't convert constants. Index: tree-outof-ssa.c =================================================================== --- tree-outof-ssa.c (revision 146829) +++ tree-outof-ssa.c (working copy) @@ -184,7 +184,7 @@ insert_value_copy_on_edge (edge e, int d start_sequence (); mode = GET_MODE (SA.partition_to_pseudo[dest]); x = expand_expr (src, SA.partition_to_pseudo[dest], mode, EXPAND_NORMAL); - if (GET_MODE (x) != mode) + if (GET_MODE (x) != VOIDmode && GET_MODE (x) != mode) x = convert_to_mode (mode, x, TYPE_UNSIGNED (TREE_TYPE (src))); if (x != SA.partition_to_pseudo[dest]) emit_move_insn (SA.partition_to_pseudo[dest], x); ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 15:08 ` Michael Matz @ 2009-04-27 15:11 ` David Edelsohn 2009-04-27 15:51 ` Michael Matz 0 siblings, 1 reply; 63+ messages in thread From: David Edelsohn @ 2009-04-27 15:11 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches On Mon, Apr 27, 2009 at 10:43 AM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Mon, 27 Apr 2009, David Edelsohn wrote: > >> A function in cgraphunit.c is being mis-compiled, probably >> build_cdtor(). This causes cgraph_build_static_cdtor() to be called >> with an invalid priority of "-1", instead of 65535. The priority should >> not be negative. This negative priority value generates an invalid >> global file function name: >> ... >> Both Darwin and AIX are bootstrapping in 32 bit mode. Richi suggested >> on IRC that this may be related to some known CONST_INT issue for >> which you already have a patch. > > This confusion with the numbers seems indeed related to PR39922. If you > can test the below patch on powerpc that would be nice. I know that it > fixes the PR on i686-linux where something similar happens. A quick test of the patch does not fix the failure on AIX. I will try a complete bootstrap in case the bug was causing other miscompilations. David ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 15:11 ` David Edelsohn @ 2009-04-27 15:51 ` Michael Matz 2009-04-27 17:03 ` David Edelsohn 2009-04-27 19:15 ` David Edelsohn 0 siblings, 2 replies; 63+ messages in thread From: Michael Matz @ 2009-04-27 15:51 UTC (permalink / raw) To: David Edelsohn; +Cc: gcc-patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 793 bytes --] Hi, On Mon, 27 Apr 2009, David Edelsohn wrote: > > This confusion with the numbers seems indeed related to PR39922. Â If > > you can test the below patch on powerpc that would be nice. Â I know > > that it fixes the PR on i686-linux where something similar happens. > > A quick test of the patch does not fix the failure on AIX. I will try a > complete bootstrap in case the bug was causing other miscompilations. Hmpf. There's another problem (promoted parameters), which might also cause this. The patch I just sent to Andreas Krebbel should fix that one. If it doesn't help AIX/darwin I fear I need a testcase. There probably are some failures in execute.exp which also would be miscompiled, that might be easier to extract than the miscompilation itself. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 15:51 ` Michael Matz @ 2009-04-27 17:03 ` David Edelsohn 2009-04-27 17:27 ` David Edelsohn 2009-04-27 19:15 ` David Edelsohn 1 sibling, 1 reply; 63+ messages in thread From: David Edelsohn @ 2009-04-27 17:03 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches On Mon, Apr 27, 2009 at 11:51 AM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Mon, 27 Apr 2009, David Edelsohn wrote: > >> > This confusion with the numbers seems indeed related to PR39922. If >> > you can test the below patch on powerpc that would be nice. I know >> > that it fixes the PR on i686-linux where something similar happens. >> >> A quick test of the patch does not fix the failure on AIX. I will try a >> complete bootstrap in case the bug was causing other miscompilations. > > Hmpf. There's another problem (promoted parameters), which might also > cause this. The patch I just sent to Andreas Krebbel should fix that one. > If it doesn't help AIX/darwin I fear I need a testcase. There probably > are some failures in execute.exp which also would be miscompiled, that > might be easier to extract than the miscompilation itself. Bootstrapping from scratch with the patch you sent me still shows the problem. I will try with the patch you sent to Andreas. As I mentioned in the PR, for AIX one can reproduce the problem by compiling libobjc/linking.m. That is a fairly small file and produces invalid assembly due to ctor priority of -1. David ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 17:03 ` David Edelsohn @ 2009-04-27 17:27 ` David Edelsohn 0 siblings, 0 replies; 63+ messages in thread From: David Edelsohn @ 2009-04-27 17:27 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches On Mon, Apr 27, 2009 at 12:57 PM, David Edelsohn <dje.gcc@gmail.com> wrote: > On Mon, Apr 27, 2009 at 11:51 AM, Michael Matz <matz@suse.de> wrote: >> Hi, >> >> On Mon, 27 Apr 2009, David Edelsohn wrote: >> >>> > This confusion with the numbers seems indeed related to PR39922. If >>> > you can test the below patch on powerpc that would be nice. I know >>> > that it fixes the PR on i686-linux where something similar happens. >>> >>> A quick test of the patch does not fix the failure on AIX. I will try a >>> complete bootstrap in case the bug was causing other miscompilations. >> >> Hmpf. There's another problem (promoted parameters), which might also >> cause this. The patch I just sent to Andreas Krebbel should fix that one. >> If it doesn't help AIX/darwin I fear I need a testcase. There probably >> are some failures in execute.exp which also would be miscompiled, that >> might be easier to extract than the miscompilation itself. > > Bootstrapping from scratch with the patch you sent me still shows the > problem. I will try with the patch you sent to Andreas. > > As I mentioned in the PR, for AIX one can reproduce the problem by > compiling libobjc/linking.m. That is a fairly small file and produces > invalid assembly due to ctor priority of -1. Unfortunately, the second patch does not fix the PowerPC failure either. I am running the C testsuite to see what new failures occur. David ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 15:51 ` Michael Matz 2009-04-27 17:03 ` David Edelsohn @ 2009-04-27 19:15 ` David Edelsohn 2009-04-28 0:48 ` Michael Matz 1 sibling, 1 reply; 63+ messages in thread From: David Edelsohn @ 2009-04-27 19:15 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches gcc.c-torture/execute/20030916-1.c is a new testsuite failure. David ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-27 19:15 ` David Edelsohn @ 2009-04-28 0:48 ` Michael Matz 2009-04-28 0:54 ` Luis Machado 2009-04-28 16:05 ` David Edelsohn 0 siblings, 2 replies; 63+ messages in thread From: Michael Matz @ 2009-04-28 0:48 UTC (permalink / raw) To: David Edelsohn; +Cc: gcc-patches Hi David, On Mon, 27 Apr 2009, David Edelsohn wrote: > gcc.c-torture/execute/20030916-1.c is a new testsuite failure. Excellent! (or not, depending on the p.o.v. :-) ). I'll try to analyze this. I feared already that this wouldn't help powerpc, but this most probably is some confusion in the RTL move instructions generated on the edges. No doubt I overlooked some target peculiarity in generating them :-( Meanwhile Andrew has analyzed this somewhat (PR39929). If that turns out to be correct we would need to amend the target to check for currently_expanding_to_rtl, not just for current_ir_type(). That wouldn't influence the AIX breakage though. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 0:48 ` Michael Matz @ 2009-04-28 0:54 ` Luis Machado 2009-04-28 1:22 ` Michael Matz 2009-04-28 16:05 ` David Edelsohn 1 sibling, 1 reply; 63+ messages in thread From: Luis Machado @ 2009-04-28 0:54 UTC (permalink / raw) To: Michael Matz; +Cc: David Edelsohn, gcc-patches Hi, On Tue, 2009-04-28 at 02:23 +0200, Michael Matz wrote: > Hi David, > > On Mon, 27 Apr 2009, David Edelsohn wrote: > > > gcc.c-torture/execute/20030916-1.c is a new testsuite failure. > > Excellent! (or not, depending on the p.o.v. :-) ). I'll try to analyze > this. I feared already that this wouldn't help powerpc, but this most > probably is some confusion in the RTL move instructions generated on the > edges. No doubt I overlooked some target peculiarity in generating them > :-( Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's 32-bit sixtrack and found that revision 146817 caused/revealed it. I'll have more details on it soon. Regards, Luis ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 0:54 ` Luis Machado @ 2009-04-28 1:22 ` Michael Matz 2009-04-28 13:24 ` Luis Machado 2009-04-30 17:55 ` Luis Machado 0 siblings, 2 replies; 63+ messages in thread From: Michael Matz @ 2009-04-28 1:22 UTC (permalink / raw) To: Luis Machado; +Cc: David Edelsohn, gcc-patches Hi, On Mon, 27 Apr 2009, Luis Machado wrote: > Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's > 32-bit sixtrack and found that revision 146817 caused/revealed it. > > I'll have more details on it soon. It seems also x86_64 is affected, so anything you find is very welcome. If I may speculate it could be related to the half TER we're now doing. As in, we're not feeding large trees to expand anymore, so there're no opportunities to cleverly expand them to short insn sequences. For cross-checking try to build with -fno-tree-ter (before the patch) and see if it's resulting in the same slowdown. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 1:22 ` Michael Matz @ 2009-04-28 13:24 ` Luis Machado 2009-04-30 17:55 ` Luis Machado 1 sibling, 0 replies; 63+ messages in thread From: Luis Machado @ 2009-04-28 13:24 UTC (permalink / raw) To: Michael Matz; +Cc: David Edelsohn, gcc-patches Hi, > > Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's > > 32-bit sixtrack and found that revision 146817 caused/revealed it. > > > > I'll have more details on it soon. > > It seems also x86_64 is affected, so anything you find is very welcome. > > If I may speculate it could be related to the half TER we're now doing. > As in, we're not feeding large trees to expand anymore, so there're no > opportunities to cleverly expand them to short insn sequences. For > cross-checking try to build with -fno-tree-ter (before the patch) and see > if it's resulting in the same slowdown. It results in some slowdown (~7%), but it's not as big as the one i've seen with the SSA patch. Regards, Luis ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 1:22 ` Michael Matz 2009-04-28 13:24 ` Luis Machado @ 2009-04-30 17:55 ` Luis Machado 2009-05-01 19:33 ` Richard Guenther 1 sibling, 1 reply; 63+ messages in thread From: Luis Machado @ 2009-04-30 17:55 UTC (permalink / raw) To: Michael Matz; +Cc: David Edelsohn, gcc-patches Hi, On Tue, 2009-04-28 at 02:48 +0200, Michael Matz wrote: > Hi, > > On Mon, 27 Apr 2009, Luis Machado wrote: > > > Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's > > 32-bit sixtrack and found that revision 146817 caused/revealed it. > > > > I'll have more details on it soon. > > It seems also x86_64 is affected, so anything you find is very welcome. > > If I may speculate it could be related to the half TER we're now doing. > As in, we're not feeding large trees to expand anymore, so there're no > opportunities to cleverly expand them to short insn sequences. For > cross-checking try to build with -fno-tree-ter (before the patch) and see > if it's resulting in the same slowdown. I've tracked down the cause of the degradation on sixtrack. We have a hot spot on sixtrack in a function called thin6d. Such loop is generated by the old (pre-146817) gcc as a single BB, thus the only way inside that loop is by executing instructions until we fall into that code. The post-146817 gcc breaks that loop in two BB's, such that we can actually branch to the middle of that loop in the first iteration, and then the loop runs just like in pre-146817. The degradation comes from the fact that the creation of two BB's for that single loop breaks good scheduling of instructions inside it, like this: Good code: All the fp load instructions are grouped in the upper portion of the code. fmul f22,f11,f13 fmul f23,f11,f0 addis r12,r6,-27 lfd f3,0(r6) addi r4,r6,8 lfd f1,9472(r12) addis r12,r4,-27 fmadd f8,f12,f0,f22 fmsub f4,f12,f13,f23 lfd f22,9472(r12) lfd f23,8(r6) addi r6,r4,8 fmul f11,f8,f13 fmul f24,f8,f1 fmul f25,f8,f3 fmul f5,f8,f0 fmadd f11,f4,f0,f11 fmadd f21,f4,f3,f24 fmsub f2,f4,f1,f25 fmsub f12,f4,f13,f5 fmul f1,f11,f23 fmul f8,f11,f22 fadd f9,f9,f21 fadd f10,f10,f2 fmsub f24,f12,f22,f1 fmadd f25,f12,f23,f8 fadd f10,f10,f24 fadd f9,f9,f25 bdnz 100ca878 <thin6d_+0x1018> Bad code: The second pair of loads are pushed down the second BB, causing slowdowns. fmul f5,f8,f0 addis r3,r4,-27 lfd f22,8(r7) addi r7,r4,8 lfd f6,9472(r3) fmadd f10,f9,f0,f10 fmsub f23,f9,f13,f5 fmul f2,f10,f22 fmul f9,f10,f6 fmr f7,f23 fmsub f25,f23,f6,f2 fmadd f26,f23,f22,f9 fadd f12,f12,f25 fadd f11,f11,f26 fmul f8,f10,f13 >> BB mark fmul f22,f10,f0 addis r3,r7,-27 lfd f21,0(r7) addi r4,r7,8 lfd f25,9472(r3) fmadd f8,f7,f0,f8 fmsub f9,f7,f13,f22 fmul f23,f8,f21 fmul f26,f8,f25 fmsub f24,f9,f25,f23 fmadd f7,f9,f21,f26 fadd f12,f12,f24 fadd f11,f11,f7 bdnz 100c9fe0 <thin6d_+0xfd0> I've opened bugzilla http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976 for this. Best regards, Luis ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-30 17:55 ` Luis Machado @ 2009-05-01 19:33 ` Richard Guenther 2009-05-04 13:38 ` Luis Machado 0 siblings, 1 reply; 63+ messages in thread From: Richard Guenther @ 2009-05-01 19:33 UTC (permalink / raw) To: luisgpm; +Cc: Michael Matz, David Edelsohn, gcc-patches On Thu, Apr 30, 2009 at 6:26 PM, Luis Machado <luisgpm@linux.vnet.ibm.com> wrote: > Hi, > > On Tue, 2009-04-28 at 02:48 +0200, Michael Matz wrote: >> Hi, >> >> On Mon, 27 Apr 2009, Luis Machado wrote: >> >> > Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's >> > 32-bit sixtrack and found that revision 146817 caused/revealed it. >> > >> > I'll have more details on it soon. >> >> It seems also x86_64 is affected, so anything you find is very welcome. >> >> If I may speculate it could be related to the half TER we're now doing. >> As in, we're not feeding large trees to expand anymore, so there're no >> opportunities to cleverly expand them to short insn sequences. For >> cross-checking try to build with -fno-tree-ter (before the patch) and see >> if it's resulting in the same slowdown. > > I've tracked down the cause of the degradation on sixtrack. > > We have a hot spot on sixtrack in a function called thin6d. > > Such loop is generated by the old (pre-146817) gcc as a single BB, thus > the only way inside that loop is by executing instructions until we fall > into that code. > > The post-146817 gcc breaks that loop in two BB's, such that we can > actually branch to the middle of that loop in the first iteration, and > then the loop runs just like in pre-146817. > > The degradation comes from the fact that the creation of two BB's for > that single loop breaks good scheduling of instructions inside it, like > this: > > Good code: All the fp load instructions are grouped in the upper portion > of the code. > > fmul f22,f11,f13 > fmul f23,f11,f0 > addis r12,r6,-27 > lfd f3,0(r6) > addi r4,r6,8 > lfd f1,9472(r12) > addis r12,r4,-27 > fmadd f8,f12,f0,f22 > fmsub f4,f12,f13,f23 > lfd f22,9472(r12) > lfd f23,8(r6) > addi r6,r4,8 > fmul f11,f8,f13 > fmul f24,f8,f1 > fmul f25,f8,f3 > fmul f5,f8,f0 > fmadd f11,f4,f0,f11 > fmadd f21,f4,f3,f24 > fmsub f2,f4,f1,f25 > fmsub f12,f4,f13,f5 > fmul f1,f11,f23 > fmul f8,f11,f22 > fadd f9,f9,f21 > fadd f10,f10,f2 > fmsub f24,f12,f22,f1 > fmadd f25,f12,f23,f8 > fadd f10,f10,f24 > fadd f9,f9,f25 > bdnz 100ca878 <thin6d_+0x1018> > > Bad code: The second pair of loads are pushed down the second BB, > causing slowdowns. > > fmul f5,f8,f0 > addis r3,r4,-27 > lfd f22,8(r7) > addi r7,r4,8 > lfd f6,9472(r3) > fmadd f10,f9,f0,f10 > fmsub f23,f9,f13,f5 > fmul f2,f10,f22 > fmul f9,f10,f6 > fmr f7,f23 > fmsub f25,f23,f6,f2 > fmadd f26,f23,f22,f9 > fadd f12,f12,f25 > fadd f11,f11,f26 > fmul f8,f10,f13 >>> BB mark > fmul f22,f10,f0 > addis r3,r7,-27 > lfd f21,0(r7) > addi r4,r7,8 > lfd f25,9472(r3) > fmadd f8,f7,f0,f8 > fmsub f9,f7,f13,f22 > fmul f23,f8,f21 > fmul f26,f8,f25 > fmsub f24,f9,f25,f23 > fmadd f7,f9,f21,f26 > fadd f12,f12,f24 > fadd f11,f11,f7 > bdnz 100c9fe0 <thin6d_+0xfd0> > > I've opened bugzilla http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976 > for this. Does enabling the selective scheduler work around this problem? Richard. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-05-01 19:33 ` Richard Guenther @ 2009-05-04 13:38 ` Luis Machado 0 siblings, 0 replies; 63+ messages in thread From: Luis Machado @ 2009-05-04 13:38 UTC (permalink / raw) To: Richard Guenther; +Cc: Michael Matz, David Edelsohn, gcc-patches Hi, On Fri, 2009-05-01 at 21:33 +0200, Richard Guenther wrote: > Does enabling the selective scheduler work around this problem? > > Richard. Not really. The numbers are the same with -fselective-scheduling. Luis ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 0:48 ` Michael Matz 2009-04-28 0:54 ` Luis Machado @ 2009-04-28 16:05 ` David Edelsohn 2009-04-28 16:19 ` Michael Matz 2009-04-28 23:49 ` Michael Matz 1 sibling, 2 replies; 63+ messages in thread From: David Edelsohn @ 2009-04-28 16:05 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches On Mon, Apr 27, 2009 at 8:23 PM, Michael Matz <matz@suse.de> wrote: > Hi David, > > On Mon, 27 Apr 2009, David Edelsohn wrote: > >> gcc.c-torture/execute/20030916-1.c is a new testsuite failure. > > Excellent! (or not, depending on the p.o.v. :-) ). I'll try to analyze > this. I feared already that this wouldn't help powerpc, but this most > probably is some confusion in the RTL move instructions generated on the > edges. No doubt I overlooked some target peculiarity in generating them > :-( > > Meanwhile Andrew has analyzed this somewhat (PR39929). If that turns out > to be correct we would need to amend the target to check for > currently_expanding_to_rtl, not just for current_ir_type(). That wouldn't > influence the AIX breakage though. I tried Andreas Krebbel's patch, but that, unfortunately, did not fix the AIX libobjc build. I am able to bootstrap C, C++, and Fortran, and run the testsuite. Objective C is not required on AIX, but the build problem is exposing a miscompilation of GCC on PowerPC after the expand SSA merge. David ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 16:05 ` David Edelsohn @ 2009-04-28 16:19 ` Michael Matz 2009-04-28 23:49 ` Michael Matz 1 sibling, 0 replies; 63+ messages in thread From: Michael Matz @ 2009-04-28 16:19 UTC (permalink / raw) To: David Edelsohn; +Cc: gcc-patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 867 bytes --] Hi, On Tue, 28 Apr 2009, David Edelsohn wrote: > > Meanwhile Andrew has analyzed this somewhat (PR39929). Â If that turns > > out to be correct we would need to amend the target to check for > > currently_expanding_to_rtl, not just for current_ir_type(). Â That > > wouldn't influence the AIX breakage though. > > I tried Andreas Krebbel's patch, but that, unfortunately, did not fix > the AIX libobjc build. > > I am able to bootstrap C, C++, and Fortran, and run the testsuite. > Objective C is not required on AIX, but the build problem is exposing a > miscompilation of GCC on PowerPC after the expand SSA merge. Yeah, I also seem to see a miscompilation on powerpc64-linux somewhere in df-core.c (two times the same bitmap free'd which breaks the df_bitmap_obstack leading to breakage downstream). Will analyze this some more. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 16:05 ` David Edelsohn 2009-04-28 16:19 ` Michael Matz @ 2009-04-28 23:49 ` Michael Matz 2009-04-29 5:50 ` David Edelsohn 1 sibling, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-28 23:49 UTC (permalink / raw) To: David Edelsohn; +Cc: gcc-patches Hi, On Tue, 28 Apr 2009, David Edelsohn wrote: > I tried Andreas Krebbel's patch, but that, unfortunately, did not fix > the AIX libobjc build. > > I am able to bootstrap C, C++, and Fortran, and run the testsuite. > Objective C is not required on AIX, but the build problem is exposing a > miscompilation of GCC on PowerPC after the expand SSA merge. I tried to reproduce this on powerpc64-linux (default languages, multilibs) but failed. It just bootstraps, and testresults look mostly sane, the only additional FAIL that I didn't find in older gcc-testresults posts were execute/va-arg-22.c and dfp/pr3903[45].c :-( That is with Andreas' patch (the approved one). I'm at a loss now without a testcase. You said you found the miscompilation in cgraphunit.c:build_cdtor(), so it possibly helps if you provide the preprocessed version of that and the command line for compilation (perhaps also giving a snippet of asm where the wrong instructions are). Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-28 23:49 ` Michael Matz @ 2009-04-29 5:50 ` David Edelsohn 2009-04-29 12:48 ` Michael Matz 0 siblings, 1 reply; 63+ messages in thread From: David Edelsohn @ 2009-04-29 5:50 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 4191 bytes --] On Tue, Apr 28, 2009 at 7:48 PM, Michael Matz <matz@suse.de> wrote: > Hi, > > On Tue, 28 Apr 2009, David Edelsohn wrote: > >> I tried Andreas Krebbel's patch, but that, unfortunately, did not fix >> the AIX libobjc build. >> >> I am able to bootstrap C, C++, and Fortran, and run the testsuite. >> Objective C is not required on AIX, but the build problem is exposing a >> miscompilation of GCC on PowerPC after the expand SSA merge. > > I tried to reproduce this on powerpc64-linux (default languages, > multilibs) but failed. It just bootstraps, and testresults look mostly > sane, the only additional FAIL that I didn't find in older gcc-testresults > posts were execute/va-arg-22.c and dfp/pr3903[45].c :-( > > That is with Andreas' patch (the approved one). > > I'm at a loss now without a testcase. You said you found the > miscompilation in cgraphunit.c:build_cdtor(), so it possibly helps if you > provide the preprocessed version of that and the command line for > compilation (perhaps also giving a snippet of asm where the wrong > instructions are). The command to compile libobjc/linking.m is in libobjc.sh. The pre-processed linking.m is attached as linking.i. The command to compile cgraphunit.c is in cgraphunit.sh The pre-processed cgraphunit.c is attached as cgraphunit.i The pre-processed tree.c is attached as tree.i I have not had an opportunity to investigate which function is miscompiled. cgraphunit.c compiled with -O1 or -O2 produces incorrect assembly language. The inlining and additional jumps introduced by optimization makes pinpointing the exact wrong code sequence difficult. GDB shows the priority returned by decl_init_priority_lookup() as 65535 but the value in the register is -1, which is the value that cgraph_build_static_cdtor() receives: priority = -1: (gdb) n 220 cgraph_build_static_cdtor (ctor_p ? 'I' : 'D', body, priority); (gdb) step 200 body = NULL_TREE; (gdb) decl_init_priority_lookup (decl=0x300e2980) at /farm/dje/src/src/gcc/tree.c:4347 4347 gcc_assert (VAR_OR_FUNCTION_DECL_P (decl)); (gdb) finish Run till exit from #0 decl_init_priority_lookup (decl=0x300e2980) at /farm/dje/src/src/gcc/tree.c:4347 0x10609ce8 in build_cdtor (ctor_p=255 'ÿ', cdtors=0x30097b78, len=1) at /farm/dje/src/src/gcc/cgraphunit.c:200 200 body = NULL_TREE; Value returned is $5 = 65535 (gdb) finish Run till exit from #0 0x10609ce8 in build_cdtor (ctor_p=255 'ÿ', cdtors=0x30097b78, len=1) at /farm/dje/src/src/gcc/cgraphunit.c:200 Breakpoint 6, cgraph_build_static_cdtor (which=73 'I', body=0x3009e820, priority=-1) at /farm/dje/src/src/gcc/cgraphunit.c:1374 1374 sprintf (which_buf, "%c_%.5d_%d", which, priority, counter++); It may be that cgraphunit.c is not miscompiled, but massages the value in extra ways that maintains the unsigned short and that somehow is lost with optimization enabled, assuming the value returned is properly zero-extended. decl_init_priority_lookup() appears to be setting the wrong value: /* An initialization priority. */ typedef unsigned short priority_type; /* The initialization priority for entities for which no explicit initialization priority has been specified. */ #define DEFAULT_INIT_PRIORITY 65535 4349 h = (struct tree_priority_map *) htab_find (init_priority_for_decl, &in); 4350 return h ? h->init : DEFAULT_INIT_PRIORITY; 0x1002af5c <decl_init_priority_lookup+48>: bl 0x100494cc <htab_find> 0x1002af60 <decl_init_priority_lookup+52>: nop 0x1002af64 <decl_init_priority_lookup+56>: li r0,-1 <------------ 0x1002af68 <decl_init_priority_lookup+60>: cmpwi r3,0 0x1002af6c <decl_init_priority_lookup+64>: beq- 0x1002af74 <decl_init_priority_lookup+72> 0x1002af70 <decl_init_priority_lookup+68>: lhz r0,4(r3) 0x1002af74 <decl_init_priority_lookup+72>: addi r1,r1,80 0x1002af78 <decl_init_priority_lookup+76>: mr r3,r0 <------------- When compiling libobjc/linking.m, the default priority is used, which should be 65535, but it is being sign-extended to -1. David [-- Attachment #2: libobjc.sh --] [-- Type: application/x-sh, Size: 760 bytes --] [-- Attachment #3: linking.i.bz2 --] [-- Type: application/x-bzip2, Size: 5450 bytes --] [-- Attachment #4: cgraphunit.sh --] [-- Type: application/x-sh, Size: 787 bytes --] [-- Attachment #5: cgraphunit.i.bz2 --] [-- Type: application/x-bzip2, Size: 132479 bytes --] [-- Attachment #6: tree.i.bz2 --] [-- Type: application/x-bzip2, Size: 146244 bytes --] ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 5:50 ` David Edelsohn @ 2009-04-29 12:48 ` Michael Matz 2009-04-29 13:21 ` David Edelsohn ` (2 more replies) 0 siblings, 3 replies; 63+ messages in thread From: Michael Matz @ 2009-04-29 12:48 UTC (permalink / raw) To: David Edelsohn; +Cc: gcc-patches Hi David, On Tue, 28 Apr 2009, David Edelsohn wrote: > decl_init_priority_lookup() appears to be setting the wrong value: > > 0x1002af64 <decl_init_priority_lookup+56>: li r0,-1 <------------ Thanks! That was very helpful. It's another case of RTL constants handled incorrectly because they are modeless :-( I've modifed Andreas' patch somewhat to take care of this, see below. The important change from his patch is the hunk in insert_value_copy_on_edge(). I'm regstrapping this currently on x86_64-linux and powerpc64-linux (plus the fix for PR39955). Ciao, Michael. -- Index: tree-outof-ssa.c =================================================================== --- tree-outof-ssa.c (revision 146952) +++ tree-outof-ssa.c (working copy) @@ -128,6 +128,25 @@ set_location_for_edge (edge e) } } +/* Emit insns to copy SRC into DEST converting SRC if necessary. */ + +static inline rtx +emit_partition_copy (rtx dest, rtx src, int unsignedsrcp) +{ + rtx seq; + + start_sequence (); + + if (GET_MODE (src) != VOIDmode && GET_MODE (src) != GET_MODE (dest)) + src = convert_to_mode (GET_MODE (dest), src, unsignedsrcp); + emit_move_insn (dest, src); + + seq = get_insns (); + end_sequence (); + + return seq; +} + /* Insert a copy instruction from partition SRC to DEST onto edge E. */ static void @@ -149,12 +168,10 @@ insert_partition_copy_on_edge (edge e, i set_location_for_edge (e); - /* Partition copy between same base variables only, so it's the same mode, - hence we can use emit_move_insn. */ - start_sequence (); - emit_move_insn (SA.partition_to_pseudo[dest], SA.partition_to_pseudo[src]); - seq = get_insns (); - end_sequence (); + seq = emit_partition_copy (SA.partition_to_pseudo[dest], + SA.partition_to_pseudo[src], + TYPE_UNSIGNED (TREE_TYPE ( + partition_to_var (SA.map, src)))); insert_insn_on_edge (seq, e); } @@ -166,7 +183,6 @@ static void insert_value_copy_on_edge (edge e, int dest, tree src) { rtx seq, x; - enum machine_mode mode; if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, @@ -186,6 +202,10 @@ insert_value_copy_on_edge (edge e, int d x = expand_expr (src, SA.partition_to_pseudo[dest], mode, EXPAND_NORMAL); if (GET_MODE (x) != VOIDmode && GET_MODE (x) != mode) x = convert_to_mode (mode, x, TYPE_UNSIGNED (TREE_TYPE (src))); + if (CONSTANT_P (x) && GET_MODE (x) == VOIDmode + && mode != TYPE_MODE (TREE_TYPE (src))) + x = convert_modes (mode, TYPE_MODE (TREE_TYPE (src)), + x, TYPE_UNSIGNED (TREE_TYPE (src))); if (x != SA.partition_to_pseudo[dest]) emit_move_insn (SA.partition_to_pseudo[dest], x); seq = get_insns (); @@ -198,7 +218,7 @@ insert_value_copy_on_edge (edge e, int d onto edge E. */ static void -insert_rtx_to_part_on_edge (edge e, int dest, rtx src) +insert_rtx_to_part_on_edge (edge e, int dest, rtx src, int unsignedsrcp) { rtx seq; if (dump_file && (dump_flags & TDF_DETAILS)) @@ -214,11 +234,9 @@ insert_rtx_to_part_on_edge (edge e, int gcc_assert (SA.partition_to_pseudo[dest]); set_location_for_edge (e); - start_sequence (); - gcc_assert (GET_MODE (src) == GET_MODE (SA.partition_to_pseudo[dest])); - emit_move_insn (SA.partition_to_pseudo[dest], src); - seq = get_insns (); - end_sequence (); + seq = emit_partition_copy (SA.partition_to_pseudo[dest], + src, + unsignedsrcp); insert_insn_on_edge (seq, e); } @@ -243,11 +261,10 @@ insert_part_to_rtx_on_edge (edge e, rtx gcc_assert (SA.partition_to_pseudo[src]); set_location_for_edge (e); - start_sequence (); - gcc_assert (GET_MODE (dest) == GET_MODE (SA.partition_to_pseudo[src])); - emit_move_insn (dest, SA.partition_to_pseudo[src]); - seq = get_insns (); - end_sequence (); + seq = emit_partition_copy (dest, + SA.partition_to_pseudo[src], + TYPE_UNSIGNED (TREE_TYPE ( + partition_to_var (SA.map, src)))); insert_insn_on_edge (seq, e); } @@ -522,14 +539,17 @@ elim_create (elim_graph g, int T) if (elim_unvisited_predecessor (g, T)) { - rtx U = get_temp_reg (partition_to_var (g->map, T)); + tree var = partition_to_var (g->map, T); + rtx U = get_temp_reg (var); + int unsignedsrcp = TYPE_UNSIGNED (TREE_TYPE (var)); + insert_part_to_rtx_on_edge (g->e, U, T); FOR_EACH_ELIM_GRAPH_PRED (g, T, P, { if (!TEST_BIT (g->visited, P)) { elim_backward (g, P); - insert_rtx_to_part_on_edge (g->e, P, U); + insert_rtx_to_part_on_edge (g->e, P, U, unsignedsrcp); } }); } ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 12:48 ` Michael Matz @ 2009-04-29 13:21 ` David Edelsohn 2009-04-29 13:35 ` Michael Matz 2009-04-29 14:38 ` David Edelsohn 2009-04-29 15:03 ` Andreas Krebbel 2 siblings, 1 reply; 63+ messages in thread From: David Edelsohn @ 2009-04-29 13:21 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches On Wed, Apr 29, 2009 at 8:39 AM, Michael Matz <matz@suse.de> wrote: > @@ -166,7 +183,6 @@ static void > insert_value_copy_on_edge (edge e, int dest, tree src) > { > rtx seq, x; > - enum machine_mode mode; Did you intend to remove the declaration of mode? David ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 13:21 ` David Edelsohn @ 2009-04-29 13:35 ` Michael Matz 0 siblings, 0 replies; 63+ messages in thread From: Michael Matz @ 2009-04-29 13:35 UTC (permalink / raw) To: David Edelsohn; +Cc: gcc-patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 406 bytes --] Hi, On Wed, 29 Apr 2009, David Edelsohn wrote: > > @@ -166,7 +183,6 @@ static void > > Â insert_value_copy_on_edge (edge e, int dest, tree src) > > Â { > > Â rtx seq, x; > > - Â enum machine_mode mode; > > Did you intend to remove the declaration of mode? Nope, oversight when merging the patches for sending, my testing trees have exactly this patch minus that hunk applied. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 12:48 ` Michael Matz 2009-04-29 13:21 ` David Edelsohn @ 2009-04-29 14:38 ` David Edelsohn 2009-04-29 14:50 ` Richard Guenther 2009-04-29 15:03 ` Andreas Krebbel 2 siblings, 1 reply; 63+ messages in thread From: David Edelsohn @ 2009-04-29 14:38 UTC (permalink / raw) To: Michael Matz; +Cc: gcc-patches On Wed, Apr 29, 2009 at 8:39 AM, Michael Matz <matz@suse.de> wrote: > Thanks! That was very helpful. It's another case of RTL constants > handled incorrectly because they are modeless :-( I've modifed Andreas' > patch somewhat to take care of this, see below. The important change from > his patch is the hunk in insert_value_copy_on_edge(). > > I'm regstrapping this currently on x86_64-linux and powerpc64-linux (plus > the fix for PR39955). This latest patch fixes the libobjc build problem on AIX. GCC now bootstraps successfully with all libraries. Thanks, David ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 14:38 ` David Edelsohn @ 2009-04-29 14:50 ` Richard Guenther 0 siblings, 0 replies; 63+ messages in thread From: Richard Guenther @ 2009-04-29 14:50 UTC (permalink / raw) To: David Edelsohn; +Cc: Michael Matz, gcc-patches On Wed, Apr 29, 2009 at 4:34 PM, David Edelsohn <dje.gcc@gmail.com> wrote: > On Wed, Apr 29, 2009 at 8:39 AM, Michael Matz <matz@suse.de> wrote: > >> Thanks! That was very helpful. It's another case of RTL constants >> handled incorrectly because they are modeless :-( I've modifed Andreas' >> patch somewhat to take care of this, see below. The important change from >> his patch is the hunk in insert_value_copy_on_edge(). >> >> I'm regstrapping this currently on x86_64-linux and powerpc64-linux (plus >> the fix for PR39955). > > This latest patch fixes the libobjc build problem on AIX. GCC now > bootstraps successfully with all libraries. Thus, that patch is ok for mainline. Thanks, Richard. > Thanks, David > ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 12:48 ` Michael Matz 2009-04-29 13:21 ` David Edelsohn 2009-04-29 14:38 ` David Edelsohn @ 2009-04-29 15:03 ` Andreas Krebbel 2009-04-29 15:11 ` Michael Matz 2 siblings, 1 reply; 63+ messages in thread From: Andreas Krebbel @ 2009-04-29 15:03 UTC (permalink / raw) To: Michael Matz; +Cc: David Edelsohn, gcc-patches Hi Michael, > I'm regstrapping this currently on x86_64-linux and powerpc64-linux (plus > the fix for PR39955). I've bootstrapped the patch on s390x. On s390 I am seeing an RTL sharing bug which I haven't tracked down yet. Bye, -Andreas- ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 15:03 ` Andreas Krebbel @ 2009-04-29 15:11 ` Michael Matz 2009-04-29 15:40 ` Andreas Krebbel 0 siblings, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-29 15:11 UTC (permalink / raw) To: Andreas Krebbel; +Cc: David Edelsohn, gcc-patches Hi, On Wed, 29 Apr 2009, Andreas Krebbel wrote: > > I'm regstrapping this currently on x86_64-linux and powerpc64-linux > > (plus the fix for PR39955). > > I've bootstrapped the patch on s390x. On s390 I am seeing an RTL > sharing bug which I haven't tracked down yet. Your original one, or the amended one (for AIX)? I'm asking because I noticed a problem when fiddling with the AIX problem. Your change to insert_value_copy_on_edge might not work in all cases as you're doing the expand_expr outside of a start_sequence. I decided to simply not use the helper function as I had to also change the code to use convert_modes. It shouldn't expose itself as an RTL sharing problem, but it definitely might have undesired effects. Unfortunately our s390 is off today, so I can't help without a testcase. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 15:11 ` Michael Matz @ 2009-04-29 15:40 ` Andreas Krebbel 2009-04-29 17:33 ` Michael Matz 0 siblings, 1 reply; 63+ messages in thread From: Andreas Krebbel @ 2009-04-29 15:40 UTC (permalink / raw) To: Michael Matz; +Cc: David Edelsohn, gcc-patches > Your original one, or the amended one (for AIX)? I'm asking because I > noticed a problem when fiddling with the AIX problem. Your change to > insert_value_copy_on_edge might not work in all cases as you're doing the > expand_expr outside of a start_sequence. I decided to simply not use the > helper function as I had to also change the code to use convert_modes. > > It shouldn't expose itself as an RTL sharing problem, but it definitely > might have undesired effects. Unfortunately our s390 is off today, so I > can't help without a testcase. I've tested the modified version from your last posting (http://gcc.gnu.org/ml/gcc-patches/2009-04/msg02325.html). A already debugged a problem caused by my insert_value_copy_on_edge change - an infinite loop due to a miscompile of do_add in real.c arrgh. Bye, -Andreas- ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 15:40 ` Andreas Krebbel @ 2009-04-29 17:33 ` Michael Matz 2009-04-29 17:41 ` Michael Matz 0 siblings, 1 reply; 63+ messages in thread From: Michael Matz @ 2009-04-29 17:33 UTC (permalink / raw) To: Andreas Krebbel; +Cc: David Edelsohn, gcc-patches Hi, On Wed, 29 Apr 2009, Andreas Krebbel wrote: > > Your original one, or the amended one (for AIX)? I'm asking because I > > noticed a problem when fiddling with the AIX problem. Your change to > > insert_value_copy_on_edge might not work in all cases as you're doing the > > expand_expr outside of a start_sequence. I decided to simply not use the > > helper function as I had to also change the code to use convert_modes. > > > > It shouldn't expose itself as an RTL sharing problem, but it definitely > > might have undesired effects. Unfortunately our s390 is off today, so I > > can't help without a testcase. > > I've tested the modified version from your last posting > (http://gcc.gnu.org/ml/gcc-patches/2009-04/msg02325.html). A already > debugged a problem caused by my insert_value_copy_on_edge change - an > infinite loop due to a miscompile of do_add in real.c arrgh. The machine is online again, so I had a look. cprop creates this sharing, but it seems it's a preexisting bug (no idea why it didn't trigger before). The problem happens in this insn: (insn 26 25 27 7 ../../../gcc/libgomp/config/linux/affinity.c:82 (set (reg:DI 65 [ cpu+-4 ]) (ior:DI (ashift:DI (zero_extend:DI (truncate:SI (mod:DI (reg:DI 65 [ cpu+-4 ]) (sign_extend:DI (reg:SI 66 [ gomp_cpu_affinity_len ] ))))) (const_int 32 [0x20])) (zero_extend:DI (truncate:SI (div:DI (reg:DI 65 [ cpu+-4 ]) (sign_extend:DI (reg:SI 66 [ gomp_cpu_affinity_len ]))))))) 352 {divmoddisi3} (expr_list:REG_EQUAL (ior:DI (ashift:DI (zero_extend:DI (umod:SI (reg:SI 61) (mem/c/i:SI (plus:SI (reg:SI 12 %r12) (reg:SI 64)) [7 gomp_cpu_affinity_len+0 S4 A32]))) (const_int 32 [0x20])) (zero_extend:DI (udiv:SI (reg:SI 61) (mem/c/i:SI (plus:SI (reg:SI 12 %r12) (reg:SI 64)) [7 gomp_cpu_affinity_len+0 S4 A32])))) (nil))) pseudo 64 is the interesting one. Note how it only occurs in the REG_EQUAL not, but it occurs twice. try_replace_reg won't find anything to substitute in the PATTERN itself, but it unconditionally wants to replace occurences also in the notes: if (note != 0 && REG_NOTE_KIND (note) == REG_EQUAL) set_unique_reg_note (insn, REG_EQUAL, simplify_replace_rtx (XEXP (note, 0), from, copy_rtx (to))); Now, simplify_replace_rtx doesn't do any unsharing, and apply_change_group doesn't work for notes. So reg 64 is replaced by this: (const:SI (unspec:SI [ (symbol_ref:SI ("gomp_cpu_affinity_len") [flags 0x42] <var_decl0x40938540 gomp_cpu_affinity_len>) ] 112)) Voila, RTL sharing in the note. We need to unshare explicitely when using simplify_replace_rtx() when we can't be sure that FROM only occurs once, or TO is sharable. Like the patch below. It lets me build libgomp on s390. On this basis I'm checking in our combined patch fixing the AIX problem. I'd be grateful for any further testing of this patch for the RTL sharing problem, I have not bootstrapped it on s390. Ciao, Michael. -- * gcse.c (try_replace_reg): Unshare RTL when substituting into notes. --- gcse.c 2009-04-29 14:34:41.000000000 +0200 +++ /suse/matz/gcse.c 2009-04-29 19:19:44.833381000 +0200 @@ -2272,9 +2272,14 @@ try_replace_reg (rtx from, rtx to, rtx i /* If there is already a REG_EQUAL note, update the expression in it with our replacement. */ if (note != 0 && REG_NOTE_KIND (note) == REG_EQUAL) - set_unique_reg_note (insn, REG_EQUAL, - simplify_replace_rtx (XEXP (note, 0), from, - copy_rtx (to))); + { + rtx x = XEXP (note, 0); + x = simplify_replace_rtx (x, from, copy_rtx (to)); + reset_used_flags (x); + x = copy_rtx_if_shared (x); + set_unique_reg_note (insn, REG_EQUAL, x); + } + if (!success && set && reg_mentioned_p (from, SET_SRC (set))) { /* If above failed and this is a single set, try to simplify the source of ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [RFA] expand from SSA form (1/2) 2009-04-29 17:33 ` Michael Matz @ 2009-04-29 17:41 ` Michael Matz 0 siblings, 0 replies; 63+ messages in thread From: Michael Matz @ 2009-04-29 17:41 UTC (permalink / raw) To: Andreas Krebbel; +Cc: David Edelsohn, gcc-patches On Wed, 29 Apr 2009, Michael Matz wrote: > The machine is online again, so I had a look. cprop creates this > sharing, but it seems it's a preexisting bug (no idea why it didn't > trigger before). Bah, wires crossing :-) I retract the unsharing patch then. Ciao, Michael. ^ permalink raw reply [flat|nested] 63+ messages in thread
end of thread, other threads:[~2011-02-14 17:38 UTC | newest] Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-04-13 20:50 RFC: expand from SSA form (1/2) Michael Matz 2009-04-21 18:23 ` Andrew MacLeod 2009-04-22 10:12 ` Paolo Bonzini 2009-04-22 10:16 ` Richard Guenther 2009-04-22 10:54 ` Michael Matz 2009-04-22 11:20 ` Richard Guenther 2009-04-22 12:34 ` Andrew MacLeod 2009-04-22 16:45 ` [RFA] " Michael Matz 2009-04-23 15:10 ` Andrew MacLeod 2009-04-24 9:42 ` Richard Guenther 2009-04-26 20:27 ` Michael Matz 2010-01-19 15:48 ` H.J. Lu 2009-04-24 14:32 ` Richard Guenther 2009-04-24 14:46 ` Richard Guenther 2009-04-26 20:21 ` Michael Matz 2009-04-26 20:34 ` Richard Guenther 2009-04-26 20:53 ` Michael Matz 2009-04-26 21:14 ` Richard Guenther 2009-04-26 21:15 ` Michael Matz 2009-04-26 21:17 ` Richard Guenther 2009-04-26 22:21 ` Michael Matz 2009-04-26 21:42 ` Michael Matz 2009-04-26 22:15 ` Michael Matz 2009-04-27 12:34 ` Michael Matz 2009-04-27 5:47 ` H.J. Lu 2009-04-28 23:49 ` H.J. Lu 2009-04-29 0:21 ` Andrew Pinski 2009-04-30 13:47 ` H.J. Lu 2009-05-29 3:47 ` H.J. Lu 2010-10-20 18:02 ` H.J. Lu 2011-02-14 18:02 ` H.J. Lu 2009-04-27 7:22 ` Hans-Peter Nilsson 2009-04-30 18:18 ` Steve Ellcey 2009-05-01 17:40 ` Michael Matz 2009-04-27 14:15 David Edelsohn 2009-04-27 14:43 ` H.J. Lu 2009-04-27 15:08 ` Michael Matz 2009-04-27 15:11 ` David Edelsohn 2009-04-27 15:51 ` Michael Matz 2009-04-27 17:03 ` David Edelsohn 2009-04-27 17:27 ` David Edelsohn 2009-04-27 19:15 ` David Edelsohn 2009-04-28 0:48 ` Michael Matz 2009-04-28 0:54 ` Luis Machado 2009-04-28 1:22 ` Michael Matz 2009-04-28 13:24 ` Luis Machado 2009-04-30 17:55 ` Luis Machado 2009-05-01 19:33 ` Richard Guenther 2009-05-04 13:38 ` Luis Machado 2009-04-28 16:05 ` David Edelsohn 2009-04-28 16:19 ` Michael Matz 2009-04-28 23:49 ` Michael Matz 2009-04-29 5:50 ` David Edelsohn 2009-04-29 12:48 ` Michael Matz 2009-04-29 13:21 ` David Edelsohn 2009-04-29 13:35 ` Michael Matz 2009-04-29 14:38 ` David Edelsohn 2009-04-29 14:50 ` Richard Guenther 2009-04-29 15:03 ` Andreas Krebbel 2009-04-29 15:11 ` Michael Matz 2009-04-29 15:40 ` Andreas Krebbel 2009-04-29 17:33 ` Michael Matz 2009-04-29 17:41 ` Michael Matz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).