From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 129114 invoked by alias); 16 Jul 2015 08:48:35 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 129098 invoked by uid 89); 16 Jul 2015 08:48:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-ob0-f171.google.com Received: from mail-ob0-f171.google.com (HELO mail-ob0-f171.google.com) (209.85.214.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 16 Jul 2015 08:48:21 +0000 Received: by obre1 with SMTP id e1so42940854obr.1 for ; Thu, 16 Jul 2015 01:48:19 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.202.74.68 with SMTP id x65mr7300559oia.98.1437036499457; Thu, 16 Jul 2015 01:48:19 -0700 (PDT) Received: by 10.76.115.167 with HTTP; Thu, 16 Jul 2015 01:48:19 -0700 (PDT) In-Reply-To: References: <551A2C7C.8060005@redhat.com> <5522AF73.5000706@redhat.com> Date: Thu, 16 Jul 2015 08:50:00 -0000 Message-ID: Subject: Re: [PR64164] drop copyrename, integrate into expand From: Richard Biener To: Alexandre Oliva Cc: Jeff Law , GCC Patches , Christophe Lyon , David Edelsohn , Eric Botcazou Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-07/txt/msg01364.txt.bz2 On Thu, Jul 16, 2015 at 9:29 AM, Alexandre Oliva wrote: > On Jun 10, 2015, Richard Biener wrote: > >> On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva wrote: >>> This caused the sparc regression reported by Eric in >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37 > >>> We need to match the mode of the rtl created for the partition and the >>> promoted mode expected for the parm. I recall working to make parm and >>> result decls the partition leaders, so that promote_ssa_mode would DTRT, >>> but this escaped my mind when revisiting the patch after some time on >>> another project. > > FWIW, during the development of this improvement, I dropped the notion > of making parm and result decls partition leaders, and instead only > considered eligible for coalescing into the same partition SSA_NAMEs > that promoted to the same mode. > >> Alternatively not coalesce SSA names when promote_decl_mode gives >> different answers (for their underlying decl)? It sounds wrong to do that >> (if that is really what happens). > > Exactly. I've now restored the promote_decl_mode behavior to > promote_ssa_mode for PARM_ and RESULT_DECLs, so that the strategy > described above works again. This fixed the sparc regression. > > On Jun 9, 2015, Alexandre Oliva wrote: > >> On Jun 9, 2015, Alexandre Oliva wrote: > >>> On Jun 9, 2015, David Edelsohn wrote: >>>> This also broke bootstrap on PPC64 LE Linux with the same error. > >>> Thanks for your reports. I'm looking into the problem. > >>> I'd appreciate a preprocessed testcase from either of you to confirm the >>> fix, if not to help debug it. > >> The first potential source for this problem that jumped at me would be >> silenced with this change: > >> diff --git a/gcc/function.c b/gcc/function.c >> index 8bcc352..9201ed9 100644 >> --- a/gcc/function.c >> +++ b/gcc/function.c >> @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all, >> stack_parm = copy_rtx (stack_parm); >> if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) >> PUT_MODE (stack_parm, GET_MODE (entry_parm)); >> - set_mem_attributes (stack_parm, parm, 1); >> + if (GET_CODE (stack_parm) == MEM) >> + set_mem_attributes (stack_parm, parm, 1); >> } > >> /* If a BLKmode arrives in registers, copy it to a stack slot. Handle > > I ended up fixing this in a slightly different way, running the original > code above, from assign_stack_local to set_mem_attributes, only when > rtl_for_parm does not obtain an assignment set up by out-of-ssa. > >> but I suspect there might be other similar issues lurking in function.c >> after my attempt to turn parm assignment upside down ;-) > > There weren't, after all. > > On Jun 9, 2015, David Edelsohn wrote: > >> This patch clearly should have been tested on more >> architectures than x86 before being approved and merged. > > The following patch was regstrapped on x86_64-linux-gnu and > i686-pc-linux-gnu. I've also cross-built all-target successfully for > targets aarch64-elf, arm-eabi, arm-symbianelf, avr-elf, bfin-elf, > cr16-elf, cris-elf, crisv32-elf, epiphany-elf, fido-elf, fr30-elf, > frv-elf, i686-elf, lm32-elf, m68k-elf, mcore-elf, microblaze-elf, > mips64el-elf, mips64-elf, mips64orion-elf, mipsel-elf, > mipsisa32-elfoabi, mipsisa64-elfoabi, mipsisa64r2el-elf, > mipsisa64r2-sde-elf, mipsisa64sb1-elf, mipstx39-elf, mn10300-elf, > moxie-elf, nds32be-elf, nds32le-elf, nios2-elf, powerpc-eabialtivec, > powerpc-eabisimaltivec, powerpc-eabisim, powerpc-eabispe, powerpc-eabi, > powerpcle-eabisim, powerpcle-eabi, powerpcle-elf, ppc-eabi, ppc-elf, > rx-elf, sh-elf, sh-superh-elf, sparc64-elf, sparc-elf, spu-elf, and > visium-elf, and got the same build failures before and after the patch > with targets c6x-elf, ft32-elf, h8300-elf, ia64-elf, iq2000-elf, > m32c-elf, m32r-elf, m32rle-elf, mep-elf, mips64vr-elf > (mips64vr-elf/mips16/newlib/libm/math/lib_a_e_hypot.o failed to build > with the patch and passed without it, but there were other "invalid > operand" failures for "lwu" insns without the patch, so I'm counting the > e_hypot failure as present but latent before), mipsisa64sr71k-elf, > msp430-elf, pdp11-aout, powerpc-xilinx-eabi, ppc64-eabi, rl78-elf, > sh64-elf, sparc-leon-elf, v850e-elf, v850-elf, xstormy16-elf, and > xtensa-elf. > > This patch differs from the previous one in that I dropped the hunk I > had put in loop_exits_before_overflow, already noticed and fixed > independently (PR66638); I updated tree_int_map_hasher, that was updated > in the trunk in tree-ssa-live.c, but that the patch moved to > tree-ssa-coalesce.c; I resolved other conflicts in files that had > #includes added by the patch and by other changes; and I put in the two > fixes mentioned above. After the full updated patch, I enclose a diff > with these two additional fixes, to ease the review. > > Is this ok to install? Yes. Thanks again for taking care of this! Richard. > > for gcc/ChangeLog > > PR rtl-optimization/64164 > * Makefile.in (OBJS): Drop tree-ssa-copyrename.o. > * tree-ssa-copyrename.c: Removed. > * opts.c (default_options_table): Drop -ftree-copyrename. Add > -ftree-coalesce-vars. > * passes.def: Drop all occurrences of pass_rename_ssa_copies. > * common.opt (ftree-copyrename): Ignore. > (ftree-coalesce-inlined-vars): Likewise. > * doc/invoke.texi: Remove the ignored options above. > * gimple-expr.h (gimple_can_coalesce_p): Move declaration > * tree-ssa-coalesce.h: ... here. > * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other > headers required by it. > * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing > across variables when flag_tree_coalesce_vars. Check register > use and promoted modes to allow coalescing. Moved to > tree-ssa-coalesce.c. > * tree-ssa-live.c (struct tree_int_map_hasher): Move along > with its member functions to tree-ssa-coalesce.c. > (var_map_base_init): Likewise. Renamed to > compute_samebase_partition_bases. > (partition_view_normal): Drop want_bases parameter. > (partition_view_bitmap): Likewise. > * tree-ssa-live.h: Adjust declarations. > * tree-ssa-coalesce.c: Include explow.h. > (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's > default defs at the entry point. > (dump_part_var_map): New. > (compute_optimized_partition_bases): New, called by... > (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead > of compute_samebase_partition_bases. Adjust. > * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs. > * cfgexpand.c (leader_merge): New. > (get_rtl_for_parm_ssa_default_def): New. > (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA > vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too. > (expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop > redundant MEM attr setting. > (expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed > from... > (expand_one_stack_var): ... this. New wrapper to check and > skip already expanded SSA partitions. > (record_alignment_for_reg_var): New, factored out of... > (expand_one_var): ... this. > (expand_one_ssa_partition): New. > (adjust_one_expanded_partition_var): New. > (expand_one_register_var): Check and skip already expanded SSA > partitions. > (expand_used_vars): Don't create DECLs for anonymous SSA > names. Expand all SSA partitions, then adjust all SSA names. > (pass::execute): Replace the loops that set > SA.partition_to_pseudo from partition leaders and cleared > DECL_RTL for multi-location variables, and that which used to > rename vars and set attrs, with one that clears DECL_RTL and > checks that PARMs and RESULTs default_defs match DECL_RTL. > * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare. > * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl. > * explow.c (promote_ssa_mode): New. > * explow.h (promote_ssa_mode): Declare. > * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs. > * function.c: Include cfgexpand.h. > (use_register_for_decl): Handle SSA_NAMEs, anonymous or not. > (use_register_for_parm_decl): Wrapper for the above to > special-case the result_ptr. > (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def. > (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with > multiple locations. > (assign_parm_adjust_stack_rtl): Add all and parm arguments, > for rtl_for_parm. For SSA-assigned parms, zero stack_parm. > (assign_parm_setup_block): Prefer SSA-assigned location. > (assign_parm_setup_reg): Likewise. Use entry_parm for equiv > if stack_parm is NULL. > (assign_parm_setup_stack): Prefer SSA-assigned location. > (assign_parms): Maybe reset DECL_RTL of params. Adjust stack > rtl before testing for pointer bounds. Special-case result_ptr. > (expand_function_start): Maybe reset DECL_RTL of result. > Prefer SSA-assigned location for result and static chain. > Factor out DECL_RESULT and SET_DECL_RTL. > * tree-outof-ssa.c (insert_value_copy_on_edge): Handle > anonymous SSA names. Use promote_ssa_mode. > (get_temp_reg): Likewise. > (remove_ssa_form): Adjust. > * var-tracking.c (dataflow_set_clear_at_call): Take call_insn > and get its reg_usage for reg invalidation. > (compute_bb_dataflow): Pass it insn. > (emit_notes_in_bb): Likewise. > > for gcc/testsuite/ChangeLog > > * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars. > * gcc.dg/ssp-1.c: Make counter a register. > * gcc.dg/ssp-2.c: Likewise. > * gcc.dg/torture/parm-coalesce.c: New. > --- > gcc/Makefile.in | 1 > gcc/alias.c | 13 + > gcc/cfgexpand.c | 370 +++++++++++++++----- > gcc/cfgexpand.h | 2 > gcc/common.opt | 12 - > gcc/doc/invoke.texi | 48 +-- > gcc/emit-rtl.c | 5 > gcc/explow.c | 22 + > gcc/explow.h | 3 > gcc/expr.c | 39 +- > gcc/function.c | 228 ++++++++++-- > gcc/gimple-expr.c | 39 -- > gcc/gimple-expr.h | 1 > gcc/opts.c | 2 > gcc/passes.def | 5 > gcc/testsuite/gcc.dg/guality/pr54200.c | 2 > gcc/testsuite/gcc.dg/ssp-1.c | 2 > gcc/testsuite/gcc.dg/ssp-2.c | 2 > gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++ > gcc/tree-outof-ssa.c | 16 - > gcc/tree-ssa-coalesce.c | 378 ++++++++++++++++++++- > gcc/tree-ssa-coalesce.h | 1 > gcc/tree-ssa-copyrename.c | 475 -------------------------- > gcc/tree-ssa-live.c | 99 ----- > gcc/tree-ssa-live.h | 4 > gcc/tree-ssa-uncprop.c | 5 > gcc/var-tracking.c | 12 - > 27 files changed, 979 insertions(+), 847 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c > delete mode 100644 gcc/tree-ssa-copyrename.c > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index bf2186a..b36f9c1 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1445,7 +1445,6 @@ OBJS = \ > tree-ssa-ccp.o \ > tree-ssa-coalesce.o \ > tree-ssa-copy.o \ > - tree-ssa-copyrename.o \ > tree-ssa-dce.o \ > tree-ssa-dom.o \ > tree-ssa-dse.o \ > diff --git a/gcc/alias.c b/gcc/alias.c > index 3203722..69e3732 100644 > --- a/gcc/alias.c > +++ b/gcc/alias.c > @@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant) > if (! DECL_P (exprx) || ! DECL_P (expry)) > return 0; > > + /* If we refer to different gimple registers, or one gimple register > + and one non-gimple-register, we know they can't overlap. First, > + gimple registers don't have their addresses taken. Now, there > + could be more than one stack slot for (different versions of) the > + same gimple register, but we can presumably tell they don't > + overlap based on offsets from stack base addresses elsewhere. > + It's important that we don't proceed to DECL_RTL, because gimple > + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be > + able to do anything about them since no SSA information will have > + remained to guide it. */ > + if (is_gimple_reg (exprx) || is_gimple_reg (expry)) > + return exprx != expry; > + > /* With invalid code we can end up storing into the constant pool. > Bail out to avoid ICEing when creating RTL for this. > See gfortran.dg/lto/20091028-2_0.f90. */ > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c > index a047632..0b19953 100644 > --- a/gcc/cfgexpand.c > +++ b/gcc/cfgexpand.c > @@ -150,21 +150,121 @@ gimple_assign_rhs_to_tree (gimple stmt) > > #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) > > +/* Choose either CUR or NEXT as the leader DECL for a partition. > + Prefer ignored decls, to simplify debug dumps and reduce ambiguity > + out of the same user variable being in multiple partitions (this is > + less likely for compiler-introduced temps). */ > + > +static tree > +leader_merge (tree cur, tree next) > +{ > + if (cur == NULL || cur == next) > + return next; > + > + if (DECL_P (cur) && DECL_IGNORED_P (cur)) > + return cur; > + > + if (DECL_P (next) && DECL_IGNORED_P (next)) > + return next; > + > + return cur; > +} > + > + > +/* Return the RTL for the default SSA def of a PARM or RESULT, if > + there is one. */ > + > +rtx > +get_rtl_for_parm_ssa_default_def (tree var) > +{ > + gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL); > + > + if (!is_gimple_reg (var)) > + return NULL_RTX; > + > + /* If we've already determined RTL for the decl, use it. This is > + not just an optimization: if VAR is a PARM whose incoming value > + is unused, we won't find a default def to use its partition, but > + we still want to use the location of the parm, if it was used at > + all. During assign_parms, until a location is assigned for the > + VAR, RTL can only for a parm or result if we're not coalescing > + across variables, when we know we're coalescing all SSA_NAMEs of > + each parm or result, and we're not coalescing them with names > + pertaining to other variables, such as other parms' default > + defs. */ > + if (DECL_RTL_SET_P (var)) > + { > + gcc_assert (DECL_RTL (var) != pc_rtx); > + return DECL_RTL (var); > + } > + > + tree name = ssa_default_def (cfun, var); > + > + if (!name) > + return NULL_RTX; > + > + int part = var_to_partition (SA.map, name); > + if (part == NO_PARTITION) > + return NULL_RTX; > + > + return SA.partition_to_pseudo[part]; > +} > + > /* Associate declaration T with storage space X. If T is no > SSA name this is exactly SET_DECL_RTL, otherwise make the > partition of T associated with X. */ > static inline void > set_rtl (tree t, rtx x) > { > + if (x && SSAVAR (t)) > + { > + bool skip = false; > + tree cur = NULL_TREE; > + > + if (MEM_P (x)) > + cur = MEM_EXPR (x); > + else if (REG_P (x)) > + cur = REG_EXPR (x); > + else if (GET_CODE (x) == CONCAT > + && REG_P (XEXP (x, 0))) > + cur = REG_EXPR (XEXP (x, 0)); > + else if (GET_CODE (x) == PARALLEL) > + cur = REG_EXPR (XVECEXP (x, 0, 0)); > + else if (x == pc_rtx) > + skip = true; > + else > + gcc_unreachable (); > + > + tree next = skip ? cur : leader_merge (cur, SSAVAR (t)); > + > + if (cur != next) > + { > + if (MEM_P (x)) > + set_mem_attributes (x, next, true); > + else > + set_reg_attrs_for_decl_rtl (next, x); > + } > + } > + > if (TREE_CODE (t) == SSA_NAME) > { > - SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x; > - if (x && !MEM_P (x)) > - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x); > - /* For the benefit of debug information at -O0 (where vartracking > - doesn't run) record the place also in the base DECL if it's > - a normal variable (not a parameter). */ > - if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL) > + int part = var_to_partition (SA.map, t); > + if (part != NO_PARTITION) > + { > + if (SA.partition_to_pseudo[part]) > + gcc_assert (SA.partition_to_pseudo[part] == x); > + else > + SA.partition_to_pseudo[part] = x; > + } > + /* For the benefit of debug information at -O0 (where > + vartracking doesn't run) record the place also in the base > + DECL. For PARMs and RESULTs, we may end up resetting these > + in function.c:maybe_reset_rtl_for_parm, but in some rare > + cases we may need them (unused and overwritten incoming > + value, that at -O0 must share the location with the other > + uses in spite of the missing default def), and this may be > + the only chance to preserve them. */ > + if (x && x != pc_rtx && SSA_NAME_VAR (t)) > { > tree var = SSA_NAME_VAR (t); > /* If we don't yet have something recorded, just record it now. */ > @@ -862,7 +962,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, > gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); > > x = plus_constant (Pmode, base, offset); > - x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x); > + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME > + ? TYPE_MODE (TREE_TYPE (decl)) > + : DECL_MODE (SSAVAR (decl)), x); > > if (TREE_CODE (decl) != SSA_NAME) > { > @@ -884,7 +986,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, > DECL_USER_ALIGN (decl) = 0; > } > > - set_mem_attributes (x, SSAVAR (decl), true); > set_rtl (decl, x); > } > > @@ -1099,13 +1200,22 @@ account_stack_vars (void) > to a variable to be allocated in the stack frame. */ > > static void > -expand_one_stack_var (tree var) > +expand_one_stack_var_1 (tree var) > { > HOST_WIDE_INT size, offset; > unsigned byte_align; > > - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); > - byte_align = align_local_variable (SSAVAR (var)); > + if (TREE_CODE (var) == SSA_NAME) > + { > + tree type = TREE_TYPE (var); > + size = tree_to_uhwi (TYPE_SIZE_UNIT (type)); > + byte_align = TYPE_ALIGN_UNIT (type); > + } > + else > + { > + size = tree_to_uhwi (DECL_SIZE_UNIT (var)); > + byte_align = align_local_variable (var); > + } > > /* We handle highly aligned variables in expand_stack_vars. */ > gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT); > @@ -1116,6 +1226,27 @@ expand_one_stack_var (tree var) > crtl->max_used_stack_slot_alignment, offset); > } > > +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are > + already assigned some MEM. */ > + > +static void > +expand_one_stack_var (tree var) > +{ > + if (TREE_CODE (var) == SSA_NAME) > + { > + int part = var_to_partition (SA.map, var); > + if (part != NO_PARTITION) > + { > + rtx x = SA.partition_to_pseudo[part]; > + gcc_assert (x); > + gcc_assert (MEM_P (x)); > + return; > + } > + } > + > + return expand_one_stack_var_1 (var); > +} > + > /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL > that will reside in a hard register. */ > > @@ -1125,13 +1256,114 @@ expand_one_hard_reg_var (tree var) > rest_of_decl_compilation (var, 0, 0); > } > > +/* Record the alignment requirements of some variable assigned to a > + pseudo. */ > + > +static void > +record_alignment_for_reg_var (unsigned int align) > +{ > + if (SUPPORTS_STACK_ALIGNMENT > + && crtl->stack_alignment_estimated < align) > + { > + /* stack_alignment_estimated shouldn't change after stack > + realign decision made */ > + gcc_assert (!crtl->stack_realign_processed); > + crtl->stack_alignment_estimated = align; > + } > + > + /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted. > + So here we only make sure stack_alignment_needed >= align. */ > + if (crtl->stack_alignment_needed < align) > + crtl->stack_alignment_needed = align; > + if (crtl->max_used_stack_slot_alignment < align) > + crtl->max_used_stack_slot_alignment = align; > +} > + > +/* Create RTL for an SSA partition. */ > + > +static void > +expand_one_ssa_partition (tree var) > +{ > + int part = var_to_partition (SA.map, var); > + gcc_assert (part != NO_PARTITION); > + > + if (SA.partition_to_pseudo[part]) > + return; > + > + if (!use_register_for_decl (var)) > + { > + expand_one_stack_var_1 (var); > + return; > + } > + > + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var), > + TYPE_MODE (TREE_TYPE (var)), > + TYPE_ALIGN (TREE_TYPE (var))); > + > + /* If the variable alignment is very large we'll dynamicaly allocate > + it, which means that in-frame portion is just a pointer. */ > + if (align > MAX_SUPPORTED_STACK_ALIGNMENT) > + align = POINTER_SIZE; > + > + record_alignment_for_reg_var (align); > + > + machine_mode reg_mode = promote_ssa_mode (var, NULL); > + > + rtx x = gen_reg_rtx (reg_mode); > + > + set_rtl (var, x); > +} > + > +/* Record the association between the RTL generated for a partition > + and the underlying variable of the SSA_NAME. */ > + > +static void > +adjust_one_expanded_partition_var (tree var) > +{ > + if (!var) > + return; > + > + tree decl = SSA_NAME_VAR (var); > + > + int part = var_to_partition (SA.map, var); > + if (part == NO_PARTITION) > + return; > + > + rtx x = SA.partition_to_pseudo[part]; > + > + set_rtl (var, x); > + > + if (!REG_P (x)) > + return; > + > + /* Note if the object is a user variable. */ > + if (decl && !DECL_ARTIFICIAL (decl)) > + mark_user_reg (x); > + > + if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var))) > + mark_reg_pointer (x, get_pointer_alignment (var)); > +} > + > /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL > that will reside in a pseudo register. */ > > static void > expand_one_register_var (tree var) > { > - tree decl = SSAVAR (var); > + if (TREE_CODE (var) == SSA_NAME) > + { > + int part = var_to_partition (SA.map, var); > + if (part != NO_PARTITION) > + { > + rtx x = SA.partition_to_pseudo[part]; > + gcc_assert (x); > + gcc_assert (REG_P (x)); > + return; > + } > + gcc_unreachable (); > + } > + > + tree decl = var; > tree type = TREE_TYPE (decl); > machine_mode reg_mode = promote_decl_mode (decl, NULL); > rtx x = gen_reg_rtx (reg_mode); > @@ -1265,21 +1497,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand) > align = POINTER_SIZE; > } > > - if (SUPPORTS_STACK_ALIGNMENT > - && crtl->stack_alignment_estimated < align) > - { > - /* stack_alignment_estimated shouldn't change after stack > - realign decision made */ > - gcc_assert (!crtl->stack_realign_processed); > - crtl->stack_alignment_estimated = align; > - } > - > - /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted. > - So here we only make sure stack_alignment_needed >= align. */ > - if (crtl->stack_alignment_needed < align) > - crtl->stack_alignment_needed = align; > - if (crtl->max_used_stack_slot_alignment < align) > - crtl->max_used_stack_slot_alignment = align; > + record_alignment_for_reg_var (align); > > if (TREE_CODE (origvar) == SSA_NAME) > { > @@ -1713,48 +1931,18 @@ expand_used_vars (void) > if (targetm.use_pseudo_pic_reg ()) > pic_offset_table_rtx = gen_reg_rtx (Pmode); > > - hash_map ssa_name_decls; > for (i = 0; i < SA.map->num_partitions; i++) > { > tree var = partition_to_var (SA.map, i); > > gcc_assert (!virtual_operand_p (var)); > > - /* Assign decls to each SSA name partition, share decls for partitions > - we could have coalesced (those with the same type). */ > - if (SSA_NAME_VAR (var) == NULL_TREE) > - { > - tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var)); > - if (!*slot) > - *slot = create_tmp_reg (TREE_TYPE (var)); > - replace_ssa_name_symbol (var, *slot); > - } > - > - /* Always allocate space for partitions based on VAR_DECLs. But for > - those based on PARM_DECLs or RESULT_DECLs and which matter for the > - debug info, there is no need to do so if optimization is disabled > - because all the SSA_NAMEs based on these DECLs have been coalesced > - into a single partition, which is thus assigned the canonical RTL > - location of the DECLs. If in_lto_p, we can't rely on optimize, > - a function could be compiled with -O1 -flto first and only the > - link performed at -O0. */ > - if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL) > - expand_one_var (var, true, true); > - else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p) > - { > - /* This is a PARM_DECL or RESULT_DECL. For those partitions that > - contain the default def (representing the parm or result itself) > - we don't do anything here. But those which don't contain the > - default def (representing a temporary based on the parm/result) > - we need to allocate space just like for normal VAR_DECLs. */ > - if (!bitmap_bit_p (SA.partition_has_default_def, i)) > - { > - expand_one_var (var, true, true); > - gcc_assert (SA.partition_to_pseudo[i]); > - } > - } > + expand_one_ssa_partition (var); > } > > + for (i = 1; i < num_ssa_names; i++) > + adjust_one_expanded_partition_var (ssa_name (i)); > + > if (flag_stack_protect == SPCT_FLAG_STRONG) > gen_stack_protect_signal > = stack_protect_decl_p () || stack_protect_return_slot_p (); > @@ -5928,35 +6116,6 @@ pass_expand::execute (function *fun) > parm_birth_insn = var_seq; > } > > - /* Now that we also have the parameter RTXs, copy them over to our > - partitions. */ > - for (i = 0; i < SA.map->num_partitions; i++) > - { > - tree var = SSA_NAME_VAR (partition_to_var (SA.map, i)); > - > - if (TREE_CODE (var) != VAR_DECL > - && !SA.partition_to_pseudo[i]) > - SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var); > - gcc_assert (SA.partition_to_pseudo[i]); > - > - /* If this decl was marked as living in multiple places, reset > - this now to NULL. */ > - if (DECL_RTL_IF_SET (var) == pc_rtx) > - SET_DECL_RTL (var, NULL); > - > - /* Some RTL parts really want to look at DECL_RTL(x) when x > - was a decl marked in REG_ATTR or MEM_ATTR. We could use > - SET_DECL_RTL here making this available, but that would mean > - to select one of the potentially many RTLs for one DECL. Instead > - of doing that we simply reset the MEM_EXPR of the RTL in question, > - then nobody can get at it and hence nobody can call DECL_RTL on it. */ > - if (!DECL_RTL_SET_P (var)) > - { > - if (MEM_P (SA.partition_to_pseudo[i])) > - set_mem_expr (SA.partition_to_pseudo[i], NULL); > - } > - } > - > /* If we have a class containing differently aligned pointers > we need to merge those into the corresponding RTL pointer > alignment. */ > @@ -5964,7 +6123,6 @@ pass_expand::execute (function *fun) > { > tree name = ssa_name (i); > int part; > - rtx r; > > if (!name > /* We might have generated new SSA names in > @@ -5977,20 +6135,24 @@ pass_expand::execute (function *fun) > if (part == NO_PARTITION) > continue; > > - /* Adjust all partition members to get the underlying decl of > - the representative which we might have created in expand_one_var. */ > - if (SSA_NAME_VAR (name) == NULL_TREE) > + gcc_assert (SA.partition_to_pseudo[part]); > + > + /* If this decl was marked as living in multiple places, reset > + this now to NULL. */ > + tree var = SSA_NAME_VAR (name); > + if (var && DECL_RTL_IF_SET (var) == pc_rtx) > + SET_DECL_RTL (var, NULL); > + /* Check that the pseudos chosen by assign_parms are those of > + the corresponding default defs. */ > + else if (SSA_NAME_IS_DEFAULT_DEF (name) > + && (TREE_CODE (var) == PARM_DECL > + || TREE_CODE (var) == RESULT_DECL)) > { > - tree leader = partition_to_var (SA.map, part); > - gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE); > - replace_ssa_name_symbol (name, SSA_NAME_VAR (leader)); > + rtx in = DECL_RTL_IF_SET (var); > + gcc_assert (in); > + rtx out = SA.partition_to_pseudo[part]; > + gcc_assert (in == out || rtx_equal_p (in, out)); > } > - if (!POINTER_TYPE_P (TREE_TYPE (name))) > - continue; > - > - r = SA.partition_to_pseudo[part]; > - if (REG_P (r)) > - mark_reg_pointer (r, get_pointer_alignment (name)); > } > > /* If this function is `main', emit a call to `__main' > diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h > index a0b6e3e..602579d 100644 > --- a/gcc/cfgexpand.h > +++ b/gcc/cfgexpand.h > @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3. If not see > > extern tree gimple_assign_rhs_to_tree (gimple); > extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *); > +extern rtx get_rtl_for_parm_ssa_default_def (tree var); > + > > #endif /* GCC_CFGEXPAND_H */ > diff --git a/gcc/common.opt b/gcc/common.opt > index 6b2ccbc..89dcabf 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2230,16 +2230,16 @@ Common Report Var(flag_tree_ch) Optimization > Enable loop header copying on trees > > ftree-coalesce-inlined-vars > -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization > -Enable coalescing of copy-related user variables that are inlined > +Common Ignore RejectNegative > +Does nothing. Preserved for backward compatibility. > > ftree-coalesce-vars > -Common Report Var(flag_ssa_coalesce_vars,2) Optimization > -Enable coalescing of all copy-related user variables > +Common Report Var(flag_tree_coalesce_vars) Optimization > +Enable SSA coalescing of user variables > > ftree-copyrename > -Common Report Var(flag_tree_copyrename) Optimization > -Replace SSA temporaries with better names in copies > +Common Ignore > +Does nothing. Preserved for backward compatibility. > > ftree-copy-prop > Common Report Var(flag_tree_copy_prop) Optimization > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 522e924..681c33e 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}. > -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol > -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol > -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol > --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol > -fdump-tree-nrv -fdump-tree-vect @gol > -fdump-tree-sink @gol > -fdump-tree-sra@r{[}-@var{n}@r{]} @gol > @@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}. > -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol > -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol > -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol > --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol > --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol > --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol > +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol > +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol > -ftree-loop-if-convert-stores -ftree-loop-im @gol > -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol > -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol > @@ -7078,11 +7076,6 @@ name is made by appending @file{.phiopt} to the source file name. > Dump each function after forward propagating single use variables. The file > name is made by appending @file{.forwprop} to the source file name. > > -@item copyrename > -@opindex fdump-tree-copyrename > -Dump each function after applying the copy rename optimization. The file > -name is made by appending @file{.copyrename} to the source file name. > - > @item nrv > @opindex fdump-tree-nrv > Dump each function after applying the named return value optimization on > @@ -7547,8 +7540,8 @@ compilation time. > -ftree-ccp @gol > -fssa-phiopt @gol > -ftree-ch @gol > +-ftree-coalesce-vars @gol > -ftree-copy-prop @gol > --ftree-copyrename @gol > -ftree-dce @gol > -ftree-dominator-opts @gol > -ftree-dse @gol > @@ -8817,6 +8810,15 @@ profitable to parallelize the loops. > Compare the results of several data dependence analyzers. This option > is used for debugging the data dependence analyzers. > > +@item -ftree-coalesce-vars > +@opindex ftree-coalesce-vars > +Tell the compiler to attempt to combine small user-defined variables > +too, instead of just compiler temporaries. This may severely limit the > +ability to debug an optimized program compiled with > +@option{-fno-var-tracking-assignments}. In the negated form, this flag > +prevents SSA coalescing of user variables. This option is enabled by > +default if optimization is enabled. > + > @item -ftree-loop-if-convert > @opindex ftree-loop-if-convert > Attempt to transform conditional jumps in the innermost loops to > @@ -8930,32 +8932,6 @@ Perform scalar replacement of aggregates. This pass replaces structure > references with scalars to prevent committing structures to memory too > early. This flag is enabled by default at @option{-O} and higher. > > -@item -ftree-copyrename > -@opindex ftree-copyrename > -Perform copy renaming on trees. This pass attempts to rename compiler > -temporaries to other variables at copy locations, usually resulting in > -variable names which more closely resemble the original variables. This flag > -is enabled by default at @option{-O} and higher. > - > -@item -ftree-coalesce-inlined-vars > -@opindex ftree-coalesce-inlined-vars > -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to > -combine small user-defined variables too, but only if they are inlined > -from other functions. It is a more limited form of > -@option{-ftree-coalesce-vars}. This may harm debug information of such > -inlined variables, but it keeps variables of the inlined-into > -function apart from each other, such that they are more likely to > -contain the expected values in a debugging session. > - > -@item -ftree-coalesce-vars > -@opindex ftree-coalesce-vars > -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to > -combine small user-defined variables too, instead of just compiler > -temporaries. This may severely limit the ability to debug an optimized > -program compiled with @option{-fno-var-tracking-assignments}. In the > -negated form, this flag prevents SSA coalescing of user variables, > -including inlined ones. This option is enabled by default. > - > @item -ftree-ter > @opindex ftree-ter > Perform temporary expression replacement during the SSA->normal phase. Single > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c > index ed2b30b..0648af6 100644 > --- a/gcc/emit-rtl.c > +++ b/gcc/emit-rtl.c > @@ -1232,6 +1232,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem) > void > set_reg_attrs_for_decl_rtl (tree t, rtx x) > { > + if (!t) > + return; > + tree tdecl = t; > if (GET_CODE (x) == SUBREG) > { > gcc_assert (subreg_lowpart_p (x)); > @@ -1240,7 +1243,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x) > if (REG_P (x)) > REG_ATTRS (x) > = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x), > - DECL_MODE (t))); > + DECL_MODE (tdecl))); > if (GET_CODE (x) == CONCAT) > { > if (REG_P (XEXP (x, 0))) > diff --git a/gcc/explow.c b/gcc/explow.c > index bd342c1..6dba6e5 100644 > --- a/gcc/explow.c > +++ b/gcc/explow.c > @@ -842,6 +842,28 @@ promote_decl_mode (const_tree decl, int *punsignedp) > return pmode; > } > > +/* Return the promoted mode for name. If it is a named SSA_NAME, it > + is the same as promote_decl_mode. Otherwise, it is the promoted > + mode of a temp decl of same type as the SSA_NAME, if we had created > + one. */ > + > +machine_mode > +promote_ssa_mode (const_tree name, int *punsignedp) > +{ > + gcc_assert (TREE_CODE (name) == SSA_NAME); > + > + tree type = TREE_TYPE (name); > + int unsignedp = TYPE_UNSIGNED (type); > + machine_mode mode = TYPE_MODE (type); > + > + machine_mode pmode = promote_mode (type, mode, &unsignedp); > + if (punsignedp) > + *punsignedp = unsignedp; > + > + return pmode; > +} > + > + > > /* Controls the behaviour of {anti_,}adjust_stack. */ > static bool suppress_reg_args_size; > diff --git a/gcc/explow.h b/gcc/explow.h > index 94613de..52113db 100644 > --- a/gcc/explow.h > +++ b/gcc/explow.h > @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *); > /* Return mode and signedness to use when object is promoted. */ > machine_mode promote_decl_mode (const_tree, int *); > > +/* Return mode and signedness to use when object is promoted. */ > +machine_mode promote_ssa_mode (const_tree, int *); > + > /* Remove some bytes from the stack. An rtx says how many. */ > extern void adjust_stack (rtx); > > diff --git a/gcc/expr.c b/gcc/expr.c > index 899a42c..d601129 100644 > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -9246,7 +9246,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > rtx op0, op1, temp, decl_rtl; > tree type; > int unsignedp; > - machine_mode mode; > + machine_mode mode, dmode; > enum tree_code code = TREE_CODE (exp); > rtx subtarget, original_target; > int ignore; > @@ -9377,7 +9377,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > if (g == NULL > && modifier == EXPAND_INITIALIZER > && !SSA_NAME_IS_DEFAULT_DEF (exp) > - && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp))) > + && (optimize || !SSA_NAME_VAR (exp) > + || DECL_IGNORED_P (SSA_NAME_VAR (exp))) > && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp))) > g = SSA_NAME_DEF_STMT (exp); > if (g) > @@ -9456,15 +9457,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > /* Ensure variable marked as used even if it doesn't go through > a parser. If it hasn't be used yet, write out an external > definition. */ > - TREE_USED (exp) = 1; > + if (exp) > + TREE_USED (exp) = 1; > > /* Show we haven't gotten RTL for this yet. */ > temp = 0; > > /* Variables inherited from containing functions should have > been lowered by this point. */ > - context = decl_function_context (exp); > - gcc_assert (SCOPE_FILE_SCOPE_P (context) > + if (exp) > + context = decl_function_context (exp); > + gcc_assert (!exp > + || SCOPE_FILE_SCOPE_P (context) > || context == current_function_decl > || TREE_STATIC (exp) > || DECL_EXTERNAL (exp) > @@ -9488,7 +9492,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > decl_rtl = use_anchored_address (decl_rtl); > if (modifier != EXPAND_CONST_ADDRESS > && modifier != EXPAND_SUM > - && !memory_address_addr_space_p (DECL_MODE (exp), > + && !memory_address_addr_space_p (exp ? DECL_MODE (exp) > + : GET_MODE (decl_rtl), > XEXP (decl_rtl, 0), > MEM_ADDR_SPACE (decl_rtl))) > temp = replace_equiv_address (decl_rtl, > @@ -9499,12 +9504,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > if the address is a register. */ > if (temp != 0) > { > - if (MEM_P (temp) && REG_P (XEXP (temp, 0))) > + if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0))) > mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp)); > > return temp; > } > > + if (exp) > + dmode = DECL_MODE (exp); > + else > + dmode = TYPE_MODE (TREE_TYPE (ssa_name)); > + > /* If the mode of DECL_RTL does not match that of the decl, > there are two cases: we are dealing with a BLKmode value > that is returned in a register, or we are dealing with > @@ -9512,22 +9522,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > of the wanted mode, but mark it so that we know that it > was already extended. */ > if (REG_P (decl_rtl) > - && DECL_MODE (exp) != BLKmode > - && GET_MODE (decl_rtl) != DECL_MODE (exp)) > + && dmode != BLKmode > + && GET_MODE (decl_rtl) != dmode) > { > machine_mode pmode; > > /* Get the signedness to be used for this variable. Ensure we get > the same mode we got when the variable was declared. */ > - if (code == SSA_NAME > - && (g = SSA_NAME_DEF_STMT (ssa_name)) > - && gimple_code (g) == GIMPLE_CALL > - && !gimple_call_internal_p (g)) > + if (code != SSA_NAME) > + pmode = promote_decl_mode (exp, &unsignedp); > + else if ((g = SSA_NAME_DEF_STMT (ssa_name)) > + && gimple_code (g) == GIMPLE_CALL > + && !gimple_call_internal_p (g)) > pmode = promote_function_mode (type, mode, &unsignedp, > gimple_call_fntype (g), > 2); > else > - pmode = promote_decl_mode (exp, &unsignedp); > + pmode = promote_ssa_mode (ssa_name, &unsignedp); > gcc_assert (GET_MODE (decl_rtl) == pmode); > > temp = gen_lowpart_SUBREG (mode, decl_rtl); > diff --git a/gcc/function.c b/gcc/function.c > index f9d11bf4..840f4a2 100644 > --- a/gcc/function.c > +++ b/gcc/function.c > @@ -72,6 +72,9 @@ along with GCC; see the file COPYING3. If not see > #include "cfganal.h" > #include "cfgbuild.h" > #include "cfgcleanup.h" > +#include "cfgexpand.h" > +#include "basic-block.h" > +#include "df.h" > #include "params.h" > #include "bb-reorder.h" > #include "shrink-wrap.h" > @@ -2105,6 +2108,30 @@ aggregate_value_p (const_tree exp, const_tree fntype) > bool > use_register_for_decl (const_tree decl) > { > + if (TREE_CODE (decl) == SSA_NAME) > + { > + /* We often try to use the SSA_NAME, instead of its underlying > + decl, to get type information and guide decisions, to avoid > + differences of behavior between anonymous and named > + variables, but in this one case we have to go for the actual > + variable if there is one. The main reason is that, at least > + at -O0, we want to place user variables on the stack, but we > + don't mind using pseudos for anonymous or ignored temps. > + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs > + should go in pseudos, whereas their corresponding variables > + might have to go on the stack. So, disregarding the decl > + here would negatively impact debug info at -O0, enable > + coalescing between SSA_NAMEs that ought to get different > + stack/pseudo assignments, and get the incoming argument > + processing thoroughly confused by PARM_DECLs expected to live > + in stack slots but assigned to pseudos. */ > + if (!SSA_NAME_VAR (decl)) > + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode > + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl))); > + > + decl = SSA_NAME_VAR (decl); > + } > + > if (!targetm.calls.allocate_stack_slots_for_args ()) > return true; > > @@ -2745,23 +2772,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data) > data->entry_parm = entry_parm; > } > > +/* Wrapper for use_register_for_decl, that special-cases the > + .result_ptr as the function's RESULT_DECL when the RESULT_DECL is > + passed by reference. */ > + > +static bool > +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm) > +{ > + if (parm == all->function_result_decl) > + { > + tree result = DECL_RESULT (current_function_decl); > + > + if (DECL_BY_REFERENCE (result)) > + parm = result; > + } > + > + return use_register_for_decl (parm); > +} > + > +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases > + the .result_ptr as the function's RESULT_DECL when the RESULT_DECL > + is passed by reference. */ > + > +static rtx > +rtl_for_parm (struct assign_parm_data_all *all, tree parm) > +{ > + if (parm == all->function_result_decl) > + { > + tree result = DECL_RESULT (current_function_decl); > + > + if (!DECL_BY_REFERENCE (result)) > + return NULL_RTX; > + > + parm = result; > + } > + > + return get_rtl_for_parm_ssa_default_def (parm); > +} > + > +/* Reset the location of PARM_DECLs and RESULT_DECLs that had > + SSA_NAMEs in multiple partitions, so that assign_parms will choose > + the default def, if it exists, or create new RTL to hold the unused > + entry value. If we are coalescing across variables, we want to > + reset the location too, because a parm without a default def > + (incoming value unused) might be coalesced with one with a default > + def, and then assign_parms would copy both incoming values to the > + same location, which might cause the wrong value to survive. */ > +static void > +maybe_reset_rtl_for_parm (tree parm) > +{ > + gcc_assert (TREE_CODE (parm) == PARM_DECL > + || TREE_CODE (parm) == RESULT_DECL); > + if ((flag_tree_coalesce_vars > + || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx)) > + && is_gimple_reg (parm)) > + SET_DECL_RTL (parm, NULL_RTX); > +} > + > /* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's > always valid and properly aligned. */ > > static void > -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data) > +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm, > + struct assign_parm_data_one *data) > { > rtx stack_parm = data->stack_parm; > > + /* If out-of-SSA assigned RTL to the parm default def, make sure we > + don't use what we might have computed before. */ > + rtx ssa_assigned = rtl_for_parm (all, parm); > + if (ssa_assigned) > + stack_parm = NULL; > + > /* If we can't trust the parm stack slot to be aligned enough for its > ultimate type, don't use that slot after entry. We'll make another > stack slot, if we need one. */ > - if (stack_parm > - && ((STRICT_ALIGNMENT > - && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm)) > - || (data->nominal_type > - && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) > - && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) > + else if (stack_parm > + && ((STRICT_ALIGNMENT > + && (GET_MODE_ALIGNMENT (data->nominal_mode) > + > MEM_ALIGN (stack_parm))) > + || (data->nominal_type > + && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) > + && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) > stack_parm = NULL; > > /* If parm was passed in memory, and we need to convert it on entry, > @@ -2823,11 +2915,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all, > > size = int_size_in_bytes (data->passed_type); > size_stored = CEIL_ROUND (size, UNITS_PER_WORD); > + > if (stack_parm == 0) > { > DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD); > - stack_parm = assign_stack_local (BLKmode, size_stored, > - DECL_ALIGN (parm)); > + stack_parm = rtl_for_parm (all, parm); > + if (!stack_parm) > + stack_parm = assign_stack_local (BLKmode, size_stored, > + DECL_ALIGN (parm)); > + else > + stack_parm = copy_rtx (stack_parm); > if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) > PUT_MODE (stack_parm, GET_MODE (entry_parm)); > set_mem_attributes (stack_parm, parm, 1); > @@ -2968,10 +3065,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp, > TREE_TYPE (current_function_decl), 2); > > - parmreg = gen_reg_rtx (promoted_nominal_mode); > + rtx from_expand = rtl_for_parm (all, parm); > > - if (!DECL_ARTIFICIAL (parm)) > - mark_user_reg (parmreg); > + if (from_expand && !data->passed_pointer) > + { > + parmreg = from_expand; > + gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode); > + } > + else > + { > + parmreg = gen_reg_rtx (promoted_nominal_mode); > + if (!DECL_ARTIFICIAL (parm)) > + mark_user_reg (parmreg); > + } > > /* If this was an item that we received a pointer to, > set DECL_RTL appropriately. */ > @@ -2990,6 +3096,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > assign_parm_find_data_types and expand_expr_real_1. */ > > equiv_stack_parm = data->stack_parm; > + if (!equiv_stack_parm) > + equiv_stack_parm = data->entry_parm; > validated_mem = validize_mem (copy_rtx (data->entry_parm)); > > need_conversion = (data->nominal_mode != data->passed_mode > @@ -3130,11 +3238,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > > /* If we were passed a pointer but the actual value can safely live > in a register, retrieve it and use it directly. */ > - if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode) > + if (data->passed_pointer > + && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode)) > { > /* We can't use nominal_mode, because it will have been set to > Pmode above. We must use the actual mode of the parm. */ > - if (use_register_for_decl (parm)) > + if (from_expand) > + { > + parmreg = from_expand; > + gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm))); > + } > + else if (use_register_for_decl (parm)) > { > parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm))); > mark_user_reg (parmreg); > @@ -3174,7 +3288,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > > /* STACK_PARM is the pointer, not the parm, and PARMREG is > now the parm. */ > - data->stack_parm = NULL; > + data->stack_parm = equiv_stack_parm = NULL; > } > > /* Mark the register as eliminable if we did no conversion and it was > @@ -3184,11 +3298,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > make here would screw up life analysis for it. */ > if (data->nominal_mode == data->passed_mode > && !did_conversion > - && data->stack_parm != 0 > - && MEM_P (data->stack_parm) > + && equiv_stack_parm != 0 > + && MEM_P (equiv_stack_parm) > && data->locate.offset.var == 0 > && reg_mentioned_p (virtual_incoming_args_rtx, > - XEXP (data->stack_parm, 0))) > + XEXP (equiv_stack_parm, 0))) > { > rtx_insn *linsn = get_last_insn (); > rtx_insn *sinsn; > @@ -3201,8 +3315,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > = GET_MODE_INNER (GET_MODE (parmreg)); > int regnor = REGNO (XEXP (parmreg, 0)); > int regnoi = REGNO (XEXP (parmreg, 1)); > - rtx stackr = adjust_address_nv (data->stack_parm, submode, 0); > - rtx stacki = adjust_address_nv (data->stack_parm, submode, > + rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0); > + rtx stacki = adjust_address_nv (equiv_stack_parm, submode, > GET_MODE_SIZE (submode)); > > /* Scan backwards for the set of the real and > @@ -3275,6 +3389,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm, > > if (data->stack_parm == 0) > { > + rtx x = data->stack_parm = rtl_for_parm (all, parm); > + if (x) > + gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm)); > + } > + > + if (data->stack_parm == 0) > + { > int align = STACK_SLOT_ALIGNMENT (data->passed_type, > GET_MODE (data->entry_parm), > TYPE_ALIGN (data->passed_type)); > @@ -3531,6 +3652,8 @@ assign_parms (tree fndecl) > DECL_INCOMING_RTL (parm) = DECL_RTL (parm); > continue; > } > + else > + maybe_reset_rtl_for_parm (parm); > > /* Estimate stack alignment from parameter alignment. */ > if (SUPPORTS_STACK_ALIGNMENT) > @@ -3580,7 +3703,9 @@ assign_parms (tree fndecl) > else > set_decl_incoming_rtl (parm, data.entry_parm, false); > > - /* Boudns should be loaded in the particular order to > + assign_parm_adjust_stack_rtl (&all, parm, &data); > + > + /* Bounds should be loaded in the particular order to > have registers allocated correctly. Collect info about > input bounds and load them later. */ > if (POINTER_BOUNDS_TYPE_P (data.passed_type)) > @@ -3597,11 +3722,10 @@ assign_parms (tree fndecl) > } > else > { > - assign_parm_adjust_stack_rtl (&data); > - > if (assign_parm_setup_block_p (&data)) > assign_parm_setup_block (&all, parm, &data); > - else if (data.passed_pointer || use_register_for_decl (parm)) > + else if (data.passed_pointer > + || use_register_for_parm_decl (&all, parm)) > assign_parm_setup_reg (&all, parm, &data); > else > assign_parm_setup_stack (&all, parm, &data); > @@ -4932,7 +5056,9 @@ expand_function_start (tree subr) > before any library calls that assign parms might generate. */ > > /* Decide whether to return the value in memory or in a register. */ > - if (aggregate_value_p (DECL_RESULT (subr), subr)) > + tree res = DECL_RESULT (subr); > + maybe_reset_rtl_for_parm (res); > + if (aggregate_value_p (res, subr)) > { > /* Returning something that won't go in a register. */ > rtx value_address = 0; > @@ -4940,7 +5066,7 @@ expand_function_start (tree subr) > #ifdef PCC_STATIC_STRUCT_RETURN > if (cfun->returns_pcc_struct) > { > - int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr))); > + int size = int_size_in_bytes (TREE_TYPE (res)); > value_address = assemble_static_space (size); > } > else > @@ -4952,36 +5078,45 @@ expand_function_start (tree subr) > it. */ > if (sv) > { > - value_address = gen_reg_rtx (Pmode); > + if (DECL_BY_REFERENCE (res)) > + value_address = get_rtl_for_parm_ssa_default_def (res); > + if (!value_address) > + value_address = gen_reg_rtx (Pmode); > emit_move_insn (value_address, sv); > } > } > if (value_address) > { > rtx x = value_address; > - if (!DECL_BY_REFERENCE (DECL_RESULT (subr))) > + if (!DECL_BY_REFERENCE (res)) > { > - x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x); > - set_mem_attributes (x, DECL_RESULT (subr), 1); > + x = get_rtl_for_parm_ssa_default_def (res); > + if (!x) > + { > + x = gen_rtx_MEM (DECL_MODE (res), value_address); > + set_mem_attributes (x, res, 1); > + } > } > - SET_DECL_RTL (DECL_RESULT (subr), x); > + SET_DECL_RTL (res, x); > } > } > - else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode) > + else if (DECL_MODE (res) == VOIDmode) > /* If return mode is void, this decl rtl should not be used. */ > - SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX); > + SET_DECL_RTL (res, NULL_RTX); > else > { > /* Compute the return values into a pseudo reg, which we will copy > into the true return register after the cleanups are done. */ > - tree return_type = TREE_TYPE (DECL_RESULT (subr)); > - if (TYPE_MODE (return_type) != BLKmode > - && targetm.calls.return_in_msb (return_type)) > + tree return_type = TREE_TYPE (res); > + rtx x = get_rtl_for_parm_ssa_default_def (res); > + if (x) > + /* Use it. */; > + else if (TYPE_MODE (return_type) != BLKmode > + && targetm.calls.return_in_msb (return_type)) > /* expand_function_end will insert the appropriate padding in > this case. Use the return value's natural (unpadded) mode > within the function proper. */ > - SET_DECL_RTL (DECL_RESULT (subr), > - gen_reg_rtx (TYPE_MODE (return_type))); > + x = gen_reg_rtx (TYPE_MODE (return_type)); > else > { > /* In order to figure out what mode to use for the pseudo, we > @@ -4992,25 +5127,26 @@ expand_function_start (tree subr) > /* Structures that are returned in registers are not > aggregate_value_p, so we may see a PARALLEL or a REG. */ > if (REG_P (hard_reg)) > - SET_DECL_RTL (DECL_RESULT (subr), > - gen_reg_rtx (GET_MODE (hard_reg))); > + x = gen_reg_rtx (GET_MODE (hard_reg)); > else > { > gcc_assert (GET_CODE (hard_reg) == PARALLEL); > - SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg)); > + x = gen_group_rtx (hard_reg); > } > } > > + SET_DECL_RTL (res, x); > + > /* Set DECL_REGISTER flag so that expand_function_end will copy the > result to the real return register(s). */ > - DECL_REGISTER (DECL_RESULT (subr)) = 1; > + DECL_REGISTER (res) = 1; > > if (chkp_function_instrumented_p (current_function_decl)) > { > - tree return_type = TREE_TYPE (DECL_RESULT (subr)); > + tree return_type = TREE_TYPE (res); > rtx bounds = targetm.calls.chkp_function_value_bounds (return_type, > subr, 1); > - SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds); > + SET_DECL_BOUNDS_RTL (res, bounds); > } > } > > @@ -5025,7 +5161,9 @@ expand_function_start (tree subr) > rtx local, chain; > rtx_insn *insn; > > - local = gen_reg_rtx (Pmode); > + local = get_rtl_for_parm_ssa_default_def (parm); > + if (!local) > + local = gen_reg_rtx (Pmode); > chain = targetm.calls.static_chain (current_function_decl, true); > > set_decl_incoming_rtl (parm, chain, false); > diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c > index b558d90..baed630 100644 > --- a/gcc/gimple-expr.c > +++ b/gcc/gimple-expr.c > @@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type) > return copy; > } > > -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for > - coalescing together, false otherwise. > - > - This must stay consistent with var_map_base_init in tree-ssa-live.c. */ > - > -bool > -gimple_can_coalesce_p (tree name1, tree name2) > -{ > - /* First check the SSA_NAME's associated DECL. We only want to > - coalesce if they have the same DECL or both have no associated DECL. */ > - tree var1 = SSA_NAME_VAR (name1); > - tree var2 = SSA_NAME_VAR (name2); > - var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; > - var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; > - if (var1 != var2) > - return false; > - > - /* Now check the types. If the types are the same, then we should > - try to coalesce V1 and V2. */ > - tree t1 = TREE_TYPE (name1); > - tree t2 = TREE_TYPE (name2); > - if (t1 == t2) > - return true; > - > - /* If the types are not the same, check for a canonical type match. This > - (for example) allows coalescing when the types are fundamentally the > - same, but just have different names. > - > - Note pointer types with different address spaces may have the same > - canonical type. Those are rejected for coalescing by the > - types_compatible_p check. */ > - if (TYPE_CANONICAL (t1) > - && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) > - && types_compatible_p (t1, t2)) > - return true; > - > - return false; > -} > - > /* Strip off a legitimate source ending from the input string NAME of > length LEN. Rather than having to know the names used by all of > our front ends, we strip off an ending of a period followed by > diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h > index ed23eb2..3d1c89f 100644 > --- a/gcc/gimple-expr.h > +++ b/gcc/gimple-expr.h > @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree); > extern bool gimple_has_body_p (tree); > extern const char *gimple_decl_printable_name (tree, int); > extern tree copy_var_decl (tree, tree, tree); > -extern bool gimple_can_coalesce_p (tree, tree); > extern tree create_tmp_var_name (const char *); > extern tree create_tmp_var_raw (tree, const char * = NULL); > extern tree create_tmp_var (tree, const char * = NULL); > diff --git a/gcc/opts.c b/gcc/opts.c > index 468a802..f22edd3 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -445,12 +445,12 @@ static const struct default_options default_options_table[] = > { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 }, > { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 }, > + { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 }, > { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 }, > - { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 }, > diff --git a/gcc/passes.def b/gcc/passes.def > index 5cd07ae..103fd2e 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_all_early_optimizations); > PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations) > NEXT_PASS (pass_remove_cgraph_callee_edges); > - NEXT_PASS (pass_rename_ssa_copies); > NEXT_PASS (pass_object_sizes); > NEXT_PASS (pass_ccp); > /* After CCP we rewrite no longer addressed locals into SSA > @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3. If not see > /* Initial scalar cleanups before alias computation. > They ensure memory accesses are not indirect wherever possible. */ > NEXT_PASS (pass_strip_predict_hints); > - NEXT_PASS (pass_rename_ssa_copies); > NEXT_PASS (pass_ccp); > /* After CCP we rewrite no longer addressed locals into SSA > form if possible. */ > @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_ch); > NEXT_PASS (pass_lower_complex); > NEXT_PASS (pass_sra); > - NEXT_PASS (pass_rename_ssa_copies); > /* The dom pass will also resolve all __builtin_constant_p calls > that are still there to 0. This has to be done after some > propagations have already run, but before some more dead code > @@ -294,7 +291,6 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_fold_builtins); > NEXT_PASS (pass_optimize_widening_mul); > NEXT_PASS (pass_tail_calls); > - NEXT_PASS (pass_rename_ssa_copies); > /* FIXME: If DCE is not run before checking for uninitialized uses, > we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c). > However, this also causes us to misdiagnose cases that should be > @@ -329,7 +325,6 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_dce); > NEXT_PASS (pass_asan); > NEXT_PASS (pass_tsan); > - NEXT_PASS (pass_rename_ssa_copies); > /* ??? We do want some kind of loop invariant motion, but we possibly > need to adjust LIM to be more friendly towards preserving accurate > debug information here. */ > diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c > index 9b17187..e1e7293 100644 > --- a/gcc/testsuite/gcc.dg/guality/pr54200.c > +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c > @@ -1,6 +1,6 @@ > /* PR tree-optimization/54200 */ > /* { dg-do run } */ > -/* { dg-options "-g -fno-var-tracking-assignments" } */ > +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */ > > int o __attribute__((used)); > > diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c > index 5467f4d..db69332 100644 > --- a/gcc/testsuite/gcc.dg/ssp-1.c > +++ b/gcc/testsuite/gcc.dg/ssp-1.c > @@ -12,7 +12,7 @@ __stack_chk_fail (void) > > int main () > { > - int i; > + register int i; > char foo[255]; > > // smash stack > diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c > index 9a7ac32..752fe53 100644 > --- a/gcc/testsuite/gcc.dg/ssp-2.c > +++ b/gcc/testsuite/gcc.dg/ssp-2.c > @@ -14,7 +14,7 @@ __stack_chk_fail (void) > void > overflow() > { > - int i = 0; > + register int i = 0; > char foo[30]; > > /* Overflow buffer. */ > diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c > new file mode 100644 > index 0000000..dbd81c1 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c > @@ -0,0 +1,40 @@ > +/* { dg-do run } */ > + > +#include > + > +/* Make sure we don't coalesce both incoming parms, one whose incoming > + value is unused, to the same location, so as to overwrite one of > + them with the incoming value of the other. */ > + > +int __attribute__((noinline, noclone)) > +foo (int i, int j) > +{ > + j = i; /* The incoming value for J is unused. */ > + i = 2; > + if (j) > + j++; > + j += i + 1; > + return j; > +} > + > +/* Same as foo, but with swapped parameters. */ > +int __attribute__((noinline, noclone)) > +bar (int j, int i) > +{ > + j = i; /* The incoming value for J is unused. */ > + i = 2; > + if (j) > + j++; > + j += i + 1; > + return j; > +} > + > +int > +main (void) > +{ > + if (foo (0, 1) != 3) > + abort (); > + if (bar (1, 0) != 3) > + abort (); > + return 0; > +} > diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c > index 7b747ab9..978476c 100644 > --- a/gcc/tree-outof-ssa.c > +++ b/gcc/tree-outof-ssa.c > @@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) > rtx dest_rtx, seq, x; > machine_mode dest_mode, src_mode; > int unsignedp; > - tree var; > > if (dump_file && (dump_flags & TDF_DETAILS)) > { > @@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) > > start_sequence (); > > - var = SSA_NAME_VAR (partition_to_var (SA.map, dest)); > + tree name = partition_to_var (SA.map, dest); > src_mode = TYPE_MODE (TREE_TYPE (src)); > dest_mode = GET_MODE (dest_rtx); > - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var))); > + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name))); > gcc_assert (!REG_P (dest_rtx) > - || dest_mode == promote_decl_mode (var, &unsignedp)); > + || dest_mode == promote_ssa_mode (name, &unsignedp)); > > if (src_mode != dest_mode) > { > @@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T) > static rtx > get_temp_reg (tree name) > { > - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; > - tree type = TREE_TYPE (var); > + tree type = TREE_TYPE (name); > int unsignedp; > - machine_mode reg_mode = promote_decl_mode (var, &unsignedp); > + machine_mode reg_mode = promote_ssa_mode (name, &unsignedp); > rtx x = gen_reg_rtx (reg_mode); > if (POINTER_TYPE_P (type)) > - mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); > + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type))); > return x; > } > > @@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa) > > /* Return to viewing the variable list as just all reference variables after > coalescing has been performed. */ > - partition_view_normal (map, false); > + partition_view_normal (map); > > if (dump_file && (dump_flags & TDF_DETAILS)) > { > diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c > index bf8983f..a622728 100644 > --- a/gcc/tree-ssa-coalesce.c > +++ b/gcc/tree-ssa-coalesce.c > @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. If not see > #include "gimple-iterator.h" > #include "tree-ssa-live.h" > #include "tree-ssa-coalesce.h" > +#include "explow.h" > #include "diagnostic-core.h" > > > @@ -806,6 +807,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) > basic_block bb; > ssa_op_iter iter; > live_track_p live; > + basic_block entry; > + > + /* If inter-variable coalescing is enabled, we may attempt to > + coalesce variables from different base variables, including > + different parameters, so we have to make sure default defs live > + at the entry block conflict with each other. */ > + if (flag_tree_coalesce_vars) > + entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); > + else > + entry = NULL; > > map = live_var_map (liveinfo); > graph = ssa_conflicts_new (num_var_partitions (map)); > @@ -864,6 +875,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) > live_track_process_def (live, result, graph); > } > > + /* Pretend there are defs for params' default defs at the start > + of the (post-)entry block. */ > + if (bb == entry) > + { > + unsigned base; > + bitmap_iterator bi; > + EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi) > + { > + bitmap_iterator bi2; > + unsigned part; > + EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base], > + 0, part, bi2) > + { > + tree var = partition_to_var (map, part); > + if (!SSA_NAME_VAR (var) > + || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL > + && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL) > + || !SSA_NAME_IS_DEFAULT_DEF (var)) > + continue; > + live_track_process_def (live, var, graph); > + } > + } > + } > + > live_track_clear_base_vars (live); > } > > @@ -1132,6 +1167,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, > { > var1 = partition_to_var (map, p1); > var2 = partition_to_var (map, p2); > + > z = var_union (map, var1, var2); > if (z == NO_PARTITION) > { > @@ -1149,6 +1185,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, > > if (debug) > fprintf (debug, ": Success -> %d\n", z); > + > return true; > } > > @@ -1244,6 +1281,328 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2) > } > > > +/* Output partition map MAP with coalescing plan PART to file F. */ > + > +void > +dump_part_var_map (FILE *f, partition part, var_map map) > +{ > + int t; > + unsigned x, y; > + int p; > + > + fprintf (f, "\nCoalescible Partition map \n\n"); > + > + for (x = 0; x < map->num_partitions; x++) > + { > + if (map->view_to_partition != NULL) > + p = map->view_to_partition[x]; > + else > + p = x; > + > + if (ssa_name (p) == NULL_TREE > + || virtual_operand_p (ssa_name (p))) > + continue; > + > + t = 0; > + for (y = 1; y < num_ssa_names; y++) > + { > + tree var = version_to_var (map, y); > + if (!var) > + continue; > + int q = var_to_partition (map, var); > + p = partition_find (part, q); > + gcc_assert (map->partition_to_base_index[q] > + == map->partition_to_base_index[p]); > + > + if (p == (int)x) > + { > + if (t++ == 0) > + { > + fprintf (f, "Partition %d, base %d (", x, > + map->partition_to_base_index[q]); > + print_generic_expr (f, partition_to_var (map, q), TDF_SLIM); > + fprintf (f, " - "); > + } > + fprintf (f, "%d ", y); > + } > + } > + if (t != 0) > + fprintf (f, ")\n"); > + } > + fprintf (f, "\n"); > +} > + > +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for > + coalescing together, false otherwise. > + > + This must stay consistent with var_map_base_init in tree-ssa-live.c. */ > + > +bool > +gimple_can_coalesce_p (tree name1, tree name2) > +{ > + /* First check the SSA_NAME's associated DECL. Without > + optimization, we only want to coalesce if they have the same DECL > + or both have no associated DECL. */ > + tree var1 = SSA_NAME_VAR (name1); > + tree var2 = SSA_NAME_VAR (name2); > + var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; > + var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; > + if (var1 != var2 && !flag_tree_coalesce_vars) > + return false; > + > + /* Now check the types. If the types are the same, then we should > + try to coalesce V1 and V2. */ > + tree t1 = TREE_TYPE (name1); > + tree t2 = TREE_TYPE (name2); > + if (t1 == t2) > + { > + check_modes: > + /* If the base variables are the same, we're good: none of the > + other tests below could possibly fail. */ > + var1 = SSA_NAME_VAR (name1); > + var2 = SSA_NAME_VAR (name2); > + if (var1 == var2) > + return true; > + > + /* We don't want to coalesce two SSA names if one of the base > + variables is supposed to be a register while the other is > + supposed to be on the stack. Anonymous SSA names take > + registers, but when not optimizing, user variables should go > + on the stack, so coalescing them with the anonymous variable > + as the partition leader would end up assigning the user > + variable to a register. Don't do that! */ > + bool reg1 = !var1 || use_register_for_decl (var1); > + bool reg2 = !var2 || use_register_for_decl (var2); > + if (reg1 != reg2) > + return false; > + > + /* Check that the promoted modes are the same. We don't want to > + coalesce if the promoted modes would be different. Only > + PARM_DECLs and RESULT_DECLs have different promotion rules, > + so skip the test if we both are variables or anonymous > + SSA_NAMEs. */ > + return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2))) > + || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL); > + } > + > + /* If the types are not the same, check for a canonical type match. This > + (for example) allows coalescing when the types are fundamentally the > + same, but just have different names. > + > + Note pointer types with different address spaces may have the same > + canonical type. Those are rejected for coalescing by the > + types_compatible_p check. */ > + if (TYPE_CANONICAL (t1) > + && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) > + && types_compatible_p (t1, t2)) > + goto check_modes; > + > + return false; > +} > + > +/* Fill in MAP's partition_to_base_index, with one index for each > + partition of SSA names USED_IN_COPIES and related by CL coalesce > + possibilities. This must match gimple_can_coalesce_p in the > + optimized case. */ > + > +static void > +compute_optimized_partition_bases (var_map map, bitmap used_in_copies, > + coalesce_list_p cl) > +{ > + int parts = num_var_partitions (map); > + partition tentative = partition_new (parts); > + > + /* Partition the SSA versions so that, for each coalescible > + pair, both of its members are in the same partition in > + TENTATIVE. */ > + gcc_assert (!cl->sorted); > + coalesce_pair_p node; > + coalesce_iterator_type ppi; > + FOR_EACH_PARTITION_PAIR (node, ppi, cl) > + { > + tree v1 = ssa_name (node->first_element); > + int p1 = partition_find (tentative, var_to_partition (map, v1)); > + tree v2 = ssa_name (node->second_element); > + int p2 = partition_find (tentative, var_to_partition (map, v2)); > + > + if (p1 == p2) > + continue; > + > + partition_union (tentative, p1, p2); > + } > + > + /* We have to deal with cost one pairs too. */ > + for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next) > + { > + tree v1 = ssa_name (co->first_element); > + int p1 = partition_find (tentative, var_to_partition (map, v1)); > + tree v2 = ssa_name (co->second_element); > + int p2 = partition_find (tentative, var_to_partition (map, v2)); > + > + if (p1 == p2) > + continue; > + > + partition_union (tentative, p1, p2); > + } > + > + /* And also with abnormal edges. */ > + basic_block bb; > + edge e; > + edge_iterator ei; > + FOR_EACH_BB_FN (bb, cfun) > + { > + FOR_EACH_EDGE (e, ei, bb->preds) > + if (e->flags & EDGE_ABNORMAL) > + { > + gphi_iterator gsi; > + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); > + gsi_next (&gsi)) > + { > + gphi *phi = gsi.phi (); > + tree arg = PHI_ARG_DEF (phi, e->dest_idx); > + if (SSA_NAME_IS_DEFAULT_DEF (arg) > + && (!SSA_NAME_VAR (arg) > + || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL)) > + continue; > + > + tree res = PHI_RESULT (phi); > + > + int p1 = partition_find (tentative, var_to_partition (map, res)); > + int p2 = partition_find (tentative, var_to_partition (map, arg)); > + > + if (p1 == p2) > + continue; > + > + partition_union (tentative, p1, p2); > + } > + } > + } > + > + map->partition_to_base_index = XCNEWVEC (int, parts); > + auto_vec index_map (parts); > + if (parts) > + index_map.quick_grow (parts); > + > + const unsigned no_part = -1; > + unsigned count = parts; > + while (count) > + index_map[--count] = no_part; > + > + /* Initialize MAP's mapping from partition to base index, using > + as base indices an enumeration of the TENTATIVE partitions in > + which each SSA version ended up, so that we compute conflicts > + between all SSA versions that ended up in the same potential > + coalesce partition. */ > + bitmap_iterator bi; > + unsigned i; > + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi) > + { > + int pidx = var_to_partition (map, ssa_name (i)); > + int base = partition_find (tentative, pidx); > + if (index_map[base] != no_part) > + continue; > + index_map[base] = count++; > + } > + > + map->num_basevars = count; > + > + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi) > + { > + int pidx = var_to_partition (map, ssa_name (i)); > + int base = partition_find (tentative, pidx); > + gcc_assert (index_map[base] < count); > + map->partition_to_base_index[pidx] = index_map[base]; > + } > + > + if (dump_file && (dump_flags & TDF_DETAILS)) > + dump_part_var_map (dump_file, tentative, map); > + > + partition_delete (tentative); > +} > + > +/* Hashtable helpers. */ > + > +struct tree_int_map_hasher : nofree_ptr_hash > +{ > + static inline hashval_t hash (const tree_int_map *); > + static inline bool equal (const tree_int_map *, const tree_int_map *); > +}; > + > +inline hashval_t > +tree_int_map_hasher::hash (const tree_int_map *v) > +{ > + return tree_map_base_hash (v); > +} > + > +inline bool > +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c) > +{ > + return tree_int_map_eq (v, c); > +} > + > +/* This routine will initialize the basevar fields of MAP with base > + names. Partitions will share the same base if they have the same > + SSA_NAME_VAR, or, being anonymous variables, the same type. This > + must match gimple_can_coalesce_p in the non-optimized case. */ > + > +static void > +compute_samebase_partition_bases (var_map map) > +{ > + int x, num_part; > + tree var; > + struct tree_int_map *m, *mapstorage; > + > + num_part = num_var_partitions (map); > + hash_table tree_to_index (num_part); > + /* We can have at most num_part entries in the hash tables, so it's > + enough to allocate so many map elements once, saving some malloc > + calls. */ > + mapstorage = m = XNEWVEC (struct tree_int_map, num_part); > + > + /* If a base table already exists, clear it, otherwise create it. */ > + free (map->partition_to_base_index); > + map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part); > + > + /* Build the base variable list, and point partitions at their bases. */ > + for (x = 0; x < num_part; x++) > + { > + struct tree_int_map **slot; > + unsigned baseindex; > + var = partition_to_var (map, x); > + if (SSA_NAME_VAR (var) > + && (!VAR_P (SSA_NAME_VAR (var)) > + || !DECL_IGNORED_P (SSA_NAME_VAR (var)))) > + m->base.from = SSA_NAME_VAR (var); > + else > + /* This restricts what anonymous SSA names we can coalesce > + as it restricts the sets we compute conflicts for. > + Using TREE_TYPE to generate sets is the easies as > + type equivalency also holds for SSA names with the same > + underlying decl. > + > + Check gimple_can_coalesce_p when changing this code. */ > + m->base.from = (TYPE_CANONICAL (TREE_TYPE (var)) > + ? TYPE_CANONICAL (TREE_TYPE (var)) > + : TREE_TYPE (var)); > + /* If base variable hasn't been seen, set it up. */ > + slot = tree_to_index.find_slot (m, INSERT); > + if (!*slot) > + { > + baseindex = m - mapstorage; > + m->to = baseindex; > + *slot = m; > + m++; > + } > + else > + baseindex = (*slot)->to; > + map->partition_to_base_index[x] = baseindex; > + } > + > + map->num_basevars = m - mapstorage; > + > + free (mapstorage); > +} > + > /* Reduce the number of copies by coalescing variables in the function. Return > a partition map with the resulting coalesces. */ > > @@ -1260,9 +1619,10 @@ coalesce_ssa_name (void) > cl = create_coalesce_list (); > map = create_outofssa_var_map (cl, used_in_copies); > > - /* If optimization is disabled, we need to coalesce all the names originating > - from the same SSA_NAME_VAR so debug info remains undisturbed. */ > - if (!optimize) > + /* If this optimization is disabled, we need to coalesce all the > + names originating from the same SSA_NAME_VAR so debug info > + remains undisturbed. */ > + if (!flag_tree_coalesce_vars) > { > hash_table ssa_name_hash (10); > > @@ -1303,8 +1663,13 @@ coalesce_ssa_name (void) > if (dump_file && (dump_flags & TDF_DETAILS)) > dump_var_map (dump_file, map); > > - /* Don't calculate live ranges for variables not in the coalesce list. */ > - partition_view_bitmap (map, used_in_copies, true); > + partition_view_bitmap (map, used_in_copies); > + > + if (flag_tree_coalesce_vars) > + compute_optimized_partition_bases (map, used_in_copies, cl); > + else > + compute_samebase_partition_bases (map); > + > BITMAP_FREE (used_in_copies); > > if (num_var_partitions (map) < 1) > @@ -1343,8 +1708,7 @@ coalesce_ssa_name (void) > > /* Now coalesce everything in the list. */ > coalesce_partitions (map, graph, cl, > - ((dump_flags & TDF_DETAILS) ? dump_file > - : NULL)); > + ((dump_flags & TDF_DETAILS) ? dump_file : NULL)); > > delete_coalesce_list (cl); > ssa_conflicts_delete (graph); > diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h > index 99b188a..ae289b4 100644 > --- a/gcc/tree-ssa-coalesce.h > +++ b/gcc/tree-ssa-coalesce.h > @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see > #define GCC_TREE_SSA_COALESCE_H > > extern var_map coalesce_ssa_name (void); > +extern bool gimple_can_coalesce_p (tree, tree); > > #endif /* GCC_TREE_SSA_COALESCE_H */ > diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c > deleted file mode 100644 > index aeb7f28..0000000 > --- a/gcc/tree-ssa-copyrename.c > +++ /dev/null > @@ -1,475 +0,0 @@ > -/* Rename SSA copies. > - Copyright (C) 2004-2015 Free Software Foundation, Inc. > - Contributed by Andrew MacLeod > - > -This file is part of GCC. > - > -GCC is free software; you can redistribute it and/or modify > -it under the terms of the GNU General Public License as published by > -the Free Software Foundation; either version 3, or (at your option) > -any later version. > - > -GCC is distributed in the hope that it will be useful, > -but WITHOUT ANY WARRANTY; without even the implied warranty of > -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > -GNU General Public License for more details. > - > -You should have received a copy of the GNU General Public License > -along with GCC; see the file COPYING3. If not see > -. */ > - > -#include "config.h" > -#include "system.h" > -#include "coretypes.h" > -#include "backend.h" > -#include "tree.h" > -#include "gimple.h" > -#include "rtl.h" > -#include "ssa.h" > -#include "alias.h" > -#include "fold-const.h" > -#include "internal-fn.h" > -#include "gimple-iterator.h" > -#include "flags.h" > -#include "tree-pretty-print.h" > -#include "insn-config.h" > -#include "expmed.h" > -#include "dojump.h" > -#include "explow.h" > -#include "calls.h" > -#include "emit-rtl.h" > -#include "varasm.h" > -#include "stmt.h" > -#include "expr.h" > -#include "tree-dfa.h" > -#include "tree-inline.h" > -#include "tree-ssa-live.h" > -#include "tree-pass.h" > -#include "langhooks.h" > - > -static struct > -{ > - /* Number of copies coalesced. */ > - int coalesced; > -} stats; > - > -/* The following routines implement the SSA copy renaming phase. > - > - This optimization looks for copies between 2 SSA_NAMES, either through a > - direct copy, or an implicit one via a PHI node result and its arguments. > - > - Each copy is examined to determine if it is possible to rename the base > - variable of one of the operands to the same variable as the other operand. > - i.e. > - T.3_5 = > - a_1 = T.3_5 > - > - If this copy couldn't be copy propagated, it could possibly remain in the > - program throughout the optimization phases. After SSA->normal, it would > - become: > - > - T.3 = > - a = T.3 > - > - Since T.3_5 is distinct from all other SSA versions of T.3, there is no > - fundamental reason why the base variable needs to be T.3, subject to > - certain restrictions. This optimization attempts to determine if we can > - change the base variable on copies like this, and result in code such as: > - > - a_5 = > - a_1 = a_5 > - > - This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is > - possible, the copy goes away completely. If it isn't possible, a new temp > - will be created for a_5, and you will end up with the exact same code: > - > - a.8 = > - a = a.8 > - > - The other benefit of performing this optimization relates to what variables > - are chosen in copies. Gimplification of the program uses temporaries for > - a lot of things. expressions like > - > - a_1 = > - = a_1 > - > - get turned into > - > - T.3_5 = > - a_1 = T.3_5 > - = a_1 > - > - Copy propagation is done in a forward direction, and if we can propagate > - through the copy, we end up with: > - > - T.3_5 = > - = T.3_5 > - > - The copy is gone, but so is all reference to the user variable 'a'. By > - performing this optimization, we would see the sequence: > - > - a_5 = > - a_1 = a_5 > - = a_1 > - > - which copy propagation would then turn into: > - > - a_5 = > - = a_5 > - > - and so we still retain the user variable whenever possible. */ > - > - > -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid. > - Choose a representative for the partition, and send debug info to DEBUG. */ > - > -static void > -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug) > -{ > - int p1, p2, p3; > - tree root1, root2; > - tree rep1, rep2; > - bool ign1, ign2, abnorm; > - > - gcc_assert (TREE_CODE (var1) == SSA_NAME); > - gcc_assert (TREE_CODE (var2) == SSA_NAME); > - > - register_ssa_partition (map, var1); > - register_ssa_partition (map, var2); > - > - p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1)); > - p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2)); > - > - if (debug) > - { > - fprintf (debug, "Try : "); > - print_generic_expr (debug, var1, TDF_SLIM); > - fprintf (debug, "(P%d) & ", p1); > - print_generic_expr (debug, var2, TDF_SLIM); > - fprintf (debug, "(P%d)", p2); > - } > - > - gcc_assert (p1 != NO_PARTITION); > - gcc_assert (p2 != NO_PARTITION); > - > - if (p1 == p2) > - { > - if (debug) > - fprintf (debug, " : Already coalesced.\n"); > - return; > - } > - > - rep1 = partition_to_var (map, p1); > - rep2 = partition_to_var (map, p2); > - root1 = SSA_NAME_VAR (rep1); > - root2 = SSA_NAME_VAR (rep2); > - if (!root1 && !root2) > - return; > - > - /* Don't coalesce if one of the variables occurs in an abnormal PHI. */ > - abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1) > - || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2)); > - if (abnorm) > - { > - if (debug) > - fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n"); > - return; > - } > - > - /* Partitions already have the same root, simply merge them. */ > - if (root1 == root2) > - { > - p1 = partition_union (map->var_partition, p1, p2); > - if (debug) > - fprintf (debug, " : Same root, coalesced --> P%d.\n", p1); > - return; > - } > - > - /* Never attempt to coalesce 2 different parameters. */ > - if ((root1 && TREE_CODE (root1) == PARM_DECL) > - && (root2 && TREE_CODE (root2) == PARM_DECL)) > - { > - if (debug) > - fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n"); > - return; > - } > - > - if ((root1 && TREE_CODE (root1) == RESULT_DECL) > - != (root2 && TREE_CODE (root2) == RESULT_DECL)) > - { > - if (debug) > - fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n"); > - return; > - } > - > - ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1)); > - ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2)); > - > - /* Refrain from coalescing user variables, if requested. */ > - if (!ign1 && !ign2) > - { > - if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2)) > - ign2 = true; > - else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1)) > - ign1 = true; > - else if (flag_ssa_coalesce_vars != 2) > - { > - if (debug) > - fprintf (debug, " : 2 different USER vars. No coalesce.\n"); > - return; > - } > - else > - ign2 = true; > - } > - > - /* If both values have default defs, we can't coalesce. If only one has a > - tag, make sure that variable is the new root partition. */ > - if (root1 && ssa_default_def (cfun, root1)) > - { > - if (root2 && ssa_default_def (cfun, root2)) > - { > - if (debug) > - fprintf (debug, " : 2 default defs. No coalesce.\n"); > - return; > - } > - else > - { > - ign2 = true; > - ign1 = false; > - } > - } > - else if (root2 && ssa_default_def (cfun, root2)) > - { > - ign1 = true; > - ign2 = false; > - } > - > - /* Do not coalesce if we cannot assign a symbol to the partition. */ > - if (!(!ign2 && root2) > - && !(!ign1 && root1)) > - { > - if (debug) > - fprintf (debug, " : Choosen variable has no root. No coalesce.\n"); > - return; > - } > - > - /* Don't coalesce if the new chosen root variable would be read-only. > - If both ign1 && ign2, then the root var of the larger partition > - wins, so reject in that case if any of the root vars is TREE_READONLY. > - Otherwise reject only if the root var, on which replace_ssa_name_symbol > - will be called below, is readonly. */ > - if (((root1 && TREE_READONLY (root1)) && ign2) > - || ((root2 && TREE_READONLY (root2)) && ign1)) > - { > - if (debug) > - fprintf (debug, " : Readonly variable. No coalesce.\n"); > - return; > - } > - > - /* Don't coalesce if the two variables aren't type compatible . */ > - if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2)) > - /* There is a disconnect between the middle-end type-system and > - VRP, avoid coalescing enum types with different bounds. */ > - || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE > - || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE) > - && TREE_TYPE (var1) != TREE_TYPE (var2))) > - { > - if (debug) > - fprintf (debug, " : Incompatible types. No coalesce.\n"); > - return; > - } > - > - /* Merge the two partitions. */ > - p3 = partition_union (map->var_partition, p1, p2); > - > - /* Set the root variable of the partition to the better choice, if there is > - one. */ > - if (!ign2 && root2) > - replace_ssa_name_symbol (partition_to_var (map, p3), root2); > - else if (!ign1 && root1) > - replace_ssa_name_symbol (partition_to_var (map, p3), root1); > - else > - gcc_unreachable (); > - > - if (debug) > - { > - fprintf (debug, " --> P%d ", p3); > - print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)), > - TDF_SLIM); > - fprintf (debug, "\n"); > - } > -} > - > - > -namespace { > - > -const pass_data pass_data_rename_ssa_copies = > -{ > - GIMPLE_PASS, /* type */ > - "copyrename", /* name */ > - OPTGROUP_NONE, /* optinfo_flags */ > - TV_TREE_COPY_RENAME, /* tv_id */ > - ( PROP_cfg | PROP_ssa ), /* properties_required */ > - 0, /* properties_provided */ > - 0, /* properties_destroyed */ > - 0, /* todo_flags_start */ > - 0, /* todo_flags_finish */ > -}; > - > -class pass_rename_ssa_copies : public gimple_opt_pass > -{ > -public: > - pass_rename_ssa_copies (gcc::context *ctxt) > - : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt) > - {} > - > - /* opt_pass methods: */ > - opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); } > - virtual bool gate (function *) { return flag_tree_copyrename != 0; } > - virtual unsigned int execute (function *); > - > -}; // class pass_rename_ssa_copies > - > -/* This function will make a pass through the IL, and attempt to coalesce any > - SSA versions which occur in PHI's or copies. Coalescing is accomplished by > - changing the underlying root variable of all coalesced version. This will > - then cause the SSA->normal pass to attempt to coalesce them all to the same > - variable. */ > - > -unsigned int > -pass_rename_ssa_copies::execute (function *fun) > -{ > - var_map map; > - basic_block bb; > - tree var, part_var; > - gimple stmt; > - unsigned x; > - FILE *debug; > - > - memset (&stats, 0, sizeof (stats)); > - > - if (dump_file && (dump_flags & TDF_DETAILS)) > - debug = dump_file; > - else > - debug = NULL; > - > - map = init_var_map (num_ssa_names); > - > - FOR_EACH_BB_FN (bb, fun) > - { > - /* Scan for real copies. */ > - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); > - gsi_next (&gsi)) > - { > - stmt = gsi_stmt (gsi); > - if (gimple_assign_ssa_name_copy_p (stmt)) > - { > - tree lhs = gimple_assign_lhs (stmt); > - tree rhs = gimple_assign_rhs1 (stmt); > - > - copy_rename_partition_coalesce (map, lhs, rhs, debug); > - } > - } > - } > - > - FOR_EACH_BB_FN (bb, fun) > - { > - /* Treat PHI nodes as copies between the result and each argument. */ > - for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi); > - gsi_next (&gsi)) > - { > - size_t i; > - tree res; > - gphi *phi = gsi.phi (); > - res = gimple_phi_result (phi); > - > - /* Do not process virtual SSA_NAMES. */ > - if (virtual_operand_p (res)) > - continue; > - > - /* Make sure to only use the same partition for an argument > - as the result but never the other way around. */ > - if (SSA_NAME_VAR (res) > - && !DECL_IGNORED_P (SSA_NAME_VAR (res))) > - for (i = 0; i < gimple_phi_num_args (phi); i++) > - { > - tree arg = PHI_ARG_DEF (phi, i); > - if (TREE_CODE (arg) == SSA_NAME) > - copy_rename_partition_coalesce (map, res, arg, > - debug); > - } > - /* Else if all arguments are in the same partition try to merge > - it with the result. */ > - else > - { > - int all_p_same = -1; > - int p = -1; > - for (i = 0; i < gimple_phi_num_args (phi); i++) > - { > - tree arg = PHI_ARG_DEF (phi, i); > - if (TREE_CODE (arg) != SSA_NAME) > - { > - all_p_same = 0; > - break; > - } > - else if (all_p_same == -1) > - { > - p = partition_find (map->var_partition, > - SSA_NAME_VERSION (arg)); > - all_p_same = 1; > - } > - else if (all_p_same == 1 > - && p != partition_find (map->var_partition, > - SSA_NAME_VERSION (arg))) > - { > - all_p_same = 0; > - break; > - } > - } > - if (all_p_same == 1) > - copy_rename_partition_coalesce (map, res, > - PHI_ARG_DEF (phi, 0), > - debug); > - } > - } > - } > - > - if (debug) > - dump_var_map (debug, map); > - > - /* Now one more pass to make all elements of a partition share the same > - root variable. */ > - > - for (x = 1; x < num_ssa_names; x++) > - { > - part_var = partition_to_var (map, x); > - if (!part_var) > - continue; > - var = ssa_name (x); > - if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var)) > - continue; > - if (debug) > - { > - fprintf (debug, "Coalesced "); > - print_generic_expr (debug, var, TDF_SLIM); > - fprintf (debug, " to "); > - print_generic_expr (debug, part_var, TDF_SLIM); > - fprintf (debug, "\n"); > - } > - stats.coalesced++; > - replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var)); > - } > - > - statistics_counter_event (fun, "copies coalesced", > - stats.coalesced); > - delete_var_map (map); > - return 0; > -} > - > -} // anon namespace > - > -gimple_opt_pass * > -make_pass_rename_ssa_copies (gcc::context *ctxt) > -{ > - return new pass_rename_ssa_copies (ctxt); > -} > diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c > index 5b00f58..4772558 100644 > --- a/gcc/tree-ssa-live.c > +++ b/gcc/tree-ssa-live.c > @@ -70,88 +70,6 @@ static void verify_live_on_entry (tree_live_info_p); > ssa_name or variable, and vice versa. */ > > > -/* Hashtable helpers. */ > - > -struct tree_int_map_hasher : nofree_ptr_hash > -{ > - static inline hashval_t hash (const tree_int_map *); > - static inline bool equal (const tree_int_map *, const tree_int_map *); > -}; > - > -inline hashval_t > -tree_int_map_hasher::hash (const tree_int_map *v) > -{ > - return tree_map_base_hash (v); > -} > - > -inline bool > -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c) > -{ > - return tree_int_map_eq (v, c); > -} > - > - > -/* This routine will initialize the basevar fields of MAP. */ > - > -static void > -var_map_base_init (var_map map) > -{ > - int x, num_part; > - tree var; > - struct tree_int_map *m, *mapstorage; > - > - num_part = num_var_partitions (map); > - hash_table tree_to_index (num_part); > - /* We can have at most num_part entries in the hash tables, so it's > - enough to allocate so many map elements once, saving some malloc > - calls. */ > - mapstorage = m = XNEWVEC (struct tree_int_map, num_part); > - > - /* If a base table already exists, clear it, otherwise create it. */ > - free (map->partition_to_base_index); > - map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part); > - > - /* Build the base variable list, and point partitions at their bases. */ > - for (x = 0; x < num_part; x++) > - { > - struct tree_int_map **slot; > - unsigned baseindex; > - var = partition_to_var (map, x); > - if (SSA_NAME_VAR (var) > - && (!VAR_P (SSA_NAME_VAR (var)) > - || !DECL_IGNORED_P (SSA_NAME_VAR (var)))) > - m->base.from = SSA_NAME_VAR (var); > - else > - /* This restricts what anonymous SSA names we can coalesce > - as it restricts the sets we compute conflicts for. > - Using TREE_TYPE to generate sets is the easies as > - type equivalency also holds for SSA names with the same > - underlying decl. > - > - Check gimple_can_coalesce_p when changing this code. */ > - m->base.from = (TYPE_CANONICAL (TREE_TYPE (var)) > - ? TYPE_CANONICAL (TREE_TYPE (var)) > - : TREE_TYPE (var)); > - /* If base variable hasn't been seen, set it up. */ > - slot = tree_to_index.find_slot (m, INSERT); > - if (!*slot) > - { > - baseindex = m - mapstorage; > - m->to = baseindex; > - *slot = m; > - m++; > - } > - else > - baseindex = (*slot)->to; > - map->partition_to_base_index[x] = baseindex; > - } > - > - map->num_basevars = m - mapstorage; > - > - free (mapstorage); > -} > - > - > /* Remove the base table in MAP. */ > > static void > @@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected) > } > > > -/* Create a partition view which includes all the used partitions in MAP. If > - WANT_BASES is true, create the base variable map as well. */ > +/* Create a partition view which includes all the used partitions in MAP. */ > > void > -partition_view_normal (var_map map, bool want_bases) > +partition_view_normal (var_map map) > { > bitmap used; > > used = partition_view_init (map); > partition_view_fini (map, used); > > - if (want_bases) > - var_map_base_init (map); > - else > - var_map_base_fini (map); > + var_map_base_fini (map); > } > > > @@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases) > as well. */ > > void > -partition_view_bitmap (var_map map, bitmap only, bool want_bases) > +partition_view_bitmap (var_map map, bitmap only) > { > bitmap used; > bitmap new_partitions = BITMAP_ALLOC (NULL); > @@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases) > } > partition_view_fini (map, new_partitions); > > - if (want_bases) > - var_map_base_init (map); > - else > - var_map_base_fini (map); > + var_map_base_fini (map); > } > > > diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h > index d5d7820..1f88358 100644 > --- a/gcc/tree-ssa-live.h > +++ b/gcc/tree-ssa-live.h > @@ -71,8 +71,8 @@ typedef struct _var_map > extern var_map init_var_map (int); > extern void delete_var_map (var_map); > extern int var_union (var_map, tree, tree); > -extern void partition_view_normal (var_map, bool); > -extern void partition_view_bitmap (var_map, bitmap, bool); > +extern void partition_view_normal (var_map); > +extern void partition_view_bitmap (var_map, bitmap); > extern void dump_scope_blocks (FILE *, int); > extern void debug_scope_block (tree, int); > extern void debug_scope_blocks (int); > diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c > index 437f69d..1fbd71e 100644 > --- a/gcc/tree-ssa-uncprop.c > +++ b/gcc/tree-ssa-uncprop.c > @@ -38,6 +38,11 @@ along with GCC; see the file COPYING3. If not see > #include "tree-pass.h" > #include "tree-ssa-propagate.h" > #include "tree-hash-traits.h" > +#include "bitmap.h" > +#include "stringpool.h" > +#include "tree-ssanames.h" > +#include "tree-ssa-live.h" > +#include "tree-ssa-coalesce.h" > > /* The basic structure describing an equivalency created by traversing > an edge. Traversing the edge effectively means that we can assume > diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c > index b5b0cb6..e10f775 100644 > --- a/gcc/var-tracking.c > +++ b/gcc/var-tracking.c > @@ -4909,12 +4909,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set) > registers, as well as associations between MEMs and VALUEs. */ > > static void > -dataflow_set_clear_at_call (dataflow_set *set) > +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn) > { > unsigned int r; > hard_reg_set_iterator hrsi; > + HARD_REG_SET invalidated_regs; > > - EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi) > + get_call_reg_set_usage (call_insn, &invalidated_regs, > + regs_invalidated_by_call); > + > + EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi) > var_regno_delete (set, r); > > if (MAY_HAVE_DEBUG_INSNS) > @@ -6698,7 +6702,7 @@ compute_bb_dataflow (basic_block bb) > switch (mo->type) > { > case MO_CALL: > - dataflow_set_clear_at_call (out); > + dataflow_set_clear_at_call (out, insn); > break; > > case MO_USE: > @@ -9160,7 +9164,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set) > switch (mo->type) > { > case MO_CALL: > - dataflow_set_clear_at_call (set); > + dataflow_set_clear_at_call (set, insn); > emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars); > { > rtx arguments = mo->u.loc, *p = &arguments; > > > > These are the incremental fixes: > > diff --git a/gcc/explow.c b/gcc/explow.c > index 6dba6e5..6941f4e 100644 > --- a/gcc/explow.c > +++ b/gcc/explow.c > @@ -852,6 +852,13 @@ promote_ssa_mode (const_tree name, int *punsignedp) > { > gcc_assert (TREE_CODE (name) == SSA_NAME); > > + /* Partitions holding parms and results must be promoted as expected > + by function.c. */ > + if (SSA_NAME_VAR (name) > + && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL > + || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL)) > + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp); > + > tree type = TREE_TYPE (name); > int unsignedp = TYPE_UNSIGNED (type); > machine_mode mode = TYPE_MODE (type); > diff --git a/gcc/function.c b/gcc/function.c > index 840f4a2..753d889 100644 > --- a/gcc/function.c > +++ b/gcc/function.c > @@ -2920,14 +2920,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all, > { > DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD); > stack_parm = rtl_for_parm (all, parm); > - if (!stack_parm) > - stack_parm = assign_stack_local (BLKmode, size_stored, > - DECL_ALIGN (parm)); > - else > + if (stack_parm) > stack_parm = copy_rtx (stack_parm); > - if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) > - PUT_MODE (stack_parm, GET_MODE (entry_parm)); > - set_mem_attributes (stack_parm, parm, 1); > + else > + { > + stack_parm = assign_stack_local (BLKmode, size_stored, > + DECL_ALIGN (parm)); > + if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) > + PUT_MODE (stack_parm, GET_MODE (entry_parm)); > + set_mem_attributes (stack_parm, parm, 1); > + } > } > > /* If a BLKmode arrives in registers, copy it to a stack slot. Handle > > > -- > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ > You must be the change you wish to see in the world. -- Gandhi > Be Free! -- http://FSFLA.org/ FSF Latin America board member > Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer