On 23/09/15 21:07, Alexandre Oliva wrote: > On Sep 18, 2015, Alan Lawrence wrote: > >> With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on >> branch aoliva/pr64164, I am now able to build a cross toolchain for >> aarch64 and aarch64_be, and can confirm the ABI failure is fixed on >> the branch. > this commit commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc Author: aoliva Date: 2015-09-27 09:02:00 +0000 revert to assign_parms assignments using default defs ... introduced a test failure on arm-none-eabi (using newlib, compiling with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ): FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2 spawn arm-none-eabi-size pr43920-2.o text data bss dec hex filename 56 0 0 56 38 pr43920-2.o text size is 56 FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54 (i haven't looked into the failure, attached asm output before and after). > Thanks for the confirmation. I've made one further tweak for cris and > lm32, dropping the assert that caused build failures for libstdc++ > atomics parms that required more alignment than > MAX_SUPPORTED_STACK_ALIGNMENT, consolidated the patchset and retested it > with a more recent baseline (r228019), with native regstraps on > x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, > powerpc64le-linux-gnu, and cross toolchain builds for the following 73 > platforms: aarch64_be-elf aarch64-elf arm-eabi armeb-eabihf > arm-symbianelf avr-elf bfin-elf c6x-elf cr16-elf cris-elf crisv32-elf > epiphany-elf fido-elf fr30-elf frv-elf ft32-elf h8300-elf i686-elf > ia64-elf iq2000-elf lm32-elf m32c-elf m32r-elf m32rle-elf m68k-elf > mcore-elf mep-elf microblaze-elf mips64el-elf mips64-elf mips64orion-elf > mips64vr-elf mipsel-elf mipsisa32-elfoabi mipsisa64-elfoabi > mipsisa64r2el-elf mipsisa64r2-sde-elf mipsisa64sb1-elf > mipsisa64sr71k-elf mipstx39-elf mn10300-elf moxie-elf msp430-elf > nds32be-elf nds32le-elf nios2-elf pdp11-aout powerpc-eabialtivec > powerpc-eabi powerpc-eabisimaltivec powerpc-eabisim powerpc-eabispe > powerpcle-eabi powerpcle-eabisim powerpcle-elf powerpc-xilinx-eabi > ppc64-eabi ppc-eabi ppc-elf rl78-elf rx-elf sh64-elf sh-elf > sh-superh-elf sparc64-elf sparc-elf sparc-leon-elf spu-elf v850e-elf > v850-elf visium-elf xstormy16-elf xtensa-elf. Not all of them succeeded > in building, but those that didn't failed at the very same spots before > and after this patch. > > > This patch doesn't really add much functionality. It rather > reimplements a lot of the ugly and fragile stuff I put in in the > previous big patchset in a far more robust and pleasant way. It fixes a > number of regressions in the process, mainly because, instead of > modifying assign_parms so as to let cfgexpand do part of its job, it > reverts all of the RTL assignment for parameters and results to > assign_parms. cfgexpand now leaves the RTL assignment of partitions > containing default defs or parms and results to assign_parms, and > assign_parms uses a single callback, set_parm_rtl, to tell cfgexpand the > assignment for the partition containing the default def of each > parameter. > > This required introducing default defs for all parms and results, even > if unused; we could refrain from creating them, and refrain from > initializing those parameters (at least when optimizing), but that would > require messing with the fragile bits in assign_parms again, and it > would bring little benefit, since RTL optimization will likely notice > the initialization is unused and drop it anyway. Besides, adding the > default defs was actually needed to fix a regression in the previous > patch, and even with the current patch it helps make sure we don't > assign more than one default def to the same SSA partition (the previous > patch attempted to do that, but there was a bug, fixed in the current > patch). Having unused default defs makes it easier for us to decide > whether to use an entry_value rtx for the initial debug insn of a parm. > We track partitions holding default defs for parms and results with a > bitmap; we used to have a bitmap that tracked partitions holding default > defs, but it was unused! I just renamed it and repurposed it. > > I've also added checking asserts to set_rtl, to verify that, when we > expect a REG, we get a REG, and that it has the expected mode. set_rtl > was also adjusted to record anonymous SSA names or their base types in > attrs of REGs or MEMs, respectively, so that code that relied on the > attrs to detect properties of the decl types no longer regress just > because we no longer generate decls for anonymous SSA names. Since > there were prior uses of types in MEM attrs, that was expected to go > smoothly, but I was surprised at how smoothly adding SSA names to REG > attrs went. No adjustments required! > > I also tightened a bit the conditions for coalescing: we used to require > the same canonical type; I've added tests for same alignment > requirements, and for same signedness. OTOH, I've added a few more > coalesce candidates for RESULT_DECLs and the newly-added default defs of > parms and results. > > Other relevant changes were in mode promotion. TYPE_MODE would often > return BLKmode for some vector types, which was fine for some return > decl RTL with PARALLEL, but that didn't quite work for SSA partitions. > There were other cases of mode promotion of result decls that failed the > asserts in set_rtl, that revealed promote_decl_mode didn't call > promote_function_mode as expected for results. > > The new assers brought additional requirements: promoting the mode of > the RTL generated for the static chain, arranging for result decls to be > assigned to a pseudo where it would formerly have got a BLKmode PARALLEL > (as mentioned above), and arranging for parms set up by > assign_parm_setup_block, that would always get a MEM, to instead get a > REG when use_register_for_decl called for it. In a few cases involving > complex parms, I couldn't figure out how to avoid a temporary MEM, used > to adjust padding of the parms, but although undesired, this is not a > regression, for we used to use the MEM, we'll just load them to > (coalescible) pseudos and use the pseudos instead, instead of coalescing > other vars that expected pseudos to the same MEM. > > Is this ok to install? > > > > revert to assign_parms assignments using default defs > > From: Alexandre Oliva > > Revert the fragile and complicated changes to assign_parms designed to > enable it to use RTL assigments chosen by cfgexpand, and instead have > cfgexpand use the RTL assignments by assign_parms, keying them off of > the default defs that are now necessarily introduced for each parm and > result. The possible lack of a default def was already a problem, and > the fallbacks in place were not enough, as shown by PR67312. We now > have checking asserts in set_rtl that verify that we're assigning to > each var a piece of RTL that matches the expectations set forth by > use_register_for_decl. > > for gcc/ChangeLog > > PR rtl-optimization/64164 > PR tree-optimization/67312 > PR middle-end/67340 > PR middle-end/67490 > PR bootstrap/67597 > * cfgexpand.c (parm_in_stack_slot_p): Remove. > (ssa_default_def_partition): Remove. > (get_rtl_for_parm_ssa_default_def): Remove. > (set_rtl): Check that RTL assignments match expectations. > Loop on SUBREGs, CONCATs and PARALLELs subexprs. Set only the > default def location for params and results. Record SSA names > or types in REG and MEM attrs, respectively. > (set_parm_rtl): New. > (expand_one_ssa_partition): Drop logic that assigned MEMs with > unassigned addresses. > (adjust_one_expanded_partition_var): Don't accept NULL RTL on > deferred stack alloc vars. > (expand_used_vars): Skip partitions holding parm default defs. > Move adjust_one_expanded_partition_var loop... > (pass_expand::execute): ... here. Drop redundant assert. > Adjust comments before the final loop over all ssa names. > Require assigned rtl of parms and results to match exactly. > Reset its attributes to match them, not any other variables in > the same partition. > (expand_debug_expr): Use entry value for PARM's default defs > only iff they have zero nondebug uses. > * cfgexpand.h (parm_in_stack_slot_p): Remove. > (get_rtl_for_parm_ssa_default_def): Remove. > (set_parm_rtl): Declare. > * doc/invoke.texi: Improve wording. > * explow.c (promote_decl_mode): Fix promote_function_mode for > result decls not by reference. > (promote_ssa_mode): Disregard BLKmode from promote_decl, and > bypass TYPE_MODE to get the actual vector mode. > * function.c: Include tree-dfa.h. Revert 2015-08-14's and > 2015-08-19's changes as follows. Drop include of > basic-block.h and df.h. > (rtl_for_parm): Remove. > (maybe_reset_rtl_for_parm): Remove. > (parm_in_unassigned_mem_p): Remove. > (use_register_for_decl): Add logic for RESULT_DECLs matching > assign_parms' behavior. > (split_complex_args): Revert. > (assign_parms_augmented_arg_list): Revert. Add comment > referencing the logic above. > (assign_parm_adjust_stack_rtl): Revert. > (assign_parm_setup_block): Revert. Use set_parm_rtl instead > of SET_DECL_RTL. Set up a REG if the parm demands so. > (assign_parm_setup_reg): Revert. Consolidated SET_DECL_RTL > calls into a single set_parm_rtl. Set up a temporary RTL > temporarily for expand_assignment. > (assign_parm_setup_stack): Revert. Use set_parm_rtl. > (assign_parms_unsplit_complex): Revert. Use set_parm_rtl. > (assign_bounds): Revert. > (assign_parms): Revert. Use set_parm_rtl. > (allocate_struct_function): Relayout result and parms of > non-abstruct functions. > (expand_function_start): Revert. Use set_parm_rtl. If the > result is not a hard reg, create a pseudo from the promoted > mode of the default def. Promote static chain mode. > * tree-outof-ssa.c (remove_ssa_form): Drop unused > partition_has_default_def. Set up > partitions_for_parm_default_defs. > (finish_out_of_ssa): Remove partition_has_default_def. > Release partitions_for_parm_default_defs. > * tree-outof-ssa.h (struct ssaexpand): Remove > partition_has_default_def. Add > partitions_for_parm_default_defs. > * tree-ssa-coalesce.c: Include tree-dfa.h, tm_p.h and > stor-layout.h. > (build_ssa_conflict_graph): Fix conflict-detection of default > defs of even unused default defs of params and results. > (for_all_parms): New. > (create_default_def): New. > (register_default_def): New. > (coalesce_with_default): New. > (create_outofssa_var_map): Create default defs for all parms > and results, and register their partitions. Add GIMPLE_RETURN > operands as coalesce candidates with results. Add default > defs of each parm or result as coalesce candidates with its > other defs. Mark each result def, and each default def of > parms, as used_in_copy. > (gimple_can_coalesce_p): Call it. Call use_register_for_decl > with the ssa names, even anonymous ones. Drop > parm_in_stack_slot_p calls. Require same signedness and > alignment. > (coalesce_ssa_name): Add coalesce candidates for all defs of > each parm and result, even unused ones. > (parm_default_def_partition_arg): New type. > (set_parm_default_def_partition): New. > (get_parm_default_def_partitions): New. > * tree-ssa-coalesce.h (get_parm_default_def_partitions): New. > * tree-ssa-live.c (partition_view_init): Regard unused defs of > parms and results as used. > (verify_live_on_entry): Don't error out just because they're > not live. > > for gcc/testsuite/ChangeLog > > PR rtl-optimization/64164 > PR tree-optimization/67312 > * gcc.dg/pr67312.c: New. From Zdenek Sojka. > * gcc.target/i386/stackalign/return-4.c: Add -O. > --- > gcc/cfgexpand.c | 332 +++++++------- > gcc/cfgexpand.h | 3 > gcc/doc/invoke.texi | 9 > gcc/explow.c | 19 + > gcc/function.c | 477 +++++++------------- > gcc/testsuite/gcc.dg/pr67312.c | 7 > .../gcc.target/i386/stackalign/return-4.c | 9 > gcc/tree-outof-ssa.c | 15 - > gcc/tree-outof-ssa.h | 6 > gcc/tree-ssa-coalesce.c | 231 ++++++++-- > gcc/tree-ssa-coalesce.h | 1 > gcc/tree-ssa-live.c | 10 > 12 files changed, 582 insertions(+), 537 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/pr67312.c > > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c > index 6c9284f..58e55d2 100644 > --- a/gcc/cfgexpand.c > +++ b/gcc/cfgexpand.c > @@ -99,6 +99,8 @@ static rtx expand_debug_expr (tree); > > static bool defer_stack_allocation (tree, bool); > > +static void record_alignment_for_reg_var (unsigned int); > + > /* Return an expression tree corresponding to the RHS of GIMPLE > statement STMT. */ > > @@ -172,111 +174,86 @@ leader_merge (tree cur, tree next) > return cur; > } > > -/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be > - assigned to a stack slot. We can't have expand_one_ssa_partition > - choose their address: the pseudo holding the address would be set > - up too late for assign_params to copy the parameter if needed. > - > - Such parameters are likely passed as a pointer to the value, rather > - than as a value, and so we must not coalesce them, nor allocate > - stack space for them before determining the calling conventions for > - them. > - > - For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs > - with pc_rtx as the address, and then it replaces the pc_rtx with > - NULL so as to make sure the MEM is not used before it is adjusted > - in assign_parm_setup_reg. */ > - > -bool > -parm_in_stack_slot_p (tree var) > -{ > - if (!var || VAR_P (var)) > - return false; > - > - gcc_assert (TREE_CODE (var) == PARM_DECL > - || TREE_CODE (var) == RESULT_DECL); > - > - return !use_register_for_decl (var); > -} > - > -/* Return the partition of the default SSA_DEF for decl VAR. */ > - > -static int > -ssa_default_def_partition (tree var) > -{ > - tree name = ssa_default_def (cfun, var); > - > - if (!name) > - return NO_PARTITION; > - > - return var_to_partition (SA.map, name); > -} > - > -/* Return the RTL for the default SSA def of a PARM or RESULT, if > - there is one. */ > - > -rtx > -get_rtl_for_parm_ssa_default_def (tree var) > -{ > - gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL); > - > - if (!is_gimple_reg (var)) > - return NULL_RTX; > - > - /* If we've already determined RTL for the decl, use it. This is > - not just an optimization: if VAR is a PARM whose incoming value > - is unused, we won't find a default def to use its partition, but > - we still want to use the location of the parm, if it was used at > - all. During assign_parms, until a location is assigned for the > - VAR, RTL can only for a parm or result if we're not coalescing > - across variables, when we know we're coalescing all SSA_NAMEs of > - each parm or result, and we're not coalescing them with names > - pertaining to other variables, such as other parms' default > - defs. */ > - if (DECL_RTL_SET_P (var)) > - { > - gcc_assert (DECL_RTL (var) != pc_rtx); > - return DECL_RTL (var); > - } > - > - int part = ssa_default_def_partition (var); > - if (part == NO_PARTITION) > - return NULL_RTX; > - > - return SA.partition_to_pseudo[part]; > -} > - > /* Associate declaration T with storage space X. If T is no > SSA name this is exactly SET_DECL_RTL, otherwise make the > partition of T associated with X. */ > static inline void > set_rtl (tree t, rtx x) > { > - if (x && SSAVAR (t)) > + gcc_checking_assert (!x > + || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t)) > + || (use_register_for_decl (t) > + ? (REG_P (x) > + || (GET_CODE (x) == CONCAT > + && (REG_P (XEXP (x, 0)) > + || SUBREG_P (XEXP (x, 0))) > + && (REG_P (XEXP (x, 1)) > + || SUBREG_P (XEXP (x, 1)))) > + || (GET_CODE (x) == PARALLEL > + && SSAVAR (t) > + && TREE_CODE (SSAVAR (t)) == RESULT_DECL > + && !flag_tree_coalesce_vars)) > + : (MEM_P (x) || x == pc_rtx > + || (GET_CODE (x) == CONCAT > + && MEM_P (XEXP (x, 0)) > + && MEM_P (XEXP (x, 1)))))); > + /* Check that the RTL for SSA_NAMEs and gimple-reg PARM_DECLs and > + RESULT_DECLs has the expected mode. For memory, we accept > + unpromoted modes, since that's what we're likely to get. For > + PARM_DECLs and RESULT_DECLs, we'll have been called by > + set_parm_rtl, which will give us the default def, so we don't > + have to compute it ourselves. For RESULT_DECLs, we accept mode > + mismatches too, as long as we're not coalescing across variables, > + so that we don't reject BLKmode PARALLELs or unpromoted REGs. */ > + gcc_checking_assert (!x || x == pc_rtx || TREE_CODE (t) != SSA_NAME > + || (SSAVAR (t) && TREE_CODE (SSAVAR (t)) == RESULT_DECL > + && !flag_tree_coalesce_vars) > + || !use_register_for_decl (t) > + || GET_MODE (x) == promote_ssa_mode (t, NULL)); > + > + if (x) > { > bool skip = false; > tree cur = NULL_TREE; > - > - if (MEM_P (x)) > - cur = MEM_EXPR (x); > - else if (REG_P (x)) > - cur = REG_EXPR (x); > - else if (GET_CODE (x) == CONCAT > - && REG_P (XEXP (x, 0))) > - cur = REG_EXPR (XEXP (x, 0)); > - else if (GET_CODE (x) == PARALLEL) > - cur = REG_EXPR (XVECEXP (x, 0, 0)); > - else if (x == pc_rtx) > + rtx xm = x; > + > + retry: > + if (MEM_P (xm)) > + cur = MEM_EXPR (xm); > + else if (REG_P (xm)) > + cur = REG_EXPR (xm); > + else if (SUBREG_P (xm)) > + { > + gcc_assert (subreg_lowpart_p (xm)); > + xm = SUBREG_REG (xm); > + goto retry; > + } > + else if (GET_CODE (xm) == CONCAT) > + { > + xm = XEXP (xm, 0); > + goto retry; > + } > + else if (GET_CODE (xm) == PARALLEL) > + { > + xm = XVECEXP (xm, 0, 0); > + gcc_assert (GET_CODE (xm) == EXPR_LIST); > + xm = XEXP (xm, 0); > + goto retry; > + } > + else if (xm == pc_rtx) > skip = true; > else > gcc_unreachable (); > > - tree next = skip ? cur : leader_merge (cur, SSAVAR (t)); > + tree next = skip ? cur : leader_merge (cur, SSAVAR (t) ? SSAVAR (t) : t); > > if (cur != next) > { > if (MEM_P (x)) > - set_mem_attributes (x, next, true); > + set_mem_attributes (x, > + next && TREE_CODE (next) == SSA_NAME > + ? TREE_TYPE (next) > + : next, true); > else > set_reg_attrs_for_decl_rtl (next, x); > } > @@ -294,13 +271,11 @@ set_rtl (tree t, rtx x) > } > /* For the benefit of debug information at -O0 (where > vartracking doesn't run) record the place also in the base > - DECL. For PARMs and RESULTs, we may end up resetting these > - in function.c:maybe_reset_rtl_for_parm, but in some rare > - cases we may need them (unused and overwritten incoming > - value, that at -O0 must share the location with the other > - uses in spite of the missing default def), and this may be > - the only chance to preserve them. */ > - if (x && x != pc_rtx && SSA_NAME_VAR (t)) > + DECL. For PARMs and RESULTs, do so only when setting the > + default def. */ > + if (x && x != pc_rtx && SSA_NAME_VAR (t) > + && (VAR_P (SSA_NAME_VAR (t)) > + || SSA_NAME_IS_DEFAULT_DEF (t))) > { > tree var = SSA_NAME_VAR (t); > /* If we don't yet have something recorded, just record it now. */ > @@ -1242,6 +1217,49 @@ account_stack_vars (void) > return size; > } > > +/* Record the RTL assignment X for the default def of PARM. */ > + > +extern void > +set_parm_rtl (tree parm, rtx x) > +{ > + gcc_assert (TREE_CODE (parm) == PARM_DECL > + || TREE_CODE (parm) == RESULT_DECL); > + > + if (x && !MEM_P (x)) > + { > + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (parm), > + TYPE_MODE (TREE_TYPE (parm)), > + TYPE_ALIGN (TREE_TYPE (parm))); > + > + /* If the variable alignment is very large we'll dynamicaly > + allocate it, which means that in-frame portion is just a > + pointer. ??? We've got a pseudo for sure here, do we > + actually dynamically allocate its spilling area if needed? > + ??? Isn't it a problem when POINTER_SIZE also exceeds > + MAX_SUPPORTED_STACK_ALIGNMENT, as on cris and lm32? */ > + if (align > MAX_SUPPORTED_STACK_ALIGNMENT) > + align = POINTER_SIZE; > + > + record_alignment_for_reg_var (align); > + } > + > + if (!is_gimple_reg (parm)) > + return set_rtl (parm, x); > + > + tree ssa = ssa_default_def (cfun, parm); > + if (!ssa) > + return set_rtl (parm, x); > + > + int part = var_to_partition (SA.map, ssa); > + gcc_assert (part != NO_PARTITION); > + > + bool changed = bitmap_bit_p (SA.partitions_for_parm_default_defs, part); > + gcc_assert (changed); > + > + set_rtl (ssa, x); > + gcc_assert (DECL_RTL (parm) == x); > +} > + > /* A subroutine of expand_one_var. Called to immediately assign rtl > to a variable to be allocated in the stack frame. */ > > @@ -1349,37 +1367,7 @@ expand_one_ssa_partition (tree var) > > if (!use_register_for_decl (var)) > { > - /* We can't risk having the parm assigned to a MEM location > - whose address references a pseudo, for the pseudo will only > - be set up after arguments are copied to the stack slot. > - > - If the parm doesn't have a default def (e.g., because its > - incoming value is unused), then we want to let assign_params > - do the allocation, too. In this case we want to make sure > - SSA_NAMEs associated with the parm don't get assigned to more > - than one partition, lest we'd create two unassigned stac > - slots for the same parm, thus the assert at the end of the > - block. */ > - if (parm_in_stack_slot_p (SSA_NAME_VAR (var)) > - && (ssa_default_def_partition (SSA_NAME_VAR (var)) == part > - || !ssa_default_def (cfun, SSA_NAME_VAR (var)))) > - { > - expand_one_stack_var_at (var, pc_rtx, 0, 0); > - rtx x = SA.partition_to_pseudo[part]; > - gcc_assert (GET_CODE (x) == MEM); > - gcc_assert (XEXP (x, 0) == pc_rtx); > - /* Reset the address, so that any attempt to use it will > - ICE. It will be adjusted in assign_parm_setup_reg. */ > - XEXP (x, 0) = NULL_RTX; > - /* If the RTL associated with the parm is not what we have > - just created, the parm has been split over multiple > - partitions. In order for this to work, we must have a > - default def for the parm, otherwise assign_params won't > - know what to do. */ > - gcc_assert (DECL_RTL_IF_SET (SSA_NAME_VAR (var)) == x > - || ssa_default_def (cfun, SSA_NAME_VAR (var))); > - } > - else if (defer_stack_allocation (var, true)) > + if (defer_stack_allocation (var, true)) > add_stack_var (var); > else > expand_one_stack_var_1 (var); > @@ -1393,8 +1381,8 @@ expand_one_ssa_partition (tree var) > set_rtl (var, x); > } > > -/* Record the association between the RTL generated for a partition > - and the underlying variable of the SSA_NAME. */ > +/* Record the association between the RTL generated for partition PART > + and the underlying variable of the SSA_NAME VAR. */ > > static void > adjust_one_expanded_partition_var (tree var) > @@ -1410,12 +1398,7 @@ adjust_one_expanded_partition_var (tree var) > > rtx x = SA.partition_to_pseudo[part]; > > - if (!x) > - { > - /* This var will get a stack slot later. */ > - gcc_assert (defer_stack_allocation (var, true)); > - return; > - } > + gcc_assert (x); > > set_rtl (var, x); > > @@ -2040,6 +2023,9 @@ expand_used_vars (void) > > for (i = 0; i < SA.map->num_partitions; i++) > { > + if (bitmap_bit_p (SA.partitions_for_parm_default_defs, i)) > + continue; > + > tree var = partition_to_var (SA.map, i); > > gcc_assert (!virtual_operand_p (var)); > @@ -2047,9 +2033,6 @@ expand_used_vars (void) > expand_one_ssa_partition (var); > } > > - for (i = 1; i < num_ssa_names; i++) > - adjust_one_expanded_partition_var (ssa_name (i)); > - > if (flag_stack_protect == SPCT_FLAG_STRONG) > gen_stack_protect_signal > = stack_protect_decl_p () || stack_protect_return_slot_p (); > @@ -4947,26 +4930,27 @@ expand_debug_expr (tree exp) > } > else > { > + /* If this is a reference to an incoming value of > + parameter that is never used in the code or where the > + incoming value is never used in the code, use > + PARM_DECL's DECL_RTL if set. */ > + if (SSA_NAME_IS_DEFAULT_DEF (exp) > + && SSA_NAME_VAR (exp) > + && TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL > + && has_zero_uses (exp)) > + { > + op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp)); > + if (op0) > + goto adjust_mode; > + op0 = expand_debug_expr (SSA_NAME_VAR (exp)); > + if (op0) > + goto adjust_mode; > + } > + > int part = var_to_partition (SA.map, exp); > > if (part == NO_PARTITION) > - { > - /* If this is a reference to an incoming value of parameter > - that is never used in the code or where the incoming > - value is never used in the code, use PARM_DECL's > - DECL_RTL if set. */ > - if (SSA_NAME_IS_DEFAULT_DEF (exp) > - && TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL) > - { > - op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp)); > - if (op0) > - goto adjust_mode; > - op0 = expand_debug_expr (SSA_NAME_VAR (exp)); > - if (op0) > - goto adjust_mode; > - } > - return NULL; > - } > + return NULL; > > gcc_assert (part >= 0 && (unsigned)part < SA.map->num_partitions); > > @@ -6216,9 +6200,26 @@ pass_expand::execute (function *fun) > parm_birth_insn = var_seq; > } > > - /* If we have a class containing differently aligned pointers > - we need to merge those into the corresponding RTL pointer > - alignment. */ > + /* Now propagate the RTL assignment of each partition to the > + underlying var of each SSA_NAME. */ > + for (i = 1; i < num_ssa_names; i++) > + { > + tree name = ssa_name (i); > + > + if (!name > + /* We might have generated new SSA names in > + update_alias_info_with_stack_vars. They will have a NULL > + defining statements, and won't be part of the partitioning, > + so ignore those. */ > + || !SSA_NAME_DEF_STMT (name)) > + continue; > + > + adjust_one_expanded_partition_var (name); > + } > + > + /* Clean up RTL of variables that straddle across multiple > + partitions, and check that the rtl of any PARM_DECLs that are not > + cleaned up is that of their default defs. */ > for (i = 1; i < num_ssa_names; i++) > { > tree name = ssa_name (i); > @@ -6235,9 +6236,6 @@ pass_expand::execute (function *fun) > if (part == NO_PARTITION) > continue; > > - gcc_assert (SA.partition_to_pseudo[part] > - || defer_stack_allocation (name, true)); > - > /* If this decl was marked as living in multiple places, reset > this now to NULL. */ > tree var = SSA_NAME_VAR (name); > @@ -6252,7 +6250,19 @@ pass_expand::execute (function *fun) > rtx in = DECL_RTL_IF_SET (var); > gcc_assert (in); > rtx out = SA.partition_to_pseudo[part]; > - gcc_assert (in == out || rtx_equal_p (in, out)); > + gcc_assert (in == out); > + > + /* Now reset VAR's RTL to IN, so that the _EXPR attrs match > + those expected by debug backends for each parm and for > + the result. This is particularly important for stabs, > + whose register elimination from parm's DECL_RTL may cause > + -fcompare-debug differences as SET_DECL_RTL changes reg's > + attrs. So, make sure the RTL already has the parm as the > + EXPR, so that it won't change. */ > + SET_DECL_RTL (var, NULL_RTX); > + if (MEM_P (in)) > + set_mem_attributes (in, var, true); > + SET_DECL_RTL (var, in); > } > } > > diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h > index ff7f4bef..8852411 100644 > --- a/gcc/cfgexpand.h > +++ b/gcc/cfgexpand.h > @@ -22,8 +22,7 @@ along with GCC; see the file COPYING3. If not see > > extern tree gimple_assign_rhs_to_tree (gimple *); > extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *); > -extern bool parm_in_stack_slot_p (tree); > -extern rtx get_rtl_for_parm_ssa_default_def (tree var); > +extern void set_parm_rtl (tree, rtx); > > > #endif /* GCC_CFGEXPAND_H */ > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 09c58ee..aefb061 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -8866,12 +8866,13 @@ profitable to parallelize the loops. > > @item -ftree-coalesce-vars > @opindex ftree-coalesce-vars > -Tell the compiler to attempt to combine small user-defined variables > -too, instead of just compiler temporaries. This may severely limit the > -ability to debug an optimized program compiled with > +While transforming the program out of the SSA representation, attempt to > +reduce copying by coalescing versions of different user-defined > +variables, instead of just compiler temporaries. This may severely > +limit the ability to debug an optimized program compiled with > @option{-fno-var-tracking-assignments}. In the negated form, this flag > prevents SSA coalescing of user variables. This option is enabled by > -default if optimization is enabled. > +default if optimization is enabled, and it does very little otherwise. > > @item -ftree-loop-if-convert > @opindex ftree-loop-if-convert > diff --git a/gcc/explow.c b/gcc/explow.c > index 6941f4e..d104a79 100644 > --- a/gcc/explow.c > +++ b/gcc/explow.c > @@ -830,8 +830,10 @@ promote_decl_mode (const_tree decl, int *punsignedp) > machine_mode mode = DECL_MODE (decl); > machine_mode pmode; > > - if (TREE_CODE (decl) == RESULT_DECL > - || TREE_CODE (decl) == PARM_DECL) > + if (TREE_CODE (decl) == RESULT_DECL && !DECL_BY_REFERENCE (decl)) > + pmode = promote_function_mode (type, mode, &unsignedp, > + TREE_TYPE (current_function_decl), 1); > + else if (TREE_CODE (decl) == RESULT_DECL || TREE_CODE (decl) == PARM_DECL) > pmode = promote_function_mode (type, mode, &unsignedp, > TREE_TYPE (current_function_decl), 2); > else > @@ -857,12 +859,23 @@ promote_ssa_mode (const_tree name, int *punsignedp) > if (SSA_NAME_VAR (name) > && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL > || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL)) > - return promote_decl_mode (SSA_NAME_VAR (name), punsignedp); > + { > + machine_mode mode = promote_decl_mode (SSA_NAME_VAR (name), punsignedp); > + if (mode != BLKmode) > + return mode; > + } > > tree type = TREE_TYPE (name); > int unsignedp = TYPE_UNSIGNED (type); > machine_mode mode = TYPE_MODE (type); > > + /* Bypass TYPE_MODE when it maps vector modes to BLKmode. */ > + if (mode == BLKmode) > + { > + gcc_assert (VECTOR_TYPE_P (type)); > + mode = type->type_common.mode; > + } > + > machine_mode pmode = promote_mode (type, mode, &unsignedp); > if (punsignedp) > *punsignedp = unsignedp; > diff --git a/gcc/function.c b/gcc/function.c > index 9b4c2b9..21304689 100644 > --- a/gcc/function.c > +++ b/gcc/function.c > @@ -74,8 +74,6 @@ along with GCC; see the file COPYING3. If not see > #include "cfgbuild.h" > #include "cfgcleanup.h" > #include "cfgexpand.h" > -#include "basic-block.h" > -#include "df.h" > #include "params.h" > #include "bb-reorder.h" > #include "shrink-wrap.h" > @@ -83,6 +81,7 @@ along with GCC; see the file COPYING3. If not see > #include "rtl-iter.h" > #include "tree-chkp.h" > #include "rtl-chkp.h" > +#include "tree-dfa.h" > > /* So we can assign to cfun in this file. */ > #undef cfun > @@ -152,9 +151,6 @@ static bool contains (const_rtx, hash_table *); > static void prepare_function_start (void); > static void do_clobber_return_reg (rtx, void *); > static void do_use_return_reg (rtx, void *); > -static rtx rtl_for_parm (struct assign_parm_data_all *, tree); > -static void maybe_reset_rtl_for_parm (tree); > -static bool parm_in_unassigned_mem_p (tree, rtx); > > > /* Stack of nested functions. */ > @@ -2145,6 +2141,47 @@ use_register_for_decl (const_tree decl) > if (TREE_ADDRESSABLE (decl)) > return false; > > + /* RESULT_DECLs are a bit special in that they're assigned without > + regard to use_register_for_decl, but we generally only store in > + them. If we coalesce their SSA NAMEs, we'd better return a > + result that matches the assignment in expand_function_start. */ > + if (TREE_CODE (decl) == RESULT_DECL) > + { > + /* If it's not an aggregate, we're going to use a REG or a > + PARALLEL containing a REG. */ > + if (!aggregate_value_p (decl, current_function_decl)) > + return true; > + > + /* If expand_function_start determines the return value, we'll > + use MEM if it's not by reference. */ > + if (cfun->returns_pcc_struct > + || (targetm.calls.struct_value_rtx > + (TREE_TYPE (current_function_decl), 1))) > + return DECL_BY_REFERENCE (decl); > + > + /* Otherwise, we're taking an extra all.function_result_decl > + argument. It's set up in assign_parms_augmented_arg_list, > + under the (negated) conditions above, and then it's used to > + set up the RESULT_DECL rtl in assign_params, after looping > + over all parameters. Now, if the RESULT_DECL is not by > + reference, we'll use a MEM either way. */ > + if (!DECL_BY_REFERENCE (decl)) > + return false; > + > + /* Otherwise, if RESULT_DECL is DECL_BY_REFERENCE, it will take > + the function_result_decl's assignment. Since it's a pointer, > + we can short-circuit a number of the tests below, and we must > + duplicat e them because we don't have the > + function_result_decl to test. */ > + if (!targetm.calls.allocate_stack_slots_for_args ()) > + return true; > + /* We don't set DECL_IGNORED_P for the function_result_decl. */ > + if (optimize) > + return true; > + /* We don't set DECL_REGISTER for the function_result_decl. */ > + return false; > + } > + > /* Decl is implicitly addressible by bound stores and loads > if it is an aggregate holding bounds. */ > if (chkp_function_instrumented_p (current_function_decl) > @@ -2272,7 +2309,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all) > needed, else the old list. */ > > static void > -split_complex_args (struct assign_parm_data_all *all, vec *args) > +split_complex_args (vec *args) > { > unsigned i; > tree p; > @@ -2283,7 +2320,6 @@ split_complex_args (struct assign_parm_data_all *all, vec *args) > if (TREE_CODE (type) == COMPLEX_TYPE > && targetm.calls.split_complex_arg (type)) > { > - tree cparm = p; > tree decl; > tree subtype = TREE_TYPE (type); > bool addressable = TREE_ADDRESSABLE (p); > @@ -2302,9 +2338,6 @@ split_complex_args (struct assign_parm_data_all *all, vec *args) > DECL_ARTIFICIAL (p) = addressable; > DECL_IGNORED_P (p) = addressable; > TREE_ADDRESSABLE (p) = 0; > - /* Reset the RTL before layout_decl, or it may change the > - mode of the RTL of the original argument copied to P. */ > - SET_DECL_RTL (p, NULL_RTX); > layout_decl (p, 0); > (*args)[i] = p; > > @@ -2316,41 +2349,6 @@ split_complex_args (struct assign_parm_data_all *all, vec *args) > DECL_IGNORED_P (decl) = addressable; > layout_decl (decl, 0); > args->safe_insert (++i, decl); > - > - /* If we are expanding a function, rather than gimplifying > - it, propagate the RTL of the complex parm to the split > - declarations, and set their contexts so that > - maybe_reset_rtl_for_parm can recognize them and refrain > - from resetting their RTL. */ > - if (currently_expanding_to_rtl) > - { > - maybe_reset_rtl_for_parm (cparm); > - rtx rtl = rtl_for_parm (all, cparm); > - if (rtl) > - { > - /* If this is parm is unassigned, assign it now: the > - newly-created decls wouldn't expect the need for > - assignment, and if they were assigned > - independently, they might not end up in adjacent > - slots, so unsplit wouldn't be able to fill in the > - unassigned address of the complex MEM. */ > - if (parm_in_unassigned_mem_p (cparm, rtl)) > - { > - int align = STACK_SLOT_ALIGNMENT > - (TREE_TYPE (cparm), GET_MODE (rtl), MEM_ALIGN (rtl)); > - rtx loc = assign_stack_local > - (GET_MODE (rtl), GET_MODE_SIZE (GET_MODE (rtl)), > - align); > - XEXP (rtl, 0) = XEXP (loc, 0); > - } > - > - SET_DECL_RTL (p, read_complex_part (rtl, false)); > - SET_DECL_RTL (decl, read_complex_part (rtl, true)); > - > - DECL_CONTEXT (p) = cparm; > - DECL_CONTEXT (decl) = cparm; > - } > - } > } > } > } > @@ -2386,6 +2384,9 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all) > DECL_ARTIFICIAL (decl) = 1; > DECL_NAMELESS (decl) = 1; > TREE_CONSTANT (decl) = 1; > + /* We don't set DECL_IGNORED_P or DECL_REGISTER here. If this > + changes, the end of the RESULT_DECL handling block in > + use_register_for_decl must be adjusted to match. */ > > DECL_CHAIN (decl) = all->orig_fnargs; > all->orig_fnargs = decl; > @@ -2413,7 +2414,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all) > > /* If the target wants to split complex arguments into scalars, do so. */ > if (targetm.calls.split_complex_arg) > - split_complex_args (all, &fnargs); > + split_complex_args (&fnargs); > > return fnargs; > } > @@ -2816,98 +2817,23 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data) > data->entry_parm = entry_parm; > } > > -/* Wrapper for use_register_for_decl, that special-cases the > - .result_ptr as the function's RESULT_DECL when the RESULT_DECL is > - passed by reference. */ > - > -static bool > -use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm) > -{ > - if (parm == all->function_result_decl) > - { > - tree result = DECL_RESULT (current_function_decl); > - > - if (DECL_BY_REFERENCE (result)) > - parm = result; > - } > - > - return use_register_for_decl (parm); > -} > - > -/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases > - the .result_ptr as the function's RESULT_DECL when the RESULT_DECL > - is passed by reference. */ > - > -static rtx > -rtl_for_parm (struct assign_parm_data_all *all, tree parm) > -{ > - if (parm == all->function_result_decl) > - { > - tree result = DECL_RESULT (current_function_decl); > - > - if (!DECL_BY_REFERENCE (result)) > - return NULL_RTX; > - > - parm = result; > - } > - > - return get_rtl_for_parm_ssa_default_def (parm); > -} > - > -/* Reset the location of PARM_DECLs and RESULT_DECLs that had > - SSA_NAMEs in multiple partitions, so that assign_parms will choose > - the default def, if it exists, or create new RTL to hold the unused > - entry value. If we are coalescing across variables, we want to > - reset the location too, because a parm without a default def > - (incoming value unused) might be coalesced with one with a default > - def, and then assign_parms would copy both incoming values to the > - same location, which might cause the wrong value to survive. */ > -static void > -maybe_reset_rtl_for_parm (tree parm) > -{ > - gcc_assert (TREE_CODE (parm) == PARM_DECL > - || TREE_CODE (parm) == RESULT_DECL); > - > - /* This is a split complex parameter, and its context was set to its > - original PARM_DECL in split_complex_args so that we could > - recognize it here and not reset its RTL. */ > - if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL) > - { > - DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm)); > - return; > - } > - > - if ((flag_tree_coalesce_vars > - || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx)) > - && is_gimple_reg (parm)) > - SET_DECL_RTL (parm, NULL_RTX); > -} > - > /* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's > always valid and properly aligned. */ > > static void > -assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm, > - struct assign_parm_data_one *data) > +assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data) > { > rtx stack_parm = data->stack_parm; > > - /* If out-of-SSA assigned RTL to the parm default def, make sure we > - don't use what we might have computed before. */ > - rtx ssa_assigned = rtl_for_parm (all, parm); > - if (ssa_assigned) > - stack_parm = NULL; > - > /* If we can't trust the parm stack slot to be aligned enough for its > ultimate type, don't use that slot after entry. We'll make another > stack slot, if we need one. */ > - else if (stack_parm > - && ((STRICT_ALIGNMENT > - && (GET_MODE_ALIGNMENT (data->nominal_mode) > - > MEM_ALIGN (stack_parm))) > - || (data->nominal_type > - && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) > - && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) > + if (stack_parm > + && ((STRICT_ALIGNMENT > + && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm)) > + || (data->nominal_type > + && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) > + && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) > stack_parm = NULL; > > /* If parm was passed in memory, and we need to convert it on entry, > @@ -2952,27 +2878,6 @@ assign_parm_setup_block_p (struct assign_parm_data_one *data) > return false; > } > > -/* Return true if FROM_EXPAND is a MEM with an address to be filled in > - by assign_params. This should be the case if, and only if, > - parm_in_stack_slot_p holds for the parm DECL that expanded to > - FROM_EXPAND, so we check that, too. */ > - > -static bool > -parm_in_unassigned_mem_p (tree decl, rtx from_expand) > -{ > - bool result = MEM_P (from_expand) && !XEXP (from_expand, 0); > - > - gcc_assert (result == parm_in_stack_slot_p (decl) > - /* Maybe it was already assigned. That's ok, especially > - for split complex args. */ > - || (!result && MEM_P (from_expand) > - && (XEXP (from_expand, 0) == virtual_stack_vars_rtx > - || (GET_CODE (XEXP (from_expand, 0)) == PLUS > - && XEXP (XEXP (from_expand, 0), 0) == virtual_stack_vars_rtx)))); > - > - return result; > -} > - > /* A subroutine of assign_parms. Arrange for the parameter to be > present and valid in DATA->STACK_RTL. */ > > @@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all *all, > { > rtx entry_parm = data->entry_parm; > rtx stack_parm = data->stack_parm; > + rtx target_reg = NULL_RTX; > HOST_WIDE_INT size; > HOST_WIDE_INT size_stored; > > if (GET_CODE (entry_parm) == PARALLEL) > entry_parm = emit_group_move_into_temps (entry_parm); > > + /* If we want the parameter in a pseudo, don't use a stack slot. */ > + if (is_gimple_reg (parm) && use_register_for_decl (parm)) > + { > + tree def = ssa_default_def (cfun, parm); > + gcc_assert (def); > + machine_mode mode = promote_ssa_mode (def, NULL); > + rtx reg = gen_reg_rtx (mode); > + if (GET_CODE (reg) != CONCAT) > + stack_parm = reg; > + else > + /* This will use or allocate a stack slot that we'd rather > + avoid. FIXME: Could we avoid it in more cases? */ > + target_reg = reg; > + data->stack_parm = NULL; > + } > + > size = int_size_in_bytes (data->passed_type); > size_stored = CEIL_ROUND (size, UNITS_PER_WORD); > - > if (stack_parm == 0) > { > DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD); > - rtx from_expand = rtl_for_parm (all, parm); > - if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand)) > - stack_parm = copy_rtx (from_expand); > - else > - { > - stack_parm = assign_stack_local (BLKmode, size_stored, > - DECL_ALIGN (parm)); > - if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) > - PUT_MODE (stack_parm, GET_MODE (entry_parm)); > - if (from_expand) > - { > - gcc_assert (GET_CODE (stack_parm) == MEM); > - gcc_assert (parm_in_unassigned_mem_p (parm, from_expand)); > - XEXP (from_expand, 0) = XEXP (stack_parm, 0); > - PUT_MODE (from_expand, GET_MODE (stack_parm)); > - stack_parm = copy_rtx (from_expand); > - } > - else > - set_mem_attributes (stack_parm, parm, 1); > - } > + stack_parm = assign_stack_local (BLKmode, size_stored, > + DECL_ALIGN (parm)); > + if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) > + PUT_MODE (stack_parm, GET_MODE (entry_parm)); > + set_mem_attributes (stack_parm, parm, 1); > } > > /* If a BLKmode arrives in registers, copy it to a stack slot. Handle > @@ -3054,11 +2960,6 @@ assign_parm_setup_block (struct assign_parm_data_all *all, > else if (size == 0) > ; > > - /* MEM may be a REG if coalescing assigns the param's partition > - to a pseudo. */ > - else if (REG_P (mem)) > - emit_move_insn (mem, entry_parm); > - > /* If SIZE is that of a mode no bigger than a word, just use > that mode's store operation. */ > else if (size <= UNITS_PER_WORD) > @@ -3113,10 +3014,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all, > tem = change_address (mem, word_mode, 0); > emit_move_insn (tem, x); > } > + else if (!MEM_P (mem)) > + emit_move_insn (mem, entry_parm); > else > move_block_from_reg (REGNO (entry_parm), mem, > size_stored / UNITS_PER_WORD); > } > + else if (!MEM_P (mem)) > + emit_move_insn (mem, entry_parm); > else > move_block_from_reg (REGNO (entry_parm), mem, > size_stored / UNITS_PER_WORD); > @@ -3131,8 +3036,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all, > end_sequence (); > } > > + if (target_reg) > + { > + emit_move_insn (target_reg, stack_parm); > + stack_parm = target_reg; > + } > + > data->stack_parm = stack_parm; > - SET_DECL_RTL (parm, stack_parm); > + set_parm_rtl (parm, stack_parm); > } > > /* A subroutine of assign_parms. Allocate a pseudo to hold the current > @@ -3148,6 +3059,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > int unsignedp = TYPE_UNSIGNED (TREE_TYPE (parm)); > bool did_conversion = false; > bool need_conversion, moved; > + rtx rtl; > > /* Store the parm in a pseudoregister during the function, but we may > need to do it in a wider mode. Using 2 here makes the result > @@ -3156,40 +3068,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp, > TREE_TYPE (current_function_decl), 2); > > - rtx from_expand = parmreg = rtl_for_parm (all, parm); > - > - if (from_expand && !data->passed_pointer) > - { > - if (GET_MODE (parmreg) != promoted_nominal_mode) > - parmreg = gen_lowpart (promoted_nominal_mode, parmreg); > - } > - else if (!from_expand || parm_in_unassigned_mem_p (parm, from_expand)) > - { > - parmreg = gen_reg_rtx (promoted_nominal_mode); > - if (!DECL_ARTIFICIAL (parm)) > - mark_user_reg (parmreg); > - > - if (from_expand) > - { > - gcc_assert (data->passed_pointer); > - gcc_assert (GET_CODE (from_expand) == MEM > - && XEXP (from_expand, 0) == NULL_RTX); > - XEXP (from_expand, 0) = parmreg; > - } > - } > + parmreg = gen_reg_rtx (promoted_nominal_mode); > + if (!DECL_ARTIFICIAL (parm)) > + mark_user_reg (parmreg); > > /* If this was an item that we received a pointer to, > - set DECL_RTL appropriately. */ > - if (from_expand) > - SET_DECL_RTL (parm, from_expand); > - else if (data->passed_pointer) > + set rtl appropriately. */ > + if (data->passed_pointer) > { > - rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg); > - set_mem_attributes (x, parm, 1); > - SET_DECL_RTL (parm, x); > + rtl = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg); > + set_mem_attributes (rtl, parm, 1); > } > else > - SET_DECL_RTL (parm, parmreg); > + rtl = parmreg; > > assign_parm_remove_parallels (data); > > @@ -3197,13 +3088,10 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > assign_parm_find_data_types and expand_expr_real_1. */ > > equiv_stack_parm = data->stack_parm; > - if (!equiv_stack_parm) > - equiv_stack_parm = data->entry_parm; > validated_mem = validize_mem (copy_rtx (data->entry_parm)); > > need_conversion = (data->nominal_mode != data->passed_mode > || promoted_nominal_mode != data->promoted_mode); > - gcc_assert (!(need_conversion && data->passed_pointer && from_expand)); > moved = false; > > if (need_conversion > @@ -3327,7 +3215,9 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > > /* TREE_USED gets set erroneously during expand_assignment. */ > save_tree_used = TREE_USED (parm); > + SET_DECL_RTL (parm, rtl); > expand_assignment (parm, make_tree (data->nominal_type, tempreg), false); > + SET_DECL_RTL (parm, NULL_RTX); > TREE_USED (parm) = save_tree_used; > all->first_conversion_insn = get_insns (); > all->last_conversion_insn = get_last_insn (); > @@ -3335,28 +3225,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > > did_conversion = true; > } > - /* We don't want to copy the incoming pointer to a parmreg expected > - to hold the value rather than the pointer. */ > - else if (!data->passed_pointer || parmreg != from_expand) > + else > emit_move_insn (parmreg, validated_mem); > > /* If we were passed a pointer but the actual value can safely live > in a register, retrieve it and use it directly. */ > - if (data->passed_pointer > - && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode)) > + if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode) > { > - rtx src = DECL_RTL (parm); > - > /* We can't use nominal_mode, because it will have been set to > Pmode above. We must use the actual mode of the parm. */ > - if (from_expand) > - { > - parmreg = from_expand; > - gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm))); > - src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem); > - set_mem_attributes (src, parm, 1); > - } > - else if (use_register_for_decl (parm)) > + if (use_register_for_decl (parm)) > { > parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm))); > mark_user_reg (parmreg); > @@ -3373,14 +3251,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > set_mem_attributes (parmreg, parm, 1); > } > > - if (GET_MODE (parmreg) != GET_MODE (src)) > + if (GET_MODE (parmreg) != GET_MODE (rtl)) > { > - rtx tempreg = gen_reg_rtx (GET_MODE (src)); > + rtx tempreg = gen_reg_rtx (GET_MODE (rtl)); > int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm)); > > push_to_sequence2 (all->first_conversion_insn, > all->last_conversion_insn); > - emit_move_insn (tempreg, src); > + emit_move_insn (tempreg, rtl); > tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p); > emit_move_insn (parmreg, tempreg); > all->first_conversion_insn = get_insns (); > @@ -3389,18 +3267,18 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > > did_conversion = true; > } > - else if (GET_MODE (parmreg) == BLKmode) > - gcc_assert (parm_in_stack_slot_p (parm)); > else > - emit_move_insn (parmreg, src); > + emit_move_insn (parmreg, rtl); > > - SET_DECL_RTL (parm, parmreg); > + rtl = parmreg; > > /* STACK_PARM is the pointer, not the parm, and PARMREG is > now the parm. */ > - data->stack_parm = equiv_stack_parm = NULL; > + data->stack_parm = NULL; > } > > + set_parm_rtl (parm, rtl); > + > /* Mark the register as eliminable if we did no conversion and it was > copied from memory at a fixed offset, and the arg pointer was not > copied to a pseudo-reg. If the arg pointer is a pseudo reg or the > @@ -3408,11 +3286,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > make here would screw up life analysis for it. */ > if (data->nominal_mode == data->passed_mode > && !did_conversion > - && equiv_stack_parm != 0 > - && MEM_P (equiv_stack_parm) > + && data->stack_parm != 0 > + && MEM_P (data->stack_parm) > && data->locate.offset.var == 0 > && reg_mentioned_p (virtual_incoming_args_rtx, > - XEXP (equiv_stack_parm, 0))) > + XEXP (data->stack_parm, 0))) > { > rtx_insn *linsn = get_last_insn (); > rtx_insn *sinsn; > @@ -3425,8 +3303,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > = GET_MODE_INNER (GET_MODE (parmreg)); > int regnor = REGNO (XEXP (parmreg, 0)); > int regnoi = REGNO (XEXP (parmreg, 1)); > - rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0); > - rtx stacki = adjust_address_nv (equiv_stack_parm, submode, > + rtx stackr = adjust_address_nv (data->stack_parm, submode, 0); > + rtx stacki = adjust_address_nv (data->stack_parm, submode, > GET_MODE_SIZE (submode)); > > /* Scan backwards for the set of the real and > @@ -3444,7 +3322,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > set_unique_reg_note (sinsn, REG_EQUIV, stackr); > } > } > - else > + else > set_dst_reg_note (linsn, REG_EQUIV, equiv_stack_parm, parmreg); > } > > @@ -3496,16 +3374,6 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm, > if (data->entry_parm != data->stack_parm) > { > rtx src, dest; > - rtx from_expand = NULL_RTX; > - > - if (data->stack_parm == 0) > - { > - from_expand = rtl_for_parm (all, parm); > - if (from_expand) > - gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm)); > - if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand)) > - data->stack_parm = from_expand; > - } > > if (data->stack_parm == 0) > { > @@ -3516,16 +3384,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm, > = assign_stack_local (GET_MODE (data->entry_parm), > GET_MODE_SIZE (GET_MODE (data->entry_parm)), > align); > - if (!from_expand) > - set_mem_attributes (data->stack_parm, parm, 1); > - else > - { > - gcc_assert (GET_CODE (data->stack_parm) == MEM); > - gcc_assert (parm_in_unassigned_mem_p (parm, from_expand)); > - XEXP (from_expand, 0) = XEXP (data->stack_parm, 0); > - PUT_MODE (from_expand, GET_MODE (data->stack_parm)); > - data->stack_parm = copy_rtx (from_expand); > - } > + set_mem_attributes (data->stack_parm, parm, 1); > } > > dest = validize_mem (copy_rtx (data->stack_parm)); > @@ -3554,7 +3413,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm, > end_sequence (); > } > > - SET_DECL_RTL (parm, data->stack_parm); > + set_parm_rtl (parm, data->stack_parm); > } > > /* A subroutine of assign_parms. If the ABI splits complex arguments, then > @@ -3580,21 +3439,11 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all, > imag = DECL_RTL (fnargs[i + 1]); > if (inner != GET_MODE (real)) > { > - real = simplify_gen_subreg (inner, real, GET_MODE (real), > - subreg_lowpart_offset > - (inner, GET_MODE (real))); > - imag = simplify_gen_subreg (inner, imag, GET_MODE (imag), > - subreg_lowpart_offset > - (inner, GET_MODE (imag))); > + real = gen_lowpart_SUBREG (inner, real); > + imag = gen_lowpart_SUBREG (inner, imag); > } > > - if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX > - && rtx_equal_p (real, > - read_complex_part (tmp, false)) > - && rtx_equal_p (imag, > - read_complex_part (tmp, true))) > - ; /* We now have the right rtl in tmp. */ > - else if (TREE_ADDRESSABLE (parm)) > + if (TREE_ADDRESSABLE (parm)) > { > rtx rmem, imem; > HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm)); > @@ -3618,7 +3467,7 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all, > } > else > tmp = gen_rtx_CONCAT (DECL_MODE (parm), real, imag); > - SET_DECL_RTL (parm, tmp); > + set_parm_rtl (parm, tmp); > > real = DECL_INCOMING_RTL (fnargs[i]); > imag = DECL_INCOMING_RTL (fnargs[i + 1]); > @@ -3740,7 +3589,7 @@ assign_bounds (vec &bndargs, > assign_parm_setup_block (&all, pbdata->bounds_parm, > &pbdata->parm_data); > else if (pbdata->parm_data.passed_pointer > - || use_register_for_parm_decl (&all, pbdata->bounds_parm)) > + || use_register_for_decl (pbdata->bounds_parm)) > assign_parm_setup_reg (&all, pbdata->bounds_parm, > &pbdata->parm_data); > else > @@ -3784,8 +3633,6 @@ assign_parms (tree fndecl) > DECL_INCOMING_RTL (parm) = DECL_RTL (parm); > continue; > } > - else > - maybe_reset_rtl_for_parm (parm); > > /* Estimate stack alignment from parameter alignment. */ > if (SUPPORTS_STACK_ALIGNMENT) > @@ -3835,7 +3682,7 @@ assign_parms (tree fndecl) > else > set_decl_incoming_rtl (parm, data.entry_parm, false); > > - assign_parm_adjust_stack_rtl (&all, parm, &data); > + assign_parm_adjust_stack_rtl (&data); > > /* Bounds should be loaded in the particular order to > have registers allocated correctly. Collect info about > @@ -3856,8 +3703,7 @@ assign_parms (tree fndecl) > { > if (assign_parm_setup_block_p (&data)) > assign_parm_setup_block (&all, parm, &data); > - else if (data.passed_pointer > - || use_register_for_parm_decl (&all, parm)) > + else if (data.passed_pointer || use_register_for_decl (parm)) > assign_parm_setup_reg (&all, parm, &data); > else > assign_parm_setup_stack (&all, parm, &data); > @@ -3954,7 +3800,7 @@ assign_parms (tree fndecl) > > DECL_HAS_VALUE_EXPR_P (result) = 1; > > - SET_DECL_RTL (result, x); > + set_parm_rtl (result, x); > } > > /* We have aligned all the args, so add space for the pretend args. */ > @@ -4986,6 +4832,18 @@ allocate_struct_function (tree fndecl, bool abstract_p) > if (fndecl != NULL_TREE) > { > tree result = DECL_RESULT (fndecl); > + > + if (!abstract_p) > + { > + /* Now that we have activated any function-specific attributes > + that might affect layout, particularly vector modes, relayout > + each of the parameters and the result. */ > + relayout_decl (result); > + for (tree parm = DECL_ARGUMENTS (fndecl); parm; > + parm = DECL_CHAIN (parm)) > + relayout_decl (parm); > + } > + > if (!abstract_p && aggregate_value_p (result, fndecl)) > { > #ifdef PCC_STATIC_STRUCT_RETURN > @@ -5189,7 +5047,6 @@ expand_function_start (tree subr) > > /* Decide whether to return the value in memory or in a register. */ > tree res = DECL_RESULT (subr); > - maybe_reset_rtl_for_parm (res); > if (aggregate_value_p (res, subr)) > { > /* Returning something that won't go in a register. */ > @@ -5210,10 +5067,7 @@ expand_function_start (tree subr) > it. */ > if (sv) > { > - if (DECL_BY_REFERENCE (res)) > - value_address = get_rtl_for_parm_ssa_default_def (res); > - if (!value_address) > - value_address = gen_reg_rtx (Pmode); > + value_address = gen_reg_rtx (Pmode); > emit_move_insn (value_address, sv); > } > } > @@ -5222,33 +5076,35 @@ expand_function_start (tree subr) > rtx x = value_address; > if (!DECL_BY_REFERENCE (res)) > { > - x = get_rtl_for_parm_ssa_default_def (res); > - if (!x) > - { > - x = gen_rtx_MEM (DECL_MODE (res), value_address); > - set_mem_attributes (x, res, 1); > - } > + x = gen_rtx_MEM (DECL_MODE (res), x); > + set_mem_attributes (x, res, 1); > } > - SET_DECL_RTL (res, x); > + set_parm_rtl (res, x); > } > } > else if (DECL_MODE (res) == VOIDmode) > /* If return mode is void, this decl rtl should not be used. */ > - SET_DECL_RTL (res, NULL_RTX); > - else > + set_parm_rtl (res, NULL_RTX); > + else > { > /* Compute the return values into a pseudo reg, which we will copy > into the true return register after the cleanups are done. */ > tree return_type = TREE_TYPE (res); > - rtx x = get_rtl_for_parm_ssa_default_def (res); > - if (x) > - /* Use it. */; > + /* If we may coalesce this result, make sure it has the expected > + mode. */ > + if (flag_tree_coalesce_vars && is_gimple_reg (res)) > + { > + tree def = ssa_default_def (cfun, res); > + gcc_assert (def); > + machine_mode mode = promote_ssa_mode (def, NULL); > + set_parm_rtl (res, gen_reg_rtx (mode)); > + } > else if (TYPE_MODE (return_type) != BLKmode > && targetm.calls.return_in_msb (return_type)) > /* expand_function_end will insert the appropriate padding in > this case. Use the return value's natural (unpadded) mode > within the function proper. */ > - x = gen_reg_rtx (TYPE_MODE (return_type)); > + set_parm_rtl (res, gen_reg_rtx (TYPE_MODE (return_type))); > else > { > /* In order to figure out what mode to use for the pseudo, we > @@ -5259,16 +5115,14 @@ expand_function_start (tree subr) > /* Structures that are returned in registers are not > aggregate_value_p, so we may see a PARALLEL or a REG. */ > if (REG_P (hard_reg)) > - x = gen_reg_rtx (GET_MODE (hard_reg)); > + set_parm_rtl (res, gen_reg_rtx (GET_MODE (hard_reg))); > else > { > gcc_assert (GET_CODE (hard_reg) == PARALLEL); > - x = gen_group_rtx (hard_reg); > + set_parm_rtl (res, gen_group_rtx (hard_reg)); > } > } > > - SET_DECL_RTL (res, x); > - > /* Set DECL_REGISTER flag so that expand_function_end will copy the > result to the real return register(s). */ > DECL_REGISTER (res) = 1; > @@ -5291,22 +5145,23 @@ expand_function_start (tree subr) > { > tree parm = cfun->static_chain_decl; > rtx local, chain; > - rtx_insn *insn; > + rtx_insn *insn; > + int unsignedp; > > - local = get_rtl_for_parm_ssa_default_def (parm); > - if (!local) > - local = gen_reg_rtx (Pmode); > + local = gen_reg_rtx (promote_decl_mode (parm, &unsignedp)); > chain = targetm.calls.static_chain (current_function_decl, true); > > set_decl_incoming_rtl (parm, chain, false); > - SET_DECL_RTL (parm, local); > + set_parm_rtl (parm, local); > mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm)))); > > - if (GET_MODE (local) != Pmode) > - local = convert_to_mode (Pmode, local, > - TYPE_UNSIGNED (TREE_TYPE (parm))); > - > - insn = emit_move_insn (local, chain); > + if (GET_MODE (local) != GET_MODE (chain)) > + { > + convert_move (local, chain, unsignedp); > + insn = get_last_insn (); > + } > + else > + insn = emit_move_insn (local, chain); > > /* Mark the register as eliminable, similar to parameters. */ > if (MEM_P (chain) > diff --git a/gcc/testsuite/gcc.dg/pr67312.c b/gcc/testsuite/gcc.dg/pr67312.c > new file mode 100644 > index 0000000..f1c9fde > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr67312.c > @@ -0,0 +1,7 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O0 -ftree-coalesce-vars" } */ > + > +void foo (int x, int y) > +{ > + y = x; > +} > diff --git a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c > index a1e35dc..d14eb2f 100644 > --- a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c > +++ b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c > @@ -1,6 +1,13 @@ > /* { dg-do compile } */ > -/* { dg-options "-mpreferred-stack-boundary=4" } */ > +/* { dg-options "-mpreferred-stack-boundary=4 -O" } */ > /* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-64,\[^\\n\]*sp" } } */ > +/* We only guarantee we won't generate the stack alignment when > + optimizing. When not optimizing, the return value will be assigned > + to a pseudo with the specified alignment, which in turn will force > + stack alignment since the pseudo might have to be spilled. Without > + optimization, we wouldn't compute the actual stack requirements > + after register allocation and reload, and just use the conservative > + estimate. */ > > /* This compile only test is to detect an assertion failure in stack branch > development. */ > diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c > index fd00883..8dc4908 100644 > --- a/gcc/tree-outof-ssa.c > +++ b/gcc/tree-outof-ssa.c > @@ -980,7 +980,6 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa) > { > bitmap values = NULL; > var_map map; > - unsigned i; > > map = coalesce_ssa_name (); > > @@ -1005,17 +1004,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa) > > sa->map = map; > sa->values = values; > - sa->partition_has_default_def = BITMAP_ALLOC (NULL); > - for (i = 1; i < num_ssa_names; i++) > - { > - tree t = ssa_name (i); > - if (t && SSA_NAME_IS_DEFAULT_DEF (t)) > - { > - int p = var_to_partition (map, t); > - if (p != NO_PARTITION) > - bitmap_set_bit (sa->partition_has_default_def, p); > - } > - } > + sa->partitions_for_parm_default_defs = get_parm_default_def_partitions (map); > } > > > @@ -1190,7 +1179,7 @@ finish_out_of_ssa (struct ssaexpand *sa) > if (sa->values) > BITMAP_FREE (sa->values); > delete_var_map (sa->map); > - BITMAP_FREE (sa->partition_has_default_def); > + BITMAP_FREE (sa->partitions_for_parm_default_defs); > memset (sa, 0, sizeof *sa); > } > > diff --git a/gcc/tree-outof-ssa.h b/gcc/tree-outof-ssa.h > index 687e5a5..60b6379 100644 > --- a/gcc/tree-outof-ssa.h > +++ b/gcc/tree-outof-ssa.h > @@ -39,9 +39,9 @@ struct ssaexpand > a pseudos REG). */ > rtx *partition_to_pseudo; > > - /* If partition I contains an SSA name that has a default def, > - bit I will be set in this bitmap. */ > - bitmap partition_has_default_def; > + /* If partition I contains an SSA name that has a default def for a > + parameter, bit I will be set in this bitmap. */ > + bitmap partitions_for_parm_default_defs; > }; > > /* This is the singleton described above. */ > diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c > index 8af6583..ff75877 100644 > --- a/gcc/tree-ssa-coalesce.c > +++ b/gcc/tree-ssa-coalesce.c > @@ -39,7 +39,9 @@ along with GCC; see the file COPYING3. If not see > #include "cfgexpand.h" > #include "explow.h" > #include "diagnostic-core.h" > - > +#include "tree-dfa.h" > +#include "tm_p.h" > +#include "stor-layout.h" > > /* This set of routines implements a coalesce_list. This is an object which > is used to track pairs of ssa_names which are desirable to coalesce > @@ -877,26 +879,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) > } > > /* Pretend there are defs for params' default defs at the start > - of the (post-)entry block. */ > + of the (post-)entry block. This will prevent PARM_DECLs from > + coalescing into the same partition. Although RESULT_DECLs' > + default defs don't have a useful initial value, we have to > + prevent them from coalescing with PARM_DECLs' default defs > + too, otherwise assign_parms would attempt to assign different > + RTL to the same partition. */ > if (bb == entry) > { > - unsigned base; > - bitmap_iterator bi; > - EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi) > + unsigned i; > + for (i = 1; i < num_ssa_names; i++) > { > - bitmap_iterator bi2; > - unsigned part; > - EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base], > - 0, part, bi2) > - { > - tree var = partition_to_var (map, part); > - if (!SSA_NAME_VAR (var) > - || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL > - && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL) > - || !SSA_NAME_IS_DEFAULT_DEF (var)) > - continue; > - live_track_process_def (live, var, graph); > - } > + tree var = ssa_name (i); > + > + if (!var > + || !SSA_NAME_IS_DEFAULT_DEF (var) > + || !SSA_NAME_VAR (var) > + || VAR_P (SSA_NAME_VAR (var))) > + continue; > + > + live_track_process_def (live, var, graph); > + /* Process a use too, so that it remains live and > + conflicts with other parms' default defs, even unused > + ones. */ > + live_track_process_use (live, var); > } > } > > @@ -937,6 +943,71 @@ fail_abnormal_edge_coalesce (int x, int y) > internal_error ("SSA corruption"); > } > > +/* Call CALLBACK for all PARM_DECLs and RESULT_DECLs for which > + assign_parms may ask for a default partition. */ > + > +static void > +for_all_parms (void (*callback)(tree var, void *arg), void *arg) > +{ > + for (tree var = DECL_ARGUMENTS (current_function_decl); var; > + var = DECL_CHAIN (var)) > + callback (var, arg); > + if (!VOID_TYPE_P (TREE_TYPE (DECL_RESULT (current_function_decl)))) > + callback (DECL_RESULT (current_function_decl), arg); > + if (cfun->static_chain_decl) > + callback (cfun->static_chain_decl, arg); > +} > + > +/* Create a default def for VAR. */ > + > +static void > +create_default_def (tree var, void *arg ATTRIBUTE_UNUSED) > +{ > + if (!is_gimple_reg (var)) > + return; > + > + tree ssa = get_or_create_ssa_default_def (cfun, var); > + gcc_assert (ssa); > +} > + > +/* Register VAR's default def in MAP. */ > + > +static void > +register_default_def (tree var, void *map_) > +{ > + var_map map = (var_map)map_; > + > + if (!is_gimple_reg (var)) > + return; > + > + tree ssa = ssa_default_def (cfun, var); > + gcc_assert (ssa); > + > + register_ssa_partition (map, ssa); > +} > + > +/* If VAR is an SSA_NAME associated with a PARM_DECL or a RESULT_DECL, > + and the DECL's default def is unused (i.e., it was introduced by > + create_default_def), mark VAR and the default def for > + coalescing. */ > + > +static void > +coalesce_with_default (tree var, coalesce_list_p cl, bitmap used_in_copy) > +{ > + if (SSA_NAME_IS_DEFAULT_DEF (var) > + || !SSA_NAME_VAR (var) > + || VAR_P (SSA_NAME_VAR (var))) > + return; > + > + tree ssa = ssa_default_def (cfun, SSA_NAME_VAR (var)); > + if (!has_zero_uses (ssa)) > + return; > + > + add_cost_one_coalesce (cl, SSA_NAME_VERSION (ssa), SSA_NAME_VERSION (var)); > + bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var)); > + /* Default defs will have their used_in_copy bits set at the end of > + create_outofssa_var_map. */ > +} > > /* This function creates a var_map for the current function as well as creating > a coalesce list for use later in the out of ssa process. */ > @@ -954,8 +1025,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy) > int v1, v2, cost; > unsigned i; > > + for_all_parms (create_default_def, NULL); > + > map = init_var_map (num_ssa_names); > > + for_all_parms (register_default_def, map); > + > FOR_EACH_BB_FN (bb, cfun) > { > tree arg; > @@ -1034,6 +1109,30 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy) > } > break; > > + case GIMPLE_RETURN: > + { > + tree res = DECL_RESULT (current_function_decl); > + if (VOID_TYPE_P (TREE_TYPE (res)) > + || !is_gimple_reg (res)) > + break; > + tree rhs1 = gimple_return_retval (as_a (stmt)); > + if (!rhs1) > + break; > + tree lhs = ssa_default_def (cfun, res); > + gcc_assert (lhs); > + if (TREE_CODE (rhs1) == SSA_NAME > + && gimple_can_coalesce_p (lhs, rhs1)) > + { > + v1 = SSA_NAME_VERSION (lhs); > + v2 = SSA_NAME_VERSION (rhs1); > + cost = coalesce_cost_bb (bb); > + add_coalesce (cl, v1, v2, cost); > + bitmap_set_bit (used_in_copy, v1); > + bitmap_set_bit (used_in_copy, v2); > + } > + break; > + } > + > case GIMPLE_ASM: > { > gasm *asm_stmt = as_a (stmt); > @@ -1100,10 +1199,13 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy) > var = ssa_name (i); > if (var != NULL_TREE && !virtual_operand_p (var)) > { > + coalesce_with_default (var, cl, used_in_copy); > + > /* Add coalesces between all the result decls. */ > if (SSA_NAME_VAR (var) > && TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL) > { > + bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var)); > if (first == NULL_TREE) > first = var; > else > @@ -1111,8 +1213,6 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy) > gcc_assert (gimple_can_coalesce_p (var, first)); > v1 = SSA_NAME_VERSION (first); > v2 = SSA_NAME_VERSION (var); > - bitmap_set_bit (used_in_copy, v1); > - bitmap_set_bit (used_in_copy, v2); > cost = coalesce_cost_bb (EXIT_BLOCK_PTR_FOR_FN (cfun)); > add_coalesce (cl, v1, v2, cost); > } > @@ -1121,7 +1221,9 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy) > since they will have to be coalesced with the base variable. If > not marked as present, they won't be in the coalesce view. */ > if (SSA_NAME_IS_DEFAULT_DEF (var) > - && !has_zero_uses (var)) > + && (!has_zero_uses (var) > + || (SSA_NAME_VAR (var) > + && !VAR_P (SSA_NAME_VAR (var))))) > bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var)); > } > } > @@ -1367,30 +1469,38 @@ gimple_can_coalesce_p (tree name1, tree name2) > > /* We don't want to coalesce two SSA names if one of the base > variables is supposed to be a register while the other is > - supposed to be on the stack. Anonymous SSA names take > - registers, but when not optimizing, user variables should go > - on the stack, so coalescing them with the anonymous variable > - as the partition leader would end up assigning the user > - variable to a register. Don't do that! */ > - bool reg1 = !var1 || use_register_for_decl (var1); > - bool reg2 = !var2 || use_register_for_decl (var2); > + supposed to be on the stack. Anonymous SSA names most often > + take registers, but when not optimizing, user variables > + should go on the stack, so coalescing them with the anonymous > + variable as the partition leader would end up assigning the > + user variable to a register. Don't do that! */ > + bool reg1 = use_register_for_decl (name1); > + bool reg2 = use_register_for_decl (name2); > if (reg1 != reg2) > return false; > > - /* Check that the promoted modes are the same. We don't want to > - coalesce if the promoted modes would be different. Only > + /* Check that the promoted modes and unsignedness are the same. > + We don't want to coalesce if the promoted modes would be > + different, or if they would sign-extend differently. Only > PARM_DECLs and RESULT_DECLs have different promotion rules, > so skip the test if both are variables, or both are anonymous > - SSA_NAMEs. Now, if a parm or result has BLKmode, do not > - coalesce its SSA versions with those of any other variables, > - because it may be passed by reference. */ > + SSA_NAMEs. */ > + int unsigned1, unsigned2; > return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2))) > - || (/* The case var1 == var2 is already covered above. */ > - !parm_in_stack_slot_p (var1) > - && !parm_in_stack_slot_p (var2) > - && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL)); > + || ((promote_ssa_mode (name1, &unsigned1) > + == promote_ssa_mode (name2, &unsigned2)) > + && unsigned1 == unsigned2); > } > > + /* If alignment requirements are different, we can't coalesce. */ > + if (MINIMUM_ALIGNMENT (t1, > + var1 ? DECL_MODE (var1) : TYPE_MODE (t1), > + var1 ? LOCAL_DECL_ALIGNMENT (var1) : TYPE_ALIGN (t1)) > + != MINIMUM_ALIGNMENT (t2, > + var2 ? DECL_MODE (var2) : TYPE_MODE (t2), > + var2 ? LOCAL_DECL_ALIGNMENT (var2) : TYPE_ALIGN (t2))) > + return false; > + > /* If the types are not the same, check for a canonical type match. This > (for example) allows coalescing when the types are fundamentally the > same, but just have different names. > @@ -1639,7 +1749,8 @@ coalesce_ssa_name (void) > if (a > && SSA_NAME_VAR (a) > && !DECL_IGNORED_P (SSA_NAME_VAR (a)) > - && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a))) > + && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a) > + || !VAR_P (SSA_NAME_VAR (a)))) > { > tree *slot = ssa_name_hash.find_slot (a, INSERT); > > @@ -1721,3 +1832,47 @@ coalesce_ssa_name (void) > > return map; > } > + > +/* We need to pass two arguments to set_parm_default_def_partition, > + but for_all_parms only supports one. Use a pair. */ > + > +typedef std::pair parm_default_def_partition_arg; > + > +/* Set in ARG's PARTS bitmap the bit corresponding to the partition in > + ARG's MAP containing VAR's default def. */ > + > +static void > +set_parm_default_def_partition (tree var, void *arg_) > +{ > + parm_default_def_partition_arg *arg = (parm_default_def_partition_arg *)arg_; > + var_map map = arg->first; > + bitmap parts = arg->second; > + > + if (!is_gimple_reg (var)) > + return; > + > + tree ssa = ssa_default_def (cfun, var); > + gcc_assert (ssa); > + > + int version = var_to_partition (map, ssa); > + gcc_assert (version != NO_PARTITION); > + > + bool changed = bitmap_set_bit (parts, version); > + gcc_assert (changed); > +} > + > +/* Allocate and return a bitmap that has a bit set for each partition > + that contains a default def for a parameter. */ > + > +extern bitmap > +get_parm_default_def_partitions (var_map map) > +{ > + bitmap parm_default_def_parts = BITMAP_ALLOC (NULL); > + > + parm_default_def_partition_arg > + arg = std::make_pair (map, parm_default_def_parts); > + > + for_all_parms (set_parm_default_def_partition, &arg); > + > + return parm_default_def_parts; > +} > diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h > index ae289b4..8316f34 100644 > --- a/gcc/tree-ssa-coalesce.h > +++ b/gcc/tree-ssa-coalesce.h > @@ -22,5 +22,6 @@ along with GCC; see the file COPYING3. If not see > > extern var_map coalesce_ssa_name (void); > extern bool gimple_can_coalesce_p (tree, tree); > +extern bitmap get_parm_default_def_partitions (var_map); > > #endif /* GCC_TREE_SSA_COALESCE_H */ > diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c > index e031725..25b548b 100644 > --- a/gcc/tree-ssa-live.c > +++ b/gcc/tree-ssa-live.c > @@ -200,7 +200,9 @@ partition_view_init (var_map map) > tmp = partition_find (map->var_partition, x); > if (ssa_name (tmp) != NULL_TREE && !virtual_operand_p (ssa_name (tmp)) > && (!has_zero_uses (ssa_name (tmp)) > - || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp)))) > + || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp)) > + || (SSA_NAME_VAR (ssa_name (tmp)) > + && !VAR_P (SSA_NAME_VAR (ssa_name (tmp)))))) > bitmap_set_bit (used, tmp); > } > > @@ -1404,6 +1406,12 @@ verify_live_on_entry (tree_live_info_p live) > } > if (ok) > continue; > + /* Expand adds unused default defs for PARM_DECLs and > + RESULT_DECLs. They're ok. */ > + if (has_zero_uses (var) > + && SSA_NAME_VAR (var) > + && !VAR_P (SSA_NAME_VAR (var))) > + continue; > num++; > print_generic_expr (stderr, var, TDF_SLIM); > fprintf (stderr, " is not marked live-on-entry to entry BB%d ", > > > -- > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ > You must be the change you wish to see in the world. -- Gandhi > Be Free! -- http://FSFLA.org/ FSF Latin America board member > Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer >