From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113848 invoked by alias); 9 Jun 2015 08:44:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 113822 invoked by uid 89); 9 Jun 2015 08:44:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.3 required=5.0 tests=AWL,BAYES_99,BAYES_999,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-qg0-f51.google.com Received: from mail-qg0-f51.google.com (HELO mail-qg0-f51.google.com) (209.85.192.51) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 09 Jun 2015 08:44:32 +0000 Received: by qgf75 with SMTP id 75so3088373qgf.1 for ; Tue, 09 Jun 2015 01:44:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=ttq6tN7vwM5tzOqtilD+mNuPK2Rdy0Bc7SZYUN64Acg=; b=VAOQb5Q7JPQNjmU4p2Tez0RnlXihgXh+4ojBfDqYo/a65a94oGSV+pQXve2bn06Cfh 6ocv/8peUdfd6S5RDsLOCesGuekJrzPUY2GkaDf8PCFzTvaa9vMIfOfYaX64YKf1sg8L igjQvr7FMCQXdCsqc6mTPhCSPxMc+R/+LRvnX+AhvGixaYnXomy2X7RAunPjBKDyP6v5 CH28PL+2Ss3plHr4Tkmj1XoGeRjUkLKjIBVbM9AfhVCRiQdfqDZLlLUnd2bLwyJ4kJqN c77DW3ih4bF1GCrE8xTxhlwINuCaAqzgL0X+qdN4u1kZNApLiKaqELiyVHIxt4/SoELV oRyA== X-Gm-Message-State: ALoCoQmiHgSdCqBnVoY7G16MP5H7wSIaNREx9BAH5a3lS63Fgo2JsqRuykypCtl6bPNXkts+8XlV MIME-Version: 1.0 X-Received: by 10.140.231.206 with SMTP id b197mr25836720qhc.32.1433839469592; Tue, 09 Jun 2015 01:44:29 -0700 (PDT) Received: by 10.140.102.164 with HTTP; Tue, 9 Jun 2015 01:44:29 -0700 (PDT) In-Reply-To: References: <551A2C7C.8060005@redhat.com> <5522AF73.5000706@redhat.com> Date: Tue, 09 Jun 2015 08:58:00 -0000 Message-ID: Subject: Re: [PR64164] drop copyrename, integrate into expand From: Christophe Lyon To: Alexandre Oliva Cc: GCC Patches Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-06/txt/msg00638.txt.bz2 On 8 June 2015 at 10:14, Richard Biener wrote: > On Sat, Jun 6, 2015 at 3:14 AM, Alexandre Oliva wrote: >> On Apr 27, 2015, Richard Biener wrote: >> >>> This should also mention that is_gimple_reg vars do not have their >>> address taken. >> >> check >> >>>> +static tree >>>> +leader_merge (tree cur, tree next) >> >>> Ick - presumably you can't use sth better than a TREE_LIST here? >> >> The list was an experiment that never really worked, and when I tried to >> make it work after the patch, it proved to be unworkable, so I dropped >> it, and rewrote leader_merge to choose either of the params, preferring >> anonymous over ignored over named, so as to reduce the likelihood of >> misreading of debug dumps, since that's all they're used for. >> >>>> static void >>>> -expand_one_stack_var (tree var) >>>> +expand_one_stack_var_1 (tree var) >>>> { >>>> HOST_WIDE_INT size, offset; >>>> unsigned byte_align; >>>> >>>> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); >>>> - byte_align = align_local_variable (SSAVAR (var)); >>>> + if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var)) >>>> + { >>>> + size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); >>>> + byte_align = align_local_variable (SSAVAR (var)); >>>> + } >>>> + else >> >>> I'd go here for all TREE_CODE (var) == SSA_NAME >> >> Check >> >>> (and get rid of the SSAVAR macro?) >> >> There are remaining uses that don't seem worth dropping it for. >> >>>> +/* Return the promoted mode for name. If it is a named SSA_NAME, it >>>> + is the same as promote_decl_mode. Otherwise, it is the promoted >>>> + mode of a temp decl of same type as the SSA_NAME, if we had created >>>> + one. */ >>>> + >>>> +machine_mode >>>> +promote_ssa_mode (const_tree name, int *punsignedp) >>>> +{ >>>> + gcc_assert (TREE_CODE (name) == SSA_NAME); >>>> + >>>> + if (SSA_NAME_VAR (name)) >>>> + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp); >> >>> As above I'd rather not have different paths for anonymous vs. non-anonymous >>> vars (so just delete the above two lines). >> >> Check >> >>>> @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >>>> pmode = promote_function_mode (type, mode, &unsignedp, >>>> gimple_call_fntype (g), >>>> 2); >>>> + else if (!exp) >>>> + { >>>> + gcc_assert (code == SSA_NAME); >> >>> promote_ssa_mode should assert this. >> >>>> + pmode = promote_ssa_mode (ssa_name, &unsignedp); >> >> It does, so... check. >> >> >>>> @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype) >>>> bool >>>> use_register_for_decl (const_tree decl) >>>> { >>>> + if (TREE_CODE (decl) == SSA_NAME) >>>> + { >>>> + if (!SSA_NAME_VAR (decl)) >>>> + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode >>>> + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl))); >>>> + >>>> + decl = SSA_NAME_VAR (decl); >> >>> See above. Please drop the SSA_NAME_VAR != NULL path. >> >> Check, then taken back, after a bootstrap failure and some debugging >> made me realize this would be wrong. Here are the nearly-added comments >> that explain why: >> >> /* We often try to use the SSA_NAME, instead of its underlying >> decl, to get type information and guide decisions, to avoid >> differences of behavior between anonymous and named >> variables, but in this one case we have to go for the actual >> variable if there is one. The main reason is that, at least >> at -O0, we want to place user variables on the stack, but we >> don't mind using pseudos for anonymous or ignored temps. >> Should we take the SSA_NAME, we'd conclude all SSA_NAMEs >> should go in pseudos, whereas their corresponding variables >> might have to go on the stack. So, disregarding the decl >> here would negatively impact debug info at -O0, enable >> coalescing between SSA_NAMEs that ought to get different >> stack/pseudo assignments, and get the incoming argument >> processing thoroughly confused by PARM_DECLs expected to live >> in stack slots but assigned to pseudos. */ >> >> >>>> +++ b/gcc/gimple-expr.h >>>> +/* Defined in tree-ssa-coalesce.c. */ >>>> +extern bool gimple_can_coalesce_p (tree, tree); >> >>> Err, put it to tree-ssa-coalesce.h? >> >> Check. Lots of additional headers required to be able to include >> tree-ssa-coalesce.h, though. >> >> >>>> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var))); >>>> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name))); >> >>> The TREE_TYPE of name and its SSA_NAME_VAR are always the same. So just >>> use TREE_TYPE (name) here. >> >> Check >> >>>> gcc_assert (!REG_P (dest_rtx) >>>> - || dest_mode == promote_decl_mode (var, &unsignedp)); >>>> + || dest_mode == promote_ssa_mode (name, &unsignedp)); >>>> >>>> if (src_mode != dest_mode) >>>> { >>>> @@ -714,12 +715,12 @@ static rtx >>>> get_temp_reg (tree name) >>>> { >>>> tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; >>>> - tree type = TREE_TYPE (var); >>>> + tree type = var ? TREE_TYPE (var) : TREE_TYPE (name); >> >>> See above. >> >> Check >> >> >> Here's the revised patch, regstrapped on x86_64-linux-gnu and >> i686-linux-gnu. The first attempt failed to compile libjava on x86_64, >> requiring the new change in tree-ssa-loop-niter.c to pass. It didn't >> occur in the unpatched tree because the differences between anon or >> named SSA_NAMEs in copyrename changed costs and caused different choices >> in ivopts, which ultimately failed to expose the problem in loop-niter >> during vrp. >> >> At the end, I enclose the incremental changes since the previous >> revision of the patch, to ease the incremental review. >> >> Ok to install? > > Ok. > > Thanks, > Richard. > >> >> for gcc/ChangeLog >> >> PR rtl-optimization/64164 >> * Makefile.in (OBJS): Drop tree-ssa-copyrename.o. >> * tree-ssa-copyrename.c: Removed. >> * opts.c (default_options_table): Drop -ftree-copyrename. Add >> -ftree-coalesce-vars. >> * passes.def: Drop all occurrences of pass_rename_ssa_copies. >> * common.opt (ftree-copyrename): Ignore. >> (ftree-coalesce-inlined-vars): Likewise. >> * doc/invoke.texi: Remove the ignored options above. >> * gimple-expr.h (gimple_can_coalesce_p): Move declaration >> * tree-ssa-coalesce.h: ... here. >> * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other >> headers required by it. >> * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing >> across variables when flag_tree_coalesce_vars. Check register >> use and promoted modes to allow coalescing. Moved to >> tree-ssa-coalesce.c. >> * tree-ssa-live.c (struct tree_int_map_hasher): Move along >> with its member functions to tree-ssa-coalesce.c. >> (var_map_base_init): Likewise. Renamed to >> compute_samebase_partition_bases. >> (partition_view_normal): Drop want_bases parameter. >> (partition_view_bitmap): Likewise. >> * tree-ssa-live.h: Adjust declarations. >> * tree-ssa-coalesce.c: Include explow.h. >> (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's >> default defs at the entry point. >> (dump_part_var_map): New. >> (compute_optimized_partition_bases): New, called by... >> (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead >> of compute_samebase_partition_bases. Adjust. >> * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs. >> * cfgexpand.c (leader_merge): New. >> (get_rtl_for_parm_ssa_default_def): New. >> (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA >> vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too. >> (expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop >> redundant MEM attr setting. >> (expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed >> from... >> (expand_one_stack_var): ... this. New wrapper to check and >> skip already expanded SSA partitions. >> (record_alignment_for_reg_var): New, factored out of... >> (expand_one_var): ... this. >> (expand_one_ssa_partition): New. >> (adjust_one_expanded_partition_var): New. >> (expand_one_register_var): Check and skip already expanded SSA >> partitions. >> (expand_used_vars): Don't create DECLs for anonymous SSA >> names. Expand all SSA partitions, then adjust all SSA names. >> (pass::execute): Replace the loops that set >> SA.partition_to_pseudo from partition leaders and cleared >> DECL_RTL for multi-location variables, and that which used to >> rename vars and set attrs, with one that clears DECL_RTL and >> checks that PARMs and RESULTs default_defs match DECL_RTL. >> * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare. >> * emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl. >> * explow.c (promote_ssa_mode): New. >> * explow.h (promote_ssa_mode): Declare. >> * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs. >> * function.c: Include cfgexpand.h. >> (use_register_for_decl): Handle SSA_NAMEs, anonymous or not. >> (use_register_for_parm_decl): Wrapper for the above to >> special-case the result_ptr. >> (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def. >> (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with >> multiple locations. >> (assign_parm_adjust_stack_rtl): Add all and parm arguments, >> for rtl_for_parm. For SSA-assigned parms, zero stack_parm. >> (assign_parm_setup_block): Prefer SSA-assigned location. >> (assign_parm_setup_reg): Likewise. Use entry_parm for equiv >> if stack_parm is NULL. >> (assign_parm_setup_stack): Prefer SSA-assigned location. >> (assign_parms): Maybe reset DECL_RTL of params. Adjust stack >> rtl before testing for pointer bounds. Special-case result_ptr. >> (expand_function_start): Maybe reset DECL_RTL of result. >> Prefer SSA-assigned location for result and static chain. >> Factor out DECL_RESULT and SET_DECL_RTL. >> * tree-outof-ssa.c (insert_value_copy_on_edge): Handle >> anonymous SSA names. Use promote_ssa_mode. >> (get_temp_reg): Likewise. >> (remove_ssa_form): Adjust. >> * var-tracking.c (dataflow_set_clear_at_call): Take call_insn >> and get its reg_usage for reg invalidation. >> (compute_bb_dataflow): Pass it insn. >> (emit_notes_in_bb): Likewise. >> * tree-ssa-loop-niter.c (loop_exits_before_overflow): Don't >> fail assert on conversion between unsigned types. >> Hi, This patch causes a GCC build failure with target armeb-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9 --with-fpu=neon during the libgcc compilation: Here is the backtrace I have: /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/obj-armeb-none-linux-gnueabihf/gcc1/./gcc/xgcc -B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/obj-armeb-none-linux-gnueabihf/gcc1/./gcc/ -B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/bin/ -B/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/lib/ -isystem /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/include -isystem /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/builds/gcc-fsf-trunk/tools/armeb-none-linux-gnueabihf/sys-include -g -O2 -O2 -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -fno-inline -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -Dinhibit_libc -fPIC -fno-inline -I. -I. -I../.././gcc -I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc -I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/. -I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/../gcc -I/media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/../include -DHAVE_CC_TLS -o _addQQ.o -MT _addQQ.o -MD -MP -MF _addQQ.dep -DL_add -DQQ_MODE -c /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c -fvisibility=hidden -DHIDE_EXPORTS In file included from /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c:55:0: /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c: In function '__gnu_addqq3': /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:450:31: internal compiler error: RTL flag check: MEM_VOLATILE_P used with unexpected rtx code 'reg' in set_mem_attributes_minus_bitpos, at emit-rtl.c:1787 #define FIXED_OP(OP,MODE,NUM) __gnu_ ## OP ## MODE ## NUM ^ /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:460:30: note: in expansion of macro 'FIXED_OP' #define FIXED_ADD_TEMP(NAME) FIXED_OP(add,NAME,3) ^ /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.h:492:19: note: in expansion of macro 'FIXED_ADD_TEMP' #define FIXED_ADD FIXED_ADD_TEMP(MODE_NAME_S) ^ /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-bit.c:59:1: note: in expansion of macro 'FIXED_ADD' FIXED_ADD (FIXED_C_TYPE a, FIXED_C_TYPE b) ^ 0xa6eb52 rtl_check_failed_flag(char const*, rtx_def const*, char const*, int, char const*) /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/rtl.c:800 0x771fc7 set_mem_attributes_minus_bitpos(rtx_def*, tree_node*, int, long) /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/emit-rtl.c:1787 0x805294 assign_parm_setup_block /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:2977 0x80b65c assign_parms /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:3775 0x80e087 expand_function_start(tree_node*) /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/function.c:5215 0x6a77ed execute /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/cfgexpand.c:6127 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See for instructions. /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/libgcc/fixed-obj.mk:27: recipe for target '_addQQ.o' failed make[2]: *** [_addQQ.o] Error 1 >> for gcc/testsuite/ChangeLog >> >> * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars. >> * gcc.dg/ssp-1.c: Make counter a register. >> * gcc.dg/ssp-2.c: Likewise. >> * gcc.dg/torture/parm-coalesce.c: New. >> --- >> gcc/Makefile.in | 1 >> gcc/alias.c | 13 + >> gcc/cfgexpand.c | 370 ++++++++++++++----- >> gcc/cfgexpand.h | 2 >> gcc/common.opt | 12 - >> gcc/doc/invoke.texi | 48 +-- >> gcc/emit-rtl.c | 5 >> gcc/explow.c | 22 + >> gcc/explow.h | 3 >> gcc/expr.c | 39 +- >> gcc/function.c | 226 +++++++++--- >> gcc/gimple-expr.c | 39 -- >> gcc/gimple-expr.h | 1 >> gcc/opts.c | 2 >> gcc/passes.def | 5 >> gcc/testsuite/gcc.dg/guality/pr54200.c | 2 >> gcc/testsuite/gcc.dg/ssp-1.c | 2 >> gcc/testsuite/gcc.dg/ssp-2.c | 2 >> gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++ >> gcc/tree-outof-ssa.c | 16 - >> gcc/tree-ssa-coalesce.c | 380 +++++++++++++++++++- >> gcc/tree-ssa-coalesce.h | 1 >> gcc/tree-ssa-copyrename.c | 499 -------------------------- >> gcc/tree-ssa-live.c | 101 ----- >> gcc/tree-ssa-live.h | 4 >> gcc/tree-ssa-loop-niter.c | 6 >> gcc/tree-ssa-uncprop.c | 5 >> gcc/var-tracking.c | 12 - >> 28 files changed, 984 insertions(+), 874 deletions(-) >> create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c >> delete mode 100644 gcc/tree-ssa-copyrename.c >> >> diff --git a/gcc/Makefile.in b/gcc/Makefile.in >> index 3d14938..2a03223 100644 >> --- a/gcc/Makefile.in >> +++ b/gcc/Makefile.in >> @@ -1441,7 +1441,6 @@ OBJS = \ >> tree-ssa-ccp.o \ >> tree-ssa-coalesce.o \ >> tree-ssa-copy.o \ >> - tree-ssa-copyrename.o \ >> tree-ssa-dce.o \ >> tree-ssa-dom.o \ >> tree-ssa-dse.o \ >> diff --git a/gcc/alias.c b/gcc/alias.c >> index ea539c5..5a031d9 100644 >> --- a/gcc/alias.c >> +++ b/gcc/alias.c >> @@ -2552,6 +2552,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant) >> if (! DECL_P (exprx) || ! DECL_P (expry)) >> return 0; >> >> + /* If we refer to different gimple registers, or one gimple register >> + and one non-gimple-register, we know they can't overlap. First, >> + gimple registers don't have their addresses taken. Now, there >> + could be more than one stack slot for (different versions of) the >> + same gimple register, but we can presumably tell they don't >> + overlap based on offsets from stack base addresses elsewhere. >> + It's important that we don't proceed to DECL_RTL, because gimple >> + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be >> + able to do anything about them since no SSA information will have >> + remained to guide it. */ >> + if (is_gimple_reg (exprx) || is_gimple_reg (expry)) >> + return exprx != expry; >> + >> /* With invalid code we can end up storing into the constant pool. >> Bail out to avoid ICEing when creating RTL for this. >> See gfortran.dg/lto/20091028-2_0.f90. */ >> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c >> index b190f91..bf972fc 100644 >> --- a/gcc/cfgexpand.c >> +++ b/gcc/cfgexpand.c >> @@ -179,21 +179,121 @@ gimple_assign_rhs_to_tree (gimple stmt) >> >> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) >> >> +/* Choose either CUR or NEXT as the leader DECL for a partition. >> + Prefer ignored decls, to simplify debug dumps and reduce ambiguity >> + out of the same user variable being in multiple partitions (this is >> + less likely for compiler-introduced temps). */ >> + >> +static tree >> +leader_merge (tree cur, tree next) >> +{ >> + if (cur == NULL || cur == next) >> + return next; >> + >> + if (DECL_P (cur) && DECL_IGNORED_P (cur)) >> + return cur; >> + >> + if (DECL_P (next) && DECL_IGNORED_P (next)) >> + return next; >> + >> + return cur; >> +} >> + >> + >> +/* Return the RTL for the default SSA def of a PARM or RESULT, if >> + there is one. */ >> + >> +rtx >> +get_rtl_for_parm_ssa_default_def (tree var) >> +{ >> + gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL); >> + >> + if (!is_gimple_reg (var)) >> + return NULL_RTX; >> + >> + /* If we've already determined RTL for the decl, use it. This is >> + not just an optimization: if VAR is a PARM whose incoming value >> + is unused, we won't find a default def to use its partition, but >> + we still want to use the location of the parm, if it was used at >> + all. During assign_parms, until a location is assigned for the >> + VAR, RTL can only for a parm or result if we're not coalescing >> + across variables, when we know we're coalescing all SSA_NAMEs of >> + each parm or result, and we're not coalescing them with names >> + pertaining to other variables, such as other parms' default >> + defs. */ >> + if (DECL_RTL_SET_P (var)) >> + { >> + gcc_assert (DECL_RTL (var) != pc_rtx); >> + return DECL_RTL (var); >> + } >> + >> + tree name = ssa_default_def (cfun, var); >> + >> + if (!name) >> + return NULL_RTX; >> + >> + int part = var_to_partition (SA.map, name); >> + if (part == NO_PARTITION) >> + return NULL_RTX; >> + >> + return SA.partition_to_pseudo[part]; >> +} >> + >> /* Associate declaration T with storage space X. If T is no >> SSA name this is exactly SET_DECL_RTL, otherwise make the >> partition of T associated with X. */ >> static inline void >> set_rtl (tree t, rtx x) >> { >> + if (x && SSAVAR (t)) >> + { >> + bool skip = false; >> + tree cur = NULL_TREE; >> + >> + if (MEM_P (x)) >> + cur = MEM_EXPR (x); >> + else if (REG_P (x)) >> + cur = REG_EXPR (x); >> + else if (GET_CODE (x) == CONCAT >> + && REG_P (XEXP (x, 0))) >> + cur = REG_EXPR (XEXP (x, 0)); >> + else if (GET_CODE (x) == PARALLEL) >> + cur = REG_EXPR (XVECEXP (x, 0, 0)); >> + else if (x == pc_rtx) >> + skip = true; >> + else >> + gcc_unreachable (); >> + >> + tree next = skip ? cur : leader_merge (cur, SSAVAR (t)); >> + >> + if (cur != next) >> + { >> + if (MEM_P (x)) >> + set_mem_attributes (x, next, true); >> + else >> + set_reg_attrs_for_decl_rtl (next, x); >> + } >> + } >> + >> if (TREE_CODE (t) == SSA_NAME) >> { >> - SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x; >> - if (x && !MEM_P (x)) >> - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x); >> - /* For the benefit of debug information at -O0 (where vartracking >> - doesn't run) record the place also in the base DECL if it's >> - a normal variable (not a parameter). */ >> - if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL) >> + int part = var_to_partition (SA.map, t); >> + if (part != NO_PARTITION) >> + { >> + if (SA.partition_to_pseudo[part]) >> + gcc_assert (SA.partition_to_pseudo[part] == x); >> + else >> + SA.partition_to_pseudo[part] = x; >> + } >> + /* For the benefit of debug information at -O0 (where >> + vartracking doesn't run) record the place also in the base >> + DECL. For PARMs and RESULTs, we may end up resetting these >> + in function.c:maybe_reset_rtl_for_parm, but in some rare >> + cases we may need them (unused and overwritten incoming >> + value, that at -O0 must share the location with the other >> + uses in spite of the missing default def), and this may be >> + the only chance to preserve them. */ >> + if (x && x != pc_rtx && SSA_NAME_VAR (t)) >> { >> tree var = SSA_NAME_VAR (t); >> /* If we don't yet have something recorded, just record it now. */ >> @@ -909,7 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, >> gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); >> >> x = plus_constant (Pmode, base, offset); >> - x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x); >> + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME >> + ? TYPE_MODE (TREE_TYPE (decl)) >> + : DECL_MODE (SSAVAR (decl)), x); >> >> if (TREE_CODE (decl) != SSA_NAME) >> { >> @@ -931,7 +1033,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, >> DECL_USER_ALIGN (decl) = 0; >> } >> >> - set_mem_attributes (x, SSAVAR (decl), true); >> set_rtl (decl, x); >> } >> >> @@ -1146,13 +1247,22 @@ account_stack_vars (void) >> to a variable to be allocated in the stack frame. */ >> >> static void >> -expand_one_stack_var (tree var) >> +expand_one_stack_var_1 (tree var) >> { >> HOST_WIDE_INT size, offset; >> unsigned byte_align; >> >> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); >> - byte_align = align_local_variable (SSAVAR (var)); >> + if (TREE_CODE (var) == SSA_NAME) >> + { >> + tree type = TREE_TYPE (var); >> + size = tree_to_uhwi (TYPE_SIZE_UNIT (type)); >> + byte_align = TYPE_ALIGN_UNIT (type); >> + } >> + else >> + { >> + size = tree_to_uhwi (DECL_SIZE_UNIT (var)); >> + byte_align = align_local_variable (var); >> + } >> >> /* We handle highly aligned variables in expand_stack_vars. */ >> gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT); >> @@ -1163,6 +1273,27 @@ expand_one_stack_var (tree var) >> crtl->max_used_stack_slot_alignment, offset); >> } >> >> +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are >> + already assigned some MEM. */ >> + >> +static void >> +expand_one_stack_var (tree var) >> +{ >> + if (TREE_CODE (var) == SSA_NAME) >> + { >> + int part = var_to_partition (SA.map, var); >> + if (part != NO_PARTITION) >> + { >> + rtx x = SA.partition_to_pseudo[part]; >> + gcc_assert (x); >> + gcc_assert (MEM_P (x)); >> + return; >> + } >> + } >> + >> + return expand_one_stack_var_1 (var); >> +} >> + >> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL >> that will reside in a hard register. */ >> >> @@ -1172,13 +1303,114 @@ expand_one_hard_reg_var (tree var) >> rest_of_decl_compilation (var, 0, 0); >> } >> >> +/* Record the alignment requirements of some variable assigned to a >> + pseudo. */ >> + >> +static void >> +record_alignment_for_reg_var (unsigned int align) >> +{ >> + if (SUPPORTS_STACK_ALIGNMENT >> + && crtl->stack_alignment_estimated < align) >> + { >> + /* stack_alignment_estimated shouldn't change after stack >> + realign decision made */ >> + gcc_assert (!crtl->stack_realign_processed); >> + crtl->stack_alignment_estimated = align; >> + } >> + >> + /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted. >> + So here we only make sure stack_alignment_needed >= align. */ >> + if (crtl->stack_alignment_needed < align) >> + crtl->stack_alignment_needed = align; >> + if (crtl->max_used_stack_slot_alignment < align) >> + crtl->max_used_stack_slot_alignment = align; >> +} >> + >> +/* Create RTL for an SSA partition. */ >> + >> +static void >> +expand_one_ssa_partition (tree var) >> +{ >> + int part = var_to_partition (SA.map, var); >> + gcc_assert (part != NO_PARTITION); >> + >> + if (SA.partition_to_pseudo[part]) >> + return; >> + >> + if (!use_register_for_decl (var)) >> + { >> + expand_one_stack_var_1 (var); >> + return; >> + } >> + >> + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var), >> + TYPE_MODE (TREE_TYPE (var)), >> + TYPE_ALIGN (TREE_TYPE (var))); >> + >> + /* If the variable alignment is very large we'll dynamicaly allocate >> + it, which means that in-frame portion is just a pointer. */ >> + if (align > MAX_SUPPORTED_STACK_ALIGNMENT) >> + align = POINTER_SIZE; >> + >> + record_alignment_for_reg_var (align); >> + >> + machine_mode reg_mode = promote_ssa_mode (var, NULL); >> + >> + rtx x = gen_reg_rtx (reg_mode); >> + >> + set_rtl (var, x); >> +} >> + >> +/* Record the association between the RTL generated for a partition >> + and the underlying variable of the SSA_NAME. */ >> + >> +static void >> +adjust_one_expanded_partition_var (tree var) >> +{ >> + if (!var) >> + return; >> + >> + tree decl = SSA_NAME_VAR (var); >> + >> + int part = var_to_partition (SA.map, var); >> + if (part == NO_PARTITION) >> + return; >> + >> + rtx x = SA.partition_to_pseudo[part]; >> + >> + set_rtl (var, x); >> + >> + if (!REG_P (x)) >> + return; >> + >> + /* Note if the object is a user variable. */ >> + if (decl && !DECL_ARTIFICIAL (decl)) >> + mark_user_reg (x); >> + >> + if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var))) >> + mark_reg_pointer (x, get_pointer_alignment (var)); >> +} >> + >> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL >> that will reside in a pseudo register. */ >> >> static void >> expand_one_register_var (tree var) >> { >> - tree decl = SSAVAR (var); >> + if (TREE_CODE (var) == SSA_NAME) >> + { >> + int part = var_to_partition (SA.map, var); >> + if (part != NO_PARTITION) >> + { >> + rtx x = SA.partition_to_pseudo[part]; >> + gcc_assert (x); >> + gcc_assert (REG_P (x)); >> + return; >> + } >> + gcc_unreachable (); >> + } >> + >> + tree decl = var; >> tree type = TREE_TYPE (decl); >> machine_mode reg_mode = promote_decl_mode (decl, NULL); >> rtx x = gen_reg_rtx (reg_mode); >> @@ -1312,21 +1544,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand) >> align = POINTER_SIZE; >> } >> >> - if (SUPPORTS_STACK_ALIGNMENT >> - && crtl->stack_alignment_estimated < align) >> - { >> - /* stack_alignment_estimated shouldn't change after stack >> - realign decision made */ >> - gcc_assert (!crtl->stack_realign_processed); >> - crtl->stack_alignment_estimated = align; >> - } >> - >> - /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted. >> - So here we only make sure stack_alignment_needed >= align. */ >> - if (crtl->stack_alignment_needed < align) >> - crtl->stack_alignment_needed = align; >> - if (crtl->max_used_stack_slot_alignment < align) >> - crtl->max_used_stack_slot_alignment = align; >> + record_alignment_for_reg_var (align); >> >> if (TREE_CODE (origvar) == SSA_NAME) >> { >> @@ -1760,48 +1978,18 @@ expand_used_vars (void) >> if (targetm.use_pseudo_pic_reg ()) >> pic_offset_table_rtx = gen_reg_rtx (Pmode); >> >> - hash_map ssa_name_decls; >> for (i = 0; i < SA.map->num_partitions; i++) >> { >> tree var = partition_to_var (SA.map, i); >> >> gcc_assert (!virtual_operand_p (var)); >> >> - /* Assign decls to each SSA name partition, share decls for partitions >> - we could have coalesced (those with the same type). */ >> - if (SSA_NAME_VAR (var) == NULL_TREE) >> - { >> - tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var)); >> - if (!*slot) >> - *slot = create_tmp_reg (TREE_TYPE (var)); >> - replace_ssa_name_symbol (var, *slot); >> - } >> - >> - /* Always allocate space for partitions based on VAR_DECLs. But for >> - those based on PARM_DECLs or RESULT_DECLs and which matter for the >> - debug info, there is no need to do so if optimization is disabled >> - because all the SSA_NAMEs based on these DECLs have been coalesced >> - into a single partition, which is thus assigned the canonical RTL >> - location of the DECLs. If in_lto_p, we can't rely on optimize, >> - a function could be compiled with -O1 -flto first and only the >> - link performed at -O0. */ >> - if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL) >> - expand_one_var (var, true, true); >> - else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p) >> - { >> - /* This is a PARM_DECL or RESULT_DECL. For those partitions that >> - contain the default def (representing the parm or result itself) >> - we don't do anything here. But those which don't contain the >> - default def (representing a temporary based on the parm/result) >> - we need to allocate space just like for normal VAR_DECLs. */ >> - if (!bitmap_bit_p (SA.partition_has_default_def, i)) >> - { >> - expand_one_var (var, true, true); >> - gcc_assert (SA.partition_to_pseudo[i]); >> - } >> - } >> + expand_one_ssa_partition (var); >> } >> >> + for (i = 1; i < num_ssa_names; i++) >> + adjust_one_expanded_partition_var (ssa_name (i)); >> + >> if (flag_stack_protect == SPCT_FLAG_STRONG) >> gen_stack_protect_signal >> = stack_protect_decl_p () || stack_protect_return_slot_p (); >> @@ -5961,35 +6149,6 @@ pass_expand::execute (function *fun) >> parm_birth_insn = var_seq; >> } >> >> - /* Now that we also have the parameter RTXs, copy them over to our >> - partitions. */ >> - for (i = 0; i < SA.map->num_partitions; i++) >> - { >> - tree var = SSA_NAME_VAR (partition_to_var (SA.map, i)); >> - >> - if (TREE_CODE (var) != VAR_DECL >> - && !SA.partition_to_pseudo[i]) >> - SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var); >> - gcc_assert (SA.partition_to_pseudo[i]); >> - >> - /* If this decl was marked as living in multiple places, reset >> - this now to NULL. */ >> - if (DECL_RTL_IF_SET (var) == pc_rtx) >> - SET_DECL_RTL (var, NULL); >> - >> - /* Some RTL parts really want to look at DECL_RTL(x) when x >> - was a decl marked in REG_ATTR or MEM_ATTR. We could use >> - SET_DECL_RTL here making this available, but that would mean >> - to select one of the potentially many RTLs for one DECL. Instead >> - of doing that we simply reset the MEM_EXPR of the RTL in question, >> - then nobody can get at it and hence nobody can call DECL_RTL on it. */ >> - if (!DECL_RTL_SET_P (var)) >> - { >> - if (MEM_P (SA.partition_to_pseudo[i])) >> - set_mem_expr (SA.partition_to_pseudo[i], NULL); >> - } >> - } >> - >> /* If we have a class containing differently aligned pointers >> we need to merge those into the corresponding RTL pointer >> alignment. */ >> @@ -5997,7 +6156,6 @@ pass_expand::execute (function *fun) >> { >> tree name = ssa_name (i); >> int part; >> - rtx r; >> >> if (!name >> /* We might have generated new SSA names in >> @@ -6010,20 +6168,24 @@ pass_expand::execute (function *fun) >> if (part == NO_PARTITION) >> continue; >> >> - /* Adjust all partition members to get the underlying decl of >> - the representative which we might have created in expand_one_var. */ >> - if (SSA_NAME_VAR (name) == NULL_TREE) >> + gcc_assert (SA.partition_to_pseudo[part]); >> + >> + /* If this decl was marked as living in multiple places, reset >> + this now to NULL. */ >> + tree var = SSA_NAME_VAR (name); >> + if (var && DECL_RTL_IF_SET (var) == pc_rtx) >> + SET_DECL_RTL (var, NULL); >> + /* Check that the pseudos chosen by assign_parms are those of >> + the corresponding default defs. */ >> + else if (SSA_NAME_IS_DEFAULT_DEF (name) >> + && (TREE_CODE (var) == PARM_DECL >> + || TREE_CODE (var) == RESULT_DECL)) >> { >> - tree leader = partition_to_var (SA.map, part); >> - gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE); >> - replace_ssa_name_symbol (name, SSA_NAME_VAR (leader)); >> + rtx in = DECL_RTL_IF_SET (var); >> + gcc_assert (in); >> + rtx out = SA.partition_to_pseudo[part]; >> + gcc_assert (in == out || rtx_equal_p (in, out)); >> } >> - if (!POINTER_TYPE_P (TREE_TYPE (name))) >> - continue; >> - >> - r = SA.partition_to_pseudo[part]; >> - if (REG_P (r)) >> - mark_reg_pointer (r, get_pointer_alignment (name)); >> } >> >> /* If this function is `main', emit a call to `__main' >> diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h >> index a0b6e3e..602579d 100644 >> --- a/gcc/cfgexpand.h >> +++ b/gcc/cfgexpand.h >> @@ -22,5 +22,7 @@ along with GCC; see the file COPYING3. If not see >> >> extern tree gimple_assign_rhs_to_tree (gimple); >> extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *); >> +extern rtx get_rtl_for_parm_ssa_default_def (tree var); >> + >> >> #endif /* GCC_CFGEXPAND_H */ >> diff --git a/gcc/common.opt b/gcc/common.opt >> index 32b416a..051f824 100644 >> --- a/gcc/common.opt >> +++ b/gcc/common.opt >> @@ -2227,16 +2227,16 @@ Common Report Var(flag_tree_ch) Optimization >> Enable loop header copying on trees >> >> ftree-coalesce-inlined-vars >> -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization >> -Enable coalescing of copy-related user variables that are inlined >> +Common Ignore RejectNegative >> +Does nothing. Preserved for backward compatibility. >> >> ftree-coalesce-vars >> -Common Report Var(flag_ssa_coalesce_vars,2) Optimization >> -Enable coalescing of all copy-related user variables >> +Common Report Var(flag_tree_coalesce_vars) Optimization >> +Enable SSA coalescing of user variables >> >> ftree-copyrename >> -Common Report Var(flag_tree_copyrename) Optimization >> -Replace SSA temporaries with better names in copies >> +Common Ignore >> +Does nothing. Preserved for backward compatibility. >> >> ftree-copy-prop >> Common Report Var(flag_tree_copy_prop) Optimization >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index e25bd62..e359be2 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -339,7 +339,6 @@ Objective-C and Objective-C++ Dialects}. >> -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol >> -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol >> -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol >> --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol >> -fdump-tree-nrv -fdump-tree-vect @gol >> -fdump-tree-sink @gol >> -fdump-tree-sra@r{[}-@var{n}@r{]} @gol >> @@ -445,9 +444,8 @@ Objective-C and Objective-C++ Dialects}. >> -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol >> -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol >> -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol >> --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol >> --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol >> --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol >> +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol >> +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol >> -ftree-loop-if-convert-stores -ftree-loop-im @gol >> -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol >> -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol >> @@ -7076,11 +7074,6 @@ name is made by appending @file{.phiopt} to the source file name. >> Dump each function after forward propagating single use variables. The file >> name is made by appending @file{.forwprop} to the source file name. >> >> -@item copyrename >> -@opindex fdump-tree-copyrename >> -Dump each function after applying the copy rename optimization. The file >> -name is made by appending @file{.copyrename} to the source file name. >> - >> @item nrv >> @opindex fdump-tree-nrv >> Dump each function after applying the named return value optimization on >> @@ -7545,8 +7538,8 @@ compilation time. >> -ftree-ccp @gol >> -fssa-phiopt @gol >> -ftree-ch @gol >> +-ftree-coalesce-vars @gol >> -ftree-copy-prop @gol >> --ftree-copyrename @gol >> -ftree-dce @gol >> -ftree-dominator-opts @gol >> -ftree-dse @gol >> @@ -8815,6 +8808,15 @@ profitable to parallelize the loops. >> Compare the results of several data dependence analyzers. This option >> is used for debugging the data dependence analyzers. >> >> +@item -ftree-coalesce-vars >> +@opindex ftree-coalesce-vars >> +Tell the compiler to attempt to combine small user-defined variables >> +too, instead of just compiler temporaries. This may severely limit the >> +ability to debug an optimized program compiled with >> +@option{-fno-var-tracking-assignments}. In the negated form, this flag >> +prevents SSA coalescing of user variables. This option is enabled by >> +default if optimization is enabled. >> + >> @item -ftree-loop-if-convert >> @opindex ftree-loop-if-convert >> Attempt to transform conditional jumps in the innermost loops to >> @@ -8928,32 +8930,6 @@ Perform scalar replacement of aggregates. This pass replaces structure >> references with scalars to prevent committing structures to memory too >> early. This flag is enabled by default at @option{-O} and higher. >> >> -@item -ftree-copyrename >> -@opindex ftree-copyrename >> -Perform copy renaming on trees. This pass attempts to rename compiler >> -temporaries to other variables at copy locations, usually resulting in >> -variable names which more closely resemble the original variables. This flag >> -is enabled by default at @option{-O} and higher. >> - >> -@item -ftree-coalesce-inlined-vars >> -@opindex ftree-coalesce-inlined-vars >> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to >> -combine small user-defined variables too, but only if they are inlined >> -from other functions. It is a more limited form of >> -@option{-ftree-coalesce-vars}. This may harm debug information of such >> -inlined variables, but it keeps variables of the inlined-into >> -function apart from each other, such that they are more likely to >> -contain the expected values in a debugging session. >> - >> -@item -ftree-coalesce-vars >> -@opindex ftree-coalesce-vars >> -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to >> -combine small user-defined variables too, instead of just compiler >> -temporaries. This may severely limit the ability to debug an optimized >> -program compiled with @option{-fno-var-tracking-assignments}. In the >> -negated form, this flag prevents SSA coalescing of user variables, >> -including inlined ones. This option is enabled by default. >> - >> @item -ftree-ter >> @opindex ftree-ter >> Perform temporary expression replacement during the SSA->normal phase. Single >> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c >> index 49a1509..2b98946 100644 >> --- a/gcc/emit-rtl.c >> +++ b/gcc/emit-rtl.c >> @@ -1249,6 +1249,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem) >> void >> set_reg_attrs_for_decl_rtl (tree t, rtx x) >> { >> + if (!t) >> + return; >> + tree tdecl = t; >> if (GET_CODE (x) == SUBREG) >> { >> gcc_assert (subreg_lowpart_p (x)); >> @@ -1257,7 +1260,7 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x) >> if (REG_P (x)) >> REG_ATTRS (x) >> = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x), >> - DECL_MODE (t))); >> + DECL_MODE (tdecl))); >> if (GET_CODE (x) == CONCAT) >> { >> if (REG_P (XEXP (x, 0))) >> diff --git a/gcc/explow.c b/gcc/explow.c >> index 8745aea..5b0d49c 100644 >> --- a/gcc/explow.c >> +++ b/gcc/explow.c >> @@ -856,6 +856,28 @@ promote_decl_mode (const_tree decl, int *punsignedp) >> return pmode; >> } >> >> +/* Return the promoted mode for name. If it is a named SSA_NAME, it >> + is the same as promote_decl_mode. Otherwise, it is the promoted >> + mode of a temp decl of same type as the SSA_NAME, if we had created >> + one. */ >> + >> +machine_mode >> +promote_ssa_mode (const_tree name, int *punsignedp) >> +{ >> + gcc_assert (TREE_CODE (name) == SSA_NAME); >> + >> + tree type = TREE_TYPE (name); >> + int unsignedp = TYPE_UNSIGNED (type); >> + machine_mode mode = TYPE_MODE (type); >> + >> + machine_mode pmode = promote_mode (type, mode, &unsignedp); >> + if (punsignedp) >> + *punsignedp = unsignedp; >> + >> + return pmode; >> +} >> + >> + >> >> /* Controls the behaviour of {anti_,}adjust_stack. */ >> static bool suppress_reg_args_size; >> diff --git a/gcc/explow.h b/gcc/explow.h >> index 94613de..52113db 100644 >> --- a/gcc/explow.h >> +++ b/gcc/explow.h >> @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *); >> /* Return mode and signedness to use when object is promoted. */ >> machine_mode promote_decl_mode (const_tree, int *); >> >> +/* Return mode and signedness to use when object is promoted. */ >> +machine_mode promote_ssa_mode (const_tree, int *); >> + >> /* Remove some bytes from the stack. An rtx says how many. */ >> extern void adjust_stack (rtx); >> >> diff --git a/gcc/expr.c b/gcc/expr.c >> index 5a931dc..5b6e16e 100644 >> --- a/gcc/expr.c >> +++ b/gcc/expr.c >> @@ -9301,7 +9301,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> rtx op0, op1, temp, decl_rtl; >> tree type; >> int unsignedp; >> - machine_mode mode; >> + machine_mode mode, dmode; >> enum tree_code code = TREE_CODE (exp); >> rtx subtarget, original_target; >> int ignore; >> @@ -9432,7 +9432,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> if (g == NULL >> && modifier == EXPAND_INITIALIZER >> && !SSA_NAME_IS_DEFAULT_DEF (exp) >> - && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp))) >> + && (optimize || !SSA_NAME_VAR (exp) >> + || DECL_IGNORED_P (SSA_NAME_VAR (exp))) >> && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp))) >> g = SSA_NAME_DEF_STMT (exp); >> if (g) >> @@ -9511,15 +9512,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> /* Ensure variable marked as used even if it doesn't go through >> a parser. If it hasn't be used yet, write out an external >> definition. */ >> - TREE_USED (exp) = 1; >> + if (exp) >> + TREE_USED (exp) = 1; >> >> /* Show we haven't gotten RTL for this yet. */ >> temp = 0; >> >> /* Variables inherited from containing functions should have >> been lowered by this point. */ >> - context = decl_function_context (exp); >> - gcc_assert (SCOPE_FILE_SCOPE_P (context) >> + if (exp) >> + context = decl_function_context (exp); >> + gcc_assert (!exp >> + || SCOPE_FILE_SCOPE_P (context) >> || context == current_function_decl >> || TREE_STATIC (exp) >> || DECL_EXTERNAL (exp) >> @@ -9543,7 +9547,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> decl_rtl = use_anchored_address (decl_rtl); >> if (modifier != EXPAND_CONST_ADDRESS >> && modifier != EXPAND_SUM >> - && !memory_address_addr_space_p (DECL_MODE (exp), >> + && !memory_address_addr_space_p (exp ? DECL_MODE (exp) >> + : GET_MODE (decl_rtl), >> XEXP (decl_rtl, 0), >> MEM_ADDR_SPACE (decl_rtl))) >> temp = replace_equiv_address (decl_rtl, >> @@ -9554,12 +9559,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> if the address is a register. */ >> if (temp != 0) >> { >> - if (MEM_P (temp) && REG_P (XEXP (temp, 0))) >> + if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0))) >> mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp)); >> >> return temp; >> } >> >> + if (exp) >> + dmode = DECL_MODE (exp); >> + else >> + dmode = TYPE_MODE (TREE_TYPE (ssa_name)); >> + >> /* If the mode of DECL_RTL does not match that of the decl, >> there are two cases: we are dealing with a BLKmode value >> that is returned in a register, or we are dealing with >> @@ -9567,22 +9577,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> of the wanted mode, but mark it so that we know that it >> was already extended. */ >> if (REG_P (decl_rtl) >> - && DECL_MODE (exp) != BLKmode >> - && GET_MODE (decl_rtl) != DECL_MODE (exp)) >> + && dmode != BLKmode >> + && GET_MODE (decl_rtl) != dmode) >> { >> machine_mode pmode; >> >> /* Get the signedness to be used for this variable. Ensure we get >> the same mode we got when the variable was declared. */ >> - if (code == SSA_NAME >> - && (g = SSA_NAME_DEF_STMT (ssa_name)) >> - && gimple_code (g) == GIMPLE_CALL >> - && !gimple_call_internal_p (g)) >> + if (code != SSA_NAME) >> + pmode = promote_decl_mode (exp, &unsignedp); >> + else if ((g = SSA_NAME_DEF_STMT (ssa_name)) >> + && gimple_code (g) == GIMPLE_CALL >> + && !gimple_call_internal_p (g)) >> pmode = promote_function_mode (type, mode, &unsignedp, >> gimple_call_fntype (g), >> 2); >> else >> - pmode = promote_decl_mode (exp, &unsignedp); >> + pmode = promote_ssa_mode (ssa_name, &unsignedp); >> gcc_assert (GET_MODE (decl_rtl) == pmode); >> >> temp = gen_lowpart_SUBREG (mode, decl_rtl); >> diff --git a/gcc/function.c b/gcc/function.c >> index 7d2d7e4..58e2498 100644 >> --- a/gcc/function.c >> +++ b/gcc/function.c >> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3. If not see >> #include "cfganal.h" >> #include "cfgbuild.h" >> #include "cfgcleanup.h" >> +#include "cfgexpand.h" >> #include "basic-block.h" >> #include "df.h" >> #include "params.h" >> @@ -2121,6 +2122,30 @@ aggregate_value_p (const_tree exp, const_tree fntype) >> bool >> use_register_for_decl (const_tree decl) >> { >> + if (TREE_CODE (decl) == SSA_NAME) >> + { >> + /* We often try to use the SSA_NAME, instead of its underlying >> + decl, to get type information and guide decisions, to avoid >> + differences of behavior between anonymous and named >> + variables, but in this one case we have to go for the actual >> + variable if there is one. The main reason is that, at least >> + at -O0, we want to place user variables on the stack, but we >> + don't mind using pseudos for anonymous or ignored temps. >> + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs >> + should go in pseudos, whereas their corresponding variables >> + might have to go on the stack. So, disregarding the decl >> + here would negatively impact debug info at -O0, enable >> + coalescing between SSA_NAMEs that ought to get different >> + stack/pseudo assignments, and get the incoming argument >> + processing thoroughly confused by PARM_DECLs expected to live >> + in stack slots but assigned to pseudos. */ >> + if (!SSA_NAME_VAR (decl)) >> + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode >> + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl))); >> + >> + decl = SSA_NAME_VAR (decl); >> + } >> + >> if (!targetm.calls.allocate_stack_slots_for_args ()) >> return true; >> >> @@ -2804,23 +2829,88 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data) >> data->entry_parm = entry_parm; >> } >> >> +/* Wrapper for use_register_for_decl, that special-cases the >> + .result_ptr as the function's RESULT_DECL when the RESULT_DECL is >> + passed by reference. */ >> + >> +static bool >> +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm) >> +{ >> + if (parm == all->function_result_decl) >> + { >> + tree result = DECL_RESULT (current_function_decl); >> + >> + if (DECL_BY_REFERENCE (result)) >> + parm = result; >> + } >> + >> + return use_register_for_decl (parm); >> +} >> + >> +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases >> + the .result_ptr as the function's RESULT_DECL when the RESULT_DECL >> + is passed by reference. */ >> + >> +static rtx >> +rtl_for_parm (struct assign_parm_data_all *all, tree parm) >> +{ >> + if (parm == all->function_result_decl) >> + { >> + tree result = DECL_RESULT (current_function_decl); >> + >> + if (!DECL_BY_REFERENCE (result)) >> + return NULL_RTX; >> + >> + parm = result; >> + } >> + >> + return get_rtl_for_parm_ssa_default_def (parm); >> +} >> + >> +/* Reset the location of PARM_DECLs and RESULT_DECLs that had >> + SSA_NAMEs in multiple partitions, so that assign_parms will choose >> + the default def, if it exists, or create new RTL to hold the unused >> + entry value. If we are coalescing across variables, we want to >> + reset the location too, because a parm without a default def >> + (incoming value unused) might be coalesced with one with a default >> + def, and then assign_parms would copy both incoming values to the >> + same location, which might cause the wrong value to survive. */ >> +static void >> +maybe_reset_rtl_for_parm (tree parm) >> +{ >> + gcc_assert (TREE_CODE (parm) == PARM_DECL >> + || TREE_CODE (parm) == RESULT_DECL); >> + if ((flag_tree_coalesce_vars >> + || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx)) >> + && is_gimple_reg (parm)) >> + SET_DECL_RTL (parm, NULL_RTX); >> +} >> + >> /* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's >> always valid and properly aligned. */ >> >> static void >> -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data) >> +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm, >> + struct assign_parm_data_one *data) >> { >> rtx stack_parm = data->stack_parm; >> >> + /* If out-of-SSA assigned RTL to the parm default def, make sure we >> + don't use what we might have computed before. */ >> + rtx ssa_assigned = rtl_for_parm (all, parm); >> + if (ssa_assigned) >> + stack_parm = NULL; >> + >> /* If we can't trust the parm stack slot to be aligned enough for its >> ultimate type, don't use that slot after entry. We'll make another >> stack slot, if we need one. */ >> - if (stack_parm >> - && ((STRICT_ALIGNMENT >> - && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm)) >> - || (data->nominal_type >> - && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) >> - && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) >> + else if (stack_parm >> + && ((STRICT_ALIGNMENT >> + && (GET_MODE_ALIGNMENT (data->nominal_mode) >> + > MEM_ALIGN (stack_parm))) >> + || (data->nominal_type >> + && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) >> + && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) >> stack_parm = NULL; >> >> /* If parm was passed in memory, and we need to convert it on entry, >> @@ -2882,11 +2972,16 @@ assign_parm_setup_block (struct assign_parm_data_all *all, >> >> size = int_size_in_bytes (data->passed_type); >> size_stored = CEIL_ROUND (size, UNITS_PER_WORD); >> + >> if (stack_parm == 0) >> { >> DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD); >> - stack_parm = assign_stack_local (BLKmode, size_stored, >> - DECL_ALIGN (parm)); >> + stack_parm = rtl_for_parm (all, parm); >> + if (!stack_parm) >> + stack_parm = assign_stack_local (BLKmode, size_stored, >> + DECL_ALIGN (parm)); >> + else >> + stack_parm = copy_rtx (stack_parm); >> if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) >> PUT_MODE (stack_parm, GET_MODE (entry_parm)); >> set_mem_attributes (stack_parm, parm, 1); >> @@ -3027,10 +3122,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, >> = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp, >> TREE_TYPE (current_function_decl), 2); >> >> - parmreg = gen_reg_rtx (promoted_nominal_mode); >> + rtx from_expand = rtl_for_parm (all, parm); >> >> - if (!DECL_ARTIFICIAL (parm)) >> - mark_user_reg (parmreg); >> + if (from_expand && !data->passed_pointer) >> + { >> + parmreg = from_expand; >> + gcc_assert (GET_MODE (parmreg) == promoted_nominal_mode); >> + } >> + else >> + { >> + parmreg = gen_reg_rtx (promoted_nominal_mode); >> + if (!DECL_ARTIFICIAL (parm)) >> + mark_user_reg (parmreg); >> + } >> >> /* If this was an item that we received a pointer to, >> set DECL_RTL appropriately. */ >> @@ -3049,6 +3153,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, >> assign_parm_find_data_types and expand_expr_real_1. */ >> >> equiv_stack_parm = data->stack_parm; >> + if (!equiv_stack_parm) >> + equiv_stack_parm = data->entry_parm; >> validated_mem = validize_mem (copy_rtx (data->entry_parm)); >> >> need_conversion = (data->nominal_mode != data->passed_mode >> @@ -3189,11 +3295,17 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, >> >> /* If we were passed a pointer but the actual value can safely live >> in a register, retrieve it and use it directly. */ >> - if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode) >> + if (data->passed_pointer >> + && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode)) >> { >> /* We can't use nominal_mode, because it will have been set to >> Pmode above. We must use the actual mode of the parm. */ >> - if (use_register_for_decl (parm)) >> + if (from_expand) >> + { >> + parmreg = from_expand; >> + gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm))); >> + } >> + else if (use_register_for_decl (parm)) >> { >> parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm))); >> mark_user_reg (parmreg); >> @@ -3233,7 +3345,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, >> >> /* STACK_PARM is the pointer, not the parm, and PARMREG is >> now the parm. */ >> - data->stack_parm = NULL; >> + data->stack_parm = equiv_stack_parm = NULL; >> } >> >> /* Mark the register as eliminable if we did no conversion and it was >> @@ -3243,11 +3355,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, >> make here would screw up life analysis for it. */ >> if (data->nominal_mode == data->passed_mode >> && !did_conversion >> - && data->stack_parm != 0 >> - && MEM_P (data->stack_parm) >> + && equiv_stack_parm != 0 >> + && MEM_P (equiv_stack_parm) >> && data->locate.offset.var == 0 >> && reg_mentioned_p (virtual_incoming_args_rtx, >> - XEXP (data->stack_parm, 0))) >> + XEXP (equiv_stack_parm, 0))) >> { >> rtx_insn *linsn = get_last_insn (); >> rtx_insn *sinsn; >> @@ -3260,8 +3372,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, >> = GET_MODE_INNER (GET_MODE (parmreg)); >> int regnor = REGNO (XEXP (parmreg, 0)); >> int regnoi = REGNO (XEXP (parmreg, 1)); >> - rtx stackr = adjust_address_nv (data->stack_parm, submode, 0); >> - rtx stacki = adjust_address_nv (data->stack_parm, submode, >> + rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0); >> + rtx stacki = adjust_address_nv (equiv_stack_parm, submode, >> GET_MODE_SIZE (submode)); >> >> /* Scan backwards for the set of the real and >> @@ -3334,6 +3446,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm, >> >> if (data->stack_parm == 0) >> { >> + rtx x = data->stack_parm = rtl_for_parm (all, parm); >> + if (x) >> + gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm)); >> + } >> + >> + if (data->stack_parm == 0) >> + { >> int align = STACK_SLOT_ALIGNMENT (data->passed_type, >> GET_MODE (data->entry_parm), >> TYPE_ALIGN (data->passed_type)); >> @@ -3592,6 +3711,8 @@ assign_parms (tree fndecl) >> DECL_INCOMING_RTL (parm) = DECL_RTL (parm); >> continue; >> } >> + else >> + maybe_reset_rtl_for_parm (parm); >> >> /* Estimate stack alignment from parameter alignment. */ >> if (SUPPORTS_STACK_ALIGNMENT) >> @@ -3641,7 +3762,9 @@ assign_parms (tree fndecl) >> else >> set_decl_incoming_rtl (parm, data.entry_parm, false); >> >> - /* Boudns should be loaded in the particular order to >> + assign_parm_adjust_stack_rtl (&all, parm, &data); >> + >> + /* Bounds should be loaded in the particular order to >> have registers allocated correctly. Collect info about >> input bounds and load them later. */ >> if (POINTER_BOUNDS_TYPE_P (data.passed_type)) >> @@ -3658,11 +3781,10 @@ assign_parms (tree fndecl) >> } >> else >> { >> - assign_parm_adjust_stack_rtl (&data); >> - >> if (assign_parm_setup_block_p (&data)) >> assign_parm_setup_block (&all, parm, &data); >> - else if (data.passed_pointer || use_register_for_decl (parm)) >> + else if (data.passed_pointer >> + || use_register_for_parm_decl (&all, parm)) >> assign_parm_setup_reg (&all, parm, &data); >> else >> assign_parm_setup_stack (&all, parm, &data); >> @@ -5004,7 +5126,9 @@ expand_function_start (tree subr) >> before any library calls that assign parms might generate. */ >> >> /* Decide whether to return the value in memory or in a register. */ >> - if (aggregate_value_p (DECL_RESULT (subr), subr)) >> + tree res = DECL_RESULT (subr); >> + maybe_reset_rtl_for_parm (res); >> + if (aggregate_value_p (res, subr)) >> { >> /* Returning something that won't go in a register. */ >> rtx value_address = 0; >> @@ -5012,7 +5136,7 @@ expand_function_start (tree subr) >> #ifdef PCC_STATIC_STRUCT_RETURN >> if (cfun->returns_pcc_struct) >> { >> - int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr))); >> + int size = int_size_in_bytes (TREE_TYPE (res)); >> value_address = assemble_static_space (size); >> } >> else >> @@ -5024,36 +5148,45 @@ expand_function_start (tree subr) >> it. */ >> if (sv) >> { >> - value_address = gen_reg_rtx (Pmode); >> + if (DECL_BY_REFERENCE (res)) >> + value_address = get_rtl_for_parm_ssa_default_def (res); >> + if (!value_address) >> + value_address = gen_reg_rtx (Pmode); >> emit_move_insn (value_address, sv); >> } >> } >> if (value_address) >> { >> rtx x = value_address; >> - if (!DECL_BY_REFERENCE (DECL_RESULT (subr))) >> + if (!DECL_BY_REFERENCE (res)) >> { >> - x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x); >> - set_mem_attributes (x, DECL_RESULT (subr), 1); >> + x = get_rtl_for_parm_ssa_default_def (res); >> + if (!x) >> + { >> + x = gen_rtx_MEM (DECL_MODE (res), value_address); >> + set_mem_attributes (x, res, 1); >> + } >> } >> - SET_DECL_RTL (DECL_RESULT (subr), x); >> + SET_DECL_RTL (res, x); >> } >> } >> - else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode) >> + else if (DECL_MODE (res) == VOIDmode) >> /* If return mode is void, this decl rtl should not be used. */ >> - SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX); >> + SET_DECL_RTL (res, NULL_RTX); >> else >> { >> /* Compute the return values into a pseudo reg, which we will copy >> into the true return register after the cleanups are done. */ >> - tree return_type = TREE_TYPE (DECL_RESULT (subr)); >> - if (TYPE_MODE (return_type) != BLKmode >> - && targetm.calls.return_in_msb (return_type)) >> + tree return_type = TREE_TYPE (res); >> + rtx x = get_rtl_for_parm_ssa_default_def (res); >> + if (x) >> + /* Use it. */; >> + else if (TYPE_MODE (return_type) != BLKmode >> + && targetm.calls.return_in_msb (return_type)) >> /* expand_function_end will insert the appropriate padding in >> this case. Use the return value's natural (unpadded) mode >> within the function proper. */ >> - SET_DECL_RTL (DECL_RESULT (subr), >> - gen_reg_rtx (TYPE_MODE (return_type))); >> + x = gen_reg_rtx (TYPE_MODE (return_type)); >> else >> { >> /* In order to figure out what mode to use for the pseudo, we >> @@ -5064,25 +5197,26 @@ expand_function_start (tree subr) >> /* Structures that are returned in registers are not >> aggregate_value_p, so we may see a PARALLEL or a REG. */ >> if (REG_P (hard_reg)) >> - SET_DECL_RTL (DECL_RESULT (subr), >> - gen_reg_rtx (GET_MODE (hard_reg))); >> + x = gen_reg_rtx (GET_MODE (hard_reg)); >> else >> { >> gcc_assert (GET_CODE (hard_reg) == PARALLEL); >> - SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg)); >> + x = gen_group_rtx (hard_reg); >> } >> } >> >> + SET_DECL_RTL (res, x); >> + >> /* Set DECL_REGISTER flag so that expand_function_end will copy the >> result to the real return register(s). */ >> - DECL_REGISTER (DECL_RESULT (subr)) = 1; >> + DECL_REGISTER (res) = 1; >> >> if (chkp_function_instrumented_p (current_function_decl)) >> { >> - tree return_type = TREE_TYPE (DECL_RESULT (subr)); >> + tree return_type = TREE_TYPE (res); >> rtx bounds = targetm.calls.chkp_function_value_bounds (return_type, >> subr, 1); >> - SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds); >> + SET_DECL_BOUNDS_RTL (res, bounds); >> } >> } >> >> @@ -5097,7 +5231,9 @@ expand_function_start (tree subr) >> rtx local, chain; >> rtx_insn *insn; >> >> - local = gen_reg_rtx (Pmode); >> + local = get_rtl_for_parm_ssa_default_def (parm); >> + if (!local) >> + local = gen_reg_rtx (Pmode); >> chain = targetm.calls.static_chain (current_function_decl, true); >> >> set_decl_incoming_rtl (parm, chain, false); >> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c >> index 4d683d6..d3d1c5f 100644 >> --- a/gcc/gimple-expr.c >> +++ b/gcc/gimple-expr.c >> @@ -391,45 +391,6 @@ copy_var_decl (tree var, tree name, tree type) >> return copy; >> } >> >> -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for >> - coalescing together, false otherwise. >> - >> - This must stay consistent with var_map_base_init in tree-ssa-live.c. */ >> - >> -bool >> -gimple_can_coalesce_p (tree name1, tree name2) >> -{ >> - /* First check the SSA_NAME's associated DECL. We only want to >> - coalesce if they have the same DECL or both have no associated DECL. */ >> - tree var1 = SSA_NAME_VAR (name1); >> - tree var2 = SSA_NAME_VAR (name2); >> - var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; >> - var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; >> - if (var1 != var2) >> - return false; >> - >> - /* Now check the types. If the types are the same, then we should >> - try to coalesce V1 and V2. */ >> - tree t1 = TREE_TYPE (name1); >> - tree t2 = TREE_TYPE (name2); >> - if (t1 == t2) >> - return true; >> - >> - /* If the types are not the same, check for a canonical type match. This >> - (for example) allows coalescing when the types are fundamentally the >> - same, but just have different names. >> - >> - Note pointer types with different address spaces may have the same >> - canonical type. Those are rejected for coalescing by the >> - types_compatible_p check. */ >> - if (TYPE_CANONICAL (t1) >> - && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) >> - && types_compatible_p (t1, t2)) >> - return true; >> - >> - return false; >> -} >> - >> /* Strip off a legitimate source ending from the input string NAME of >> length LEN. Rather than having to know the names used by all of >> our front ends, we strip off an ending of a period followed by >> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h >> index ed23eb2..3d1c89f 100644 >> --- a/gcc/gimple-expr.h >> +++ b/gcc/gimple-expr.h >> @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree); >> extern bool gimple_has_body_p (tree); >> extern const char *gimple_decl_printable_name (tree, int); >> extern tree copy_var_decl (tree, tree, tree); >> -extern bool gimple_can_coalesce_p (tree, tree); >> extern tree create_tmp_var_name (const char *); >> extern tree create_tmp_var_raw (tree, const char * = NULL); >> extern tree create_tmp_var (tree, const char * = NULL); >> diff --git a/gcc/opts.c b/gcc/opts.c >> index 9793999..5305299 100644 >> --- a/gcc/opts.c >> +++ b/gcc/opts.c >> @@ -448,12 +448,12 @@ static const struct default_options default_options_table[] = >> { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 }, >> { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 }, >> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 }, >> + { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 }, >> { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 }, >> { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 }, >> { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 }, >> { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 }, >> { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 }, >> - { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 }, >> { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 }, >> { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 }, >> { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 }, >> diff --git a/gcc/passes.def b/gcc/passes.def >> index 4690e23..230e089 100644 >> --- a/gcc/passes.def >> +++ b/gcc/passes.def >> @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see >> NEXT_PASS (pass_all_early_optimizations); >> PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations) >> NEXT_PASS (pass_remove_cgraph_callee_edges); >> - NEXT_PASS (pass_rename_ssa_copies); >> NEXT_PASS (pass_object_sizes); >> NEXT_PASS (pass_ccp); >> /* After CCP we rewrite no longer addressed locals into SSA >> @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3. If not see >> /* Initial scalar cleanups before alias computation. >> They ensure memory accesses are not indirect wherever possible. */ >> NEXT_PASS (pass_strip_predict_hints); >> - NEXT_PASS (pass_rename_ssa_copies); >> NEXT_PASS (pass_ccp); >> /* After CCP we rewrite no longer addressed locals into SSA >> form if possible. */ >> @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3. If not see >> NEXT_PASS (pass_ch); >> NEXT_PASS (pass_lower_complex); >> NEXT_PASS (pass_sra); >> - NEXT_PASS (pass_rename_ssa_copies); >> /* The dom pass will also resolve all __builtin_constant_p calls >> that are still there to 0. This has to be done after some >> propagations have already run, but before some more dead code >> @@ -291,7 +288,6 @@ along with GCC; see the file COPYING3. If not see >> NEXT_PASS (pass_fold_builtins); >> NEXT_PASS (pass_optimize_widening_mul); >> NEXT_PASS (pass_tail_calls); >> - NEXT_PASS (pass_rename_ssa_copies); >> /* FIXME: If DCE is not run before checking for uninitialized uses, >> we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c). >> However, this also causes us to misdiagnose cases that should be >> @@ -326,7 +322,6 @@ along with GCC; see the file COPYING3. If not see >> NEXT_PASS (pass_dce); >> NEXT_PASS (pass_asan); >> NEXT_PASS (pass_tsan); >> - NEXT_PASS (pass_rename_ssa_copies); >> /* ??? We do want some kind of loop invariant motion, but we possibly >> need to adjust LIM to be more friendly towards preserving accurate >> debug information here. */ >> diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c >> index 9b17187..e1e7293 100644 >> --- a/gcc/testsuite/gcc.dg/guality/pr54200.c >> +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c >> @@ -1,6 +1,6 @@ >> /* PR tree-optimization/54200 */ >> /* { dg-do run } */ >> -/* { dg-options "-g -fno-var-tracking-assignments" } */ >> +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */ >> >> int o __attribute__((used)); >> >> diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c >> index 5467f4d..db69332 100644 >> --- a/gcc/testsuite/gcc.dg/ssp-1.c >> +++ b/gcc/testsuite/gcc.dg/ssp-1.c >> @@ -12,7 +12,7 @@ __stack_chk_fail (void) >> >> int main () >> { >> - int i; >> + register int i; >> char foo[255]; >> >> // smash stack >> diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c >> index 9a7ac32..752fe53 100644 >> --- a/gcc/testsuite/gcc.dg/ssp-2.c >> +++ b/gcc/testsuite/gcc.dg/ssp-2.c >> @@ -14,7 +14,7 @@ __stack_chk_fail (void) >> void >> overflow() >> { >> - int i = 0; >> + register int i = 0; >> char foo[30]; >> >> /* Overflow buffer. */ >> diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c >> new file mode 100644 >> index 0000000..dbd81c1 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c >> @@ -0,0 +1,40 @@ >> +/* { dg-do run } */ >> + >> +#include >> + >> +/* Make sure we don't coalesce both incoming parms, one whose incoming >> + value is unused, to the same location, so as to overwrite one of >> + them with the incoming value of the other. */ >> + >> +int __attribute__((noinline, noclone)) >> +foo (int i, int j) >> +{ >> + j = i; /* The incoming value for J is unused. */ >> + i = 2; >> + if (j) >> + j++; >> + j += i + 1; >> + return j; >> +} >> + >> +/* Same as foo, but with swapped parameters. */ >> +int __attribute__((noinline, noclone)) >> +bar (int j, int i) >> +{ >> + j = i; /* The incoming value for J is unused. */ >> + i = 2; >> + if (j) >> + j++; >> + j += i + 1; >> + return j; >> +} >> + >> +int >> +main (void) >> +{ >> + if (foo (0, 1) != 3) >> + abort (); >> + if (bar (1, 0) != 3) >> + abort (); >> + return 0; >> +} >> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c >> index e23bc0b..59d91c6 100644 >> --- a/gcc/tree-outof-ssa.c >> +++ b/gcc/tree-outof-ssa.c >> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) >> rtx dest_rtx, seq, x; >> machine_mode dest_mode, src_mode; >> int unsignedp; >> - tree var; >> >> if (dump_file && (dump_flags & TDF_DETAILS)) >> { >> @@ -327,12 +326,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) >> >> start_sequence (); >> >> - var = SSA_NAME_VAR (partition_to_var (SA.map, dest)); >> + tree name = partition_to_var (SA.map, dest); >> src_mode = TYPE_MODE (TREE_TYPE (src)); >> dest_mode = GET_MODE (dest_rtx); >> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var))); >> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name))); >> gcc_assert (!REG_P (dest_rtx) >> - || dest_mode == promote_decl_mode (var, &unsignedp)); >> + || dest_mode == promote_ssa_mode (name, &unsignedp)); >> >> if (src_mode != dest_mode) >> { >> @@ -708,13 +707,12 @@ elim_backward (elim_graph g, int T) >> static rtx >> get_temp_reg (tree name) >> { >> - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; >> - tree type = TREE_TYPE (var); >> + tree type = TREE_TYPE (name); >> int unsignedp; >> - machine_mode reg_mode = promote_decl_mode (var, &unsignedp); >> + machine_mode reg_mode = promote_ssa_mode (name, &unsignedp); >> rtx x = gen_reg_rtx (reg_mode); >> if (POINTER_TYPE_P (type)) >> - mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); >> + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type))); >> return x; >> } >> >> @@ -1014,7 +1012,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa) >> >> /* Return to viewing the variable list as just all reference variables after >> coalescing has been performed. */ >> - partition_view_normal (map, false); >> + partition_view_normal (map); >> >> if (dump_file && (dump_flags & TDF_DETAILS)) >> { >> diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c >> index b05a860..9ffa3f1 100644 >> --- a/gcc/tree-ssa-coalesce.c >> +++ b/gcc/tree-ssa-coalesce.c >> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see >> #include "tree-ssanames.h" >> #include "tree-ssa-live.h" >> #include "tree-ssa-coalesce.h" >> +#include "explow.h" >> #include "diagnostic-core.h" >> >> >> @@ -830,6 +831,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) >> basic_block bb; >> ssa_op_iter iter; >> live_track_p live; >> + basic_block entry; >> + >> + /* If inter-variable coalescing is enabled, we may attempt to >> + coalesce variables from different base variables, including >> + different parameters, so we have to make sure default defs live >> + at the entry block conflict with each other. */ >> + if (flag_tree_coalesce_vars) >> + entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >> + else >> + entry = NULL; >> >> map = live_var_map (liveinfo); >> graph = ssa_conflicts_new (num_var_partitions (map)); >> @@ -888,6 +899,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) >> live_track_process_def (live, result, graph); >> } >> >> + /* Pretend there are defs for params' default defs at the start >> + of the (post-)entry block. */ >> + if (bb == entry) >> + { >> + unsigned base; >> + bitmap_iterator bi; >> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi) >> + { >> + bitmap_iterator bi2; >> + unsigned part; >> + EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base], >> + 0, part, bi2) >> + { >> + tree var = partition_to_var (map, part); >> + if (!SSA_NAME_VAR (var) >> + || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL >> + && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL) >> + || !SSA_NAME_IS_DEFAULT_DEF (var)) >> + continue; >> + live_track_process_def (live, var, graph); >> + } >> + } >> + } >> + >> live_track_clear_base_vars (live); >> } >> >> @@ -1156,6 +1191,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, >> { >> var1 = partition_to_var (map, p1); >> var2 = partition_to_var (map, p2); >> + >> z = var_union (map, var1, var2); >> if (z == NO_PARTITION) >> { >> @@ -1173,6 +1209,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, >> >> if (debug) >> fprintf (debug, ": Success -> %d\n", z); >> + >> return true; >> } >> >> @@ -1270,6 +1307,330 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2) >> } >> >> >> +/* Output partition map MAP with coalescing plan PART to file F. */ >> + >> +void >> +dump_part_var_map (FILE *f, partition part, var_map map) >> +{ >> + int t; >> + unsigned x, y; >> + int p; >> + >> + fprintf (f, "\nCoalescible Partition map \n\n"); >> + >> + for (x = 0; x < map->num_partitions; x++) >> + { >> + if (map->view_to_partition != NULL) >> + p = map->view_to_partition[x]; >> + else >> + p = x; >> + >> + if (ssa_name (p) == NULL_TREE >> + || virtual_operand_p (ssa_name (p))) >> + continue; >> + >> + t = 0; >> + for (y = 1; y < num_ssa_names; y++) >> + { >> + tree var = version_to_var (map, y); >> + if (!var) >> + continue; >> + int q = var_to_partition (map, var); >> + p = partition_find (part, q); >> + gcc_assert (map->partition_to_base_index[q] >> + == map->partition_to_base_index[p]); >> + >> + if (p == (int)x) >> + { >> + if (t++ == 0) >> + { >> + fprintf (f, "Partition %d, base %d (", x, >> + map->partition_to_base_index[q]); >> + print_generic_expr (f, partition_to_var (map, q), TDF_SLIM); >> + fprintf (f, " - "); >> + } >> + fprintf (f, "%d ", y); >> + } >> + } >> + if (t != 0) >> + fprintf (f, ")\n"); >> + } >> + fprintf (f, "\n"); >> +} >> + >> +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for >> + coalescing together, false otherwise. >> + >> + This must stay consistent with var_map_base_init in tree-ssa-live.c. */ >> + >> +bool >> +gimple_can_coalesce_p (tree name1, tree name2) >> +{ >> + /* First check the SSA_NAME's associated DECL. Without >> + optimization, we only want to coalesce if they have the same DECL >> + or both have no associated DECL. */ >> + tree var1 = SSA_NAME_VAR (name1); >> + tree var2 = SSA_NAME_VAR (name2); >> + var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; >> + var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; >> + if (var1 != var2 && !flag_tree_coalesce_vars) >> + return false; >> + >> + /* Now check the types. If the types are the same, then we should >> + try to coalesce V1 and V2. */ >> + tree t1 = TREE_TYPE (name1); >> + tree t2 = TREE_TYPE (name2); >> + if (t1 == t2) >> + { >> + check_modes: >> + /* If the base variables are the same, we're good: none of the >> + other tests below could possibly fail. */ >> + var1 = SSA_NAME_VAR (name1); >> + var2 = SSA_NAME_VAR (name2); >> + if (var1 == var2) >> + return true; >> + >> + /* We don't want to coalesce two SSA names if one of the base >> + variables is supposed to be a register while the other is >> + supposed to be on the stack. Anonymous SSA names take >> + registers, but when not optimizing, user variables should go >> + on the stack, so coalescing them with the anonymous variable >> + as the partition leader would end up assigning the user >> + variable to a register. Don't do that! */ >> + bool reg1 = !var1 || use_register_for_decl (var1); >> + bool reg2 = !var2 || use_register_for_decl (var2); >> + if (reg1 != reg2) >> + return false; >> + >> + /* Check that the promoted modes are the same. We don't want to >> + coalesce if the promoted modes would be different. Only >> + PARM_DECLs and RESULT_DECLs have different promotion rules, >> + so skip the test if we both are variables or anonymous >> + SSA_NAMEs. */ >> + return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2))) >> + || promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL); >> + } >> + >> + /* If the types are not the same, check for a canonical type match. This >> + (for example) allows coalescing when the types are fundamentally the >> + same, but just have different names. >> + >> + Note pointer types with different address spaces may have the same >> + canonical type. Those are rejected for coalescing by the >> + types_compatible_p check. */ >> + if (TYPE_CANONICAL (t1) >> + && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) >> + && types_compatible_p (t1, t2)) >> + goto check_modes; >> + >> + return false; >> +} >> + >> +/* Fill in MAP's partition_to_base_index, with one index for each >> + partition of SSA names USED_IN_COPIES and related by CL coalesce >> + possibilities. This must match gimple_can_coalesce_p in the >> + optimized case. */ >> + >> +static void >> +compute_optimized_partition_bases (var_map map, bitmap used_in_copies, >> + coalesce_list_p cl) >> +{ >> + int parts = num_var_partitions (map); >> + partition tentative = partition_new (parts); >> + >> + /* Partition the SSA versions so that, for each coalescible >> + pair, both of its members are in the same partition in >> + TENTATIVE. */ >> + gcc_assert (!cl->sorted); >> + coalesce_pair_p node; >> + coalesce_iterator_type ppi; >> + FOR_EACH_PARTITION_PAIR (node, ppi, cl) >> + { >> + tree v1 = ssa_name (node->first_element); >> + int p1 = partition_find (tentative, var_to_partition (map, v1)); >> + tree v2 = ssa_name (node->second_element); >> + int p2 = partition_find (tentative, var_to_partition (map, v2)); >> + >> + if (p1 == p2) >> + continue; >> + >> + partition_union (tentative, p1, p2); >> + } >> + >> + /* We have to deal with cost one pairs too. */ >> + for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next) >> + { >> + tree v1 = ssa_name (co->first_element); >> + int p1 = partition_find (tentative, var_to_partition (map, v1)); >> + tree v2 = ssa_name (co->second_element); >> + int p2 = partition_find (tentative, var_to_partition (map, v2)); >> + >> + if (p1 == p2) >> + continue; >> + >> + partition_union (tentative, p1, p2); >> + } >> + >> + /* And also with abnormal edges. */ >> + basic_block bb; >> + edge e; >> + edge_iterator ei; >> + FOR_EACH_BB_FN (bb, cfun) >> + { >> + FOR_EACH_EDGE (e, ei, bb->preds) >> + if (e->flags & EDGE_ABNORMAL) >> + { >> + gphi_iterator gsi; >> + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); >> + gsi_next (&gsi)) >> + { >> + gphi *phi = gsi.phi (); >> + tree arg = PHI_ARG_DEF (phi, e->dest_idx); >> + if (SSA_NAME_IS_DEFAULT_DEF (arg) >> + && (!SSA_NAME_VAR (arg) >> + || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL)) >> + continue; >> + >> + tree res = PHI_RESULT (phi); >> + >> + int p1 = partition_find (tentative, var_to_partition (map, res)); >> + int p2 = partition_find (tentative, var_to_partition (map, arg)); >> + >> + if (p1 == p2) >> + continue; >> + >> + partition_union (tentative, p1, p2); >> + } >> + } >> + } >> + >> + map->partition_to_base_index = XCNEWVEC (int, parts); >> + auto_vec index_map (parts); >> + if (parts) >> + index_map.quick_grow (parts); >> + >> + const unsigned no_part = -1; >> + unsigned count = parts; >> + while (count) >> + index_map[--count] = no_part; >> + >> + /* Initialize MAP's mapping from partition to base index, using >> + as base indices an enumeration of the TENTATIVE partitions in >> + which each SSA version ended up, so that we compute conflicts >> + between all SSA versions that ended up in the same potential >> + coalesce partition. */ >> + bitmap_iterator bi; >> + unsigned i; >> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi) >> + { >> + int pidx = var_to_partition (map, ssa_name (i)); >> + int base = partition_find (tentative, pidx); >> + if (index_map[base] != no_part) >> + continue; >> + index_map[base] = count++; >> + } >> + >> + map->num_basevars = count; >> + >> + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi) >> + { >> + int pidx = var_to_partition (map, ssa_name (i)); >> + int base = partition_find (tentative, pidx); >> + gcc_assert (index_map[base] < count); >> + map->partition_to_base_index[pidx] = index_map[base]; >> + } >> + >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + dump_part_var_map (dump_file, tentative, map); >> + >> + partition_delete (tentative); >> +} >> + >> +/* Hashtable helpers. */ >> + >> +struct tree_int_map_hasher : typed_noop_remove >> +{ >> + typedef tree_int_map *value_type; >> + typedef tree_int_map *compare_type; >> + static inline hashval_t hash (const tree_int_map *); >> + static inline bool equal (const tree_int_map *, const tree_int_map *); >> +}; >> + >> +inline hashval_t >> +tree_int_map_hasher::hash (const tree_int_map *v) >> +{ >> + return tree_map_base_hash (v); >> +} >> + >> +inline bool >> +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c) >> +{ >> + return tree_int_map_eq (v, c); >> +} >> + >> +/* This routine will initialize the basevar fields of MAP with base >> + names. Partitions will share the same base if they have the same >> + SSA_NAME_VAR, or, being anonymous variables, the same type. This >> + must match gimple_can_coalesce_p in the non-optimized case. */ >> + >> +static void >> +compute_samebase_partition_bases (var_map map) >> +{ >> + int x, num_part; >> + tree var; >> + struct tree_int_map *m, *mapstorage; >> + >> + num_part = num_var_partitions (map); >> + hash_table tree_to_index (num_part); >> + /* We can have at most num_part entries in the hash tables, so it's >> + enough to allocate so many map elements once, saving some malloc >> + calls. */ >> + mapstorage = m = XNEWVEC (struct tree_int_map, num_part); >> + >> + /* If a base table already exists, clear it, otherwise create it. */ >> + free (map->partition_to_base_index); >> + map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part); >> + >> + /* Build the base variable list, and point partitions at their bases. */ >> + for (x = 0; x < num_part; x++) >> + { >> + struct tree_int_map **slot; >> + unsigned baseindex; >> + var = partition_to_var (map, x); >> + if (SSA_NAME_VAR (var) >> + && (!VAR_P (SSA_NAME_VAR (var)) >> + || !DECL_IGNORED_P (SSA_NAME_VAR (var)))) >> + m->base.from = SSA_NAME_VAR (var); >> + else >> + /* This restricts what anonymous SSA names we can coalesce >> + as it restricts the sets we compute conflicts for. >> + Using TREE_TYPE to generate sets is the easies as >> + type equivalency also holds for SSA names with the same >> + underlying decl. >> + >> + Check gimple_can_coalesce_p when changing this code. */ >> + m->base.from = (TYPE_CANONICAL (TREE_TYPE (var)) >> + ? TYPE_CANONICAL (TREE_TYPE (var)) >> + : TREE_TYPE (var)); >> + /* If base variable hasn't been seen, set it up. */ >> + slot = tree_to_index.find_slot (m, INSERT); >> + if (!*slot) >> + { >> + baseindex = m - mapstorage; >> + m->to = baseindex; >> + *slot = m; >> + m++; >> + } >> + else >> + baseindex = (*slot)->to; >> + map->partition_to_base_index[x] = baseindex; >> + } >> + >> + map->num_basevars = m - mapstorage; >> + >> + free (mapstorage); >> +} >> + >> /* Reduce the number of copies by coalescing variables in the function. Return >> a partition map with the resulting coalesces. */ >> >> @@ -1286,9 +1647,10 @@ coalesce_ssa_name (void) >> cl = create_coalesce_list (); >> map = create_outofssa_var_map (cl, used_in_copies); >> >> - /* If optimization is disabled, we need to coalesce all the names originating >> - from the same SSA_NAME_VAR so debug info remains undisturbed. */ >> - if (!optimize) >> + /* If this optimization is disabled, we need to coalesce all the >> + names originating from the same SSA_NAME_VAR so debug info >> + remains undisturbed. */ >> + if (!flag_tree_coalesce_vars) >> { >> hash_table ssa_name_hash (10); >> >> @@ -1329,8 +1691,13 @@ coalesce_ssa_name (void) >> if (dump_file && (dump_flags & TDF_DETAILS)) >> dump_var_map (dump_file, map); >> >> - /* Don't calculate live ranges for variables not in the coalesce list. */ >> - partition_view_bitmap (map, used_in_copies, true); >> + partition_view_bitmap (map, used_in_copies); >> + >> + if (flag_tree_coalesce_vars) >> + compute_optimized_partition_bases (map, used_in_copies, cl); >> + else >> + compute_samebase_partition_bases (map); >> + >> BITMAP_FREE (used_in_copies); >> >> if (num_var_partitions (map) < 1) >> @@ -1369,8 +1736,7 @@ coalesce_ssa_name (void) >> >> /* Now coalesce everything in the list. */ >> coalesce_partitions (map, graph, cl, >> - ((dump_flags & TDF_DETAILS) ? dump_file >> - : NULL)); >> + ((dump_flags & TDF_DETAILS) ? dump_file : NULL)); >> >> delete_coalesce_list (cl); >> ssa_conflicts_delete (graph); >> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h >> index 99b188a..ae289b4 100644 >> --- a/gcc/tree-ssa-coalesce.h >> +++ b/gcc/tree-ssa-coalesce.h >> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see >> #define GCC_TREE_SSA_COALESCE_H >> >> extern var_map coalesce_ssa_name (void); >> +extern bool gimple_can_coalesce_p (tree, tree); >> >> #endif /* GCC_TREE_SSA_COALESCE_H */ >> diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c >> deleted file mode 100644 >> index f3cb56e..0000000 >> --- a/gcc/tree-ssa-copyrename.c >> +++ /dev/null >> @@ -1,499 +0,0 @@ >> -/* Rename SSA copies. >> - Copyright (C) 2004-2015 Free Software Foundation, Inc. >> - Contributed by Andrew MacLeod >> - >> -This file is part of GCC. >> - >> -GCC is free software; you can redistribute it and/or modify >> -it under the terms of the GNU General Public License as published by >> -the Free Software Foundation; either version 3, or (at your option) >> -any later version. >> - >> -GCC is distributed in the hope that it will be useful, >> -but WITHOUT ANY WARRANTY; without even the implied warranty of >> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >> -GNU General Public License for more details. >> - >> -You should have received a copy of the GNU General Public License >> -along with GCC; see the file COPYING3. If not see >> -. */ >> - >> -#include "config.h" >> -#include "system.h" >> -#include "coretypes.h" >> -#include "tm.h" >> -#include "hash-set.h" >> -#include "machmode.h" >> -#include "vec.h" >> -#include "double-int.h" >> -#include "input.h" >> -#include "alias.h" >> -#include "symtab.h" >> -#include "wide-int.h" >> -#include "inchash.h" >> -#include "tree.h" >> -#include "fold-const.h" >> -#include "predict.h" >> -#include "hard-reg-set.h" >> -#include "function.h" >> -#include "dominance.h" >> -#include "cfg.h" >> -#include "basic-block.h" >> -#include "tree-ssa-alias.h" >> -#include "internal-fn.h" >> -#include "gimple-expr.h" >> -#include "is-a.h" >> -#include "gimple.h" >> -#include "gimple-iterator.h" >> -#include "flags.h" >> -#include "tree-pretty-print.h" >> -#include "bitmap.h" >> -#include "gimple-ssa.h" >> -#include "stringpool.h" >> -#include "tree-ssanames.h" >> -#include "hashtab.h" >> -#include "rtl.h" >> -#include "statistics.h" >> -#include "real.h" >> -#include "fixed-value.h" >> -#include "insn-config.h" >> -#include "expmed.h" >> -#include "dojump.h" >> -#include "explow.h" >> -#include "calls.h" >> -#include "emit-rtl.h" >> -#include "varasm.h" >> -#include "stmt.h" >> -#include "expr.h" >> -#include "tree-dfa.h" >> -#include "tree-inline.h" >> -#include "tree-ssa-live.h" >> -#include "tree-pass.h" >> -#include "langhooks.h" >> - >> -static struct >> -{ >> - /* Number of copies coalesced. */ >> - int coalesced; >> -} stats; >> - >> -/* The following routines implement the SSA copy renaming phase. >> - >> - This optimization looks for copies between 2 SSA_NAMES, either through a >> - direct copy, or an implicit one via a PHI node result and its arguments. >> - >> - Each copy is examined to determine if it is possible to rename the base >> - variable of one of the operands to the same variable as the other operand. >> - i.e. >> - T.3_5 = >> - a_1 = T.3_5 >> - >> - If this copy couldn't be copy propagated, it could possibly remain in the >> - program throughout the optimization phases. After SSA->normal, it would >> - become: >> - >> - T.3 = >> - a = T.3 >> - >> - Since T.3_5 is distinct from all other SSA versions of T.3, there is no >> - fundamental reason why the base variable needs to be T.3, subject to >> - certain restrictions. This optimization attempts to determine if we can >> - change the base variable on copies like this, and result in code such as: >> - >> - a_5 = >> - a_1 = a_5 >> - >> - This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is >> - possible, the copy goes away completely. If it isn't possible, a new temp >> - will be created for a_5, and you will end up with the exact same code: >> - >> - a.8 = >> - a = a.8 >> - >> - The other benefit of performing this optimization relates to what variables >> - are chosen in copies. Gimplification of the program uses temporaries for >> - a lot of things. expressions like >> - >> - a_1 = >> - = a_1 >> - >> - get turned into >> - >> - T.3_5 = >> - a_1 = T.3_5 >> - = a_1 >> - >> - Copy propagation is done in a forward direction, and if we can propagate >> - through the copy, we end up with: >> - >> - T.3_5 = >> - = T.3_5 >> - >> - The copy is gone, but so is all reference to the user variable 'a'. By >> - performing this optimization, we would see the sequence: >> - >> - a_5 = >> - a_1 = a_5 >> - = a_1 >> - >> - which copy propagation would then turn into: >> - >> - a_5 = >> - = a_5 >> - >> - and so we still retain the user variable whenever possible. */ >> - >> - >> -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid. >> - Choose a representative for the partition, and send debug info to DEBUG. */ >> - >> -static void >> -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug) >> -{ >> - int p1, p2, p3; >> - tree root1, root2; >> - tree rep1, rep2; >> - bool ign1, ign2, abnorm; >> - >> - gcc_assert (TREE_CODE (var1) == SSA_NAME); >> - gcc_assert (TREE_CODE (var2) == SSA_NAME); >> - >> - register_ssa_partition (map, var1); >> - register_ssa_partition (map, var2); >> - >> - p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1)); >> - p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2)); >> - >> - if (debug) >> - { >> - fprintf (debug, "Try : "); >> - print_generic_expr (debug, var1, TDF_SLIM); >> - fprintf (debug, "(P%d) & ", p1); >> - print_generic_expr (debug, var2, TDF_SLIM); >> - fprintf (debug, "(P%d)", p2); >> - } >> - >> - gcc_assert (p1 != NO_PARTITION); >> - gcc_assert (p2 != NO_PARTITION); >> - >> - if (p1 == p2) >> - { >> - if (debug) >> - fprintf (debug, " : Already coalesced.\n"); >> - return; >> - } >> - >> - rep1 = partition_to_var (map, p1); >> - rep2 = partition_to_var (map, p2); >> - root1 = SSA_NAME_VAR (rep1); >> - root2 = SSA_NAME_VAR (rep2); >> - if (!root1 && !root2) >> - return; >> - >> - /* Don't coalesce if one of the variables occurs in an abnormal PHI. */ >> - abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1) >> - || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2)); >> - if (abnorm) >> - { >> - if (debug) >> - fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n"); >> - return; >> - } >> - >> - /* Partitions already have the same root, simply merge them. */ >> - if (root1 == root2) >> - { >> - p1 = partition_union (map->var_partition, p1, p2); >> - if (debug) >> - fprintf (debug, " : Same root, coalesced --> P%d.\n", p1); >> - return; >> - } >> - >> - /* Never attempt to coalesce 2 different parameters. */ >> - if ((root1 && TREE_CODE (root1) == PARM_DECL) >> - && (root2 && TREE_CODE (root2) == PARM_DECL)) >> - { >> - if (debug) >> - fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n"); >> - return; >> - } >> - >> - if ((root1 && TREE_CODE (root1) == RESULT_DECL) >> - != (root2 && TREE_CODE (root2) == RESULT_DECL)) >> - { >> - if (debug) >> - fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n"); >> - return; >> - } >> - >> - ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1)); >> - ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2)); >> - >> - /* Refrain from coalescing user variables, if requested. */ >> - if (!ign1 && !ign2) >> - { >> - if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2)) >> - ign2 = true; >> - else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1)) >> - ign1 = true; >> - else if (flag_ssa_coalesce_vars != 2) >> - { >> - if (debug) >> - fprintf (debug, " : 2 different USER vars. No coalesce.\n"); >> - return; >> - } >> - else >> - ign2 = true; >> - } >> - >> - /* If both values have default defs, we can't coalesce. If only one has a >> - tag, make sure that variable is the new root partition. */ >> - if (root1 && ssa_default_def (cfun, root1)) >> - { >> - if (root2 && ssa_default_def (cfun, root2)) >> - { >> - if (debug) >> - fprintf (debug, " : 2 default defs. No coalesce.\n"); >> - return; >> - } >> - else >> - { >> - ign2 = true; >> - ign1 = false; >> - } >> - } >> - else if (root2 && ssa_default_def (cfun, root2)) >> - { >> - ign1 = true; >> - ign2 = false; >> - } >> - >> - /* Do not coalesce if we cannot assign a symbol to the partition. */ >> - if (!(!ign2 && root2) >> - && !(!ign1 && root1)) >> - { >> - if (debug) >> - fprintf (debug, " : Choosen variable has no root. No coalesce.\n"); >> - return; >> - } >> - >> - /* Don't coalesce if the new chosen root variable would be read-only. >> - If both ign1 && ign2, then the root var of the larger partition >> - wins, so reject in that case if any of the root vars is TREE_READONLY. >> - Otherwise reject only if the root var, on which replace_ssa_name_symbol >> - will be called below, is readonly. */ >> - if (((root1 && TREE_READONLY (root1)) && ign2) >> - || ((root2 && TREE_READONLY (root2)) && ign1)) >> - { >> - if (debug) >> - fprintf (debug, " : Readonly variable. No coalesce.\n"); >> - return; >> - } >> - >> - /* Don't coalesce if the two variables aren't type compatible . */ >> - if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2)) >> - /* There is a disconnect between the middle-end type-system and >> - VRP, avoid coalescing enum types with different bounds. */ >> - || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE >> - || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE) >> - && TREE_TYPE (var1) != TREE_TYPE (var2))) >> - { >> - if (debug) >> - fprintf (debug, " : Incompatible types. No coalesce.\n"); >> - return; >> - } >> - >> - /* Merge the two partitions. */ >> - p3 = partition_union (map->var_partition, p1, p2); >> - >> - /* Set the root variable of the partition to the better choice, if there is >> - one. */ >> - if (!ign2 && root2) >> - replace_ssa_name_symbol (partition_to_var (map, p3), root2); >> - else if (!ign1 && root1) >> - replace_ssa_name_symbol (partition_to_var (map, p3), root1); >> - else >> - gcc_unreachable (); >> - >> - if (debug) >> - { >> - fprintf (debug, " --> P%d ", p3); >> - print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)), >> - TDF_SLIM); >> - fprintf (debug, "\n"); >> - } >> -} >> - >> - >> -namespace { >> - >> -const pass_data pass_data_rename_ssa_copies = >> -{ >> - GIMPLE_PASS, /* type */ >> - "copyrename", /* name */ >> - OPTGROUP_NONE, /* optinfo_flags */ >> - TV_TREE_COPY_RENAME, /* tv_id */ >> - ( PROP_cfg | PROP_ssa ), /* properties_required */ >> - 0, /* properties_provided */ >> - 0, /* properties_destroyed */ >> - 0, /* todo_flags_start */ >> - 0, /* todo_flags_finish */ >> -}; >> - >> -class pass_rename_ssa_copies : public gimple_opt_pass >> -{ >> -public: >> - pass_rename_ssa_copies (gcc::context *ctxt) >> - : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt) >> - {} >> - >> - /* opt_pass methods: */ >> - opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); } >> - virtual bool gate (function *) { return flag_tree_copyrename != 0; } >> - virtual unsigned int execute (function *); >> - >> -}; // class pass_rename_ssa_copies >> - >> -/* This function will make a pass through the IL, and attempt to coalesce any >> - SSA versions which occur in PHI's or copies. Coalescing is accomplished by >> - changing the underlying root variable of all coalesced version. This will >> - then cause the SSA->normal pass to attempt to coalesce them all to the same >> - variable. */ >> - >> -unsigned int >> -pass_rename_ssa_copies::execute (function *fun) >> -{ >> - var_map map; >> - basic_block bb; >> - tree var, part_var; >> - gimple stmt; >> - unsigned x; >> - FILE *debug; >> - >> - memset (&stats, 0, sizeof (stats)); >> - >> - if (dump_file && (dump_flags & TDF_DETAILS)) >> - debug = dump_file; >> - else >> - debug = NULL; >> - >> - map = init_var_map (num_ssa_names); >> - >> - FOR_EACH_BB_FN (bb, fun) >> - { >> - /* Scan for real copies. */ >> - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); >> - gsi_next (&gsi)) >> - { >> - stmt = gsi_stmt (gsi); >> - if (gimple_assign_ssa_name_copy_p (stmt)) >> - { >> - tree lhs = gimple_assign_lhs (stmt); >> - tree rhs = gimple_assign_rhs1 (stmt); >> - >> - copy_rename_partition_coalesce (map, lhs, rhs, debug); >> - } >> - } >> - } >> - >> - FOR_EACH_BB_FN (bb, fun) >> - { >> - /* Treat PHI nodes as copies between the result and each argument. */ >> - for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi); >> - gsi_next (&gsi)) >> - { >> - size_t i; >> - tree res; >> - gphi *phi = gsi.phi (); >> - res = gimple_phi_result (phi); >> - >> - /* Do not process virtual SSA_NAMES. */ >> - if (virtual_operand_p (res)) >> - continue; >> - >> - /* Make sure to only use the same partition for an argument >> - as the result but never the other way around. */ >> - if (SSA_NAME_VAR (res) >> - && !DECL_IGNORED_P (SSA_NAME_VAR (res))) >> - for (i = 0; i < gimple_phi_num_args (phi); i++) >> - { >> - tree arg = PHI_ARG_DEF (phi, i); >> - if (TREE_CODE (arg) == SSA_NAME) >> - copy_rename_partition_coalesce (map, res, arg, >> - debug); >> - } >> - /* Else if all arguments are in the same partition try to merge >> - it with the result. */ >> - else >> - { >> - int all_p_same = -1; >> - int p = -1; >> - for (i = 0; i < gimple_phi_num_args (phi); i++) >> - { >> - tree arg = PHI_ARG_DEF (phi, i); >> - if (TREE_CODE (arg) != SSA_NAME) >> - { >> - all_p_same = 0; >> - break; >> - } >> - else if (all_p_same == -1) >> - { >> - p = partition_find (map->var_partition, >> - SSA_NAME_VERSION (arg)); >> - all_p_same = 1; >> - } >> - else if (all_p_same == 1 >> - && p != partition_find (map->var_partition, >> - SSA_NAME_VERSION (arg))) >> - { >> - all_p_same = 0; >> - break; >> - } >> - } >> - if (all_p_same == 1) >> - copy_rename_partition_coalesce (map, res, >> - PHI_ARG_DEF (phi, 0), >> - debug); >> - } >> - } >> - } >> - >> - if (debug) >> - dump_var_map (debug, map); >> - >> - /* Now one more pass to make all elements of a partition share the same >> - root variable. */ >> - >> - for (x = 1; x < num_ssa_names; x++) >> - { >> - part_var = partition_to_var (map, x); >> - if (!part_var) >> - continue; >> - var = ssa_name (x); >> - if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var)) >> - continue; >> - if (debug) >> - { >> - fprintf (debug, "Coalesced "); >> - print_generic_expr (debug, var, TDF_SLIM); >> - fprintf (debug, " to "); >> - print_generic_expr (debug, part_var, TDF_SLIM); >> - fprintf (debug, "\n"); >> - } >> - stats.coalesced++; >> - replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var)); >> - } >> - >> - statistics_counter_event (fun, "copies coalesced", >> - stats.coalesced); >> - delete_var_map (map); >> - return 0; >> -} >> - >> -} // anon namespace >> - >> -gimple_opt_pass * >> -make_pass_rename_ssa_copies (gcc::context *ctxt) >> -{ >> - return new pass_rename_ssa_copies (ctxt); >> -} >> diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c >> index 2c7c072..821b2f4 100644 >> --- a/gcc/tree-ssa-live.c >> +++ b/gcc/tree-ssa-live.c >> @@ -100,90 +100,6 @@ static void verify_live_on_entry (tree_live_info_p); >> ssa_name or variable, and vice versa. */ >> >> >> -/* Hashtable helpers. */ >> - >> -struct tree_int_map_hasher : typed_noop_remove >> -{ >> - typedef tree_int_map *value_type; >> - typedef tree_int_map *compare_type; >> - static inline hashval_t hash (const tree_int_map *); >> - static inline bool equal (const tree_int_map *, const tree_int_map *); >> -}; >> - >> -inline hashval_t >> -tree_int_map_hasher::hash (const tree_int_map *v) >> -{ >> - return tree_map_base_hash (v); >> -} >> - >> -inline bool >> -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c) >> -{ >> - return tree_int_map_eq (v, c); >> -} >> - >> - >> -/* This routine will initialize the basevar fields of MAP. */ >> - >> -static void >> -var_map_base_init (var_map map) >> -{ >> - int x, num_part; >> - tree var; >> - struct tree_int_map *m, *mapstorage; >> - >> - num_part = num_var_partitions (map); >> - hash_table tree_to_index (num_part); >> - /* We can have at most num_part entries in the hash tables, so it's >> - enough to allocate so many map elements once, saving some malloc >> - calls. */ >> - mapstorage = m = XNEWVEC (struct tree_int_map, num_part); >> - >> - /* If a base table already exists, clear it, otherwise create it. */ >> - free (map->partition_to_base_index); >> - map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part); >> - >> - /* Build the base variable list, and point partitions at their bases. */ >> - for (x = 0; x < num_part; x++) >> - { >> - struct tree_int_map **slot; >> - unsigned baseindex; >> - var = partition_to_var (map, x); >> - if (SSA_NAME_VAR (var) >> - && (!VAR_P (SSA_NAME_VAR (var)) >> - || !DECL_IGNORED_P (SSA_NAME_VAR (var)))) >> - m->base.from = SSA_NAME_VAR (var); >> - else >> - /* This restricts what anonymous SSA names we can coalesce >> - as it restricts the sets we compute conflicts for. >> - Using TREE_TYPE to generate sets is the easies as >> - type equivalency also holds for SSA names with the same >> - underlying decl. >> - >> - Check gimple_can_coalesce_p when changing this code. */ >> - m->base.from = (TYPE_CANONICAL (TREE_TYPE (var)) >> - ? TYPE_CANONICAL (TREE_TYPE (var)) >> - : TREE_TYPE (var)); >> - /* If base variable hasn't been seen, set it up. */ >> - slot = tree_to_index.find_slot (m, INSERT); >> - if (!*slot) >> - { >> - baseindex = m - mapstorage; >> - m->to = baseindex; >> - *slot = m; >> - m++; >> - } >> - else >> - baseindex = (*slot)->to; >> - map->partition_to_base_index[x] = baseindex; >> - } >> - >> - map->num_basevars = m - mapstorage; >> - >> - free (mapstorage); >> -} >> - >> - >> /* Remove the base table in MAP. */ >> >> static void >> @@ -361,21 +277,17 @@ partition_view_fini (var_map map, bitmap selected) >> } >> >> >> -/* Create a partition view which includes all the used partitions in MAP. If >> - WANT_BASES is true, create the base variable map as well. */ >> +/* Create a partition view which includes all the used partitions in MAP. */ >> >> void >> -partition_view_normal (var_map map, bool want_bases) >> +partition_view_normal (var_map map) >> { >> bitmap used; >> >> used = partition_view_init (map); >> partition_view_fini (map, used); >> >> - if (want_bases) >> - var_map_base_init (map); >> - else >> - var_map_base_fini (map); >> + var_map_base_fini (map); >> } >> >> >> @@ -384,7 +296,7 @@ partition_view_normal (var_map map, bool want_bases) >> as well. */ >> >> void >> -partition_view_bitmap (var_map map, bitmap only, bool want_bases) >> +partition_view_bitmap (var_map map, bitmap only) >> { >> bitmap used; >> bitmap new_partitions = BITMAP_ALLOC (NULL); >> @@ -400,10 +312,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases) >> } >> partition_view_fini (map, new_partitions); >> >> - if (want_bases) >> - var_map_base_init (map); >> - else >> - var_map_base_fini (map); >> + var_map_base_fini (map); >> } >> >> >> diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h >> index d5d7820..1f88358 100644 >> --- a/gcc/tree-ssa-live.h >> +++ b/gcc/tree-ssa-live.h >> @@ -71,8 +71,8 @@ typedef struct _var_map >> extern var_map init_var_map (int); >> extern void delete_var_map (var_map); >> extern int var_union (var_map, tree, tree); >> -extern void partition_view_normal (var_map, bool); >> -extern void partition_view_bitmap (var_map, bitmap, bool); >> +extern void partition_view_normal (var_map); >> +extern void partition_view_bitmap (var_map, bitmap); >> extern void dump_scope_blocks (FILE *, int); >> extern void debug_scope_block (tree, int); >> extern void debug_scope_blocks (int); >> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c >> index 3f6bebe..7bef8cf 100644 >> --- a/gcc/tree-ssa-loop-niter.c >> +++ b/gcc/tree-ssa-loop-niter.c >> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step, >> if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0)) >> continue; >> e = TREE_OPERAND (e, 0); >> - gcc_assert (operand_equal_p (e, base, 0)); >> + /* If E has an unsigned type, the operand equality test below >> + would fail, but the equality test above would have already >> + verified the equality, so we can proceed with it. */ >> + gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e)) >> + || operand_equal_p (e, base, 0)); >> if (tree_int_cst_sign_bit (step)) >> { >> code = LT_EXPR; >> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c >> index f75a7f1..0982305 100644 >> --- a/gcc/tree-ssa-uncprop.c >> +++ b/gcc/tree-ssa-uncprop.c >> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see >> #include "domwalk.h" >> #include "tree-pass.h" >> #include "tree-ssa-propagate.h" >> +#include "bitmap.h" >> +#include "stringpool.h" >> +#include "tree-ssanames.h" >> +#include "tree-ssa-live.h" >> +#include "tree-ssa-coalesce.h" >> >> /* The basic structure describing an equivalency created by traversing >> an edge. Traversing the edge effectively means that we can assume >> diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c >> index 0b24007..acdcd46 100644 >> --- a/gcc/var-tracking.c >> +++ b/gcc/var-tracking.c >> @@ -4931,12 +4931,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set) >> registers, as well as associations between MEMs and VALUEs. */ >> >> static void >> -dataflow_set_clear_at_call (dataflow_set *set) >> +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn) >> { >> unsigned int r; >> hard_reg_set_iterator hrsi; >> + HARD_REG_SET invalidated_regs; >> >> - EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi) >> + get_call_reg_set_usage (call_insn, &invalidated_regs, >> + regs_invalidated_by_call); >> + >> + EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi) >> var_regno_delete (set, r); >> >> if (MAY_HAVE_DEBUG_INSNS) >> @@ -6720,7 +6724,7 @@ compute_bb_dataflow (basic_block bb) >> switch (mo->type) >> { >> case MO_CALL: >> - dataflow_set_clear_at_call (out); >> + dataflow_set_clear_at_call (out, insn); >> break; >> >> case MO_USE: >> @@ -9182,7 +9186,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set) >> switch (mo->type) >> { >> case MO_CALL: >> - dataflow_set_clear_at_call (set); >> + dataflow_set_clear_at_call (set, insn); >> emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars); >> { >> rtx arguments = mo->u.loc, *p = &arguments; >> >> >> >> And here's the incremental patch: >> >> --- >> gcc/alias.c | 17 +++++++------ >> gcc/cfgexpand.c | 57 +++++++++++++++++---------------------------- >> gcc/emit-rtl.c | 2 -- >> gcc/explow.c | 3 -- >> gcc/expr.c | 16 +++++-------- >> gcc/function.c | 15 ++++++++++++ >> gcc/gimple-expr.h | 4 --- >> gcc/tree-outof-ssa.c | 7 ++---- >> gcc/tree-ssa-coalesce.h | 1 + >> gcc/tree-ssa-loop-niter.c | 6 ++++- >> gcc/tree-ssa-uncprop.c | 5 ++++ >> 11 files changed, 64 insertions(+), 69 deletions(-) >> >> diff --git a/gcc/alias.c b/gcc/alias.c >> index 7a74e81..5a031d9 100644 >> --- a/gcc/alias.c >> +++ b/gcc/alias.c >> @@ -2553,14 +2553,15 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant) >> return 0; >> >> /* If we refer to different gimple registers, or one gimple register >> - and one non-gimple-register, we know they can't overlap. Now, >> - there could be more than one stack slot for (different versions >> - of) the same gimple register, but we can presumably tell they >> - don't overlap based on offsets from stack base addresses >> - elsewhere. It's important that we don't proceed to DECL_RTL, >> - because gimple registers may not pass DECL_RTL_SET_P, and >> - make_decl_rtl won't be able to do anything about them since no >> - SSA information will have remained to guide it. */ >> + and one non-gimple-register, we know they can't overlap. First, >> + gimple registers don't have their addresses taken. Now, there >> + could be more than one stack slot for (different versions of) the >> + same gimple register, but we can presumably tell they don't >> + overlap based on offsets from stack base addresses elsewhere. >> + It's important that we don't proceed to DECL_RTL, because gimple >> + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be >> + able to do anything about them since no SSA information will have >> + remained to guide it. */ >> if (is_gimple_reg (exprx) || is_gimple_reg (expry)) >> return exprx != expry; >> >> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c >> index 3e80b4a..bf972fc 100644 >> --- a/gcc/cfgexpand.c >> +++ b/gcc/cfgexpand.c >> @@ -179,11 +179,10 @@ gimple_assign_rhs_to_tree (gimple stmt) >> >> #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) >> >> -/* NEXT is a DECL to be associated with some RTL, CUR is a DECL or a >> - TREE_LIST of DECLs. If NEXT is covered by CUR, return CUR >> - unchanged. Otherwise, return a list with all entries of CUR, with >> - NEXT at the end. If CUR was a list, it will be modified in >> - place. */ >> +/* Choose either CUR or NEXT as the leader DECL for a partition. >> + Prefer ignored decls, to simplify debug dumps and reduce ambiguity >> + out of the same user variable being in multiple partitions (this is >> + less likely for compiler-introduced temps). */ >> >> static tree >> leader_merge (tree cur, tree next) >> @@ -191,26 +190,11 @@ leader_merge (tree cur, tree next) >> if (cur == NULL || cur == next) >> return next; >> >> - tree list; >> + if (DECL_P (cur) && DECL_IGNORED_P (cur)) >> + return cur; >> >> - if (TREE_CODE (cur) == TREE_LIST) >> - { >> - /* Look for NEXT in the list. Stop at the last node to insert >> - there. */ >> - for (list = cur; ; list = TREE_CHAIN (list)) >> - { >> - if (TREE_VALUE (list) == next) >> - return cur; >> - if (!TREE_CHAIN (list)) >> - break; >> - } >> - } >> - else >> - /* Create the first node. */ >> - list = build_tree_list (NULL, cur); >> - >> - next = build_tree_list (NULL, next); >> - TREE_CHAIN (list) = next; >> + if (DECL_P (next) && DECL_IGNORED_P (next)) >> + return next; >> >> return cur; >> } >> @@ -285,9 +269,9 @@ set_rtl (tree t, rtx x) >> if (cur != next) >> { >> if (MEM_P (x)) >> - set_mem_attributes (x, SSAVAR (t), true); >> + set_mem_attributes (x, next, true); >> else >> - set_reg_attrs_for_decl_rtl (SSAVAR (t), x); >> + set_reg_attrs_for_decl_rtl (next, x); >> } >> } >> >> @@ -1025,9 +1009,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, >> gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); >> >> x = plus_constant (Pmode, base, offset); >> - x = gen_rtx_MEM ((TREE_CODE (decl) != SSA_NAME || SSA_NAME_VAR (decl)) >> - ? DECL_MODE (SSAVAR (decl)) >> - : TYPE_MODE (TREE_TYPE (decl)), x); >> + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME >> + ? TYPE_MODE (TREE_TYPE (decl)) >> + : DECL_MODE (SSAVAR (decl)), x); >> >> if (TREE_CODE (decl) != SSA_NAME) >> { >> @@ -1268,17 +1252,17 @@ expand_one_stack_var_1 (tree var) >> HOST_WIDE_INT size, offset; >> unsigned byte_align; >> >> - if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var)) >> - { >> - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); >> - byte_align = align_local_variable (SSAVAR (var)); >> - } >> - else >> + if (TREE_CODE (var) == SSA_NAME) >> { >> tree type = TREE_TYPE (var); >> size = tree_to_uhwi (TYPE_SIZE_UNIT (type)); >> byte_align = TYPE_ALIGN_UNIT (type); >> } >> + else >> + { >> + size = tree_to_uhwi (DECL_SIZE_UNIT (var)); >> + byte_align = align_local_variable (var); >> + } >> >> /* We handle highly aligned variables in expand_stack_vars. */ >> gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT); >> @@ -1423,9 +1407,10 @@ expand_one_register_var (tree var) >> gcc_assert (REG_P (x)); >> return; >> } >> + gcc_unreachable (); >> } >> >> - tree decl = SSAVAR (var); >> + tree decl = var; >> tree type = TREE_TYPE (decl); >> machine_mode reg_mode = promote_decl_mode (decl, NULL); >> rtx x = gen_reg_rtx (reg_mode); >> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c >> index 308da40..2b98946 100644 >> --- a/gcc/emit-rtl.c >> +++ b/gcc/emit-rtl.c >> @@ -1252,8 +1252,6 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x) >> if (!t) >> return; >> tree tdecl = t; >> - if (TREE_CODE (t) == TREE_LIST) >> - tdecl = TREE_VALUE (t); >> if (GET_CODE (x) == SUBREG) >> { >> gcc_assert (subreg_lowpart_p (x)); >> diff --git a/gcc/explow.c b/gcc/explow.c >> index e09c032e1..5b0d49c 100644 >> --- a/gcc/explow.c >> +++ b/gcc/explow.c >> @@ -866,9 +866,6 @@ promote_ssa_mode (const_tree name, int *punsignedp) >> { >> gcc_assert (TREE_CODE (name) == SSA_NAME); >> >> - if (SSA_NAME_VAR (name)) >> - return promote_decl_mode (SSA_NAME_VAR (name), punsignedp); >> - >> tree type = TREE_TYPE (name); >> int unsignedp = TYPE_UNSIGNED (type); >> machine_mode mode = TYPE_MODE (type); >> diff --git a/gcc/expr.c b/gcc/expr.c >> index effe379..5b6e16e 100644 >> --- a/gcc/expr.c >> +++ b/gcc/expr.c >> @@ -9584,20 +9584,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> >> /* Get the signedness to be used for this variable. Ensure we get >> the same mode we got when the variable was declared. */ >> - if (code == SSA_NAME >> - && (g = SSA_NAME_DEF_STMT (ssa_name)) >> - && gimple_code (g) == GIMPLE_CALL >> - && !gimple_call_internal_p (g)) >> + if (code != SSA_NAME) >> + pmode = promote_decl_mode (exp, &unsignedp); >> + else if ((g = SSA_NAME_DEF_STMT (ssa_name)) >> + && gimple_code (g) == GIMPLE_CALL >> + && !gimple_call_internal_p (g)) >> pmode = promote_function_mode (type, mode, &unsignedp, >> gimple_call_fntype (g), >> 2); >> - else if (!exp) >> - { >> - gcc_assert (code == SSA_NAME); >> - pmode = promote_ssa_mode (ssa_name, &unsignedp); >> - } >> else >> - pmode = promote_decl_mode (exp, &unsignedp); >> + pmode = promote_ssa_mode (ssa_name, &unsignedp); >> gcc_assert (GET_MODE (decl_rtl) == pmode); >> >> temp = gen_lowpart_SUBREG (mode, decl_rtl); >> diff --git a/gcc/function.c b/gcc/function.c >> index dc9e77f..58e2498 100644 >> --- a/gcc/function.c >> +++ b/gcc/function.c >> @@ -2124,6 +2124,21 @@ use_register_for_decl (const_tree decl) >> { >> if (TREE_CODE (decl) == SSA_NAME) >> { >> + /* We often try to use the SSA_NAME, instead of its underlying >> + decl, to get type information and guide decisions, to avoid >> + differences of behavior between anonymous and named >> + variables, but in this one case we have to go for the actual >> + variable if there is one. The main reason is that, at least >> + at -O0, we want to place user variables on the stack, but we >> + don't mind using pseudos for anonymous or ignored temps. >> + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs >> + should go in pseudos, whereas their corresponding variables >> + might have to go on the stack. So, disregarding the decl >> + here would negatively impact debug info at -O0, enable >> + coalescing between SSA_NAMEs that ought to get different >> + stack/pseudo assignments, and get the incoming argument >> + processing thoroughly confused by PARM_DECLs expected to live >> + in stack slots but assigned to pseudos. */ >> if (!SSA_NAME_VAR (decl)) >> return TYPE_MODE (TREE_TYPE (decl)) != BLKmode >> && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl))); >> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h >> index 146cede..3d1c89f 100644 >> --- a/gcc/gimple-expr.h >> +++ b/gcc/gimple-expr.h >> @@ -55,10 +55,6 @@ extern bool is_gimple_mem_ref_addr (tree); >> extern void mark_addressable (tree); >> extern bool is_gimple_reg_rhs (tree); >> >> -/* Defined in tree-ssa-coalesce.c. */ >> -extern bool gimple_can_coalesce_p (tree, tree); >> - >> - >> /* Return true if a conversion from either type of TYPE1 and TYPE2 >> to the other is not required. Otherwise return false. */ >> >> diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c >> index dda9973..59d91c6 100644 >> --- a/gcc/tree-outof-ssa.c >> +++ b/gcc/tree-outof-ssa.c >> @@ -305,7 +305,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) >> rtx dest_rtx, seq, x; >> machine_mode dest_mode, src_mode; >> int unsignedp; >> - tree var; >> >> if (dump_file && (dump_flags & TDF_DETAILS)) >> { >> @@ -328,10 +327,9 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) >> start_sequence (); >> >> tree name = partition_to_var (SA.map, dest); >> - var = SSA_NAME_VAR (name); >> src_mode = TYPE_MODE (TREE_TYPE (src)); >> dest_mode = GET_MODE (dest_rtx); >> - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name))); >> + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name))); >> gcc_assert (!REG_P (dest_rtx) >> || dest_mode == promote_ssa_mode (name, &unsignedp)); >> >> @@ -709,8 +707,7 @@ elim_backward (elim_graph g, int T) >> static rtx >> get_temp_reg (tree name) >> { >> - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; >> - tree type = var ? TREE_TYPE (var) : TREE_TYPE (name); >> + tree type = TREE_TYPE (name); >> int unsignedp; >> machine_mode reg_mode = promote_ssa_mode (name, &unsignedp); >> rtx x = gen_reg_rtx (reg_mode); >> diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h >> index 99b188a..ae289b4 100644 >> --- a/gcc/tree-ssa-coalesce.h >> +++ b/gcc/tree-ssa-coalesce.h >> @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see >> #define GCC_TREE_SSA_COALESCE_H >> >> extern var_map coalesce_ssa_name (void); >> +extern bool gimple_can_coalesce_p (tree, tree); >> >> #endif /* GCC_TREE_SSA_COALESCE_H */ >> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c >> index 3f6bebe..7bef8cf 100644 >> --- a/gcc/tree-ssa-loop-niter.c >> +++ b/gcc/tree-ssa-loop-niter.c >> @@ -3964,7 +3964,11 @@ loop_exits_before_overflow (tree base, tree step, >> if (!CONVERT_EXPR_P (e) || !operand_equal_p (e, unsigned_base, 0)) >> continue; >> e = TREE_OPERAND (e, 0); >> - gcc_assert (operand_equal_p (e, base, 0)); >> + /* If E has an unsigned type, the operand equality test below >> + would fail, but the equality test above would have already >> + verified the equality, so we can proceed with it. */ >> + gcc_assert (TYPE_UNSIGNED (TREE_TYPE (e)) >> + || operand_equal_p (e, base, 0)); >> if (tree_int_cst_sign_bit (step)) >> { >> code = LT_EXPR; >> diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c >> index f75a7f1..0982305 100644 >> --- a/gcc/tree-ssa-uncprop.c >> +++ b/gcc/tree-ssa-uncprop.c >> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see >> #include "domwalk.h" >> #include "tree-pass.h" >> #include "tree-ssa-propagate.h" >> +#include "bitmap.h" >> +#include "stringpool.h" >> +#include "tree-ssanames.h" >> +#include "tree-ssa-live.h" >> +#include "tree-ssa-coalesce.h" >> >> /* The basic structure describing an equivalency created by traversing >> an edge. Traversing the edge effectively means that we can assume >> >> >> >> -- >> Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ >> You must be the change you wish to see in the world. -- Gandhi >> Be Free! -- http://FSFLA.org/ FSF Latin America board member >> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer