From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23494 invoked by alias); 27 May 2011 19:10:11 -0000 Received: (qmail 23449 invoked by uid 22791); 27 May 2011 19:10:06 -0000 X-SWARE-Spam-Status: No, hits=-0.8 required=5.0 tests=AWL,BAYES_40,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 27 May 2011 19:09:50 +0000 Received: (qmail 7579 invoked from network); 27 May 2011 19:09:49 -0000 Received: from unknown (HELO ?84.152.188.204?) (bernds@127.0.0.2) by mail.codesourcery.com with ESMTPA; 27 May 2011 19:09:49 -0000 Message-ID: <4DDFF6F9.3090605@codesourcery.com> Date: Fri, 27 May 2011 20:10:00 -0000 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110516 Lightning/1.0b3pre Thunderbird/3.1.10 MIME-Version: 1.0 To: GCC Patches Subject: [1/2] Rename across ebbs Content-Type: multipart/mixed; boundary="------------020602050901060306040705" Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg02203.txt.bz2 This is a multi-part message in MIME format. --------------020602050901060306040705 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-length: 1353 While working on a C6X scheduling patch, I found myself wondering - what would be involved in making the register renamer operate on extended basic blocks rather than simple bbs? Somewhat surprisingly, the answer turns out to be "not much". After the last rewrite, all the conflict tests are based on id numbers and bitmaps, not live ranges. The following two patches add support for saving and restoring renamer state at basic block boundaries. Below is the initial patch, which just reorders the code a little without changing functionality. Besides moving and splitting out functions, I've only removed one redundant structure field and some #if 0'ed code. Bootstrapped and regression tested (with both patches applied) on i686-linux, with @@ -5106,6 +5106,9 @@ static const struct default_options ix86 +#ifdef INSN_SCHEDULING + { OPT_LEVELS_2_PLUS, OPT_frename_registers, NULL, 1 }, +#endif to ensure it gets tested. Also tested with a 4.5 c6x-elf toolchain, which has -frename-registers enabled by default. With a few other changes to make full use of it, I get up to 10% speedup on certain testcases on c6x. It would be nice if someone could run benchmarks on other targets where this could affect performance. I may or may not manage an i686 SPEC run over the weekend; it's unlikely to be affected much due to not using sched_ebb. Bernd --------------020602050901060306040705 Content-Type: text/plain; name="rnreg-prelim.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="rnreg-prelim.diff" Content-length: 15866 * regrename.c (struct du_head): Remove member terminated. (create_new_chain): Don't initialize it. (scan_rtx_reg): Don't set or test it, test the open_chains_set bitmap instead. (tick, this_tick): New global variables, moved out of regrename_optimize. (current_id, open_chains, closed_chains, open_chains_set, live_in_chains, live_hard_regs): Reorder declarations. (dump_def_use_chain): Move function earlier in the file. (rename_chains): New static function, broken out of regrename_optimize. (regrename_optimize): Use it. Remove #if 0'ed code. Index: gcc/regrename.c =================================================================== --- gcc/regrename.c.orig +++ gcc/regrename.c @@ -85,8 +85,6 @@ struct du_head /* Conflicts with untracked hard registers. */ HARD_REG_SET hard_conflicts; - /* Nonzero if the chain is finished; zero if it is still open. */ - unsigned int terminated:1; /* Nonzero if the chain crosses a call. */ unsigned int need_caller_save_reg:1; /* Nonzero if the register is used in a way that prevents renaming, @@ -132,6 +130,11 @@ static const char * const scan_actions_n "mark_access" }; +/* TICK and THIS_TICK are used to record the last time we saw each + register. */ +static int tick[FIRST_PSEUDO_REGISTER]; +static int this_tick = 0; + static struct obstack rename_obstack; static void do_replace (struct du_head *, int); @@ -147,8 +150,49 @@ static void dump_def_use_chain (struct d typedef struct du_head *du_head_p; DEF_VEC_P (du_head_p); DEF_VEC_ALLOC_P (du_head_p, heap); + +/* The id to be given to the next opened chain. */ +static unsigned current_id; + +/* A mapping of unique id numbers to chains. */ static VEC(du_head_p, heap) *id_to_chain; +/* List of currently open chains, and closed chains that can be renamed. */ +static struct du_head *open_chains; +static struct du_head *closed_chains; + +/* Bitmap of open chains. The bits set always match the list found in + open_chains. */ +static bitmap_head open_chains_set; + +/* Record the registers being tracked in open_chains. */ +static HARD_REG_SET live_in_chains; + +/* Record the registers that are live but not tracked. The intersection + between this and live_in_chains is empty. */ +static HARD_REG_SET live_hard_regs; + +/* Dump all def/use chains in CHAINS to DUMP_FILE. */ + +static void +dump_def_use_chain (struct du_head *head) +{ + while (head) + { + struct du_chain *this_du = head->first; + fprintf (dump_file, "Register %s (%d):", + reg_names[head->regno], head->nregs); + while (this_du) + { + fprintf (dump_file, " %d [%s]", INSN_UID (this_du->insn), + reg_class_names[this_du->cl]); + this_du = this_du->next_use; + } + fprintf (dump_file, "\n"); + head = head->next_chain; + } +} + static void free_chain_data (void) { @@ -225,13 +269,160 @@ check_new_reg_p (int reg ATTRIBUTE_UNUSE return true; } +/* Process the closed chains starting with ALL_CHAINS and rename + registers if possible. */ +static void +rename_chains (du_head_p all_chains) +{ + HARD_REG_SET unavailable; + + + CLEAR_HARD_REG_SET (unavailable); + /* Don't clobber traceback for noreturn functions. */ + if (frame_pointer_needed) + { + add_to_hard_reg_set (&unavailable, Pmode, FRAME_POINTER_REGNUM); +#if !HARD_FRAME_POINTER_IS_FRAME_POINTER + add_to_hard_reg_set (&unavailable, Pmode, HARD_FRAME_POINTER_REGNUM); +#endif + } + + while (all_chains) + { + int new_reg, best_new_reg, best_nregs; + int n_uses; + struct du_head *this_head = all_chains; + struct du_chain *tmp; + HARD_REG_SET this_unavailable; + int reg = this_head->regno; + int pass; + enum reg_class super_class = NO_REGS; + enum reg_class preferred_class; + bool has_preferred_class; + + all_chains = this_head->next_chain; + + if (this_head->cannot_rename) + continue; + + best_new_reg = reg; + best_nregs = this_head->nregs; + + if (fixed_regs[reg] || global_regs[reg] +#if !HARD_FRAME_POINTER_IS_FRAME_POINTER + || (frame_pointer_needed && reg == HARD_FRAME_POINTER_REGNUM) +#else + || (frame_pointer_needed && reg == FRAME_POINTER_REGNUM) +#endif + ) + continue; + + COPY_HARD_REG_SET (this_unavailable, unavailable); + + /* Iterate over elements in the chain in order to: + 1. Count number of uses, and narrow the set of registers we can + use for renaming. + 2. Compute the superunion of register classes in this chain. */ + n_uses = 0; + super_class = NO_REGS; + for (tmp = this_head->first; tmp; tmp = tmp->next_use) + { + if (DEBUG_INSN_P (tmp->insn)) + continue; + n_uses++; + IOR_COMPL_HARD_REG_SET (this_unavailable, + reg_class_contents[tmp->cl]); + super_class + = reg_class_superunion[(int) super_class][(int) tmp->cl]; + } + + if (n_uses < 2) + continue; + + /* Further narrow the set of registers we can use for renaming. + If the chain needs a call-saved register, mark the call-used + registers as unavailable. */ + if (this_head->need_caller_save_reg) + IOR_HARD_REG_SET (this_unavailable, call_used_reg_set); + + /* And mark registers that overlap its lifetime as unavailable. */ + merge_overlapping_regs (&this_unavailable, this_head); + + /* Compute preferred rename class of super union of all the classes + in the chain. */ + preferred_class + = (enum reg_class) targetm.preferred_rename_class (super_class); + + /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass + over registers that belong to PREFERRED_CLASS and try to find the + best register within the class. If that failed, we iterate in + the second pass over registers that don't belong to the class. + If PREFERRED_CLASS is NO_REGS, we iterate over all registers in + ascending order without any preference. */ + has_preferred_class = (preferred_class != NO_REGS); + for (pass = (has_preferred_class ? 0 : 1); pass < 2; pass++) + { + for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++) + { + if (has_preferred_class + && (pass == 0) + != TEST_HARD_REG_BIT + (reg_class_contents[preferred_class], new_reg)) + continue; + + /* In the first pass, we force the renaming of registers that + don't belong to PREFERRED_CLASS to registers that do, even + though the latters were used not very long ago. */ + if (check_new_reg_p (reg, new_reg, this_head, + this_unavailable) + && ((pass == 0 + && !TEST_HARD_REG_BIT + (reg_class_contents[preferred_class], + best_new_reg)) + || tick[best_new_reg] > tick[new_reg])) + { + enum machine_mode mode + = GET_MODE (*this_head->first->loc); + best_new_reg = new_reg; + best_nregs = hard_regno_nregs[new_reg][mode]; + } + } + if (pass == 0 && best_new_reg != reg) + break; + } + + if (dump_file) + { + fprintf (dump_file, "Register %s in insn %d", + reg_names[reg], INSN_UID (this_head->first->insn)); + if (this_head->need_caller_save_reg) + fprintf (dump_file, " crosses a call"); + } + + if (best_new_reg == reg) + { + tick[reg] = ++this_tick; + if (dump_file) + fprintf (dump_file, "; no available better choice\n"); + continue; + } + + if (dump_file) + fprintf (dump_file, ", renamed as %s\n", reg_names[best_new_reg]); + + do_replace (this_head, best_new_reg); + this_head->regno = best_new_reg; + this_head->nregs = best_nregs; + tick[best_new_reg] = ++this_tick; + df_set_regs_ever_live (best_new_reg, true); + } +} + /* Perform register renaming on the current function. */ static unsigned int regrename_optimize (void) { - int tick[FIRST_PSEUDO_REGISTER]; - int this_tick = 0; basic_block bb; char *first_obj; @@ -248,16 +439,9 @@ regrename_optimize (void) FOR_EACH_BB (bb) { struct du_head *all_chains = 0; - HARD_REG_SET unavailable; -#if 0 - HARD_REG_SET regs_seen; - CLEAR_HARD_REG_SET (regs_seen); -#endif id_to_chain = VEC_alloc (du_head_p, heap, 0); - CLEAR_HARD_REG_SET (unavailable); - if (dump_file) fprintf (dump_file, "\nBasic block %d:\n", bb->index); @@ -266,154 +450,7 @@ regrename_optimize (void) if (dump_file) dump_def_use_chain (all_chains); - CLEAR_HARD_REG_SET (unavailable); - /* Don't clobber traceback for noreturn functions. */ - if (frame_pointer_needed) - { - add_to_hard_reg_set (&unavailable, Pmode, FRAME_POINTER_REGNUM); -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER - add_to_hard_reg_set (&unavailable, Pmode, HARD_FRAME_POINTER_REGNUM); -#endif - } - - while (all_chains) - { - int new_reg, best_new_reg, best_nregs; - int n_uses; - struct du_head *this_head = all_chains; - struct du_chain *tmp; - HARD_REG_SET this_unavailable; - int reg = this_head->regno; - int pass; - enum reg_class super_class = NO_REGS; - enum reg_class preferred_class; - bool has_preferred_class; - - all_chains = this_head->next_chain; - - if (this_head->cannot_rename) - continue; - - best_new_reg = reg; - best_nregs = this_head->nregs; - -#if 0 /* This just disables optimization opportunities. */ - /* Only rename once we've seen the reg more than once. */ - if (! TEST_HARD_REG_BIT (regs_seen, reg)) - { - SET_HARD_REG_BIT (regs_seen, reg); - continue; - } -#endif - - if (fixed_regs[reg] || global_regs[reg] -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER - || (frame_pointer_needed && reg == HARD_FRAME_POINTER_REGNUM) -#else - || (frame_pointer_needed && reg == FRAME_POINTER_REGNUM) -#endif - ) - continue; - - COPY_HARD_REG_SET (this_unavailable, unavailable); - - /* Iterate over elements in the chain in order to: - 1. Count number of uses, and narrow the set of registers we can - use for renaming. - 2. Compute the superunion of register classes in this chain. */ - n_uses = 0; - super_class = NO_REGS; - for (tmp = this_head->first; tmp; tmp = tmp->next_use) - { - if (DEBUG_INSN_P (tmp->insn)) - continue; - n_uses++; - IOR_COMPL_HARD_REG_SET (this_unavailable, - reg_class_contents[tmp->cl]); - super_class - = reg_class_superunion[(int) super_class][(int) tmp->cl]; - } - - if (n_uses < 2) - continue; - - /* Further narrow the set of registers we can use for renaming. - If the chain needs a call-saved register, mark the call-used - registers as unavailable. */ - if (this_head->need_caller_save_reg) - IOR_HARD_REG_SET (this_unavailable, call_used_reg_set); - - /* And mark registers that overlap its lifetime as unavailable. */ - merge_overlapping_regs (&this_unavailable, this_head); - - /* Compute preferred rename class of super union of all the classes - in the chain. */ - preferred_class - = (enum reg_class) targetm.preferred_rename_class (super_class); - - /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass - over registers that belong to PREFERRED_CLASS and try to find the - best register within the class. If that failed, we iterate in - the second pass over registers that don't belong to the class. - If PREFERRED_CLASS is NO_REGS, we iterate over all registers in - ascending order without any preference. */ - has_preferred_class = (preferred_class != NO_REGS); - for (pass = (has_preferred_class ? 0 : 1); pass < 2; pass++) - { - for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++) - { - if (has_preferred_class - && (pass == 0) - != TEST_HARD_REG_BIT - (reg_class_contents[preferred_class], new_reg)) - continue; - - /* In the first pass, we force the renaming of registers that - don't belong to PREFERRED_CLASS to registers that do, even - though the latters were used not very long ago. */ - if (check_new_reg_p (reg, new_reg, this_head, - this_unavailable) - && ((pass == 0 - && !TEST_HARD_REG_BIT - (reg_class_contents[preferred_class], - best_new_reg)) - || tick[best_new_reg] > tick[new_reg])) - { - enum machine_mode mode - = GET_MODE (*this_head->first->loc); - best_new_reg = new_reg; - best_nregs = hard_regno_nregs[new_reg][mode]; - } - } - if (pass == 0 && best_new_reg != reg) - break; - } - - if (dump_file) - { - fprintf (dump_file, "Register %s in insn %d", - reg_names[reg], INSN_UID (this_head->first->insn)); - if (this_head->need_caller_save_reg) - fprintf (dump_file, " crosses a call"); - } - - if (best_new_reg == reg) - { - tick[reg] = ++this_tick; - if (dump_file) - fprintf (dump_file, "; no available better choice\n"); - continue; - } - - if (dump_file) - fprintf (dump_file, ", renamed as %s\n", reg_names[best_new_reg]); - - do_replace (this_head, best_new_reg); - this_head->regno = best_new_reg; - this_head->nregs = best_nregs; - tick[best_new_reg] = ++this_tick; - df_set_regs_ever_live (best_new_reg, true); - } + rename_chains (all_chains); free_chain_data (); obstack_free (&rename_obstack, first_obj); @@ -507,24 +544,6 @@ mark_conflict (struct du_head *chains, u without renaming. */ static bool fail_current_block; -/* The id to be given to the next opened chain. */ -static unsigned current_id; - -/* List of currently open chains, and closed chains that can be renamed. */ -static struct du_head *open_chains; -static struct du_head *closed_chains; - -/* Bitmap of open chains. The bits set always match the list found in - open_chains. */ -static bitmap_head open_chains_set; - -/* Record the registers being tracked in open_chains. */ -static HARD_REG_SET live_in_chains; - -/* Record the registers that are live but not tracked. The intersection - between this and live_in_chains is empty. */ -static HARD_REG_SET live_hard_regs; - /* Return true if OP is a reg for which all bits are set in PSET, false if all bits are clear. In other cases, set fail_current_block and return false. */ @@ -603,7 +622,6 @@ create_new_chain (unsigned this_regno, u head->nregs = this_nregs; head->need_caller_save_reg = 0; head->cannot_rename = 0; - head->terminated = 0; VEC_safe_push (du_head_p, heap, id_to_chain, head); head->id = current_id++; @@ -682,7 +700,7 @@ scan_rtx_reg (rtx insn, rtx *loc, enum r int subset = (this_regno >= head->regno && this_regno + this_nregs <= head->regno + head->nregs); - if (head->terminated + if (!bitmap_bit_p (&open_chains_set, head->id) || head->regno + head->nregs <= this_regno || this_regno + this_nregs <= head->regno) { @@ -757,7 +775,6 @@ scan_rtx_reg (rtx insn, rtx *loc, enum r { unsigned nregs; - head->terminated = 1; head->next_chain = closed_chains; closed_chains = head; bitmap_clear_bit (&open_chains_set, head->id); @@ -1407,29 +1424,6 @@ build_def_use (basic_block bb) is still open lives past the basic block, so it can't be renamed. */ return closed_chains; } - -/* Dump all def/use chains in CHAINS to DUMP_FILE. They are - printed in reverse order as that's how we build them. */ - -static void -dump_def_use_chain (struct du_head *head) -{ - while (head) - { - struct du_chain *this_du = head->first; - fprintf (dump_file, "Register %s (%d):", - reg_names[head->regno], head->nregs); - while (this_du) - { - fprintf (dump_file, " %d [%s]", INSN_UID (this_du->insn), - reg_class_names[this_du->cl]); - this_du = this_du->next_use; - } - fprintf (dump_file, "\n"); - head = head->next_chain; - } -} - static bool gate_handle_regrename (void) --------------020602050901060306040705--