* [RFC][PATCH] Preferred rename register in regrename pass @ 2015-09-17 14:40 Robert Suchanek 2015-09-17 17:11 ` Jeff Law 2015-09-18 15:29 ` Bernd Schmidt 0 siblings, 2 replies; 21+ messages in thread From: Robert Suchanek @ 2015-09-17 14:40 UTC (permalink / raw) To: gcc-patches Hi, We came across a situation for MIPS64 where moves for sign-extension were not converted into a nop because of IRA spilled some of the allocnos and assigned different hard register for the output operand in the move. LRA is not fixing this up as most likely the move was not introduced by the LRA itself. I found it hard to fix this in LRA and looked at an alternative solution where regrename pass appeared to be the best candidate. The patch below introduces a notion of a preferred rename register and attempts to use the output register for an input register iff the input register dies in an instruction. The preferred register is validated and in the case it fails to be validated, it falls back to the old technique of finding the best rename register. Of course, it has slightly limited scope of use as it's not enabled be default, however, when targeting performance one is likely to enable it via -funroll-loops or -fpeel-loops. I did some experiments with -funroll-loops on x86_64-unknown-linux-gnu and the code size improved almost 0.4% on average case. I haven't done an extensive performance testing but it appeared SPEC2006 had some minor improvement on average, which could be real improvement or just noise. On MIPS64 with -funroll-loops, there were a number of cases where the unnecessary moves turned into a nop in CSiBE. MIPS32 also marginally improved but to a lower degree. The patch successfully passed x86_64-unknown-linux-gnu, mips-mti-linux-gnu and mips-img-linux-gnu. I'm not sure if this is something that should be enabled by default for everyone or a target hook should be added. Any other comments/suggestions? Regards, Robert gcc/ * regrename.c (find_preferred_rename_reg): New function. (record_operand_use): Remove assertion. Allocate or resize heads and chains vectors, if necessary. (find_best_rename_reg): Use the new function and validate chosen register. (build_def_use): Don't allocate and initialize space of size 0. (regrename_finish): Free heads and chains vectors. (regrename_optimize): Pass true to initializing function. * regrename.h (struct operand_rr_info): Replace arrays of heads and chains with vectors. --- gcc/regrename.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++---- gcc/regrename.h | 4 +-- 2 files changed, 82 insertions(+), 8 deletions(-) diff --git a/gcc/regrename.c b/gcc/regrename.c index c328c1b..90fee98 100644 --- a/gcc/regrename.c +++ b/gcc/regrename.c @@ -174,6 +174,51 @@ dump_def_use_chain (int from) } } +/* Return a preferred rename register for HEAD. */ + +static int +find_preferred_rename_reg (du_head_p head) +{ + struct du_chain *this_du; + int preferred_reg = -1; + + for (this_du = head->first; this_du; this_du = this_du->next_use) + { + rtx note; + insn_rr_info *p; + + /* The preferred rename register is an output register iff an input + register dies in an instruction but the candidate must be validated by + check_new_reg_p. */ + for (note = REG_NOTES (this_du->insn); note; note = XEXP (note, 1)) + if (insn_rr.exists() + && REG_NOTE_KIND (note) == REG_DEAD + && REGNO (XEXP (note, 0)) == head->regno + && (p = &insn_rr[INSN_UID (this_du->insn)]) + && p->op_info) + { + int i; + for (i = 0; i < p->op_info->n_chains; i++) + { + struct du_head *next_head = p->op_info->heads[i]; + if (head != next_head) + { + preferred_reg = next_head->regno; + if (dump_file) + fprintf (dump_file, + "Chain %s (%d) has preferred rename register" + " %s for insn %d [%s]\n", + reg_names[head->regno], head->id, + reg_names[preferred_reg], + INSN_UID (this_du->insn), + reg_class_names[this_du->cl]); + } + } + } + } + return preferred_reg; +} + static void free_chain_data (void) { @@ -206,7 +251,16 @@ record_operand_use (struct du_head *head, struct du_chain *this_du) { if (cur_operand == NULL) return; - gcc_assert (cur_operand->n_chains < MAX_REGS_PER_ADDRESS); + + if (!cur_operand->heads.exists ()) + cur_operand->heads.create (0); + if (!cur_operand->chains.exists ()) + cur_operand->chains.create (0); + if (cur_operand->heads.length () <= (unsigned) cur_operand->n_chains) + cur_operand->heads.safe_grow_cleared (cur_operand->n_chains + 1); + if (cur_operand->chains.length () <= (unsigned) cur_operand->n_chains) + cur_operand->chains.safe_grow_cleared (cur_operand->n_chains + 1); + cur_operand->heads[cur_operand->n_chains] = head; cur_operand->chains[cur_operand->n_chains++] = this_du; } @@ -355,6 +409,7 @@ find_rename_reg (du_head_p this_head, enum reg_class super_class, enum reg_class preferred_class; int pass; int best_new_reg = old_reg; + int preferred_reg = -1; /* Further narrow the set of registers we can use for renaming. If the chain needs a call-saved register, mark the call-used @@ -370,6 +425,11 @@ find_rename_reg (du_head_p this_head, enum reg_class super_class, preferred_class = (enum reg_class) targetm.preferred_rename_class (super_class); + /* Try to find a preferred rename register for THIS_HEAD. */ + if ((preferred_reg = find_preferred_rename_reg (this_head)) != -1 + && check_new_reg_p (old_reg, preferred_reg, this_head, *unavailable)) + return preferred_reg; + /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass over registers that belong to PREFERRED_CLASS and try to find the best register within the class. If that failed, we iterate in @@ -1588,10 +1648,14 @@ build_def_use (basic_block bb) if (insn_rr.exists ()) { insn_info = &insn_rr[INSN_UID (insn)]; - insn_info->op_info = XOBNEWVEC (&rename_obstack, operand_rr_info, - recog_data.n_operands); - memset (insn_info->op_info, 0, - sizeof (operand_rr_info) * recog_data.n_operands); + if (recog_data.n_operands > 0) + { + insn_info->op_info = XOBNEWVEC (&rename_obstack, + operand_rr_info, + recog_data.n_operands); + memset (insn_info->op_info, 0, + sizeof (operand_rr_info) * recog_data.n_operands); + } } /* Simplify the code below by promoting OP_OUT to OP_INOUT in @@ -1811,6 +1875,16 @@ regrename_init (bool insn_info) void regrename_finish (void) { + int i; + struct insn_rr_info *item; + + FOR_EACH_VEC_ELT (insn_rr, i, item) + if (item->op_info) + { + item->op_info->heads.release (); + item->op_info->chains.release (); + } + insn_rr.release (); free_chain_data (); obstack_free (&rename_obstack, NULL); @@ -1826,7 +1900,7 @@ regrename_optimize (void) df_analyze (); df_set_flags (DF_DEFER_INSN_RESCAN); - regrename_init (false); + regrename_init (true); regrename_analyze (NULL); diff --git a/gcc/regrename.h b/gcc/regrename.h index bbe156d..2e3bc20 100644 --- a/gcc/regrename.h +++ b/gcc/regrename.h @@ -71,8 +71,8 @@ struct operand_rr_info int n_chains; /* Holds either the chain for the operand itself, or for the registers in a memory operand. */ - struct du_chain *chains[MAX_REGS_PER_ADDRESS]; - struct du_head *heads[MAX_REGS_PER_ADDRESS]; + vec<struct du_chain *> chains; + vec<struct du_head *> heads; }; /* A struct to hold a vector of operand_rr_info structures describing the -- 2.4.5 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-09-17 14:40 [RFC][PATCH] Preferred rename register in regrename pass Robert Suchanek @ 2015-09-17 17:11 ` Jeff Law 2015-09-17 21:28 ` Eric Botcazou 2015-09-18 15:29 ` Bernd Schmidt 1 sibling, 1 reply; 21+ messages in thread From: Jeff Law @ 2015-09-17 17:11 UTC (permalink / raw) To: Robert Suchanek, gcc-patches On 09/17/2015 08:38 AM, Robert Suchanek wrote: > Hi, > > We came across a situation for MIPS64 where moves for sign-extension were > not converted into a nop because of IRA spilled some of the allocnos and > assigned different hard register for the output operand in the move. > LRA is not fixing this up as most likely the move was not introduced by > the LRA itself. I found it hard to fix this in LRA and looked at > an alternative solution where regrename pass appeared to be the best candidate. Yea, we've never been great at tying the source & destination of sign/zero extensions. The inherently different modes often caused the old allocator (and pre-allocator bits like regmove, and post-allocator bits like reload) to 'give up'. > > I'm not sure if this is something that should be enabled by default for everyone > or a target hook should be added. Any other comments/suggestions? I'll let Bernd comment on the patch itself. But I would say that if you're setting up cases where we can tie the source/dest of an extension together, then it's a good thing. It'll cause more of them to turn into NOPs and it'll make the redundant extension elimination pass more effective as well. This would be something that I'd expect to benefit most architectures (obviously to varying degrees). Jeff ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-09-17 17:11 ` Jeff Law @ 2015-09-17 21:28 ` Eric Botcazou 0 siblings, 0 replies; 21+ messages in thread From: Eric Botcazou @ 2015-09-17 21:28 UTC (permalink / raw) To: Jeff Law; +Cc: gcc-patches, Robert Suchanek > I'll let Bernd comment on the patch itself. But I would say that if > you're setting up cases where we can tie the source/dest of an extension > together, then it's a good thing. It'll cause more of them to turn into > NOPs and it'll make the redundant extension elimination pass more > effective as well. Not if you do it in regrename though, as it's run very late in the pipeline. If you want to make REE more effective, this would need to be done during RA. -- Eric Botcazou ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-09-17 14:40 [RFC][PATCH] Preferred rename register in regrename pass Robert Suchanek 2015-09-17 17:11 ` Jeff Law @ 2015-09-18 15:29 ` Bernd Schmidt 2015-10-09 7:10 ` Robert Suchanek 1 sibling, 1 reply; 21+ messages in thread From: Bernd Schmidt @ 2015-09-18 15:29 UTC (permalink / raw) To: Robert Suchanek, gcc-patches On 09/17/2015 04:38 PM, Robert Suchanek wrote: > We came across a situation for MIPS64 where moves for sign-extension were > not converted into a nop because of IRA spilled some of the allocnos and > assigned different hard register for the output operand in the move. > LRA is not fixing this up as most likely the move was not introduced by > the LRA itself. I found it hard to fix this in LRA and looked at > an alternative solution where regrename pass appeared to be the best candidate. For reference, please post examples of the insn pattern(s) where you would hope to get an improvement. Do they use matching constraints between the input and output operands in at least one alternative? So this does look like something that could be addressed in regrename, but I think the patch is not quite the way to do it. > +/* Return a preferred rename register for HEAD. */ Function comments ideally ought to be a little more detailed. Preferred how and why? > +static int > +find_preferred_rename_reg (du_head_p head) > +{ > + struct du_chain *this_du; > + int preferred_reg = -1; > + > + for (this_du = head->first; this_du; this_du = this_du->next_use) This loop seems to search for the insn where the chain terminates (i.e. the register dies). It seems strange to do this here rather than during the initial scan in record_out_operands where we visit every insn and already look for REG_DEAD notes. > + rtx note; > + insn_rr_info *p; > + > + /* The preferred rename register is an output register iff an input > + register dies in an instruction but the candidate must be validated by > + check_new_reg_p. */ > + for (note = REG_NOTES (this_du->insn); note; note = XEXP (note, 1)) > + if (insn_rr.exists() > + && REG_NOTE_KIND (note) == REG_DEAD > + && REGNO (XEXP (note, 0)) == head->regno > + && (p = &insn_rr[INSN_UID (this_du->insn)]) > + && p->op_info) > + { > + int i; > + for (i = 0; i < p->op_info->n_chains; i++) > + { > + struct du_head *next_head = p->op_info->heads[i]; > + if (head != next_head) Here you're not actually verifying the chosen preferred reg is an output? Is the use of plain "p->op_info" (which is actually an array) intentional as a guess that operand 0 is the output? I'm not thrilled with this, and at the very least it should be "p->op_info[0]." to avoid reader confusion. It's also not verifying that this is indeed a case where choosing a preferred reg has a beneficial effect at all. The use of insn_rr would probably also be unnecessary if this was done during the scan phase. > + preferred_reg = next_head->regno; The problem here is that there's an ordering issue. What if next_head gets renamed afterwards? The choice of preferred register hasn't bought us anything in that case. For all these reasons I'd suggest a different approach, looking for such situations during the scan. Try to detect a situation where * we have a REG_DEAD note for an existing chain * the insn fulfils certain conditions (i.e. it's a move, or maybe one of the alternatives has a matching constraint). After all, there's not much point in tying if the reg that dies was used in a memory address. * a new chain is started for a single output Then, instead of picking a best register, mark the two chains as tied. Then, when choosing a rename register, see if a tied chain already was renamed, and try to pick the same register first. > @@ -1826,7 +1900,7 @@ regrename_optimize (void) > df_analyze (); > df_set_flags (DF_DEFER_INSN_RESCAN); > > - regrename_init (false); > + regrename_init (true); It would be good to avoid this as it makes the renamer more expensive. I expect that if you follow the strategy described above, this won't be necessary. > - struct du_chain *chains[MAX_REGS_PER_ADDRESS]; > - struct du_head *heads[MAX_REGS_PER_ADDRESS]; > + vec<struct du_chain *> chains; > + vec<struct du_head *> heads; Given that MAX_REGS_PER_ADDRESS tends to be 1 or 2 this appears to make things more heavyweight, especially with the extra loop needed to free the vecs. If possible, try to avoid this. (Again, AFAICS this information shouldn't really be necessary for what you're trying to do). Bernd ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-09-18 15:29 ` Bernd Schmidt @ 2015-10-09 7:10 ` Robert Suchanek 2015-10-09 11:06 ` Bernd Schmidt 0 siblings, 1 reply; 21+ messages in thread From: Robert Suchanek @ 2015-10-09 7:10 UTC (permalink / raw) To: Bernd Schmidt, ebotcazou, gcc-patches Hi Bernd, Thanks for the comments, much appreciated. Comments inlined and a reworked patch attached. > On 09/17/2015 04:38 PM, Robert Suchanek wrote: > > We came across a situation for MIPS64 where moves for sign-extension were > > not converted into a nop because of IRA spilled some of the allocnos and > > assigned different hard register for the output operand in the move. > > LRA is not fixing this up as most likely the move was not introduced by > > the LRA itself. I found it hard to fix this in LRA and looked at > > an alternative solution where regrename pass appeared to be the best > candidate. > > For reference, please post examples of the insn pattern(s) where you > would hope to get an improvement. Do they use matching constraints > between the input and output operands in at least one alternative? It all started because of 'extendmn2' SPN on MIPS64 where IRA broke the tie between input and output registers causing the move to stay around. Generally, I was expecting some of the moves to disappear. Initially I thought that the approach would benefit for matching constraints but whilst reworking and more detailed analysis I couldn't find a case. > So this does look like something that could be addressed in regrename, > but I think the patch is not quite the way to do it. > > > +/* Return a preferred rename register for HEAD. */ > > Function comments ideally ought to be a little more detailed. Preferred > how and why? Noted. Will provide better description in the future. > > > +static int > > +find_preferred_rename_reg (du_head_p head) > > +{ > > + struct du_chain *this_du; > > + int preferred_reg = -1; > > + > > + for (this_du = head->first; this_du; this_du = this_du->next_use) > > This loop seems to search for the insn where the chain terminates (i.e. > the register dies). It seems strange to do this here rather than during > the initial scan in record_out_operands where we visit every insn and > already look for REG_DEAD notes. At the time, I thought it seemed to ok to do this separately. > > > + rtx note; > > + insn_rr_info *p; > > + > > + /* The preferred rename register is an output register iff an input > > + register dies in an instruction but the candidate must be validated by > > + check_new_reg_p. */ > > + for (note = REG_NOTES (this_du->insn); note; note = XEXP (note, 1)) > > + if (insn_rr.exists() > > + && REG_NOTE_KIND (note) == REG_DEAD > > + && REGNO (XEXP (note, 0)) == head->regno > > + && (p = &insn_rr[INSN_UID (this_du->insn)]) > > + && p->op_info) > > + { > > + int i; > > + for (i = 0; i < p->op_info->n_chains; i++) > > + { > > + struct du_head *next_head = p->op_info->heads[i]; > > + if (head != next_head) > > Here you're not actually verifying the chosen preferred reg is an > output? Is the use of plain "p->op_info" (which is actually an array) > intentional as a guess that operand 0 is the output? I'm not thrilled > with this, and at the very least it should be "p->op_info[0]." to avoid > reader confusion. > It's also not verifying that this is indeed a case where choosing a > preferred reg has a beneficial effect at all. I realized that the check done in find_rename_reg should have been moved here as, indeed, it is rather confusing. This is more like finding a candidate rather than validating it. AFAICS, "p->op_info" is a pointer to struct operand_rr_info and it can be null if a chain is opened but not used in a BB. It appears this happens if a register is live across BB but not used in the currently processed one. > > The use of insn_rr would probably also be unnecessary if this was done > during the scan phase. > > > + preferred_reg = next_head->regno; > > The problem here is that there's an ordering issue. What if next_head > gets renamed afterwards? The choice of preferred register hasn't bought > us anything in that case. > > For all these reasons I'd suggest a different approach, looking for such > situations during the scan. Try to detect a situation where > * we have a REG_DEAD note for an existing chain > * the insn fulfils certain conditions (i.e. it's a move, or maybe one > of the alternatives has a matching constraint). After all, there's > not much point in tying if the reg that dies was used in a memory > address. > * a new chain is started for a single output > Then, instead of picking a best register, mark the two chains as tied. > Then, when choosing a rename register, see if a tied chain already was > renamed, and try to pick the same register first. I had something similar in mind at beginning but I didn't see how this would fit into the scan. I think I have now better understanding of the regrename pass, thus, the attached patch is likely the way go to. I limited the conditions to moves only as I didn't see any tying for matching constraints. > > > @@ -1826,7 +1900,7 @@ regrename_optimize (void) > > df_analyze (); > > df_set_flags (DF_DEFER_INSN_RESCAN); > > > > - regrename_init (false); > > + regrename_init (true); > > It would be good to avoid this as it makes the renamer more expensive. I > expect that if you follow the strategy described above, this won't be > necessary. > > > - struct du_chain *chains[MAX_REGS_PER_ADDRESS]; > > - struct du_head *heads[MAX_REGS_PER_ADDRESS]; > > + vec<struct du_chain *> chains; > > + vec<struct du_head *> heads; > > Given that MAX_REGS_PER_ADDRESS tends to be 1 or 2 this appears to make > things more heavyweight, especially with the extra loop needed to free > the vecs. If possible, try to avoid this. (Again, AFAICS this > information shouldn't really be necessary for what you're trying to do). It happens that MIPS defines the macro to 1 and one of the DSP tests in Dejagnu failed. I wasn't sure if the macro used here is the correct one or rather it should be relying on NREGS but this cannot be determined statically. Alternatively, a new macro like MAX_REGS_PER_OPERAND would probably better but now this change is redundant. > > > Bernd > > I'll let Bernd comment on the patch itself. But I would say that if > > you're setting up cases where we can tie the source/dest of an extension > > together, then it's a good thing. It'll cause more of them to turn into > > NOPs and it'll make the redundant extension elimination pass more > > effective as well. > > Not if you do it in regrename though, as it's run very late in the pipeline. > If you want to make REE more effective, this would need to be done during RA. > > -- > Eric Botcazou My immediate thought on this is to add a target hook and enable/disable this on a port basis if this hurts performance. I tried to fix this in IRA using a similar technique as in this patch but couldn't find the right place to apply this. I'll come back to this if I have some more time. Regards, Robert gcc/ * regrename.c (create_new_chain): Initialize terminated_dead, renamed and tied_chain. (find_best_rename_reg): Pick and check register from the tied chain. (regrename_do_replace): Mark head as renamed. (scan_rtx_reg): Tie chains in move insns. Set terminate_dead flag. * regrename.h (struct du_head): Add tied_chain, renamed and terminated_dead members. --- gcc/regrename.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++- gcc/regrename.h | 6 ++++++ 2 files changed, 53 insertions(+), 1 deletion(-) diff --git a/gcc/regrename.c b/gcc/regrename.c index 6517f4e..7356a20 100644 --- a/gcc/regrename.c +++ b/gcc/regrename.c @@ -230,6 +230,9 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc, head->nregs = this_nregs; head->need_caller_save_reg = 0; head->cannot_rename = 0; + head->terminated_dead = 0; + head->renamed = 0; + head->tied_chain = NULL; id_to_chain.safe_push (head); head->id = current_id++; @@ -373,6 +376,13 @@ find_best_rename_reg (du_head_p this_head, enum reg_class super_class, preferred_class = (enum reg_class) targetm.preferred_rename_class (super_class); + /* Pick and check the register from the tied chain iff the tied chain + is not renamed. */ + if (this_head->tied_chain && !this_head->tied_chain->renamed + && check_new_reg_p (old_reg, this_head->tied_chain->regno, + this_head, *unavailable)) + return this_head->tied_chain->regno; + /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass over registers that belong to PREFERRED_CLASS and try to find the best register within the class. If that failed, we iterate in @@ -952,6 +962,7 @@ regrename_do_replace (struct du_head *head, int reg) } mode = GET_MODE (*head->first->loc); + head->renamed = 1; head->regno = reg; head->nregs = hard_regno_nregs[reg][mode]; } @@ -1035,7 +1046,40 @@ scan_rtx_reg (rtx insn, rtx *loc, enum reg_class cl, enum scan_actions action, if (action == mark_write) { if (type == OP_OUT) - create_new_chain (this_regno, this_nregs, loc, insn, cl); + { + int i; + du_head_p c; + du_head_p head; + rtx pat = PATTERN (insn); + + c = create_new_chain (this_regno, this_nregs, loc, insn, cl); + + /* We try to tie chains in a move instruction for + a single output. */ + if (recog_data.n_operands == 2 + && GET_CODE (pat) == SET + && GET_CODE (SET_DEST (pat)) == REG + && GET_CODE (SET_SRC (pat)) == REG) + { + /* Find the input chain. */ + for (i = c->id - 1; id_to_chain.iterate (i, &head); i--) + if (head->last && head->last->insn == insn + && head->terminated_dead) + { + gcc_assert (head->regno == REGNO (recog_data.operand[1])); + c->tied_chain = head; + head->tied_chain = c; + + if (dump_file) + fprintf (dump_file, "Tying chain %s (%d) with %s (%d)\n", + reg_names[c->regno], c->id, + reg_names[head->regno], head->id); + /* Once tied, we're done. */ + break; + } + } + } + return; } @@ -1143,6 +1187,8 @@ scan_rtx_reg (rtx insn, rtx *loc, enum reg_class cl, enum scan_actions action, SET_HARD_REG_BIT (live_hard_regs, head->regno + nregs); } + if (action == terminate_dead) + (*p)->terminated_dead = 1; *p = next; if (dump_file) fprintf (dump_file, diff --git a/gcc/regrename.h b/gcc/regrename.h index 9a611f0..a61c5bd 100644 --- a/gcc/regrename.h +++ b/gcc/regrename.h @@ -28,6 +28,8 @@ struct du_head struct du_head *next_chain; /* The first and last elements of this chain. */ struct du_chain *first, *last; + /* The chain that this chain is tied to. */ + struct du_head *tied_chain; /* Describes the register being tracked. */ unsigned regno; int nregs; @@ -45,6 +47,10 @@ struct du_head such as the SET_DEST of a CALL_INSN or an asm operand that used to be a hard register. */ unsigned int cannot_rename:1; + /* Nonzero if the chain is renamed. */ + unsigned int renamed:1; + /* Nonzero if the chain is marked as dead. */ + unsigned int terminated_dead:1; }; typedef struct du_head *du_head_p; -- 2.4.5 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-10-09 7:10 ` Robert Suchanek @ 2015-10-09 11:06 ` Bernd Schmidt 2015-10-09 11:20 ` Robert Suchanek 2015-11-09 13:32 ` Robert Suchanek 0 siblings, 2 replies; 21+ messages in thread From: Bernd Schmidt @ 2015-10-09 11:06 UTC (permalink / raw) To: Robert Suchanek, ebotcazou, gcc-patches Hi Robert, > gcc/ > * regrename.c (create_new_chain): Initialize terminated_dead, > renamed and tied_chain. > (find_best_rename_reg): Pick and check register from the tied chain. > (regrename_do_replace): Mark head as renamed. > (scan_rtx_reg): Tie chains in move insns. Set terminate_dead flag. > * regrename.h (struct du_head): Add tied_chain, renamed and > terminated_dead members. Thanks - this looks a lot better already. You didn't say how it was bootstrapped and tested; please include this information for future submissions. For a patch like this, some data on the improvement you got would also be appreciated. I'd still like to investigate the possibility of further simplification: > + { > + /* Find the input chain. */ > + for (i = c->id - 1; id_to_chain.iterate (i, &head); i--) > + if (head->last && head->last->insn == insn > + && head->terminated_dead) > + { > + gcc_assert (head->regno == REGNO (recog_data.operand[1])); > + c->tied_chain = head; > + head->tied_chain = c; > + > + if (dump_file) > + fprintf (dump_file, "Tying chain %s (%d) with %s (%d)\n", > + reg_names[c->regno], c->id, > + reg_names[head->regno], head->id); > + /* Once tied, we're done. */ > + break; > + } > + } > + } > + This looks like it's a little more complicated than necessary. Couldn't you add a static var "terminated_this_insn" which gets initialized to NULL and set when a reg dies, and then you check this here rather than having a loop? That would also eliminate the new "terminated_dead" field. Other than that I'm pretty happy with this. Bernd ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-10-09 11:06 ` Bernd Schmidt @ 2015-10-09 11:20 ` Robert Suchanek 2015-11-09 13:32 ` Robert Suchanek 1 sibling, 0 replies; 21+ messages in thread From: Robert Suchanek @ 2015-10-09 11:20 UTC (permalink / raw) To: Bernd Schmidt, ebotcazou, gcc-patches Hi Bernd, > Hi Robert, > > gcc/ > > * regrename.c (create_new_chain): Initialize terminated_dead, > > renamed and tied_chain. > > (find_best_rename_reg): Pick and check register from the tied chain. > > (regrename_do_replace): Mark head as renamed. > > (scan_rtx_reg): Tie chains in move insns. Set terminate_dead flag. > > * regrename.h (struct du_head): Add tied_chain, renamed and > > terminated_dead members. > > Thanks - this looks a lot better already. You didn't say how it was > bootstrapped and tested; please include this information for future > submissions. For a patch like this, some data on the improvement you got > would also be appreciated. Ah, sorry. I bootstrapped on x86_64-unknown-linux-gnu and ran the Dejagnu with -frename-registers. All looked fine. As for the data, I'll do the comparison and will update this thread by next week. > > I'd still like to investigate the possibility of further simplification: > > > + { > > + /* Find the input chain. */ > > + for (i = c->id - 1; id_to_chain.iterate (i, &head); i--) > > + if (head->last && head->last->insn == insn > > + && head->terminated_dead) > > + { > > + gcc_assert (head->regno == REGNO (recog_data.operand[1])); > > + c->tied_chain = head; > > + head->tied_chain = c; > > + > > + if (dump_file) > > + fprintf (dump_file, "Tying chain %s (%d) with %s (%d)\n", > > + reg_names[c->regno], c->id, > > + reg_names[head->regno], head->id); > > + /* Once tied, we're done. */ > > + break; > > + } > > + } > > + } > > + > This looks like it's a little more complicated than necessary. Couldn't > you add a static var "terminated_this_insn" which gets initialized to > NULL and set when a reg dies, and then you check this here rather than > having a loop? That would also eliminate the new "terminated_dead" field. That is a good idea. I'll add the changes and update together with the results. > Other than that I'm pretty happy with this. > > > Bernd Regards, Robert ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-10-09 11:06 ` Bernd Schmidt 2015-10-09 11:20 ` Robert Suchanek @ 2015-11-09 13:32 ` Robert Suchanek 2015-11-09 16:30 ` Bernd Schmidt 2015-11-10 12:10 ` Bernd Schmidt 1 sibling, 2 replies; 21+ messages in thread From: Robert Suchanek @ 2015-11-09 13:32 UTC (permalink / raw) To: Bernd Schmidt, ebotcazou, gcc-patches Hi Bernd, Sorry for late reply. The updated patch was bootstrapped on x86_64-unknown-linux-gnu and cross tested on mips-img-linux-gnu using r229786. The results below were generated for CSiBE benchmark and the numbers in columns express bytes in format 'net (gain/loss)' to show the difference with and without the patch when -frename-registers switch is used. I looked at the gains, especially for MIPS and 'teem', and it appears that renaming registers affects the rtl_dce pass i.e. makes it less effective. However, on average case the patch appears to reduce the code size slightly and moves are genuinely removed. I haven't tested the performance extensively but the SPEC benchmarks showed almost the same results, which could be just the noise. | MIPS n64 -Os | MIPS o32 -Os | x86_64 -Os | ---------------+----------------+----------------+------------------+ bzip2-1.0.2 | -32 (0/-32) | -24 (0/-24) | -34 (1/-35) | cg_compiler | -172 (0/-172) | -156 (0/-156) | -46 (0/-46) | compiler | -36 (0/-36) | -24 (0/-24) | -6 (0/-6) | flex-2.5.31 | -68 (0/-68) | -80 (0/-80) | -98 (7/-105) | jikespg-1.3 | -284 (0/-284) | -204 (0/-204) | -127 (9/-136) | jpeg-6b | -52 (8/-60) | -20 (0/-20) | -80 (11/-91) | libmspack | -136 (0/-136) | -28 (0/-28) | -33 (23/-56) | libpng-1.2.5 | -72 (0/-72) | -64 (0/-64) | -176 (14/-190) | linux-2.4.23 | -700 (20/-720) | -384 (0/-384) | -691 (44/-735) | lwip-0.5.3 | -4 (0/-4) | -4 (0/-4) | +4 (13/-9) | mpeg2dec-0.3.1 | -16 (0/-16) | | -142 (6/-148) | mpgcut-1.1 | -24 (0/-24) | -12 (4/-16) | -2 (0/-2) | OpenTCP-1.0.4 | -28 (0/-28) | -12 (0/-12) | -1 (0/-1) | replaypc-0.4.0 | -32 (0/-32) | -12 (0/-12) | -4 (2/-6) | teem-1.6.0 | -88 (480/-568)| +108 (564/-456)| -1272 (117/-1389)| ttt-0.10.1 | -24 (0/-24) | -20 (0/-20) | -16 (0/-16) | unrarlib-0.4.0 | -20 (0/-20) | -8 (0/-8) | -59 (9/-68) | zlib-1.1.4 | -12 (0/-12) | -4 (0/-4) | -23 (8/-31) | | MIPS n64 -O2 | MIPS o32 -O2 | x86_64 -O2 | ---------------+----------------+----------------+------------------+ bzip2-1.0.2 | -104 (0/-104) | -48 (0/-48) | -55 (0/-55) | cg_compiler | -184 (4/-188) | -232 (0/-232) | -31 (5/-36) | compiler | -32 (0/-32) | -12 (0/-12) | -4 (1/-5) | flex-2.5.31 | -96 (0/-96) | -112 (0/-112) | -12 (34/-46) | jikespg-1.3 | -540 (20/-560) | -476 (4/-480) | -154 (30/-184) | jpeg-6b | -112 (16/-128) | -60 (0/-60) | -136 (84/-220) | libmspack | -164 (0/-164) | -40 (0/-40) | -87 (32/-119) | libpng-1.2.5 | -120 (8/-128) | -92 (4/-96) | -140 (53/-193) | linux-2.4.23 | -596 (12/-608) | -320 (8/-328) | -794 (285/-1079)| lwip-0.5.3 | -8 (0/-8) | -8 (0/-8) | +2 (4/-2) | mpeg2dec-0.3.1 | -44 (0/-44) | -4 (0/-4) | -122 (8/-130) | mpgcut-1.1 | -8 (0/-8) | -8 (0/-8) | +28 (32/-4) | OpenTCP-1.0.4 | -4 (0/-4) | -4 (0/-4) | -2 (0/-2) | replaypc-0.4.0 | -20 (0/-20) | -24 (0/-24) | -13 (0/-13) | teem-1.6.0 | +100 (740/-640)| +84 (736/-652)| -1998 (168/-2166)| ttt-0.10.1 | -16 (0/-16) | | | unrarlib-0.4.0 | -16 (0/-16) | -8 (0/-8) | +19 (37/-18) | zlib-1.1.4 | -12 (0/-12) | -4 (0/-4) | -15 (1/-16) | Regards, Robert > Hi Robert, > > gcc/ > > * regrename.c (create_new_chain): Initialize terminated_dead, > > renamed and tied_chain. > > (find_best_rename_reg): Pick and check register from the tied chain. > > (regrename_do_replace): Mark head as renamed. > > (scan_rtx_reg): Tie chains in move insns. Set terminate_dead flag. > > * regrename.h (struct du_head): Add tied_chain, renamed and > > terminated_dead members. > > Thanks - this looks a lot better already. You didn't say how it was > bootstrapped and tested; please include this information for future > submissions. For a patch like this, some data on the improvement you got > would also be appreciated. > > I'd still like to investigate the possibility of further simplification: > > > + { > > + /* Find the input chain. */ > > + for (i = c->id - 1; id_to_chain.iterate (i, &head); i--) > > + if (head->last && head->last->insn == insn > > + && head->terminated_dead) > > + { > > + gcc_assert (head->regno == REGNO (recog_data.operand[1])); > > + c->tied_chain = head; > > + head->tied_chain = c; > > + > > + if (dump_file) > > + fprintf (dump_file, "Tying chain %s (%d) with %s (%d)\n", > > + reg_names[c->regno], c->id, > > + reg_names[head->regno], head->id); > > + /* Once tied, we're done. */ > > + break; > > + } > > + } > > + } > > + > This looks like it's a little more complicated than necessary. Couldn't > you add a static var "terminated_this_insn" which gets initialized to > NULL and set when a reg dies, and then you check this here rather than > having a loop? That would also eliminate the new "terminated_dead" field. > > Other than that I'm pretty happy with this. > > > Bernd gcc/ * regrename.c (create_new_chain): Initialize renamed and tied_chain. (build_def_use): Initialize terminated_this_insn. (find_best_rename_reg): Pick and check register from the tied chain. (regrename_do_replace): Mark head as renamed. (struct du_head *terminated_this_insn). New static variable. (scan_rtx_reg): Tie chains in move insns. Set terminated_this_insn. * regrename.h (struct du_head): Add tied_chain, renamed members. --- gcc/regrename.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- gcc/regrename.h | 4 ++++ 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/gcc/regrename.c b/gcc/regrename.c index 5f383fc..d3f9951 100644 --- a/gcc/regrename.c +++ b/gcc/regrename.c @@ -130,6 +130,9 @@ static HARD_REG_SET live_hard_regs; record_operand_use. */ static operand_rr_info *cur_operand; +/* Set while scanning RTL if a register dies. Used to tie chains. */ +static struct du_head *terminated_this_insn; + /* Return the chain corresponding to id number ID. Take into account that chains may have been merged. */ du_head_p @@ -224,6 +227,8 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc, head->nregs = this_nregs; head->need_caller_save_reg = 0; head->cannot_rename = 0; + head->renamed = 0; + head->tied_chain = NULL; id_to_chain.safe_push (head); head->id = current_id++; @@ -366,6 +371,13 @@ find_rename_reg (du_head_p this_head, enum reg_class super_class, preferred_class = (enum reg_class) targetm.preferred_rename_class (super_class); + /* Pick and check the register from the tied chain iff the tied chain + is not renamed. */ + if (this_head->tied_chain && !this_head->tied_chain->renamed + && check_new_reg_p (old_reg, this_head->tied_chain->regno, + this_head, *unavailable)) + return this_head->tied_chain->regno; + /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass over registers that belong to PREFERRED_CLASS and try to find the best register within the class. If that failed, we iterate in @@ -960,6 +972,7 @@ regrename_do_replace (struct du_head *head, int reg) return false; mode = GET_MODE (*head->first->loc); + head->renamed = 1; head->regno = reg; head->nregs = hard_regno_nregs[reg][mode]; return true; @@ -1043,7 +1056,34 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc, enum reg_class cl, enum scan_actions act if (action == mark_write) { if (type == OP_OUT) - create_new_chain (this_regno, this_nregs, loc, insn, cl); + { + du_head_p c; + rtx pat = PATTERN (insn); + + c = create_new_chain (this_regno, this_nregs, loc, insn, cl); + + /* We try to tie chains in a move instruction for + a single output. */ + if (recog_data.n_operands == 2 + && GET_CODE (pat) == SET + && GET_CODE (SET_DEST (pat)) == REG + && GET_CODE (SET_SRC (pat)) == REG + && terminated_this_insn) + { + gcc_assert + (terminated_this_insn->regno == REGNO (recog_data.operand[1])); + + c->tied_chain = terminated_this_insn; + terminated_this_insn->tied_chain = c; + + if (dump_file) + fprintf (dump_file, "Tying chain %s (%d) with %s (%d)\n", + reg_names[c->regno], c->id, + reg_names[terminated_this_insn->regno], + terminated_this_insn->id); + } + } + return; } @@ -1151,6 +1191,8 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc, enum reg_class cl, enum scan_actions act SET_HARD_REG_BIT (live_hard_regs, head->regno + nregs); } + if (action == terminate_dead) + terminated_this_insn = *p; *p = next; if (dump_file) fprintf (dump_file, @@ -1707,6 +1749,8 @@ build_def_use (basic_block bb) scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_read, OP_INOUT); + terminated_this_insn = NULL; + /* Step 4: Close chains for registers that die here, unless the register is mentioned in a REG_UNUSED note. In that case we keep the chain open until step #7 below to ensure diff --git a/gcc/regrename.h b/gcc/regrename.h index bbe156d..9c72181 100644 --- a/gcc/regrename.h +++ b/gcc/regrename.h @@ -28,6 +28,8 @@ struct du_head struct du_head *next_chain; /* The first and last elements of this chain. */ struct du_chain *first, *last; + /* The chain that this chain is tied to. */ + struct du_head *tied_chain; /* Describes the register being tracked. */ unsigned regno; int nregs; @@ -45,6 +47,8 @@ struct du_head such as the SET_DEST of a CALL_INSN or an asm operand that used to be a hard register. */ unsigned int cannot_rename:1; + /* Nonzero if the chain is renamed. */ + unsigned int renamed:1; }; typedef struct du_head *du_head_p; -- 2.4.5 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-09 13:32 ` Robert Suchanek @ 2015-11-09 16:30 ` Bernd Schmidt 2015-11-09 17:01 ` Robert Suchanek 2015-11-10 12:10 ` Bernd Schmidt 1 sibling, 1 reply; 21+ messages in thread From: Bernd Schmidt @ 2015-11-09 16:30 UTC (permalink / raw) To: Robert Suchanek, ebotcazou, gcc-patches On 11/09/2015 02:32 PM, Robert Suchanek wrote: > The results below were generated for CSiBE benchmark and the numbers in > columns express bytes in format 'net (gain/loss)' to show the difference > with and without the patch when -frename-registers switch is used. I'm not entirely sure what the numbers represent. I can see how you'd measure at a net size change (I assume a negative net is the intended goal), but how did you arrive at gain/loss numbers? In any case, assuming negative is good, the results seem pretty decent. > + gcc_assert > + (terminated_this_insn->regno == REGNO (recog_data.operand[1])); Maybe break the line before the == so that you can start the arguments on the same line as the assert. > + /* Nonzero if the chain is renamed. */ > + unsigned int renamed:1; I'd write "has already been renamed" since that is maybe slightly less ambiguous. Ok with those changes. Bernd ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-09 16:30 ` Bernd Schmidt @ 2015-11-09 17:01 ` Robert Suchanek 2015-11-10 11:21 ` Christophe Lyon 0 siblings, 1 reply; 21+ messages in thread From: Robert Suchanek @ 2015-11-09 17:01 UTC (permalink / raw) To: Bernd Schmidt, ebotcazou, gcc-patches Hi, > On 11/09/2015 02:32 PM, Robert Suchanek wrote: > > The results below were generated for CSiBE benchmark and the numbers in > > columns express bytes in format 'net (gain/loss)' to show the difference > > with and without the patch when -frename-registers switch is used. > > I'm not entirely sure what the numbers represent. I can see how you'd > measure at a net size change (I assume a negative net is the intended > goal), but how did you arrive at gain/loss numbers? > > In any case, assuming negative is good, the results seem pretty decent. The gain/loss was calculated based on per function analysis. Each flavour e.g. MIPS n64 -Os was ran with/without the patch and compared to the base i.e. without the patch. The patched version of each function may show either positive (larger code size), negative or no difference to the code size. The gain/loss in a cell is the sum of all positive/negative numbers for a test. The negatives, as you said, are the good ones. > > > + gcc_assert > > + (terminated_this_insn->regno == REGNO (recog_data.operand[1])); > > Maybe break the line before the == so that you can start the arguments > on the same line as the assert. > > > + /* Nonzero if the chain is renamed. */ > > + unsigned int renamed:1; > > I'd write "has already been renamed" since that is maybe slightly less > ambiguous. > > Ok with those changes. > > > Bernd Will do the changes and apply. Regards, Robert ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-09 17:01 ` Robert Suchanek @ 2015-11-10 11:21 ` Christophe Lyon 2015-11-10 11:41 ` Robert Suchanek 0 siblings, 1 reply; 21+ messages in thread From: Christophe Lyon @ 2015-11-10 11:21 UTC (permalink / raw) To: Robert Suchanek; +Cc: Bernd Schmidt, ebotcazou, gcc-patches On 9 November 2015 at 18:01, Robert Suchanek <Robert.Suchanek@imgtec.com> wrote: > Hi, > >> On 11/09/2015 02:32 PM, Robert Suchanek wrote: >> > The results below were generated for CSiBE benchmark and the numbers in >> > columns express bytes in format 'net (gain/loss)' to show the difference >> > with and without the patch when -frename-registers switch is used. >> >> I'm not entirely sure what the numbers represent. I can see how you'd >> measure at a net size change (I assume a negative net is the intended >> goal), but how did you arrive at gain/loss numbers? >> >> In any case, assuming negative is good, the results seem pretty decent. > > The gain/loss was calculated based on per function analysis. > Each flavour e.g. MIPS n64 -Os was ran with/without the patch and compared to > the base i.e. without the patch. The patched version of each function may > show either positive (larger code size), negative or no difference to > the code size. The gain/loss in a cell is the sum of all positive/negative > numbers for a test. The negatives, as you said, are the good ones. > >> >> > + gcc_assert >> > + (terminated_this_insn->regno == REGNO (recog_data.operand[1])); >> >> Maybe break the line before the == so that you can start the arguments >> on the same line as the assert. >> >> > + /* Nonzero if the chain is renamed. */ >> > + unsigned int renamed:1; >> >> I'd write "has already been renamed" since that is maybe slightly less >> ambiguous. >> >> Ok with those changes. >> >> >> Bernd > > Will do the changes and apply. > Hi, Since you committed this (r230087 if I'm correct), I can see that GCC fails to build ligfortran for target arm-none-linuxgnueabi --with-cpu=cortex-a9. The backtrace is: /tmp/8079076_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgfortran/generated/matmul_i8.c: In function 'matmul_i8': /tmp/8079076_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgfortran/generated/matmul_i8.c:374:1: internal compiler error: in scan_rtx_reg, at regrename.c:1074 } ^ 0xa13940 scan_rtx_reg /tmp/8079076_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/regrename.c:1074 0xa1451d record_out_operands /tmp/8079076_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/regrename.c:1554 0xa14d12 build_def_use /tmp/8079076_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/regrename.c:1802 0xa1533e regrename_analyze(bitmap_head*) /tmp/8079076_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/regrename.c:726 0xa161f9 regrename_optimize /tmp/8079076_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/regrename.c:1871 0xa161f9 execute /tmp/8079076_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/regrename.c:1908 Please submit a full bug report, Can you have a look? > Regards, > Robert > ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-10 11:21 ` Christophe Lyon @ 2015-11-10 11:41 ` Robert Suchanek 2015-11-10 16:22 ` Christophe Lyon 0 siblings, 1 reply; 21+ messages in thread From: Robert Suchanek @ 2015-11-10 11:41 UTC (permalink / raw) To: Christophe Lyon; +Cc: Bernd Schmidt, ebotcazou, gcc-patches Hi Christophe, > Hi, > > Since you committed this (r230087 if I'm correct), I can see that GCC > fails to build > ligfortran for target arm-none-linuxgnueabi --with-cpu=cortex-a9. ... > > Can you have a look? Sorry for the breakage. I see that my assertion is being triggered. I'll investigate this and check whether the assertion is correct or something else needs to be done. Robert ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-10 11:41 ` Robert Suchanek @ 2015-11-10 16:22 ` Christophe Lyon 2015-11-10 17:43 ` James Greenhalgh 0 siblings, 1 reply; 21+ messages in thread From: Christophe Lyon @ 2015-11-10 16:22 UTC (permalink / raw) To: Robert Suchanek; +Cc: Bernd Schmidt, ebotcazou, gcc-patches On 10 November 2015 at 12:41, Robert Suchanek <Robert.Suchanek@imgtec.com> wrote: > Hi Christophe, > >> Hi, >> >> Since you committed this (r230087 if I'm correct), I can see that GCC >> fails to build >> ligfortran for target arm-none-linuxgnueabi --with-cpu=cortex-a9. > ... >> >> Can you have a look? > > Sorry for the breakage. I see that my assertion is being triggered. > I'll investigate this and check whether the assertion is correct or > something else needs to be done. > Now that 'make check' has had enough time to run, I can see several regressions in the configurations where GCC still builds. For more details: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/230087/report-build-info.html > Robert ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-10 16:22 ` Christophe Lyon @ 2015-11-10 17:43 ` James Greenhalgh 2015-11-10 22:33 ` Robert Suchanek 0 siblings, 1 reply; 21+ messages in thread From: James Greenhalgh @ 2015-11-10 17:43 UTC (permalink / raw) To: Christophe Lyon; +Cc: Robert Suchanek, Bernd Schmidt, ebotcazou, gcc-patches On Tue, Nov 10, 2015 at 05:22:40PM +0100, Christophe Lyon wrote: > On 10 November 2015 at 12:41, Robert Suchanek > <Robert.Suchanek@imgtec.com> wrote: > > Hi Christophe, > > > >> Hi, > >> > >> Since you committed this (r230087 if I'm correct), I can see that GCC > >> fails to build > >> ligfortran for target arm-none-linuxgnueabi --with-cpu=cortex-a9. > > ... > >> > >> Can you have a look? > > > > Sorry for the breakage. I see that my assertion is being triggered. > > I'll investigate this and check whether the assertion is correct or > > something else needs to be done. > > > > Now that 'make check' has had enough time to run, I can see several > regressions in the configurations where GCC still builds. > For more details: > http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/230087/report-build-info.html > This also causes failures for AArch64 -mcpu=cortex-a57 targets. This testcase: void foo (unsigned char *out, const unsigned char *in, int a) { for (int i = 0; i < a; i++) { out[0] = in[2]; out[1] = in[1]; out[2] = in[0]; in += 3; out += 3; } } Fails as so: foo.c: In function 'void foo(unsigned char*, const unsigned char*, int)': foo.c:12:1: internal compiler error: in scan_rtx_reg, at regrename.c:1074 } ^ 0xbe00f8 scan_rtx_reg ..../gcc/regrename.c:1073 0xbe0ad5 scan_rtx ..../gcc/regrename.c:1401 0xbe1038 record_out_operands ..../gcc/regrename.c:1554 0xbe1f50 build_def_use ..../gcc/regrename.c:1802 0xbe1f50 regrename_analyze(bitmap_head*) ..../gcc/regrename.c:726 0xf7a0c7 func_fma_steering::execute_fma_steering() ..../gcc/config/aarch64/cortex-a57-fma-steering.c:1026 0xf7a9c1 pass_fma_steering::execute(function*) ..../gcc/config/aarch64/cortex-a57-fma-steering.c:1063 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. When compiled with: <gcc-aarch64> -O3 -mcpu=cortex-a57 foo.c Thanks, James ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-10 17:43 ` James Greenhalgh @ 2015-11-10 22:33 ` Robert Suchanek 2015-11-10 22:57 ` Bernd Schmidt 0 siblings, 1 reply; 21+ messages in thread From: Robert Suchanek @ 2015-11-10 22:33 UTC (permalink / raw) To: James Greenhalgh, Christophe Lyon; +Cc: Bernd Schmidt, ebotcazou, gcc-patches Hi all, > > Now that 'make check' has had enough time to run, I can see several > > regressions in the configurations where GCC still builds. > > For more details: > > http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/230087/report-build-info.html > > > > This also causes failures for AArch64 -mcpu=cortex-a57 targets. This > testcase: > > void > foo (unsigned char *out, const unsigned char *in, int a) > { > for (int i = 0; i < a; i++) > { > out[0] = in[2]; > out[1] = in[1]; > out[2] = in[0]; > in += 3; > out += 3; > } > } > > Fails as so: > > foo.c: In function 'void foo(unsigned char*, const unsigned char*, int)': > foo.c:12:1: internal compiler error: in scan_rtx_reg, at regrename.c:1074 > } > ^ > > 0xbe00f8 scan_rtx_reg > ..../gcc/regrename.c:1073 > 0xbe0ad5 scan_rtx > ..../gcc/regrename.c:1401 > 0xbe1038 record_out_operands > ..../gcc/regrename.c:1554 > 0xbe1f50 build_def_use > ..../gcc/regrename.c:1802 > 0xbe1f50 regrename_analyze(bitmap_head*) > ..../gcc/regrename.c:726 > 0xf7a0c7 func_fma_steering::execute_fma_steering() > ..../gcc/config/aarch64/cortex-a57-fma-steering.c:1026 > 0xf7a9c1 pass_fma_steering::execute(function*) > ..../gcc/config/aarch64/cortex-a57-fma-steering.c:1063 > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <http://gcc.gnu.org/bugs.html> for instructions. > > When compiled with: > > <gcc-aarch64> -O3 -mcpu=cortex-a57 foo.c > > Thanks, > James 0xbe1f50 build_def_use > ..../gcc/regrename.c:1802 > 0xbe1f50 regrename_analyze(bitmap_head*) > ..../gcc/regrename.c:726 > 0xf7a0c7 func_fma_steering::execute_fma_steering() > ..../gcc/config/aarch64/cortex-a57-fma-steering.c:1026 > 0xf7a9c1 pass_fma_steering::execute(function*) > ..../gcc/config/aarch64/cortex-a57-fma-steering.c:1063 > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <http://gcc.gnu.org/bugs.html> for instructions. > > When compiled with: > > <gcc-aarch64> -O3 -mcpu=cortex-a57 foo.c Thanks for the test case. It appears that I managed to run only those tests that didn't expose the assertion error and there is at least one more port i.e. powerpc64 showing similar ICEs when -funroll-loops and/or -fpeel-loops are used that enables the regrename pass. In both AArch64 and ARM cases I found the same insufficient checks when chains are tied and it seems that this is the root cause behind all failures. With the attached patch I built arm-none-linux-gnueabi without failures, checked a number of cases shown on Christophe's page, the above test case, and it would appear that the problem is solved. The reason behind the failures is that the terminated_this_insn had a different number of consecutive registers (and mode) to the input operand in a move currently being considered for tying. In the fix, I allow tying only if there is matching number of NREGS. Bernd, do you think that this check would be sufficient and safe? I'm not sure what would be better: check the mode, nregs plus perhaps consider tying only if nregs == 1. Regards, Robert gcc/ * regname.c (scan_rtx_reg): Check the matching number of consecutive registers when tying chains. --- gcc/regrename.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/regrename.c b/gcc/regrename.c index d727dd9..0b8f032 100644 --- a/gcc/regrename.c +++ b/gcc/regrename.c @@ -1068,7 +1068,9 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc, enum reg_class cl, enum scan_actions act && GET_CODE (pat) == SET && GET_CODE (SET_DEST (pat)) == REG && GET_CODE (SET_SRC (pat)) == REG - && terminated_this_insn) + && terminated_this_insn + && terminated_this_insn->nregs + == REG_NREGS (recog_data.operand[1])) { gcc_assert (terminated_this_insn->regno == REGNO (recog_data.operand[1])); -- 2.4.5 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-10 22:33 ` Robert Suchanek @ 2015-11-10 22:57 ` Bernd Schmidt 2015-11-11 0:06 ` Robert Suchanek 2015-11-11 8:50 ` Robert Suchanek 0 siblings, 2 replies; 21+ messages in thread From: Bernd Schmidt @ 2015-11-10 22:57 UTC (permalink / raw) To: Robert Suchanek, James Greenhalgh, Christophe Lyon; +Cc: ebotcazou, gcc-patches On 11/10/2015 11:33 PM, Robert Suchanek wrote: > > The reason behind the failures is that the terminated_this_insn had > a different number of consecutive registers (and mode) to the input > operand in a move currently being considered for tying. In the fix, > I allow tying only if there is matching number of NREGS. > > Bernd, do you think that this check would be sufficient and safe? > I'm not sure what would be better: check the mode, nregs plus perhaps > consider tying only if nregs == 1. Hmm, but shouldn't the regno still be the same? Or is this a case where we have a multi-word chain like ax/dx and then something like a "set bx, dx" involving only a part of it, but the entire chain dies? I guess this is ok to stop the failures for now, but you may want to move the check to the point where we set terminated_this_insn. Also, as I pointed out earlier, clearing terminated_this_insn should probably happen earlier. Bernd ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-10 22:57 ` Bernd Schmidt @ 2015-11-11 0:06 ` Robert Suchanek 2015-11-11 8:50 ` Robert Suchanek 1 sibling, 0 replies; 21+ messages in thread From: Robert Suchanek @ 2015-11-11 0:06 UTC (permalink / raw) To: Bernd Schmidt, James Greenhalgh, Christophe Lyon; +Cc: ebotcazou, gcc-patches Hi, > > Bernd, do you think that this check would be sufficient and safe? > > I'm not sure what would be better: check the mode, nregs plus perhaps > > consider tying only if nregs == 1. > > Hmm, but shouldn't the regno still be the same? Or is this a case where > we have a multi-word chain like ax/dx and then something like a "set bx, > dx" involving only a part of it, but the entire chain dies? The more I stare at this the more confusing it is. Yes, it appears to be a multi-word chain and when a subset dies then the whole chain dies. Let's consider the following snippet: ... (insn 1467 1465 1466 68 (set (reg:DI 4 r4 [626]) (mult:DI (zero_extend:DI (reg:SI 1 r1 [orig:698 bbase_yn ] [698])) (zero_extend:DI (reg:SI 12 ip [orig:700 _302 ] [700])))) /scratch2/check-other-ports/src/gcc/libgfortran/generated/matmul_i8.c:284 54 {*umulsidi3_v6} (nil)) (insn 1466 1467 4288 68 (set (reg:SI 2 r2 [625]) (plus:SI (mult:SI (reg:SI 12 ip [orig:700 _302 ] [700]) (reg:SI 0 r0 [orig:699 bbase_yn+4 ] [699])) (reg:SI 2 r2 [624]))) /scratch2/check-other-ports/src/gcc/libgfortran/generated/matmul_i8.c:284 43 {*mulsi3addsi_v6} (expr_list:REG_DEAD (reg:SI 12 ip [orig:700 _302 ] [700]) (nil))) (insn 4288 1466 1469 68 (set (reg:SI 12 ip [1933]) (reg:SI 5 r5 [+4 ])) /scratch2/check-other-ports/src/gcc/libgfortran/generated/matmul_i8.c:284 174 {*arm_movsi_insn} (expr_list:REG_DEAD (reg:SI 5 r5 [+4 ]) (nil))) ... When the input operand in insn 4288 is terminated as dead then the terminated_this_insn->regno points to register 4 but this_regno is 5. terminated_this_insn->last->insn points to insn 1467. I presume "[+4 ]" for register 5 in the dump indicates that this is a part of the multi-word register. When a new chain is created for the output operand with register 12 and tying is attempted then we get an assertion error. > I guess this is ok to stop the failures for now, but you may want to > move the check to the point where we set terminated_this_insn. Also, as > I pointed out earlier, clearing terminated_this_insn should probably > happen earlier. > > Bernd Ah yes, I forgot to move this. I'll move it and commit the patch in the morning. Regards, Robert ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-10 22:57 ` Bernd Schmidt 2015-11-11 0:06 ` Robert Suchanek @ 2015-11-11 8:50 ` Robert Suchanek 2015-11-12 7:47 ` Christophe Lyon 1 sibling, 1 reply; 21+ messages in thread From: Robert Suchanek @ 2015-11-11 8:50 UTC (permalink / raw) To: Bernd Schmidt, James Greenhalgh, Christophe Lyon; +Cc: ebotcazou, gcc-patches Hi, > I guess this is ok to stop the failures for now, but you may want to > move the check to the point where we set terminated_this_insn. Also, as > I pointed out earlier, clearing terminated_this_insn should probably > happen earlier. Here is the updated patch that I'm about to commit once the bootstrap finishes. Regards, Robert gcc/ * regname.c (scan_rtx_reg): Check the matching number of consecutive registers when tying chains. (build_def_use): Move terminated_this_insn earlier in the function. --- gcc/regrename.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/gcc/regrename.c b/gcc/regrename.c index d727dd9..d41410a 100644 --- a/gcc/regrename.c +++ b/gcc/regrename.c @@ -1068,7 +1068,9 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc, enum reg_class cl, enum scan_actions act && GET_CODE (pat) == SET && GET_CODE (SET_DEST (pat)) == REG && GET_CODE (SET_SRC (pat)) == REG - && terminated_this_insn) + && terminated_this_insn + && terminated_this_insn->nregs + == REG_NREGS (recog_data.operand[1])) { gcc_assert (terminated_this_insn->regno == REGNO (recog_data.operand[1])); @@ -1593,6 +1595,7 @@ build_def_use (basic_block bb) enum rtx_code set_code = SET; enum rtx_code clobber_code = CLOBBER; insn_rr_info *insn_info = NULL; + terminated_this_insn = NULL; /* Process the insn, determining its effect on the def-use chains and live hard registers. We perform the following @@ -1749,8 +1752,6 @@ build_def_use (basic_block bb) scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_read, OP_INOUT); - terminated_this_insn = NULL; - /* Step 4: Close chains for registers that die here, unless the register is mentioned in a REG_UNUSED note. In that case we keep the chain open until step #7 below to ensure -- 2.4. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-11 8:50 ` Robert Suchanek @ 2015-11-12 7:47 ` Christophe Lyon 2015-11-12 12:57 ` Robert Suchanek 0 siblings, 1 reply; 21+ messages in thread From: Christophe Lyon @ 2015-11-12 7:47 UTC (permalink / raw) To: Robert Suchanek; +Cc: Bernd Schmidt, James Greenhalgh, ebotcazou, gcc-patches On 11 November 2015 at 09:50, Robert Suchanek <Robert.Suchanek@imgtec.com> wrote: > Hi, > >> I guess this is ok to stop the failures for now, but you may want to >> move the check to the point where we set terminated_this_insn. Also, as >> I pointed out earlier, clearing terminated_this_insn should probably >> happen earlier. > > Here is the updated patch that I'm about to commit once the bootstrap > finishes. > Hi, I confirm that this fixes the build errors I was seeing. Thanks. > Regards, > Robert > > gcc/ > * regname.c (scan_rtx_reg): Check the matching number of consecutive > registers when tying chains. > (build_def_use): Move terminated_this_insn earlier in the function. > --- > gcc/regrename.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/gcc/regrename.c b/gcc/regrename.c > index d727dd9..d41410a 100644 > --- a/gcc/regrename.c > +++ b/gcc/regrename.c > @@ -1068,7 +1068,9 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc, enum reg_class cl, enum scan_actions act > && GET_CODE (pat) == SET > && GET_CODE (SET_DEST (pat)) == REG > && GET_CODE (SET_SRC (pat)) == REG > - && terminated_this_insn) > + && terminated_this_insn > + && terminated_this_insn->nregs > + == REG_NREGS (recog_data.operand[1])) > { > gcc_assert (terminated_this_insn->regno > == REGNO (recog_data.operand[1])); > @@ -1593,6 +1595,7 @@ build_def_use (basic_block bb) > enum rtx_code set_code = SET; > enum rtx_code clobber_code = CLOBBER; > insn_rr_info *insn_info = NULL; > + terminated_this_insn = NULL; > > /* Process the insn, determining its effect on the def-use > chains and live hard registers. We perform the following > @@ -1749,8 +1752,6 @@ build_def_use (basic_block bb) > scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_read, > OP_INOUT); > > - terminated_this_insn = NULL; > - > /* Step 4: Close chains for registers that die here, unless > the register is mentioned in a REG_UNUSED note. In that > case we keep the chain open until step #7 below to ensure > -- > 2.4. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-12 7:47 ` Christophe Lyon @ 2015-11-12 12:57 ` Robert Suchanek 0 siblings, 0 replies; 21+ messages in thread From: Robert Suchanek @ 2015-11-12 12:57 UTC (permalink / raw) To: Christophe Lyon; +Cc: Bernd Schmidt, James Greenhalgh, ebotcazou, gcc-patches Hi Christophe, > > > Hi, > I confirm that this fixes the build errors I was seeing. > Thanks. > Thanks for checking this. I'm still seeing a number of ICEs on the gcc-testresults mailing list across various ports but these are likely to be caused another patch. They are already reported as PR68293 and PR68296. Regards, Robert ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC][PATCH] Preferred rename register in regrename pass 2015-11-09 13:32 ` Robert Suchanek 2015-11-09 16:30 ` Bernd Schmidt @ 2015-11-10 12:10 ` Bernd Schmidt 1 sibling, 0 replies; 21+ messages in thread From: Bernd Schmidt @ 2015-11-10 12:10 UTC (permalink / raw) To: Robert Suchanek, ebotcazou, gcc-patches On 11/09/2015 02:32 PM, Robert Suchanek wrote: > @@ -1707,6 +1749,8 @@ build_def_use (basic_block bb) > scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_read, > OP_INOUT); > > + terminated_this_insn = NULL; > + > /* Step 4: Close chains for registers that die here, unless > the register is mentioned in a REG_UNUSED note. In that > case we keep the chain open until step #7 below to ensure I suspect you'll want to move this earlier, just before step 1. My guess would be that the reported failure was for an earlyclobber operand. Bernd ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2015-11-12 12:57 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-09-17 14:40 [RFC][PATCH] Preferred rename register in regrename pass Robert Suchanek 2015-09-17 17:11 ` Jeff Law 2015-09-17 21:28 ` Eric Botcazou 2015-09-18 15:29 ` Bernd Schmidt 2015-10-09 7:10 ` Robert Suchanek 2015-10-09 11:06 ` Bernd Schmidt 2015-10-09 11:20 ` Robert Suchanek 2015-11-09 13:32 ` Robert Suchanek 2015-11-09 16:30 ` Bernd Schmidt 2015-11-09 17:01 ` Robert Suchanek 2015-11-10 11:21 ` Christophe Lyon 2015-11-10 11:41 ` Robert Suchanek 2015-11-10 16:22 ` Christophe Lyon 2015-11-10 17:43 ` James Greenhalgh 2015-11-10 22:33 ` Robert Suchanek 2015-11-10 22:57 ` Bernd Schmidt 2015-11-11 0:06 ` Robert Suchanek 2015-11-11 8:50 ` Robert Suchanek 2015-11-12 7:47 ` Christophe Lyon 2015-11-12 12:57 ` Robert Suchanek 2015-11-10 12:10 ` Bernd Schmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).