* [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) @ 2017-08-09 21:16 Segher Boessenkool 2017-08-10 4:18 ` Alan Modra 0 siblings, 1 reply; 11+ messages in thread From: Segher Boessenkool @ 2017-08-09 21:16 UTC (permalink / raw) To: gcc-patches; +Cc: dje.gcc, Segher Boessenkool We can have SAVE_MULTIPLE while we do not have REST_MULTIPLE. If the inline restore does not restore all registers, the CFI for the save and restore can conflict if things are shrink-wrapped. We could restore all registers that are saved (not ideal), or emit the CFI notes to say we did (which will work fine, but is also not so great); instead, let's not save the registers that are unused. Tested on powerpc64-linux {-m32,-m64}; committing to trunk. Segher 2017-08-09 Segher Boessenkool <segher@kernel.crashing.org> PR target/80938 * config/rs6000/rs6000.c (rs6000_savres_strategy): Don't use SAVE_MULTIPLE if not all the registers that saves, should be saved. --- gcc/config/rs6000/rs6000.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 0f0b1ff..e8cdd25 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -24437,6 +24437,21 @@ rs6000_savres_strategy (rs6000_stack_t *info, else if (!lr_save_p && info->first_gp_reg_save > 29) strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS; + /* We can only use save multiple if we need to save all the registers from + first_gp_reg_save. Otherwise, the CFI gets messed up (we save some + register we do not restore). */ + if (strategy & SAVE_MULTIPLE) + { + int i; + + for (i = info->first_gp_reg_save; i < 32; i++) + if (fixed_reg_p (i) || !save_reg_p (i)) + { + strategy &= ~SAVE_MULTIPLE; + break; + } + } + /* We can only use load multiple or the out-of-line routines to restore gprs if we've saved all the registers from first_gp_reg_save. Otherwise, we risk loading garbage. -- 1.9.3 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) 2017-08-09 21:16 [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) Segher Boessenkool @ 2017-08-10 4:18 ` Alan Modra 2017-08-10 4:44 ` Segher Boessenkool 0 siblings, 1 reply; 11+ messages in thread From: Alan Modra @ 2017-08-10 4:18 UTC (permalink / raw) To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc On Wed, Aug 09, 2017 at 09:06:18PM +0000, Segher Boessenkool wrote: > We can have SAVE_MULTIPLE while we do not have REST_MULTIPLE. If the > inline restore does not restore all registers, the CFI for the save > and restore can conflict if things are shrink-wrapped. > > We could restore all registers that are saved (not ideal), No, we can't do that. A -ffixed-reg register should not be restored even if it is saved, as such regs should never be written. For example, a fixed reg might be counting interrupts. If you restore it, you may lose count. Another example is a fixed reg used as some sort of context pointer. If you restore in a function that sets a new context you're obviously breaking user code. > or emit > the CFI notes to say we did (which will work fine, No, you can't do that either. Unwinding might then restore a -ffixed-reg register. > but is also not > so great); instead, let's not save the registers that are unused. Ick, looks like papering over the real problem to me, and will no doubt cause -Os size regressions. Also, if SAVE_MULTIPLE causes this bad interaction with shrink-wrap, does using the out-of-line register save functions cause the same problem? I haven't looked yet, but at a guess I suspect the correct solution is to modify cfa_restores in rs6000_emit_epilogue. -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) 2017-08-10 4:18 ` Alan Modra @ 2017-08-10 4:44 ` Segher Boessenkool 2017-08-10 5:26 ` Alan Modra 0 siblings, 1 reply; 11+ messages in thread From: Segher Boessenkool @ 2017-08-10 4:44 UTC (permalink / raw) To: Alan Modra; +Cc: gcc-patches, dje.gcc On Thu, Aug 10, 2017 at 10:33:05AM +0930, Alan Modra wrote: > On Wed, Aug 09, 2017 at 09:06:18PM +0000, Segher Boessenkool wrote: > > We can have SAVE_MULTIPLE while we do not have REST_MULTIPLE. If the > > inline restore does not restore all registers, the CFI for the save > > and restore can conflict if things are shrink-wrapped. > > > > We could restore all registers that are saved (not ideal), > > No, we can't do that. A -ffixed-reg register should not be restored > even if it is saved, as such regs should never be written. For > example, a fixed reg might be counting interrupts. If you restore it, > you may lose count. Another example is a fixed reg used as some sort > of context pointer. If you restore in a function that sets a new > context you're obviously breaking user code. Yes, sorry for glossing over this. We need to handle fixed registers specially in most other prologue/epilogue things, too (and we hopefully do everywhere it is needed). > > or emit > > the CFI notes to say we did (which will work fine, > > No, you can't do that either. Unwinding might then restore a > -ffixed-reg register. Yep. > > but is also not > > so great); instead, let's not save the registers that are unused. > > Ick, looks like papering over the real problem to me, and will no > doubt cause -Os size regressions. I think it is very directly solving the real problem. It isn't likely to cause size regressions (look how long it took for this PR to show up, so this cannot happen often). This only happens if r30 (the PIC reg) is used but r31 isn't; which means that a) there are no other registers to save, or b) the function is marked as needing a hard frame pointer but eventually doesn't need one. (RA picks the registers r31, r30, ... in sequence). In the case in the PR, this patch replaces one insn by one (cheaper) insn. > Also, if SAVE_MULTIPLE causes this > bad interaction with shrink-wrap, does using the out-of-line register > save functions cause the same problem? I do not know. Not with the code in the PR (restoring only one or two registers isn't done out-of-line, and it's a sibcall exit as well). Segher ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) 2017-08-10 4:44 ` Segher Boessenkool @ 2017-08-10 5:26 ` Alan Modra 2017-08-10 13:46 ` Segher Boessenkool 0 siblings, 1 reply; 11+ messages in thread From: Alan Modra @ 2017-08-10 5:26 UTC (permalink / raw) To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc [-- Attachment #1: Type: text/plain, Size: 2566 bytes --] On Wed, Aug 09, 2017 at 09:28:22PM -0500, Segher Boessenkool wrote: > On Thu, Aug 10, 2017 at 10:33:05AM +0930, Alan Modra wrote: > > On Wed, Aug 09, 2017 at 09:06:18PM +0000, Segher Boessenkool wrote: > > > We can have SAVE_MULTIPLE while we do not have REST_MULTIPLE. If the > > > inline restore does not restore all registers, the CFI for the save > > > and restore can conflict if things are shrink-wrapped. > > > > > > We could restore all registers that are saved (not ideal), > > > > No, we can't do that. A -ffixed-reg register should not be restored > > even if it is saved, as such regs should never be written. For > > example, a fixed reg might be counting interrupts. If you restore it, > > you may lose count. Another example is a fixed reg used as some sort > > of context pointer. If you restore in a function that sets a new > > context you're obviously breaking user code. > > Yes, sorry for glossing over this. We need to handle fixed registers > specially in most other prologue/epilogue things, too (and we hopefully > do everywhere it is needed). We don't. :-( I have a fix and will post after testing. > > > or emit > > > the CFI notes to say we did (which will work fine, > > > > No, you can't do that either. Unwinding might then restore a > > -ffixed-reg register. > > Yep. > > > > but is also not > > > so great); instead, let's not save the registers that are unused. > > > > Ick, looks like papering over the real problem to me, and will no > > doubt cause -Os size regressions. > > I think it is very directly solving the real problem. It isn't likely > to cause size regressions (look how long it took for this PR to show > up, so this cannot happen often). > > This only happens if r30 (the PIC reg) is used but r31 isn't; which means > that a) there are no other registers to save, or b) the function is marked > as needing a hard frame pointer but eventually doesn't need one. > > (RA picks the registers r31, r30, ... in sequence). > > In the case in the PR, this patch replaces one insn by one (cheaper) > insn. And in other cases your patch will prevent stmw when it should be used. Testcase attached. It shows the wrong use of lmw too. > > Also, if SAVE_MULTIPLE causes this > > bad interaction with shrink-wrap, does using the out-of-line register > > save functions cause the same problem? > > I do not know. Not with the code in the PR (restoring only one or two > registers isn't done out-of-line, and it's a sibcall exit as well). > > > Segher -- Alan Modra Australia Development Lab, IBM [-- Attachment #2: savres.c --] [-- Type: text/x-csrc, Size: 947 bytes --] #ifdef USER_R26 register long r26 __asm__ ("r26"); #endif int f1 (void) { register long r25 __asm__ ("r25"); register long r27 __asm__ ("r27"); register long r28 __asm__ ("r28"); register long r29 __asm__ ("r29"); register long r31 __asm__ ("r31"); __asm__ __volatile__ ("#uses %0 %1 %2 %3 %4" : "=r" (r25), "=r" (r27), "=r" (r28), "=r" (r29), "=r" (r31)); return 0; } void f2 (char *s) { int col = 0; char c; void f3 (char c) { extern int f4 (char); register long r25 __asm__ ("r25"); register long r27 __asm__ ("r27"); register long r28 __asm__ ("r28"); register long r29 __asm__ ("r29"); register long r31 __asm__ ("r31"); if (c == '\t') do f3 (' '); while (col % 8 != 0); else { __asm__ __volatile__ ("#uses %0 %1 %2 %3 %4" : "=r" (r25), "=r" (r27), "=r" (r28), "=r" (r29), "=r" (r31)); f4 (c); col++; } } while ((c = *s++) != 0) f3 (c); } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) 2017-08-10 5:26 ` Alan Modra @ 2017-08-10 13:46 ` Segher Boessenkool 2017-08-11 4:58 ` [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving Alan Modra 0 siblings, 1 reply; 11+ messages in thread From: Segher Boessenkool @ 2017-08-10 13:46 UTC (permalink / raw) To: Alan Modra; +Cc: gcc-patches, dje.gcc On Thu, Aug 10, 2017 at 02:17:40PM +0930, Alan Modra wrote: > > > Ick, looks like papering over the real problem to me, and will no > > > doubt cause -Os size regressions. > > > > I think it is very directly solving the real problem. It isn't likely > > to cause size regressions (look how long it took for this PR to show > > up, so this cannot happen often). > > > > This only happens if r30 (the PIC reg) is used but r31 isn't; which means > > that a) there are no other registers to save, or b) the function is marked > > as needing a hard frame pointer but eventually doesn't need one. > > > > (RA picks the registers r31, r30, ... in sequence). > > > > In the case in the PR, this patch replaces one insn by one (cheaper) > > insn. > > And in other cases your patch will prevent stmw when it should be > used. Testcase attached. It shows the wrong use of lmw too. I dunno... If you change r25 to r14 in that testcase it will use stmw 14 with my patch reverted. Not very reasonable imho (but then again, people using register asm like this get what they deserve anyway). Segher ^ permalink raw reply [flat|nested] 11+ messages in thread
* [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving 2017-08-10 13:46 ` Segher Boessenkool @ 2017-08-11 4:58 ` Alan Modra 2017-08-15 10:40 ` Segher Boessenkool 0 siblings, 1 reply; 11+ messages in thread From: Alan Modra @ 2017-08-11 4:58 UTC (permalink / raw) To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc It is possible when using out-of-line register saves or store multiple to save some registers unnecessarily, for example one reg in the block saved might be unused. We don't need to emit eh_frame info for those registers as that just bloats the eh_frame info, and also can result in an ICE when shrink-wrap gives multiple paths through the function saving different sets of registers. All the join points need to have identical eh_frame register save state. This patch reverts the previous fix for PR80939 "Use SAVE_MULTIPLE only if we restore what it saves (PR80938)" and instead fixes the PR by correcting the eh_frame info. The change to rs6000_savres_strategy is an optimization, but note that it hides the underlying problem in the PR testcase. Bootstrapped and regression tested powerpc64-linux (-m32 too) and powerpc64le-linux, with https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00774.html and https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00775.html applied. OK to apply? PR target/80938 * config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09. Don't use store multiple if only one reg needs saving. (rs6000_frame_related): Don't emit eh_frame for regs that don't need saving. (rs6000_emit_epilogue): Likewise. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 2070648..abc55bd 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -24432,20 +24432,37 @@ rs6000_savres_strategy (rs6000_stack_t *info, && flag_shrink_wrap_separate && optimize_function_for_speed_p (cfun))) { - /* Prefer store multiple for saves over out-of-line routines, - since the store-multiple instruction will always be smaller. */ - strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE; - - /* The situation is more complicated with load multiple. We'd - prefer to use the out-of-line routines for restores, since the - "exit" out-of-line routines can handle the restore of LR and the - frame teardown. However if doesn't make sense to use the - out-of-line routine if that is the only reason we'd need to save - LR, and we can't use the "exit" out-of-line gpr restore if we - have saved some fprs; In those cases it is advantageous to use - load multiple when available. */ - if (info->first_fp_reg_save != 64 || !lr_save_p) - strategy |= REST_INLINE_GPRS | REST_MULTIPLE; + int count; + + for (count = 0, i = info->first_gp_reg_save; i < 32; i++) + if (save_reg_p (i)) + count += 1; + + if (count <= 1) + /* Don't use store multiple if only one reg needs to be + saved. This can occur for example when the ABI_V4 pic reg + (r30) needs to be saved to make calls, but r31 is not + used. */ + strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS; + else + { + /* Prefer store multiple for saves over out-of-line + routines, since the store-multiple instruction will + always be smaller. */ + strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE; + + /* The situation is more complicated with load multiple. + We'd prefer to use the out-of-line routines for restores, + since the "exit" out-of-line routines can handle the + restore of LR and the frame teardown. However if doesn't + make sense to use the out-of-line routine if that is the + only reason we'd need to save LR, and we can't use the + "exit" out-of-line gpr restore if we have saved some + fprs; In those cases it is advantageous to use load + multiple when available. */ + if (info->first_fp_reg_save != 64 || !lr_save_p) + strategy |= REST_INLINE_GPRS | REST_MULTIPLE; + } } /* Using the "exit" out-of-line routine does not improve code size @@ -24454,21 +24471,6 @@ rs6000_savres_strategy (rs6000_stack_t *info, else if (!lr_save_p && info->first_gp_reg_save > 29) strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS; - /* We can only use save multiple if we need to save all the registers from - first_gp_reg_save. Otherwise, the CFI gets messed up (we save some - register we do not restore). */ - if (strategy & SAVE_MULTIPLE) - { - int i; - - for (i = info->first_gp_reg_save; i < 32; i++) - if (fixed_reg_p (i) || !save_reg_p (i)) - { - strategy &= ~SAVE_MULTIPLE; - break; - } - } - /* Don't ever restore fixed regs. */ if ((strategy & (REST_INLINE_GPRS | REST_MULTIPLE)) != REST_INLINE_GPRS) for (i = info->first_gp_reg_save; i < 32; i++) @@ -25681,9 +25683,15 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val, register save functions, or store multiple, then omit eh_frame info for any user-defined global regs. If eh_frame info is supplied, frame unwinding will - restore a user reg. */ + restore a user reg. Also omit eh_frame info for any + reg we don't need to save, as that bloats eh_frame + and can cause problems with shrink wrapping. Saves + of r0 are actually saving LR, so don't omit those. */ if (!REG_P (SET_SRC (set)) - || !fixed_reg_p (REGNO (SET_SRC (set)))) + || REGNO (SET_SRC (set)) == 0 + || REGNO (SET_SRC (set)) == CR2_REGNO + || (!fixed_reg_p (REGNO (SET_SRC (set))) + && save_reg_p (REGNO (SET_SRC (set))))) RTX_FRAME_RELATED_P (set) = 1; } RTX_FRAME_RELATED_P (insn) = 1; @@ -25720,9 +25728,13 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val, set = simplify_replace_rtx (set, reg2, repl2); XVECEXP (pat, 0, i) = set; - /* Omit eh_frame info for any user-defined global regs. */ + /* Omit eh_frame info for any user-defined global regs or + regs that don't need to be saved. */ if (!REG_P (SET_SRC (set)) - || !fixed_reg_p (REGNO (SET_SRC (set)))) + || REGNO (SET_SRC (set)) == 0 + || REGNO (SET_SRC (set)) == CR2_REGNO + || (!fixed_reg_p (REGNO (SET_SRC (set))) + && save_reg_p (REGNO (SET_SRC (set))))) RTX_FRAME_RELATED_P (set) = 1; } } @@ -27945,7 +27957,8 @@ rs6000_emit_epilogue (int sibcall) RTVEC_ELT (p, j++) = gen_frame_load (reg, frame_reg_rtx, info->gp_save_offset + reg_size * i); - if (flag_shrink_wrap) + if (flag_shrink_wrap + && save_reg_p (info->first_gp_reg_save + i)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } for (i = 0; info->first_altivec_reg_save + i <= LAST_ALTIVEC_REGNO; i++) @@ -27954,7 +27967,8 @@ rs6000_emit_epilogue (int sibcall) RTVEC_ELT (p, j++) = gen_frame_load (reg, frame_reg_rtx, info->altivec_save_offset + 16 * i); - if (flag_shrink_wrap) + if (flag_shrink_wrap + && save_reg_p (info->first_altivec_reg_save + i)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } for (i = 0; info->first_fp_reg_save + i <= 63; i++) @@ -27964,7 +27978,8 @@ rs6000_emit_epilogue (int sibcall) info->first_fp_reg_save + i); RTVEC_ELT (p, j++) = gen_frame_load (reg, frame_reg_rtx, info->fp_save_offset + 8 * i); - if (flag_shrink_wrap) + if (flag_shrink_wrap + && save_reg_p (info->first_fp_reg_save + i)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } RTVEC_ELT (p, j++) @@ -28085,7 +28100,8 @@ rs6000_emit_epilogue (int sibcall) && (flag_shrink_wrap || (offset_below_red_zone_p (info->altivec_save_offset - + 16 * (i - info->first_altivec_reg_save))))) + + 16 * (i - info->first_altivec_reg_save)))) + && save_reg_p (i)) { rtx reg = gen_rtx_REG (V4SImode, i); cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); @@ -28297,7 +28313,8 @@ rs6000_emit_epilogue (int sibcall) for (i = info->first_altivec_reg_save; i <= LAST_ALTIVEC_REGNO; ++i) if (((strategy & REST_INLINE_VRS) == 0 || (info->vrsave_mask & ALTIVEC_REG_BIT (i)) != 0) - && (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)) + && (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap) + && save_reg_p (i)) { rtx reg = gen_rtx_REG (V4SImode, i); cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); @@ -28643,7 +28660,8 @@ rs6000_emit_epilogue (int sibcall) RTVEC_ELT (p, elt++) = gen_frame_load (reg, sp_reg_rtx, info->fp_save_offset + 8 * i); - if (flag_shrink_wrap) + if (flag_shrink_wrap + && save_reg_p (info->first_fp_reg_save + i)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving 2017-08-11 4:58 ` [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving Alan Modra @ 2017-08-15 10:40 ` Segher Boessenkool 2017-08-16 0:13 ` Alan Modra 0 siblings, 1 reply; 11+ messages in thread From: Segher Boessenkool @ 2017-08-15 10:40 UTC (permalink / raw) To: Alan Modra; +Cc: gcc-patches, dje.gcc Hi! On Fri, Aug 11, 2017 at 12:40:11PM +0930, Alan Modra wrote: > It is possible when using out-of-line register saves or store multiple > to save some registers unnecessarily, for example one reg in the block > saved might be unused. We don't need to emit eh_frame info for those > registers as that just bloats the eh_frame info, and also can result > in an ICE when shrink-wrap gives multiple paths through the function > saving different sets of registers. All the join points need to have > identical eh_frame register save state. It isn't just eh_frame, we might have never noticed if it was. It is also the DWARF call frame info used for debugging. > PR target/80938 > * config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09. > Don't use store multiple if only one reg needs saving. > (rs6000_frame_related): Don't emit eh_frame for regs that > don't need saving. > (rs6000_emit_epilogue): Likewise. > + int count; > + > + for (count = 0, i = info->first_gp_reg_save; i < 32; i++) > + if (save_reg_p (i)) > + count += 1; int count = 0; for (i = info->first_gp_reg_save; i < 32; i++) if (save_reg_p (i)) count++; > @@ -25681,9 +25683,15 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val, > register save functions, or store multiple, then omit > eh_frame info for any user-defined global regs. If > eh_frame info is supplied, frame unwinding will > - restore a user reg. */ > + restore a user reg. Also omit eh_frame info for any > + reg we don't need to save, as that bloats eh_frame > + and can cause problems with shrink wrapping. Saves > + of r0 are actually saving LR, so don't omit those. */ > if (!REG_P (SET_SRC (set)) > - || !fixed_reg_p (REGNO (SET_SRC (set)))) > + || REGNO (SET_SRC (set)) == 0 > + || REGNO (SET_SRC (set)) == CR2_REGNO > + || (!fixed_reg_p (REGNO (SET_SRC (set))) > + && save_reg_p (REGNO (SET_SRC (set))))) > RTX_FRAME_RELATED_P (set) = 1; > } > RTX_FRAME_RELATED_P (insn) = 1; That "0" and "CR2_REGNO" is non-obvious, it needs a big fat comment (it's not clear that just CR2_REGNO is correct, either). Oh, and please factor out "REGNO (SET_SRC (set))". The rest looks good, but please repost. Segher ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving 2017-08-15 10:40 ` Segher Boessenkool @ 2017-08-16 0:13 ` Alan Modra 2017-08-17 2:23 ` Segher Boessenkool 0 siblings, 1 reply; 11+ messages in thread From: Alan Modra @ 2017-08-16 0:13 UTC (permalink / raw) To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc Repost with requested changes. I've extracted the logic that omits frame saves from rs6000_frame_related to a new function, because it's easier to document that way. The logic has been simplified a little too: fixed_reg_p doesn't need to be called there. PR target/80938 * config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09. Don't use store multiple if only one reg needs saving. (interesting_frame_related_regno): New function. (rs6000_frame_related): Don't emit eh_frame for regs that don't need saving. (rs6000_emit_epilogue): Likewise. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index f9aa13b..178632e 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -24445,20 +24445,36 @@ rs6000_savres_strategy (rs6000_stack_t *info, && flag_shrink_wrap_separate && optimize_function_for_speed_p (cfun))) { - /* Prefer store multiple for saves over out-of-line routines, - since the store-multiple instruction will always be smaller. */ - strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE; - - /* The situation is more complicated with load multiple. We'd - prefer to use the out-of-line routines for restores, since the - "exit" out-of-line routines can handle the restore of LR and the - frame teardown. However if doesn't make sense to use the - out-of-line routine if that is the only reason we'd need to save - LR, and we can't use the "exit" out-of-line gpr restore if we - have saved some fprs; In those cases it is advantageous to use - load multiple when available. */ - if (info->first_fp_reg_save != 64 || !lr_save_p) - strategy |= REST_INLINE_GPRS | REST_MULTIPLE; + int count = 0; + for (int i = info->first_gp_reg_save; i < 32; i++) + if (save_reg_p (i)) + count++; + + if (count <= 1) + /* Don't use store multiple if only one reg needs to be + saved. This can occur for example when the ABI_V4 pic reg + (r30) needs to be saved to make calls, but r31 is not + used. */ + strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS; + else + { + /* Prefer store multiple for saves over out-of-line + routines, since the store-multiple instruction will + always be smaller. */ + strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE; + + /* The situation is more complicated with load multiple. + We'd prefer to use the out-of-line routines for restores, + since the "exit" out-of-line routines can handle the + restore of LR and the frame teardown. However if doesn't + make sense to use the out-of-line routine if that is the + only reason we'd need to save LR, and we can't use the + "exit" out-of-line gpr restore if we have saved some + fprs; In those cases it is advantageous to use load + multiple when available. */ + if (info->first_fp_reg_save != 64 || !lr_save_p) + strategy |= REST_INLINE_GPRS | REST_MULTIPLE; + } } /* Using the "exit" out-of-line routine does not improve code size @@ -24467,21 +24483,6 @@ rs6000_savres_strategy (rs6000_stack_t *info, else if (!lr_save_p && info->first_gp_reg_save > 29) strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS; - /* We can only use save multiple if we need to save all the registers from - first_gp_reg_save. Otherwise, the CFI gets messed up (we save some - register we do not restore). */ - if (strategy & SAVE_MULTIPLE) - { - int i; - - for (i = info->first_gp_reg_save; i < 32; i++) - if (fixed_reg_p (i) || !save_reg_p (i)) - { - strategy &= ~SAVE_MULTIPLE; - break; - } - } - /* Don't ever restore fixed regs. */ if ((strategy & (REST_INLINE_GPRS | REST_MULTIPLE)) != REST_INLINE_GPRS) for (int i = info->first_gp_reg_save; i < 32; i++) @@ -25654,6 +25655,32 @@ output_probe_stack_range (rtx reg1, rtx reg2) return ""; } +/* This function is called when rs6000_frame_related is processing + SETs within a PARALLEL, and returns whether the REGNO save ought to + be marked RTX_FRAME_RELATED_P. The PARALLELs involved are those + for out-of-line register save functions, store multiple, and the + Darwin world_save. They may contain registers that don't really + need saving. */ + +static bool +interesting_frame_related_regno (unsigned int regno) +{ + /* Saves apparently of r0 are actually saving LR. */ + if (regno == 0) + return true; + /* If we see CR2 then we are here on a Darwin world save. Saves of + CR2 signify the whole CR is being saved. */ + if (regno == CR2_REGNO) + return true; + /* Omit frame info for any user-defined global regs. If frame info + is supplied for them, frame unwinding will restore a user reg. + Also omit frame info for any reg we don't need to save, as that + bloats eh_frame and can cause problems with shrink wrapping. + Since global regs won't be seen as needing to be saved, both of + these conditions are covered by save_reg_p. */ + return save_reg_p (regno); +} + /* Add to 'insn' a note which is PATTERN (INSN) but with REG replaced with (plus:P (reg 1) VAL), and with REG2 replaced with REPL2 if REG2 is not NULL. It would be nice if dwarf2out_frame_debug_expr could @@ -25688,13 +25715,8 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val, { rtx set = XVECEXP (pat, 0, i); - /* If this PARALLEL has been emitted for out-of-line - register save functions, or store multiple, then omit - eh_frame info for any user-defined global regs. If - eh_frame info is supplied, frame unwinding will - restore a user reg. */ if (!REG_P (SET_SRC (set)) - || !fixed_reg_p (REGNO (SET_SRC (set)))) + || interesting_frame_related_regno (REGNO (SET_SRC (set)))) RTX_FRAME_RELATED_P (set) = 1; } RTX_FRAME_RELATED_P (insn) = 1; @@ -25731,9 +25753,8 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val, set = simplify_replace_rtx (set, reg2, repl2); XVECEXP (pat, 0, i) = set; - /* Omit eh_frame info for any user-defined global regs. */ if (!REG_P (SET_SRC (set)) - || !fixed_reg_p (REGNO (SET_SRC (set)))) + || interesting_frame_related_regno (REGNO (SET_SRC (set)))) RTX_FRAME_RELATED_P (set) = 1; } } @@ -27956,7 +27977,8 @@ rs6000_emit_epilogue (int sibcall) RTVEC_ELT (p, j++) = gen_frame_load (reg, frame_reg_rtx, info->gp_save_offset + reg_size * i); - if (flag_shrink_wrap) + if (flag_shrink_wrap + && save_reg_p (info->first_gp_reg_save + i)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } for (i = 0; info->first_altivec_reg_save + i <= LAST_ALTIVEC_REGNO; i++) @@ -27965,7 +27987,8 @@ rs6000_emit_epilogue (int sibcall) RTVEC_ELT (p, j++) = gen_frame_load (reg, frame_reg_rtx, info->altivec_save_offset + 16 * i); - if (flag_shrink_wrap) + if (flag_shrink_wrap + && save_reg_p (info->first_altivec_reg_save + i)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } for (i = 0; info->first_fp_reg_save + i <= 63; i++) @@ -27975,7 +27998,8 @@ rs6000_emit_epilogue (int sibcall) info->first_fp_reg_save + i); RTVEC_ELT (p, j++) = gen_frame_load (reg, frame_reg_rtx, info->fp_save_offset + 8 * i); - if (flag_shrink_wrap) + if (flag_shrink_wrap + && save_reg_p (info->first_fp_reg_save + i)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } RTVEC_ELT (p, j++) @@ -28096,7 +28120,8 @@ rs6000_emit_epilogue (int sibcall) && (flag_shrink_wrap || (offset_below_red_zone_p (info->altivec_save_offset - + 16 * (i - info->first_altivec_reg_save))))) + + 16 * (i - info->first_altivec_reg_save)))) + && save_reg_p (i)) { rtx reg = gen_rtx_REG (V4SImode, i); cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); @@ -28308,7 +28333,8 @@ rs6000_emit_epilogue (int sibcall) for (i = info->first_altivec_reg_save; i <= LAST_ALTIVEC_REGNO; ++i) if (((strategy & REST_INLINE_VRS) == 0 || (info->vrsave_mask & ALTIVEC_REG_BIT (i)) != 0) - && (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)) + && (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap) + && save_reg_p (i)) { rtx reg = gen_rtx_REG (V4SImode, i); cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); @@ -28654,7 +28680,8 @@ rs6000_emit_epilogue (int sibcall) RTVEC_ELT (p, elt++) = gen_frame_load (reg, sp_reg_rtx, info->fp_save_offset + 8 * i); - if (flag_shrink_wrap) + if (flag_shrink_wrap + && save_reg_p (info->first_fp_reg_save + i)) cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores); } -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving 2017-08-16 0:13 ` Alan Modra @ 2017-08-17 2:23 ` Segher Boessenkool 2017-08-17 5:28 ` Alan Modra 0 siblings, 1 reply; 11+ messages in thread From: Segher Boessenkool @ 2017-08-17 2:23 UTC (permalink / raw) To: Alan Modra; +Cc: gcc-patches, dje.gcc Hi! On Wed, Aug 16, 2017 at 08:05:04AM +0930, Alan Modra wrote: > Repost with requested changes. I've extracted the logic that omits > frame saves from rs6000_frame_related to a new function, because it's > easier to document that way. The logic has been simplified a little > too: fixed_reg_p doesn't need to be called there. > > PR target/80938 > * config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09. > Don't use store multiple if only one reg needs saving. > (interesting_frame_related_regno): New function. > (rs6000_frame_related): Don't emit eh_frame for regs that > don't need saving. > (rs6000_emit_epilogue): Likewise. It's not just eh_frame, it's all call frame information. > +/* This function is called when rs6000_frame_related is processing > + SETs within a PARALLEL, and returns whether the REGNO save ought to > + be marked RTX_FRAME_RELATED_P. The PARALLELs involved are those > + for out-of-line register save functions, store multiple, and the > + Darwin world_save. They may contain registers that don't really > + need saving. */ > + > +static bool > +interesting_frame_related_regno (unsigned int regno) > +{ > + /* Saves apparently of r0 are actually saving LR. */ > + if (regno == 0) > + return true; > + /* If we see CR2 then we are here on a Darwin world save. Saves of > + CR2 signify the whole CR is being saved. */ > + if (regno == CR2_REGNO) > + return true; > + /* Omit frame info for any user-defined global regs. If frame info > + is supplied for them, frame unwinding will restore a user reg. > + Also omit frame info for any reg we don't need to save, as that > + bloats eh_frame and can cause problems with shrink wrapping. > + Since global regs won't be seen as needing to be saved, both of > + these conditions are covered by save_reg_p. */ > + return save_reg_p (regno); > +} The function name isn't so great, doesn't say what it does at all. Not that I can think of anything better. Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P by itself? Not sure if that is nicer. Both this CR2 and R0 handling are pretty nasty hacks. Could you add a comment saying that? Okay for trunk with those improvements (eh_frame and hack comment). Thanks! Segher ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving 2017-08-17 2:23 ` Segher Boessenkool @ 2017-08-17 5:28 ` Alan Modra 2017-08-17 5:50 ` Segher Boessenkool 0 siblings, 1 reply; 11+ messages in thread From: Alan Modra @ 2017-08-17 5:28 UTC (permalink / raw) To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc On Wed, Aug 16, 2017 at 06:23:13PM -0500, Segher Boessenkool wrote: > Hi! > > On Wed, Aug 16, 2017 at 08:05:04AM +0930, Alan Modra wrote: > > Repost with requested changes. I've extracted the logic that omits > > frame saves from rs6000_frame_related to a new function, because it's > > easier to document that way. The logic has been simplified a little > > too: fixed_reg_p doesn't need to be called there. > > > > PR target/80938 > > * config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09. > > Don't use store multiple if only one reg needs saving. > > (interesting_frame_related_regno): New function. > > (rs6000_frame_related): Don't emit eh_frame for regs that > > don't need saving. > > (rs6000_emit_epilogue): Likewise. > > It's not just eh_frame, it's all call frame information. Sure, I meant to fix that. > > +/* This function is called when rs6000_frame_related is processing > > + SETs within a PARALLEL, and returns whether the REGNO save ought to > > + be marked RTX_FRAME_RELATED_P. The PARALLELs involved are those > > + for out-of-line register save functions, store multiple, and the > > + Darwin world_save. They may contain registers that don't really > > + need saving. */ > > + > > +static bool > > +interesting_frame_related_regno (unsigned int regno) > > +{ > > + /* Saves apparently of r0 are actually saving LR. */ > > + if (regno == 0) > > + return true; > > + /* If we see CR2 then we are here on a Darwin world save. Saves of > > + CR2 signify the whole CR is being saved. */ > > + if (regno == CR2_REGNO) > > + return true; > > + /* Omit frame info for any user-defined global regs. If frame info > > + is supplied for them, frame unwinding will restore a user reg. > > + Also omit frame info for any reg we don't need to save, as that > > + bloats eh_frame and can cause problems with shrink wrapping. > > + Since global regs won't be seen as needing to be saved, both of > > + these conditions are covered by save_reg_p. */ > > + return save_reg_p (regno); > > +} > > The function name isn't so great, doesn't say what it does at all. Not > that I can think of anything better. > > Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P > by itself? Not sure if that is nicer. > > Both this CR2 and R0 handling are pretty nasty hacks. Could you add a > comment saying that? I wouldn't say the R0 handling is a nasty hack at all. You can't save LR directly, storing to the stack must go via a gpr. I'm 100% sure you know that, and so would anyone else working on powerpc gcc support. It so happens that we use r0 in every case we hit this code. *That* fact is commented. I don't really know what else you want. Hmm, maybe I'm just too close to this code. I'll go with expounding some of the things known, as follows. /* Saves apparently of r0 are actually saving LR. It doesn't make sense to substitute the regno here to test save_reg_p (LR_REGNO). We *know* LR needs saving, and dwarf2cfi.c is able to deduce that (set (mem) (r0)) is saving LR from a prior (set (r0) (lr)) marked as frame related. */ (Incidentally, the dwarf2cfi.c behaviour is described by In addition, if a register has previously been saved to a different register, Yup, great comment that one! Dates back to 2004, commit 60ea93bb72.) The CR2 thing is a long-standing convention for frame info about CR, a wart fixed by ELFv2. See elsewhere /* CR register traditionally saved as CR2. */ and /* In other ABIs, by convention, we use a single CR regnum to represent the fact that all call-saved CR fields are saved. We use CR2_REGNO to be compatible with gcc-2.95 on Linux. */ I'l go with: /* If we see CR2 then we are here on a Darwin world save. Saves of CR2 signify the whole CR is being saved. This is a long-standing ABI wart fixed by ELFv2. As for r0/lr there is no need to check that CR needs to be saved. */ > Okay for trunk with those improvements (eh_frame and hack comment). > Thanks! > > > Segher -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving 2017-08-17 5:28 ` Alan Modra @ 2017-08-17 5:50 ` Segher Boessenkool 0 siblings, 0 replies; 11+ messages in thread From: Segher Boessenkool @ 2017-08-17 5:50 UTC (permalink / raw) To: Alan Modra; +Cc: gcc-patches, dje.gcc On Thu, Aug 17, 2017 at 11:25:27AM +0930, Alan Modra wrote: > On Wed, Aug 16, 2017 at 06:23:13PM -0500, Segher Boessenkool wrote: > > Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P > > by itself? Not sure if that is nicer. > > > > Both this CR2 and R0 handling are pretty nasty hacks. Could you add a > > comment saying that? > > I wouldn't say the R0 handling is a nasty hack at all. You can't save > LR directly, storing to the stack must go via a gpr. I'm 100% sure > you know that, and so would anyone else working on powerpc gcc > support. It so happens that we use r0 in every case we hit this > code. *That* fact is commented. I don't really know what else you > want. But R0 isn't necessarily used for LR here. That is true currently in all cases I can see, but it can be used for other things. The ABI doesn't force things here. The separate shrink-wrapping code does use R0 for other things, but it manually sets the CFI notes anyway, so no problem there. Maybe I should just stop worrying. > (Incidentally, the dwarf2cfi.c behaviour is described by > > In addition, if a register has previously been saved to a different > register, > > Yup, great comment that one! Dates back to 2004, commit 60ea93bb72.) Perhaps inspired by the REG_CFA_REGISTER name for the RTL note ;-) > The CR2 thing is a long-standing convention for frame info about CR, a > wart fixed by ELFv2. See elsewhere That ELFv2 fix is why I still haven't submitted the separate shrink-wrapping for CR fields code -- it complicates things a lot :-/ > I'l go with: > > /* If we see CR2 then we are here on a Darwin world save. Saves of > CR2 signify the whole CR is being saved. This is a long-standing > ABI wart fixed by ELFv2. As for r0/lr there is no need to check > that CR needs to be saved. */ That is fine, thanks again. Segher ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-08-17 2:23 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-09 21:16 [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) Segher Boessenkool 2017-08-10 4:18 ` Alan Modra 2017-08-10 4:44 ` Segher Boessenkool 2017-08-10 5:26 ` Alan Modra 2017-08-10 13:46 ` Segher Boessenkool 2017-08-11 4:58 ` [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving Alan Modra 2017-08-15 10:40 ` Segher Boessenkool 2017-08-16 0:13 ` Alan Modra 2017-08-17 2:23 ` Segher Boessenkool 2017-08-17 5:28 ` Alan Modra 2017-08-17 5:50 ` Segher Boessenkool
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).