public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938)
@ 2017-08-09 21:16 Segher Boessenkool
  2017-08-10  4:18 ` Alan Modra
  0 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2017-08-09 21:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: dje.gcc, Segher Boessenkool

We can have SAVE_MULTIPLE while we do not have REST_MULTIPLE.  If the
inline restore does not restore all registers, the CFI for the save
and restore can conflict if things are shrink-wrapped.

We could restore all registers that are saved (not ideal), or emit
the CFI notes to say we did (which will work fine, but is also not
so great); instead, let's not save the registers that are unused.

Tested on powerpc64-linux {-m32,-m64}; committing to trunk.


Segher


2017-08-09  Segher Boessenkool  <segher@kernel.crashing.org>

	PR target/80938
	* config/rs6000/rs6000.c (rs6000_savres_strategy): Don't use
	SAVE_MULTIPLE if not all the registers that saves, should be saved.

---
 gcc/config/rs6000/rs6000.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 0f0b1ff..e8cdd25 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24437,6 +24437,21 @@ rs6000_savres_strategy (rs6000_stack_t *info,
   else if (!lr_save_p && info->first_gp_reg_save > 29)
     strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
 
+  /* We can only use save multiple if we need to save all the registers from
+     first_gp_reg_save.  Otherwise, the CFI gets messed up (we save some
+     register we do not restore).  */
+  if (strategy & SAVE_MULTIPLE)
+    {
+      int i;
+
+      for (i = info->first_gp_reg_save; i < 32; i++)
+	if (fixed_reg_p (i) || !save_reg_p (i))
+	  {
+	    strategy &= ~SAVE_MULTIPLE;
+	    break;
+	  }
+    }
+
   /* We can only use load multiple or the out-of-line routines to
      restore gprs if we've saved all the registers from
      first_gp_reg_save.  Otherwise, we risk loading garbage.
-- 
1.9.3

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938)
  2017-08-09 21:16 [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) Segher Boessenkool
@ 2017-08-10  4:18 ` Alan Modra
  2017-08-10  4:44   ` Segher Boessenkool
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Modra @ 2017-08-10  4:18 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc

On Wed, Aug 09, 2017 at 09:06:18PM +0000, Segher Boessenkool wrote:
> We can have SAVE_MULTIPLE while we do not have REST_MULTIPLE.  If the
> inline restore does not restore all registers, the CFI for the save
> and restore can conflict if things are shrink-wrapped.
> 
> We could restore all registers that are saved (not ideal),

No, we can't do that.  A -ffixed-reg register should not be restored
even if it is saved, as such regs should never be written.  For
example, a fixed reg might be counting interrupts.  If you restore it,
you may lose count.  Another example is a fixed reg used as some sort
of context pointer.  If you restore in a function that sets a new
context you're obviously breaking user code.

> or emit
> the CFI notes to say we did (which will work fine,

No, you can't do that either.  Unwinding might then restore a
-ffixed-reg register.

> but is also not
> so great); instead, let's not save the registers that are unused.

Ick, looks like papering over the real problem to me, and will no
doubt cause -Os size regressions.  Also, if SAVE_MULTIPLE causes this
bad interaction with shrink-wrap, does using the out-of-line register
save functions cause the same problem?

I haven't looked yet, but at a guess I suspect the correct solution is
to modify cfa_restores in rs6000_emit_epilogue.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938)
  2017-08-10  4:18 ` Alan Modra
@ 2017-08-10  4:44   ` Segher Boessenkool
  2017-08-10  5:26     ` Alan Modra
  0 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2017-08-10  4:44 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, dje.gcc

On Thu, Aug 10, 2017 at 10:33:05AM +0930, Alan Modra wrote:
> On Wed, Aug 09, 2017 at 09:06:18PM +0000, Segher Boessenkool wrote:
> > We can have SAVE_MULTIPLE while we do not have REST_MULTIPLE.  If the
> > inline restore does not restore all registers, the CFI for the save
> > and restore can conflict if things are shrink-wrapped.
> > 
> > We could restore all registers that are saved (not ideal),
> 
> No, we can't do that.  A -ffixed-reg register should not be restored
> even if it is saved, as such regs should never be written.  For
> example, a fixed reg might be counting interrupts.  If you restore it,
> you may lose count.  Another example is a fixed reg used as some sort
> of context pointer.  If you restore in a function that sets a new
> context you're obviously breaking user code.

Yes, sorry for glossing over this.  We need to handle fixed registers
specially in most other prologue/epilogue things, too (and we hopefully
do everywhere it is needed).

> > or emit
> > the CFI notes to say we did (which will work fine,
> 
> No, you can't do that either.  Unwinding might then restore a
> -ffixed-reg register.

Yep.

> > but is also not
> > so great); instead, let's not save the registers that are unused.
> 
> Ick, looks like papering over the real problem to me, and will no
> doubt cause -Os size regressions.

I think it is very directly solving the real problem.  It isn't likely
to cause size regressions (look how long it took for this PR to show
up, so this cannot happen often).

This only happens if r30 (the PIC reg) is used but r31 isn't; which means
that a) there are no other registers to save, or b) the function is marked
as needing a hard frame pointer but eventually doesn't need one.

(RA picks the registers r31, r30, ... in sequence).

In the case in the PR, this patch replaces one insn by one (cheaper)
insn.

> Also, if SAVE_MULTIPLE causes this
> bad interaction with shrink-wrap, does using the out-of-line register
> save functions cause the same problem?

I do not know.  Not with the code in the PR (restoring only one or two
registers isn't done out-of-line, and it's a sibcall exit as well).


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938)
  2017-08-10  4:44   ` Segher Boessenkool
@ 2017-08-10  5:26     ` Alan Modra
  2017-08-10 13:46       ` Segher Boessenkool
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Modra @ 2017-08-10  5:26 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc

[-- Attachment #1: Type: text/plain, Size: 2566 bytes --]

On Wed, Aug 09, 2017 at 09:28:22PM -0500, Segher Boessenkool wrote:
> On Thu, Aug 10, 2017 at 10:33:05AM +0930, Alan Modra wrote:
> > On Wed, Aug 09, 2017 at 09:06:18PM +0000, Segher Boessenkool wrote:
> > > We can have SAVE_MULTIPLE while we do not have REST_MULTIPLE.  If the
> > > inline restore does not restore all registers, the CFI for the save
> > > and restore can conflict if things are shrink-wrapped.
> > > 
> > > We could restore all registers that are saved (not ideal),
> > 
> > No, we can't do that.  A -ffixed-reg register should not be restored
> > even if it is saved, as such regs should never be written.  For
> > example, a fixed reg might be counting interrupts.  If you restore it,
> > you may lose count.  Another example is a fixed reg used as some sort
> > of context pointer.  If you restore in a function that sets a new
> > context you're obviously breaking user code.
> 
> Yes, sorry for glossing over this.  We need to handle fixed registers
> specially in most other prologue/epilogue things, too (and we hopefully
> do everywhere it is needed).

We don't.  :-(  I have a fix and will post after testing.

> > > or emit
> > > the CFI notes to say we did (which will work fine,
> > 
> > No, you can't do that either.  Unwinding might then restore a
> > -ffixed-reg register.
> 
> Yep.
> 
> > > but is also not
> > > so great); instead, let's not save the registers that are unused.
> > 
> > Ick, looks like papering over the real problem to me, and will no
> > doubt cause -Os size regressions.
> 
> I think it is very directly solving the real problem.  It isn't likely
> to cause size regressions (look how long it took for this PR to show
> up, so this cannot happen often).
> 
> This only happens if r30 (the PIC reg) is used but r31 isn't; which means
> that a) there are no other registers to save, or b) the function is marked
> as needing a hard frame pointer but eventually doesn't need one.
> 
> (RA picks the registers r31, r30, ... in sequence).
> 
> In the case in the PR, this patch replaces one insn by one (cheaper)
> insn.

And in other cases your patch will prevent stmw when it should be
used.  Testcase attached.  It shows the wrong use of lmw too.

> > Also, if SAVE_MULTIPLE causes this
> > bad interaction with shrink-wrap, does using the out-of-line register
> > save functions cause the same problem?
> 
> I do not know.  Not with the code in the PR (restoring only one or two
> registers isn't done out-of-line, and it's a sibcall exit as well).
> 
> 
> Segher

-- 
Alan Modra
Australia Development Lab, IBM

[-- Attachment #2: savres.c --]
[-- Type: text/x-csrc, Size: 947 bytes --]

#ifdef USER_R26
register long r26 __asm__ ("r26");
#endif

int f1 (void)
{
  register long r25 __asm__ ("r25");
  register long r27 __asm__ ("r27");
  register long r28 __asm__ ("r28");
  register long r29 __asm__ ("r29");
  register long r31 __asm__ ("r31");
  __asm__ __volatile__
    ("#uses %0 %1 %2 %3 %4"
     : "=r" (r25), "=r" (r27), "=r" (r28), "=r" (r29), "=r" (r31));
  return 0;
}

void f2 (char *s)
{
  int col = 0;
  char c;

  void f3 (char c)
  {
    extern int f4 (char);
    register long r25 __asm__ ("r25");
    register long r27 __asm__ ("r27");
    register long r28 __asm__ ("r28");
    register long r29 __asm__ ("r29");
    register long r31 __asm__ ("r31");
    if (c == '\t')
      do f3 (' '); while (col % 8 != 0);
    else
      {
	__asm__ __volatile__
	  ("#uses %0 %1 %2 %3 %4"
	   : "=r" (r25), "=r" (r27), "=r" (r28), "=r" (r29), "=r" (r31));
	f4 (c);
	col++;
      }
  }

  while ((c = *s++) != 0)
    f3 (c);
}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938)
  2017-08-10  5:26     ` Alan Modra
@ 2017-08-10 13:46       ` Segher Boessenkool
  2017-08-11  4:58         ` [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving Alan Modra
  0 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2017-08-10 13:46 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, dje.gcc

On Thu, Aug 10, 2017 at 02:17:40PM +0930, Alan Modra wrote:
> > > Ick, looks like papering over the real problem to me, and will no
> > > doubt cause -Os size regressions.
> > 
> > I think it is very directly solving the real problem.  It isn't likely
> > to cause size regressions (look how long it took for this PR to show
> > up, so this cannot happen often).
> > 
> > This only happens if r30 (the PIC reg) is used but r31 isn't; which means
> > that a) there are no other registers to save, or b) the function is marked
> > as needing a hard frame pointer but eventually doesn't need one.
> > 
> > (RA picks the registers r31, r30, ... in sequence).
> > 
> > In the case in the PR, this patch replaces one insn by one (cheaper)
> > insn.
> 
> And in other cases your patch will prevent stmw when it should be
> used.  Testcase attached.  It shows the wrong use of lmw too.

I dunno...  If you change r25 to r14 in that testcase it will use stmw 14
with my patch reverted.  Not very reasonable imho (but then again, people
using register asm like this get what they deserve anyway).


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving
  2017-08-10 13:46       ` Segher Boessenkool
@ 2017-08-11  4:58         ` Alan Modra
  2017-08-15 10:40           ` Segher Boessenkool
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Modra @ 2017-08-11  4:58 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc

It is possible when using out-of-line register saves or store multiple
to save some registers unnecessarily, for example one reg in the block
saved might be unused.  We don't need to emit eh_frame info for those
registers as that just bloats the eh_frame info, and also can result
in an ICE when shrink-wrap gives multiple paths through the function
saving different sets of registers.  All the join points need to have
identical eh_frame register save state.

This patch reverts the previous fix for PR80939 "Use SAVE_MULTIPLE
only if we restore what it saves (PR80938)" and instead fixes the PR
by correcting the eh_frame info.  The change to rs6000_savres_strategy
is an optimization, but note that it hides the underlying problem in
the PR testcase.

Bootstrapped and regression tested powerpc64-linux (-m32 too) and
powerpc64le-linux, with
https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00774.html and
https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00775.html applied.
OK to apply?

	PR target/80938
	* config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
	Don't use store multiple if only one reg needs saving.
	(rs6000_frame_related): Don't emit eh_frame for regs that
	don't need saving.
	(rs6000_emit_epilogue): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2070648..abc55bd 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24432,20 +24432,37 @@ rs6000_savres_strategy (rs6000_stack_t *info,
 	   && flag_shrink_wrap_separate
 	   && optimize_function_for_speed_p (cfun)))
     {
-      /* Prefer store multiple for saves over out-of-line routines,
-	 since the store-multiple instruction will always be smaller.  */
-      strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE;
-
-      /* The situation is more complicated with load multiple.  We'd
-	 prefer to use the out-of-line routines for restores, since the
-	 "exit" out-of-line routines can handle the restore of LR and the
-	 frame teardown.  However if doesn't make sense to use the
-	 out-of-line routine if that is the only reason we'd need to save
-	 LR, and we can't use the "exit" out-of-line gpr restore if we
-	 have saved some fprs; In those cases it is advantageous to use
-	 load multiple when available.  */
-      if (info->first_fp_reg_save != 64 || !lr_save_p)
-	strategy |= REST_INLINE_GPRS | REST_MULTIPLE;
+      int count;
+
+      for (count = 0, i = info->first_gp_reg_save; i < 32; i++)
+	if (save_reg_p (i))
+	  count += 1;
+
+      if (count <= 1)
+	/* Don't use store multiple if only one reg needs to be
+	   saved.  This can occur for example when the ABI_V4 pic reg
+	   (r30) needs to be saved to make calls, but r31 is not
+	   used.  */
+	strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
+      else
+	{
+	  /* Prefer store multiple for saves over out-of-line
+	     routines, since the store-multiple instruction will
+	     always be smaller.  */
+	  strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE;
+
+	  /* The situation is more complicated with load multiple.
+	     We'd prefer to use the out-of-line routines for restores,
+	     since the "exit" out-of-line routines can handle the
+	     restore of LR and the frame teardown.  However if doesn't
+	     make sense to use the out-of-line routine if that is the
+	     only reason we'd need to save LR, and we can't use the
+	     "exit" out-of-line gpr restore if we have saved some
+	     fprs; In those cases it is advantageous to use load
+	     multiple when available.  */
+	  if (info->first_fp_reg_save != 64 || !lr_save_p)
+	    strategy |= REST_INLINE_GPRS | REST_MULTIPLE;
+	}
     }
 
   /* Using the "exit" out-of-line routine does not improve code size
@@ -24454,21 +24471,6 @@ rs6000_savres_strategy (rs6000_stack_t *info,
   else if (!lr_save_p && info->first_gp_reg_save > 29)
     strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
 
-  /* We can only use save multiple if we need to save all the registers from
-     first_gp_reg_save.  Otherwise, the CFI gets messed up (we save some
-     register we do not restore).  */
-  if (strategy & SAVE_MULTIPLE)
-    {
-      int i;
-
-      for (i = info->first_gp_reg_save; i < 32; i++)
-	if (fixed_reg_p (i) || !save_reg_p (i))
-	  {
-	    strategy &= ~SAVE_MULTIPLE;
-	    break;
-	  }
-    }
-
   /* Don't ever restore fixed regs.  */
   if ((strategy & (REST_INLINE_GPRS | REST_MULTIPLE)) != REST_INLINE_GPRS)
     for (i = info->first_gp_reg_save; i < 32; i++)
@@ -25681,9 +25683,15 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val,
 		 register save functions, or store multiple, then omit
 		 eh_frame info for any user-defined global regs.  If
 		 eh_frame info is supplied, frame unwinding will
-		 restore a user reg.  */
+		 restore a user reg.  Also omit eh_frame info for any
+		 reg we don't need to save, as that bloats eh_frame
+		 and can cause problems with shrink wrapping.  Saves
+		 of r0 are actually saving LR, so don't omit those.   */
 	      if (!REG_P (SET_SRC (set))
-		  || !fixed_reg_p (REGNO (SET_SRC (set))))
+		  || REGNO (SET_SRC (set)) == 0
+		  || REGNO (SET_SRC (set)) == CR2_REGNO
+		  || (!fixed_reg_p (REGNO (SET_SRC (set)))
+		      && save_reg_p (REGNO (SET_SRC (set)))))
 		RTX_FRAME_RELATED_P (set) = 1;
 	    }
       RTX_FRAME_RELATED_P (insn) = 1;
@@ -25720,9 +25728,13 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val,
 	      set = simplify_replace_rtx (set, reg2, repl2);
 	    XVECEXP (pat, 0, i) = set;
 
-	    /* Omit eh_frame info for any user-defined global regs.  */
+	    /* Omit eh_frame info for any user-defined global regs or
+	       regs that don't need to be saved.  */
 	    if (!REG_P (SET_SRC (set))
-		|| !fixed_reg_p (REGNO (SET_SRC (set))))
+		|| REGNO (SET_SRC (set)) == 0
+		|| REGNO (SET_SRC (set)) == CR2_REGNO
+		|| (!fixed_reg_p (REGNO (SET_SRC (set)))
+		    && save_reg_p (REGNO (SET_SRC (set)))))
 	      RTX_FRAME_RELATED_P (set) = 1;
 	  }
     }
@@ -27945,7 +27957,8 @@ rs6000_emit_epilogue (int sibcall)
 	  RTVEC_ELT (p, j++)
 	    = gen_frame_load (reg,
 			      frame_reg_rtx, info->gp_save_offset + reg_size * i);
-	  if (flag_shrink_wrap)
+	  if (flag_shrink_wrap
+	      && save_reg_p (info->first_gp_reg_save + i))
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       for (i = 0; info->first_altivec_reg_save + i <= LAST_ALTIVEC_REGNO; i++)
@@ -27954,7 +27967,8 @@ rs6000_emit_epilogue (int sibcall)
 	  RTVEC_ELT (p, j++)
 	    = gen_frame_load (reg,
 			      frame_reg_rtx, info->altivec_save_offset + 16 * i);
-	  if (flag_shrink_wrap)
+	  if (flag_shrink_wrap
+	      && save_reg_p (info->first_altivec_reg_save + i))
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       for (i = 0; info->first_fp_reg_save + i <= 63; i++)
@@ -27964,7 +27978,8 @@ rs6000_emit_epilogue (int sibcall)
 				 info->first_fp_reg_save + i);
 	  RTVEC_ELT (p, j++)
 	    = gen_frame_load (reg, frame_reg_rtx, info->fp_save_offset + 8 * i);
-	  if (flag_shrink_wrap)
+	  if (flag_shrink_wrap
+	      && save_reg_p (info->first_fp_reg_save + i))
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       RTVEC_ELT (p, j++)
@@ -28085,7 +28100,8 @@ rs6000_emit_epilogue (int sibcall)
 	    && (flag_shrink_wrap
 		|| (offset_below_red_zone_p
 		    (info->altivec_save_offset
-		     + 16 * (i - info->first_altivec_reg_save)))))
+		     + 16 * (i - info->first_altivec_reg_save))))
+	    && save_reg_p (i))
 	  {
 	    rtx reg = gen_rtx_REG (V4SImode, i);
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
@@ -28297,7 +28313,8 @@ rs6000_emit_epilogue (int sibcall)
       for (i = info->first_altivec_reg_save; i <= LAST_ALTIVEC_REGNO; ++i)
 	if (((strategy & REST_INLINE_VRS) == 0
 	     || (info->vrsave_mask & ALTIVEC_REG_BIT (i)) != 0)
-	    && (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap))
+	    && (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)
+	    && save_reg_p (i))
 	  {
 	    rtx reg = gen_rtx_REG (V4SImode, i);
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
@@ -28643,7 +28660,8 @@ rs6000_emit_epilogue (int sibcall)
 
 	  RTVEC_ELT (p, elt++)
 	    = gen_frame_load (reg, sp_reg_rtx, info->fp_save_offset + 8 * i);
-	  if (flag_shrink_wrap)
+	  if (flag_shrink_wrap
+	      && save_reg_p (info->first_fp_reg_save + i))
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
 


-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving
  2017-08-11  4:58         ` [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving Alan Modra
@ 2017-08-15 10:40           ` Segher Boessenkool
  2017-08-16  0:13             ` Alan Modra
  0 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2017-08-15 10:40 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, dje.gcc

Hi!

On Fri, Aug 11, 2017 at 12:40:11PM +0930, Alan Modra wrote:
> It is possible when using out-of-line register saves or store multiple
> to save some registers unnecessarily, for example one reg in the block
> saved might be unused.  We don't need to emit eh_frame info for those
> registers as that just bloats the eh_frame info, and also can result
> in an ICE when shrink-wrap gives multiple paths through the function
> saving different sets of registers.  All the join points need to have
> identical eh_frame register save state.

It isn't just eh_frame, we might have never noticed if it was.  It is
also the DWARF call frame info used for debugging.

> 	PR target/80938
> 	* config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
> 	Don't use store multiple if only one reg needs saving.
> 	(rs6000_frame_related): Don't emit eh_frame for regs that
> 	don't need saving.
> 	(rs6000_emit_epilogue): Likewise.

> +      int count;
> +
> +      for (count = 0, i = info->first_gp_reg_save; i < 32; i++)
> +	if (save_reg_p (i))
> +	  count += 1;

      int count = 0;
      for (i = info->first_gp_reg_save; i < 32; i++)
	if (save_reg_p (i))
	  count++;

> @@ -25681,9 +25683,15 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val,
>  		 register save functions, or store multiple, then omit
>  		 eh_frame info for any user-defined global regs.  If
>  		 eh_frame info is supplied, frame unwinding will
> -		 restore a user reg.  */
> +		 restore a user reg.  Also omit eh_frame info for any
> +		 reg we don't need to save, as that bloats eh_frame
> +		 and can cause problems with shrink wrapping.  Saves
> +		 of r0 are actually saving LR, so don't omit those.   */
>  	      if (!REG_P (SET_SRC (set))
> -		  || !fixed_reg_p (REGNO (SET_SRC (set))))
> +		  || REGNO (SET_SRC (set)) == 0
> +		  || REGNO (SET_SRC (set)) == CR2_REGNO
> +		  || (!fixed_reg_p (REGNO (SET_SRC (set)))
> +		      && save_reg_p (REGNO (SET_SRC (set)))))
>  		RTX_FRAME_RELATED_P (set) = 1;
>  	    }
>        RTX_FRAME_RELATED_P (insn) = 1;

That "0" and "CR2_REGNO" is non-obvious, it needs a big fat comment (it's
not clear that just CR2_REGNO is correct, either).  Oh, and please factor
out "REGNO (SET_SRC (set))".

The rest looks good, but please repost.


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving
  2017-08-15 10:40           ` Segher Boessenkool
@ 2017-08-16  0:13             ` Alan Modra
  2017-08-17  2:23               ` Segher Boessenkool
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Modra @ 2017-08-16  0:13 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc

Repost with requested changes.  I've extracted the logic that omits
frame saves from rs6000_frame_related to a new function, because it's
easier to document that way.  The logic has been simplified a little
too: fixed_reg_p doesn't need to be called there.

	PR target/80938
	* config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
	Don't use store multiple if only one reg needs saving.
	(interesting_frame_related_regno): New function.
	(rs6000_frame_related): Don't emit eh_frame for regs that
	don't need saving.
	(rs6000_emit_epilogue): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index f9aa13b..178632e 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24445,20 +24445,36 @@ rs6000_savres_strategy (rs6000_stack_t *info,
 	   && flag_shrink_wrap_separate
 	   && optimize_function_for_speed_p (cfun)))
     {
-      /* Prefer store multiple for saves over out-of-line routines,
-	 since the store-multiple instruction will always be smaller.  */
-      strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE;
-
-      /* The situation is more complicated with load multiple.  We'd
-	 prefer to use the out-of-line routines for restores, since the
-	 "exit" out-of-line routines can handle the restore of LR and the
-	 frame teardown.  However if doesn't make sense to use the
-	 out-of-line routine if that is the only reason we'd need to save
-	 LR, and we can't use the "exit" out-of-line gpr restore if we
-	 have saved some fprs; In those cases it is advantageous to use
-	 load multiple when available.  */
-      if (info->first_fp_reg_save != 64 || !lr_save_p)
-	strategy |= REST_INLINE_GPRS | REST_MULTIPLE;
+      int count = 0;
+      for (int i = info->first_gp_reg_save; i < 32; i++)
+	if (save_reg_p (i))
+	  count++;
+
+      if (count <= 1)
+	/* Don't use store multiple if only one reg needs to be
+	   saved.  This can occur for example when the ABI_V4 pic reg
+	   (r30) needs to be saved to make calls, but r31 is not
+	   used.  */
+	strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
+      else
+	{
+	  /* Prefer store multiple for saves over out-of-line
+	     routines, since the store-multiple instruction will
+	     always be smaller.  */
+	  strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE;
+
+	  /* The situation is more complicated with load multiple.
+	     We'd prefer to use the out-of-line routines for restores,
+	     since the "exit" out-of-line routines can handle the
+	     restore of LR and the frame teardown.  However if doesn't
+	     make sense to use the out-of-line routine if that is the
+	     only reason we'd need to save LR, and we can't use the
+	     "exit" out-of-line gpr restore if we have saved some
+	     fprs; In those cases it is advantageous to use load
+	     multiple when available.  */
+	  if (info->first_fp_reg_save != 64 || !lr_save_p)
+	    strategy |= REST_INLINE_GPRS | REST_MULTIPLE;
+	}
     }
 
   /* Using the "exit" out-of-line routine does not improve code size
@@ -24467,21 +24483,6 @@ rs6000_savres_strategy (rs6000_stack_t *info,
   else if (!lr_save_p && info->first_gp_reg_save > 29)
     strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
 
-  /* We can only use save multiple if we need to save all the registers from
-     first_gp_reg_save.  Otherwise, the CFI gets messed up (we save some
-     register we do not restore).  */
-  if (strategy & SAVE_MULTIPLE)
-    {
-      int i;
-
-      for (i = info->first_gp_reg_save; i < 32; i++)
-	if (fixed_reg_p (i) || !save_reg_p (i))
-	  {
-	    strategy &= ~SAVE_MULTIPLE;
-	    break;
-	  }
-    }
-
   /* Don't ever restore fixed regs.  */
   if ((strategy & (REST_INLINE_GPRS | REST_MULTIPLE)) != REST_INLINE_GPRS)
     for (int i = info->first_gp_reg_save; i < 32; i++)
@@ -25654,6 +25655,32 @@ output_probe_stack_range (rtx reg1, rtx reg2)
   return "";
 }
 
+/* This function is called when rs6000_frame_related is processing
+   SETs within a PARALLEL, and returns whether the REGNO save ought to
+   be marked RTX_FRAME_RELATED_P.  The PARALLELs involved are those
+   for out-of-line register save functions, store multiple, and the
+   Darwin world_save.  They may contain registers that don't really
+   need saving.  */
+
+static bool
+interesting_frame_related_regno (unsigned int regno)
+{
+  /* Saves apparently of r0 are actually saving LR.  */
+  if (regno == 0)
+    return true;
+  /* If we see CR2 then we are here on a Darwin world save.  Saves of
+     CR2 signify the whole CR is being saved.  */
+  if (regno == CR2_REGNO)
+    return true;
+  /* Omit frame info for any user-defined global regs.  If frame info
+     is supplied for them, frame unwinding will restore a user reg.
+     Also omit frame info for any reg we don't need to save, as that
+     bloats eh_frame and can cause problems with shrink wrapping.
+     Since global regs won't be seen as needing to be saved, both of
+     these conditions are covered by save_reg_p.  */
+  return save_reg_p (regno);
+}
+
 /* Add to 'insn' a note which is PATTERN (INSN) but with REG replaced
    with (plus:P (reg 1) VAL), and with REG2 replaced with REPL2 if REG2
    is not NULL.  It would be nice if dwarf2out_frame_debug_expr could
@@ -25688,13 +25715,8 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val,
 	    {
 	      rtx set = XVECEXP (pat, 0, i);
 
-	      /* If this PARALLEL has been emitted for out-of-line
-		 register save functions, or store multiple, then omit
-		 eh_frame info for any user-defined global regs.  If
-		 eh_frame info is supplied, frame unwinding will
-		 restore a user reg.  */
 	      if (!REG_P (SET_SRC (set))
-		  || !fixed_reg_p (REGNO (SET_SRC (set))))
+		  || interesting_frame_related_regno (REGNO (SET_SRC (set))))
 		RTX_FRAME_RELATED_P (set) = 1;
 	    }
       RTX_FRAME_RELATED_P (insn) = 1;
@@ -25731,9 +25753,8 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, HOST_WIDE_INT val,
 	      set = simplify_replace_rtx (set, reg2, repl2);
 	    XVECEXP (pat, 0, i) = set;
 
-	    /* Omit eh_frame info for any user-defined global regs.  */
 	    if (!REG_P (SET_SRC (set))
-		|| !fixed_reg_p (REGNO (SET_SRC (set))))
+		|| interesting_frame_related_regno (REGNO (SET_SRC (set))))
 	      RTX_FRAME_RELATED_P (set) = 1;
 	  }
     }
@@ -27956,7 +27977,8 @@ rs6000_emit_epilogue (int sibcall)
 	  RTVEC_ELT (p, j++)
 	    = gen_frame_load (reg,
 			      frame_reg_rtx, info->gp_save_offset + reg_size * i);
-	  if (flag_shrink_wrap)
+	  if (flag_shrink_wrap
+	      && save_reg_p (info->first_gp_reg_save + i))
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       for (i = 0; info->first_altivec_reg_save + i <= LAST_ALTIVEC_REGNO; i++)
@@ -27965,7 +27987,8 @@ rs6000_emit_epilogue (int sibcall)
 	  RTVEC_ELT (p, j++)
 	    = gen_frame_load (reg,
 			      frame_reg_rtx, info->altivec_save_offset + 16 * i);
-	  if (flag_shrink_wrap)
+	  if (flag_shrink_wrap
+	      && save_reg_p (info->first_altivec_reg_save + i))
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       for (i = 0; info->first_fp_reg_save + i <= 63; i++)
@@ -27975,7 +27998,8 @@ rs6000_emit_epilogue (int sibcall)
 				 info->first_fp_reg_save + i);
 	  RTVEC_ELT (p, j++)
 	    = gen_frame_load (reg, frame_reg_rtx, info->fp_save_offset + 8 * i);
-	  if (flag_shrink_wrap)
+	  if (flag_shrink_wrap
+	      && save_reg_p (info->first_fp_reg_save + i))
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
       RTVEC_ELT (p, j++)
@@ -28096,7 +28120,8 @@ rs6000_emit_epilogue (int sibcall)
 	    && (flag_shrink_wrap
 		|| (offset_below_red_zone_p
 		    (info->altivec_save_offset
-		     + 16 * (i - info->first_altivec_reg_save)))))
+		     + 16 * (i - info->first_altivec_reg_save))))
+	    && save_reg_p (i))
 	  {
 	    rtx reg = gen_rtx_REG (V4SImode, i);
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
@@ -28308,7 +28333,8 @@ rs6000_emit_epilogue (int sibcall)
       for (i = info->first_altivec_reg_save; i <= LAST_ALTIVEC_REGNO; ++i)
 	if (((strategy & REST_INLINE_VRS) == 0
 	     || (info->vrsave_mask & ALTIVEC_REG_BIT (i)) != 0)
-	    && (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap))
+	    && (DEFAULT_ABI == ABI_V4 || flag_shrink_wrap)
+	    && save_reg_p (i))
 	  {
 	    rtx reg = gen_rtx_REG (V4SImode, i);
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
@@ -28654,7 +28680,8 @@ rs6000_emit_epilogue (int sibcall)
 
 	  RTVEC_ELT (p, elt++)
 	    = gen_frame_load (reg, sp_reg_rtx, info->fp_save_offset + 8 * i);
-	  if (flag_shrink_wrap)
+	  if (flag_shrink_wrap
+	      && save_reg_p (info->first_fp_reg_save + i))
 	    cfa_restores = alloc_reg_note (REG_CFA_RESTORE, reg, cfa_restores);
 	}
 

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving
  2017-08-16  0:13             ` Alan Modra
@ 2017-08-17  2:23               ` Segher Boessenkool
  2017-08-17  5:28                 ` Alan Modra
  0 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2017-08-17  2:23 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, dje.gcc

Hi!

On Wed, Aug 16, 2017 at 08:05:04AM +0930, Alan Modra wrote:
> Repost with requested changes.  I've extracted the logic that omits
> frame saves from rs6000_frame_related to a new function, because it's
> easier to document that way.  The logic has been simplified a little
> too: fixed_reg_p doesn't need to be called there.
> 
> 	PR target/80938
> 	* config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
> 	Don't use store multiple if only one reg needs saving.
> 	(interesting_frame_related_regno): New function.
> 	(rs6000_frame_related): Don't emit eh_frame for regs that
> 	don't need saving.
> 	(rs6000_emit_epilogue): Likewise.

It's not just eh_frame, it's all call frame information.

> +/* This function is called when rs6000_frame_related is processing
> +   SETs within a PARALLEL, and returns whether the REGNO save ought to
> +   be marked RTX_FRAME_RELATED_P.  The PARALLELs involved are those
> +   for out-of-line register save functions, store multiple, and the
> +   Darwin world_save.  They may contain registers that don't really
> +   need saving.  */
> +
> +static bool
> +interesting_frame_related_regno (unsigned int regno)
> +{
> +  /* Saves apparently of r0 are actually saving LR.  */
> +  if (regno == 0)
> +    return true;
> +  /* If we see CR2 then we are here on a Darwin world save.  Saves of
> +     CR2 signify the whole CR is being saved.  */
> +  if (regno == CR2_REGNO)
> +    return true;
> +  /* Omit frame info for any user-defined global regs.  If frame info
> +     is supplied for them, frame unwinding will restore a user reg.
> +     Also omit frame info for any reg we don't need to save, as that
> +     bloats eh_frame and can cause problems with shrink wrapping.
> +     Since global regs won't be seen as needing to be saved, both of
> +     these conditions are covered by save_reg_p.  */
> +  return save_reg_p (regno);
> +}

The function name isn't so great, doesn't say what it does at all.  Not
that I can think of anything better.

Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P
by itself?  Not sure if that is nicer.

Both this CR2 and R0 handling are pretty nasty hacks.  Could you add a
comment saying that?

Okay for trunk with those improvements (eh_frame and hack comment).
Thanks!


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving
  2017-08-17  2:23               ` Segher Boessenkool
@ 2017-08-17  5:28                 ` Alan Modra
  2017-08-17  5:50                   ` Segher Boessenkool
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Modra @ 2017-08-17  5:28 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc

On Wed, Aug 16, 2017 at 06:23:13PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Aug 16, 2017 at 08:05:04AM +0930, Alan Modra wrote:
> > Repost with requested changes.  I've extracted the logic that omits
> > frame saves from rs6000_frame_related to a new function, because it's
> > easier to document that way.  The logic has been simplified a little
> > too: fixed_reg_p doesn't need to be called there.
> > 
> > 	PR target/80938
> > 	* config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
> > 	Don't use store multiple if only one reg needs saving.
> > 	(interesting_frame_related_regno): New function.
> > 	(rs6000_frame_related): Don't emit eh_frame for regs that
> > 	don't need saving.
> > 	(rs6000_emit_epilogue): Likewise.
> 
> It's not just eh_frame, it's all call frame information.

Sure, I meant to fix that.

> > +/* This function is called when rs6000_frame_related is processing
> > +   SETs within a PARALLEL, and returns whether the REGNO save ought to
> > +   be marked RTX_FRAME_RELATED_P.  The PARALLELs involved are those
> > +   for out-of-line register save functions, store multiple, and the
> > +   Darwin world_save.  They may contain registers that don't really
> > +   need saving.  */
> > +
> > +static bool
> > +interesting_frame_related_regno (unsigned int regno)
> > +{
> > +  /* Saves apparently of r0 are actually saving LR.  */
> > +  if (regno == 0)
> > +    return true;
> > +  /* If we see CR2 then we are here on a Darwin world save.  Saves of
> > +     CR2 signify the whole CR is being saved.  */
> > +  if (regno == CR2_REGNO)
> > +    return true;
> > +  /* Omit frame info for any user-defined global regs.  If frame info
> > +     is supplied for them, frame unwinding will restore a user reg.
> > +     Also omit frame info for any reg we don't need to save, as that
> > +     bloats eh_frame and can cause problems with shrink wrapping.
> > +     Since global regs won't be seen as needing to be saved, both of
> > +     these conditions are covered by save_reg_p.  */
> > +  return save_reg_p (regno);
> > +}
> 
> The function name isn't so great, doesn't say what it does at all.  Not
> that I can think of anything better.
> 
> Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P
> by itself?  Not sure if that is nicer.
> 
> Both this CR2 and R0 handling are pretty nasty hacks.  Could you add a
> comment saying that?

I wouldn't say the R0 handling is a nasty hack at all.  You can't save
LR directly, storing to the stack must go via a gpr.  I'm 100% sure
you know that, and so would anyone else working on powerpc gcc
support.  It so happens that we use r0 in every case we hit this 
code.  *That* fact is commented.  I don't really know what else you
want.  Hmm, maybe I'm just too close to this code.  I'll go with
expounding some of the things known, as follows. 

  /* Saves apparently of r0 are actually saving LR.  It doesn't make
     sense to substitute the regno here to test save_reg_p (LR_REGNO).
     We *know* LR needs saving, and dwarf2cfi.c is able to deduce that
     (set (mem) (r0)) is saving LR from a prior (set (r0) (lr)) marked
     as frame related.  */

(Incidentally, the dwarf2cfi.c behaviour is described by

  In addition, if a register has previously been saved to a different
  register,

Yup, great comment that one!  Dates back to 2004, commit 60ea93bb72.)

The CR2 thing is a long-standing convention for frame info about CR, a
wart fixed by ELFv2.  See elsewhere

      /* CR register traditionally saved as CR2.  */
and
	  /* In other ABIs, by convention, we use a single CR regnum to
	     represent the fact that all call-saved CR fields are saved.
	     We use CR2_REGNO to be compatible with gcc-2.95 on Linux.  */

I'l go with:

  /* If we see CR2 then we are here on a Darwin world save.  Saves of
     CR2 signify the whole CR is being saved.  This is a long-standing
     ABI wart fixed by ELFv2.  As for r0/lr there is no need to check
     that CR needs to be saved.  */

> Okay for trunk with those improvements (eh_frame and hack comment).
> Thanks!
> 
> 
> Segher

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving
  2017-08-17  5:28                 ` Alan Modra
@ 2017-08-17  5:50                   ` Segher Boessenkool
  0 siblings, 0 replies; 11+ messages in thread
From: Segher Boessenkool @ 2017-08-17  5:50 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, dje.gcc

On Thu, Aug 17, 2017 at 11:25:27AM +0930, Alan Modra wrote:
> On Wed, Aug 16, 2017 at 06:23:13PM -0500, Segher Boessenkool wrote:
> > Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P
> > by itself?  Not sure if that is nicer.
> > 
> > Both this CR2 and R0 handling are pretty nasty hacks.  Could you add a
> > comment saying that?
> 
> I wouldn't say the R0 handling is a nasty hack at all.  You can't save
> LR directly, storing to the stack must go via a gpr.  I'm 100% sure
> you know that, and so would anyone else working on powerpc gcc
> support.  It so happens that we use r0 in every case we hit this 
> code.  *That* fact is commented.  I don't really know what else you
> want.

But R0 isn't necessarily used for LR here.  That is true currently in
all cases I can see, but it can be used for other things.  The ABI
doesn't force things here.

The separate shrink-wrapping code does use R0 for other things, but it
manually sets the CFI notes anyway, so no problem there.  Maybe I should
just stop worrying.

> (Incidentally, the dwarf2cfi.c behaviour is described by
> 
>   In addition, if a register has previously been saved to a different
>   register,
> 
> Yup, great comment that one!  Dates back to 2004, commit 60ea93bb72.)

Perhaps inspired by the REG_CFA_REGISTER name for the RTL note ;-)

> The CR2 thing is a long-standing convention for frame info about CR, a
> wart fixed by ELFv2.  See elsewhere

That ELFv2 fix is why I still haven't submitted the separate shrink-wrapping
for CR fields code -- it complicates things a lot :-/

> I'l go with:
> 
>   /* If we see CR2 then we are here on a Darwin world save.  Saves of
>      CR2 signify the whole CR is being saved.  This is a long-standing
>      ABI wart fixed by ELFv2.  As for r0/lr there is no need to check
>      that CR needs to be saved.  */

That is fine, thanks again.


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-08-17  2:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-09 21:16 [PATCH] rs6000: Use SAVE_MULTIPLE only if we restore what it saves (PR80938) Segher Boessenkool
2017-08-10  4:18 ` Alan Modra
2017-08-10  4:44   ` Segher Boessenkool
2017-08-10  5:26     ` Alan Modra
2017-08-10 13:46       ` Segher Boessenkool
2017-08-11  4:58         ` [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving Alan Modra
2017-08-15 10:40           ` Segher Boessenkool
2017-08-16  0:13             ` Alan Modra
2017-08-17  2:23               ` Segher Boessenkool
2017-08-17  5:28                 ` Alan Modra
2017-08-17  5:50                   ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).