public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][AArch64] Simplify frame layout for stack probing
@ 2017-07-25 13:58 Wilco Dijkstra
  2017-07-26 19:16 ` Jeff Law
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Wilco Dijkstra @ 2017-07-25 13:58 UTC (permalink / raw)
  To: GCC Patches, James Greenhalgh, Jeff Law; +Cc: nd

This patch makes some changes to the frame layout in order to simplify
stack probing.  We want to use the save of LR as a probe in any non-leaf
function.  With shrinkwrapping we may only save LR before a call, so it
is useful to define a fixed location in the callee-saves. So force LR at
the bottom of the callee-saves even with -fomit-frame-pointer.

Also remove a rarely used frame layout that saves the callee-saves first
with -fomit-frame-pointer.

OK for commit (and backport to GCC7)?

ChangeLog:
2017-07-25  Wilco Dijkstra  <wdijkstr@arm.com>

	* config/aarch64/aarch64.c (aarch64_layout_frame):
	Ensure LR is always stored at the bottom of the callee-saves.
	Remove frame option which saves callee-saves at top of frame.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b8a4160d9de8e689ccd26cb9f0ce046ee65e0ef4..3fc36ae28d18b9635480fd99f1fa7719267e66e4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2875,7 +2875,8 @@ aarch64_frame_pointer_required (void)
 
 /* Mark the registers that need to be saved by the callee and calculate
    the size of the callee-saved registers area and frame record (both FP
-   and LR may be omitted).  */
+   and LR may be omitted).  If the function is not a leaf, ensure LR is
+   saved at the bottom of the callee-save area.  */
 static void
 aarch64_layout_frame (void)
 {
@@ -2926,7 +2927,14 @@ aarch64_layout_frame (void)
       cfun->machine->frame.wb_candidate1 = R29_REGNUM;
       cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD;
       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
-      offset += 2 * UNITS_PER_WORD;
+      offset = 2 * UNITS_PER_WORD;
+    }
+  else if (!crtl->is_leaf)
+    {
+      /* Ensure LR is saved at the bottom of the callee-saves.  */
+      cfun->machine->frame.reg_offset[R30_REGNUM] = 0;
+      cfun->machine->frame.wb_candidate1 = R30_REGNUM;
+      offset = UNITS_PER_WORD;
     }
 
   /* Now assign stack slots for them.  */
@@ -3025,20 +3033,6 @@ aarch64_layout_frame (void)
       cfun->machine->frame.final_adjust
 	= cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
     }
-  else if (!frame_pointer_needed
-	   && varargs_and_saved_regs_size < max_push_offset)
-    {
-      /* Frame with large local area and outgoing arguments (this pushes the
-	 callee-saves first, followed by the locals and outgoing area):
-	 stp reg1, reg2, [sp, -varargs_and_saved_regs_size]!
-	 stp reg3, reg4, [sp, 16]
-	 sub sp, sp, frame_size - varargs_and_saved_regs_size  */
-      cfun->machine->frame.callee_adjust = varargs_and_saved_regs_size;
-      cfun->machine->frame.final_adjust
-	= cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.hard_fp_offset = cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.locals_offset = cfun->machine->frame.hard_fp_offset;
-    }
   else
     {
       /* Frame with large local area and outgoing arguments using frame pointer:

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][AArch64] Simplify frame layout for stack probing
  2017-07-25 13:58 [PATCH][AArch64] Simplify frame layout for stack probing Wilco Dijkstra
@ 2017-07-26 19:16 ` Jeff Law
  2017-08-01 10:18 ` Wilco Dijkstra
  2017-10-26 15:23 ` James Greenhalgh
  2 siblings, 0 replies; 7+ messages in thread
From: Jeff Law @ 2017-07-26 19:16 UTC (permalink / raw)
  To: Wilco Dijkstra, GCC Patches, James Greenhalgh; +Cc: nd

On 07/25/2017 07:58 AM, Wilco Dijkstra wrote:
> This patch makes some changes to the frame layout in order to simplify
> stack probing.  We want to use the save of LR as a probe in any non-leaf
> function.  With shrinkwrapping we may only save LR before a call, so it
> is useful to define a fixed location in the callee-saves. So force LR at
> the bottom of the callee-saves even with -fomit-frame-pointer.
> 
> Also remove a rarely used frame layout that saves the callee-saves first
> with -fomit-frame-pointer.
> 
> OK for commit (and backport to GCC7)?
> 
> ChangeLog:
> 2017-07-25  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_layout_frame):
> 	Ensure LR is always stored at the bottom of the callee-saves.
> 	Remove frame option which saves callee-saves at top of frame.
I'll let the appropriate aarch64 maintainers comment on correctness.
But I wanted to give an explicit thanks for simplifying this so that we
can rely on it.

Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][AArch64] Simplify frame layout for stack probing
  2017-07-25 13:58 [PATCH][AArch64] Simplify frame layout for stack probing Wilco Dijkstra
  2017-07-26 19:16 ` Jeff Law
@ 2017-08-01 10:18 ` Wilco Dijkstra
  2017-08-15 16:27   ` Wilco Dijkstra
  2017-10-26 15:23 ` James Greenhalgh
  2 siblings, 1 reply; 7+ messages in thread
From: Wilco Dijkstra @ 2017-08-01 10:18 UTC (permalink / raw)
  To: GCC Patches, James Greenhalgh, Jeff Law; +Cc: nd


ping


From: Wilco Dijkstra
Sent: 25 July 2017 14:58
To: GCC Patches; James Greenhalgh; Jeff Law
Cc: nd
Subject: [PATCH][AArch64] Simplify frame layout for stack probing
    
This patch makes some changes to the frame layout in order to simplify
stack probing.  We want to use the save of LR as a probe in any non-leaf
function.  With shrinkwrapping we may only save LR before a call, so it
is useful to define a fixed location in the callee-saves. So force LR at
the bottom of the callee-saves even with -fomit-frame-pointer.

Also remove a rarely used frame layout that saves the callee-saves first
with -fomit-frame-pointer.

OK for commit (and backport to GCC7)?

ChangeLog:
2017-07-25  Wilco Dijkstra  <wdijkstr@arm.com>

        * config/aarch64/aarch64.c (aarch64_layout_frame):
        Ensure LR is always stored at the bottom of the callee-saves.
        Remove frame option which saves callee-saves at top of frame.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b8a4160d9de8e689ccd26cb9f0ce046ee65e0ef4..3fc36ae28d18b9635480fd99f1fa7719267e66e4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2875,7 +2875,8 @@ aarch64_frame_pointer_required (void)
 
 /* Mark the registers that need to be saved by the callee and calculate
    the size of the callee-saved registers area and frame record (both FP
-   and LR may be omitted).  */
+   and LR may be omitted).  If the function is not a leaf, ensure LR is
+   saved at the bottom of the callee-save area.  */
 static void
 aarch64_layout_frame (void)
 {
@@ -2926,7 +2927,14 @@ aarch64_layout_frame (void)
       cfun->machine->frame.wb_candidate1 = R29_REGNUM;
       cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD;
       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
-      offset += 2 * UNITS_PER_WORD;
+      offset = 2 * UNITS_PER_WORD;
+    }
+  else if (!crtl->is_leaf)
+    {
+      /* Ensure LR is saved at the bottom of the callee-saves.  */
+      cfun->machine->frame.reg_offset[R30_REGNUM] = 0;
+      cfun->machine->frame.wb_candidate1 = R30_REGNUM;
+      offset = UNITS_PER_WORD;
     }
 
   /* Now assign stack slots for them.  */
@@ -3025,20 +3033,6 @@ aarch64_layout_frame (void)
       cfun->machine->frame.final_adjust
         = cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
     }
-  else if (!frame_pointer_needed
-          && varargs_and_saved_regs_size < max_push_offset)
-    {
-      /* Frame with large local area and outgoing arguments (this pushes the
-        callee-saves first, followed by the locals and outgoing area):
-        stp reg1, reg2, [sp, -varargs_and_saved_regs_size]!
-        stp reg3, reg4, [sp, 16]
-        sub sp, sp, frame_size - varargs_and_saved_regs_size  */
-      cfun->machine->frame.callee_adjust = varargs_and_saved_regs_size;
-      cfun->machine->frame.final_adjust
-       = cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.hard_fp_offset = cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.locals_offset = cfun->machine->frame.hard_fp_offset;
-    }
   else
     {
       /* Frame with large local area and outgoing arguments using frame pointer:
    

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][AArch64] Simplify frame layout for stack probing
  2017-08-01 10:18 ` Wilco Dijkstra
@ 2017-08-15 16:27   ` Wilco Dijkstra
  0 siblings, 0 replies; 7+ messages in thread
From: Wilco Dijkstra @ 2017-08-15 16:27 UTC (permalink / raw)
  To: GCC Patches, James Greenhalgh, Jeff Law; +Cc: nd


  
ping


From: Wilco Dijkstra
Sent: 25 July 2017 14:58
To: GCC Patches; James Greenhalgh; Jeff Law
Cc: nd
Subject: [PATCH][AArch64] Simplify frame layout for stack probing
    
This patch makes some changes to the frame layout in order to simplify
stack probing.  We want to use the save of LR as a probe in any non-leaf
function.  With shrinkwrapping we may only save LR before a call, so it
is useful to define a fixed location in the callee-saves. So force LR at
the bottom of the callee-saves even with -fomit-frame-pointer.

Also remove a rarely used frame layout that saves the callee-saves first
with -fomit-frame-pointer.

OK for commit (and backport to GCC7)?

ChangeLog:
2017-07-25  Wilco Dijkstra  <wdijkstr@arm.com>

        * config/aarch64/aarch64.c (aarch64_layout_frame):
        Ensure LR is always stored at the bottom of the callee-saves.
        Remove frame option which saves callee-saves at top of frame.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b8a4160d9de8e689ccd26cb9f0ce046ee65e0ef4..3fc36ae28d18b9635480fd99f1fa7719267e66e4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2875,7 +2875,8 @@ aarch64_frame_pointer_required (void)
 
 /* Mark the registers that need to be saved by the callee and calculate
    the size of the callee-saved registers area and frame record (both FP
-   and LR may be omitted).  */
+   and LR may be omitted).  If the function is not a leaf, ensure LR is
+   saved at the bottom of the callee-save area.  */
 static void
 aarch64_layout_frame (void)
 {
@@ -2926,7 +2927,14 @@ aarch64_layout_frame (void)
       cfun->machine->frame.wb_candidate1 = R29_REGNUM;
       cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD;
       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
-      offset += 2 * UNITS_PER_WORD;
+      offset = 2 * UNITS_PER_WORD;
+    }
+  else if (!crtl->is_leaf)
+    {
+      /* Ensure LR is saved at the bottom of the callee-saves.  */
+      cfun->machine->frame.reg_offset[R30_REGNUM] = 0;
+      cfun->machine->frame.wb_candidate1 = R30_REGNUM;
+      offset = UNITS_PER_WORD;
     }
 
   /* Now assign stack slots for them.  */
@@ -3025,20 +3033,6 @@ aarch64_layout_frame (void)
       cfun->machine->frame.final_adjust
         = cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
     }
-  else if (!frame_pointer_needed
-          && varargs_and_saved_regs_size < max_push_offset)
-    {
-      /* Frame with large local area and outgoing arguments (this pushes the
-        callee-saves first, followed by the locals and outgoing area):
-        stp reg1, reg2, [sp, -varargs_and_saved_regs_size]!
-        stp reg3, reg4, [sp, 16]
-        sub sp, sp, frame_size - varargs_and_saved_regs_size  */
-      cfun->machine->frame.callee_adjust = varargs_and_saved_regs_size;
-      cfun->machine->frame.final_adjust
-       = cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.hard_fp_offset = cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.locals_offset = cfun->machine->frame.hard_fp_offset;
-    }
   else
     {
       /* Frame with large local area and outgoing arguments using frame pointer:
        

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][AArch64] Simplify frame layout for stack probing
  2017-07-25 13:58 [PATCH][AArch64] Simplify frame layout for stack probing Wilco Dijkstra
  2017-07-26 19:16 ` Jeff Law
  2017-08-01 10:18 ` Wilco Dijkstra
@ 2017-10-26 15:23 ` James Greenhalgh
  2017-10-27  9:26   ` James Greenhalgh
  2 siblings, 1 reply; 7+ messages in thread
From: James Greenhalgh @ 2017-10-26 15:23 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: GCC Patches, Jeff Law, nd

On Tue, Jul 25, 2017 at 02:58:04PM +0100, Wilco Dijkstra wrote:
> This patch makes some changes to the frame layout in order to simplify
> stack probing.  We want to use the save of LR as a probe in any non-leaf
> function.  With shrinkwrapping we may only save LR before a call, so it
> is useful to define a fixed location in the callee-saves. So force LR at
> the bottom of the callee-saves even with -fomit-frame-pointer.
> 
> Also remove a rarely used frame layout that saves the callee-saves first
> with -fomit-frame-pointer.
> 
> OK for commit (and backport to GCC7)?

OK. Leave it a week before backporting.

Reviewed by: James Greenhalgh <james.greenhalgh@arm.com>

Thanks,
James

> 
> ChangeLog:
> 2017-07-25  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_layout_frame):
> 	Ensure LR is always stored at the bottom of the callee-saves.
> 	Remove frame option which saves callee-saves at top of frame.
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][AArch64] Simplify frame layout for stack probing
  2017-10-26 15:23 ` James Greenhalgh
@ 2017-10-27  9:26   ` James Greenhalgh
  2017-11-03 16:47     ` Wilco Dijkstra
  0 siblings, 1 reply; 7+ messages in thread
From: James Greenhalgh @ 2017-10-27  9:26 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: GCC Patches, Jeff Law, nd

On Thu, Oct 26, 2017 at 04:19:35PM +0100, James Greenhalgh wrote:
> On Tue, Jul 25, 2017 at 02:58:04PM +0100, Wilco Dijkstra wrote:
> > This patch makes some changes to the frame layout in order to simplify
> > stack probing.  We want to use the save of LR as a probe in any non-leaf
> > function.  With shrinkwrapping we may only save LR before a call, so it
> > is useful to define a fixed location in the callee-saves. So force LR at
> > the bottom of the callee-saves even with -fomit-frame-pointer.
> > 
> > Also remove a rarely used frame layout that saves the callee-saves first
> > with -fomit-frame-pointer.
> > 
> > OK for commit (and backport to GCC7)?
> 
> OK. Leave it a week before backporting.

This caused:

  Failures:
	gcc.target/aarch64/test_frame_4.c
	gcc.target/aarch64/test_frame_2.c
	gcc.target/aarch64/test_frame_7.c
	gcc.target/aarch64/test_frame_10.c
	
  Bisected to: 

  Author: wilco
  Date:   Thu Oct 26 16:40:25 2017 +0000

    Simplify frame layout for stack probing
    
    This patch makes some changes to the frame layout in order to simplify
    stack probing.  We want to use the save of LR as a probe in any non-leaf
    function.  With shrinkwrapping we may only save LR before a call, so it
    is useful to define a fixed location in the callee-saves. So force LR at
    the bottom of the callee-saves even with -fomit-frame-pointer.
    
    Also remove a rarely used frame layout that saves the callee-saves first
    with -fomit-frame-pointer.  Doing so allows the store of LR to be used as
    a valid stack probe in all frames.
    
        gcc/
    	* config/aarch64/aarch64.c (aarch64_layout_frame):
            Ensure LR is always stored at the bottom of the callee-saves.
            Remove rarely used frame layout which saves callee-saves at top of
            frame, so the store of LR can be used as a valid probe in all cases.
    
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@254112

Please look in to this.

This will also block the request to backport the patch until after the
failures have been resolved.

There's no reason we shouldn't be catching bugs like this (simple
scan assembler tests which have been in the port for years, that will
obviously never pass after your changes) before the patch makes it to
trunk. How was this patch tested?

Thanks,
James

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][AArch64] Simplify frame layout for stack probing
  2017-10-27  9:26   ` James Greenhalgh
@ 2017-11-03 16:47     ` Wilco Dijkstra
  0 siblings, 0 replies; 7+ messages in thread
From: Wilco Dijkstra @ 2017-11-03 16:47 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: GCC Patches, Jeff Law, nd

James Greenhalgh wrote:
>
> This caused:
>
>  Failures:
>        gcc.target/aarch64/test_frame_4.c
>        gcc.target/aarch64/test_frame_2.c
>        gcc.target/aarch64/test_frame_7.c
>        gcc.target/aarch64/test_frame_10.c

Sorry, I missed that in testing. I've reverted part of the patch that caused this.
The tests are definitely too picky but they also uncovered a real code generation
inefficiency, so I need to look into that further.

I've committed this:

2017-11-03  Wilco Dijkstra  <wdijkstr@arm.com>

        PR target/82786
        * config/aarch64/aarch64.c (aarch64_layout_frame):
        Undo forcing of LR at bottom of frame.
--
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2fc7db4..949f3cb 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-11-03  Wilco Dijkstra  <wdijkstr@arm.com>
+
+       PR target/82786
+       * config/aarch64/aarch64.c (aarch64_layout_frame):
+       Undo forcing of LR at bottom of frame.
+
 2017-11-03  Jeff Law  <law@redhat.com>
 
        * cfganal.c (single_pred_edge_ignoring_loop_edges): New function
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1e12645..12f247d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2908,8 +2908,7 @@ aarch64_frame_pointer_required (void)
 
 /* Mark the registers that need to be saved by the callee and calculate
    the size of the callee-saved registers area and frame record (both FP
-   and LR may be omitted).  If the function is not a leaf, ensure LR is
-   saved at the bottom of the callee-save area.  */
+   and LR may be omitted).  */
 static void
 aarch64_layout_frame (void)
 {
@@ -2966,13 +2965,6 @@ aarch64_layout_frame (void)
       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
       offset = 2 * UNITS_PER_WORD;
     }
-  else if (!crtl->is_leaf)
-    {
-      /* Ensure LR is saved at the bottom of the callee-saves.  */
-      cfun->machine->frame.reg_offset[R30_REGNUM] = 0;
-      cfun->machine->frame.wb_candidate1 = R30_REGNUM;
-      offset = UNITS_PER_WORD;
-    }
 
   /* Now assign stack slots for them.  */
   for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-11-03 16:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-25 13:58 [PATCH][AArch64] Simplify frame layout for stack probing Wilco Dijkstra
2017-07-26 19:16 ` Jeff Law
2017-08-01 10:18 ` Wilco Dijkstra
2017-08-15 16:27   ` Wilco Dijkstra
2017-10-26 15:23 ` James Greenhalgh
2017-10-27  9:26   ` James Greenhalgh
2017-11-03 16:47     ` Wilco Dijkstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).