public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, arm][PR88167] Fix __builtin_return_address returns invalid address
@ 2018-12-18  9:32 Mihail Ionescu
  2018-12-18 12:54 ` Mihail Ionescu
  0 siblings, 1 reply; 4+ messages in thread
From: Mihail Ionescu @ 2018-12-18  9:32 UTC (permalink / raw)
  To: Ramana Radhakrishnan, gcc-patches, Richard Earnshaw, Kyrylo Tkachov

Hi All,

In Thumb mode when the function prologue gets expanded, in case of a 
multiple register push, additional mov instructions are generated to 
save the high registers which result in lr getting overwritten before 
it's value can be used to retrieve the return address.

The fix consists of detecting if lr is alive after the prologue, in 
which case, the lr register won't be used as a scratch.

Regression tested on arm-none-eabi.

gcc/ChangeLog:
2018-11-23  Mihail Ionescu  <mihail.ionescu@arm.com>

	PR target/88167
	* config/arm/arm.c: Add lr liveness check.

gcc/testsuite/ChangeLog
2018-11-23  Mihail Ionescu  <mihail.ionescu@arm.com>

	PR target/88167
	* gcc.target/arm/pr88167.c: New test.

If everything is ok for trunk, could someone commit it on my behalf?

Best regards,
    Mihail

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH, arm][PR88167] Fix __builtin_return_address returns invalid address
  2018-12-18  9:32 [PATCH, arm][PR88167] Fix __builtin_return_address returns invalid address Mihail Ionescu
@ 2018-12-18 12:54 ` Mihail Ionescu
  2019-04-26  9:26   ` [PATCH, arm, ping][PR88167] " Mihail Ionescu
  2019-05-08 14:39   ` [PATCH, arm][PR88167] " Richard Earnshaw (lists)
  0 siblings, 2 replies; 4+ messages in thread
From: Mihail Ionescu @ 2018-12-18 12:54 UTC (permalink / raw)
  To: Ramana Radhakrishnan, gcc-patches, Richard Earnshaw, Kyrylo Tkachov

[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]



On 12/18/2018 09:32 AM, Mihail Ionescu wrote:
> Hi All,
> 
> In Thumb mode when the function prologue gets expanded, in case of a 
> multiple register push, additional mov instructions are generated to 
> save the high registers which result in lr getting overwritten before 
> it's value can be used to retrieve the return address.
> 
> The fix consists of detecting if lr is alive after the prologue, in 
> which case, the lr register won't be used as a scratch.
> 
> Regression tested on arm-none-eabi.
> 
> gcc/ChangeLog:
> 2018-11-23  Mihail Ionescu  <mihail.ionescu@arm.com>
> 
>      PR target/88167
>      * config/arm/arm.c: Add lr liveness check.
> 
> gcc/testsuite/ChangeLog
> 2018-11-23  Mihail Ionescu  <mihail.ionescu@arm.com>
> 
>      PR target/88167
>      * gcc.target/arm/pr88167.c: New test.
> 
> If everything is ok for trunk, could someone commit it on my behalf?
> 
> Best regards,
>     Mihail

Hi All,

Sorry, I forgot to attach the diff.

Regards,
  Mihail

[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 1470 bytes --]

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8393f0b87f34c04c9dcc89c63d2e9bbd042c969c..b5c5942791530bc83f54ec96ed3c9c3838080e0f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25186,7 +25186,10 @@ thumb1_expand_prologue (void)
 	 even if they can be pushed.  This is to avoid using them to stash the high
 	 registers.  Such kind of stash may clobber the use of arguments.  */
       pushable_regs = l_mask & (~arg_regs_mask);
-      if (lr_needs_saving)
+      bool lr_alive = REGNO_REG_SET_P (df_get_live_out (
+					ENTRY_BLOCK_PTR_FOR_FN (cfun)), LR_REGNUM);
+
+      if (lr_needs_saving || lr_alive)
 	pushable_regs &= ~(1 << LR_REGNUM);
 
       if (pushable_regs == 0)
diff --git a/gcc/testsuite/gcc.target/arm/pr88167.c b/gcc/testsuite/gcc.target/arm/pr88167.c
new file mode 100644
index 0000000000000000000000000000000000000000..e0023716e0010ef3f09878fd6fa1a70f727228b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr88167.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb1_ok } */
+/* { dg-options "-mcpu=cortex-m0 -O2" }  */
+
+__attribute__ ((used))
+void *retaddr;
+
+__attribute__ ((noinline))
+void foo (void) {
+  retaddr = __builtin_return_address (0);
+
+  /* Used for enforcing registers stacking.  */
+  asm volatile ("" : : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+			 "r8", "r9", "r10", "r11", "r12");
+}
+
+/* { dg-final { scan-assembler-not "mov\tlr," } } */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH, arm, ping][PR88167] Fix __builtin_return_address returns invalid address
  2018-12-18 12:54 ` Mihail Ionescu
@ 2019-04-26  9:26   ` Mihail Ionescu
  2019-05-08 14:39   ` [PATCH, arm][PR88167] " Richard Earnshaw (lists)
  1 sibling, 0 replies; 4+ messages in thread
From: Mihail Ionescu @ 2019-04-26  9:26 UTC (permalink / raw)
  To: Ramana Radhakrishnan, gcc-patches, Richard Earnshaw, Kyrylo Tkachov

Ping?

Best regards,

Mihail

On 12/18/2018 12:53 PM, Mihail Ionescu wrote:
> 
> 
> On 12/18/2018 09:32 AM, Mihail Ionescu wrote:
>> Hi All,
>>
>> In Thumb mode when the function prologue gets expanded, in case of a 
>> multiple register push, additional mov instructions are generated to 
>> save the high registers which result in lr getting overwritten before 
>> it's value can be used to retrieve the return address.
>>
>> The fix consists of detecting if lr is alive after the prologue, in 
>> which case, the lr register won't be used as a scratch.
>>
>> Regression tested on arm-none-eabi.
>>
>> gcc/ChangeLog:
>> 2018-11-23  Mihail Ionescu  <mihail.ionescu@arm.com>
>>
>>      PR target/88167
>>      * config/arm/arm.c: Add lr liveness check.
>>
>> gcc/testsuite/ChangeLog
>> 2018-11-23  Mihail Ionescu  <mihail.ionescu@arm.com>
>>
>>      PR target/88167
>>      * gcc.target/arm/pr88167.c: New test.
>>
>> If everything is ok for trunk, could someone commit it on my behalf?
>>
>> Best regards,
>>     Mihail
> 
> Hi All,
> 
> Sorry, I forgot to attach the diff.
> 
> Regards,
>   Mihail

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH, arm][PR88167] Fix __builtin_return_address returns invalid address
  2018-12-18 12:54 ` Mihail Ionescu
  2019-04-26  9:26   ` [PATCH, arm, ping][PR88167] " Mihail Ionescu
@ 2019-05-08 14:39   ` Richard Earnshaw (lists)
  1 sibling, 0 replies; 4+ messages in thread
From: Richard Earnshaw (lists) @ 2019-05-08 14:39 UTC (permalink / raw)
  To: Mihail Ionescu, Ramana Radhakrishnan, gcc-patches, Kyrylo Tkachov

[-- Attachment #1: Type: text/plain, Size: 3145 bytes --]

On 18/12/2018 12:53, Mihail Ionescu wrote:
> 
> 
> On 12/18/2018 09:32 AM, Mihail Ionescu wrote:
>> Hi All,
>>
>> In Thumb mode when the function prologue gets expanded, in case of a
>> multiple register push, additional mov instructions are generated to
>> save the high registers which result in lr getting overwritten before
>> it's value can be used to retrieve the return address.
>>
>> The fix consists of detecting if lr is alive after the prologue, in
>> which case, the lr register won't be used as a scratch.
>>
>> Regression tested on arm-none-eabi.
>>
>> gcc/ChangeLog:
>> 2018-11-23  Mihail Ionescu  <mihail.ionescu@arm.com>
>>
>>      PR target/88167
>>      * config/arm/arm.c: Add lr liveness check.
>>
>> gcc/testsuite/ChangeLog
>> 2018-11-23  Mihail Ionescu  <mihail.ionescu@arm.com>
>>
>>      PR target/88167
>>      * gcc.target/arm/pr88167.c: New test.
>>
>> If everything is ok for trunk, could someone commit it on my behalf?
>>
>> Best regards,
>>     Mihail

Hi Mihail,

Thanks for the patch.  There are a couple of minor issues with your
patch.  Firstly, the name of the variable doesn't really describe its
use clearly.  It wasn't obvious from your changes what was the case
being considered and it had to be derived from looking at the testcase
and bug report.  A comment on the code would have helped clarify things
somewhat.

Secondly the testcase cannot simply add options like "-mcpu=cortex-m0"
to the command-line options, even after testing for thumb1 being OK.
The arm_thumb1_ok test says that it is OK to add -mthumb to the options,
but it doesn't allow you to change much else, especially anything
related to the CPU or architecture.

Whilst reviewing this patch I noticed that we could probably do better
when generating prologue and epilogue code in situations similar to
this.  Specifically, that we could often make more use of the low
registers rather than forcing a work register and then pushing (and
popping) every high register one at a time.  So what I'm committing
below adds this to your original changes (with some slight tweaking).

2019-05-08  Mihail Ionescu  <mihail.ionescu@arm.com>
	    Richard Earnshaw  <rearnsha@arm.com>

gcc:

	PR target/88167
	* config/arm/arm.c (thumb1_prologue_unused_call_clobbered_lo_regs): New
	function.
	(thumb1_epilogue_unused_call_clobbered_lo_regs): New function.
	(thumb1_compute_save_core_reg_mask): Don't force a spare work
	register if both the epilogue and prologue can use call-clobbered
	regs.
	(thumb1_unexpanded_epilogue): Use
thumb1_epilogue_unused_call_clobbered_lo_regs.
	Reverse the logic for picking temporaries for restoring high regs
	to match that of the prologue where possible.
	(thumb1_expand_prologue): Add any usable call-clobbered low
	registers to the list of work registers.  Detect if the return
	address is still live at the end of the prologue and avoid using
	it for a work register if so.  If the return address is not
	live, add LR to the list of pushable regs after the first pass.

testsuite:

	PR target/88167
	* gcc.target/arm/pr88167-1.c: New test.
	* gcc.target/arm/pr88167-2.c: New test.

R.

[-- Attachment #2: pr88167.patch --]
[-- Type: text/x-patch, Size: 7026 bytes --]

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 45abcd89963..91bb65130b8 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19670,6 +19670,35 @@ arm_compute_save_core_reg_mask (void)
   return save_reg_mask;
 }
 
+/* Return a mask for the call-clobbered low registers that are unused
+   at the end of the prologue.  */
+static unsigned long
+thumb1_prologue_unused_call_clobbered_lo_regs (void)
+{
+  unsigned long mask = 0;
+
+  for (int reg = 0; reg <= LAST_LO_REGNUM; reg++)
+    if (!callee_saved_reg_p (reg)
+	&& !REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)),
+			     reg))
+      mask |= 1 << reg;
+  return mask;
+}
+
+/* Similarly for the start of the epilogue.  */
+static unsigned long
+thumb1_epilogue_unused_call_clobbered_lo_regs (void)
+{
+  unsigned long mask = 0;
+
+  for (int reg = 0; reg <= LAST_LO_REGNUM; reg++)
+    if (!callee_saved_reg_p (reg)
+	&& !REGNO_REG_SET_P (df_get_live_in (EXIT_BLOCK_PTR_FOR_FN (cfun)),
+			     reg))
+      mask |= 1 << reg;
+  return mask;
+}
+
 /* Compute a bit mask of which core registers need to be
    saved on the stack for the current function.  */
 static unsigned long
@@ -19701,10 +19730,19 @@ thumb1_compute_save_core_reg_mask (void)
   if (mask & 0xff || thumb_force_lr_save ())
     mask |= (1 << LR_REGNUM);
 
-  /* Make sure we have a low work register if we need one.
-     We will need one if we are going to push a high register,
-     but we are not currently intending to push a low register.  */
+  bool call_clobbered_scratch
+    = (thumb1_prologue_unused_call_clobbered_lo_regs ()
+       && thumb1_epilogue_unused_call_clobbered_lo_regs ());
+
+  /* Make sure we have a low work register if we need one.  We will
+     need one if we are going to push a high register, but we are not
+     currently intending to push a low register.  However if both the
+     prologue and epilogue have a spare call-clobbered low register,
+     then we won't need to find an additional work register.  It does
+     not need to be the same register in the prologue and
+     epilogue.  */
   if ((mask & 0xff) == 0
+      && !call_clobbered_scratch
       && ((mask & 0x0f00) || TARGET_BACKTRACE))
     {
       /* Use thumb_find_work_register to choose which register
@@ -24930,12 +24968,7 @@ thumb1_unexpanded_epilogue (void)
       unsigned long mask = live_regs_mask & 0xff;
       int next_hi_reg;
 
-      /* The available low registers depend on the size of the value we are
-         returning.  */
-      if (size <= 12)
-	mask |=  1 << 3;
-      if (size <= 8)
-	mask |= 1 << 2;
+      mask |= thumb1_epilogue_unused_call_clobbered_lo_regs ();
 
       if (mask == 0)
 	/* Oh dear!  We have no low registers into which we can pop
@@ -24943,7 +24976,7 @@ thumb1_unexpanded_epilogue (void)
 	internal_error
 	  ("no low registers available for popping high registers");
 
-      for (next_hi_reg = 8; next_hi_reg < 13; next_hi_reg++)
+      for (next_hi_reg = 12; next_hi_reg > LAST_LO_REGNUM; next_hi_reg--)
 	if (live_regs_mask & (1 << next_hi_reg))
 	  break;
 
@@ -24951,7 +24984,7 @@ thumb1_unexpanded_epilogue (void)
 	{
 	  /* Find lo register(s) into which the high register(s) can
              be popped.  */
-	  for (regno = 0; regno <= LAST_LO_REGNUM; regno++)
+	  for (regno = LAST_LO_REGNUM; regno >= 0; regno--)
 	    {
 	      if (mask & (1 << regno))
 		high_regs_pushed--;
@@ -24959,20 +24992,22 @@ thumb1_unexpanded_epilogue (void)
 		break;
 	    }
 
-	  mask &= (2 << regno) - 1;	/* A noop if regno == 8 */
+	  if (high_regs_pushed == 0 && regno >= 0)
+	    mask &= ~((1 << regno) - 1);
 
 	  /* Pop the values into the low register(s).  */
 	  thumb_pop (asm_out_file, mask);
 
 	  /* Move the value(s) into the high registers.  */
-	  for (regno = 0; regno <= LAST_LO_REGNUM; regno++)
+	  for (regno = LAST_LO_REGNUM; regno >= 0; regno--)
 	    {
 	      if (mask & (1 << regno))
 		{
 		  asm_fprintf (asm_out_file, "\tmov\t%r, %r\n", next_hi_reg,
 			       regno);
 
-		  for (next_hi_reg++; next_hi_reg < 13; next_hi_reg++)
+		  for (next_hi_reg--; next_hi_reg > LAST_LO_REGNUM;
+		       next_hi_reg--)
 		    if (live_regs_mask & (1 << next_hi_reg))
 		      break;
 		}
@@ -25354,10 +25389,20 @@ thumb1_expand_prologue (void)
 	  break;
 
       /* Here we need to mask out registers used for passing arguments
-	 even if they can be pushed.  This is to avoid using them to stash the high
-	 registers.  Such kind of stash may clobber the use of arguments.  */
+	 even if they can be pushed.  This is to avoid using them to
+	 stash the high registers.  Such kind of stash may clobber the
+	 use of arguments.  */
       pushable_regs = l_mask & (~arg_regs_mask);
-      if (lr_needs_saving)
+      pushable_regs |= thumb1_prologue_unused_call_clobbered_lo_regs ();
+
+      /* Normally, LR can be used as a scratch register once it has been
+	 saved; but if the function examines its own return address then
+	 the value is still live and we need to avoid using it.  */
+      bool return_addr_live
+	= REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)),
+			   LR_REGNUM);
+
+      if (lr_needs_saving || return_addr_live)
 	pushable_regs &= ~(1 << LR_REGNUM);
 
       if (pushable_regs == 0)
@@ -25398,6 +25443,11 @@ thumb1_expand_prologue (void)
 	      push_mask |= 1 << LR_REGNUM;
 	      real_regs_mask |= 1 << LR_REGNUM;
 	      lr_needs_saving = false;
+	      /* If the return address is not live at this point, we
+		 can add LR to the list of registers that we can use
+		 for pushes.  */
+	      if (!return_addr_live)
+		pushable_regs |= 1 << LR_REGNUM;
 	    }
 
 	  insn = thumb1_emit_multi_reg_push (push_mask, real_regs_mask);
diff --git a/gcc/testsuite/gcc.target/arm/pr88167-1.c b/gcc/testsuite/gcc.target/arm/pr88167-1.c
new file mode 100644
index 00000000000..517a86d6e4b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr88167-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb1_ok } */
+/* { dg-options "-O2 -mthumb" }  */
+
+void *retaddr;
+
+void foo (void) {
+  retaddr = __builtin_return_address (0);
+
+  /* Used for enforcing registers stacking.  */
+  asm volatile ("" : : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+			 "r8", "r9", "r10", "r11", "r12");
+}
+
+/* { dg-final { scan-assembler-not "mov\tlr," } } */
diff --git a/gcc/testsuite/gcc.target/arm/pr88167-2.c b/gcc/testsuite/gcc.target/arm/pr88167-2.c
new file mode 100644
index 00000000000..6a303345eb9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr88167-2.c
@@ -0,0 +1,18 @@
+/* { dg-do run } */
+/* { dg-options "-O2" }  */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+
+int __attribute__((noclone, noinline))
+foo (int a, long long b) {
+  /* Used for enforcing registers stacking.  */
+  asm volatile ("" : : : "r0", "r1", "r2", "r3",
+			 "r8", "r9", "r10", "r11", "r12");
+  return (int) b;
+}
+
+int main ()
+{
+  if (foo (1, 0x1000000000000003LL) != 3)
+    __builtin_abort ();
+  __builtin_exit (0);
+}

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-08 14:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-18  9:32 [PATCH, arm][PR88167] Fix __builtin_return_address returns invalid address Mihail Ionescu
2018-12-18 12:54 ` Mihail Ionescu
2019-04-26  9:26   ` [PATCH, arm, ping][PR88167] " Mihail Ionescu
2019-05-08 14:39   ` [PATCH, arm][PR88167] " Richard Earnshaw (lists)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).