public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump
@ 2012-12-20  9:53 Joey Ye
  2013-01-05  7:42 ` Joey Ye
  2013-04-11  9:47 ` Ramana Radhakrishnan
  0 siblings, 2 replies; 7+ messages in thread
From: Joey Ye @ 2012-12-20  9:53 UTC (permalink / raw)
  To: gcc-patches, Ramana Radhakrishnan; +Cc: Joey Ye

Current GCC thumb1 has an annoying problem that always assuming far branch.
So it forces to save lr, even when unnecessarily. The most extreme case
complained by partner is:

// compiled with "-mthumb -mcpu=cortex-m0 -Os".
void foo() { for (;;); }
=>
foo:
	push	{lr}  // Crazy!!!
.L2:
	b	.L2

The reason is that thumb1 far jump is only resolved in the very late pass
"shorten_branch". Prologue/epilogue pass doesn't actually know a branch is
far or not from its attribute. It has to conservatively save/restore lr
whenever there is a branch.

This patch tries to fix it with a simple heuristic, i.e., using function
size to decide if a far jump will likely be used. Function size information
is meaningful in prologue/epilogue pass. The heuristic uses following check
to decide if lr should be saved for far jump:

function_size * 3 >= 2048 // yes: save lr for possible far jump. No: don't
save lr for far jump

The scheme has an issue: if some corner case does break above condition,
there is no chance to fix-up but to ICE. But the heuristic condition is very
conservative. It is base on the worse normal condition that each instruction
is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3 times ).
I can't think of a real case to trigger the ICE. So I think it should work.

Other approaches than the heuristic scheme are too expensive to implement
for this small size/performance issue. I did explored some but none of them
persuaded myself.

Tests passed:
* build libgcc, libstdc++, newlib, libm
* make check-gcc with cpu=cortex-m0
* Small and extreme test cases

ChangeLog:

2012-12-20  Joey Ye  <joey.ye@arm.com>

	* config/arm/arm.c(thumb1_final_prescan_insn): 
	Assert lr save for real far jump.
	(thumb_far_jump_used_p): Count instruction size and set 
     far_jump_used.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 327ef22..ad79451 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn)
       else if (conds != CONDS_NOCOND)
 	cfun->machine->thumb1_cc_insn = NULL_RTX;
     }
+
+    /* Check if unexpected far jump is used.  */
+    if (cfun->machine->lr_save_eliminated
+        && get_attr_far_jump (insn) == FAR_JUMP_YES)
+      internal_error("Unexpected thumb1 far jump");
 }
 
 int
@@ -21815,6 +21887,8 @@ static int
 thumb_far_jump_used_p (void)
 {
   rtx insn;
+  bool far_jump = false;
+  unsigned int func_size = 0;
 
   /* This test is only important for leaf functions.  */
   /* assert (!leaf_function_p ()); */
@@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void)
 	  && get_attr_far_jump (insn) == FAR_JUMP_YES
 	  )
 	{
+	  far_jump = true;
+	}
+      func_size += get_attr_length (insn);
+    }
+
+  /* Attribute far_jump will always be true for thumb1 before
shorten_branch
+     pass. So checking far_jump attribute before shorten_branch isn't much
+     useful.
+     
+     Following heuristic tries to estimate more accruately if a far jump
may 
+     finally be used. The heuristic is very conservative as there is no
chance
+     to roll-back the decision of not to use far jump.
+
+     Thumb1 long branch offset is -2048 to 2046. The worst case is each
2-byte
+     insn is assiociated with a 4 byte constant pool. Using function size 
+     2048/3 as the threshold is conservative enough.  */
+  if (far_jump)
+    {
+      if ((func_size * 3) >= 2048)
+        {
 	  /* Record the fact that we have decided that
 	     the function does use far jumps.  */
 	  cfun->machine->far_jump_used = 1;





^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump
  2012-12-20  9:53 [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump Joey Ye
@ 2013-01-05  7:42 ` Joey Ye
  2013-01-15  8:02   ` Joey Ye
  2013-04-11  9:32   ` Joey Ye
  2013-04-11  9:47 ` Ramana Radhakrishnan
  1 sibling, 2 replies; 7+ messages in thread
From: Joey Ye @ 2013-01-05  7:42 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Joey Ye, gcc-patches

Ping

> -----Original Message-----
> From: Joey Ye
> Sent: Thursday, December 20, 2012 17:53
> To: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan
> Cc: Joey Ye
> Subject: [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-
> far jump
> 
> Current GCC thumb1 has an annoying problem that always assuming far
> branch.
> So it forces to save lr, even when unnecessarily. The most extreme case
> complained by partner is:
> 
> // compiled with "-mthumb -mcpu=cortex-m0 -Os".
> void foo() { for (;;); }
> =>
> foo:
> 	push	{lr}  // Crazy!!!
> .L2:
> 	b	.L2
> 
> The reason is that thumb1 far jump is only resolved in the very late
> pass
> "shorten_branch". Prologue/epilogue pass doesn't actually know a branch
> is
> far or not from its attribute. It has to conservatively save/restore lr
> whenever there is a branch.
> 
> This patch tries to fix it with a simple heuristic, i.e., using function
> size to decide if a far jump will likely be used. Function size
> information
> is meaningful in prologue/epilogue pass. The heuristic uses following
> check
> to decide if lr should be saved for far jump:
> 
> function_size * 3 >= 2048 // yes: save lr for possible far jump. No:
> don't
> save lr for far jump
> 
> The scheme has an issue: if some corner case does break above condition,
> there is no chance to fix-up but to ICE. But the heuristic condition is
> very
> conservative. It is base on the worse normal condition that each
> instruction
> is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3
> times ).
> I can't think of a real case to trigger the ICE. So I think it should
> work.
> 
> Other approaches than the heuristic scheme are too expensive to
> implement
> for this small size/performance issue. I did explored some but none of
> them
> persuaded myself.
> 
> Tests passed:
> * build libgcc, libstdc++, newlib, libm
> * make check-gcc with cpu=cortex-m0
> * Small and extreme test cases
> 
> ChangeLog:
> 
> 2012-12-20  Joey Ye  <joey.ye@arm.com>
> 
> 	* config/arm/arm.c(thumb1_final_prescan_insn):
> 	Assert lr save for real far jump.
> 	(thumb_far_jump_used_p): Count instruction size and set
>      far_jump_used.
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 327ef22..ad79451 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn)
>        else if (conds != CONDS_NOCOND)
>  	cfun->machine->thumb1_cc_insn = NULL_RTX;
>      }
> +
> +    /* Check if unexpected far jump is used.  */
> +    if (cfun->machine->lr_save_eliminated
> +        && get_attr_far_jump (insn) == FAR_JUMP_YES)
> +      internal_error("Unexpected thumb1 far jump");
>  }
> 
>  int
> @@ -21815,6 +21887,8 @@ static int
>  thumb_far_jump_used_p (void)
>  {
>    rtx insn;
> +  bool far_jump = false;
> +  unsigned int func_size = 0;
> 
>    /* This test is only important for leaf functions.  */
>    /* assert (!leaf_function_p ()); */
> @@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void)
>  	  && get_attr_far_jump (insn) == FAR_JUMP_YES
>  	  )
>  	{
> +	  far_jump = true;
> +	}
> +      func_size += get_attr_length (insn);
> +    }
> +
> +  /* Attribute far_jump will always be true for thumb1 before
> shorten_branch
> +     pass. So checking far_jump attribute before shorten_branch isn't
> much
> +     useful.
> +
> +     Following heuristic tries to estimate more accruately if a far
> jump
> may
> +     finally be used. The heuristic is very conservative as there is no
> chance
> +     to roll-back the decision of not to use far jump.
> +
> +     Thumb1 long branch offset is -2048 to 2046. The worst case is each
> 2-byte
> +     insn is assiociated with a 4 byte constant pool. Using function
> size
> +     2048/3 as the threshold is conservative enough.  */
> +  if (far_jump)
> +    {
> +      if ((func_size * 3) >= 2048)
> +        {
>  	  /* Record the fact that we have decided that
>  	     the function does use far jumps.  */
>  	  cfun->machine->far_jump_used = 1;
> 
> 
> 




^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump
  2013-01-05  7:42 ` Joey Ye
@ 2013-01-15  8:02   ` Joey Ye
  2013-04-11  9:32   ` Joey Ye
  1 sibling, 0 replies; 7+ messages in thread
From: Joey Ye @ 2013-01-15  8:02 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: gcc-patches

Ping^2

> -----Original Message-----
> From: Joey Ye
> Sent: Saturday, January 05, 2013 15:41
> To: Ramana Radhakrishnan
> Cc: Joey Ye; gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH][ARM][thumb1] Reduce lr save for leaf function with
> non-far jump
> 
> Ping
> 
> > -----Original Message-----
> > From: Joey Ye
> > Sent: Thursday, December 20, 2012 17:53
> > To: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan
> > Cc: Joey Ye
> > Subject: [PATCH][ARM][thumb1] Reduce lr save for leaf function with
> non-
> > far jump
> >
> > Current GCC thumb1 has an annoying problem that always assuming far
> > branch.
> > So it forces to save lr, even when unnecessarily. The most extreme
> case
> > complained by partner is:
> >
> > // compiled with "-mthumb -mcpu=cortex-m0 -Os".
> > void foo() { for (;;); }
> > =>
> > foo:
> > 	push	{lr}  // Crazy!!!
> > .L2:
> > 	b	.L2
> >
> > The reason is that thumb1 far jump is only resolved in the very late
> > pass
> > "shorten_branch". Prologue/epilogue pass doesn't actually know a
> branch
> > is
> > far or not from its attribute. It has to conservatively save/restore
> lr
> > whenever there is a branch.
> >
> > This patch tries to fix it with a simple heuristic, i.e., using
> function
> > size to decide if a far jump will likely be used. Function size
> > information
> > is meaningful in prologue/epilogue pass. The heuristic uses following
> > check
> > to decide if lr should be saved for far jump:
> >
> > function_size * 3 >= 2048 // yes: save lr for possible far jump. No:
> > don't
> > save lr for far jump
> >
> > The scheme has an issue: if some corner case does break above
> condition,
> > there is no chance to fix-up but to ICE. But the heuristic condition
> is
> > very
> > conservative. It is base on the worse normal condition that each
> > instruction
> > is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3
> > times ).
> > I can't think of a real case to trigger the ICE. So I think it should
> > work.
> >
> > Other approaches than the heuristic scheme are too expensive to
> > implement
> > for this small size/performance issue. I did explored some but none of
> > them
> > persuaded myself.
> >
> > Tests passed:
> > * build libgcc, libstdc++, newlib, libm
> > * make check-gcc with cpu=cortex-m0
> > * Small and extreme test cases
> >
> > ChangeLog:
> >
> > 2012-12-20  Joey Ye  <joey.ye@arm.com>
> >
> > 	* config/arm/arm.c(thumb1_final_prescan_insn):
> > 	Assert lr save for real far jump.
> > 	(thumb_far_jump_used_p): Count instruction size and set
> >      far_jump_used.
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index 327ef22..ad79451 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn)
> >        else if (conds != CONDS_NOCOND)
> >  	cfun->machine->thumb1_cc_insn = NULL_RTX;
> >      }
> > +
> > +    /* Check if unexpected far jump is used.  */
> > +    if (cfun->machine->lr_save_eliminated
> > +        && get_attr_far_jump (insn) == FAR_JUMP_YES)
> > +      internal_error("Unexpected thumb1 far jump");
> >  }
> >
> >  int
> > @@ -21815,6 +21887,8 @@ static int
> >  thumb_far_jump_used_p (void)
> >  {
> >    rtx insn;
> > +  bool far_jump = false;
> > +  unsigned int func_size = 0;
> >
> >    /* This test is only important for leaf functions.  */
> >    /* assert (!leaf_function_p ()); */
> > @@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void)
> >  	  && get_attr_far_jump (insn) == FAR_JUMP_YES
> >  	  )
> >  	{
> > +	  far_jump = true;
> > +	}
> > +      func_size += get_attr_length (insn);
> > +    }
> > +
> > +  /* Attribute far_jump will always be true for thumb1 before
> > shorten_branch
> > +     pass. So checking far_jump attribute before shorten_branch isn't
> > much
> > +     useful.
> > +
> > +     Following heuristic tries to estimate more accruately if a far
> > jump
> > may
> > +     finally be used. The heuristic is very conservative as there is
> no
> > chance
> > +     to roll-back the decision of not to use far jump.
> > +
> > +     Thumb1 long branch offset is -2048 to 2046. The worst case is
> each
> > 2-byte
> > +     insn is assiociated with a 4 byte constant pool. Using function
> > size
> > +     2048/3 as the threshold is conservative enough.  */
> > +  if (far_jump)
> > +    {
> > +      if ((func_size * 3) >= 2048)
> > +        {
> >  	  /* Record the fact that we have decided that
> >  	     the function does use far jumps.  */
> >  	  cfun->machine->far_jump_used = 1;
> >
> >
> >
> 
> 




^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump
  2013-01-05  7:42 ` Joey Ye
  2013-01-15  8:02   ` Joey Ye
@ 2013-04-11  9:32   ` Joey Ye
  1 sibling, 0 replies; 7+ messages in thread
From: Joey Ye @ 2013-04-11  9:32 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: gcc-patches

Ping ^ 2

> -----Original Message-----
> From: Joey Ye
> Sent: Saturday, January 05, 2013 3:41 PM
> To: Ramana Radhakrishnan
> Cc: Joey Ye; gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH][ARM][thumb1] Reduce lr save for leaf function with
> non-far jump
> 
> Ping
> 
> > -----Original Message-----
> > From: Joey Ye
> > Sent: Thursday, December 20, 2012 17:53
> > To: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan
> > Cc: Joey Ye
> > Subject: [PATCH][ARM][thumb1] Reduce lr save for leaf function with
> > non- far jump
> >
> > Current GCC thumb1 has an annoying problem that always assuming far
> > branch.
> > So it forces to save lr, even when unnecessarily. The most extreme
> > case complained by partner is:
> >
> > // compiled with "-mthumb -mcpu=cortex-m0 -Os".
> > void foo() { for (;;); }
> > =>
> > foo:
> > 	push	{lr}  // Crazy!!!
> > .L2:
> > 	b	.L2
> >
> > The reason is that thumb1 far jump is only resolved in the very late
> > pass "shorten_branch". Prologue/epilogue pass doesn't actually know a
> > branch is far or not from its attribute. It has to conservatively
> > save/restore lr whenever there is a branch.
> >
> > This patch tries to fix it with a simple heuristic, i.e., using
> > function size to decide if a far jump will likely be used. Function
> > size information is meaningful in prologue/epilogue pass. The
> > heuristic uses following check to decide if lr should be saved for far
> > jump:
> >
> > function_size * 3 >= 2048 // yes: save lr for possible far jump. No:
> > don't
> > save lr for far jump
> >
> > The scheme has an issue: if some corner case does break above
> > condition, there is no chance to fix-up but to ICE. But the heuristic
> > condition is very conservative. It is base on the worse normal
> > condition that each instruction is associated with a 4 byte literal (
> > (2+4)/2=3, blooming size by 3 times ).
> > I can't think of a real case to trigger the ICE. So I think it should
> > work.
> >
> > Other approaches than the heuristic scheme are too expensive to
> > implement for this small size/performance issue. I did explored some
> > but none of them persuaded myself.
> >
> > Tests passed:
> > * build libgcc, libstdc++, newlib, libm
> > * make check-gcc with cpu=cortex-m0
> > * Small and extreme test cases
> >
> > ChangeLog:
> >
> > 2012-12-20  Joey Ye  <joey.ye@arm.com>
> >
> > 	* config/arm/arm.c(thumb1_final_prescan_insn):
> > 	Assert lr save for real far jump.
> > 	(thumb_far_jump_used_p): Count instruction size and set
> >      far_jump_used.
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index
> > 327ef22..ad79451 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn)
> >        else if (conds != CONDS_NOCOND)
> >  	cfun->machine->thumb1_cc_insn = NULL_RTX;
> >      }
> > +
> > +    /* Check if unexpected far jump is used.  */
> > +    if (cfun->machine->lr_save_eliminated
> > +        && get_attr_far_jump (insn) == FAR_JUMP_YES)
> > +      internal_error("Unexpected thumb1 far jump");
> >  }
> >
> >  int
> > @@ -21815,6 +21887,8 @@ static int
> >  thumb_far_jump_used_p (void)
> >  {
> >    rtx insn;
> > +  bool far_jump = false;
> > +  unsigned int func_size = 0;
> >
> >    /* This test is only important for leaf functions.  */
> >    /* assert (!leaf_function_p ()); */ @@ -21870,6 +21944,26 @@
> > thumb_far_jump_used_p (void)
> >  	  && get_attr_far_jump (insn) == FAR_JUMP_YES
> >  	  )
> >  	{
> > +	  far_jump = true;
> > +	}
> > +      func_size += get_attr_length (insn);
> > +    }
> > +
> > +  /* Attribute far_jump will always be true for thumb1 before
> > shorten_branch
> > +     pass. So checking far_jump attribute before shorten_branch isn't
> > much
> > +     useful.
> > +
> > +     Following heuristic tries to estimate more accruately if a far
> > jump
> > may
> > +     finally be used. The heuristic is very conservative as there is
> > + no
> > chance
> > +     to roll-back the decision of not to use far jump.
> > +
> > +     Thumb1 long branch offset is -2048 to 2046. The worst case is
> > + each
> > 2-byte
> > +     insn is assiociated with a 4 byte constant pool. Using function
> > size
> > +     2048/3 as the threshold is conservative enough.  */  if
> > + (far_jump)
> > +    {
> > +      if ((func_size * 3) >= 2048)
> > +        {
> >  	  /* Record the fact that we have decided that
> >  	     the function does use far jumps.  */
> >  	  cfun->machine->far_jump_used = 1;
> >
> >
> >
> 
> 




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump
  2012-12-20  9:53 [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump Joey Ye
  2013-01-05  7:42 ` Joey Ye
@ 2013-04-11  9:47 ` Ramana Radhakrishnan
  2013-04-15 10:05   ` Joey Ye
  1 sibling, 1 reply; 7+ messages in thread
From: Ramana Radhakrishnan @ 2013-04-11  9:47 UTC (permalink / raw)
  To: Joey Ye; +Cc: gcc-patches

On 12/20/12 09:53, Joey Ye wrote:
> Current GCC thumb1 has an annoying problem that always assuming far branch.
> So it forces to save lr, even when unnecessarily. The most extreme case
> complained by partner is:
>
> // compiled with "-mthumb -mcpu=cortex-m0 -Os".
> void foo() { for (;;); }
> =>
> foo:
> 	push	{lr}  // Crazy!!!
> .L2:
> 	b	.L2
>
> The reason is that thumb1 far jump is only resolved in the very late pass
> "shorten_branch". Prologue/epilogue pass doesn't actually know a branch is
> far or not from its attribute. It has to conservatively save/restore lr
> whenever there is a branch.
>
> This patch tries to fix it with a simple heuristic, i.e., using function
> size to decide if a far jump will likely be used. Function size information
> is meaningful in prologue/epilogue pass. The heuristic uses following check
> to decide if lr should be saved for far jump:
>
> function_size * 3 >= 2048 // yes: save lr for possible far jump. No: don't
> save lr for far jump
>
> The scheme has an issue: if some corner case does break above condition,
> there is no chance to fix-up but to ICE. But the heuristic condition is very
> conservative. It is base on the worse normal condition that each instruction
> is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3 times ).
> I can't think of a real case to trigger the ICE. So I think it should work.
>
> Other approaches than the heuristic scheme are too expensive to implement
> for this small size/performance issue. I did explored some but none of them
> persuaded myself.
>
> Tests passed:
> * build libgcc, libstdc++, newlib, libm
> * make check-gcc with cpu=cortex-m0
> * Small and extreme test cases
>
> ChangeLog:
>
> 2012-12-20  Joey Ye  <joey.ye@arm.com>
>
> 	* config/arm/arm.c(thumb1_final_prescan_insn):
> 	Assert lr save for real far jump.
> 	(thumb_far_jump_used_p): Count instruction size and set
>       far_jump_used.
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 327ef22..ad79451 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn)
>         else if (conds != CONDS_NOCOND)
>   	cfun->machine->thumb1_cc_insn = NULL_RTX;
>       }
> +
> +    /* Check if unexpected far jump is used.  */
> +    if (cfun->machine->lr_save_eliminated
> +        && get_attr_far_jump (insn) == FAR_JUMP_YES)
> +      internal_error("Unexpected thumb1 far jump");
>   }
>
>   int
> @@ -21815,6 +21887,8 @@ static int
>   thumb_far_jump_used_p (void)
>   {
>     rtx insn;
> +  bool far_jump = false;
> +  unsigned int func_size = 0;
>
>     /* This test is only important for leaf functions.  */
>     /* assert (!leaf_function_p ()); */
> @@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void)
>   	  && get_attr_far_jump (insn) == FAR_JUMP_YES
>   	  )
>   	{
> +	  far_jump = true;
> +	}
> +      func_size += get_attr_length (insn);
> +    }
> +
> +  /* Attribute far_jump will always be true for thumb1 before
> shorten_branch
> +     pass. So checking far_jump attribute before shorten_branch isn't much
> +     useful.
> +
> +     Following heuristic tries to estimate more accruately if a far jump
> may
> +     finally be used. The heuristic is very conservative as there is no
> chance
> +     to roll-back the decision of not to use far jump.
> +
> +     Thumb1 long branch offset is -2048 to 2046. The worst case is each
> 2-byte
> +     insn is assiociated with a 4 byte constant pool. Using function size
> +     2048/3 as the threshold is conservative enough.  */
> +  if (far_jump)
> +    {
> +      if ((func_size * 3) >= 2048)
> +        {
>   	  /* Record the fact that we have decided that
>   	     the function does use far jumps.  */
>   	  cfun->machine->far_jump_used = 1;
>
>
>

Check for 80 character line length in the comments above - I can never 
tell if it is my mail client or yours.

Otherwise ok if no regressions..

regards
Ramana
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump
  2013-04-11  9:47 ` Ramana Radhakrishnan
@ 2013-04-15 10:05   ` Joey Ye
  0 siblings, 0 replies; 7+ messages in thread
From: Joey Ye @ 2013-04-15 10:05 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: gcc-patches



> -----Original Message-----
> From: Ramana Radhakrishnan
> Sent: Thursday, April 11, 2013 4:40 PM
> To: Joey Ye
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH][ARM][thumb1] Reduce lr save for leaf function with
> non-far jump
> 
> On 12/20/12 09:53, Joey Ye wrote:
> > Current GCC thumb1 has an annoying problem that always assuming far
> branch.
> > So it forces to save lr, even when unnecessarily. The most extreme
> > case complained by partner is:
> >
> > // compiled with "-mthumb -mcpu=cortex-m0 -Os".
> > void foo() { for (;;); }
> > =>
> > foo:
> > 	push	{lr}  // Crazy!!!
> > .L2:
> > 	b	.L2
> >
> > The reason is that thumb1 far jump is only resolved in the very late
> > pass "shorten_branch". Prologue/epilogue pass doesn't actually know a
> > branch is far or not from its attribute. It has to conservatively
> > save/restore lr whenever there is a branch.
> >
> > This patch tries to fix it with a simple heuristic, i.e., using
> > function size to decide if a far jump will likely be used. Function
> > size information is meaningful in prologue/epilogue pass. The
> > heuristic uses following check to decide if lr should be saved for far
jump:
> >
> > function_size * 3 >= 2048 // yes: save lr for possible far jump. No:
> > don't save lr for far jump
> >
> > The scheme has an issue: if some corner case does break above
> > condition, there is no chance to fix-up but to ICE. But the heuristic
> > condition is very conservative. It is base on the worse normal
> > condition that each instruction is associated with a 4 byte literal (
(2+4)/2=3,
> blooming size by 3 times ).
> > I can't think of a real case to trigger the ICE. So I think it should
work.
> >
> > Other approaches than the heuristic scheme are too expensive to
> > implement for this small size/performance issue. I did explored some
> > but none of them persuaded myself.
> >
> > Tests passed:
> > * build libgcc, libstdc++, newlib, libm
> > * make check-gcc with cpu=cortex-m0
> > * Small and extreme test cases
> >
> > ChangeLog:
> >
> > 2012-12-20  Joey Ye  <joey.ye@arm.com>
> >
> > 	* config/arm/arm.c(thumb1_final_prescan_insn):
> > 	Assert lr save for real far jump.
> > 	(thumb_far_jump_used_p): Count instruction size and set
> >       far_jump_used.
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index
> > 327ef22..ad79451 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn)
> >         else if (conds != CONDS_NOCOND)
> >   	cfun->machine->thumb1_cc_insn = NULL_RTX;
> >       }
> > +
> > +    /* Check if unexpected far jump is used.  */
> > +    if (cfun->machine->lr_save_eliminated
> > +        && get_attr_far_jump (insn) == FAR_JUMP_YES)
> > +      internal_error("Unexpected thumb1 far jump");
> >   }
> >
> >   int
> > @@ -21815,6 +21887,8 @@ static int
> >   thumb_far_jump_used_p (void)
> >   {
> >     rtx insn;
> > +  bool far_jump = false;
> > +  unsigned int func_size = 0;
> >
> >     /* This test is only important for leaf functions.  */
> >     /* assert (!leaf_function_p ()); */ @@ -21870,6 +21944,26 @@
> > thumb_far_jump_used_p (void)
> >   	  && get_attr_far_jump (insn) == FAR_JUMP_YES
> >   	  )
> >   	{
> > +	  far_jump = true;
> > +	}
> > +      func_size += get_attr_length (insn);
> > +    }
> > +
> > +  /* Attribute far_jump will always be true for thumb1 before
> > shorten_branch
> > +     pass. So checking far_jump attribute before shorten_branch isn't
much
> > +     useful.
> > +
> > +     Following heuristic tries to estimate more accurately if a far
> > + jump
> > may
> > +     finally be used. The heuristic is very conservative as there is
> > + no
> > chance
> > +     to roll-back the decision of not to use far jump.
> > +
> > +     Thumb1 long branch offset is -2048 to 2046. The worst case is
> > + each
> > 2-byte
> > +     insn is associated with a 4 byte constant pool. Using function
size
> > +     2048/3 as the threshold is conservative enough.  */  if
> > + (far_jump)
> > +    {
> > +      if ((func_size * 3) >= 2048)
> > +        {
> >   	  /* Record the fact that we have decided that
> >   	     the function does use far jumps.  */
> >   	  cfun->machine->far_jump_used = 1;
> >
> >
> >
> 
> Check for 80 character line length in the comments above - I can never
tell if
> it is my mail client or yours.
Further shorten the lines.
> 
> Otherwise ok if no regressions..
Make check targeting Cortex-M0/M3 with qemu. No regression.

Committed as 197956

- Joey



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump
@ 2012-12-20  6:47 Joey Ye
  0 siblings, 0 replies; 7+ messages in thread
From: Joey Ye @ 2012-12-20  6:47 UTC (permalink / raw)
  To: gcc-patches

Current GCC thumb1 has an annoying problem that always assuming far branch.
So it forces to save lr, even when unnecessarily. The most extreme case
complained by partner is:

// compiled with "-mthumb -mcpu=cortex-m0 -Os".
void foo() { for (;;); }
=>
foo:
	push	{lr}  // Crazy!!!
.L2:
	b	.L2

The reason is that thumb1 far jump is only resolved in the very late pass
"shorten_branch". Prologue/epilogue pass doesn't actually know a branch is
far or not from its attribute. It has to conservatively save/restore lr
whenever there is a branch.

This patch tries to fix it with a simple heuristic, i.e., using function
size to decide if a far jump will likely be used. Function size information
is meaningful in prologue/epilogue pass. The heuristic uses following check
to decide if lr should be saved for far jump:

function_size * 3 >= 2048 // yes: save lr for possible far jump. No: don't
save lr for far jump

The scheme has an issue: if some corner case does break above condition,
there is no chance to fix-up but to ICE. But the heuristic condition is very
conservative. It is base on the worse normal condition that each instruction
is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3 times ).
I can't think of a real case to trigger the ICE. So I think it should work.

Other approaches than the heuristic scheme are too expensive to implement
for this small size/performance issue. I did explored some but none of them
persuaded myself.

Tests passed:
* build libgcc, libstdc++, newlib, libm
* make check-gcc with cpu=cortex-m0
* Small and extreme test cases

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 327ef22..ad79451 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn)
       else if (conds != CONDS_NOCOND)
 	cfun->machine->thumb1_cc_insn = NULL_RTX;
     }
+
+    /* Check if unexpected far jump is used.  */
+    if (cfun->machine->lr_save_eliminated
+        && get_attr_far_jump (insn) == FAR_JUMP_YES)
+      internal_error("Unexpected thumb1 far jump");
 }
 
 int
@@ -21815,6 +21887,8 @@ static int
 thumb_far_jump_used_p (void)
 {
   rtx insn;
+  bool far_jump = false;
+  unsigned int func_size = 0;
 
   /* This test is only important for leaf functions.  */
   /* assert (!leaf_function_p ()); */
@@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void)
 	  && get_attr_far_jump (insn) == FAR_JUMP_YES
 	  )
 	{
+	  far_jump = true;
+	}
+      func_size += get_attr_length (insn);
+    }
+
+  /* Attribute far_jump will always be true for thumb1 before
shorten_branch
+     pass. So checking far_jump attribute before shorten_branch isn't much
+     useful.
+     
+     Following heuristic tries to estimate more accruately if a far jump
may 
+     finally be used. The heuristic is very conservative as there is no
chance
+     to roll-back the decision of not to use far jump.
+
+     Thumb1 long branch offset is -2048 to 2046. The worst case is each
2-byte
+     insn is assiociated with a 4 byte constant pool. Using function size 
+     2048/3 as the threshold is conservative enough.  */
+  if (far_jump)
+    {
+      if ((func_size * 3) >= 2048)
+        {
 	  /* Record the fact that we have decided that
 	     the function does use far jumps.  */
 	  cfun->machine->far_jump_used = 1;





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-04-15  7:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-20  9:53 [PATCH][ARM][thumb1] Reduce lr save for leaf function with non-far jump Joey Ye
2013-01-05  7:42 ` Joey Ye
2013-01-15  8:02   ` Joey Ye
2013-04-11  9:32   ` Joey Ye
2013-04-11  9:47 ` Ramana Radhakrishnan
2013-04-15 10:05   ` Joey Ye
  -- strict thread matches above, loose matches on Subject: below --
2012-12-20  6:47 Joey Ye

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).