[PATCH] Disable loop2_invariant for -Os

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] Disable loop2_invariant for -Os
@ 2012-06-27  8:48 Zhenqiang Chen
  0 siblings, 0 replies; 9+ messages in thread
From: Zhenqiang Chen @ 2012-06-27  8:48 UTC (permalink / raw)
  To: gcc-patches

Hi,

In general, invariant motion itself can not reduce code size. But it will
change the liverange of the invariant, which might lead to more spilling.
The patch disables loop2_invariant when optimizing for size.

I measured the code size benefit for four targets based on CSiBE benchmark:

ARM: 0.33%
MIPS: 1.15%
PPC: 0.24%
X86: 0.45%

Is it OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2012-06-27  Zhenqiang Chen <zhenqiang.chen@arm.com>

	* loop-init.c (gate_rtl_move_loop_invariants): Disable
loop2_invariant
	when optimizing function for size.

diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 03f8f61..5d8cf73 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
 static bool
 gate_rtl_move_loop_invariants (void)
 {
+  /* In general, invariant motion can not reduce code size. But it will
+     change the liverange of the invariant, which increases the register
+     pressure and might lead to more spilling.  */
+  if (optimize_function_for_size_p (cfun))
+    return false;
+
   return flag_move_loop_invariants;
 }



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] Disable loop2_invariant for -Os
  2012-07-03  9:32         ` Richard Guenther
@ 2012-07-09  8:40           ` Zhenqiang Chen
  0 siblings, 0 replies; 9+ messages in thread
From: Zhenqiang Chen @ 2012-07-09  8:40 UTC (permalink / raw)
  To: 'Richard Guenther'; +Cc: gcc-patches

>>
>> 1) If -fira_loop_pressure is enabled, it reduces ~24% invariant motions in my
>tests. But it does not help on total code size. Seams there is issue to update the
>"regs_needed" after moving an invariant out of the loop (My benchmark logs
>show ~73% cases have more than one invariants moved).
>>
>> During tracing, I found that move an integer constant out of the loop does not
>increase regs_needed. Function "get_pressure_class_and_nregs (rtx insn, int
>*nregs)" computes the "regs_needed".
>>
>>    *nregs
>>       = ira_reg_class_max_nregs[pressure_class][GET_MODE (SET_SRC
>> (set))];
>>
>> In ARM, the insn to set an integer is like
>>      (set (reg:SI 183)
>>         (const_int 32 [0x20])) inv1.c:64 182 {*thumb1_movsi_insn}
>>      (nil))
>> GET_MODE (SET_SRC (set)) is VOIDMode and
>ira_reg_class_max_nregs[pressure_class][VOIDMode] is 0. In one of my test
>cases, it moves 4 integer constants out of the loop, which leads to spilling.
>>
>> According to the algorithm in "calculate_loop_reg_pressure", moving an
>> invariant out of the loop should impact on the register pressure. So I
>> try to add the following code
>>
>>   if (! (*nregs))
>>     *nregs = ira_reg_class_max_nregs[pressure_class][GET_MODE (reg)];
>>
>> Logs show it reduces another 32% invariant motions. But the code size is still
>far from disabling the pass. Logs show -fira_loop_pressure impact other passes
>in addition to loop2_invariant (The result of "-fira_loop_pressure
>-fno-move-loop-invariants" is different from the result of
>"-fno-move-loop-invariants").
>>
>> 2) By default -fira_loop_pressure is not enabled for -Os, the logic to compute
>"regs_used" seams not sound. The following codes is from function
>"find_invariants_to_move"
>>     {
>>       unsigned int n_regs = DF_REG_SIZE (df);
>>
>>       regs_used = 2;
>>
>>       for (i = 0; i < n_regs; i++)
>>         {
>>           if (!DF_REGNO_FIRST_DEF (i) && DF_REGNO_LAST_USE (i))
>>             {
>>               /* This is a value that is used but not changed inside loop.
>*/
>>               regs_used++;
>>             }
>>         }
>>     }
>> * There is no loop related inform in the code.
>> * Benchmark logs show the condition (!DF_REGNO_FIRST_DEF (i) &&
>DF_REGNO_LAST_USE (i)) is never true.
>
>Still there is code that tries to deal with -Os.  Simply disabling the pass makes
>that logic pointless.

If -fira-loop-pressure is not enabled, function estimate_reg_pressure_cost (cfgloopanal.c) is used to estimate the cost. At the beginning of the function, it checks

/* If we have enough registers, we should use them and not restrict
     the transformations unnecessarily.  */
  if (regs_needed + target_res_regs <= available_regs)
    return 0;

Here are the CSiBE benchmark logs before "if (...)" for ARM/MIPS/PPC/X86.

     available_regs target_res_regs regs_needed
ARM : 9                 3              2
MIPS: 10/26             3              2
PPC : 18/29             3              2
X86 : 6/15              3              2

regs_needed++ after invariant motion. The size_cost of the first several invariant (available_regs - target_res_regs(3) - regs_needed(2)) motions are always 0. So I prefer to disable the pass if -fira-loop-pressure is not enabled.

>Thus, please try to fix the code that is there to deal with -Os (a target may opt to
>enable -fira-loop-pressure by default for -Os).

Yes. Targets need tune to enable -fira-loop-pressure.

For -fira-loop-pressure, CSiBE logs show MIPS and PPC have a little improvement and X86 has a little regression compared with -fira-loop-pressure is not enabled.
If fira-loop-pressure is enabled, the cost check bases on

  if ((int) new_regs[pressure_class]
      + (int) regs_needed[pressure_class]
      + LOOP_DATA (curr_loop)->max_reg_pressure[pressure_class]
      + IRA_LOOP_RESERVED_REGS
      > ira_available_class_regs[pressure_class])

But a reg is available does not mean it can be used in any instruction. e.g. For ARM Cortex-M0, only few instructions can use r8-r15. (r8-r11, r13-r15 are already excluded in the available_regs). Logs show the result is much better if r12 is also excluded.

Thanks!
-Zhenqiang



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Disable loop2_invariant for -Os
       [not found]       ` <4ff2ad57.8aa0d80a.5910.ffffb0d6SMTPIN_ADDED@mx.google.com>
@ 2012-07-03  9:32         ` Richard Guenther
  2012-07-09  8:40           ` Zhenqiang Chen
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Guenther @ 2012-07-03  9:32 UTC (permalink / raw)
  To: Zhenqiang Chen; +Cc: gcc-patches

On Tue, Jul 3, 2012 at 10:29 AM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
>>-----Original Message-----
>>From: Richard Guenther [mailto:richard.guenther@gmail.com]
>>Sent: 2012年6月28日 17:24
>>To: Zhenqiang Chen
>>Cc: gcc-patches@gcc.gnu.org
>>Subject: Re: [PATCH] Disable loop2_invariant for -Os
>>
>>On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen <zhenqiang.chen@arm.com>
>>wrote:
>>>>> diff --git a/gcc/loop-init.c b/gcc/loop-init.c index
>>>>> 03f8f61..5d8cf73
>>>>> 100644
>>>>> --- a/gcc/loop-init.c
>>>>> +++ b/gcc/loop-init.c
>>>>> @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
>>>>>  static bool
>>>>>  gate_rtl_move_loop_invariants (void)
>>>>>  {
>>>>> +  /* In general, invariant motion can not reduce code size. But it
>>>>> + will
>>>>> +     change the liverange of the invariant, which increases the
>>>>> + register
>>>>> +     pressure and might lead to more spilling.  */
>>>>> +  if (optimize_function_for_size_p (cfun))
>>>>> +    return false;
>>>>> +
>>>>
>>>>Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?
>>>
>>> Update it according to the comments.
>>>
>>> Thanks!
>>> -Zhenqiang
>>>
>>> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index
>>> f8405dd..b0e84a7 100644
>>> --- a/gcc/loop-invariant.c
>>> +++ b/gcc/loop-invariant.c
>>> @@ -1931,7 +1931,8 @@ move_loop_invariants (void)
>>>       curr_loop = loop;
>>>       /* move_single_loop_invariants for very large loops
>>>         is time consuming and might need a lot of memory.  */
>>> -      if (loop->num_nodes <= (unsigned)
>>> LOOP_INVARIANT_MAX_BBS_IN_LOOP)
>>> +      if (loop->num_nodes <= (unsigned)
>>> + LOOP_INVARIANT_MAX_BBS_IN_LOOP
>>> +         && ! optimize_loop_nest_for_size_p (loop))
>>>        move_single_loop_invariants (loop);
>>
>>Wait - move_single_loop_invariants itself already uses
>>optimize_loop_for_speed_p.
>>And looking down it seems to have support for tracking spill cost (eventually only
>>with -fira-loop-pressure) - please work out why this support is not working for
>>you.
>
> 1) If -fira_loop_pressure is enabled, it reduces ~24% invariant motions in my tests. But it does not help on total code size. Seams there is issue to update the "regs_needed" after moving an invariant out of the loop (My benchmark logs show ~73% cases have more than one invariants moved).
>
> During tracing, I found that move an integer constant out of the loop does not increase regs_needed. Function "get_pressure_class_and_nregs (rtx insn, int *nregs)" computes the "regs_needed".
>
>    *nregs
>       = ira_reg_class_max_nregs[pressure_class][GET_MODE (SET_SRC (set))];
>
> In ARM, the insn to set an integer is like
>      (set (reg:SI 183)
>         (const_int 32 [0x20])) inv1.c:64 182 {*thumb1_movsi_insn}
>      (nil))
> GET_MODE (SET_SRC (set)) is VOIDMode and ira_reg_class_max_nregs[pressure_class][VOIDMode] is 0. In one of my test cases, it moves 4 integer constants out of the loop, which leads to spilling.
>
> According to the algorithm in "calculate_loop_reg_pressure", moving an invariant out of the loop should impact on the register pressure. So I try to add the following code
>
>   if (! (*nregs))
>     *nregs = ira_reg_class_max_nregs[pressure_class][GET_MODE (reg)];
>
> Logs show it reduces another 32% invariant motions. But the code size is still far from disabling the pass. Logs show -fira_loop_pressure impact other passes in addition to loop2_invariant (The result of "-fira_loop_pressure -fno-move-loop-invariants" is different from the result of "-fno-move-loop-invariants").
>
> 2) By default -fira_loop_pressure is not enabled for -Os, the logic to compute "regs_used" seams not sound. The following codes is from function "find_invariants_to_move"
>     {
>       unsigned int n_regs = DF_REG_SIZE (df);
>
>       regs_used = 2;
>
>       for (i = 0; i < n_regs; i++)
>         {
>           if (!DF_REGNO_FIRST_DEF (i) && DF_REGNO_LAST_USE (i))
>             {
>               /* This is a value that is used but not changed inside loop.  */
>               regs_used++;
>             }
>         }
>     }
> * There is no loop related inform in the code.
> * Benchmark logs show the condition (!DF_REGNO_FIRST_DEF (i) && DF_REGNO_LAST_USE (i)) is never true.

Still there is code that tries to deal with -Os.  Simply disabling the pass
makes that logic pointless.

Thus, please try to fix the code that is there to deal with -Os (a target may
opt to enable -fira-loop-pressure by default for -Os).

Thanks,
Richard.

> Thanks!
> -Zhenqiang
>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] Disable loop2_invariant for -Os
  2012-06-28  9:38     ` Richard Guenther
@ 2012-07-03  8:29       ` Zhenqiang Chen
       [not found]       ` <4ff2ad57.8aa0d80a.5910.ffffb0d6SMTPIN_ADDED@mx.google.com>
  1 sibling, 0 replies; 9+ messages in thread
From: Zhenqiang Chen @ 2012-07-03  8:29 UTC (permalink / raw)
  To: 'Richard Guenther'; +Cc: gcc-patches

>-----Original Message-----
>From: Richard Guenther [mailto:richard.guenther@gmail.com]
>Sent: 2012年6月28日 17:24
>To: Zhenqiang Chen
>Cc: gcc-patches@gcc.gnu.org
>Subject: Re: [PATCH] Disable loop2_invariant for -Os
>
>On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen <zhenqiang.chen@arm.com>
>wrote:
>>>> diff --git a/gcc/loop-init.c b/gcc/loop-init.c index
>>>> 03f8f61..5d8cf73
>>>> 100644
>>>> --- a/gcc/loop-init.c
>>>> +++ b/gcc/loop-init.c
>>>> @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
>>>>  static bool
>>>>  gate_rtl_move_loop_invariants (void)
>>>>  {
>>>> +  /* In general, invariant motion can not reduce code size. But it
>>>> + will
>>>> +     change the liverange of the invariant, which increases the
>>>> + register
>>>> +     pressure and might lead to more spilling.  */
>>>> +  if (optimize_function_for_size_p (cfun))
>>>> +    return false;
>>>> +
>>>
>>>Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?
>>
>> Update it according to the comments.
>>
>> Thanks!
>> -Zhenqiang
>>
>> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index
>> f8405dd..b0e84a7 100644
>> --- a/gcc/loop-invariant.c
>> +++ b/gcc/loop-invariant.c
>> @@ -1931,7 +1931,8 @@ move_loop_invariants (void)
>>       curr_loop = loop;
>>       /* move_single_loop_invariants for very large loops
>>         is time consuming and might need a lot of memory.  */
>> -      if (loop->num_nodes <= (unsigned)
>> LOOP_INVARIANT_MAX_BBS_IN_LOOP)
>> +      if (loop->num_nodes <= (unsigned)
>> + LOOP_INVARIANT_MAX_BBS_IN_LOOP
>> +         && ! optimize_loop_nest_for_size_p (loop))
>>        move_single_loop_invariants (loop);
>
>Wait - move_single_loop_invariants itself already uses
>optimize_loop_for_speed_p.
>And looking down it seems to have support for tracking spill cost (eventually only
>with -fira-loop-pressure) - please work out why this support is not working for
>you.

1) If -fira_loop_pressure is enabled, it reduces ~24% invariant motions in my tests. But it does not help on total code size. Seams there is issue to update the "regs_needed" after moving an invariant out of the loop (My benchmark logs show ~73% cases have more than one invariants moved).

During tracing, I found that move an integer constant out of the loop does not increase regs_needed. Function "get_pressure_class_and_nregs (rtx insn, int *nregs)" computes the "regs_needed".

   *nregs
      = ira_reg_class_max_nregs[pressure_class][GET_MODE (SET_SRC (set))];

In ARM, the insn to set an integer is like
     (set (reg:SI 183)
        (const_int 32 [0x20])) inv1.c:64 182 {*thumb1_movsi_insn}
     (nil))
GET_MODE (SET_SRC (set)) is VOIDMode and ira_reg_class_max_nregs[pressure_class][VOIDMode] is 0. In one of my test cases, it moves 4 integer constants out of the loop, which leads to spilling.

According to the algorithm in "calculate_loop_reg_pressure", moving an invariant out of the loop should impact on the register pressure. So I try to add the following code

  if (! (*nregs))
    *nregs = ira_reg_class_max_nregs[pressure_class][GET_MODE (reg)];

Logs show it reduces another 32% invariant motions. But the code size is still far from disabling the pass. Logs show -fira_loop_pressure impact other passes in addition to loop2_invariant (The result of "-fira_loop_pressure -fno-move-loop-invariants" is different from the result of "-fno-move-loop-invariants").

2) By default -fira_loop_pressure is not enabled for -Os, the logic to compute "regs_used" seams not sound. The following codes is from function "find_invariants_to_move"
    {
      unsigned int n_regs = DF_REG_SIZE (df);

      regs_used = 2;

      for (i = 0; i < n_regs; i++)
        {
          if (!DF_REGNO_FIRST_DEF (i) && DF_REGNO_LAST_USE (i))
            {
              /* This is a value that is used but not changed inside loop.  */
              regs_used++;
            }
        }
    }
* There is no loop related inform in the code.
* Benchmark logs show the condition (!DF_REGNO_FIRST_DEF (i) && DF_REGNO_LAST_USE (i)) is never true.

Thanks!
-Zhenqiang



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Disable loop2_invariant for -Os
       [not found]   ` <4fec169c.e908b40a.26d1.ffffc3f1SMTPIN_ADDED@mx.google.com>
@ 2012-06-28  9:38     ` Richard Guenther
  2012-07-03  8:29       ` Zhenqiang Chen
       [not found]       ` <4ff2ad57.8aa0d80a.5910.ffffb0d6SMTPIN_ADDED@mx.google.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Richard Guenther @ 2012-06-28  9:38 UTC (permalink / raw)
  To: Zhenqiang Chen; +Cc: gcc-patches

On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
>>> diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73
>>> 100644
>>> --- a/gcc/loop-init.c
>>> +++ b/gcc/loop-init.c
>>> @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
>>>  static bool
>>>  gate_rtl_move_loop_invariants (void)
>>>  {
>>> +  /* In general, invariant motion can not reduce code size. But it
>>> + will
>>> +     change the liverange of the invariant, which increases the
>>> + register
>>> +     pressure and might lead to more spilling.  */
>>> +  if (optimize_function_for_size_p (cfun))
>>> +    return false;
>>> +
>>
>>Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?
>
> Update it according to the comments.
>
> Thanks!
> -Zhenqiang
>
> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
> index f8405dd..b0e84a7 100644
> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
> @@ -1931,7 +1931,8 @@ move_loop_invariants (void)
>       curr_loop = loop;
>       /* move_single_loop_invariants for very large loops
>         is time consuming and might need a lot of memory.  */
> -      if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP)
> +      if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP
> +         && ! optimize_loop_nest_for_size_p (loop))
>        move_single_loop_invariants (loop);

Wait - move_single_loop_invariants itself already uses
optimize_loop_for_speed_p.
And looking down it seems to have support for tracking spill cost (eventually
only with -fira-loop-pressure) - please work out why this support is not working
for you.

Richard.

>     }
>
> ChangeLog:
> 2012-06-28  Zhenqiang Chen <zhenqiang.chen@arm.com>
>
>        * loop-invariant.c (move_loop_invariants): Skip
>        move_single_loop_invariants when optimizing loop for size
>
>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] Disable loop2_invariant for -Os
  2012-06-27  9:53 ` Steven Bosscher
@ 2012-06-28  9:05   ` Zhenqiang Chen
  0 siblings, 0 replies; 9+ messages in thread
From: Zhenqiang Chen @ 2012-06-28  9:05 UTC (permalink / raw)
  To: 'Steven Bosscher'; +Cc: gcc-patches

>-----Original Message-----
>From: Steven Bosscher [mailto:stevenb.gcc@gmail.com]
>Sent: 2012年6月27日 16:54
>To: Zhenqiang Chen
>Cc: gcc-patches@gcc.gnu.org
>Subject: Re: [PATCH] Disable loop2_invariant for -Os
>
>On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen
><zhenqiang.chen@arm.com> wrote:
>> Hi,
>>
>> In general, invariant motion itself can not reduce code size. But it
>> will change the liverange of the invariant, which might lead to more
spilling.
>
>This may be true for ARM but it's not true in general. Sometimes
loop-invariant

Benchmark tests show it also benefits MIPS, PPC and X86 for code size.

>address arithmetic, that is not exposed in GIMPLE, is profitable to hoist
out of
>the loop. See e.g. PR41026 (for which I still have a patch in the queue).

>If this goes in anyway, please mention PR39837 in your ChangeLog entry.

It can not handle the case.

Thanks!
-Zhenqiang



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] Disable loop2_invariant for -Os
  2012-06-27  9:09 ` Richard Guenther
@ 2012-06-28  9:03   ` Zhenqiang Chen
       [not found]   ` <4fec169c.e908b40a.26d1.ffffc3f1SMTPIN_ADDED@mx.google.com>
  1 sibling, 0 replies; 9+ messages in thread
From: Zhenqiang Chen @ 2012-06-28  9:03 UTC (permalink / raw)
  To: 'Richard Guenther'; +Cc: gcc-patches

>> diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73
>> 100644
>> --- a/gcc/loop-init.c
>> +++ b/gcc/loop-init.c
>> @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
>>  static bool
>>  gate_rtl_move_loop_invariants (void)
>>  {
>> +  /* In general, invariant motion can not reduce code size. But it
>> + will
>> +     change the liverange of the invariant, which increases the
>> + register
>> +     pressure and might lead to more spilling.  */
>> +  if (optimize_function_for_size_p (cfun))
>> +    return false;
>> +
>
>Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?

Update it according to the comments.

Thanks!
-Zhenqiang

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f8405dd..b0e84a7 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1931,7 +1931,8 @@ move_loop_invariants (void)
       curr_loop = loop;
       /* move_single_loop_invariants for very large loops
 	 is time consuming and might need a lot of memory.  */
-      if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP)
+      if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP
+	  && ! optimize_loop_nest_for_size_p (loop))
 	move_single_loop_invariants (loop);
     }

ChangeLog:
2012-06-28  Zhenqiang Chen <zhenqiang.chen@arm.com>

	* loop-invariant.c (move_loop_invariants): Skip
	move_single_loop_invariants when optimizing loop for size




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Disable loop2_invariant for -Os
       [not found] <4feac6fa.e288440a.73b3.ffffa5c7SMTPIN_ADDED@mx.google.com>
@ 2012-06-27  9:53 ` Steven Bosscher
  2012-06-28  9:05   ` Zhenqiang Chen
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Bosscher @ 2012-06-27  9:53 UTC (permalink / raw)
  To: Zhenqiang Chen; +Cc: gcc-patches

On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
> Hi,
>
> In general, invariant motion itself can not reduce code size. But it will
> change the liverange of the invariant, which might lead to more spilling.

This may be true for ARM but it's not true in general. Sometimes
loop-invariant address arithmetic, that is not exposed in GIMPLE, is
profitable to hoist out of the loop. See e.g. PR41026 (for which I
still have a patch in the queue).

If this goes in anyway, please mention PR39837 in your ChangeLog entry.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Disable loop2_invariant for -Os
       [not found] <4feac6f5.4abd440a.074d.ffffcfbaSMTPIN_ADDED@mx.google.com>
@ 2012-06-27  9:09 ` Richard Guenther
  2012-06-28  9:03   ` Zhenqiang Chen
       [not found]   ` <4fec169c.e908b40a.26d1.ffffc3f1SMTPIN_ADDED@mx.google.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Richard Guenther @ 2012-06-27  9:09 UTC (permalink / raw)
  To: Zhenqiang Chen; +Cc: gcc-patches

On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
> Hi,
>
> In general, invariant motion itself can not reduce code size.

It can expose CSE opportunities across loops though.

> But it will
> change the liverange of the invariant, which might lead to more spilling.

"might" - indeed.  I wonder what the trade-off is here ... but given that you
leave tree loop invariant motion enabled it might not make much of a difference.

Still as this is mostly a spilling issue it looks odd to do that generally.  In
fact you could improve things by only disabling motion when that increases
register lifetime - it can after all reduce overall register lifetime:

for (;;)
  inv = inv1 + inv2;
  ... use inv;

to

inv = inv1 + inv2;
for (;;)
  ... use inv;

has register lifetime reduced.

Or at least like I suggest below.

> The patch disables loop2_invariant when optimizing for size.
>
> I measured the code size benefit for four targets based on CSiBE benchmark:
>
> ARM: 0.33%
> MIPS: 1.15%
> PPC: 0.24%
> X86: 0.45%
>
> Is it OK for trunk?
>
> Thanks!
> -Zhenqiang
>
> ChangeLog:
> 2012-06-27  Zhenqiang Chen <zhenqiang.chen@arm.com>
>
>        * loop-init.c (gate_rtl_move_loop_invariants): Disable
> loop2_invariant
>        when optimizing function for size.
>
> diff --git a/gcc/loop-init.c b/gcc/loop-init.c
> index 03f8f61..5d8cf73 100644
> --- a/gcc/loop-init.c
> +++ b/gcc/loop-init.c
> @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
>  static bool
>  gate_rtl_move_loop_invariants (void)
>  {
> +  /* In general, invariant motion can not reduce code size. But it will
> +     change the liverange of the invariant, which increases the register
> +     pressure and might lead to more spilling.  */
> +  if (optimize_function_for_size_p (cfun))
> +    return false;
> +

Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?

Thanks,
Richard.

>   return flag_move_loop_invariants;
>  }
>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-07-09  8:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-27  8:48 [PATCH] Disable loop2_invariant for -Os Zhenqiang Chen
     [not found] <4feac6f5.4abd440a.074d.ffffcfbaSMTPIN_ADDED@mx.google.com>
2012-06-27  9:09 ` Richard Guenther
2012-06-28  9:03   ` Zhenqiang Chen
     [not found]   ` <4fec169c.e908b40a.26d1.ffffc3f1SMTPIN_ADDED@mx.google.com>
2012-06-28  9:38     ` Richard Guenther
2012-07-03  8:29       ` Zhenqiang Chen
     [not found]       ` <4ff2ad57.8aa0d80a.5910.ffffb0d6SMTPIN_ADDED@mx.google.com>
2012-07-03  9:32         ` Richard Guenther
2012-07-09  8:40           ` Zhenqiang Chen
     [not found] <4feac6fa.e288440a.73b3.ffffa5c7SMTPIN_ADDED@mx.google.com>
2012-06-27  9:53 ` Steven Bosscher
2012-06-28  9:05   ` Zhenqiang Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).