public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
@ 2016-09-10  6:41 Yuan, Pengfei
  2016-09-14 10:35 ` Richard Biener
  0 siblings, 1 reply; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-10  6:41 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.guenther, hubicka

Hi,

Previously I have sent a patch on profile based option tuning:
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01377.html

According to Richard Biener's advice, I try investigating where the code size
reduction comes from. After analyzing the dumped IL, I figure out that it is
related to function inlining. Some cold functions are inlined regardless of
profile feedback, which increases code size.

The problem is with the early inliner. In want_early_inline_function_p, if the
estimated edge growth > 0, want_inline depends on maybe_hot_p, which usually
returns true unless optimize_size, since profile feedback is not available at
this point. Some functions which may be cold according to profile feedback are
inlined regardlessly, resulting in code size increase.

At first, I come up with a solution that preloads some profile info before
pass_early_inline. But it fails with numerous coverage-mismatch errors in
pass_ipa_tree_profile. Therefore, the proposed patch prevents early inlining
with positive code size growth if FDO is enabled.

Experiment results are as follows:

Setup
  Hardware             Core i7-4770, 32GB RAM
  OS                   Debian sid amd64
  Compiler             GCC 5.4.1 20160907
  Firefox source       mozilla-central, cset 91c2b9d5c135
  Training workload    css3test.com, html5test.com, Octane benchmark

Vanilla GCC
  Code size (.text of libxul.so)    48708873
  Octane benchmark (score)          35828   36618   35847
  Kraken benchmark (time)           939.4ms 964.0ms 951.8ms

Patched GCC
  Code size (.text of libxul.so)    44686265
  Octane benchmark (score)          36103   35740   35611
  Kraken benchmark (time)           928.9ms 949.1ms 938.7ms

There is over 8% reduction in code size, while no obvious difference in
performance. The experiment is conducted with GCC 5. There is segmentation
fault when starting Firefox instrumented by GCC 6. GCC 7 encounters ICE when
building Firefox.

Regards,

Yuan, Pengfei


gcc/ChangeLog:
	* ipa-inline.c (want_early_inline_function_p): Be more conservative
	if FDO is enabled.


diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 7097cf3..8266f97 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -628,6 +628,20 @@ want_early_inline_function_p (struct cgraph_edge *e)
 
       if (growth <= 0)
 	;
+      /* Profile feedback is not available at this point.
+	 Be more conservative if FDO is enabled.  */
+      else if ((profile_arc_flag && !flag_test_coverage)
+	       || (flag_branch_probabilities && !flag_auto_profile))
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
+		     "FDO is enabled and code would grow by %i\n",
+		     xstrdup_for_dump (e->caller->name ()),
+		     e->caller->order,
+		     xstrdup_for_dump (callee->name ()), callee->order,
+		     growth);
+	  want_inline = false;
+	}
       else if (!e->maybe_hot_p ()
 	       && growth > 0)
 	{

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-10  6:41 [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled Yuan, Pengfei
@ 2016-09-14 10:35 ` Richard Biener
  2016-09-15  3:17   ` Yuan, Pengfei
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Biener @ 2016-09-14 10:35 UTC (permalink / raw)
  To: Yuan, Pengfei; +Cc: GCC Patches, Jan Hubicka

On Sat, Sep 10, 2016 at 8:04 AM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
> Hi,
>
> Previously I have sent a patch on profile based option tuning:
> https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01377.html
>
> According to Richard Biener's advice, I try investigating where the code size
> reduction comes from. After analyzing the dumped IL, I figure out that it is
> related to function inlining. Some cold functions are inlined regardless of
> profile feedback, which increases code size.
>
> The problem is with the early inliner. In want_early_inline_function_p, if the
> estimated edge growth > 0, want_inline depends on maybe_hot_p, which usually
> returns true unless optimize_size, since profile feedback is not available at
> this point. Some functions which may be cold according to profile feedback are
> inlined regardlessly, resulting in code size increase.
>
> At first, I come up with a solution that preloads some profile info before
> pass_early_inline. But it fails with numerous coverage-mismatch errors in
> pass_ipa_tree_profile. Therefore, the proposed patch prevents early inlining
> with positive code size growth if FDO is enabled.
>
> Experiment results are as follows:
>
> Setup
>   Hardware             Core i7-4770, 32GB RAM
>   OS                   Debian sid amd64
>   Compiler             GCC 5.4.1 20160907
>   Firefox source       mozilla-central, cset 91c2b9d5c135
>   Training workload    css3test.com, html5test.com, Octane benchmark
>
> Vanilla GCC
>   Code size (.text of libxul.so)    48708873
>   Octane benchmark (score)          35828   36618   35847
>   Kraken benchmark (time)           939.4ms 964.0ms 951.8ms
>
> Patched GCC
>   Code size (.text of libxul.so)    44686265
>   Octane benchmark (score)          36103   35740   35611
>   Kraken benchmark (time)           928.9ms 949.1ms 938.7ms
>
> There is over 8% reduction in code size, while no obvious difference in
> performance. The experiment is conducted with GCC 5. There is segmentation
> fault when starting Firefox instrumented by GCC 6. GCC 7 encounters ICE when
> building Firefox.

I think the approach is reasonable though it might still have
interesting effects on
optimization for very small growth.  So for further experimenting it
would be nice
to have a separate PARAM_EARLY_FDO_INLINING_INSNS or maybe simply
adjust the PARAM_EARLY_INLINING_INSNS default accordingly when FDO is
enabled?

I'll let Honza also double-check the condition detecting FDO (it looks
like we should
have some abstraction for that).

Thanks,
Richard.

> Regards,
>
> Yuan, Pengfei
>
>
> gcc/ChangeLog:
>         * ipa-inline.c (want_early_inline_function_p): Be more conservative
>         if FDO is enabled.
>
>
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index 7097cf3..8266f97 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -628,6 +628,20 @@ want_early_inline_function_p (struct cgraph_edge *e)
>
>        if (growth <= 0)
>         ;
> +      /* Profile feedback is not available at this point.
> +        Be more conservative if FDO is enabled.  */
> +      else if ((profile_arc_flag && !flag_test_coverage)
> +              || (flag_branch_probabilities && !flag_auto_profile))
> +       {
> +         if (dump_file)
> +           fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> +                    "FDO is enabled and code would grow by %i\n",
> +                    xstrdup_for_dump (e->caller->name ()),
> +                    e->caller->order,
> +                    xstrdup_for_dump (callee->name ()), callee->order,
> +                    growth);
> +         want_inline = false;
> +       }
>        else if (!e->maybe_hot_p ()
>                && growth > 0)
>         {
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-14 10:35 ` Richard Biener
@ 2016-09-15  3:17   ` Yuan, Pengfei
  2016-09-15  8:44     ` Richard Biener
  0 siblings, 1 reply; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-15  3:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jan Hubicka

> I think the approach is reasonable though it might still have
> interesting effects on
> optimization for very small growth.  So for further experimenting it
> would be nice

Test it on SPEC CPU with FDO enabled?

> to have a separate PARAM_EARLY_FDO_INLINING_INSNS or maybe simply
> adjust the PARAM_EARLY_INLINING_INSNS default accordingly when FDO is
> enabled?

Setting PARAM_EARLY_INLINING_INSNS to 0 when FDO is enabled should be
equivalent to my patch.

Regards,
Yuan, Pengfei
 
> I'll let Honza also double-check the condition detecting FDO (it looks
> like we should
> have some abstraction for that).
> 
> Thanks,
> Richard.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-15  3:17   ` Yuan, Pengfei
@ 2016-09-15  8:44     ` Richard Biener
  2016-09-15 10:08       ` Yuan, Pengfei
  2016-09-16  5:52       ` Yuan, Pengfei
  0 siblings, 2 replies; 23+ messages in thread
From: Richard Biener @ 2016-09-15  8:44 UTC (permalink / raw)
  To: Yuan, Pengfei; +Cc: GCC Patches, Jan Hubicka

On Thu, Sep 15, 2016 at 2:21 AM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
>> I think the approach is reasonable though it might still have
>> interesting effects on
>> optimization for very small growth.  So for further experimenting it
>> would be nice
>
> Test it on SPEC CPU with FDO enabled?
>
>> to have a separate PARAM_EARLY_FDO_INLINING_INSNS or maybe simply
>> adjust the PARAM_EARLY_INLINING_INSNS default accordingly when FDO is
>> enabled?
>
> Setting PARAM_EARLY_INLINING_INSNS to 0 when FDO is enabled should be
> equivalent to my patch.

Yes.  This means it's easy to experiment with other values than zero.  Basically
early inlining is supposed to remove abstraction penalty to

 a) reduce FDO instrumentation overhead
 b) get more realistic size estimates for the inliner

a) was particularly important back in time for tramp3d, reducing
profiling runtime
1000-fold.  b) is generally important.

PARAM_EARLY_INLINING_INSNS is supposed to be a reasonable value to
get abstraction removed but IIRC we increased it quite a bit to also get more
early optimization (to get more accurate inliner estimates).

What I am saying is that reducing PARAM_EARLY_INLINING_INSNS makes
sense for FDO but reducing it to zero is probably a bit much.

Can you do your measurements with values between zero and the current
default of 14 (wow, 14 ... didn't know it's _that_ high currently ;)).
What's the
value that crosses the boundary of diminishing returns regarding to code-size
improvements for you?

Richard.

> Regards,
> Yuan, Pengfei
>
>> I'll let Honza also double-check the condition detecting FDO (it looks
>> like we should
>> have some abstraction for that).
>>
>> Thanks,
>> Richard.
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-15  8:44     ` Richard Biener
@ 2016-09-15 10:08       ` Yuan, Pengfei
  2016-09-16  5:52       ` Yuan, Pengfei
  1 sibling, 0 replies; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-15 10:08 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jan Hubicka

> Yes.  This means it's easy to experiment with other values than zero.  Basically
> early inlining is supposed to remove abstraction penalty to
> 
>  a) reduce FDO instrumentation overhead
>  b) get more realistic size estimates for the inliner
> 
> a) was particularly important back in time for tramp3d, reducing
> profiling runtime
> 1000-fold.  b) is generally important.
> 
> PARAM_EARLY_INLINING_INSNS is supposed to be a reasonable value to
> get abstraction removed but IIRC we increased it quite a bit to also get more
> early optimization (to get more accurate inliner estimates).
> 
> What I am saying is that reducing PARAM_EARLY_INLINING_INSNS makes
> sense for FDO but reducing it to zero is probably a bit much.
> 
> Can you do your measurements with values between zero and the current
> default of 14 (wow, 14 ... didn't know it's _that_ high currently ;)).
> What's the
> value that crosses the boundary of diminishing returns regarding to code-size
> improvements for you?
> 
> Richard.
> 

I see. This procedure may take some time since profile data can not be reused.

Yuan, Pengfei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-15  8:44     ` Richard Biener
  2016-09-15 10:08       ` Yuan, Pengfei
@ 2016-09-16  5:52       ` Yuan, Pengfei
  2016-09-16  8:09         ` Jan Hubicka
  1 sibling, 1 reply; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-16  5:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jan Hubicka

> > Setting PARAM_EARLY_INLINING_INSNS to 0 when FDO is enabled should be
> > equivalent to my patch.
> 
> Yes.  This means it's easy to experiment with other values than zero.  Basically
> early inlining is supposed to remove abstraction penalty to
> 
>  a) reduce FDO instrumentation overhead
>  b) get more realistic size estimates for the inliner
> 
> a) was particularly important back in time for tramp3d, reducing
> profiling runtime
> 1000-fold.  b) is generally important.
> 
> PARAM_EARLY_INLINING_INSNS is supposed to be a reasonable value to
> get abstraction removed but IIRC we increased it quite a bit to also get more
> early optimization (to get more accurate inliner estimates).
> 
> What I am saying is that reducing PARAM_EARLY_INLINING_INSNS makes
> sense for FDO but reducing it to zero is probably a bit much.
> 
> Can you do your measurements with values between zero and the current
> default of 14 (wow, 14 ... didn't know it's _that_ high currently ;)).
> What's the
> value that crosses the boundary of diminishing returns regarding to code-size
> improvements for you?
> 
> Richard.

Here are the results:

Param  Size (GCC5)        Time (GCC5)  Time (GCC7)
0      44686265 (-8.26%)  58.772s      66.332s
1      45692793 (-6.19%)  40.684s      39.220s
2      45556185 (-6.47%)  35.292s      34.328s
3      46251049 (-5.05%)  28.820s      27.136s
4      47028873 (-3.45%)  24.616s      22.200s
5      47495641 (-2.49%)  20.160s      17.800s
6      47520153 (-2.44%)  16.444s      15.656s
14     48708873           5.620s       5.556s

Param: value of PARAM_EARLY_INLINING_INSNS
Size:  code size (.text) of optimized libxul.so
Time:  execution time of instrumented tramp3d (-n 25)

To balance between size reduction of optimized binary and speed penalty
of instrumented binary, I set param=6 as baseline and compare:

Param  Size score  Time score  Total
0      3.39        -3.57       -0.18
1      2.54        -2.47        0.07
2      2.65        -2.15        0.50
3      2.07        -1.75        0.32
4      1.41        -1.50       -0.09
5      1.02        -1.23       -0.21
6      1.00        -1.00        0.00
14     0.00        -0.34       -0.34

Therefore, I think param=2 is the best choice.

Is the attached patch OK?

Regards,
Yuan, Pengfei


gcc/ChangeLog
	* opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
	when FDO is enabled.


diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..b59c700 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
       maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
                             opts->x_param_values, opts_set->x_param_values);
     }

+  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
+  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
+      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
+    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
+                          opts->x_param_values, opts_set->x_param_values);
+
   if (opts->x_flag_lto)
     {
 #ifdef ENABLE_LTO
       opts->x_flag_generate_lto = 1;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16  5:52       ` Yuan, Pengfei
@ 2016-09-16  8:09         ` Jan Hubicka
  2016-09-16  9:01           ` Yuan, Pengfei
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Hubicka @ 2016-09-16  8:09 UTC (permalink / raw)
  To: Yuan, Pengfei; +Cc: Richard Biener, GCC Patches, Jan Hubicka

> 
> Here are the results:
> 
> Param  Size (GCC5)        Time (GCC5)  Time (GCC7)
> 0      44686265 (-8.26%)  58.772s      66.332s
> 1      45692793 (-6.19%)  40.684s      39.220s
> 2      45556185 (-6.47%)  35.292s      34.328s
> 3      46251049 (-5.05%)  28.820s      27.136s
> 4      47028873 (-3.45%)  24.616s      22.200s
> 5      47495641 (-2.49%)  20.160s      17.800s
> 6      47520153 (-2.44%)  16.444s      15.656s
> 14     48708873           5.620s       5.556s

Thanks for data! I meant to run the benchmark myself, but had little time to do
it over past week becuase of traveling and was also wondering what to do given
that spec is rather poor benchmark in this area. Tramp3d is biassed but we are
in stage1 and can fine tune latter. I am debugging the libxul crashes in FDO
binary now, so we can re-run talos.
> 
> Param: value of PARAM_EARLY_INLINING_INSNS
> Size:  code size (.text) of optimized libxul.so
> Time:  execution time of instrumented tramp3d (-n 25)
> 
> To balance between size reduction of optimized binary and speed penalty
> of instrumented binary, I set param=6 as baseline and compare:
> 
> Param  Size score  Time score  Total
> 0      3.39        -3.57       -0.18
> 1      2.54        -2.47        0.07
> 2      2.65        -2.15        0.50
> 3      2.07        -1.75        0.32
> 4      1.41        -1.50       -0.09
> 5      1.02        -1.23       -0.21
> 6      1.00        -1.00        0.00
> 14     0.00        -0.34       -0.34
> 
> Therefore, I think param=2 is the best choice.
> 
> Is the attached patch OK?

Setting param to 2 looks fine

> gcc/ChangeLog
> 	* opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
> 	when FDO is enabled.
> 
> 
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 39c190d..b59c700 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
>                              opts->x_param_values, opts_set->x_param_values);
>      }
> 
> +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
> +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
> +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
> +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
> +                          opts->x_param_values, opts_set->x_param_values);
> +

I would actually preffer to have PARAM_EARLY_ININING_INSNS_FEEDBACK.  We
already have TRACER_DYNAMIC_COVERAGE_FEEDBACK and other params.  The reason is
that profile is not a global property of program. It may or may not be
available for given function, while params are global.

Even at compile time profile may be selectively missing for example for
COMDATs that did not win in the linking process.

There is also need to update the documentation.
Thanks for the work!
Honza
>    if (opts->x_flag_lto)
>      {
>  #ifdef ENABLE_LTO
>        opts->x_flag_generate_lto = 1;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16  8:09         ` Jan Hubicka
@ 2016-09-16  9:01           ` Yuan, Pengfei
  2016-09-16  9:21             ` Richard Biener
  0 siblings, 1 reply; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-16  9:01 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, GCC Patches

> > Here are the results:
> > 
> > Param  Size (GCC5)        Time (GCC5)  Time (GCC7)
> > 0      44686265 (-8.26%)  58.772s      66.332s
> > 1      45692793 (-6.19%)  40.684s      39.220s
> > 2      45556185 (-6.47%)  35.292s      34.328s
> > 3      46251049 (-5.05%)  28.820s      27.136s
> > 4      47028873 (-3.45%)  24.616s      22.200s
> > 5      47495641 (-2.49%)  20.160s      17.800s
> > 6      47520153 (-2.44%)  16.444s      15.656s
> > 14     48708873           5.620s       5.556s
> 
> Thanks for data! I meant to run the benchmark myself, but had little time to do
> it over past week becuase of traveling and was also wondering what to do given
> that spec is rather poor benchmark in this area. Tramp3d is biassed but we are
> in stage1 and can fine tune latter. I am debugging the libxul crashes in FDO
> binary now, so we can re-run talos.

It seems to be memory corruption. The content of a pointer becomes 0xe5e5...e5.
I have tried Firefox 48.0.2, 49b10 and mozilla-central with GCC 6. Same error.

> > Param: value of PARAM_EARLY_INLINING_INSNS
> > Size:  code size (.text) of optimized libxul.so
> > Time:  execution time of instrumented tramp3d (-n 25)
> > 
> > To balance between size reduction of optimized binary and speed penalty
> > of instrumented binary, I set param=6 as baseline and compare:
> > 
> > Param  Size score  Time score  Total
> > 0      3.39        -3.57       -0.18
> > 1      2.54        -2.47        0.07
> > 2      2.65        -2.15        0.50
> > 3      2.07        -1.75        0.32
> > 4      1.41        -1.50       -0.09
> > 5      1.02        -1.23       -0.21
> > 6      1.00        -1.00        0.00
> > 14     0.00        -0.34       -0.34
> > 
> > Therefore, I think param=2 is the best choice.
> > 
> > Is the attached patch OK?
> 
> Setting param to 2 looks fine
> 
> > gcc/ChangeLog
> > 	* opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
> > 	when FDO is enabled.
> > 
> > 
> > diff --git a/gcc/opts.c b/gcc/opts.c
> > index 39c190d..b59c700 100644
> > --- a/gcc/opts.c
> > +++ b/gcc/opts.c
> > @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
> >        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
> >                              opts->x_param_values, opts_set->x_param_values);
> >      }
> > 
> > +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
> > +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
> > +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
> > +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
> > +                          opts->x_param_values, opts_set->x_param_values);
> > +
> 
> I would actually preffer to have PARAM_EARLY_ININING_INSNS_FEEDBACK.  We
> already have TRACER_DYNAMIC_COVERAGE_FEEDBACK and other params.  The reason is
> that profile is not a global property of program. It may or may not be
> available for given function, while params are global.

Whether profile is available for specific functions is not important here because
profile is loaded after pass_early_inline. Therefore I think setting the param
globally is fine.

> Even at compile time profile may be selectively missing for example for
> COMDATs that did not win in the linking process.
> 
> There is also need to update the documentation.
> Thanks for the work!
> Honza

Here is the new patch.

Regards,
Yuan, Pengfei


gcc/ChangeLog
	* doc/invoke.texi (early-inlining-insns): Value is adjusted
	when FDO is enabled.
	* opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
	when FDO is enabled.


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1ca4dcc..880a28c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10272,9 +10272,11 @@ The default value is 10.
 
 @item early-inlining-insns
 Specify growth that the early inliner can make.  In effect it increases
 the amount of inlining for code having a large abstraction penalty.
-The default value is 14.
+The default value is 14.  When feedback-directed optimizations is enabled (by
+@option{-fprofile-generate} or @option{-fprofile-use}), the value is adjusted
+to 2.
 
 @item max-early-inliner-iterations
 Limit of iterations of the early inliner.  This basically bounds
 the number of nested indirect calls the early inliner can resolve.
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..b59c700 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
       maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
 			     opts->x_param_values, opts_set->x_param_values);
     }
 
+  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
+  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
+      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
+    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
+			   opts->x_param_values, opts_set->x_param_values);
+
   if (opts->x_flag_lto)
     {
 #ifdef ENABLE_LTO
       opts->x_flag_generate_lto = 1;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16  9:01           ` Yuan, Pengfei
@ 2016-09-16  9:21             ` Richard Biener
  2016-09-16  9:37               ` Jan Hubicka
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Biener @ 2016-09-16  9:21 UTC (permalink / raw)
  To: Yuan, Pengfei; +Cc: Jan Hubicka, GCC Patches

On Fri, Sep 16, 2016 at 10:50 AM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
>> > Here are the results:
>> >
>> > Param  Size (GCC5)        Time (GCC5)  Time (GCC7)
>> > 0      44686265 (-8.26%)  58.772s      66.332s
>> > 1      45692793 (-6.19%)  40.684s      39.220s
>> > 2      45556185 (-6.47%)  35.292s      34.328s
>> > 3      46251049 (-5.05%)  28.820s      27.136s
>> > 4      47028873 (-3.45%)  24.616s      22.200s
>> > 5      47495641 (-2.49%)  20.160s      17.800s
>> > 6      47520153 (-2.44%)  16.444s      15.656s
>> > 14     48708873           5.620s       5.556s
>>
>> Thanks for data! I meant to run the benchmark myself, but had little time to do
>> it over past week becuase of traveling and was also wondering what to do given
>> that spec is rather poor benchmark in this area. Tramp3d is biassed but we are
>> in stage1 and can fine tune latter. I am debugging the libxul crashes in FDO
>> binary now, so we can re-run talos.
>
> It seems to be memory corruption. The content of a pointer becomes 0xe5e5...e5.
> I have tried Firefox 48.0.2, 49b10 and mozilla-central with GCC 6. Same error.
>
>> > Param: value of PARAM_EARLY_INLINING_INSNS
>> > Size:  code size (.text) of optimized libxul.so
>> > Time:  execution time of instrumented tramp3d (-n 25)
>> >
>> > To balance between size reduction of optimized binary and speed penalty
>> > of instrumented binary, I set param=6 as baseline and compare:
>> >
>> > Param  Size score  Time score  Total
>> > 0      3.39        -3.57       -0.18
>> > 1      2.54        -2.47        0.07
>> > 2      2.65        -2.15        0.50
>> > 3      2.07        -1.75        0.32
>> > 4      1.41        -1.50       -0.09
>> > 5      1.02        -1.23       -0.21
>> > 6      1.00        -1.00        0.00
>> > 14     0.00        -0.34       -0.34
>> >
>> > Therefore, I think param=2 is the best choice.
>> >
>> > Is the attached patch OK?
>>
>> Setting param to 2 looks fine
>>
>> > gcc/ChangeLog
>> >     * opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
>> >     when FDO is enabled.
>> >
>> >
>> > diff --git a/gcc/opts.c b/gcc/opts.c
>> > index 39c190d..b59c700 100644
>> > --- a/gcc/opts.c
>> > +++ b/gcc/opts.c
>> > @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>> >        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
>> >                              opts->x_param_values, opts_set->x_param_values);
>> >      }
>> >
>> > +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
>> > +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
>> > +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
>> > +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
>> > +                          opts->x_param_values, opts_set->x_param_values);
>> > +
>>
>> I would actually preffer to have PARAM_EARLY_ININING_INSNS_FEEDBACK.  We
>> already have TRACER_DYNAMIC_COVERAGE_FEEDBACK and other params.  The reason is
>> that profile is not a global property of program. It may or may not be
>> available for given function, while params are global.
>
> Whether profile is available for specific functions is not important here because
> profile is loaded after pass_early_inline. Therefore I think setting the param
> globally is fine.

I also like a new param better as it avoids a new magic constant and
makes playing with
it easier (your patch removes the ability to do statistics like you did via the
early-inlining-insns parameter by forcing it to two).

Thanks,
Richard.

>> Even at compile time profile may be selectively missing for example for
>> COMDATs that did not win in the linking process.
>>
>> There is also need to update the documentation.
>> Thanks for the work!
>> Honza
>
> Here is the new patch.
>
> Regards,
> Yuan, Pengfei
>
>
> gcc/ChangeLog
>         * doc/invoke.texi (early-inlining-insns): Value is adjusted
>         when FDO is enabled.
>         * opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
>         when FDO is enabled.
>
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 1ca4dcc..880a28c 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -10272,9 +10272,11 @@ The default value is 10.
>
>  @item early-inlining-insns
>  Specify growth that the early inliner can make.  In effect it increases
>  the amount of inlining for code having a large abstraction penalty.
> -The default value is 14.
> +The default value is 14.  When feedback-directed optimizations is enabled (by
> +@option{-fprofile-generate} or @option{-fprofile-use}), the value is adjusted
> +to 2.
>
>  @item max-early-inliner-iterations
>  Limit of iterations of the early inliner.  This basically bounds
>  the number of nested indirect calls the early inliner can resolve.
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 39c190d..b59c700 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
>                              opts->x_param_values, opts_set->x_param_values);
>      }
>
> +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
> +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
> +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
> +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
> +                          opts->x_param_values, opts_set->x_param_values);
> +
>    if (opts->x_flag_lto)
>      {
>  #ifdef ENABLE_LTO
>        opts->x_flag_generate_lto = 1;
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16  9:21             ` Richard Biener
@ 2016-09-16  9:37               ` Jan Hubicka
  2016-09-16 12:00                 ` Yuan, Pengfei
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Hubicka @ 2016-09-16  9:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: Yuan, Pengfei, Jan Hubicka, GCC Patches

> On Fri, Sep 16, 2016 at 10:50 AM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
> >> > Here are the results:
> >> >
> >> > Param  Size (GCC5)        Time (GCC5)  Time (GCC7)
> >> > 0      44686265 (-8.26%)  58.772s      66.332s
> >> > 1      45692793 (-6.19%)  40.684s      39.220s
> >> > 2      45556185 (-6.47%)  35.292s      34.328s
> >> > 3      46251049 (-5.05%)  28.820s      27.136s
> >> > 4      47028873 (-3.45%)  24.616s      22.200s
> >> > 5      47495641 (-2.49%)  20.160s      17.800s
> >> > 6      47520153 (-2.44%)  16.444s      15.656s
> >> > 14     48708873           5.620s       5.556s
> >>
> >> Thanks for data! I meant to run the benchmark myself, but had little time to do
> >> it over past week becuase of traveling and was also wondering what to do given
> >> that spec is rather poor benchmark in this area. Tramp3d is biassed but we are
> >> in stage1 and can fine tune latter. I am debugging the libxul crashes in FDO
> >> binary now, so we can re-run talos.
> >
> > It seems to be memory corruption. The content of a pointer becomes 0xe5e5...e5.
> > I have tried Firefox 48.0.2, 49b10 and mozilla-central with GCC 6. Same error.
> >
> >> > Param: value of PARAM_EARLY_INLINING_INSNS
> >> > Size:  code size (.text) of optimized libxul.so
> >> > Time:  execution time of instrumented tramp3d (-n 25)
> >> >
> >> > To balance between size reduction of optimized binary and speed penalty
> >> > of instrumented binary, I set param=6 as baseline and compare:
> >> >
> >> > Param  Size score  Time score  Total
> >> > 0      3.39        -3.57       -0.18
> >> > 1      2.54        -2.47        0.07
> >> > 2      2.65        -2.15        0.50
> >> > 3      2.07        -1.75        0.32
> >> > 4      1.41        -1.50       -0.09
> >> > 5      1.02        -1.23       -0.21
> >> > 6      1.00        -1.00        0.00
> >> > 14     0.00        -0.34       -0.34
> >> >
> >> > Therefore, I think param=2 is the best choice.
> >> >
> >> > Is the attached patch OK?
> >>
> >> Setting param to 2 looks fine
> >>
> >> > gcc/ChangeLog
> >> >     * opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
> >> >     when FDO is enabled.
> >> >
> >> >
> >> > diff --git a/gcc/opts.c b/gcc/opts.c
> >> > index 39c190d..b59c700 100644
> >> > --- a/gcc/opts.c
> >> > +++ b/gcc/opts.c
> >> > @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
> >> >        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
> >> >                              opts->x_param_values, opts_set->x_param_values);
> >> >      }
> >> >
> >> > +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
> >> > +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
> >> > +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
> >> > +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
> >> > +                          opts->x_param_values, opts_set->x_param_values);
> >> > +
> >>
> >> I would actually preffer to have PARAM_EARLY_ININING_INSNS_FEEDBACK.  We
> >> already have TRACER_DYNAMIC_COVERAGE_FEEDBACK and other params.  The reason is
> >> that profile is not a global property of program. It may or may not be
> >> available for given function, while params are global.
> >
> > Whether profile is available for specific functions is not important here because
> > profile is loaded after pass_early_inline. Therefore I think setting the param
> > globally is fine.
> 
> I also like a new param better as it avoids a new magic constant and
> makes playing with
> it easier (your patch removes the ability to do statistics like you did via the
> early-inlining-insns parameter by forcing it to two).

Hmm, you are right that you do not know if this particular function will get
profile (forgot about that).  Still, please use two params - it is more
consistent with what we have now and we may make it profile specific in
future..

Honza
> 
> Thanks,
> Richard.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16  9:37               ` Jan Hubicka
@ 2016-09-16 12:00                 ` Yuan, Pengfei
  2016-09-16 12:56                   ` Jan Hubicka
                                     ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-16 12:00 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, GCC Patches

> > I also like a new param better as it avoids a new magic constant and
> > makes playing with
> > it easier (your patch removes the ability to do statistics like you did via the
> > early-inlining-insns parameter by forcing it to two).
> 
> Hmm, you are right that you do not know if this particular function will get
> profile (forgot about that).  Still, please use two params - it is more
> consistent with what we have now and we may make it profile specific in
> future..
> 
> Honza
> > 
> > Thanks,
> > Richard.

A new patch for trunk is attached.

Regards,
Yuan, Pengfei


2016-09-16  Yuan Pengfei  <ypf@pku.edu.cn>

	* doc/invoke.texi (--param early-inlining-insns-feedback): New.
	* ipa-inline.c (want_early_inline_function_p): Use
	PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
	* params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
	(PARAM_EARLY_INLINING_INSNS): Change help string accordingly.


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8eb5eff..6e7659a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9124,12 +9124,18 @@ given call expression.  This parameter limits inlining only to call expressions
 whose probability exceeds the given threshold (in percents).
 The default value is 10.
 
 @item early-inlining-insns
+@itemx early-inlining-insns-feedback
 Specify growth that the early inliner can make.  In effect it increases
 the amount of inlining for code having a large abstraction penalty.
 The default value is 14.
 
+The @option{early-inlining-insns-feedback} parameter is used only when
+profile feedback-directed optimizations are enabled (by
+@option{-fprofile-generate} or @option{-fprofile-use}).
+The default value is 2.
+
 @item max-early-inliner-iterations
 Limit of iterations of the early inliner.  This basically bounds
 the number of nested indirect calls the early inliner can resolve.
 Deeper chains are still handled by late inlining.
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 5c9366a..e028c08 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -594,10 +594,17 @@ want_early_inline_function_p (struct cgraph_edge *e)
     }
   else
     {
       int growth = estimate_edge_growth (e);
+      int growth_limit;
       int n;
 
+      if ((profile_arc_flag && !flag_test_coverage)
+	  || (flag_branch_probabilities && !flag_auto_profile))
+	growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_FEEDBACK);
+      else
+	growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
+
       if (growth <= 0)
 	;
       else if (!e->maybe_hot_p ()
 	       && growth > 0)
@@ -610,9 +617,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
 		     xstrdup_for_dump (callee->name ()), callee->order,
 		     growth);
 	  want_inline = false;
 	}
-      else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
+      else if (growth > growth_limit)
 	{
 	  if (dump_file)
 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
 		     "growth %i exceeds --param early-inlining-insns\n",
@@ -622,9 +629,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
 		     growth);
 	  want_inline = false;
 	}
       else if ((n = num_calls (callee)) != 0
-	       && growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
+	       && growth * (n + 1) > growth_limit)
 	{
 	  if (dump_file)
 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
 		     "growth %i exceeds --param early-inlining-insns "
diff --git a/gcc/params.def b/gcc/params.def
index 79b7dd4..91ea513 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -199,12 +199,20 @@ DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
 DEFPARAM(PARAM_IPCP_UNIT_GROWTH,
 	 "ipcp-unit-growth",
 	 "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).",
 	 10, 0, 0)
-DEFPARAM(PARAM_EARLY_INLINING_INSNS,
-	 "early-inlining-insns",
-	 "Maximal estimated growth of function body caused by early inlining of single call.",
-	 14, 0, 0)
+DEFPARAM (PARAM_EARLY_INLINING_INSNS_FEEDBACK,
+	  "early-inlining-insns-feedback",
+	  "Maximal estimated growth of function body caused by early "
+	  "inlining of single call.  Used when profile feedback-directed "
+	  "optimizations are enabled.",
+	  2, 0, 0)
+DEFPARAM (PARAM_EARLY_INLINING_INSNS,
+	  "early-inlining-insns",
+	  "Maximal estimated growth of function body caused by early "
+	  "inlining of single call.  Used when profile feedback-directed "
+	  "optimizations are not enabled.",
+	  14, 0, 0)
 DEFPARAM(PARAM_LARGE_STACK_FRAME,
 	 "large-stack-frame",
 	 "The size of stack frame to be considered large.",
 	 256, 0, 0)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16 12:00                 ` Yuan, Pengfei
@ 2016-09-16 12:56                   ` Jan Hubicka
  2016-09-20  8:53                     ` Yuan, Pengfei
  2016-09-20 11:44                     ` Richard Biener
  2016-09-26  4:03                   ` Yuan, Pengfei
  2016-10-10  2:23                   ` PING: [PATCH] " Yuan, Pengfei
  2 siblings, 2 replies; 23+ messages in thread
From: Jan Hubicka @ 2016-09-16 12:56 UTC (permalink / raw)
  To: Yuan, Pengfei; +Cc: Jan Hubicka, Richard Biener, GCC Patches

> > > I also like a new param better as it avoids a new magic constant and
> > > makes playing with
> > > it easier (your patch removes the ability to do statistics like you did via the
> > > early-inlining-insns parameter by forcing it to two).
> > 
> > Hmm, you are right that you do not know if this particular function will get
> > profile (forgot about that).  Still, please use two params - it is more
> > consistent with what we have now and we may make it profile specific in
> > future..
> > 
> > Honza
> > > 
> > > Thanks,
> > > Richard.
> 
> A new patch for trunk is attached.
> 
> Regards,
> Yuan, Pengfei
> 
> 
> 2016-09-16  Yuan Pengfei  <ypf@pku.edu.cn>
> 
> 	* doc/invoke.texi (--param early-inlining-insns-feedback): New.
> 	* ipa-inline.c (want_early_inline_function_p): Use
> 	PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
> 	* params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
> 	(PARAM_EARLY_INLINING_INSNS): Change help string accordingly.

OK,
thanks

Honza

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16 12:56                   ` Jan Hubicka
@ 2016-09-20  8:53                     ` Yuan, Pengfei
  2016-09-20 11:44                     ` Richard Biener
  1 sibling, 0 replies; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-20  8:53 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, GCC Patches

> > > > I also like a new param better as it avoids a new magic constant and
> > > > makes playing with
> > > > it easier (your patch removes the ability to do statistics like you did via the
> > > > early-inlining-insns parameter by forcing it to two).
> > > 
> > > Hmm, you are right that you do not know if this particular function will get
> > > profile (forgot about that).  Still, please use two params - it is more
> > > consistent with what we have now and we may make it profile specific in
> > > future..
> > > 
> > > Honza
> > > > 
> > > > Thanks,
> > > > Richard.
> > 
> > A new patch for trunk is attached.
> > 
> > Regards,
> > Yuan, Pengfei
> > 
> > 
> > 2016-09-16  Yuan Pengfei  <ypf@pku.edu.cn>
> > 
> > 	* doc/invoke.texi (--param early-inlining-insns-feedback): New.
> > 	* ipa-inline.c (want_early_inline_function_p): Use
> > 	PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
> > 	* params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
> > 	(PARAM_EARLY_INLINING_INSNS): Change help string accordingly.
> 
> OK,
> thanks
> 
> Honza

Do I have to sign an FSF copyright assignment to get this patch applied?

Regards,
Yuan, Pengfei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16 12:56                   ` Jan Hubicka
  2016-09-20  8:53                     ` Yuan, Pengfei
@ 2016-09-20 11:44                     ` Richard Biener
  2016-09-20 11:58                       ` Richard Biener
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Biener @ 2016-09-20 11:44 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Yuan, Pengfei, GCC Patches

On Fri, Sep 16, 2016 at 2:00 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> > > I also like a new param better as it avoids a new magic constant and
>> > > makes playing with
>> > > it easier (your patch removes the ability to do statistics like you did via the
>> > > early-inlining-insns parameter by forcing it to two).
>> >
>> > Hmm, you are right that you do not know if this particular function will get
>> > profile (forgot about that).  Still, please use two params - it is more
>> > consistent with what we have now and we may make it profile specific in
>> > future..
>> >
>> > Honza
>> > >
>> > > Thanks,
>> > > Richard.
>>
>> A new patch for trunk is attached.
>>
>> Regards,
>> Yuan, Pengfei
>>
>>
>> 2016-09-16  Yuan Pengfei  <ypf@pku.edu.cn>
>>
>>       * doc/invoke.texi (--param early-inlining-insns-feedback): New.
>>       * ipa-inline.c (want_early_inline_function_p): Use
>>       PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
>>       * params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
>>       (PARAM_EARLY_INLINING_INSNS): Change help string accordingly.

Btw, It occurs to me that then win in code-size might be purely due to the
smaller base value for the TU size we use to compute the maximum unit
growth with ... any idea how to improve it on this side?  Say, computing
the TU size before early optimization (uh, probably not ...)

That said, the inliner always completely fills its budged, that is, increase
the unit by max-unit-growth?

Richard.

> OK,
> thanks
>
> Honza

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-20 11:44                     ` Richard Biener
@ 2016-09-20 11:58                       ` Richard Biener
  2016-09-20 12:09                         ` Yuan, Pengfei
  2016-09-21 12:44                         ` Yuan, Pengfei
  0 siblings, 2 replies; 23+ messages in thread
From: Richard Biener @ 2016-09-20 11:58 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Yuan, Pengfei, GCC Patches

On Tue, Sep 20, 2016 at 1:31 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Sep 16, 2016 at 2:00 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> > > I also like a new param better as it avoids a new magic constant and
>>> > > makes playing with
>>> > > it easier (your patch removes the ability to do statistics like you did via the
>>> > > early-inlining-insns parameter by forcing it to two).
>>> >
>>> > Hmm, you are right that you do not know if this particular function will get
>>> > profile (forgot about that).  Still, please use two params - it is more
>>> > consistent with what we have now and we may make it profile specific in
>>> > future..
>>> >
>>> > Honza
>>> > >
>>> > > Thanks,
>>> > > Richard.
>>>
>>> A new patch for trunk is attached.
>>>
>>> Regards,
>>> Yuan, Pengfei
>>>
>>>
>>> 2016-09-16  Yuan Pengfei  <ypf@pku.edu.cn>
>>>
>>>       * doc/invoke.texi (--param early-inlining-insns-feedback): New.
>>>       * ipa-inline.c (want_early_inline_function_p): Use
>>>       PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
>>>       * params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
>>>       (PARAM_EARLY_INLINING_INSNS): Change help string accordingly.
>
> Btw, It occurs to me that then win in code-size might be purely due to the
> smaller base value for the TU size we use to compute the maximum unit
> growth with ... any idea how to improve it on this side?  Say, computing
> the TU size before early optimization (uh, probably not ...)
>
> That said, the inliner always completely fills its budged, that is, increase
> the unit by max-unit-growth?

What I'm trying to say is that rather than limiting early inlining we should
maybe decrease inline-unit-growth when FDO is in effect?  Because we
can better control where the inlining goes.  If there is over 8% reduction
in size benchmarking (unpatched) compiler on Firefox with FDO and
--param inline-unit-growth=12 might show if the results are the same.

Richard.

> Richard.
>
>> OK,
>> thanks
>>
>> Honza

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-20 11:58                       ` Richard Biener
@ 2016-09-20 12:09                         ` Yuan, Pengfei
  2016-09-20 12:19                           ` Richard Biener
  2016-09-21 12:44                         ` Yuan, Pengfei
  1 sibling, 1 reply; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-20 12:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, GCC Patches

> > Btw, It occurs to me that then win in code-size might be purely due to the
> > smaller base value for the TU size we use to compute the maximum unit
> > growth with ... any idea how to improve it on this side?  Say, computing
> > the TU size before early optimization (uh, probably not ...)
> >
> > That said, the inliner always completely fills its budged, that is, increase
> > the unit by max-unit-growth?
> 
> What I'm trying to say is that rather than limiting early inlining we should
> maybe decrease inline-unit-growth when FDO is in effect?  Because we
> can better control where the inlining goes.  If there is over 8% reduction
> in size benchmarking (unpatched) compiler on Firefox with FDO and
> --param inline-unit-growth=12 might show if the results are the same.
> 
> Richard.
> 
> > Richard.

AFAIK, param inline-unit-growth only affects pass_ipa_inline. However, the code
size reduction achieved with my patch is from pass_early_inline.

Yuan, Pengfei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-20 12:09                         ` Yuan, Pengfei
@ 2016-09-20 12:19                           ` Richard Biener
  2016-09-20 13:10                             ` Yuan, Pengfei
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Biener @ 2016-09-20 12:19 UTC (permalink / raw)
  To: Yuan, Pengfei; +Cc: Jan Hubicka, GCC Patches

On Tue, Sep 20, 2016 at 1:57 PM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
>> > Btw, It occurs to me that then win in code-size might be purely due to the
>> > smaller base value for the TU size we use to compute the maximum unit
>> > growth with ... any idea how to improve it on this side?  Say, computing
>> > the TU size before early optimization (uh, probably not ...)
>> >
>> > That said, the inliner always completely fills its budged, that is, increase
>> > the unit by max-unit-growth?
>>
>> What I'm trying to say is that rather than limiting early inlining we should
>> maybe decrease inline-unit-growth when FDO is in effect?  Because we
>> can better control where the inlining goes.  If there is over 8% reduction
>> in size benchmarking (unpatched) compiler on Firefox with FDO and
>> --param inline-unit-growth=12 might show if the results are the same.
>>
>> Richard.
>>
>> > Richard.
>
> AFAIK, param inline-unit-growth only affects pass_ipa_inline. However, the code
> size reduction achieved with my patch is from pass_early_inline.

Any inlining you don't do at early inlining time will a) decrease the
size inline-unit-growth
is computed on b) just delays inlining to IPA inlining (with the now
more constrained size budget).

Richard.

> Yuan, Pengfei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-20 12:19                           ` Richard Biener
@ 2016-09-20 13:10                             ` Yuan, Pengfei
  0 siblings, 0 replies; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-20 13:10 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, GCC Patches

> >> > Btw, It occurs to me that then win in code-size might be purely due to the
> >> > smaller base value for the TU size we use to compute the maximum unit
> >> > growth with ... any idea how to improve it on this side?  Say, computing
> >> > the TU size before early optimization (uh, probably not ...)
> >> >
> >> > That said, the inliner always completely fills its budged, that is, increase
> >> > the unit by max-unit-growth?
> >>
> >> What I'm trying to say is that rather than limiting early inlining we should
> >> maybe decrease inline-unit-growth when FDO is in effect?  Because we
> >> can better control where the inlining goes.  If there is over 8% reduction
> >> in size benchmarking (unpatched) compiler on Firefox with FDO and
> >> --param inline-unit-growth=12 might show if the results are the same.
> >>
> >> Richard.
> >>
> >> > Richard.
> >
> > AFAIK, param inline-unit-growth only affects pass_ipa_inline. However, the code
> > size reduction achieved with my patch is from pass_early_inline.
> 
> Any inlining you don't do at early inlining time will a) decrease the
> size inline-unit-growth
> is computed on b) just delays inlining to IPA inlining (with the now
> more constrained size budget).

Delaying inlining from early inlining to IPA inlining can save size because profile
feedback is effective at IPA inlining and inlining of cold functions can be avoided.

Otherwise, inlining done at early inlining can not be canceled later.

Yuan, Pengfei

> Richard.
> 
> > Yuan, Pengfei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-20 11:58                       ` Richard Biener
  2016-09-20 12:09                         ` Yuan, Pengfei
@ 2016-09-21 12:44                         ` Yuan, Pengfei
  1 sibling, 0 replies; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-21 12:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, GCC Patches

> > Btw, It occurs to me that then win in code-size might be purely due to the
> > smaller base value for the TU size we use to compute the maximum unit
> > growth with ... any idea how to improve it on this side?  Say, computing
> > the TU size before early optimization (uh, probably not ...)
> >
> > That said, the inliner always completely fills its budged, that is, increase
> > the unit by max-unit-growth?
> 
> What I'm trying to say is that rather than limiting early inlining we should
> maybe decrease inline-unit-growth when FDO is in effect?  Because we
> can better control where the inlining goes.  If there is over 8% reduction
> in size benchmarking (unpatched) compiler on Firefox with FDO and
> --param inline-unit-growth=12 might show if the results are the same.

FYI, with --param inline-unit-growth=12 --param early-inlining-insns=14,
the code size reduction is only 1.1%.

Yuan, Pengfei

> Richard.
> 
> > Richard.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
  2016-09-16 12:00                 ` Yuan, Pengfei
  2016-09-16 12:56                   ` Jan Hubicka
@ 2016-09-26  4:03                   ` Yuan, Pengfei
  2016-10-10  2:23                   ` PING: [PATCH] " Yuan, Pengfei
  2 siblings, 0 replies; 23+ messages in thread
From: Yuan, Pengfei @ 2016-09-26  4:03 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, GCC Patches

Hi,

May I ask if there is any decision?

Regards,
Yuan, Pengfei

> > > I also like a new param better as it avoids a new magic constant and
> > > makes playing with
> > > it easier (your patch removes the ability to do statistics like you did via the
> > > early-inlining-insns parameter by forcing it to two).
> > 
> > Hmm, you are right that you do not know if this particular function will get
> > profile (forgot about that).  Still, please use two params - it is more
> > consistent with what we have now and we may make it profile specific in
> > future..
> > 
> > Honza
> > > 
> > > Thanks,
> > > Richard.
> 
> A new patch for trunk is attached.
> 
> Regards,
> Yuan, Pengfei
> 
> 
> 2016-09-16  Yuan Pengfei  <ypf@pku.edu.cn>
> 
> 	* doc/invoke.texi (--param early-inlining-insns-feedback): New.
> 	* ipa-inline.c (want_early_inline_function_p): Use
> 	PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
> 	* params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
> 	(PARAM_EARLY_INLINING_INSNS): Change help string accordingly.
> 
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 8eb5eff..6e7659a 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -9124,12 +9124,18 @@ given call expression.  This parameter limits inlining only to call expressions
>  whose probability exceeds the given threshold (in percents).
>  The default value is 10.
>  
>  @item early-inlining-insns
> +@itemx early-inlining-insns-feedback
>  Specify growth that the early inliner can make.  In effect it increases
>  the amount of inlining for code having a large abstraction penalty.
>  The default value is 14.
>  
> +The @option{early-inlining-insns-feedback} parameter is used only when
> +profile feedback-directed optimizations are enabled (by
> +@option{-fprofile-generate} or @option{-fprofile-use}).
> +The default value is 2.
> +
>  @item max-early-inliner-iterations
>  Limit of iterations of the early inliner.  This basically bounds
>  the number of nested indirect calls the early inliner can resolve.
>  Deeper chains are still handled by late inlining.
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index 5c9366a..e028c08 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -594,10 +594,17 @@ want_early_inline_function_p (struct cgraph_edge *e)
>      }
>    else
>      {
>        int growth = estimate_edge_growth (e);
> +      int growth_limit;
>        int n;
>  
> +      if ((profile_arc_flag && !flag_test_coverage)
> +	  || (flag_branch_probabilities && !flag_auto_profile))
> +	growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_FEEDBACK);
> +      else
> +	growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
> +
>        if (growth <= 0)
>  	;
>        else if (!e->maybe_hot_p ()
>  	       && growth > 0)
> @@ -610,9 +617,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>  		     xstrdup_for_dump (callee->name ()), callee->order,
>  		     growth);
>  	  want_inline = false;
>  	}
> -      else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
> +      else if (growth > growth_limit)
>  	{
>  	  if (dump_file)
>  	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>  		     "growth %i exceeds --param early-inlining-insns\n",
> @@ -622,9 +629,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>  		     growth);
>  	  want_inline = false;
>  	}
>        else if ((n = num_calls (callee)) != 0
> -	       && growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
> +	       && growth * (n + 1) > growth_limit)
>  	{
>  	  if (dump_file)
>  	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>  		     "growth %i exceeds --param early-inlining-insns "
> diff --git a/gcc/params.def b/gcc/params.def
> index 79b7dd4..91ea513 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -199,12 +199,20 @@ DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
>  DEFPARAM(PARAM_IPCP_UNIT_GROWTH,
>  	 "ipcp-unit-growth",
>  	 "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).",
>  	 10, 0, 0)
> -DEFPARAM(PARAM_EARLY_INLINING_INSNS,
> -	 "early-inlining-insns",
> -	 "Maximal estimated growth of function body caused by early inlining of single call.",
> -	 14, 0, 0)
> +DEFPARAM (PARAM_EARLY_INLINING_INSNS_FEEDBACK,
> +	  "early-inlining-insns-feedback",
> +	  "Maximal estimated growth of function body caused by early "
> +	  "inlining of single call.  Used when profile feedback-directed "
> +	  "optimizations are enabled.",
> +	  2, 0, 0)
> +DEFPARAM (PARAM_EARLY_INLINING_INSNS,
> +	  "early-inlining-insns",
> +	  "Maximal estimated growth of function body caused by early "
> +	  "inlining of single call.  Used when profile feedback-directed "
> +	  "optimizations are not enabled.",
> +	  14, 0, 0)
>  DEFPARAM(PARAM_LARGE_STACK_FRAME,
>  	 "large-stack-frame",
>  	 "The size of stack frame to be considered large.",
>  	 256, 0, 0)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* PING: [PATCH] Be more conservative in early inliner if FDO is enabled
  2016-09-16 12:00                 ` Yuan, Pengfei
  2016-09-16 12:56                   ` Jan Hubicka
  2016-09-26  4:03                   ` Yuan, Pengfei
@ 2016-10-10  2:23                   ` Yuan, Pengfei
  2016-10-10  9:55                     ` Richard Biener
  2 siblings, 1 reply; 23+ messages in thread
From: Yuan, Pengfei @ 2016-10-10  2:23 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, GCC Patches

Hi,

What is the decision on this patch?
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01041.html

Regards,
Yuan, Pengfei

> A new patch for trunk is attached.
> 
> Regards,
> Yuan, Pengfei
> 
> 
> 2016-09-16  Yuan Pengfei  <ypf@pku.edu.cn>
> 
> 	* doc/invoke.texi (--param early-inlining-insns-feedback): New.
> 	* ipa-inline.c (want_early_inline_function_p): Use
> 	PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
> 	* params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
> 	(PARAM_EARLY_INLINING_INSNS): Change help string accordingly.
> 
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 8eb5eff..6e7659a 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -9124,12 +9124,18 @@ given call expression.  This parameter limits inlining only to call expressions
>  whose probability exceeds the given threshold (in percents).
>  The default value is 10.
>  
>  @item early-inlining-insns
> +@itemx early-inlining-insns-feedback
>  Specify growth that the early inliner can make.  In effect it increases
>  the amount of inlining for code having a large abstraction penalty.
>  The default value is 14.
>  
> +The @option{early-inlining-insns-feedback} parameter is used only when
> +profile feedback-directed optimizations are enabled (by
> +@option{-fprofile-generate} or @option{-fprofile-use}).
> +The default value is 2.
> +
>  @item max-early-inliner-iterations
>  Limit of iterations of the early inliner.  This basically bounds
>  the number of nested indirect calls the early inliner can resolve.
>  Deeper chains are still handled by late inlining.
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index 5c9366a..e028c08 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -594,10 +594,17 @@ want_early_inline_function_p (struct cgraph_edge *e)
>      }
>    else
>      {
>        int growth = estimate_edge_growth (e);
> +      int growth_limit;
>        int n;
>  
> +      if ((profile_arc_flag && !flag_test_coverage)
> +	  || (flag_branch_probabilities && !flag_auto_profile))
> +	growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_FEEDBACK);
> +      else
> +	growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
> +
>        if (growth <= 0)
>  	;
>        else if (!e->maybe_hot_p ()
>  	       && growth > 0)
> @@ -610,9 +617,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>  		     xstrdup_for_dump (callee->name ()), callee->order,
>  		     growth);
>  	  want_inline = false;
>  	}
> -      else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
> +      else if (growth > growth_limit)
>  	{
>  	  if (dump_file)
>  	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>  		     "growth %i exceeds --param early-inlining-insns\n",
> @@ -622,9 +629,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>  		     growth);
>  	  want_inline = false;
>  	}
>        else if ((n = num_calls (callee)) != 0
> -	       && growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
> +	       && growth * (n + 1) > growth_limit)
>  	{
>  	  if (dump_file)
>  	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>  		     "growth %i exceeds --param early-inlining-insns "
> diff --git a/gcc/params.def b/gcc/params.def
> index 79b7dd4..91ea513 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -199,12 +199,20 @@ DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
>  DEFPARAM(PARAM_IPCP_UNIT_GROWTH,
>  	 "ipcp-unit-growth",
>  	 "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).",
>  	 10, 0, 0)
> -DEFPARAM(PARAM_EARLY_INLINING_INSNS,
> -	 "early-inlining-insns",
> -	 "Maximal estimated growth of function body caused by early inlining of single call.",
> -	 14, 0, 0)
> +DEFPARAM (PARAM_EARLY_INLINING_INSNS_FEEDBACK,
> +	  "early-inlining-insns-feedback",
> +	  "Maximal estimated growth of function body caused by early "
> +	  "inlining of single call.  Used when profile feedback-directed "
> +	  "optimizations are enabled.",
> +	  2, 0, 0)
> +DEFPARAM (PARAM_EARLY_INLINING_INSNS,
> +	  "early-inlining-insns",
> +	  "Maximal estimated growth of function body caused by early "
> +	  "inlining of single call.  Used when profile feedback-directed "
> +	  "optimizations are not enabled.",
> +	  14, 0, 0)
>  DEFPARAM(PARAM_LARGE_STACK_FRAME,
>  	 "large-stack-frame",
>  	 "The size of stack frame to be considered large.",
>  	 256, 0, 0)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PING: [PATCH] Be more conservative in early inliner if FDO is enabled
  2016-10-10  2:23                   ` PING: [PATCH] " Yuan, Pengfei
@ 2016-10-10  9:55                     ` Richard Biener
  2016-10-10 10:52                       ` Yuan, Pengfei
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Biener @ 2016-10-10  9:55 UTC (permalink / raw)
  To: Yuan, Pengfei; +Cc: Jan Hubicka, GCC Patches

On Mon, Oct 10, 2016 at 4:23 AM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
> Hi,
>
> What is the decision on this patch?
> https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01041.html

Honza approved the patch already.

Richard.

> Regards,
> Yuan, Pengfei
>
>> A new patch for trunk is attached.
>>
>> Regards,
>> Yuan, Pengfei
>>
>>
>> 2016-09-16  Yuan Pengfei  <ypf@pku.edu.cn>
>>
>>       * doc/invoke.texi (--param early-inlining-insns-feedback): New.
>>       * ipa-inline.c (want_early_inline_function_p): Use
>>       PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
>>       * params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
>>       (PARAM_EARLY_INLINING_INSNS): Change help string accordingly.
>>
>>
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 8eb5eff..6e7659a 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -9124,12 +9124,18 @@ given call expression.  This parameter limits inlining only to call expressions
>>  whose probability exceeds the given threshold (in percents).
>>  The default value is 10.
>>
>>  @item early-inlining-insns
>> +@itemx early-inlining-insns-feedback
>>  Specify growth that the early inliner can make.  In effect it increases
>>  the amount of inlining for code having a large abstraction penalty.
>>  The default value is 14.
>>
>> +The @option{early-inlining-insns-feedback} parameter is used only when
>> +profile feedback-directed optimizations are enabled (by
>> +@option{-fprofile-generate} or @option{-fprofile-use}).
>> +The default value is 2.
>> +
>>  @item max-early-inliner-iterations
>>  Limit of iterations of the early inliner.  This basically bounds
>>  the number of nested indirect calls the early inliner can resolve.
>>  Deeper chains are still handled by late inlining.
>> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
>> index 5c9366a..e028c08 100644
>> --- a/gcc/ipa-inline.c
>> +++ b/gcc/ipa-inline.c
>> @@ -594,10 +594,17 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>      }
>>    else
>>      {
>>        int growth = estimate_edge_growth (e);
>> +      int growth_limit;
>>        int n;
>>
>> +      if ((profile_arc_flag && !flag_test_coverage)
>> +       || (flag_branch_probabilities && !flag_auto_profile))
>> +     growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_FEEDBACK);
>> +      else
>> +     growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
>> +
>>        if (growth <= 0)
>>       ;
>>        else if (!e->maybe_hot_p ()
>>              && growth > 0)
>> @@ -610,9 +617,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>                    xstrdup_for_dump (callee->name ()), callee->order,
>>                    growth);
>>         want_inline = false;
>>       }
>> -      else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
>> +      else if (growth > growth_limit)
>>       {
>>         if (dump_file)
>>           fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>>                    "growth %i exceeds --param early-inlining-insns\n",
>> @@ -622,9 +629,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>                    growth);
>>         want_inline = false;
>>       }
>>        else if ((n = num_calls (callee)) != 0
>> -            && growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
>> +            && growth * (n + 1) > growth_limit)
>>       {
>>         if (dump_file)
>>           fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>>                    "growth %i exceeds --param early-inlining-insns "
>> diff --git a/gcc/params.def b/gcc/params.def
>> index 79b7dd4..91ea513 100644
>> --- a/gcc/params.def
>> +++ b/gcc/params.def
>> @@ -199,12 +199,20 @@ DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
>>  DEFPARAM(PARAM_IPCP_UNIT_GROWTH,
>>        "ipcp-unit-growth",
>>        "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).",
>>        10, 0, 0)
>> -DEFPARAM(PARAM_EARLY_INLINING_INSNS,
>> -      "early-inlining-insns",
>> -      "Maximal estimated growth of function body caused by early inlining of single call.",
>> -      14, 0, 0)
>> +DEFPARAM (PARAM_EARLY_INLINING_INSNS_FEEDBACK,
>> +       "early-inlining-insns-feedback",
>> +       "Maximal estimated growth of function body caused by early "
>> +       "inlining of single call.  Used when profile feedback-directed "
>> +       "optimizations are enabled.",
>> +       2, 0, 0)
>> +DEFPARAM (PARAM_EARLY_INLINING_INSNS,
>> +       "early-inlining-insns",
>> +       "Maximal estimated growth of function body caused by early "
>> +       "inlining of single call.  Used when profile feedback-directed "
>> +       "optimizations are not enabled.",
>> +       14, 0, 0)
>>  DEFPARAM(PARAM_LARGE_STACK_FRAME,
>>        "large-stack-frame",
>>        "The size of stack frame to be considered large.",
>>        256, 0, 0)
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PING: [PATCH] Be more conservative in early inliner if FDO is enabled
  2016-10-10  9:55                     ` Richard Biener
@ 2016-10-10 10:52                       ` Yuan, Pengfei
  0 siblings, 0 replies; 23+ messages in thread
From: Yuan, Pengfei @ 2016-10-10 10:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, GCC Patches

> On Mon, Oct 10, 2016 at 4:23 AM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
> > Hi,
> >
> > What is the decision on this patch?
> > https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01041.html
> 
> Honza approved the patch already.
> 
> Richard.

Do I need to sign a copyright assignment for the patch?
Moreover, I do not have the permission to commit it.

Regards,
Yuan, Pengfei

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-10-10 10:52 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-10  6:41 [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled Yuan, Pengfei
2016-09-14 10:35 ` Richard Biener
2016-09-15  3:17   ` Yuan, Pengfei
2016-09-15  8:44     ` Richard Biener
2016-09-15 10:08       ` Yuan, Pengfei
2016-09-16  5:52       ` Yuan, Pengfei
2016-09-16  8:09         ` Jan Hubicka
2016-09-16  9:01           ` Yuan, Pengfei
2016-09-16  9:21             ` Richard Biener
2016-09-16  9:37               ` Jan Hubicka
2016-09-16 12:00                 ` Yuan, Pengfei
2016-09-16 12:56                   ` Jan Hubicka
2016-09-20  8:53                     ` Yuan, Pengfei
2016-09-20 11:44                     ` Richard Biener
2016-09-20 11:58                       ` Richard Biener
2016-09-20 12:09                         ` Yuan, Pengfei
2016-09-20 12:19                           ` Richard Biener
2016-09-20 13:10                             ` Yuan, Pengfei
2016-09-21 12:44                         ` Yuan, Pengfei
2016-09-26  4:03                   ` Yuan, Pengfei
2016-10-10  2:23                   ` PING: [PATCH] " Yuan, Pengfei
2016-10-10  9:55                     ` Richard Biener
2016-10-10 10:52                       ` Yuan, Pengfei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).