Re: [PATCH] Add capability to run several iterations of early optimizations

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Maxim Kuvyrkov <maxim@codesourcery.com>
To: Richard Guenther <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>, Matt <matt@use.net>
Subject: Re: [PATCH] Add capability to run several iterations of early optimizations
Date: Fri, 28 Oct 2011 23:07:00 -0000	[thread overview]
Message-ID: <AE6C3BCE-4EE9-478A-BABA-0D52CB7C3003@codesourcery.com> (raw)
In-Reply-To: <CAFiYyc1AHg7u0VA2XiZ50ozY6=BaPQx7Dct9-7M1rUzdgNj9Lw@mail.gmail.com>

On 28/10/2011, at 11:43 PM, Richard Guenther wrote:

> On Fri, Oct 28, 2011 at 1:05 AM, Maxim Kuvyrkov <maxim@codesourcery.com> wrote:
>> 
>> OK for trunk?
> 
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index f056d3d..4738b28 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -2416,7 +2416,7 @@ cgraph_add_new_function (tree fndecl, bool lowered)
>            tree_lowering_passes (fndecl);
>            bitmap_obstack_initialize (NULL);
>            if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
> -             execute_pass_list (pass_early_local_passes.pass.sub);
> +             execute_early_local_passes_for_current_function ();
>            bitmap_obstack_release (NULL);
>            pop_cfun ();
>            current_function_decl = NULL;
> @@ -2441,7 +2441,7 @@ cgraph_add_new_function (tree fndecl, bool lowered)
>        gimple_register_cfg_hooks ();
>        bitmap_obstack_initialize (NULL);
>        if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
> -         execute_pass_list (pass_early_local_passes.pass.sub);
> +         execute_early_local_passes_for_current_function ();
> 
> I think these should only execute the lowering pieces of early local passes,
> let me see if that's properly split ...
> 
> @@ -255,7 +255,7 @@ cgraph_process_new_functions (void)
>              /* When not optimizing, be sure we run early local passes anyway
>                 to expand OMP.  */
>              || !optimize)
> -           execute_pass_list (pass_early_local_passes.pass.sub);
> +           execute_early_local_passes_for_current_function ();
> 
> similar.

In the case of tree_lowering_passes, I agree.  But with the calls to early_local_passes from cgraph_add_new_function/cgraph_process_new_functions we ought to run the full list of optimizations to make bodies of new functions ready for inlining to existing functions.  Note, cgraph_add_new_function() explicitly calls tree_lowering_passes and then invokes early_local_passes.

What do you think?  If we want to pursue this, it should be in a separate patch anyway.

> 
> About all this -suffix stuff, I'd like to have the iterations simply re-use
> the existing dump-files, thus statically sub-divide pass_early_local_passes
> like

I considered this when I was implementing and debugging iterative support, and it is better to keep dumps for main and iterative parts separate.  These are two different IPA passes, and their sub-passes should use separate dumps.

Given that the set of sub-passes of early_local_passes_main is substantially different from the set of sub-passes of early_local_passes_iter (the former has a bunch of init/expand/lower passes in it), mixing the dumps will be mighty confusing.  Without knowing the exact ordering of passes one will not be able to construct a chronological picture of optimizations.  Separating dumps of sub-passes of different IPA passes simplifies understanding of optimization ordering.

> 
> NEXT_PASS (pass_early_local_lowering_passes);
>  {
>      NEXT_PASS (pass_fixup_cfg);
>      NEXT_PASS (pass_init_datastructures);
>      NEXT_PASS (pass_expand_omp);
> 
>      NEXT_PASS (pass_referenced_vars);
>      NEXT_PASS (pass_build_ssa);
>      NEXT_PASS (pass_lower_vector);
>      NEXT_PASS (pass_early_warn_uninitialized);
>  }
> /* The following you maybe iterate.  */
> NEXT_PASS (pass_early_local_optimizations);
>  {
>      NEXT_PASS (pass_rebuild_cgraph_edges);
>      NEXT_PASS (pass_inline_parameters);
>      NEXT_PASS (pass_early_inline);
>      NEXT_PASS (pass_all_early_optimizations);
>        {
>          struct opt_pass **p = &pass_all_early_optimizations.pass.sub;
>          NEXT_PASS (pass_remove_cgraph_callee_edges);
>          NEXT_PASS (pass_rename_ssa_copies);
>          NEXT_PASS (pass_ccp);
>          NEXT_PASS (pass_forwprop);
>          /* pass_build_ealias is a dummy pass that ensures that we
>             execute TODO_rebuild_alias at this point.  Re-building
>             alias information also rewrites no longer addressed
>             locals into SSA form if possible.  */
>          NEXT_PASS (pass_build_ealias);
>          NEXT_PASS (pass_sra_early);
>          NEXT_PASS (pass_fre);
>          NEXT_PASS (pass_copy_prop);
>          NEXT_PASS (pass_merge_phi);
>          NEXT_PASS (pass_cd_dce);
>          NEXT_PASS (pass_early_ipa_sra);
>          NEXT_PASS (pass_tail_recursion);
>          NEXT_PASS (pass_convert_switch);
>          NEXT_PASS (pass_cleanup_eh);
>          NEXT_PASS (pass_profile);
>          NEXT_PASS (pass_local_pure_const);
>       }
>   }
> NEXT_PASS (pass_late_early_local_optimizations)
>  {
>          /* Split functions creates parts that are not run through
>             early optimizations again.  It is thus good idea to do this
>             late.  */
>          NEXT_PASS (pass_split_functions);
>      NEXT_PASS (pass_release_ssa_names);
>      NEXT_PASS (pass_rebuild_cgraph_edges);
>      NEXT_PASS (pass_inline_parameters);
>  }

This simple division into 3 stages will not work.  On one hand, one needs to add fixup/cleanup passes at the beginning and end of every SIMPLE_IPA pass; without those cgraph_verify() gets upset very easily.  On the other hand, when iterative optimizations are not enabled, and there is no need to decouple main and late parts, these additional fixup/cleanup passes would be unnecessary overhead.

Also, in the above division you effectively make 3 separate SIMPLE_IPA passes, and I don't know what the effect of that will be.

The pass formation and scheduling in the patch I posted may not look as pretty as what you describe above, but it is reasonably clean and easy to maintain.  One needs to change only passes in the main part, and those changes will be automatically picked up in iterative and late parts.

> 
> +++ b/gcc/toplev.c
> @@ -1228,7 +1228,6 @@ general_init (const char *argv0)
>   /* This must be done after global_init_params but before argument
>      processing.  */
>   init_ggc_heuristics();
> -  init_optimization_passes ();
>   statistics_early_init ();
>   finish_params ();
> }
> @@ -1989,6 +1988,8 @@ toplev_main (int argc, char **argv)
>                  save_decoded_options, save_decoded_options_count,
>                  UNKNOWN_LOCATION, global_dc);
> 
> +  init_optimization_passes ();
> +
>   handle_common_deferred_options ();
> 
>   init_local_tick ();
> 
> any reason for this?

Yes, this moves init_optimization_passes() after processing of --param options.  This is needed to be able to use PARAM_VALUE in passes initialization.

> 
> +  /* Don't recurse or wonder on to the next pass when running
> +     execute_ipa_pass_list below.  */
> +  current->execute = NULL;
> +  current->next = NULL;
> 
> that looks awkward ;)  Shouldn't instead
> 
> +    execute_ipa_pass_list (current);
> 
> not do what it does?  Thus, split that into two pieces, one that we
> can iterate w/o the fiddling with current->?

The choice is between the above and duplicating post-processing logic of execute_ipa_pass_list (i.e., call to cgraph_process_new_functions) in the iteration loop.  I agree that the above is a tad awkward, but it is cleaner than copy-paste of the final part of execute_ipa_pass_list.

> 
> I like this variant a lot better than the last one - still it lacks any
> analysis-based justification for iteration (see my reply to Matt on
> what I discussed with Honza).

Yes, having a way to tell whether a function have significantly changed would be awesome.  My approach here would be to make inline_parameters output feedback of how much the size/time metrics have changed for a function since previous run.  If the change is above X%, then queue functions callers for more optimizations.  Similarly, Martin's rebuild_cgraph_edges_and_devirt (when that goes into trunk) could queue new direct callees and current function for another iteration if new direct edges were resolved.

But such analysis is way out of scope of this patch, which adds infrastructure for iterative optimizations to be possible and reliable.

>  Thus, I don't think we want to
> merge this in its current form or in this stage1.

What is the benefit of pushing this to a later release?  If anything, merging the support for iterative optimizations now will allow us to consider adding the wonderful smartness to it later.  In the meantime, substituting that smartness with a knob is still a great alternative.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics

next prev parent reply	other threads:[~2011-10-28 22:22 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-12  8:14 Maxim Kuvyrkov
2011-10-12 12:15 ` Richard Guenther
2011-10-18  3:00   ` Maxim Kuvyrkov
2011-10-18  9:09     ` Richard Guenther
2011-10-27 23:29       ` Maxim Kuvyrkov
2011-10-28 11:12         ` Richard Guenther
2011-10-28 23:07           ` Maxim Kuvyrkov [this message]
2011-10-29  0:10             ` Matt
2011-11-01 20:48               ` Martin Jambor
2011-11-01 21:33               ` Richard Guenther
2011-11-08  7:23                 ` Maxim Kuvyrkov
2011-11-08 11:18                   ` Richard Guenther
2011-10-27 22:47 Matt
2011-10-28 10:01 ` Richard Guenther
2011-10-28 22:30   ` Matt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AE6C3BCE-4EE9-478A-BABA-0D52CB7C3003@codesourcery.com \
    --to=maxim@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=matt@use.net \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).