Re: Inliner heuristics facelift

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: Inliner heuristics facelift
@ 2011-04-17 12:07 Dominique Dhumieres
  2011-04-17 13:35 ` Jan Hubicka
  2011-04-17 14:29 ` Jan Hubicka
  0 siblings, 2 replies; 9+ messages in thread
From: Dominique Dhumieres @ 2011-04-17 12:07 UTC (permalink / raw)
  To: hubicka; +Cc: richard.guenther, gcc-patches

Honza,

Your patch seems to make --param max-inline-insns-auto= ineffective:

gfc -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer --param max-inline-insns-auto=4000 -fwhole-program -fstack-arrays ac.f90
8.105u 0.005s 0:08.11 99.8%	0+0k 0+5io 0pf+0w

while the timing was ~2.7s with a 125 threshold before the patch.

Cheers,

Dominique

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Inliner heuristics facelift
  2011-04-17 12:07 Inliner heuristics facelift Dominique Dhumieres
@ 2011-04-17 13:35 ` Jan Hubicka
  2011-04-17 14:58   ` Dominique Dhumieres
  2011-04-17 14:29 ` Jan Hubicka
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Hubicka @ 2011-04-17 13:35 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: hubicka, richard.guenther, gcc-patches

> Honza,
> 
> Your patch seems to make --param max-inline-insns-auto= ineffective:
> 
> gfc -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer --param max-inline-insns-auto=4000 -fwhole-program -fstack-arrays ac.f90
> 8.105u 0.005s 0:08.11 99.8%	0+0k 0+5io 0pf+0w
> 
> while the timing was ~2.7s with a 125 threshold before the patch.

For me, I get everything inlined except for:
  not inlinable: main/5 -> MAIN__/4, --param large-stack-frame-growth limit reached
and indeed in my program there is only main and MAIN function.  You may try if
increasing large-stack-frame-grwoth makes the regression go away, but in a way

I doubt so, since main() is:
main (integer(kind=4) argc, character(kind=1) * * argv)
{ 
  static integer(kind=4) options.90[8] = {68, 511, 0, 0, 0, 1, 0, 1};

<bb 2>:
  _gfortran_set_args (argc_1(D), argv_2(D));
  _gfortran_set_options (8, &options.90[0]);
  MAIN__ ();
  return 0;

}
and inlining into it is not most fruitable thing anyway.

--param max-inline-insns-auto needed for me is also way bellow 125.  If I believe
right, the function is estimated to 94 with 14 of benefit, so it should be about
80.

Any idea what makes the regression for you? Can you confirm that you also get
everything inlined? I can't easilly reproduce your result since I don't have
graphite enabled setup.

I think I should add -Winline-all that will report
reasons for not inlining functions that are not declared inline, too, so these
things are easier to look at.

Honza
> 
> Cheers,
> 
> Dominique

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Inliner heuristics facelift
  2011-04-17 13:35 ` Jan Hubicka
@ 2011-04-17 14:58   ` Dominique Dhumieres
  2011-04-17 15:29     ` Jan Hubicka
  0 siblings, 1 reply; 9+ messages in thread
From: Dominique Dhumieres @ 2011-04-17 14:58 UTC (permalink / raw)
  To: hubicka, dominiq; +Cc: richard.guenther, hubicka, gcc-patches

> For me, I get everything inlined except for:

Oops! looking at my mail, I realized that I did not timed fatigue, but ac.
With the patch the only difference for fatigue is that the threshold for
full inlining decreased from 125 to 113.

Sorry for the noise.

Dominique

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Inliner heuristics facelift
  2011-04-17 14:58   ` Dominique Dhumieres
@ 2011-04-17 15:29     ` Jan Hubicka
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Hubicka @ 2011-04-17 15:29 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: hubicka, richard.guenther, gcc-patches

> > For me, I get everything inlined except for:
> 
> Oops! looking at my mail, I realized that I did not timed fatigue, but ac.
> With the patch the only difference for fatigue is that the threshold for
> full inlining decreased from 125 to 113.

See, we make progress. 10 more patches like this and you can set threshold to 0 ;)

This is expected behaviour - the patch changes interpretation of inline bounds
from limits on function sizes on limits of estimated function sizes after
inlining. Those 12 instructions are accounting function call overhead.

Honza
> 
> Sorry for the noise.
> 
> Dominique

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Inliner heuristics facelift
  2011-04-17 12:07 Inliner heuristics facelift Dominique Dhumieres
  2011-04-17 13:35 ` Jan Hubicka
@ 2011-04-17 14:29 ` Jan Hubicka
  2011-04-19 13:40   ` H.J. Lu
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Hubicka @ 2011-04-17 14:29 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: hubicka, richard.guenther, gcc-patches

Hi,
there is now tracking PR48636 for fortran inlining issues.  I would suggest
reporting problems like this there or PR that blocks it instead of sending it
to different threads so we have track of them.  It is quite important to get
some understanding what sort of issues we have here.  We might want to collect
some sort of inlining benchmark like we have for C++.  For the moment I think
the polyhedron we have is working well, but just having a summary what
benchmarks needs what --param tweaks and why (i.e. what is important to inline)
would be useful.

While final testing before comming I noticed that last minue gcc_assert added
to can_inline checking that gimple stmt and edge cannot_inline_flags are in sync
(they must be, otherwise we would do invalid optimizations at IPA) is failing
on one of C++ testcases.

The problem is code in tree-inline that incorrectly handles the flag on
functions that do have mismatching EH personalities.

This is final version of patch I comitted that fixes this by moving all the logic
scattered across tree-inline.c into can_inline_edge_p where it belongs and adding
corresponding CIF codes.  It also turned out that our all_set_cannot_inline are
not quite up to date after devirtualization and other transforms.  As first cut
i decided to move the computation into inline summary computation and I think
i can remove it from gimplifier and value profiler in followup patch.

Bootstrapped/regtested x86_64-linux, comitted.

Index: ChangeLog
===================================================================
*** ChangeLog	(revision 172608)
--- ChangeLog	(working copy)
***************
*** 1,3 ****
--- 1,74 ----
+ 2011-04-17  Jan Hubicka  <jh@suse.cz>
+ 
+ 	* lto-symtab.c (lto_cgraph_replace_node): When call statement is
+ 	present, also set gimple_call_set_cannot_inline.
+ 	* ipa-inline.c: Update toplevel comment.
+ 	(MAX_TIME): Remove.
+ 	(cgraph_clone_inlined_nodes): Fix linebreaks.
+ 	(cgraph_check_inline_limits): Restructure to ...
+ 	(caller_growth_limits): ... this one; be more tolerant
+ 	on growth in nested inline chains; add explanatory comment;
+ 	fix stack accounting thinko introduced by previous patch.
+ 	(cgraph_default_inline_p): Remove.
+ 	(report_inline_failed_reason): New function.
+ 	(can_inline_edge_p): New function.
+ 	(can_early_inline_edge_p): New function.
+ 	(leaf_node_p): Move upwards in file.
+ 	(want_early_inline_function_p): New function.
+ 	(want_inline_small_function_p): New function.
+ 	(want_inline_self_recursive_call_p): New function.
+ 	(cgraph_edge_badness): Rename to ...
+ 	(edge_badness) ... this one; fix linebreaks.
+ 	(update_edge_key): Update call of edge_baddness; add
+ 	detailed dump about queue updates.
+ 	(update_caller_keys): Use can_inline_edge_p and
+ 	want_inline_small_function_p.
+ 	(cgraph_decide_recursive_inlining): Rename to...
+ 	(recursive_inlining): Use can_inline_edge_p and
+ 	want_inline_self_recursive_call_p; simplify and
+ 	remove no longer valid FIXME.
+ 	(cgraph_set_inline_failed): Remove.
+ 	(add_new_edges_to_heap): Use can_inline_edge_p and
+ 	want_inline_small_function_p.
+ 	(cgraph_decide_inlining_of_small_functions): Rename to ...
+ 	(inline_small_functions): ... this one; cleanup; use
+ 	can/want predicates; cleanup debug ouput; work edges
+ 	till fibheap is exhausted and do not stop once unit
+ 	growth is reached; remove later loop processing remaining
+ 	edges.
+ 	(cgraph_flatten): Rename to ...
+ 	(flatten_function): ... this one; use can_inline_edge_p
+ 	and can_early_inline_edge_p predicates.
+ 	(cgraph_decide_inlining): Rename to ...
+ 	(ipa_inline): ... this one; remove unreachable nodes before
+ 	inlining functions called once; simplify the pass.
+ 	(cgraph_perform_always_inlining): Rename to ...
+ 	(inline_always_inline_functions): ... this one; use
+ 	DECL_DISREGARD_INLINE_LIMITS; use can_inline_edge_p
+ 	predicate
+ 	(cgraph_decide_inlining_incrementally): Rename to ...
+ 	(early_inline_small_functions): ... this one; simplify
+ 	using new predicates; cleanup; make dumps prettier.
+ 	(cgraph_early_inlining): Rename to ...
+ 	(early_inliner): newer inline regular functions into always-inlines;
+ 	fix updating of call stmt summaries.
+ 	(pass_early_inline): Update for new names.
+ 	(inline_transform): Fix formating.
+ 	(gate_cgraph_decide_inlining): Rename to ...
+ 	(pass_ipa_inline): ... this one.
+ 	* ipa-inline.h (inline_summary): Remove disregard_inline_limits.
+ 	* ipa-inline-analysis.c (dump_inline_summary): Update.
+ 	(compute_inline_parameters): Do not compute disregard_inline_limits;
+ 	look for mismatching arguments.
+ 	(estimate_growth): Fix handlig of non-trivial self recursion.
+ 	(inline_read_summary): Do not read info->disregard_inline_limits.
+ 	(inline_write_summary): Do not write info->disregard_inline_limits.
+ 	* tree-inline.c (inline_forbidden_into_p, tree_can_inline_p): Remove and
+ 	move all checks into can_inline_edge_p predicate; re-enable code comparing
+ 	optimization levels.
+ 	(expand_call_inline): Do not test inline_forbidden_into_p.
+ 	* Makefile.in (ipa-inline.o): Update arguments.
+ 
  2011-04-17  Revital Eres  <revital.eres@linaro.org>
  
  	* ddg.c (free_ddg_all_sccs): Free sccs field in struct ddg_all_sccs.
Index: testsuite/gcc.dg/winline-5.c
===================================================================
*** testsuite/gcc.dg/winline-5.c	(revision 172597)
--- testsuite/gcc.dg/winline-5.c	(working copy)
*************** inline int q(void) /* { dg-warning "inli
*** 15,29 ****
  	big();
  	big();
  }
- inline int q1(void)
- {
- 	big();
- 	big();
- 	big();
- }
  int t (void)
  {
-  /* We allow one inlining over limit.  */
- 	q1();
  	return q ();		 /* { dg-warning "called from here" } */
  }
--- 15,21 ----
Index: testsuite/ChangeLog
===================================================================
*** testsuite/ChangeLog	(revision 172597)
--- testsuite/ChangeLog	(working copy)
***************
*** 1,3 ****
--- 1,7 ----
+ 2011-04-17  Jan Hubicka  <jh@suse.cz>
+ 
+ 	* gcc.dg/winline-5.c: Update testcase.
+ 
  2011-04-16  Uros Bizjak  <ubizjak@gmail.com>
  
  	* gcc.target/i386/avx256-unaligned-load-1.c: Update scan patterns.
Index: cif-code.def
===================================================================
*** cif-code.def	(revision 172597)
--- cif-code.def	(working copy)
*************** DEFCIFCODE(OPTIMIZING_FOR_SIZE,
*** 79,84 ****
--- 79,86 ----
  
  /* Inlining failed because of mismatched options or arguments.  */
  DEFCIFCODE(TARGET_OPTION_MISMATCH, N_("target specific option mismatch"))
+ DEFCIFCODE(TARGET_OPTIMIZATION_MISMATCH,
+ 	   N_("optimization level attribute mismatch"))
  DEFCIFCODE(MISMATCHED_ARGUMENTS, N_("mismatched arguments"))
  
  /* Call was originally indirect.  */
*************** DEFCIFCODE(ORIGINALLY_INDIRECT_CALL,
*** 89,92 ****
--- 91,108 ----
  DEFCIFCODE(INDIRECT_UNKNOWN_CALL,
  	   N_("indirect function call with a yet undetermined callee"))
  
+ /* We can't inline different EH personalities together.  */
+ DEFCIFCODE(EH_PERSONALITY,
+ 	   N_("excepion handling personality mismatch"))
+ 
+ /* Don't inline if the callee can throw non-call exceptions but the
+    caller cannot.  */
+ DEFCIFCODE(NON_CALL_EXCEPTIONS,
+ 	   N_("excepion handling personality mismatch"))
+ 
+ /* Don't inline if the callee can throw non-call exceptions but the
+    caller cannot.  */
+ DEFCIFCODE(OPTIMIZATION_MISMATCH,
+ 	   N_("optimization mode mismatch"))
+ 
  DEFCIFCODE(OVERWRITABLE, N_("function body can be overwriten at linktime"))
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 172597)
--- ipa-inline.c	(working copy)
*************** along with GCC; see the file COPYING3.  
*** 21,93 ****
  
  /*  Inlining decision heuristics
  
!     We separate inlining decisions from the inliner itself and store it
!     inside callgraph as so called inline plan.  Refer to cgraph.c
!     documentation about particular representation of inline plans in the
!     callgraph.
! 
!     There are three major parts of this file:
! 
!     cgraph_mark_inline_edge implementation
! 
!       This function allows to mark given call inline and performs necessary
!       modifications of cgraph (production of the clones and updating overall
!       statistics)
  
      inlining heuristics limits
  
!       These functions allow to check that particular inlining is allowed
!       by the limits specified by user (allowed function growth, overall unit
!       growth and so on).
  
      inlining heuristics
  
!       This is implementation of IPA pass aiming to get as much of benefit
!       from inlining obeying the limits checked above.
  
!       The implementation of particular heuristics is separated from
!       the rest of code to make it easier to replace it with more complicated
!       implementation in the future.  The rest of inlining code acts as a
!       library aimed to modify the callgraph and verify that the parameters
!       on code size growth fits.
! 
!       To mark given call inline, use cgraph_mark_inline function, the
!       verification is performed by cgraph_default_inline_p and
!       cgraph_check_inline_limits.
! 
!       The heuristics implements simple knapsack style algorithm ordering
!       all functions by their "profitability" (estimated by code size growth)
!       and inlining them in priority order.
! 
!       cgraph_decide_inlining implements heuristics taking whole callgraph
!       into account, while cgraph_decide_inlining_incrementally considers
!       only one function at a time and is used by early inliner.
! 
!    The inliner itself is split into two passes:
! 
!    pass_early_inlining
! 
!      Simple local inlining pass inlining callees into current function.  This
!      pass makes no global whole compilation unit analysis and this when allowed
!      to do inlining expanding code size it might result in unbounded growth of
!      whole unit.
! 
!      The pass is run during conversion into SSA form.  Only functions already
!      converted into SSA form are inlined, so the conversion must happen in
!      topological order on the callgraph (that is maintained by pass manager).
!      The functions after inlining are early optimized so the early inliner sees
!      unoptimized function itself, but all considered callees are already
!      optimized allowing it to unfold abstraction penalty on C++ effectively and
!      cheaply.
! 
!    pass_ipa_inline
! 
!      This is the main pass implementing simple greedy algorithm to do inlining
!      of small functions that results in overall growth of compilation unit and
!      inlining of functions called once.  The pass compute just so called inline
!      plan (representation of inlining to be done in callgraph) and unlike early
!      inlining it is not performing the inlining itself.
!  */
  
  #include "config.h"
  #include "system.h"
--- 21,106 ----
  
  /*  Inlining decision heuristics
  
!     The implementation of inliner is organized as follows:
! 
!     Transformation of callgraph to represent inlining decisions.
! 
!       The inline decisions are stored in callgraph in "inline plan" and
!       all applied later.
! 
!       To mark given call inline, use cgraph_mark_inline function.
!       The function marks the edge inlinable and, if neccesary, produces
!       virtual clone in the callgraph representing the new copy of callee's
!       function body.
! 
!       The inline plan is applied on given function body by inline_transform. 
  
      inlining heuristics limits
  
!       can_inline_edge_p allow to check that particular inlining is allowed
!       by the limits specified by user (allowed function growth, growth and so
!       on).
! 
!       Functions are inlined when it is obvious the result is profitable (such
!       as functions called once or when inlining reduce code size).
!       In addition to that we perform inlining of small functions and recursive
!       inlining.
  
      inlining heuristics
  
!        The inliner itself is split into two passes:
! 
!        pass_early_inlining
  
! 	 Simple local inlining pass inlining callees into current function.
! 	 This pass makes no use of whole unit analysis and thus it can do only
! 	 very simple decisions based on local properties.
! 
! 	 The strength of the pass is that it is run in topological order
! 	 (reverse postorder) on the callgraph. Functions are converted into SSA
! 	 form just before this pass and optimized subsequently. As a result, the
! 	 callees of the function seen by the early inliner was already optimized
! 	 and results of early inlining adds a lot of optimization oppurtunities
! 	 for the local optimization.
! 
! 	 The pass handle the obvious inlining decisions within the copmilation
! 	 unit - inlining auto inline functions, inlining for size and
! 	 flattening.
! 
! 	 main strength of the pass is the ability to eliminate abstraction
! 	 penalty in C++ code (via combination of inlining and early
! 	 optimization) and thus improve quality of analysis done by real IPA
! 	 optimizers.
! 
! 	 Because of lack of whole unit knowledge, the pass can not really make
! 	 good code size/performance tradeoffs.  It however does very simple
! 	 speculative inlining allowing code size to grow by
! 	 EARLY_INLINING_INSNS when calee is leaf function.  In this case the
! 	 optimizations perfomed later are very likely to eliminate the cost.
! 
!        pass_ipa_inline
! 
! 	 This is the real inliner able to handle inlining with whole program
! 	 knowledge. It performs following steps:
! 
! 	 1) inlining of small functions.  This is implemented by greedy
! 	 algorithm ordering all inlinable cgraph edges by their badness and
! 	 inlining them in this order as long as inline limits allows doing so.
! 
! 	 This heuristics is not very good on inlining recursive calls. Recursive
! 	 calls can be inlined with results similar to loop unrolling. To do so,
! 	 special purpose recursive inliner is executed on function when
! 	 recursive edge is met as viable candidate.
! 
! 	 2) Unreachable functions are removed from callgraph.  Inlining leads
! 	 to devirtualization and other modification of callgraph so functions
! 	 may become unreachable during the process. Also functions declared as
! 	 extern inline or virtual functions are removed, since after inlining
! 	 we no longer need the offline bodies.
! 
! 	 3) Functions called once and not exported from the unit are inlined.
! 	 This should almost always lead to reduction of code size by eliminating
! 	 the need for offline copy of the function.  */
  
  #include "config.h"
  #include "system.h"
*************** along with GCC; see the file COPYING3.  
*** 105,122 ****
  #include "fibheap.h"
  #include "intl.h"
  #include "tree-pass.h"
- #include "hashtab.h"
  #include "coverage.h"
  #include "ggc.h"
- #include "tree-flow.h"
  #include "rtl.h"
  #include "ipa-prop.h"
  #include "except.h"
  #include "ipa-inline.h"
  
- #define MAX_TIME 1000000000
- 
- 
  /* Statistics we collect about inlining algorithm.  */
  static int ncalls_inlined;
  static int nfunctions_inlined;
--- 118,132 ----
  #include "fibheap.h"
  #include "intl.h"
  #include "tree-pass.h"
  #include "coverage.h"
  #include "ggc.h"
  #include "rtl.h"
+ #include "tree-flow.h"
  #include "ipa-prop.h"
  #include "except.h"
+ #include "target.h"
  #include "ipa-inline.h"
  
  /* Statistics we collect about inlining algorithm.  */
  static int ncalls_inlined;
  static int nfunctions_inlined;
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 163,173 ****
        /* We may eliminate the need for out-of-line copy to be output.
  	 In that case just go ahead and re-use it.  */
        if (!e->callee->callers->next_caller
! 	  /* Recursive inlining never wants the master clone to be overwritten.  */
  	  && update_original
! 	  /* FIXME: When address is taken of DECL_EXTERNAL function we still can remove its
! 	     offline copy, but we would need to keep unanalyzed node in the callgraph so
! 	     references can point to it.  */
  	  && !e->callee->address_taken
  	  && cgraph_can_remove_if_no_direct_calls_p (e->callee)
  	  /* Inlining might enable more devirtualizing, so we want to remove
--- 173,184 ----
        /* We may eliminate the need for out-of-line copy to be output.
  	 In that case just go ahead and re-use it.  */
        if (!e->callee->callers->next_caller
! 	  /* Recursive inlining never wants the master clone to
! 	     be overwritten.  */
  	  && update_original
! 	  /* FIXME: When address is taken of DECL_EXTERNAL function we still
! 	     can remove its offline copy, but we would need to keep unanalyzed
! 	     node in the callgraph so references can point to it.  */
  	  && !e->callee->address_taken
  	  && cgraph_can_remove_if_no_direct_calls_p (e->callee)
  	  /* Inlining might enable more devirtualizing, so we want to remove
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 175,181 ****
  	     Lacking may edges in callgraph we just preserve them post
  	     inlining.  */
  	  && (!DECL_VIRTUAL_P (e->callee->decl)
! 	      || (!DECL_COMDAT (e->callee->decl) && !DECL_EXTERNAL (e->callee->decl)))
  	  /* Don't reuse if more than one function shares a comdat group.
  	     If the other function(s) are needed, we need to emit even
  	     this function out of line.  */
--- 186,193 ----
  	     Lacking may edges in callgraph we just preserve them post
  	     inlining.  */
  	  && (!DECL_VIRTUAL_P (e->callee->decl)
! 	      || (!DECL_COMDAT (e->callee->decl)
! 		  && !DECL_EXTERNAL (e->callee->decl)))
  	  /* Don't reuse if more than one function shares a comdat group.
  	     If the other function(s) are needed, we need to emit even
  	     this function out of line.  */
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 214,220 ****
        + caller_info->estimated_self_stack_size;
    peak = callee_info->stack_frame_offset
        + callee_info->estimated_self_stack_size;
!   if (inline_summary (e->callee->global.inlined_to)->estimated_stack_size < peak)
      inline_summary (e->callee->global.inlined_to)->estimated_stack_size = peak;
    cgraph_propagate_frequency (e->callee);
  
--- 226,233 ----
        + caller_info->estimated_self_stack_size;
    peak = callee_info->stack_frame_offset
        + callee_info->estimated_self_stack_size;
!   if (inline_summary (e->callee->global.inlined_to)->estimated_stack_size
!       < peak)
      inline_summary (e->callee->global.inlined_to)->estimated_stack_size = peak;
    cgraph_propagate_frequency (e->callee);
  
*************** cgraph_mark_inline_edge (struct cgraph_e
*** 272,304 ****
      return false;
  }
  
! /* Return false when inlining edge E is not good idea
!    as it would cause too large growth of the callers function body
!    or stack frame size.  *REASON if non-NULL is updated if the
!    inlining is not a good idea.  */
  
  static bool
! cgraph_check_inline_limits (struct cgraph_edge *e,
! 			    cgraph_inline_failed_t *reason)
  {
    struct cgraph_node *to = e->caller;
    struct cgraph_node *what = e->callee;
    int newsize;
!   int limit;
!   HOST_WIDE_INT stack_size_limit, inlined_stack;
!   struct inline_summary *info, *what_info;
! 
!   if (to->global.inlined_to)
!     to = to->global.inlined_to;
  
-   info = inline_summary (to);
    what_info = inline_summary (what);
  
!   /* When inlining large function body called once into small function,
!      take the inlined function as base for limiting the growth.  */
!   if (info->self_size > what_info->self_size)
!     limit = info->self_size;
!   else
      limit = what_info->self_size;
  
    limit += limit * PARAM_VALUE (PARAM_LARGE_FUNCTION_GROWTH) / 100;
--- 285,336 ----
      return false;
  }
  
! /* Return false when inlining edge E would lead to violating
!    limits on function unit growth or stack usage growth.  
! 
!    The relative function body growth limit is present generally
!    to avoid problems with non-linear behaviour of the compiler.
!    To allow inlining huge functions into tiny wrapper, the limit
!    is always based on the bigger of the two functions considered.
! 
!    For stack growth limits we always base the growth in stack usage
!    of the callers.  We want to prevent applications from segfaulting
!    on stack overflow when functions with huge stack frames gets
!    inlined. */
  
  static bool
! caller_growth_limits (struct cgraph_edge *e)
  {
    struct cgraph_node *to = e->caller;
    struct cgraph_node *what = e->callee;
    int newsize;
!   int limit = 0;
!   HOST_WIDE_INT stack_size_limit = 0, inlined_stack;
!   struct inline_summary *info, *what_info, *outer_info = inline_summary (to);
! 
!   /* Look for function e->caller is inlined to.  While doing
!      so work out the largest function body on the way.  As
!      described above, we want to base our function growth
!      limits based on that.  Not on the self size of the
!      outer function, not on the self size of inline code
!      we immediately inline to.  This is the most relaxed
!      interpretation of the rule "do not grow large functions
!      too much in order to prevent compiler from exploding".  */
!   do
!     {
!       info = inline_summary (to);
!       if (limit < info->self_size)
! 	limit = info->self_size;
!       if (stack_size_limit < info->estimated_self_stack_size)
! 	stack_size_limit = info->estimated_self_stack_size;
!       if (to->global.inlined_to)
!         to = to->callers->caller;
!     }
!   while (to->global.inlined_to);
  
    what_info = inline_summary (what);
  
!   if (limit < what_info->self_size)
      limit = what_info->self_size;
  
    limit += limit * PARAM_VALUE (PARAM_LARGE_FUNCTION_GROWTH) / 100;
*************** cgraph_check_inline_limits (struct cgrap
*** 310,388 ****
        && newsize > PARAM_VALUE (PARAM_LARGE_FUNCTION_INSNS)
        && newsize > limit)
      {
!       if (reason)
!         *reason = CIF_LARGE_FUNCTION_GROWTH_LIMIT;
        return false;
      }
  
!   stack_size_limit = info->estimated_self_stack_size;
  
!   stack_size_limit += stack_size_limit * PARAM_VALUE (PARAM_STACK_FRAME_GROWTH) / 100;
! 
!   inlined_stack = (info->stack_frame_offset
! 		   + info->estimated_self_stack_size
  		   + what_info->estimated_stack_size);
!   if (inlined_stack  > stack_size_limit
        && inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
      {
!       if (reason)
!         *reason = CIF_LARGE_STACK_FRAME_GROWTH_LIMIT;
        return false;
      }
    return true;
  }
  
! /* Return true when function N is small enough to be inlined.  */
  
  static bool
! cgraph_default_inline_p (struct cgraph_node *n, cgraph_inline_failed_t *reason)
  {
!   tree decl = n->decl;
!   struct inline_summary *info = inline_summary (n);
  
!   if (info->disregard_inline_limits)
!     return true;
  
!   if (!flag_inline_small_functions && !DECL_DECLARED_INLINE_P (decl))
      {
!       if (reason)
! 	*reason = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
        return false;
      }
!   if (!n->analyzed)
      {
!       if (reason)
! 	*reason = CIF_BODY_NOT_AVAILABLE;
        return false;
      }
!   if (cgraph_function_body_availability (n) <= AVAIL_OVERWRITABLE)
      {
!       if (reason)
! 	*reason = CIF_OVERWRITABLE;
        return false;
      }
  
  
!   if (DECL_DECLARED_INLINE_P (decl))
      {
!       if (info->size >= MAX_INLINE_INSNS_SINGLE)
  	{
! 	  if (reason)
! 	    *reason = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
! 	  return false;
  	}
      }
    else
      {
!       if (info->size >= MAX_INLINE_INSNS_AUTO)
  	{
! 	  if (reason)
! 	    *reason = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
! 	  return false;
  	}
      }
  
!   return true;
  }
  
  /* A cost model driving the inlining heuristics in a way so the edges with
--- 342,762 ----
        && newsize > PARAM_VALUE (PARAM_LARGE_FUNCTION_INSNS)
        && newsize > limit)
      {
!       e->inline_failed = CIF_LARGE_FUNCTION_GROWTH_LIMIT;
        return false;
      }
  
!   /* FIXME: Stack size limit often prevents inlining in fortran programs
!      due to large i/o datastructures used by the fortran frontend.
!      We ought to ignore this limit when we know that the edge is executed
!      on every invocation of the caller (i.e. its call statement dominates
!      exit block).  We do not track this information, yet.  */
!   stack_size_limit += (stack_size_limit
! 		       * PARAM_VALUE (PARAM_STACK_FRAME_GROWTH) / 100);
  
!   inlined_stack = (outer_info->stack_frame_offset
! 		   + outer_info->estimated_self_stack_size
  		   + what_info->estimated_stack_size);
!   /* Check new stack consumption with stack consumption at the place
!      stack is used.  */
!   if (inlined_stack > stack_size_limit
!       /* If function already has large stack usage from sibbling
! 	 inline call, we can inline, too.
! 	 This bit overoptimistically assume that we are good at stack
! 	 packing.  */
!       && inlined_stack > info->estimated_stack_size
        && inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
      {
!       e->inline_failed = CIF_LARGE_STACK_FRAME_GROWTH_LIMIT;
        return false;
      }
    return true;
  }
  
! /* Dump info about why inlining has failed.  */
! 
! static void
! report_inline_failed_reason (struct cgraph_edge *e)
! {
!   if (dump_file)
!     {
!       fprintf (dump_file, "  not inlinable: %s/%i -> %s/%i, %s\n",
! 	       cgraph_node_name (e->caller), e->caller->uid,
! 	       cgraph_node_name (e->callee), e->callee->uid,
! 	       cgraph_inline_failed_string (e->inline_failed));
!     }
! }
! 
! /* Decide if we can inline the edge and possibly update
!    inline_failed reason.  
!    We check whether inlining is possible at all and whether
!    caller growth limits allow doing so.  
! 
!    if REPORT is true, output reason to the dump file.  */
  
  static bool
! can_inline_edge_p (struct cgraph_edge *e, bool report)
  {
!   bool inlinable = true;
!   tree caller_tree = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (e->caller->decl);
!   tree callee_tree = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (e->callee->decl);
  
!   gcc_assert (e->inline_failed);
  
!   if (!e->callee->analyzed)
!     {
!       e->inline_failed = CIF_BODY_NOT_AVAILABLE;
!       inlinable = false;
!     }
!   else if (!inline_summary (e->callee)->inlinable)
      {
!       e->inline_failed = CIF_FUNCTION_NOT_INLINABLE;
!       inlinable = false;
!     }
!   else if (cgraph_function_body_availability (e->callee) <= AVAIL_OVERWRITABLE)
!     {
!       e->inline_failed = CIF_OVERWRITABLE;
        return false;
      }
!   else if (e->call_stmt_cannot_inline_p)
!     {
!       e->inline_failed = CIF_MISMATCHED_ARGUMENTS;
!       inlinable = false;
!     }
!   /* Don't inline if the functions have different EH personalities.  */
!   else if (DECL_FUNCTION_PERSONALITY (e->caller->decl)
! 	   && DECL_FUNCTION_PERSONALITY (e->callee->decl)
! 	   && (DECL_FUNCTION_PERSONALITY (e->caller->decl)
! 	       != DECL_FUNCTION_PERSONALITY (e->callee->decl)))
!     {
!       e->inline_failed = CIF_EH_PERSONALITY;
!       inlinable = false;
!     }
!   /* Don't inline if the callee can throw non-call exceptions but the
!      caller cannot.
!      FIXME: this is obviously wrong for LTO where STRUCT_FUNCTION is missing.
!      Move the flag into cgraph node or mirror it in the inline summary.  */
!   else if (DECL_STRUCT_FUNCTION (e->callee->decl)
! 	   && DECL_STRUCT_FUNCTION (e->callee->decl)->can_throw_non_call_exceptions
! 	   && !(DECL_STRUCT_FUNCTION (e->caller->decl)
! 	        && DECL_STRUCT_FUNCTION (e->caller->decl)->can_throw_non_call_exceptions))
!     {
!       e->inline_failed = CIF_NON_CALL_EXCEPTIONS;
!       inlinable = false;
!     }
!   /* Check compatibility of target optimizatio noptions.  */
!   else if (!targetm.target_option.can_inline_p (e->caller->decl,
! 						e->callee->decl))
!     {
!       e->inline_failed = CIF_TARGET_OPTION_MISMATCH;
!       inlinable = false;
!     }
!   /* Check if caller growth allows the inlining.  */
!   else if (!DECL_DISREGARD_INLINE_LIMITS (e->callee->decl)
!            && !caller_growth_limits (e))
!     inlinable = false;
!   /* Don't inline a function with a higher optimization level than the
!      caller.  FIXME: this is really just tip of iceberg of handling
!      optimization attribute.  */
!   else if (caller_tree != callee_tree)
!     {
!       struct cl_optimization *caller_opt
! 	= TREE_OPTIMIZATION ((caller_tree)
! 			     ? caller_tree
! 			     : optimization_default_node);
! 
!       struct cl_optimization *callee_opt
! 	= TREE_OPTIMIZATION ((callee_tree)
! 			     ? callee_tree
! 			     : optimization_default_node);
! 
!       if ((caller_opt->x_optimize > callee_opt->x_optimize)
! 	  || (caller_opt->x_optimize_size != callee_opt->x_optimize_size))
! 	{
!           e->inline_failed = CIF_TARGET_OPTIMIZATION_MISMATCH;
! 	  inlinable = false;
! 	}
!     }
! 
!   /* Be sure that the cannot_inline_p flag is up to date.  */
!   gcc_checking_assert (!e->call_stmt
! 		       || (gimple_call_cannot_inline_p (e->call_stmt)
! 		           == e->call_stmt_cannot_inline_p)
! 		       /* In -flto-partition=none mode we really keep things out of
! 			  sync because call_stmt_cannot_inline_p is set at cgraph
! 			  merging when function bodies are not there yet.  */
! 		       || (in_lto_p && !gimple_call_cannot_inline_p (e->call_stmt)));
!   if (!inlinable && report)
!     report_inline_failed_reason (e);
!   return inlinable;
! }
! 
! 
! /* Return true if the edge E is inlinable during early inlining.  */
! 
! static bool
! can_early_inline_edge_p (struct cgraph_edge *e)
! {
!   /* Early inliner might get called at WPA stage when IPA pass adds new
!      function.  In this case we can not really do any of early inlining
!      because function bodies are missing.  */
!   if (!gimple_has_body_p (e->callee->decl))
      {
!       e->inline_failed = CIF_BODY_NOT_AVAILABLE;
        return false;
      }
!   /* In early inliner some of callees may not be in SSA form yet
!      (i.e. the callgraph is cyclic and we did not process
!      the callee by early inliner, yet).  We don't have CIF code for this
!      case; later we will re-do the decision in the real inliner.  */
!   if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->caller->decl))
!       || !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
      {
!       if (dump_file)
! 	fprintf (dump_file, "  edge not inlinable: not in SSA form\n");
        return false;
      }
+   if (!can_inline_edge_p (e, true))
+     return false;
+   return true;
+ }
+ 
  
+ /* Return true when N is leaf function.  Accept cheap builtins
+    in leaf functions.  */
  
! static bool
! leaf_node_p (struct cgraph_node *n)
! {
!   struct cgraph_edge *e;
!   for (e = n->callees; e; e = e->next_callee)
!     if (!is_inexpensive_builtin (e->callee->decl))
!       return false;
!   return true;
! }
! 
! 
! /* Return true if we are interested in inlining small function.  */
! 
! static bool
! want_early_inline_function_p (struct cgraph_edge *e)
! {
!   bool want_inline = true;
! 
!   if (DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
!     ;
!   else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
! 	   && !flag_inline_small_functions)
!     {
!       e->inline_failed = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
!       report_inline_failed_reason (e);
!       want_inline = false;
!     }
!   else
      {
!       int growth = estimate_edge_growth (e);
!       if (growth <= 0)
! 	;
!       else if (!cgraph_maybe_hot_edge_p (e)
! 	       && growth > 0)
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
! 		     "call is cold and code would grow by %i\n",
! 		     cgraph_node_name (e->caller), e->caller->uid,
! 		     cgraph_node_name (e->callee), e->callee->uid,
! 		     growth);
! 	  want_inline = false;
! 	}
!       else if (!leaf_node_p (e->callee)
! 	       && growth > 0)
  	{
! 	  if (dump_file)
! 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
! 		     "callee is not leaf and code would grow by %i\n",
! 		     cgraph_node_name (e->caller), e->caller->uid,
! 		     cgraph_node_name (e->callee), e->callee->uid,
! 		     growth);
! 	  want_inline = false;
  	}
+       else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
+ 	{
+ 	  if (dump_file)
+ 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
+ 		     "growth %i exceeds --param early-inlining-insns\n",
+ 		     cgraph_node_name (e->caller), e->caller->uid,
+ 		     cgraph_node_name (e->callee), e->callee->uid,
+ 		     growth);
+ 	  want_inline = false;
+ 	}
+     }
+   return want_inline;
+ }
+ 
+ /* Return true if we are interested in inlining small function.
+    When REPORT is true, report reason to dump file.  */
+ 
+ static bool
+ want_inline_small_function_p (struct cgraph_edge *e, bool report)
+ {
+   bool want_inline = true;
+ 
+   if (DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
+     ;
+   else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
+ 	   && !flag_inline_small_functions)
+     {
+       e->inline_failed = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
+       want_inline = false;
      }
    else
      {
!       int growth = estimate_edge_growth (e);
! 
!       if (growth <= 0)
! 	;
!       else if (DECL_DECLARED_INLINE_P (e->callee->decl)
! 	       && growth >= MAX_INLINE_INSNS_SINGLE)
! 	{
!           e->inline_failed = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
! 	  want_inline = false;
! 	}
!       else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
! 	       && !flag_inline_functions)
! 	{
!           e->inline_failed = CIF_NOT_DECLARED_INLINED;
! 	  want_inline = false;
! 	}
!       else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
! 	       && growth >= MAX_INLINE_INSNS_AUTO)
! 	{
!           e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
! 	  want_inline = false;
! 	}
!       else if (!cgraph_maybe_hot_edge_p (e)
! 	       && estimate_growth (e->callee) > 0)
  	{
!           e->inline_failed = CIF_UNLIKELY_CALL;
! 	  want_inline = false;
  	}
      }
+   if (!want_inline && report)
+     report_inline_failed_reason (e);
+   return want_inline;
+ }
  
! /* EDGE is self recursive edge.
!    We hand two cases - when function A is inlining into itself
!    or when function A is being inlined into another inliner copy of function
!    A within function B.  
! 
!    In first case OUTER_NODE points to the toplevel copy of A, while
!    in the second case OUTER_NODE points to the outermost copy of A in B.
! 
!    In both cases we want to be extra selective since
!    inlining the call will just introduce new recursive calls to appear.  */
! static bool
! want_inline_self_recursive_call_p (struct cgraph_edge *edge,
! 				   struct cgraph_node *outer_node,
! 				   bool peeling,
! 				   int depth)
! {
!   char const *reason = NULL;
!   bool want_inline = true;
!   int caller_freq = CGRAPH_FREQ_BASE;
!   int max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH_AUTO);
! 
!   if (DECL_DECLARED_INLINE_P (edge->callee->decl))
!     max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH);
! 
!   if (!cgraph_maybe_hot_edge_p (edge))
!     {
!       reason = "recursive call is cold";
!       want_inline = false;
!     }
!   else if (max_count && !outer_node->count)
!     {
!       reason = "not executed in profile";
!       want_inline = false;
!     }
!   else if (depth > max_depth)
!     {
!       reason = "--param max-inline-recursive-depth exceeded.";
!       want_inline = false;
!     }
! 
!   if (outer_node->global.inlined_to)
!     caller_freq = outer_node->callers->frequency;
! 
!   if (!want_inline)
!     ;
!   /* Inlining of self recursive function into copy of itself within other function
!      is transformation similar to loop peeling.
! 
!      Peeling is profitable if we can inline enough copies to make probablility
!      of actual call to the self recursive function very small.  Be sure that
!      the probability of recursion is small.
! 
!      We ensure that the frequency of recusing is at most 1 - (1/max_depth).
!      This way the expected number of recusion is at most max_depth.  */
!   else if (peeling)
!     {
!       int max_prob = CGRAPH_FREQ_BASE - ((CGRAPH_FREQ_BASE + max_depth - 1)
! 					 / max_depth);
!       int i;
!       for (i = 1; i < depth; i++)
! 	max_prob = max_prob * max_prob / CGRAPH_FREQ_BASE;
!       if (max_count
! 	  && (edge->count * CGRAPH_FREQ_BASE / outer_node->count
! 	      >= max_prob))
! 	{
! 	  reason = "profile of recursive call is too large";
! 	  want_inline = false;
! 	}
!       if (!max_count
! 	  && (edge->frequency * CGRAPH_FREQ_BASE / caller_freq
! 	      >= max_prob))
! 	{
! 	  reason = "frequency of recursive call is too large";
! 	  want_inline = false;
! 	}
!     }
!   /* Recusive inlining, i.e. equivalent of unrolling, is profitable if recusion
!      depth is large.  We reduce function call overhead and increase chances that
!      things fit in hardware return predictor.
! 
!      Recursive inlining might however increase cost of stack frame setup
!      actually slowing down functions whose recursion tree is wide rather than
!      deep.
! 
!      Deciding reliably on when to do recursive inlining withthout profile feedback
!      is tricky.  For now we disable recursive inlining when probability of self
!      recursion is low. 
! 
!      Recursive inlining of self recursive call within loop also results in large loop
!      depths that generally optimize badly.  We may want to throttle down inlining
!      in those cases.  In particular this seems to happen in one of libstdc++ rb tree
!      methods.  */
!   else
!     {
!       if (max_count
! 	  && (edge->count * 100 / outer_node->count
! 	      <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
! 	{
! 	  reason = "profile of recursive call is too small";
! 	  want_inline = false;
! 	}
!       else if (!max_count
! 	       && (edge->frequency * 100 / caller_freq
! 	           <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
! 	{
! 	  reason = "frequency of recursive call is too small";
! 	  want_inline = false;
! 	}
!     }
!   if (!want_inline && dump_file)
!     fprintf (dump_file, "   not inlining recursively: %s\n", reason);
!   return want_inline;
  }
  
  /* A cost model driving the inlining heuristics in a way so the edges with
*************** cgraph_default_inline_p (struct cgraph_n
*** 392,404 ****
     of the function or function body size.  */
  
  static int
! cgraph_edge_badness (struct cgraph_edge *edge, bool dump)
  {
    gcov_type badness;
    int growth;
    struct inline_summary *callee_info = inline_summary (edge->callee);
  
!   if (callee_info->disregard_inline_limits)
      return INT_MIN;
  
    growth = estimate_edge_growth (edge);
--- 766,778 ----
     of the function or function body size.  */
  
  static int
! edge_badness (struct cgraph_edge *edge, bool dump)
  {
    gcov_type badness;
    int growth;
    struct inline_summary *callee_info = inline_summary (edge->callee);
  
!   if (DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl))
      return INT_MIN;
  
    growth = estimate_edge_growth (edge);
*************** cgraph_edge_badness (struct cgraph_edge 
*** 488,494 ****
  	  fprintf (dump_file,
  		   "      %i: guessed profile. frequency %i, overall growth %i,"
  		   " benefit %i%%, divisor %i\n",
! 		   (int) badness, edge->frequency, growth_for_all, benefitperc, div);
  	}
      }
    /* When function local profile is not available or it does not give
--- 862,869 ----
  	  fprintf (dump_file,
  		   "      %i: guessed profile. frequency %i, overall growth %i,"
  		   " benefit %i%%, divisor %i\n",
! 		   (int) badness, edge->frequency, growth_for_all,
! 		   benefitperc, div);
  	}
      }
    /* When function local profile is not available or it does not give
*************** cgraph_edge_badness (struct cgraph_edge 
*** 523,532 ****
  }
  
  /* Recompute badness of EDGE and update its key in HEAP if needed.  */
! static void
  update_edge_key (fibheap_t heap, struct cgraph_edge *edge)
  {
!   int badness = cgraph_edge_badness (edge, false);
    if (edge->aux)
      {
        fibnode_t n = (fibnode_t) edge->aux;
--- 898,907 ----
  }
  
  /* Recompute badness of EDGE and update its key in HEAP if needed.  */
! static inline void
  update_edge_key (fibheap_t heap, struct cgraph_edge *edge)
  {
!   int badness = edge_badness (edge, false);
    if (edge->aux)
      {
        fibnode_t n = (fibnode_t) edge->aux;
*************** update_edge_key (fibheap_t heap, struct 
*** 539,549 ****
        if (badness < n->key)
  	{
  	  fibheap_replace_key (heap, n, badness);
  	  gcc_checking_assert (n->key == badness);
  	}
      }
    else
!     edge->aux = fibheap_insert (heap, badness, edge);
  }
  
  /* Recompute heap nodes for each of caller edge.  */
--- 914,943 ----
        if (badness < n->key)
  	{
  	  fibheap_replace_key (heap, n, badness);
+ 	  if (dump_file && (dump_flags & TDF_DETAILS))
+ 	    {
+ 	      fprintf (dump_file,
+ 		       "  decreasing badness %s/%i -> %s/%i, %i to %i\n",
+ 		       cgraph_node_name (edge->caller), edge->caller->uid,
+ 		       cgraph_node_name (edge->callee), edge->callee->uid,
+ 		       (int)n->key,
+ 		       badness);
+ 	    }
  	  gcc_checking_assert (n->key == badness);
  	}
      }
    else
!     {
!        if (dump_file && (dump_flags & TDF_DETAILS))
! 	 {
! 	   fprintf (dump_file,
! 		    "  enqueuing call %s/%i -> %s/%i, badness %i\n",
! 		    cgraph_node_name (edge->caller), edge->caller->uid,
! 		    cgraph_node_name (edge->callee), edge->callee->uid,
! 		    badness);
! 	 }
!       edge->aux = fibheap_insert (heap, badness, edge);
!     }
  }
  
  /* Recompute heap nodes for each of caller edge.  */
*************** update_caller_keys (fibheap_t heap, stru
*** 553,559 ****
  		    bitmap updated_nodes)
  {
    struct cgraph_edge *edge;
-   cgraph_inline_failed_t failed_reason;
  
    if (!inline_summary (node)->inlinable
        || cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE
--- 947,952 ----
*************** update_caller_keys (fibheap_t heap, stru
*** 569,591 ****
        break;
    if (!edge)
      return;
!   /* Prune out edges we won't inline into anymore.  */
!   if (!cgraph_default_inline_p (node, &failed_reason))
!     {
!       for (; edge; edge = edge->next_caller)
! 	if (edge->aux)
  	  {
  	    fibheap_delete_node (heap, (fibnode_t) edge->aux);
  	    edge->aux = NULL;
- 	    if (edge->inline_failed)
- 	      edge->inline_failed = failed_reason;
  	  }
!       return;
!     }
! 
!   for (; edge; edge = edge->next_caller)
!     if (edge->inline_failed)
!       update_edge_key (heap, edge);
  }
  
  /* Recompute heap nodes for each uninlined call.
--- 962,981 ----
        break;
    if (!edge)
      return;
! 
!   for (; edge; edge = edge->next_caller)
!     if (edge->inline_failed)
!       {
! 	if (can_inline_edge_p (edge, false)
! 	    && want_inline_small_function_p (edge, false))
!           update_edge_key (heap, edge);
! 	else if (edge->aux)
  	  {
+ 	    report_inline_failed_reason (edge);
  	    fibheap_delete_node (heap, (fibnode_t) edge->aux);
  	    edge->aux = NULL;
  	  }
!       }
  }
  
  /* Recompute heap nodes for each uninlined call.
*************** update_callee_keys (fibheap_t heap, stru
*** 613,624 ****
  	    && !bitmap_bit_p (updated_nodes, e->callee->uid))
  	  {
  	    inline_summary (node)->estimated_growth = INT_MIN;
! 	    /* If function becomes uninlinable, we need to remove it from the heap.  */
! 	    if (!cgraph_default_inline_p (e->callee, &e->inline_failed))
! 	      update_caller_keys (heap, e->callee, updated_nodes);
! 	    else
! 	    /* Otherwise update just edge E.  */
! 	      update_edge_key (heap, e);
  	  }
  	if (e->next_callee)
  	  e = e->next_callee;
--- 1003,1009 ----
  	    && !bitmap_bit_p (updated_nodes, e->callee->uid))
  	  {
  	    inline_summary (node)->estimated_growth = INT_MIN;
! 	    update_edge_key (heap, e);
  	  }
  	if (e->next_callee)
  	  e = e->next_callee;
*************** lookup_recursive_calls (struct cgraph_no
*** 702,717 ****
     is NULL.  */
  
  static bool
! cgraph_decide_recursive_inlining (struct cgraph_edge *edge,
! 				  VEC (cgraph_edge_p, heap) **new_edges)
  {
    int limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE_AUTO);
-   int max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH_AUTO);
-   int probability = PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY);
    fibheap_t heap;
    struct cgraph_node *node;
    struct cgraph_edge *e;
!   struct cgraph_node *master_clone, *next;
    int depth = 0;
    int n = 0;
  
--- 1087,1100 ----
     is NULL.  */
  
  static bool
! recursive_inlining (struct cgraph_edge *edge,
! 		    VEC (cgraph_edge_p, heap) **new_edges)
  {
    int limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE_AUTO);
    fibheap_t heap;
    struct cgraph_node *node;
    struct cgraph_edge *e;
!   struct cgraph_node *master_clone = NULL, *next;
    int depth = 0;
    int n = 0;
  
*************** cgraph_decide_recursive_inlining (struct
*** 719,743 ****
    if (node->global.inlined_to)
      node = node->global.inlined_to;
  
-   /* It does not make sense to recursively inline always-inline functions
-      as we are going to sorry() on the remaining calls anyway.  */
-   if (inline_summary (node)->disregard_inline_limits
-       && lookup_attribute ("always_inline", DECL_ATTRIBUTES (node->decl)))
-     return false;
- 
-   if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION (node->decl))
-       || (!flag_inline_functions && !DECL_DECLARED_INLINE_P (node->decl)))
-     return false;
- 
    if (DECL_DECLARED_INLINE_P (node->decl))
!     {
!       limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE);
!       max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH);
!     }
  
    /* Make sure that function is small enough to be considered for inlining.  */
!   if (!max_depth
!       || estimate_size_after_inlining (node, edge)  >= limit)
      return false;
    heap = fibheap_new ();
    lookup_recursive_calls (node, node, heap);
--- 1102,1112 ----
    if (node->global.inlined_to)
      node = node->global.inlined_to;
  
    if (DECL_DECLARED_INLINE_P (node->decl))
!     limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE);
  
    /* Make sure that function is small enough to be considered for inlining.  */
!   if (estimate_size_after_inlining (node, edge)  >= limit)
      return false;
    heap = fibheap_new ();
    lookup_recursive_calls (node, node, heap);
*************** cgraph_decide_recursive_inlining (struct
*** 752,765 ****
  	     "  Performing recursive inlining on %s\n",
  	     cgraph_node_name (node));
  
-   /* We need original clone to copy around.  */
-   master_clone = cgraph_clone_node (node, node->decl,
- 				    node->count, CGRAPH_FREQ_BASE, 1,
-   				    false, NULL);
-   for (e = master_clone->callees; e; e = e->next_callee)
-     if (!e->inline_failed)
-       cgraph_clone_inlined_nodes (e, true, false);
- 
    /* Do the inlining and update list of recursive call during process.  */
    while (!fibheap_empty (heap))
      {
--- 1121,1126 ----
*************** cgraph_decide_recursive_inlining (struct
*** 770,804 ****
        if (estimate_size_after_inlining (node, curr) > limit)
  	break;
  
        depth = 1;
        for (cnode = curr->caller;
  	   cnode->global.inlined_to; cnode = cnode->callers->caller)
  	if (node->decl == curr->callee->decl)
  	  depth++;
-       if (depth > max_depth)
- 	{
-           if (dump_file)
- 	    fprintf (dump_file,
- 		     "   maximal depth reached\n");
- 	  continue;
- 	}
  
!       if (max_count)
! 	{
!           if (!cgraph_maybe_hot_edge_p (curr))
! 	    {
! 	      if (dump_file)
! 		fprintf (dump_file, "   Not inlining cold call\n");
! 	      continue;
! 	    }
!           if (curr->count * 100 / node->count < probability)
! 	    {
! 	      if (dump_file)
! 		fprintf (dump_file,
! 			 "   Probability of edge is too small\n");
! 	      continue;
! 	    }
! 	}
  
        if (dump_file)
  	{
--- 1131,1147 ----
        if (estimate_size_after_inlining (node, curr) > limit)
  	break;
  
+       if (!can_inline_edge_p (curr, true))
+ 	continue;
+ 
        depth = 1;
        for (cnode = curr->caller;
  	   cnode->global.inlined_to; cnode = cnode->callers->caller)
  	if (node->decl == curr->callee->decl)
  	  depth++;
  
!       if (!want_inline_self_recursive_call_p (curr, node, false, depth))
! 	continue;
  
        if (dump_file)
  	{
*************** cgraph_decide_recursive_inlining (struct
*** 811,828 ****
  	    }
  	  fprintf (dump_file, "\n");
  	}
        cgraph_redirect_edge_callee (curr, master_clone);
        cgraph_mark_inline_edge (curr, false, new_edges);
        lookup_recursive_calls (node, curr->callee, heap);
        n++;
      }
    if (!fibheap_empty (heap) && dump_file)
      fprintf (dump_file, "    Recursive inlining growth limit met.\n");
- 
    fibheap_delete (heap);
    if (dump_file)
      fprintf (dump_file,
! 	     "\n   Inlined %i times, body grown from size %i to %i, time %i to %i\n", n,
  	     inline_summary (master_clone)->size, inline_summary (node)->size,
  	     inline_summary (master_clone)->time, inline_summary (node)->time);
  
--- 1154,1187 ----
  	    }
  	  fprintf (dump_file, "\n");
  	}
+       if (!master_clone)
+ 	{
+ 	  /* We need original clone to copy around.  */
+ 	  master_clone = cgraph_clone_node (node, node->decl,
+ 					    node->count, CGRAPH_FREQ_BASE, 1,
+ 					    false, NULL);
+ 	  for (e = master_clone->callees; e; e = e->next_callee)
+ 	    if (!e->inline_failed)
+ 	      cgraph_clone_inlined_nodes (e, true, false);
+ 	}
+ 
        cgraph_redirect_edge_callee (curr, master_clone);
        cgraph_mark_inline_edge (curr, false, new_edges);
        lookup_recursive_calls (node, curr->callee, heap);
        n++;
      }
+ 
    if (!fibheap_empty (heap) && dump_file)
      fprintf (dump_file, "    Recursive inlining growth limit met.\n");
    fibheap_delete (heap);
+ 
+   if (!master_clone)
+     return false;
+ 
    if (dump_file)
      fprintf (dump_file,
! 	     "\n   Inlined %i times, "
! 	     "body grown from size %i to %i, time %i to %i\n", n,
  	     inline_summary (master_clone)->size, inline_summary (node)->size,
  	     inline_summary (master_clone)->time, inline_summary (node)->time);
  
*************** cgraph_decide_recursive_inlining (struct
*** 837,863 ****
  	cgraph_remove_node (node);
      }
    cgraph_remove_node (master_clone);
!   /* FIXME: Recursive inlining actually reduces number of calls of the
!      function.  At this place we should probably walk the function and
!      inline clones and compensate the counts accordingly.  This probably
!      doesn't matter much in practice.  */
!   return n > 0;
! }
! 
! /* Set inline_failed for all callers of given function to REASON.  */
! 
! static void
! cgraph_set_inline_failed (struct cgraph_node *node,
! 			  cgraph_inline_failed_t reason)
! {
!   struct cgraph_edge *e;
! 
!   if (dump_file)
!     fprintf (dump_file, "Inlining failed: %s\n",
! 	     cgraph_inline_failed_string (reason));
!   for (e = node->callers; e; e = e->next_caller)
!     if (e->inline_failed)
!       e->inline_failed = reason;
  }
  
  /* Given whole compilation unit estimate of INSNS, compute how large we can
--- 1196,1202 ----
  	cgraph_remove_node (node);
      }
    cgraph_remove_node (master_clone);
!   return true;
  }
  
  /* Given whole compilation unit estimate of INSNS, compute how large we can
*************** add_new_edges_to_heap (fibheap_t heap, V
*** 884,891 ****
        gcc_assert (!edge->aux);
        if (inline_summary (edge->callee)->inlinable
  	  && edge->inline_failed
! 	  && cgraph_default_inline_p (edge->callee, &edge->inline_failed))
!         edge->aux = fibheap_insert (heap, cgraph_edge_badness (edge, false), edge);
      }
  }
  
--- 1223,1231 ----
        gcc_assert (!edge->aux);
        if (inline_summary (edge->callee)->inlinable
  	  && edge->inline_failed
! 	  && can_inline_edge_p (edge, true)
! 	  && want_inline_small_function_p (edge, true))
!         edge->aux = fibheap_insert (heap, edge_badness (edge, false), edge);
      }
  }
  
*************** add_new_edges_to_heap (fibheap_t heap, V
*** 898,908 ****
     to be passed to cgraph_inlined_into and cgraph_inlined_callees.  */
  
  static void
! cgraph_decide_inlining_of_small_functions (void)
  {
    struct cgraph_node *node;
    struct cgraph_edge *edge;
-   cgraph_inline_failed_t failed_reason;
    fibheap_t heap = fibheap_new ();
    bitmap updated_nodes = BITMAP_ALLOC (NULL);
    int min_size, max_size;
--- 1238,1247 ----
     to be passed to cgraph_inlined_into and cgraph_inlined_callees.  */
  
  static void
! inline_small_functions (void)
  {
    struct cgraph_node *node;
    struct cgraph_edge *edge;
    fibheap_t heap = fibheap_new ();
    bitmap updated_nodes = BITMAP_ALLOC (NULL);
    int min_size, max_size;
*************** cgraph_decide_inlining_of_small_function
*** 921,951 ****
        {
  	struct inline_summary *info = inline_summary (node);
  
- 	if (!info->inlinable || !node->callers)
- 	  {
- 	    struct cgraph_edge *e;
- 	    for (e = node->callers; e; e = e->next_caller)
- 	      {
- 		gcc_assert (e->inline_failed);
- 		e->inline_failed = CIF_FUNCTION_NOT_INLINABLE;
- 	      }
- 	    continue;
- 	  }
  	if (dump_file)
! 	  fprintf (dump_file, "Considering inline candidate %s.\n", cgraph_node_name (node));
  
  	info->estimated_growth = INT_MIN;
- 	if (!cgraph_default_inline_p (node, &failed_reason))
- 	  {
- 	    cgraph_set_inline_failed (node, failed_reason);
- 	    continue;
- 	  }
  
  	for (edge = node->callers; edge; edge = edge->next_caller)
! 	  if (edge->inline_failed)
  	    {
  	      gcc_assert (!edge->aux);
! 	      edge->aux = fibheap_insert (heap, cgraph_edge_badness (edge, false), edge);
  	    }
        }
  
--- 1260,1279 ----
        {
  	struct inline_summary *info = inline_summary (node);
  
  	if (dump_file)
! 	  fprintf (dump_file, "Enqueueing calls of %s/%i.\n",
! 		   cgraph_node_name (node), node->uid);
  
  	info->estimated_growth = INT_MIN;
  
  	for (edge = node->callers; edge; edge = edge->next_caller)
! 	  if (edge->inline_failed
! 	      && can_inline_edge_p (edge, true)
! 	      && want_inline_small_function_p (edge, true)
! 	      && edge->inline_failed)
  	    {
  	      gcc_assert (!edge->aux);
! 	      update_edge_key (heap, edge);
  	    }
        }
  
*************** cgraph_decide_inlining_of_small_function
*** 960,966 ****
        int badness = fibheap_min_key (heap);
        int current_badness;
        int growth;
-       cgraph_inline_failed_t not_good = CIF_OK;
  
        edge = (struct cgraph_edge *) fibheap_extract_min (heap);
        gcc_assert (edge->aux);
--- 1288,1293 ----
*************** cgraph_decide_inlining_of_small_function
*** 971,983 ****
        /* When updating the edge costs, we only decrease badness in the keys.
  	 When the badness increase, we keep the heap as it is and re-insert
  	 key now.  */
!       current_badness = cgraph_edge_badness (edge, false);
        gcc_assert (current_badness >= badness);
        if (current_badness != badness)
  	{
  	  edge->aux = fibheap_insert (heap, current_badness, edge);
  	  continue;
  	}
        
        callee = edge->callee;
        growth = estimate_edge_growth (edge);
--- 1298,1313 ----
        /* When updating the edge costs, we only decrease badness in the keys.
  	 When the badness increase, we keep the heap as it is and re-insert
  	 key now.  */
!       current_badness = edge_badness (edge, false);
        gcc_assert (current_badness >= badness);
        if (current_badness != badness)
  	{
  	  edge->aux = fibheap_insert (heap, current_badness, edge);
  	  continue;
  	}
+ 
+       if (!can_inline_edge_p (edge, true))
+ 	continue;
        
        callee = edge->callee;
        growth = estimate_edge_growth (edge);
*************** cgraph_decide_inlining_of_small_function
*** 989,1069 ****
  		   inline_summary (edge->callee)->size);
  	  fprintf (dump_file,
  		   " to be inlined into %s in %s:%i\n"
! 		   " Estimated growth after inlined into all callees is %+i insns.\n"
  		   " Estimated badness is %i, frequency %.2f.\n",
  		   cgraph_node_name (edge->caller),
  		   flag_wpa ? "unknown"
  		   : gimple_filename ((const_gimple) edge->call_stmt),
! 		   flag_wpa ? -1 : gimple_lineno ((const_gimple) edge->call_stmt),
  		   estimate_growth (edge->callee),
  		   badness,
  		   edge->frequency / (double)CGRAPH_FREQ_BASE);
  	  if (edge->count)
! 	    fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n", edge->count);
  	  if (dump_flags & TDF_DETAILS)
! 	    cgraph_edge_badness (edge, true);
  	}
  
!       /* When not having profile info ready we don't weight by any way the
!          position of call in procedure itself.  This means if call of
! 	 function A from function B seems profitable to inline, the recursive
! 	 call of function A in inline copy of A in B will look profitable too
! 	 and we end up inlining until reaching maximal function growth.  This
! 	 is not good idea so prohibit the recursive inlining.
! 
! 	 ??? When the frequencies are taken into account we might not need this
! 	 restriction.
! 
! 	 We need to be careful here, in some testcases, e.g. directives.c in
! 	 libcpp, we can estimate self recursive function to have negative growth
! 	 for inlining completely.
! 	 */
!       if (!edge->count)
  	{
! 	  where = edge->caller;
! 	  while (where->global.inlined_to)
! 	    {
! 	      if (where->decl == edge->callee->decl)
! 		break;
! 	      where = where->callers->caller;
! 	    }
! 	  if (where->global.inlined_to)
! 	    {
! 	      edge->inline_failed
! 		= (inline_summary (edge->callee)->disregard_inline_limits
! 		   ? CIF_RECURSIVE_INLINING : CIF_UNSPECIFIED);
! 	      if (dump_file)
! 		fprintf (dump_file, " inline_failed:Recursive inlining performed only for function itself.\n");
! 	      continue;
! 	    }
! 	}
! 
!       if (inline_summary (edge->callee)->disregard_inline_limits)
! 	;
!       else if (!cgraph_maybe_hot_edge_p (edge))
!  	not_good = CIF_UNLIKELY_CALL;
!       else if (!flag_inline_functions
! 	  && !DECL_DECLARED_INLINE_P (edge->callee->decl))
!  	not_good = CIF_NOT_DECLARED_INLINED;
!       else if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION(edge->caller->decl)))
!  	not_good = CIF_OPTIMIZING_FOR_SIZE;
!       if (not_good && growth > 0 && estimate_growth (edge->callee) > 0)
! 	{
! 	  edge->inline_failed = not_good;
! 	  if (dump_file)
! 	    fprintf (dump_file, " inline_failed:%s.\n",
! 		     cgraph_inline_failed_string (edge->inline_failed));
  	  continue;
  	}
!       if (!cgraph_default_inline_p (edge->callee, &edge->inline_failed))
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file, " inline_failed:%s.\n",
! 		     cgraph_inline_failed_string (edge->inline_failed));
! 	  continue;
! 	}
!       if (!tree_can_inline_p (edge)
! 	  || edge->call_stmt_cannot_inline_p)
  	{
  	  if (dump_file)
  	    fprintf (dump_file, " inline_failed:%s.\n",
--- 1319,1350 ----
  		   inline_summary (edge->callee)->size);
  	  fprintf (dump_file,
  		   " to be inlined into %s in %s:%i\n"
! 		   " Estimated growth after inlined into all is %+i insns.\n"
  		   " Estimated badness is %i, frequency %.2f.\n",
  		   cgraph_node_name (edge->caller),
  		   flag_wpa ? "unknown"
  		   : gimple_filename ((const_gimple) edge->call_stmt),
! 		   flag_wpa ? -1
! 		   : gimple_lineno ((const_gimple) edge->call_stmt),
  		   estimate_growth (edge->callee),
  		   badness,
  		   edge->frequency / (double)CGRAPH_FREQ_BASE);
  	  if (edge->count)
! 	    fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n",
! 		     edge->count);
  	  if (dump_flags & TDF_DETAILS)
! 	    edge_badness (edge, true);
  	}
  
!       if (overall_size + growth > max_size
! 	  && !DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl))
  	{
! 	  edge->inline_failed = CIF_INLINE_UNIT_GROWTH_LIMIT;
! 	  report_inline_failed_reason (edge);
  	  continue;
  	}
! 
!       if (!want_inline_small_function_p (edge, true))
  	{
  	  if (dump_file)
  	    fprintf (dump_file, " inline_failed:%s.\n",
*************** cgraph_decide_inlining_of_small_function
*** 1075,1083 ****
  	  where = edge->caller;
  	  if (where->global.inlined_to)
  	    where = where->global.inlined_to;
! 	  if (!cgraph_decide_recursive_inlining (edge,
! 						 flag_indirect_inlining
! 						 ? &new_indirect_edges : NULL))
  	    {
  	      edge->inline_failed = CIF_RECURSIVE_INLINING;
  	      continue;
--- 1356,1364 ----
  	  where = edge->caller;
  	  if (where->global.inlined_to)
  	    where = where->global.inlined_to;
! 	  if (!recursive_inlining (edge,
! 				   flag_indirect_inlining
! 				   ? &new_indirect_edges : NULL))
  	    {
  	      edge->inline_failed = CIF_RECURSIVE_INLINING;
  	      continue;
*************** cgraph_decide_inlining_of_small_function
*** 1089,1102 ****
        else
  	{
  	  struct cgraph_node *callee;
! 	  if (!cgraph_check_inline_limits (edge, &edge->inline_failed))
  	    {
! 	      if (dump_file)
! 		fprintf (dump_file, " Not inlining into %s:%s.\n",
! 			 cgraph_node_name (edge->caller),
! 			 cgraph_inline_failed_string (edge->inline_failed));
  	      continue;
  	    }
  	  callee = edge->callee;
  	  gcc_checking_assert (!callee->global.inlined_to);
  	  cgraph_mark_inline_edge (edge, true, &new_indirect_edges);
--- 1370,1402 ----
        else
  	{
  	  struct cgraph_node *callee;
! 	  struct cgraph_node *outer_node = NULL;
! 	  int depth = 0;
! 
! 	  /* Consider the case where self recursive function A is inlined into B.
! 	     This is desired optimization in some cases, since it leads to effect
! 	     similar of loop peeling and we might completely optimize out the
! 	     recursive call.  However we must be extra selective.  */
! 
! 	  where = edge->caller;
! 	  while (where->global.inlined_to)
  	    {
! 	      if (where->decl == edge->callee->decl)
! 		outer_node = where, depth++;
! 	      where = where->callers->caller;
! 	    }
! 	  if (outer_node
! 	      && !want_inline_self_recursive_call_p (edge, outer_node,
! 						     true, depth))
! 	    {
! 	      edge->inline_failed
! 		= (DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl)
! 		   ? CIF_RECURSIVE_INLINING : CIF_UNSPECIFIED);
  	      continue;
  	    }
+ 	  else if (depth && dump_file)
+ 	    fprintf (dump_file, " Peeling recursion with depth %i\n", depth);
+ 
  	  callee = edge->callee;
  	  gcc_checking_assert (!callee->global.inlined_to);
  	  cgraph_mark_inline_edge (edge, true, &new_indirect_edges);
*************** cgraph_decide_inlining_of_small_function
*** 1148,1190 ****
  	    fprintf (dump_file, "New minimal size reached: %i\n", min_size);
  	}
      }
-   while (!fibheap_empty (heap))
-     {
-       int badness = fibheap_min_key (heap);
- 
-       edge = (struct cgraph_edge *) fibheap_extract_min (heap);
-       gcc_assert (edge->aux);
-       edge->aux = NULL;
-       if (!edge->inline_failed)
- 	continue;
- #ifdef ENABLE_CHECKING
-       gcc_assert (cgraph_edge_badness (edge, false) >= badness);
- #endif
-       if (dump_file)
- 	{
- 	  fprintf (dump_file,
- 		   "\nSkipping %s with %i size\n",
- 		   cgraph_node_name (edge->callee),
- 		   inline_summary (edge->callee)->size);
- 	  fprintf (dump_file,
- 		   " called by %s in %s:%i\n"
- 		   " Estimated growth after inlined into all callees is %+i insns.\n"
- 		   " Estimated badness is %i, frequency %.2f.\n",
- 		   cgraph_node_name (edge->caller),
- 		   flag_wpa ? "unknown"
- 		   : gimple_filename ((const_gimple) edge->call_stmt),
- 		   flag_wpa ? -1 : gimple_lineno ((const_gimple) edge->call_stmt),
- 		   estimate_growth (edge->callee),
- 		   badness,
- 		   edge->frequency / (double)CGRAPH_FREQ_BASE);
- 	  if (edge->count)
- 	    fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n", edge->count);
- 	  if (dump_flags & TDF_DETAILS)
- 	    cgraph_edge_badness (edge, true);
- 	}
-       if (!inline_summary (edge->callee)->disregard_inline_limits && edge->inline_failed)
- 	edge->inline_failed = CIF_INLINE_UNIT_GROWTH_LIMIT;
-     }
  
    if (new_indirect_edges)
      VEC_free (cgraph_edge_p, heap, new_indirect_edges);
--- 1448,1453 ----
*************** cgraph_decide_inlining_of_small_function
*** 1195,1201 ****
  /* Flatten NODE from the IPA inliner.  */
  
  static void
! cgraph_flatten (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
  
--- 1458,1464 ----
  /* Flatten NODE from the IPA inliner.  */
  
  static void
! flatten_function (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
  
*************** cgraph_flatten (struct cgraph_node *node
*** 1208,1229 ****
      {
        struct cgraph_node *orig_callee;
  
-       if (e->call_stmt_cannot_inline_p)
- 	{
- 	  if (dump_file)
- 	    fprintf (dump_file, "Not inlining: %s",
- 		     cgraph_inline_failed_string (e->inline_failed));
- 	  continue;
- 	}
- 
-       if (!e->callee->analyzed)
- 	{
- 	  if (dump_file)
- 	    fprintf (dump_file,
- 		     "Not inlining: Function body not available.\n");
- 	  continue;
- 	}
- 
        /* We've hit cycle?  It is time to give up.  */
        if (e->callee->aux)
  	{
--- 1471,1476 ----
*************** cgraph_flatten (struct cgraph_node *node
*** 1240,1249 ****
  	 it in order to fully flatten the leaves.  */
        if (!e->inline_failed)
  	{
! 	  cgraph_flatten (e->callee);
  	  continue;
  	}
  
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
--- 1487,1504 ----
  	 it in order to fully flatten the leaves.  */
        if (!e->inline_failed)
  	{
! 	  flatten_function (e->callee);
  	  continue;
  	}
  
+       /* Flatten attribute needs to be processed during late inlining. For
+ 	 extra code quality we however do flattening during early optimization,
+ 	 too.  */
+       if (cgraph_state != CGRAPH_STATE_IPA_SSA
+ 	  ? !can_inline_edge_p (e, true)
+ 	  : !can_early_inline_edge_p (e))
+ 	continue;
+ 
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
*************** cgraph_flatten (struct cgraph_node *node
*** 1251,1264 ****
  	  continue;
  	}
  
-       if (!tree_can_inline_p (e))
- 	{
- 	  if (dump_file)
- 	    fprintf (dump_file, "Not inlining: %s",
- 		     cgraph_inline_failed_string (e->inline_failed));
- 	  continue;
- 	}
- 
        if (gimple_in_ssa_p (DECL_STRUCT_FUNCTION (node->decl))
  	  != gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
  	{
--- 1506,1511 ----
*************** cgraph_flatten (struct cgraph_node *node
*** 1277,1283 ****
        cgraph_mark_inline_edge (e, true, NULL);
        if (e->callee != orig_callee)
  	orig_callee->aux = (void *) node;
!       cgraph_flatten (e->callee);
        if (e->callee != orig_callee)
  	orig_callee->aux = NULL;
      }
--- 1524,1530 ----
        cgraph_mark_inline_edge (e, true, NULL);
        if (e->callee != orig_callee)
  	orig_callee->aux = (void *) node;
!       flatten_function (e->callee);
        if (e->callee != orig_callee)
  	orig_callee->aux = NULL;
      }
*************** cgraph_flatten (struct cgraph_node *node
*** 1289,1295 ****
     expenses on updating data structures.  */
  
  static unsigned int
! cgraph_decide_inlining (void)
  {
    struct cgraph_node *node;
    int nnodes;
--- 1536,1542 ----
     expenses on updating data structures.  */
  
  static unsigned int
! ipa_inline (void)
  {
    struct cgraph_node *node;
    int nnodes;
*************** cgraph_decide_inlining (void)
*** 1366,1406 ****
  	  if (dump_file)
  	    fprintf (dump_file,
  		     "Flattening %s\n", cgraph_node_name (node));
! 	  cgraph_flatten (node);
  	}
      }
  
!   cgraph_decide_inlining_of_small_functions ();
  
    if (flag_inline_functions_called_once)
      {
        if (dump_file)
  	fprintf (dump_file, "\nDeciding on functions called once:\n");
  
        /* And finally decide what functions are called once.  */
!       for (i = nnodes - 1; i >= 0; i--)
  	{
- 	  node = order[i];
- 
  	  if (node->callers
  	      && !node->callers->next_caller
  	      && !node->global.inlined_to
- 	      && cgraph_will_be_removed_from_program_if_no_direct_calls (node)
- 	      && inline_summary (node)->inlinable
- 	      && cgraph_function_body_availability (node) >= AVAIL_AVAILABLE
  	      && node->callers->inline_failed
  	      && node->callers->caller != node
  	      && node->callers->caller->global.inlined_to != node
! 	      && !node->callers->call_stmt_cannot_inline_p
! 	      && tree_can_inline_p (node->callers)
! 	      && !DECL_EXTERNAL (node->decl))
  	    {
! 	      cgraph_inline_failed_t reason;
  	      old_size = overall_size;
  	      if (dump_file)
  		{
  		  fprintf (dump_file,
! 			   "\nConsidering %s size %i.\n",
  			   cgraph_node_name (node), inline_summary (node)->size);
  		  fprintf (dump_file,
  			   " Called once from %s %i insns.\n",
--- 1613,1656 ----
  	  if (dump_file)
  	    fprintf (dump_file,
  		     "Flattening %s\n", cgraph_node_name (node));
! 	  flatten_function (node);
  	}
      }
  
!   inline_small_functions ();
!   cgraph_remove_unreachable_nodes (true, dump_file);
!   free (order);
  
+   /* We already perform some inlining of functions called once during
+      inlining small functions above.  After unreachable nodes are removed,
+      we still might do a quick check that nothing new is found.  */
    if (flag_inline_functions_called_once)
      {
        if (dump_file)
  	fprintf (dump_file, "\nDeciding on functions called once:\n");
  
        /* And finally decide what functions are called once.  */
!       for (node = cgraph_nodes; node; node = node->next)
  	{
  	  if (node->callers
  	      && !node->callers->next_caller
  	      && !node->global.inlined_to
  	      && node->callers->inline_failed
  	      && node->callers->caller != node
  	      && node->callers->caller->global.inlined_to != node
! 	      && cgraph_will_be_removed_from_program_if_no_direct_calls (node)
! 	      && inline_summary (node)->inlinable
! 	      && cgraph_function_body_availability (node) >= AVAIL_AVAILABLE
! 	      && !DECL_EXTERNAL (node->decl)
! 	      && can_inline_edge_p (node->callers, true))
  	    {
! 	      struct cgraph_node *caller = node->callers->caller;
! 
  	      old_size = overall_size;
  	      if (dump_file)
  		{
  		  fprintf (dump_file,
! 			   "\nInlining %s size %i.\n",
  			   cgraph_node_name (node), inline_summary (node)->size);
  		  fprintf (dump_file,
  			   " Called once from %s %i insns.\n",
*************** cgraph_decide_inlining (void)
*** 1408,1432 ****
  			   inline_summary (node->callers->caller)->size);
  		}
  
! 	      if (cgraph_check_inline_limits (node->callers, &reason))
! 		{
! 		  struct cgraph_node *caller = node->callers->caller;
! 		  cgraph_mark_inline_edge (node->callers, true, NULL);
! 		  if (dump_file)
! 		    fprintf (dump_file,
! 			     " Inlined into %s which now has %i size"
! 			     " for a net change of %+i size.\n",
! 			     cgraph_node_name (caller),
! 			     inline_summary (caller)->size,
! 			     overall_size - old_size);
! 		}
! 	      else
! 		{
! 		  if (dump_file)
! 		    fprintf (dump_file,
! 			     " Not inlining: %s.\n",
!                              cgraph_inline_failed_string (reason));
! 		}
  	    }
  	}
      }
--- 1658,1671 ----
  			   inline_summary (node->callers->caller)->size);
  		}
  
! 	      cgraph_mark_inline_edge (node->callers, true, NULL);
! 	      if (dump_file)
! 		fprintf (dump_file,
! 			 " Inlined into %s which now has %i size"
! 			 " for a net change of %+i size.\n",
! 			 cgraph_node_name (caller),
! 			 inline_summary (caller)->size,
! 			 overall_size - old_size);
  	    }
  	}
      }
*************** cgraph_decide_inlining (void)
*** 1441,1532 ****
  	     "size %i turned to %i size.\n\n",
  	     ncalls_inlined, nfunctions_inlined, initial_size,
  	     overall_size);
-   free (order);
    /* In WPA we use inline summaries for partitioning process.  */
    if (!flag_wpa)
      inline_free_summary ();
    return 0;
  }
  
- /* Return true when N is leaf function.  Accept cheap builtins
-    in leaf functions.  */
- 
- static bool
- leaf_node_p (struct cgraph_node *n)
- {
-   struct cgraph_edge *e;
-   for (e = n->callees; e; e = e->next_callee)
-     if (!is_inexpensive_builtin (e->callee->decl))
-       return false;
-   return true;
- }
- 
- /* Return true if the edge E is inlinable during early inlining.  */
- 
- static bool
- cgraph_edge_early_inlinable_p (struct cgraph_edge *e, FILE *file)
- {
-   if (!inline_summary (e->callee)->inlinable)
-     {
-       if (file)
- 	fprintf (file, "Not inlining: Function not inlinable.\n");
-       return false;
-     }
-   if (!e->callee->analyzed)
-     {
-       if (file)
- 	fprintf (file, "Not inlining: Function body not available.\n");
-       return false;
-     }
-   if (!tree_can_inline_p (e)
-       || e->call_stmt_cannot_inline_p)
-     {
-       if (file)
- 	fprintf (file, "Not inlining: %s.\n",
- 		 cgraph_inline_failed_string (e->inline_failed));
-       return false;
-     }
-   if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->caller->decl))
-       || !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
-     {
-       if (file)
- 	fprintf (file, "Not inlining: not in SSA form.\n");
-       return false;
-     }
-   return true;
- }
- 
  /* Inline always-inline function calls in NODE.  */
  
  static bool
! cgraph_perform_always_inlining (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
    bool inlined = false;
  
    for (e = node->callees; e; e = e->next_callee)
      {
!       if (!inline_summary (e->callee)->disregard_inline_limits)
  	continue;
  
-       if (dump_file)
- 	fprintf (dump_file,
- 		 "Considering always-inline candidate %s.\n",
- 		 cgraph_node_name (e->callee));
- 
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
! 	    fprintf (dump_file, "Not inlining: recursive call.\n");
  	  e->inline_failed = CIF_RECURSIVE_INLINING;
  	  continue;
  	}
  
!       if (!cgraph_edge_early_inlinable_p (e, dump_file))
  	continue;
  
        if (dump_file)
! 	fprintf (dump_file, " Inlining %s into %s.\n",
  		 cgraph_node_name (e->callee),
  		 cgraph_node_name (e->caller));
        cgraph_mark_inline_edge (e, true, NULL);
--- 1680,1718 ----
  	     "size %i turned to %i size.\n\n",
  	     ncalls_inlined, nfunctions_inlined, initial_size,
  	     overall_size);
    /* In WPA we use inline summaries for partitioning process.  */
    if (!flag_wpa)
      inline_free_summary ();
    return 0;
  }
  
  /* Inline always-inline function calls in NODE.  */
  
  static bool
! inline_always_inline_functions (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
    bool inlined = false;
  
    for (e = node->callees; e; e = e->next_callee)
      {
!       if (!DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
  	continue;
  
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
! 	    fprintf (dump_file, "  Not inlining recursive call to %s.\n",
! 		     cgraph_node_name (e->callee));
  	  e->inline_failed = CIF_RECURSIVE_INLINING;
  	  continue;
  	}
  
!       if (!can_early_inline_edge_p (e))
  	continue;
  
        if (dump_file)
! 	fprintf (dump_file, "  Inlining %s into %s (always_inline).\n",
  		 cgraph_node_name (e->callee),
  		 cgraph_node_name (e->caller));
        cgraph_mark_inline_edge (e, true, NULL);
*************** cgraph_perform_always_inlining (struct c
*** 1540,1563 ****
     expenses on updating data structures.  */
  
  static bool
! cgraph_decide_inlining_incrementally (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
    bool inlined = false;
-   cgraph_inline_failed_t failed_reason;
- 
-   /* Never inline regular functions into always-inline functions
-      during incremental inlining.  */
-   if (inline_summary (node)->disregard_inline_limits)
-     return false;
  
    for (e = node->callees; e; e = e->next_callee)
      {
-       int allowed_growth = 0;
- 
        if (!inline_summary (e->callee)->inlinable
! 	  || !e->inline_failed
! 	  || inline_summary (e->callee)->disregard_inline_limits)
  	continue;
  
        /* Do not consider functions not declared inline.  */
--- 1726,1740 ----
     expenses on updating data structures.  */
  
  static bool
! early_inline_small_functions (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
    bool inlined = false;
  
    for (e = node->callees; e; e = e->next_callee)
      {
        if (!inline_summary (e->callee)->inlinable
! 	  || !e->inline_failed)
  	continue;
  
        /* Do not consider functions not declared inline.  */
*************** cgraph_decide_inlining_incrementally (st
*** 1570,1616 ****
  	fprintf (dump_file, "Considering inline candidate %s.\n",
  		 cgraph_node_name (e->callee));
  
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
! 	    fprintf (dump_file, "Not inlining: recursive call.\n");
  	  continue;
  	}
  
!       if (!cgraph_edge_early_inlinable_p (e, dump_file))
  	continue;
  
!       if (cgraph_maybe_hot_edge_p (e) && leaf_node_p (e->callee)
! 	  && optimize_function_for_speed_p (cfun))
! 	allowed_growth = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
! 
!       /* When the function body would grow and inlining the function
! 	 won't eliminate the need for offline copy of the function,
! 	 don't inline.  */
!       if (estimate_edge_growth (e) > allowed_growth)
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file,
! 		     "Not inlining: code size would grow by %i.\n",
! 		     estimate_edge_growth (e));
! 	  continue;
! 	}
!       if (!cgraph_check_inline_limits (e, &e->inline_failed))
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file, "Not inlining: %s.\n",
! 		     cgraph_inline_failed_string (e->inline_failed));
! 	  continue;
! 	}
!       if (cgraph_default_inline_p (e->callee, &failed_reason))
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file, " Inlining %s into %s.\n",
! 		     cgraph_node_name (e->callee),
! 		     cgraph_node_name (e->caller));
! 	  cgraph_mark_inline_edge (e, true, NULL);
! 	  inlined = true;
! 	}
      }
  
    return inlined;
--- 1747,1771 ----
  	fprintf (dump_file, "Considering inline candidate %s.\n",
  		 cgraph_node_name (e->callee));
  
+       if (!can_early_inline_edge_p (e))
+ 	continue;
+ 
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
! 	    fprintf (dump_file, "  Not inlining: recursive call.\n");
  	  continue;
  	}
  
!       if (!want_early_inline_function_p (e))
  	continue;
  
!       if (dump_file)
! 	fprintf (dump_file, " Inlining %s into %s.\n",
! 		 cgraph_node_name (e->callee),
! 		 cgraph_node_name (e->caller));
!       cgraph_mark_inline_edge (e, true, NULL);
!       inlined = true;
      }
  
    return inlined;
*************** static GTY ((length ("nnodes"))) struct 
*** 1626,1632 ****
     passes to be somewhat more effective and avoids some code duplication in
     later real inlining pass for testcases with very many function calls.  */
  static unsigned int
! cgraph_early_inlining (void)
  {
    struct cgraph_node *node = cgraph_get_node (current_function_decl);
    struct cgraph_edge *edge;
--- 1781,1787 ----
     passes to be somewhat more effective and avoids some code duplication in
     later real inlining pass for testcases with very many function calls.  */
  static unsigned int
! early_inliner (void)
  {
    struct cgraph_node *node = cgraph_get_node (current_function_decl);
    struct cgraph_edge *edge;
*************** cgraph_early_inlining (void)
*** 1643,1653 ****
  
    /* Even when not optimizing or not inlining inline always-inline
       functions.  */
!   inlined = cgraph_perform_always_inlining (node);
  
    if (!optimize
        || flag_no_inline
!       || !flag_early_inlining)
      ;
    else if (lookup_attribute ("flatten",
  			     DECL_ATTRIBUTES (node->decl)) != NULL)
--- 1798,1817 ----
  
    /* Even when not optimizing or not inlining inline always-inline
       functions.  */
!   inlined = inline_always_inline_functions (node);
  
    if (!optimize
        || flag_no_inline
!       || !flag_early_inlining
!       /* Never inline regular functions into always-inline functions
! 	 during incremental inlining.  This sucks as functions calling
! 	 always inline functions will get less optimized, but at the
! 	 same time inlining of functions calling always inline
! 	 functoin into an always inline function might introduce
! 	 cycles of edges to be always inlined in the callgraph.
! 
! 	 We might want to be smarter and just avoid this type of inlining.  */
!       || DECL_DISREGARD_INLINE_LIMITS (node->decl))
      ;
    else if (lookup_attribute ("flatten",
  			     DECL_ATTRIBUTES (node->decl)) != NULL)
*************** cgraph_early_inlining (void)
*** 1657,1663 ****
        if (dump_file)
  	fprintf (dump_file,
  		 "Flattening %s\n", cgraph_node_name (node));
!       cgraph_flatten (node);
        inlined = true;
      }
    else
--- 1821,1827 ----
        if (dump_file)
  	fprintf (dump_file,
  		 "Flattening %s\n", cgraph_node_name (node));
!       flatten_function (node);
        inlined = true;
      }
    else
*************** cgraph_early_inlining (void)
*** 1665,1674 ****
        /* We iterate incremental inlining to get trivial cases of indirect
  	 inlining.  */
        while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
! 	     && cgraph_decide_inlining_incrementally (node))
  	{
  	  timevar_push (TV_INTEGRATION);
  	  todo |= optimize_inline_calls (current_function_decl);
  	  timevar_pop (TV_INTEGRATION);
  	  iterations++;
  	  inlined = false;
--- 1829,1850 ----
        /* We iterate incremental inlining to get trivial cases of indirect
  	 inlining.  */
        while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
! 	     && early_inline_small_functions (node))
  	{
  	  timevar_push (TV_INTEGRATION);
  	  todo |= optimize_inline_calls (current_function_decl);
+ 
+ 	  /* Technically we ought to recompute inline parameters so the new
+  	     iteration of early inliner works as expected.  We however have
+ 	     values approximately right and thus we only need to update edge
+ 	     info that might be cleared out for newly discovered edges.  */
+ 	  for (edge = node->callees; edge; edge = edge->next_callee)
+ 	    {
+ 	      edge->call_stmt_size
+ 		= estimate_num_insns (edge->call_stmt, &eni_size_weights);
+ 	      edge->call_stmt_time
+ 		= estimate_num_insns (edge->call_stmt, &eni_time_weights);
+ 	    }
  	  timevar_pop (TV_INTEGRATION);
  	  iterations++;
  	  inlined = false;
*************** cgraph_early_inlining (void)
*** 1681,1699 ****
      {
        timevar_push (TV_INTEGRATION);
        todo |= optimize_inline_calls (current_function_decl);
- 
-       /* Technically we ought to recompute inline parameters so the new iteration of
- 	 early inliner works as expected.  We however have values approximately right
- 	 and thus we only need to update edge info that might be cleared out for
- 	 newly discovered edges.  */
-       for (edge = node->callees; edge; edge = edge->next_callee)
- 	{
- 	  edge->call_stmt_size
- 	    = estimate_num_insns (edge->call_stmt, &eni_size_weights);
- 	  edge->call_stmt_time
- 	    = estimate_num_insns (edge->call_stmt, &eni_time_weights);
- 	}
- 
        timevar_pop (TV_INTEGRATION);
      }
  
--- 1857,1862 ----
*************** struct gimple_opt_pass pass_early_inline
*** 1708,1714 ****
    GIMPLE_PASS,
    "einline",	 			/* name */
    NULL,					/* gate */
!   cgraph_early_inlining,		/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
--- 1871,1877 ----
    GIMPLE_PASS,
    "einline",	 			/* name */
    NULL,					/* gate */
!   early_inliner,			/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
*************** inline_transform (struct cgraph_node *no
*** 1730,1737 ****
    struct cgraph_edge *e;
    bool inline_p = false;
  
!   /* FIXME: Currently the pass manager is adding inline transform more than once to some
!      clones.  This needs revisiting after WPA cleanups.  */
    if (cfun->after_inlining)
      return 0;
  
--- 1893,1900 ----
    struct cgraph_edge *e;
    bool inline_p = false;
  
!   /* FIXME: Currently the pass manager is adding inline transform more than
!      once to some clones.  This needs revisiting after WPA cleanups.  */
    if (cfun->after_inlining)
      return 0;
  
*************** inline_transform (struct cgraph_node *no
*** 1762,1768 ****
     happens during early inlining.  */
  
  static bool
! gate_cgraph_decide_inlining (void)
  {
    /* ???  We'd like to skip this if not optimizing or not inlining as
       all always-inline functions have been processed by early
--- 1925,1931 ----
     happens during early inlining.  */
  
  static bool
! gate_ipa_inline (void)
  {
    /* ???  We'd like to skip this if not optimizing or not inlining as
       all always-inline functions have been processed by early
*************** struct ipa_opt_pass_d pass_ipa_inline =
*** 1777,1784 ****
   {
    IPA_PASS,
    "inline",				/* name */
!   gate_cgraph_decide_inlining,		/* gate */
!   cgraph_decide_inlining,		/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
--- 1940,1947 ----
   {
    IPA_PASS,
    "inline",				/* name */
!   gate_ipa_inline,			/* gate */
!   ipa_inline,				/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
Index: ipa-inline.h
===================================================================
*** ipa-inline.h	(revision 172597)
--- ipa-inline.h	(working copy)
*************** struct inline_summary
*** 41,48 ****
    /* False when there something makes versioning impossible.
       Currently computed and used only by ipa-cp.  */
    unsigned versionable : 1;
-   /* True when function should be inlined independently on its size.  */
-   unsigned disregard_inline_limits : 1;
  
    /* Information about function that will result after applying all the
       inline decisions present in the callgraph.  Generally kept up to
--- 41,46 ----
Index: ipa-inline-analysis.c
===================================================================
*** ipa-inline-analysis.c	(revision 172597)
--- ipa-inline-analysis.c	(working copy)
*************** dump_inline_summary (FILE *f, struct cgr
*** 131,137 ****
        struct inline_summary *s = inline_summary (node);
        fprintf (f, "Inline summary for %s/%i", cgraph_node_name (node),
  	       node->uid);
!       if (s->disregard_inline_limits)
  	fprintf (f, " always_inline");
        if (s->inlinable)
  	fprintf (f, " inlinable");
--- 131,137 ----
        struct inline_summary *s = inline_summary (node);
        fprintf (f, "Inline summary for %s/%i", cgraph_node_name (node),
  	       node->uid);
!       if (DECL_DISREGARD_INLINE_LIMITS (node->decl))
  	fprintf (f, " always_inline");
        if (s->inlinable)
  	fprintf (f, " inlinable");
*************** dump_inline_summary (FILE *f, struct cgr
*** 142,148 ****
        fprintf (f, "  global time:     %i\n", s->time);
        fprintf (f, "  self size:       %i, benefit: %i\n",
  	       s->self_size, s->size_inlining_benefit);
!       fprintf (f, "  global size:     %i", s->size);
        fprintf (f, "  self stack:      %i\n",
  	       (int)s->estimated_self_stack_size);
        fprintf (f, "  global stack:    %i\n\n",
--- 142,148 ----
        fprintf (f, "  global time:     %i\n", s->time);
        fprintf (f, "  self size:       %i, benefit: %i\n",
  	       s->self_size, s->size_inlining_benefit);
!       fprintf (f, "  global size:     %i\n", s->size);
        fprintf (f, "  self stack:      %i\n",
  	       (int)s->estimated_self_stack_size);
        fprintf (f, "  global stack:    %i\n\n",
*************** estimate_function_body_sizes (struct cgr
*** 303,308 ****
--- 303,319 ----
  	      struct cgraph_edge *edge = cgraph_edge (node, stmt);
  	      edge->call_stmt_size = this_size;
  	      edge->call_stmt_time = this_time;
+ 
+ 	      /* Do not inline calls where we cannot triviall work around mismatches
+ 		 in argument or return types.  */
+ 	      if (edge->callee
+ 		  && !gimple_check_call_matching_types (stmt, edge->callee->decl))
+ 		{
+ 		  edge->call_stmt_cannot_inline_p = true;
+ 		  gimple_call_set_cannot_inline (stmt, true);
+ 		}
+ 	      else
+ 		gcc_assert (!gimple_call_cannot_inline_p (stmt));
  	    }
  
  	  this_time *= freq;
*************** compute_inline_parameters (struct cgraph
*** 364,371 ****
  
    /* Can this function be inlined at all?  */
    info->inlinable = tree_inlinable_function_p (node->decl);
-   if (!info->inlinable)
-     info->disregard_inline_limits = 0;
  
    /* Inlinable functions always can change signature.  */
    if (info->inlinable)
--- 375,380 ----
*************** compute_inline_parameters (struct cgraph
*** 388,395 ****
    info->estimated_growth = INT_MIN;
    info->stack_frame_offset = 0;
    info->estimated_stack_size = info->estimated_self_stack_size;
-   info->disregard_inline_limits
-     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
  }
  
  
--- 397,402 ----
*************** estimate_growth (struct cgraph_node *nod
*** 483,507 ****
  
    for (e = node->callers; e; e = e->next_caller)
      {
!       if (e->caller == node)
          self_recursive = true;
!       if (e->inline_failed)
! 	growth += estimate_edge_growth (e);
      }
  
!   /* ??? Wrong for non-trivially self recursive functions or cases where
!      we decide to not inline for different reasons, but it is not big deal
!      as in that case we will keep the body around, but we will also avoid
!      some inlining.  */
!   if (cgraph_will_be_removed_from_program_if_no_direct_calls (node)
!       && !DECL_EXTERNAL (node->decl) && !self_recursive)
!     growth -= info->size;
!   /* COMDAT functions are very often not shared across multiple units since they
!      come from various template instantiations.  Take this into account.  */
!   else  if (DECL_COMDAT (node->decl) && !self_recursive
! 	    && cgraph_can_remove_if_no_direct_calls_p (node))
!     growth -= (info->size
! 	       * (100 - PARAM_VALUE (PARAM_COMDAT_SHARING_PROBABILITY)) + 50) / 100;
  
    info->estimated_growth = growth;
    return growth;
--- 490,523 ----
  
    for (e = node->callers; e; e = e->next_caller)
      {
!       gcc_checking_assert (e->inline_failed);
! 
!       if (e->caller == node
! 	  || (e->caller->global.inlined_to
! 	      && e->caller->global.inlined_to == node))
          self_recursive = true;
!       growth += estimate_edge_growth (e);
      }
+      
  
!   /* For self recursive functions the growth estimation really should be
!      infinity.  We don't want to return very large values because the growth
!      plays various roles in badness computation fractions.  Be sure to not
!      return zero or negative growths. */
!   if (self_recursive)
!     growth = growth < info->size ? info->size : growth;
!   else
!     {
!       if (cgraph_will_be_removed_from_program_if_no_direct_calls (node)
! 	  && !DECL_EXTERNAL (node->decl))
! 	growth -= info->size;
!       /* COMDAT functions are very often not shared across multiple units since they
! 	 come from various template instantiations.  Take this into account.  */
!       else  if (DECL_COMDAT (node->decl)
! 		&& cgraph_can_remove_if_no_direct_calls_p (node))
! 	growth -= (info->size
! 		   * (100 - PARAM_VALUE (PARAM_COMDAT_SHARING_PROBABILITY)) + 50) / 100;
!     }
  
    info->estimated_growth = growth;
    return growth;
*************** inline_read_summary (void)
*** 621,627 ****
  	      bp = lto_input_bitpack (ib);
  	      info->inlinable = bp_unpack_value (&bp, 1);
  	      info->versionable = bp_unpack_value (&bp, 1);
- 	      info->disregard_inline_limits = bp_unpack_value (&bp, 1);
  	    }
  
  	  lto_destroy_simple_input_block (file_data,
--- 637,642 ----
*************** inline_write_summary (cgraph_node_set se
*** 688,694 ****
  	  bp = bitpack_create (ob->main_stream);
  	  bp_pack_value (&bp, info->inlinable, 1);
  	  bp_pack_value (&bp, info->versionable, 1);
- 	  bp_pack_value (&bp, info->disregard_inline_limits, 1);
  	  lto_output_bitpack (&bp);
  	}
      }
--- 703,708 ----
Index: tree-inline.c
===================================================================
*** tree-inline.c	(revision 172597)
--- tree-inline.c	(working copy)
*************** inline_forbidden_p (tree fndecl)
*** 3138,3166 ****
    return forbidden_p;
  }
  
- /* Return true if CALLEE cannot be inlined into CALLER.  */
- 
- static bool
- inline_forbidden_into_p (tree caller, tree callee)
- {
-   /* Don't inline if the functions have different EH personalities.  */
-   if (DECL_FUNCTION_PERSONALITY (caller)
-       && DECL_FUNCTION_PERSONALITY (callee)
-       && (DECL_FUNCTION_PERSONALITY (caller)
- 	  != DECL_FUNCTION_PERSONALITY (callee)))
-     return true;
- 
-   /* Don't inline if the callee can throw non-call exceptions but the
-      caller cannot.  */
-   if (DECL_STRUCT_FUNCTION (callee)
-       && DECL_STRUCT_FUNCTION (callee)->can_throw_non_call_exceptions
-       && !(DECL_STRUCT_FUNCTION (caller)
- 	   && DECL_STRUCT_FUNCTION (caller)->can_throw_non_call_exceptions))
-     return true;
- 
-   return false;
- }
- 
  /* Returns nonzero if FN is a function that does not have any
     fundamental inline blocking properties.  */
  
--- 3138,3143 ----
*************** expand_call_inline (basic_block bb, gimp
*** 3750,3759 ****
        && gimple_has_body_p (DECL_ABSTRACT_ORIGIN (fn)))
      fn = DECL_ABSTRACT_ORIGIN (fn);
  
-   /* First check that inlining isn't simply forbidden in this case.  */
-   if (inline_forbidden_into_p (cg_edge->caller->decl, cg_edge->callee->decl))
-     goto egress;
- 
    /* Don't try to inline functions that are not well-suited to inlining.  */
    if (!cgraph_inline_p (cg_edge, &reason))
      {
--- 3727,3732 ----
*************** build_duplicate_type (tree type)
*** 5298,5372 ****
  
    return type;
  }
- 
- /* Return whether it is safe to inline a function because it used different
-    target specific options or call site actual types mismatch parameter types.
-    E is the call edge to be checked.  */
- bool
- tree_can_inline_p (struct cgraph_edge *e)
- {
- #if 0
-   /* This causes a regression in SPEC in that it prevents a cold function from
-      inlining a hot function.  Perhaps this should only apply to functions
-      that the user declares hot/cold/optimize explicitly.  */
- 
-   /* Don't inline a function with a higher optimization level than the
-      caller, or with different space constraints (hot/cold functions).  */
-   tree caller_tree = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (caller);
-   tree callee_tree = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (callee);
- 
-   if (caller_tree != callee_tree)
-     {
-       struct cl_optimization *caller_opt
- 	= TREE_OPTIMIZATION ((caller_tree)
- 			     ? caller_tree
- 			     : optimization_default_node);
- 
-       struct cl_optimization *callee_opt
- 	= TREE_OPTIMIZATION ((callee_tree)
- 			     ? callee_tree
- 			     : optimization_default_node);
- 
-       if ((caller_opt->optimize > callee_opt->optimize)
- 	  || (caller_opt->optimize_size != callee_opt->optimize_size))
- 	return false;
-     }
- #endif
-   tree caller, callee;
- 
-   caller = e->caller->decl;
-   callee = e->callee->decl;
- 
-   /* First check that inlining isn't simply forbidden in this case.  */
-   if (inline_forbidden_into_p (caller, callee))
-     {
-       e->inline_failed = CIF_UNSPECIFIED;
-       if (e->call_stmt)
- 	gimple_call_set_cannot_inline (e->call_stmt, true);
-       return false;
-     }
- 
-   /* Allow the backend to decide if inlining is ok.  */
-   if (!targetm.target_option.can_inline_p (caller, callee))
-     {
-       e->inline_failed = CIF_TARGET_OPTION_MISMATCH;
-       if (e->call_stmt)
- 	gimple_call_set_cannot_inline (e->call_stmt, true);
-       e->call_stmt_cannot_inline_p = true;
-       return false;
-     }
- 
-   /* Do not inline calls where we cannot triviall work around mismatches
-      in argument or return types.  */
-   if (e->call_stmt
-       && !gimple_check_call_matching_types (e->call_stmt, callee))
-     {
-       e->inline_failed = CIF_MISMATCHED_ARGUMENTS;
-       if (e->call_stmt)
- 	gimple_call_set_cannot_inline (e->call_stmt, true);
-       e->call_stmt_cannot_inline_p = true;
-       return false;
-     }
- 
-   return true;
- }
--- 5271,5273 ----
Index: tree-inline.h
===================================================================
*** tree-inline.h	(revision 172597)
--- tree-inline.h	(working copy)
*************** int estimate_num_insns (gimple, eni_weig
*** 182,188 ****
  int estimate_num_insns_fn (tree, eni_weights *);
  int count_insns_seq (gimple_seq, eni_weights *);
  bool tree_versionable_function_p (tree);
- bool tree_can_inline_p (struct cgraph_edge *e);
  
  extern tree remap_decl (tree decl, copy_body_data *id);
  extern tree remap_type (tree type, copy_body_data *id);
--- 182,187 ----
Index: Makefile.in
===================================================================
*** Makefile.in	(revision 172597)
--- Makefile.in	(working copy)
*************** matrix-reorg.o : matrix-reorg.c $(CONFIG
*** 3028,3035 ****
  ipa-inline.o : ipa-inline.c gt-ipa-inline.h $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
     $(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \
     $(DIAGNOSTIC_H) $(FIBHEAP_H) $(PARAMS_H) $(TIMEVAR_H) $(TREE_PASS_H) \
!    $(HASHTAB_H) $(COVERAGE_H) $(GGC_H) $(TREE_FLOW_H) $(RTL_H) $(IPA_PROP_H) \
!    $(EXCEPT_H) gimple-pretty-print.h ipa-inline.h
  ipa-inline-analysis.o : ipa-inline-analysis.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
     $(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \
     $(DIAGNOSTIC_H) $(PARAMS_H) $(TIMEVAR_H) $(TREE_PASS_H) \
--- 3028,3035 ----
  ipa-inline.o : ipa-inline.c gt-ipa-inline.h $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
     $(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \
     $(DIAGNOSTIC_H) $(FIBHEAP_H) $(PARAMS_H) $(TIMEVAR_H) $(TREE_PASS_H) \
!    $(COVERAGE_H) $(GGC_H) $(TREE_FLOW_H) $(RTL_H) $(IPA_PROP_H) \
!    $(EXCEPT_H) gimple-pretty-print.h ipa-inline.h $(TARGET_H)
  ipa-inline-analysis.o : ipa-inline-analysis.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
     $(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \
     $(DIAGNOSTIC_H) $(PARAMS_H) $(TIMEVAR_H) $(TREE_PASS_H) \

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Inliner heuristics facelift
  2011-04-17 14:29 ` Jan Hubicka
@ 2011-04-19 13:40   ` H.J. Lu
  0 siblings, 0 replies; 9+ messages in thread
From: H.J. Lu @ 2011-04-19 13:40 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Dominique Dhumieres, richard.guenther, gcc-patches

On Sun, Apr 17, 2011 at 7:14 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> there is now tracking PR48636 for fortran inlining issues.  I would suggest
> reporting problems like this there or PR that blocks it instead of sending it
> to different threads so we have track of them.  It is quite important to get
> some understanding what sort of issues we have here.  We might want to collect
> some sort of inlining benchmark like we have for C++.  For the moment I think
> the polyhedron we have is working well, but just having a summary what
> benchmarks needs what --param tweaks and why (i.e. what is important to inline)
> would be useful.
>
> While final testing before comming I noticed that last minue gcc_assert added
> to can_inline checking that gimple stmt and edge cannot_inline_flags are in sync
> (they must be, otherwise we would do invalid optimizations at IPA) is failing
> on one of C++ testcases.
>
> The problem is code in tree-inline that incorrectly handles the flag on
> functions that do have mismatching EH personalities.
>
> This is final version of patch I comitted that fixes this by moving all the logic
> scattered across tree-inline.c into can_inline_edge_p where it belongs and adding
> corresponding CIF codes.  It also turned out that our all_set_cannot_inline are
> not quite up to date after devirtualization and other transforms.  As first cut
> i decided to move the computation into inline summary computation and I think
> i can remove it from gimplifier and value profiler in followup patch.
>
> Bootstrapped/regtested x86_64-linux, comitted.
>
> Index: ChangeLog
> ===================================================================
> *** ChangeLog   (revision 172608)
> --- ChangeLog   (working copy)
> ***************
> *** 1,3 ****
> --- 1,74 ----
> + 2011-04-17  Jan Hubicka  <jh@suse.cz>
> +
> +       * lto-symtab.c (lto_cgraph_replace_node): When call statement is
> +       present, also set gimple_call_set_cannot_inline.
> +       * ipa-inline.c: Update toplevel comment.
> +       (MAX_TIME): Remove.
> +       (cgraph_clone_inlined_nodes): Fix linebreaks.
> +       (cgraph_check_inline_limits): Restructure to ...
> +       (caller_growth_limits): ... this one; be more tolerant
> +       on growth in nested inline chains; add explanatory comment;
> +       fix stack accounting thinko introduced by previous patch.
> +       (cgraph_default_inline_p): Remove.
> +       (report_inline_failed_reason): New function.
> +       (can_inline_edge_p): New function.
> +       (can_early_inline_edge_p): New function.
> +       (leaf_node_p): Move upwards in file.
> +       (want_early_inline_function_p): New function.
> +       (want_inline_small_function_p): New function.
> +       (want_inline_self_recursive_call_p): New function.
> +       (cgraph_edge_badness): Rename to ...
> +       (edge_badness) ... this one; fix linebreaks.
> +       (update_edge_key): Update call of edge_baddness; add
> +       detailed dump about queue updates.
> +       (update_caller_keys): Use can_inline_edge_p and
> +       want_inline_small_function_p.
> +       (cgraph_decide_recursive_inlining): Rename to...
> +       (recursive_inlining): Use can_inline_edge_p and
> +       want_inline_self_recursive_call_p; simplify and
> +       remove no longer valid FIXME.
> +       (cgraph_set_inline_failed): Remove.
> +       (add_new_edges_to_heap): Use can_inline_edge_p and
> +       want_inline_small_function_p.
> +       (cgraph_decide_inlining_of_small_functions): Rename to ...
> +       (inline_small_functions): ... this one; cleanup; use
> +       can/want predicates; cleanup debug ouput; work edges
> +       till fibheap is exhausted and do not stop once unit
> +       growth is reached; remove later loop processing remaining
> +       edges.
> +       (cgraph_flatten): Rename to ...
> +       (flatten_function): ... this one; use can_inline_edge_p
> +       and can_early_inline_edge_p predicates.
> +       (cgraph_decide_inlining): Rename to ...
> +       (ipa_inline): ... this one; remove unreachable nodes before
> +       inlining functions called once; simplify the pass.
> +       (cgraph_perform_always_inlining): Rename to ...
> +       (inline_always_inline_functions): ... this one; use
> +       DECL_DISREGARD_INLINE_LIMITS; use can_inline_edge_p
> +       predicate
> +       (cgraph_decide_inlining_incrementally): Rename to ...
> +       (early_inline_small_functions): ... this one; simplify
> +       using new predicates; cleanup; make dumps prettier.
> +       (cgraph_early_inlining): Rename to ...
> +       (early_inliner): newer inline regular functions into always-inlines;
> +       fix updating of call stmt summaries.
> +       (pass_early_inline): Update for new names.
> +       (inline_transform): Fix formating.
> +       (gate_cgraph_decide_inlining): Rename to ...
> +       (pass_ipa_inline): ... this one.
> +       * ipa-inline.h (inline_summary): Remove disregard_inline_limits.
> +       * ipa-inline-analysis.c (dump_inline_summary): Update.
> +       (compute_inline_parameters): Do not compute disregard_inline_limits;
> +       look for mismatching arguments.
> +       (estimate_growth): Fix handlig of non-trivial self recursion.
> +       (inline_read_summary): Do not read info->disregard_inline_limits.
> +       (inline_write_summary): Do not write info->disregard_inline_limits.
> +       * tree-inline.c (inline_forbidden_into_p, tree_can_inline_p): Remove and
> +       move all checks into can_inline_edge_p predicate; re-enable code comparing
> +       optimization levels.
> +       (expand_call_inline): Do not test inline_forbidden_into_p.
> +       * Makefile.in (ipa-inline.o): Update arguments.
> +

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48683

-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Inliner heuristics facelift
@ 2011-04-17 10:54 Jan Hubicka
  2011-04-19 18:37 ` Xinliang David Li
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Hubicka @ 2011-04-17 10:54 UTC (permalink / raw)
  To: gcc-patches

Hi,
cgraph inliner implementation is 8 years old now.  It has growh from simple
C-only inliner preventing unbounded code size explossion seen on earlier tree
inlining heuristics over "lets solve problem with abstraction penalty in
tramp3d" into "lets do something useful on Mozilla".

In this patch I am attempting to retrospect some design into the code that
since original one got lost in tons of various additions and migration from
different cgraph implementations.

While I think the overhall scheme of having way to transform cgraph&functions
separated from heuritics and heuristics implemented as a greedy algorithm doing
as much of useful inlining as allowed by the limits is OK, the implementation
got clumpsy.

Main idea of the patch is to move all the various check scattered across the
code into can_inline_edge_p predicate that test if inlining is at all possible.
In addition each of different inliner heuristics (always inline, flatteing,
early inlining, inlining of small functions, recursive inlining and inlining of
functions called once) has bit different rules on when it thinks inlining is
good idea.  For always inline and flattening it is trivial. For other there are
now separated want_* predicates doing all the checks at one place.

The patch also cleanups the dump files and makes function naming more regular,
especially by dropping nonsential cgraph prefixes coming from dark times when
cgraph and inliner was in single source file. 

While working on the patch I contied number of minor issues that are fixed,
since it was easier than go with immitating previous behaviour and adding
FIXMEs.

In particular

  1) Long standing TODO in estimate_growth concerning miscomputation of self
     recursive functions is gone.  It was trivial ever since the cgraph code
     was revamped to inline plans with clones.
  2) Small function inliner no longer inserts only functoins passing
     cgraph_default_inline_p nto the queue.  The feasibility of small function
     inlining is edge specific now and accounts estimated benefits into
     --param large-function-insns-auto and -single limits.
  3) There was bug in cgraph_check_inline_limits concerning stack usage
     computation for nested functions, probably introduced by my previous patch.
     It is fixed now
  4) cgraph_check_inline_limits was specially allowing inlining large function
     into small by setting the base of limit to be bigger caller or callee self
     size.  This is very important since often large functions should be inlined
     into tiny wrappers and benefit from the extra context.

     It is however also sort shighted, since when large function gets inlined into
     tiny wrapper, later we migh prevent inlining small functions into the large functions
     since in next iteration the size will be maximum of the tiny wrapper and small
     function.
     Fixed by looking for largest function in the inline stack and symetrically in
     the case of
  5) All of the recursive inlining tests was implemented before callgraph edge
     frequencies was added and not updated since then. We now compute the decision
     based on both.  It is still not doing as well as I would like, I will try
     to reconsider this later.
  6) Inlining functions called once was missing oppurtunities because at inliner
     time we keep in callgraph offline copies of extern inline functions and some
     virtual functions for future inlining.  Calls from these functions prevent
     inliner to notice that function is called once.  Fixed by removing unreachable
     calls before functions called once are processed.

Bootstrapped/regtested x86_64-linux, also tested on tramp3d to behave sanely
and on nightly tester (frescobaldi).
Will commit it later today.

Honza

	* lto-symtab.c (lto_cgraph_replace_node): When call statement is
	present, also set gimple_call_set_cannot_inline.
	* ipa-inline.c: Update toplevel comment.
	(MAX_TIME): Remove.
	(cgraph_clone_inlined_nodes): Fix linebreaks.
	(cgraph_check_inline_limits): Restructure to ...
	(caller_growth_limits): ... this one; be more tolerant
	on growth in nested inline chains; add explanatory comment;
	fix stack accounting thinko introduced by previous patch.
	(cgraph_default_inline_p): Remove.
	(report_inline_failed_reason): New function.
	(can_inline_edge_p): New function.
	(can_early_inline_edge_p): New function.
	(leaf_node_p): Move upwards in file.
	(want_early_inline_function_p): New function.
	(want_inline_small_function_p): New function.
	(want_inline_self_recursive_call_p): New function.
	(cgraph_edge_badness): Rename to ...
	(edge_badness) ... this one; fix linebreaks.
	(update_edge_key): Update call of edge_baddness; add
	detailed dump about queue updates.
	(update_caller_keys): Use can_inline_edge_p and
	want_inline_small_function_p.
	(cgraph_decide_recursive_inlining): Rename to...
	(recursive_inlining): Use can_inline_edge_p and
	want_inline_self_recursive_call_p; simplify and
	remove no longer valid FIXME.
	(cgraph_set_inline_failed): Remove.
	(add_new_edges_to_heap): Use can_inline_edge_p and
	want_inline_small_function_p.
	(cgraph_decide_inlining_of_small_functions): Rename to ...
	(inline_small_functions): ... this one; cleanup; use
	can/want predicates; cleanup debug ouput; work edges
	till fibheap is exhausted and do not stop once unit
	growth is reached; remove later loop processing remaining
	edges.
	(cgraph_flatten): Rename to ...
	(flatten_function): ... this one; use can_inline_edge_p
	and can_early_inline_edge_p predicates.
	(cgraph_decide_inlining): Rename to ...
	(ipa_inline): ... this one; remove unreachable nodes before
	inlining functions called once; simplify the pass.
	(cgraph_perform_always_inlining): Rename to ...
	(inline_always_inline_functions): ... this one; use
	DECL_DISREGARD_INLINE_LIMITS; use can_inline_edge_p
	predicate
	(cgraph_decide_inlining_incrementally): Rename to ...
	(early_inline_small_functions): ... this one; simplify
	using new predicates; cleanup; make dumps prettier.
	(cgraph_early_inlining): Rename to ...
	(early_inliner): newer inline regular functions into always-inlines;
	fix updating of call stmt summaries.
	(pass_early_inline): Update for new names.
	(inline_transform): Fix formating.
	(gate_cgraph_decide_inlining): Rename to ...
	(pass_ipa_inline): ... this one.
	* ipa-inline.h (inline_summary): Remove disregard_inline_limits.
	* ipa-inline-analysis.c (dump_inline_summary): Update.
	(compute_inline_parameters): Do not compute disregard_inline_limits.
	(estimate_growth): Fix handlig of non-trivial self recursion.
	(inline_read_summary): Do not read info->disregard_inline_limits.
	(inline_write_summary): Do not write info->disregard_inline_limits.

	* gcc.dg/winline-5.c: Update testcase.

Index: testsuite/gcc.dg/winline-5.c
===================================================================
*** testsuite/gcc.dg/winline-5.c	(revision 172597)
--- testsuite/gcc.dg/winline-5.c	(working copy)
*************** inline int q(void) /* { dg-warning "inli
*** 15,29 ****
  	big();
  	big();
  }
- inline int q1(void)
- {
- 	big();
- 	big();
- 	big();
- }
  int t (void)
  {
-  /* We allow one inlining over limit.  */
- 	q1();
  	return q ();		 /* { dg-warning "called from here" } */
  }
--- 15,21 ----
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 172597)
--- ipa-inline.c	(working copy)
*************** along with GCC; see the file COPYING3.  
*** 21,93 ****
  
  /*  Inlining decision heuristics
  
!     We separate inlining decisions from the inliner itself and store it
!     inside callgraph as so called inline plan.  Refer to cgraph.c
!     documentation about particular representation of inline plans in the
!     callgraph.
! 
!     There are three major parts of this file:
! 
!     cgraph_mark_inline_edge implementation
! 
!       This function allows to mark given call inline and performs necessary
!       modifications of cgraph (production of the clones and updating overall
!       statistics)
  
      inlining heuristics limits
  
!       These functions allow to check that particular inlining is allowed
!       by the limits specified by user (allowed function growth, overall unit
!       growth and so on).
  
      inlining heuristics
  
!       This is implementation of IPA pass aiming to get as much of benefit
!       from inlining obeying the limits checked above.
  
!       The implementation of particular heuristics is separated from
!       the rest of code to make it easier to replace it with more complicated
!       implementation in the future.  The rest of inlining code acts as a
!       library aimed to modify the callgraph and verify that the parameters
!       on code size growth fits.
! 
!       To mark given call inline, use cgraph_mark_inline function, the
!       verification is performed by cgraph_default_inline_p and
!       cgraph_check_inline_limits.
! 
!       The heuristics implements simple knapsack style algorithm ordering
!       all functions by their "profitability" (estimated by code size growth)
!       and inlining them in priority order.
! 
!       cgraph_decide_inlining implements heuristics taking whole callgraph
!       into account, while cgraph_decide_inlining_incrementally considers
!       only one function at a time and is used by early inliner.
! 
!    The inliner itself is split into two passes:
! 
!    pass_early_inlining
! 
!      Simple local inlining pass inlining callees into current function.  This
!      pass makes no global whole compilation unit analysis and this when allowed
!      to do inlining expanding code size it might result in unbounded growth of
!      whole unit.
! 
!      The pass is run during conversion into SSA form.  Only functions already
!      converted into SSA form are inlined, so the conversion must happen in
!      topological order on the callgraph (that is maintained by pass manager).
!      The functions after inlining are early optimized so the early inliner sees
!      unoptimized function itself, but all considered callees are already
!      optimized allowing it to unfold abstraction penalty on C++ effectively and
!      cheaply.
! 
!    pass_ipa_inline
! 
!      This is the main pass implementing simple greedy algorithm to do inlining
!      of small functions that results in overall growth of compilation unit and
!      inlining of functions called once.  The pass compute just so called inline
!      plan (representation of inlining to be done in callgraph) and unlike early
!      inlining it is not performing the inlining itself.
!  */
  
  #include "config.h"
  #include "system.h"
--- 21,106 ----
  
  /*  Inlining decision heuristics
  
!     The implementation of inliner is organized as follows:
! 
!     Transformation of callgraph to represent inlining decisions.
! 
!       The inline decisions are stored in callgraph in "inline plan" and
!       all applied later.
! 
!       To mark given call inline, use cgraph_mark_inline function.
!       The function marks the edge inlinable and, if neccesary, produces
!       virtual clone in the callgraph representing the new copy of callee's
!       function body.
! 
!       The inline plan is applied on given function body by inline_transform. 
  
      inlining heuristics limits
  
!       can_inline_edge_p allow to check that particular inlining is allowed
!       by the limits specified by user (allowed function growth, growth and so
!       on).
! 
!       Functions are inlined when it is obvious the result is profitable (such
!       as functions called once or when inlining reduce code size).
!       In addition to that we perform inlining of small functions and recursive
!       inlining.
  
      inlining heuristics
  
!        The inliner itself is split into two passes:
! 
!        pass_early_inlining
  
! 	 Simple local inlining pass inlining callees into current function.
! 	 This pass makes no use of whole unit analysis and thus it can do only
! 	 very simple decisions based on local properties.
! 
! 	 The strength of the pass is that it is run in topological order
! 	 (reverse postorder) on the callgraph. Functions are converted into SSA
! 	 form just before this pass and optimized subsequently. As a result, the
! 	 callees of the function seen by the early inliner was already optimized
! 	 and results of early inlining adds a lot of optimization oppurtunities
! 	 for the local optimization.
! 
! 	 The pass handle the obvious inlining decisions within the copmilation
! 	 unit - inlining auto inline functions, inlining for size and
! 	 flattening.
! 
! 	 main strength of the pass is the ability to eliminate abstraction
! 	 penalty in C++ code (via combination of inlining and early
! 	 optimization) and thus improve quality of analysis done by real IPA
! 	 optimizers.
! 
! 	 Because of lack of whole unit knowledge, the pass can not really make
! 	 good code size/performance tradeoffs.  It however does very simple
! 	 speculative inlining allowing code size to grow by
! 	 EARLY_INLINING_INSNS when calee is leaf function.  In this case the
! 	 optimizations perfomed later are very likely to eliminate the cost.
! 
!        pass_ipa_inline
! 
! 	 This is the real inliner able to handle inlining with whole program
! 	 knowledge. It performs following steps:
! 
! 	 1) inlining of small functions.  This is implemented by greedy
! 	 algorithm ordering all inlinable cgraph edges by their badness and
! 	 inlining them in this order as long as inline limits allows doing so.
! 
! 	 This heuristics is not very good on inlining recursive calls. Recursive
! 	 calls can be inlined with results similar to loop unrolling. To do so,
! 	 special purpose recursive inliner is executed on function when
! 	 recursive edge is met as viable candidate.
! 
! 	 2) Unreachable functions are removed from callgraph.  Inlining leads
! 	 to devirtualization and other modification of callgraph so functions
! 	 may become unreachable during the process. Also functions declared as
! 	 extern inline or virtual functions are removed, since after inlining
! 	 we no longer need the offline bodies.
! 
! 	 3) Functions called once and not exported from the unit are inlined.
! 	 This should almost always lead to reduction of code size by eliminating
! 	 the need for offline copy of the function.  */
  
  #include "config.h"
  #include "system.h"
*************** along with GCC; see the file COPYING3.  
*** 114,122 ****
  #include "except.h"
  #include "ipa-inline.h"
  
- #define MAX_TIME 1000000000
- 
- 
  /* Statistics we collect about inlining algorithm.  */
  static int ncalls_inlined;
  static int nfunctions_inlined;
--- 127,132 ----
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 163,173 ****
        /* We may eliminate the need for out-of-line copy to be output.
  	 In that case just go ahead and re-use it.  */
        if (!e->callee->callers->next_caller
! 	  /* Recursive inlining never wants the master clone to be overwritten.  */
  	  && update_original
! 	  /* FIXME: When address is taken of DECL_EXTERNAL function we still can remove its
! 	     offline copy, but we would need to keep unanalyzed node in the callgraph so
! 	     references can point to it.  */
  	  && !e->callee->address_taken
  	  && cgraph_can_remove_if_no_direct_calls_p (e->callee)
  	  /* Inlining might enable more devirtualizing, so we want to remove
--- 173,184 ----
        /* We may eliminate the need for out-of-line copy to be output.
  	 In that case just go ahead and re-use it.  */
        if (!e->callee->callers->next_caller
! 	  /* Recursive inlining never wants the master clone to
! 	     be overwritten.  */
  	  && update_original
! 	  /* FIXME: When address is taken of DECL_EXTERNAL function we still
! 	     can remove its offline copy, but we would need to keep unanalyzed
! 	     node in the callgraph so references can point to it.  */
  	  && !e->callee->address_taken
  	  && cgraph_can_remove_if_no_direct_calls_p (e->callee)
  	  /* Inlining might enable more devirtualizing, so we want to remove
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 175,181 ****
  	     Lacking may edges in callgraph we just preserve them post
  	     inlining.  */
  	  && (!DECL_VIRTUAL_P (e->callee->decl)
! 	      || (!DECL_COMDAT (e->callee->decl) && !DECL_EXTERNAL (e->callee->decl)))
  	  /* Don't reuse if more than one function shares a comdat group.
  	     If the other function(s) are needed, we need to emit even
  	     this function out of line.  */
--- 186,193 ----
  	     Lacking may edges in callgraph we just preserve them post
  	     inlining.  */
  	  && (!DECL_VIRTUAL_P (e->callee->decl)
! 	      || (!DECL_COMDAT (e->callee->decl)
! 		  && !DECL_EXTERNAL (e->callee->decl)))
  	  /* Don't reuse if more than one function shares a comdat group.
  	     If the other function(s) are needed, we need to emit even
  	     this function out of line.  */
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 214,220 ****
        + caller_info->estimated_self_stack_size;
    peak = callee_info->stack_frame_offset
        + callee_info->estimated_self_stack_size;
!   if (inline_summary (e->callee->global.inlined_to)->estimated_stack_size < peak)
      inline_summary (e->callee->global.inlined_to)->estimated_stack_size = peak;
    cgraph_propagate_frequency (e->callee);
  
--- 226,233 ----
        + caller_info->estimated_self_stack_size;
    peak = callee_info->stack_frame_offset
        + callee_info->estimated_self_stack_size;
!   if (inline_summary (e->callee->global.inlined_to)->estimated_stack_size
!       < peak)
      inline_summary (e->callee->global.inlined_to)->estimated_stack_size = peak;
    cgraph_propagate_frequency (e->callee);
  
*************** cgraph_mark_inline_edge (struct cgraph_e
*** 272,304 ****
      return false;
  }
  
! /* Return false when inlining edge E is not good idea
!    as it would cause too large growth of the callers function body
!    or stack frame size.  *REASON if non-NULL is updated if the
!    inlining is not a good idea.  */
  
  static bool
! cgraph_check_inline_limits (struct cgraph_edge *e,
! 			    cgraph_inline_failed_t *reason)
  {
    struct cgraph_node *to = e->caller;
    struct cgraph_node *what = e->callee;
    int newsize;
!   int limit;
!   HOST_WIDE_INT stack_size_limit, inlined_stack;
!   struct inline_summary *info, *what_info;
! 
!   if (to->global.inlined_to)
!     to = to->global.inlined_to;
  
-   info = inline_summary (to);
    what_info = inline_summary (what);
  
!   /* When inlining large function body called once into small function,
!      take the inlined function as base for limiting the growth.  */
!   if (info->self_size > what_info->self_size)
!     limit = info->self_size;
!   else
      limit = what_info->self_size;
  
    limit += limit * PARAM_VALUE (PARAM_LARGE_FUNCTION_GROWTH) / 100;
--- 285,336 ----
      return false;
  }
  
! /* Return false when inlining edge E would lead to violating
!    limits on function unit growth or stack usage growth.  
! 
!    The relative function body growth limit is present generally
!    to avoid problems with non-linear behaviour of the compiler.
!    To allow inlining huge functions into tiny wrapper, the limit
!    is always based on the bigger of the two functions considered.
! 
!    For stack growth limits we always base the growth in stack usage
!    of the callers.  We want to prevent applications from segfaulting
!    on stack overflow when functions with huge stack frames gets
!    inlined. */
  
  static bool
! caller_growth_limits (struct cgraph_edge *e)
  {
    struct cgraph_node *to = e->caller;
    struct cgraph_node *what = e->callee;
    int newsize;
!   int limit = 0;
!   HOST_WIDE_INT stack_size_limit = 0, inlined_stack;
!   struct inline_summary *info, *what_info, *outer_info = inline_summary (to);
! 
!   /* Look for function e->caller is inlined to.  While doing
!      so work out the largest function body on the way.  As
!      described above, we want to base our function growth
!      limits based on that.  Not on the self size of the
!      outer function, not on the self size of inline code
!      we immediately inline to.  This is the most relaxed
!      interpretation of the rule "do not grow large functions
!      too much in order to prevent compiler from exploding".  */
!   do
!     {
!       info = inline_summary (to);
!       if (limit < info->self_size)
! 	limit = info->self_size;
!       if (stack_size_limit < info->estimated_self_stack_size)
! 	stack_size_limit = info->estimated_self_stack_size;
!       if (to->global.inlined_to)
!         to = to->callers->caller;
!     }
!   while (to->global.inlined_to);
  
    what_info = inline_summary (what);
  
!   if (limit < what_info->self_size)
      limit = what_info->self_size;
  
    limit += limit * PARAM_VALUE (PARAM_LARGE_FUNCTION_GROWTH) / 100;
*************** cgraph_check_inline_limits (struct cgrap
*** 310,388 ****
        && newsize > PARAM_VALUE (PARAM_LARGE_FUNCTION_INSNS)
        && newsize > limit)
      {
!       if (reason)
!         *reason = CIF_LARGE_FUNCTION_GROWTH_LIMIT;
        return false;
      }
  
!   stack_size_limit = info->estimated_self_stack_size;
! 
!   stack_size_limit += stack_size_limit * PARAM_VALUE (PARAM_STACK_FRAME_GROWTH) / 100;
  
!   inlined_stack = (info->stack_frame_offset
! 		   + info->estimated_self_stack_size
  		   + what_info->estimated_stack_size);
!   if (inlined_stack  > stack_size_limit
        && inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
      {
!       if (reason)
!         *reason = CIF_LARGE_STACK_FRAME_GROWTH_LIMIT;
        return false;
      }
    return true;
  }
  
! /* Return true when function N is small enough to be inlined.  */
  
  static bool
! cgraph_default_inline_p (struct cgraph_node *n, cgraph_inline_failed_t *reason)
  {
!   tree decl = n->decl;
!   struct inline_summary *info = inline_summary (n);
  
!   if (info->disregard_inline_limits)
!     return true;
  
!   if (!flag_inline_small_functions && !DECL_DECLARED_INLINE_P (decl))
      {
!       if (reason)
! 	*reason = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
        return false;
      }
!   if (!n->analyzed)
      {
!       if (reason)
! 	*reason = CIF_BODY_NOT_AVAILABLE;
        return false;
      }
!   if (cgraph_function_body_availability (n) <= AVAIL_OVERWRITABLE)
      {
!       if (reason)
! 	*reason = CIF_OVERWRITABLE;
        return false;
      }
  
  
!   if (DECL_DECLARED_INLINE_P (decl))
      {
!       if (info->size >= MAX_INLINE_INSNS_SINGLE)
  	{
! 	  if (reason)
! 	    *reason = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
! 	  return false;
  	}
      }
    else
      {
!       if (info->size >= MAX_INLINE_INSNS_AUTO)
  	{
! 	  if (reason)
! 	    *reason = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
! 	  return false;
  	}
      }
  
!   return true;
  }
  
  /* A cost model driving the inlining heuristics in a way so the edges with
--- 342,711 ----
        && newsize > PARAM_VALUE (PARAM_LARGE_FUNCTION_INSNS)
        && newsize > limit)
      {
!       e->inline_failed = CIF_LARGE_FUNCTION_GROWTH_LIMIT;
        return false;
      }
  
!   /* FIXME: Stack size limit often prevents inlining in fortran programs
!      due to large i/o datastructures used by the fortran frontend.
!      We ought to ignore this limit when we know that the edge is executed
!      on every invocation of the caller (i.e. its call statement dominates
!      exit block).  We do not track this information, yet.  */
!   stack_size_limit += (stack_size_limit
! 		       * PARAM_VALUE (PARAM_STACK_FRAME_GROWTH) / 100);
  
!   inlined_stack = (outer_info->stack_frame_offset
! 		   + outer_info->estimated_self_stack_size
  		   + what_info->estimated_stack_size);
!   /* Check new stack consumption with stack consumption at the place
!      stack is used.  */
!   if (inlined_stack > stack_size_limit
!       /* If function already has large stack usage from sibbling
! 	 inline call, we can inline, too.
! 	 This bit overoptimistically assume that we are good at stack
! 	 packing.  */
!       && inlined_stack > info->estimated_stack_size
        && inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
      {
!       e->inline_failed = CIF_LARGE_STACK_FRAME_GROWTH_LIMIT;
        return false;
      }
    return true;
  }
  
! /* Dump info about why inlining has failed.  */
! 
! static void
! report_inline_failed_reason (struct cgraph_edge *e)
! {
!   if (dump_file)
!     {
!       fprintf (dump_file, "  not inlinable: %s/%i -> %s/%i, %s\n",
! 	       cgraph_node_name (e->caller), e->caller->uid,
! 	       cgraph_node_name (e->callee), e->callee->uid,
! 	       cgraph_inline_failed_string (e->inline_failed));
!     }
! }
! 
! /* Decide if we can inline the edge and possibly update
!    inline_failed reason.  
!    We check whether inlining is possible at all and whether
!    caller growth limits allow doing so.  
! 
!    if REPORT is true, output reason to dump file*/
  
  static bool
! can_inline_edge_p (struct cgraph_edge *e, bool report)
  {
!   bool inlinable = true;
  
!   gcc_assert (e->inline_failed);
  
!   if (!e->callee->analyzed)
!     {
!       e->inline_failed = CIF_BODY_NOT_AVAILABLE;
!       inlinable = false;
!     }
!   else if (!inline_summary (e->callee)->inlinable)
!     {
!       e->inline_failed = CIF_FUNCTION_NOT_INLINABLE;
!       inlinable = false;
!     }
!   else if (cgraph_function_body_availability (e->callee) <= AVAIL_OVERWRITABLE)
      {
!       e->inline_failed = CIF_OVERWRITABLE;
        return false;
      }
!   else if (e->call_stmt_cannot_inline_p)
!     {
!       e->inline_failed = CIF_MISMATCHED_ARGUMENTS;
!       inlinable = false;
!     }
!   else if (!tree_can_inline_p (e))
!     inlinable = false;
!   else if (!DECL_DISREGARD_INLINE_LIMITS (e->callee->decl)
!            && !caller_growth_limits (e))
!     inlinable = false;
! 
!   /* Be sure that the cannot_inline_p flag is up to date.  */
!   gcc_checking_assert (!e->call_stmt
! 		       || (gimple_call_cannot_inline_p (e->call_stmt)
! 		           == e->call_stmt_cannot_inline_p)
! 		       /* In -flto-partition=none mode we really keep things out of
! 			  sync because call_stmt_cannot_inline_p is set at cgraph
! 			  merging when function bodies are not there yet.  */
! 		       || (in_lto_p && !gimple_call_cannot_inline_p (e->call_stmt)));
!   if (!inlinable && report)
!     report_inline_failed_reason (e);
!   return inlinable;
! }
! 
! 
! /* Return true if the edge E is inlinable during early inlining.  */
! 
! static bool
! can_early_inline_edge_p (struct cgraph_edge *e)
! {
!   /* Early inliner might get called at WPA stage when IPA pass adds new
!      function.  In this case we can not really do any of early inlining
!      because function bodies are missing.  */
!   if (!gimple_has_body_p (e->callee->decl))
      {
!       e->inline_failed = CIF_BODY_NOT_AVAILABLE;
        return false;
      }
!   /* In early inliner some of callees may not be in SSA form yet
!      (i.e. the callgraph is cyclic and we did not process
!      the callee by early inliner, yet).  We don't have CIF code for this
!      case; later we will re-do the decision in the real inliner.  */
!   if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->caller->decl))
!       || !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
      {
!       if (dump_file)
! 	fprintf (dump_file, "  edge not inlinable: not in SSA form\n");
        return false;
      }
+   if (!can_inline_edge_p (e, true))
+     return false;
+   return true;
+ }
+ 
  
+ /* Return true when N is leaf function.  Accept cheap builtins
+    in leaf functions.  */
+ 
+ static bool
+ leaf_node_p (struct cgraph_node *n)
+ {
+   struct cgraph_edge *e;
+   for (e = n->callees; e; e = e->next_callee)
+     if (!is_inexpensive_builtin (e->callee->decl))
+       return false;
+   return true;
+ }
  
! 
! /* Return true if we are interested in inlining small function.  */
! 
! static bool
! want_early_inline_function_p (struct cgraph_edge *e)
! {
!   bool want_inline = true;
! 
!   if (DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
!     ;
!   else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
! 	   && !flag_inline_small_functions)
      {
!       e->inline_failed = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
!       report_inline_failed_reason (e);
!       want_inline = false;
!     }
!   else
!     {
!       int growth = estimate_edge_growth (e);
!       if (growth <= 0)
! 	;
!       else if (!cgraph_maybe_hot_edge_p (e)
! 	       && growth > 0)
  	{
! 	  if (dump_file)
! 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
! 		     "call is cold and code would grow by %i\n",
! 		     cgraph_node_name (e->caller), e->caller->uid,
! 		     cgraph_node_name (e->callee), e->callee->uid,
! 		     growth);
! 	  want_inline = false;
  	}
+       else if (!leaf_node_p (e->callee)
+ 	       && growth > 0)
+ 	{
+ 	  if (dump_file)
+ 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
+ 		     "callee is not leaf and code would grow by %i\n",
+ 		     cgraph_node_name (e->caller), e->caller->uid,
+ 		     cgraph_node_name (e->callee), e->callee->uid,
+ 		     growth);
+ 	  want_inline = false;
+ 	}
+       else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
+ 	{
+ 	  if (dump_file)
+ 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
+ 		     "growth %i exceeds --param early-inlining-insns\n",
+ 		     cgraph_node_name (e->caller), e->caller->uid,
+ 		     cgraph_node_name (e->callee), e->callee->uid,
+ 		     growth);
+ 	  want_inline = false;
+ 	}
+     }
+   return want_inline;
+ }
+ 
+ /* Return true if we are interested in inlining small function.
+    When REPORT is true, report reason to dump file.  */
+ 
+ static bool
+ want_inline_small_function_p (struct cgraph_edge *e, bool report)
+ {
+   bool want_inline = true;
+ 
+   if (DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
+     ;
+   else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
+ 	   && !flag_inline_small_functions)
+     {
+       e->inline_failed = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
+       want_inline = false;
      }
    else
      {
!       int growth = estimate_edge_growth (e);
! 
!       if (growth <= 0)
! 	;
!       else if (DECL_DECLARED_INLINE_P (e->callee->decl)
! 	       && growth >= MAX_INLINE_INSNS_SINGLE)
  	{
!           e->inline_failed = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
! 	  want_inline = false;
! 	}
!       else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
! 	       && !flag_inline_functions)
! 	{
!           e->inline_failed = CIF_NOT_DECLARED_INLINED;
! 	  want_inline = false;
! 	}
!       else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
! 	       && growth >= MAX_INLINE_INSNS_AUTO)
! 	{
!           e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
! 	  want_inline = false;
! 	}
!       else if (!cgraph_maybe_hot_edge_p (e)
! 	       && estimate_growth (e->callee) > 0)
! 	{
!           e->inline_failed = CIF_UNLIKELY_CALL;
! 	  want_inline = false;
  	}
      }
+   if (!want_inline && report)
+     report_inline_failed_reason (e);
+   return want_inline;
+ }
  
! /* EDGE is self recursive edge.
!    We hand two cases - when function A is inlining into itself
!    or when function A is being inlined into another inliner copy of function
!    A within function B.  
! 
!    In first case OUTER_NODE points to the toplevel copy of A, while
!    in the second case OUTER_NODE points to the outermost copy of A in B.
! 
!    In both cases we want to be extra selective since
!    inlining the call will just introduce new recursive calls to appear.  */
! static bool
! want_inline_self_recursive_call_p (struct cgraph_edge *edge,
! 				   struct cgraph_node *outer_node,
! 				   bool peeling,
! 				   int depth)
! {
!   char const *reason = NULL;
!   bool want_inline = true;
!   int caller_freq = CGRAPH_FREQ_BASE;
!   int max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH_AUTO);
! 
!   if (DECL_DECLARED_INLINE_P (edge->callee->decl))
!     max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH);
! 
!   if (!cgraph_maybe_hot_edge_p (edge))
!     {
!       reason = "recursive call is cold";
!       want_inline = false;
!     }
!   else if (max_count && !outer_node->count)
!     {
!       reason = "not executed in profile";
!       want_inline = false;
!     }
!   else if (depth > max_depth)
!     {
!       reason = "--param max-inline-recursive-depth exceeded.";
!       want_inline = false;
!     }
! 
!   if (outer_node->global.inlined_to)
!     caller_freq = outer_node->callers->frequency;
! 
!   if (!want_inline)
!     ;
!   /* Inlining of self recursive function into copy of itself within other function
!      is transformation similar to loop peeling.
! 
!      Peeling is profitable if we can inline enough copies to make probablility
!      of actual call to the self recursive function very small.  Be sure that
!      the probability of recursion is small.
! 
!      We ensure that the frequency of recusing is at most 1 - (1/max_depth).
!      This way the expected number of recusion is at most max_depth.  */
!   else if (peeling)
!     {
!       int max_prob = CGRAPH_FREQ_BASE - ((CGRAPH_FREQ_BASE + max_depth - 1)
! 					 / max_depth);
!       int i;
!       for (i = 1; i < depth; i++)
! 	max_prob = max_prob * max_prob / CGRAPH_FREQ_BASE;
!       if (max_count
! 	  && (edge->count * CGRAPH_FREQ_BASE / outer_node->count
! 	      >= max_prob))
! 	{
! 	  reason = "profile of recursive call is too large";
! 	  want_inline = false;
! 	}
!       if (!max_count
! 	  && (edge->frequency * CGRAPH_FREQ_BASE / caller_freq
! 	      >= max_prob))
! 	{
! 	  reason = "frequency of recursive call is too large";
! 	  want_inline = false;
! 	}
!     }
!   /* Recusive inlining, i.e. equivalent of unrolling, is profitable if recusion
!      depth is large.  We reduce function call overhead and increase chances that
!      things fit in hardware return predictor.
! 
!      Recursive inlining might however increase cost of stack frame setup
!      actually slowing down functions whose recursion tree is wide rather than
!      deep.
! 
!      Deciding reliably on when to do recursive inlining withthout profile feedback
!      is tricky.  For now we disable recursive inlining when probability of self
!      recursion is low. 
! 
!      Recursive inlining of self recursive call within loop also results in large loop
!      depths that generally optimize badly.  We may want to throttle down inlining
!      in those cases.  In particular this seems to happen in one of libstdc++ rb tree
!      methods.  */
!   else
!     {
!       if (max_count
! 	  && (edge->count * 100 / outer_node->count
! 	      <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
! 	{
! 	  reason = "profile of recursive call is too small";
! 	  want_inline = false;
! 	}
!       else if (!max_count
! 	       && (edge->frequency * 100 / caller_freq
! 	           <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
! 	{
! 	  reason = "frequency of recursive call is too small";
! 	  want_inline = false;
! 	}
!     }
!   if (!want_inline && dump_file)
!     fprintf (dump_file, "   not inlining recursively: %s\n", reason);
!   return want_inline;
  }
  
  /* A cost model driving the inlining heuristics in a way so the edges with
*************** cgraph_default_inline_p (struct cgraph_n
*** 392,404 ****
     of the function or function body size.  */
  
  static int
! cgraph_edge_badness (struct cgraph_edge *edge, bool dump)
  {
    gcov_type badness;
    int growth;
    struct inline_summary *callee_info = inline_summary (edge->callee);
  
!   if (callee_info->disregard_inline_limits)
      return INT_MIN;
  
    growth = estimate_edge_growth (edge);
--- 715,727 ----
     of the function or function body size.  */
  
  static int
! edge_badness (struct cgraph_edge *edge, bool dump)
  {
    gcov_type badness;
    int growth;
    struct inline_summary *callee_info = inline_summary (edge->callee);
  
!   if (DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl))
      return INT_MIN;
  
    growth = estimate_edge_growth (edge);
*************** cgraph_edge_badness (struct cgraph_edge 
*** 488,494 ****
  	  fprintf (dump_file,
  		   "      %i: guessed profile. frequency %i, overall growth %i,"
  		   " benefit %i%%, divisor %i\n",
! 		   (int) badness, edge->frequency, growth_for_all, benefitperc, div);
  	}
      }
    /* When function local profile is not available or it does not give
--- 811,818 ----
  	  fprintf (dump_file,
  		   "      %i: guessed profile. frequency %i, overall growth %i,"
  		   " benefit %i%%, divisor %i\n",
! 		   (int) badness, edge->frequency, growth_for_all,
! 		   benefitperc, div);
  	}
      }
    /* When function local profile is not available or it does not give
*************** cgraph_edge_badness (struct cgraph_edge 
*** 523,532 ****
  }
  
  /* Recompute badness of EDGE and update its key in HEAP if needed.  */
! static void
  update_edge_key (fibheap_t heap, struct cgraph_edge *edge)
  {
!   int badness = cgraph_edge_badness (edge, false);
    if (edge->aux)
      {
        fibnode_t n = (fibnode_t) edge->aux;
--- 847,856 ----
  }
  
  /* Recompute badness of EDGE and update its key in HEAP if needed.  */
! static inline void
  update_edge_key (fibheap_t heap, struct cgraph_edge *edge)
  {
!   int badness = edge_badness (edge, false);
    if (edge->aux)
      {
        fibnode_t n = (fibnode_t) edge->aux;
*************** update_edge_key (fibheap_t heap, struct 
*** 539,549 ****
        if (badness < n->key)
  	{
  	  fibheap_replace_key (heap, n, badness);
  	  gcc_checking_assert (n->key == badness);
  	}
      }
    else
!     edge->aux = fibheap_insert (heap, badness, edge);
  }
  
  /* Recompute heap nodes for each of caller edge.  */
--- 863,892 ----
        if (badness < n->key)
  	{
  	  fibheap_replace_key (heap, n, badness);
+ 	  if (dump_file && (dump_flags & TDF_DETAILS))
+ 	    {
+ 	      fprintf (dump_file,
+ 		       "  decreasing badness %s/%i -> %s/%i, %i to %i\n",
+ 		       cgraph_node_name (edge->caller), edge->caller->uid,
+ 		       cgraph_node_name (edge->callee), edge->callee->uid,
+ 		       (int)n->key,
+ 		       badness);
+ 	    }
  	  gcc_checking_assert (n->key == badness);
  	}
      }
    else
!     {
!        if (dump_file && (dump_flags & TDF_DETAILS))
! 	 {
! 	   fprintf (dump_file,
! 		    "  enqueuing call %s/%i -> %s/%i, badness %i\n",
! 		    cgraph_node_name (edge->caller), edge->caller->uid,
! 		    cgraph_node_name (edge->callee), edge->callee->uid,
! 		    badness);
! 	 }
!       edge->aux = fibheap_insert (heap, badness, edge);
!     }
  }
  
  /* Recompute heap nodes for each of caller edge.  */
*************** update_caller_keys (fibheap_t heap, stru
*** 553,559 ****
  		    bitmap updated_nodes)
  {
    struct cgraph_edge *edge;
-   cgraph_inline_failed_t failed_reason;
  
    if (!inline_summary (node)->inlinable
        || cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE
--- 896,901 ----
*************** update_caller_keys (fibheap_t heap, stru
*** 569,591 ****
        break;
    if (!edge)
      return;
!   /* Prune out edges we won't inline into anymore.  */
!   if (!cgraph_default_inline_p (node, &failed_reason))
!     {
!       for (; edge; edge = edge->next_caller)
! 	if (edge->aux)
  	  {
  	    fibheap_delete_node (heap, (fibnode_t) edge->aux);
  	    edge->aux = NULL;
- 	    if (edge->inline_failed)
- 	      edge->inline_failed = failed_reason;
  	  }
!       return;
!     }
! 
!   for (; edge; edge = edge->next_caller)
!     if (edge->inline_failed)
!       update_edge_key (heap, edge);
  }
  
  /* Recompute heap nodes for each uninlined call.
--- 911,930 ----
        break;
    if (!edge)
      return;
! 
!   for (; edge; edge = edge->next_caller)
!     if (edge->inline_failed)
!       {
! 	if (can_inline_edge_p (edge, false)
! 	    && want_inline_small_function_p (edge, false))
!           update_edge_key (heap, edge);
! 	else if (edge->aux)
  	  {
+ 	    report_inline_failed_reason (edge);
  	    fibheap_delete_node (heap, (fibnode_t) edge->aux);
  	    edge->aux = NULL;
  	  }
!       }
  }
  
  /* Recompute heap nodes for each uninlined call.
*************** update_callee_keys (fibheap_t heap, stru
*** 613,624 ****
  	    && !bitmap_bit_p (updated_nodes, e->callee->uid))
  	  {
  	    inline_summary (node)->estimated_growth = INT_MIN;
! 	    /* If function becomes uninlinable, we need to remove it from the heap.  */
! 	    if (!cgraph_default_inline_p (e->callee, &e->inline_failed))
! 	      update_caller_keys (heap, e->callee, updated_nodes);
! 	    else
! 	    /* Otherwise update just edge E.  */
! 	      update_edge_key (heap, e);
  	  }
  	if (e->next_callee)
  	  e = e->next_callee;
--- 952,958 ----
  	    && !bitmap_bit_p (updated_nodes, e->callee->uid))
  	  {
  	    inline_summary (node)->estimated_growth = INT_MIN;
! 	    update_edge_key (heap, e);
  	  }
  	if (e->next_callee)
  	  e = e->next_callee;
*************** lookup_recursive_calls (struct cgraph_no
*** 702,717 ****
     is NULL.  */
  
  static bool
! cgraph_decide_recursive_inlining (struct cgraph_edge *edge,
! 				  VEC (cgraph_edge_p, heap) **new_edges)
  {
    int limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE_AUTO);
-   int max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH_AUTO);
-   int probability = PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY);
    fibheap_t heap;
    struct cgraph_node *node;
    struct cgraph_edge *e;
!   struct cgraph_node *master_clone, *next;
    int depth = 0;
    int n = 0;
  
--- 1036,1049 ----
     is NULL.  */
  
  static bool
! recursive_inlining (struct cgraph_edge *edge,
! 		    VEC (cgraph_edge_p, heap) **new_edges)
  {
    int limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE_AUTO);
    fibheap_t heap;
    struct cgraph_node *node;
    struct cgraph_edge *e;
!   struct cgraph_node *master_clone = NULL, *next;
    int depth = 0;
    int n = 0;
  
*************** cgraph_decide_recursive_inlining (struct
*** 719,743 ****
    if (node->global.inlined_to)
      node = node->global.inlined_to;
  
-   /* It does not make sense to recursively inline always-inline functions
-      as we are going to sorry() on the remaining calls anyway.  */
-   if (inline_summary (node)->disregard_inline_limits
-       && lookup_attribute ("always_inline", DECL_ATTRIBUTES (node->decl)))
-     return false;
- 
-   if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION (node->decl))
-       || (!flag_inline_functions && !DECL_DECLARED_INLINE_P (node->decl)))
-     return false;
- 
    if (DECL_DECLARED_INLINE_P (node->decl))
!     {
!       limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE);
!       max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH);
!     }
  
    /* Make sure that function is small enough to be considered for inlining.  */
!   if (!max_depth
!       || estimate_size_after_inlining (node, edge)  >= limit)
      return false;
    heap = fibheap_new ();
    lookup_recursive_calls (node, node, heap);
--- 1051,1061 ----
    if (node->global.inlined_to)
      node = node->global.inlined_to;
  
    if (DECL_DECLARED_INLINE_P (node->decl))
!     limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE);
  
    /* Make sure that function is small enough to be considered for inlining.  */
!   if (estimate_size_after_inlining (node, edge)  >= limit)
      return false;
    heap = fibheap_new ();
    lookup_recursive_calls (node, node, heap);
*************** cgraph_decide_recursive_inlining (struct
*** 752,765 ****
  	     "  Performing recursive inlining on %s\n",
  	     cgraph_node_name (node));
  
-   /* We need original clone to copy around.  */
-   master_clone = cgraph_clone_node (node, node->decl,
- 				    node->count, CGRAPH_FREQ_BASE, 1,
-   				    false, NULL);
-   for (e = master_clone->callees; e; e = e->next_callee)
-     if (!e->inline_failed)
-       cgraph_clone_inlined_nodes (e, true, false);
- 
    /* Do the inlining and update list of recursive call during process.  */
    while (!fibheap_empty (heap))
      {
--- 1070,1075 ----
*************** cgraph_decide_recursive_inlining (struct
*** 770,804 ****
        if (estimate_size_after_inlining (node, curr) > limit)
  	break;
  
        depth = 1;
        for (cnode = curr->caller;
  	   cnode->global.inlined_to; cnode = cnode->callers->caller)
  	if (node->decl == curr->callee->decl)
  	  depth++;
-       if (depth > max_depth)
- 	{
-           if (dump_file)
- 	    fprintf (dump_file,
- 		     "   maximal depth reached\n");
- 	  continue;
- 	}
  
!       if (max_count)
! 	{
!           if (!cgraph_maybe_hot_edge_p (curr))
! 	    {
! 	      if (dump_file)
! 		fprintf (dump_file, "   Not inlining cold call\n");
! 	      continue;
! 	    }
!           if (curr->count * 100 / node->count < probability)
! 	    {
! 	      if (dump_file)
! 		fprintf (dump_file,
! 			 "   Probability of edge is too small\n");
! 	      continue;
! 	    }
! 	}
  
        if (dump_file)
  	{
--- 1080,1096 ----
        if (estimate_size_after_inlining (node, curr) > limit)
  	break;
  
+       if (!can_inline_edge_p (curr, true))
+ 	continue;
+ 
        depth = 1;
        for (cnode = curr->caller;
  	   cnode->global.inlined_to; cnode = cnode->callers->caller)
  	if (node->decl == curr->callee->decl)
  	  depth++;
  
!       if (!want_inline_self_recursive_call_p (curr, node, false, depth))
! 	continue;
  
        if (dump_file)
  	{
*************** cgraph_decide_recursive_inlining (struct
*** 811,828 ****
  	    }
  	  fprintf (dump_file, "\n");
  	}
        cgraph_redirect_edge_callee (curr, master_clone);
        cgraph_mark_inline_edge (curr, false, new_edges);
        lookup_recursive_calls (node, curr->callee, heap);
        n++;
      }
    if (!fibheap_empty (heap) && dump_file)
      fprintf (dump_file, "    Recursive inlining growth limit met.\n");
- 
    fibheap_delete (heap);
    if (dump_file)
      fprintf (dump_file,
! 	     "\n   Inlined %i times, body grown from size %i to %i, time %i to %i\n", n,
  	     inline_summary (master_clone)->size, inline_summary (node)->size,
  	     inline_summary (master_clone)->time, inline_summary (node)->time);
  
--- 1103,1136 ----
  	    }
  	  fprintf (dump_file, "\n");
  	}
+       if (!master_clone)
+ 	{
+ 	  /* We need original clone to copy around.  */
+ 	  master_clone = cgraph_clone_node (node, node->decl,
+ 					    node->count, CGRAPH_FREQ_BASE, 1,
+ 					    false, NULL);
+ 	  for (e = master_clone->callees; e; e = e->next_callee)
+ 	    if (!e->inline_failed)
+ 	      cgraph_clone_inlined_nodes (e, true, false);
+ 	}
+ 
        cgraph_redirect_edge_callee (curr, master_clone);
        cgraph_mark_inline_edge (curr, false, new_edges);
        lookup_recursive_calls (node, curr->callee, heap);
        n++;
      }
+ 
    if (!fibheap_empty (heap) && dump_file)
      fprintf (dump_file, "    Recursive inlining growth limit met.\n");
    fibheap_delete (heap);
+ 
+   if (!master_clone)
+     return false;
+ 
    if (dump_file)
      fprintf (dump_file,
! 	     "\n   Inlined %i times, "
! 	     "body grown from size %i to %i, time %i to %i\n", n,
  	     inline_summary (master_clone)->size, inline_summary (node)->size,
  	     inline_summary (master_clone)->time, inline_summary (node)->time);
  
*************** cgraph_decide_recursive_inlining (struct
*** 837,863 ****
  	cgraph_remove_node (node);
      }
    cgraph_remove_node (master_clone);
!   /* FIXME: Recursive inlining actually reduces number of calls of the
!      function.  At this place we should probably walk the function and
!      inline clones and compensate the counts accordingly.  This probably
!      doesn't matter much in practice.  */
!   return n > 0;
! }
! 
! /* Set inline_failed for all callers of given function to REASON.  */
! 
! static void
! cgraph_set_inline_failed (struct cgraph_node *node,
! 			  cgraph_inline_failed_t reason)
! {
!   struct cgraph_edge *e;
! 
!   if (dump_file)
!     fprintf (dump_file, "Inlining failed: %s\n",
! 	     cgraph_inline_failed_string (reason));
!   for (e = node->callers; e; e = e->next_caller)
!     if (e->inline_failed)
!       e->inline_failed = reason;
  }
  
  /* Given whole compilation unit estimate of INSNS, compute how large we can
--- 1145,1151 ----
  	cgraph_remove_node (node);
      }
    cgraph_remove_node (master_clone);
!   return true;
  }
  
  /* Given whole compilation unit estimate of INSNS, compute how large we can
*************** add_new_edges_to_heap (fibheap_t heap, V
*** 884,891 ****
        gcc_assert (!edge->aux);
        if (inline_summary (edge->callee)->inlinable
  	  && edge->inline_failed
! 	  && cgraph_default_inline_p (edge->callee, &edge->inline_failed))
!         edge->aux = fibheap_insert (heap, cgraph_edge_badness (edge, false), edge);
      }
  }
  
--- 1172,1180 ----
        gcc_assert (!edge->aux);
        if (inline_summary (edge->callee)->inlinable
  	  && edge->inline_failed
! 	  && can_inline_edge_p (edge, true)
! 	  && want_inline_small_function_p (edge, true))
!         edge->aux = fibheap_insert (heap, edge_badness (edge, false), edge);
      }
  }
  
*************** add_new_edges_to_heap (fibheap_t heap, V
*** 898,908 ****
     to be passed to cgraph_inlined_into and cgraph_inlined_callees.  */
  
  static void
! cgraph_decide_inlining_of_small_functions (void)
  {
    struct cgraph_node *node;
    struct cgraph_edge *edge;
-   cgraph_inline_failed_t failed_reason;
    fibheap_t heap = fibheap_new ();
    bitmap updated_nodes = BITMAP_ALLOC (NULL);
    int min_size, max_size;
--- 1187,1196 ----
     to be passed to cgraph_inlined_into and cgraph_inlined_callees.  */
  
  static void
! inline_small_functions (void)
  {
    struct cgraph_node *node;
    struct cgraph_edge *edge;
    fibheap_t heap = fibheap_new ();
    bitmap updated_nodes = BITMAP_ALLOC (NULL);
    int min_size, max_size;
*************** cgraph_decide_inlining_of_small_function
*** 921,951 ****
        {
  	struct inline_summary *info = inline_summary (node);
  
- 	if (!info->inlinable || !node->callers)
- 	  {
- 	    struct cgraph_edge *e;
- 	    for (e = node->callers; e; e = e->next_caller)
- 	      {
- 		gcc_assert (e->inline_failed);
- 		e->inline_failed = CIF_FUNCTION_NOT_INLINABLE;
- 	      }
- 	    continue;
- 	  }
  	if (dump_file)
! 	  fprintf (dump_file, "Considering inline candidate %s.\n", cgraph_node_name (node));
  
  	info->estimated_growth = INT_MIN;
- 	if (!cgraph_default_inline_p (node, &failed_reason))
- 	  {
- 	    cgraph_set_inline_failed (node, failed_reason);
- 	    continue;
- 	  }
  
  	for (edge = node->callers; edge; edge = edge->next_caller)
! 	  if (edge->inline_failed)
  	    {
  	      gcc_assert (!edge->aux);
! 	      edge->aux = fibheap_insert (heap, cgraph_edge_badness (edge, false), edge);
  	    }
        }
  
--- 1209,1228 ----
        {
  	struct inline_summary *info = inline_summary (node);
  
  	if (dump_file)
! 	  fprintf (dump_file, "Enqueueing calls of %s/%i.\n",
! 		   cgraph_node_name (node), node->uid);
  
  	info->estimated_growth = INT_MIN;
  
  	for (edge = node->callers; edge; edge = edge->next_caller)
! 	  if (edge->inline_failed
! 	      && can_inline_edge_p (edge, true)
! 	      && want_inline_small_function_p (edge, true)
! 	      && edge->inline_failed)
  	    {
  	      gcc_assert (!edge->aux);
! 	      update_edge_key (heap, edge);
  	    }
        }
  
*************** cgraph_decide_inlining_of_small_function
*** 960,966 ****
        int badness = fibheap_min_key (heap);
        int current_badness;
        int growth;
-       cgraph_inline_failed_t not_good = CIF_OK;
  
        edge = (struct cgraph_edge *) fibheap_extract_min (heap);
        gcc_assert (edge->aux);
--- 1237,1242 ----
*************** cgraph_decide_inlining_of_small_function
*** 971,983 ****
        /* When updating the edge costs, we only decrease badness in the keys.
  	 When the badness increase, we keep the heap as it is and re-insert
  	 key now.  */
!       current_badness = cgraph_edge_badness (edge, false);
        gcc_assert (current_badness >= badness);
        if (current_badness != badness)
  	{
  	  edge->aux = fibheap_insert (heap, current_badness, edge);
  	  continue;
  	}
        
        callee = edge->callee;
        growth = estimate_edge_growth (edge);
--- 1247,1262 ----
        /* When updating the edge costs, we only decrease badness in the keys.
  	 When the badness increase, we keep the heap as it is and re-insert
  	 key now.  */
!       current_badness = edge_badness (edge, false);
        gcc_assert (current_badness >= badness);
        if (current_badness != badness)
  	{
  	  edge->aux = fibheap_insert (heap, current_badness, edge);
  	  continue;
  	}
+ 
+       if (!can_inline_edge_p (edge, true))
+ 	continue;
        
        callee = edge->callee;
        growth = estimate_edge_growth (edge);
*************** cgraph_decide_inlining_of_small_function
*** 989,1069 ****
  		   inline_summary (edge->callee)->size);
  	  fprintf (dump_file,
  		   " to be inlined into %s in %s:%i\n"
! 		   " Estimated growth after inlined into all callees is %+i insns.\n"
  		   " Estimated badness is %i, frequency %.2f.\n",
  		   cgraph_node_name (edge->caller),
  		   flag_wpa ? "unknown"
  		   : gimple_filename ((const_gimple) edge->call_stmt),
! 		   flag_wpa ? -1 : gimple_lineno ((const_gimple) edge->call_stmt),
  		   estimate_growth (edge->callee),
  		   badness,
  		   edge->frequency / (double)CGRAPH_FREQ_BASE);
  	  if (edge->count)
! 	    fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n", edge->count);
  	  if (dump_flags & TDF_DETAILS)
! 	    cgraph_edge_badness (edge, true);
  	}
  
!       /* When not having profile info ready we don't weight by any way the
!          position of call in procedure itself.  This means if call of
! 	 function A from function B seems profitable to inline, the recursive
! 	 call of function A in inline copy of A in B will look profitable too
! 	 and we end up inlining until reaching maximal function growth.  This
! 	 is not good idea so prohibit the recursive inlining.
! 
! 	 ??? When the frequencies are taken into account we might not need this
! 	 restriction.
! 
! 	 We need to be careful here, in some testcases, e.g. directives.c in
! 	 libcpp, we can estimate self recursive function to have negative growth
! 	 for inlining completely.
! 	 */
!       if (!edge->count)
  	{
! 	  where = edge->caller;
! 	  while (where->global.inlined_to)
! 	    {
! 	      if (where->decl == edge->callee->decl)
! 		break;
! 	      where = where->callers->caller;
! 	    }
! 	  if (where->global.inlined_to)
! 	    {
! 	      edge->inline_failed
! 		= (inline_summary (edge->callee)->disregard_inline_limits
! 		   ? CIF_RECURSIVE_INLINING : CIF_UNSPECIFIED);
! 	      if (dump_file)
! 		fprintf (dump_file, " inline_failed:Recursive inlining performed only for function itself.\n");
! 	      continue;
! 	    }
! 	}
! 
!       if (inline_summary (edge->callee)->disregard_inline_limits)
! 	;
!       else if (!cgraph_maybe_hot_edge_p (edge))
!  	not_good = CIF_UNLIKELY_CALL;
!       else if (!flag_inline_functions
! 	  && !DECL_DECLARED_INLINE_P (edge->callee->decl))
!  	not_good = CIF_NOT_DECLARED_INLINED;
!       else if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION(edge->caller->decl)))
!  	not_good = CIF_OPTIMIZING_FOR_SIZE;
!       if (not_good && growth > 0 && estimate_growth (edge->callee) > 0)
! 	{
! 	  edge->inline_failed = not_good;
! 	  if (dump_file)
! 	    fprintf (dump_file, " inline_failed:%s.\n",
! 		     cgraph_inline_failed_string (edge->inline_failed));
  	  continue;
  	}
!       if (!cgraph_default_inline_p (edge->callee, &edge->inline_failed))
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file, " inline_failed:%s.\n",
! 		     cgraph_inline_failed_string (edge->inline_failed));
! 	  continue;
! 	}
!       if (!tree_can_inline_p (edge)
! 	  || edge->call_stmt_cannot_inline_p)
  	{
  	  if (dump_file)
  	    fprintf (dump_file, " inline_failed:%s.\n",
--- 1268,1299 ----
  		   inline_summary (edge->callee)->size);
  	  fprintf (dump_file,
  		   " to be inlined into %s in %s:%i\n"
! 		   " Estimated growth after inlined into all is %+i insns.\n"
  		   " Estimated badness is %i, frequency %.2f.\n",
  		   cgraph_node_name (edge->caller),
  		   flag_wpa ? "unknown"
  		   : gimple_filename ((const_gimple) edge->call_stmt),
! 		   flag_wpa ? -1
! 		   : gimple_lineno ((const_gimple) edge->call_stmt),
  		   estimate_growth (edge->callee),
  		   badness,
  		   edge->frequency / (double)CGRAPH_FREQ_BASE);
  	  if (edge->count)
! 	    fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n",
! 		     edge->count);
  	  if (dump_flags & TDF_DETAILS)
! 	    edge_badness (edge, true);
  	}
  
!       if (overall_size + growth > max_size
! 	  && !DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl))
  	{
! 	  edge->inline_failed = CIF_INLINE_UNIT_GROWTH_LIMIT;
! 	  report_inline_failed_reason (edge);
  	  continue;
  	}
! 
!       if (!want_inline_small_function_p (edge, true))
  	{
  	  if (dump_file)
  	    fprintf (dump_file, " inline_failed:%s.\n",
*************** cgraph_decide_inlining_of_small_function
*** 1075,1083 ****
  	  where = edge->caller;
  	  if (where->global.inlined_to)
  	    where = where->global.inlined_to;
! 	  if (!cgraph_decide_recursive_inlining (edge,
! 						 flag_indirect_inlining
! 						 ? &new_indirect_edges : NULL))
  	    {
  	      edge->inline_failed = CIF_RECURSIVE_INLINING;
  	      continue;
--- 1305,1313 ----
  	  where = edge->caller;
  	  if (where->global.inlined_to)
  	    where = where->global.inlined_to;
! 	  if (!recursive_inlining (edge,
! 				   flag_indirect_inlining
! 				   ? &new_indirect_edges : NULL))
  	    {
  	      edge->inline_failed = CIF_RECURSIVE_INLINING;
  	      continue;
*************** cgraph_decide_inlining_of_small_function
*** 1089,1102 ****
        else
  	{
  	  struct cgraph_node *callee;
! 	  if (!cgraph_check_inline_limits (edge, &edge->inline_failed))
  	    {
! 	      if (dump_file)
! 		fprintf (dump_file, " Not inlining into %s:%s.\n",
! 			 cgraph_node_name (edge->caller),
! 			 cgraph_inline_failed_string (edge->inline_failed));
  	      continue;
  	    }
  	  callee = edge->callee;
  	  gcc_checking_assert (!callee->global.inlined_to);
  	  cgraph_mark_inline_edge (edge, true, &new_indirect_edges);
--- 1319,1351 ----
        else
  	{
  	  struct cgraph_node *callee;
! 	  struct cgraph_node *outer_node = NULL;
! 	  int depth = 0;
! 
! 	  /* Consider the case where self recursive function A is inlined into B.
! 	     This is desired optimization in some cases, since it leads to effect
! 	     similar of loop peeling and we might completely optimize out the
! 	     recursive call.  However we must be extra selective.  */
! 
! 	  where = edge->caller;
! 	  while (where->global.inlined_to)
  	    {
! 	      if (where->decl == edge->callee->decl)
! 		outer_node = where, depth++;
! 	      where = where->callers->caller;
! 	    }
! 	  if (outer_node
! 	      && !want_inline_self_recursive_call_p (edge, outer_node,
! 						     true, depth))
! 	    {
! 	      edge->inline_failed
! 		= (DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl)
! 		   ? CIF_RECURSIVE_INLINING : CIF_UNSPECIFIED);
  	      continue;
  	    }
+ 	  else if (depth && dump_file)
+ 	    fprintf (dump_file, " Peeling recursion with depth %i\n", depth);
+ 
  	  callee = edge->callee;
  	  gcc_checking_assert (!callee->global.inlined_to);
  	  cgraph_mark_inline_edge (edge, true, &new_indirect_edges);
*************** cgraph_decide_inlining_of_small_function
*** 1148,1190 ****
  	    fprintf (dump_file, "New minimal size reached: %i\n", min_size);
  	}
      }
-   while (!fibheap_empty (heap))
-     {
-       int badness = fibheap_min_key (heap);
- 
-       edge = (struct cgraph_edge *) fibheap_extract_min (heap);
-       gcc_assert (edge->aux);
-       edge->aux = NULL;
-       if (!edge->inline_failed)
- 	continue;
- #ifdef ENABLE_CHECKING
-       gcc_assert (cgraph_edge_badness (edge, false) >= badness);
- #endif
-       if (dump_file)
- 	{
- 	  fprintf (dump_file,
- 		   "\nSkipping %s with %i size\n",
- 		   cgraph_node_name (edge->callee),
- 		   inline_summary (edge->callee)->size);
- 	  fprintf (dump_file,
- 		   " called by %s in %s:%i\n"
- 		   " Estimated growth after inlined into all callees is %+i insns.\n"
- 		   " Estimated badness is %i, frequency %.2f.\n",
- 		   cgraph_node_name (edge->caller),
- 		   flag_wpa ? "unknown"
- 		   : gimple_filename ((const_gimple) edge->call_stmt),
- 		   flag_wpa ? -1 : gimple_lineno ((const_gimple) edge->call_stmt),
- 		   estimate_growth (edge->callee),
- 		   badness,
- 		   edge->frequency / (double)CGRAPH_FREQ_BASE);
- 	  if (edge->count)
- 	    fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n", edge->count);
- 	  if (dump_flags & TDF_DETAILS)
- 	    cgraph_edge_badness (edge, true);
- 	}
-       if (!inline_summary (edge->callee)->disregard_inline_limits && edge->inline_failed)
- 	edge->inline_failed = CIF_INLINE_UNIT_GROWTH_LIMIT;
-     }
  
    if (new_indirect_edges)
      VEC_free (cgraph_edge_p, heap, new_indirect_edges);
--- 1397,1402 ----
*************** cgraph_decide_inlining_of_small_function
*** 1195,1201 ****
  /* Flatten NODE from the IPA inliner.  */
  
  static void
! cgraph_flatten (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
  
--- 1407,1413 ----
  /* Flatten NODE from the IPA inliner.  */
  
  static void
! flatten_function (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
  
*************** cgraph_flatten (struct cgraph_node *node
*** 1208,1229 ****
      {
        struct cgraph_node *orig_callee;
  
-       if (e->call_stmt_cannot_inline_p)
- 	{
- 	  if (dump_file)
- 	    fprintf (dump_file, "Not inlining: %s",
- 		     cgraph_inline_failed_string (e->inline_failed));
- 	  continue;
- 	}
- 
-       if (!e->callee->analyzed)
- 	{
- 	  if (dump_file)
- 	    fprintf (dump_file,
- 		     "Not inlining: Function body not available.\n");
- 	  continue;
- 	}
- 
        /* We've hit cycle?  It is time to give up.  */
        if (e->callee->aux)
  	{
--- 1420,1425 ----
*************** cgraph_flatten (struct cgraph_node *node
*** 1240,1249 ****
  	 it in order to fully flatten the leaves.  */
        if (!e->inline_failed)
  	{
! 	  cgraph_flatten (e->callee);
  	  continue;
  	}
  
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
--- 1436,1453 ----
  	 it in order to fully flatten the leaves.  */
        if (!e->inline_failed)
  	{
! 	  flatten_function (e->callee);
  	  continue;
  	}
  
+       /* Flatten attribute needs to be processed during late inlining. For
+ 	 extra code quality we however do flattening during early optimization,
+ 	 too.  */
+       if (cgraph_state != CGRAPH_STATE_IPA_SSA
+ 	  ? !can_inline_edge_p (e, true)
+ 	  : !can_early_inline_edge_p (e))
+ 	continue;
+ 
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
*************** cgraph_flatten (struct cgraph_node *node
*** 1251,1264 ****
  	  continue;
  	}
  
-       if (!tree_can_inline_p (e))
- 	{
- 	  if (dump_file)
- 	    fprintf (dump_file, "Not inlining: %s",
- 		     cgraph_inline_failed_string (e->inline_failed));
- 	  continue;
- 	}
- 
        if (gimple_in_ssa_p (DECL_STRUCT_FUNCTION (node->decl))
  	  != gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
  	{
--- 1455,1460 ----
*************** cgraph_flatten (struct cgraph_node *node
*** 1277,1283 ****
        cgraph_mark_inline_edge (e, true, NULL);
        if (e->callee != orig_callee)
  	orig_callee->aux = (void *) node;
!       cgraph_flatten (e->callee);
        if (e->callee != orig_callee)
  	orig_callee->aux = NULL;
      }
--- 1473,1479 ----
        cgraph_mark_inline_edge (e, true, NULL);
        if (e->callee != orig_callee)
  	orig_callee->aux = (void *) node;
!       flatten_function (e->callee);
        if (e->callee != orig_callee)
  	orig_callee->aux = NULL;
      }
*************** cgraph_flatten (struct cgraph_node *node
*** 1289,1295 ****
     expenses on updating data structures.  */
  
  static unsigned int
! cgraph_decide_inlining (void)
  {
    struct cgraph_node *node;
    int nnodes;
--- 1485,1491 ----
     expenses on updating data structures.  */
  
  static unsigned int
! ipa_inline (void)
  {
    struct cgraph_node *node;
    int nnodes;
*************** cgraph_decide_inlining (void)
*** 1366,1406 ****
  	  if (dump_file)
  	    fprintf (dump_file,
  		     "Flattening %s\n", cgraph_node_name (node));
! 	  cgraph_flatten (node);
  	}
      }
  
!   cgraph_decide_inlining_of_small_functions ();
  
    if (flag_inline_functions_called_once)
      {
        if (dump_file)
  	fprintf (dump_file, "\nDeciding on functions called once:\n");
  
        /* And finally decide what functions are called once.  */
!       for (i = nnodes - 1; i >= 0; i--)
  	{
- 	  node = order[i];
- 
  	  if (node->callers
  	      && !node->callers->next_caller
  	      && !node->global.inlined_to
- 	      && cgraph_will_be_removed_from_program_if_no_direct_calls (node)
- 	      && inline_summary (node)->inlinable
- 	      && cgraph_function_body_availability (node) >= AVAIL_AVAILABLE
  	      && node->callers->inline_failed
  	      && node->callers->caller != node
  	      && node->callers->caller->global.inlined_to != node
! 	      && !node->callers->call_stmt_cannot_inline_p
! 	      && tree_can_inline_p (node->callers)
! 	      && !DECL_EXTERNAL (node->decl))
  	    {
! 	      cgraph_inline_failed_t reason;
  	      old_size = overall_size;
  	      if (dump_file)
  		{
  		  fprintf (dump_file,
! 			   "\nConsidering %s size %i.\n",
  			   cgraph_node_name (node), inline_summary (node)->size);
  		  fprintf (dump_file,
  			   " Called once from %s %i insns.\n",
--- 1562,1605 ----
  	  if (dump_file)
  	    fprintf (dump_file,
  		     "Flattening %s\n", cgraph_node_name (node));
! 	  flatten_function (node);
  	}
      }
  
!   inline_small_functions ();
!   cgraph_remove_unreachable_nodes (true, dump_file);
!   free (order);
  
+   /* We already perform some inlining of functions called once during
+      inlining small functions above.  After unreachable nodes are removed,
+      we still might do a quick check that nothing new is found.  */
    if (flag_inline_functions_called_once)
      {
        if (dump_file)
  	fprintf (dump_file, "\nDeciding on functions called once:\n");
  
        /* And finally decide what functions are called once.  */
!       for (node = cgraph_nodes; node; node = node->next)
  	{
  	  if (node->callers
  	      && !node->callers->next_caller
  	      && !node->global.inlined_to
  	      && node->callers->inline_failed
  	      && node->callers->caller != node
  	      && node->callers->caller->global.inlined_to != node
! 	      && cgraph_will_be_removed_from_program_if_no_direct_calls (node)
! 	      && inline_summary (node)->inlinable
! 	      && cgraph_function_body_availability (node) >= AVAIL_AVAILABLE
! 	      && !DECL_EXTERNAL (node->decl)
! 	      && can_inline_edge_p (node->callers, true))
  	    {
! 	      struct cgraph_node *caller = node->callers->caller;
! 
  	      old_size = overall_size;
  	      if (dump_file)
  		{
  		  fprintf (dump_file,
! 			   "\nInlining %s size %i.\n",
  			   cgraph_node_name (node), inline_summary (node)->size);
  		  fprintf (dump_file,
  			   " Called once from %s %i insns.\n",
*************** cgraph_decide_inlining (void)
*** 1408,1432 ****
  			   inline_summary (node->callers->caller)->size);
  		}
  
! 	      if (cgraph_check_inline_limits (node->callers, &reason))
! 		{
! 		  struct cgraph_node *caller = node->callers->caller;
! 		  cgraph_mark_inline_edge (node->callers, true, NULL);
! 		  if (dump_file)
! 		    fprintf (dump_file,
! 			     " Inlined into %s which now has %i size"
! 			     " for a net change of %+i size.\n",
! 			     cgraph_node_name (caller),
! 			     inline_summary (caller)->size,
! 			     overall_size - old_size);
! 		}
! 	      else
! 		{
! 		  if (dump_file)
! 		    fprintf (dump_file,
! 			     " Not inlining: %s.\n",
!                              cgraph_inline_failed_string (reason));
! 		}
  	    }
  	}
      }
--- 1607,1620 ----
  			   inline_summary (node->callers->caller)->size);
  		}
  
! 	      cgraph_mark_inline_edge (node->callers, true, NULL);
! 	      if (dump_file)
! 		fprintf (dump_file,
! 			 " Inlined into %s which now has %i size"
! 			 " for a net change of %+i size.\n",
! 			 cgraph_node_name (caller),
! 			 inline_summary (caller)->size,
! 			 overall_size - old_size);
  	    }
  	}
      }
*************** cgraph_decide_inlining (void)
*** 1441,1532 ****
  	     "size %i turned to %i size.\n\n",
  	     ncalls_inlined, nfunctions_inlined, initial_size,
  	     overall_size);
-   free (order);
    /* In WPA we use inline summaries for partitioning process.  */
    if (!flag_wpa)
      inline_free_summary ();
    return 0;
  }
  
- /* Return true when N is leaf function.  Accept cheap builtins
-    in leaf functions.  */
- 
- static bool
- leaf_node_p (struct cgraph_node *n)
- {
-   struct cgraph_edge *e;
-   for (e = n->callees; e; e = e->next_callee)
-     if (!is_inexpensive_builtin (e->callee->decl))
-       return false;
-   return true;
- }
- 
- /* Return true if the edge E is inlinable during early inlining.  */
- 
- static bool
- cgraph_edge_early_inlinable_p (struct cgraph_edge *e, FILE *file)
- {
-   if (!inline_summary (e->callee)->inlinable)
-     {
-       if (file)
- 	fprintf (file, "Not inlining: Function not inlinable.\n");
-       return false;
-     }
-   if (!e->callee->analyzed)
-     {
-       if (file)
- 	fprintf (file, "Not inlining: Function body not available.\n");
-       return false;
-     }
-   if (!tree_can_inline_p (e)
-       || e->call_stmt_cannot_inline_p)
-     {
-       if (file)
- 	fprintf (file, "Not inlining: %s.\n",
- 		 cgraph_inline_failed_string (e->inline_failed));
-       return false;
-     }
-   if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->caller->decl))
-       || !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
-     {
-       if (file)
- 	fprintf (file, "Not inlining: not in SSA form.\n");
-       return false;
-     }
-   return true;
- }
- 
  /* Inline always-inline function calls in NODE.  */
  
  static bool
! cgraph_perform_always_inlining (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
    bool inlined = false;
  
    for (e = node->callees; e; e = e->next_callee)
      {
!       if (!inline_summary (e->callee)->disregard_inline_limits)
  	continue;
  
-       if (dump_file)
- 	fprintf (dump_file,
- 		 "Considering always-inline candidate %s.\n",
- 		 cgraph_node_name (e->callee));
- 
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
! 	    fprintf (dump_file, "Not inlining: recursive call.\n");
  	  e->inline_failed = CIF_RECURSIVE_INLINING;
  	  continue;
  	}
  
!       if (!cgraph_edge_early_inlinable_p (e, dump_file))
  	continue;
  
        if (dump_file)
! 	fprintf (dump_file, " Inlining %s into %s.\n",
  		 cgraph_node_name (e->callee),
  		 cgraph_node_name (e->caller));
        cgraph_mark_inline_edge (e, true, NULL);
--- 1629,1667 ----
  	     "size %i turned to %i size.\n\n",
  	     ncalls_inlined, nfunctions_inlined, initial_size,
  	     overall_size);
    /* In WPA we use inline summaries for partitioning process.  */
    if (!flag_wpa)
      inline_free_summary ();
    return 0;
  }
  
  /* Inline always-inline function calls in NODE.  */
  
  static bool
! inline_always_inline_functions (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
    bool inlined = false;
  
    for (e = node->callees; e; e = e->next_callee)
      {
!       if (!DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
  	continue;
  
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
! 	    fprintf (dump_file, "  Not inlining recursive call to %s.\n",
! 		     cgraph_node_name (e->callee));
  	  e->inline_failed = CIF_RECURSIVE_INLINING;
  	  continue;
  	}
  
!       if (!can_early_inline_edge_p (e))
  	continue;
  
        if (dump_file)
! 	fprintf (dump_file, "  Inlining %s into %s (always_inline).\n",
  		 cgraph_node_name (e->callee),
  		 cgraph_node_name (e->caller));
        cgraph_mark_inline_edge (e, true, NULL);
*************** cgraph_perform_always_inlining (struct c
*** 1540,1563 ****
     expenses on updating data structures.  */
  
  static bool
! cgraph_decide_inlining_incrementally (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
    bool inlined = false;
-   cgraph_inline_failed_t failed_reason;
- 
-   /* Never inline regular functions into always-inline functions
-      during incremental inlining.  */
-   if (inline_summary (node)->disregard_inline_limits)
-     return false;
  
    for (e = node->callees; e; e = e->next_callee)
      {
-       int allowed_growth = 0;
- 
        if (!inline_summary (e->callee)->inlinable
! 	  || !e->inline_failed
! 	  || inline_summary (e->callee)->disregard_inline_limits)
  	continue;
  
        /* Do not consider functions not declared inline.  */
--- 1675,1689 ----
     expenses on updating data structures.  */
  
  static bool
! early_inline_small_functions (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
    bool inlined = false;
  
    for (e = node->callees; e; e = e->next_callee)
      {
        if (!inline_summary (e->callee)->inlinable
! 	  || !e->inline_failed)
  	continue;
  
        /* Do not consider functions not declared inline.  */
*************** cgraph_decide_inlining_incrementally (st
*** 1570,1616 ****
  	fprintf (dump_file, "Considering inline candidate %s.\n",
  		 cgraph_node_name (e->callee));
  
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
! 	    fprintf (dump_file, "Not inlining: recursive call.\n");
  	  continue;
  	}
  
!       if (!cgraph_edge_early_inlinable_p (e, dump_file))
  	continue;
  
!       if (cgraph_maybe_hot_edge_p (e) && leaf_node_p (e->callee)
! 	  && optimize_function_for_speed_p (cfun))
! 	allowed_growth = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
! 
!       /* When the function body would grow and inlining the function
! 	 won't eliminate the need for offline copy of the function,
! 	 don't inline.  */
!       if (estimate_edge_growth (e) > allowed_growth)
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file,
! 		     "Not inlining: code size would grow by %i.\n",
! 		     estimate_edge_growth (e));
! 	  continue;
! 	}
!       if (!cgraph_check_inline_limits (e, &e->inline_failed))
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file, "Not inlining: %s.\n",
! 		     cgraph_inline_failed_string (e->inline_failed));
! 	  continue;
! 	}
!       if (cgraph_default_inline_p (e->callee, &failed_reason))
! 	{
! 	  if (dump_file)
! 	    fprintf (dump_file, " Inlining %s into %s.\n",
! 		     cgraph_node_name (e->callee),
! 		     cgraph_node_name (e->caller));
! 	  cgraph_mark_inline_edge (e, true, NULL);
! 	  inlined = true;
! 	}
      }
  
    return inlined;
--- 1696,1720 ----
  	fprintf (dump_file, "Considering inline candidate %s.\n",
  		 cgraph_node_name (e->callee));
  
+       if (!can_early_inline_edge_p (e))
+ 	continue;
+ 
        if (cgraph_edge_recursive_p (e))
  	{
  	  if (dump_file)
! 	    fprintf (dump_file, "  Not inlining: recursive call.\n");
  	  continue;
  	}
  
!       if (!want_early_inline_function_p (e))
  	continue;
  
!       if (dump_file)
! 	fprintf (dump_file, " Inlining %s into %s.\n",
! 		 cgraph_node_name (e->callee),
! 		 cgraph_node_name (e->caller));
!       cgraph_mark_inline_edge (e, true, NULL);
!       inlined = true;
      }
  
    return inlined;
*************** static GTY ((length ("nnodes"))) struct 
*** 1626,1632 ****
     passes to be somewhat more effective and avoids some code duplication in
     later real inlining pass for testcases with very many function calls.  */
  static unsigned int
! cgraph_early_inlining (void)
  {
    struct cgraph_node *node = cgraph_get_node (current_function_decl);
    struct cgraph_edge *edge;
--- 1730,1736 ----
     passes to be somewhat more effective and avoids some code duplication in
     later real inlining pass for testcases with very many function calls.  */
  static unsigned int
! early_inliner (void)
  {
    struct cgraph_node *node = cgraph_get_node (current_function_decl);
    struct cgraph_edge *edge;
*************** cgraph_early_inlining (void)
*** 1643,1653 ****
  
    /* Even when not optimizing or not inlining inline always-inline
       functions.  */
!   inlined = cgraph_perform_always_inlining (node);
  
    if (!optimize
        || flag_no_inline
!       || !flag_early_inlining)
      ;
    else if (lookup_attribute ("flatten",
  			     DECL_ATTRIBUTES (node->decl)) != NULL)
--- 1747,1766 ----
  
    /* Even when not optimizing or not inlining inline always-inline
       functions.  */
!   inlined = inline_always_inline_functions (node);
  
    if (!optimize
        || flag_no_inline
!       || !flag_early_inlining
!       /* Never inline regular functions into always-inline functions
! 	 during incremental inlining.  This sucks as functions calling
! 	 always inline functions will get less optimized, but at the
! 	 same time inlining of functions calling always inline
! 	 functoin into an always inline function might introduce
! 	 cycles of edges to be always inlined in the callgraph.
! 
! 	 We might want to be smarter and just avoid this type of inlining.  */
!       || DECL_DISREGARD_INLINE_LIMITS (node->decl))
      ;
    else if (lookup_attribute ("flatten",
  			     DECL_ATTRIBUTES (node->decl)) != NULL)
*************** cgraph_early_inlining (void)
*** 1657,1663 ****
        if (dump_file)
  	fprintf (dump_file,
  		 "Flattening %s\n", cgraph_node_name (node));
!       cgraph_flatten (node);
        inlined = true;
      }
    else
--- 1770,1776 ----
        if (dump_file)
  	fprintf (dump_file,
  		 "Flattening %s\n", cgraph_node_name (node));
!       flatten_function (node);
        inlined = true;
      }
    else
*************** cgraph_early_inlining (void)
*** 1665,1674 ****
        /* We iterate incremental inlining to get trivial cases of indirect
  	 inlining.  */
        while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
! 	     && cgraph_decide_inlining_incrementally (node))
  	{
  	  timevar_push (TV_INTEGRATION);
  	  todo |= optimize_inline_calls (current_function_decl);
  	  timevar_pop (TV_INTEGRATION);
  	  iterations++;
  	  inlined = false;
--- 1778,1799 ----
        /* We iterate incremental inlining to get trivial cases of indirect
  	 inlining.  */
        while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
! 	     && early_inline_small_functions (node))
  	{
  	  timevar_push (TV_INTEGRATION);
  	  todo |= optimize_inline_calls (current_function_decl);
+ 
+ 	  /* Technically we ought to recompute inline parameters so the new
+  	     iteration of early inliner works as expected.  We however have
+ 	     values approximately right and thus we only need to update edge
+ 	     info that might be cleared out for newly discovered edges.  */
+ 	  for (edge = node->callees; edge; edge = edge->next_callee)
+ 	    {
+ 	      edge->call_stmt_size
+ 		= estimate_num_insns (edge->call_stmt, &eni_size_weights);
+ 	      edge->call_stmt_time
+ 		= estimate_num_insns (edge->call_stmt, &eni_time_weights);
+ 	    }
  	  timevar_pop (TV_INTEGRATION);
  	  iterations++;
  	  inlined = false;
*************** cgraph_early_inlining (void)
*** 1681,1699 ****
      {
        timevar_push (TV_INTEGRATION);
        todo |= optimize_inline_calls (current_function_decl);
- 
-       /* Technically we ought to recompute inline parameters so the new iteration of
- 	 early inliner works as expected.  We however have values approximately right
- 	 and thus we only need to update edge info that might be cleared out for
- 	 newly discovered edges.  */
-       for (edge = node->callees; edge; edge = edge->next_callee)
- 	{
- 	  edge->call_stmt_size
- 	    = estimate_num_insns (edge->call_stmt, &eni_size_weights);
- 	  edge->call_stmt_time
- 	    = estimate_num_insns (edge->call_stmt, &eni_time_weights);
- 	}
- 
        timevar_pop (TV_INTEGRATION);
      }
  
--- 1806,1811 ----
*************** struct gimple_opt_pass pass_early_inline
*** 1708,1714 ****
    GIMPLE_PASS,
    "einline",	 			/* name */
    NULL,					/* gate */
!   cgraph_early_inlining,		/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
--- 1820,1826 ----
    GIMPLE_PASS,
    "einline",	 			/* name */
    NULL,					/* gate */
!   early_inliner,			/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
*************** inline_transform (struct cgraph_node *no
*** 1730,1737 ****
    struct cgraph_edge *e;
    bool inline_p = false;
  
!   /* FIXME: Currently the pass manager is adding inline transform more than once to some
!      clones.  This needs revisiting after WPA cleanups.  */
    if (cfun->after_inlining)
      return 0;
  
--- 1842,1849 ----
    struct cgraph_edge *e;
    bool inline_p = false;
  
!   /* FIXME: Currently the pass manager is adding inline transform more than
!      once to some clones.  This needs revisiting after WPA cleanups.  */
    if (cfun->after_inlining)
      return 0;
  
*************** inline_transform (struct cgraph_node *no
*** 1762,1768 ****
     happens during early inlining.  */
  
  static bool
! gate_cgraph_decide_inlining (void)
  {
    /* ???  We'd like to skip this if not optimizing or not inlining as
       all always-inline functions have been processed by early
--- 1874,1880 ----
     happens during early inlining.  */
  
  static bool
! gate_ipa_inline (void)
  {
    /* ???  We'd like to skip this if not optimizing or not inlining as
       all always-inline functions have been processed by early
*************** struct ipa_opt_pass_d pass_ipa_inline =
*** 1777,1784 ****
   {
    IPA_PASS,
    "inline",				/* name */
!   gate_cgraph_decide_inlining,		/* gate */
!   cgraph_decide_inlining,		/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
--- 1889,1896 ----
   {
    IPA_PASS,
    "inline",				/* name */
!   gate_ipa_inline,			/* gate */
!   ipa_inline,				/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
Index: ipa-inline.h
===================================================================
*** ipa-inline.h	(revision 172597)
--- ipa-inline.h	(working copy)
*************** struct inline_summary
*** 41,48 ****
    /* False when there something makes versioning impossible.
       Currently computed and used only by ipa-cp.  */
    unsigned versionable : 1;
-   /* True when function should be inlined independently on its size.  */
-   unsigned disregard_inline_limits : 1;
  
    /* Information about function that will result after applying all the
       inline decisions present in the callgraph.  Generally kept up to
--- 41,46 ----
Index: ipa-inline-analysis.c
===================================================================
*** ipa-inline-analysis.c	(revision 172597)
--- ipa-inline-analysis.c	(working copy)
*************** dump_inline_summary (FILE *f, struct cgr
*** 131,137 ****
        struct inline_summary *s = inline_summary (node);
        fprintf (f, "Inline summary for %s/%i", cgraph_node_name (node),
  	       node->uid);
!       if (s->disregard_inline_limits)
  	fprintf (f, " always_inline");
        if (s->inlinable)
  	fprintf (f, " inlinable");
--- 131,137 ----
        struct inline_summary *s = inline_summary (node);
        fprintf (f, "Inline summary for %s/%i", cgraph_node_name (node),
  	       node->uid);
!       if (DECL_DISREGARD_INLINE_LIMITS (node->decl))
  	fprintf (f, " always_inline");
        if (s->inlinable)
  	fprintf (f, " inlinable");
*************** dump_inline_summary (FILE *f, struct cgr
*** 142,148 ****
        fprintf (f, "  global time:     %i\n", s->time);
        fprintf (f, "  self size:       %i, benefit: %i\n",
  	       s->self_size, s->size_inlining_benefit);
!       fprintf (f, "  global size:     %i", s->size);
        fprintf (f, "  self stack:      %i\n",
  	       (int)s->estimated_self_stack_size);
        fprintf (f, "  global stack:    %i\n\n",
--- 142,148 ----
        fprintf (f, "  global time:     %i\n", s->time);
        fprintf (f, "  self size:       %i, benefit: %i\n",
  	       s->self_size, s->size_inlining_benefit);
!       fprintf (f, "  global size:     %i\n", s->size);
        fprintf (f, "  self stack:      %i\n",
  	       (int)s->estimated_self_stack_size);
        fprintf (f, "  global stack:    %i\n\n",
*************** compute_inline_parameters (struct cgraph
*** 364,371 ****
  
    /* Can this function be inlined at all?  */
    info->inlinable = tree_inlinable_function_p (node->decl);
-   if (!info->inlinable)
-     info->disregard_inline_limits = 0;
  
    /* Inlinable functions always can change signature.  */
    if (info->inlinable)
--- 364,369 ----
*************** compute_inline_parameters (struct cgraph
*** 388,395 ****
    info->estimated_growth = INT_MIN;
    info->stack_frame_offset = 0;
    info->estimated_stack_size = info->estimated_self_stack_size;
-   info->disregard_inline_limits
-     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
  }
  
  
--- 386,391 ----
*************** estimate_growth (struct cgraph_node *nod
*** 483,507 ****
  
    for (e = node->callers; e; e = e->next_caller)
      {
!       if (e->caller == node)
          self_recursive = true;
!       if (e->inline_failed)
! 	growth += estimate_edge_growth (e);
      }
  
!   /* ??? Wrong for non-trivially self recursive functions or cases where
!      we decide to not inline for different reasons, but it is not big deal
!      as in that case we will keep the body around, but we will also avoid
!      some inlining.  */
!   if (cgraph_will_be_removed_from_program_if_no_direct_calls (node)
!       && !DECL_EXTERNAL (node->decl) && !self_recursive)
!     growth -= info->size;
!   /* COMDAT functions are very often not shared across multiple units since they
!      come from various template instantiations.  Take this into account.  */
!   else  if (DECL_COMDAT (node->decl) && !self_recursive
! 	    && cgraph_can_remove_if_no_direct_calls_p (node))
!     growth -= (info->size
! 	       * (100 - PARAM_VALUE (PARAM_COMDAT_SHARING_PROBABILITY)) + 50) / 100;
  
    info->estimated_growth = growth;
    return growth;
--- 479,512 ----
  
    for (e = node->callers; e; e = e->next_caller)
      {
!       gcc_checking_assert (e->inline_failed);
! 
!       if (e->caller == node
! 	  || (e->caller->global.inlined_to
! 	      && e->caller->global.inlined_to == node))
          self_recursive = true;
!       growth += estimate_edge_growth (e);
      }
+      
  
!   /* For self recursive functions the growth estimation really should be
!      infinity.  We don't want to return very large values because the growth
!      plays various roles in badness computation fractions.  Be sure to not
!      return zero or negative growths. */
!   if (self_recursive)
!     growth = growth < info->size ? info->size : growth;
!   else
!     {
!       if (cgraph_will_be_removed_from_program_if_no_direct_calls (node)
! 	  && !DECL_EXTERNAL (node->decl))
! 	growth -= info->size;
!       /* COMDAT functions are very often not shared across multiple units since they
! 	 come from various template instantiations.  Take this into account.  */
!       else  if (DECL_COMDAT (node->decl)
! 		&& cgraph_can_remove_if_no_direct_calls_p (node))
! 	growth -= (info->size
! 		   * (100 - PARAM_VALUE (PARAM_COMDAT_SHARING_PROBABILITY)) + 50) / 100;
!     }
  
    info->estimated_growth = growth;
    return growth;
*************** inline_read_summary (void)
*** 621,627 ****
  	      bp = lto_input_bitpack (ib);
  	      info->inlinable = bp_unpack_value (&bp, 1);
  	      info->versionable = bp_unpack_value (&bp, 1);
- 	      info->disregard_inline_limits = bp_unpack_value (&bp, 1);
  	    }
  
  	  lto_destroy_simple_input_block (file_data,
--- 626,631 ----
*************** inline_write_summary (cgraph_node_set se
*** 688,694 ****
  	  bp = bitpack_create (ob->main_stream);
  	  bp_pack_value (&bp, info->inlinable, 1);
  	  bp_pack_value (&bp, info->versionable, 1);
- 	  bp_pack_value (&bp, info->disregard_inline_limits, 1);
  	  lto_output_bitpack (&bp);
  	}
      }
--- 692,697 ----

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Inliner heuristics facelift
  2011-04-17 10:54 Jan Hubicka
@ 2011-04-19 18:37 ` Xinliang David Li
  2011-04-19 19:24   ` H.J. Lu
  0 siblings, 1 reply; 9+ messages in thread
From: Xinliang David Li @ 2011-04-19 18:37 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

Is change tested this with SPEC?  5 or 6 SPEC2k programs ICE with the
following error when built with FDO.

.... internal compiler error: in inline_small_functions, at ipa-inline.c:1361
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

David

On Sun, Apr 17, 2011 at 3:24 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> cgraph inliner implementation is 8 years old now.  It has growh from simple
> C-only inliner preventing unbounded code size explossion seen on earlier tree
> inlining heuristics over "lets solve problem with abstraction penalty in
> tramp3d" into "lets do something useful on Mozilla".
>
> In this patch I am attempting to retrospect some design into the code that
> since original one got lost in tons of various additions and migration from
> different cgraph implementations.
>
> While I think the overhall scheme of having way to transform cgraph&functions
> separated from heuritics and heuristics implemented as a greedy algorithm doing
> as much of useful inlining as allowed by the limits is OK, the implementation
> got clumpsy.
>
> Main idea of the patch is to move all the various check scattered across the
> code into can_inline_edge_p predicate that test if inlining is at all possible.
> In addition each of different inliner heuristics (always inline, flatteing,
> early inlining, inlining of small functions, recursive inlining and inlining of
> functions called once) has bit different rules on when it thinks inlining is
> good idea.  For always inline and flattening it is trivial. For other there are
> now separated want_* predicates doing all the checks at one place.
>
> The patch also cleanups the dump files and makes function naming more regular,
> especially by dropping nonsential cgraph prefixes coming from dark times when
> cgraph and inliner was in single source file.
>
> While working on the patch I contied number of minor issues that are fixed,
> since it was easier than go with immitating previous behaviour and adding
> FIXMEs.
>
> In particular
>
>  1) Long standing TODO in estimate_growth concerning miscomputation of self
>     recursive functions is gone.  It was trivial ever since the cgraph code
>     was revamped to inline plans with clones.
>  2) Small function inliner no longer inserts only functoins passing
>     cgraph_default_inline_p nto the queue.  The feasibility of small function
>     inlining is edge specific now and accounts estimated benefits into
>     --param large-function-insns-auto and -single limits.
>  3) There was bug in cgraph_check_inline_limits concerning stack usage
>     computation for nested functions, probably introduced by my previous patch.
>     It is fixed now
>  4) cgraph_check_inline_limits was specially allowing inlining large function
>     into small by setting the base of limit to be bigger caller or callee self
>     size.  This is very important since often large functions should be inlined
>     into tiny wrappers and benefit from the extra context.
>
>     It is however also sort shighted, since when large function gets inlined into
>     tiny wrapper, later we migh prevent inlining small functions into the large functions
>     since in next iteration the size will be maximum of the tiny wrapper and small
>     function.
>     Fixed by looking for largest function in the inline stack and symetrically in
>     the case of
>  5) All of the recursive inlining tests was implemented before callgraph edge
>     frequencies was added and not updated since then. We now compute the decision
>     based on both.  It is still not doing as well as I would like, I will try
>     to reconsider this later.
>  6) Inlining functions called once was missing oppurtunities because at inliner
>     time we keep in callgraph offline copies of extern inline functions and some
>     virtual functions for future inlining.  Calls from these functions prevent
>     inliner to notice that function is called once.  Fixed by removing unreachable
>     calls before functions called once are processed.
>
> Bootstrapped/regtested x86_64-linux, also tested on tramp3d to behave sanely
> and on nightly tester (frescobaldi).
> Will commit it later today.
>
> Honza
>
>        * lto-symtab.c (lto_cgraph_replace_node): When call statement is
>        present, also set gimple_call_set_cannot_inline.
>        * ipa-inline.c: Update toplevel comment.
>        (MAX_TIME): Remove.
>        (cgraph_clone_inlined_nodes): Fix linebreaks.
>        (cgraph_check_inline_limits): Restructure to ...
>        (caller_growth_limits): ... this one; be more tolerant
>        on growth in nested inline chains; add explanatory comment;
>        fix stack accounting thinko introduced by previous patch.
>        (cgraph_default_inline_p): Remove.
>        (report_inline_failed_reason): New function.
>        (can_inline_edge_p): New function.
>        (can_early_inline_edge_p): New function.
>        (leaf_node_p): Move upwards in file.
>        (want_early_inline_function_p): New function.
>        (want_inline_small_function_p): New function.
>        (want_inline_self_recursive_call_p): New function.
>        (cgraph_edge_badness): Rename to ...
>        (edge_badness) ... this one; fix linebreaks.
>        (update_edge_key): Update call of edge_baddness; add
>        detailed dump about queue updates.
>        (update_caller_keys): Use can_inline_edge_p and
>        want_inline_small_function_p.
>        (cgraph_decide_recursive_inlining): Rename to...
>        (recursive_inlining): Use can_inline_edge_p and
>        want_inline_self_recursive_call_p; simplify and
>        remove no longer valid FIXME.
>        (cgraph_set_inline_failed): Remove.
>        (add_new_edges_to_heap): Use can_inline_edge_p and
>        want_inline_small_function_p.
>        (cgraph_decide_inlining_of_small_functions): Rename to ...
>        (inline_small_functions): ... this one; cleanup; use
>        can/want predicates; cleanup debug ouput; work edges
>        till fibheap is exhausted and do not stop once unit
>        growth is reached; remove later loop processing remaining
>        edges.
>        (cgraph_flatten): Rename to ...
>        (flatten_function): ... this one; use can_inline_edge_p
>        and can_early_inline_edge_p predicates.
>        (cgraph_decide_inlining): Rename to ...
>        (ipa_inline): ... this one; remove unreachable nodes before
>        inlining functions called once; simplify the pass.
>        (cgraph_perform_always_inlining): Rename to ...
>        (inline_always_inline_functions): ... this one; use
>        DECL_DISREGARD_INLINE_LIMITS; use can_inline_edge_p
>        predicate
>        (cgraph_decide_inlining_incrementally): Rename to ...
>        (early_inline_small_functions): ... this one; simplify
>        using new predicates; cleanup; make dumps prettier.
>        (cgraph_early_inlining): Rename to ...
>        (early_inliner): newer inline regular functions into always-inlines;
>        fix updating of call stmt summaries.
>        (pass_early_inline): Update for new names.
>        (inline_transform): Fix formating.
>        (gate_cgraph_decide_inlining): Rename to ...
>        (pass_ipa_inline): ... this one.
>        * ipa-inline.h (inline_summary): Remove disregard_inline_limits.
>        * ipa-inline-analysis.c (dump_inline_summary): Update.
>        (compute_inline_parameters): Do not compute disregard_inline_limits.
>        (estimate_growth): Fix handlig of non-trivial self recursion.
>        (inline_read_summary): Do not read info->disregard_inline_limits.
>        (inline_write_summary): Do not write info->disregard_inline_limits.
>
>        * gcc.dg/winline-5.c: Update testcase.
>
> Index: testsuite/gcc.dg/winline-5.c
> ===================================================================
> *** testsuite/gcc.dg/winline-5.c        (revision 172597)
> --- testsuite/gcc.dg/winline-5.c        (working copy)
> *************** inline int q(void) /* { dg-warning "inli
> *** 15,29 ****
>        big();
>        big();
>  }
> - inline int q1(void)
> - {
> -       big();
> -       big();
> -       big();
> - }
>  int t (void)
>  {
> -  /* We allow one inlining over limit.  */
> -       q1();
>        return q ();             /* { dg-warning "called from here" } */
>  }
> --- 15,21 ----
> Index: ipa-inline.c
> ===================================================================
> *** ipa-inline.c        (revision 172597)
> --- ipa-inline.c        (working copy)
> *************** along with GCC; see the file COPYING3.
> *** 21,93 ****
>
>  /*  Inlining decision heuristics
>
> !     We separate inlining decisions from the inliner itself and store it
> !     inside callgraph as so called inline plan.  Refer to cgraph.c
> !     documentation about particular representation of inline plans in the
> !     callgraph.
> !
> !     There are three major parts of this file:
> !
> !     cgraph_mark_inline_edge implementation
> !
> !       This function allows to mark given call inline and performs necessary
> !       modifications of cgraph (production of the clones and updating overall
> !       statistics)
>
>      inlining heuristics limits
>
> !       These functions allow to check that particular inlining is allowed
> !       by the limits specified by user (allowed function growth, overall unit
> !       growth and so on).
>
>      inlining heuristics
>
> !       This is implementation of IPA pass aiming to get as much of benefit
> !       from inlining obeying the limits checked above.
>
> !       The implementation of particular heuristics is separated from
> !       the rest of code to make it easier to replace it with more complicated
> !       implementation in the future.  The rest of inlining code acts as a
> !       library aimed to modify the callgraph and verify that the parameters
> !       on code size growth fits.
> !
> !       To mark given call inline, use cgraph_mark_inline function, the
> !       verification is performed by cgraph_default_inline_p and
> !       cgraph_check_inline_limits.
> !
> !       The heuristics implements simple knapsack style algorithm ordering
> !       all functions by their "profitability" (estimated by code size growth)
> !       and inlining them in priority order.
> !
> !       cgraph_decide_inlining implements heuristics taking whole callgraph
> !       into account, while cgraph_decide_inlining_incrementally considers
> !       only one function at a time and is used by early inliner.
> !
> !    The inliner itself is split into two passes:
> !
> !    pass_early_inlining
> !
> !      Simple local inlining pass inlining callees into current function.  This
> !      pass makes no global whole compilation unit analysis and this when allowed
> !      to do inlining expanding code size it might result in unbounded growth of
> !      whole unit.
> !
> !      The pass is run during conversion into SSA form.  Only functions already
> !      converted into SSA form are inlined, so the conversion must happen in
> !      topological order on the callgraph (that is maintained by pass manager).
> !      The functions after inlining are early optimized so the early inliner sees
> !      unoptimized function itself, but all considered callees are already
> !      optimized allowing it to unfold abstraction penalty on C++ effectively and
> !      cheaply.
> !
> !    pass_ipa_inline
> !
> !      This is the main pass implementing simple greedy algorithm to do inlining
> !      of small functions that results in overall growth of compilation unit and
> !      inlining of functions called once.  The pass compute just so called inline
> !      plan (representation of inlining to be done in callgraph) and unlike early
> !      inlining it is not performing the inlining itself.
> !  */
>
>  #include "config.h"
>  #include "system.h"
> --- 21,106 ----
>
>  /*  Inlining decision heuristics
>
> !     The implementation of inliner is organized as follows:
> !
> !     Transformation of callgraph to represent inlining decisions.
> !
> !       The inline decisions are stored in callgraph in "inline plan" and
> !       all applied later.
> !
> !       To mark given call inline, use cgraph_mark_inline function.
> !       The function marks the edge inlinable and, if neccesary, produces
> !       virtual clone in the callgraph representing the new copy of callee's
> !       function body.
> !
> !       The inline plan is applied on given function body by inline_transform.
>
>      inlining heuristics limits
>
> !       can_inline_edge_p allow to check that particular inlining is allowed
> !       by the limits specified by user (allowed function growth, growth and so
> !       on).
> !
> !       Functions are inlined when it is obvious the result is profitable (such
> !       as functions called once or when inlining reduce code size).
> !       In addition to that we perform inlining of small functions and recursive
> !       inlining.
>
>      inlining heuristics
>
> !        The inliner itself is split into two passes:
> !
> !        pass_early_inlining
>
> !        Simple local inlining pass inlining callees into current function.
> !        This pass makes no use of whole unit analysis and thus it can do only
> !        very simple decisions based on local properties.
> !
> !        The strength of the pass is that it is run in topological order
> !        (reverse postorder) on the callgraph. Functions are converted into SSA
> !        form just before this pass and optimized subsequently. As a result, the
> !        callees of the function seen by the early inliner was already optimized
> !        and results of early inlining adds a lot of optimization oppurtunities
> !        for the local optimization.
> !
> !        The pass handle the obvious inlining decisions within the copmilation
> !        unit - inlining auto inline functions, inlining for size and
> !        flattening.
> !
> !        main strength of the pass is the ability to eliminate abstraction
> !        penalty in C++ code (via combination of inlining and early
> !        optimization) and thus improve quality of analysis done by real IPA
> !        optimizers.
> !
> !        Because of lack of whole unit knowledge, the pass can not really make
> !        good code size/performance tradeoffs.  It however does very simple
> !        speculative inlining allowing code size to grow by
> !        EARLY_INLINING_INSNS when calee is leaf function.  In this case the
> !        optimizations perfomed later are very likely to eliminate the cost.
> !
> !        pass_ipa_inline
> !
> !        This is the real inliner able to handle inlining with whole program
> !        knowledge. It performs following steps:
> !
> !        1) inlining of small functions.  This is implemented by greedy
> !        algorithm ordering all inlinable cgraph edges by their badness and
> !        inlining them in this order as long as inline limits allows doing so.
> !
> !        This heuristics is not very good on inlining recursive calls. Recursive
> !        calls can be inlined with results similar to loop unrolling. To do so,
> !        special purpose recursive inliner is executed on function when
> !        recursive edge is met as viable candidate.
> !
> !        2) Unreachable functions are removed from callgraph.  Inlining leads
> !        to devirtualization and other modification of callgraph so functions
> !        may become unreachable during the process. Also functions declared as
> !        extern inline or virtual functions are removed, since after inlining
> !        we no longer need the offline bodies.
> !
> !        3) Functions called once and not exported from the unit are inlined.
> !        This should almost always lead to reduction of code size by eliminating
> !        the need for offline copy of the function.  */
>
>  #include "config.h"
>  #include "system.h"
> *************** along with GCC; see the file COPYING3.
> *** 114,122 ****
>  #include "except.h"
>  #include "ipa-inline.h"
>
> - #define MAX_TIME 1000000000
> -
> -
>  /* Statistics we collect about inlining algorithm.  */
>  static int ncalls_inlined;
>  static int nfunctions_inlined;
> --- 127,132 ----
> *************** cgraph_clone_inlined_nodes (struct cgrap
> *** 163,173 ****
>        /* We may eliminate the need for out-of-line copy to be output.
>         In that case just go ahead and re-use it.  */
>        if (!e->callee->callers->next_caller
> !         /* Recursive inlining never wants the master clone to be overwritten.  */
>          && update_original
> !         /* FIXME: When address is taken of DECL_EXTERNAL function we still can remove its
> !            offline copy, but we would need to keep unanalyzed node in the callgraph so
> !            references can point to it.  */
>          && !e->callee->address_taken
>          && cgraph_can_remove_if_no_direct_calls_p (e->callee)
>          /* Inlining might enable more devirtualizing, so we want to remove
> --- 173,184 ----
>        /* We may eliminate the need for out-of-line copy to be output.
>         In that case just go ahead and re-use it.  */
>        if (!e->callee->callers->next_caller
> !         /* Recursive inlining never wants the master clone to
> !            be overwritten.  */
>          && update_original
> !         /* FIXME: When address is taken of DECL_EXTERNAL function we still
> !            can remove its offline copy, but we would need to keep unanalyzed
> !            node in the callgraph so references can point to it.  */
>          && !e->callee->address_taken
>          && cgraph_can_remove_if_no_direct_calls_p (e->callee)
>          /* Inlining might enable more devirtualizing, so we want to remove
> *************** cgraph_clone_inlined_nodes (struct cgrap
> *** 175,181 ****
>             Lacking may edges in callgraph we just preserve them post
>             inlining.  */
>          && (!DECL_VIRTUAL_P (e->callee->decl)
> !             || (!DECL_COMDAT (e->callee->decl) && !DECL_EXTERNAL (e->callee->decl)))
>          /* Don't reuse if more than one function shares a comdat group.
>             If the other function(s) are needed, we need to emit even
>             this function out of line.  */
> --- 186,193 ----
>             Lacking may edges in callgraph we just preserve them post
>             inlining.  */
>          && (!DECL_VIRTUAL_P (e->callee->decl)
> !             || (!DECL_COMDAT (e->callee->decl)
> !                 && !DECL_EXTERNAL (e->callee->decl)))
>          /* Don't reuse if more than one function shares a comdat group.
>             If the other function(s) are needed, we need to emit even
>             this function out of line.  */
> *************** cgraph_clone_inlined_nodes (struct cgrap
> *** 214,220 ****
>        + caller_info->estimated_self_stack_size;
>    peak = callee_info->stack_frame_offset
>        + callee_info->estimated_self_stack_size;
> !   if (inline_summary (e->callee->global.inlined_to)->estimated_stack_size < peak)
>      inline_summary (e->callee->global.inlined_to)->estimated_stack_size = peak;
>    cgraph_propagate_frequency (e->callee);
>
> --- 226,233 ----
>        + caller_info->estimated_self_stack_size;
>    peak = callee_info->stack_frame_offset
>        + callee_info->estimated_self_stack_size;
> !   if (inline_summary (e->callee->global.inlined_to)->estimated_stack_size
> !       < peak)
>      inline_summary (e->callee->global.inlined_to)->estimated_stack_size = peak;
>    cgraph_propagate_frequency (e->callee);
>
> *************** cgraph_mark_inline_edge (struct cgraph_e
> *** 272,304 ****
>      return false;
>  }
>
> ! /* Return false when inlining edge E is not good idea
> !    as it would cause too large growth of the callers function body
> !    or stack frame size.  *REASON if non-NULL is updated if the
> !    inlining is not a good idea.  */
>
>  static bool
> ! cgraph_check_inline_limits (struct cgraph_edge *e,
> !                           cgraph_inline_failed_t *reason)
>  {
>    struct cgraph_node *to = e->caller;
>    struct cgraph_node *what = e->callee;
>    int newsize;
> !   int limit;
> !   HOST_WIDE_INT stack_size_limit, inlined_stack;
> !   struct inline_summary *info, *what_info;
> !
> !   if (to->global.inlined_to)
> !     to = to->global.inlined_to;
>
> -   info = inline_summary (to);
>    what_info = inline_summary (what);
>
> !   /* When inlining large function body called once into small function,
> !      take the inlined function as base for limiting the growth.  */
> !   if (info->self_size > what_info->self_size)
> !     limit = info->self_size;
> !   else
>      limit = what_info->self_size;
>
>    limit += limit * PARAM_VALUE (PARAM_LARGE_FUNCTION_GROWTH) / 100;
> --- 285,336 ----
>      return false;
>  }
>
> ! /* Return false when inlining edge E would lead to violating
> !    limits on function unit growth or stack usage growth.
> !
> !    The relative function body growth limit is present generally
> !    to avoid problems with non-linear behaviour of the compiler.
> !    To allow inlining huge functions into tiny wrapper, the limit
> !    is always based on the bigger of the two functions considered.
> !
> !    For stack growth limits we always base the growth in stack usage
> !    of the callers.  We want to prevent applications from segfaulting
> !    on stack overflow when functions with huge stack frames gets
> !    inlined. */
>
>  static bool
> ! caller_growth_limits (struct cgraph_edge *e)
>  {
>    struct cgraph_node *to = e->caller;
>    struct cgraph_node *what = e->callee;
>    int newsize;
> !   int limit = 0;
> !   HOST_WIDE_INT stack_size_limit = 0, inlined_stack;
> !   struct inline_summary *info, *what_info, *outer_info = inline_summary (to);
> !
> !   /* Look for function e->caller is inlined to.  While doing
> !      so work out the largest function body on the way.  As
> !      described above, we want to base our function growth
> !      limits based on that.  Not on the self size of the
> !      outer function, not on the self size of inline code
> !      we immediately inline to.  This is the most relaxed
> !      interpretation of the rule "do not grow large functions
> !      too much in order to prevent compiler from exploding".  */
> !   do
> !     {
> !       info = inline_summary (to);
> !       if (limit < info->self_size)
> !       limit = info->self_size;
> !       if (stack_size_limit < info->estimated_self_stack_size)
> !       stack_size_limit = info->estimated_self_stack_size;
> !       if (to->global.inlined_to)
> !         to = to->callers->caller;
> !     }
> !   while (to->global.inlined_to);
>
>    what_info = inline_summary (what);
>
> !   if (limit < what_info->self_size)
>      limit = what_info->self_size;
>
>    limit += limit * PARAM_VALUE (PARAM_LARGE_FUNCTION_GROWTH) / 100;
> *************** cgraph_check_inline_limits (struct cgrap
> *** 310,388 ****
>        && newsize > PARAM_VALUE (PARAM_LARGE_FUNCTION_INSNS)
>        && newsize > limit)
>      {
> !       if (reason)
> !         *reason = CIF_LARGE_FUNCTION_GROWTH_LIMIT;
>        return false;
>      }
>
> !   stack_size_limit = info->estimated_self_stack_size;
> !
> !   stack_size_limit += stack_size_limit * PARAM_VALUE (PARAM_STACK_FRAME_GROWTH) / 100;
>
> !   inlined_stack = (info->stack_frame_offset
> !                  + info->estimated_self_stack_size
>                   + what_info->estimated_stack_size);
> !   if (inlined_stack  > stack_size_limit
>        && inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
>      {
> !       if (reason)
> !         *reason = CIF_LARGE_STACK_FRAME_GROWTH_LIMIT;
>        return false;
>      }
>    return true;
>  }
>
> ! /* Return true when function N is small enough to be inlined.  */
>
>  static bool
> ! cgraph_default_inline_p (struct cgraph_node *n, cgraph_inline_failed_t *reason)
>  {
> !   tree decl = n->decl;
> !   struct inline_summary *info = inline_summary (n);
>
> !   if (info->disregard_inline_limits)
> !     return true;
>
> !   if (!flag_inline_small_functions && !DECL_DECLARED_INLINE_P (decl))
>      {
> !       if (reason)
> !       *reason = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
>        return false;
>      }
> !   if (!n->analyzed)
>      {
> !       if (reason)
> !       *reason = CIF_BODY_NOT_AVAILABLE;
>        return false;
>      }
> !   if (cgraph_function_body_availability (n) <= AVAIL_OVERWRITABLE)
>      {
> !       if (reason)
> !       *reason = CIF_OVERWRITABLE;
>        return false;
>      }
>
>
> !   if (DECL_DECLARED_INLINE_P (decl))
>      {
> !       if (info->size >= MAX_INLINE_INSNS_SINGLE)
>        {
> !         if (reason)
> !           *reason = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
> !         return false;
>        }
>      }
>    else
>      {
> !       if (info->size >= MAX_INLINE_INSNS_AUTO)
>        {
> !         if (reason)
> !           *reason = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
> !         return false;
>        }
>      }
>
> !   return true;
>  }
>
>  /* A cost model driving the inlining heuristics in a way so the edges with
> --- 342,711 ----
>        && newsize > PARAM_VALUE (PARAM_LARGE_FUNCTION_INSNS)
>        && newsize > limit)
>      {
> !       e->inline_failed = CIF_LARGE_FUNCTION_GROWTH_LIMIT;
>        return false;
>      }
>
> !   /* FIXME: Stack size limit often prevents inlining in fortran programs
> !      due to large i/o datastructures used by the fortran frontend.
> !      We ought to ignore this limit when we know that the edge is executed
> !      on every invocation of the caller (i.e. its call statement dominates
> !      exit block).  We do not track this information, yet.  */
> !   stack_size_limit += (stack_size_limit
> !                      * PARAM_VALUE (PARAM_STACK_FRAME_GROWTH) / 100);
>
> !   inlined_stack = (outer_info->stack_frame_offset
> !                  + outer_info->estimated_self_stack_size
>                   + what_info->estimated_stack_size);
> !   /* Check new stack consumption with stack consumption at the place
> !      stack is used.  */
> !   if (inlined_stack > stack_size_limit
> !       /* If function already has large stack usage from sibbling
> !        inline call, we can inline, too.
> !        This bit overoptimistically assume that we are good at stack
> !        packing.  */
> !       && inlined_stack > info->estimated_stack_size
>        && inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
>      {
> !       e->inline_failed = CIF_LARGE_STACK_FRAME_GROWTH_LIMIT;
>        return false;
>      }
>    return true;
>  }
>
> ! /* Dump info about why inlining has failed.  */
> !
> ! static void
> ! report_inline_failed_reason (struct cgraph_edge *e)
> ! {
> !   if (dump_file)
> !     {
> !       fprintf (dump_file, "  not inlinable: %s/%i -> %s/%i, %s\n",
> !              cgraph_node_name (e->caller), e->caller->uid,
> !              cgraph_node_name (e->callee), e->callee->uid,
> !              cgraph_inline_failed_string (e->inline_failed));
> !     }
> ! }
> !
> ! /* Decide if we can inline the edge and possibly update
> !    inline_failed reason.
> !    We check whether inlining is possible at all and whether
> !    caller growth limits allow doing so.
> !
> !    if REPORT is true, output reason to dump file*/
>
>  static bool
> ! can_inline_edge_p (struct cgraph_edge *e, bool report)
>  {
> !   bool inlinable = true;
>
> !   gcc_assert (e->inline_failed);
>
> !   if (!e->callee->analyzed)
> !     {
> !       e->inline_failed = CIF_BODY_NOT_AVAILABLE;
> !       inlinable = false;
> !     }
> !   else if (!inline_summary (e->callee)->inlinable)
> !     {
> !       e->inline_failed = CIF_FUNCTION_NOT_INLINABLE;
> !       inlinable = false;
> !     }
> !   else if (cgraph_function_body_availability (e->callee) <= AVAIL_OVERWRITABLE)
>      {
> !       e->inline_failed = CIF_OVERWRITABLE;
>        return false;
>      }
> !   else if (e->call_stmt_cannot_inline_p)
> !     {
> !       e->inline_failed = CIF_MISMATCHED_ARGUMENTS;
> !       inlinable = false;
> !     }
> !   else if (!tree_can_inline_p (e))
> !     inlinable = false;
> !   else if (!DECL_DISREGARD_INLINE_LIMITS (e->callee->decl)
> !            && !caller_growth_limits (e))
> !     inlinable = false;
> !
> !   /* Be sure that the cannot_inline_p flag is up to date.  */
> !   gcc_checking_assert (!e->call_stmt
> !                      || (gimple_call_cannot_inline_p (e->call_stmt)
> !                          == e->call_stmt_cannot_inline_p)
> !                      /* In -flto-partition=none mode we really keep things out of
> !                         sync because call_stmt_cannot_inline_p is set at cgraph
> !                         merging when function bodies are not there yet.  */
> !                      || (in_lto_p && !gimple_call_cannot_inline_p (e->call_stmt)));
> !   if (!inlinable && report)
> !     report_inline_failed_reason (e);
> !   return inlinable;
> ! }
> !
> !
> ! /* Return true if the edge E is inlinable during early inlining.  */
> !
> ! static bool
> ! can_early_inline_edge_p (struct cgraph_edge *e)
> ! {
> !   /* Early inliner might get called at WPA stage when IPA pass adds new
> !      function.  In this case we can not really do any of early inlining
> !      because function bodies are missing.  */
> !   if (!gimple_has_body_p (e->callee->decl))
>      {
> !       e->inline_failed = CIF_BODY_NOT_AVAILABLE;
>        return false;
>      }
> !   /* In early inliner some of callees may not be in SSA form yet
> !      (i.e. the callgraph is cyclic and we did not process
> !      the callee by early inliner, yet).  We don't have CIF code for this
> !      case; later we will re-do the decision in the real inliner.  */
> !   if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->caller->decl))
> !       || !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
>      {
> !       if (dump_file)
> !       fprintf (dump_file, "  edge not inlinable: not in SSA form\n");
>        return false;
>      }
> +   if (!can_inline_edge_p (e, true))
> +     return false;
> +   return true;
> + }
> +
>
> + /* Return true when N is leaf function.  Accept cheap builtins
> +    in leaf functions.  */
> +
> + static bool
> + leaf_node_p (struct cgraph_node *n)
> + {
> +   struct cgraph_edge *e;
> +   for (e = n->callees; e; e = e->next_callee)
> +     if (!is_inexpensive_builtin (e->callee->decl))
> +       return false;
> +   return true;
> + }
>
> !
> ! /* Return true if we are interested in inlining small function.  */
> !
> ! static bool
> ! want_early_inline_function_p (struct cgraph_edge *e)
> ! {
> !   bool want_inline = true;
> !
> !   if (DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
> !     ;
> !   else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
> !          && !flag_inline_small_functions)
>      {
> !       e->inline_failed = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
> !       report_inline_failed_reason (e);
> !       want_inline = false;
> !     }
> !   else
> !     {
> !       int growth = estimate_edge_growth (e);
> !       if (growth <= 0)
> !       ;
> !       else if (!cgraph_maybe_hot_edge_p (e)
> !              && growth > 0)
>        {
> !         if (dump_file)
> !           fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> !                    "call is cold and code would grow by %i\n",
> !                    cgraph_node_name (e->caller), e->caller->uid,
> !                    cgraph_node_name (e->callee), e->callee->uid,
> !                    growth);
> !         want_inline = false;
>        }
> +       else if (!leaf_node_p (e->callee)
> +              && growth > 0)
> +       {
> +         if (dump_file)
> +           fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> +                    "callee is not leaf and code would grow by %i\n",
> +                    cgraph_node_name (e->caller), e->caller->uid,
> +                    cgraph_node_name (e->callee), e->callee->uid,
> +                    growth);
> +         want_inline = false;
> +       }
> +       else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
> +       {
> +         if (dump_file)
> +           fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> +                    "growth %i exceeds --param early-inlining-insns\n",
> +                    cgraph_node_name (e->caller), e->caller->uid,
> +                    cgraph_node_name (e->callee), e->callee->uid,
> +                    growth);
> +         want_inline = false;
> +       }
> +     }
> +   return want_inline;
> + }
> +
> + /* Return true if we are interested in inlining small function.
> +    When REPORT is true, report reason to dump file.  */
> +
> + static bool
> + want_inline_small_function_p (struct cgraph_edge *e, bool report)
> + {
> +   bool want_inline = true;
> +
> +   if (DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
> +     ;
> +   else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
> +          && !flag_inline_small_functions)
> +     {
> +       e->inline_failed = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
> +       want_inline = false;
>      }
>    else
>      {
> !       int growth = estimate_edge_growth (e);
> !
> !       if (growth <= 0)
> !       ;
> !       else if (DECL_DECLARED_INLINE_P (e->callee->decl)
> !              && growth >= MAX_INLINE_INSNS_SINGLE)
>        {
> !           e->inline_failed = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
> !         want_inline = false;
> !       }
> !       else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
> !              && !flag_inline_functions)
> !       {
> !           e->inline_failed = CIF_NOT_DECLARED_INLINED;
> !         want_inline = false;
> !       }
> !       else if (!DECL_DECLARED_INLINE_P (e->callee->decl)
> !              && growth >= MAX_INLINE_INSNS_AUTO)
> !       {
> !           e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
> !         want_inline = false;
> !       }
> !       else if (!cgraph_maybe_hot_edge_p (e)
> !              && estimate_growth (e->callee) > 0)
> !       {
> !           e->inline_failed = CIF_UNLIKELY_CALL;
> !         want_inline = false;
>        }
>      }
> +   if (!want_inline && report)
> +     report_inline_failed_reason (e);
> +   return want_inline;
> + }
>
> ! /* EDGE is self recursive edge.
> !    We hand two cases - when function A is inlining into itself
> !    or when function A is being inlined into another inliner copy of function
> !    A within function B.
> !
> !    In first case OUTER_NODE points to the toplevel copy of A, while
> !    in the second case OUTER_NODE points to the outermost copy of A in B.
> !
> !    In both cases we want to be extra selective since
> !    inlining the call will just introduce new recursive calls to appear.  */
> ! static bool
> ! want_inline_self_recursive_call_p (struct cgraph_edge *edge,
> !                                  struct cgraph_node *outer_node,
> !                                  bool peeling,
> !                                  int depth)
> ! {
> !   char const *reason = NULL;
> !   bool want_inline = true;
> !   int caller_freq = CGRAPH_FREQ_BASE;
> !   int max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH_AUTO);
> !
> !   if (DECL_DECLARED_INLINE_P (edge->callee->decl))
> !     max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH);
> !
> !   if (!cgraph_maybe_hot_edge_p (edge))
> !     {
> !       reason = "recursive call is cold";
> !       want_inline = false;
> !     }
> !   else if (max_count && !outer_node->count)
> !     {
> !       reason = "not executed in profile";
> !       want_inline = false;
> !     }
> !   else if (depth > max_depth)
> !     {
> !       reason = "--param max-inline-recursive-depth exceeded.";
> !       want_inline = false;
> !     }
> !
> !   if (outer_node->global.inlined_to)
> !     caller_freq = outer_node->callers->frequency;
> !
> !   if (!want_inline)
> !     ;
> !   /* Inlining of self recursive function into copy of itself within other function
> !      is transformation similar to loop peeling.
> !
> !      Peeling is profitable if we can inline enough copies to make probablility
> !      of actual call to the self recursive function very small.  Be sure that
> !      the probability of recursion is small.
> !
> !      We ensure that the frequency of recusing is at most 1 - (1/max_depth).
> !      This way the expected number of recusion is at most max_depth.  */
> !   else if (peeling)
> !     {
> !       int max_prob = CGRAPH_FREQ_BASE - ((CGRAPH_FREQ_BASE + max_depth - 1)
> !                                        / max_depth);
> !       int i;
> !       for (i = 1; i < depth; i++)
> !       max_prob = max_prob * max_prob / CGRAPH_FREQ_BASE;
> !       if (max_count
> !         && (edge->count * CGRAPH_FREQ_BASE / outer_node->count
> !             >= max_prob))
> !       {
> !         reason = "profile of recursive call is too large";
> !         want_inline = false;
> !       }
> !       if (!max_count
> !         && (edge->frequency * CGRAPH_FREQ_BASE / caller_freq
> !             >= max_prob))
> !       {
> !         reason = "frequency of recursive call is too large";
> !         want_inline = false;
> !       }
> !     }
> !   /* Recusive inlining, i.e. equivalent of unrolling, is profitable if recusion
> !      depth is large.  We reduce function call overhead and increase chances that
> !      things fit in hardware return predictor.
> !
> !      Recursive inlining might however increase cost of stack frame setup
> !      actually slowing down functions whose recursion tree is wide rather than
> !      deep.
> !
> !      Deciding reliably on when to do recursive inlining withthout profile feedback
> !      is tricky.  For now we disable recursive inlining when probability of self
> !      recursion is low.
> !
> !      Recursive inlining of self recursive call within loop also results in large loop
> !      depths that generally optimize badly.  We may want to throttle down inlining
> !      in those cases.  In particular this seems to happen in one of libstdc++ rb tree
> !      methods.  */
> !   else
> !     {
> !       if (max_count
> !         && (edge->count * 100 / outer_node->count
> !             <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
> !       {
> !         reason = "profile of recursive call is too small";
> !         want_inline = false;
> !       }
> !       else if (!max_count
> !              && (edge->frequency * 100 / caller_freq
> !                  <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
> !       {
> !         reason = "frequency of recursive call is too small";
> !         want_inline = false;
> !       }
> !     }
> !   if (!want_inline && dump_file)
> !     fprintf (dump_file, "   not inlining recursively: %s\n", reason);
> !   return want_inline;
>  }
>
>  /* A cost model driving the inlining heuristics in a way so the edges with
> *************** cgraph_default_inline_p (struct cgraph_n
> *** 392,404 ****
>     of the function or function body size.  */
>
>  static int
> ! cgraph_edge_badness (struct cgraph_edge *edge, bool dump)
>  {
>    gcov_type badness;
>    int growth;
>    struct inline_summary *callee_info = inline_summary (edge->callee);
>
> !   if (callee_info->disregard_inline_limits)
>      return INT_MIN;
>
>    growth = estimate_edge_growth (edge);
> --- 715,727 ----
>     of the function or function body size.  */
>
>  static int
> ! edge_badness (struct cgraph_edge *edge, bool dump)
>  {
>    gcov_type badness;
>    int growth;
>    struct inline_summary *callee_info = inline_summary (edge->callee);
>
> !   if (DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl))
>      return INT_MIN;
>
>    growth = estimate_edge_growth (edge);
> *************** cgraph_edge_badness (struct cgraph_edge
> *** 488,494 ****
>          fprintf (dump_file,
>                   "      %i: guessed profile. frequency %i, overall growth %i,"
>                   " benefit %i%%, divisor %i\n",
> !                  (int) badness, edge->frequency, growth_for_all, benefitperc, div);
>        }
>      }
>    /* When function local profile is not available or it does not give
> --- 811,818 ----
>          fprintf (dump_file,
>                   "      %i: guessed profile. frequency %i, overall growth %i,"
>                   " benefit %i%%, divisor %i\n",
> !                  (int) badness, edge->frequency, growth_for_all,
> !                  benefitperc, div);
>        }
>      }
>    /* When function local profile is not available or it does not give
> *************** cgraph_edge_badness (struct cgraph_edge
> *** 523,532 ****
>  }
>
>  /* Recompute badness of EDGE and update its key in HEAP if needed.  */
> ! static void
>  update_edge_key (fibheap_t heap, struct cgraph_edge *edge)
>  {
> !   int badness = cgraph_edge_badness (edge, false);
>    if (edge->aux)
>      {
>        fibnode_t n = (fibnode_t) edge->aux;
> --- 847,856 ----
>  }
>
>  /* Recompute badness of EDGE and update its key in HEAP if needed.  */
> ! static inline void
>  update_edge_key (fibheap_t heap, struct cgraph_edge *edge)
>  {
> !   int badness = edge_badness (edge, false);
>    if (edge->aux)
>      {
>        fibnode_t n = (fibnode_t) edge->aux;
> *************** update_edge_key (fibheap_t heap, struct
> *** 539,549 ****
>        if (badness < n->key)
>        {
>          fibheap_replace_key (heap, n, badness);
>          gcc_checking_assert (n->key == badness);
>        }
>      }
>    else
> !     edge->aux = fibheap_insert (heap, badness, edge);
>  }
>
>  /* Recompute heap nodes for each of caller edge.  */
> --- 863,892 ----
>        if (badness < n->key)
>        {
>          fibheap_replace_key (heap, n, badness);
> +         if (dump_file && (dump_flags & TDF_DETAILS))
> +           {
> +             fprintf (dump_file,
> +                      "  decreasing badness %s/%i -> %s/%i, %i to %i\n",
> +                      cgraph_node_name (edge->caller), edge->caller->uid,
> +                      cgraph_node_name (edge->callee), edge->callee->uid,
> +                      (int)n->key,
> +                      badness);
> +           }
>          gcc_checking_assert (n->key == badness);
>        }
>      }
>    else
> !     {
> !        if (dump_file && (dump_flags & TDF_DETAILS))
> !        {
> !          fprintf (dump_file,
> !                   "  enqueuing call %s/%i -> %s/%i, badness %i\n",
> !                   cgraph_node_name (edge->caller), edge->caller->uid,
> !                   cgraph_node_name (edge->callee), edge->callee->uid,
> !                   badness);
> !        }
> !       edge->aux = fibheap_insert (heap, badness, edge);
> !     }
>  }
>
>  /* Recompute heap nodes for each of caller edge.  */
> *************** update_caller_keys (fibheap_t heap, stru
> *** 553,559 ****
>                    bitmap updated_nodes)
>  {
>    struct cgraph_edge *edge;
> -   cgraph_inline_failed_t failed_reason;
>
>    if (!inline_summary (node)->inlinable
>        || cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE
> --- 896,901 ----
> *************** update_caller_keys (fibheap_t heap, stru
> *** 569,591 ****
>        break;
>    if (!edge)
>      return;
> !   /* Prune out edges we won't inline into anymore.  */
> !   if (!cgraph_default_inline_p (node, &failed_reason))
> !     {
> !       for (; edge; edge = edge->next_caller)
> !       if (edge->aux)
>          {
>            fibheap_delete_node (heap, (fibnode_t) edge->aux);
>            edge->aux = NULL;
> -           if (edge->inline_failed)
> -             edge->inline_failed = failed_reason;
>          }
> !       return;
> !     }
> !
> !   for (; edge; edge = edge->next_caller)
> !     if (edge->inline_failed)
> !       update_edge_key (heap, edge);
>  }
>
>  /* Recompute heap nodes for each uninlined call.
> --- 911,930 ----
>        break;
>    if (!edge)
>      return;
> !
> !   for (; edge; edge = edge->next_caller)
> !     if (edge->inline_failed)
> !       {
> !       if (can_inline_edge_p (edge, false)
> !           && want_inline_small_function_p (edge, false))
> !           update_edge_key (heap, edge);
> !       else if (edge->aux)
>          {
> +           report_inline_failed_reason (edge);
>            fibheap_delete_node (heap, (fibnode_t) edge->aux);
>            edge->aux = NULL;
>          }
> !       }
>  }
>
>  /* Recompute heap nodes for each uninlined call.
> *************** update_callee_keys (fibheap_t heap, stru
> *** 613,624 ****
>            && !bitmap_bit_p (updated_nodes, e->callee->uid))
>          {
>            inline_summary (node)->estimated_growth = INT_MIN;
> !           /* If function becomes uninlinable, we need to remove it from the heap.  */
> !           if (!cgraph_default_inline_p (e->callee, &e->inline_failed))
> !             update_caller_keys (heap, e->callee, updated_nodes);
> !           else
> !           /* Otherwise update just edge E.  */
> !             update_edge_key (heap, e);
>          }
>        if (e->next_callee)
>          e = e->next_callee;
> --- 952,958 ----
>            && !bitmap_bit_p (updated_nodes, e->callee->uid))
>          {
>            inline_summary (node)->estimated_growth = INT_MIN;
> !           update_edge_key (heap, e);
>          }
>        if (e->next_callee)
>          e = e->next_callee;
> *************** lookup_recursive_calls (struct cgraph_no
> *** 702,717 ****
>     is NULL.  */
>
>  static bool
> ! cgraph_decide_recursive_inlining (struct cgraph_edge *edge,
> !                                 VEC (cgraph_edge_p, heap) **new_edges)
>  {
>    int limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE_AUTO);
> -   int max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH_AUTO);
> -   int probability = PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY);
>    fibheap_t heap;
>    struct cgraph_node *node;
>    struct cgraph_edge *e;
> !   struct cgraph_node *master_clone, *next;
>    int depth = 0;
>    int n = 0;
>
> --- 1036,1049 ----
>     is NULL.  */
>
>  static bool
> ! recursive_inlining (struct cgraph_edge *edge,
> !                   VEC (cgraph_edge_p, heap) **new_edges)
>  {
>    int limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE_AUTO);
>    fibheap_t heap;
>    struct cgraph_node *node;
>    struct cgraph_edge *e;
> !   struct cgraph_node *master_clone = NULL, *next;
>    int depth = 0;
>    int n = 0;
>
> *************** cgraph_decide_recursive_inlining (struct
> *** 719,743 ****
>    if (node->global.inlined_to)
>      node = node->global.inlined_to;
>
> -   /* It does not make sense to recursively inline always-inline functions
> -      as we are going to sorry() on the remaining calls anyway.  */
> -   if (inline_summary (node)->disregard_inline_limits
> -       && lookup_attribute ("always_inline", DECL_ATTRIBUTES (node->decl)))
> -     return false;
> -
> -   if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION (node->decl))
> -       || (!flag_inline_functions && !DECL_DECLARED_INLINE_P (node->decl)))
> -     return false;
> -
>    if (DECL_DECLARED_INLINE_P (node->decl))
> !     {
> !       limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE);
> !       max_depth = PARAM_VALUE (PARAM_MAX_INLINE_RECURSIVE_DEPTH);
> !     }
>
>    /* Make sure that function is small enough to be considered for inlining.  */
> !   if (!max_depth
> !       || estimate_size_after_inlining (node, edge)  >= limit)
>      return false;
>    heap = fibheap_new ();
>    lookup_recursive_calls (node, node, heap);
> --- 1051,1061 ----
>    if (node->global.inlined_to)
>      node = node->global.inlined_to;
>
>    if (DECL_DECLARED_INLINE_P (node->decl))
> !     limit = PARAM_VALUE (PARAM_MAX_INLINE_INSNS_RECURSIVE);
>
>    /* Make sure that function is small enough to be considered for inlining.  */
> !   if (estimate_size_after_inlining (node, edge)  >= limit)
>      return false;
>    heap = fibheap_new ();
>    lookup_recursive_calls (node, node, heap);
> *************** cgraph_decide_recursive_inlining (struct
> *** 752,765 ****
>             "  Performing recursive inlining on %s\n",
>             cgraph_node_name (node));
>
> -   /* We need original clone to copy around.  */
> -   master_clone = cgraph_clone_node (node, node->decl,
> -                                   node->count, CGRAPH_FREQ_BASE, 1,
> -                                   false, NULL);
> -   for (e = master_clone->callees; e; e = e->next_callee)
> -     if (!e->inline_failed)
> -       cgraph_clone_inlined_nodes (e, true, false);
> -
>    /* Do the inlining and update list of recursive call during process.  */
>    while (!fibheap_empty (heap))
>      {
> --- 1070,1075 ----
> *************** cgraph_decide_recursive_inlining (struct
> *** 770,804 ****
>        if (estimate_size_after_inlining (node, curr) > limit)
>        break;
>
>        depth = 1;
>        for (cnode = curr->caller;
>           cnode->global.inlined_to; cnode = cnode->callers->caller)
>        if (node->decl == curr->callee->decl)
>          depth++;
> -       if (depth > max_depth)
> -       {
> -           if (dump_file)
> -           fprintf (dump_file,
> -                    "   maximal depth reached\n");
> -         continue;
> -       }
>
> !       if (max_count)
> !       {
> !           if (!cgraph_maybe_hot_edge_p (curr))
> !           {
> !             if (dump_file)
> !               fprintf (dump_file, "   Not inlining cold call\n");
> !             continue;
> !           }
> !           if (curr->count * 100 / node->count < probability)
> !           {
> !             if (dump_file)
> !               fprintf (dump_file,
> !                        "   Probability of edge is too small\n");
> !             continue;
> !           }
> !       }
>
>        if (dump_file)
>        {
> --- 1080,1096 ----
>        if (estimate_size_after_inlining (node, curr) > limit)
>        break;
>
> +       if (!can_inline_edge_p (curr, true))
> +       continue;
> +
>        depth = 1;
>        for (cnode = curr->caller;
>           cnode->global.inlined_to; cnode = cnode->callers->caller)
>        if (node->decl == curr->callee->decl)
>          depth++;
>
> !       if (!want_inline_self_recursive_call_p (curr, node, false, depth))
> !       continue;
>
>        if (dump_file)
>        {
> *************** cgraph_decide_recursive_inlining (struct
> *** 811,828 ****
>            }
>          fprintf (dump_file, "\n");
>        }
>        cgraph_redirect_edge_callee (curr, master_clone);
>        cgraph_mark_inline_edge (curr, false, new_edges);
>        lookup_recursive_calls (node, curr->callee, heap);
>        n++;
>      }
>    if (!fibheap_empty (heap) && dump_file)
>      fprintf (dump_file, "    Recursive inlining growth limit met.\n");
> -
>    fibheap_delete (heap);
>    if (dump_file)
>      fprintf (dump_file,
> !            "\n   Inlined %i times, body grown from size %i to %i, time %i to %i\n", n,
>             inline_summary (master_clone)->size, inline_summary (node)->size,
>             inline_summary (master_clone)->time, inline_summary (node)->time);
>
> --- 1103,1136 ----
>            }
>          fprintf (dump_file, "\n");
>        }
> +       if (!master_clone)
> +       {
> +         /* We need original clone to copy around.  */
> +         master_clone = cgraph_clone_node (node, node->decl,
> +                                           node->count, CGRAPH_FREQ_BASE, 1,
> +                                           false, NULL);
> +         for (e = master_clone->callees; e; e = e->next_callee)
> +           if (!e->inline_failed)
> +             cgraph_clone_inlined_nodes (e, true, false);
> +       }
> +
>        cgraph_redirect_edge_callee (curr, master_clone);
>        cgraph_mark_inline_edge (curr, false, new_edges);
>        lookup_recursive_calls (node, curr->callee, heap);
>        n++;
>      }
> +
>    if (!fibheap_empty (heap) && dump_file)
>      fprintf (dump_file, "    Recursive inlining growth limit met.\n");
>    fibheap_delete (heap);
> +
> +   if (!master_clone)
> +     return false;
> +
>    if (dump_file)
>      fprintf (dump_file,
> !            "\n   Inlined %i times, "
> !            "body grown from size %i to %i, time %i to %i\n", n,
>             inline_summary (master_clone)->size, inline_summary (node)->size,
>             inline_summary (master_clone)->time, inline_summary (node)->time);
>
> *************** cgraph_decide_recursive_inlining (struct
> *** 837,863 ****
>        cgraph_remove_node (node);
>      }
>    cgraph_remove_node (master_clone);
> !   /* FIXME: Recursive inlining actually reduces number of calls of the
> !      function.  At this place we should probably walk the function and
> !      inline clones and compensate the counts accordingly.  This probably
> !      doesn't matter much in practice.  */
> !   return n > 0;
> ! }
> !
> ! /* Set inline_failed for all callers of given function to REASON.  */
> !
> ! static void
> ! cgraph_set_inline_failed (struct cgraph_node *node,
> !                         cgraph_inline_failed_t reason)
> ! {
> !   struct cgraph_edge *e;
> !
> !   if (dump_file)
> !     fprintf (dump_file, "Inlining failed: %s\n",
> !            cgraph_inline_failed_string (reason));
> !   for (e = node->callers; e; e = e->next_caller)
> !     if (e->inline_failed)
> !       e->inline_failed = reason;
>  }
>
>  /* Given whole compilation unit estimate of INSNS, compute how large we can
> --- 1145,1151 ----
>        cgraph_remove_node (node);
>      }
>    cgraph_remove_node (master_clone);
> !   return true;
>  }
>
>  /* Given whole compilation unit estimate of INSNS, compute how large we can
> *************** add_new_edges_to_heap (fibheap_t heap, V
> *** 884,891 ****
>        gcc_assert (!edge->aux);
>        if (inline_summary (edge->callee)->inlinable
>          && edge->inline_failed
> !         && cgraph_default_inline_p (edge->callee, &edge->inline_failed))
> !         edge->aux = fibheap_insert (heap, cgraph_edge_badness (edge, false), edge);
>      }
>  }
>
> --- 1172,1180 ----
>        gcc_assert (!edge->aux);
>        if (inline_summary (edge->callee)->inlinable
>          && edge->inline_failed
> !         && can_inline_edge_p (edge, true)
> !         && want_inline_small_function_p (edge, true))
> !         edge->aux = fibheap_insert (heap, edge_badness (edge, false), edge);
>      }
>  }
>
> *************** add_new_edges_to_heap (fibheap_t heap, V
> *** 898,908 ****
>     to be passed to cgraph_inlined_into and cgraph_inlined_callees.  */
>
>  static void
> ! cgraph_decide_inlining_of_small_functions (void)
>  {
>    struct cgraph_node *node;
>    struct cgraph_edge *edge;
> -   cgraph_inline_failed_t failed_reason;
>    fibheap_t heap = fibheap_new ();
>    bitmap updated_nodes = BITMAP_ALLOC (NULL);
>    int min_size, max_size;
> --- 1187,1196 ----
>     to be passed to cgraph_inlined_into and cgraph_inlined_callees.  */
>
>  static void
> ! inline_small_functions (void)
>  {
>    struct cgraph_node *node;
>    struct cgraph_edge *edge;
>    fibheap_t heap = fibheap_new ();
>    bitmap updated_nodes = BITMAP_ALLOC (NULL);
>    int min_size, max_size;
> *************** cgraph_decide_inlining_of_small_function
> *** 921,951 ****
>        {
>        struct inline_summary *info = inline_summary (node);
>
> -       if (!info->inlinable || !node->callers)
> -         {
> -           struct cgraph_edge *e;
> -           for (e = node->callers; e; e = e->next_caller)
> -             {
> -               gcc_assert (e->inline_failed);
> -               e->inline_failed = CIF_FUNCTION_NOT_INLINABLE;
> -             }
> -           continue;
> -         }
>        if (dump_file)
> !         fprintf (dump_file, "Considering inline candidate %s.\n", cgraph_node_name (node));
>
>        info->estimated_growth = INT_MIN;
> -       if (!cgraph_default_inline_p (node, &failed_reason))
> -         {
> -           cgraph_set_inline_failed (node, failed_reason);
> -           continue;
> -         }
>
>        for (edge = node->callers; edge; edge = edge->next_caller)
> !         if (edge->inline_failed)
>            {
>              gcc_assert (!edge->aux);
> !             edge->aux = fibheap_insert (heap, cgraph_edge_badness (edge, false), edge);
>            }
>        }
>
> --- 1209,1228 ----
>        {
>        struct inline_summary *info = inline_summary (node);
>
>        if (dump_file)
> !         fprintf (dump_file, "Enqueueing calls of %s/%i.\n",
> !                  cgraph_node_name (node), node->uid);
>
>        info->estimated_growth = INT_MIN;
>
>        for (edge = node->callers; edge; edge = edge->next_caller)
> !         if (edge->inline_failed
> !             && can_inline_edge_p (edge, true)
> !             && want_inline_small_function_p (edge, true)
> !             && edge->inline_failed)
>            {
>              gcc_assert (!edge->aux);
> !             update_edge_key (heap, edge);
>            }
>        }
>
> *************** cgraph_decide_inlining_of_small_function
> *** 960,966 ****
>        int badness = fibheap_min_key (heap);
>        int current_badness;
>        int growth;
> -       cgraph_inline_failed_t not_good = CIF_OK;
>
>        edge = (struct cgraph_edge *) fibheap_extract_min (heap);
>        gcc_assert (edge->aux);
> --- 1237,1242 ----
> *************** cgraph_decide_inlining_of_small_function
> *** 971,983 ****
>        /* When updating the edge costs, we only decrease badness in the keys.
>         When the badness increase, we keep the heap as it is and re-insert
>         key now.  */
> !       current_badness = cgraph_edge_badness (edge, false);
>        gcc_assert (current_badness >= badness);
>        if (current_badness != badness)
>        {
>          edge->aux = fibheap_insert (heap, current_badness, edge);
>          continue;
>        }
>
>        callee = edge->callee;
>        growth = estimate_edge_growth (edge);
> --- 1247,1262 ----
>        /* When updating the edge costs, we only decrease badness in the keys.
>         When the badness increase, we keep the heap as it is and re-insert
>         key now.  */
> !       current_badness = edge_badness (edge, false);
>        gcc_assert (current_badness >= badness);
>        if (current_badness != badness)
>        {
>          edge->aux = fibheap_insert (heap, current_badness, edge);
>          continue;
>        }
> +
> +       if (!can_inline_edge_p (edge, true))
> +       continue;
>
>        callee = edge->callee;
>        growth = estimate_edge_growth (edge);
> *************** cgraph_decide_inlining_of_small_function
> *** 989,1069 ****
>                   inline_summary (edge->callee)->size);
>          fprintf (dump_file,
>                   " to be inlined into %s in %s:%i\n"
> !                  " Estimated growth after inlined into all callees is %+i insns.\n"
>                   " Estimated badness is %i, frequency %.2f.\n",
>                   cgraph_node_name (edge->caller),
>                   flag_wpa ? "unknown"
>                   : gimple_filename ((const_gimple) edge->call_stmt),
> !                  flag_wpa ? -1 : gimple_lineno ((const_gimple) edge->call_stmt),
>                   estimate_growth (edge->callee),
>                   badness,
>                   edge->frequency / (double)CGRAPH_FREQ_BASE);
>          if (edge->count)
> !           fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n", edge->count);
>          if (dump_flags & TDF_DETAILS)
> !           cgraph_edge_badness (edge, true);
>        }
>
> !       /* When not having profile info ready we don't weight by any way the
> !          position of call in procedure itself.  This means if call of
> !        function A from function B seems profitable to inline, the recursive
> !        call of function A in inline copy of A in B will look profitable too
> !        and we end up inlining until reaching maximal function growth.  This
> !        is not good idea so prohibit the recursive inlining.
> !
> !        ??? When the frequencies are taken into account we might not need this
> !        restriction.
> !
> !        We need to be careful here, in some testcases, e.g. directives.c in
> !        libcpp, we can estimate self recursive function to have negative growth
> !        for inlining completely.
> !        */
> !       if (!edge->count)
>        {
> !         where = edge->caller;
> !         while (where->global.inlined_to)
> !           {
> !             if (where->decl == edge->callee->decl)
> !               break;
> !             where = where->callers->caller;
> !           }
> !         if (where->global.inlined_to)
> !           {
> !             edge->inline_failed
> !               = (inline_summary (edge->callee)->disregard_inline_limits
> !                  ? CIF_RECURSIVE_INLINING : CIF_UNSPECIFIED);
> !             if (dump_file)
> !               fprintf (dump_file, " inline_failed:Recursive inlining performed only for function itself.\n");
> !             continue;
> !           }
> !       }
> !
> !       if (inline_summary (edge->callee)->disregard_inline_limits)
> !       ;
> !       else if (!cgraph_maybe_hot_edge_p (edge))
> !       not_good = CIF_UNLIKELY_CALL;
> !       else if (!flag_inline_functions
> !         && !DECL_DECLARED_INLINE_P (edge->callee->decl))
> !       not_good = CIF_NOT_DECLARED_INLINED;
> !       else if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION(edge->caller->decl)))
> !       not_good = CIF_OPTIMIZING_FOR_SIZE;
> !       if (not_good && growth > 0 && estimate_growth (edge->callee) > 0)
> !       {
> !         edge->inline_failed = not_good;
> !         if (dump_file)
> !           fprintf (dump_file, " inline_failed:%s.\n",
> !                    cgraph_inline_failed_string (edge->inline_failed));
>          continue;
>        }
> !       if (!cgraph_default_inline_p (edge->callee, &edge->inline_failed))
> !       {
> !         if (dump_file)
> !           fprintf (dump_file, " inline_failed:%s.\n",
> !                    cgraph_inline_failed_string (edge->inline_failed));
> !         continue;
> !       }
> !       if (!tree_can_inline_p (edge)
> !         || edge->call_stmt_cannot_inline_p)
>        {
>          if (dump_file)
>            fprintf (dump_file, " inline_failed:%s.\n",
> --- 1268,1299 ----
>                   inline_summary (edge->callee)->size);
>          fprintf (dump_file,
>                   " to be inlined into %s in %s:%i\n"
> !                  " Estimated growth after inlined into all is %+i insns.\n"
>                   " Estimated badness is %i, frequency %.2f.\n",
>                   cgraph_node_name (edge->caller),
>                   flag_wpa ? "unknown"
>                   : gimple_filename ((const_gimple) edge->call_stmt),
> !                  flag_wpa ? -1
> !                  : gimple_lineno ((const_gimple) edge->call_stmt),
>                   estimate_growth (edge->callee),
>                   badness,
>                   edge->frequency / (double)CGRAPH_FREQ_BASE);
>          if (edge->count)
> !           fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n",
> !                    edge->count);
>          if (dump_flags & TDF_DETAILS)
> !           edge_badness (edge, true);
>        }
>
> !       if (overall_size + growth > max_size
> !         && !DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl))
>        {
> !         edge->inline_failed = CIF_INLINE_UNIT_GROWTH_LIMIT;
> !         report_inline_failed_reason (edge);
>          continue;
>        }
> !
> !       if (!want_inline_small_function_p (edge, true))
>        {
>          if (dump_file)
>            fprintf (dump_file, " inline_failed:%s.\n",
> *************** cgraph_decide_inlining_of_small_function
> *** 1075,1083 ****
>          where = edge->caller;
>          if (where->global.inlined_to)
>            where = where->global.inlined_to;
> !         if (!cgraph_decide_recursive_inlining (edge,
> !                                                flag_indirect_inlining
> !                                                ? &new_indirect_edges : NULL))
>            {
>              edge->inline_failed = CIF_RECURSIVE_INLINING;
>              continue;
> --- 1305,1313 ----
>          where = edge->caller;
>          if (where->global.inlined_to)
>            where = where->global.inlined_to;
> !         if (!recursive_inlining (edge,
> !                                  flag_indirect_inlining
> !                                  ? &new_indirect_edges : NULL))
>            {
>              edge->inline_failed = CIF_RECURSIVE_INLINING;
>              continue;
> *************** cgraph_decide_inlining_of_small_function
> *** 1089,1102 ****
>        else
>        {
>          struct cgraph_node *callee;
> !         if (!cgraph_check_inline_limits (edge, &edge->inline_failed))
>            {
> !             if (dump_file)
> !               fprintf (dump_file, " Not inlining into %s:%s.\n",
> !                        cgraph_node_name (edge->caller),
> !                        cgraph_inline_failed_string (edge->inline_failed));
>              continue;
>            }
>          callee = edge->callee;
>          gcc_checking_assert (!callee->global.inlined_to);
>          cgraph_mark_inline_edge (edge, true, &new_indirect_edges);
> --- 1319,1351 ----
>        else
>        {
>          struct cgraph_node *callee;
> !         struct cgraph_node *outer_node = NULL;
> !         int depth = 0;
> !
> !         /* Consider the case where self recursive function A is inlined into B.
> !            This is desired optimization in some cases, since it leads to effect
> !            similar of loop peeling and we might completely optimize out the
> !            recursive call.  However we must be extra selective.  */
> !
> !         where = edge->caller;
> !         while (where->global.inlined_to)
>            {
> !             if (where->decl == edge->callee->decl)
> !               outer_node = where, depth++;
> !             where = where->callers->caller;
> !           }
> !         if (outer_node
> !             && !want_inline_self_recursive_call_p (edge, outer_node,
> !                                                    true, depth))
> !           {
> !             edge->inline_failed
> !               = (DECL_DISREGARD_INLINE_LIMITS (edge->callee->decl)
> !                  ? CIF_RECURSIVE_INLINING : CIF_UNSPECIFIED);
>              continue;
>            }
> +         else if (depth && dump_file)
> +           fprintf (dump_file, " Peeling recursion with depth %i\n", depth);
> +
>          callee = edge->callee;
>          gcc_checking_assert (!callee->global.inlined_to);
>          cgraph_mark_inline_edge (edge, true, &new_indirect_edges);
> *************** cgraph_decide_inlining_of_small_function
> *** 1148,1190 ****
>            fprintf (dump_file, "New minimal size reached: %i\n", min_size);
>        }
>      }
> -   while (!fibheap_empty (heap))
> -     {
> -       int badness = fibheap_min_key (heap);
> -
> -       edge = (struct cgraph_edge *) fibheap_extract_min (heap);
> -       gcc_assert (edge->aux);
> -       edge->aux = NULL;
> -       if (!edge->inline_failed)
> -       continue;
> - #ifdef ENABLE_CHECKING
> -       gcc_assert (cgraph_edge_badness (edge, false) >= badness);
> - #endif
> -       if (dump_file)
> -       {
> -         fprintf (dump_file,
> -                  "\nSkipping %s with %i size\n",
> -                  cgraph_node_name (edge->callee),
> -                  inline_summary (edge->callee)->size);
> -         fprintf (dump_file,
> -                  " called by %s in %s:%i\n"
> -                  " Estimated growth after inlined into all callees is %+i insns.\n"
> -                  " Estimated badness is %i, frequency %.2f.\n",
> -                  cgraph_node_name (edge->caller),
> -                  flag_wpa ? "unknown"
> -                  : gimple_filename ((const_gimple) edge->call_stmt),
> -                  flag_wpa ? -1 : gimple_lineno ((const_gimple) edge->call_stmt),
> -                  estimate_growth (edge->callee),
> -                  badness,
> -                  edge->frequency / (double)CGRAPH_FREQ_BASE);
> -         if (edge->count)
> -           fprintf (dump_file," Called "HOST_WIDEST_INT_PRINT_DEC"x\n", edge->count);
> -         if (dump_flags & TDF_DETAILS)
> -           cgraph_edge_badness (edge, true);
> -       }
> -       if (!inline_summary (edge->callee)->disregard_inline_limits && edge->inline_failed)
> -       edge->inline_failed = CIF_INLINE_UNIT_GROWTH_LIMIT;
> -     }
>
>    if (new_indirect_edges)
>      VEC_free (cgraph_edge_p, heap, new_indirect_edges);
> --- 1397,1402 ----
> *************** cgraph_decide_inlining_of_small_function
> *** 1195,1201 ****
>  /* Flatten NODE from the IPA inliner.  */
>
>  static void
> ! cgraph_flatten (struct cgraph_node *node)
>  {
>    struct cgraph_edge *e;
>
> --- 1407,1413 ----
>  /* Flatten NODE from the IPA inliner.  */
>
>  static void
> ! flatten_function (struct cgraph_node *node)
>  {
>    struct cgraph_edge *e;
>
> *************** cgraph_flatten (struct cgraph_node *node
> *** 1208,1229 ****
>      {
>        struct cgraph_node *orig_callee;
>
> -       if (e->call_stmt_cannot_inline_p)
> -       {
> -         if (dump_file)
> -           fprintf (dump_file, "Not inlining: %s",
> -                    cgraph_inline_failed_string (e->inline_failed));
> -         continue;
> -       }
> -
> -       if (!e->callee->analyzed)
> -       {
> -         if (dump_file)
> -           fprintf (dump_file,
> -                    "Not inlining: Function body not available.\n");
> -         continue;
> -       }
> -
>        /* We've hit cycle?  It is time to give up.  */
>        if (e->callee->aux)
>        {
> --- 1420,1425 ----
> *************** cgraph_flatten (struct cgraph_node *node
> *** 1240,1249 ****
>         it in order to fully flatten the leaves.  */
>        if (!e->inline_failed)
>        {
> !         cgraph_flatten (e->callee);
>          continue;
>        }
>
>        if (cgraph_edge_recursive_p (e))
>        {
>          if (dump_file)
> --- 1436,1453 ----
>         it in order to fully flatten the leaves.  */
>        if (!e->inline_failed)
>        {
> !         flatten_function (e->callee);
>          continue;
>        }
>
> +       /* Flatten attribute needs to be processed during late inlining. For
> +        extra code quality we however do flattening during early optimization,
> +        too.  */
> +       if (cgraph_state != CGRAPH_STATE_IPA_SSA
> +         ? !can_inline_edge_p (e, true)
> +         : !can_early_inline_edge_p (e))
> +       continue;
> +
>        if (cgraph_edge_recursive_p (e))
>        {
>          if (dump_file)
> *************** cgraph_flatten (struct cgraph_node *node
> *** 1251,1264 ****
>          continue;
>        }
>
> -       if (!tree_can_inline_p (e))
> -       {
> -         if (dump_file)
> -           fprintf (dump_file, "Not inlining: %s",
> -                    cgraph_inline_failed_string (e->inline_failed));
> -         continue;
> -       }
> -
>        if (gimple_in_ssa_p (DECL_STRUCT_FUNCTION (node->decl))
>          != gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
>        {
> --- 1455,1460 ----
> *************** cgraph_flatten (struct cgraph_node *node
> *** 1277,1283 ****
>        cgraph_mark_inline_edge (e, true, NULL);
>        if (e->callee != orig_callee)
>        orig_callee->aux = (void *) node;
> !       cgraph_flatten (e->callee);
>        if (e->callee != orig_callee)
>        orig_callee->aux = NULL;
>      }
> --- 1473,1479 ----
>        cgraph_mark_inline_edge (e, true, NULL);
>        if (e->callee != orig_callee)
>        orig_callee->aux = (void *) node;
> !       flatten_function (e->callee);
>        if (e->callee != orig_callee)
>        orig_callee->aux = NULL;
>      }
> *************** cgraph_flatten (struct cgraph_node *node
> *** 1289,1295 ****
>     expenses on updating data structures.  */
>
>  static unsigned int
> ! cgraph_decide_inlining (void)
>  {
>    struct cgraph_node *node;
>    int nnodes;
> --- 1485,1491 ----
>     expenses on updating data structures.  */
>
>  static unsigned int
> ! ipa_inline (void)
>  {
>    struct cgraph_node *node;
>    int nnodes;
> *************** cgraph_decide_inlining (void)
> *** 1366,1406 ****
>          if (dump_file)
>            fprintf (dump_file,
>                     "Flattening %s\n", cgraph_node_name (node));
> !         cgraph_flatten (node);
>        }
>      }
>
> !   cgraph_decide_inlining_of_small_functions ();
>
>    if (flag_inline_functions_called_once)
>      {
>        if (dump_file)
>        fprintf (dump_file, "\nDeciding on functions called once:\n");
>
>        /* And finally decide what functions are called once.  */
> !       for (i = nnodes - 1; i >= 0; i--)
>        {
> -         node = order[i];
> -
>          if (node->callers
>              && !node->callers->next_caller
>              && !node->global.inlined_to
> -             && cgraph_will_be_removed_from_program_if_no_direct_calls (node)
> -             && inline_summary (node)->inlinable
> -             && cgraph_function_body_availability (node) >= AVAIL_AVAILABLE
>              && node->callers->inline_failed
>              && node->callers->caller != node
>              && node->callers->caller->global.inlined_to != node
> !             && !node->callers->call_stmt_cannot_inline_p
> !             && tree_can_inline_p (node->callers)
> !             && !DECL_EXTERNAL (node->decl))
>            {
> !             cgraph_inline_failed_t reason;
>              old_size = overall_size;
>              if (dump_file)
>                {
>                  fprintf (dump_file,
> !                          "\nConsidering %s size %i.\n",
>                           cgraph_node_name (node), inline_summary (node)->size);
>                  fprintf (dump_file,
>                           " Called once from %s %i insns.\n",
> --- 1562,1605 ----
>          if (dump_file)
>            fprintf (dump_file,
>                     "Flattening %s\n", cgraph_node_name (node));
> !         flatten_function (node);
>        }
>      }
>
> !   inline_small_functions ();
> !   cgraph_remove_unreachable_nodes (true, dump_file);
> !   free (order);
>
> +   /* We already perform some inlining of functions called once during
> +      inlining small functions above.  After unreachable nodes are removed,
> +      we still might do a quick check that nothing new is found.  */
>    if (flag_inline_functions_called_once)
>      {
>        if (dump_file)
>        fprintf (dump_file, "\nDeciding on functions called once:\n");
>
>        /* And finally decide what functions are called once.  */
> !       for (node = cgraph_nodes; node; node = node->next)
>        {
>          if (node->callers
>              && !node->callers->next_caller
>              && !node->global.inlined_to
>              && node->callers->inline_failed
>              && node->callers->caller != node
>              && node->callers->caller->global.inlined_to != node
> !             && cgraph_will_be_removed_from_program_if_no_direct_calls (node)
> !             && inline_summary (node)->inlinable
> !             && cgraph_function_body_availability (node) >= AVAIL_AVAILABLE
> !             && !DECL_EXTERNAL (node->decl)
> !             && can_inline_edge_p (node->callers, true))
>            {
> !             struct cgraph_node *caller = node->callers->caller;
> !
>              old_size = overall_size;
>              if (dump_file)
>                {
>                  fprintf (dump_file,
> !                          "\nInlining %s size %i.\n",
>                           cgraph_node_name (node), inline_summary (node)->size);
>                  fprintf (dump_file,
>                           " Called once from %s %i insns.\n",
> *************** cgraph_decide_inlining (void)
> *** 1408,1432 ****
>                           inline_summary (node->callers->caller)->size);
>                }
>
> !             if (cgraph_check_inline_limits (node->callers, &reason))
> !               {
> !                 struct cgraph_node *caller = node->callers->caller;
> !                 cgraph_mark_inline_edge (node->callers, true, NULL);
> !                 if (dump_file)
> !                   fprintf (dump_file,
> !                            " Inlined into %s which now has %i size"
> !                            " for a net change of %+i size.\n",
> !                            cgraph_node_name (caller),
> !                            inline_summary (caller)->size,
> !                            overall_size - old_size);
> !               }
> !             else
> !               {
> !                 if (dump_file)
> !                   fprintf (dump_file,
> !                            " Not inlining: %s.\n",
> !                              cgraph_inline_failed_string (reason));
> !               }
>            }
>        }
>      }
> --- 1607,1620 ----
>                           inline_summary (node->callers->caller)->size);
>                }
>
> !             cgraph_mark_inline_edge (node->callers, true, NULL);
> !             if (dump_file)
> !               fprintf (dump_file,
> !                        " Inlined into %s which now has %i size"
> !                        " for a net change of %+i size.\n",
> !                        cgraph_node_name (caller),
> !                        inline_summary (caller)->size,
> !                        overall_size - old_size);
>            }
>        }
>      }
> *************** cgraph_decide_inlining (void)
> *** 1441,1532 ****
>             "size %i turned to %i size.\n\n",
>             ncalls_inlined, nfunctions_inlined, initial_size,
>             overall_size);
> -   free (order);
>    /* In WPA we use inline summaries for partitioning process.  */
>    if (!flag_wpa)
>      inline_free_summary ();
>    return 0;
>  }
>
> - /* Return true when N is leaf function.  Accept cheap builtins
> -    in leaf functions.  */
> -
> - static bool
> - leaf_node_p (struct cgraph_node *n)
> - {
> -   struct cgraph_edge *e;
> -   for (e = n->callees; e; e = e->next_callee)
> -     if (!is_inexpensive_builtin (e->callee->decl))
> -       return false;
> -   return true;
> - }
> -
> - /* Return true if the edge E is inlinable during early inlining.  */
> -
> - static bool
> - cgraph_edge_early_inlinable_p (struct cgraph_edge *e, FILE *file)
> - {
> -   if (!inline_summary (e->callee)->inlinable)
> -     {
> -       if (file)
> -       fprintf (file, "Not inlining: Function not inlinable.\n");
> -       return false;
> -     }
> -   if (!e->callee->analyzed)
> -     {
> -       if (file)
> -       fprintf (file, "Not inlining: Function body not available.\n");
> -       return false;
> -     }
> -   if (!tree_can_inline_p (e)
> -       || e->call_stmt_cannot_inline_p)
> -     {
> -       if (file)
> -       fprintf (file, "Not inlining: %s.\n",
> -                cgraph_inline_failed_string (e->inline_failed));
> -       return false;
> -     }
> -   if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->caller->decl))
> -       || !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
> -     {
> -       if (file)
> -       fprintf (file, "Not inlining: not in SSA form.\n");
> -       return false;
> -     }
> -   return true;
> - }
> -
>  /* Inline always-inline function calls in NODE.  */
>
>  static bool
> ! cgraph_perform_always_inlining (struct cgraph_node *node)
>  {
>    struct cgraph_edge *e;
>    bool inlined = false;
>
>    for (e = node->callees; e; e = e->next_callee)
>      {
> !       if (!inline_summary (e->callee)->disregard_inline_limits)
>        continue;
>
> -       if (dump_file)
> -       fprintf (dump_file,
> -                "Considering always-inline candidate %s.\n",
> -                cgraph_node_name (e->callee));
> -
>        if (cgraph_edge_recursive_p (e))
>        {
>          if (dump_file)
> !           fprintf (dump_file, "Not inlining: recursive call.\n");
>          e->inline_failed = CIF_RECURSIVE_INLINING;
>          continue;
>        }
>
> !       if (!cgraph_edge_early_inlinable_p (e, dump_file))
>        continue;
>
>        if (dump_file)
> !       fprintf (dump_file, " Inlining %s into %s.\n",
>                 cgraph_node_name (e->callee),
>                 cgraph_node_name (e->caller));
>        cgraph_mark_inline_edge (e, true, NULL);
> --- 1629,1667 ----
>             "size %i turned to %i size.\n\n",
>             ncalls_inlined, nfunctions_inlined, initial_size,
>             overall_size);
>    /* In WPA we use inline summaries for partitioning process.  */
>    if (!flag_wpa)
>      inline_free_summary ();
>    return 0;
>  }
>
>  /* Inline always-inline function calls in NODE.  */
>
>  static bool
> ! inline_always_inline_functions (struct cgraph_node *node)
>  {
>    struct cgraph_edge *e;
>    bool inlined = false;
>
>    for (e = node->callees; e; e = e->next_callee)
>      {
> !       if (!DECL_DISREGARD_INLINE_LIMITS (e->callee->decl))
>        continue;
>
>        if (cgraph_edge_recursive_p (e))
>        {
>          if (dump_file)
> !           fprintf (dump_file, "  Not inlining recursive call to %s.\n",
> !                    cgraph_node_name (e->callee));
>          e->inline_failed = CIF_RECURSIVE_INLINING;
>          continue;
>        }
>
> !       if (!can_early_inline_edge_p (e))
>        continue;
>
>        if (dump_file)
> !       fprintf (dump_file, "  Inlining %s into %s (always_inline).\n",
>                 cgraph_node_name (e->callee),
>                 cgraph_node_name (e->caller));
>        cgraph_mark_inline_edge (e, true, NULL);
> *************** cgraph_perform_always_inlining (struct c
> *** 1540,1563 ****
>     expenses on updating data structures.  */
>
>  static bool
> ! cgraph_decide_inlining_incrementally (struct cgraph_node *node)
>  {
>    struct cgraph_edge *e;
>    bool inlined = false;
> -   cgraph_inline_failed_t failed_reason;
> -
> -   /* Never inline regular functions into always-inline functions
> -      during incremental inlining.  */
> -   if (inline_summary (node)->disregard_inline_limits)
> -     return false;
>
>    for (e = node->callees; e; e = e->next_callee)
>      {
> -       int allowed_growth = 0;
> -
>        if (!inline_summary (e->callee)->inlinable
> !         || !e->inline_failed
> !         || inline_summary (e->callee)->disregard_inline_limits)
>        continue;
>
>        /* Do not consider functions not declared inline.  */
> --- 1675,1689 ----
>     expenses on updating data structures.  */
>
>  static bool
> ! early_inline_small_functions (struct cgraph_node *node)
>  {
>    struct cgraph_edge *e;
>    bool inlined = false;
>
>    for (e = node->callees; e; e = e->next_callee)
>      {
>        if (!inline_summary (e->callee)->inlinable
> !         || !e->inline_failed)
>        continue;
>
>        /* Do not consider functions not declared inline.  */
> *************** cgraph_decide_inlining_incrementally (st
> *** 1570,1616 ****
>        fprintf (dump_file, "Considering inline candidate %s.\n",
>                 cgraph_node_name (e->callee));
>
>        if (cgraph_edge_recursive_p (e))
>        {
>          if (dump_file)
> !           fprintf (dump_file, "Not inlining: recursive call.\n");
>          continue;
>        }
>
> !       if (!cgraph_edge_early_inlinable_p (e, dump_file))
>        continue;
>
> !       if (cgraph_maybe_hot_edge_p (e) && leaf_node_p (e->callee)
> !         && optimize_function_for_speed_p (cfun))
> !       allowed_growth = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
> !
> !       /* When the function body would grow and inlining the function
> !        won't eliminate the need for offline copy of the function,
> !        don't inline.  */
> !       if (estimate_edge_growth (e) > allowed_growth)
> !       {
> !         if (dump_file)
> !           fprintf (dump_file,
> !                    "Not inlining: code size would grow by %i.\n",
> !                    estimate_edge_growth (e));
> !         continue;
> !       }
> !       if (!cgraph_check_inline_limits (e, &e->inline_failed))
> !       {
> !         if (dump_file)
> !           fprintf (dump_file, "Not inlining: %s.\n",
> !                    cgraph_inline_failed_string (e->inline_failed));
> !         continue;
> !       }
> !       if (cgraph_default_inline_p (e->callee, &failed_reason))
> !       {
> !         if (dump_file)
> !           fprintf (dump_file, " Inlining %s into %s.\n",
> !                    cgraph_node_name (e->callee),
> !                    cgraph_node_name (e->caller));
> !         cgraph_mark_inline_edge (e, true, NULL);
> !         inlined = true;
> !       }
>      }
>
>    return inlined;
> --- 1696,1720 ----
>        fprintf (dump_file, "Considering inline candidate %s.\n",
>                 cgraph_node_name (e->callee));
>
> +       if (!can_early_inline_edge_p (e))
> +       continue;
> +
>        if (cgraph_edge_recursive_p (e))
>        {
>          if (dump_file)
> !           fprintf (dump_file, "  Not inlining: recursive call.\n");
>          continue;
>        }
>
> !       if (!want_early_inline_function_p (e))
>        continue;
>
> !       if (dump_file)
> !       fprintf (dump_file, " Inlining %s into %s.\n",
> !                cgraph_node_name (e->callee),
> !                cgraph_node_name (e->caller));
> !       cgraph_mark_inline_edge (e, true, NULL);
> !       inlined = true;
>      }
>
>    return inlined;
> *************** static GTY ((length ("nnodes"))) struct
> *** 1626,1632 ****
>     passes to be somewhat more effective and avoids some code duplication in
>     later real inlining pass for testcases with very many function calls.  */
>  static unsigned int
> ! cgraph_early_inlining (void)
>  {
>    struct cgraph_node *node = cgraph_get_node (current_function_decl);
>    struct cgraph_edge *edge;
> --- 1730,1736 ----
>     passes to be somewhat more effective and avoids some code duplication in
>     later real inlining pass for testcases with very many function calls.  */
>  static unsigned int
> ! early_inliner (void)
>  {
>    struct cgraph_node *node = cgraph_get_node (current_function_decl);
>    struct cgraph_edge *edge;
> *************** cgraph_early_inlining (void)
> *** 1643,1653 ****
>
>    /* Even when not optimizing or not inlining inline always-inline
>       functions.  */
> !   inlined = cgraph_perform_always_inlining (node);
>
>    if (!optimize
>        || flag_no_inline
> !       || !flag_early_inlining)
>      ;
>    else if (lookup_attribute ("flatten",
>                             DECL_ATTRIBUTES (node->decl)) != NULL)
> --- 1747,1766 ----
>
>    /* Even when not optimizing or not inlining inline always-inline
>       functions.  */
> !   inlined = inline_always_inline_functions (node);
>
>    if (!optimize
>        || flag_no_inline
> !       || !flag_early_inlining
> !       /* Never inline regular functions into always-inline functions
> !        during incremental inlining.  This sucks as functions calling
> !        always inline functions will get less optimized, but at the
> !        same time inlining of functions calling always inline
> !        functoin into an always inline function might introduce
> !        cycles of edges to be always inlined in the callgraph.
> !
> !        We might want to be smarter and just avoid this type of inlining.  */
> !       || DECL_DISREGARD_INLINE_LIMITS (node->decl))
>      ;
>    else if (lookup_attribute ("flatten",
>                             DECL_ATTRIBUTES (node->decl)) != NULL)
> *************** cgraph_early_inlining (void)
> *** 1657,1663 ****
>        if (dump_file)
>        fprintf (dump_file,
>                 "Flattening %s\n", cgraph_node_name (node));
> !       cgraph_flatten (node);
>        inlined = true;
>      }
>    else
> --- 1770,1776 ----
>        if (dump_file)
>        fprintf (dump_file,
>                 "Flattening %s\n", cgraph_node_name (node));
> !       flatten_function (node);
>        inlined = true;
>      }
>    else
> *************** cgraph_early_inlining (void)
> *** 1665,1674 ****
>        /* We iterate incremental inlining to get trivial cases of indirect
>         inlining.  */
>        while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
> !            && cgraph_decide_inlining_incrementally (node))
>        {
>          timevar_push (TV_INTEGRATION);
>          todo |= optimize_inline_calls (current_function_decl);
>          timevar_pop (TV_INTEGRATION);
>          iterations++;
>          inlined = false;
> --- 1778,1799 ----
>        /* We iterate incremental inlining to get trivial cases of indirect
>         inlining.  */
>        while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
> !            && early_inline_small_functions (node))
>        {
>          timevar_push (TV_INTEGRATION);
>          todo |= optimize_inline_calls (current_function_decl);
> +
> +         /* Technically we ought to recompute inline parameters so the new
> +            iteration of early inliner works as expected.  We however have
> +            values approximately right and thus we only need to update edge
> +            info that might be cleared out for newly discovered edges.  */
> +         for (edge = node->callees; edge; edge = edge->next_callee)
> +           {
> +             edge->call_stmt_size
> +               = estimate_num_insns (edge->call_stmt, &eni_size_weights);
> +             edge->call_stmt_time
> +               = estimate_num_insns (edge->call_stmt, &eni_time_weights);
> +           }
>          timevar_pop (TV_INTEGRATION);
>          iterations++;
>          inlined = false;
> *************** cgraph_early_inlining (void)
> *** 1681,1699 ****
>      {
>        timevar_push (TV_INTEGRATION);
>        todo |= optimize_inline_calls (current_function_decl);
> -
> -       /* Technically we ought to recompute inline parameters so the new iteration of
> -        early inliner works as expected.  We however have values approximately right
> -        and thus we only need to update edge info that might be cleared out for
> -        newly discovered edges.  */
> -       for (edge = node->callees; edge; edge = edge->next_callee)
> -       {
> -         edge->call_stmt_size
> -           = estimate_num_insns (edge->call_stmt, &eni_size_weights);
> -         edge->call_stmt_time
> -           = estimate_num_insns (edge->call_stmt, &eni_time_weights);
> -       }
> -
>        timevar_pop (TV_INTEGRATION);
>      }
>
> --- 1806,1811 ----
> *************** struct gimple_opt_pass pass_early_inline
> *** 1708,1714 ****
>    GIMPLE_PASS,
>    "einline",                          /* name */
>    NULL,                                       /* gate */
> !   cgraph_early_inlining,              /* execute */
>    NULL,                                       /* sub */
>    NULL,                                       /* next */
>    0,                                  /* static_pass_number */
> --- 1820,1826 ----
>    GIMPLE_PASS,
>    "einline",                          /* name */
>    NULL,                                       /* gate */
> !   early_inliner,                      /* execute */
>    NULL,                                       /* sub */
>    NULL,                                       /* next */
>    0,                                  /* static_pass_number */
> *************** inline_transform (struct cgraph_node *no
> *** 1730,1737 ****
>    struct cgraph_edge *e;
>    bool inline_p = false;
>
> !   /* FIXME: Currently the pass manager is adding inline transform more than once to some
> !      clones.  This needs revisiting after WPA cleanups.  */
>    if (cfun->after_inlining)
>      return 0;
>
> --- 1842,1849 ----
>    struct cgraph_edge *e;
>    bool inline_p = false;
>
> !   /* FIXME: Currently the pass manager is adding inline transform more than
> !      once to some clones.  This needs revisiting after WPA cleanups.  */
>    if (cfun->after_inlining)
>      return 0;
>
> *************** inline_transform (struct cgraph_node *no
> *** 1762,1768 ****
>     happens during early inlining.  */
>
>  static bool
> ! gate_cgraph_decide_inlining (void)
>  {
>    /* ???  We'd like to skip this if not optimizing or not inlining as
>       all always-inline functions have been processed by early
> --- 1874,1880 ----
>     happens during early inlining.  */
>
>  static bool
> ! gate_ipa_inline (void)
>  {
>    /* ???  We'd like to skip this if not optimizing or not inlining as
>       all always-inline functions have been processed by early
> *************** struct ipa_opt_pass_d pass_ipa_inline =
> *** 1777,1784 ****
>   {
>    IPA_PASS,
>    "inline",                           /* name */
> !   gate_cgraph_decide_inlining,                /* gate */
> !   cgraph_decide_inlining,             /* execute */
>    NULL,                                       /* sub */
>    NULL,                                       /* next */
>    0,                                  /* static_pass_number */
> --- 1889,1896 ----
>   {
>    IPA_PASS,
>    "inline",                           /* name */
> !   gate_ipa_inline,                    /* gate */
> !   ipa_inline,                         /* execute */
>    NULL,                                       /* sub */
>    NULL,                                       /* next */
>    0,                                  /* static_pass_number */
> Index: ipa-inline.h
> ===================================================================
> *** ipa-inline.h        (revision 172597)
> --- ipa-inline.h        (working copy)
> *************** struct inline_summary
> *** 41,48 ****
>    /* False when there something makes versioning impossible.
>       Currently computed and used only by ipa-cp.  */
>    unsigned versionable : 1;
> -   /* True when function should be inlined independently on its size.  */
> -   unsigned disregard_inline_limits : 1;
>
>    /* Information about function that will result after applying all the
>       inline decisions present in the callgraph.  Generally kept up to
> --- 41,46 ----
> Index: ipa-inline-analysis.c
> ===================================================================
> *** ipa-inline-analysis.c       (revision 172597)
> --- ipa-inline-analysis.c       (working copy)
> *************** dump_inline_summary (FILE *f, struct cgr
> *** 131,137 ****
>        struct inline_summary *s = inline_summary (node);
>        fprintf (f, "Inline summary for %s/%i", cgraph_node_name (node),
>               node->uid);
> !       if (s->disregard_inline_limits)
>        fprintf (f, " always_inline");
>        if (s->inlinable)
>        fprintf (f, " inlinable");
> --- 131,137 ----
>        struct inline_summary *s = inline_summary (node);
>        fprintf (f, "Inline summary for %s/%i", cgraph_node_name (node),
>               node->uid);
> !       if (DECL_DISREGARD_INLINE_LIMITS (node->decl))
>        fprintf (f, " always_inline");
>        if (s->inlinable)
>        fprintf (f, " inlinable");
> *************** dump_inline_summary (FILE *f, struct cgr
> *** 142,148 ****
>        fprintf (f, "  global time:     %i\n", s->time);
>        fprintf (f, "  self size:       %i, benefit: %i\n",
>               s->self_size, s->size_inlining_benefit);
> !       fprintf (f, "  global size:     %i", s->size);
>        fprintf (f, "  self stack:      %i\n",
>               (int)s->estimated_self_stack_size);
>        fprintf (f, "  global stack:    %i\n\n",
> --- 142,148 ----
>        fprintf (f, "  global time:     %i\n", s->time);
>        fprintf (f, "  self size:       %i, benefit: %i\n",
>               s->self_size, s->size_inlining_benefit);
> !       fprintf (f, "  global size:     %i\n", s->size);
>        fprintf (f, "  self stack:      %i\n",
>               (int)s->estimated_self_stack_size);
>        fprintf (f, "  global stack:    %i\n\n",
> *************** compute_inline_parameters (struct cgraph
> *** 364,371 ****
>
>    /* Can this function be inlined at all?  */
>    info->inlinable = tree_inlinable_function_p (node->decl);
> -   if (!info->inlinable)
> -     info->disregard_inline_limits = 0;
>
>    /* Inlinable functions always can change signature.  */
>    if (info->inlinable)
> --- 364,369 ----
> *************** compute_inline_parameters (struct cgraph
> *** 388,395 ****
>    info->estimated_growth = INT_MIN;
>    info->stack_frame_offset = 0;
>    info->estimated_stack_size = info->estimated_self_stack_size;
> -   info->disregard_inline_limits
> -     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
>  }
>
>
> --- 386,391 ----
> *************** estimate_growth (struct cgraph_node *nod
> *** 483,507 ****
>
>    for (e = node->callers; e; e = e->next_caller)
>      {
> !       if (e->caller == node)
>          self_recursive = true;
> !       if (e->inline_failed)
> !       growth += estimate_edge_growth (e);
>      }
>
> !   /* ??? Wrong for non-trivially self recursive functions or cases where
> !      we decide to not inline for different reasons, but it is not big deal
> !      as in that case we will keep the body around, but we will also avoid
> !      some inlining.  */
> !   if (cgraph_will_be_removed_from_program_if_no_direct_calls (node)
> !       && !DECL_EXTERNAL (node->decl) && !self_recursive)
> !     growth -= info->size;
> !   /* COMDAT functions are very often not shared across multiple units since they
> !      come from various template instantiations.  Take this into account.  */
> !   else  if (DECL_COMDAT (node->decl) && !self_recursive
> !           && cgraph_can_remove_if_no_direct_calls_p (node))
> !     growth -= (info->size
> !              * (100 - PARAM_VALUE (PARAM_COMDAT_SHARING_PROBABILITY)) + 50) / 100;
>
>    info->estimated_growth = growth;
>    return growth;
> --- 479,512 ----
>
>    for (e = node->callers; e; e = e->next_caller)
>      {
> !       gcc_checking_assert (e->inline_failed);
> !
> !       if (e->caller == node
> !         || (e->caller->global.inlined_to
> !             && e->caller->global.inlined_to == node))
>          self_recursive = true;
> !       growth += estimate_edge_growth (e);
>      }
> +
>
> !   /* For self recursive functions the growth estimation really should be
> !      infinity.  We don't want to return very large values because the growth
> !      plays various roles in badness computation fractions.  Be sure to not
> !      return zero or negative growths. */
> !   if (self_recursive)
> !     growth = growth < info->size ? info->size : growth;
> !   else
> !     {
> !       if (cgraph_will_be_removed_from_program_if_no_direct_calls (node)
> !         && !DECL_EXTERNAL (node->decl))
> !       growth -= info->size;
> !       /* COMDAT functions are very often not shared across multiple units since they
> !        come from various template instantiations.  Take this into account.  */
> !       else  if (DECL_COMDAT (node->decl)
> !               && cgraph_can_remove_if_no_direct_calls_p (node))
> !       growth -= (info->size
> !                  * (100 - PARAM_VALUE (PARAM_COMDAT_SHARING_PROBABILITY)) + 50) / 100;
> !     }
>
>    info->estimated_growth = growth;
>    return growth;
> *************** inline_read_summary (void)
> *** 621,627 ****
>              bp = lto_input_bitpack (ib);
>              info->inlinable = bp_unpack_value (&bp, 1);
>              info->versionable = bp_unpack_value (&bp, 1);
> -             info->disregard_inline_limits = bp_unpack_value (&bp, 1);
>            }
>
>          lto_destroy_simple_input_block (file_data,
> --- 626,631 ----
> *************** inline_write_summary (cgraph_node_set se
> *** 688,694 ****
>          bp = bitpack_create (ob->main_stream);
>          bp_pack_value (&bp, info->inlinable, 1);
>          bp_pack_value (&bp, info->versionable, 1);
> -         bp_pack_value (&bp, info->disregard_inline_limits, 1);
>          lto_output_bitpack (&bp);
>        }
>      }
> --- 692,697 ----
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Inliner heuristics facelift
  2011-04-19 18:37 ` Xinliang David Li
@ 2011-04-19 19:24   ` H.J. Lu
  0 siblings, 0 replies; 9+ messages in thread
From: H.J. Lu @ 2011-04-19 19:24 UTC (permalink / raw)
  To: Xinliang David Li; +Cc: Jan Hubicka, gcc-patches

On Tue, Apr 19, 2011 at 11:30 AM, Xinliang David Li <davidxl@google.com> wrote:
> Is change tested this with SPEC?  5 or 6 SPEC2k programs ICE with the
> following error when built with FDO.
>
> .... internal compiler error: in inline_small_functions, at ipa-inline.c:1361
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <http://gcc.gnu.org/bugs.html> for instructions.
>

This is caused by a different patch:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48671

-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-04-19 18:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-17 12:07 Inliner heuristics facelift Dominique Dhumieres
2011-04-17 13:35 ` Jan Hubicka
2011-04-17 14:58   ` Dominique Dhumieres
2011-04-17 15:29     ` Jan Hubicka
2011-04-17 14:29 ` Jan Hubicka
2011-04-19 13:40   ` H.J. Lu
  -- strict thread matches above, loose matches on Subject: below --
2011-04-17 10:54 Jan Hubicka
2011-04-19 18:37 ` Xinliang David Li
2011-04-19 19:24   ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).