Extend ipa-inline-analysis to recognize functoin/loop invariant parameters

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Extend ipa-inline-analysis to recognize functoin/loop invariant parameters
@ 2011-09-23 20:34 Jan Hubicka
  2011-09-26 18:54 ` H.J. Lu
  0 siblings, 1 reply; 2+ messages in thread
From: Jan Hubicka @ 2011-09-23 20:34 UTC (permalink / raw)
  To: gcc-patches

Hi,
this patch extends inliner's predicate infrastructure to track optimization
oppurtunities that depends on properties of call site parameters that are
not readilly available (and do not belong to) jump functions.

For this new inline_param_summary is introduced that is filled in
estimate_function_body_sizes used at predicate evaulation time and maintained
over inlining and cloning.

I now use it to track probability that parameter will change.  Probability
of 0 means invaraint values, while probabilities in range 1...REG_BR_PROB_BASE
means that parameter will be recomputed in between function calls with
the known probability.

The motivation for it comes from testcases like c-ray raytracer, where function
raysphere has two parameters (ray and sphere). It is called from loop that for
given ray iterates across all spheres in the scene.
Large benefits from inlining raysphere are due to fact that good portion of the
function is loop invariant and thus it is subsequentely lifted out of the loop
by PRE.

The patch splits NOT_CONSTANT predicate into NOT_CONSTANT and CHANGED.
NOT_CONSTANT is true always when parameters is not a know IP invaraint (in
sense of jump functions), while CHANGED is true when parameter change
probaiblity is >= 1 (i.e. it is not function level invariant). NOT_CONSTANT is
used when evaulating builtin_constant_p that is never true for loop invariant.
CHANGED is used when deciding whether given statment will be executed per
invocation of the particula rfunction call.  Moreover predicates can now be
evaulated into execution probablity that correspond to the probability of
parameter change.

As implemented now, the patch is good enough to analyze c-ray and work out that
raysphere is cool to inline, it however won't inline it because of the hard
size limits.

Some guesswork is obviously needed. Most magic is in param_change_prob and the
way it handles non-SSA parameters where it is quite non-obvious how much of code
motion will be possible after inlining.  Also predicate_probability can really
only evaulate simple predicated like (op0 changed) into op0's change probability.
It makes simple estiamtes so (op0 changed || op1 changed) is maximum of op0 and op1
change probability while (op0 changed && op1 changed) is minimum.
The second is quite rare, so this should work relatively well in practice.

There are other simple things I would like to add incrementally. I.e. tracking
whether parameter is address of automatic variable (and thus inlining likely
enables SRA).  This should help C++ inlining and move us closer to be able to
analyze fortran testcases.  Fortran is particularly difficult because most of
stuff is passed by reference and we will probably need to introduce jump functions
for values passed in aggregates and values passed by reference to understand
things like array descriptors.

The patch has little effect on spec2000 or our C++ benchmark suite at least
with current inline limits. Most tests are small enough so we usully do all
inlining allowed and thus saner priority queue makes little difference.  It
however helps some of Mozilla inner loops and increase amount of inlining done
there by about 8% while keeping same code size.

Bootstrapped/regtsted x86_64-linux.

Honza

	* gcc.dg/ipa/inline-1.c: new testcase.
	* gcc.dg/ipa/inline-2.c: new testcase.
	* gcc.dg/ipa/inline-3.c: new testcase.
	* gcc.dg/ipa/inline-4.c: new testcase.

	* ipa-inline-transform.c (inline_call): Add comment.
	* ipa-inline.h (inline_param_summary): New structure and vector.
	(struct inline_edge_summary): Add param field.
	* ipa-inline-analysis.c (CHANGED): New constant.
	(add_clause): Handle CHANGED and NOT_CONSTANT.
	(predicate_probability): New function.
	(dump_condition): Dump CHANGED predicate.
	(evaluate_conditions_for_known_args): Handle ERROR_MARK as marker
	of unknown function wide invariant.
	(evaluate_conditions_for_edge): Handle change probabilities.
	(inline_edge_duplication_hook): Copy param summaries.
	(inline_edge_removal_hook): Free param summaries.
	(dump_inline_edge_summary): Fix dumping of indirect edges and callee sizes;
	dump param summaries.
	(will_be_nonconstant_predicate): Use CHANGED predicate.
	(record_modified_bb_info): New structure.
	(record_modified): New function.
	(param_change_prob): New function.
	(estimate_function_body_sizes): Compute param summaries.
	(estimate_edge_size_and_time): Add probability argument.
	(estimate_node_size_and_time): Add inline_param_summary argument;
	handle predicate probabilities.
	(remap_predicate): Fix formating.
	(remap_edge_change_prob): New function.
	(remap_edge_summaries): Rename from ...; use remap_edge_change_prob.
	(remap_edge_predicates): ... this one.
	(inline_merge_summary): Remap edge summaries; handle predicate probabilities;
	remove param summaries after we are done.
	(do_estimate_edge_time): Update.
	(do_estimate_edge_growth): Update.
	(read_inline_edge_summary): Read param info.
	(inline_read_summary): Fix formating.
	(write_inline_edge_summary): Write param summaries.
Index: ipa-inline-transform.c
===================================================================
*** ipa-inline-transform.c	(revision 179093)
--- ipa-inline-transform.c	(working copy)
*************** inline_call (struct cgraph_edge *e, bool
*** 248,253 ****
--- 248,255 ----
      *overall_size += new_size - old_size;
    ncalls_inlined++;
  
+   /* This must happen after inline_merge_summary that rely on jump
+      functions of callee to not be updated.  */
    if (optimize)
      return ipa_propagate_indirect_call_infos (curr, new_edges);
    else
Index: testsuite/gcc.dg/ipa/inline-1.c
===================================================================
*** testsuite/gcc.dg/ipa/inline-1.c	(revision 0)
--- testsuite/gcc.dg/ipa/inline-1.c	(revision 0)
***************
*** 0 ****
--- 1,37 ----
+ /* Verify that analysis of function parameters works as expected.  */
+ /* { dg-do compile } */
+ /* { dg-options "-O3 -c -fdump-ipa-inline"  } */
+ struct bah {int a,b,c,d,e;};
+ static struct bah bah3={2,3,4,5,6};
+ const static struct bah bah4={2,3,4,5,6};
+ void test (int, struct bah *, struct bah, struct bah, int, struct bah, struct bah, struct bah);
+ void foo (int invariant, struct bah invariant2)
+ {
+   int i;
+   struct bah bah2={1,2,3,4,5};
+   struct bah bah5={1,2,3,4,5};
+   for (i = 0; i<10; i++)
+     {
+       bah5.a=i;
+       test (i, &bah2, bah2, bah3, invariant, invariant2, bah4, bah5);
+     }
+ }
+ /* op0 change on every invocation.  */
+ /* op1 is function invariant.  */
+ /* { dg-final { scan-ipa-dump-not "op0 is compile time invariant"  "inline"  } } */
+ /* { dg-final { scan-ipa-dump-not "op0 change"  "inline"  } } */
+ /* { dg-final { scan-ipa-dump "op1 is compile time invariant"  "inline"  } } */
+ /* op2 is invariant within loop (we make assumption that function call does not afect it.). */
+ /* { dg-final { scan-ipa-dump "op2 change 10.000000. of time"  "inline"  } } */
+ /* op3 is invariant within loop (we make assumption that function call does not afect it.). */
+ /* { dg-final { scan-ipa-dump "op3 change 10.000000. of time"  "inline"  } } */
+ /* op4 is invariant within loop.  */
+ /* { dg-final { scan-ipa-dump "op4 change 10.000000. of time"  "inline"  } } */
+ /* op5 is invariant within loop.  */
+ /* { dg-final { scan-ipa-dump "op5 change 10.000000. of time"  "inline"  } } */
+ /* op6 is compile time invariant.  */
+ /* { dg-final { scan-ipa-dump "op6 is compile time invariant"  "inline"  } } */
+ /* op7 change.  */
+ /* { dg-final { scan-ipa-dump-not "op7 is compile time invariant"  "inline"  } } */
+ /* { dg-final { scan-ipa-dump-not "op7 change"  "inline"  } } */
+ /* { dg-final { cleanup-ipa-dump "inline" } } */
Index: testsuite/gcc.dg/ipa/inline-2.c
===================================================================
*** testsuite/gcc.dg/ipa/inline-2.c	(revision 0)
--- testsuite/gcc.dg/ipa/inline-2.c	(revision 0)
***************
*** 0 ****
--- 1,33 ----
+ /* Verify that logic combining probabilities works as expected.  */
+ /* { dg-do compile } */
+ /* { dg-options "-O3 -c -fdump-ipa-inline -fno-early-inlining"  } */
+ 
+ struct bah {int a,b,d;};
+ 
+ __attribute__ ((noinline))
+ void test(int a,int b,int c,int d,int e)
+ {
+   test3(a,b,c,d,e);
+ }
+ inline
+ static void bar (int parm1, int parm2)
+ {
+   int i;
+   for (i = 0; i<10; i++)
+     {
+       test (0,0,parm1,parm2,i);
+     }
+ }
+ void foo (int invariant)
+ {
+   int i;
+   for (i = 0; i<10; i++)
+     {
+       bar (i, invariant);
+     }
+ }
+ /* After inlining bar into foo, op2 is invariant within inner loop.  */
+ /* { dg-final { scan-ipa-dump "op2 change 10.000000. of time"  "inline"  } } */
+ /* After inlining bar into foo, op3 is invariant within both loops.  */
+ /* { dg-final { scan-ipa-dump "op3 change 1.000000. of time"  "inline"  } } */
+ /* { dg-final { cleanup-ipa-dump "inline" } } */
Index: testsuite/gcc.dg/ipa/inline-3.c
===================================================================
*** testsuite/gcc.dg/ipa/inline-3.c	(revision 0)
--- testsuite/gcc.dg/ipa/inline-3.c	(revision 0)
***************
*** 0 ****
--- 1,25 ----
+ /* Verify that do_work is detected as being loop invariant.  */
+ /* { dg-do compile } */
+ /* { dg-options "-O3 -c -fdump-ipa-inline-details -fno-early-inlining"  } */
+ 
+ struct bah {int a,b,d;};
+ 
+ static int do_work (struct bah s)
+ {
+   return s.a*s.b/s.d;
+ }
+ int foo (int invariant)
+ {
+   int i;
+   struct bah s = {invariant,invariant,invariant};
+   int sum = 0;
+   for (i = 0; i<10; i++)
+     {
+       sum += do_work (s);
+     }
+   return sum;
+ }
+ 
+ 
+ /* { dg-final { scan-ipa-dump "Scaling time by probability:0.100000"  "inline"  } } */
+ /* { dg-final { cleanup-ipa-dump "inline" } } */
Index: testsuite/gcc.dg/ipa/inline-4.c
===================================================================
*** testsuite/gcc.dg/ipa/inline-4.c	(revision 0)
--- testsuite/gcc.dg/ipa/inline-4.c	(revision 0)
***************
*** 0 ****
--- 1,29 ----
+ /* { dg-do compile } */
+ /* { dg-options "-Os -c -fdump-ipa-inline -fno-early-inlining -fno-partial-inlining -fno-ipa-cp"  } */
+ 
+ void do_something (int shall_i_work)
+ {
+   if (shall_i_work)
+     {
+       work_hard ();
+       work_hard ();
+       work_hard ();
+       work_hard ();
+       work_hard ();
+       work_hard ();
+       work_hard ();
+       work_hard ();
+     }
+ }
+ int foo (int invariant)
+ {
+   do_something (0);
+   do_something (1);
+ }
+ 
+ 
+ /* We should inline do_something(0),  but not do_something (1).  */
+ /* { dg-final { scan-ipa-dump "Inlined 1 calls, eliminated 0 functions"  "inline"  } } */
+ /* Call to work_hard should be detected as optimized out.  */
+ /* { dg-final { scan-ipa-dump-times "predicate: .false." 8 "inline"  } } */
+ /* { dg-final { cleanup-ipa-dump "inline" } } */
Index: ipa-inline.h
===================================================================
*** ipa-inline.h	(revision 179093)
--- ipa-inline.h	(working copy)
*************** struct GTY(()) inline_summary
*** 104,114 ****
--- 104,131 ----
    VEC(size_time_entry,gc) *entry;
  };
  
+ 
  typedef struct inline_summary inline_summary_t;
  DEF_VEC_O(inline_summary_t);
  DEF_VEC_ALLOC_O(inline_summary_t,gc);
  extern GTY(()) VEC(inline_summary_t,gc) *inline_summary_vec;
  
+ /* Information kept about parameter of call site.  */
+ struct inline_param_summary
+ {
+   /* REG_BR_PROB_BASE based probability that parameter will change in between
+      two invocation of the calls.
+      I.e. loop invariant parameters
+      REG_BR_PROB_BASE/estimated_iterations and regular
+      parameters REG_BR_PROB_BASE.
+ 
+      Value 0 is reserved for compile time invariants. */
+   int change_prob;
+ };
+ typedef struct inline_param_summary inline_param_summary_t;
+ DEF_VEC_O(inline_param_summary_t);
+ DEF_VEC_ALLOC_O(inline_param_summary_t,heap);
+ 
  /* Information kept about callgraph edges.  */
  struct inline_edge_summary
  {
*************** struct inline_edge_summary
*** 118,123 ****
--- 135,144 ----
    /* Depth of loop nest, 0 means no nesting.  */
    unsigned short int loop_depth;
    struct predicate *predicate;
+   /* Array indexed by parameters.
+      0 means that parameter change all the time, REG_BR_PROB_BASE means
+      that parameter is constant.  */
+   VEC (inline_param_summary_t, heap) *param;
  };
  
  typedef struct inline_edge_summary inline_edge_summary_t;
Index: ipa-inline-analysis.c
===================================================================
*** ipa-inline-analysis.c	(revision 179093)
--- ipa-inline-analysis.c	(working copy)
*************** enum predicate_conditions
*** 108,113 ****
--- 108,118 ----
  /* Special condition code we use to represent test that operand is compile time
     constant.  */
  #define IS_NOT_CONSTANT ERROR_MARK
+ /* Special condition code we use to represent test that operand is not changed
+    across invocation of the function.  When operand IS_NOT_CONSTANT it is always
+    CHANGED, however i.e. loop invariants can be NOT_CHANGED given percentage
+    of executions even when they are not compile time constants.  */
+ #define CHANGED IDENTIFIER_NODE
  
  /* Holders of ipa cgraph hooks: */
  static struct cgraph_node_hook_list *function_insertion_hook_holder;
*************** add_clause (conditions conditions, struc
*** 287,308 ****
    /* Look for clauses that are obviously true.  I.e.
       op0 == 5 || op0 != 5.  */
    for (c1 = predicate_first_dynamic_condition; c1 < NUM_CONDITIONS; c1++)
!     for (c2 = c1 + 1; c2 <= NUM_CONDITIONS; c2++)
!       if ((clause & (1 << c1))
! 	  && (clause & (1 << c2)))
! 	{
! 	  condition *cc1 = VEC_index (condition,
! 				      conditions,
! 				      c1 - predicate_first_dynamic_condition);
! 	  condition *cc2 = VEC_index (condition,
! 				      conditions,
! 				      c2 - predicate_first_dynamic_condition);
! 	  if (cc1->operand_num == cc2->operand_num
! 	      && cc1->val == cc2->val
! 	      && cc1->code == invert_tree_comparison (cc2->code,
! 						      HONOR_NANS (TYPE_MODE (TREE_TYPE (cc1->val)))))
! 	    return;
! 	}
  	
  
    /* We run out of variants.  Be conservative in positive direction.  */
--- 292,328 ----
    /* Look for clauses that are obviously true.  I.e.
       op0 == 5 || op0 != 5.  */
    for (c1 = predicate_first_dynamic_condition; c1 < NUM_CONDITIONS; c1++)
!     {
!       condition *cc1;
!       if (!(clause & (1 << c1)))
! 	continue;
!       cc1 = VEC_index (condition,
! 		       conditions,
! 		       c1 - predicate_first_dynamic_condition);
!       /* We have no way to represent !CHANGED and !IS_NOT_CONSTANT
! 	 and thus there is no point for looking for them.  */
!       if (cc1->code == CHANGED
! 	  || cc1->code == IS_NOT_CONSTANT)
! 	continue;
!       for (c2 = c1 + 1; c2 <= NUM_CONDITIONS; c2++)
! 	if (clause & (1 << c2))
! 	  {
! 	    condition *cc1 = VEC_index (condition,
! 					conditions,
! 					c1 - predicate_first_dynamic_condition);
! 	    condition *cc2 = VEC_index (condition,
! 					conditions,
! 					c2 - predicate_first_dynamic_condition);
! 	    if (cc1->operand_num == cc2->operand_num
! 		&& cc1->val == cc2->val
! 		&& cc2->code != IS_NOT_CONSTANT
! 		&& cc2->code != CHANGED
! 		&& cc1->code == invert_tree_comparison 
! 		    (cc2->code,
! 		     HONOR_NANS (TYPE_MODE (TREE_TYPE (cc1->val)))))
! 	      return;
! 	  }
!     }
  	
  
    /* We run out of variants.  Be conservative in positive direction.  */
*************** evaluate_predicate (struct predicate *p,
*** 420,425 ****
--- 440,506 ----
    return true;
  }
  
+ /* Return the probability in range 0...REG_BR_PROB_BASE that the predicated
+    instruction will be recomputed per invocation of the inlined call.  */
+ 
+ static int
+ predicate_probability (conditions conds,
+ 		       struct predicate *p, clause_t possible_truths,
+ 		       VEC (inline_param_summary_t, heap) *inline_param_summary)
+ {
+   int i;
+   int combined_prob = REG_BR_PROB_BASE;
+ 
+   /* True remains true.  */
+   if (true_predicate_p (p))
+     return REG_BR_PROB_BASE;
+ 
+   if (false_predicate_p (p))
+     return 0;
+ 
+   gcc_assert (!(possible_truths & (1 << predicate_false_condition)));
+ 
+   /* See if we can find clause we can disprove.  */
+   for (i = 0; p->clause[i]; i++)
+     {
+       gcc_checking_assert (i < MAX_CLAUSES);
+       if (!(p->clause[i] & possible_truths))
+ 	return 0;
+       else
+ 	{
+ 	  int this_prob = 0;
+ 	  int i2;
+ 	  if (!inline_param_summary)
+ 	    return REG_BR_PROB_BASE;
+ 	  for (i2 = 0; i2 < NUM_CONDITIONS; i2++)
+ 	    if ((p->clause[i] & possible_truths) & (1 << i2))
+ 	      {
+ 		if (i2 >= predicate_first_dynamic_condition)
+ 		  {
+ 		    condition *c = VEC_index
+ 				    (condition, conds,
+ 				     i2 - predicate_first_dynamic_condition);
+ 		    if (c->code == CHANGED)
+ 		      {
+ 			int iprob = VEC_index (inline_param_summary_t,
+ 					       inline_param_summary,
+ 					       c->operand_num)->change_prob;
+ 			this_prob = MAX (this_prob, iprob);
+ 		      }
+ 		    else
+ 		      this_prob = REG_BR_PROB_BASE;
+ 		   }
+ 		 else
+ 		   this_prob = REG_BR_PROB_BASE;
+ 	      }
+ 	  combined_prob = MIN (this_prob, combined_prob);
+ 	  if (!combined_prob)
+             return 0;
+ 	}
+     }
+   return combined_prob;
+ }
+ 
  
  /* Dump conditional COND.  */
  
*************** dump_condition (FILE *f, conditions cond
*** 433,445 ****
      fprintf (f, "not inlined");
    else
      {
!       c = VEC_index (condition, conditions, cond - predicate_first_dynamic_condition);
        fprintf (f, "op%i", c->operand_num);
        if (c->code == IS_NOT_CONSTANT)
  	{
  	  fprintf (f, " not constant");
  	  return;
  	}
        fprintf (f, " %s ", op_symbol_code (c->code));
        print_generic_expr (f, c->val, 1);
      }
--- 514,532 ----
      fprintf (f, "not inlined");
    else
      {
!       c = VEC_index (condition, conditions,
! 		     cond - predicate_first_dynamic_condition);
        fprintf (f, "op%i", c->operand_num);
        if (c->code == IS_NOT_CONSTANT)
  	{
  	  fprintf (f, " not constant");
  	  return;
  	}
+       if (c->code == CHANGED)
+ 	{
+ 	  fprintf (f, " changed");
+ 	  return;
+ 	}
        fprintf (f, " %s ", op_symbol_code (c->code));
        print_generic_expr (f, c->val, 1);
      }
*************** edge_set_predicate (struct cgraph_edge *
*** 571,577 ****
  
  /* KNOWN_VALS is partial mapping of parameters of NODE to constant values.
     Return clause of possible truths. When INLINE_P is true, assume that
!    we are inlining.  */
  
  static clause_t
  evaluate_conditions_for_known_args (struct cgraph_node *node,
--- 658,666 ----
  
  /* KNOWN_VALS is partial mapping of parameters of NODE to constant values.
     Return clause of possible truths. When INLINE_P is true, assume that
!    we are inlining. 
! 
!    ERROR_MARK means compile time invariant.  */
  
  static clause_t
  evaluate_conditions_for_known_args (struct cgraph_node *node,
*************** evaluate_conditions_for_known_args (stru
*** 596,607 ****
        else
  	val = NULL;
  
        if (!val)
  	{
  	  clause |= 1 << (i + predicate_first_dynamic_condition);
  	  continue;
  	}
!       if (c->code == IS_NOT_CONSTANT)
  	continue;
        res = fold_binary_to_constant (c->code, boolean_type_node, val, c->val);
        if (res
--- 685,699 ----
        else
  	val = NULL;
  
+       if (val == error_mark_node && c->code != CHANGED)
+ 	val = NULL;
+ 
        if (!val)
  	{
  	  clause |= 1 << (i + predicate_first_dynamic_condition);
  	  continue;
  	}
!       if (c->code == IS_NOT_CONSTANT || c->code == CHANGED)
  	continue;
        res = fold_binary_to_constant (c->code, boolean_type_node, val, c->val);
        if (res
*************** evaluate_conditions_for_edge (struct cgr
*** 627,632 ****
--- 719,725 ----
      {
        struct ipa_node_params *parms_info;
        struct ipa_edge_args *args = IPA_EDGE_REF (e);
+       struct inline_edge_summary *es = inline_edge_summary (e);
        int i, count = ipa_get_cs_argument_count (args);
        VEC (tree, heap) *known_vals = NULL;
  
*************** evaluate_conditions_for_edge (struct cgr
*** 643,648 ****
--- 736,746 ----
  					 ipa_get_ith_jump_func (args, i));
  	  if (cst)
  	    VEC_replace (tree, known_vals, i, cst);
+ 	  else if (inline_p
+ 		   && !VEC_index (inline_param_summary_t,
+ 				  es->param,
+ 				  i)->change_prob)
+ 	    VEC_replace (tree, known_vals, i, error_mark_node);
  	}
        clause = evaluate_conditions_for_known_args (callee,
  						   inline_p, known_vals);
*************** inline_edge_duplication_hook (struct cgr
*** 898,903 ****
--- 996,1002 ----
  	  sizeof (struct inline_edge_summary));
    info->predicate = NULL;
    edge_set_predicate (dst, srcinfo->predicate);
+   info->param = VEC_copy (inline_param_summary_t, heap, srcinfo->param);
  }
  
  
*************** inline_edge_removal_hook (struct cgraph_
*** 908,917 ****
  {
    if (edge_growth_cache)
      reset_edge_growth_cache (edge);
!   if (edge->uid < (int)VEC_length (inline_edge_summary_t, inline_edge_summary_vec))
      {
        edge_set_predicate (edge, NULL);
!       memset (inline_edge_summary (edge), 0, sizeof (struct inline_edge_summary));
      }
  }
  
--- 1007,1020 ----
  {
    if (edge_growth_cache)
      reset_edge_growth_cache (edge);
!   if (edge->uid
!       < (int)VEC_length (inline_edge_summary_t, inline_edge_summary_vec))
      {
        edge_set_predicate (edge, NULL);
!       VEC_free (inline_param_summary_t, heap,
! 	        inline_edge_summary (edge)->param);
!       memset (inline_edge_summary (edge), 0,
! 	      sizeof (struct inline_edge_summary));
      }
  }
  
*************** dump_inline_edge_summary (FILE * f, int 
*** 953,958 ****
--- 1056,1063 ----
      {
        struct inline_edge_summary *es = inline_edge_summary (edge);
        struct cgraph_node *callee = cgraph_function_or_thunk_node (edge->callee, NULL);
+       int i;
+ 
        fprintf (f, "%*s%s/%i %s\n%*s  loop depth:%2i freq:%4i size:%2i time: %2i callee size:%2i stack:%2i",
  	       indent, "", cgraph_node_name (callee),
  	       callee->uid, 
*************** dump_inline_edge_summary (FILE * f, int 
*** 963,970 ****
                 edge->frequency,
  	       es->call_stmt_size,
  	       es->call_stmt_time,
! 	       (int)inline_summary (callee)->size,
  	       (int)inline_summary (callee)->estimated_stack_size);
        if (es->predicate)
  	{
  	  fprintf (f, " predicate: ");
--- 1068,1076 ----
                 edge->frequency,
  	       es->call_stmt_size,
  	       es->call_stmt_time,
! 	       (int)inline_summary (callee)->size / INLINE_SIZE_SCALE,
  	       (int)inline_summary (callee)->estimated_stack_size);
+ 
        if (es->predicate)
  	{
  	  fprintf (f, " predicate: ");
*************** dump_inline_edge_summary (FILE * f, int 
*** 972,980 ****
  	}
        else
  	  fprintf (f, "\n");
        if (!edge->inline_failed)
  	{
!           fprintf (f, "%*sStack frame offset %i, callee self size %i, callee size %i\n",
  		   indent+2, "",
  		   (int)inline_summary (callee)->stack_frame_offset,
  		   (int)inline_summary (callee)->estimated_self_stack_size,
--- 1078,1101 ----
  	}
        else
  	  fprintf (f, "\n");
+       if (es->param)
+         for (i = 0; i < (int)VEC_length (inline_param_summary_t, es->param);
+ 	     i++)
+ 	  {
+ 	    int prob = VEC_index (inline_param_summary_t,
+ 				  es->param, i)->change_prob;
+ 
+ 	    if (!prob)
+ 	      fprintf (f, "%*s op%i is compile time invariant\n",
+ 		       indent + 2, "", i);
+ 	    else if (prob != REG_BR_PROB_BASE)
+ 	      fprintf (f, "%*s op%i change %f%% of time\n", indent + 2, "", i,
+ 		       prob * 100.0 / REG_BR_PROB_BASE);
+ 	  }
        if (!edge->inline_failed)
  	{
!           fprintf (f, "%*sStack frame offset %i, callee self size %i,"
! 		   " callee size %i\n",
  		   indent+2, "",
  		   (int)inline_summary (callee)->stack_frame_offset,
  		   (int)inline_summary (callee)->estimated_self_stack_size,
*************** dump_inline_edge_summary (FILE * f, int 
*** 986,992 ****
      {
        struct inline_edge_summary *es = inline_edge_summary (edge);
        fprintf (f, "%*sindirect call loop depth:%2i freq:%4i size:%2i"
! 	       " time: %2i\n",
  	       indent, "",
  	       es->loop_depth,	
                 edge->frequency,
--- 1107,1113 ----
      {
        struct inline_edge_summary *es = inline_edge_summary (edge);
        fprintf (f, "%*sindirect call loop depth:%2i freq:%4i size:%2i"
! 	       " time: %2i",
  	       indent, "",
  	       es->loop_depth,	
                 edge->frequency,
*************** will_be_nonconstant_predicate (struct ip
*** 1504,1510 ****
  		    (stmt, get_base_address (gimple_assign_rhs1 (stmt)));
        p = add_condition (summary,
  			 ipa_get_param_decl_index (info, parm),
! 			 IS_NOT_CONSTANT, NULL);
        op_non_const = or_predicates (summary->conds, &p, &op_non_const);
      }
    FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
--- 1625,1631 ----
  		    (stmt, get_base_address (gimple_assign_rhs1 (stmt)));
        p = add_condition (summary,
  			 ipa_get_param_decl_index (info, parm),
! 			 CHANGED, NULL);
        op_non_const = or_predicates (summary->conds, &p, &op_non_const);
      }
    FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
*************** will_be_nonconstant_predicate (struct ip
*** 1513,1519 ****
        if (parm && ipa_get_param_decl_index (info, parm) >= 0)
  	p = add_condition (summary,
  			   ipa_get_param_decl_index (info, parm),
! 			   IS_NOT_CONSTANT, NULL);
        else
  	p = *VEC_index (predicate_t, nonconstant_names,
  			SSA_NAME_VERSION (use));
--- 1634,1640 ----
        if (parm && ipa_get_param_decl_index (info, parm) >= 0)
  	p = add_condition (summary,
  			   ipa_get_param_decl_index (info, parm),
! 			   CHANGED, NULL);
        else
  	p = *VEC_index (predicate_t, nonconstant_names,
  			SSA_NAME_VERSION (use));
*************** will_be_nonconstant_predicate (struct ip
*** 1526,1531 ****
--- 1647,1762 ----
    return op_non_const;
  }
  
+ struct record_modified_bb_info
+ {
+   bitmap bb_set;
+   gimple stmt;
+ };
+ 
+ /* Callback of walk_aliased_vdefs.  Records basic blocks where the value may be
+    set except for info->stmt.  */
+ 
+ static bool
+ record_modified (ao_ref *ao ATTRIBUTE_UNUSED, tree vdef,
+ 	         void *data)
+ {
+   struct record_modified_bb_info *info = (struct record_modified_bb_info *) data;
+   if (SSA_NAME_DEF_STMT (vdef) == info->stmt)
+     return false;
+   bitmap_set_bit (info->bb_set,
+ 		  SSA_NAME_IS_DEFAULT_DEF (vdef)
+ 		  ? ENTRY_BLOCK_PTR->index : gimple_bb (SSA_NAME_DEF_STMT (vdef))->index);
+   return false;
+ }
+ 
+ /* Return probability (based on REG_BR_PROB_BASE) that I-th parameter of STMT
+    will change since last invocation of STMT. 
+ 
+    Value 0 is reserved for compile time invariants.
+    For common parameters it is REG_BR_PROB_BASE.  For loop invariants it
+    ought to be REG_BR_PROB_BASE / estimated_iters.  */
+ 
+ static int
+ param_change_prob (gimple stmt, int i)
+ {
+   tree op = gimple_call_arg (stmt, i);
+   basic_block bb = gimple_bb (stmt);
+   tree base;
+ 
+   if (is_gimple_min_invariant (op))
+     return 0;
+   /* We would have to do non-trivial analysis to really work out what
+      is the probability of value to change (i.e. when init statement
+      is in a sibling loop of the call). 
+ 
+      We do an conservative estimate: when call is executed N times more often
+      than the statement defining value, we take the frequency 1/N.  */
+   if (TREE_CODE (op) == SSA_NAME)
+     {
+       int init_freq;
+ 
+       if (!bb->frequency)
+ 	return REG_BR_PROB_BASE;
+ 
+       if (SSA_NAME_IS_DEFAULT_DEF (op))
+ 	init_freq = ENTRY_BLOCK_PTR->frequency;
+       else
+ 	init_freq = gimple_bb (SSA_NAME_DEF_STMT (op))->frequency;
+ 
+       if (!init_freq)
+ 	init_freq = 1;
+       if (init_freq < bb->frequency)
+         return MAX ((init_freq * REG_BR_PROB_BASE +
+ 		    bb->frequency / 2) / bb->frequency, 1);
+       else
+         return REG_BR_PROB_BASE;
+     }
+ 
+   base = get_base_address (op);
+   if (base)
+     {
+       ao_ref refd;
+       int max;
+       struct record_modified_bb_info info;
+       bitmap_iterator bi;
+       unsigned index;
+ 
+       if (const_value_known_p (base))
+ 	return 0;
+       if (!bb->frequency)
+ 	return REG_BR_PROB_BASE;
+       ao_ref_init (&refd, op);
+       info.stmt = stmt;
+       info.bb_set = BITMAP_ALLOC (NULL);
+       walk_aliased_vdefs (&refd, gimple_vuse (stmt), record_modified, &info,
+ 			  NULL);
+       if (bitmap_bit_p (info.bb_set, bb->index))
+ 	{
+           BITMAP_FREE (info.bb_set);
+ 	  return REG_BR_PROB_BASE;
+ 	}
+ 
+       /* Assume that every memory is initialized at entry.
+ 	 TODO: Can we easilly determine if value is always defined
+ 	 and thus we may skip entry block?  */
+       if (ENTRY_BLOCK_PTR->frequency)
+ 	max = ENTRY_BLOCK_PTR->frequency;
+       else
+ 	max = 1;
+ 
+       EXECUTE_IF_SET_IN_BITMAP (info.bb_set, 0, index, bi)
+ 	max = MIN (max, BASIC_BLOCK (index)->frequency);
+       
+       BITMAP_FREE (info.bb_set);
+       if (max < bb->frequency)
+         return MAX ((max * REG_BR_PROB_BASE +
+ 		     bb->frequency / 2) / bb->frequency, 1);
+       else
+         return REG_BR_PROB_BASE;
+     }
+   return REG_BR_PROB_BASE;
+ }
+ 
  
  /* Compute function body size parameters for NODE.
     When EARLY is true, we compute only simple summaries without
*************** estimate_function_body_sizes (struct cgr
*** 1626,1632 ****
  		{
  		  struct predicate false_p = false_predicate ();
  		  VEC_replace (predicate_t, nonconstant_names,
! 			       SSA_NAME_VERSION (gimple_call_lhs (stmt)), &false_p);
  		}
  
  	      es->call_stmt_size = this_size;
--- 1857,1880 ----
  		{
  		  struct predicate false_p = false_predicate ();
  		  VEC_replace (predicate_t, nonconstant_names,
! 			       SSA_NAME_VERSION (gimple_call_lhs (stmt)),
! 			       &false_p);
! 		}
! 	      if (ipa_node_params_vector)
! 		{
! 	          int count = gimple_call_num_args (stmt);
! 		  int i;
! 
! 		  if (count)
! 		    VEC_safe_grow_cleared (inline_param_summary_t, heap,
! 					   es->param, count);
! 		  for (i = 0; i < count; i++)
! 		    {
! 		      int prob = param_change_prob (stmt, i);
! 		      gcc_assert (prob >= 0 && prob <= REG_BR_PROB_BASE);
! 		      VEC_index (inline_param_summary_t,
! 				 es->param, i)->change_prob = prob;
! 		    }
  		}
  
  	      es->call_stmt_size = this_size;
*************** struct gimple_opt_pass pass_inline_param
*** 1842,1852 ****
  /* Increase SIZE and TIME for size and time needed to handle edge E.  */
  
  static void
! estimate_edge_size_and_time (struct cgraph_edge *e, int *size, int *time)
  {
    struct inline_edge_summary *es = inline_edge_summary (e);
    *size += es->call_stmt_size * INLINE_SIZE_SCALE;
!   *time += (es->call_stmt_time
  	    * e->frequency * (INLINE_TIME_SCALE / CGRAPH_FREQ_BASE));
    if (*time > MAX_TIME * INLINE_TIME_SCALE)
      *time = MAX_TIME * INLINE_TIME_SCALE;
--- 2090,2101 ----
  /* Increase SIZE and TIME for size and time needed to handle edge E.  */
  
  static void
! estimate_edge_size_and_time (struct cgraph_edge *e, int *size, int *time,
! 			     int prob)
  {
    struct inline_edge_summary *es = inline_edge_summary (e);
    *size += es->call_stmt_size * INLINE_SIZE_SCALE;
!   *time += (es->call_stmt_time * prob / REG_BR_PROB_BASE
  	    * e->frequency * (INLINE_TIME_SCALE / CGRAPH_FREQ_BASE));
    if (*time > MAX_TIME * INLINE_TIME_SCALE)
      *time = MAX_TIME * INLINE_TIME_SCALE;
*************** estimate_calls_size_and_time (struct cgr
*** 1866,1872 ****
        if (!es->predicate || evaluate_predicate (es->predicate, possible_truths))
  	{
  	  if (e->inline_failed)
! 	    estimate_edge_size_and_time (e, size, time);
  	  else
  	    estimate_calls_size_and_time (e->callee, size, time,
  					  possible_truths);
--- 2115,2125 ----
        if (!es->predicate || evaluate_predicate (es->predicate, possible_truths))
  	{
  	  if (e->inline_failed)
! 	    {
! 	      /* Predicates of calls shall not use NOT_CHANGED codes,
! 		 sowe do not need to compute probabilities.  */
! 	      estimate_edge_size_and_time (e, size, time, REG_BR_PROB_BASE);
! 	    }
  	  else
  	    estimate_calls_size_and_time (e->callee, size, time,
  					  possible_truths);
*************** estimate_calls_size_and_time (struct cgr
*** 1877,1883 ****
      {
        struct inline_edge_summary *es = inline_edge_summary (e);
        if (!es->predicate || evaluate_predicate (es->predicate, possible_truths))
!         estimate_edge_size_and_time (e, size, time);
      }
  }
  
--- 2130,2136 ----
      {
        struct inline_edge_summary *es = inline_edge_summary (e);
        if (!es->predicate || evaluate_predicate (es->predicate, possible_truths))
!         estimate_edge_size_and_time (e, size, time, REG_BR_PROB_BASE);
      }
  }
  
*************** estimate_calls_size_and_time (struct cgr
*** 1888,1894 ****
  static void
  estimate_node_size_and_time (struct cgraph_node *node,
  			     clause_t possible_truths,
! 		       	     int *ret_size, int *ret_time)
  {
    struct inline_summary *info = inline_summary (node);
    size_time_entry *e;
--- 2141,2149 ----
  static void
  estimate_node_size_and_time (struct cgraph_node *node,
  			     clause_t possible_truths,
! 		       	     int *ret_size, int *ret_time,
! 			     VEC (inline_param_summary_t, heap)
! 			       *inline_param_summary)
  {
    struct inline_summary *info = inline_summary (node);
    size_time_entry *e;
*************** estimate_node_size_and_time (struct cgra
*** 1918,1924 ****
  
    for (i = 0; VEC_iterate (size_time_entry, info->entry, i, e); i++)
      if (evaluate_predicate (&e->predicate, possible_truths))
!       time += e->time, size += e->size;
  
    if (time > MAX_TIME * INLINE_TIME_SCALE)
      time = MAX_TIME * INLINE_TIME_SCALE;
--- 2173,2192 ----
  
    for (i = 0; VEC_iterate (size_time_entry, info->entry, i, e); i++)
      if (evaluate_predicate (&e->predicate, possible_truths))
!       {
! 	size += e->size;
! 	if (!inline_param_summary)
! 	  time += e->time;
! 	else
! 	  {
! 	    int prob = predicate_probability (info->conds,
! 					      &e->predicate,
! 					      possible_truths,
! 					      inline_param_summary);
! 	    time += e->time * prob / REG_BR_PROB_BASE;
! 	  }
! 					         
!       }
  
    if (time > MAX_TIME * INLINE_TIME_SCALE)
      time = MAX_TIME * INLINE_TIME_SCALE;
*************** estimate_ipcp_clone_size_and_time (struc
*** 1951,1971 ****
    clause_t clause;
  
    clause = evaluate_conditions_for_known_args (node, false, known_vals);
!   estimate_node_size_and_time (node, clause, ret_size, ret_time);
  }
  
  
! /* Translate all conditions from callee representation into caller representation and
!    symbolically evaluate predicate P into new predicate.
  
!    INFO is inline_summary of function we are adding predicate into, CALLEE_INFO is summary
!    of function predicate P is from. OPERAND_MAP is array giving callee formal IDs the
!    caller formal IDs. POSSSIBLE_TRUTHS is clausule of all callee conditions that
!    may be true in caller context. TOPLEV_PREDICATE is predicate under which callee
!    is executed.  */
  
  static struct predicate
! remap_predicate (struct inline_summary *info, struct inline_summary *callee_info,
  		 struct predicate *p,
  		 VEC (int, heap) *operand_map,
  		 clause_t possible_truths,
--- 2219,2241 ----
    clause_t clause;
  
    clause = evaluate_conditions_for_known_args (node, false, known_vals);
!   estimate_node_size_and_time (node, clause, ret_size, ret_time,
! 			       NULL);
  }
  
  
! /* Translate all conditions from callee representation into caller
!    representation and symbolically evaluate predicate P into new predicate.
  
!    INFO is inline_summary of function we are adding predicate into,
!    CALLEE_INFO is summary of function predicate P is from. OPERAND_MAP is
!    array giving callee formal IDs the caller formal IDs. POSSSIBLE_TRUTHS is
!    clausule of all callee conditions that may be true in caller context.
!    TOPLEV_PREDICATE is predicate under which callee is executed.  */
  
  static struct predicate
! remap_predicate (struct inline_summary *info,
! 		 struct inline_summary *callee_info,
  		 struct predicate *p,
  		 VEC (int, heap) *operand_map,
  		 clause_t possible_truths,
*************** inline_update_callee_summaries (struct c
*** 2057,2068 ****
      inline_edge_summary (e)->loop_depth += depth;
  }
  
  
! /* Remap predicates of callees of NODE.  Rest of arguments match
!    remap_predicate.  */
  
  static void
! remap_edge_predicates (struct cgraph_node *node,
  		       struct inline_summary *info,
  		       struct inline_summary *callee_info,
  		       VEC (int, heap) *operand_map,
--- 2327,2384 ----
      inline_edge_summary (e)->loop_depth += depth;
  }
  
+ /* Update change_prob of EDGE after INLINED_EDGE has been inlined.
+    When functoin A is inlined in B and A calls C with parameter that
+    changes with probability PROB1 and C is known to be passthroug
+    of argument if B that change with probability PROB2, the probability
+    of change is now PROB1*PROB2.  */
+ 
+ static void
+ remap_edge_change_prob (struct cgraph_edge *inlined_edge,
+ 			struct cgraph_edge *edge)
+ {
+   if (ipa_node_params_vector)
+     {
+       int i;
+       struct ipa_edge_args *args = IPA_EDGE_REF (edge);
+       struct inline_edge_summary *es = inline_edge_summary (edge);
+       struct inline_edge_summary *inlined_es
+ 				    = inline_edge_summary (inlined_edge);
+ 
+       for (i = 0; i < ipa_get_cs_argument_count (args); i++)
+ 	{
+ 	  struct ipa_jump_func *jfunc = ipa_get_ith_jump_func (args, i);
+ 	  if (jfunc->type == IPA_JF_PASS_THROUGH)
+ 	    {
+ 	      int prob1 = VEC_index (inline_param_summary_t,
+ 				     es->param, i)->change_prob;
+ 	      int prob2 = VEC_index
+ 			     (inline_param_summary_t,
+ 			     inlined_es->param,
+ 			     jfunc->value.pass_through.formal_id)->change_prob;
+ 	      int prob = ((prob1 * prob2 + REG_BR_PROB_BASE / 2)
+ 			  / REG_BR_PROB_BASE);
+ 
+ 	      if (prob1 && prob2 && !prob)
+ 		prob = 1;
+ 
+ 	      VEC_index (inline_param_summary_t,
+ 			 es->param, i)->change_prob = prob;
+ 	    }
+ 	}
+   }
+ }
+ 
+ /* Update edge summaries of NODE after INLINED_EDGE has been inlined.
  
!    Remap predicates of callees of NODE.  Rest of arguments match
!    remap_predicate.
! 
!    Also update change probabilities.  */
  
  static void
! remap_edge_summaries  (struct cgraph_edge *inlined_edge,
! 		       struct cgraph_node *node,
  		       struct inline_summary *info,
  		       struct inline_summary *callee_info,
  		       VEC (int, heap) *operand_map,
*************** remap_edge_predicates (struct cgraph_nod
*** 2074,2090 ****
      {
        struct inline_edge_summary *es = inline_edge_summary (e);
        struct predicate p;
        if (e->inline_failed)
  	{
  	  if (es->predicate)
  	    {
  	      p = remap_predicate (info, callee_info,
  				   es->predicate, operand_map, possible_truths,
  				   toplev_predicate);
  	      edge_set_predicate (e, &p);
! 	      /* TODO: We should remove the edge for code that will be optimized out,
! 		 but we need to keep verifiers and tree-inline happy.
! 		 Make it cold for now.  */
  	      if (false_predicate_p (&p))
  		{
  		  e->count = 0;
--- 2390,2409 ----
      {
        struct inline_edge_summary *es = inline_edge_summary (e);
        struct predicate p;
+ 
        if (e->inline_failed)
  	{
+ 	  remap_edge_change_prob (inlined_edge, e);
+ 
  	  if (es->predicate)
  	    {
  	      p = remap_predicate (info, callee_info,
  				   es->predicate, operand_map, possible_truths,
  				   toplev_predicate);
  	      edge_set_predicate (e, &p);
! 	      /* TODO: We should remove the edge for code that will be
! 		 optimized out, but we need to keep verifiers and tree-inline
! 		 happy.  Make it cold for now.  */
  	      if (false_predicate_p (&p))
  		{
  		  e->count = 0;
*************** remap_edge_predicates (struct cgraph_nod
*** 2095,2115 ****
  	    edge_set_predicate (e, toplev_predicate);
  	}
        else
! 	remap_edge_predicates (e->callee, info, callee_info, operand_map,
! 			       possible_truths, toplev_predicate);
      }
    for (e = node->indirect_calls; e; e = e->next_callee)
      {
        struct inline_edge_summary *es = inline_edge_summary (e);
        struct predicate p;
        if (es->predicate)
  	{
  	  p = remap_predicate (info, callee_info,
  			       es->predicate, operand_map, possible_truths,
  			       toplev_predicate);
  	  edge_set_predicate (e, &p);
! 	  /* TODO: We should remove the edge for code that will be optimized out,
! 	     but we need to keep verifiers and tree-inline happy.
  	     Make it cold for now.  */
  	  if (false_predicate_p (&p))
  	    {
--- 2414,2436 ----
  	    edge_set_predicate (e, toplev_predicate);
  	}
        else
! 	remap_edge_summaries (inlined_edge, e->callee, info, callee_info,
! 			      operand_map, possible_truths, toplev_predicate);
      }
    for (e = node->indirect_calls; e; e = e->next_callee)
      {
        struct inline_edge_summary *es = inline_edge_summary (e);
        struct predicate p;
+ 
+       remap_edge_change_prob (inlined_edge, e);
        if (es->predicate)
  	{
  	  p = remap_predicate (info, callee_info,
  			       es->predicate, operand_map, possible_truths,
  			       toplev_predicate);
  	  edge_set_predicate (e, &p);
! 	  /* TODO: We should remove the edge for code that will be optimized
! 	     out, but we need to keep verifiers and tree-inline happy.
  	     Make it cold for now.  */
  	  if (false_predicate_p (&p))
  	    {
*************** inline_merge_summary (struct cgraph_edge
*** 2171,2184 ****
        struct predicate p = remap_predicate (info, callee_info,
  					    &e->predicate, operand_map, clause,
  					    &toplev_predicate);
!       gcov_type add_time = ((gcov_type)e->time * edge->frequency
! 			    + CGRAPH_FREQ_BASE / 2) / CGRAPH_FREQ_BASE;
!       if (add_time > MAX_TIME)
! 	add_time = MAX_TIME;
!       account_size_time (info, e->size, add_time, &p);
      }
!   remap_edge_predicates (edge->callee, info, callee_info, operand_map,
! 			 clause, &toplev_predicate);
    info->size = 0;
    info->time = 0;
    for (i = 0; VEC_iterate (size_time_entry, info->entry, i, e); i++)
--- 2492,2518 ----
        struct predicate p = remap_predicate (info, callee_info,
  					    &e->predicate, operand_map, clause,
  					    &toplev_predicate);
!       if (!false_predicate_p (&p))
! 	{
! 	  gcov_type add_time = ((gcov_type)e->time * edge->frequency
! 				+ CGRAPH_FREQ_BASE / 2) / CGRAPH_FREQ_BASE;
! 	  int prob = predicate_probability (callee_info->conds,
! 					    &e->predicate,
! 					    clause, es->param);
! 	  add_time = add_time * prob / REG_BR_PROB_BASE;
! 	  if (add_time > MAX_TIME * INLINE_TIME_SCALE)
! 	    add_time = MAX_TIME * INLINE_TIME_SCALE;
! 	  if (prob != REG_BR_PROB_BASE
! 	      && dump_file && (dump_flags & TDF_DETAILS))
! 	    {
! 	      fprintf (dump_file, "\t\tScaling time by probability:%f\n",
! 		       (double)prob / REG_BR_PROB_BASE);
! 	    }
! 	  account_size_time (info, e->size, add_time, &p);
! 	}
      }
!   remap_edge_summaries (edge, edge->callee, info, callee_info, operand_map,
! 			clause, &toplev_predicate);
    info->size = 0;
    info->time = 0;
    for (i = 0; VEC_iterate (size_time_entry, info->entry, i, e); i++)
*************** inline_merge_summary (struct cgraph_edge
*** 2191,2196 ****
--- 2525,2532 ----
  
    /* We do not maintain predicates of inlined edges, free it.  */
    edge_set_predicate (edge, &true_p);
+   /* Similarly remove param summaries.  */
+   VEC_free (inline_param_summary_t, heap, es->param);
  
    info->time = (info->time + INLINE_TIME_SCALE / 2) / INLINE_TIME_SCALE;
    info->size = (info->size + INLINE_SIZE_SCALE / 2) / INLINE_SIZE_SCALE;
*************** do_estimate_edge_time (struct cgraph_edg
*** 2213,2221 ****
    struct inline_edge_summary *es = inline_edge_summary (edge);
  
    gcc_checking_assert (edge->inline_failed);
!   estimate_node_size_and_time (cgraph_function_or_thunk_node (edge->callee, NULL),
  			       evaluate_conditions_for_edge (edge, true),
! 			       &size, &time);
  
    ret = (((gcov_type)time
  	   - es->call_stmt_time) * edge->frequency
--- 2549,2558 ----
    struct inline_edge_summary *es = inline_edge_summary (edge);
  
    gcc_checking_assert (edge->inline_failed);
!   estimate_node_size_and_time (cgraph_function_or_thunk_node (edge->callee,
! 							      NULL),
  			       evaluate_conditions_for_edge (edge, true),
! 			       &size, &time, es->param);
  
    ret = (((gcov_type)time
  	   - es->call_stmt_time) * edge->frequency
*************** do_estimate_edge_growth (struct cgraph_e
*** 2267,2273 ****
    gcc_checking_assert (edge->inline_failed);
    estimate_node_size_and_time (callee,
  			       evaluate_conditions_for_edge (edge, true),
! 			       &size, NULL);
    gcc_checking_assert (inline_edge_summary (edge)->call_stmt_size);
    return size - inline_edge_summary (edge)->call_stmt_size;
  }
--- 2604,2610 ----
    gcc_checking_assert (edge->inline_failed);
    estimate_node_size_and_time (callee,
  			       evaluate_conditions_for_edge (edge, true),
! 			       &size, NULL, NULL);
    gcc_checking_assert (inline_edge_summary (edge)->call_stmt_size);
    return size - inline_edge_summary (edge)->call_stmt_size;
  }
*************** read_inline_edge_summary (struct lto_inp
*** 2475,2486 ****
--- 2812,2832 ----
  {
    struct inline_edge_summary *es = inline_edge_summary (e);
    struct predicate p;
+   int length, i;
  
    es->call_stmt_size = streamer_read_uhwi (ib);
    es->call_stmt_time = streamer_read_uhwi (ib);
    es->loop_depth = streamer_read_uhwi (ib);
    p = read_predicate (ib);
    edge_set_predicate (e, &p);
+   length = streamer_read_uhwi (ib);
+   if (length)
+     {
+       VEC_safe_grow_cleared (inline_param_summary_t, heap, es->param, length);
+       for (i = 0; i < length; i++)
+ 	VEC_index (inline_param_summary_t, es->param, i)->change_prob
+ 	  = streamer_read_uhwi (ib);
+     }
  }
  
  
*************** inline_read_summary (void)
*** 2579,2591 ****
    while ((file_data = file_data_vec[j++]))
      {
        size_t len;
!       const char *data = lto_get_section_data (file_data, LTO_section_inline_summary, NULL, &len);
        if (data)
          inline_read_section (file_data, data, len);
        else
! 	/* Fatal error here.  We do not want to support compiling ltrans units with
! 	   different version of compiler or different flags than the WPA unit, so
! 	   this should never happen.  */
  	fatal_error ("ipa inline summary is missing in input file");
      }
    if (optimize)
--- 2925,2939 ----
    while ((file_data = file_data_vec[j++]))
      {
        size_t len;
!       const char *data = lto_get_section_data (file_data,
! 					       LTO_section_inline_summary,
! 					       NULL, &len);
        if (data)
          inline_read_section (file_data, data, len);
        else
! 	/* Fatal error here.  We do not want to support compiling ltrans units
! 	   with different version of compiler or different flags than the WPA
! 	   unit, so this should never happen.  */
  	fatal_error ("ipa inline summary is missing in input file");
      }
    if (optimize)
*************** static void
*** 2621,2630 ****
--- 2969,2984 ----
  write_inline_edge_summary (struct output_block *ob, struct cgraph_edge *e)
  {
    struct inline_edge_summary *es = inline_edge_summary (e);
+   int i;
+ 
    streamer_write_uhwi (ob, es->call_stmt_size);
    streamer_write_uhwi (ob, es->call_stmt_time);
    streamer_write_uhwi (ob, es->loop_depth);
    write_predicate (ob, es->predicate);
+   streamer_write_uhwi (ob, VEC_length (inline_param_summary_t, es->param));
+   for (i = 0; i < (int)VEC_length (inline_param_summary_t, es->param); i++)
+     streamer_write_uhwi (ob, VEC_index (inline_param_summary_t,
+ 				        es->param, i)->change_prob);
  }
  
  

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Extend ipa-inline-analysis to recognize functoin/loop invariant parameters
  2011-09-23 20:34 Extend ipa-inline-analysis to recognize functoin/loop invariant parameters Jan Hubicka
@ 2011-09-26 18:54 ` H.J. Lu
  0 siblings, 0 replies; 2+ messages in thread
From: H.J. Lu @ 2011-09-26 18:54 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Fri, Sep 23, 2011 at 10:34 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this patch extends inliner's predicate infrastructure to track optimization
> oppurtunities that depends on properties of call site parameters that are
> not readilly available (and do not belong to) jump functions.
>
> For this new inline_param_summary is introduced that is filled in
> estimate_function_body_sizes used at predicate evaulation time and maintained
> over inlining and cloning.
>
> I now use it to track probability that parameter will change.  Probability
> of 0 means invaraint values, while probabilities in range 1...REG_BR_PROB_BASE
> means that parameter will be recomputed in between function calls with
> the known probability.
>
> The motivation for it comes from testcases like c-ray raytracer, where function
> raysphere has two parameters (ray and sphere). It is called from loop that for
> given ray iterates across all spheres in the scene.
> Large benefits from inlining raysphere are due to fact that good portion of the
> function is loop invariant and thus it is subsequentely lifted out of the loop
> by PRE.
>
> The patch splits NOT_CONSTANT predicate into NOT_CONSTANT and CHANGED.
> NOT_CONSTANT is true always when parameters is not a know IP invaraint (in
> sense of jump functions), while CHANGED is true when parameter change
> probaiblity is >= 1 (i.e. it is not function level invariant). NOT_CONSTANT is
> used when evaulating builtin_constant_p that is never true for loop invariant.
> CHANGED is used when deciding whether given statment will be executed per
> invocation of the particula rfunction call.  Moreover predicates can now be
> evaulated into execution probablity that correspond to the probability of
> parameter change.
>
> As implemented now, the patch is good enough to analyze c-ray and work out that
> raysphere is cool to inline, it however won't inline it because of the hard
> size limits.
>
> Some guesswork is obviously needed. Most magic is in param_change_prob and the
> way it handles non-SSA parameters where it is quite non-obvious how much of code
> motion will be possible after inlining.  Also predicate_probability can really
> only evaulate simple predicated like (op0 changed) into op0's change probability.
> It makes simple estiamtes so (op0 changed || op1 changed) is maximum of op0 and op1
> change probability while (op0 changed && op1 changed) is minimum.
> The second is quite rare, so this should work relatively well in practice.
>
> There are other simple things I would like to add incrementally. I.e. tracking
> whether parameter is address of automatic variable (and thus inlining likely
> enables SRA).  This should help C++ inlining and move us closer to be able to
> analyze fortran testcases.  Fortran is particularly difficult because most of
> stuff is passed by reference and we will probably need to introduce jump functions
> for values passed in aggregates and values passed by reference to understand
> things like array descriptors.
>
> The patch has little effect on spec2000 or our C++ benchmark suite at least
> with current inline limits. Most tests are small enough so we usully do all
> inlining allowed and thus saner priority queue makes little difference.  It
> however helps some of Mozilla inner loops and increase amount of inlining done
> there by about 8% while keeping same code size.
>
> Bootstrapped/regtsted x86_64-linux.
>
> Honza
>
>        * gcc.dg/ipa/inline-1.c: new testcase.
>        * gcc.dg/ipa/inline-2.c: new testcase.
>        * gcc.dg/ipa/inline-3.c: new testcase.
>        * gcc.dg/ipa/inline-4.c: new testcase.
>
>        * ipa-inline-transform.c (inline_call): Add comment.
>        * ipa-inline.h (inline_param_summary): New structure and vector.
>        (struct inline_edge_summary): Add param field.
>        * ipa-inline-analysis.c (CHANGED): New constant.
>        (add_clause): Handle CHANGED and NOT_CONSTANT.
>        (predicate_probability): New function.
>        (dump_condition): Dump CHANGED predicate.
>        (evaluate_conditions_for_known_args): Handle ERROR_MARK as marker
>        of unknown function wide invariant.
>        (evaluate_conditions_for_edge): Handle change probabilities.
>        (inline_edge_duplication_hook): Copy param summaries.
>        (inline_edge_removal_hook): Free param summaries.
>        (dump_inline_edge_summary): Fix dumping of indirect edges and callee sizes;
>        dump param summaries.
>        (will_be_nonconstant_predicate): Use CHANGED predicate.
>        (record_modified_bb_info): New structure.
>        (record_modified): New function.
>        (param_change_prob): New function.
>        (estimate_function_body_sizes): Compute param summaries.
>        (estimate_edge_size_and_time): Add probability argument.
>        (estimate_node_size_and_time): Add inline_param_summary argument;
>        handle predicate probabilities.
>        (remap_predicate): Fix formating.
>        (remap_edge_change_prob): New function.
>        (remap_edge_summaries): Rename from ...; use remap_edge_change_prob.
>        (remap_edge_predicates): ... this one.
>        (inline_merge_summary): Remap edge summaries; handle predicate probabilities;
>        remove param summaries after we are done.
>        (do_estimate_edge_time): Update.
>        (do_estimate_edge_growth): Update.
>        (read_inline_edge_summary): Read param info.
>        (inline_read_summary): Fix formating.
>        (write_inline_edge_summary): Write param summaries.

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50528

-- 
H.J.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-09-26 17:47 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-23 20:34 Extend ipa-inline-analysis to recognize functoin/loop invariant parameters Jan Hubicka
2011-09-26 18:54 ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).