public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Continue stmt branch prediction
@ 2008-02-09 20:59 Jan Hubicka
  2008-02-10 14:00 ` Diego Novillo
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Jan Hubicka @ 2008-02-09 20:59 UTC (permalink / raw)
  To: gcc-patches, dnovillo


Hi,
this patch adds a continue branch predictor that was dropped by introduction of
gimplifier (that replaced continue statement by goto).

The idea is that continue statements forms a loops that are usually not looping,
as proper loops are written via the loop construct.  On SPECint2000 average loop
tends to loop 7 times, while average loop constructed via continue loops less
than once.

The prediction is implemented via high level predictors that used also to predict,
for example, that explicit goto statements are not taken (this is helping kernel) 
and can be used for other stuff in other languages too.
I will re-introduce the other predictors by followups later, this patch was
developed while looking into the gzip regression as predicting continue formed
loop to loop a lot is one of problems there:

statement
  if (test1 || test2 || test3 || test4)
    continue;

is in mainline handled as 4 nested loops, each predicted to loop roughly 4 times, 
making all the rest of function cold.

Since the patch itself is not solving the regression fully (it makes improvement
on prelbmk 64bit peak, 920->980, and gzip 64bit, 750->770) but irronically it causes
regression at -O3 score of gzip 32bit by exposing other problems.  So I am holding
it for 4.4.

The branch prediction bits I can approve myself, but I would welcome comment on
introducing PREDICT_EXPR.  This was discussed while ago on GCC summit.  (the
original patch was done at CFG expansion times, but since we was about to run
out of tree codes I holded it and then forgot about the issue completely).

Bootstrapped/regtested i686-linux.

Honza

	* tree-pretty-print.c: Include predict.h.
	(dump_generic_node): Dump predictor.
	* tree.h (PREDICT_EXPR_OUTCOME, PREDICT_EXPR_PREDICTION): Update.
	* tree-gimple.c (is_gimple_stmt): Add PREDICT_EXPR.
	* gimple-low.c (lower_stmt): Likewise.
	* expr.c (expand_expr_real): Likewise.
	* predict.c (tree_bb_level_predictions): Use PREDICT_EXPRs and remove
	them.
	(build_predict_expr, build_predict_expr): New.
	* predict.h (predictor_name, build_predict_expr): Update.
	* c-typeck.c (c_finish_bc_stmt): Add prediction.
	* gimplify.c (gimplify_expr): Add PREDICT_EXPR.
	* predict.def (PRED_CONTINUE): Update hitrate.
	* tree.def (PREDICT_EXPR): Define.
	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark PREDICT_EXPR;
	do not handle BIND_EXPR.
	* tree-inline.c (estimate_num_insns_1): PREDICT_EXPR is free.
	* tree-cfg.c (verify_gimple_stmt): PREDICT_EXPR is valid.
	* tree-ssa-operands.c (get_expr_operands): PREDICT_EXPR takes no operands.


Index: tree-pretty-print.c
===================================================================
*** tree-pretty-print.c	(revision 132144)
--- tree-pretty-print.c	(working copy)
*************** along with GCC; see the file COPYING3.  
*** 35,40 ****
--- 35,41 ----
  #include "tree-pass.h"
  #include "fixed-value.h"
  #include "value-prof.h"
+ #include "predict.h"
  
  /* Local functions, macros and variables.  */
  static int op_prio (const_tree);
*************** dump_generic_node (pretty_printer *buffe
*** 1586,1591 ****
--- 1587,1602 ----
        is_expr = false;
        break;
  
+     case PREDICT_EXPR:
+       pp_string (buffer, "// predicted ");
+       if (PREDICT_EXPR_OUTCOME (node))
+         pp_string (buffer, "likely by ");
+       else
+         pp_string (buffer, "unlikely by ");
+       pp_string (buffer, predictor_name (PREDICT_EXPR_PREDICTOR (node)));
+       pp_string (buffer, " predictor.");
+       break;
+ 
      case RETURN_EXPR:
        pp_string (buffer, "return");
        op0 = TREE_OPERAND (node, 0);
Index: tree.h
===================================================================
*** tree.h	(revision 132144)
--- tree.h	(working copy)
*************** struct gimple_stmt GTY(())
*** 436,441 ****
--- 436,442 ----
  	   expression.
         CALL_EXPR_TAILCALL in CALL_EXPR
         CASE_LOW_SEEN in CASE_LABEL_EXPR
+        RETURN_EXPR_OUTCOME in RETURN_EXPR
  
     static_flag:
  
*************** extern void omp_clause_range_check_faile
*** 1160,1165 ****
--- 1161,1171 ----
  #define CASE_LOW_SEEN(NODE) \
    (CASE_LABEL_EXPR_CHECK (NODE)->base.addressable_flag)
  
+ #define PREDICT_EXPR_OUTCOME(NODE) \
+   (PREDICT_EXPR_CHECK(NODE)->base.addressable_flag)
+ #define PREDICT_EXPR_PREDICTOR(NODE) \
+   ((enum br_predictor)tree_low_cst (TREE_OPERAND (PREDICT_EXPR_CHECK (NODE), 0), 0))
+ 
  /* In a VAR_DECL, nonzero means allocate static storage.
     In a FUNCTION_DECL, nonzero if function has been defined.
     In a CONSTRUCTOR, nonzero means allocate static storage.
Index: tree-gimple.c
===================================================================
*** tree-gimple.c	(revision 132144)
--- tree-gimple.c	(working copy)
*************** is_gimple_stmt (tree t)
*** 248,253 ****
--- 248,254 ----
  
      case CALL_EXPR:
      case GIMPLE_MODIFY_STMT:
+     case PREDICT_EXPR:
        /* These are valid regardless of their type.  */
        return true;
  
Index: gimple-low.c
===================================================================
*** gimple-low.c	(revision 132144)
--- gimple-low.c	(working copy)
*************** lower_stmt (tree_stmt_iterator *tsi, str
*** 239,244 ****
--- 239,245 ----
      case NOP_EXPR:
      case ASM_EXPR:
      case GOTO_EXPR:
+     case PREDICT_EXPR:
      case LABEL_EXPR:
      case SWITCH_EXPR:
      case CHANGE_DYNAMIC_TYPE_EXPR:
Index: expr.c
===================================================================
*** expr.c	(revision 132144)
--- expr.c	(working copy)
*************** expand_expr_real (tree exp, rtx target, 
*** 7053,7058 ****
--- 7053,7059 ----
  
    /* Handle ERROR_MARK before anybody tries to access its type.  */
    if (TREE_CODE (exp) == ERROR_MARK
+       || TREE_CODE (exp) == PREDICT_EXPR
        || (!GIMPLE_TUPLE_P (exp) && TREE_CODE (TREE_TYPE (exp)) == ERROR_MARK))
      {
        ret = CONST0_RTX (tmode);
Index: predict.c
===================================================================
*** predict.c	(revision 132144)
--- predict.c	(working copy)
*************** static void
*** 1286,1310 ****
  tree_bb_level_predictions (void)
  {
    basic_block bb;
-   int *heads;
  
!   heads = XCNEWVEC (int, last_basic_block);
!   heads[ENTRY_BLOCK_PTR->next_bb->index] = last_basic_block;
! 
!   apply_return_prediction (heads);
  
    FOR_EACH_BB (bb)
      {
        block_stmt_iterator bsi = bsi_last (bb);
  
!       for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
  	{
  	  tree stmt = bsi_stmt (bsi);
  	  tree decl;
  	  switch (TREE_CODE (stmt))
  	    {
  	      case GIMPLE_MODIFY_STMT:
--- 1290,1307 ----
  tree_bb_level_predictions (void)
  {
    basic_block bb;
  
!   apply_return_prediction ();
  
!   heads = XCNEWVEC (int, last_basic_block);
!   heads[ENTRY_BLOCK_PTR->next_bb->index] = last_basic_block;
! 
!   apply_return_prediction (heads);
    FOR_EACH_BB (bb)
      {
        block_stmt_iterator bsi = bsi_last (bb);
  
!       for (bsi = bsi_start (bb); !bsi_end_p (bsi);)
  	{
  	  tree stmt = bsi_stmt (bsi);
  	  tree decl;
+ 	  bool next = false;
  	  switch (TREE_CODE (stmt))
  	    {
  	      case GIMPLE_MODIFY_STMT:
*************** tree_bb_level_predictions (void)
*** 1313,1336 ****
  	      case CALL_EXPR:
  call_expr:;
  		if (call_expr_flags (stmt) & ECF_NORETURN)
! 		  predict_paths_leading_to (bb, heads, PRED_NORETURN,
  		      			    NOT_TAKEN);
  		decl = get_callee_fndecl (stmt);
  		if (decl
  		    && lookup_attribute ("cold",
  					 DECL_ATTRIBUTES (decl)))
! 		  predict_paths_leading_to (bb, heads, PRED_COLD_FUNCTION,
  		      			    NOT_TAKEN);
  		break;
  	      default:
  		break;
  	    }
  	}
      }
  
    free (heads);
  }
  
  #ifdef ENABLE_CHECKING
--- 1314,1341 ----
  	      case CALL_EXPR:
  call_expr:;
  		if (call_expr_flags (stmt) & ECF_NORETURN)
! 		  predict_paths_leading_to (bb, heads, PRED_NORETURN,
  		      			    NOT_TAKEN);
  		decl = get_callee_fndecl (stmt);
  		if (decl
  		    && lookup_attribute ("cold",
  					 DECL_ATTRIBUTES (decl)))
! 		  predict_paths_leading_to (bb, heads, PRED_COLD_FUNCTION,
  		      			    NOT_TAKEN);
  		break;
+ 	      case PREDICT_EXPR:
+ 		predict_paths_leading_to (bb, heads, PREDICT_EXPR_PREDICTOR (stmt),
+ 		      			  PREDICT_EXPR_OUTCOME (stmt));
+ 		bsi_remove (&bsi, true);
+ 		next = true;
+ 		break;
  	      default:
  		break;
  	    }
+ 	  if (!next)
+ 	    bsi_next (&bsi);
  	}
      }
  }
  
  #ifdef ENABLE_CHECKING
*************** gate_estimate_probability (void)
*** 1937,1942 ****
--- 1930,1950 ----
    return flag_guess_branch_prob;
  }
  
+ /* Build PREDICT_EXPR.  */
+ tree
+ build_predict_expr (enum br_predictor predictor, enum prediction taken)
+ {
+   tree t = build1 (PREDICT_EXPR, NULL_TREE, build_int_cst (NULL, predictor));
+   PREDICT_EXPR_OUTCOME (t) = taken;
+   return t;
+ }
+ 
+ const char *
+ predictor_name (enum br_predictor predictor)
+ {
+   return predictor_info[predictor].name;
+ }
+ 
  struct tree_opt_pass pass_profile = 
  {
    "profile",				/* name */
Index: predict.h
===================================================================
*** predict.h	(revision 132144)
--- predict.h	(working copy)
*************** enum prediction
*** 38,42 ****
--- 38,44 ----
  extern void predict_insn_def (rtx, enum br_predictor, enum prediction);
  extern int counts_to_freqs (void);
  extern void estimate_bb_frequencies (void);
+ extern const char *predictor_name (enum br_predictor);
+ extern tree build_predict_expr (enum br_predictor, enum prediction);
  
  #endif  /* GCC_PREDICT_H */
Index: c-typeck.c
===================================================================
*** c-typeck.c	(revision 132144)
--- c-typeck.c	(working copy)
*************** c_finish_bc_stmt (tree *label_p, bool is
*** 7500,7505 ****
--- 7500,7508 ----
    if (skip)
      return NULL_TREE;
  
+   if (!is_break)
+     add_stmt (build_predict_expr (PRED_CONTINUE, NOT_TAKEN));
+ 
    return add_stmt (build1 (GOTO_EXPR, void_type_node, label));
  }
  
Index: gimplify.c
===================================================================
*** gimplify.c	(revision 132144)
--- gimplify.c	(working copy)
*************** gimplify_expr (tree *expr_p, tree *pre_p
*** 5833,5838 ****
--- 5833,5842 ----
  				 NULL, is_gimple_val, fb_rvalue);
  	  break;
  
+ 	  /* Predictions are always gimplified.  */
+ 	case PREDICT_EXPR:
+ 	  goto out;
+ 
  	case LABEL_EXPR:
  	  ret = GS_ALL_DONE;
  	  gcc_assert (decl_function_context (LABEL_EXPR_LABEL (*expr_p))
Index: predict.def
===================================================================
*** predict.def	(revision 132144)
--- predict.def	(working copy)
*************** DEF_PREDICTOR (PRED_LOOP_ITERATIONS_GUES
*** 66,72 ****
  	       PROB_ALWAYS, PRED_FLAG_FIRST_MATCH)
  
  /* Branch containing goto is probably not taken.  */
! DEF_PREDICTOR (PRED_CONTINUE, "continue", HITRATE (56), 0)
  
  /* Branch to basic block containing call marked by noreturn attribute.  */
  DEF_PREDICTOR (PRED_NORETURN, "noreturn call", HITRATE (99),
--- 66,72 ----
  	       PROB_ALWAYS, PRED_FLAG_FIRST_MATCH)
  
  /* Branch containing goto is probably not taken.  */
! DEF_PREDICTOR (PRED_CONTINUE, "continue", HITRATE (50), 0)
  
  /* Branch to basic block containing call marked by noreturn attribute.  */
  DEF_PREDICTOR (PRED_NORETURN, "noreturn call", HITRATE (99),
Index: tree.def
===================================================================
*** tree.def	(revision 132144)
--- tree.def	(working copy)
*************** DEFTREECODE (VEC_EXTRACT_ODD_EXPR, "vec_
*** 1165,1170 ****
--- 1165,1176 ----
  DEFTREECODE (VEC_INTERLEAVE_HIGH_EXPR, "vec_interleavehigh_expr", tcc_binary, 2)
  DEFTREECODE (VEC_INTERLEAVE_LOW_EXPR, "vec_interleavelow_expr", tcc_binary, 2)
  
+ /* PREDICT_EXPR.  Specify hint for branch prediction.  The
+    PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the
+    outcome (0 for not taken and 1 for taken).  Once the profile is guessed
+    all conditional branches leading to execution paths executing the
+    PREDICT_EXPR will get predicted by the specified predictor.  */
+ DEFTREECODE (PREDICT_EXPR, "predict_expr", tcc_unary, 1)
  /*
  Local variables:
  mode:c
Index: tree-ssa-dce.c
===================================================================
*** tree-ssa-dce.c	(revision 132144)
--- tree-ssa-dce.c	(working copy)
*************** mark_stmt_if_obviously_necessary (tree s
*** 277,283 ****
       can then remove the block and labels.  */
    switch (TREE_CODE (stmt))
      {
!     case BIND_EXPR:
      case LABEL_EXPR:
      case CASE_LABEL_EXPR:
        mark_stmt_necessary (stmt, false);
--- 277,283 ----
       can then remove the block and labels.  */
    switch (TREE_CODE (stmt))
      {
!     case PREDICT_EXPR:
      case LABEL_EXPR:
      case CASE_LABEL_EXPR:
        mark_stmt_necessary (stmt, false);
Index: tree-inline.c
===================================================================
*** tree-inline.c	(revision 132144)
--- tree-inline.c	(working copy)
*************** estimate_num_insns_1 (tree *tp, int *wal
*** 2265,2270 ****
--- 2265,2271 ----
      case COMPLEX_CST:
      case VECTOR_CST:
      case STRING_CST:
+     case PREDICT_EXPR:
        *walk_subtrees = 0;
        return NULL;
  
Index: tree-cfg.c
===================================================================
*** tree-cfg.c	(revision 132144)
--- tree-cfg.c	(working copy)
*************** verify_gimple_stmt (tree stmt)
*** 4065,4070 ****
--- 4065,4071 ----
      case NOP_EXPR:
      case CHANGE_DYNAMIC_TYPE_EXPR:
      case ASM_EXPR:
+     case PREDICT_EXPR:
        return false;
  
      default:
Index: tree-ssa-operands.c
===================================================================
*** tree-ssa-operands.c	(revision 132144)
--- tree-ssa-operands.c	(working copy)
*************** get_expr_operands (tree stmt, tree *expr
*** 2376,2381 ****
--- 2376,2382 ----
      case OMP_RETURN:
      case OMP_SECTION:
      case OMP_SECTIONS_SWITCH:
+     case PREDICT_EXPR:
        /* Expressions that make no memory references.  */
        return;
  

^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: Continue stmt branch prediction
@ 2008-02-20 22:53 Bradley Lucier
  0 siblings, 0 replies; 27+ messages in thread
From: Bradley Lucier @ 2008-02-20 22:53 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Bradley Lucier, gcc-patches

Re:

>> What are the other predictors?  It might help to understand it  
>> better if we had a better overview of everything PREDICT_EXPR was  
>> going to do.
>
>
> I was mostly looking into following stuff:
>   1) the continue bit
>   2) explicit goto statements are usually not executed
>   3) default path of C switch statement on emum with all named fields
>   explicit is unlikely
>   4) expanded bounds checks (ie arrays tends to be in bounds. We  
> used to
>   have PRED_MUDLFAP here but it was dropped and Java has similar  
> logic)
>   5) expanded openMP constructs
>   6) higher level constructs from non-C frontends we have.  There are
>   cases where we produce control flow expanding stuff like fortran
>   vectors where we have unlikely paths through. Though I didn't look
>   into great detail into this yet.


Honza:

Computer-generated C code as the back end of compilers for other  
languages may implement function calls as gotos (the callee, not the  
caller, controls the argument stack in Scheme, for example, so The  
Gambit Scheme->C compiler generate C code where function calls are  
just goto's with arguments).  It appears that heuristics such as 2  
may make such code perform worse (especially as incorrectly-predicted  
jumps have such a high cost on some processors).

Are there predictors already implemented in gcc based on other  
things, such as control flow graphs?

If gcc does adopt heuristics such as these, I hope they can be  
documented so that people automatically generating C code can either  
exploit them or avoid them as necessary.  (An example is that the  
Gambit Scheme compiler has been generating SSA-style floating-point  
code for years because the original code generator produced code that  
gcc's register allocator didn't handle very well.)

Brad

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2008-03-19 12:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-09 20:59 Continue stmt branch prediction Jan Hubicka
2008-02-10 14:00 ` Diego Novillo
2008-02-10 15:49   ` Jan Hubicka
2008-02-11  2:51     ` Diego Novillo
2008-02-11  8:21       ` Jan Hubicka
2008-02-11 22:42 ` Andrew MacLeod
2008-02-14 15:31   ` Jan Hubicka
2008-02-14 15:38     ` Richard Guenther
2008-02-14 17:49       ` Jan Hubicka
2008-02-14 18:27         ` Andrew MacLeod
2008-02-14 19:12           ` Jan Hubicka
2008-02-14 15:45     ` Andrew MacLeod
2008-02-14 18:11       ` Jan Hubicka
2008-03-05 18:33   ` Jan Hubicka
2008-03-05 22:10     ` Diego Novillo
2008-03-05 22:28       ` Jan Hubicka
2008-03-12 17:54         ` Diego Novillo
2008-03-13  1:38           ` Jan Hubicka
2008-03-13  1:41             ` Jan Hubicka
2008-03-13  9:21             ` Richard Guenther
2008-03-13 14:12             ` Diego Novillo
2008-03-19  6:33 ` H.J. Lu
2008-03-19 11:25   ` Jan Hubicka
2008-03-19 11:37     ` Richard Guenther
2008-03-19 11:45       ` Jan Hubicka
2008-03-19 13:21         ` Richard Guenther
2008-02-20 22:53 Bradley Lucier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).