Re: [PATCH v3 2/3] Add predict_doloop_p target hook

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Richard Biener <rguenther@suse.de>
Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org,
	       wschmidt@linux.ibm.com, bin.cheng@linux.alibaba.com,
	jakub@redhat.com
Subject: Re: [PATCH v3 2/3] Add predict_doloop_p target hook
Date: Tue, 21 May 2019 05:28:00 -0000	[thread overview]
Message-ID: <5d155218-1231-f1bc-8faa-3851b94bd469@linux.ibm.com> (raw)
In-Reply-To: <alpine.LSU.2.20.1905201122310.10704@zhemvz.fhfr.qr>

on 2019/5/20 ä¸‹åˆ5:28, Richard Biener wrote:
> On Thu, 16 May 2019, linkw@linux.ibm.com wrote:
> 
>> From: Kewen Lin <linkw@linux.ibm.com>
>>
>> Hi,
>>
>> Previous version link:
>> https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00654.html
>>
>> Comparing with the previous version, I moved the generic
>> parts of rs6000 target hook to IVOPTs.  But I still kept
>> the target hook as previous which checks some target
>> specific criteria like innermost, max iteration counts
>> etc, and checks for invalid stmt in loop.  The reason
>> I decided not to move this part to generic is they are
>> not generic enough.
>>
>> 1) For the part of target specific criteria, if we want
>> to put it in generic, we can call the hook
>> targetm.can_use_doloop_p, which requires us to
>> prepare those four parameters, but each target only needs
>> one or two parameters, it means we will evaluate some
>> things which aren't required for that target.  So I'd like
>> to leave this part to target hook.
>> 2) For the other part of target invalid stmt check, as the
>> hook invalid_within_doloop grep data shows, no all targets
>> need to check whether invalid instructions exist in doloop.
>> If we scan all stmts as generic, it can waste time for those
>> targets which don't need to check.  Besides, the scope of
>> the current check on SWITCH in rs6000 hook is wide, later
>> if we want it more exact, we may need to check more stmts
>> instead of single.  To let target hook scan the BBs/stmts
>> by itself is also more flexible.
>>
>> Bootstrapped and regression testing ongoing on powerpc64le.
>>
>> Any more comments?
>>
>> -\f
>> +/* Check whether there are some instructions preventing doloop transformation
>> +   inside loop body, mainly for instructions which are possible to kill CTR.
>> +
>> +   Return true if some invalid insn exits, otherwise return false.  */
>> +
>> +static bool
>> +invalid_insn_for_doloop_p (struct loop *loop)
>> +{
>> +  basic_block *body = get_loop_body (loop);
> 
> You leak memory here.  I still think all of this function should
> be in generic code for now.
> 

Sorry, forgot to reply the part on "this function should be in generic code".
I'll update this, as Segher suggested, make this part as another hook, call 
this hook in generic code.  Any targets which don't require it can disable
it with null re-defintion.


Thanks,
Kewen 

>> +  gimple_stmt_iterator gsi;
>> +
>> +  for (unsigned i = 0; i < loop->num_nodes; i++)
>> +    for (gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi); gsi_next (&gsi))
>> +      {
>> +	gimple *stmt = gsi_stmt (gsi);
>> +	if (is_gimple_call (stmt) && !gimple_call_internal_p (stmt)
>> +	    && !is_inexpensive_builtin (gimple_call_fndecl (stmt)))
>> +	  {
>> +	    if (dump_file && (dump_flags & TDF_DETAILS))
>> +	      fprintf (dump_file,
>> +		       "predict doloop failure due to finding call.\n");
>> +	    return true;
>> +	  }
>> +	if (computed_goto_p (stmt))
>> +	  {
>> +	    if (dump_file && (dump_flags & TDF_DETAILS))
>> +	      fprintf (dump_file, "predict doloop failure due to"
>> +				  "finding computed jump.\n");
>> +	    return true;
>> +	  }
>> +
>> +	/* TODO: Now this hook is expected to be called in ivopts, which is
>> +	   before switchlower1/switchlower2.  Checking for SWITCH at this point
>> +	   will eliminate some good candidates.  But since there are only a few
>> +	   cases having _a_ switch statement in the innermost loop, it can be a
>> +	   low priority enhancement.  */
>> +	if (gimple_code (stmt) == GIMPLE_SWITCH)
>> +	  {
>> +	    if (dump_file && (dump_flags & TDF_DETAILS))
>> +	      fprintf (dump_file,
>> +		       "predict doloop failure due to finding switch.\n");
>> +	    return true;
>> +	  }
>> +      }
>> +
>> +  return false;
>> +}
>> +
>> +/* Predict whether the given loop in gimple will be transformed in the RTL
>> +   doloop_optimize pass.  This is for rs6000 target specific.  */
>> +
>> +static bool
>> +rs6000_predict_doloop_p (struct loop *loop)
>> +{
>> +  gcc_assert (loop);
>> +
>> +  /* On rs6000, targetm.can_use_doloop_p is actually
>> +     can_use_doloop_if_innermost.  Just ensure it's innermost.  */
> 
> So the point is that ppc only has a single counter register, right?
> 
> So the better way would be to expose that via a target hook somehow.
> Or simply restrict IVOPTs processing to innermost loops for now.
> 
>> +  if (loop->inner != NULL)
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file, "predict doloop failure due to"
>> +			    "no innermost.\n");
>> +      return false;
>> +    }
>> +
>> +  /* Similar to doloop_optimize targetm.invalid_within_doloop, check for
>> +     CALL, tablejump, computed_jump.  */
>> +  if (invalid_insn_for_doloop_p (loop))
>> +    return false;
>> +
>> +  return true;
>> +}
>> +
>>  struct gcc_target targetm = TARGET_INITIALIZER;
>>  
>>  #include "gt-rs6000.h"
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index 8c8978b..e595b8b 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> @@ -11603,6 +11603,14 @@ function version at run-time for a given set of function versions.
>>  body must be generated.
>>  @end deftypefn
>>  
>> +@deftypefn {Target Hook} bool TARGET_PREDICT_DOLOOP_P (struct loop *@var{loop})
>> +Return true if we can predict it is possible to use low-overhead loops
>> +for a particular loop.  The parameter @var{loop} is a pointer to the loop
>> +which is going to be checked.  This target hook is required only when the
>> +target supports low-overhead loops, and will help some earlier middle-end
>> +passes to make some decisions.
>> +@end deftypefn
>> +
>>  @deftypefn {Target Hook} bool TARGET_CAN_USE_DOLOOP_P (const widest_int @var{&iterations}, const widest_int @var{&iterations_max}, unsigned int @var{loop_depth}, bool @var{entered_at_top})
>>  Return true if it is possible to use low-overhead loops (@code{doloop_end}
>>  and @code{doloop_begin}) for a particular loop.  @var{iterations} gives the
>> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
>> index fe1194e..e047734 100644
>> --- a/gcc/doc/tm.texi.in
>> +++ b/gcc/doc/tm.texi.in
>> @@ -7942,6 +7942,8 @@ to by @var{ce_info}.
>>  
>>  @hook TARGET_GENERATE_VERSION_DISPATCHER_BODY
>>  
>> +@hook TARGET_PREDICT_DOLOOP_P
>> +
>>  @hook TARGET_CAN_USE_DOLOOP_P
>>  
>>  @hook TARGET_INVALID_WITHIN_DOLOOP
>> diff --git a/gcc/target.def b/gcc/target.def
>> index 66cee07..0a80de3 100644
>> --- a/gcc/target.def
>> +++ b/gcc/target.def
>> @@ -4227,6 +4227,15 @@ DEFHOOK
>>  rtx, (machine_mode mode, rtx result, rtx val, rtx failval),
>>   default_speculation_safe_value)
>>   
>> +DEFHOOK
>> +(predict_doloop_p,
>> + "Return true if we can predict it is possible to use low-overhead loops\n\
>> +for a particular loop.  The parameter @var{loop} is a pointer to the loop\n\
>> +which is going to be checked.  This target hook is required only when the\n\
>> +target supports low-overhead loops, and will help some earlier middle-end\n\
>> +passes to make some decisions.",
>> + bool, (struct loop *loop),
>> + default_predict_doloop_p)
>>  
>>  DEFHOOK
>>  (can_use_doloop_p,
>> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
>> index 318f7e9..56ed4cc 100644
>> --- a/gcc/targhooks.c
>> +++ b/gcc/targhooks.c
>> @@ -643,6 +643,19 @@ default_has_ifunc_p (void)
>>    return HAVE_GNU_INDIRECT_FUNCTION;
>>  }
>>  
>> +/* True if we can predict this loop is possible to be transformed to a
>> +   low-overhead loop, otherwise returns false.
>> +
>> +   By default, false is returned, as this hook's applicability should be
>> +   verified for each target.  Target maintainers should re-define the hook
>> +   if the target can take advantage of it.  */
>> +
>> +bool
>> +default_predict_doloop_p (struct loop *loop ATTRIBUTE_UNUSED)
>> +{
>> +  return false;
>> +}
>> +
>>  /* NULL if INSN insn is valid within a low-overhead loop, otherwise returns
>>     an error message.
>>  
>> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
>> index 5943627..70bfb17 100644
>> --- a/gcc/targhooks.h
>> +++ b/gcc/targhooks.h
>> @@ -85,6 +85,7 @@ extern bool default_fixed_point_supported_p (void);
>>  
>>  extern bool default_has_ifunc_p (void);
>>  
>> +extern bool default_predict_doloop_p (struct loop *);
>>  extern const char * default_invalid_within_doloop (const rtx_insn *);
>>  
>>  extern tree default_builtin_vectorized_function (unsigned int, tree, tree);
>> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>> index a44b4cb..ed7f2a5 100644
>> --- a/gcc/tree-ssa-loop-ivopts.c
>> +++ b/gcc/tree-ssa-loop-ivopts.c
>> @@ -3776,6 +3776,111 @@ prepare_decl_rtl (tree *expr_p, int *ws, void *data)
>>    return NULL_TREE;
>>  }
>>  
>> +/* Check whether number of iteration computation is too costly for doloop
>> +   transformation.  It expands the gimple sequence to equivalent RTL insn
>> +   sequence, then evaluate the cost.
>> +
>> +   Return true if it's costly, otherwise return false.  */
>> +
>> +static bool
>> +costly_iter_for_doloop_p (struct loop *loop, tree niters)
>> +{
> 
> I already said this but I expect IVOPTs to already cost initial
> value computation of new IVs.  I also doubt it's worth
> going into RTL details here, estimate_num_insns should be
> enough, like via using expression_expensive_p from SCEV analysis
> which is already used by IVOPTs.
> 
> You try to be perfect here but end up adding more
> magic numbers (PARAM_MAX_ITERATIONS_COMPUTATION_COST).
> 
>> +  tree type = TREE_TYPE (niters);
>> +  unsigned cost = 0;
>> +  bool speed = optimize_loop_for_speed_p (loop);
>> +  int regno = LAST_VIRTUAL_REGISTER + 1;
>> +  walk_tree (&niters, prepare_decl_rtl, &regno, NULL);
>> +  start_sequence ();
>> +  expand_expr (niters, NULL_RTX, TYPE_MODE (type), EXPAND_NORMAL);
>> +  rtx_insn *seq = get_insns ();
>> +  end_sequence ();
>> +
>> +  for (; seq; seq = NEXT_INSN (seq))
>> +    {
>> +      if (!INSN_P (seq))
>> +	continue;
>> +      rtx body = PATTERN (seq);
>> +      if (GET_CODE (body) == SET)
>> +	{
>> +	  rtx set_val = XEXP (body, 1);
>> +	  enum rtx_code code = GET_CODE (set_val);
>> +	  enum rtx_class cls = GET_RTX_CLASS (code);
>> +	  /* For now, we only consider these two RTX classes, to match what we
>> +	 get in doloop_optimize, excluding operations like zero/sign extend.  */
>> +	  if (cls == RTX_BIN_ARITH || cls == RTX_COMM_ARITH)
>> +	    cost += set_src_cost (set_val, GET_MODE (set_val), speed);
>> +	}
>> +    }
>> +  unsigned max_cost
>> +    = COSTS_N_INSNS (PARAM_VALUE (PARAM_MAX_ITERATIONS_COMPUTATION_COST));
>> +  if (cost > max_cost)
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Predict whether the given loop will be transformed in the RTL
>> +   doloop_optimize pass.  Attempt to duplicate as many doloop_optimize checks
>> +   as possible.  This is only for target independent checks, see
>> +   targetm.predict_doloop_p for the target dependent ones.
>> +
>> +   Some RTL specific checks seems unable to be checked in gimple, if any new
>> +   checks or easy checks _are_ missing here, please add them.  */
>> +
>> +static bool
>> +generic_predict_doloop_p (struct ivopts_data *data)
>> +{
>> +  struct loop *loop = data->current_loop;
>> +
>> +  /* Call target hook for target dependent checks.  */
>> +  if (!targetm.predict_doloop_p (loop))
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file, "predict doloop failure due to"
>> +			    "target specific checks.\n");
>> +      return false;
>> +    }
>> +
>> +  /* Similar to doloop_optimize, check iteration description to know it's
>> +     suitable or not.  */
>> +  edge exit = loop_latch_edge (loop);
>> +  struct tree_niter_desc *niter_desc = niter_for_exit (data, exit);
>> +  if (niter_desc == NULL)
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file, "predict doloop failure due to"
>> +			    "unexpected niters.\n");
>> +      return false;
>> +    }
>> +
>> +  /* Similar to doloop_optimize, check whether iteration count too small
>> +     and not profitable.  */
>> +  HOST_WIDE_INT est_niter = get_estimated_loop_iterations_int (loop);
>> +  if (est_niter == -1)
>> +    est_niter = get_likely_max_loop_iterations_int (loop);
>> +  if (est_niter >= 0 && est_niter < 3)
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file,
>> +		 "predict doloop failure due to"
>> +		 "too few iterations (%u).\n",
>> +		 (unsigned int) est_niter);
>> +      return false;
>> +    }
>> +
>> +  /* Similar to doloop_optimize, check whether number of iterations too costly
>> +     to compute.  */
>> +  if (costly_iter_for_doloop_p (loop, niter_desc->niter))
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file, "predict doloop failure due to"
>> +			    "costly niter computation.\n");
>> +      return false;
>> +    }
>> +
>> +  return true;
>> +}
>> +
>>  /* Determines cost of the computation of EXPR.  */
>>  
>>  static unsigned
>>
>

     prev parent reply	other threads:[~2019-05-21  5:28 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17  3:37 linkw
2019-05-17  5:31 ` Kugan Vivekanandarajah
2019-05-17  6:15   ` Kewen.Lin
2019-05-17  6:57   ` Segher Boessenkool
2019-05-17  6:49 ` Segher Boessenkool
2019-05-17  7:20   ` Kewen.Lin
2019-05-20  9:28 ` Richard Biener
2019-05-20 10:24   ` Segher Boessenkool
2019-05-20 14:44     ` Jeff Law
2019-05-20 16:38       ` Segher Boessenkool
2019-05-21  6:03         ` Kewen.Lin
2019-05-21 10:20           ` Richard Biener
2019-05-21 12:42             ` Segher Boessenkool
2019-05-21 15:22             ` Jeff Law
2019-05-22  2:07             ` Kewen.Lin
2019-06-11  2:39             ` Kewen.Lin
2019-06-11 12:17               ` Richard Biener
2019-06-13  5:50                 ` [PATCH v4 " Kewen.Lin
2019-06-14 21:53                   ` Segher Boessenkool
2019-06-14 23:46                     ` Bill Schmidt
2019-06-17  2:08                       ` Kewen.Lin
2019-06-17  8:51                         ` Richard Biener
2019-06-17  9:39                           ` Kewen.Lin
2019-06-17  9:59                           ` Segher Boessenkool
2019-06-17 12:08                             ` Richard Biener
2019-06-17 13:39                               ` Kewen.Lin
2019-06-17 13:44                                 ` Richard Biener
2019-06-17 14:24                                   ` Kewen.Lin
2019-05-21 11:09           ` [PATCH v3 " Segher Boessenkool
2019-05-21  9:58         ` Richard Biener
2019-05-21  5:50       ` Kewen.Lin
2019-05-21  6:32         ` Bin.Cheng
2019-05-21 10:36         ` Segher Boessenkool
2019-05-21  3:48   ` Kewen.Lin
2019-05-21  5:28   ` Kewen.Lin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5d155218-1231-f1bc-8faa-3851b94bd469@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=bin.cheng@linux.alibaba.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=rguenther@suse.de \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).