public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Richard Biener <rguenther@suse.de>
Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org,
	       wschmidt@linux.ibm.com, bin.cheng@linux.alibaba.com,
	jakub@redhat.com
Subject: Re: [PATCH v3 2/3] Add predict_doloop_p target hook
Date: Tue, 21 May 2019 03:48:00 -0000	[thread overview]
Message-ID: <e8c1e96f-b811-5d5c-0bb1-395bdef93850@linux.ibm.com> (raw)
In-Reply-To: <alpine.LSU.2.20.1905201122310.10704@zhemvz.fhfr.qr>

on 2019/5/20 下午5:28, Richard Biener wrote:
> On Thu, 16 May 2019, linkw@linux.ibm.com wrote:
> 
>> From: Kewen Lin <linkw@linux.ibm.com>
>>
>> Hi,
>>
>> Previous version link:
>> https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00654.html
>>
>> Comparing with the previous version, I moved the generic
>> parts of rs6000 target hook to IVOPTs.  But I still kept
>> the target hook as previous which checks some target
>> specific criteria like innermost, max iteration counts
>> etc, and checks for invalid stmt in loop.  The reason
>> I decided not to move this part to generic is they are
>> not generic enough.
>>
>> 1) For the part of target specific criteria, if we want
>> to put it in generic, we can call the hook
>> targetm.can_use_doloop_p, which requires us to
>> prepare those four parameters, but each target only needs
>> one or two parameters, it means we will evaluate some
>> things which aren't required for that target.  So I'd like
>> to leave this part to target hook.
>> 2) For the other part of target invalid stmt check, as the
>> hook invalid_within_doloop grep data shows, no all targets
>> need to check whether invalid instructions exist in doloop.
>> If we scan all stmts as generic, it can waste time for those
>> targets which don't need to check.  Besides, the scope of
>> the current check on SWITCH in rs6000 hook is wide, later
>> if we want it more exact, we may need to check more stmts
>> instead of single.  To let target hook scan the BBs/stmts
>> by itself is also more flexible.
>>
>> Bootstrapped and regression testing ongoing on powerpc64le.
>>
>> Any more comments?
>>
>>  
>> -\f
>> +/* Check whether there are some instructions preventing doloop transformation
>> +   inside loop body, mainly for instructions which are possible to kill CTR.
>> +
>> +   Return true if some invalid insn exits, otherwise return false.  */
>> +
>> +static bool
>> +invalid_insn_for_doloop_p (struct loop *loop)
>> +{
>> +  basic_block *body = get_loop_body (loop);
> 
> You leak memory here.  I still think all of this function should
> be in generic code for now.
> 

Thanks a lot for catching this!  I didn't notice I should free it by myself.

>> +  gimple_stmt_iterator gsi;
>> +
>> +  for (unsigned i = 0; i < loop->num_nodes; i++)
>> +    for (gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi); gsi_next (&gsi))
>> +      {
>> +	gimple *stmt = gsi_stmt (gsi);
>> +	if (is_gimple_call (stmt) && !gimple_call_internal_p (stmt)
>> +	    && !is_inexpensive_builtin (gimple_call_fndecl (stmt)))
>> +	  {
>> +	    if (dump_file && (dump_flags & TDF_DETAILS))
>> +	      fprintf (dump_file,
>> +		       "predict doloop failure due to finding call.\n");
>> +	    return true;
>> +	  }
>> +	if (computed_goto_p (stmt))
>> +	  {
>> +	    if (dump_file && (dump_flags & TDF_DETAILS))
>> +	      fprintf (dump_file, "predict doloop failure due to"
>> +				  "finding computed jump.\n");
>> +	    return true;
>> +	  }
>> +
>> +	/* TODO: Now this hook is expected to be called in ivopts, which is
>> +	   before switchlower1/switchlower2.  Checking for SWITCH at this point
>> +	   will eliminate some good candidates.  But since there are only a few
>> +	   cases having _a_ switch statement in the innermost loop, it can be a
>> +	   low priority enhancement.  */
>> +	if (gimple_code (stmt) == GIMPLE_SWITCH)
>> +	  {
>> +	    if (dump_file && (dump_flags & TDF_DETAILS))
>> +	      fprintf (dump_file,
>> +		       "predict doloop failure due to finding switch.\n");
>> +	    return true;
>> +	  }
>> +      }
>> +
>> +  return false;
>> +}
>> +
>> +/* Predict whether the given loop in gimple will be transformed in the RTL
>> +   doloop_optimize pass.  This is for rs6000 target specific.  */
>> +
>> +static bool
>> +rs6000_predict_doloop_p (struct loop *loop)
>> +{
>> +  gcc_assert (loop);
>> +
>> +  /* On rs6000, targetm.can_use_doloop_p is actually
>> +     can_use_doloop_if_innermost.  Just ensure it's innermost.  */
> 
> So the point is that ppc only has a single counter register, right?
> 
> So the better way would be to expose that via a target hook somehow.
> Or simply restrict IVOPTs processing to innermost loops for now.
> 
It's to answer whether the target supports doloop or not.  I guess
to check innermost is stricter and simpler.  There is another target 
hook invalid_within_doloop to check counter register clobber.

>> +  if (loop->inner != NULL)
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file, "predict doloop failure due to"
>> +			    "no innermost.\n");
>> +      return false;
>> +    }
>> +
>> +  /* Similar to doloop_optimize targetm.invalid_within_doloop, check for
>> +     CALL, tablejump, computed_jump.  */
>> +  if (invalid_insn_for_doloop_p (loop))
>> +    return false;
>> +
>> +  return true;
>> +}
>> +
>>  struct gcc_target targetm = TARGET_INITIALIZER;
>>  
>>  #include "gt-rs6000.h"
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index 8c8978b..e595b8b 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> @@ -11603,6 +11603,14 @@ function version at run-time for a given set of function versions.
>>  body must be generated.
>>  @end deftypefn
>>  
>> +@deftypefn {Target Hook} bool TARGET_PREDICT_DOLOOP_P (struct loop *@var{loop})
>> +Return true if we can predict it is possible to use low-overhead loops
>> +for a particular loop.  The parameter @var{loop} is a pointer to the loop
>> +which is going to be checked.  This target hook is required only when the
>> +target supports low-overhead loops, and will help some earlier middle-end
>> +passes to make some decisions.
>> +@end deftypefn
>> +
>>  @deftypefn {Target Hook} bool TARGET_CAN_USE_DOLOOP_P (const widest_int @var{&iterations}, const widest_int @var{&iterations_max}, unsigned int @var{loop_depth}, bool @var{entered_at_top})
>>  Return true if it is possible to use low-overhead loops (@code{doloop_end}
>>  and @code{doloop_begin}) for a particular loop.  @var{iterations} gives the
>> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
>> index fe1194e..e047734 100644
>> --- a/gcc/doc/tm.texi.in
>> +++ b/gcc/doc/tm.texi.in
>> @@ -7942,6 +7942,8 @@ to by @var{ce_info}.
>>  
>>  @hook TARGET_GENERATE_VERSION_DISPATCHER_BODY
>>  
>> +@hook TARGET_PREDICT_DOLOOP_P
>> +
>>  @hook TARGET_CAN_USE_DOLOOP_P
>>  
>>  @hook TARGET_INVALID_WITHIN_DOLOOP
>> diff --git a/gcc/target.def b/gcc/target.def
>> index 66cee07..0a80de3 100644
>> --- a/gcc/target.def
>> +++ b/gcc/target.def
>> @@ -4227,6 +4227,15 @@ DEFHOOK
>>  rtx, (machine_mode mode, rtx result, rtx val, rtx failval),
>>   default_speculation_safe_value)
>>   
>> +DEFHOOK
>> +(predict_doloop_p,
>> + "Return true if we can predict it is possible to use low-overhead loops\n\
>> +for a particular loop.  The parameter @var{loop} is a pointer to the loop\n\
>> +which is going to be checked.  This target hook is required only when the\n\
>> +target supports low-overhead loops, and will help some earlier middle-end\n\
>> +passes to make some decisions.",
>> + bool, (struct loop *loop),
>> + default_predict_doloop_p)
>>  
>>  DEFHOOK
>>  (can_use_doloop_p,
>> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
>> index 318f7e9..56ed4cc 100644
>> --- a/gcc/targhooks.c
>> +++ b/gcc/targhooks.c
>> @@ -643,6 +643,19 @@ default_has_ifunc_p (void)
>>    return HAVE_GNU_INDIRECT_FUNCTION;
>>  }
>>  
>> +/* True if we can predict this loop is possible to be transformed to a
>> +   low-overhead loop, otherwise returns false.
>> +
>> +   By default, false is returned, as this hook's applicability should be
>> +   verified for each target.  Target maintainers should re-define the hook
>> +   if the target can take advantage of it.  */
>> +
>> +bool
>> +default_predict_doloop_p (struct loop *loop ATTRIBUTE_UNUSED)
>> +{
>> +  return false;
>> +}
>> +
>>  /* NULL if INSN insn is valid within a low-overhead loop, otherwise returns
>>     an error message.
>>  
>> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
>> index 5943627..70bfb17 100644
>> --- a/gcc/targhooks.h
>> +++ b/gcc/targhooks.h
>> @@ -85,6 +85,7 @@ extern bool default_fixed_point_supported_p (void);
>>  
>>  extern bool default_has_ifunc_p (void);
>>  
>> +extern bool default_predict_doloop_p (struct loop *);
>>  extern const char * default_invalid_within_doloop (const rtx_insn *);
>>  
>>  extern tree default_builtin_vectorized_function (unsigned int, tree, tree);
>> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>> index a44b4cb..ed7f2a5 100644
>> --- a/gcc/tree-ssa-loop-ivopts.c
>> +++ b/gcc/tree-ssa-loop-ivopts.c
>> @@ -3776,6 +3776,111 @@ prepare_decl_rtl (tree *expr_p, int *ws, void *data)
>>    return NULL_TREE;
>>  }
>>  
>> +/* Check whether number of iteration computation is too costly for doloop
>> +   transformation.  It expands the gimple sequence to equivalent RTL insn
>> +   sequence, then evaluate the cost.
>> +
>> +   Return true if it's costly, otherwise return false.  */
>> +
>> +static bool
>> +costly_iter_for_doloop_p (struct loop *loop, tree niters)
>> +{
> 
> I already said this but I expect IVOPTs to already cost initial
> value computation of new IVs.  I also doubt it's worth
> going into RTL details here, estimate_num_insns should be
> enough, like via using expression_expensive_p from SCEV analysis
> which is already used by IVOPTs.
> 
> You try to be perfect here but end up adding more
> magic numbers (PARAM_MAX_ITERATIONS_COMPUTATION_COST).
> 

Yes, I noticed there is something used for cost computation.  But it can not 
be used for this niter cost directly.  I tried to make similar thing like 
what we have in doloop_optimize.  The magic number is also from 
doloop_optimize.  I'm fine to use those existing ways, I don't have data for
whether it's not important to be exact.


Thanks
Kewen

>> +  tree type = TREE_TYPE (niters);
>> +  unsigned cost = 0;
>> +  bool speed = optimize_loop_for_speed_p (loop);
>> +  int regno = LAST_VIRTUAL_REGISTER + 1;
>> +  walk_tree (&niters, prepare_decl_rtl, &regno, NULL);
>> +  start_sequence ();
>> +  expand_expr (niters, NULL_RTX, TYPE_MODE (type), EXPAND_NORMAL);
>> +  rtx_insn *seq = get_insns ();
>> +  end_sequence ();
>> +
>> +  for (; seq; seq = NEXT_INSN (seq))
>> +    {
>> +      if (!INSN_P (seq))
>> +	continue;
>> +      rtx body = PATTERN (seq);
>> +      if (GET_CODE (body) == SET)
>> +	{
>> +	  rtx set_val = XEXP (body, 1);
>> +	  enum rtx_code code = GET_CODE (set_val);
>> +	  enum rtx_class cls = GET_RTX_CLASS (code);
>> +	  /* For now, we only consider these two RTX classes, to match what we
>> +	 get in doloop_optimize, excluding operations like zero/sign extend.  */
>> +	  if (cls == RTX_BIN_ARITH || cls == RTX_COMM_ARITH)
>> +	    cost += set_src_cost (set_val, GET_MODE (set_val), speed);
>> +	}
>> +    }
>> +  unsigned max_cost
>> +    = COSTS_N_INSNS (PARAM_VALUE (PARAM_MAX_ITERATIONS_COMPUTATION_COST));
>> +  if (cost > max_cost)
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Predict whether the given loop will be transformed in the RTL
>> +   doloop_optimize pass.  Attempt to duplicate as many doloop_optimize checks
>> +   as possible.  This is only for target independent checks, see
>> +   targetm.predict_doloop_p for the target dependent ones.
>> +
>> +   Some RTL specific checks seems unable to be checked in gimple, if any new
>> +   checks or easy checks _are_ missing here, please add them.  */
>> +
>> +static bool
>> +generic_predict_doloop_p (struct ivopts_data *data)
>> +{
>> +  struct loop *loop = data->current_loop;
>> +
>> +  /* Call target hook for target dependent checks.  */
>> +  if (!targetm.predict_doloop_p (loop))
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file, "predict doloop failure due to"
>> +			    "target specific checks.\n");
>> +      return false;
>> +    }
>> +
>> +  /* Similar to doloop_optimize, check iteration description to know it's
>> +     suitable or not.  */
>> +  edge exit = loop_latch_edge (loop);
>> +  struct tree_niter_desc *niter_desc = niter_for_exit (data, exit);
>> +  if (niter_desc == NULL)
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file, "predict doloop failure due to"
>> +			    "unexpected niters.\n");
>> +      return false;
>> +    }
>> +
>> +  /* Similar to doloop_optimize, check whether iteration count too small
>> +     and not profitable.  */
>> +  HOST_WIDE_INT est_niter = get_estimated_loop_iterations_int (loop);
>> +  if (est_niter == -1)
>> +    est_niter = get_likely_max_loop_iterations_int (loop);
>> +  if (est_niter >= 0 && est_niter < 3)
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file,
>> +		 "predict doloop failure due to"
>> +		 "too few iterations (%u).\n",
>> +		 (unsigned int) est_niter);
>> +      return false;
>> +    }
>> +
>> +  /* Similar to doloop_optimize, check whether number of iterations too costly
>> +     to compute.  */
>> +  if (costly_iter_for_doloop_p (loop, niter_desc->niter))
>> +    {
>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>> +	fprintf (dump_file, "predict doloop failure due to"
>> +			    "costly niter computation.\n");
>> +      return false;
>> +    }
>> +
>> +  return true;
>> +}
>> +
>>  /* Determines cost of the computation of EXPR.  */
>>  
>>  static unsigned
>>
> 

  parent reply	other threads:[~2019-05-21  3:48 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17  3:37 linkw
2019-05-17  5:31 ` Kugan Vivekanandarajah
2019-05-17  6:15   ` Kewen.Lin
2019-05-17  6:57   ` Segher Boessenkool
2019-05-17  6:49 ` Segher Boessenkool
2019-05-17  7:20   ` Kewen.Lin
2019-05-20  9:28 ` Richard Biener
2019-05-20 10:24   ` Segher Boessenkool
2019-05-20 14:44     ` Jeff Law
2019-05-20 16:38       ` Segher Boessenkool
2019-05-21  6:03         ` Kewen.Lin
2019-05-21 10:20           ` Richard Biener
2019-05-21 12:42             ` Segher Boessenkool
2019-05-21 15:22             ` Jeff Law
2019-05-22  2:07             ` Kewen.Lin
2019-06-11  2:39             ` Kewen.Lin
2019-06-11 12:17               ` Richard Biener
2019-06-13  5:50                 ` [PATCH v4 " Kewen.Lin
2019-06-14 21:53                   ` Segher Boessenkool
2019-06-14 23:46                     ` Bill Schmidt
2019-06-17  2:08                       ` Kewen.Lin
2019-06-17  8:51                         ` Richard Biener
2019-06-17  9:39                           ` Kewen.Lin
2019-06-17  9:59                           ` Segher Boessenkool
2019-06-17 12:08                             ` Richard Biener
2019-06-17 13:39                               ` Kewen.Lin
2019-06-17 13:44                                 ` Richard Biener
2019-06-17 14:24                                   ` Kewen.Lin
2019-05-21 11:09           ` [PATCH v3 " Segher Boessenkool
2019-05-21  9:58         ` Richard Biener
2019-05-21  5:50       ` Kewen.Lin
2019-05-21  6:32         ` Bin.Cheng
2019-05-21 10:36         ` Segher Boessenkool
2019-05-21  3:48   ` Kewen.Lin [this message]
2019-05-21  5:28   ` Kewen.Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8c1e96f-b811-5d5c-0bb1-395bdef93850@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=bin.cheng@linux.alibaba.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=rguenther@suse.de \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).