RE: [PATCH 2/3]middle-end: updated niters analysis to handle multiple exits.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>, "jlaw@ventanamicro.com" <jlaw@ventanamicro.com>
Subject: RE: [PATCH 2/3]middle-end: updated niters analysis to handle multiple exits.
Date: Wed, 11 Oct 2023 10:54:44 +0000	[thread overview]
Message-ID: <VI1PR08MB5325E118576DC368F1A5D451FFCCA@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2310101105020.5561@jbgna.fhfr.qr>

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Tuesday, October 10, 2023 12:14 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jlaw@ventanamicro.com
> Subject: Re: [PATCH 2/3]middle-end: updated niters analysis to handle
> multiple exits.
> 
> On Mon, 2 Oct 2023, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This second part updates niters analysis to be able to analyze any
> > number of exits.  If we have multiple exits we determine the main exit
> > by finding the first counting IV.
> >
> > The change allows the vectorizer to pass analysis for multiple loops,
> > but we later gracefully reject them.  It does however allow us to test
> > if the exit handling is using the right exit everywhere.
> >
> > Additionally since we analyze all exits, we now return all conditions
> > for them and determine which condition belongs to the main exit.
> >
> > The main condition is needed because the vectorizer needs to ignore
> > the main IV condition during vectorization as it will replace it during codegen.
> >
> > To track versioned loops we extend the contract between ifcvt and the
> > vectorizer to store the exit number in aux so that we can match it up again
> during peeling.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu,
> > and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	* tree-if-conv.cc (tree_if_conversion): Record exits in aux.
> > 	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
> Use
> > 	it.
> > 	* tree-vect-loop.cc (vect_get_loop_niters): Determine main exit.
> > 	(vec_init_loop_exit_info): Extend analysis when multiple exits.
> > 	(vect_analyze_loop_form): Record conds and determine main cond.
> > 	(vect_create_loop_vinfo): Extend bookkeeping of conds.
> > 	(vect_analyze_loop): Release conds.
> > 	* tree-vectorizer.h (LOOP_VINFO_LOOP_CONDS,
> > 	LOOP_VINFO_LOOP_IV_COND):  New.
> > 	(struct vect_loop_form_info): Add conds, alt_loop_conds;
> > 	(struct loop_vec_info): Add conds, loop_iv_cond.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index
> >
> 799f071965e5c41eb352b5530cf1d9c7ecf7bf25..3dc2290467797ebbfcef55
> 903531
> > b22829f4fdbd 100644
> > --- a/gcc/tree-if-conv.cc
> > +++ b/gcc/tree-if-conv.cc
> > @@ -3795,6 +3795,13 @@ tree_if_conversion (class loop *loop,
> vec<gimple *> *preds)
> >      }
> >    if (need_to_ifcvt)
> >      {
> > +      /* Before we rewrite edges we'll record their original position in the
> > +	 edge map such that we can map the edges between the ifcvt and the
> > +	 non-ifcvt loop during peeling.  */
> > +      uintptr_t idx = 0;
> > +      for (edge exit : get_loop_exit_edges (loop))
> > +	exit->aux = (void*)idx++;
> > +
> >        /* Now all statements are if-convertible.  Combine all the basic
> >  	 blocks into one huge basic block doing the if-conversion
> >  	 on-the-fly.  */
> > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > index
> >
> e06717272aafc6d31cbdcb94840ac25de616da6d..77f8e668bcc8beca99ba4
> 052e1b1
> > 2e0d17300262 100644
> > --- a/gcc/tree-vect-loop-manip.cc
> > +++ b/gcc/tree-vect-loop-manip.cc
> > @@ -1470,6 +1470,18 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class
> loop *loop, edge loop_exit,
> >        scalar_loop = loop;
> >        scalar_exit = loop_exit;
> >      }
> > +  else if (scalar_loop == loop)
> > +    scalar_exit = loop_exit;
> > +  else
> > +    {
> > +      /* Loop has been version, match exits up using the aux index.  */
> > +      for (edge exit : get_loop_exit_edges (scalar_loop))
> > +	if (exit->aux == loop_exit->aux)
> > +	  {
> > +	    scalar_exit	= exit;
> > +	    break;
> > +	  }
> > +    }
> >
> >    bbs = XNEWVEC (basic_block, scalar_loop->num_nodes + 1);
> >    pbbs = bbs + 1;
> > @@ -1501,6 +1513,8 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class
> loop *loop, edge loop_exit,
> >    exit = loop_exit;
> >    basic_block new_preheader = new_bbs[0];
> >
> > +  /* Record the new loop exit information.  new_loop doesn't have SCEV
> data and
> > +     so we must initialize the exit information.  */
> >    if (new_e)
> >      *new_e = new_exit;
> >
> > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index
> >
> 6e60d84143626a8e1d801bb580f4dcebc73c7ba7..f1caa5f207d3b13da58c3
> a313b11
> > d1ef98374349 100644
> > --- a/gcc/tree-vect-loop.cc
> > +++ b/gcc/tree-vect-loop.cc
> > @@ -851,79 +851,106 @@ vect_fixup_scalar_cycles_with_patterns
> (loop_vec_info loop_vinfo)
> >     in NUMBER_OF_ITERATIONSM1.  Place the condition under which the
> >     niter information holds in ASSUMPTIONS.
> >
> > -   Return the loop exit condition.  */
> > +   Return the loop exit conditions.  */
> >
> >
> > -static gcond *
> > -vect_get_loop_niters (class loop *loop, edge exit, tree *assumptions,
> > +static vec<gcond *>
> > +vect_get_loop_niters (class loop *loop, tree *assumptions, const_edge
> > +main_exit,
> >  		      tree *number_of_iterations, tree
> *number_of_iterationsm1)
> 
> Any reason you swap exit and main_exit?  IMHO the input better pairs with
> the other input 'loop'.
> 

No, I think I was just rearranging thing to fit more on a line.  I'll put them next
to their exits.

> 
> >  {
> > +  auto_vec<edge> exits = get_loop_exit_edges (loop);  vec<gcond *>
> > + conds;  conds.create (exits.length ());
> >    class tree_niter_desc niter_desc;
> >    tree niter_assumptions, niter, may_be_zero;
> > -  gcond *cond = get_loop_exit_condition (loop);
> >
> >    *assumptions = boolean_true_node;
> >    *number_of_iterationsm1 = chrec_dont_know;
> >    *number_of_iterations = chrec_dont_know;
> > +
> >    DUMP_VECT_SCOPE ("get_loop_niters");
> >
> > -  if (!exit)
> > -    return cond;
> > +  if (exits.is_empty ())
> > +    return conds;
> > +
> > +  if (dump_enabled_p ())
> > +    dump_printf_loc (MSG_NOTE, vect_location, "Loop has %d exits.\n",
> > +		     exits.length ());
> > +
> > +  edge exit;
> > +  unsigned int i;
> > +  FOR_EACH_VEC_ELT (exits, i, exit)
> > +    {
> > +      gcond *cond = get_loop_exit_condition (exit);
> > +      if (cond)
> > +	conds.safe_push (cond);
> > +
> > +      if (dump_enabled_p ())
> > +	dump_printf_loc (MSG_NOTE, vect_location, "Analyzing exit %d...\n",
> > +i);
> >
> > -  may_be_zero = NULL_TREE;
> > -  if (!number_of_iterations_exit_assumptions (loop, exit, &niter_desc,
> NULL)
> > -      || chrec_contains_undetermined (niter_desc.niter))
> > -    return cond;
> > +      may_be_zero = NULL_TREE;
> > +      if (!number_of_iterations_exit_assumptions (loop, exit, &niter_desc,
> NULL)
> > +          || chrec_contains_undetermined (niter_desc.niter))
> > +	continue;
> >
> > -  niter_assumptions = niter_desc.assumptions;
> > -  may_be_zero = niter_desc.may_be_zero;
> > -  niter = niter_desc.niter;
> > +      niter_assumptions = niter_desc.assumptions;
> > +      may_be_zero = niter_desc.may_be_zero;
> > +      niter = niter_desc.niter;
> >
> > -  if (may_be_zero && integer_zerop (may_be_zero))
> > -    may_be_zero = NULL_TREE;
> > +      if (may_be_zero && integer_zerop (may_be_zero))
> > +	may_be_zero = NULL_TREE;
> >
> > -  if (may_be_zero)
> > -    {
> > -      if (COMPARISON_CLASS_P (may_be_zero))
> > +      if (may_be_zero)
> >  	{
> > -	  /* Try to combine may_be_zero with assumptions, this can simplify
> > -	     computation of niter expression.  */
> > -	  if (niter_assumptions && !integer_nonzerop (niter_assumptions))
> > -	    niter_assumptions = fold_build2 (TRUTH_AND_EXPR,
> boolean_type_node,
> > -					     niter_assumptions,
> > -					     fold_build1 (TRUTH_NOT_EXPR,
> > -							  boolean_type_node,
> > -							  may_be_zero));
> > +	  if (COMPARISON_CLASS_P (may_be_zero))
> > +	    {
> > +	      /* Try to combine may_be_zero with assumptions, this can simplify
> > +		 computation of niter expression.  */
> > +	      if (niter_assumptions && !integer_nonzerop (niter_assumptions))
> > +		niter_assumptions = fold_build2 (TRUTH_AND_EXPR,
> boolean_type_node,
> > +						 niter_assumptions,
> > +						 fold_build1
> (TRUTH_NOT_EXPR,
> > +
> boolean_type_node,
> > +							      may_be_zero));
> > +	      else
> > +		niter = fold_build3 (COND_EXPR, TREE_TYPE (niter),
> may_be_zero,
> > +				     build_int_cst (TREE_TYPE (niter), 0),
> > +				     rewrite_to_non_trapping_overflow (niter));
> > +
> > +	      may_be_zero = NULL_TREE;
> > +	    }
> > +	  else if (integer_nonzerop (may_be_zero) && exit == main_exit)
> > +	    {
> > +	      *number_of_iterationsm1 = build_int_cst (TREE_TYPE (niter), 0);
> > +	      *number_of_iterations = build_int_cst (TREE_TYPE (niter), 1);
> > +	      continue;
> > +	    }
> >  	  else
> > -	    niter = fold_build3 (COND_EXPR, TREE_TYPE (niter), may_be_zero,
> > -				 build_int_cst (TREE_TYPE (niter), 0),
> > -				 rewrite_to_non_trapping_overflow (niter));
> > +	    continue;
> > +       }
> >
> > -	  may_be_zero = NULL_TREE;
> > -	}
> > -      else if (integer_nonzerop (may_be_zero))
> > +      /* Loop assumptions are based off the normal exit.  */
> > +      if (exit == main_exit)
> 
> It's a bit hard to follow in patch form but I wonder why you even analyze the
> number of iterations of the non-main exits riskying possibly clobbering the
> *number_* outputs which we later assume to be for the main exit?
> 

My original goal here was that if we can't analyze the other exits, we probably
can't vectorize them. So I don't really need the results but I thought it useful to
check.  I can skip them.

Thanks,
Tamar

> >  	{
> > -	  *number_of_iterationsm1 = build_int_cst (TREE_TYPE (niter), 0);
> > -	  *number_of_iterations = build_int_cst (TREE_TYPE (niter), 1);
> > -	  return cond;
> > +	  *assumptions = niter_assumptions;
> > +	  *number_of_iterationsm1 = niter;
> > +
> > +	  /* We want the number of loop header executions which is the
> number
> > +	     of latch executions plus one.
> > +	     ???  For UINT_MAX latch executions this number overflows to zero
> > +	     for loops like do { n++; } while (n != 0);  */
> > +	  if (niter && !chrec_contains_undetermined (niter))
> > +	    niter = fold_build2 (PLUS_EXPR, TREE_TYPE (niter),
> > +				 unshare_expr (niter),
> > +				 build_int_cst (TREE_TYPE (niter), 1));
> > +	  *number_of_iterations = niter;
> >  	}
> > -      else
> > -	return cond;
> >      }
> >
> > -  *assumptions = niter_assumptions;
> > -  *number_of_iterationsm1 = niter;
> > -
> > -  /* We want the number of loop header executions which is the number
> > -     of latch executions plus one.
> > -     ???  For UINT_MAX latch executions this number overflows to zero
> > -     for loops like do { n++; } while (n != 0);  */
> > -  if (niter && !chrec_contains_undetermined (niter))
> > -    niter = fold_build2 (PLUS_EXPR, TREE_TYPE (niter), unshare_expr (niter),
> > -			  build_int_cst (TREE_TYPE (niter), 1));
> > -  *number_of_iterations = niter;
> > +  if (dump_enabled_p ())
> > +    dump_printf_loc (MSG_NOTE, vect_location, "All loop exits
> > + successfully analyzed.\n");
> >
> > -  return cond;
> > +  return conds;
> >  }
> >
> >  /*  Determine the main loop exit for the vectorizer.  */ @@ -936,8
> > +963,25 @@ vec_init_loop_exit_info (class loop *loop)
> >    auto_vec<edge> exits = get_loop_exit_edges (loop);
> >    if (exits.length () == 1)
> >      return exits[0];
> > -  else
> > -    return NULL;
> > +
> > +  /* If we have multiple exits we only support counting IV at the moment.
> Analyze
> > +     all exits and return one */
> > +  class tree_niter_desc niter_desc;
> > +  edge candidate = NULL;
> > +  for (edge exit : exits)
> > +    {
> > +      if (!get_loop_exit_condition (exit))
> > +	continue;
> > +
> > +      if (number_of_iterations_exit_assumptions (loop, exit, &niter_desc,
> NULL)
> > +	  && !chrec_contains_undetermined (niter_desc.niter))
> > +	{
> > +	  if (!niter_desc.may_be_zero || !candidate)
> > +	    candidate = exit;
> > +	}
> > +    }
> > +
> > +  return candidate;
> >  }
> >
> >  /* Function bb_in_loop_p
> > @@ -1788,21 +1832,31 @@ vect_analyze_loop_form (class loop *loop,
> vect_loop_form_info *info)
> >  				   "not vectorized: latch block not empty.\n");
> >
> >    /* Make sure the exit is not abnormal.  */
> > -  edge e = single_exit (loop);
> > -  if (e->flags & EDGE_ABNORMAL)
> > +  if (exit_e->flags & EDGE_ABNORMAL)
> >      return opt_result::failure_at (vect_location,
> >  				   "not vectorized:"
> >  				   " abnormal loop exit edge.\n");
> >
> > -  info->loop_cond
> > -    = vect_get_loop_niters (loop, e, &info->assumptions,
> > +  info->conds
> > +    = vect_get_loop_niters (loop, &info->assumptions, exit_e,
> >  			    &info->number_of_iterations,
> >  			    &info->number_of_iterationsm1);
> > -  if (!info->loop_cond)
> > +
> > +  if (info->conds.is_empty ())
> >      return opt_result::failure_at
> >        (vect_location,
> >         "not vectorized: complicated exit condition.\n");
> >
> > +  /* Determine what the primary and alternate exit conds are.  */
> > +  info->alt_loop_conds.create (info->conds.length () - 1);
> > +  for (gcond *cond : info->conds)
> > +    {
> > +      if (exit_e->src != gimple_bb (cond))
> > +	info->alt_loop_conds.quick_push (cond);
> > +      else
> > +	info->loop_cond = cond;
> > +    }
> > +
> 
> IMHO it would be simpler to have the primary exit condition in
> info->conds[0] and the rest after that?  That avoids having two
> arrays and one scalar in vect_loop_form_info.
> 
> >    if (integer_zerop (info->assumptions)
> >        || !info->number_of_iterations
> >        || chrec_contains_undetermined (info->number_of_iterations)) @@
> > -1847,8 +1901,13 @@ vect_create_loop_vinfo (class loop *loop,
> vec_info_shared *shared,
> >    if (!integer_onep (info->assumptions) && !main_loop_info)
> >      LOOP_VINFO_NITERS_ASSUMPTIONS (loop_vinfo) = info->assumptions;
> >
> > -  stmt_vec_info loop_cond_info = loop_vinfo->lookup_stmt
> > (info->loop_cond);
> > -  STMT_VINFO_TYPE (loop_cond_info) = loop_exit_ctrl_vec_info_type;
> > +  for (gcond *cond : info->conds)
> > +    {
> > +      stmt_vec_info loop_cond_info = loop_vinfo->lookup_stmt (cond);
> > +      STMT_VINFO_TYPE (loop_cond_info) = loop_exit_ctrl_vec_info_type;
> > +    }
> > +  LOOP_VINFO_LOOP_CONDS (loop_vinfo).safe_splice
> > + (info->alt_loop_conds);  LOOP_VINFO_LOOP_IV_COND (loop_vinfo) =
> > + info->loop_cond;
> >    LOOP_VINFO_IV_EXIT (loop_vinfo) = info->loop_exit;
> >
> > @@ -3594,7 +3653,11 @@ vect_analyze_loop (class loop *loop,
> vec_info_shared *shared)
> >  			 && LOOP_VINFO_PEELING_FOR_NITER
> (first_loop_vinfo)
> >  			 && !loop->simduid);
> >    if (!vect_epilogues)
> > -    return first_loop_vinfo;
> > +    {
> > +      loop_form_info.conds.release ();
> > +      loop_form_info.alt_loop_conds.release ();
> > +      return first_loop_vinfo;
> > +    }
> 
> I think there's 'inner' where you leak these.  Maybe use auto_vec<> in
> vect_loop_form_info instead?
> 
> Otherwise looks OK.
> 
> Thanks,
> Richard.
> 
> >    /* Now analyze first_loop_vinfo for epilogue vectorization.  */
> >    poly_uint64 lowest_th = LOOP_VINFO_VERSIONING_THRESHOLD
> > (first_loop_vinfo); @@ -3694,6 +3757,9 @@ vect_analyze_loop (class loop
> *loop, vec_info_shared *shared)
> >  			   (first_loop_vinfo->epilogue_vinfos[0]-
> >vector_mode));
> >      }
> >
> > +  loop_form_info.conds.release ();
> > +  loop_form_info.alt_loop_conds.release ();
> > +
> >    return first_loop_vinfo;
> >  }
> >
> > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> >
> afa7a8e30891c782a0e5e3740ecc4377f5a31e54..55b6771b271d5072fa132
> 7d595e1
> > dddb112cfdf6 100644
> > --- a/gcc/tree-vectorizer.h
> > +++ b/gcc/tree-vectorizer.h
> > @@ -882,6 +882,12 @@ public:
> >       we need to peel off iterations at the end to form an epilogue loop.  */
> >    bool peeling_for_niter;
> >
> > +  /* List of loop additional IV conditionals found in the loop.  */
> > + auto_vec<gcond *> conds;
> > +
> > +  /* Main loop IV cond.  */
> > +  gcond* loop_iv_cond;
> > +
> >    /* True if there are no loop carried data dependencies in the loop.
> >       If loop->safelen <= 1, then this is always true, either the loop
> >       didn't have any loop carried data dependencies, or the loop is
> > being @@ -984,6 +990,8 @@ public:
> >  #define LOOP_VINFO_REDUCTION_CHAINS(L)     (L)->reduction_chains
> >  #define LOOP_VINFO_PEELING_FOR_GAPS(L)     (L)->peeling_for_gaps
> >  #define LOOP_VINFO_PEELING_FOR_NITER(L)    (L)->peeling_for_niter
> > +#define LOOP_VINFO_LOOP_CONDS(L)           (L)->conds
> > +#define LOOP_VINFO_LOOP_IV_COND(L)         (L)->loop_iv_cond
> >  #define LOOP_VINFO_NO_DATA_DEPENDENCIES(L) (L)-
> >no_data_dependencies
> >  #define LOOP_VINFO_SCALAR_LOOP(L)	   (L)->scalar_loop
> >  #define LOOP_VINFO_SCALAR_LOOP_SCALING(L)  (L)->scalar_loop_scaling
> > @@ -2373,7 +2381,9 @@ struct vect_loop_form_info
> >    tree number_of_iterations;
> >    tree number_of_iterationsm1;
> >    tree assumptions;
> > +  vec<gcond *> conds;
> >    gcond *loop_cond;
> > +  vec<gcond *> alt_loop_conds;
> >    gcond *inner_loop_cond;
> >    edge loop_exit;
> >  };
> >
> >
> >
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)

next prev parent reply	other threads:[~2023-10-11 10:54 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-02  7:41 [PATCH 1/3]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables Tamar Christina
2023-10-02  7:41 ` [PATCH 2/3]middle-end: updated niters analysis to handle multiple exits Tamar Christina
2023-10-10 11:13   ` Richard Biener
2023-10-11 10:54     ` Tamar Christina [this message]
2023-10-11 12:08       ` Richard Biener
2023-10-02  7:42 ` [PATCH 3/3]middle-end: maintain LCSSA throughout loop peeling Tamar Christina
2023-10-10 12:59   ` Richard Biener
2023-10-11 11:16     ` Tamar Christina
2023-10-11 12:09       ` Richard Biener
2023-10-09 13:35 ` [PATCH 1/3]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables Richard Biener
2023-10-11 10:45   ` Tamar Christina
2023-10-11 12:07     ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB5325E118576DC368F1A5D451FFCCA@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).