From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=lPz2=FZ=suse.de=rguenther@sourceware.org>
Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d])
	by sourceware.org (Postfix) with ESMTPS id 232083861814
	for <gcc-patches@gcc.gnu.org>; Wed, 11 Oct 2023 12:07:11 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 232083861814
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de
Received: from relay2.suse.de (relay2.suse.de [149.44.160.134])
	by smtp-out2.suse.de (Postfix) with ESMTP id 41A8D1FDF9;
	Wed, 11 Oct 2023 12:07:09 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa;
	t=1697026029; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=dPJAv4Go6d0H9rhOYOEyi9uL5Mj2Kop5UqqvTDL/nhA=;
	b=caFZewR0Qmch/METJUSEjTQ6Zzx8z3L1dGMlU2vPbJ/w6seMe8CCuTjT6sUN4byv+/o2W7
	gSVr7E5vWO+CbuTnJ2BHbI6EZgRSJ1sVoRtIiN06R+rg5HyFZ3ruGDbdxVq72ueoa1DNWA
	f8RbTSvhieE3kg2yKDWfCYkwuj7BoeY=
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;
	s=susede2_ed25519; t=1697026029;
	h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=dPJAv4Go6d0H9rhOYOEyi9uL5Mj2Kop5UqqvTDL/nhA=;
	b=bomJ0ySM7sWSZJtNoiuPY7V0yvM78jpJzZu4mdXeTbDfPPiV9ykVubd+M4MFtOnuDKTlfG
	8pwkMB7FqNYtVPDQ==
Received: from wotan.suse.de (wotan.suse.de [10.160.0.1])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by relay2.suse.de (Postfix) with ESMTPS id 0D0262C6A3;
	Wed, 11 Oct 2023 12:07:09 +0000 (UTC)
Date: Wed, 11 Oct 2023 12:07:09 +0000 (UTC)
From: Richard Biener <rguenther@suse.de>
To: Tamar Christina <Tamar.Christina@arm.com>
cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>, 
    "jlaw@ventanamicro.com" <jlaw@ventanamicro.com>
Subject: RE: [PATCH 1/3]middle-end: Refactor vectorizer loop conditionals
 and separate out IV to new variables
In-Reply-To:  <VI1PR08MB532513DF9E9FB32293146F09FFCCA@VI1PR08MB5325.eurprd08.prod.outlook.com>
Message-ID: <nycvar.YFH.7.77.849.2310111206320.10643@jbgna.fhfr.qr>
References: <patch-17789-tamar@arm.com> <nycvar.YFH.7.77.849.2310091323350.5561@jbgna.fhfr.qr>  <VI1PR08MB532513DF9E9FB32293146F09FFCCA@VI1PR08MB5325.eurprd08.prod.outlook.com>
User-Agent: Alpine 2.22 (LSU 394 2020-01-19)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Wed, 11 Oct 2023, Tamar Christina wrote:

> > > @@ -2664,7 +2679,7 @@ slpeel_update_phi_nodes_for_loops
> > (loop_vec_info loop_vinfo,
> > >       for correct vectorization of live stmts.  */
> > >    if (loop == first)
> > >      {
> > > -      basic_block orig_exit = single_exit (second)->dest;
> > > +      basic_block orig_exit = second_loop_e->dest;
> > >        for (gsi_orig = gsi_start_phis (orig_exit);
> > >  	   !gsi_end_p (gsi_orig); gsi_next (&gsi_orig))
> > >  	{
> > > @@ -2673,13 +2688,14 @@ slpeel_update_phi_nodes_for_loops
> > (loop_vec_info loop_vinfo,
> > >  	  if (TREE_CODE (orig_arg) != SSA_NAME || virtual_operand_p
> > (orig_arg))
> > >  	    continue;
> > >
> > > +	  const_edge exit_e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > >  	  /* Already created in the above loop.   */
> > > -	  if (find_guard_arg (first, second, orig_phi))
> > > +	  if (find_guard_arg (first, second, exit_e, orig_phi))
> > >  	    continue;
> > >
> > >  	  tree new_res = copy_ssa_name (orig_arg);
> > >  	  gphi *lcphi = create_phi_node (new_res, between_bb);
> > > -	  add_phi_arg (lcphi, orig_arg, single_exit (first),
> > UNKNOWN_LOCATION);
> > > +	  add_phi_arg (lcphi, orig_arg, first_loop_e, UNKNOWN_LOCATION);
> > >  	}
> > >      }
> > >  }
> > > @@ -2847,7 +2863,8 @@ slpeel_update_phi_nodes_for_guard2 (class loop
> > *loop, class loop *epilog,
> > >        if (!merge_arg)
> > >  	merge_arg = old_arg;
> > >
> > > -      tree guard_arg = find_guard_arg (loop, epilog, update_phi);
> > > +      tree guard_arg
> > > +	= find_guard_arg (loop, epilog, single_exit (loop), update_phi);
> > 
> > missed adjustment?  you are introducing a single_exit call here ...
> > 
> 
> It's a very temporary one that gets removed in patch 3/3 when I start
> passing the rest of the edges down explicitly. It allowed me to split the
> patches a bit more.

OK, fine.

> > >        /* If the var is live after loop but not a reduction, we simply
> > >  	 use the old arg.  */
> > >        if (!guard_arg)
> > > @@ -3201,27 +3218,37 @@ vect_do_peeling (loop_vec_info loop_vinfo,
> > tree niters, tree nitersm1,
> > >      }
> > >
> > >    if (vect_epilogues)
> > > -    /* Make sure to set the epilogue's epilogue scalar loop, such that we can
> > > -       use the original scalar loop as remaining epilogue if necessary.  */
> > > -    LOOP_VINFO_SCALAR_LOOP (epilogue_vinfo)
> > > -      = LOOP_VINFO_SCALAR_LOOP (loop_vinfo);
> > > +    {
> > > +      /* Make sure to set the epilogue's epilogue scalar loop, such that we can
> > > +	 use the original scalar loop as remaining epilogue if necessary.  */
> > > +      LOOP_VINFO_SCALAR_LOOP (epilogue_vinfo)
> > > +	= LOOP_VINFO_SCALAR_LOOP (loop_vinfo);
> > > +      LOOP_VINFO_SCALAR_IV_EXIT (epilogue_vinfo)
> > > +	= LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo);
> > > +    }
> > >
> > >    if (prolog_peeling)
> > >      {
> > >        e = loop_preheader_edge (loop);
> > > -      gcc_checking_assert (slpeel_can_duplicate_loop_p (loop, e));
> > > +      edge exit_e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > +      gcc_checking_assert (slpeel_can_duplicate_loop_p (loop, exit_e,
> > > + e));
> > >
> > >        /* Peel prolog and put it on preheader edge of loop.  */
> > > -      prolog = slpeel_tree_duplicate_loop_to_edge_cfg (loop, scalar_loop, e);
> > > +      edge scalar_e = LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo);
> > > +      edge prolog_e = NULL;
> > > +      prolog = slpeel_tree_duplicate_loop_to_edge_cfg (loop, exit_e,
> > > +						       scalar_loop, scalar_e,
> > > +						       e, &prolog_e);
> > >        gcc_assert (prolog);
> > >        prolog->force_vectorize = false;
> > > -      slpeel_update_phi_nodes_for_loops (loop_vinfo, prolog, loop, true);
> > > +      slpeel_update_phi_nodes_for_loops (loop_vinfo, prolog, prolog_e, loop,
> > > +					 exit_e, true);
> > >        first_loop = prolog;
> > >        reset_original_copy_tables ();
> > >
> > >        /* Update the number of iterations for prolog loop.  */
> > >        tree step_prolog = build_one_cst (TREE_TYPE (niters_prolog));
> > > -      vect_set_loop_condition (prolog, NULL, niters_prolog,
> > > +      vect_set_loop_condition (prolog, prolog_e, loop_vinfo,
> > > + niters_prolog,
> > >  			       step_prolog, NULL_TREE, false);
> > >
> > >        /* Skip the prolog loop.  */
> > > @@ -3275,8 +3302,8 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> > > niters, tree nitersm1,
> > >
> > >    if (epilog_peeling)
> > >      {
> > > -      e = single_exit (loop);
> > > -      gcc_checking_assert (slpeel_can_duplicate_loop_p (loop, e));
> > > +      e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > +      gcc_checking_assert (slpeel_can_duplicate_loop_p (loop, e, e));
> > >
> > >        /* Peel epilog and put it on exit edge of loop.  If we are vectorizing
> > >  	 said epilog then we should use a copy of the main loop as a
> > > starting @@ -3285,12 +3312,18 @@ vect_do_peeling (loop_vec_info
> > loop_vinfo, tree niters, tree nitersm1,
> > >  	 If we are not vectorizing the epilog then we should use the scalar loop
> > >  	 as the transformations mentioned above make less or no sense when
> > not
> > >  	 vectorizing.  */
> > > +      edge scalar_e = LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo);
> > >        epilog = vect_epilogues ? get_loop_copy (loop) : scalar_loop;
> > > -      epilog = slpeel_tree_duplicate_loop_to_edge_cfg (loop, epilog, e);
> > > +      edge epilog_e = vect_epilogues ? e : scalar_e;
> > > +      edge new_epilog_e = NULL;
> > > +      epilog = slpeel_tree_duplicate_loop_to_edge_cfg (loop, e, epilog,
> > > +						       epilog_e, e,
> > > +						       &new_epilog_e);
> > > +      LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo) = new_epilog_e;
> > >        gcc_assert (epilog);
> > > -
> > >        epilog->force_vectorize = false;
> > > -      slpeel_update_phi_nodes_for_loops (loop_vinfo, loop, epilog, false);
> > > +      slpeel_update_phi_nodes_for_loops (loop_vinfo, loop, e, epilog,
> > > +					 new_epilog_e, false);
> > >        bb_before_epilog = loop_preheader_edge (epilog)->src;
> > >
> > >        /* Scalar version loop may be preferred.  In this case, add
> > > guard @@ -3374,16 +3407,16 @@ vect_do_peeling (loop_vec_info
> > loop_vinfo, tree niters, tree nitersm1,
> > >  	{
> > >  	  guard_cond = fold_build2 (EQ_EXPR, boolean_type_node,
> > >  				    niters, niters_vector_mult_vf);
> > > -	  guard_bb = single_exit (loop)->dest;
> > > -	  guard_to = split_edge (single_exit (epilog));
> > > +	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> > > +	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
> > > +	  guard_to = split_edge (epilog_e);
> > >  	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
> > >  					   skip_vector ? anchor : guard_bb,
> > >  					   prob_epilog.invert (),
> > >  					   irred_flag);
> > >  	  if (vect_epilogues)
> > >  	    epilogue_vinfo->skip_this_loop_edge = guard_e;
> > > -	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, guard_e,
> > > -					      single_exit (epilog));
> > > +	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, guard_e,
> > > +epilog_e);
> > >  	  /* Only need to handle basic block before epilog loop if it's not
> > >  	     the guard_bb, which is the case when skip_vector is true.  */
> > >  	  if (guard_bb != bb_before_epilog)
> > > @@ -3416,6 +3449,8 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> > niters, tree nitersm1,
> > >      {
> > >        epilog->aux = epilogue_vinfo;
> > >        LOOP_VINFO_LOOP (epilogue_vinfo) = epilog;
> > > +      LOOP_VINFO_IV_EXIT (epilogue_vinfo)
> > > +	= LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
> > >
> > >        loop_constraint_clear (epilog, LOOP_C_INFINITE);
> > >
> > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index
> > >
> > 23c6e8259e7b133cd7acc6bcf0bad26423e9993a..6e60d84143626a8e1d80
> > 1bb580f4
> > > dcebc73c7ba7 100644
> > > --- a/gcc/tree-vect-loop.cc
> > > +++ b/gcc/tree-vect-loop.cc
> > > @@ -855,10 +855,9 @@ vect_fixup_scalar_cycles_with_patterns
> > > (loop_vec_info loop_vinfo)
> > >
> > >
> > >  static gcond *
> > > -vect_get_loop_niters (class loop *loop, tree *assumptions,
> > > +vect_get_loop_niters (class loop *loop, edge exit, tree *assumptions,
> > >  		      tree *number_of_iterations, tree
> > *number_of_iterationsm1)  {
> > > -  edge exit = single_exit (loop);
> > >    class tree_niter_desc niter_desc;
> > >    tree niter_assumptions, niter, may_be_zero;
> > >    gcond *cond = get_loop_exit_condition (loop); @@ -927,6 +926,20 @@
> > > vect_get_loop_niters (class loop *loop, tree *assumptions,
> > >    return cond;
> > >  }
> > >
> > > +/*  Determine the main loop exit for the vectorizer.  */
> > > +
> > > +edge
> > 
> > can't this be 'static'?
> 
> No since it's used by set_uid_loop_bbs which is setting the loop out of get_loop.
> 
> If I understand correctly the expected loop from this is the ifcvt loop? If that's the
> case I may be able to match it up through the ->aux again but since set_uid_loop_bbs
> isn't called often I figure I can just re-analyze.

I see.

Richard.

> Regards,
> Tamar
> 
> > 
> > > +vec_init_loop_exit_info (class loop *loop) {
> > > +  /* Before we begin we must first determine which exit is the main one and
> > > +     which are auxilary exits.  */
> > > +  auto_vec<edge> exits = get_loop_exit_edges (loop);
> > > +  if (exits.length () == 1)
> > > +    return exits[0];
> > > +  else
> > > +    return NULL;
> > > +}
> > > +
> > >  /* Function bb_in_loop_p
> > >
> > >     Used as predicate for dfs order traversal of the loop bbs.  */ @@
> > > -987,7 +1000,10 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in,
> > vec_info_shared *shared)
> > >      has_mask_store (false),
> > >      scalar_loop_scaling (profile_probability::uninitialized ()),
> > >      scalar_loop (NULL),
> > > -    orig_loop_info (NULL)
> > > +    orig_loop_info (NULL),
> > > +    vec_loop_iv (NULL),
> > > +    vec_epilogue_loop_iv (NULL),
> > > +    scalar_loop_iv (NULL)
> > >  {
> > >    /* CHECKME: We want to visit all BBs before their successors (except for
> > >       latch blocks, for which this assertion wouldn't hold).  In the
> > > simple @@ -1646,6 +1662,18 @@ vect_analyze_loop_form (class loop
> > > *loop, vect_loop_form_info *info)  {
> > >    DUMP_VECT_SCOPE ("vect_analyze_loop_form");
> > >
> > > +  edge exit_e = vec_init_loop_exit_info (loop);
> > > +  if (!exit_e)
> > > +    return opt_result::failure_at (vect_location,
> > > +				   "not vectorized:"
> > > +				   " could not determine main exit from"
> > > +				   " loop with multiple exits.\n");
> > > +  info->loop_exit = exit_e;
> > > +  if (dump_enabled_p ())
> > > +      dump_printf_loc (MSG_NOTE, vect_location,
> > > +		       "using as main loop exit: %d -> %d [AUX: %p]\n",
> > > +		       exit_e->src->index, exit_e->dest->index, exit_e->aux);
> > > +
> > >    /* Different restrictions apply when we are considering an inner-most loop,
> > >       vs. an outer (nested) loop.
> > >       (FORNOW. May want to relax some of these restrictions in the
> > > future).  */ @@ -1767,7 +1795,7 @@ vect_analyze_loop_form (class loop
> > *loop, vect_loop_form_info *info)
> > >  				   " abnormal loop exit edge.\n");
> > >
> > >    info->loop_cond
> > > -    = vect_get_loop_niters (loop, &info->assumptions,
> > > +    = vect_get_loop_niters (loop, e, &info->assumptions,
> > >  			    &info->number_of_iterations,
> > >  			    &info->number_of_iterationsm1);
> > >    if (!info->loop_cond)
> > > @@ -1821,6 +1849,9 @@ vect_create_loop_vinfo (class loop *loop,
> > > vec_info_shared *shared,
> > >
> > >    stmt_vec_info loop_cond_info = loop_vinfo->lookup_stmt (info-
> > >loop_cond);
> > >    STMT_VINFO_TYPE (loop_cond_info) = loop_exit_ctrl_vec_info_type;
> > > +
> > > +  LOOP_VINFO_IV_EXIT (loop_vinfo) = info->loop_exit;
> > > +
> > >    if (info->inner_loop_cond)
> > >      {
> > >        stmt_vec_info inner_loop_cond_info @@ -3063,9 +3094,9 @@
> > > start_over:
> > >        if (dump_enabled_p ())
> > >          dump_printf_loc (MSG_NOTE, vect_location, "epilog loop required\n");
> > >        if (!vect_can_advance_ivs_p (loop_vinfo)
> > > -	  || !slpeel_can_duplicate_loop_p (LOOP_VINFO_LOOP (loop_vinfo),
> > > -					   single_exit (LOOP_VINFO_LOOP
> > > -							 (loop_vinfo))))
> > > +	  || !slpeel_can_duplicate_loop_p (loop,
> > > +					   LOOP_VINFO_IV_EXIT (loop_vinfo),
> > > +					   LOOP_VINFO_IV_EXIT (loop_vinfo)))
> > >          {
> > >  	  ok = opt_result::failure_at (vect_location,
> > >  				       "not vectorized: can't create required "
> > > @@ -6002,7 +6033,7 @@ vect_create_epilog_for_reduction (loop_vec_info
> > loop_vinfo,
> > >           Store them in NEW_PHIS.  */
> > >    if (double_reduc)
> > >      loop = outer_loop;
> > > -  exit_bb = single_exit (loop)->dest;
> > > +  exit_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> > >    exit_gsi = gsi_after_labels (exit_bb);
> > >    reduc_inputs.create (slp_node ? vec_num : ncopies);
> > >    for (unsigned i = 0; i < vec_num; i++) @@ -6018,7 +6049,7 @@
> > > vect_create_epilog_for_reduction (loop_vec_info loop_vinfo,
> > >  	  phi = create_phi_node (new_def, exit_bb);
> > >  	  if (j)
> > >  	    def = gimple_get_lhs (STMT_VINFO_VEC_STMTS (rdef_info)[j]);
> > > -	  SET_PHI_ARG_DEF (phi, single_exit (loop)->dest_idx, def);
> > > +	  SET_PHI_ARG_DEF (phi, LOOP_VINFO_IV_EXIT (loop_vinfo)-
> > >dest_idx,
> > > +def);
> > >  	  new_def = gimple_convert (&stmts, vectype, new_def);
> > >  	  reduc_inputs.quick_push (new_def);
> > >  	}
> > > @@ -10416,12 +10447,12 @@ vectorizable_live_operation (vec_info
> > *vinfo, stmt_vec_info stmt_info,
> > >  	   lhs' = new_tree;  */
> > >
> > >        class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> > > -      basic_block exit_bb = single_exit (loop)->dest;
> > > +      basic_block exit_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> > >        gcc_assert (single_pred_p (exit_bb));
> > >
> > >        tree vec_lhs_phi = copy_ssa_name (vec_lhs);
> > >        gimple *phi = create_phi_node (vec_lhs_phi, exit_bb);
> > > -      SET_PHI_ARG_DEF (phi, single_exit (loop)->dest_idx, vec_lhs);
> > > +      SET_PHI_ARG_DEF (phi, LOOP_VINFO_IV_EXIT
> > > + (loop_vinfo)->dest_idx, vec_lhs);
> > >
> > >        gimple_seq stmts = NULL;
> > >        tree new_tree;
> > > @@ -10965,7 +10996,7 @@ vect_get_loop_len (loop_vec_info loop_vinfo,
> > gimple_stmt_iterator *gsi,
> > >     profile.  */
> > >
> > >  static void
> > > -scale_profile_for_vect_loop (class loop *loop, unsigned vf, bool
> > > flat)
> > > +scale_profile_for_vect_loop (class loop *loop, edge exit_e, unsigned
> > > +vf, bool flat)
> > >  {
> > >    /* For flat profiles do not scale down proportionally by VF and only
> > >       cap by known iteration count bounds.  */ @@ -10980,7 +11011,6 @@
> > > scale_profile_for_vect_loop (class loop *loop, unsigned vf, bool flat)
> > >        return;
> > >      }
> > >    /* Loop body executes VF fewer times and exit increases VF times.
> > > */
> > > -  edge exit_e = single_exit (loop);
> > >    profile_count entry_count = loop_preheader_edge (loop)->count ();
> > >
> > >    /* If we have unreliable loop profile avoid dropping entry @@
> > > -11350,7 +11380,7 @@ vect_transform_loop (loop_vec_info loop_vinfo,
> > > gimple *loop_vectorized_call)
> > >
> > >    /* Make sure there exists a single-predecessor exit bb.  Do this before
> > >       versioning.   */
> > > -  edge e = single_exit (loop);
> > > +  edge e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > >    if (! single_pred_p (e->dest))
> > >      {
> > >        split_loop_exit_edge (e, true); @@ -11376,7 +11406,7 @@
> > > vect_transform_loop (loop_vec_info loop_vinfo, gimple
> > *loop_vectorized_call)
> > >       loop closed PHI nodes on the exit.  */
> > >    if (LOOP_VINFO_SCALAR_LOOP (loop_vinfo))
> > >      {
> > > -      e = single_exit (LOOP_VINFO_SCALAR_LOOP (loop_vinfo));
> > > +      e = LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo);
> > >        if (! single_pred_p (e->dest))
> > >  	{
> > >  	  split_loop_exit_edge (e, true);
> > > @@ -11625,8 +11655,9 @@ vect_transform_loop (loop_vec_info
> > loop_vinfo, gimple *loop_vectorized_call)
> > >       a zero NITERS becomes a nonzero NITERS_VECTOR.  */
> > >    if (integer_onep (step_vector))
> > >      niters_no_overflow = true;
> > > -  vect_set_loop_condition (loop, loop_vinfo, niters_vector, step_vector,
> > > -			   niters_vector_mult_vf, !niters_no_overflow);
> > > +  vect_set_loop_condition (loop, LOOP_VINFO_IV_EXIT (loop_vinfo),
> > loop_vinfo,
> > > +			   niters_vector, step_vector, niters_vector_mult_vf,
> > > +			   !niters_no_overflow);
> > >
> > >    unsigned int assumed_vf = vect_vf_for_cost (loop_vinfo);
> > >
> > > @@ -11699,7 +11730,8 @@ vect_transform_loop (loop_vec_info
> > loop_vinfo, gimple *loop_vectorized_call)
> > >  			  assumed_vf) - 1
> > >  	 : wi::udiv_floor (loop->nb_iterations_estimate + bias_for_assumed,
> > >  			   assumed_vf) - 1);
> > > -  scale_profile_for_vect_loop (loop, assumed_vf, flat);
> > > +  scale_profile_for_vect_loop (loop, LOOP_VINFO_IV_EXIT (loop_vinfo),
> > > +			       assumed_vf, flat);
> > >
> > >    if (dump_enabled_p ())
> > >      {
> > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> > >
> > f1d0cd79961abb095bc79d3b59a81930f0337e59..afa7a8e30891c782a0e5e
> > 3740ecc
> > > 4377f5a31e54 100644
> > > --- a/gcc/tree-vectorizer.h
> > > +++ b/gcc/tree-vectorizer.h
> > > @@ -919,10 +919,24 @@ public:
> > >       analysis.  */
> > >    vec<_loop_vec_info *> epilogue_vinfos;
> > >
> > > +  /* The controlling loop IV for the current loop when vectorizing.  This IV
> > > +     controls the natural exits of the loop.  */  edge vec_loop_iv;
> > > +
> > > +  /* The controlling loop IV for the epilogue loop when vectorizing.  This IV
> > > +     controls the natural exits of the loop.  */  edge
> > > + vec_epilogue_loop_iv;
> > > +
> > > +  /* The controlling loop IV for the scalar loop being vectorized.  This IV
> > > +     controls the natural exits of the loop.  */  edge
> > > + scalar_loop_iv;
> > 
> > all of the above sound as if they were IVs, the access macros have _EXIT at the
> > end, can you make the above as well?
> > 
> > Otherwise looks good to me.
> > 
> > Feel free to push approved patches of the series, no need to wait until
> > everything is approved.
> > 
> > Thanks,
> > Richard.
> > 
> > >  } *loop_vec_info;
> > >
> > >  /* Access Functions.  */
> > >  #define LOOP_VINFO_LOOP(L)                 (L)->loop
> > > +#define LOOP_VINFO_IV_EXIT(L)              (L)->vec_loop_iv
> > > +#define LOOP_VINFO_EPILOGUE_IV_EXIT(L)     (L)->vec_epilogue_loop_iv
> > > +#define LOOP_VINFO_SCALAR_IV_EXIT(L)       (L)->scalar_loop_iv
> > >  #define LOOP_VINFO_BBS(L)                  (L)->bbs
> > >  #define LOOP_VINFO_NITERSM1(L)             (L)->num_itersm1
> > >  #define LOOP_VINFO_NITERS(L)               (L)->num_iters
> > > @@ -2155,11 +2169,13 @@ class auto_purge_vect_location
> > >
> > >  /* Simple loop peeling and versioning utilities for vectorizer's purposes -
> > >     in tree-vect-loop-manip.cc.  */
> > > -extern void vect_set_loop_condition (class loop *, loop_vec_info,
> > > +extern void vect_set_loop_condition (class loop *, edge,
> > > +loop_vec_info,
> > >  				     tree, tree, tree, bool);
> > > -extern bool slpeel_can_duplicate_loop_p (const class loop *,
> > > const_edge); -class loop *slpeel_tree_duplicate_loop_to_edge_cfg (class
> > loop *,
> > > -						     class loop *, edge);
> > > +extern bool slpeel_can_duplicate_loop_p (const class loop *, const_edge,
> > > +					 const_edge);
> > > +class loop *slpeel_tree_duplicate_loop_to_edge_cfg (class loop *, edge,
> > > +						    class loop *, edge,
> > > +						    edge, edge *);
> > >  class loop *vect_loop_versioning (loop_vec_info, gimple *);  extern
> > > class loop *vect_do_peeling (loop_vec_info, tree, tree,
> > >  				    tree *, tree *, tree *, int, bool, bool, @@ -
> > 2169,6 +2185,7
> > > @@ extern void vect_prepare_for_masked_peels (loop_vec_info);  extern
> > > dump_user_location_t find_loop_location (class loop *);  extern bool
> > > vect_can_advance_ivs_p (loop_vec_info);  extern void
> > > vect_update_inits_of_drs (loop_vec_info, tree, tree_code);
> > > +extern edge vec_init_loop_exit_info (class loop *);
> > >
> > >  /* In tree-vect-stmts.cc.  */
> > >  extern tree get_related_vectype_for_scalar_type (machine_mode, tree,
> > > @@ -2358,6 +2375,7 @@ struct vect_loop_form_info
> > >    tree assumptions;
> > >    gcond *loop_cond;
> > >    gcond *inner_loop_cond;
> > > +  edge loop_exit;
> > >  };
> > >  extern opt_result vect_analyze_loop_form (class loop *,
> > > vect_loop_form_info *);  extern loop_vec_info vect_create_loop_vinfo
> > > (class loop *, vec_info_shared *, diff --git a/gcc/tree-vectorizer.cc
> > > b/gcc/tree-vectorizer.cc index
> > >
> > a048e9d89178a37455bd7b83ab0f2a238a4ce69e..d97e2b54c25ac6037893
> > 5392aa7b
> > > 73476efed74b 100644
> > > --- a/gcc/tree-vectorizer.cc
> > > +++ b/gcc/tree-vectorizer.cc
> > > @@ -943,6 +943,8 @@ set_uid_loop_bbs (loop_vec_info loop_vinfo,
> > gimple *loop_vectorized_call,
> > >    class loop *scalar_loop = get_loop (fun, tree_to_shwi (arg));
> > >
> > >    LOOP_VINFO_SCALAR_LOOP (loop_vinfo) = scalar_loop;
> > > +  LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo)
> > > +    = vec_init_loop_exit_info (scalar_loop);
> > >    gcc_checking_assert (vect_loop_vectorized_call (scalar_loop)
> > >  		       == loop_vectorized_call);
> > >    /* If we are going to vectorize outer loop, prevent vectorization
> > >
> > >
> > >
> > >
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)