From: Richard Biener <rguenther@suse.de>
To: Tamar Christina <tamar.christina@arm.com>
Cc: gcc-patches@gcc.gnu.org, nd@arm.com, jlaw@ventanamicro.com
Subject: Re: [PATCH 2/3]middle-end: updated niters analysis to handle multiple exits.
Date: Tue, 10 Oct 2023 11:13:58 +0000 (UTC) [thread overview]
Message-ID: <nycvar.YFH.7.77.849.2310101105020.5561@jbgna.fhfr.qr> (raw)
In-Reply-To: <ZRp0Q7ovA3FweWk4@arm.com>
On Mon, 2 Oct 2023, Tamar Christina wrote:
> Hi All,
>
> This second part updates niters analysis to be able to analyze any number of
> exits. If we have multiple exits we determine the main exit by finding the
> first counting IV.
>
> The change allows the vectorizer to pass analysis for multiple loops, but we
> later gracefully reject them. It does however allow us to test if the exit
> handling is using the right exit everywhere.
>
> Additionally since we analyze all exits, we now return all conditions for them
> and determine which condition belongs to the main exit.
>
> The main condition is needed because the vectorizer needs to ignore the main IV
> condition during vectorization as it will replace it during codegen.
>
> To track versioned loops we extend the contract between ifcvt and the vectorizer
> to store the exit number in aux so that we can match it up again during peeling.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu, and
> no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * tree-if-conv.cc (tree_if_conversion): Record exits in aux.
> * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Use
> it.
> * tree-vect-loop.cc (vect_get_loop_niters): Determine main exit.
> (vec_init_loop_exit_info): Extend analysis when multiple exits.
> (vect_analyze_loop_form): Record conds and determine main cond.
> (vect_create_loop_vinfo): Extend bookkeeping of conds.
> (vect_analyze_loop): Release conds.
> * tree-vectorizer.h (LOOP_VINFO_LOOP_CONDS,
> LOOP_VINFO_LOOP_IV_COND): New.
> (struct vect_loop_form_info): Add conds, alt_loop_conds;
> (struct loop_vec_info): Add conds, loop_iv_cond.
>
> --- inline copy of patch --
> diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
> index 799f071965e5c41eb352b5530cf1d9c7ecf7bf25..3dc2290467797ebbfcef55903531b22829f4fdbd 100644
> --- a/gcc/tree-if-conv.cc
> +++ b/gcc/tree-if-conv.cc
> @@ -3795,6 +3795,13 @@ tree_if_conversion (class loop *loop, vec<gimple *> *preds)
> }
> if (need_to_ifcvt)
> {
> + /* Before we rewrite edges we'll record their original position in the
> + edge map such that we can map the edges between the ifcvt and the
> + non-ifcvt loop during peeling. */
> + uintptr_t idx = 0;
> + for (edge exit : get_loop_exit_edges (loop))
> + exit->aux = (void*)idx++;
> +
> /* Now all statements are if-convertible. Combine all the basic
> blocks into one huge basic block doing the if-conversion
> on-the-fly. */
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index e06717272aafc6d31cbdcb94840ac25de616da6d..77f8e668bcc8beca99ba4052e1b12e0d17300262 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -1470,6 +1470,18 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
> scalar_loop = loop;
> scalar_exit = loop_exit;
> }
> + else if (scalar_loop == loop)
> + scalar_exit = loop_exit;
> + else
> + {
> + /* Loop has been version, match exits up using the aux index. */
> + for (edge exit : get_loop_exit_edges (scalar_loop))
> + if (exit->aux == loop_exit->aux)
> + {
> + scalar_exit = exit;
> + break;
> + }
> + }
>
> bbs = XNEWVEC (basic_block, scalar_loop->num_nodes + 1);
> pbbs = bbs + 1;
> @@ -1501,6 +1513,8 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
> exit = loop_exit;
> basic_block new_preheader = new_bbs[0];
>
> + /* Record the new loop exit information. new_loop doesn't have SCEV data and
> + so we must initialize the exit information. */
> if (new_e)
> *new_e = new_exit;
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 6e60d84143626a8e1d801bb580f4dcebc73c7ba7..f1caa5f207d3b13da58c3a313b11d1ef98374349 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -851,79 +851,106 @@ vect_fixup_scalar_cycles_with_patterns (loop_vec_info loop_vinfo)
> in NUMBER_OF_ITERATIONSM1. Place the condition under which the
> niter information holds in ASSUMPTIONS.
>
> - Return the loop exit condition. */
> + Return the loop exit conditions. */
>
>
> -static gcond *
> -vect_get_loop_niters (class loop *loop, edge exit, tree *assumptions,
> +static vec<gcond *>
> +vect_get_loop_niters (class loop *loop, tree *assumptions, const_edge main_exit,
> tree *number_of_iterations, tree *number_of_iterationsm1)
Any reason you swap exit and main_exit? IMHO the input better pairs with
the other input 'loop'.
> {
> + auto_vec<edge> exits = get_loop_exit_edges (loop);
> + vec<gcond *> conds;
> + conds.create (exits.length ());
> class tree_niter_desc niter_desc;
> tree niter_assumptions, niter, may_be_zero;
> - gcond *cond = get_loop_exit_condition (loop);
>
> *assumptions = boolean_true_node;
> *number_of_iterationsm1 = chrec_dont_know;
> *number_of_iterations = chrec_dont_know;
> +
> DUMP_VECT_SCOPE ("get_loop_niters");
>
> - if (!exit)
> - return cond;
> + if (exits.is_empty ())
> + return conds;
> +
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, vect_location, "Loop has %d exits.\n",
> + exits.length ());
> +
> + edge exit;
> + unsigned int i;
> + FOR_EACH_VEC_ELT (exits, i, exit)
> + {
> + gcond *cond = get_loop_exit_condition (exit);
> + if (cond)
> + conds.safe_push (cond);
> +
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, vect_location, "Analyzing exit %d...\n", i);
>
> - may_be_zero = NULL_TREE;
> - if (!number_of_iterations_exit_assumptions (loop, exit, &niter_desc, NULL)
> - || chrec_contains_undetermined (niter_desc.niter))
> - return cond;
> + may_be_zero = NULL_TREE;
> + if (!number_of_iterations_exit_assumptions (loop, exit, &niter_desc, NULL)
> + || chrec_contains_undetermined (niter_desc.niter))
> + continue;
>
> - niter_assumptions = niter_desc.assumptions;
> - may_be_zero = niter_desc.may_be_zero;
> - niter = niter_desc.niter;
> + niter_assumptions = niter_desc.assumptions;
> + may_be_zero = niter_desc.may_be_zero;
> + niter = niter_desc.niter;
>
> - if (may_be_zero && integer_zerop (may_be_zero))
> - may_be_zero = NULL_TREE;
> + if (may_be_zero && integer_zerop (may_be_zero))
> + may_be_zero = NULL_TREE;
>
> - if (may_be_zero)
> - {
> - if (COMPARISON_CLASS_P (may_be_zero))
> + if (may_be_zero)
> {
> - /* Try to combine may_be_zero with assumptions, this can simplify
> - computation of niter expression. */
> - if (niter_assumptions && !integer_nonzerop (niter_assumptions))
> - niter_assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
> - niter_assumptions,
> - fold_build1 (TRUTH_NOT_EXPR,
> - boolean_type_node,
> - may_be_zero));
> + if (COMPARISON_CLASS_P (may_be_zero))
> + {
> + /* Try to combine may_be_zero with assumptions, this can simplify
> + computation of niter expression. */
> + if (niter_assumptions && !integer_nonzerop (niter_assumptions))
> + niter_assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
> + niter_assumptions,
> + fold_build1 (TRUTH_NOT_EXPR,
> + boolean_type_node,
> + may_be_zero));
> + else
> + niter = fold_build3 (COND_EXPR, TREE_TYPE (niter), may_be_zero,
> + build_int_cst (TREE_TYPE (niter), 0),
> + rewrite_to_non_trapping_overflow (niter));
> +
> + may_be_zero = NULL_TREE;
> + }
> + else if (integer_nonzerop (may_be_zero) && exit == main_exit)
> + {
> + *number_of_iterationsm1 = build_int_cst (TREE_TYPE (niter), 0);
> + *number_of_iterations = build_int_cst (TREE_TYPE (niter), 1);
> + continue;
> + }
> else
> - niter = fold_build3 (COND_EXPR, TREE_TYPE (niter), may_be_zero,
> - build_int_cst (TREE_TYPE (niter), 0),
> - rewrite_to_non_trapping_overflow (niter));
> + continue;
> + }
>
> - may_be_zero = NULL_TREE;
> - }
> - else if (integer_nonzerop (may_be_zero))
> + /* Loop assumptions are based off the normal exit. */
> + if (exit == main_exit)
It's a bit hard to follow in patch form but I wonder why you even
analyze the number of iterations of the non-main exits riskying
possibly clobbering the *number_* outputs which we later assume
to be for the main exit?
> {
> - *number_of_iterationsm1 = build_int_cst (TREE_TYPE (niter), 0);
> - *number_of_iterations = build_int_cst (TREE_TYPE (niter), 1);
> - return cond;
> + *assumptions = niter_assumptions;
> + *number_of_iterationsm1 = niter;
> +
> + /* We want the number of loop header executions which is the number
> + of latch executions plus one.
> + ??? For UINT_MAX latch executions this number overflows to zero
> + for loops like do { n++; } while (n != 0); */
> + if (niter && !chrec_contains_undetermined (niter))
> + niter = fold_build2 (PLUS_EXPR, TREE_TYPE (niter),
> + unshare_expr (niter),
> + build_int_cst (TREE_TYPE (niter), 1));
> + *number_of_iterations = niter;
> }
> - else
> - return cond;
> }
>
> - *assumptions = niter_assumptions;
> - *number_of_iterationsm1 = niter;
> -
> - /* We want the number of loop header executions which is the number
> - of latch executions plus one.
> - ??? For UINT_MAX latch executions this number overflows to zero
> - for loops like do { n++; } while (n != 0); */
> - if (niter && !chrec_contains_undetermined (niter))
> - niter = fold_build2 (PLUS_EXPR, TREE_TYPE (niter), unshare_expr (niter),
> - build_int_cst (TREE_TYPE (niter), 1));
> - *number_of_iterations = niter;
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, vect_location, "All loop exits successfully analyzed.\n");
>
> - return cond;
> + return conds;
> }
>
> /* Determine the main loop exit for the vectorizer. */
> @@ -936,8 +963,25 @@ vec_init_loop_exit_info (class loop *loop)
> auto_vec<edge> exits = get_loop_exit_edges (loop);
> if (exits.length () == 1)
> return exits[0];
> - else
> - return NULL;
> +
> + /* If we have multiple exits we only support counting IV at the moment. Analyze
> + all exits and return one */
> + class tree_niter_desc niter_desc;
> + edge candidate = NULL;
> + for (edge exit : exits)
> + {
> + if (!get_loop_exit_condition (exit))
> + continue;
> +
> + if (number_of_iterations_exit_assumptions (loop, exit, &niter_desc, NULL)
> + && !chrec_contains_undetermined (niter_desc.niter))
> + {
> + if (!niter_desc.may_be_zero || !candidate)
> + candidate = exit;
> + }
> + }
> +
> + return candidate;
> }
>
> /* Function bb_in_loop_p
> @@ -1788,21 +1832,31 @@ vect_analyze_loop_form (class loop *loop, vect_loop_form_info *info)
> "not vectorized: latch block not empty.\n");
>
> /* Make sure the exit is not abnormal. */
> - edge e = single_exit (loop);
> - if (e->flags & EDGE_ABNORMAL)
> + if (exit_e->flags & EDGE_ABNORMAL)
> return opt_result::failure_at (vect_location,
> "not vectorized:"
> " abnormal loop exit edge.\n");
>
> - info->loop_cond
> - = vect_get_loop_niters (loop, e, &info->assumptions,
> + info->conds
> + = vect_get_loop_niters (loop, &info->assumptions, exit_e,
> &info->number_of_iterations,
> &info->number_of_iterationsm1);
> - if (!info->loop_cond)
> +
> + if (info->conds.is_empty ())
> return opt_result::failure_at
> (vect_location,
> "not vectorized: complicated exit condition.\n");
>
> + /* Determine what the primary and alternate exit conds are. */
> + info->alt_loop_conds.create (info->conds.length () - 1);
> + for (gcond *cond : info->conds)
> + {
> + if (exit_e->src != gimple_bb (cond))
> + info->alt_loop_conds.quick_push (cond);
> + else
> + info->loop_cond = cond;
> + }
> +
IMHO it would be simpler to have the primary exit condition in
info->conds[0] and the rest after that? That avoids having two
arrays and one scalar in vect_loop_form_info.
> if (integer_zerop (info->assumptions)
> || !info->number_of_iterations
> || chrec_contains_undetermined (info->number_of_iterations))
> @@ -1847,8 +1901,13 @@ vect_create_loop_vinfo (class loop *loop, vec_info_shared *shared,
> if (!integer_onep (info->assumptions) && !main_loop_info)
> LOOP_VINFO_NITERS_ASSUMPTIONS (loop_vinfo) = info->assumptions;
>
> - stmt_vec_info loop_cond_info = loop_vinfo->lookup_stmt (info->loop_cond);
> - STMT_VINFO_TYPE (loop_cond_info) = loop_exit_ctrl_vec_info_type;
> + for (gcond *cond : info->conds)
> + {
> + stmt_vec_info loop_cond_info = loop_vinfo->lookup_stmt (cond);
> + STMT_VINFO_TYPE (loop_cond_info) = loop_exit_ctrl_vec_info_type;
> + }
> + LOOP_VINFO_LOOP_CONDS (loop_vinfo).safe_splice (info->alt_loop_conds);
> + LOOP_VINFO_LOOP_IV_COND (loop_vinfo) = info->loop_cond;
> LOOP_VINFO_IV_EXIT (loop_vinfo) = info->loop_exit;
>
> @@ -3594,7 +3653,11 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
> && LOOP_VINFO_PEELING_FOR_NITER (first_loop_vinfo)
> && !loop->simduid);
> if (!vect_epilogues)
> - return first_loop_vinfo;
> + {
> + loop_form_info.conds.release ();
> + loop_form_info.alt_loop_conds.release ();
> + return first_loop_vinfo;
> + }
I think there's 'inner' where you leak these. Maybe use auto_vec<>
in vect_loop_form_info instead?
Otherwise looks OK.
Thanks,
Richard.
> /* Now analyze first_loop_vinfo for epilogue vectorization. */
> poly_uint64 lowest_th = LOOP_VINFO_VERSIONING_THRESHOLD (first_loop_vinfo);
> @@ -3694,6 +3757,9 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
> (first_loop_vinfo->epilogue_vinfos[0]->vector_mode));
> }
>
> + loop_form_info.conds.release ();
> + loop_form_info.alt_loop_conds.release ();
> +
> return first_loop_vinfo;
> }
>
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index afa7a8e30891c782a0e5e3740ecc4377f5a31e54..55b6771b271d5072fa1327d595e1dddb112cfdf6 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -882,6 +882,12 @@ public:
> we need to peel off iterations at the end to form an epilogue loop. */
> bool peeling_for_niter;
>
> + /* List of loop additional IV conditionals found in the loop. */
> + auto_vec<gcond *> conds;
> +
> + /* Main loop IV cond. */
> + gcond* loop_iv_cond;
> +
> /* True if there are no loop carried data dependencies in the loop.
> If loop->safelen <= 1, then this is always true, either the loop
> didn't have any loop carried data dependencies, or the loop is being
> @@ -984,6 +990,8 @@ public:
> #define LOOP_VINFO_REDUCTION_CHAINS(L) (L)->reduction_chains
> #define LOOP_VINFO_PEELING_FOR_GAPS(L) (L)->peeling_for_gaps
> #define LOOP_VINFO_PEELING_FOR_NITER(L) (L)->peeling_for_niter
> +#define LOOP_VINFO_LOOP_CONDS(L) (L)->conds
> +#define LOOP_VINFO_LOOP_IV_COND(L) (L)->loop_iv_cond
> #define LOOP_VINFO_NO_DATA_DEPENDENCIES(L) (L)->no_data_dependencies
> #define LOOP_VINFO_SCALAR_LOOP(L) (L)->scalar_loop
> #define LOOP_VINFO_SCALAR_LOOP_SCALING(L) (L)->scalar_loop_scaling
> @@ -2373,7 +2381,9 @@ struct vect_loop_form_info
> tree number_of_iterations;
> tree number_of_iterationsm1;
> tree assumptions;
> + vec<gcond *> conds;
> gcond *loop_cond;
> + vec<gcond *> alt_loop_conds;
> gcond *inner_loop_cond;
> edge loop_exit;
> };
>
>
>
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
next prev parent reply other threads:[~2023-10-10 11:13 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-02 7:41 [PATCH 1/3]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables Tamar Christina
2023-10-02 7:41 ` [PATCH 2/3]middle-end: updated niters analysis to handle multiple exits Tamar Christina
2023-10-10 11:13 ` Richard Biener [this message]
2023-10-11 10:54 ` Tamar Christina
2023-10-11 12:08 ` Richard Biener
2023-10-02 7:42 ` [PATCH 3/3]middle-end: maintain LCSSA throughout loop peeling Tamar Christina
2023-10-10 12:59 ` Richard Biener
2023-10-11 11:16 ` Tamar Christina
2023-10-11 12:09 ` Richard Biener
2023-10-09 13:35 ` [PATCH 1/3]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables Richard Biener
2023-10-11 10:45 ` Tamar Christina
2023-10-11 12:07 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.YFH.7.77.849.2310101105020.5561@jbgna.fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=jlaw@ventanamicro.com \
--cc=nd@arm.com \
--cc=tamar.christina@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).