From: Richard Biener <richard.guenther@gmail.com>
To: Andrea Corallo <andrea.corallo@arm.com>,
"Kewen.Lin" <linkw@linux.ibm.com>,
GCC Patches <gcc-patches@gcc.gnu.org>,
Segher Boessenkool <segher@kernel.crashing.org>,
Bill Schmidt <wschmidt@linux.ibm.com>,
Richard Biener <richard.guenther@gmail.com>,
Richard Sandiford <richard.sandiford@arm.com>
Subject: Re: [PATCH] vect: Fix epilogue loop handling of partial vectors
Date: Wed, 23 Sep 2020 09:58:59 +0200 [thread overview]
Message-ID: <CAFiYyc0Tsq7eUA7JpJ-exBJrmnM3XfzT0+xF2KjFVRqqQNtOkw@mail.gmail.com> (raw)
In-Reply-To: <mptft79op6k.fsf_-_@arm.com>
On Tue, Sep 22, 2020 at 4:34 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Sandiford <richard.sandiford@arm.com> writes:
> > I'll try to have a patch ready tomorrow morning European time.
>
> Well, I totally failed to hit that deadline. When testing on Power,
> I saw a couple of extra failures, but I now think they're improvements
> rather than regressions. See the point about single-iteration
> epilogues below.
>
> ---
>
> This patch fixes the fallout that Kewen reported on Power after
> the recent change to avoid unnecessary use of partial vectors.
> As Kewen said, the problem is that vect_analyze_loop_2 doesn't
> know how many epilogue iterations there will be, and so it
> cannot make a final decision about whether the number of
> iterations forces an epilogue loop to use partial vectors.
>
> This is similar to the current situation for peeling: we don't know
> during initial analysis whether an epilogue loop will itself require
> peeling. Instead we decide that during vect_do_peeling, where the
> final number of epilogue loop iterations is known.
>
> The patch takes a similar approach for the decision about whether
> to use partial vectors. As the comments in the patch say, the
> idea is that vect_analyze_loop_2 should make peeling and partial-
> vector decisions based on the assumption that the loop_vinfo will
> be used as the main loop, while vect_do_peeling should make them
> in the knowledge that the loop_vinfo will be used as an epilogue loop.
>
> This allows the same analysis to be used for both cases, which we
> rely on for implementing VECT_COMPARE_COSTS; see the big comment
> in vect_analyze_loop for details.
>
> I hope the patch makes the (mostly preexisting) structure a bit
> more obvious. It isn't what anyone would design from scratch,
> but that's the nature of working with a mature vector framework.
Indeed :/
> Arranging things this way means that vect_verify_full_masking
> and vect_verify_loop_lens now become part of the “can” rather
> than “will” test for partial vectors.
>
> Also, while splitting out the logic that handles epilogues with
> constant iterations, I added a check to make sure that we don't
> try to use partial vectors to vectorise a single-scalar loop.
> This required some changes to the Power tests.
>
> Tested on aarch64-linux-gnu, arm-linux-gnueabi, x86_64-linux-gnu and
> powerpc64le-linux-gnu. OK to install?
Thanks for the nice work.
OK.
Richard.
> Richard
>
>
> gcc/
> * tree-vectorizer.h (determine_peel_for_niter): Delete in favor of...
> (vect_determine_partial_vectors_and_peeling): ...this new function.
> * tree-vect-loop-manip.c (vect_update_epilogue_niters): New function.
> Reject using vector epilogue loops for single iterations. Install
> the constant number of epilogue loop iterations in the associated
> loop_vinfo. Rely on vect_determine_partial_vectors_and_peeling
> to do the main part of the test.
> (vect_do_peeling): Use vect_update_epilogue_niters to handle
> epilogue loops with a known number of iterations. Skip recomputing
> the number of iterations later in that case. Otherwise, use
> vect_determine_partial_vectors_and_peeling to decide whether the
> epilogue loop needs to use partial vectors or peeling.
> * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Set the
> default can_use_partial_vectors_p to false if partial-vector-usage=0.
> (determine_peel_for_niter): Remove in favor of...
> (vect_determine_partial_vectors_and_peeling): ...this new function,
> split out from...
> (vect_analyze_loop_2): ...here. Reflect the vect_verify_full_masking
> and vect_verify_loop_lens results in CAN_USE_PARTIAL_VECTORS_P
> rather than USING_PARTIAL_VECTORS_P.
>
> gcc/testsuite/
> * gcc.target/powerpc/p9-vec-length-epil-1.c: Do not expect the
> single-iteration epilogues of the 64-bit loops to be vectorized.
> * gcc.target/powerpc/p9-vec-length-epil-7.c: Likewise.
> * gcc.target/powerpc/p9-vec-length-epil-8.c: Likewise.
> ---
> .../gcc.target/powerpc/p9-vec-length-epil-1.c | 4 +-
> .../gcc.target/powerpc/p9-vec-length-epil-7.c | 2 +-
> .../gcc.target/powerpc/p9-vec-length-epil-8.c | 4 +-
> gcc/tree-vect-loop-manip.c | 83 +++++---
> gcc/tree-vect-loop.c | 196 ++++++++++++------
> gcc/tree-vectorizer.h | 3 +-
> 6 files changed, 192 insertions(+), 100 deletions(-)
>
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index 9dffc5570e5..b7fa6bc8d2f 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -1967,7 +1967,8 @@ extern tree vect_create_addr_base_for_vector_ref (vec_info *,
> extern widest_int vect_iv_limit_for_partial_vectors (loop_vec_info loop_vinfo);
> bool vect_rgroup_iv_might_wrap_p (loop_vec_info, rgroup_controls *);
> /* Used in tree-vect-loop-manip.c */
> -extern void determine_peel_for_niter (loop_vec_info);
> +extern opt_result vect_determine_partial_vectors_and_peeling (loop_vec_info,
> + bool);
> /* Used in gimple-loop-interchange.c and tree-parloops.c. */
> extern bool check_reduction_path (dump_user_location_t, loop_p, gphi *, tree,
> enum tree_code);
> diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
> index 47cfa6f4061..7cf00e6eed4 100644
> --- a/gcc/tree-vect-loop-manip.c
> +++ b/gcc/tree-vect-loop-manip.c
> @@ -2386,6 +2386,34 @@ slpeel_update_phi_nodes_for_lcssa (class loop *epilog)
> rename_use_op (PHI_ARG_DEF_PTR_FROM_EDGE (gsi.phi (), e));
> }
>
> +/* EPILOGUE_VINFO is an epilogue loop that we now know would need to
> + iterate exactly CONST_NITERS times. Make a final decision about
> + whether the epilogue loop should be used, returning true if so. */
> +
> +static bool
> +vect_update_epilogue_niters (loop_vec_info epilogue_vinfo,
> + unsigned HOST_WIDE_INT const_niters)
> +{
> + /* Avoid wrap-around when computing const_niters - 1. Also reject
> + using an epilogue loop for a single scalar iteration, even if
> + we could in principle implement that using partial vectors. */
> + unsigned int gap_niters = LOOP_VINFO_PEELING_FOR_GAPS (epilogue_vinfo);
> + if (const_niters <= gap_niters + 1)
> + return false;
> +
> + /* Install the number of iterations. */
> + tree niters_type = TREE_TYPE (LOOP_VINFO_NITERS (epilogue_vinfo));
> + tree niters_tree = build_int_cst (niters_type, const_niters);
> + tree nitersm1_tree = build_int_cst (niters_type, const_niters - 1);
> +
> + LOOP_VINFO_NITERS (epilogue_vinfo) = niters_tree;
> + LOOP_VINFO_NITERSM1 (epilogue_vinfo) = nitersm1_tree;
> +
> + /* Decide what to do if the number of epilogue iterations is not
> + a multiple of the epilogue loop's vectorization factor. */
> + return vect_determine_partial_vectors_and_peeling (epilogue_vinfo, true);
> +}
> +
> /* Function vect_do_peeling.
>
> Input:
> @@ -2493,6 +2521,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
> int estimated_vf;
> int prolog_peeling = 0;
> bool vect_epilogues = loop_vinfo->epilogue_vinfos.length () > 0;
> + bool vect_epilogues_updated_niters = false;
> /* We currently do not support prolog peeling if the target alignment is not
> known at compile time. 'vect_gen_prolog_loop_niters' depends on the
> target alignment being constant. */
> @@ -2601,8 +2630,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
> if (vect_epilogues
> && LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> && prolog_peeling >= 0
> - && known_eq (vf, lowest_vf)
> - && !LOOP_VINFO_USING_PARTIAL_VECTORS_P (epilogue_vinfo))
> + && known_eq (vf, lowest_vf))
> {
> unsigned HOST_WIDE_INT eiters
> = (LOOP_VINFO_INT_NITERS (loop_vinfo)
> @@ -2612,13 +2640,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
> eiters
> = eiters % lowest_vf + LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo);
>
> - unsigned int ratio;
> - unsigned int epilogue_gaps
> - = LOOP_VINFO_PEELING_FOR_GAPS (epilogue_vinfo);
> - while (!(constant_multiple_p
> - (GET_MODE_SIZE (loop_vinfo->vector_mode),
> - GET_MODE_SIZE (epilogue_vinfo->vector_mode), &ratio)
> - && eiters >= lowest_vf / ratio + epilogue_gaps))
> + while (!vect_update_epilogue_niters (epilogue_vinfo, eiters))
> {
> delete epilogue_vinfo;
> epilogue_vinfo = NULL;
> @@ -2629,8 +2651,8 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
> }
> epilogue_vinfo = loop_vinfo->epilogue_vinfos[0];
> loop_vinfo->epilogue_vinfos.ordered_remove (0);
> - epilogue_gaps = LOOP_VINFO_PEELING_FOR_GAPS (epilogue_vinfo);
> }
> + vect_epilogues_updated_niters = true;
> }
> /* Prolog loop may be skipped. */
> bool skip_prolog = (prolog_peeling != 0);
> @@ -2928,7 +2950,9 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
> skip_e edge. */
> if (skip_vector)
> {
> - gcc_assert (update_e != NULL && skip_e != NULL);
> + gcc_assert (update_e != NULL
> + && skip_e != NULL
> + && !vect_epilogues_updated_niters);
> gphi *new_phi = create_phi_node (make_ssa_name (TREE_TYPE (niters)),
> update_e->dest);
> tree new_ssa = make_ssa_name (TREE_TYPE (niters));
> @@ -2953,25 +2977,32 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
> niters = PHI_RESULT (new_phi);
> }
>
> - /* Subtract the number of iterations performed by the vectorized loop
> - from the number of total iterations. */
> - tree epilogue_niters = fold_build2 (MINUS_EXPR, TREE_TYPE (niters),
> - before_loop_niters,
> - niters);
> -
> - LOOP_VINFO_NITERS (epilogue_vinfo) = epilogue_niters;
> - LOOP_VINFO_NITERSM1 (epilogue_vinfo)
> - = fold_build2 (MINUS_EXPR, TREE_TYPE (epilogue_niters),
> - epilogue_niters,
> - build_one_cst (TREE_TYPE (epilogue_niters)));
> -
> /* Set ADVANCE to the number of iterations performed by the previous
> loop and its prologue. */
> *advance = niters;
>
> - /* Redo the peeling for niter analysis as the NITERs and alignment
> - may have been updated to take the main loop into account. */
> - determine_peel_for_niter (epilogue_vinfo);
> + if (!vect_epilogues_updated_niters)
> + {
> + /* Subtract the number of iterations performed by the vectorized loop
> + from the number of total iterations. */
> + tree epilogue_niters = fold_build2 (MINUS_EXPR, TREE_TYPE (niters),
> + before_loop_niters,
> + niters);
> +
> + LOOP_VINFO_NITERS (epilogue_vinfo) = epilogue_niters;
> + LOOP_VINFO_NITERSM1 (epilogue_vinfo)
> + = fold_build2 (MINUS_EXPR, TREE_TYPE (epilogue_niters),
> + epilogue_niters,
> + build_one_cst (TREE_TYPE (epilogue_niters)));
> +
> + /* Decide what to do if the number of epilogue iterations is not
> + a multiple of the epilogue loop's vectorization factor.
> + We should have rejected the loop during the analysis phase
> + if this fails. */
> + if (!vect_determine_partial_vectors_and_peeling (epilogue_vinfo,
> + true))
> + gcc_unreachable ();
> + }
> }
>
> adjust_vec.release ();
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index b1a6e1508c7..bb9d87fd200 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -814,7 +814,7 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, vec_info_shared *shared)
> vec_outside_cost (0),
> vec_inside_cost (0),
> vectorizable (false),
> - can_use_partial_vectors_p (true),
> + can_use_partial_vectors_p (param_vect_partial_vector_usage != 0),
> using_partial_vectors_p (false),
> epil_using_partial_vectors_p (false),
> peeling_for_gaps (false),
> @@ -2003,22 +2003,123 @@ vect_dissolve_slp_only_groups (loop_vec_info loop_vinfo)
> }
> }
>
> +/* Determine if operating on full vectors for LOOP_VINFO might leave
> + some scalar iterations still to do. If so, decide how we should
> + handle those scalar iterations. The possibilities are:
>
> -/* Decides whether we need to create an epilogue loop to handle
> - remaining scalar iterations and sets PEELING_FOR_NITERS accordingly. */
> + (1) Make LOOP_VINFO operate on partial vectors instead of full vectors.
> + In this case:
>
> -void
> -determine_peel_for_niter (loop_vec_info loop_vinfo)
> + LOOP_VINFO_USING_PARTIAL_VECTORS_P == true
> + LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P == false
> + LOOP_VINFO_PEELING_FOR_NITER == false
> +
> + (2) Make LOOP_VINFO operate on full vectors and use an epilogue loop
> + to handle the remaining scalar iterations. In this case:
> +
> + LOOP_VINFO_USING_PARTIAL_VECTORS_P == false
> + LOOP_VINFO_PEELING_FOR_NITER == true
> +
> + There are two choices:
> +
> + (2a) Consider vectorizing the epilogue loop at the same VF as the
> + main loop, but using partial vectors instead of full vectors.
> + In this case:
> +
> + LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P == true
> +
> + (2b) Consider vectorizing the epilogue loop at lower VFs only.
> + In this case:
> +
> + LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P == false
> +
> + When FOR_EPILOGUE_P is true, make this determination based on the
> + assumption that LOOP_VINFO is an epilogue loop, otherwise make it
> + based on the assumption that LOOP_VINFO is the main loop. The caller
> + has made sure that the number of iterations is set appropriately for
> + this value of FOR_EPILOGUE_P. */
> +
> +opt_result
> +vect_determine_partial_vectors_and_peeling (loop_vec_info loop_vinfo,
> + bool for_epilogue_p)
> {
> - LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = false;
> + /* Determine whether there would be any scalar iterations left over. */
> + bool need_peeling_or_partial_vectors_p
> + = vect_need_peeling_or_partial_vectors_p (loop_vinfo);
>
> - if (LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
> - /* The main loop handles all iterations. */
> - LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = false;
> - else if (vect_need_peeling_or_partial_vectors_p (loop_vinfo))
> - LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true;
> -}
> + /* Decide whether to vectorize the loop with partial vectors. */
> + LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = false;
> + LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (loop_vinfo) = false;
> + if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> + && need_peeling_or_partial_vectors_p)
> + {
> + /* For partial-vector-usage=1, try to push the handling of partial
> + vectors to the epilogue, with the main loop continuing to operate
> + on full vectors.
> +
> + ??? We could then end up failing to use partial vectors if we
> + decide to peel iterations into a prologue, and if the main loop
> + then ends up processing fewer than VF iterations. */
> + if (param_vect_partial_vector_usage == 1
> + && !LOOP_VINFO_EPILOGUE_P (loop_vinfo)
> + && !vect_known_niters_smaller_than_vf (loop_vinfo))
> + LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (loop_vinfo) = true;
> + else
> + LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = true;
> + }
> +
> + if (dump_enabled_p ())
> + {
> + if (LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
> + dump_printf_loc (MSG_NOTE, vect_location,
> + "operating on partial vectors%s.\n",
> + for_epilogue_p ? " for epilogue loop" : "");
> + else
> + dump_printf_loc (MSG_NOTE, vect_location,
> + "operating only on full vectors%s.\n",
> + for_epilogue_p ? " for epilogue loop" : "");
> + }
>
> + if (for_epilogue_p)
> + {
> + loop_vec_info orig_loop_vinfo = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo);
> + gcc_assert (orig_loop_vinfo);
> + if (!LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
> + gcc_assert (known_lt (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
> + LOOP_VINFO_VECT_FACTOR (orig_loop_vinfo)));
> + }
> +
> + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> + && !LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
> + {
> + /* Check that the loop processes at least one full vector. */
> + poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> + tree scalar_niters = LOOP_VINFO_NITERS (loop_vinfo);
> + if (known_lt (wi::to_widest (scalar_niters), vf))
> + return opt_result::failure_at (vect_location,
> + "loop does not have enough iterations"
> + " to support vectorization.\n");
> +
> + /* If we need to peel an extra epilogue iteration to handle data
> + accesses with gaps, check that there are enough scalar iterations
> + available.
> +
> + The check above is redundant with this one when peeling for gaps,
> + but the distinction is useful for diagnostics. */
> + tree scalar_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
> + if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
> + && known_lt (wi::to_widest (scalar_nitersm1), vf))
> + return opt_result::failure_at (vect_location,
> + "loop does not have enough iterations"
> + " to support peeling for gaps.\n");
> + }
> +
> + LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo)
> + = (!LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)
> + && need_peeling_or_partial_vectors_p);
> +
> + return opt_result::success ();
> +}
>
> /* Function vect_analyze_loop_2.
>
> @@ -2272,72 +2373,32 @@ start_over:
> LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
> }
>
> - /* Decide whether to vectorize a loop with partial vectors for
> - this vectorization factor. */
> - if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
> - {
> - /* Don't use partial vectors if we don't need to peel the loop. */
> - if (param_vect_partial_vector_usage == 0
> - || !vect_need_peeling_or_partial_vectors_p (loop_vinfo))
> - LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = false;
> - else if (vect_verify_full_masking (loop_vinfo)
> - || vect_verify_loop_lens (loop_vinfo))
> - {
> - /* The epilogue and other known niters less than VF
> - cases can still use vector access with length fully. */
> - if (param_vect_partial_vector_usage == 1
> - && !LOOP_VINFO_EPILOGUE_P (loop_vinfo)
> - && !vect_known_niters_smaller_than_vf (loop_vinfo))
> - {
> - LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = false;
> - LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (loop_vinfo) = true;
> - }
> - else
> - LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = true;
> - }
> - else
> - LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = false;
> - }
> - else
> - LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = false;
> -
> - if (dump_enabled_p ())
> - {
> - if (LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
> - dump_printf_loc (MSG_NOTE, vect_location,
> - "operating on partial vectors.\n");
> - else
> - dump_printf_loc (MSG_NOTE, vect_location,
> - "operating only on full vectors.\n");
> - }
> -
> - /* If epilog loop is required because of data accesses with gaps,
> - one additional iteration needs to be peeled. Check if there is
> - enough iterations for vectorization. */
> - if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
> - && LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> - && !LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
> - {
> - poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> - tree scalar_niters = LOOP_VINFO_NITERSM1 (loop_vinfo);
> -
> - if (known_lt (wi::to_widest (scalar_niters), vf))
> - return opt_result::failure_at (vect_location,
> - "loop has no enough iterations to"
> - " support peeling for gaps.\n");
> - }
> + /* If we still have the option of using partial vectors,
> + check whether we can generate the necessary loop controls. */
> + if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> + && !vect_verify_full_masking (loop_vinfo)
> + && !vect_verify_loop_lens (loop_vinfo))
> + LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
>
> /* If we're vectorizing an epilogue loop, the vectorized loop either needs
> to be able to handle fewer than VF scalars, or needs to have a lower VF
> than the main loop. */
> if (LOOP_VINFO_EPILOGUE_P (loop_vinfo)
> - && !LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)
> + && !LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> && maybe_ge (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
> LOOP_VINFO_VECT_FACTOR (orig_loop_vinfo)))
> return opt_result::failure_at (vect_location,
> "Vectorization factor too high for"
> " epilogue loop.\n");
>
> + /* Decide whether this loop_vinfo should use partial vectors or peeling,
> + assuming that the loop will be used as a main loop. We will redo
> + this analysis later if we instead decide to use the loop as an
> + epilogue loop. */
> + ok = vect_determine_partial_vectors_and_peeling (loop_vinfo, false);
> + if (!ok)
> + return ok;
> +
> /* Check the costings of the loop make vectorizing worthwhile. */
> res = vect_analyze_loop_costing (loop_vinfo);
> if (res < 0)
> @@ -2350,7 +2411,6 @@ start_over:
> return opt_result::failure_at (vect_location,
> "Loop costings not worthwhile.\n");
>
> - determine_peel_for_niter (loop_vinfo);
> /* If an epilogue loop is required make sure we can create one. */
> if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
> || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo))
> diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-1.c b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-1.c
> index ebb2f45c917..d248f091b52 100644
> --- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-1.c
> @@ -10,6 +10,6 @@
>
> /* { dg-final { scan-assembler-times {\mlxvx?\M} 20 } } */
> /* { dg-final { scan-assembler-times {\mstxvx?\M} 10 } } */
> -/* { dg-final { scan-assembler-times {\mlxvl\M} 20 } } */
> -/* { dg-final { scan-assembler-times {\mstxvl\M} 10 } } */
> +/* { dg-final { scan-assembler-times {\mlxvl\M} 14 } } */
> +/* { dg-final { scan-assembler-times {\mstxvl\M} 7 } } */
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-7.c b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-7.c
> index 9d403287923..a27ee347ca1 100644
> --- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-7.c
> +++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-7.c
> @@ -8,4 +8,4 @@
>
> #include "p9-vec-length-7.h"
>
> -/* { dg-final { scan-assembler-times {\mstxvl\M} 10 } } */
> +/* { dg-final { scan-assembler-times {\mstxvl\M} 7 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-8.c b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-8.c
> index 6b54a29efaa..961df0d5646 100644
> --- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-8.c
> +++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-8.c
> @@ -8,5 +8,5 @@
>
> #include "p9-vec-length-8.h"
>
> -/* { dg-final { scan-assembler-times {\mlxvl\M} 30 } } */
> -/* { dg-final { scan-assembler-times {\mstxvl\M} 10 } } */
> +/* { dg-final { scan-assembler-times {\mlxvl\M} 21 } } */
> +/* { dg-final { scan-assembler-times {\mstxvl\M} 7 } } */
prev parent reply other threads:[~2020-09-23 7:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-18 2:37 [PATCH] vect/test: Don't check for epilogue loop [PR97075] Kewen.Lin
2020-09-18 15:46 ` Segher Boessenkool
2020-09-18 18:17 ` Richard Sandiford
2020-09-20 14:03 ` Kewen.Lin
2020-09-21 6:50 ` Richard Sandiford
2020-09-22 3:23 ` Kewen.Lin
2020-09-21 7:29 ` Andrea Corallo
2020-09-21 8:32 ` Richard Sandiford
2020-09-22 14:34 ` [PATCH] vect: Fix epilogue loop handling of partial vectors Richard Sandiford
2020-09-23 3:07 ` Kewen.Lin
2020-09-23 11:33 ` Richard Sandiford
2020-09-24 2:35 ` Kewen.Lin
2020-09-23 7:58 ` Richard Biener [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFiYyc0Tsq7eUA7JpJ-exBJrmnM3XfzT0+xF2KjFVRqqQNtOkw@mail.gmail.com \
--to=richard.guenther@gmail.com \
--cc=andrea.corallo@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=linkw@linux.ibm.com \
--cc=richard.sandiford@arm.com \
--cc=segher@kernel.crashing.org \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).