From: Richard Biener <rguenther@suse.de>
To: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops
Date: Fri, 11 Oct 2019 13:01:00 -0000 [thread overview]
Message-ID: <nycvar.YFH.7.76.1910111408390.5566@zhemvz.fhfr.qr> (raw)
In-Reply-To: <97dbaa36-107b-e078-05a5-0e8dc3700e9a@arm.com>
[-- Attachment #1: Type: text/plain, Size: 10923 bytes --]
On Thu, 10 Oct 2019, Andre Vieira (lists) wrote:
> Hi,
>
> After all the discussions and respins I now believe this patch is close to
> what we envisioned.
>
> This patch achieves two things when vect-epilogues-nomask=1:
> 1) It analyzes the original loop for each supported vector size and saves this
> analysis per loop, as well as the vector sizes we know we can vectorize the
> loop for.
> 2) When loop versioning it uses the 'skip_vector' code path to vectorize the
> epilogue, and uses the lowest versioning threshold between the main and
> epilogue's.
>
> As side effects of this patch I also changed ensure_base_align to only update
> the alignment if the new alignment is lower than the current one. This
> function already did that if the object was a symbol, now it behaves this way
> for any object.
>
> I bootstrapped this patch with both vect-epilogues-nomask turned on and off on
> x86_64 (AVX512) and aarch64. Regression tests looked good.
>
> Is this OK for trunk?
+
+ /* Keep track of vector sizes we know we can vectorize the epilogue
with. */
+ vector_sizes epilogue_vsizes;
};
please don't enlarge struct loop, instead track this somewhere
in the vectorizer (in loop_vinfo? I see you already have
epilogue_vinfos there - so the loop_vinfo simply lacks
convenient access to the vector_size?) I don't see any
use that could be trivially adjusted to look at a loop_vinfo
member instead.
For the vect_update_inits_of_drs this means that we'd possibly
do less CSE. Not sure if really an issue.
You use LOOP_VINFO_EPILOGUE_P sometimes and sometimes
LOOP_VINFO_ORIG_LOOP_INFO, please change predicates to
LOOP_VINFO_EPILOGUE_P.
@@ -2466,15 +2461,62 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
niters, tree nitersm1,
else
niters_prolog = build_int_cst (type, 0);
+ loop_vec_info epilogue_vinfo = NULL;
+ if (vect_epilogues)
+ {
...
+ vect_epilogues = false;
+ }
+
I don't understand what all this does - it clearly needs a comment.
Maybe the overall comment of the function should be amended with
an overview of how we handle [multiple] epilogue loop vectorization?
+
+ if (epilogue_any_upper_bound && prolog_peeling >= 0)
+ {
+ epilog->any_upper_bound = true;
+ epilog->nb_iterations_upper_bound = eiters + 1;
+ }
+
comment missing. How can prolog_peeling be < 0? We likely
didn't set the upper bound because we don't know it in the
case we skipped the vector loop (skip_vector)? So make sure
to not introduce wrong-code issues here - maybe do this
optimization as followup?
class loop *
-vect_loop_versioning (loop_vec_info loop_vinfo,
- unsigned int th, bool check_profitability,
- poly_uint64 versioning_threshold)
+vect_loop_versioning (loop_vec_info loop_vinfo)
{
class loop *loop = LOOP_VINFO_LOOP (loop_vinfo), *nloop;
class loop *scalar_loop = LOOP_VINFO_SCALAR_LOOP (loop_vinfo);
@@ -2988,10 +3151,15 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
bool version_align = LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT
(loop_vinfo);
bool version_alias = LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo);
bool version_niter = LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo);
+ poly_uint64 versioning_threshold
+ = LOOP_VINFO_VERSIONING_THRESHOLD (loop_vinfo);
tree version_simd_if_cond
= LOOP_REQUIRES_VERSIONING_FOR_SIMD_IF_COND (loop_vinfo);
+ unsigned th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo);
- if (check_profitability)
+ if (th >= vect_vf_for_cost (loop_vinfo)
+ && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+ && !ordered_p (th, versioning_threshold))
cond_expr = fold_build2 (GE_EXPR, boolean_type_node,
scalar_loop_iters,
build_int_cst (TREE_TYPE (scalar_loop_iters),
th - 1));
split out this refactoring - preapproved.
@@ -1726,7 +1729,13 @@ vect_analyze_loop_costing (loop_vec_info
loop_vinfo)
return 0;
}
- HOST_WIDE_INT estimated_niter = estimated_stmt_executions_int (loop);
+ HOST_WIDE_INT estimated_niter = -1;
+
+ if (LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo))
+ estimated_niter
+ = vect_vf_for_cost (LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo)) - 1;
+ if (estimated_niter == -1)
+ estimated_niter = estimated_stmt_executions_int (loop);
if (estimated_niter == -1)
estimated_niter = likely_max_stmt_executions_int (loop);
if (estimated_niter != -1
it's clearer if the old code is completely in a else {} path
even though vect_vf_for_cost - 1 should never be -1.
+/* Decides whether we need to create an epilogue loop to handle
+ remaining scalar iterations and sets PEELING_FOR_NITERS accordingly.
*/
+
+void
+determine_peel_for_niter (loop_vec_info loop_vinfo)
+{
+
extra vertical space
+ unsigned HOST_WIDE_INT const_vf;
+ HOST_WIDE_INT max_niter
if it's a 1:1 copy outlined then split it out - preapproved
(so further reviews get smaller patches ;)) I'd add a
LOOP_VINFO_PEELING_FOR_NITER () = false as final else
since that's what we do by default?
- if (LOOP_REQUIRES_VERSIONING (loop_vinfo))
+ if (LOOP_REQUIRES_VERSIONING (loop_vinfo)
+ || ((orig_loop_vinfo = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo))
+ && LOOP_REQUIRES_VERSIONING (orig_loop_vinfo)))
not sure why we need to do this for epilouges?
+
+ /* Use the same condition as vect_transform_loop to decide when to
use
+ the cost to determine a versioning threshold. */
+ if (th >= vect_vf_for_cost (loop_vinfo)
+ && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+ && ordered_p (th, niters_th))
+ niters_th = ordered_max (poly_uint64 (th), niters_th);
that's an independent change, right? Please split out, it's
pre-approved if it tests OK separately.
+static tree
+replace_ops (tree op, hash_map<tree, tree> &mapping)
+{
I'm quite sure I've seen such beast elsewhere ;) simplify_replace_tree
comes up first (not a 1:1 match but hints at a possible tree
sharing issue in your variant).
@@ -8497,11 +8588,11 @@ vect_transform_loop (loop_vec_info loop_vinfo)
if (th >= vect_vf_for_cost (loop_vinfo)
&& !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
{
- if (dump_enabled_p ())
- dump_printf_loc (MSG_NOTE, vect_location,
- "Profitability threshold is %d loop
iterations.\n",
- th);
- check_profitability = true;
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, vect_location,
+ "Profitability threshold is %d loop
iterations.\n",
+ th);
+ check_profitability = true;
}
/* Make sure there exists a single-predecessor exit bb. Do this before
obvious (separate)
+ tree advance;
epilogue = vect_do_peeling (loop_vinfo, niters, nitersm1,
&niters_vector,
&step_vector, &niters_vector_mult_vf, th,
- check_profitability, niters_no_overflow);
+ check_profitability, niters_no_overflow,
+ &advance);
+
+ if (epilogue)
+ {
+ basic_block *orig_bbs = get_loop_body (loop);
+ loop_vec_info epilogue_vinfo = loop_vec_info_for_loop (epilogue);
...
please record this in vect_do_peeling itself and store the
orig_stmts/drs/etc. in the epilogue loop_vinfo and ...
+ /* We are done vectorizing the main loop, so now we update the
epilogues
+ stmt_vec_info's. At the same time we set the gimple UID of each
+ statement in the epilogue, as these are used to look them up in
the
+ epilogues loop_vec_info later. We also keep track of what
...
split this out to a new function. I wonder why you need to record
the DRs, are they not available via ->datarefs and lookup_dr ()?
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index
601a6f55fbff388c89f88d994e790aebf2bf960e..201549da6c0cbae0797a23ae1b8967b9895505e9
100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -6288,7 +6288,7 @@ ensure_base_align (dr_vec_info *dr_info)
if (decl_in_symtab_p (base_decl))
symtab_node::get (base_decl)->increase_alignment (align_base_to);
- else
+ else if (DECL_ALIGN (base_decl) < align_base_to)
{
SET_DECL_ALIGN (base_decl, align_base_to);
DECL_USER_ALIGN (base_decl) = 1;
split out - preapproved.
Still have to go over the main loop doing the analysis/transform.
Thanks, it looks really promising (albeit exepectedly ugly due to
the data rewriting).
Richard.
> gcc/ChangeLog:
> 2019-10-10 Andre Vieira <andre.simoesdiasvieira@arm.com>
>
> PR 88915
> * cfgloop.h (loop): Add epilogue_vsizes member.
> * cfgloop.c (flow_loop_free): Release epilogue_vsizes.
> (alloc_loop): Initialize epilogue_vsizes.
> * gentype.c (main): Add poly_uint64 type and vector_sizes to
> generator.
> * tree-vect-loop.c (vect_get_loop_niters): Make externally visible.
> (_loop_vec_info): Initialize epilogue_vinfos.
> (~_loop_vec_info): Release epilogue_vinfos.
> (vect_analyze_loop_costing): Use knowledge of main VF to estimate
> number of iterations of epilogue.
> (determine_peel_for_niter): New. Outlined code to re-use in two
> places.
> (vect_analyze_loop_2): Adapt to analyse main loop for all supported
> vector sizes when vect-epilogues-nomask=1. Also keep track of lowest
> versioning threshold needed for main loop.
> (vect_analyze_loop): Likewise.
> (replace_ops): New helper function.
> (vect_transform_loop): When vectorizing epilogues re-use analysis done
> on main loop and update necessary information.
> * tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert
> stmts on loop preheader edge.
> (vect_do_peeling): Enable skip-vectors when doing loop versioning if
> we decided to vectorize epilogues. Update epilogues NITERS and
> construct ADVANCE to update epilogues data references where needed.
> (vect_loop_versioning): Moved decision to check_profitability
> based on cost model.
> * tree-vect-stmts.c (ensure_base_align): Only update alignment
> if new alignment is lower.
> * tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos member.
> (vect_loop_versioning, vect_do_peeling, vect_get_loop_niters,
> vect_update_inits_of_drs, determine_peel_for_niter,
> vect_analyze_loop): Add or update declarations.
> * tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already
> create loop_vec_info's for epilogues when available. Otherwise analyse
> epilogue separately.
>
>
>
> Cheers,
> Andre
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 247165 (AG München)
next prev parent reply other threads:[~2019-10-11 12:57 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-23 17:17 Andre Vieira (lists)
2019-08-26 13:17 ` Richard Biener
2019-09-18 11:11 ` Andre Vieira (lists)
2019-10-08 13:16 ` Andre Vieira (lists)
2019-10-09 9:06 ` Richard Biener
2019-10-10 14:01 ` Andre Vieira (lists)
2019-10-11 13:01 ` Richard Biener [this message]
2019-10-22 12:53 ` Andre Vieira (lists)
2019-10-22 14:05 ` Richard Biener
2019-10-25 16:18 ` Andre Vieira (lists)
2019-10-28 13:16 ` Richard Biener
2019-10-22 18:07 ` Richard Sandiford
2019-10-25 16:20 ` Andre Vieira (lists)
2019-10-25 16:27 ` Andre Vieira (lists)
2019-10-28 14:18 ` Richard Biener
2019-10-28 18:37 ` Andre Vieira (lists)
2019-10-29 12:12 ` Richard Biener
2019-10-29 13:17 ` [committed][vect]PR 88915: Vectorize epilogues when versioning loops (was Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops) Andre Vieira (lists)
2019-10-29 13:45 ` Richard Biener
2019-09-04 15:35 ` [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.YFH.7.76.1910111408390.5566@zhemvz.fhfr.qr \
--to=rguenther@suse.de \
--cc=andre.simoesdiasvieira@arm.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).