From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Biener <rguenther@suse.de>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: RE: [PATCH] tree-optimization/113026 - avoid vector epilog in more cases
Date: Mon, 8 Jan 2024 13:22:31 +0000 [thread overview]
Message-ID: <VI1PR08MB5325E87A32912B281E67F202FF6B2@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <e99559bc-d4aa-45cf-9893-418ca3a366e8@DB5PEPF00014B9B.eurprd02.prod.outlook.com>
> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, January 8, 2024 11:29 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Tamar Christina <Tamar.Christina@arm.com>
> Subject: [PATCH] tree-optimization/113026 - avoid vector epilog in more cases
>
> The following avoids creating a niter peeling epilog more consistently,
> matching what peeling later uses for the skip_vector condition, in
> particular when versioning is required which then also ensures the
> vector loop is entered unless the epilog is vectorized. This should
> ideally match LOOP_VINFO_VERSIONING_THRESHOLD which is only computed
> later, some refactoring could make that better matching.
>
> The patch also makes sure to adjust the upper bound of the epilogues
> when we do not have a skip edge around the vector loop.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu. Tamar, does
> that look OK wrt early-breaks?
Yeah the value looks correct, I did find a few cases where the niters should actually be
higher for skip_vector, namely when of the breaks forces ncopies > 1 and we have a
break condition that requires all values to be true to continue.
The code is not wrong in that case, just executes a completely useless vector iters.
But that's unrelated, this looks correct because it means bound_scalar is not set, in
which case there's no difference between one and multiple exits.
Thanks,
Tamar
>
> Thanks,
> Richard.
>
> PR tree-optimization/113026
> * tree-vect-loop.cc (vect_need_peeling_or_partial_vectors_p):
> Avoid an epilog in more cases.
> * tree-vect-loop-manip.cc (vect_do_peeling): Adjust the
> epilogues niter upper bounds and estimates.
>
> * gcc.dg/torture/pr113026-1.c: New testcase.
> * gcc.dg/torture/pr113026-2.c: Likewise.
> ---
> gcc/testsuite/gcc.dg/torture/pr113026-1.c | 11 ++++++++
> gcc/testsuite/gcc.dg/torture/pr113026-2.c | 18 +++++++++++++
> gcc/tree-vect-loop-manip.cc | 32 +++++++++++++++++++++++
> gcc/tree-vect-loop.cc | 6 ++++-
> 4 files changed, 66 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr113026-1.c
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr113026-2.c
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr113026-1.c
> b/gcc/testsuite/gcc.dg/torture/pr113026-1.c
> new file mode 100644
> index 00000000000..56dfef3b36c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr113026-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Wall" } */
> +
> +char dst[16];
> +
> +void
> +foo (char *src, long n)
> +{
> + for (long i = 0; i < n; i++)
> + dst[i] = src[i]; /* { dg-bogus "" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/torture/pr113026-2.c
> b/gcc/testsuite/gcc.dg/torture/pr113026-2.c
> new file mode 100644
> index 00000000000..b9d5857a403
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr113026-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Wall" } */
> +
> +char dst1[17];
> +void
> +foo1 (char *src, long n)
> +{
> + for (long i = 0; i < n; i++)
> + dst1[i] = src[i]; /* { dg-bogus "" } */
> +}
> +
> +char dst2[18];
> +void
> +foo2 (char *src, long n)
> +{
> + for (long i = 0; i < n; i++)
> + dst2[i] = src[i]; /* { dg-bogus "" } */
> +}
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index 9330183bfb9..927f76a0947 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -3364,6 +3364,38 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> niters, tree nitersm1,
> bb_before_epilog->count = single_pred_edge (bb_before_epilog)->count
> ();
> bb_before_epilog = loop_preheader_edge (epilog)->src;
> }
> + else
> + {
> + /* When we do not have a loop-around edge to the epilog we know
> + the vector loop covered at least VF scalar iterations unless
> + we have early breaks and the epilog will cover at most
> + VF - 1 + gap peeling iterations.
> + Update any known upper bound with this knowledge. */
> + if (! LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> + {
> + if (epilog->any_upper_bound)
> + epilog->nb_iterations_upper_bound -= lowest_vf;
> + if (epilog->any_likely_upper_bound)
> + epilog->nb_iterations_likely_upper_bound -= lowest_vf;
> + if (epilog->any_estimate)
> + epilog->nb_iterations_estimate -= lowest_vf;
> + }
> + unsigned HOST_WIDE_INT const_vf;
> + if (vf.is_constant (&const_vf))
> + {
> + const_vf += LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) - 1;
> + if (epilog->any_upper_bound)
> + epilog->nb_iterations_upper_bound
> + = wi::umin (epilog->nb_iterations_upper_bound, const_vf);
> + if (epilog->any_likely_upper_bound)
> + epilog->nb_iterations_likely_upper_bound
> + = wi::umin (epilog->nb_iterations_likely_upper_bound,
> + const_vf);
> + if (epilog->any_estimate)
> + epilog->nb_iterations_estimate
> + = wi::umin (epilog->nb_iterations_estimate, const_vf);
> + }
> + }
>
> /* If loop is peeled for non-zero constant times, now niters refers to
> orig_niters - prolog_peeling, it won't overflow even the orig_niters
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index a06771611ac..9dd573ef125 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -1261,7 +1261,11 @@ vect_need_peeling_or_partial_vectors_p
> (loop_vec_info loop_vinfo)
> the epilogue is unnecessary. */
> && (!LOOP_REQUIRES_VERSIONING (loop_vinfo)
> || ((unsigned HOST_WIDE_INT) max_niter
> - > (th / const_vf) * const_vf))))
> + /* We'd like to use LOOP_VINFO_VERSIONING_THRESHOLD
> + but that's only computed later based on our result.
> + The following is the most conservative approximation. */
> + > (std::max ((unsigned HOST_WIDE_INT) th,
> + const_vf) / const_vf) * const_vf))))
> return true;
>
> return false;
> --
> 2.35.3
next parent reply other threads:[~2024-01-08 13:22 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <e99559bc-d4aa-45cf-9893-418ca3a366e8@DB5PEPF00014B9B.eurprd02.prod.outlook.com>
2024-01-08 13:22 ` Tamar Christina [this message]
2024-01-08 11:29 Richard Biener
-- strict thread matches above, loose matches on Subject: below --
2023-12-15 11:28 Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=VI1PR08MB5325E87A32912B281E67F202FF6B2@VI1PR08MB5325.eurprd08.prod.outlook.com \
--to=tamar.christina@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).