public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Biener <rguenther@suse.de>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: RE: [PATCH] tree-optimization/113026 - avoid vector epilog in more cases
Date: Mon, 8 Jan 2024 13:22:31 +0000	[thread overview]
Message-ID: <VI1PR08MB5325E87A32912B281E67F202FF6B2@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <e99559bc-d4aa-45cf-9893-418ca3a366e8@DB5PEPF00014B9B.eurprd02.prod.outlook.com>

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, January 8, 2024 11:29 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Tamar Christina <Tamar.Christina@arm.com>
> Subject: [PATCH] tree-optimization/113026 - avoid vector epilog in more cases
> 
> The following avoids creating a niter peeling epilog more consistently,
> matching what peeling later uses for the skip_vector condition, in
> particular when versioning is required which then also ensures the
> vector loop is entered unless the epilog is vectorized.  This should
> ideally match LOOP_VINFO_VERSIONING_THRESHOLD which is only computed
> later, some refactoring could make that better matching.
> 
> The patch also makes sure to adjust the upper bound of the epilogues
> when we do not have a skip edge around the vector loop.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.  Tamar, does
> that look OK wrt early-breaks?

Yeah the value looks correct, I did find a few cases where the niters should actually be
higher for skip_vector, namely when of the breaks forces ncopies > 1 and we have a
break condition that requires all values to be true to continue.

The code is not wrong in that case, just executes a completely useless vector iters.

But that's unrelated, this looks correct because it means bound_scalar is not set, in
which case there's no difference between one and multiple exits.

Thanks,
Tamar

> 
> Thanks,
> Richard.
> 
> 	PR tree-optimization/113026
> 	* tree-vect-loop.cc (vect_need_peeling_or_partial_vectors_p):
> 	Avoid an epilog in more cases.
> 	* tree-vect-loop-manip.cc (vect_do_peeling): Adjust the
> 	epilogues niter upper bounds and estimates.
> 
> 	* gcc.dg/torture/pr113026-1.c: New testcase.
> 	* gcc.dg/torture/pr113026-2.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/torture/pr113026-1.c | 11 ++++++++
>  gcc/testsuite/gcc.dg/torture/pr113026-2.c | 18 +++++++++++++
>  gcc/tree-vect-loop-manip.cc               | 32 +++++++++++++++++++++++
>  gcc/tree-vect-loop.cc                     |  6 ++++-
>  4 files changed, 66 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr113026-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr113026-2.c
> 
> diff --git a/gcc/testsuite/gcc.dg/torture/pr113026-1.c
> b/gcc/testsuite/gcc.dg/torture/pr113026-1.c
> new file mode 100644
> index 00000000000..56dfef3b36c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr113026-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Wall" } */
> +
> +char dst[16];
> +
> +void
> +foo (char *src, long n)
> +{
> +  for (long i = 0; i < n; i++)
> +    dst[i] = src[i]; /* { dg-bogus "" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/torture/pr113026-2.c
> b/gcc/testsuite/gcc.dg/torture/pr113026-2.c
> new file mode 100644
> index 00000000000..b9d5857a403
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr113026-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Wall" } */
> +
> +char dst1[17];
> +void
> +foo1 (char *src, long n)
> +{
> +  for (long i = 0; i < n; i++)
> +    dst1[i] = src[i]; /* { dg-bogus "" } */
> +}
> +
> +char dst2[18];
> +void
> +foo2 (char *src, long n)
> +{
> +  for (long i = 0; i < n; i++)
> +    dst2[i] = src[i]; /* { dg-bogus "" } */
> +}
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index 9330183bfb9..927f76a0947 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -3364,6 +3364,38 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> niters, tree nitersm1,
>  	    bb_before_epilog->count = single_pred_edge (bb_before_epilog)->count
> ();
>  	  bb_before_epilog = loop_preheader_edge (epilog)->src;
>  	}
> +      else
> +	{
> +	  /* When we do not have a loop-around edge to the epilog we know
> +	     the vector loop covered at least VF scalar iterations unless
> +	     we have early breaks and the epilog will cover at most
> +	     VF - 1 + gap peeling iterations.
> +	     Update any known upper bound with this knowledge.  */
> +	  if (! LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> +	    {
> +	      if (epilog->any_upper_bound)
> +		epilog->nb_iterations_upper_bound -= lowest_vf;
> +	      if (epilog->any_likely_upper_bound)
> +		epilog->nb_iterations_likely_upper_bound -= lowest_vf;
> +	      if (epilog->any_estimate)
> +		epilog->nb_iterations_estimate -= lowest_vf;
> +	    }
> +	  unsigned HOST_WIDE_INT const_vf;
> +	  if (vf.is_constant (&const_vf))
> +	    {
> +	      const_vf += LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) - 1;
> +	      if (epilog->any_upper_bound)
> +		epilog->nb_iterations_upper_bound
> +		  = wi::umin (epilog->nb_iterations_upper_bound, const_vf);
> +	      if (epilog->any_likely_upper_bound)
> +		epilog->nb_iterations_likely_upper_bound
> +		  = wi::umin (epilog->nb_iterations_likely_upper_bound,
> +			      const_vf);
> +	      if (epilog->any_estimate)
> +		epilog->nb_iterations_estimate
> +		  = wi::umin (epilog->nb_iterations_estimate, const_vf);
> +	    }
> +	}
> 
>        /* If loop is peeled for non-zero constant times, now niters refers to
>  	 orig_niters - prolog_peeling, it won't overflow even the orig_niters
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index a06771611ac..9dd573ef125 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -1261,7 +1261,11 @@ vect_need_peeling_or_partial_vectors_p
> (loop_vec_info loop_vinfo)
>  	     the epilogue is unnecessary.  */
>  	  && (!LOOP_REQUIRES_VERSIONING (loop_vinfo)
>  	      || ((unsigned HOST_WIDE_INT) max_niter
> -		  > (th / const_vf) * const_vf))))
> +		  /* We'd like to use LOOP_VINFO_VERSIONING_THRESHOLD
> +		     but that's only computed later based on our result.
> +		     The following is the most conservative approximation.  */
> +		  > (std::max ((unsigned HOST_WIDE_INT) th,
> +			       const_vf) / const_vf) * const_vf))))
>      return true;
> 
>    return false;
> --
> 2.35.3

       reply	other threads:[~2024-01-08 13:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <e99559bc-d4aa-45cf-9893-418ca3a366e8@DB5PEPF00014B9B.eurprd02.prod.outlook.com>
2024-01-08 13:22 ` Tamar Christina [this message]
2024-01-08 11:29 Richard Biener
  -- strict thread matches above, loose matches on Subject: below --
2023-12-15 11:28 Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB5325E87A32912B281E67F202FF6B2@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).