public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com
Subject: Re: [PATCH V2] Middle-end: Fix bug of induction variable vectorization for RVV
Date: Fri, 10 Nov 2023 13:35:00 +0000 (UTC)	[thread overview]
Message-ID: <nycvar.YFH.7.77.849.2311101334530.8772@jbgna.fhfr.qr> (raw)
In-Reply-To: <20231110122011.3626658-1-juzhe.zhong@rivai.ai>

On Fri, 10 Nov 2023, Juzhe-Zhong wrote:

> PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438
> 
> 1. Since SELECT_VL result is not necessary always VF in non-final iteration.
> 
> Current GIMPLE IR is wrong:
> 
> # vect_vec_iv_.8_22 = PHI <_21(4), { 0, 1, 2, ... }(3)>
> ...
> _35 = .SELECT_VL (ivtmp_33, VF);
> _21 = vect_vec_iv_.8_22 + { VF, ... };
> 
> E.g. Consider the total iterations N = 6, the VF = 4.
> Since SELECT_VL output is defined as not always to be VF in non-final iteration
> which needs to depend on hardware implementation.
> 
> Suppose we have a RVV CPU core with vsetvl doing even distribution workload optimization.
> It may process 3 elements at the 1st iteration and 3 elements at the last iteration.
> Then the induction variable here: _21 = vect_vec_iv_.8_22 + { POLY_INT_CST [4, 4], ... }; 
> is wrong which is adding VF, which is 4, actually, we didn't process 4 elements.
> 
> It should be adding 3 elements which is the result of SELECT_VL.
> So, here the correct IR should be:
> 
>   _36 = .SELECT_VL (ivtmp_34, VF);
>   _22 = (int) _36;
>   vect_cst__21 = [vec_duplicate_expr] _22;
> 
> 2. This issue only happens on non-SLP vectorization single rgroup since:
>    
>      if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
>     {
>       tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
>       if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
> 					  OPTIMIZE_FOR_SPEED)
> 	  && LOOP_VINFO_LENS (loop_vinfo).length () == 1
> 	  && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp
> 	  && (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> 	      || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()))
> 	LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) = true;
>     }
> 
> 3. This issue doesn't appears on nested loop no matter LOOP_VINFO_USING_SELECT_VL_P is true or false.
> 
> Since:
> 
>   # vect_vec_iv_.6_5 = PHI <_19(3), { 0, ... }(5)>
>   # vect_diff_15.7_20 = PHI <vect_diff_9.8_22(3), vect_diff_18.5_11(5)>
>   _19 = vect_vec_iv_.6_5 + { 1, ... };
>   vect_diff_9.8_22 = .COND_LEN_ADD ({ -1, ... }, vect_vec_iv_.6_5, vect_diff_15.7_20, vect_diff_15.7_20, _28, 0);
>   ivtmp_1 = ivtmp_4 + 4294967295;
>   ....
>   <bb 5> [local count: 6549826]:
>   # vect_diff_18.5_11 = PHI <vect_diff_9.8_22(4), { 0, ... }(2)>
>   # ivtmp_26 = PHI <ivtmp_27(4), 40(2)>
>   _28 = .SELECT_VL (ivtmp_26, POLY_INT_CST [4, 4]);
>   goto <bb 3>; [100.00%]
> 
> Note the induction variable IR: _21 = vect_vec_iv_.8_22 + { POLY_INT_CST [4, 4], ... }; update induction variable
> independent on VF (or don't care about how many elements are processed in the iteration).
> 
> The update is loop invariant. So it won't be the problem even if LOOP_VINFO_USING_SELECT_VL_P is true.
>    
> Testing passed, Ok for trunk ?

OK.

Richard.

> 	PR tree-optimization/112438
> 
> gcc/ChangeLog:
> 
> 	* tree-vect-loop.cc (vectorizable_induction):
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/autovec/pr112438.c: New test.
> 
> ---
>  .../gcc.target/riscv/rvv/autovec/pr112438.c   | 33 +++++++++++++++++++
>  gcc/tree-vect-loop.cc                         | 30 ++++++++++++++++-
>  2 files changed, 62 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c
> 
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c
> new file mode 100644
> index 00000000000..51f90df38a0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
> +
> +void
> +foo (int n, int *__restrict in, int *__restrict out)
> +{
> +  for (int i = 0; i < n; i += 1)
> +    {
> +      out[i] = in[i] + i;
> +    }
> +}
> +
> +void
> +foo2 (int n, float * __restrict in, 
> +float * __restrict out)
> +{
> +  for (int i = 0; i < n; i += 1)
> +    {
> +      out[i] = in[i] + i;
> +    }
> +}
> +
> +void
> +foo3 (int n, float * __restrict in, 
> +float * __restrict out, float x)
> +{
> +  for (int i = 0; i < n; i += 1)
> +    {
> +      out[i] = in[i] + i* i;
> +    }
> +}
> +
> +/* We don't want to see vect_vec_iv_.21_25 + { POLY_INT_CST [4, 4], ... }.  */
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 8abc1937d74..b152072c969 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10306,10 +10306,36 @@ vectorizable_induction (loop_vec_info loop_vinfo,
>  
>  
>    /* Create the vector that holds the step of the induction.  */
> +  gimple_stmt_iterator *step_iv_si = NULL;
>    if (nested_in_vect_loop)
>      /* iv_loop is nested in the loop to be vectorized. Generate:
>         vec_step = [S, S, S, S]  */
>      new_name = step_expr;
> +  else if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo))
> +    {
> +      /* When we're using loop_len produced by SELEC_VL, the non-final
> +	 iterations are not always processing VF elements.  So vectorize
> +	 induction variable instead of
> +
> +	   _21 = vect_vec_iv_.6_22 + { VF, ... };
> +
> +	 We should generate:
> +
> +	   _35 = .SELECT_VL (ivtmp_33, VF);
> +	   vect_cst__22 = [vec_duplicate_expr] _35;
> +	   _21 = vect_vec_iv_.6_22 + vect_cst__22;  */
> +      gcc_assert (!slp_node);
> +      gimple_seq seq = NULL;
> +      vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo);
> +      tree len = vect_get_loop_len (loop_vinfo, NULL, lens, 1, vectype, 0, 0);
> +      expr = force_gimple_operand (fold_convert (TREE_TYPE (step_expr),
> +						 unshare_expr (len)),
> +				   &seq, true, NULL_TREE);
> +      new_name = gimple_build (&seq, MULT_EXPR, TREE_TYPE (step_expr), expr,
> +			       step_expr);
> +      gsi_insert_seq_before (&si, seq, GSI_SAME_STMT);
> +      step_iv_si = &si;
> +    }
>    else
>      {
>        /* iv_loop is the loop to be vectorized. Generate:
> @@ -10336,7 +10362,7 @@ vectorizable_induction (loop_vec_info loop_vinfo,
>  	      || TREE_CODE (new_name) == SSA_NAME);
>    new_vec = build_vector_from_val (step_vectype, t);
>    vec_step = vect_init_vector (loop_vinfo, stmt_info,
> -			       new_vec, step_vectype, NULL);
> +			       new_vec, step_vectype, step_iv_si);
>  
>  
>    /* Create the following def-use cycle:
> @@ -10382,6 +10408,8 @@ vectorizable_induction (loop_vec_info loop_vinfo,
>        gimple_seq seq = NULL;
>        /* FORNOW. This restriction should be relaxed.  */
>        gcc_assert (!nested_in_vect_loop);
> +      /* We expect LOOP_VINFO_USING_SELECT_VL_P to be false if ncopies > 1.  */
> +      gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo));
>  
>        /* Create the vector that holds the step of the induction.  */
>        if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr)))
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

      reply	other threads:[~2023-11-10 13:35 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-10 12:20 Juzhe-Zhong
2023-11-10 13:35 ` Richard Biener [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.77.849.2311101334530.8772@jbgna.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=juzhe.zhong@rivai.ai \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).