[PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest.
@ 2023-06-26  1:31 liuhongt
  2023-06-26  1:31 ` [PATCH 2/3] Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math liuhongt
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: liuhongt @ 2023-06-26  1:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford, rguenther

When there're multiple operands in vec_oprnds0, vec_dest will be
overwrited to vectype_out, but in multi_step_cvt case, cvt_type is
expected. It caused an ICE when verify_gimple_in_cfg.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} and aarch64-linux-gnu.
Ok for trunk?

gcc/ChangeLog:

	PR tree-optimization/110371
	PR tree-optimization/110018
	* tree-vect-stmts.cc (vectorizable_conversion): Use cvt_op to
	save intermediate type operand instead of "subtle" vec_dest
	for case NONE.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/pr110371.c: New test.
---
 gcc/testsuite/gcc.target/aarch64/pr110371.c | 20 ++++++++++++++++++++
 gcc/tree-vect-stmts.cc                      | 14 ++++++++++----
 2 files changed, 30 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr110371.c

diff --git a/gcc/testsuite/gcc.target/aarch64/pr110371.c b/gcc/testsuite/gcc.target/aarch64/pr110371.c
new file mode 100644
index 00000000000..444e514e04f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr110371.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+typedef struct dest
+{
+  double m[3][3];
+} dest;
+
+typedef struct src
+{
+  int m[3][3];
+} src;
+
+void
+foo (dest *a, src* s)
+{
+  for (int i = 0; i != 3; i++)
+    for (int j = 0; j != 3; j++)
+      a->m[i][j] = s->m[i][j];
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 85d1f3ae52c..1748555a625 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5044,7 +5044,7 @@ vectorizable_conversion (vec_info *vinfo,
 			 gimple **vec_stmt, slp_tree slp_node,
 			 stmt_vector_for_cost *cost_vec)
 {
-  tree vec_dest;
+  tree vec_dest, cvt_op = NULL_TREE;
   tree scalar_dest;
   tree op0, op1 = NULL_TREE;
   loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
@@ -5568,6 +5568,13 @@ vectorizable_conversion (vec_info *vinfo,
     case NONE:
       vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
 			 op0, &vec_oprnds0);
+      /* vec_dest is intermediate type operand when multi_step_cvt.  */
+      if (multi_step_cvt)
+	{
+	  cvt_op = vec_dest;
+	  vec_dest = vec_dsts[0];
+	}
+
       FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	{
 	  /* Arguments are ready, create the new vector stmt.  */
@@ -5575,12 +5582,11 @@ vectorizable_conversion (vec_info *vinfo,
 	  if (multi_step_cvt)
 	    {
 	      gcc_assert (multi_step_cvt == 1);
-	      new_stmt = vect_gimple_build (vec_dest, codecvt1, vop0);
-	      new_temp = make_ssa_name (vec_dest, new_stmt);
+	      new_stmt = vect_gimple_build (cvt_op, codecvt1, vop0);
+	      new_temp = make_ssa_name (cvt_op, new_stmt);
 	      gimple_assign_set_lhs (new_stmt, new_temp);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	      vop0 = new_temp;
-	      vec_dest = vec_dsts[0];
 	    }
 	  new_stmt = vect_gimple_build (vec_dest, code1, vop0);
 	  new_temp = make_ssa_name (vec_dest, new_stmt);
-- 
2.39.1.388.g2fc9e9ca3c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/3] Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math.
  2023-06-26  1:31 [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest liuhongt
@ 2023-06-26  1:31 ` liuhongt
  2023-06-26  7:24   ` Richard Biener
  2023-06-26  1:31 ` [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007 liuhongt
  2023-06-26  7:46 ` [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest Richard Biener
  2 siblings, 1 reply; 7+ messages in thread
From: liuhongt @ 2023-06-26  1:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford, rguenther

> > Hmm, good question.  GENERIC has a direct truncation to unsigned char
> > for example, the C standard generally says if the integral part cannot
> > be represented then the behavior is undefined.  So I think we should be
> > safe here (0x1.0p32 doesn't fit an int).
>
> We should be following Annex F (unspecified value plus "invalid" exception
> for out-of-range floating-to-integer conversions rather than undefined
> behavior).  But we don't achieve that very well at present (see bug 93806
> comments 27-29 for examples of how such conversions produce wobbly
> values).

That would mean guarding this with !flag_trapping_math would be the appropriate
thing to do.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} and aarch64-linux-gnu.
Ok for trunk?

gcc/ChangeLog:

	PR tree-optimization/110371
	PR tree-optimization/110018
	* tree-vect-stmts.cc (vectorizable_conversion): Don't use
	intermiediate type for FIX_TRUNC_EXPR when ftrapping-math.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr110018-1.c: Add -fno-trapping-math to dg-options.
	* gcc.target/i386/pr110018-2.c: Ditto.
---
 gcc/testsuite/gcc.target/i386/pr110018-1.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr110018-2.c | 2 +-
 gcc/tree-vect-stmts.cc                     | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr110018-1.c b/gcc/testsuite/gcc.target/i386/pr110018-1.c
index b6a3be7b7a2..24eeca60f6f 100644
--- a/gcc/testsuite/gcc.target/i386/pr110018-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr110018-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq" } */
+/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq -fno-trapping-math" } */
 /* { dg-final { scan-assembler-times {(?n)vcvttp[dsh]2[dqw]} 5 } } */
 /* { dg-final { scan-assembler-times {(?n)vcvt[dqw]*2p[dsh]} 5 } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr110018-2.c b/gcc/testsuite/gcc.target/i386/pr110018-2.c
index a663e074698..9a2d9e17894 100644
--- a/gcc/testsuite/gcc.target/i386/pr110018-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr110018-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq" } */
+/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq -fno-trapping-math" } */
 /* { dg-final { scan-assembler-times {(?n)vcvttp[dsh]2[dqw]} 5 } } */
 /* { dg-final { scan-assembler-times {(?n)vcvt[dqw]*2p[dsh]} 5 } } */
 
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 1748555a625..bf61461939b 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5263,7 +5263,8 @@ vectorizable_conversion (vec_info *vinfo,
       if ((code == FLOAT_EXPR
 	   && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
 	  || (code == FIX_TRUNC_EXPR
-	      && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)))
+	      && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
+	      && !flag_trapping_math))
 	{
 	  bool float_expr_p = code == FLOAT_EXPR;
 	  scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
-- 
2.39.1.388.g2fc9e9ca3c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007.
  2023-06-26  1:31 [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest liuhongt
  2023-06-26  1:31 ` [PATCH 2/3] Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math liuhongt
@ 2023-06-26  1:31 ` liuhongt
  2023-06-26  1:32   ` Hongtao Liu
  2023-06-26  8:21   ` Richard Sandiford
  2023-06-26  7:46 ` [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest Richard Biener
  2 siblings, 2 replies; 7+ messages in thread
From: liuhongt @ 2023-06-26  1:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford, rguenther

The new assembly looks better than original one, so I adjust those testcases.
Ok for trunk?

gcc/testsuite/ChangeLog:

	PR tree-optimization/110371
	PR tree-optimization/110018
	* gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Scan scvt +
	sxtw instead of scvt + zip1 + zip2.
	* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Scan scvt +
	uxtw instead of ucvtf + zip1 + zip2.
---
 gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c | 6 +++---
 .../gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c         | 5 ++---
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
index 0f96dc2ff00..5edc288ce35 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
@@ -10,6 +10,6 @@ unpack_double_int_plus8 (double *d, int32_t *s, int size)
     d[i] = s[i] + 8;
 }
 
-/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsxtw\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
index 70465f91eba..ecd72176177 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
@@ -10,6 +10,5 @@ unpack_double_int_plus9 (double *d, uint32_t *s, int size)
     d[i] = (double) (s[i] + 9);
 }
 
-/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tucvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
-- 
2.39.1.388.g2fc9e9ca3c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007.
  2023-06-26  1:31 ` [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007 liuhongt
@ 2023-06-26  1:32   ` Hongtao Liu
  2023-06-26  8:21   ` Richard Sandiford
  1 sibling, 0 replies; 7+ messages in thread
From: Hongtao Liu @ 2023-06-26  1:32 UTC (permalink / raw)
  To: richard.sandiford; +Cc: gcc-patches, thiago.bauermann

On Mon, Jun 26, 2023 at 9:31 AM liuhongt via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> The new assembly looks better than original one, so I adjust those testcases.
> Ok for trunk?
>
> gcc/testsuite/ChangeLog:
>
>         PR tree-optimization/110371
>         PR tree-optimization/110018
>         * gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Scan scvt +
>         sxtw instead of scvt + zip1 + zip2.
>         * gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Scan scvt +
>         uxtw instead of ucvtf + zip1 + zip2.
> ---
>  gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c | 6 +++---
>  .../gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c         | 5 ++---
>  2 files changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
> index 0f96dc2ff00..5edc288ce35 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
> @@ -10,6 +10,6 @@ unpack_double_int_plus8 (double *d, int32_t *s, int size)
>      d[i] = s[i] + 8;
>  }
>
> -/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.s\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tsxtw\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
> +
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
> index 70465f91eba..ecd72176177 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
> @@ -10,6 +10,5 @@ unpack_double_int_plus9 (double *d, uint32_t *s, int size)
>      d[i] = (double) (s[i] + 9);
>  }
>
> -/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tucvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.s\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
> --
> 2.39.1.388.g2fc9e9ca3c
>


--
BR,
Hongtao

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math.
  2023-06-26  1:31 ` [PATCH 2/3] Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math liuhongt
@ 2023-06-26  7:24   ` Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2023-06-26  7:24 UTC (permalink / raw)
  To: liuhongt; +Cc: gcc-patches, richard.sandiford, rguenther

On Mon, Jun 26, 2023 at 3:31 AM liuhongt via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> > > Hmm, good question.  GENERIC has a direct truncation to unsigned char
> > > for example, the C standard generally says if the integral part cannot
> > > be represented then the behavior is undefined.  So I think we should be
> > > safe here (0x1.0p32 doesn't fit an int).
> >
> > We should be following Annex F (unspecified value plus "invalid" exception
> > for out-of-range floating-to-integer conversions rather than undefined
> > behavior).  But we don't achieve that very well at present (see bug 93806
> > comments 27-29 for examples of how such conversions produce wobbly
> > values).
>
> That would mean guarding this with !flag_trapping_math would be the appropriate
> thing to do.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} and aarch64-linux-gnu.
> Ok for trunk?

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
>         PR tree-optimization/110371
>         PR tree-optimization/110018
>         * tree-vect-stmts.cc (vectorizable_conversion): Don't use
>         intermiediate type for FIX_TRUNC_EXPR when ftrapping-math.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/i386/pr110018-1.c: Add -fno-trapping-math to dg-options.
>         * gcc.target/i386/pr110018-2.c: Ditto.
> ---
>  gcc/testsuite/gcc.target/i386/pr110018-1.c | 2 +-
>  gcc/testsuite/gcc.target/i386/pr110018-2.c | 2 +-
>  gcc/tree-vect-stmts.cc                     | 3 ++-
>  3 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr110018-1.c b/gcc/testsuite/gcc.target/i386/pr110018-1.c
> index b6a3be7b7a2..24eeca60f6f 100644
> --- a/gcc/testsuite/gcc.target/i386/pr110018-1.c
> +++ b/gcc/testsuite/gcc.target/i386/pr110018-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq" } */
> +/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq -fno-trapping-math" } */
>  /* { dg-final { scan-assembler-times {(?n)vcvttp[dsh]2[dqw]} 5 } } */
>  /* { dg-final { scan-assembler-times {(?n)vcvt[dqw]*2p[dsh]} 5 } } */
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr110018-2.c b/gcc/testsuite/gcc.target/i386/pr110018-2.c
> index a663e074698..9a2d9e17894 100644
> --- a/gcc/testsuite/gcc.target/i386/pr110018-2.c
> +++ b/gcc/testsuite/gcc.target/i386/pr110018-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq" } */
> +/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq -fno-trapping-math" } */
>  /* { dg-final { scan-assembler-times {(?n)vcvttp[dsh]2[dqw]} 5 } } */
>  /* { dg-final { scan-assembler-times {(?n)vcvt[dqw]*2p[dsh]} 5 } } */
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 1748555a625..bf61461939b 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -5263,7 +5263,8 @@ vectorizable_conversion (vec_info *vinfo,
>        if ((code == FLOAT_EXPR
>            && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
>           || (code == FIX_TRUNC_EXPR
> -             && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)))
> +             && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
> +             && !flag_trapping_math))
>         {
>           bool float_expr_p = code == FLOAT_EXPR;
>           scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
> --
> 2.39.1.388.g2fc9e9ca3c
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest.
  2023-06-26  1:31 [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest liuhongt
  2023-06-26  1:31 ` [PATCH 2/3] Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math liuhongt
  2023-06-26  1:31 ` [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007 liuhongt
@ 2023-06-26  7:46 ` Richard Biener
  2 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2023-06-26  7:46 UTC (permalink / raw)
  To: liuhongt; +Cc: gcc-patches, richard.sandiford

On Mon, 26 Jun 2023, liuhongt wrote:

> When there're multiple operands in vec_oprnds0, vec_dest will be
> overwrited to vectype_out, but in multi_step_cvt case, cvt_type is
> expected. It caused an ICE when verify_gimple_in_cfg.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} and aarch64-linux-gnu.
> Ok for trunk?

OK.

Richard.

> gcc/ChangeLog:
> 
> 	PR tree-optimization/110371
> 	PR tree-optimization/110018
> 	* tree-vect-stmts.cc (vectorizable_conversion): Use cvt_op to
> 	save intermediate type operand instead of "subtle" vec_dest
> 	for case NONE.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/aarch64/pr110371.c: New test.
> ---
>  gcc/testsuite/gcc.target/aarch64/pr110371.c | 20 ++++++++++++++++++++
>  gcc/tree-vect-stmts.cc                      | 14 ++++++++++----
>  2 files changed, 30 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr110371.c
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr110371.c b/gcc/testsuite/gcc.target/aarch64/pr110371.c
> new file mode 100644
> index 00000000000..444e514e04f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr110371.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +typedef struct dest
> +{
> +  double m[3][3];
> +} dest;
> +
> +typedef struct src
> +{
> +  int m[3][3];
> +} src;
> +
> +void
> +foo (dest *a, src* s)
> +{
> +  for (int i = 0; i != 3; i++)
> +    for (int j = 0; j != 3; j++)
> +      a->m[i][j] = s->m[i][j];
> +}
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 85d1f3ae52c..1748555a625 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -5044,7 +5044,7 @@ vectorizable_conversion (vec_info *vinfo,
>  			 gimple **vec_stmt, slp_tree slp_node,
>  			 stmt_vector_for_cost *cost_vec)
>  {
> -  tree vec_dest;
> +  tree vec_dest, cvt_op = NULL_TREE;
>    tree scalar_dest;
>    tree op0, op1 = NULL_TREE;
>    loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
> @@ -5568,6 +5568,13 @@ vectorizable_conversion (vec_info *vinfo,
>      case NONE:
>        vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
>  			 op0, &vec_oprnds0);
> +      /* vec_dest is intermediate type operand when multi_step_cvt.  */
> +      if (multi_step_cvt)
> +	{
> +	  cvt_op = vec_dest;
> +	  vec_dest = vec_dsts[0];
> +	}
> +
>        FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
>  	{
>  	  /* Arguments are ready, create the new vector stmt.  */
> @@ -5575,12 +5582,11 @@ vectorizable_conversion (vec_info *vinfo,
>  	  if (multi_step_cvt)
>  	    {
>  	      gcc_assert (multi_step_cvt == 1);
> -	      new_stmt = vect_gimple_build (vec_dest, codecvt1, vop0);
> -	      new_temp = make_ssa_name (vec_dest, new_stmt);
> +	      new_stmt = vect_gimple_build (cvt_op, codecvt1, vop0);
> +	      new_temp = make_ssa_name (cvt_op, new_stmt);
>  	      gimple_assign_set_lhs (new_stmt, new_temp);
>  	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
>  	      vop0 = new_temp;
> -	      vec_dest = vec_dsts[0];
>  	    }
>  	  new_stmt = vect_gimple_build (vec_dest, code1, vop0);
>  	  new_temp = make_ssa_name (vec_dest, new_stmt);
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007.
  2023-06-26  1:31 ` [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007 liuhongt
  2023-06-26  1:32   ` Hongtao Liu
@ 2023-06-26  8:21   ` Richard Sandiford
  1 sibling, 0 replies; 7+ messages in thread
From: Richard Sandiford @ 2023-06-26  8:21 UTC (permalink / raw)
  To: liuhongt; +Cc: gcc-patches, rguenther

liuhongt <hongtao.liu@intel.com> writes:
> The new assembly looks better than original one, so I adjust those testcases.

The new loops are shorter, but they process only half the amount of data
per iteration.

The problem is that the new vectoriser code generates multiple statements
but only costs one.  I'll post a fix soon.

Thanks,
Richard

> Ok for trunk?
>
> gcc/testsuite/ChangeLog:
>
> 	PR tree-optimization/110371
> 	PR tree-optimization/110018
> 	* gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Scan scvt +
> 	sxtw instead of scvt + zip1 + zip2.
> 	* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Scan scvt +
> 	uxtw instead of ucvtf + zip1 + zip2.
> ---
>  gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c | 6 +++---
>  .../gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c         | 5 ++---
>  2 files changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
> index 0f96dc2ff00..5edc288ce35 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
> @@ -10,6 +10,6 @@ unpack_double_int_plus8 (double *d, int32_t *s, int size)
>      d[i] = s[i] + 8;
>  }
>  
> -/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.s\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tsxtw\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
> +
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
> index 70465f91eba..ecd72176177 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
> @@ -10,6 +10,5 @@ unpack_double_int_plus9 (double *d, uint32_t *s, int size)
>      d[i] = (double) (s[i] + 9);
>  }
>  
> -/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tucvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.s\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-06-26  8:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-26  1:31 [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest liuhongt
2023-06-26  1:31 ` [PATCH 2/3] Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math liuhongt
2023-06-26  7:24   ` Richard Biener
2023-06-26  1:31 ` [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007 liuhongt
2023-06-26  1:32   ` Hongtao Liu
2023-06-26  8:21   ` Richard Sandiford
2023-06-26  7:46 ` [PATCH 1/3] Use cvt_op to save intermediate type operand instead of "subtle" vec_dest Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).