Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Hongtao Liu <crazylht@gmail.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>,
	Kirill Yukhin <kirill.yukhin@gmail.com>
Subject: Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.
Date: Tue, 27 Oct 2020 15:51:53 +0800	[thread overview]
Message-ID: <CAMZc-bzXUGMBbRReERhvyJj_nESSFopxrcwC=gwQgMtpKaMPnA@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc1mNhhmSs3wEkbpd1pJg8VTn0K3U0+P5dKsm_dOqMXB3g@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5438 bytes --]

ping^1

On Tue, Oct 20, 2020 at 3:36 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Tue, Oct 20, 2020 at 4:35 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Mon, Oct 19, 2020 at 5:55 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Mon, Oct 19, 2020 at 11:37 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Mon, Oct 19, 2020 at 5:07 PM Richard Biener
> > > > <richard.guenther@gmail.com> wrote:
> > > > >
> > > > > On Mon, Oct 19, 2020 at 10:21 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > > > > >
> > > > > > Hi:
> > > > > >   It's implemented as below:
> > > > > > V setg (V v, int idx, T val)
> > > > > >
> > > > > > {
> > > > > >   V idxv = (V){idx, idx, idx, idx, idx, idx, idx, idx};
> > > > > >   V valv = (V){val, val, val, val, val, val, val, val};
> > > > > >   V mask = ((V){0, 1, 2, 3, 4, 5, 6, 7} == idxv);
> > > > > >   v = (v & ~mask) | (valv & mask);
> > > > > >   return v;
> > > > > > }
> > > > > >
> > > > > > Bootstrap is fine, regression test for i386/x86-64 backend is ok.
> > > > > > Ok for trunk?
> > > > >
> > > > > Hmm, I guess you're trying to keep the code for !AVX512BW simple
> > > > > but isn't just splitting the compare into
> > > > >
> > > > >  clow = {0, 1, 2, 3 ... } == idxv
> > > > >  chigh = {16, 17, 18, ... } == idxv;
> > > > >  cmp = {clow, chigh}
> > > > >
> > > >
> > > > We also don't have 512-bits byte/word blend instructions without
> > > > TARGET_AVX512W, so how to use 512-bits cmp?
> > >
> > > Oh, I see.  Guess two back-to-back vpternlog could emulate
> >
> > Yes, we can have something like vpternlogd %zmm0, %zmm1, %zmm2, 0xD8,
> > but since we don't have 512-bits bytes/word broadcast instruction,
> > It would need 2 broadcast and 1 vec_concat to get 1 512-bits vector.
> > it wouldn't save many instructions compared to my version(as below).
> >
> > ---
> >         leal    -16(%rsi), %eax
> >         vmovd   %edi, %xmm2
> >         vmovdqa .LC0(%rip), %ymm4
> >         vextracti64x4   $0x1, %zmm0, %ymm3
> >         vmovd   %eax, %xmm1
> >         vpbroadcastw    %xmm2, %ymm2
> >         vpbroadcastw    %xmm1, %ymm1
> >         vpcmpeqw        %ymm4, %ymm1, %ymm1
> >         vpblendvb       %ymm1, %ymm2, %ymm3, %ymm3
> >         vmovd   %esi, %xmm1
> >         vpbroadcastw    %xmm1, %ymm1
> >         vpcmpeqw        %ymm4, %ymm1, %ymm1
> >         vpblendvb       %ymm1, %ymm2, %ymm0, %ymm0
> >         vinserti64x4    $0x1, %ymm3, %zmm0, %zmm0
> > ---
> >
> > > the blend?  Not sure if important - I recall only knl didn't have bw?
> > >
> >
> > Yes, after(including) SKX, all avx512 targets will support AVX512BW.
> > And i don't think performance for V32HI/V64QI without AVX512BW is important.
>
> True.
>
> I have no further comments on the patch then - it still needs i386 maintainer
> approval though.
>
> Thanks,
> Richard.
>
> >
> > > > cut from i386-expand.c:
> > > > in ix86_expand_sse_movcc
> > > >  3682    case E_V64QImode:
> > > >  3683      gen = gen_avx512bw_blendmv64qi; ---> TARGET_AVX512BW needed
> > > >  3684      break;
> > > >  3685    case E_V32HImode:
> > > >  3686      gen = gen_avx512bw_blendmv32hi; --> TARGET_AVX512BW needed
> > > >  3687      break;
> > > >  3688    case E_V16SImode:
> > > >  3689      gen = gen_avx512f_blendmv16si;
> > > >  3690      break;
> > > >  3691    case E_V8DImode:
> > > >  3692      gen = gen_avx512f_blendmv8di;
> > > >  3693      break;
> > > >  3694    case E_V8DFmode:
> > > >
> > > > > faster, smaller and eventually even easier during expansion?
> > > > >
> > > > > +  gcc_assert (ix86_expand_vector_init_duplicate (false, mode, valv, val));
> > > > > +  gcc_assert (ix86_expand_vector_init_duplicate (false, cmp_mode,
> > > > > idxv, idx_tmp));
> > > > >
> > > > > side-effects in gcc_assert is considered bad style, use
> > > > >
> > > > >   ok = ix86_expand_vector_init_duplicate (false, mode, valv, val);
> > > > >   gcc_assert (ok);
> > > > >
> > > > > +  vec[5] = constv;
> > > > > +  ix86_expand_int_vcond (vec);
> > > > >
> > > > > this also returns a bool you probably should assert true.
> > > > >
> > > >
> > > > Yes, will change.
> > > >
> > > > > Otherwise thanks for tackling this.
> > > > >
> > > > > Richard.
> > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > >         PR target/97194
> > > > > >         * config/i386/i386-expand.c (ix86_expand_vector_set_var): New function.
> > > > > >         * config/i386/i386-protos.h (ix86_expand_vector_set_var): New Decl.
> > > > > >         * config/i386/predicates.md (vec_setm_operand): New predicate,
> > > > > >         true for const_int_operand or register_operand under TARGET_AVX2.
> > > > > >         * config/i386/sse.md (vec_set<mode>): Support both constant
> > > > > >         and variable index vec_set.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >
> > > > > >         * gcc.target/i386/avx2-vec-set-1.c: New test.
> > > > > >         * gcc.target/i386/avx2-vec-set-2.c: New test.
> > > > > >         * gcc.target/i386/avx512bw-vec-set-1.c: New test.
> > > > > >         * gcc.target/i386/avx512bw-vec-set-2.c: New test.
> > > > > >         * gcc.target/i386/avx512f-vec-set-2.c: New test.
> > > > > >         * gcc.target/i386/avx512vl-vec-set-2.c: New test.
> > > > > >
> > > > > > --
> > > > > > BR,
> > > > > > Hongtao
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao

[-- Attachment #2: 0001-Support-variable-index-vec_set-V2.patch --]
[-- Type: text/x-patch, Size: 15186 bytes --]

From 3b24e66d861c38292da43a55f5b2a51d0f89d043 Mon Sep 17 00:00:00 2001
From: liuhongt <hongtao.liu@intel.com>
Date: Mon, 19 Oct 2020 16:04:39 +0800
Subject: [PATCH] Support variable index vec_set.

gcc/ChangeLog:

	PR target/97194
	* config/i386/i386-expand.c (ix86_expand_vector_set_var): New function.
	* config/i386/i386-protos.h (ix86_expand_vector_set_var): New Decl.
	* config/i386/predicates.md (vec_setm_operand): New predicate,
	true for const_int_operand or register_operand under TARGET_AVX2.
	* config/i386/sse.md (vec_set<mode>): Support both constant
	and variable index vec_set.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx2-vec-set-1.c: New test.
	* gcc.target/i386/avx2-vec-set-2.c: New test.
	* gcc.target/i386/avx512bw-vec-set-1.c: New test.
	* gcc.target/i386/avx512bw-vec-set-2.c: New test.
	* gcc.target/i386/avx512f-vec-set-2.c: New test.
	* gcc.target/i386/avx512vl-vec-set-2.c: New test.
---
 gcc/config/i386/i386-expand.c                 | 106 ++++++++++++++++++
 gcc/config/i386/i386-protos.h                 |   1 +
 gcc/config/i386/predicates.md                 |   6 +
 gcc/config/i386/sse.md                        |   9 +-
 .../gcc.target/i386/avx2-vec-set-1.c          |  49 ++++++++
 .../gcc.target/i386/avx2-vec-set-2.c          |  50 +++++++++
 .../gcc.target/i386/avx512bw-vec-set-1.c      |  20 ++++
 .../gcc.target/i386/avx512bw-vec-set-2.c      |  44 ++++++++
 .../gcc.target/i386/avx512f-vec-set-2.c       |  42 +++++++
 .../gcc.target/i386/avx512vl-vec-set-2.c      |  55 +++++++++
 10 files changed, 379 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx2-vec-set-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx2-vec-set-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512bw-vec-set-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512bw-vec-set-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vec-set-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-vec-set-2.c

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 3e8afe683dc..fe449ac855e 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -14235,6 +14235,112 @@ ix86_expand_vector_init (bool mmx_ok, rtx target, rtx vals)
   ix86_expand_vector_init_general (mmx_ok, mode, target, vals);
 }
 
+/* Implemented as
+   V setg (V v, int idx, T val)
+   {
+     V idxv = (V){idx, idx, idx, idx, idx, idx, idx, idx};
+     V valv = (V){val, val, val, val, val, val, val, val};
+     V mask = ((V){0, 1, 2, 3, 4, 5, 6, 7} == idxv);
+     v = (v & ~mask) | (valv & mask);
+     return v;
+   }.  */
+void
+ix86_expand_vector_set_var (rtx target, rtx val, rtx idx)
+{
+  rtx vec[64];
+  machine_mode mode = GET_MODE (target);
+  machine_mode cmp_mode = mode;
+  int n_elts = GET_MODE_NUNITS (mode);
+  rtx valv,idxv,constv,idx_tmp;
+  bool ok = false;
+
+  /* 512-bits vector byte/word broadcast and comparison only available
+     under TARGET_AVX512BW, break 512-bits vector into two 256-bits vector
+     when without TARGET_AVX512BW.  */
+  if ((mode == V32HImode || mode == V64QImode) && !TARGET_AVX512BW)
+    {
+      gcc_assert (TARGET_AVX512F);
+      rtx vhi, vlo, idx_hi;
+      machine_mode half_mode;
+      rtx (*extract_hi)(rtx, rtx);
+      rtx (*extract_lo)(rtx, rtx);
+
+      if (mode == V32HImode)
+	{
+	  half_mode = V16HImode;
+	  extract_hi = gen_vec_extract_hi_v32hi;
+	  extract_lo = gen_vec_extract_lo_v32hi;
+	}
+      else
+	{
+	  half_mode = V32QImode;
+	  extract_hi = gen_vec_extract_hi_v64qi;
+	  extract_lo = gen_vec_extract_lo_v64qi;
+	}
+
+      vhi = gen_reg_rtx (half_mode);
+      vlo = gen_reg_rtx (half_mode);
+      idx_hi = gen_reg_rtx (GET_MODE (idx));
+      emit_insn (extract_hi (vhi, target));
+      emit_insn (extract_lo (vlo, target));
+      vec[0] = idx_hi;
+      vec[1] = idx;
+      vec[2] = GEN_INT (n_elts/2);
+      ix86_expand_binary_operator (MINUS, GET_MODE (idx), vec);
+      ix86_expand_vector_set_var (vhi, val, idx_hi);
+      ix86_expand_vector_set_var (vlo, val, idx);
+      emit_insn (gen_rtx_SET (target, gen_rtx_VEC_CONCAT (mode, vlo, vhi)));
+      return;
+    }
+
+  if (FLOAT_MODE_P (GET_MODE_INNER (mode)))
+    {
+      switch (mode)
+	{
+	case E_V2DFmode:
+	  cmp_mode = V2DImode;
+	  break;
+	case E_V4DFmode:
+	  cmp_mode = V4DImode;
+	  break;
+	case E_V8DFmode:
+	  cmp_mode = V8DImode;
+	  break;
+	case E_V4SFmode:
+	  cmp_mode = V4SImode;
+	  break;
+	case E_V8SFmode:
+	  cmp_mode = V8SImode;
+	  break;
+	case E_V16SFmode:
+	  cmp_mode = V16SImode;
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+    }
+
+  for (int i = 0; i != n_elts; i++)
+    vec[i] = GEN_INT (i);
+  constv = gen_rtx_CONST_VECTOR (cmp_mode, gen_rtvec_v (n_elts, vec));
+  valv = gen_reg_rtx (mode);
+  idxv = gen_reg_rtx (cmp_mode);
+  idx_tmp = convert_to_mode (GET_MODE_INNER (cmp_mode), idx, 1);
+
+  ok = ix86_expand_vector_init_duplicate (false, mode, valv, val);
+  gcc_assert (ok);
+  ok = ix86_expand_vector_init_duplicate (false, cmp_mode, idxv, idx_tmp);
+  gcc_assert (ok);
+  vec[0] = target;
+  vec[1] = valv;
+  vec[2] = target;
+  vec[3] = gen_rtx_EQ (mode, idxv, constv);
+  vec[4] = idxv;
+  vec[5] = constv;
+  ok = ix86_expand_int_vcond (vec);
+  gcc_assert (ok);
+}
+
 void
 ix86_expand_vector_set (bool mmx_ok, rtx target, rtx val, int elt)
 {
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index c5b700efd0e..7a1dc3d4d64 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -242,6 +242,7 @@ extern rtx ix86_rewrite_tls_address (rtx);
 
 extern void ix86_expand_vector_init (bool, rtx, rtx);
 extern void ix86_expand_vector_set (bool, rtx, rtx, int);
+extern void ix86_expand_vector_set_var (rtx, rtx, rtx);
 extern void ix86_expand_vector_extract (bool, rtx, rtx, int);
 extern void ix86_expand_reduc (rtx (*)(rtx, rtx, rtx), rtx, rtx);
 
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index be57cdaf768..d2a2eece02b 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1023,6 +1023,12 @@ (define_predicate "incdec_operand"
   return op == const1_rtx || op == constm1_rtx;
 })
 
+;; True for registers, or const_int_operand, used to vec_setm expander.
+(define_predicate "vec_setm_operand"
+  (ior (and (match_operand 0 "register_operand")
+	    (match_test "TARGET_AVX2"))
+       (match_code "const_int")))
+
 ;; True for registers, or 1 or -1.  Used to optimize double-word shifts.
 (define_predicate "reg_or_pm1_operand"
   (ior (match_operand 0 "register_operand")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 52635f6bc08..35a64b7b333 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8085,11 +8085,14 @@ (define_insn "vec_setv2df_0"
 (define_expand "vec_set<mode>"
   [(match_operand:V 0 "register_operand")
    (match_operand:<ssescalarmode> 1 "register_operand")
-   (match_operand 2 "const_int_operand")]
+   (match_operand 2 "vec_setm_operand")]
   "TARGET_SSE"
 {
-  ix86_expand_vector_set (false, operands[0], operands[1],
-			  INTVAL (operands[2]));
+  if (CONST_INT_P (operands[2]))
+    ix86_expand_vector_set (false, operands[0], operands[1],
+			    INTVAL (operands[2]));
+  else
+    ix86_expand_vector_set_var (operands[0], operands[1], operands[2]);
   DONE;
 })
 
diff --git a/gcc/testsuite/gcc.target/i386/avx2-vec-set-1.c b/gcc/testsuite/gcc.target/i386/avx2-vec-set-1.c
new file mode 100644
index 00000000000..4c16ec5dfc4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx2-vec-set-1.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx2 -O2 -mno-avx512f" } */
+/* { dg-final { scan-assembler-times {(?n)vpcmpeq[bwdq]} 12 } } */
+/* { dg-final { scan-assembler-times {(?n)vp?blendv} 12 } } */
+
+typedef char v32qi __attribute__ ((vector_size (32)));
+typedef char v16qi __attribute__ ((vector_size (16)));
+
+typedef short v16hi __attribute__ ((vector_size (32)));
+typedef short v8hi __attribute__ ((vector_size (16)));
+
+typedef int v8si __attribute__ ((vector_size (32)));
+typedef int v4si __attribute__ ((vector_size (16)));
+
+typedef long long v4di __attribute__ ((vector_size (32)));
+typedef long long v2di __attribute__ ((vector_size (16)));
+
+typedef float v8sf __attribute__ ((vector_size (32)));
+typedef float v4sf __attribute__ ((vector_size (16)));
+
+typedef double v4df __attribute__ ((vector_size (32)));
+typedef double v2df __attribute__ ((vector_size (16)));
+
+#define FOO(VTYPE, TYPE)			\
+  VTYPE						\
+  __attribute__ ((noipa))			\
+  foo_##VTYPE (VTYPE a, TYPE b, unsigned int c)	\
+  {						\
+    a[c] = b;					\
+    return a;					\
+  }						\
+
+FOO (v16qi, char);
+FOO (v32qi, char);
+
+FOO (v8hi, short);
+FOO (v16hi, short);
+
+FOO (v4si, int);
+FOO (v8si, int);
+
+FOO (v2di, long long);
+FOO (v4di, long long);
+
+FOO (v4sf, float);
+FOO (v8sf, float);
+
+FOO (v2df, double);
+FOO (v4df, double);
diff --git a/gcc/testsuite/gcc.target/i386/avx2-vec-set-2.c b/gcc/testsuite/gcc.target/i386/avx2-vec-set-2.c
new file mode 100644
index 00000000000..9086ef406f1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx2-vec-set-2.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx2 } */
+/* { dg-options "-O2 -mavx2" } */
+
+
+#ifndef CHECK
+#define CHECK "avx2-check.h"
+#endif
+
+#ifndef TEST
+#define TEST avx2_test
+#endif
+
+#include CHECK
+
+#include "avx2-vec-set-1.c"
+
+#define CALC_TEST(vtype, type, N, idx)				\
+do								\
+  {								\
+    int i,val = idx * idx - idx * 3 + 16;			\
+    type res[N],exp[N];						\
+    vtype resv;							\
+    for (i = 0; i < N; i++)					\
+      {								\
+	res[i] = i * i - i * 3 + 15;				\
+	exp[i] = res[i];					\
+      }								\
+    exp[idx] = val;						\
+    resv = foo_##vtype (*(vtype *)&res[0], val, idx);		\
+    for (i = 0; i < N; i++)					\
+      {								\
+	if (resv[i] != exp[i])					\
+	  abort ();						\
+      }								\
+  }								\
+while (0)
+
+static void
+TEST (void)
+{
+  CALC_TEST (v32qi, char, 32, 17);
+  CALC_TEST (v16qi, char, 16, 5);
+  CALC_TEST (v16hi, short, 16, 9);
+  CALC_TEST (v8hi, short, 8, 6);
+  CALC_TEST (v8si, int, 8, 3);
+  CALC_TEST (v4si, int, 4, 2);
+  CALC_TEST (v4di, long long, 4, 1);
+  CALC_TEST (v2di, long long, 2, 0);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-vec-set-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-vec-set-1.c
new file mode 100644
index 00000000000..5cfbc85732e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-vec-set-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512vl -mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times {(?n)(?:vp?broadcast|vmovddup)} 36 } } */
+/* { dg-final { scan-assembler-times {(?n)vpcmp[bwdq][ \t]+\$0} 18 } } */
+
+typedef char v64qi __attribute__ ((vector_size (64)));
+typedef short v32hi __attribute__ ((vector_size (64)));
+typedef int v16si __attribute__ ((vector_size (64)));
+typedef long long v8di __attribute__ ((vector_size (64)));
+typedef float v16sf __attribute__ ((vector_size (64)));
+typedef double v8df __attribute__ ((vector_size (64)));
+
+#include "avx2-vec-set-1.c"
+
+FOO (v64qi, char);
+FOO (v32hi, short);
+FOO (v16si, int);
+FOO (v8di, long long);
+FOO (v16sf, float);
+FOO (v8df, double);
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-vec-set-2.c b/gcc/testsuite/gcc.target/i386/avx512bw-vec-set-2.c
new file mode 100644
index 00000000000..22e64183ebd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-vec-set-2.c
@@ -0,0 +1,44 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx512bw } */
+/* { dg-options "-O2 -mavx512bw" } */
+
+
+#ifndef CHECK
+#define CHECK "avx512f-check.h"
+#endif
+
+#define AVX512BW
+
+#include CHECK
+
+#include "avx512bw-vec-set-1.c"
+
+#define CALC_TEST(vtype, type, N, idx)				\
+do								\
+  {								\
+    int i,val = idx * idx - idx * 3 + 16;			\
+    type res[N],exp[N];						\
+    vtype resv;							\
+    for (i = 0; i < N; i++)					\
+      {								\
+	res[i] = i * i - i * 3 + 15;				\
+	exp[i] = res[i];					\
+      }								\
+    exp[idx] = val;						\
+    resv = foo_##vtype (*(vtype *)&res[0], val, idx);		\
+    for (i = 0; i < N; i++)					\
+      {								\
+	if (resv[i] != exp[i])					\
+	  abort ();						\
+      }								\
+  }								\
+while (0)
+
+static void
+test_512 (void)
+{
+  CALC_TEST (v64qi, char, 64, 50);
+  CALC_TEST (v32hi, short, 32, 30);
+  CALC_TEST (v16si, int, 16, 15);
+  CALC_TEST (v8di, long long, 8, 7);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vec-set-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vec-set-2.c
new file mode 100644
index 00000000000..8f2aa03ec11
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vec-set-2.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx512f } */
+/* { dg-options "-O2 -mavx512f -mno-avx512bw" } */
+
+
+#ifndef CHECK
+#define CHECK "avx512f-check.h"
+#endif
+
+#define AVX512F
+
+#include CHECK
+
+#include "avx512bw-vec-set-1.c"
+
+#define CALC_TEST(vtype, type, N, idx)				\
+do								\
+  {								\
+    int i,val = idx * idx - idx * 3 + 16;			\
+    type res[N],exp[N];						\
+    vtype resv;							\
+    for (i = 0; i < N; i++)					\
+      {								\
+	res[i] = i * i - i * 3 + 15;				\
+	exp[i] = res[i];					\
+      }								\
+    exp[idx] = val;						\
+    resv = foo_##vtype (*(vtype *)&res[0], val, idx);		\
+    for (i = 0; i < N; i++)					\
+      {								\
+	if (resv[i] != exp[i])					\
+	  abort ();						\
+      }								\
+  }								\
+while (0)
+
+static void
+test_512 (void)
+{
+  CALC_TEST (v64qi, char, 64, 50);
+  CALC_TEST (v32hi, short, 32, 30);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vec-set-2.c b/gcc/testsuite/gcc.target/i386/avx512vl-vec-set-2.c
new file mode 100644
index 00000000000..4f327427a64
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vec-set-2.c
@@ -0,0 +1,55 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx512bw } */
+/* { dg-require-effective-target avx512vl } */
+/* { dg-options "-O2 -mavx512bw -mavx512vl" } */
+
+
+#ifndef CHECK
+#define CHECK "avx512f-check.h"
+#endif
+
+#define AVX512VL
+#define AVX512BW
+
+#include CHECK
+
+#include "avx512bw-vec-set-1.c"
+
+#define CALC_TEST(vtype, type, N, idx)				\
+do								\
+  {								\
+    int i,val = idx * idx - idx * 3 + 16;			\
+    type res[N],exp[N];						\
+    vtype resv;							\
+    for (i = 0; i < N; i++)					\
+      {								\
+	res[i] = i * i - i * 3 + 15;				\
+	exp[i] = res[i];					\
+      }								\
+    exp[idx] = val;						\
+    resv = foo_##vtype (*(vtype *)&res[0], val, idx);		\
+    for (i = 0; i < N; i++)					\
+      {								\
+	if (resv[i] != exp[i])					\
+	  abort ();						\
+      }								\
+  }								\
+while (0)
+
+static void
+test_256 (void)
+{
+  CALC_TEST (v32qi, char, 32, 17);
+  CALC_TEST (v16hi, short, 16, 9);
+  CALC_TEST (v8si, int, 8, 3);
+  CALC_TEST (v4di, long long, 4, 1);
+}
+
+static void
+test_128 (void)
+{
+  CALC_TEST (v16qi, char, 16, 5);
+  CALC_TEST (v8hi, short, 8, 6);
+  CALC_TEST (v4si, int, 4, 2);
+  CALC_TEST (v2di, long long, 2, 0);
+}
-- 
2.18.1

next prev parent reply	other threads:[~2020-10-27  7:50 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-19  8:23 Hongtao Liu
2020-10-19  9:07 ` Richard Biener
2020-10-19  9:39   ` Hongtao Liu
2020-10-19  9:55     ` Richard Biener
2020-10-20  2:37       ` Hongtao Liu
2020-10-20  7:36         ` Richard Biener
2020-10-27  7:51           ` Hongtao Liu [this message]
2020-11-11  8:03             ` Hongtao Liu
2020-11-25 19:15               ` Jeff Law
2020-11-26  1:45                 ` Hongtao Liu
2020-11-11  8:45 Uros Bizjak
2020-11-12  2:06 ` Hongtao Liu
2020-11-12  8:20   ` Uros Bizjak
2020-11-12  9:12     ` Hongtao Liu
2020-11-12  9:15       ` Hongtao Liu
2020-11-12  9:25         ` Hongtao Liu
2020-11-12 13:59           ` Richard Biener
2020-11-12 17:51             ` Uros Bizjak
2020-11-12 18:26               ` Uros Bizjak
2020-11-12 18:34                 ` Uros Bizjak
2020-11-16 11:57             ` Uros Bizjak
2020-11-19 10:54               ` Richard Sandiford
2020-11-16 10:16       ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZc-bzXUGMBbRReERhvyJj_nESSFopxrcwC=gwQgMtpKaMPnA@mail.gmail.com' \
    --to=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=kirill.yukhin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).