From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 853C23858034 for ; Thu, 1 Jul 2021 06:17:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 853C23858034 X-IronPort-AV: E=McAfee;i="6200,9189,10031"; a="272334072" X-IronPort-AV: E=Sophos;i="5.83,313,1616482800"; d="scan'208";a="272334072" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jun 2021 23:17:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,313,1616482800"; d="scan'208";a="409038750" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga003.jf.intel.com with ESMTP; 30 Jun 2021 23:17:31 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.236.50]) by scymds01.sc.intel.com with ESMTP id 1616Gmf4031625; Wed, 30 Jun 2021 23:17:30 -0700 From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com, ubizjak@gmail.com, jakub@redhat.com Subject: [PATCH 25/62] AVX512FP16: Add testcase for vmovsh/vmovw. Date: Thu, 1 Jul 2021 14:16:11 +0800 Message-Id: <20210701061648.9447-26-hongtao.liu@intel.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20210701061648.9447-1-hongtao.liu@intel.com> References: <20210701061648.9447-1-hongtao.liu@intel.com> X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jul 2021 06:17:34 -0000 gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vmovsh-1a.c: New test. * gcc.target/i386/avx512fp16-vmovsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vmovw-1a.c: Ditto. * gcc.target/i386/avx512fp16-vmovw-1b.c: Ditto. * gcc.target/i386/avx512fp16-vmovw-2a.c: Ditto. * gcc.target/i386/avx512fp16-vmovw-2b.c: Ditto. * gcc.target/i386/avx512fp16-vmovw-3a.c: Ditto. * gcc.target/i386/avx512fp16-vmovw-3b.c: Ditto. * gcc.target/i386/avx512fp16-vmovw-4a.c: Ditto. * gcc.target/i386/avx512fp16-vmovw-4b.c: Ditto. --- .../gcc.target/i386/avx512fp16-vmovsh-1a.c | 26 ++++ .../gcc.target/i386/avx512fp16-vmovsh-1b.c | 115 ++++++++++++++++++ .../gcc.target/i386/avx512fp16-vmovw-1a.c | 15 +++ .../gcc.target/i386/avx512fp16-vmovw-1b.c | 27 ++++ .../gcc.target/i386/avx512fp16-vmovw-2a.c | 21 ++++ .../gcc.target/i386/avx512fp16-vmovw-2b.c | 53 ++++++++ .../gcc.target/i386/avx512fp16-vmovw-3a.c | 23 ++++ .../gcc.target/i386/avx512fp16-vmovw-3b.c | 52 ++++++++ .../gcc.target/i386/avx512fp16-vmovw-4a.c | 27 ++++ .../gcc.target/i386/avx512fp16-vmovw-4b.c | 52 ++++++++ 10 files changed, 411 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovsh-1a.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovsh-1b.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-1a.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-1b.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-2a.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-2b.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-3a.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-3b.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-4a.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-4b.c diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovsh-1a.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovsh-1a.c new file mode 100644 index 00000000000..e35be10fcd0 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovsh-1a.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512fp16 -O2" } */ +/* { dg-final { scan-assembler-times "vmovsh\[ \\t\]+%xmm\[0-9\]+\[^\n\r\]*%\[er\]ax+\[^\n\r]*\{%k\[0-9\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vmovsh\[ \\t\]+\[^\n\r\]*%\[er\]ax+\[^\n\r]*%xmm\[0-9\]+\{%k\[0-9\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vmovsh\[ \\t\]+\[^\n\r\]*%\[er\]ax+\[^\n\r]*%xmm\[0-9\]+\{%k\[0-9\]\}\{z\}\[^\n\r]*(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vmovsh\[ \\t\]+%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vmovsh\[ \\t\]+%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+\{%k\[0-9\]\}\[^z\n\r]*(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vmovsh\[ \\t\]+%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+\{%k\[0-9\]\}\{z\}\[^\n\r]*(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +extern _Float16 const* p; +volatile __m128h x1, x2, res; +volatile __mmask8 m8; + +void +avx512f_test (void) +{ + x2 = _mm_mask_load_sh (x1, m8, p); + x2 = _mm_maskz_load_sh (m8, p); + _mm_mask_store_sh (p, m8, x1); + + res = _mm_move_sh (x1, x2); + res = _mm_mask_move_sh (res, m8, x1, x2); + res = _mm_maskz_move_sh (m8, x1, x2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovsh-1b.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovsh-1b.c new file mode 100644 index 00000000000..cea224a62e6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovsh-1b.c @@ -0,0 +1,115 @@ +/* { dg-do run { target avx512fp16 } } */ +/* { dg-options "-O2 -mavx512fp16 -mavx512dq" } */ + +#define AVX512FP16 +#include "avx512fp16-helper.h" + +void NOINLINE +emulate_mov2_load_sh(V512 * dest, V512 op1, + __mmask8 k, int zero_mask) +{ + V512 v1, v2, v3, v4, v5, v6, v7, v8; + int i; + + unpack_ph_2twops(op1, &v1, &v2); + unpack_ph_2twops(*dest, &v7, &v8); + + if ((k&1) || !k) + v5.f32[0] = v1.f32[0]; + else if (zero_mask) + v5.f32[0] = 0; + else + v5.f32[0] = v7.f32[0]; //remains unchanged + + for (i = 1; i < 8; i++) + v5.f32[i] = 0; + + *dest = pack_twops_2ph(v5, v6); +} + +void NOINLINE +emulate_mov3_load_sh(V512 * dest, V512 op1, V512 op2, + __mmask8 k, int zero_mask) +{ + V512 v1, v2, v3, v4, v5, v6, v7, v8; + int i; + + unpack_ph_2twops(op1, &v1, &v2); + unpack_ph_2twops(op2, &v3, &v4); + unpack_ph_2twops(*dest, &v7, &v8); + + if ((k&1) || !k) + v5.f32[0] = v3.f32[0]; + else if (zero_mask) + v5.f32[0] = 0; + else + v5.f32[0] = v7.f32[0]; //remains unchanged + + for (i = 1; i < 8; i++) + v5.f32[i] = v1.f32[i]; + + *dest = pack_twops_2ph(v5, v6); +} + +void NOINLINE +emulate_mov2_store_sh(V512 * dest, V512 op1, __mmask8 k) +{ + V512 v1, v2, v3, v4, v5, v6, v7, v8; + int i; + + unpack_ph_2twops(op1, &v1, &v2); + unpack_ph_2twops(*dest, &v7, &v8); + + if ((k&1) || !k) + v5.f32[0] = v1.f32[0]; + else + v5.f32[0] = v7.f32[0]; //remains unchanged + + *dest = pack_twops_2ph(v5, v6); +} + +void +test_512 (void) +{ + V512 res; + V512 exp; + + init_src(); + + // no mask + emulate_mov2_load_sh (&exp, src1, 0x0, 0); + res.xmmh[0] = _mm_load_sh((const void *)&(src1.u16[0])); + check_results(&res, &exp, 8, "_mm_load_sh"); + + // with mask and mask bit is set + emulate_mov2_load_sh (&exp, src1, 0x1, 0); + res.xmmh[0] = _mm_mask_load_sh(res.xmmh[0], 0x1, (const void *)&(src1.u16[0])); + check_results(&res, &exp, 8, "_mm__mask_load_sh"); + + // with zero-mask + emulate_mov2_load_sh (&exp, src1, 0x0, 1); + res.xmmh[0] = _mm_maskz_load_sh(0x1, (const void *)&(src1.u16[0])); + check_results(&res, &exp, 8, "_mm_maskz_load_sh"); + + emulate_mov3_load_sh (&exp, src1, src2, 0x1, 0); + res.xmmh[0] = _mm_mask_move_sh(res.xmmh[0], 0x1, src1.xmmh[0], src2.xmmh[0]); + check_results(&res, &exp, 8, "_mm_mask_move_sh"); + + emulate_mov3_load_sh (&exp, src1, src2, 0x1, 1); + res.xmmh[0] = _mm_maskz_move_sh(0x1, src1.xmmh[0], src2.xmmh[0]); + check_results(&res, &exp, 8, "_mm_maskz_move_sh"); + + // no mask + emulate_mov2_store_sh (&exp, src1, 0x0); + _mm_store_sh((void *)&(res.u16[0]), src1.xmmh[0]); + check_results(&exp, &res, 1, "_mm_store_sh"); + + // with mask + emulate_mov2_store_sh (&exp, src1, 0x1); + _mm_mask_store_sh((void *)&(res.u16[0]), 0x1, src1.xmmh[0]); + check_results(&exp, &res, 1, "_mm_mask_store_sh"); + + if (n_errs != 0) { + abort (); + } +} diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-1a.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-1a.c new file mode 100644 index 00000000000..177802c6dcb --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-1a.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512fp16 -O2" } */ +/* { dg-final { scan-assembler-times "vmovw\[^-]" 1 } } */ +/* { dg-final { scan-assembler-times "vpextrw" 1 } } */ +#include + +volatile __m128i x1; +volatile short x2; + +void extern +avx512f_test (void) +{ + x1 = _mm_cvtsi16_si128 (x2); + x2 = _mm_cvtsi128_si16 (x1); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-1b.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-1b.c new file mode 100644 index 00000000000..a96007d6fd8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-1b.c @@ -0,0 +1,27 @@ +/* { dg-do run {target avx512fp16} } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +static void do_test (void); + +#define DO_TEST do_test +#define AVX512FP16 +#include "avx512-check.h" + +static void +do_test (void) +{ + union128i_w u; + short b = 128; + short e[8] = {0,0,0,0,0,0,0,0}; + + u.x = _mm_cvtsi16_si128 (b); + + e[0] = b; + + if (check_union128i_w (u, e)) + abort (); + u.a[0] = 123; + b = _mm_cvtsi128_si16 (u.x); + if (u.a[0] != b) + abort(); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-2a.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-2a.c new file mode 100644 index 00000000000..efa24e5523c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-2a.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +typedef short __v8hi __attribute__ ((__vector_size__ (16))); +typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__)); + +__m128i +__attribute__ ((noinline, noclone)) +foo1 (short x) +{ + return __extension__ (__m128i)(__v8hi) { x, 0, 0, 0, 0, 0, 0, 0 }; +} + +__m128i +__attribute__ ((noinline, noclone)) +foo2 (short *x) +{ + return __extension__ (__m128i)(__v8hi) { *x, 0, 0, 0, 0, 0, 0, 0 }; +} + +/* { dg-final { scan-assembler-times "vmovw\[^-\n\r]*xmm0" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-2b.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-2b.c new file mode 100644 index 00000000000..b680a16945f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-2b.c @@ -0,0 +1,53 @@ +/* { dg-do run { target avx512fp16 } } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +#include + +static void do_test (void); + +#define DO_TEST do_test +#define AVX512FP16 +#include "avx512-check.h" +#include "avx512fp16-vmovw-2a.c" + +__m128i +__attribute__ ((noinline,noclone)) +foo3 (__m128i x) +{ + return foo1 (((__v8hi) x)[0]); +} + +static void +do_test (void) +{ + short x; + union128i_w u = { -1, -1,}; + union128i_w exp = { 0, 0}; + __m128i v; + union128i_w a; + + x = 25; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo1 (x); + a.x = v; + if (check_union128i_w (a, exp.a)) + abort (); + + x = 33; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo2 (&x); + a.x = v; + if (check_union128i_w (a, exp.a)) + abort (); + + x = -33; + u.a[0] = x; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo3 (u.x); + a.x = v; + if (check_union128i_w (a, exp.a)) + abort (); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-3a.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-3a.c new file mode 100644 index 00000000000..c60310710a4 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-3a.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +typedef short __v16hi __attribute__ ((__vector_size__ (32))); +typedef long long __m256i __attribute__ ((__vector_size__ (32), __may_alias__)); + +__m256i +__attribute__ ((noinline, noclone)) +foo1 (short x) +{ + return __extension__ (__m256i)(__v16hi) { x, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; +} + +__m256i +__attribute__ ((noinline, noclone)) +foo2 (short *x) +{ + return __extension__ (__m256i)(__v16hi) { *x, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; +} + +/* { dg-final { scan-assembler-times "vmovw\[^-\n\r]*xmm0" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-3b.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-3b.c new file mode 100644 index 00000000000..13c1f6518f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-3b.c @@ -0,0 +1,52 @@ +/* { dg-do run { target avx512fp16 } } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +#include + +static void do_test (void); + +#define DO_TEST do_test +#define AVX512FP16 +#include "avx512-check.h" +#include "avx512fp16-vmovw-3a.c" + +__m256i +__attribute__ ((noinline,noclone)) +foo3 (__m256i x) +{ + return foo1 (((__v16hi) x)[0]); +} + +static void +do_test (void) +{ + short x; + union256i_w u = { -1, -1, -1, -1 }; + union256i_w exp = { 0, 0, 0, 0 }; + + __m256i v; + union256i_w a; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo1 (x); + a.x = v; + if (check_union256i_w (a, exp.a)) + abort (); + + x = 33; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo2 (&x); + a.x = v; + if (check_union256i_w (a, exp.a)) + abort (); + + x = -23; + u.a[0] = x; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo3 (u.x); + a.x = v; + if (check_union256i_w (a, exp.a)) + abort (); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-4a.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-4a.c new file mode 100644 index 00000000000..2ba198dd7fc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-4a.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +typedef short __v32hi __attribute__ ((__vector_size__ (64))); +typedef long long __m512i __attribute__ ((__vector_size__ (64), __may_alias__)); + +__m512i +__attribute__ ((noinline, noclone)) +foo1 (short x) +{ + return __extension__ (__m512i)(__v32hi) { x, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; +} + +__m512i +__attribute__ ((noinline, noclone)) +foo2 (short *x) +{ + return __extension__ (__m512i)(__v32hi) { *x, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; +} + +/* { dg-final { scan-assembler-times "vmovw\[^-\n\r]*xmm0" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-4b.c b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-4b.c new file mode 100644 index 00000000000..ec6477b793f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vmovw-4b.c @@ -0,0 +1,52 @@ +/* { dg-do run { target avx512fp16 } } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +#include + +static void do_test (void); + +#define DO_TEST do_test +#define AVX512FP16 +#include "avx512-check.h" +#include "avx512fp16-vmovw-4a.c" + +__m512i +__attribute__ ((noinline,noclone)) +foo3 (__m512i x) +{ + return foo1 (((__v32hi) x)[0]); +} + +static void +do_test (void) +{ + short x = 25; + union512i_w u = { -1, -1, -1, -1, -1, -1, -1, -1 }; + union512i_w exp = { 0, 0, 0, 0, 0, 0, 0, 0 }; + + __m512i v; + union512i_w a; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo1 (x); + a.x = v; + if (check_union512i_w (a, exp.a)) + abort (); + + x = 55; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo2 (&x); + a.x = v; + if (check_union512i_w (a, exp.a)) + abort (); + + x = 33; + u.a[0] = x; + exp.a[0] = x; + memset (&v, -1, sizeof (v)); + v = foo3 (u.x); + a.x = v; + if (check_union512i_w (a, exp.a)) + abort (); +} -- 2.18.1