public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch][i386, AVX] Adding missing mask[z]_sqrt_round_s[d,s] intrinsics
       [not found] <DC8A343E1B673F4AA224650A1F844CCD26F41F@IRSMSX103.ger.corp.intel.com>
@ 2017-11-21 13:48 ` Makhotina, Olga
  2018-02-12  5:46   ` Kirill Yukhin
  0 siblings, 1 reply; 2+ messages in thread
From: Makhotina, Olga @ 2017-11-21 13:48 UTC (permalink / raw)
  To: 'gcc-patches@gcc.gnu.org'
  Cc: 'Kirill Yukhin', Makhotina, Olga, Peryt, Sebastian

[-- Attachment #1: Type: text/plain, Size: 2719 bytes --]

Hi,

This patch adds missing intrinsics for _mm_mask[z]_sqrt_round_[sd,ss].

21.11.2017 Olga Makhotina  <olga.makhotina@intel.com>

gcc/
              * config/i386/avx512fintrin.h (_mm_mask_sqrt_round_sd,
              _mm_maskz_sqrt_round_sd, _mm_mask_sqrt_round_ss,
              _mm_maskz_sqrt_round_ss): New intrinsics.
              (__builtin_ia32_sqrtsd_round, __builtin_ia32_sqrtss_round): Remove.
              (__builtin_ia32_sqrtsd_mask_round,
              __builtin_ia32_sqrtss_mask_round): New builtins.
              * config/i386/i386-builtin.def (__builtin_ia32_sqrtsd_round,
              __builtin_ia32_sqrtss_round): Remove.
              (__builtin_ia32_sqrtsd_mask_round,
              __builtin_ia32_sqrtss_mask_round): New builtins.
              * config/i386/sse.md (vmsqrt<mode>2<round_name>): Renamed to ...
              (vmsqrt<mode>2<mask_scalar_name><round_scalar_name>): ... this.
              ((match_operand:VF_128 1 "vector_operand" 
              "xBm,<round_constraint>")): Changed to ...
              ((match_operand:VF_128 1 "vector_operand" 
              "xBm,<round_scalar_constraint>")): ... this.
              (vsqrt<ssescalarmodesuffix>\t{<round_op3>%1, %2, %0|
              %0, %2, %<iptr>1<round_op3>}): Changed to ...
              (vsqrt<ssescalarmodesuffix>\t{<round_scalar_mask_op3>%1, %2, 
              %0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %2, 
              %<iptr>1<round_scalar_mask_op3>}): ... this.
              ((set_attr "prefix" "<round_prefix>")): Changed to ...
              ((set_attr "prefix" "<round_scalar_prefix>")): ... this.

21.11.2017 Olga Makhotina  <olga.makhotina@intel.com>

gcc/testsuite/
              * gcc.target/i386/avx512f-vsqrtsd-1.c (_mm_mask_sqrt_round_sd,
              _mm_maskz_sqrt_round_sd): Test new intrinsics.
              * gcc.target/i386/avx512f-vsqrtsd-2.c (_mm_sqrt_round_sd,
              _mm_mask_sqrt_round_sd, _mm_maskz_sqrt_round_sd): Test new intrinsics.
              * gcc.target/i386/avx512f-vsqrtss-1.c (_mm_mask_sqrt_round_ss,
              _mm_maskz_sqrt_round_ss): Test new intrinsics.
              * gcc.target/i386/avx512f-vsqrtss-2.c (_mm_sqrt_round_ss,
              _mm_mask_sqrt_round_ss,      _mm_maskz_sqrt_round_ss): Test new intrinsics.
              * gcc.target/i386/avx-1.c (__builtin_ia32_sqrtsd_round,
              __builtin_ia32_sqrtss_round): Remove builtins.
              (__builtin_ia32_sqrtsd_mask_round,
              __builtin_ia32_sqrtss_mask_round): Test new builtins.
              * gcc.target/i386/sse-13.c: Ditto.
              * gcc.target/i386/sse-23.c: Ditto.

Is it ok for trunk?

Thanks,
Olga


[-- Attachment #2: 0001-sqrt.patch --]
[-- Type: application/octet-stream, Size: 17307 bytes --]

From 77d2762f269acd976e9e96a19d78b9083b6ddfa2 Mon Sep 17 00:00:00 2001
From: Olga Makhotina <olga.makhotina@intel.com>
Date: Mon, 20 Nov 2017 17:08:02 +0300
Subject: [PATCH] sqrt

---
 gcc/config/i386/avx512fintrin.h                   | 84 ++++++++++++++++++++---
 gcc/config/i386/i386-builtin.def                  |  4 +-
 gcc/config/i386/sse.md                            |  8 +--
 gcc/testsuite/gcc.target/i386/avx-1.c             |  4 +-
 gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-1.c |  5 ++
 gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-2.c | 62 +++++++++++++++++
 gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-1.c |  6 ++
 gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-2.c | 63 +++++++++++++++++
 gcc/testsuite/gcc.target/i386/sse-13.c            |  4 +-
 gcc/testsuite/gcc.target/i386/sse-23.c            |  4 +-
 10 files changed, 222 insertions(+), 22 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-2.c

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 5dc5fae..1bd35bf 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -1955,18 +1955,66 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_sqrt_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_sqrtsd_round ((__v2df) __B,
-						(__v2df) __A,
-						__R);
+  return (__m128d) __builtin_ia32_sqrtsd_mask_round ((__v2df) __B,
+						     (__v2df) __A,
+						     (__v2df)
+						     _mm_setzero_pd (),
+						     (__mmask8) -1, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_sqrt_round_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B,
+			const int __R)
+{
+  return (__m128d) __builtin_ia32_sqrtsd_mask_round ((__v2df) __B,
+						     (__v2df) __A,
+						     (__v2df) __W,
+						     (__mmask8) __U, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_sqrt_round_sd (__mmask8 __U, __m128d __A, __m128d __B, const int __R)
+{
+  return (__m128d) __builtin_ia32_sqrtsd_mask_round ((__v2df) __B,
+						     (__v2df) __A,
+						     (__v2df)
+						     _mm_setzero_pd (),
+						     (__mmask8) __U, __R);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_sqrt_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_sqrtss_round ((__v4sf) __B,
-					       (__v4sf) __A,
-					       __R);
+  return (__m128) __builtin_ia32_sqrtss_mask_round ((__v4sf) __B,
+						    (__v4sf) __A,
+						    (__v4sf)
+						    _mm_setzero_ps (),
+						    (__mmask8) -1, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_sqrt_round_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B,
+			const int __R)
+{
+  return (__m128) __builtin_ia32_sqrtss_mask_round ((__v4sf) __B,
+						    (__v4sf) __A,
+						    (__v4sf) __W,
+						    (__mmask8) __U, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_sqrt_round_ss (__mmask8 __U, __m128 __A, __m128 __B, const int __R)
+{
+  return (__m128) __builtin_ia32_sqrtss_mask_round ((__v4sf) __B,
+						    (__v4sf) __A,
+						    (__v4sf)
+						    _mm_setzero_ps (),
+						    (__mmask8) __U, __R);
 }
 #else
 #define _mm512_sqrt_round_pd(A, C)            \
@@ -1987,11 +2035,27 @@ _mm_sqrt_round_ss (__m128 __A, __m128 __B, const int __R)
 #define _mm512_maskz_sqrt_round_ps(U, A, C)   \
     (__m512)__builtin_ia32_sqrtps512_mask(A, (__v16sf)_mm512_setzero_ps(), U, C)
 
-#define _mm_sqrt_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_sqrtsd_round(A, B, C)
+#define _mm_sqrt_round_sd(A, B, C)	      \
+    (__m128d)__builtin_ia32_sqrtsd_mask_round (B, A, \
+	(__v2df) _mm_setzero_pd (), -1, C)
+
+#define _mm_mask_sqrt_round_sd(W, U, A, B, C) \
+    (__m128d)__builtin_ia32_sqrtsd_mask_round (B, A, W, U, C)
+
+#define _mm_maskz_sqrt_round_sd(U, A, B, C)   \
+    (__m128d)__builtin_ia32_sqrtsd_mask_round (B, A, \
+	(__v2df) _mm_setzero_pd (), U, C)
+
+#define _mm_sqrt_round_ss(A, B, C)	      \
+    (__m128)__builtin_ia32_sqrtss_mask_round (B, A, \
+	(__v4sf) _mm_setzero_ps (), -1, C)
+
+#define _mm_mask_sqrt_round_ss(W, U, A, B, C) \
+    (__m128)__builtin_ia32_sqrtss_mask_round (B, A, W, U, C)
 
-#define _mm_sqrt_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_sqrtss_round(A, B, C)
+#define _mm_maskz_sqrt_round_ss(U, A, B, C)   \
+    (__m128)__builtin_ia32_sqrtss_mask_round (B, A, \
+	(__v4sf) _mm_setzero_ps (), U, C)
 #endif
 
 extern __inline __m512i
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index e46a6ab..f7eb6e7 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2500,8 +2500,8 @@ BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_vmscalefv2df_round, "__builtin_
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_vmscalefv4sf_round, "__builtin_ia32_scalefss_round", IX86_BUILTIN_SCALEFSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_sqrtv8df2_mask_round, "__builtin_ia32_sqrtpd512_mask", IX86_BUILTIN_SQRTPD512_MASK, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_sqrtv16sf2_mask_round, "__builtin_ia32_sqrtps512_mask", IX86_BUILTIN_SQRTPS512_MASK, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_sse2_vmsqrtv2df2_round, "__builtin_ia32_sqrtsd_round", IX86_BUILTIN_SQRTSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_sse_vmsqrtv4sf2_round, "__builtin_ia32_sqrtss_round", IX86_BUILTIN_SQRTSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
+BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_sse2_vmsqrtv2df2_mask_round, "__builtin_ia32_sqrtsd_mask_round", IX86_BUILTIN_SQRTSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
+BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_sse_vmsqrtv4sf2_mask_round, "__builtin_ia32_sqrtss_mask_round", IX86_BUILTIN_SQRTSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_subv8df3_mask_round, "__builtin_ia32_subpd512_mask", IX86_BUILTIN_SUBPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_subv16sf3_mask_round, "__builtin_ia32_subps512_mask", IX86_BUILTIN_SUBPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_sse2_vmsubv2df3_round, "__builtin_ia32_subsd_round", IX86_BUILTIN_SUBSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 7f17231..5d2b83c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1820,21 +1820,21 @@
    (set_attr "prefix" "maybe_vex")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "<sse>_vmsqrt<mode>2<round_name>"
+(define_insn "<sse>_vmsqrt<mode>2<mask_scalar_name><round_scalar_name>"
   [(set (match_operand:VF_128 0 "register_operand" "=x,v")
 	(vec_merge:VF_128
 	  (sqrt:VF_128
-	    (match_operand:VF_128 1 "vector_operand" "xBm,<round_constraint>"))
+	    (match_operand:VF_128 1 "vector_operand" "xBm,<round_scalar_constraint>"))
 	  (match_operand:VF_128 2 "register_operand" "0,v")
 	  (const_int 1)))]
   "TARGET_SSE"
   "@
    sqrt<ssescalarmodesuffix>\t{%1, %0|%0, %<iptr>1}
-   vsqrt<ssescalarmodesuffix>\t{<round_op3>%1, %2, %0|%0, %2, %<iptr>1<round_op3>}"
+   vsqrt<ssescalarmodesuffix>\t{<round_scalar_mask_op3>%1, %2, %0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %2, %<iptr>1<round_scalar_mask_op3>}"
   [(set_attr "isa" "noavx,avx")
    (set_attr "type" "sse")
    (set_attr "atom_sse_attr" "sqrt")
-   (set_attr "prefix" "<round_prefix>")
+   (set_attr "prefix" "<round_scalar_prefix>")
    (set_attr "btver2_sse_attr" "sqrt")
    (set_attr "mode" "<ssescalarmode>")])
 
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index 1133a83..c27e482 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -305,8 +305,8 @@
 #define __builtin_ia32_shufps512_mask(A, B, F, D, E) __builtin_ia32_shufps512_mask(A, B, 1, D, E)
 #define __builtin_ia32_sqrtpd512_mask(A, B, C, D) __builtin_ia32_sqrtpd512_mask(A, B, C, 8)
 #define __builtin_ia32_sqrtps512_mask(A, B, C, D) __builtin_ia32_sqrtps512_mask(A, B, C, 8)
-#define __builtin_ia32_sqrtss_round(A, B, C) __builtin_ia32_sqrtss_round(A, B, 8)
-#define __builtin_ia32_sqrtsd_round(A, B, C) __builtin_ia32_sqrtsd_round(A, B, 8)
+#define __builtin_ia32_sqrtss_mask_round(A, B, C, D, E) __builtin_ia32_sqrtss_mask_round(A, B, C, D, 8)
+#define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-1.c
index c0559c0..a7d7af9 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-1.c
@@ -1,13 +1,18 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
 /* { dg-final { scan-assembler-times "vsqrtsd\[ \\t\]+\[^\n\]*\{rn-sae\}\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vsqrtsd\[ \\t\]+\[^\n\]*\{rd-sae\}\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vsqrtsd\[ \\t\]+\[^\n\]*\{rz-sae\}\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
 volatile __m128d x1, x2;
+volatile __mmask8 m;
 
 void extern
 avx512f_test (void)
 {
   x1 = _mm_sqrt_round_sd (x1, x2, _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  x1 = _mm_mask_sqrt_round_sd (x1, m, x1, x2, _MM_FROUND_TO_NEG_INF | _MM_FROUND_NO_EXC);
+  x1 = _mm_maskz_sqrt_round_sd (m, x1, x2, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-2.c
new file mode 100644
index 0000000..49ca7ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vsqrtsd-2.c
@@ -0,0 +1,62 @@
+/* { dg-do run } */
+/* { dg-options "-mavx512f -O2" } */
+/* { dg-require-effective-target avx512f } */
+
+#include <math.h>
+#include "avx512f-check.h"
+
+#define SIZE (128 / 64)
+#include "avx512f-mask-type.h"
+
+static void
+compute_sqrtsd (double *s1, double *s2, double *r)
+{
+  r[0] = sqrt(s2[0]);
+  r[1] = s1[1];
+}
+
+void static
+avx512f_test (void)
+{
+  union128d res1, res2, res3;
+  union128d s1, s2;
+  double res_ref[SIZE];
+  MASK_TYPE mask = MASK_VALUE;
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+    {
+      s1.a[i] = 11.5 * (i + 1);
+      s2.a[i] = 10.5 * (i + 1);
+      res_ref[i] = 9.5 * (i + 1);
+      res1.a[i] = DEFAULT_VALUE;
+      res2.a[i] = DEFAULT_VALUE;
+      res3.a[i] = DEFAULT_VALUE;
+    }
+
+  res1.x = _mm_sqrt_round_sd (s1.x, s2.x, 
+                _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  res2.x = _mm_mask_sqrt_round_sd (s1.x, mask, s1.x, s2.x,
+		_MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  res3.x = _mm_maskz_sqrt_round_sd (mask, s1.x, s2.x,
+		_MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+
+  compute_sqrtsd (s1.a, s2.a, res_ref);
+
+  if (check_union128d (res1, res_ref))
+    abort ();
+
+  MASK_MERGE (d) (res_ref, mask, 1);
+
+  if (check_union128d (res2, res_ref))
+    abort ();
+
+  MASK_ZERO (d) (res_ref, mask, 1);
+
+  if (check_union128d (res3, res_ref))
+    abort ();
+}
+
+
+
+
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-1.c
index e43b4a1..103ff30 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-1.c
@@ -1,13 +1,19 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
 /* { dg-final { scan-assembler-times "vsqrtss\[ \\t\]+\[^\n\]*\{rn-sae\}\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vsqrtss\[ \\t\]+\[^\n\]*\{rd-sae\}\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vsqrtss\[ \\t\]+\[^\n\]*\{rz-sae\}\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
 
 #include <immintrin.h>
 
 volatile __m128 x1, x2;
+volatile __mmask8 m;
 
 void extern
 avx512f_test (void)
 {
   x1 = _mm_sqrt_round_ss (x1, x2, _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  x1 = _mm_mask_sqrt_round_ss (x1, m, x1, x2, _MM_FROUND_TO_NEG_INF | _MM_FROUND_NO_EXC);
+  x1 = _mm_maskz_sqrt_round_ss (m, x1, x2, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-2.c
new file mode 100644
index 0000000..90f88be
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vsqrtss-2.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+/* { dg-options "-mavx512f -O2" } */
+/* { dg-require-effective-target avx512f } */
+
+#include <math.h>
+#include "avx512f-check.h"
+
+#define SIZE (128 / 32)
+#include "avx512f-mask-type.h"
+
+static void
+compute_sqrtss (float *s1, float *s2, float *r)
+{
+  r[0] = sqrt(s2[0]);
+  int i;
+  for (i = 1; i < SIZE; i++)
+    {
+      r[i] = s1[i];
+    }
+}
+
+static void
+avx512f_test (void)
+{
+  union128 res1, res2, res3;
+  union128 s1, s2;
+  float res_ref[SIZE];
+  MASK_TYPE mask = MASK_VALUE;
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+    {
+      s1.a[i] = 11.5 * (i + 1);
+      s2.a[i] = 10.5 * (i + 1);
+      res_ref[i] = 9.5 * (i + 1);
+      res1.a[i] = DEFAULT_VALUE;
+      res2.a[i] = DEFAULT_VALUE;
+      res3.a[i] = DEFAULT_VALUE;
+    }
+
+  res1.x = _mm_sqrt_round_ss (s1.x, s2.x,
+                _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  res2.x = _mm_mask_sqrt_round_ss (s1.x, mask, s1.x, s2.x,
+                _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  res3.x = _mm_maskz_sqrt_round_ss (mask, s1.x, s2.x,
+                _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+
+  compute_sqrtss (s1.a, s2.a, res_ref);
+
+  if (check_union128 (res1, res_ref))
+    abort ();
+
+  MASK_MERGE () (res_ref, mask, 1);
+
+  if (check_union128 (res2, res_ref))
+    abort ();
+
+  MASK_ZERO () (res_ref, mask, 1);
+
+  if (check_union128 (res3, res_ref))
+    abort ();
+}
+
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index 9bdc73f..3740bc0 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -322,8 +322,8 @@
 #define __builtin_ia32_shufps512_mask(A, B, F, D, E) __builtin_ia32_shufps512_mask(A, B, 1, D, E)
 #define __builtin_ia32_sqrtpd512_mask(A, B, C, D) __builtin_ia32_sqrtpd512_mask(A, B, C, 8)
 #define __builtin_ia32_sqrtps512_mask(A, B, C, D) __builtin_ia32_sqrtps512_mask(A, B, C, 8)
-#define __builtin_ia32_sqrtss_round(A, B, C) __builtin_ia32_sqrtss_round(A, B, 8)
-#define __builtin_ia32_sqrtsd_round(A, B, C) __builtin_ia32_sqrtsd_round(A, B, 8)
+#define __builtin_ia32_sqrtss_mask_round(A, B, C, D, E) __builtin_ia32_sqrtss_mask_round(A, B, C, E, 8)
+#define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, E, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 66c25c7..36116ae 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -323,8 +323,8 @@
 #define __builtin_ia32_shufps512_mask(A, B, F, D, E) __builtin_ia32_shufps512_mask(A, B, 1, D, E)
 #define __builtin_ia32_sqrtpd512_mask(A, B, C, D) __builtin_ia32_sqrtpd512_mask(A, B, C, 8)
 #define __builtin_ia32_sqrtps512_mask(A, B, C, D) __builtin_ia32_sqrtps512_mask(A, B, C, 8)
-#define __builtin_ia32_sqrtss_round(A, B, C) __builtin_ia32_sqrtss_round(A, B, 8)
-#define __builtin_ia32_sqrtsd_round(A, B, C) __builtin_ia32_sqrtsd_round(A, B, 8)
+#define __builtin_ia32_sqrtss_mask_round(A, B, C, D, E) __builtin_ia32_sqrtss_mask_round(A, B, C, D, 8)
+#define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
-- 
2.5.5


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [patch][i386, AVX] Adding missing mask[z]_sqrt_round_s[d,s] intrinsics
  2017-11-21 13:48 ` [patch][i386, AVX] Adding missing mask[z]_sqrt_round_s[d,s] intrinsics Makhotina, Olga
@ 2018-02-12  5:46   ` Kirill Yukhin
  0 siblings, 0 replies; 2+ messages in thread
From: Kirill Yukhin @ 2018-02-12  5:46 UTC (permalink / raw)
  To: Makhotina, Olga; +Cc: 'gcc-patches@gcc.gnu.org', Peryt, Sebastian

Hello Olga,

On 21 Nov 12:46, Makhotina, Olga wrote:
> Hi,
> 
> This patch adds missing intrinsics for _mm_mask[z]_sqrt_round_[sd,ss].
> 
> 21.11.2017 Olga Makhotina  <olga.makhotina@intel.com>
> 
> gcc/
>               * config/i386/avx512fintrin.h (_mm_mask_sqrt_round_sd,
>               _mm_maskz_sqrt_round_sd, _mm_mask_sqrt_round_ss,
>               _mm_maskz_sqrt_round_ss): New intrinsics.
>               (__builtin_ia32_sqrtsd_round, __builtin_ia32_sqrtss_round): Remove.
>               (__builtin_ia32_sqrtsd_mask_round,
>               __builtin_ia32_sqrtss_mask_round): New builtins.
>               * config/i386/i386-builtin.def (__builtin_ia32_sqrtsd_round,
>               __builtin_ia32_sqrtss_round): Remove.
>               (__builtin_ia32_sqrtsd_mask_round,
>               __builtin_ia32_sqrtss_mask_round): New builtins.
>               * config/i386/sse.md (vmsqrt<mode>2<round_name>): Renamed to ...
>               (vmsqrt<mode>2<mask_scalar_name><round_scalar_name>): ... this.
>               ((match_operand:VF_128 1 "vector_operand" 
>               "xBm,<round_constraint>")): Changed to ...
>               ((match_operand:VF_128 1 "vector_operand" 
>               "xBm,<round_scalar_constraint>")): ... this.
>               (vsqrt<ssescalarmodesuffix>\t{<round_op3>%1, %2, %0|
>               %0, %2, %<iptr>1<round_op3>}): Changed to ...
>               (vsqrt<ssescalarmodesuffix>\t{<round_scalar_mask_op3>%1, %2, 
>               %0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %2, 
>               %<iptr>1<round_scalar_mask_op3>}): ... this.
>               ((set_attr "prefix" "<round_prefix>")): Changed to ...
>               ((set_attr "prefix" "<round_scalar_prefix>")): ... this.
> 
> 21.11.2017 Olga Makhotina  <olga.makhotina@intel.com>
> 
> gcc/testsuite/
>               * gcc.target/i386/avx512f-vsqrtsd-1.c (_mm_mask_sqrt_round_sd,
>               _mm_maskz_sqrt_round_sd): Test new intrinsics.
>               * gcc.target/i386/avx512f-vsqrtsd-2.c (_mm_sqrt_round_sd,
>               _mm_mask_sqrt_round_sd, _mm_maskz_sqrt_round_sd): Test new intrinsics.
>               * gcc.target/i386/avx512f-vsqrtss-1.c (_mm_mask_sqrt_round_ss,
>               _mm_maskz_sqrt_round_ss): Test new intrinsics.
>               * gcc.target/i386/avx512f-vsqrtss-2.c (_mm_sqrt_round_ss,
>               _mm_mask_sqrt_round_ss,      _mm_maskz_sqrt_round_ss): Test new intrinsics.
>               * gcc.target/i386/avx-1.c (__builtin_ia32_sqrtsd_round,
>               __builtin_ia32_sqrtss_round): Remove builtins.
>               (__builtin_ia32_sqrtsd_mask_round,
>               __builtin_ia32_sqrtss_mask_round): Test new builtins.
>               * gcc.target/i386/sse-13.c: Ditto.
>               * gcc.target/i386/sse-23.c: Ditto.
> 
> Is it ok for trunk?
The patch itself is OK for trunk. I've checked it in.

One nit: could you pls format ChangeLog entries more carefully:
80 chars in line max, tab instead of leading spaces.

--
Thanks, K

> 
> Thanks,
> Olga
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-02-12  5:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <DC8A343E1B673F4AA224650A1F844CCD26F41F@IRSMSX103.ger.corp.intel.com>
2017-11-21 13:48 ` [patch][i386, AVX] Adding missing mask[z]_sqrt_round_s[d,s] intrinsics Makhotina, Olga
2018-02-12  5:46   ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).