public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch][i386, AVX] Adding missing mask[z]_scalef_round_s[d,s] intrinsics
@ 2017-11-14  9:57 Makhotina, Olga
  2017-11-14 10:09 ` Makhotina, Olga
  0 siblings, 1 reply; 3+ messages in thread
From: Makhotina, Olga @ 2017-11-14  9:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Kirill Yukhin, Makhotina, Olga, Peryt, Sebastian

Hi,

This patch adds missing intrinsics for _mm_mask[z]_scalef_round_[sd,ss].

14.11.2017  Olga Makhotina  <olga.makhotina@intel.com>

gcc/
	* config/i386/avx512fintrin.h (_mm_mask_scalef_round_sd,
	_mm_maskz_scalef_round_sd, _mm_mask_scalef_round_ss, 
	_mm_maskz_scalef_round_ss): New intrinsics.
	(__builtin_ia32_scalefsd_round, __builtin_ia32_scalefss_round): Fix.
	* config/i386/i386-builtin.def (__builtin_ia32_scalefsd_round,
	__builtin_ia32_scalefss_round): Remove.
	(__builtin_ia32_scalefsd_mask_round,
	__builtin_ia32_scalefss_mask_round): New intrinsics.
	* config/i386/sse.md (vmscalef<mode><round_name>): Renamed to ...
	(vmscalef<mode><mask_scalar_name><round_scalar_name>): ... this.
	((match_operand:VF_128 2 "<round_nimm_predicate>" 
	"<round_constraint>")): Changed to ...
	((match_operand:VF_128 2 "<round_scalar_nimm_predicate>" 
	"<round_scalar_constraint>")): ... this.
	("vscalef<ssescalarmodesuffix>\t{<round_op3>%2, %1, %0|
	%0, %1, %2<round_op3>}"): Changed to ...
	("vscalef<ssescalarmodesuffix>\t{<round_scalar_mask_op3>%2, %1, 
	%0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %1, 
	%2<round_scalar_mask_op3>}"): ... this.
	* config/i386/subst.md (round_scalar_nimm_predicate): New.

14.11.2017  Olga Makhotina  <olga.makhotina@intel.com>

gcc/testsuite/
	* gcc.target/i386/avx512f-vscalefsd-1.c (_mm_mask_scalef_round_sd,
	_mm_maskz_scalef_round_sd): Test new intrinsics.
	* gcc.target/i386/avx512f-vscalefsd-2.c (_mm_scalef_round_sd,
	_mm_mask_scalef_round_sd, _mm_maskz_scalef_round_sd): Test new intrinsics.
	* gcc.target/i386/avx512f-vscalefss-1.c (_mm_mask_scalef_round_ss,
	_mm_maskz_scalef_round_ss): Test new intrinsics.
	* gcc.target/i386/avx512f-vscalefss-2.c (_mm_scalef_round_ss,
	_mm_mask_scalef_round_ss, _mm_maskz_scalef_round_ss): Test new intrinsics.
	* gcc.target/i386/avx-1.c (__builtin_ia32_scalefsd_round,
	__builtin_ia32_scalefss_round): Remove builtin.
	(__builtin_ia32_scalefsd_mask_round,
	__builtin_ia32_scalefss_mask_round): Test new builtin.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.

Is it ok for trunk?
 
Thanks,
Olga

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [patch][i386, AVX] Adding missing mask[z]_scalef_round_s[d,s] intrinsics
  2017-11-14  9:57 [patch][i386, AVX] Adding missing mask[z]_scalef_round_s[d,s] intrinsics Makhotina, Olga
@ 2017-11-14 10:09 ` Makhotina, Olga
  2018-02-12  6:10   ` Kirill Yukhin
  0 siblings, 1 reply; 3+ messages in thread
From: Makhotina, Olga @ 2017-11-14 10:09 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Kirill Yukhin, Peryt, Sebastian

[-- Attachment #1: Type: text/plain, Size: 2617 bytes --]

Hi,

Attachment got lost by accident. Attaching it again.

Thanks,
Olga

-----Original Message-----
From: Makhotina, Olga 
Sent: Tuesday, November 14, 2017 10:49 AM
To: gcc-patches@gcc.gnu.org
Cc: Uros Bizjak <ubizjak@gmail.com>; Kirill Yukhin <kirill.yukhin@gmail.com>; Makhotina, Olga <olga.makhotina@intel.com>; Peryt, Sebastian <sebastian.peryt@intel.com>
Subject: [patch][i386, AVX] Adding missing mask[z]_scalef_round_s[d,s] intrinsics

Hi,

This patch adds missing intrinsics for _mm_mask[z]_scalef_round_[sd,ss].

14.11.2017  Olga Makhotina  <olga.makhotina@intel.com>

gcc/
	* config/i386/avx512fintrin.h (_mm_mask_scalef_round_sd,
	_mm_maskz_scalef_round_sd, _mm_mask_scalef_round_ss, 
	_mm_maskz_scalef_round_ss): New intrinsics.
	(__builtin_ia32_scalefsd_round, __builtin_ia32_scalefss_round): Fix.
	* config/i386/i386-builtin.def (__builtin_ia32_scalefsd_round,
	__builtin_ia32_scalefss_round): Remove.
	(__builtin_ia32_scalefsd_mask_round,
	__builtin_ia32_scalefss_mask_round): New intrinsics.
	* config/i386/sse.md (vmscalef<mode><round_name>): Renamed to ...
	(vmscalef<mode><mask_scalar_name><round_scalar_name>): ... this.
	((match_operand:VF_128 2 "<round_nimm_predicate>" 
	"<round_constraint>")): Changed to ...
	((match_operand:VF_128 2 "<round_scalar_nimm_predicate>" 
	"<round_scalar_constraint>")): ... this.
	("vscalef<ssescalarmodesuffix>\t{<round_op3>%2, %1, %0|
	%0, %1, %2<round_op3>}"): Changed to ...
	("vscalef<ssescalarmodesuffix>\t{<round_scalar_mask_op3>%2, %1, 
	%0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %1, 
	%2<round_scalar_mask_op3>}"): ... this.
	* config/i386/subst.md (round_scalar_nimm_predicate): New.

14.11.2017  Olga Makhotina  <olga.makhotina@intel.com>

gcc/testsuite/
	* gcc.target/i386/avx512f-vscalefsd-1.c (_mm_mask_scalef_round_sd,
	_mm_maskz_scalef_round_sd): Test new intrinsics.
	* gcc.target/i386/avx512f-vscalefsd-2.c (_mm_scalef_round_sd,
	_mm_mask_scalef_round_sd, _mm_maskz_scalef_round_sd): Test new intrinsics.
	* gcc.target/i386/avx512f-vscalefss-1.c (_mm_mask_scalef_round_ss,
	_mm_maskz_scalef_round_ss): Test new intrinsics.
	* gcc.target/i386/avx512f-vscalefss-2.c (_mm_scalef_round_ss,
	_mm_mask_scalef_round_ss, _mm_maskz_scalef_round_ss): Test new intrinsics.
	* gcc.target/i386/avx-1.c (__builtin_ia32_scalefsd_round,
	__builtin_ia32_scalefss_round): Remove builtin.
	(__builtin_ia32_scalefsd_mask_round,
	__builtin_ia32_scalefss_mask_round): Test new builtin.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.

Is it ok for trunk?
 
Thanks,
Olga

[-- Attachment #2: 0001-mask-z-_scalef_round_s-s-d.patch --]
[-- Type: application/octet-stream, Size: 19352 bytes --]

From 457d58a02c2f4a446a9114743824d869e2b987f4 Mon Sep 17 00:00:00 2001
From: Olga Makhotina <olga.makhotina@intel.com>
Date: Tue, 31 Oct 2017 11:52:20 +0100
Subject: [PATCH] mask[z]_scalef_round_s[s/d] 	modified:  
 gcc/config/i386/avx512fintrin.h 	modified:  
 gcc/config/i386/i386-builtin.def 	modified:   gcc/config/i386/sse.md 
 modified:   gcc/config/i386/subst.md 	modified:  
 gcc/testsuite/gcc.target/i386/avx-1.c 	modified:  
 gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-1.c 	modified:  
 gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-2.c 	modified:  
 gcc/testsuite/gcc.target/i386/avx512f-vscalefss-1.c 	modified:  
 gcc/testsuite/gcc.target/i386/avx512f-vscalefss-2.c 	modified:  
 gcc/testsuite/gcc.target/i386/sse-13.c 	modified:  
 gcc/testsuite/gcc.target/i386/sse-23.c

---
 gcc/config/i386/avx512fintrin.h                    | 85 ++++++++++++++++++----
 gcc/config/i386/i386-builtin.def                   |  4 +-
 gcc/config/i386/sse.md                             |  6 +-
 gcc/config/i386/subst.md                           |  1 +
 gcc/testsuite/gcc.target/i386/avx-1.c              |  4 +-
 .../gcc.target/i386/avx512f-vscalefsd-1.c          |  7 ++
 .../gcc.target/i386/avx512f-vscalefsd-2.c          | 28 ++++++-
 .../gcc.target/i386/avx512f-vscalefss-1.c          |  6 ++
 .../gcc.target/i386/avx512f-vscalefss-2.c          | 28 ++++++-
 gcc/testsuite/gcc.target/i386/sse-13.c             |  4 +-
 gcc/testsuite/gcc.target/i386/sse-23.c             |  4 +-
 11 files changed, 150 insertions(+), 27 deletions(-)

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 72f57f7..957ae84 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -3039,18 +3039,67 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_scalef_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_scalefsd_round ((__v2df) __A,
-						  (__v2df) __B,
-						  __R);
+  return (__m128d) __builtin_ia32_scalefsd_mask_round ((__v2df) __A,
+						       (__v2df) __B,
+						       (__v2df)
+						       _mm_setzero_pd (),
+						       (__mmask8) -1, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_scalef_round_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B,
+			  const int __R)
+{
+  return (__m128d) __builtin_ia32_scalefsd_mask_round ((__v2df) __A,
+						       (__v2df) __B,
+						       (__v2df) __W,
+						       (__mmask8) __U, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_scalef_round_sd (__mmask8 __U, __m128d __A, __m128d __B,
+			   const int __R)
+{
+  return (__m128d) __builtin_ia32_scalefsd_mask_round ((__v2df) __A,
+						       (__v2df) __B,
+						       (__v2df)
+						       _mm_setzero_pd (),
+						       (__mmask8) __U, __R);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_scalef_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_scalefss_round ((__v4sf) __A,
-						 (__v4sf) __B,
-						 __R);
+  return (__m128) __builtin_ia32_scalefss_mask_round ((__v4sf) __A,
+						      (__v4sf) __B,
+						      (__v4sf)
+						      _mm_setzero_ps (),
+						      (__mmask8) -1, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_scalef_round_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B,
+			 const int __R)
+{
+  return (__m128) __builtin_ia32_scalefss_mask_round ((__v4sf) __A,
+						      (__v4sf) __B,
+						      (__v4sf) __W,
+						      (__mmask8) __U, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_scalef_round_ss (__mmask8 __U, __m128 __A, __m128 __B, const int __R)
+{
+  return (__m128) __builtin_ia32_scalefss_mask_round ((__v4sf) __A,
+						      (__v4sf) __B,
+						      (__v4sf)
+						      _mm_setzero_ps (),
+						      (__mmask8) __U, __R);
 }
 #else
 #define _mm512_scalef_round_pd(A, B, C)            \
@@ -3072,10 +3121,12 @@ _mm_scalef_round_ss (__m128 __A, __m128 __B, const int __R)
     (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C)
 
 #define _mm_scalef_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_scalefsd_round(A, B, C)
+    (__m128d)__builtin_ia32_scalefsd_mask_round (A, B, \
+	(__v2df)_mm_setzero_pd (), -1, C)
 
 #define _mm_scalef_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_scalefss_round(A, B, C)
+    (__m128)__builtin_ia32_scalefss_mask_round (A, B, \
+	(__v4sf)_mm_setzero_ps (), -1, C)
 #endif
 
 #ifdef __OPTIMIZE__
@@ -12118,18 +12169,24 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_scalef_sd (__m128d __A, __m128d __B)
 {
-  return (__m128d) __builtin_ia32_scalefsd_round ((__v2df) __A,
-						  (__v2df) __B,
-						  _MM_FROUND_CUR_DIRECTION);
+  return (__m128d) __builtin_ia32_scalefsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_setzero_pd (),
+						    (__mmask8) -1,
+						    _MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_scalef_ss (__m128 __A, __m128 __B)
 {
-  return (__m128) __builtin_ia32_scalefss_round ((__v4sf) __A,
-						 (__v4sf) __B,
-						 _MM_FROUND_CUR_DIRECTION);
+  return (__m128) __builtin_ia32_scalefss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_setzero_ps (),
+						   (__mmask8) -1,
+						   _MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m512d
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 4666a4e..3612eea 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2482,8 +2482,8 @@ BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_rndscalev2df_round, "__builtin_
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_rndscalev4sf_round, "__builtin_ia32_rndscaless_round", IX86_BUILTIN_RNDSCALESS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_scalefv8df_mask_round, "__builtin_ia32_scalefpd512_mask", IX86_BUILTIN_SCALEFPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_scalefv16sf_mask_round, "__builtin_ia32_scalefps512_mask", IX86_BUILTIN_SCALEFPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_vmscalefv2df_round, "__builtin_ia32_scalefsd_round", IX86_BUILTIN_SCALEFSD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_vmscalefv4sf_round, "__builtin_ia32_scalefss_round", IX86_BUILTIN_SCALEFSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
+BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_vmscalefv2df_mask_round, "__builtin_ia32_scalefsd_mask_round", IX86_BUILTIN_SCALEFSD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
+BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_vmscalefv4sf_mask_round, "__builtin_ia32_scalefss_mask_round", IX86_BUILTIN_SCALEFSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_sqrtv8df2_mask_round, "__builtin_ia32_sqrtpd512_mask", IX86_BUILTIN_SQRTPD512_MASK, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_sqrtv16sf2_mask_round, "__builtin_ia32_sqrtps512_mask", IX86_BUILTIN_SQRTPS512_MASK, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_HI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_sse2_vmsqrtv2df2_round, "__builtin_ia32_sqrtsd_round", IX86_BUILTIN_SQRTSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 19b2c69..2f65975 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8241,17 +8241,17 @@
   operands[1] = adjust_address (operands[1], DFmode, INTVAL (operands[2]) * 8);
 })
 
-(define_insn "avx512f_vmscalef<mode><round_name>"
+(define_insn "avx512f_vmscalef<mode><mask_scalar_name><round_scalar_name>"
   [(set (match_operand:VF_128 0 "register_operand" "=v")
 	(vec_merge:VF_128
 	  (unspec:VF_128
 	    [(match_operand:VF_128 1 "register_operand" "v")
-	     (match_operand:VF_128 2 "<round_nimm_predicate>" "<round_constraint>")]
+	     (match_operand:VF_128 2 "<round_scalar_nimm_predicate>" "<round_scalar_constraint>")]
 	    UNSPEC_SCALEF)
 	  (match_dup 1)
 	  (const_int 1)))]
   "TARGET_AVX512F"
-  "vscalef<ssescalarmodesuffix>\t{<round_op3>%2, %1, %0|%0, %1, %2<round_op3>}"
+  "vscalef<ssescalarmodesuffix>\t{<round_scalar_mask_op3>%2, %1, %0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %1, %2<round_scalar_mask_op3>}"
   [(set_attr "prefix" "evex")
    (set_attr "mode"  "<ssescalarmode>")])
 
diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
index a318a8d..af3f3ef 100644
--- a/gcc/config/i386/subst.md
+++ b/gcc/config/i386/subst.md
@@ -262,6 +262,7 @@
 (define_subst_attr "round_scalar_mask_op3" "round_scalar" "" "<round_scalar_mask_operand3>")
 (define_subst_attr "round_scalar_constraint" "round_scalar" "vm" "v")
 (define_subst_attr "round_scalar_prefix" "round_scalar" "vex" "evex")
+(define_subst_attr "round_scalar_nimm_predicate" "round_scalar" "vector_operand" "register_operand")
 
 (define_subst "round_scalar"
   [(set (match_operand:SUBST_V 0)
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index d03625b..b5db8f1 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -287,8 +287,8 @@
 #define __builtin_ia32_rndscaless_round(A, B, C, D) __builtin_ia32_rndscaless_round(A, B, 1, 4)
 #define __builtin_ia32_scalefpd512_mask(A, B, C, D, E) __builtin_ia32_scalefpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_scalefps512_mask(A, B, C, D, E) __builtin_ia32_scalefps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_scalefsd_round(A, B, C) __builtin_ia32_scalefsd_round(A, B, 8)
-#define __builtin_ia32_scalefss_round(A, B, C) __builtin_ia32_scalefss_round(A, B, 8)
+#define __builtin_ia32_scalefsd_mask_round(A, B, C, D, E) __builtin_ia32_scalefsd_mask_round(A, B, C, D, 8)
+#define __builtin_ia32_scalefss_mask_round(A, B, C, D, E) __builtin_ia32_scalefss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv8df(A, B, C, D, F) __builtin_ia32_scatterdiv8df(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv8di(A, B, C, D, F) __builtin_ia32_scatterdiv8di(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv16sf(A, B, C, D, F) __builtin_ia32_scatterdiv16sf(A, B, C, D, 8)
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-1.c
index c883192..09bc5c6 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-1.c
@@ -1,14 +1,21 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
 /* { dg-final { scan-assembler-times "vscalefsd\[ \\t\]+\[^\n\]*\{rn-sae\}\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vscalefsd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vscalefsd\[ \\t\]+\[^\n\]*\{rd-sae\}\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vscalefsd\[ \\t\]+\[^\n\]*\{rz-sae\}\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
 
 #include <immintrin.h>
 
 volatile __m128d x;
+volatile __mmask8 m;
 
 void extern
 avx512f_test (void)
 {
   x = _mm_scalef_sd (x, x);
   x = _mm_scalef_round_sd (x, x, _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  x = _mm_mask_scalef_round_sd (x, m, x, x, _MM_FROUND_TO_NEG_INF | _MM_FROUND_NO_EXC);
+  x = _mm_maskz_scalef_round_sd (m, x, x, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-2.c
index 28738f7..0609b01 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vscalefsd-2.c
@@ -6,6 +6,7 @@
 #include "avx512f-check.h"
 
 #define SIZE (128 / 64)
+#include "avx512f-mask-type.h"
 
 static void
 compute_scalefsd (double *s1, double *s2, double *r)
@@ -17,20 +18,45 @@ compute_scalefsd (double *s1, double *s2, double *r)
 void static
 avx512f_test (void)
 {
-  union128d res1, s1, s2;
+  union128d res1, res2, res3, res4;
+  union128d s1, s2;
   double res_ref[SIZE];
+  MASK_TYPE mask = MASK_VALUE;
   int i;
 
   for (i = 0; i < SIZE; i++)
     {
       s1.a[i] = 11.5 * (i + 1);
       s2.a[i] = 10.5 * (i + 1);
+      res_ref[i] = 9.5 * (i + 1);
+      res1.a[i] = DEFAULT_VALUE;
+      res2.a[i] = DEFAULT_VALUE;
+      res3.a[i] = DEFAULT_VALUE;
+      res4.a[i] = DEFAULT_VALUE;
     }
 
   res1.x = _mm_scalef_sd (s1.x, s2.x);
+  res2.x = _mm_scalef_round_sd (s1.x, s2.x,
+              _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  res3.x = _mm_mask_scalef_round_sd (s1.x, mask, s1.x, s2.x,
+              _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  res4.x = _mm_maskz_scalef_round_sd (mask, s1.x, s2.x,
+              _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
 
   compute_scalefsd (s1.a, s2.a, res_ref);
 
   if (check_union128d (res1, res_ref))
     abort ();
+  if (check_union128d (res2, res_ref))
+    abort ();
+
+  MASK_MERGE (d) (res_ref, mask, 1);
+
+  if (check_union128d (res3, res_ref))
+    abort ();
+
+  MASK_ZERO (d) (res_ref, mask, 1);
+
+  if (check_union128d (res4, res_ref))
+    abort ();
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vscalefss-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vscalefss-1.c
index f59525f..d1af336 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vscalefss-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vscalefss-1.c
@@ -1,14 +1,20 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
 /* { dg-final { scan-assembler-times "vscalefss\[ \\t\]+\[^\n\]*\{rn-sae\}\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vscalefss\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vscalefss\[ \\t\]+\[^\n\]*\{ru-sae\}\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vscalefss\[ \\t\]+\[^\n\]*\{rz-sae\}\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
 volatile __m128 x;
+volatile __mmask8 m;
 
 void extern
 avx512f_test (void)
 {
   x = _mm_scalef_ss (x, x);
   x = _mm_scalef_round_ss (x, x, _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  x = _mm_mask_scalef_round_ss (x, m, x, x, _MM_FROUND_TO_POS_INF | _MM_FROUND_NO_EXC);
+  x = _mm_maskz_scalef_round_ss (m, x, x, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vscalefss-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vscalefss-2.c
index 9356184..f0501bf 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vscalefss-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vscalefss-2.c
@@ -6,6 +6,7 @@
 #include "avx512f-check.h"
 
 #define SIZE (128 / 32)
+#include "avx512f-mask-type.h"
 
 static void
 compute_scalefss (float *s1, float *s2, float *r)
@@ -19,20 +20,45 @@ compute_scalefss (float *s1, float *s2, float *r)
 static void
 avx512f_test (void)
 {
-  union128 res1, s1, s2;
+  union128 res1, res2, res3, res4;
+  union128 s1, s2;
   float res_ref[SIZE];
+  MASK_TYPE mask = MASK_VALUE;
   int i;
 
   for (i = 0; i < SIZE; i++)
     {
       s1.a[i] = 11.5 * (i + 1);
       s2.a[i] = 10.5 * (i + 1);
+      res_ref[i] = 9.5 * (i + 1);
+      res1.a[i] = DEFAULT_VALUE;
+      res2.a[i] = DEFAULT_VALUE;
+      res3.a[i] = DEFAULT_VALUE;
+      res4.a[i] = DEFAULT_VALUE;
     }
 
   res1.x = _mm_scalef_ss (s1.x, s2.x);
+  res2.x = _mm_scalef_round_ss (s1.x, s2.x,
+              _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  res3.x = _mm_mask_scalef_round_ss (s1.x, mask, s1.x, s2.x,
+              _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
+  res4.x = _mm_maskz_scalef_round_ss (mask, s1.x, s2.x,
+              _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC);
 
   compute_scalefss (s1.a, s2.a, res_ref);
 
   if (check_union128 (res1, res_ref))
     abort ();
+  if (check_union128 (res2, res_ref))                                         
+    abort ();
+
+  MASK_MERGE () (res_ref, mask, 1);
+
+  if (check_union128 (res3, res_ref))
+    abort ();
+
+  MASK_ZERO () (res_ref, mask, 1);
+
+  if (check_union128 (res4, res_ref))
+    abort ();
 }
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index 7ab2223..0642888 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -304,8 +304,8 @@
 #define __builtin_ia32_rndscaless_round(A, B, C, D) __builtin_ia32_rndscaless_round(A, B, 1, 4)
 #define __builtin_ia32_scalefpd512_mask(A, B, C, D, E) __builtin_ia32_scalefpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_scalefps512_mask(A, B, C, D, E) __builtin_ia32_scalefps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_scalefsd_round(A, B, C) __builtin_ia32_scalefsd_round(A, B, 8)
-#define __builtin_ia32_scalefss_round(A, B, C) __builtin_ia32_scalefss_round(A, B, 8)
+#define __builtin_ia32_scalefsd_mask_round(A, B, C, D, E) __builtin_ia32_scalefsd_mask_round(A, B, C, D, 8)
+#define __builtin_ia32_scalefss_mask_round(A, B, C, D, E) __builtin_ia32_scalefss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv8df(A, B, C, D, F) __builtin_ia32_scatterdiv8df(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv8di(A, B, C, D, F) __builtin_ia32_scatterdiv8di(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv16sf(A, B, C, D, F) __builtin_ia32_scatterdiv16sf(A, B, C, D, 8)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 3a90e54..298a30f 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -305,8 +305,8 @@
 #define __builtin_ia32_rndscaless_round(A, B, C, D) __builtin_ia32_rndscaless_round(A, B, 1, 4)
 #define __builtin_ia32_scalefpd512_mask(A, B, C, D, E) __builtin_ia32_scalefpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_scalefps512_mask(A, B, C, D, E) __builtin_ia32_scalefps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_scalefsd_round(A, B, C) __builtin_ia32_scalefsd_round(A, B, 8)
-#define __builtin_ia32_scalefss_round(A, B, C) __builtin_ia32_scalefss_round(A, B, 8)
+#define __builtin_ia32_scalefsd_mask_round(A, B, C, D, E) __builtin_ia32_scalefsd_mask_round(A, B, C, D, 8)
+#define __builtin_ia32_scalefss_mask_round(A, B, C, D, E) __builtin_ia32_scalefss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv8df(A, B, C, D, F) __builtin_ia32_scatterdiv8df(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv8di(A, B, C, D, F) __builtin_ia32_scatterdiv8di(A, B, C, D, 8)
 #define __builtin_ia32_scatterdiv16sf(A, B, C, D, F) __builtin_ia32_scatterdiv16sf(A, B, C, D, 8)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [patch][i386, AVX] Adding missing mask[z]_scalef_round_s[d,s] intrinsics
  2017-11-14 10:09 ` Makhotina, Olga
@ 2018-02-12  6:10   ` Kirill Yukhin
  0 siblings, 0 replies; 3+ messages in thread
From: Kirill Yukhin @ 2018-02-12  6:10 UTC (permalink / raw)
  To: Makhotina, Olga; +Cc: gcc-patches, Uros Bizjak, Peryt, Sebastian

Hello Olga,

On 14 Nov 09:56, Makhotina, Olga wrote:
> Hi,
> 
> Attachment got lost by accident. Attaching it again.
> 
> Thanks,
> Olga
> 
> -----Original Message-----
> From: Makhotina, Olga 
> Sent: Tuesday, November 14, 2017 10:49 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Uros Bizjak <ubizjak@gmail.com>; Kirill Yukhin <kirill.yukhin@gmail.com>; Makhotina, Olga <olga.makhotina@intel.com>; Peryt, Sebastian <sebastian.peryt@intel.com>
> Subject: [patch][i386, AVX] Adding missing mask[z]_scalef_round_s[d,s] intrinsics
> 
> Hi,
> 
> This patch adds missing intrinsics for _mm_mask[z]_scalef_round_[sd,ss].
> 
> 14.11.2017  Olga Makhotina  <olga.makhotina@intel.com>
> 
> gcc/
> 	* config/i386/avx512fintrin.h (_mm_mask_scalef_round_sd,
> 	_mm_maskz_scalef_round_sd, _mm_mask_scalef_round_ss, 
> 	_mm_maskz_scalef_round_ss): New intrinsics.
> 	(__builtin_ia32_scalefsd_round, __builtin_ia32_scalefss_round): Fix.
> 	* config/i386/i386-builtin.def (__builtin_ia32_scalefsd_round,
> 	__builtin_ia32_scalefss_round): Remove.
> 	(__builtin_ia32_scalefsd_mask_round,
> 	__builtin_ia32_scalefss_mask_round): New intrinsics.
> 	* config/i386/sse.md (vmscalef<mode><round_name>): Renamed to ...
> 	(vmscalef<mode><mask_scalar_name><round_scalar_name>): ... this.
> 	((match_operand:VF_128 2 "<round_nimm_predicate>" 
> 	"<round_constraint>")): Changed to ...
> 	((match_operand:VF_128 2 "<round_scalar_nimm_predicate>" 
> 	"<round_scalar_constraint>")): ... this.
> 	("vscalef<ssescalarmodesuffix>\t{<round_op3>%2, %1, %0|
> 	%0, %1, %2<round_op3>}"): Changed to ...
> 	("vscalef<ssescalarmodesuffix>\t{<round_scalar_mask_op3>%2, %1, 
> 	%0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %1, 
> 	%2<round_scalar_mask_op3>}"): ... this.
> 	* config/i386/subst.md (round_scalar_nimm_predicate): New.
> 
> 14.11.2017  Olga Makhotina  <olga.makhotina@intel.com>
> 
> gcc/testsuite/
> 	* gcc.target/i386/avx512f-vscalefsd-1.c (_mm_mask_scalef_round_sd,
> 	_mm_maskz_scalef_round_sd): Test new intrinsics.
> 	* gcc.target/i386/avx512f-vscalefsd-2.c (_mm_scalef_round_sd,
> 	_mm_mask_scalef_round_sd, _mm_maskz_scalef_round_sd): Test new intrinsics.
> 	* gcc.target/i386/avx512f-vscalefss-1.c (_mm_mask_scalef_round_ss,
> 	_mm_maskz_scalef_round_ss): Test new intrinsics.
> 	* gcc.target/i386/avx512f-vscalefss-2.c (_mm_scalef_round_ss,
> 	_mm_mask_scalef_round_ss, _mm_maskz_scalef_round_ss): Test new intrinsics.
> 	* gcc.target/i386/avx-1.c (__builtin_ia32_scalefsd_round,
> 	__builtin_ia32_scalefss_round): Remove builtin.
> 	(__builtin_ia32_scalefsd_mask_round,
> 	__builtin_ia32_scalefss_mask_round): Test new builtin.
> 	* gcc.target/i386/sse-13.c: Ditto.
> 	* gcc.target/i386/sse-23.c: Ditto.
> 
> Is it ok for trunk?
Your patch is OK for main trunk. I've checked it in.
(I've removed trailing spaces from CL entries and limited to 80 chars).

--
Thanks, K

>  
> Thanks,
> Olga


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-02-12  6:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-14  9:57 [patch][i386, AVX] Adding missing mask[z]_scalef_round_s[d,s] intrinsics Makhotina, Olga
2017-11-14 10:09 ` Makhotina, Olga
2018-02-12  6:10   ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).