public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Remove redundant builtins for avx512f scalar instructions.
@ 2019-12-24 14:24 Hongyu Wang
  2020-01-14 20:52 ` Jeff Law
  2020-11-13  5:42 ` Jeff Law
  0 siblings, 2 replies; 8+ messages in thread
From: Hongyu Wang @ 2019-12-24 14:24 UTC (permalink / raw)
  To: jakub, gcc-patches; +Cc: crazylht, hjl.tools

[-- Attachment #1: Type: text/plain, Size: 2864 bytes --]

Hi:
  For avx512f scalar instructions, current builtin function like
__builtin_ia32_*{sd,ss}_round can be replaced by
__builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
patch did the replacement and remove the corresponding redundant
builtins.

  Bootstrap is ok, make-check ok for i386 target.
  Ok for trunk?

Changelog

gcc/
        * config/i386/avx512fintrin.h
        (_mm_add_round_sd, _mm_add_round_ss): Use
         __builtin_ia32_adds?_mask_round builtins instead of
        __builtin_ia32_adds?_round.
        (_mm_sub_round_sd, _mm_sub_round_ss,
        _mm_mul_round_sd, _mm_mul_round_ss,
        _mm_div_round_sd, _mm_div_round_ss,
        _mm_getexp_sd, _mm_getexp_ss,
        _mm_getexp_round_sd, _mm_getexp_round_ss,
        _mm_getmant_sd, _mm_getmant_ss,
        _mm_getmant_round_sd, _mm_getmant_round_ss,
        _mm_max_round_sd, _mm_max_round_ss,
        _mm_min_round_sd, _mm_min_round_ss,
        _mm_fmadd_round_sd, _mm_fmadd_round_ss,
        _mm_fmsub_round_sd, _mm_fmsub_round_ss,
        _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
        _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
        * config/i386/i386-builtin.def
        (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
        __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
        __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
        __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
        __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
        __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
        __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
        __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
        __builtin_ia32_vfmaddsd3_round,
        __builtin_ia32_vfmaddss3_round): Remove.
        * config/i386/i386-expand.c
        (ix86_expand_round_builtin): Remove corresponding case.

gcc/testsuite/
        * lib/target-supports.exp
        (check_effective_target_avx512f): Use
        __builtin_ia32_getmantsd_mask_round builtins instead of
        __builtin_ia32_getmantsd_round.
        *gcc.target/i386/avx-1.c
        (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
        __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
        __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
        __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
        __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
        __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
        __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
        __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
        __builtin_ia32_vfmaddsd3_round,
        __builtin_ia32_vfmaddss3_round): Remove.
        *gcc.target/i386/sse-13.c: Ditto.
        *gcc.target/i386/sse-23.c: Ditto.


Regards,
Hongyu Wang

[-- Attachment #2: 0001-Remove-redundant-round-builtins-for-avx512f-scalar-i.patch --]
[-- Type: text/x-patch, Size: 71267 bytes --]

From 9cc4928aad5770c53ff580f5c996092cdaf2f9ba Mon Sep 17 00:00:00 2001
From: hongyuw1 <hongyuw1@gitlab.devtools.intel.com>
Date: Wed, 18 Dec 2019 14:52:54 +0000
Subject: [PATCH] Remove redundant round builtins for avx512f scalar
 instructions

Changelog

gcc/
	* config/i386/avx512fintrin.h
	(_mm_add_round_sd, _mm_add_round_ss): Use
	 __builtin_ia32_adds?_mask_round builtins instead of
	__builtin_ia32_adds?_round.
	(_mm_sub_round_sd, _mm_sub_round_ss,
	_mm_mul_round_sd, _mm_mul_round_ss,
	_mm_div_round_sd, _mm_div_round_ss,
	_mm_getexp_sd, _mm_getexp_ss,
	_mm_getexp_round_sd, _mm_getexp_round_ss,
	_mm_getmant_sd, _mm_getmant_ss,
	_mm_getmant_round_sd, _mm_getmant_round_ss,
	_mm_max_round_sd, _mm_max_round_ss,
	_mm_min_round_sd, _mm_min_round_ss,
	_mm_fmadd_round_sd, _mm_fmadd_round_ss,
	_mm_fmsub_round_sd, _mm_fmsub_round_ss,
	_mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
	_mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
	* config/i386/i386-builtin.def
	(__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
	__builtin_ia32_subsd_round, __builtin_ia32_subss_round,
	__builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
	__builtin_ia32_divsd_round, __builtin_ia32_divss_round,
	__builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
	__builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
	__builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
	__builtin_ia32_minsd_round, __builtin_ia32_minss_round,
	__builtin_ia32_vfmaddsd3_round,
	__builtin_ia32_vfmaddss3_round): Remove.
	* config/i386/i386-expand.c
	(ix86_expand_round_builtin): Remove corresponding case.

gcc/testsuite/
	* lib/target-supports.exp
	(check_effective_target_avx512f): Use
	__builtin_ia32_getmantsd_mask_round builtins instead of
	__builtin_ia32_getmantsd_round.
	*gcc.target/i386/avx-1.c
	(__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
	__builtin_ia32_subsd_round, __builtin_ia32_subss_round,
	__builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
	__builtin_ia32_divsd_round, __builtin_ia32_divss_round,
	__builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
	__builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
	__builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
	__builtin_ia32_minsd_round, __builtin_ia32_minss_round,
	__builtin_ia32_vfmaddsd3_round,
	__builtin_ia32_vfmaddss3_round): Remove.
	*gcc.target/i386/sse-13.c: Ditto.
	*gcc.target/i386/sse-23.c: Ditto.
---
 gcc/config/i386/avx512fintrin.h        | 584 +++++++++++++++++--------
 gcc/config/i386/i386-builtin.def       |  18 -
 gcc/config/i386/i386-expand.c          |   7 -
 gcc/testsuite/gcc.target/i386/avx-1.c  |  18 -
 gcc/testsuite/gcc.target/i386/sse-13.c |  18 -
 gcc/testsuite/gcc.target/i386/sse-23.c |  16 -
 gcc/testsuite/lib/target-supports.exp  |   2 +-
 7 files changed, 404 insertions(+), 259 deletions(-)

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 1d08f01a841..cdb4c948496 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -1481,9 +1481,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_add_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_addsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_addsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -1513,9 +1516,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_add_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_addss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_addss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_ps (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -1545,9 +1551,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_sub_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_subsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_subsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -1577,9 +1586,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_sub_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_subss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_subss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_ps (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -1606,8 +1618,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 }
 
 #else
-#define _mm_add_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_addsd_round(A, B, C)
+#define _mm_add_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_addsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_add_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_addsd_mask_round(A, B, W, U, C)
@@ -1615,8 +1633,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_add_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_addsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_add_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_addss_round(A, B, C)
+#define _mm_add_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_addss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_add_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_addss_mask_round(A, B, W, U, C)
@@ -1624,8 +1648,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_add_round_ss(U, A, B, C)   \
     (__m128)__builtin_ia32_addss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
 
-#define _mm_sub_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_subsd_round(A, B, C)
+#define _mm_sub_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_subsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_sub_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_subsd_mask_round(A, B, W, U, C)
@@ -1633,8 +1663,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_sub_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_subsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_sub_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_subss_round(A, B, C)
+#define _mm_sub_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_subss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_sub_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_subss_mask_round(A, B, W, U, C)
@@ -2730,9 +2766,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_mul_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_mulsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_mulsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -2762,9 +2801,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_mul_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_mulss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_mulss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_pd (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -2794,9 +2836,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_div_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_divsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_divsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -2826,9 +2871,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_div_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_divss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_divss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_pd (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -2891,8 +2939,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm512_maskz_div_round_ps(U, A, B, C)   \
     (__m512)__builtin_ia32_divps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C)
 
-#define _mm_mul_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_mulsd_round(A, B, C)
+#define _mm_mul_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_mulsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_mul_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_mulsd_mask_round(A, B, W, U, C)
@@ -2900,8 +2954,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_mul_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_mulsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_mul_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_mulss_round(A, B, C)
+#define _mm_mul_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_mulss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) -1,		\
+				    (int) (C)))
 
 #define _mm_mask_mul_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_mulss_mask_round(A, B, W, U, C)
@@ -2909,8 +2969,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_mul_round_ss(U, A, B, C)   \
     (__m128)__builtin_ia32_mulss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
 
-#define _mm_div_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_divsd_round(A, B, C)
+#define _mm_div_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_divsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_div_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_divsd_mask_round(A, B, W, U, C)
@@ -2918,8 +2984,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_div_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_divsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_div_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_divss_round(A, B, C)
+#define _mm_div_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_divss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) -1,		\
+				    (int) (C)))
 
 #define _mm_mask_div_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_divss_mask_round(A, B, W, U, C)
@@ -8703,9 +8775,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getexp_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_getexpss128_round ((__v4sf) __A,
-						    (__v4sf) __B,
-						    __R);
+  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
+						      (__v4sf) __B,
+						      (__v4sf)
+						      _mm_undefined_ps (),
+						      (__mmask8) -1,
+						      __R);
 }
 
 extern __inline __m128
@@ -8735,9 +8810,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getexp_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_getexpsd128_round ((__v2df) __A,
-						     (__v2df) __B,
-						     __R);
+  return (__m128d) __builtin_ia32_getexpsd_mask_round ((__v2df) __A,
+						       (__v2df) __B,
+						       (__v2df)
+						       _mm_undefined_pd (),
+						       (__mmask8) -1,
+						       __R);
 }
 
 extern __inline __m128d
@@ -8901,10 +8979,13 @@ _mm_getmant_round_sd (__m128d __A, __m128d __B,
 		      _MM_MANTISSA_NORM_ENUM __C,
 		      _MM_MANTISSA_SIGN_ENUM __D, const int __R)
 {
-  return (__m128d) __builtin_ia32_getmantsd_round ((__v2df) __A,
-						  (__v2df) __B,
-						  (__D << 2) | __C,
-						   __R);
+  return (__m128d) __builtin_ia32_getmantsd_mask_round ((__v2df) __A,
+							(__v2df) __B,
+							(__D << 2) | __C,
+							(__v2df)
+							_mm_undefined_pd (),
+							(__mmask8) -1,
+							__R);
 }
 
 extern __inline __m128d
@@ -8940,10 +9021,13 @@ _mm_getmant_round_ss (__m128 __A, __m128 __B,
 		      _MM_MANTISSA_NORM_ENUM __C,
 		      _MM_MANTISSA_SIGN_ENUM __D, const int __R)
 {
-  return (__m128) __builtin_ia32_getmantss_round ((__v4sf) __A,
-						  (__v4sf) __B,
-						  (__D << 2) | __C,
-						  __R);
+  return (__m128) __builtin_ia32_getmantss_mask_round ((__v4sf) __A,
+						       (__v4sf) __B,
+						       (__D << 2) | __C,
+						       (__v4sf)
+						       _mm_undefined_ps (),
+						       (__mmask8) -1,
+						       __R);
 }
 
 extern __inline __m128
@@ -9014,11 +9098,15 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                              (__v16sf)(__m512)_mm512_setzero_ps(),  \
                                              (__mmask16)(U),\
 					     (R)))
-#define _mm_getmant_round_sd(X, Y, C, D, R)                                                  \
-  ((__m128d)__builtin_ia32_getmantsd_round ((__v2df)(__m128d)(X),                    \
-					    (__v2df)(__m128d)(Y),	\
-					    (int)(((D)<<2) | (C)),	\
-					    (R)))
+#define _mm_getmant_round_sd(X, Y, C, D, R)			\
+  ((__m128d)							\
+   __builtin_ia32_getmantsd_mask_round ((__v2df) (__m128d) (X),\
+					(__v2df) (__m128d) (Y),	\
+					(int) (((D)<<2) | (C)),	\
+					(__v2df) (__m128d)	\
+					_mm_undefined_pd (),	\
+					(__mmask8) (-1),	\
+					(int) (R)))
 
 #define _mm_mask_getmant_round_sd(W, U, X, Y, C, D, R)                                       \
   ((__m128d)__builtin_ia32_getmantsd_mask_round ((__v2df)(__m128d)(X),                  \
@@ -9036,11 +9124,15 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                              (__mmask8)(U),\
 					     (R)))
 
-#define _mm_getmant_round_ss(X, Y, C, D, R)                                                  \
-  ((__m128)__builtin_ia32_getmantss_round ((__v4sf)(__m128)(X),                      \
-					   (__v4sf)(__m128)(Y),		\
-					   (int)(((D)<<2) | (C)),	\
-					   (R)))
+#define _mm_getmant_round_ss(X, Y, C, D, R)			\
+  ((__m128)							\
+   __builtin_ia32_getmantss_mask_round ((__v4sf) (__m128) (X),	\
+					(__v4sf) (__m128) (Y),	\
+					(int) (((D)<<2) | (C)),	\
+					(__v4sf) (__m128)	\
+					_mm_undefined_ps (),	\
+					(__mmask8) (-1),	\
+					(int) (R)))
 
 #define _mm_mask_getmant_round_ss(W, U, X, Y, C, D, R)                                       \
   ((__m128)__builtin_ia32_getmantss_mask_round ((__v4sf)(__m128)(X),                  \
@@ -9058,8 +9150,14 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                              (__mmask8)(U),\
 					     (R)))
 
-#define _mm_getexp_round_ss(A, B, R)						      \
-  ((__m128)__builtin_ia32_getexpss128_round((__v4sf)(__m128)(A), (__v4sf)(__m128)(B), R))
+#define _mm_getexp_round_ss(A, B, R)				\
+  ((__m128)							\
+   __builtin_ia32_getexpss_mask_round ((__v4sf) (__m128) (A),	\
+				       (__v4sf) (__m128) (B),	\
+				       (__v4sf) (__m128)	\
+				       _mm_undefined_ps (),	\
+				       (__mmask8) (-1),		\
+				       (int) (R)))
 
 #define _mm_mask_getexp_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_getexpss_mask_round(A, B, W, U, C)
@@ -9067,8 +9165,14 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_getexp_round_ss(U, A, B, C)   \
     (__m128)__builtin_ia32_getexpss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
 
-#define _mm_getexp_round_sd(A, B, R)						       \
-  ((__m128d)__builtin_ia32_getexpsd128_round((__v2df)(__m128d)(A), (__v2df)(__m128d)(B), R))
+#define _mm_getexp_round_sd(A, B, R)				\
+  ((__m128d)							\
+   __builtin_ia32_getexpsd_mask_round ((__v2df) (__m128d) (A),	\
+				       (__v2df) (__m128d) (B),	\
+				       (__v2df) (__m128d)	\
+				       _mm_undefined_pd (),	\
+				       (__mmask8) (-1),		\
+				       (int) (R)))
 
 #define _mm_mask_getexp_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_getexpsd_mask_round(A, B, W, U, C)
@@ -11392,9 +11496,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_max_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_maxsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_maxsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -11424,9 +11531,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_max_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_maxss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_maxss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_ps (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -11456,9 +11566,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_min_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_minsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_minsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -11488,9 +11601,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_min_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_minss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_minss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_ps (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -11517,8 +11633,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 }
 
 #else
-#define _mm_max_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_maxsd_round(A, B, C)
+#define _mm_max_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_maxsd_mask_round((__v2df) (__m128d) (A),	\
+				   (__v2df) (__m128d) (B),	\
+				   (__v2df) (__m128d)		\
+				   _mm_undefined_pd (),		\
+				   (__mmask8) (-1),		\
+				   (int) (C)))
 
 #define _mm_mask_max_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_maxsd_mask_round(A, B, W, U, C)
@@ -11526,8 +11648,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_max_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_maxsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_max_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_maxss_round(A, B, C)
+#define _mm_max_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_maxss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) (-1),		\
+				    (int)(C)))
 
 #define _mm_mask_max_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_maxss_mask_round(A, B, W, U, C)
@@ -11535,8 +11663,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_max_round_ss(U, A, B, C)   \
     (__m128)__builtin_ia32_maxss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
 
-#define _mm_min_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_minsd_round(A, B, C)
+#define _mm_min_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_minsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_min_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_minsd_mask_round(A, B, W, U, C)
@@ -11544,8 +11678,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_min_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_minsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_min_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_minss_round(A, B, C)
+#define _mm_min_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_minss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_min_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_minss_mask_round(A, B, W, U, C)
@@ -11596,105 +11736,153 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fmadd_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
-						   (__v2df) __A,
-						   (__v2df) __B,
-						   __R);
+  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
+						  (__v2df) __A,
+						  (__v2df) __B,
+						  (__mmask8) -1,
+						  __R);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fmadd_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
-						  (__v4sf) __A,
-						  (__v4sf) __B,
-						  __R);
+  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
+						 (__v4sf) __A,
+						 (__v4sf) __B,
+						 (__mmask8) -1,
+						 __R);
 }
 
 extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fmsub_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
-						   (__v2df) __A,
-						   -(__v2df) __B,
-						   __R);
-}
+  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
+						  (__v2df) __A,
+						  -(__v2df) __B,
+						  (__mmask8) -1,
+						  __R);
+ }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fmsub_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
-						  (__v4sf) __A,
-						  -(__v4sf) __B,
-						  __R);
+  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
+						 (__v4sf) __A,
+						 -(__v4sf) __B,
+						 (__mmask8) -1,
+						 __R);
 }
 
 extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fnmadd_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
-						   -(__v2df) __A,
-						   (__v2df) __B,
-						   __R);
+  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
+						  -(__v2df) __A,
+						  (__v2df) __B,
+						  (__mmask8) -1,
+						  __R);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fnmadd_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
-						  -(__v4sf) __A,
-						  (__v4sf) __B,
-						  __R);
-}
+  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
+						 -(__v4sf) __A,
+						 (__v4sf) __B,
+						 (__mmask8) -1,
+						 __R);
+ }
 
 extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fnmsub_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
-						   -(__v2df) __A,
-						   -(__v2df) __B,
-						   __R);
-}
+  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
+						  -(__v2df) __A,
+						  -(__v2df) __B,
+						  (__mmask8) -1,
+						  __R);
+ }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fnmsub_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
-						  -(__v4sf) __A,
-						  -(__v4sf) __B,
-						  __R);
+  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
+						 -(__v4sf) __A,
+						 -(__v4sf) __B,
+						 (__mmask8) -1,
+						 __R);
 }
 #else
-#define _mm_fmadd_round_sd(A, B, C, R)            \
-    (__m128d)__builtin_ia32_vfmaddsd3_round(A, B, C, R)
-
-#define _mm_fmadd_round_ss(A, B, C, R)            \
-    (__m128)__builtin_ia32_vfmaddss3_round(A, B, C, R)
-
-#define _mm_fmsub_round_sd(A, B, C, R)            \
-    (__m128d)__builtin_ia32_vfmaddsd3_round(A, B, -(C), R)
-
-#define _mm_fmsub_round_ss(A, B, C, R)            \
-    (__m128)__builtin_ia32_vfmaddss3_round(A, B, -(C), R)
-
-#define _mm_fnmadd_round_sd(A, B, C, R)            \
-    (__m128d)__builtin_ia32_vfmaddsd3_round(A, -(B), C, R)
-
-#define _mm_fnmadd_round_ss(A, B, C, R)            \
-   (__m128)__builtin_ia32_vfmaddss3_round(A, -(B), C, R)
-
-#define _mm_fnmsub_round_sd(A, B, C, R)            \
-    (__m128d)__builtin_ia32_vfmaddsd3_round(A, -(B), -(C), R)
-
-#define _mm_fnmsub_round_ss(A, B, C, R)            \
-    (__m128)__builtin_ia32_vfmaddss3_round(A, -(B), -(C), R)
+#define _mm_fmadd_round_sd(A, B, C, R)				\
+  ((__m128d)							\
+   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
+				  (__v2df) (__m128d) (B),	\
+				  (__v2df) (__m128d) (C),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fmadd_round_ss(A, B, C, R)				\
+  ((__m128)							\
+   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
+				  (__v4sf) (__m128) (B),	\
+				  (__v4sf) (__m128) (C),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fmsub_round_sd(A, B, C, R)				\
+  ((__m128d)							\
+   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
+				  (__v2df) (__m128d) (B),	\
+				  (__v2df) (__m128d) (-(C)),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fmsub_round_ss(A, B, C, R)				\
+  ((__m128)							\
+   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
+				  (__v4sf) (__m128) (B),	\
+				  (__v4sf) (__m128) (-(C)),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fnmadd_round_sd(A, B, C, R)				\
+  ((__m128d)							\
+   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
+				  (__v2df) (__m128d) (-(B)),	\
+				  (__v2df) (__m128d) (C),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fnmadd_round_ss(A, B, C, R)				\
+  ((__m128)							\
+   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
+				  (__v4sf) (__m128) (-(B)),	\
+				  (__v4sf) (__m128) (C),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fnmsub_round_sd(A, B, C, R)				\
+  ((__m128d)							\
+   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
+				  (__v2df) (__m128d) (-(B)),	\
+				  (__v2df) (__m128d) (-(C)),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fnmsub_round_ss(A, B, C, R)				\
+  ((__m128)							\
+   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
+				  (__v4sf) (__m128) (-(B)),	\
+				  (__v4sf) (__m128) (-(C)),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
 #endif
 
 extern __inline __m128d
@@ -14504,20 +14692,24 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getexp_ss (__m128 __A, __m128 __B)
 {
-  return (__m128) __builtin_ia32_getexpss128_round ((__v4sf) __A,
-						    (__v4sf) __B,
-						    _MM_FROUND_CUR_DIRECTION);
+  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
+						      (__v4sf) __B,
+						      (__v4sf)
+						      _mm_undefined_ps (),
+						      (__mmask8) -1,
+						      _MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_mask_getexp_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B)
 {
-  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
-						(__v4sf) __B,
-						(__v4sf) __W,
-						(__mmask8) __U,
-						_MM_FROUND_CUR_DIRECTION);
+  return (__m128)
+    __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
+					(__v4sf) __B,
+					(__v4sf) __W,
+					(__mmask8) __U,
+					_MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128
@@ -14536,9 +14728,13 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getexp_sd (__m128d __A, __m128d __B)
 {
-  return (__m128d) __builtin_ia32_getexpsd128_round ((__v2df) __A,
-						     (__v2df) __B,
-						     _MM_FROUND_CUR_DIRECTION);
+  return (__m128d)
+    __builtin_ia32_getexpsd_mask_round ((__v2df) __A,
+					(__v2df) __B,
+					(__v2df)
+					_mm_undefined_pd (),
+					(__mmask8) -1,
+					_MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128d
@@ -14641,10 +14837,14 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getmant_sd (__m128d __A, __m128d __B, _MM_MANTISSA_NORM_ENUM __C,
 		_MM_MANTISSA_SIGN_ENUM __D)
 {
-  return (__m128d) __builtin_ia32_getmantsd_round ((__v2df) __A,
-						   (__v2df) __B,
-						   (__D << 2) | __C,
-						   _MM_FROUND_CUR_DIRECTION);
+  return (__m128d)
+    __builtin_ia32_getmantsd_mask_round ((__v2df) __A,
+					 (__v2df) __B,
+					 (__D << 2) | __C,
+					 (__v2df)
+					 _mm_undefined_pd (),
+					 (__mmask8) -1,
+					 _MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128d
@@ -14679,10 +14879,14 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getmant_ss (__m128 __A, __m128 __B, _MM_MANTISSA_NORM_ENUM __C,
 		_MM_MANTISSA_SIGN_ENUM __D)
 {
-  return (__m128) __builtin_ia32_getmantss_round ((__v4sf) __A,
-						  (__v4sf) __B,
-						  (__D << 2) | __C,
-						  _MM_FROUND_CUR_DIRECTION);
+  return (__m128)
+    __builtin_ia32_getmantss_mask_round ((__v4sf) __A,
+					 (__v4sf) __B,
+					 (__D << 2) | __C,
+					 (__v4sf)
+					 _mm_undefined_ps (),
+					 (__mmask8) -1,
+					 _MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128
@@ -14753,11 +14957,15 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                              (__v16sf)_mm512_setzero_ps(),          \
                                              (__mmask16)(U),\
 					     _MM_FROUND_CUR_DIRECTION))
-#define _mm_getmant_sd(X, Y, C, D)                                                  \
-  ((__m128d)__builtin_ia32_getmantsd_round ((__v2df)(__m128d)(X),                    \
-                                           (__v2df)(__m128d)(Y),                    \
-                                           (int)(((D)<<2) | (C)),                   \
-					   _MM_FROUND_CUR_DIRECTION))
+#define _mm_getmant_sd(X, Y, C, D)					\
+  ((__m128d)								\
+   __builtin_ia32_getmantsd_mask_round ((__v2df) (__m128d) (X),	\
+					(__v2df) (__m128d) (Y),		\
+					(int) (((D)<<2) | (C)),		\
+					(__v2df) (__m128d)		\
+					_mm_undefined_pd (),		\
+					(__mmask8) (-1),		\
+					_MM_FROUND_CUR_DIRECTION))
 
 #define _mm_mask_getmant_sd(W, U, X, Y, C, D)                                       \
   ((__m128d)__builtin_ia32_getmantsd_mask_round ((__v2df)(__m128d)(X),                 \
@@ -14775,11 +14983,15 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                               (__mmask8)(U),\
 					      _MM_FROUND_CUR_DIRECTION))
 
-#define _mm_getmant_ss(X, Y, C, D)                                                  \
-  ((__m128)__builtin_ia32_getmantss_round ((__v4sf)(__m128)(X),                      \
-                                          (__v4sf)(__m128)(Y),                      \
-                                          (int)(((D)<<2) | (C)),                    \
-					  _MM_FROUND_CUR_DIRECTION))
+#define _mm_getmant_ss(X, Y, C, D)					\
+  ((__m128)								\
+   __builtin_ia32_getmantss_mask_round ((__v4sf) (__m128) (X),		\
+					(__v4sf) (__m128) (Y),		\
+					(int) (((D)<<2) | (C)),		\
+					(__v4sf) (__m128)		\
+					_mm_undefined_ps (),		\
+					(__mmask8) (-1),		\
+					_MM_FROUND_CUR_DIRECTION))
 
 #define _mm_mask_getmant_ss(W, U, X, Y, C, D)                                       \
   ((__m128)__builtin_ia32_getmantss_mask_round ((__v4sf)(__m128)(X),                 \
@@ -14797,9 +15009,14 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                               (__mmask8)(U),\
 					      _MM_FROUND_CUR_DIRECTION))
 
-#define _mm_getexp_ss(A, B)						      \
-  ((__m128)__builtin_ia32_getexpss128_round((__v4sf)(__m128)(A), (__v4sf)(__m128)(B),  \
-					   _MM_FROUND_CUR_DIRECTION))
+#define _mm_getexp_ss(A, B)						\
+  ((__m128)								\
+   __builtin_ia32_getexpss_mask_round ((__v4sf) (__m128) (A),		\
+				       (__v4sf) (__m128) (B),		\
+				       (__v4sf) (__m128)		\
+				       _mm_undefined_ps (),		\
+				       (__mmask8) (-1),			\
+				       _MM_FROUND_CUR_DIRECTION))
 
 #define _mm_mask_getexp_ss(W, U, A, B) \
     (__m128)__builtin_ia32_getexpss_mask_round(A, B, W, U,\
@@ -14809,9 +15026,14 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
     (__m128)__builtin_ia32_getexpss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U,\
 					      _MM_FROUND_CUR_DIRECTION)
 
-#define _mm_getexp_sd(A, B)						       \
-  ((__m128d)__builtin_ia32_getexpsd128_round((__v2df)(__m128d)(A), (__v2df)(__m128d)(B),\
-					    _MM_FROUND_CUR_DIRECTION))
+#define _mm_getexp_sd(A, B)						\
+  ((__m128d)								\
+   __builtin_ia32_getexpsd_mask_round ((__v2df) (__m128d) (A),		\
+				       (__v2df) (__m128d)(B),		\
+				       (__v2df) (__m128d)		\
+				       _mm_undefined_pd (),		\
+				       (__mmask8) (-1),			\
+				       _MM_FROUND_CUR_DIRECTION))
 
 #define _mm_mask_getexp_sd(W, U, A, B) \
     (__m128d)__builtin_ia32_getexpsd_mask_round(A, B, W, U,\
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index a6500f9d9b5..c2039e9d112 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2751,9 +2751,7 @@ BDESC_END (ARGS, ROUND_ARGS)
 BDESC_FIRST (round_args, ROUND_ARGS,
        OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_addv8df3_mask_round, "__builtin_ia32_addpd512_mask", IX86_BUILTIN_ADDPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_addv16sf3_mask_round, "__builtin_ia32_addps512_mask", IX86_BUILTIN_ADDPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmaddv2df3_round, "__builtin_ia32_addsd_round", IX86_BUILTIN_ADDSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmaddv2df3_mask_round, "__builtin_ia32_addsd_mask_round", IX86_BUILTIN_ADDSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmaddv4sf3_round, "__builtin_ia32_addss_round", IX86_BUILTIN_ADDSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmaddv4sf3_mask_round, "__builtin_ia32_addss_mask_round", IX86_BUILTIN_ADDSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_cmpv8df3_mask_round, "__builtin_ia32_cmppd512_mask", IX86_BUILTIN_CMPPD512, UNKNOWN, (int) UQI_FTYPE_V8DF_V8DF_INT_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_cmpv16sf3_mask_round, "__builtin_ia32_cmpps512_mask", IX86_BUILTIN_CMPPS512, UNKNOWN, (int) UHI_FTYPE_V16SF_V16SF_INT_UHI_INT)
@@ -2784,9 +2782,7 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_cvtusi2ss32_round, "__builtin_ia32_c
 BDESC (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_cvtusi2ss64_round, "__builtin_ia32_cvtusi2ss64", IX86_BUILTIN_CVTUSI2SS64, UNKNOWN, (int) V4SF_FTYPE_V4SF_UINT64_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_divv8df3_mask_round, "__builtin_ia32_divpd512_mask", IX86_BUILTIN_DIVPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_divv16sf3_mask_round, "__builtin_ia32_divps512_mask", IX86_BUILTIN_DIVPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmdivv2df3_round, "__builtin_ia32_divsd_round", IX86_BUILTIN_DIVSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmdivv2df3_mask_round, "__builtin_ia32_divsd_mask_round", IX86_BUILTIN_DIVSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmdivv4sf3_round, "__builtin_ia32_divss_round", IX86_BUILTIN_DIVSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmdivv4sf3_mask_round, "__builtin_ia32_divss_mask_round", IX86_BUILTIN_DIVSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fixupimmv8df_mask_round, "__builtin_ia32_fixupimmpd512_mask", IX86_BUILTIN_FIXUPIMMPD512_MASK, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DI_INT_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fixupimmv8df_maskz_round, "__builtin_ia32_fixupimmpd512_maskz", IX86_BUILTIN_FIXUPIMMPD512_MASKZ, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DI_INT_QI_INT)
@@ -2798,33 +2794,23 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sfixupimmv4sf_mask_round, "_
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sfixupimmv4sf_maskz_round, "__builtin_ia32_fixupimmss_maskz", IX86_BUILTIN_FIXUPIMMSS128_MASKZ, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SI_INT_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getexpv8df_mask_round, "__builtin_ia32_getexppd512_mask", IX86_BUILTIN_GETEXPPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getexpv16sf_mask_round, "__builtin_ia32_getexpps512_mask", IX86_BUILTIN_GETEXPPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv2df_round, "__builtin_ia32_getexpsd128_round", IX86_BUILTIN_GETEXPSD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv2df_mask_round, "__builtin_ia32_getexpsd_mask_round", IX86_BUILTIN_GETEXPSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv4sf_round, "__builtin_ia32_getexpss128_round", IX86_BUILTIN_GETEXPSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv4sf_mask_round, "__builtin_ia32_getexpss_mask_round", IX86_BUILTIN_GETEXPSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getmantv8df_mask_round, "__builtin_ia32_getmantpd512_mask", IX86_BUILTIN_GETMANTPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getmantv16sf_mask_round, "__builtin_ia32_getmantps512_mask", IX86_BUILTIN_GETMANTPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv2df_round, "__builtin_ia32_getmantsd_round", IX86_BUILTIN_GETMANTSD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv2df_mask_round, "__builtin_ia32_getmantsd_mask_round", IX86_BUILTIN_GETMANTSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv4sf_round, "__builtin_ia32_getmantss_round", IX86_BUILTIN_GETMANTSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv4sf_mask_round, "__builtin_ia32_getmantss_mask_round", IX86_BUILTIN_GETMANTSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_smaxv8df3_mask_round, "__builtin_ia32_maxpd512_mask", IX86_BUILTIN_MAXPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_smaxv16sf3_mask_round, "__builtin_ia32_maxps512_mask", IX86_BUILTIN_MAXPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsmaxv2df3_round, "__builtin_ia32_maxsd_round", IX86_BUILTIN_MAXSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsmaxv2df3_mask_round, "__builtin_ia32_maxsd_mask_round", IX86_BUILTIN_MAXSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsmaxv4sf3_round, "__builtin_ia32_maxss_round", IX86_BUILTIN_MAXSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsmaxv4sf3_mask_round, "__builtin_ia32_maxss_mask_round", IX86_BUILTIN_MAXSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sminv8df3_mask_round, "__builtin_ia32_minpd512_mask", IX86_BUILTIN_MINPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sminv16sf3_mask_round, "__builtin_ia32_minps512_mask", IX86_BUILTIN_MINPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsminv2df3_round, "__builtin_ia32_minsd_round", IX86_BUILTIN_MINSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsminv2df3_mask_round, "__builtin_ia32_minsd_mask_round", IX86_BUILTIN_MINSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsminv4sf3_round, "__builtin_ia32_minss_round", IX86_BUILTIN_MINSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsminv4sf3_mask_round, "__builtin_ia32_minss_mask_round", IX86_BUILTIN_MINSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_mulv8df3_mask_round, "__builtin_ia32_mulpd512_mask", IX86_BUILTIN_MULPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_mulv16sf3_mask_round, "__builtin_ia32_mulps512_mask", IX86_BUILTIN_MULPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmmulv2df3_round, "__builtin_ia32_mulsd_round", IX86_BUILTIN_MULSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmmulv2df3_mask_round, "__builtin_ia32_mulsd_mask_round", IX86_BUILTIN_MULSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmmulv4sf3_round, "__builtin_ia32_mulss_round", IX86_BUILTIN_MULSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmmulv4sf3_mask_round, "__builtin_ia32_mulss_mask_round", IX86_BUILTIN_MULSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_rndscalev8df_mask_round, "__builtin_ia32_rndscalepd_mask", IX86_BUILTIN_RNDSCALEPD, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_rndscalev16sf_mask_round, "__builtin_ia32_rndscaleps_mask", IX86_BUILTIN_RNDSCALEPS, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_HI_INT)
@@ -2840,9 +2826,7 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsqrtv2df2_mask_round, "__buil
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsqrtv4sf2_mask_round, "__builtin_ia32_sqrtss_mask_round", IX86_BUILTIN_SQRTSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_subv8df3_mask_round, "__builtin_ia32_subpd512_mask", IX86_BUILTIN_SUBPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_subv16sf3_mask_round, "__builtin_ia32_subps512_mask", IX86_BUILTIN_SUBPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsubv2df3_round, "__builtin_ia32_subsd_round", IX86_BUILTIN_SUBSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsubv2df3_mask_round, "__builtin_ia32_subsd_mask_round", IX86_BUILTIN_SUBSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsubv4sf3_round, "__builtin_ia32_subss_round", IX86_BUILTIN_SUBSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsubv4sf3_mask_round, "__builtin_ia32_subss_mask_round", IX86_BUILTIN_SUBSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_cvtsd2si_round, "__builtin_ia32_vcvtsd2si32", IX86_BUILTIN_VCVTSD2SI32, UNKNOWN, (int) INT_FTYPE_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse2_cvtsd2siq_round, "__builtin_ia32_vcvtsd2si64", IX86_BUILTIN_VCVTSD2SI64, UNKNOWN, (int) INT64_FTYPE_V2DF_INT)
@@ -2866,8 +2850,6 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v8df_maskz_round, "__b
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_mask_round, "__builtin_ia32_vfmaddps512_mask", IX86_BUILTIN_VFMADDPS512_MASK, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_mask3_round, "__builtin_ia32_vfmaddps512_mask3", IX86_BUILTIN_VFMADDPS512_MASK3, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_maskz_round, "__builtin_ia32_vfmaddps512_maskz", IX86_BUILTIN_VFMADDPS512_MASKZ, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_fmai_vmfmadd_v2df_round, "__builtin_ia32_vfmaddsd3_round", IX86_BUILTIN_VFMADDSD3_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_fmai_vmfmadd_v4sf_round, "__builtin_ia32_vfmaddss3_round", IX86_BUILTIN_VFMADDSS3_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_mask_round, "__builtin_ia32_vfmaddsd3_mask", IX86_BUILTIN_VFMADDSD3_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_mask3_round, "__builtin_ia32_vfmaddsd3_mask3", IX86_BUILTIN_VFMADDSD3_MASK3, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_maskz_round, "__builtin_ia32_vfmaddsd3_maskz", IX86_BUILTIN_VFMADDSD3_MASKZ, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index cbf4eb7b487..66bf9be5bd4 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -10193,13 +10193,6 @@ ix86_expand_round_builtin (const struct builtin_description *d,
     case V16SI_FTYPE_V16SF_V16SI_HI_INT:
     case V8DF_FTYPE_V8SF_V8DF_QI_INT:
     case V16SF_FTYPE_V16HI_V16SF_HI_INT:
-    case V2DF_FTYPE_V2DF_V2DF_V2DF_INT:
-    case V4SF_FTYPE_V4SF_V4SF_V4SF_INT:
-      nargs = 4;
-      break;
-    case V4SF_FTYPE_V4SF_V4SF_INT_INT:
-    case V2DF_FTYPE_V2DF_V2DF_INT_INT:
-      nargs_constant = 2;
       nargs = 4;
       break;
     case INT_FTYPE_V4SF_V4SF_INT_INT:
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index 3600a7abe91..0e00bfbbb5e 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -172,9 +172,7 @@
 #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
 #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
 #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
 #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
 #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
@@ -206,9 +204,7 @@
 #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
 #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
 #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
 #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
@@ -232,15 +228,11 @@
 #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
 #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
 #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
-#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
 #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
 #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
 #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
-#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
 #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
-#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
 #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
 #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
@@ -248,21 +240,15 @@
 #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
 #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
 #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_minsd_round(A, B, C) __builtin_ia32_minsd_round(A, B, 4)
 #define __builtin_ia32_minsd_mask_round(A, B, C, D, E) __builtin_ia32_minsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_minss_round(A, B, C) __builtin_ia32_minss_round(A, B, 4)
 #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
 #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
 #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
 #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
@@ -309,9 +295,7 @@
 #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
 #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
 #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
 #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
@@ -341,8 +325,6 @@
 #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
-#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
-#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index 45c1c285c57..fdb7852f0b3 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -189,9 +189,7 @@
 #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
 #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
 #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
 #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
 #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
@@ -223,9 +221,7 @@
 #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
 #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
 #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
 #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
@@ -249,15 +245,11 @@
 #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
 #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
 #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
-#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
 #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
 #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
 #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
-#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
 #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
-#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
 #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
 #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
@@ -265,21 +257,15 @@
 #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
 #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
 #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_minsd_round(A, B, C) __builtin_ia32_minsd_round(A, B, 4)
 #define __builtin_ia32_minsd_mask_round(A, B, C, D, E) __builtin_ia32_minsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_minss_round(A, B, C) __builtin_ia32_minss_round(A, B, 4)
 #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
 #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
 #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
 #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
@@ -326,9 +312,7 @@
 #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, E, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
 #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
 #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
 #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
@@ -358,8 +342,6 @@
 #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
-#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
-#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index e98c7693ef7..cb98cc63e6b 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -191,9 +191,7 @@
 #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
 #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
 #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
 #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
 #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
@@ -225,9 +223,7 @@
 #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
 #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
 #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
 #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
@@ -251,15 +247,11 @@
 #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
 #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
 #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
-#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
 #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
 #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
 #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
-#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
 #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
-#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
 #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
 #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
@@ -267,9 +259,7 @@
 #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
 #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
 #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
@@ -279,9 +269,7 @@
 #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
 #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
 #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
 #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
@@ -328,9 +316,7 @@
 #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
 #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
 #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
 #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
@@ -360,8 +346,6 @@
 #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
-#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
-#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 98f1141a8a4..e102b15ce54 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7786,7 +7786,7 @@ proc check_effective_target_avx512f { } {
 
 	__m128d _mm128_getmant (__m128d a)
 	{
-	  return __builtin_ia32_getmantsd_round (a, a, 0, 8);
+	  return __builtin_ia32_getmantsd_mask_round (a, a, 0, a, 1, 8);
 	}
     } "-O2 -mavx512f" ]
 }
-- 
2.19.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-11-30 20:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-24 14:24 [PATCH] Remove redundant builtins for avx512f scalar instructions Hongyu Wang
2020-01-14 20:52 ` Jeff Law
2020-01-15  2:55   ` Hongyu Wang
2020-11-13  5:42 ` Jeff Law
2020-11-13  6:21   ` Hongyu Wang
2020-11-30 16:23     ` Jeff Law
2020-11-30 16:26       ` Jakub Jelinek
2020-11-30 20:51         ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).