public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Remove redundant builtins for avx512f scalar instructions.
@ 2019-12-24 14:24 Hongyu Wang
  2020-01-14 20:52 ` Jeff Law
  2020-11-13  5:42 ` Jeff Law
  0 siblings, 2 replies; 8+ messages in thread
From: Hongyu Wang @ 2019-12-24 14:24 UTC (permalink / raw)
  To: jakub, gcc-patches; +Cc: crazylht, hjl.tools

[-- Attachment #1: Type: text/plain, Size: 2864 bytes --]

Hi:
  For avx512f scalar instructions, current builtin function like
__builtin_ia32_*{sd,ss}_round can be replaced by
__builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
patch did the replacement and remove the corresponding redundant
builtins.

  Bootstrap is ok, make-check ok for i386 target.
  Ok for trunk?

Changelog

gcc/
        * config/i386/avx512fintrin.h
        (_mm_add_round_sd, _mm_add_round_ss): Use
         __builtin_ia32_adds?_mask_round builtins instead of
        __builtin_ia32_adds?_round.
        (_mm_sub_round_sd, _mm_sub_round_ss,
        _mm_mul_round_sd, _mm_mul_round_ss,
        _mm_div_round_sd, _mm_div_round_ss,
        _mm_getexp_sd, _mm_getexp_ss,
        _mm_getexp_round_sd, _mm_getexp_round_ss,
        _mm_getmant_sd, _mm_getmant_ss,
        _mm_getmant_round_sd, _mm_getmant_round_ss,
        _mm_max_round_sd, _mm_max_round_ss,
        _mm_min_round_sd, _mm_min_round_ss,
        _mm_fmadd_round_sd, _mm_fmadd_round_ss,
        _mm_fmsub_round_sd, _mm_fmsub_round_ss,
        _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
        _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
        * config/i386/i386-builtin.def
        (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
        __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
        __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
        __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
        __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
        __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
        __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
        __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
        __builtin_ia32_vfmaddsd3_round,
        __builtin_ia32_vfmaddss3_round): Remove.
        * config/i386/i386-expand.c
        (ix86_expand_round_builtin): Remove corresponding case.

gcc/testsuite/
        * lib/target-supports.exp
        (check_effective_target_avx512f): Use
        __builtin_ia32_getmantsd_mask_round builtins instead of
        __builtin_ia32_getmantsd_round.
        *gcc.target/i386/avx-1.c
        (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
        __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
        __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
        __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
        __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
        __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
        __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
        __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
        __builtin_ia32_vfmaddsd3_round,
        __builtin_ia32_vfmaddss3_round): Remove.
        *gcc.target/i386/sse-13.c: Ditto.
        *gcc.target/i386/sse-23.c: Ditto.


Regards,
Hongyu Wang

[-- Attachment #2: 0001-Remove-redundant-round-builtins-for-avx512f-scalar-i.patch --]
[-- Type: text/x-patch, Size: 71267 bytes --]

From 9cc4928aad5770c53ff580f5c996092cdaf2f9ba Mon Sep 17 00:00:00 2001
From: hongyuw1 <hongyuw1@gitlab.devtools.intel.com>
Date: Wed, 18 Dec 2019 14:52:54 +0000
Subject: [PATCH] Remove redundant round builtins for avx512f scalar
 instructions

Changelog

gcc/
	* config/i386/avx512fintrin.h
	(_mm_add_round_sd, _mm_add_round_ss): Use
	 __builtin_ia32_adds?_mask_round builtins instead of
	__builtin_ia32_adds?_round.
	(_mm_sub_round_sd, _mm_sub_round_ss,
	_mm_mul_round_sd, _mm_mul_round_ss,
	_mm_div_round_sd, _mm_div_round_ss,
	_mm_getexp_sd, _mm_getexp_ss,
	_mm_getexp_round_sd, _mm_getexp_round_ss,
	_mm_getmant_sd, _mm_getmant_ss,
	_mm_getmant_round_sd, _mm_getmant_round_ss,
	_mm_max_round_sd, _mm_max_round_ss,
	_mm_min_round_sd, _mm_min_round_ss,
	_mm_fmadd_round_sd, _mm_fmadd_round_ss,
	_mm_fmsub_round_sd, _mm_fmsub_round_ss,
	_mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
	_mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
	* config/i386/i386-builtin.def
	(__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
	__builtin_ia32_subsd_round, __builtin_ia32_subss_round,
	__builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
	__builtin_ia32_divsd_round, __builtin_ia32_divss_round,
	__builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
	__builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
	__builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
	__builtin_ia32_minsd_round, __builtin_ia32_minss_round,
	__builtin_ia32_vfmaddsd3_round,
	__builtin_ia32_vfmaddss3_round): Remove.
	* config/i386/i386-expand.c
	(ix86_expand_round_builtin): Remove corresponding case.

gcc/testsuite/
	* lib/target-supports.exp
	(check_effective_target_avx512f): Use
	__builtin_ia32_getmantsd_mask_round builtins instead of
	__builtin_ia32_getmantsd_round.
	*gcc.target/i386/avx-1.c
	(__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
	__builtin_ia32_subsd_round, __builtin_ia32_subss_round,
	__builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
	__builtin_ia32_divsd_round, __builtin_ia32_divss_round,
	__builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
	__builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
	__builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
	__builtin_ia32_minsd_round, __builtin_ia32_minss_round,
	__builtin_ia32_vfmaddsd3_round,
	__builtin_ia32_vfmaddss3_round): Remove.
	*gcc.target/i386/sse-13.c: Ditto.
	*gcc.target/i386/sse-23.c: Ditto.
---
 gcc/config/i386/avx512fintrin.h        | 584 +++++++++++++++++--------
 gcc/config/i386/i386-builtin.def       |  18 -
 gcc/config/i386/i386-expand.c          |   7 -
 gcc/testsuite/gcc.target/i386/avx-1.c  |  18 -
 gcc/testsuite/gcc.target/i386/sse-13.c |  18 -
 gcc/testsuite/gcc.target/i386/sse-23.c |  16 -
 gcc/testsuite/lib/target-supports.exp  |   2 +-
 7 files changed, 404 insertions(+), 259 deletions(-)

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 1d08f01a841..cdb4c948496 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -1481,9 +1481,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_add_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_addsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_addsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -1513,9 +1516,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_add_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_addss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_addss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_ps (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -1545,9 +1551,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_sub_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_subsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_subsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -1577,9 +1586,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_sub_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_subss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_subss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_ps (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -1606,8 +1618,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 }
 
 #else
-#define _mm_add_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_addsd_round(A, B, C)
+#define _mm_add_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_addsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_add_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_addsd_mask_round(A, B, W, U, C)
@@ -1615,8 +1633,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_add_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_addsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_add_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_addss_round(A, B, C)
+#define _mm_add_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_addss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_add_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_addss_mask_round(A, B, W, U, C)
@@ -1624,8 +1648,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_add_round_ss(U, A, B, C)   \
     (__m128)__builtin_ia32_addss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
 
-#define _mm_sub_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_subsd_round(A, B, C)
+#define _mm_sub_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_subsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_sub_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_subsd_mask_round(A, B, W, U, C)
@@ -1633,8 +1663,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_sub_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_subsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_sub_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_subss_round(A, B, C)
+#define _mm_sub_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_subss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_sub_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_subss_mask_round(A, B, W, U, C)
@@ -2730,9 +2766,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_mul_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_mulsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_mulsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -2762,9 +2801,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_mul_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_mulss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_mulss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_pd (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -2794,9 +2836,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_div_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_divsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_divsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -2826,9 +2871,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_div_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_divss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_divss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_pd (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -2891,8 +2939,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm512_maskz_div_round_ps(U, A, B, C)   \
     (__m512)__builtin_ia32_divps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C)
 
-#define _mm_mul_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_mulsd_round(A, B, C)
+#define _mm_mul_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_mulsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_mul_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_mulsd_mask_round(A, B, W, U, C)
@@ -2900,8 +2954,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_mul_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_mulsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_mul_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_mulss_round(A, B, C)
+#define _mm_mul_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_mulss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) -1,		\
+				    (int) (C)))
 
 #define _mm_mask_mul_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_mulss_mask_round(A, B, W, U, C)
@@ -2909,8 +2969,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_mul_round_ss(U, A, B, C)   \
     (__m128)__builtin_ia32_mulss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
 
-#define _mm_div_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_divsd_round(A, B, C)
+#define _mm_div_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_divsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_div_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_divsd_mask_round(A, B, W, U, C)
@@ -2918,8 +2984,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_div_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_divsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_div_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_divss_round(A, B, C)
+#define _mm_div_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_divss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) -1,		\
+				    (int) (C)))
 
 #define _mm_mask_div_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_divss_mask_round(A, B, W, U, C)
@@ -8703,9 +8775,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getexp_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_getexpss128_round ((__v4sf) __A,
-						    (__v4sf) __B,
-						    __R);
+  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
+						      (__v4sf) __B,
+						      (__v4sf)
+						      _mm_undefined_ps (),
+						      (__mmask8) -1,
+						      __R);
 }
 
 extern __inline __m128
@@ -8735,9 +8810,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getexp_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_getexpsd128_round ((__v2df) __A,
-						     (__v2df) __B,
-						     __R);
+  return (__m128d) __builtin_ia32_getexpsd_mask_round ((__v2df) __A,
+						       (__v2df) __B,
+						       (__v2df)
+						       _mm_undefined_pd (),
+						       (__mmask8) -1,
+						       __R);
 }
 
 extern __inline __m128d
@@ -8901,10 +8979,13 @@ _mm_getmant_round_sd (__m128d __A, __m128d __B,
 		      _MM_MANTISSA_NORM_ENUM __C,
 		      _MM_MANTISSA_SIGN_ENUM __D, const int __R)
 {
-  return (__m128d) __builtin_ia32_getmantsd_round ((__v2df) __A,
-						  (__v2df) __B,
-						  (__D << 2) | __C,
-						   __R);
+  return (__m128d) __builtin_ia32_getmantsd_mask_round ((__v2df) __A,
+							(__v2df) __B,
+							(__D << 2) | __C,
+							(__v2df)
+							_mm_undefined_pd (),
+							(__mmask8) -1,
+							__R);
 }
 
 extern __inline __m128d
@@ -8940,10 +9021,13 @@ _mm_getmant_round_ss (__m128 __A, __m128 __B,
 		      _MM_MANTISSA_NORM_ENUM __C,
 		      _MM_MANTISSA_SIGN_ENUM __D, const int __R)
 {
-  return (__m128) __builtin_ia32_getmantss_round ((__v4sf) __A,
-						  (__v4sf) __B,
-						  (__D << 2) | __C,
-						  __R);
+  return (__m128) __builtin_ia32_getmantss_mask_round ((__v4sf) __A,
+						       (__v4sf) __B,
+						       (__D << 2) | __C,
+						       (__v4sf)
+						       _mm_undefined_ps (),
+						       (__mmask8) -1,
+						       __R);
 }
 
 extern __inline __m128
@@ -9014,11 +9098,15 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                              (__v16sf)(__m512)_mm512_setzero_ps(),  \
                                              (__mmask16)(U),\
 					     (R)))
-#define _mm_getmant_round_sd(X, Y, C, D, R)                                                  \
-  ((__m128d)__builtin_ia32_getmantsd_round ((__v2df)(__m128d)(X),                    \
-					    (__v2df)(__m128d)(Y),	\
-					    (int)(((D)<<2) | (C)),	\
-					    (R)))
+#define _mm_getmant_round_sd(X, Y, C, D, R)			\
+  ((__m128d)							\
+   __builtin_ia32_getmantsd_mask_round ((__v2df) (__m128d) (X),\
+					(__v2df) (__m128d) (Y),	\
+					(int) (((D)<<2) | (C)),	\
+					(__v2df) (__m128d)	\
+					_mm_undefined_pd (),	\
+					(__mmask8) (-1),	\
+					(int) (R)))
 
 #define _mm_mask_getmant_round_sd(W, U, X, Y, C, D, R)                                       \
   ((__m128d)__builtin_ia32_getmantsd_mask_round ((__v2df)(__m128d)(X),                  \
@@ -9036,11 +9124,15 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                              (__mmask8)(U),\
 					     (R)))
 
-#define _mm_getmant_round_ss(X, Y, C, D, R)                                                  \
-  ((__m128)__builtin_ia32_getmantss_round ((__v4sf)(__m128)(X),                      \
-					   (__v4sf)(__m128)(Y),		\
-					   (int)(((D)<<2) | (C)),	\
-					   (R)))
+#define _mm_getmant_round_ss(X, Y, C, D, R)			\
+  ((__m128)							\
+   __builtin_ia32_getmantss_mask_round ((__v4sf) (__m128) (X),	\
+					(__v4sf) (__m128) (Y),	\
+					(int) (((D)<<2) | (C)),	\
+					(__v4sf) (__m128)	\
+					_mm_undefined_ps (),	\
+					(__mmask8) (-1),	\
+					(int) (R)))
 
 #define _mm_mask_getmant_round_ss(W, U, X, Y, C, D, R)                                       \
   ((__m128)__builtin_ia32_getmantss_mask_round ((__v4sf)(__m128)(X),                  \
@@ -9058,8 +9150,14 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                              (__mmask8)(U),\
 					     (R)))
 
-#define _mm_getexp_round_ss(A, B, R)						      \
-  ((__m128)__builtin_ia32_getexpss128_round((__v4sf)(__m128)(A), (__v4sf)(__m128)(B), R))
+#define _mm_getexp_round_ss(A, B, R)				\
+  ((__m128)							\
+   __builtin_ia32_getexpss_mask_round ((__v4sf) (__m128) (A),	\
+				       (__v4sf) (__m128) (B),	\
+				       (__v4sf) (__m128)	\
+				       _mm_undefined_ps (),	\
+				       (__mmask8) (-1),		\
+				       (int) (R)))
 
 #define _mm_mask_getexp_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_getexpss_mask_round(A, B, W, U, C)
@@ -9067,8 +9165,14 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_getexp_round_ss(U, A, B, C)   \
     (__m128)__builtin_ia32_getexpss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
 
-#define _mm_getexp_round_sd(A, B, R)						       \
-  ((__m128d)__builtin_ia32_getexpsd128_round((__v2df)(__m128d)(A), (__v2df)(__m128d)(B), R))
+#define _mm_getexp_round_sd(A, B, R)				\
+  ((__m128d)							\
+   __builtin_ia32_getexpsd_mask_round ((__v2df) (__m128d) (A),	\
+				       (__v2df) (__m128d) (B),	\
+				       (__v2df) (__m128d)	\
+				       _mm_undefined_pd (),	\
+				       (__mmask8) (-1),		\
+				       (int) (R)))
 
 #define _mm_mask_getexp_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_getexpsd_mask_round(A, B, W, U, C)
@@ -11392,9 +11496,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_max_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_maxsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_maxsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -11424,9 +11531,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_max_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_maxss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_maxss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_ps (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -11456,9 +11566,12 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_min_round_sd (__m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_minsd_round ((__v2df) __A,
-					       (__v2df) __B,
-					       __R);
+  return (__m128d) __builtin_ia32_minsd_mask_round ((__v2df) __A,
+						    (__v2df) __B,
+						    (__v2df)
+						    _mm_undefined_pd (),
+						    (__mmask8) -1,
+						    __R);
 }
 
 extern __inline __m128d
@@ -11488,9 +11601,12 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_min_round_ss (__m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_minss_round ((__v4sf) __A,
-					      (__v4sf) __B,
-					      __R);
+  return (__m128) __builtin_ia32_minss_mask_round ((__v4sf) __A,
+						   (__v4sf) __B,
+						   (__v4sf)
+						   _mm_undefined_ps (),
+						   (__mmask8) -1,
+						   __R);
 }
 
 extern __inline __m128
@@ -11517,8 +11633,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 }
 
 #else
-#define _mm_max_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_maxsd_round(A, B, C)
+#define _mm_max_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_maxsd_mask_round((__v2df) (__m128d) (A),	\
+				   (__v2df) (__m128d) (B),	\
+				   (__v2df) (__m128d)		\
+				   _mm_undefined_pd (),		\
+				   (__mmask8) (-1),		\
+				   (int) (C)))
 
 #define _mm_mask_max_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_maxsd_mask_round(A, B, W, U, C)
@@ -11526,8 +11648,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_max_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_maxsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_max_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_maxss_round(A, B, C)
+#define _mm_max_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_maxss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) (-1),		\
+				    (int)(C)))
 
 #define _mm_mask_max_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_maxss_mask_round(A, B, W, U, C)
@@ -11535,8 +11663,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_max_round_ss(U, A, B, C)   \
     (__m128)__builtin_ia32_maxss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
 
-#define _mm_min_round_sd(A, B, C)            \
-    (__m128d)__builtin_ia32_minsd_round(A, B, C)
+#define _mm_min_round_sd(A, B, C)				\
+  ((__m128d)							\
+   __builtin_ia32_minsd_mask_round ((__v2df) (__m128d) (A),	\
+				    (__v2df) (__m128d) (B),	\
+				    (__v2df) (__m128d)		\
+				    _mm_undefined_pd (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_min_round_sd(W, U, A, B, C) \
     (__m128d)__builtin_ia32_minsd_mask_round(A, B, W, U, C)
@@ -11544,8 +11678,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
 #define _mm_maskz_min_round_sd(U, A, B, C)   \
     (__m128d)__builtin_ia32_minsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
 
-#define _mm_min_round_ss(A, B, C)            \
-    (__m128)__builtin_ia32_minss_round(A, B, C)
+#define _mm_min_round_ss(A, B, C)				\
+  ((__m128)							\
+   __builtin_ia32_minss_mask_round ((__v4sf) (__m128) (A),	\
+				    (__v4sf) (__m128) (B),	\
+				    (__v4sf) (__m128)		\
+				    _mm_undefined_ps (),	\
+				    (__mmask8) (-1),		\
+				    (int) (C)))
 
 #define _mm_mask_min_round_ss(W, U, A, B, C) \
     (__m128)__builtin_ia32_minss_mask_round(A, B, W, U, C)
@@ -11596,105 +11736,153 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fmadd_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
-						   (__v2df) __A,
-						   (__v2df) __B,
-						   __R);
+  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
+						  (__v2df) __A,
+						  (__v2df) __B,
+						  (__mmask8) -1,
+						  __R);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fmadd_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
-						  (__v4sf) __A,
-						  (__v4sf) __B,
-						  __R);
+  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
+						 (__v4sf) __A,
+						 (__v4sf) __B,
+						 (__mmask8) -1,
+						 __R);
 }
 
 extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fmsub_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
-						   (__v2df) __A,
-						   -(__v2df) __B,
-						   __R);
-}
+  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
+						  (__v2df) __A,
+						  -(__v2df) __B,
+						  (__mmask8) -1,
+						  __R);
+ }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fmsub_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
-						  (__v4sf) __A,
-						  -(__v4sf) __B,
-						  __R);
+  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
+						 (__v4sf) __A,
+						 -(__v4sf) __B,
+						 (__mmask8) -1,
+						 __R);
 }
 
 extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fnmadd_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
-						   -(__v2df) __A,
-						   (__v2df) __B,
-						   __R);
+  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
+						  -(__v2df) __A,
+						  (__v2df) __B,
+						  (__mmask8) -1,
+						  __R);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fnmadd_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
-						  -(__v4sf) __A,
-						  (__v4sf) __B,
-						  __R);
-}
+  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
+						 -(__v4sf) __A,
+						 (__v4sf) __B,
+						 (__mmask8) -1,
+						 __R);
+ }
 
 extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fnmsub_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
 {
-  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
-						   -(__v2df) __A,
-						   -(__v2df) __B,
-						   __R);
-}
+  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
+						  -(__v2df) __A,
+						  -(__v2df) __B,
+						  (__mmask8) -1,
+						  __R);
+ }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_fnmsub_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
 {
-  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
-						  -(__v4sf) __A,
-						  -(__v4sf) __B,
-						  __R);
+  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
+						 -(__v4sf) __A,
+						 -(__v4sf) __B,
+						 (__mmask8) -1,
+						 __R);
 }
 #else
-#define _mm_fmadd_round_sd(A, B, C, R)            \
-    (__m128d)__builtin_ia32_vfmaddsd3_round(A, B, C, R)
-
-#define _mm_fmadd_round_ss(A, B, C, R)            \
-    (__m128)__builtin_ia32_vfmaddss3_round(A, B, C, R)
-
-#define _mm_fmsub_round_sd(A, B, C, R)            \
-    (__m128d)__builtin_ia32_vfmaddsd3_round(A, B, -(C), R)
-
-#define _mm_fmsub_round_ss(A, B, C, R)            \
-    (__m128)__builtin_ia32_vfmaddss3_round(A, B, -(C), R)
-
-#define _mm_fnmadd_round_sd(A, B, C, R)            \
-    (__m128d)__builtin_ia32_vfmaddsd3_round(A, -(B), C, R)
-
-#define _mm_fnmadd_round_ss(A, B, C, R)            \
-   (__m128)__builtin_ia32_vfmaddss3_round(A, -(B), C, R)
-
-#define _mm_fnmsub_round_sd(A, B, C, R)            \
-    (__m128d)__builtin_ia32_vfmaddsd3_round(A, -(B), -(C), R)
-
-#define _mm_fnmsub_round_ss(A, B, C, R)            \
-    (__m128)__builtin_ia32_vfmaddss3_round(A, -(B), -(C), R)
+#define _mm_fmadd_round_sd(A, B, C, R)				\
+  ((__m128d)							\
+   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
+				  (__v2df) (__m128d) (B),	\
+				  (__v2df) (__m128d) (C),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fmadd_round_ss(A, B, C, R)				\
+  ((__m128)							\
+   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
+				  (__v4sf) (__m128) (B),	\
+				  (__v4sf) (__m128) (C),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fmsub_round_sd(A, B, C, R)				\
+  ((__m128d)							\
+   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
+				  (__v2df) (__m128d) (B),	\
+				  (__v2df) (__m128d) (-(C)),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fmsub_round_ss(A, B, C, R)				\
+  ((__m128)							\
+   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
+				  (__v4sf) (__m128) (B),	\
+				  (__v4sf) (__m128) (-(C)),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fnmadd_round_sd(A, B, C, R)				\
+  ((__m128d)							\
+   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
+				  (__v2df) (__m128d) (-(B)),	\
+				  (__v2df) (__m128d) (C),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fnmadd_round_ss(A, B, C, R)				\
+  ((__m128)							\
+   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
+				  (__v4sf) (__m128) (-(B)),	\
+				  (__v4sf) (__m128) (C),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fnmsub_round_sd(A, B, C, R)				\
+  ((__m128d)							\
+   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
+				  (__v2df) (__m128d) (-(B)),	\
+				  (__v2df) (__m128d) (-(C)),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
+
+#define _mm_fnmsub_round_ss(A, B, C, R)				\
+  ((__m128)							\
+   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
+				  (__v4sf) (__m128) (-(B)),	\
+				  (__v4sf) (__m128) (-(C)),	\
+				  (__mmask8) (-1),		\
+				  (int) (R)))
 #endif
 
 extern __inline __m128d
@@ -14504,20 +14692,24 @@ extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getexp_ss (__m128 __A, __m128 __B)
 {
-  return (__m128) __builtin_ia32_getexpss128_round ((__v4sf) __A,
-						    (__v4sf) __B,
-						    _MM_FROUND_CUR_DIRECTION);
+  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
+						      (__v4sf) __B,
+						      (__v4sf)
+						      _mm_undefined_ps (),
+						      (__mmask8) -1,
+						      _MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_mask_getexp_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B)
 {
-  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
-						(__v4sf) __B,
-						(__v4sf) __W,
-						(__mmask8) __U,
-						_MM_FROUND_CUR_DIRECTION);
+  return (__m128)
+    __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
+					(__v4sf) __B,
+					(__v4sf) __W,
+					(__mmask8) __U,
+					_MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128
@@ -14536,9 +14728,13 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getexp_sd (__m128d __A, __m128d __B)
 {
-  return (__m128d) __builtin_ia32_getexpsd128_round ((__v2df) __A,
-						     (__v2df) __B,
-						     _MM_FROUND_CUR_DIRECTION);
+  return (__m128d)
+    __builtin_ia32_getexpsd_mask_round ((__v2df) __A,
+					(__v2df) __B,
+					(__v2df)
+					_mm_undefined_pd (),
+					(__mmask8) -1,
+					_MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128d
@@ -14641,10 +14837,14 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getmant_sd (__m128d __A, __m128d __B, _MM_MANTISSA_NORM_ENUM __C,
 		_MM_MANTISSA_SIGN_ENUM __D)
 {
-  return (__m128d) __builtin_ia32_getmantsd_round ((__v2df) __A,
-						   (__v2df) __B,
-						   (__D << 2) | __C,
-						   _MM_FROUND_CUR_DIRECTION);
+  return (__m128d)
+    __builtin_ia32_getmantsd_mask_round ((__v2df) __A,
+					 (__v2df) __B,
+					 (__D << 2) | __C,
+					 (__v2df)
+					 _mm_undefined_pd (),
+					 (__mmask8) -1,
+					 _MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128d
@@ -14679,10 +14879,14 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_getmant_ss (__m128 __A, __m128 __B, _MM_MANTISSA_NORM_ENUM __C,
 		_MM_MANTISSA_SIGN_ENUM __D)
 {
-  return (__m128) __builtin_ia32_getmantss_round ((__v4sf) __A,
-						  (__v4sf) __B,
-						  (__D << 2) | __C,
-						  _MM_FROUND_CUR_DIRECTION);
+  return (__m128)
+    __builtin_ia32_getmantss_mask_round ((__v4sf) __A,
+					 (__v4sf) __B,
+					 (__D << 2) | __C,
+					 (__v4sf)
+					 _mm_undefined_ps (),
+					 (__mmask8) -1,
+					 _MM_FROUND_CUR_DIRECTION);
 }
 
 extern __inline __m128
@@ -14753,11 +14957,15 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                              (__v16sf)_mm512_setzero_ps(),          \
                                              (__mmask16)(U),\
 					     _MM_FROUND_CUR_DIRECTION))
-#define _mm_getmant_sd(X, Y, C, D)                                                  \
-  ((__m128d)__builtin_ia32_getmantsd_round ((__v2df)(__m128d)(X),                    \
-                                           (__v2df)(__m128d)(Y),                    \
-                                           (int)(((D)<<2) | (C)),                   \
-					   _MM_FROUND_CUR_DIRECTION))
+#define _mm_getmant_sd(X, Y, C, D)					\
+  ((__m128d)								\
+   __builtin_ia32_getmantsd_mask_round ((__v2df) (__m128d) (X),	\
+					(__v2df) (__m128d) (Y),		\
+					(int) (((D)<<2) | (C)),		\
+					(__v2df) (__m128d)		\
+					_mm_undefined_pd (),		\
+					(__mmask8) (-1),		\
+					_MM_FROUND_CUR_DIRECTION))
 
 #define _mm_mask_getmant_sd(W, U, X, Y, C, D)                                       \
   ((__m128d)__builtin_ia32_getmantsd_mask_round ((__v2df)(__m128d)(X),                 \
@@ -14775,11 +14983,15 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                               (__mmask8)(U),\
 					      _MM_FROUND_CUR_DIRECTION))
 
-#define _mm_getmant_ss(X, Y, C, D)                                                  \
-  ((__m128)__builtin_ia32_getmantss_round ((__v4sf)(__m128)(X),                      \
-                                          (__v4sf)(__m128)(Y),                      \
-                                          (int)(((D)<<2) | (C)),                    \
-					  _MM_FROUND_CUR_DIRECTION))
+#define _mm_getmant_ss(X, Y, C, D)					\
+  ((__m128)								\
+   __builtin_ia32_getmantss_mask_round ((__v4sf) (__m128) (X),		\
+					(__v4sf) (__m128) (Y),		\
+					(int) (((D)<<2) | (C)),		\
+					(__v4sf) (__m128)		\
+					_mm_undefined_ps (),		\
+					(__mmask8) (-1),		\
+					_MM_FROUND_CUR_DIRECTION))
 
 #define _mm_mask_getmant_ss(W, U, X, Y, C, D)                                       \
   ((__m128)__builtin_ia32_getmantss_mask_round ((__v4sf)(__m128)(X),                 \
@@ -14797,9 +15009,14 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
                                               (__mmask8)(U),\
 					      _MM_FROUND_CUR_DIRECTION))
 
-#define _mm_getexp_ss(A, B)						      \
-  ((__m128)__builtin_ia32_getexpss128_round((__v4sf)(__m128)(A), (__v4sf)(__m128)(B),  \
-					   _MM_FROUND_CUR_DIRECTION))
+#define _mm_getexp_ss(A, B)						\
+  ((__m128)								\
+   __builtin_ia32_getexpss_mask_round ((__v4sf) (__m128) (A),		\
+				       (__v4sf) (__m128) (B),		\
+				       (__v4sf) (__m128)		\
+				       _mm_undefined_ps (),		\
+				       (__mmask8) (-1),			\
+				       _MM_FROUND_CUR_DIRECTION))
 
 #define _mm_mask_getexp_ss(W, U, A, B) \
     (__m128)__builtin_ia32_getexpss_mask_round(A, B, W, U,\
@@ -14809,9 +15026,14 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
     (__m128)__builtin_ia32_getexpss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U,\
 					      _MM_FROUND_CUR_DIRECTION)
 
-#define _mm_getexp_sd(A, B)						       \
-  ((__m128d)__builtin_ia32_getexpsd128_round((__v2df)(__m128d)(A), (__v2df)(__m128d)(B),\
-					    _MM_FROUND_CUR_DIRECTION))
+#define _mm_getexp_sd(A, B)						\
+  ((__m128d)								\
+   __builtin_ia32_getexpsd_mask_round ((__v2df) (__m128d) (A),		\
+				       (__v2df) (__m128d)(B),		\
+				       (__v2df) (__m128d)		\
+				       _mm_undefined_pd (),		\
+				       (__mmask8) (-1),			\
+				       _MM_FROUND_CUR_DIRECTION))
 
 #define _mm_mask_getexp_sd(W, U, A, B) \
     (__m128d)__builtin_ia32_getexpsd_mask_round(A, B, W, U,\
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index a6500f9d9b5..c2039e9d112 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2751,9 +2751,7 @@ BDESC_END (ARGS, ROUND_ARGS)
 BDESC_FIRST (round_args, ROUND_ARGS,
        OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_addv8df3_mask_round, "__builtin_ia32_addpd512_mask", IX86_BUILTIN_ADDPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_addv16sf3_mask_round, "__builtin_ia32_addps512_mask", IX86_BUILTIN_ADDPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmaddv2df3_round, "__builtin_ia32_addsd_round", IX86_BUILTIN_ADDSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmaddv2df3_mask_round, "__builtin_ia32_addsd_mask_round", IX86_BUILTIN_ADDSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmaddv4sf3_round, "__builtin_ia32_addss_round", IX86_BUILTIN_ADDSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmaddv4sf3_mask_round, "__builtin_ia32_addss_mask_round", IX86_BUILTIN_ADDSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_cmpv8df3_mask_round, "__builtin_ia32_cmppd512_mask", IX86_BUILTIN_CMPPD512, UNKNOWN, (int) UQI_FTYPE_V8DF_V8DF_INT_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_cmpv16sf3_mask_round, "__builtin_ia32_cmpps512_mask", IX86_BUILTIN_CMPPS512, UNKNOWN, (int) UHI_FTYPE_V16SF_V16SF_INT_UHI_INT)
@@ -2784,9 +2782,7 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_cvtusi2ss32_round, "__builtin_ia32_c
 BDESC (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_cvtusi2ss64_round, "__builtin_ia32_cvtusi2ss64", IX86_BUILTIN_CVTUSI2SS64, UNKNOWN, (int) V4SF_FTYPE_V4SF_UINT64_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_divv8df3_mask_round, "__builtin_ia32_divpd512_mask", IX86_BUILTIN_DIVPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_divv16sf3_mask_round, "__builtin_ia32_divps512_mask", IX86_BUILTIN_DIVPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmdivv2df3_round, "__builtin_ia32_divsd_round", IX86_BUILTIN_DIVSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmdivv2df3_mask_round, "__builtin_ia32_divsd_mask_round", IX86_BUILTIN_DIVSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmdivv4sf3_round, "__builtin_ia32_divss_round", IX86_BUILTIN_DIVSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmdivv4sf3_mask_round, "__builtin_ia32_divss_mask_round", IX86_BUILTIN_DIVSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fixupimmv8df_mask_round, "__builtin_ia32_fixupimmpd512_mask", IX86_BUILTIN_FIXUPIMMPD512_MASK, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DI_INT_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fixupimmv8df_maskz_round, "__builtin_ia32_fixupimmpd512_maskz", IX86_BUILTIN_FIXUPIMMPD512_MASKZ, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DI_INT_QI_INT)
@@ -2798,33 +2794,23 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sfixupimmv4sf_mask_round, "_
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sfixupimmv4sf_maskz_round, "__builtin_ia32_fixupimmss_maskz", IX86_BUILTIN_FIXUPIMMSS128_MASKZ, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SI_INT_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getexpv8df_mask_round, "__builtin_ia32_getexppd512_mask", IX86_BUILTIN_GETEXPPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getexpv16sf_mask_round, "__builtin_ia32_getexpps512_mask", IX86_BUILTIN_GETEXPPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv2df_round, "__builtin_ia32_getexpsd128_round", IX86_BUILTIN_GETEXPSD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv2df_mask_round, "__builtin_ia32_getexpsd_mask_round", IX86_BUILTIN_GETEXPSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv4sf_round, "__builtin_ia32_getexpss128_round", IX86_BUILTIN_GETEXPSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv4sf_mask_round, "__builtin_ia32_getexpss_mask_round", IX86_BUILTIN_GETEXPSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getmantv8df_mask_round, "__builtin_ia32_getmantpd512_mask", IX86_BUILTIN_GETMANTPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getmantv16sf_mask_round, "__builtin_ia32_getmantps512_mask", IX86_BUILTIN_GETMANTPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv2df_round, "__builtin_ia32_getmantsd_round", IX86_BUILTIN_GETMANTSD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv2df_mask_round, "__builtin_ia32_getmantsd_mask_round", IX86_BUILTIN_GETMANTSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv4sf_round, "__builtin_ia32_getmantss_round", IX86_BUILTIN_GETMANTSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv4sf_mask_round, "__builtin_ia32_getmantss_mask_round", IX86_BUILTIN_GETMANTSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_smaxv8df3_mask_round, "__builtin_ia32_maxpd512_mask", IX86_BUILTIN_MAXPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_smaxv16sf3_mask_round, "__builtin_ia32_maxps512_mask", IX86_BUILTIN_MAXPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsmaxv2df3_round, "__builtin_ia32_maxsd_round", IX86_BUILTIN_MAXSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsmaxv2df3_mask_round, "__builtin_ia32_maxsd_mask_round", IX86_BUILTIN_MAXSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsmaxv4sf3_round, "__builtin_ia32_maxss_round", IX86_BUILTIN_MAXSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsmaxv4sf3_mask_round, "__builtin_ia32_maxss_mask_round", IX86_BUILTIN_MAXSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sminv8df3_mask_round, "__builtin_ia32_minpd512_mask", IX86_BUILTIN_MINPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sminv16sf3_mask_round, "__builtin_ia32_minps512_mask", IX86_BUILTIN_MINPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsminv2df3_round, "__builtin_ia32_minsd_round", IX86_BUILTIN_MINSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsminv2df3_mask_round, "__builtin_ia32_minsd_mask_round", IX86_BUILTIN_MINSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsminv4sf3_round, "__builtin_ia32_minss_round", IX86_BUILTIN_MINSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsminv4sf3_mask_round, "__builtin_ia32_minss_mask_round", IX86_BUILTIN_MINSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_mulv8df3_mask_round, "__builtin_ia32_mulpd512_mask", IX86_BUILTIN_MULPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_mulv16sf3_mask_round, "__builtin_ia32_mulps512_mask", IX86_BUILTIN_MULPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmmulv2df3_round, "__builtin_ia32_mulsd_round", IX86_BUILTIN_MULSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmmulv2df3_mask_round, "__builtin_ia32_mulsd_mask_round", IX86_BUILTIN_MULSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmmulv4sf3_round, "__builtin_ia32_mulss_round", IX86_BUILTIN_MULSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmmulv4sf3_mask_round, "__builtin_ia32_mulss_mask_round", IX86_BUILTIN_MULSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_rndscalev8df_mask_round, "__builtin_ia32_rndscalepd_mask", IX86_BUILTIN_RNDSCALEPD, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_rndscalev16sf_mask_round, "__builtin_ia32_rndscaleps_mask", IX86_BUILTIN_RNDSCALEPS, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_HI_INT)
@@ -2840,9 +2826,7 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsqrtv2df2_mask_round, "__buil
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsqrtv4sf2_mask_round, "__builtin_ia32_sqrtss_mask_round", IX86_BUILTIN_SQRTSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_subv8df3_mask_round, "__builtin_ia32_subpd512_mask", IX86_BUILTIN_SUBPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_subv16sf3_mask_round, "__builtin_ia32_subps512_mask", IX86_BUILTIN_SUBPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsubv2df3_round, "__builtin_ia32_subsd_round", IX86_BUILTIN_SUBSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsubv2df3_mask_round, "__builtin_ia32_subsd_mask_round", IX86_BUILTIN_SUBSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsubv4sf3_round, "__builtin_ia32_subss_round", IX86_BUILTIN_SUBSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsubv4sf3_mask_round, "__builtin_ia32_subss_mask_round", IX86_BUILTIN_SUBSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_cvtsd2si_round, "__builtin_ia32_vcvtsd2si32", IX86_BUILTIN_VCVTSD2SI32, UNKNOWN, (int) INT_FTYPE_V2DF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse2_cvtsd2siq_round, "__builtin_ia32_vcvtsd2si64", IX86_BUILTIN_VCVTSD2SI64, UNKNOWN, (int) INT64_FTYPE_V2DF_INT)
@@ -2866,8 +2850,6 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v8df_maskz_round, "__b
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_mask_round, "__builtin_ia32_vfmaddps512_mask", IX86_BUILTIN_VFMADDPS512_MASK, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_mask3_round, "__builtin_ia32_vfmaddps512_mask3", IX86_BUILTIN_VFMADDPS512_MASK3, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_maskz_round, "__builtin_ia32_vfmaddps512_maskz", IX86_BUILTIN_VFMADDPS512_MASKZ, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_fmai_vmfmadd_v2df_round, "__builtin_ia32_vfmaddsd3_round", IX86_BUILTIN_VFMADDSD3_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_INT)
-BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_fmai_vmfmadd_v4sf_round, "__builtin_ia32_vfmaddss3_round", IX86_BUILTIN_VFMADDSS3_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_mask_round, "__builtin_ia32_vfmaddsd3_mask", IX86_BUILTIN_VFMADDSD3_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_mask3_round, "__builtin_ia32_vfmaddsd3_mask3", IX86_BUILTIN_VFMADDSD3_MASK3, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_maskz_round, "__builtin_ia32_vfmaddsd3_maskz", IX86_BUILTIN_VFMADDSD3_MASKZ, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index cbf4eb7b487..66bf9be5bd4 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -10193,13 +10193,6 @@ ix86_expand_round_builtin (const struct builtin_description *d,
     case V16SI_FTYPE_V16SF_V16SI_HI_INT:
     case V8DF_FTYPE_V8SF_V8DF_QI_INT:
     case V16SF_FTYPE_V16HI_V16SF_HI_INT:
-    case V2DF_FTYPE_V2DF_V2DF_V2DF_INT:
-    case V4SF_FTYPE_V4SF_V4SF_V4SF_INT:
-      nargs = 4;
-      break;
-    case V4SF_FTYPE_V4SF_V4SF_INT_INT:
-    case V2DF_FTYPE_V2DF_V2DF_INT_INT:
-      nargs_constant = 2;
       nargs = 4;
       break;
     case INT_FTYPE_V4SF_V4SF_INT_INT:
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index 3600a7abe91..0e00bfbbb5e 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -172,9 +172,7 @@
 #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
 #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
 #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
 #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
 #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
@@ -206,9 +204,7 @@
 #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
 #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
 #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
 #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
@@ -232,15 +228,11 @@
 #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
 #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
 #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
-#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
 #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
 #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
 #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
-#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
 #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
-#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
 #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
 #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
@@ -248,21 +240,15 @@
 #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
 #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
 #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_minsd_round(A, B, C) __builtin_ia32_minsd_round(A, B, 4)
 #define __builtin_ia32_minsd_mask_round(A, B, C, D, E) __builtin_ia32_minsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_minss_round(A, B, C) __builtin_ia32_minss_round(A, B, 4)
 #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
 #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
 #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
 #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
@@ -309,9 +295,7 @@
 #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
 #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
 #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
 #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
@@ -341,8 +325,6 @@
 #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
-#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
-#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index 45c1c285c57..fdb7852f0b3 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -189,9 +189,7 @@
 #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
 #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
 #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
 #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
 #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
@@ -223,9 +221,7 @@
 #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
 #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
 #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
 #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
@@ -249,15 +245,11 @@
 #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
 #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
 #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
-#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
 #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
 #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
 #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
-#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
 #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
-#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
 #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
 #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
@@ -265,21 +257,15 @@
 #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
 #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
 #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_minsd_round(A, B, C) __builtin_ia32_minsd_round(A, B, 4)
 #define __builtin_ia32_minsd_mask_round(A, B, C, D, E) __builtin_ia32_minsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_minss_round(A, B, C) __builtin_ia32_minss_round(A, B, 4)
 #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
 #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
 #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
 #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
@@ -326,9 +312,7 @@
 #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, E, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
 #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
 #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
 #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
@@ -358,8 +342,6 @@
 #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
-#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
-#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index e98c7693ef7..cb98cc63e6b 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -191,9 +191,7 @@
 #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
 #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
 #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
 #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
 #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
@@ -225,9 +223,7 @@
 #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
 #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
 #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
 #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
@@ -251,15 +247,11 @@
 #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
 #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
 #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
-#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
 #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
 #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
 #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
-#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
 #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
-#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
 #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
 #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
@@ -267,9 +259,7 @@
 #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
 #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
 #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 4)
-#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
 #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
@@ -279,9 +269,7 @@
 #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
 #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
 #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
 #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
 #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
@@ -328,9 +316,7 @@
 #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
 #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
-#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
 #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
 #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
 #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
@@ -360,8 +346,6 @@
 #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
-#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
-#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
 #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 98f1141a8a4..e102b15ce54 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7786,7 +7786,7 @@ proc check_effective_target_avx512f { } {
 
 	__m128d _mm128_getmant (__m128d a)
 	{
-	  return __builtin_ia32_getmantsd_round (a, a, 0, 8);
+	  return __builtin_ia32_getmantsd_mask_round (a, a, 0, a, 1, 8);
 	}
     } "-O2 -mavx512f" ]
 }
-- 
2.19.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.
  2019-12-24 14:24 [PATCH] Remove redundant builtins for avx512f scalar instructions Hongyu Wang
@ 2020-01-14 20:52 ` Jeff Law
  2020-01-15  2:55   ` Hongyu Wang
  2020-11-13  5:42 ` Jeff Law
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Law @ 2020-01-14 20:52 UTC (permalink / raw)
  To: Hongyu Wang, jakub, gcc-patches; +Cc: crazylht, hjl.tools

On Tue, 2019-12-24 at 13:31 +0800, Hongyu Wang wrote:
> Hi:
>   For avx512f scalar instructions, current builtin function like
> __builtin_ia32_*{sd,ss}_round can be replaced by
> __builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
> patch did the replacement and remove the corresponding redundant
> builtins.
> 
>   Bootstrap is ok, make-check ok for i386 target.
>   Ok for trunk?
> 
> Changelog
> 
> gcc/
>         * config/i386/avx512fintrin.h
>         (_mm_add_round_sd, _mm_add_round_ss): Use
>          __builtin_ia32_adds?_mask_round builtins instead of
>         __builtin_ia32_adds?_round.
>         (_mm_sub_round_sd, _mm_sub_round_ss,
>         _mm_mul_round_sd, _mm_mul_round_ss,
>         _mm_div_round_sd, _mm_div_round_ss,
>         _mm_getexp_sd, _mm_getexp_ss,
>         _mm_getexp_round_sd, _mm_getexp_round_ss,
>         _mm_getmant_sd, _mm_getmant_ss,
>         _mm_getmant_round_sd, _mm_getmant_round_ss,
>         _mm_max_round_sd, _mm_max_round_ss,
>         _mm_min_round_sd, _mm_min_round_ss,
>         _mm_fmadd_round_sd, _mm_fmadd_round_ss,
>         _mm_fmsub_round_sd, _mm_fmsub_round_ss,
>         _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
>         _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
>         * config/i386/i386-builtin.def
>         (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
>         __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
>         __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
>         __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
>         __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
>         __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
>         __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
>         __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
>         __builtin_ia32_vfmaddsd3_round,
>         __builtin_ia32_vfmaddss3_round): Remove.
>         * config/i386/i386-expand.c
>         (ix86_expand_round_builtin): Remove corresponding case.
This doesn't really look like a bugfix to me.  Can it wait for gcc-11?

jeff
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.
  2020-01-14 20:52 ` Jeff Law
@ 2020-01-15  2:55   ` Hongyu Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Hongyu Wang @ 2020-01-15  2:55 UTC (permalink / raw)
  To: law; +Cc: jakub, gcc-patches, crazylht, hjl.tools

For sure.

Jeff Law <law@redhat.com> 于2020年1月15日周三 上午4:48写道:
>
> On Tue, 2019-12-24 at 13:31 +0800, Hongyu Wang wrote:
> > Hi:
> >   For avx512f scalar instructions, current builtin function like
> > __builtin_ia32_*{sd,ss}_round can be replaced by
> > __builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
> > patch did the replacement and remove the corresponding redundant
> > builtins.
> >
> >   Bootstrap is ok, make-check ok for i386 target.
> >   Ok for trunk?
> >
> > Changelog
> >
> > gcc/
> >         * config/i386/avx512fintrin.h
> >         (_mm_add_round_sd, _mm_add_round_ss): Use
> >          __builtin_ia32_adds?_mask_round builtins instead of
> >         __builtin_ia32_adds?_round.
> >         (_mm_sub_round_sd, _mm_sub_round_ss,
> >         _mm_mul_round_sd, _mm_mul_round_ss,
> >         _mm_div_round_sd, _mm_div_round_ss,
> >         _mm_getexp_sd, _mm_getexp_ss,
> >         _mm_getexp_round_sd, _mm_getexp_round_ss,
> >         _mm_getmant_sd, _mm_getmant_ss,
> >         _mm_getmant_round_sd, _mm_getmant_round_ss,
> >         _mm_max_round_sd, _mm_max_round_ss,
> >         _mm_min_round_sd, _mm_min_round_ss,
> >         _mm_fmadd_round_sd, _mm_fmadd_round_ss,
> >         _mm_fmsub_round_sd, _mm_fmsub_round_ss,
> >         _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> >         _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> >         * config/i386/i386-builtin.def
> >         (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> >         __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> >         __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> >         __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> >         __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> >         __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> >         __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> >         __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> >         __builtin_ia32_vfmaddsd3_round,
> >         __builtin_ia32_vfmaddss3_round): Remove.
> >         * config/i386/i386-expand.c
> >         (ix86_expand_round_builtin): Remove corresponding case.
> This doesn't really look like a bugfix to me.  Can it wait for gcc-11?
>
> jeff
> >
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.
  2019-12-24 14:24 [PATCH] Remove redundant builtins for avx512f scalar instructions Hongyu Wang
  2020-01-14 20:52 ` Jeff Law
@ 2020-11-13  5:42 ` Jeff Law
  2020-11-13  6:21   ` Hongyu Wang
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Law @ 2020-11-13  5:42 UTC (permalink / raw)
  To: Hongyu Wang, jakub, gcc-patches


On 12/23/19 10:31 PM, Hongyu Wang wrote:
> Hi:
>   For avx512f scalar instructions, current builtin function like
> __builtin_ia32_*{sd,ss}_round can be replaced by
> __builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
> patch did the replacement and remove the corresponding redundant
> builtins.
>
>   Bootstrap is ok, make-check ok for i386 target.
>   Ok for trunk?
>
> Changelog
>
> gcc/
>         * config/i386/avx512fintrin.h
>         (_mm_add_round_sd, _mm_add_round_ss): Use
>          __builtin_ia32_adds?_mask_round builtins instead of
>         __builtin_ia32_adds?_round.
>         (_mm_sub_round_sd, _mm_sub_round_ss,
>         _mm_mul_round_sd, _mm_mul_round_ss,
>         _mm_div_round_sd, _mm_div_round_ss,
>         _mm_getexp_sd, _mm_getexp_ss,
>         _mm_getexp_round_sd, _mm_getexp_round_ss,
>         _mm_getmant_sd, _mm_getmant_ss,
>         _mm_getmant_round_sd, _mm_getmant_round_ss,
>         _mm_max_round_sd, _mm_max_round_ss,
>         _mm_min_round_sd, _mm_min_round_ss,
>         _mm_fmadd_round_sd, _mm_fmadd_round_ss,
>         _mm_fmsub_round_sd, _mm_fmsub_round_ss,
>         _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
>         _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
>         * config/i386/i386-builtin.def
>         (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
>         __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
>         __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
>         __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
>         __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
>         __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
>         __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
>         __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
>         __builtin_ia32_vfmaddsd3_round,
>         __builtin_ia32_vfmaddss3_round): Remove.
>         * config/i386/i386-expand.c
>         (ix86_expand_round_builtin): Remove corresponding case.
>
> gcc/testsuite/
>         * lib/target-supports.exp
>         (check_effective_target_avx512f): Use
>         __builtin_ia32_getmantsd_mask_round builtins instead of
>         __builtin_ia32_getmantsd_round.
>         *gcc.target/i386/avx-1.c
>         (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
>         __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
>         __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
>         __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
>         __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
>         __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
>         __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
>         __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
>         __builtin_ia32_vfmaddsd3_round,
>         __builtin_ia32_vfmaddss3_round): Remove.
>         *gcc.target/i386/sse-13.c: Ditto.
>         *gcc.target/i386/sse-23.c: Ditto.

So I like the idea of simplifying the implementation of some of the
intrinsics when we can, but ISTM that removing existing intrinsics would
be a mistake since end-users could be using them in their code.   I'd
think we'd want to keep the existing APIs, even if we change the
implementation under the hood.


Thoughts?


jeff


> Hongyu Wang
>
> 0001-Remove-redundant-round-builtins-for-avx512f-scalar-i.patch
>
> From 9cc4928aad5770c53ff580f5c996092cdaf2f9ba Mon Sep 17 00:00:00 2001
> From: hongyuw1 <hongyuw1@gitlab.devtools.intel.com>
> Date: Wed, 18 Dec 2019 14:52:54 +0000
> Subject: [PATCH] Remove redundant round builtins for avx512f scalar
>  instructions
>
> Changelog
>
> gcc/
> 	* config/i386/avx512fintrin.h
> 	(_mm_add_round_sd, _mm_add_round_ss): Use
> 	 __builtin_ia32_adds?_mask_round builtins instead of
> 	__builtin_ia32_adds?_round.
> 	(_mm_sub_round_sd, _mm_sub_round_ss,
> 	_mm_mul_round_sd, _mm_mul_round_ss,
> 	_mm_div_round_sd, _mm_div_round_ss,
> 	_mm_getexp_sd, _mm_getexp_ss,
> 	_mm_getexp_round_sd, _mm_getexp_round_ss,
> 	_mm_getmant_sd, _mm_getmant_ss,
> 	_mm_getmant_round_sd, _mm_getmant_round_ss,
> 	_mm_max_round_sd, _mm_max_round_ss,
> 	_mm_min_round_sd, _mm_min_round_ss,
> 	_mm_fmadd_round_sd, _mm_fmadd_round_ss,
> 	_mm_fmsub_round_sd, _mm_fmsub_round_ss,
> 	_mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> 	_mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> 	* config/i386/i386-builtin.def
> 	(__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> 	__builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> 	__builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> 	__builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> 	__builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> 	__builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> 	__builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> 	__builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> 	__builtin_ia32_vfmaddsd3_round,
> 	__builtin_ia32_vfmaddss3_round): Remove.
> 	* config/i386/i386-expand.c
> 	(ix86_expand_round_builtin): Remove corresponding case.
>
> gcc/testsuite/
> 	* lib/target-supports.exp
> 	(check_effective_target_avx512f): Use
> 	__builtin_ia32_getmantsd_mask_round builtins instead of
> 	__builtin_ia32_getmantsd_round.
> 	*gcc.target/i386/avx-1.c
> 	(__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> 	__builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> 	__builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> 	__builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> 	__builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> 	__builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> 	__builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> 	__builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> 	__builtin_ia32_vfmaddsd3_round,
> 	__builtin_ia32_vfmaddss3_round): Remove.
> 	*gcc.target/i386/sse-13.c: Ditto.
> 	*gcc.target/i386/sse-23.c: Ditto.
> ---
>  gcc/config/i386/avx512fintrin.h        | 584 +++++++++++++++++--------
>  gcc/config/i386/i386-builtin.def       |  18 -
>  gcc/config/i386/i386-expand.c          |   7 -
>  gcc/testsuite/gcc.target/i386/avx-1.c  |  18 -
>  gcc/testsuite/gcc.target/i386/sse-13.c |  18 -
>  gcc/testsuite/gcc.target/i386/sse-23.c |  16 -
>  gcc/testsuite/lib/target-supports.exp  |   2 +-
>  7 files changed, 404 insertions(+), 259 deletions(-)
>
> diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
> index 1d08f01a841..cdb4c948496 100644
> --- a/gcc/config/i386/avx512fintrin.h
> +++ b/gcc/config/i386/avx512fintrin.h
> @@ -1481,9 +1481,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_add_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_addsd_round ((__v2df) __A,
> -					       (__v2df) __B,
> -					       __R);
> +  return (__m128d) __builtin_ia32_addsd_mask_round ((__v2df) __A,
> +						    (__v2df) __B,
> +						    (__v2df)
> +						    _mm_undefined_pd (),
> +						    (__mmask8) -1,
> +						    __R);
>  }
>  
>  extern __inline __m128d
> @@ -1513,9 +1516,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_add_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_addss_round ((__v4sf) __A,
> -					      (__v4sf) __B,
> -					      __R);
> +  return (__m128) __builtin_ia32_addss_mask_round ((__v4sf) __A,
> +						   (__v4sf) __B,
> +						   (__v4sf)
> +						   _mm_undefined_ps (),
> +						   (__mmask8) -1,
> +						   __R);
>  }
>  
>  extern __inline __m128
> @@ -1545,9 +1551,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_sub_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_subsd_round ((__v2df) __A,
> -					       (__v2df) __B,
> -					       __R);
> +  return (__m128d) __builtin_ia32_subsd_mask_round ((__v2df) __A,
> +						    (__v2df) __B,
> +						    (__v2df)
> +						    _mm_undefined_pd (),
> +						    (__mmask8) -1,
> +						    __R);
>  }
>  
>  extern __inline __m128d
> @@ -1577,9 +1586,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_sub_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_subss_round ((__v4sf) __A,
> -					      (__v4sf) __B,
> -					      __R);
> +  return (__m128) __builtin_ia32_subss_mask_round ((__v4sf) __A,
> +						   (__v4sf) __B,
> +						   (__v4sf)
> +						   _mm_undefined_ps (),
> +						   (__mmask8) -1,
> +						   __R);
>  }
>  
>  extern __inline __m128
> @@ -1606,8 +1618,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  }
>  
>  #else
> -#define _mm_add_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_addsd_round(A, B, C)
> +#define _mm_add_round_sd(A, B, C)				\
> +  ((__m128d)							\
> +   __builtin_ia32_addsd_mask_round ((__v2df) (__m128d) (A),	\
> +				    (__v2df) (__m128d) (B),	\
> +				    (__v2df) (__m128d)		\
> +				    _mm_undefined_pd (),	\
> +				    (__mmask8) (-1),		\
> +				    (int) (C)))
>  
>  #define _mm_mask_add_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_addsd_mask_round(A, B, W, U, C)
> @@ -1615,8 +1633,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_add_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_addsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>  
> -#define _mm_add_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_addss_round(A, B, C)
> +#define _mm_add_round_ss(A, B, C)				\
> +  ((__m128)							\
> +   __builtin_ia32_addss_mask_round ((__v4sf) (__m128) (A),	\
> +				    (__v4sf) (__m128) (B),	\
> +				    (__v4sf) (__m128)		\
> +				    _mm_undefined_ps (),	\
> +				    (__mmask8) (-1),		\
> +				    (int) (C)))
>  
>  #define _mm_mask_add_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_addss_mask_round(A, B, W, U, C)
> @@ -1624,8 +1648,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_add_round_ss(U, A, B, C)   \
>      (__m128)__builtin_ia32_addss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
>  
> -#define _mm_sub_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_subsd_round(A, B, C)
> +#define _mm_sub_round_sd(A, B, C)				\
> +  ((__m128d)							\
> +   __builtin_ia32_subsd_mask_round ((__v2df) (__m128d) (A),	\
> +				    (__v2df) (__m128d) (B),	\
> +				    (__v2df) (__m128d)		\
> +				    _mm_undefined_pd (),	\
> +				    (__mmask8) (-1),		\
> +				    (int) (C)))
>  
>  #define _mm_mask_sub_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_subsd_mask_round(A, B, W, U, C)
> @@ -1633,8 +1663,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_sub_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_subsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>  
> -#define _mm_sub_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_subss_round(A, B, C)
> +#define _mm_sub_round_ss(A, B, C)				\
> +  ((__m128)							\
> +   __builtin_ia32_subss_mask_round ((__v4sf) (__m128) (A),	\
> +				    (__v4sf) (__m128) (B),	\
> +				    (__v4sf) (__m128)		\
> +				    _mm_undefined_ps (),	\
> +				    (__mmask8) (-1),		\
> +				    (int) (C)))
>  
>  #define _mm_mask_sub_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_subss_mask_round(A, B, W, U, C)
> @@ -2730,9 +2766,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_mul_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_mulsd_round ((__v2df) __A,
> -					       (__v2df) __B,
> -					       __R);
> +  return (__m128d) __builtin_ia32_mulsd_mask_round ((__v2df) __A,
> +						    (__v2df) __B,
> +						    (__v2df)
> +						    _mm_undefined_pd (),
> +						    (__mmask8) -1,
> +						    __R);
>  }
>  
>  extern __inline __m128d
> @@ -2762,9 +2801,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_mul_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_mulss_round ((__v4sf) __A,
> -					      (__v4sf) __B,
> -					      __R);
> +  return (__m128) __builtin_ia32_mulss_mask_round ((__v4sf) __A,
> +						   (__v4sf) __B,
> +						   (__v4sf)
> +						   _mm_undefined_pd (),
> +						   (__mmask8) -1,
> +						   __R);
>  }
>  
>  extern __inline __m128
> @@ -2794,9 +2836,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_div_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_divsd_round ((__v2df) __A,
> -					       (__v2df) __B,
> -					       __R);
> +  return (__m128d) __builtin_ia32_divsd_mask_round ((__v2df) __A,
> +						    (__v2df) __B,
> +						    (__v2df)
> +						    _mm_undefined_pd (),
> +						    (__mmask8) -1,
> +						    __R);
>  }
>  
>  extern __inline __m128d
> @@ -2826,9 +2871,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_div_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_divss_round ((__v4sf) __A,
> -					      (__v4sf) __B,
> -					      __R);
> +  return (__m128) __builtin_ia32_divss_mask_round ((__v4sf) __A,
> +						   (__v4sf) __B,
> +						   (__v4sf)
> +						   _mm_undefined_pd (),
> +						   (__mmask8) -1,
> +						   __R);
>  }
>  
>  extern __inline __m128
> @@ -2891,8 +2939,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm512_maskz_div_round_ps(U, A, B, C)   \
>      (__m512)__builtin_ia32_divps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C)
>  
> -#define _mm_mul_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_mulsd_round(A, B, C)
> +#define _mm_mul_round_sd(A, B, C)				\
> +  ((__m128d)							\
> +   __builtin_ia32_mulsd_mask_round ((__v2df) (__m128d) (A),	\
> +				    (__v2df) (__m128d) (B),	\
> +				    (__v2df) (__m128d)		\
> +				    _mm_undefined_pd (),	\
> +				    (__mmask8) (-1),		\
> +				    (int) (C)))
>  
>  #define _mm_mask_mul_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_mulsd_mask_round(A, B, W, U, C)
> @@ -2900,8 +2954,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_mul_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_mulsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>  
> -#define _mm_mul_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_mulss_round(A, B, C)
> +#define _mm_mul_round_ss(A, B, C)				\
> +  ((__m128)							\
> +   __builtin_ia32_mulss_mask_round ((__v4sf) (__m128) (A),	\
> +				    (__v4sf) (__m128) (B),	\
> +				    (__v4sf) (__m128)		\
> +				    _mm_undefined_ps (),	\
> +				    (__mmask8) -1,		\
> +				    (int) (C)))
>  
>  #define _mm_mask_mul_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_mulss_mask_round(A, B, W, U, C)
> @@ -2909,8 +2969,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_mul_round_ss(U, A, B, C)   \
>      (__m128)__builtin_ia32_mulss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
>  
> -#define _mm_div_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_divsd_round(A, B, C)
> +#define _mm_div_round_sd(A, B, C)				\
> +  ((__m128d)							\
> +   __builtin_ia32_divsd_mask_round ((__v2df) (__m128d) (A),	\
> +				    (__v2df) (__m128d) (B),	\
> +				    (__v2df) (__m128d)		\
> +				    _mm_undefined_pd (),	\
> +				    (__mmask8) (-1),		\
> +				    (int) (C)))
>  
>  #define _mm_mask_div_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_divsd_mask_round(A, B, W, U, C)
> @@ -2918,8 +2984,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_div_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_divsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>  
> -#define _mm_div_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_divss_round(A, B, C)
> +#define _mm_div_round_ss(A, B, C)				\
> +  ((__m128)							\
> +   __builtin_ia32_divss_mask_round ((__v4sf) (__m128) (A),	\
> +				    (__v4sf) (__m128) (B),	\
> +				    (__v4sf) (__m128)		\
> +				    _mm_undefined_ps (),	\
> +				    (__mmask8) -1,		\
> +				    (int) (C)))
>  
>  #define _mm_mask_div_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_divss_mask_round(A, B, W, U, C)
> @@ -8703,9 +8775,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getexp_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_getexpss128_round ((__v4sf) __A,
> -						    (__v4sf) __B,
> -						    __R);
> +  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
> +						      (__v4sf) __B,
> +						      (__v4sf)
> +						      _mm_undefined_ps (),
> +						      (__mmask8) -1,
> +						      __R);
>  }
>  
>  extern __inline __m128
> @@ -8735,9 +8810,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getexp_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_getexpsd128_round ((__v2df) __A,
> -						     (__v2df) __B,
> -						     __R);
> +  return (__m128d) __builtin_ia32_getexpsd_mask_round ((__v2df) __A,
> +						       (__v2df) __B,
> +						       (__v2df)
> +						       _mm_undefined_pd (),
> +						       (__mmask8) -1,
> +						       __R);
>  }
>  
>  extern __inline __m128d
> @@ -8901,10 +8979,13 @@ _mm_getmant_round_sd (__m128d __A, __m128d __B,
>  		      _MM_MANTISSA_NORM_ENUM __C,
>  		      _MM_MANTISSA_SIGN_ENUM __D, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_getmantsd_round ((__v2df) __A,
> -						  (__v2df) __B,
> -						  (__D << 2) | __C,
> -						   __R);
> +  return (__m128d) __builtin_ia32_getmantsd_mask_round ((__v2df) __A,
> +							(__v2df) __B,
> +							(__D << 2) | __C,
> +							(__v2df)
> +							_mm_undefined_pd (),
> +							(__mmask8) -1,
> +							__R);
>  }
>  
>  extern __inline __m128d
> @@ -8940,10 +9021,13 @@ _mm_getmant_round_ss (__m128 __A, __m128 __B,
>  		      _MM_MANTISSA_NORM_ENUM __C,
>  		      _MM_MANTISSA_SIGN_ENUM __D, const int __R)
>  {
> -  return (__m128) __builtin_ia32_getmantss_round ((__v4sf) __A,
> -						  (__v4sf) __B,
> -						  (__D << 2) | __C,
> -						  __R);
> +  return (__m128) __builtin_ia32_getmantss_mask_round ((__v4sf) __A,
> +						       (__v4sf) __B,
> +						       (__D << 2) | __C,
> +						       (__v4sf)
> +						       _mm_undefined_ps (),
> +						       (__mmask8) -1,
> +						       __R);
>  }
>  
>  extern __inline __m128
> @@ -9014,11 +9098,15 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                               (__v16sf)(__m512)_mm512_setzero_ps(),  \
>                                               (__mmask16)(U),\
>  					     (R)))
> -#define _mm_getmant_round_sd(X, Y, C, D, R)                                                  \
> -  ((__m128d)__builtin_ia32_getmantsd_round ((__v2df)(__m128d)(X),                    \
> -					    (__v2df)(__m128d)(Y),	\
> -					    (int)(((D)<<2) | (C)),	\
> -					    (R)))
> +#define _mm_getmant_round_sd(X, Y, C, D, R)			\
> +  ((__m128d)							\
> +   __builtin_ia32_getmantsd_mask_round ((__v2df) (__m128d) (X),\
> +					(__v2df) (__m128d) (Y),	\
> +					(int) (((D)<<2) | (C)),	\
> +					(__v2df) (__m128d)	\
> +					_mm_undefined_pd (),	\
> +					(__mmask8) (-1),	\
> +					(int) (R)))
>  
>  #define _mm_mask_getmant_round_sd(W, U, X, Y, C, D, R)                                       \
>    ((__m128d)__builtin_ia32_getmantsd_mask_round ((__v2df)(__m128d)(X),                  \
> @@ -9036,11 +9124,15 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                               (__mmask8)(U),\
>  					     (R)))
>  
> -#define _mm_getmant_round_ss(X, Y, C, D, R)                                                  \
> -  ((__m128)__builtin_ia32_getmantss_round ((__v4sf)(__m128)(X),                      \
> -					   (__v4sf)(__m128)(Y),		\
> -					   (int)(((D)<<2) | (C)),	\
> -					   (R)))
> +#define _mm_getmant_round_ss(X, Y, C, D, R)			\
> +  ((__m128)							\
> +   __builtin_ia32_getmantss_mask_round ((__v4sf) (__m128) (X),	\
> +					(__v4sf) (__m128) (Y),	\
> +					(int) (((D)<<2) | (C)),	\
> +					(__v4sf) (__m128)	\
> +					_mm_undefined_ps (),	\
> +					(__mmask8) (-1),	\
> +					(int) (R)))
>  
>  #define _mm_mask_getmant_round_ss(W, U, X, Y, C, D, R)                                       \
>    ((__m128)__builtin_ia32_getmantss_mask_round ((__v4sf)(__m128)(X),                  \
> @@ -9058,8 +9150,14 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                               (__mmask8)(U),\
>  					     (R)))
>  
> -#define _mm_getexp_round_ss(A, B, R)						      \
> -  ((__m128)__builtin_ia32_getexpss128_round((__v4sf)(__m128)(A), (__v4sf)(__m128)(B), R))
> +#define _mm_getexp_round_ss(A, B, R)				\
> +  ((__m128)							\
> +   __builtin_ia32_getexpss_mask_round ((__v4sf) (__m128) (A),	\
> +				       (__v4sf) (__m128) (B),	\
> +				       (__v4sf) (__m128)	\
> +				       _mm_undefined_ps (),	\
> +				       (__mmask8) (-1),		\
> +				       (int) (R)))
>  
>  #define _mm_mask_getexp_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_getexpss_mask_round(A, B, W, U, C)
> @@ -9067,8 +9165,14 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_getexp_round_ss(U, A, B, C)   \
>      (__m128)__builtin_ia32_getexpss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
>  
> -#define _mm_getexp_round_sd(A, B, R)						       \
> -  ((__m128d)__builtin_ia32_getexpsd128_round((__v2df)(__m128d)(A), (__v2df)(__m128d)(B), R))
> +#define _mm_getexp_round_sd(A, B, R)				\
> +  ((__m128d)							\
> +   __builtin_ia32_getexpsd_mask_round ((__v2df) (__m128d) (A),	\
> +				       (__v2df) (__m128d) (B),	\
> +				       (__v2df) (__m128d)	\
> +				       _mm_undefined_pd (),	\
> +				       (__mmask8) (-1),		\
> +				       (int) (R)))
>  
>  #define _mm_mask_getexp_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_getexpsd_mask_round(A, B, W, U, C)
> @@ -11392,9 +11496,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_max_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_maxsd_round ((__v2df) __A,
> -					       (__v2df) __B,
> -					       __R);
> +  return (__m128d) __builtin_ia32_maxsd_mask_round ((__v2df) __A,
> +						    (__v2df) __B,
> +						    (__v2df)
> +						    _mm_undefined_pd (),
> +						    (__mmask8) -1,
> +						    __R);
>  }
>  
>  extern __inline __m128d
> @@ -11424,9 +11531,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_max_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_maxss_round ((__v4sf) __A,
> -					      (__v4sf) __B,
> -					      __R);
> +  return (__m128) __builtin_ia32_maxss_mask_round ((__v4sf) __A,
> +						   (__v4sf) __B,
> +						   (__v4sf)
> +						   _mm_undefined_ps (),
> +						   (__mmask8) -1,
> +						   __R);
>  }
>  
>  extern __inline __m128
> @@ -11456,9 +11566,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_min_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_minsd_round ((__v2df) __A,
> -					       (__v2df) __B,
> -					       __R);
> +  return (__m128d) __builtin_ia32_minsd_mask_round ((__v2df) __A,
> +						    (__v2df) __B,
> +						    (__v2df)
> +						    _mm_undefined_pd (),
> +						    (__mmask8) -1,
> +						    __R);
>  }
>  
>  extern __inline __m128d
> @@ -11488,9 +11601,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_min_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_minss_round ((__v4sf) __A,
> -					      (__v4sf) __B,
> -					      __R);
> +  return (__m128) __builtin_ia32_minss_mask_round ((__v4sf) __A,
> +						   (__v4sf) __B,
> +						   (__v4sf)
> +						   _mm_undefined_ps (),
> +						   (__mmask8) -1,
> +						   __R);
>  }
>  
>  extern __inline __m128
> @@ -11517,8 +11633,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  }
>  
>  #else
> -#define _mm_max_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_maxsd_round(A, B, C)
> +#define _mm_max_round_sd(A, B, C)				\
> +  ((__m128d)							\
> +   __builtin_ia32_maxsd_mask_round((__v2df) (__m128d) (A),	\
> +				   (__v2df) (__m128d) (B),	\
> +				   (__v2df) (__m128d)		\
> +				   _mm_undefined_pd (),		\
> +				   (__mmask8) (-1),		\
> +				   (int) (C)))
>  
>  #define _mm_mask_max_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_maxsd_mask_round(A, B, W, U, C)
> @@ -11526,8 +11648,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_max_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_maxsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>  
> -#define _mm_max_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_maxss_round(A, B, C)
> +#define _mm_max_round_ss(A, B, C)				\
> +  ((__m128)							\
> +   __builtin_ia32_maxss_mask_round ((__v4sf) (__m128) (A),	\
> +				    (__v4sf) (__m128) (B),	\
> +				    (__v4sf) (__m128)		\
> +				    _mm_undefined_ps (),	\
> +				    (__mmask8) (-1),		\
> +				    (int)(C)))
>  
>  #define _mm_mask_max_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_maxss_mask_round(A, B, W, U, C)
> @@ -11535,8 +11663,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_max_round_ss(U, A, B, C)   \
>      (__m128)__builtin_ia32_maxss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
>  
> -#define _mm_min_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_minsd_round(A, B, C)
> +#define _mm_min_round_sd(A, B, C)				\
> +  ((__m128d)							\
> +   __builtin_ia32_minsd_mask_round ((__v2df) (__m128d) (A),	\
> +				    (__v2df) (__m128d) (B),	\
> +				    (__v2df) (__m128d)		\
> +				    _mm_undefined_pd (),	\
> +				    (__mmask8) (-1),		\
> +				    (int) (C)))
>  
>  #define _mm_mask_min_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_minsd_mask_round(A, B, W, U, C)
> @@ -11544,8 +11678,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_min_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_minsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>  
> -#define _mm_min_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_minss_round(A, B, C)
> +#define _mm_min_round_ss(A, B, C)				\
> +  ((__m128)							\
> +   __builtin_ia32_minss_mask_round ((__v4sf) (__m128) (A),	\
> +				    (__v4sf) (__m128) (B),	\
> +				    (__v4sf) (__m128)		\
> +				    _mm_undefined_ps (),	\
> +				    (__mmask8) (-1),		\
> +				    (int) (C)))
>  
>  #define _mm_mask_min_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_minss_mask_round(A, B, W, U, C)
> @@ -11596,105 +11736,153 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fmadd_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
> -						   (__v2df) __A,
> -						   (__v2df) __B,
> -						   __R);
> +  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
> +						  (__v2df) __A,
> +						  (__v2df) __B,
> +						  (__mmask8) -1,
> +						  __R);
>  }
>  
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fmadd_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
> -						  (__v4sf) __A,
> -						  (__v4sf) __B,
> -						  __R);
> +  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
> +						 (__v4sf) __A,
> +						 (__v4sf) __B,
> +						 (__mmask8) -1,
> +						 __R);
>  }
>  
>  extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fmsub_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
> -						   (__v2df) __A,
> -						   -(__v2df) __B,
> -						   __R);
> -}
> +  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
> +						  (__v2df) __A,
> +						  -(__v2df) __B,
> +						  (__mmask8) -1,
> +						  __R);
> + }
>  
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fmsub_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
> -						  (__v4sf) __A,
> -						  -(__v4sf) __B,
> -						  __R);
> +  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
> +						 (__v4sf) __A,
> +						 -(__v4sf) __B,
> +						 (__mmask8) -1,
> +						 __R);
>  }
>  
>  extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fnmadd_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
> -						   -(__v2df) __A,
> -						   (__v2df) __B,
> -						   __R);
> +  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
> +						  -(__v2df) __A,
> +						  (__v2df) __B,
> +						  (__mmask8) -1,
> +						  __R);
>  }
>  
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fnmadd_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
> -						  -(__v4sf) __A,
> -						  (__v4sf) __B,
> -						  __R);
> -}
> +  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
> +						 -(__v4sf) __A,
> +						 (__v4sf) __B,
> +						 (__mmask8) -1,
> +						 __R);
> + }
>  
>  extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fnmsub_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
> -						   -(__v2df) __A,
> -						   -(__v2df) __B,
> -						   __R);
> -}
> +  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
> +						  -(__v2df) __A,
> +						  -(__v2df) __B,
> +						  (__mmask8) -1,
> +						  __R);
> + }
>  
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fnmsub_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
> -						  -(__v4sf) __A,
> -						  -(__v4sf) __B,
> -						  __R);
> +  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
> +						 -(__v4sf) __A,
> +						 -(__v4sf) __B,
> +						 (__mmask8) -1,
> +						 __R);
>  }
>  #else
> -#define _mm_fmadd_round_sd(A, B, C, R)            \
> -    (__m128d)__builtin_ia32_vfmaddsd3_round(A, B, C, R)
> -
> -#define _mm_fmadd_round_ss(A, B, C, R)            \
> -    (__m128)__builtin_ia32_vfmaddss3_round(A, B, C, R)
> -
> -#define _mm_fmsub_round_sd(A, B, C, R)            \
> -    (__m128d)__builtin_ia32_vfmaddsd3_round(A, B, -(C), R)
> -
> -#define _mm_fmsub_round_ss(A, B, C, R)            \
> -    (__m128)__builtin_ia32_vfmaddss3_round(A, B, -(C), R)
> -
> -#define _mm_fnmadd_round_sd(A, B, C, R)            \
> -    (__m128d)__builtin_ia32_vfmaddsd3_round(A, -(B), C, R)
> -
> -#define _mm_fnmadd_round_ss(A, B, C, R)            \
> -   (__m128)__builtin_ia32_vfmaddss3_round(A, -(B), C, R)
> -
> -#define _mm_fnmsub_round_sd(A, B, C, R)            \
> -    (__m128d)__builtin_ia32_vfmaddsd3_round(A, -(B), -(C), R)
> -
> -#define _mm_fnmsub_round_ss(A, B, C, R)            \
> -    (__m128)__builtin_ia32_vfmaddss3_round(A, -(B), -(C), R)
> +#define _mm_fmadd_round_sd(A, B, C, R)				\
> +  ((__m128d)							\
> +   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
> +				  (__v2df) (__m128d) (B),	\
> +				  (__v2df) (__m128d) (C),	\
> +				  (__mmask8) (-1),		\
> +				  (int) (R)))
> +
> +#define _mm_fmadd_round_ss(A, B, C, R)				\
> +  ((__m128)							\
> +   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
> +				  (__v4sf) (__m128) (B),	\
> +				  (__v4sf) (__m128) (C),	\
> +				  (__mmask8) (-1),		\
> +				  (int) (R)))
> +
> +#define _mm_fmsub_round_sd(A, B, C, R)				\
> +  ((__m128d)							\
> +   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
> +				  (__v2df) (__m128d) (B),	\
> +				  (__v2df) (__m128d) (-(C)),	\
> +				  (__mmask8) (-1),		\
> +				  (int) (R)))
> +
> +#define _mm_fmsub_round_ss(A, B, C, R)				\
> +  ((__m128)							\
> +   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
> +				  (__v4sf) (__m128) (B),	\
> +				  (__v4sf) (__m128) (-(C)),	\
> +				  (__mmask8) (-1),		\
> +				  (int) (R)))
> +
> +#define _mm_fnmadd_round_sd(A, B, C, R)				\
> +  ((__m128d)							\
> +   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
> +				  (__v2df) (__m128d) (-(B)),	\
> +				  (__v2df) (__m128d) (C),	\
> +				  (__mmask8) (-1),		\
> +				  (int) (R)))
> +
> +#define _mm_fnmadd_round_ss(A, B, C, R)				\
> +  ((__m128)							\
> +   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
> +				  (__v4sf) (__m128) (-(B)),	\
> +				  (__v4sf) (__m128) (C),	\
> +				  (__mmask8) (-1),		\
> +				  (int) (R)))
> +
> +#define _mm_fnmsub_round_sd(A, B, C, R)				\
> +  ((__m128d)							\
> +   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A),	\
> +				  (__v2df) (__m128d) (-(B)),	\
> +				  (__v2df) (__m128d) (-(C)),	\
> +				  (__mmask8) (-1),		\
> +				  (int) (R)))
> +
> +#define _mm_fnmsub_round_ss(A, B, C, R)				\
> +  ((__m128)							\
> +   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A),	\
> +				  (__v4sf) (__m128) (-(B)),	\
> +				  (__v4sf) (__m128) (-(C)),	\
> +				  (__mmask8) (-1),		\
> +				  (int) (R)))
>  #endif
>  
>  extern __inline __m128d
> @@ -14504,20 +14692,24 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getexp_ss (__m128 __A, __m128 __B)
>  {
> -  return (__m128) __builtin_ia32_getexpss128_round ((__v4sf) __A,
> -						    (__v4sf) __B,
> -						    _MM_FROUND_CUR_DIRECTION);
> +  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
> +						      (__v4sf) __B,
> +						      (__v4sf)
> +						      _mm_undefined_ps (),
> +						      (__mmask8) -1,
> +						      _MM_FROUND_CUR_DIRECTION);
>  }
>  
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_mask_getexp_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B)
>  {
> -  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
> -						(__v4sf) __B,
> -						(__v4sf) __W,
> -						(__mmask8) __U,
> -						_MM_FROUND_CUR_DIRECTION);
> +  return (__m128)
> +    __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
> +					(__v4sf) __B,
> +					(__v4sf) __W,
> +					(__mmask8) __U,
> +					_MM_FROUND_CUR_DIRECTION);
>  }
>  
>  extern __inline __m128
> @@ -14536,9 +14728,13 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getexp_sd (__m128d __A, __m128d __B)
>  {
> -  return (__m128d) __builtin_ia32_getexpsd128_round ((__v2df) __A,
> -						     (__v2df) __B,
> -						     _MM_FROUND_CUR_DIRECTION);
> +  return (__m128d)
> +    __builtin_ia32_getexpsd_mask_round ((__v2df) __A,
> +					(__v2df) __B,
> +					(__v2df)
> +					_mm_undefined_pd (),
> +					(__mmask8) -1,
> +					_MM_FROUND_CUR_DIRECTION);
>  }
>  
>  extern __inline __m128d
> @@ -14641,10 +14837,14 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getmant_sd (__m128d __A, __m128d __B, _MM_MANTISSA_NORM_ENUM __C,
>  		_MM_MANTISSA_SIGN_ENUM __D)
>  {
> -  return (__m128d) __builtin_ia32_getmantsd_round ((__v2df) __A,
> -						   (__v2df) __B,
> -						   (__D << 2) | __C,
> -						   _MM_FROUND_CUR_DIRECTION);
> +  return (__m128d)
> +    __builtin_ia32_getmantsd_mask_round ((__v2df) __A,
> +					 (__v2df) __B,
> +					 (__D << 2) | __C,
> +					 (__v2df)
> +					 _mm_undefined_pd (),
> +					 (__mmask8) -1,
> +					 _MM_FROUND_CUR_DIRECTION);
>  }
>  
>  extern __inline __m128d
> @@ -14679,10 +14879,14 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getmant_ss (__m128 __A, __m128 __B, _MM_MANTISSA_NORM_ENUM __C,
>  		_MM_MANTISSA_SIGN_ENUM __D)
>  {
> -  return (__m128) __builtin_ia32_getmantss_round ((__v4sf) __A,
> -						  (__v4sf) __B,
> -						  (__D << 2) | __C,
> -						  _MM_FROUND_CUR_DIRECTION);
> +  return (__m128)
> +    __builtin_ia32_getmantss_mask_round ((__v4sf) __A,
> +					 (__v4sf) __B,
> +					 (__D << 2) | __C,
> +					 (__v4sf)
> +					 _mm_undefined_ps (),
> +					 (__mmask8) -1,
> +					 _MM_FROUND_CUR_DIRECTION);
>  }
>  
>  extern __inline __m128
> @@ -14753,11 +14957,15 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                               (__v16sf)_mm512_setzero_ps(),          \
>                                               (__mmask16)(U),\
>  					     _MM_FROUND_CUR_DIRECTION))
> -#define _mm_getmant_sd(X, Y, C, D)                                                  \
> -  ((__m128d)__builtin_ia32_getmantsd_round ((__v2df)(__m128d)(X),                    \
> -                                           (__v2df)(__m128d)(Y),                    \
> -                                           (int)(((D)<<2) | (C)),                   \
> -					   _MM_FROUND_CUR_DIRECTION))
> +#define _mm_getmant_sd(X, Y, C, D)					\
> +  ((__m128d)								\
> +   __builtin_ia32_getmantsd_mask_round ((__v2df) (__m128d) (X),	\
> +					(__v2df) (__m128d) (Y),		\
> +					(int) (((D)<<2) | (C)),		\
> +					(__v2df) (__m128d)		\
> +					_mm_undefined_pd (),		\
> +					(__mmask8) (-1),		\
> +					_MM_FROUND_CUR_DIRECTION))
>  
>  #define _mm_mask_getmant_sd(W, U, X, Y, C, D)                                       \
>    ((__m128d)__builtin_ia32_getmantsd_mask_round ((__v2df)(__m128d)(X),                 \
> @@ -14775,11 +14983,15 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                                (__mmask8)(U),\
>  					      _MM_FROUND_CUR_DIRECTION))
>  
> -#define _mm_getmant_ss(X, Y, C, D)                                                  \
> -  ((__m128)__builtin_ia32_getmantss_round ((__v4sf)(__m128)(X),                      \
> -                                          (__v4sf)(__m128)(Y),                      \
> -                                          (int)(((D)<<2) | (C)),                    \
> -					  _MM_FROUND_CUR_DIRECTION))
> +#define _mm_getmant_ss(X, Y, C, D)					\
> +  ((__m128)								\
> +   __builtin_ia32_getmantss_mask_round ((__v4sf) (__m128) (X),		\
> +					(__v4sf) (__m128) (Y),		\
> +					(int) (((D)<<2) | (C)),		\
> +					(__v4sf) (__m128)		\
> +					_mm_undefined_ps (),		\
> +					(__mmask8) (-1),		\
> +					_MM_FROUND_CUR_DIRECTION))
>  
>  #define _mm_mask_getmant_ss(W, U, X, Y, C, D)                                       \
>    ((__m128)__builtin_ia32_getmantss_mask_round ((__v4sf)(__m128)(X),                 \
> @@ -14797,9 +15009,14 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                                (__mmask8)(U),\
>  					      _MM_FROUND_CUR_DIRECTION))
>  
> -#define _mm_getexp_ss(A, B)						      \
> -  ((__m128)__builtin_ia32_getexpss128_round((__v4sf)(__m128)(A), (__v4sf)(__m128)(B),  \
> -					   _MM_FROUND_CUR_DIRECTION))
> +#define _mm_getexp_ss(A, B)						\
> +  ((__m128)								\
> +   __builtin_ia32_getexpss_mask_round ((__v4sf) (__m128) (A),		\
> +				       (__v4sf) (__m128) (B),		\
> +				       (__v4sf) (__m128)		\
> +				       _mm_undefined_ps (),		\
> +				       (__mmask8) (-1),			\
> +				       _MM_FROUND_CUR_DIRECTION))
>  
>  #define _mm_mask_getexp_ss(W, U, A, B) \
>      (__m128)__builtin_ia32_getexpss_mask_round(A, B, W, U,\
> @@ -14809,9 +15026,14 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
>      (__m128)__builtin_ia32_getexpss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U,\
>  					      _MM_FROUND_CUR_DIRECTION)
>  
> -#define _mm_getexp_sd(A, B)						       \
> -  ((__m128d)__builtin_ia32_getexpsd128_round((__v2df)(__m128d)(A), (__v2df)(__m128d)(B),\
> -					    _MM_FROUND_CUR_DIRECTION))
> +#define _mm_getexp_sd(A, B)						\
> +  ((__m128d)								\
> +   __builtin_ia32_getexpsd_mask_round ((__v2df) (__m128d) (A),		\
> +				       (__v2df) (__m128d)(B),		\
> +				       (__v2df) (__m128d)		\
> +				       _mm_undefined_pd (),		\
> +				       (__mmask8) (-1),			\
> +				       _MM_FROUND_CUR_DIRECTION))
>  
>  #define _mm_mask_getexp_sd(W, U, A, B) \
>      (__m128d)__builtin_ia32_getexpsd_mask_round(A, B, W, U,\
> diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
> index a6500f9d9b5..c2039e9d112 100644
> --- a/gcc/config/i386/i386-builtin.def
> +++ b/gcc/config/i386/i386-builtin.def
> @@ -2751,9 +2751,7 @@ BDESC_END (ARGS, ROUND_ARGS)
>  BDESC_FIRST (round_args, ROUND_ARGS,
>         OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_addv8df3_mask_round, "__builtin_ia32_addpd512_mask", IX86_BUILTIN_ADDPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_addv16sf3_mask_round, "__builtin_ia32_addps512_mask", IX86_BUILTIN_ADDPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmaddv2df3_round, "__builtin_ia32_addsd_round", IX86_BUILTIN_ADDSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmaddv2df3_mask_round, "__builtin_ia32_addsd_mask_round", IX86_BUILTIN_ADDSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmaddv4sf3_round, "__builtin_ia32_addss_round", IX86_BUILTIN_ADDSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmaddv4sf3_mask_round, "__builtin_ia32_addss_mask_round", IX86_BUILTIN_ADDSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_cmpv8df3_mask_round, "__builtin_ia32_cmppd512_mask", IX86_BUILTIN_CMPPD512, UNKNOWN, (int) UQI_FTYPE_V8DF_V8DF_INT_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_cmpv16sf3_mask_round, "__builtin_ia32_cmpps512_mask", IX86_BUILTIN_CMPPS512, UNKNOWN, (int) UHI_FTYPE_V16SF_V16SF_INT_UHI_INT)
> @@ -2784,9 +2782,7 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_cvtusi2ss32_round, "__builtin_ia32_c
>  BDESC (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_cvtusi2ss64_round, "__builtin_ia32_cvtusi2ss64", IX86_BUILTIN_CVTUSI2SS64, UNKNOWN, (int) V4SF_FTYPE_V4SF_UINT64_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_divv8df3_mask_round, "__builtin_ia32_divpd512_mask", IX86_BUILTIN_DIVPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_divv16sf3_mask_round, "__builtin_ia32_divps512_mask", IX86_BUILTIN_DIVPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmdivv2df3_round, "__builtin_ia32_divsd_round", IX86_BUILTIN_DIVSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmdivv2df3_mask_round, "__builtin_ia32_divsd_mask_round", IX86_BUILTIN_DIVSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmdivv4sf3_round, "__builtin_ia32_divss_round", IX86_BUILTIN_DIVSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmdivv4sf3_mask_round, "__builtin_ia32_divss_mask_round", IX86_BUILTIN_DIVSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fixupimmv8df_mask_round, "__builtin_ia32_fixupimmpd512_mask", IX86_BUILTIN_FIXUPIMMPD512_MASK, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DI_INT_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fixupimmv8df_maskz_round, "__builtin_ia32_fixupimmpd512_maskz", IX86_BUILTIN_FIXUPIMMPD512_MASKZ, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DI_INT_QI_INT)
> @@ -2798,33 +2794,23 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sfixupimmv4sf_mask_round, "_
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sfixupimmv4sf_maskz_round, "__builtin_ia32_fixupimmss_maskz", IX86_BUILTIN_FIXUPIMMSS128_MASKZ, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SI_INT_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getexpv8df_mask_round, "__builtin_ia32_getexppd512_mask", IX86_BUILTIN_GETEXPPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getexpv16sf_mask_round, "__builtin_ia32_getexpps512_mask", IX86_BUILTIN_GETEXPPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv2df_round, "__builtin_ia32_getexpsd128_round", IX86_BUILTIN_GETEXPSD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv2df_mask_round, "__builtin_ia32_getexpsd_mask_round", IX86_BUILTIN_GETEXPSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv4sf_round, "__builtin_ia32_getexpss128_round", IX86_BUILTIN_GETEXPSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv4sf_mask_round, "__builtin_ia32_getexpss_mask_round", IX86_BUILTIN_GETEXPSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getmantv8df_mask_round, "__builtin_ia32_getmantpd512_mask", IX86_BUILTIN_GETMANTPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getmantv16sf_mask_round, "__builtin_ia32_getmantps512_mask", IX86_BUILTIN_GETMANTPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv2df_round, "__builtin_ia32_getmantsd_round", IX86_BUILTIN_GETMANTSD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv2df_mask_round, "__builtin_ia32_getmantsd_mask_round", IX86_BUILTIN_GETMANTSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv4sf_round, "__builtin_ia32_getmantss_round", IX86_BUILTIN_GETMANTSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv4sf_mask_round, "__builtin_ia32_getmantss_mask_round", IX86_BUILTIN_GETMANTSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_smaxv8df3_mask_round, "__builtin_ia32_maxpd512_mask", IX86_BUILTIN_MAXPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_smaxv16sf3_mask_round, "__builtin_ia32_maxps512_mask", IX86_BUILTIN_MAXPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsmaxv2df3_round, "__builtin_ia32_maxsd_round", IX86_BUILTIN_MAXSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsmaxv2df3_mask_round, "__builtin_ia32_maxsd_mask_round", IX86_BUILTIN_MAXSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsmaxv4sf3_round, "__builtin_ia32_maxss_round", IX86_BUILTIN_MAXSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsmaxv4sf3_mask_round, "__builtin_ia32_maxss_mask_round", IX86_BUILTIN_MAXSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sminv8df3_mask_round, "__builtin_ia32_minpd512_mask", IX86_BUILTIN_MINPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sminv16sf3_mask_round, "__builtin_ia32_minps512_mask", IX86_BUILTIN_MINPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsminv2df3_round, "__builtin_ia32_minsd_round", IX86_BUILTIN_MINSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsminv2df3_mask_round, "__builtin_ia32_minsd_mask_round", IX86_BUILTIN_MINSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsminv4sf3_round, "__builtin_ia32_minss_round", IX86_BUILTIN_MINSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsminv4sf3_mask_round, "__builtin_ia32_minss_mask_round", IX86_BUILTIN_MINSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_mulv8df3_mask_round, "__builtin_ia32_mulpd512_mask", IX86_BUILTIN_MULPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_mulv16sf3_mask_round, "__builtin_ia32_mulps512_mask", IX86_BUILTIN_MULPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmmulv2df3_round, "__builtin_ia32_mulsd_round", IX86_BUILTIN_MULSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmmulv2df3_mask_round, "__builtin_ia32_mulsd_mask_round", IX86_BUILTIN_MULSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmmulv4sf3_round, "__builtin_ia32_mulss_round", IX86_BUILTIN_MULSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmmulv4sf3_mask_round, "__builtin_ia32_mulss_mask_round", IX86_BUILTIN_MULSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_rndscalev8df_mask_round, "__builtin_ia32_rndscalepd_mask", IX86_BUILTIN_RNDSCALEPD, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_rndscalev16sf_mask_round, "__builtin_ia32_rndscaleps_mask", IX86_BUILTIN_RNDSCALEPS, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_HI_INT)
> @@ -2840,9 +2826,7 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsqrtv2df2_mask_round, "__buil
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsqrtv4sf2_mask_round, "__builtin_ia32_sqrtss_mask_round", IX86_BUILTIN_SQRTSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_subv8df3_mask_round, "__builtin_ia32_subpd512_mask", IX86_BUILTIN_SUBPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_subv16sf3_mask_round, "__builtin_ia32_subps512_mask", IX86_BUILTIN_SUBPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsubv2df3_round, "__builtin_ia32_subsd_round", IX86_BUILTIN_SUBSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsubv2df3_mask_round, "__builtin_ia32_subsd_mask_round", IX86_BUILTIN_SUBSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsubv4sf3_round, "__builtin_ia32_subss_round", IX86_BUILTIN_SUBSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsubv4sf3_mask_round, "__builtin_ia32_subss_mask_round", IX86_BUILTIN_SUBSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_cvtsd2si_round, "__builtin_ia32_vcvtsd2si32", IX86_BUILTIN_VCVTSD2SI32, UNKNOWN, (int) INT_FTYPE_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse2_cvtsd2siq_round, "__builtin_ia32_vcvtsd2si64", IX86_BUILTIN_VCVTSD2SI64, UNKNOWN, (int) INT64_FTYPE_V2DF_INT)
> @@ -2866,8 +2850,6 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v8df_maskz_round, "__b
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_mask_round, "__builtin_ia32_vfmaddps512_mask", IX86_BUILTIN_VFMADDPS512_MASK, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_mask3_round, "__builtin_ia32_vfmaddps512_mask3", IX86_BUILTIN_VFMADDPS512_MASK3, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_maskz_round, "__builtin_ia32_vfmaddps512_maskz", IX86_BUILTIN_VFMADDPS512_MASKZ, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_fmai_vmfmadd_v2df_round, "__builtin_ia32_vfmaddsd3_round", IX86_BUILTIN_VFMADDSD3_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_fmai_vmfmadd_v4sf_round, "__builtin_ia32_vfmaddss3_round", IX86_BUILTIN_VFMADDSS3_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_mask_round, "__builtin_ia32_vfmaddsd3_mask", IX86_BUILTIN_VFMADDSD3_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_mask3_round, "__builtin_ia32_vfmaddsd3_mask3", IX86_BUILTIN_VFMADDSD3_MASK3, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_maskz_round, "__builtin_ia32_vfmaddsd3_maskz", IX86_BUILTIN_VFMADDSD3_MASKZ, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index cbf4eb7b487..66bf9be5bd4 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -10193,13 +10193,6 @@ ix86_expand_round_builtin (const struct builtin_description *d,
>      case V16SI_FTYPE_V16SF_V16SI_HI_INT:
>      case V8DF_FTYPE_V8SF_V8DF_QI_INT:
>      case V16SF_FTYPE_V16HI_V16SF_HI_INT:
> -    case V2DF_FTYPE_V2DF_V2DF_V2DF_INT:
> -    case V4SF_FTYPE_V4SF_V4SF_V4SF_INT:
> -      nargs = 4;
> -      break;
> -    case V4SF_FTYPE_V4SF_V4SF_INT_INT:
> -    case V2DF_FTYPE_V2DF_V2DF_INT_INT:
> -      nargs_constant = 2;
>        nargs = 4;
>        break;
>      case INT_FTYPE_V4SF_V4SF_INT_INT:
> diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
> index 3600a7abe91..0e00bfbbb5e 100644
> --- a/gcc/testsuite/gcc.target/i386/avx-1.c
> +++ b/gcc/testsuite/gcc.target/i386/avx-1.c
> @@ -172,9 +172,7 @@
>  #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
>  #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
>  #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
>  #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
>  #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
> @@ -206,9 +204,7 @@
>  #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
>  #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
>  #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
>  #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
>  #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
> @@ -232,15 +228,11 @@
>  #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
>  #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
>  #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
> -#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
>  #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
>  #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
>  #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
> -#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
> -#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
>  #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
> @@ -248,21 +240,15 @@
>  #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
>  #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
>  #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_minsd_round(A, B, C) __builtin_ia32_minsd_round(A, B, 4)
>  #define __builtin_ia32_minsd_mask_round(A, B, C, D, E) __builtin_ia32_minsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_minss_round(A, B, C) __builtin_ia32_minss_round(A, B, 4)
>  #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
>  #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
>  #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
>  #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
> @@ -309,9 +295,7 @@
>  #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
>  #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
>  #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
>  #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
> @@ -341,8 +325,6 @@
>  #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
> -#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
> -#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
> diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
> index 45c1c285c57..fdb7852f0b3 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-13.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-13.c
> @@ -189,9 +189,7 @@
>  #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
>  #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
>  #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
>  #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
>  #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
> @@ -223,9 +221,7 @@
>  #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
>  #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
>  #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
>  #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
>  #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
> @@ -249,15 +245,11 @@
>  #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
>  #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
>  #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
> -#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
>  #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
>  #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
>  #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
> -#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
> -#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
>  #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
> @@ -265,21 +257,15 @@
>  #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
>  #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
>  #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_minsd_round(A, B, C) __builtin_ia32_minsd_round(A, B, 4)
>  #define __builtin_ia32_minsd_mask_round(A, B, C, D, E) __builtin_ia32_minsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_minss_round(A, B, C) __builtin_ia32_minss_round(A, B, 4)
>  #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
>  #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
>  #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
>  #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
> @@ -326,9 +312,7 @@
>  #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, E, 8)
>  #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
>  #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
>  #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
>  #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
> @@ -358,8 +342,6 @@
>  #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
> -#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
> -#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
> diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
> index e98c7693ef7..cb98cc63e6b 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-23.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-23.c
> @@ -191,9 +191,7 @@
>  #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
>  #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
>  #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
>  #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
>  #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
> @@ -225,9 +223,7 @@
>  #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
>  #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
>  #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
>  #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
>  #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
> @@ -251,15 +247,11 @@
>  #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
>  #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
>  #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
> -#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
>  #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
>  #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
>  #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
> -#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
> -#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
>  #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
> @@ -267,9 +259,7 @@
>  #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
>  #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
>  #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
> @@ -279,9 +269,7 @@
>  #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
>  #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
>  #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
>  #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
> @@ -328,9 +316,7 @@
>  #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
>  #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
>  #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
>  #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
> @@ -360,8 +346,6 @@
>  #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
> -#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
> -#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
> diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
> index 98f1141a8a4..e102b15ce54 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -7786,7 +7786,7 @@ proc check_effective_target_avx512f { } {
>  
>  	__m128d _mm128_getmant (__m128d a)
>  	{
> -	  return __builtin_ia32_getmantsd_round (a, a, 0, 8);
> +	  return __builtin_ia32_getmantsd_mask_round (a, a, 0, a, 1, 8);
>  	}
>      } "-O2 -mavx512f" ]
>  }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.
  2020-11-13  5:42 ` Jeff Law
@ 2020-11-13  6:21   ` Hongyu Wang
  2020-11-30 16:23     ` Jeff Law
  0 siblings, 1 reply; 8+ messages in thread
From: Hongyu Wang @ 2020-11-13  6:21 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jakub Jelinek, GCC Patches, Hongtao Liu, H.J. Lu

Hi

Thanks for reminding me about this patch. I didn't remove any existing
intrinsics, just remove redundant builtin functions that end-users
would not likely to use.

Also I'm OK to keep current implementation, in case there might be
someone using the builtin directly.

Jeff Law <law@redhat.com> 于2020年11月13日周五 下午1:43写道:
>
>
> On 12/23/19 10:31 PM, Hongyu Wang wrote:
>
> Hi:
>   For avx512f scalar instructions, current builtin function like
> __builtin_ia32_*{sd,ss}_round can be replaced by
> __builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
> patch did the replacement and remove the corresponding redundant
> builtins.
>
>   Bootstrap is ok, make-check ok for i386 target.
>   Ok for trunk?
>
> Changelog
>
> gcc/
>         * config/i386/avx512fintrin.h
>         (_mm_add_round_sd, _mm_add_round_ss): Use
>          __builtin_ia32_adds?_mask_round builtins instead of
>         __builtin_ia32_adds?_round.
>         (_mm_sub_round_sd, _mm_sub_round_ss,
>         _mm_mul_round_sd, _mm_mul_round_ss,
>         _mm_div_round_sd, _mm_div_round_ss,
>         _mm_getexp_sd, _mm_getexp_ss,
>         _mm_getexp_round_sd, _mm_getexp_round_ss,
>         _mm_getmant_sd, _mm_getmant_ss,
>         _mm_getmant_round_sd, _mm_getmant_round_ss,
>         _mm_max_round_sd, _mm_max_round_ss,
>         _mm_min_round_sd, _mm_min_round_ss,
>         _mm_fmadd_round_sd, _mm_fmadd_round_ss,
>         _mm_fmsub_round_sd, _mm_fmsub_round_ss,
>         _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
>         _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
>         * config/i386/i386-builtin.def
>         (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
>         __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
>         __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
>         __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
>         __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
>         __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
>         __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
>         __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
>         __builtin_ia32_vfmaddsd3_round,
>         __builtin_ia32_vfmaddss3_round): Remove.
>         * config/i386/i386-expand.c
>         (ix86_expand_round_builtin): Remove corresponding case.
>
> gcc/testsuite/
>         * lib/target-supports.exp
>         (check_effective_target_avx512f): Use
>         __builtin_ia32_getmantsd_mask_round builtins instead of
>         __builtin_ia32_getmantsd_round.
>         *gcc.target/i386/avx-1.c
>         (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
>         __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
>         __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
>         __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
>         __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
>         __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
>         __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
>         __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
>         __builtin_ia32_vfmaddsd3_round,
>         __builtin_ia32_vfmaddss3_round): Remove.
>         *gcc.target/i386/sse-13.c: Ditto.
>         *gcc.target/i386/sse-23.c: Ditto.
>
> So I like the idea of simplifying the implementation of some of the intrinsics when we can, but ISTM that removing existing intrinsics would be a mistake since end-users could be using them in their code.   I'd think we'd want to keep the existing APIs, even if we change the implementation under the hood.
>
>
> Thoughts?
>
>
> jeff
>
>
> Hongyu Wang
>
>
> 0001-Remove-redundant-round-builtins-for-avx512f-scalar-i.patch
>
> From 9cc4928aad5770c53ff580f5c996092cdaf2f9ba Mon Sep 17 00:00:00 2001
> From: hongyuw1 <hongyuw1@gitlab.devtools.intel.com>
> Date: Wed, 18 Dec 2019 14:52:54 +0000
> Subject: [PATCH] Remove redundant round builtins for avx512f scalar
>  instructions
>
> Changelog
>
> gcc/
> * config/i386/avx512fintrin.h
> (_mm_add_round_sd, _mm_add_round_ss): Use
> __builtin_ia32_adds?_mask_round builtins instead of
> __builtin_ia32_adds?_round.
> (_mm_sub_round_sd, _mm_sub_round_ss,
> _mm_mul_round_sd, _mm_mul_round_ss,
> _mm_div_round_sd, _mm_div_round_ss,
> _mm_getexp_sd, _mm_getexp_ss,
> _mm_getexp_round_sd, _mm_getexp_round_ss,
> _mm_getmant_sd, _mm_getmant_ss,
> _mm_getmant_round_sd, _mm_getmant_round_ss,
> _mm_max_round_sd, _mm_max_round_ss,
> _mm_min_round_sd, _mm_min_round_ss,
> _mm_fmadd_round_sd, _mm_fmadd_round_ss,
> _mm_fmsub_round_sd, _mm_fmsub_round_ss,
> _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> * config/i386/i386-builtin.def
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> * config/i386/i386-expand.c
> (ix86_expand_round_builtin): Remove corresponding case.
>
> gcc/testsuite/
> * lib/target-supports.exp
> (check_effective_target_avx512f): Use
> __builtin_ia32_getmantsd_mask_round builtins instead of
> __builtin_ia32_getmantsd_round.
> *gcc.target/i386/avx-1.c
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> *gcc.target/i386/sse-13.c: Ditto.
> *gcc.target/i386/sse-23.c: Ditto.
> ---
>  gcc/config/i386/avx512fintrin.h        | 584 +++++++++++++++++--------
>  gcc/config/i386/i386-builtin.def       |  18 -
>  gcc/config/i386/i386-expand.c          |   7 -
>  gcc/testsuite/gcc.target/i386/avx-1.c  |  18 -
>  gcc/testsuite/gcc.target/i386/sse-13.c |  18 -
>  gcc/testsuite/gcc.target/i386/sse-23.c |  16 -
>  gcc/testsuite/lib/target-supports.exp  |   2 +-
>  7 files changed, 404 insertions(+), 259 deletions(-)
>
> diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
> index 1d08f01a841..cdb4c948496 100644
> --- a/gcc/config/i386/avx512fintrin.h
> +++ b/gcc/config/i386/avx512fintrin.h
> @@ -1481,9 +1481,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_add_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_addsd_round ((__v2df) __A,
> -       (__v2df) __B,
> -       __R);
> +  return (__m128d) __builtin_ia32_addsd_mask_round ((__v2df) __A,
> +    (__v2df) __B,
> +    (__v2df)
> +    _mm_undefined_pd (),
> +    (__mmask8) -1,
> +    __R);
>  }
>
>  extern __inline __m128d
> @@ -1513,9 +1516,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_add_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_addss_round ((__v4sf) __A,
> -      (__v4sf) __B,
> -      __R);
> +  return (__m128) __builtin_ia32_addss_mask_round ((__v4sf) __A,
> +   (__v4sf) __B,
> +   (__v4sf)
> +   _mm_undefined_ps (),
> +   (__mmask8) -1,
> +   __R);
>  }
>
>  extern __inline __m128
> @@ -1545,9 +1551,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_sub_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_subsd_round ((__v2df) __A,
> -       (__v2df) __B,
> -       __R);
> +  return (__m128d) __builtin_ia32_subsd_mask_round ((__v2df) __A,
> +    (__v2df) __B,
> +    (__v2df)
> +    _mm_undefined_pd (),
> +    (__mmask8) -1,
> +    __R);
>  }
>
>  extern __inline __m128d
> @@ -1577,9 +1586,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_sub_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_subss_round ((__v4sf) __A,
> -      (__v4sf) __B,
> -      __R);
> +  return (__m128) __builtin_ia32_subss_mask_round ((__v4sf) __A,
> +   (__v4sf) __B,
> +   (__v4sf)
> +   _mm_undefined_ps (),
> +   (__mmask8) -1,
> +   __R);
>  }
>
>  extern __inline __m128
> @@ -1606,8 +1618,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  }
>
>  #else
> -#define _mm_add_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_addsd_round(A, B, C)
> +#define _mm_add_round_sd(A, B, C) \
> +  ((__m128d) \
> +   __builtin_ia32_addsd_mask_round ((__v2df) (__m128d) (A), \
> +    (__v2df) (__m128d) (B), \
> +    (__v2df) (__m128d) \
> +    _mm_undefined_pd (), \
> +    (__mmask8) (-1), \
> +    (int) (C)))
>
>  #define _mm_mask_add_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_addsd_mask_round(A, B, W, U, C)
> @@ -1615,8 +1633,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_add_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_addsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>
> -#define _mm_add_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_addss_round(A, B, C)
> +#define _mm_add_round_ss(A, B, C) \
> +  ((__m128) \
> +   __builtin_ia32_addss_mask_round ((__v4sf) (__m128) (A), \
> +    (__v4sf) (__m128) (B), \
> +    (__v4sf) (__m128) \
> +    _mm_undefined_ps (), \
> +    (__mmask8) (-1), \
> +    (int) (C)))
>
>  #define _mm_mask_add_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_addss_mask_round(A, B, W, U, C)
> @@ -1624,8 +1648,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_add_round_ss(U, A, B, C)   \
>      (__m128)__builtin_ia32_addss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
>
> -#define _mm_sub_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_subsd_round(A, B, C)
> +#define _mm_sub_round_sd(A, B, C) \
> +  ((__m128d) \
> +   __builtin_ia32_subsd_mask_round ((__v2df) (__m128d) (A), \
> +    (__v2df) (__m128d) (B), \
> +    (__v2df) (__m128d) \
> +    _mm_undefined_pd (), \
> +    (__mmask8) (-1), \
> +    (int) (C)))
>
>  #define _mm_mask_sub_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_subsd_mask_round(A, B, W, U, C)
> @@ -1633,8 +1663,14 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_sub_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_subsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>
> -#define _mm_sub_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_subss_round(A, B, C)
> +#define _mm_sub_round_ss(A, B, C) \
> +  ((__m128) \
> +   __builtin_ia32_subss_mask_round ((__v4sf) (__m128) (A), \
> +    (__v4sf) (__m128) (B), \
> +    (__v4sf) (__m128) \
> +    _mm_undefined_ps (), \
> +    (__mmask8) (-1), \
> +    (int) (C)))
>
>  #define _mm_mask_sub_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_subss_mask_round(A, B, W, U, C)
> @@ -2730,9 +2766,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_mul_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_mulsd_round ((__v2df) __A,
> -       (__v2df) __B,
> -       __R);
> +  return (__m128d) __builtin_ia32_mulsd_mask_round ((__v2df) __A,
> +    (__v2df) __B,
> +    (__v2df)
> +    _mm_undefined_pd (),
> +    (__mmask8) -1,
> +    __R);
>  }
>
>  extern __inline __m128d
> @@ -2762,9 +2801,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_mul_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_mulss_round ((__v4sf) __A,
> -      (__v4sf) __B,
> -      __R);
> +  return (__m128) __builtin_ia32_mulss_mask_round ((__v4sf) __A,
> +   (__v4sf) __B,
> +   (__v4sf)
> +   _mm_undefined_pd (),
> +   (__mmask8) -1,
> +   __R);
>  }
>
>  extern __inline __m128
> @@ -2794,9 +2836,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_div_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_divsd_round ((__v2df) __A,
> -       (__v2df) __B,
> -       __R);
> +  return (__m128d) __builtin_ia32_divsd_mask_round ((__v2df) __A,
> +    (__v2df) __B,
> +    (__v2df)
> +    _mm_undefined_pd (),
> +    (__mmask8) -1,
> +    __R);
>  }
>
>  extern __inline __m128d
> @@ -2826,9 +2871,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_div_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_divss_round ((__v4sf) __A,
> -      (__v4sf) __B,
> -      __R);
> +  return (__m128) __builtin_ia32_divss_mask_round ((__v4sf) __A,
> +   (__v4sf) __B,
> +   (__v4sf)
> +   _mm_undefined_pd (),
> +   (__mmask8) -1,
> +   __R);
>  }
>
>  extern __inline __m128
> @@ -2891,8 +2939,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm512_maskz_div_round_ps(U, A, B, C)   \
>      (__m512)__builtin_ia32_divps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C)
>
> -#define _mm_mul_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_mulsd_round(A, B, C)
> +#define _mm_mul_round_sd(A, B, C) \
> +  ((__m128d) \
> +   __builtin_ia32_mulsd_mask_round ((__v2df) (__m128d) (A), \
> +    (__v2df) (__m128d) (B), \
> +    (__v2df) (__m128d) \
> +    _mm_undefined_pd (), \
> +    (__mmask8) (-1), \
> +    (int) (C)))
>
>  #define _mm_mask_mul_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_mulsd_mask_round(A, B, W, U, C)
> @@ -2900,8 +2954,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_mul_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_mulsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>
> -#define _mm_mul_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_mulss_round(A, B, C)
> +#define _mm_mul_round_ss(A, B, C) \
> +  ((__m128) \
> +   __builtin_ia32_mulss_mask_round ((__v4sf) (__m128) (A), \
> +    (__v4sf) (__m128) (B), \
> +    (__v4sf) (__m128) \
> +    _mm_undefined_ps (), \
> +    (__mmask8) -1, \
> +    (int) (C)))
>
>  #define _mm_mask_mul_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_mulss_mask_round(A, B, W, U, C)
> @@ -2909,8 +2969,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_mul_round_ss(U, A, B, C)   \
>      (__m128)__builtin_ia32_mulss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
>
> -#define _mm_div_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_divsd_round(A, B, C)
> +#define _mm_div_round_sd(A, B, C) \
> +  ((__m128d) \
> +   __builtin_ia32_divsd_mask_round ((__v2df) (__m128d) (A), \
> +    (__v2df) (__m128d) (B), \
> +    (__v2df) (__m128d) \
> +    _mm_undefined_pd (), \
> +    (__mmask8) (-1), \
> +    (int) (C)))
>
>  #define _mm_mask_div_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_divsd_mask_round(A, B, W, U, C)
> @@ -2918,8 +2984,14 @@ _mm_maskz_div_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_div_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_divsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>
> -#define _mm_div_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_divss_round(A, B, C)
> +#define _mm_div_round_ss(A, B, C) \
> +  ((__m128) \
> +   __builtin_ia32_divss_mask_round ((__v4sf) (__m128) (A), \
> +    (__v4sf) (__m128) (B), \
> +    (__v4sf) (__m128) \
> +    _mm_undefined_ps (), \
> +    (__mmask8) -1, \
> +    (int) (C)))
>
>  #define _mm_mask_div_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_divss_mask_round(A, B, W, U, C)
> @@ -8703,9 +8775,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getexp_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_getexpss128_round ((__v4sf) __A,
> -    (__v4sf) __B,
> -    __R);
> +  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
> +      (__v4sf) __B,
> +      (__v4sf)
> +      _mm_undefined_ps (),
> +      (__mmask8) -1,
> +      __R);
>  }
>
>  extern __inline __m128
> @@ -8735,9 +8810,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getexp_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_getexpsd128_round ((__v2df) __A,
> -     (__v2df) __B,
> -     __R);
> +  return (__m128d) __builtin_ia32_getexpsd_mask_round ((__v2df) __A,
> +       (__v2df) __B,
> +       (__v2df)
> +       _mm_undefined_pd (),
> +       (__mmask8) -1,
> +       __R);
>  }
>
>  extern __inline __m128d
> @@ -8901,10 +8979,13 @@ _mm_getmant_round_sd (__m128d __A, __m128d __B,
>        _MM_MANTISSA_NORM_ENUM __C,
>        _MM_MANTISSA_SIGN_ENUM __D, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_getmantsd_round ((__v2df) __A,
> -  (__v2df) __B,
> -  (__D << 2) | __C,
> -   __R);
> +  return (__m128d) __builtin_ia32_getmantsd_mask_round ((__v2df) __A,
> + (__v2df) __B,
> + (__D << 2) | __C,
> + (__v2df)
> + _mm_undefined_pd (),
> + (__mmask8) -1,
> + __R);
>  }
>
>  extern __inline __m128d
> @@ -8940,10 +9021,13 @@ _mm_getmant_round_ss (__m128 __A, __m128 __B,
>        _MM_MANTISSA_NORM_ENUM __C,
>        _MM_MANTISSA_SIGN_ENUM __D, const int __R)
>  {
> -  return (__m128) __builtin_ia32_getmantss_round ((__v4sf) __A,
> -  (__v4sf) __B,
> -  (__D << 2) | __C,
> -  __R);
> +  return (__m128) __builtin_ia32_getmantss_mask_round ((__v4sf) __A,
> +       (__v4sf) __B,
> +       (__D << 2) | __C,
> +       (__v4sf)
> +       _mm_undefined_ps (),
> +       (__mmask8) -1,
> +       __R);
>  }
>
>  extern __inline __m128
> @@ -9014,11 +9098,15 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                               (__v16sf)(__m512)_mm512_setzero_ps(),  \
>                                               (__mmask16)(U),\
>       (R)))
> -#define _mm_getmant_round_sd(X, Y, C, D, R)                                                  \
> -  ((__m128d)__builtin_ia32_getmantsd_round ((__v2df)(__m128d)(X),                    \
> -    (__v2df)(__m128d)(Y), \
> -    (int)(((D)<<2) | (C)), \
> -    (R)))
> +#define _mm_getmant_round_sd(X, Y, C, D, R) \
> +  ((__m128d) \
> +   __builtin_ia32_getmantsd_mask_round ((__v2df) (__m128d) (X),\
> + (__v2df) (__m128d) (Y), \
> + (int) (((D)<<2) | (C)), \
> + (__v2df) (__m128d) \
> + _mm_undefined_pd (), \
> + (__mmask8) (-1), \
> + (int) (R)))
>
>  #define _mm_mask_getmant_round_sd(W, U, X, Y, C, D, R)                                       \
>    ((__m128d)__builtin_ia32_getmantsd_mask_round ((__v2df)(__m128d)(X),                  \
> @@ -9036,11 +9124,15 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                               (__mmask8)(U),\
>       (R)))
>
> -#define _mm_getmant_round_ss(X, Y, C, D, R)                                                  \
> -  ((__m128)__builtin_ia32_getmantss_round ((__v4sf)(__m128)(X),                      \
> -   (__v4sf)(__m128)(Y), \
> -   (int)(((D)<<2) | (C)), \
> -   (R)))
> +#define _mm_getmant_round_ss(X, Y, C, D, R) \
> +  ((__m128) \
> +   __builtin_ia32_getmantss_mask_round ((__v4sf) (__m128) (X), \
> + (__v4sf) (__m128) (Y), \
> + (int) (((D)<<2) | (C)), \
> + (__v4sf) (__m128) \
> + _mm_undefined_ps (), \
> + (__mmask8) (-1), \
> + (int) (R)))
>
>  #define _mm_mask_getmant_round_ss(W, U, X, Y, C, D, R)                                       \
>    ((__m128)__builtin_ia32_getmantss_mask_round ((__v4sf)(__m128)(X),                  \
> @@ -9058,8 +9150,14 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                               (__mmask8)(U),\
>       (R)))
>
> -#define _mm_getexp_round_ss(A, B, R)      \
> -  ((__m128)__builtin_ia32_getexpss128_round((__v4sf)(__m128)(A), (__v4sf)(__m128)(B), R))
> +#define _mm_getexp_round_ss(A, B, R) \
> +  ((__m128) \
> +   __builtin_ia32_getexpss_mask_round ((__v4sf) (__m128) (A), \
> +       (__v4sf) (__m128) (B), \
> +       (__v4sf) (__m128) \
> +       _mm_undefined_ps (), \
> +       (__mmask8) (-1), \
> +       (int) (R)))
>
>  #define _mm_mask_getexp_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_getexpss_mask_round(A, B, W, U, C)
> @@ -9067,8 +9165,14 @@ _mm_maskz_getmant_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_getexp_round_ss(U, A, B, C)   \
>      (__m128)__builtin_ia32_getexpss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
>
> -#define _mm_getexp_round_sd(A, B, R)       \
> -  ((__m128d)__builtin_ia32_getexpsd128_round((__v2df)(__m128d)(A), (__v2df)(__m128d)(B), R))
> +#define _mm_getexp_round_sd(A, B, R) \
> +  ((__m128d) \
> +   __builtin_ia32_getexpsd_mask_round ((__v2df) (__m128d) (A), \
> +       (__v2df) (__m128d) (B), \
> +       (__v2df) (__m128d) \
> +       _mm_undefined_pd (), \
> +       (__mmask8) (-1), \
> +       (int) (R)))
>
>  #define _mm_mask_getexp_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_getexpsd_mask_round(A, B, W, U, C)
> @@ -11392,9 +11496,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_max_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_maxsd_round ((__v2df) __A,
> -       (__v2df) __B,
> -       __R);
> +  return (__m128d) __builtin_ia32_maxsd_mask_round ((__v2df) __A,
> +    (__v2df) __B,
> +    (__v2df)
> +    _mm_undefined_pd (),
> +    (__mmask8) -1,
> +    __R);
>  }
>
>  extern __inline __m128d
> @@ -11424,9 +11531,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_max_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_maxss_round ((__v4sf) __A,
> -      (__v4sf) __B,
> -      __R);
> +  return (__m128) __builtin_ia32_maxss_mask_round ((__v4sf) __A,
> +   (__v4sf) __B,
> +   (__v4sf)
> +   _mm_undefined_ps (),
> +   (__mmask8) -1,
> +   __R);
>  }
>
>  extern __inline __m128
> @@ -11456,9 +11566,12 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_min_round_sd (__m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_minsd_round ((__v2df) __A,
> -       (__v2df) __B,
> -       __R);
> +  return (__m128d) __builtin_ia32_minsd_mask_round ((__v2df) __A,
> +    (__v2df) __B,
> +    (__v2df)
> +    _mm_undefined_pd (),
> +    (__mmask8) -1,
> +    __R);
>  }
>
>  extern __inline __m128d
> @@ -11488,9 +11601,12 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_min_round_ss (__m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_minss_round ((__v4sf) __A,
> -      (__v4sf) __B,
> -      __R);
> +  return (__m128) __builtin_ia32_minss_mask_round ((__v4sf) __A,
> +   (__v4sf) __B,
> +   (__v4sf)
> +   _mm_undefined_ps (),
> +   (__mmask8) -1,
> +   __R);
>  }
>
>  extern __inline __m128
> @@ -11517,8 +11633,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  }
>
>  #else
> -#define _mm_max_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_maxsd_round(A, B, C)
> +#define _mm_max_round_sd(A, B, C) \
> +  ((__m128d) \
> +   __builtin_ia32_maxsd_mask_round((__v2df) (__m128d) (A), \
> +   (__v2df) (__m128d) (B), \
> +   (__v2df) (__m128d) \
> +   _mm_undefined_pd (), \
> +   (__mmask8) (-1), \
> +   (int) (C)))
>
>  #define _mm_mask_max_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_maxsd_mask_round(A, B, W, U, C)
> @@ -11526,8 +11648,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_max_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_maxsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>
> -#define _mm_max_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_maxss_round(A, B, C)
> +#define _mm_max_round_ss(A, B, C) \
> +  ((__m128) \
> +   __builtin_ia32_maxss_mask_round ((__v4sf) (__m128) (A), \
> +    (__v4sf) (__m128) (B), \
> +    (__v4sf) (__m128) \
> +    _mm_undefined_ps (), \
> +    (__mmask8) (-1), \
> +    (int)(C)))
>
>  #define _mm_mask_max_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_maxss_mask_round(A, B, W, U, C)
> @@ -11535,8 +11663,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_max_round_ss(U, A, B, C)   \
>      (__m128)__builtin_ia32_maxss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U, C)
>
> -#define _mm_min_round_sd(A, B, C)            \
> -    (__m128d)__builtin_ia32_minsd_round(A, B, C)
> +#define _mm_min_round_sd(A, B, C) \
> +  ((__m128d) \
> +   __builtin_ia32_minsd_mask_round ((__v2df) (__m128d) (A), \
> +    (__v2df) (__m128d) (B), \
> +    (__v2df) (__m128d) \
> +    _mm_undefined_pd (), \
> +    (__mmask8) (-1), \
> +    (int) (C)))
>
>  #define _mm_mask_min_round_sd(W, U, A, B, C) \
>      (__m128d)__builtin_ia32_minsd_mask_round(A, B, W, U, C)
> @@ -11544,8 +11678,14 @@ _mm_maskz_min_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
>  #define _mm_maskz_min_round_sd(U, A, B, C)   \
>      (__m128d)__builtin_ia32_minsd_mask_round(A, B, (__v2df)_mm_setzero_pd(), U, C)
>
> -#define _mm_min_round_ss(A, B, C)            \
> -    (__m128)__builtin_ia32_minss_round(A, B, C)
> +#define _mm_min_round_ss(A, B, C) \
> +  ((__m128) \
> +   __builtin_ia32_minss_mask_round ((__v4sf) (__m128) (A), \
> +    (__v4sf) (__m128) (B), \
> +    (__v4sf) (__m128) \
> +    _mm_undefined_ps (), \
> +    (__mmask8) (-1), \
> +    (int) (C)))
>
>  #define _mm_mask_min_round_ss(W, U, A, B, C) \
>      (__m128)__builtin_ia32_minss_mask_round(A, B, W, U, C)
> @@ -11596,105 +11736,153 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fmadd_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
> -   (__v2df) __A,
> -   (__v2df) __B,
> -   __R);
> +  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
> +  (__v2df) __A,
> +  (__v2df) __B,
> +  (__mmask8) -1,
> +  __R);
>  }
>
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fmadd_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
> -  (__v4sf) __A,
> -  (__v4sf) __B,
> -  __R);
> +  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
> + (__v4sf) __A,
> + (__v4sf) __B,
> + (__mmask8) -1,
> + __R);
>  }
>
>  extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fmsub_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
> -   (__v2df) __A,
> -   -(__v2df) __B,
> -   __R);
> -}
> +  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
> +  (__v2df) __A,
> +  -(__v2df) __B,
> +  (__mmask8) -1,
> +  __R);
> + }
>
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fmsub_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
> -  (__v4sf) __A,
> -  -(__v4sf) __B,
> -  __R);
> +  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
> + (__v4sf) __A,
> + -(__v4sf) __B,
> + (__mmask8) -1,
> + __R);
>  }
>
>  extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fnmadd_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
> -   -(__v2df) __A,
> -   (__v2df) __B,
> -   __R);
> +  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
> +  -(__v2df) __A,
> +  (__v2df) __B,
> +  (__mmask8) -1,
> +  __R);
>  }
>
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fnmadd_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
> -  -(__v4sf) __A,
> -  (__v4sf) __B,
> -  __R);
> -}
> +  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
> + -(__v4sf) __A,
> + (__v4sf) __B,
> + (__mmask8) -1,
> + __R);
> + }
>
>  extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fnmsub_round_sd (__m128d __W, __m128d __A, __m128d __B, const int __R)
>  {
> -  return (__m128d) __builtin_ia32_vfmaddsd3_round ((__v2df) __W,
> -   -(__v2df) __A,
> -   -(__v2df) __B,
> -   __R);
> -}
> +  return (__m128d) __builtin_ia32_vfmaddsd3_mask ((__v2df) __W,
> +  -(__v2df) __A,
> +  -(__v2df) __B,
> +  (__mmask8) -1,
> +  __R);
> + }
>
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_fnmsub_round_ss (__m128 __W, __m128 __A, __m128 __B, const int __R)
>  {
> -  return (__m128) __builtin_ia32_vfmaddss3_round ((__v4sf) __W,
> -  -(__v4sf) __A,
> -  -(__v4sf) __B,
> -  __R);
> +  return (__m128) __builtin_ia32_vfmaddss3_mask ((__v4sf) __W,
> + -(__v4sf) __A,
> + -(__v4sf) __B,
> + (__mmask8) -1,
> + __R);
>  }
>  #else
> -#define _mm_fmadd_round_sd(A, B, C, R)            \
> -    (__m128d)__builtin_ia32_vfmaddsd3_round(A, B, C, R)
> -
> -#define _mm_fmadd_round_ss(A, B, C, R)            \
> -    (__m128)__builtin_ia32_vfmaddss3_round(A, B, C, R)
> -
> -#define _mm_fmsub_round_sd(A, B, C, R)            \
> -    (__m128d)__builtin_ia32_vfmaddsd3_round(A, B, -(C), R)
> -
> -#define _mm_fmsub_round_ss(A, B, C, R)            \
> -    (__m128)__builtin_ia32_vfmaddss3_round(A, B, -(C), R)
> -
> -#define _mm_fnmadd_round_sd(A, B, C, R)            \
> -    (__m128d)__builtin_ia32_vfmaddsd3_round(A, -(B), C, R)
> -
> -#define _mm_fnmadd_round_ss(A, B, C, R)            \
> -   (__m128)__builtin_ia32_vfmaddss3_round(A, -(B), C, R)
> -
> -#define _mm_fnmsub_round_sd(A, B, C, R)            \
> -    (__m128d)__builtin_ia32_vfmaddsd3_round(A, -(B), -(C), R)
> -
> -#define _mm_fnmsub_round_ss(A, B, C, R)            \
> -    (__m128)__builtin_ia32_vfmaddss3_round(A, -(B), -(C), R)
> +#define _mm_fmadd_round_sd(A, B, C, R) \
> +  ((__m128d) \
> +   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A), \
> +  (__v2df) (__m128d) (B), \
> +  (__v2df) (__m128d) (C), \
> +  (__mmask8) (-1), \
> +  (int) (R)))
> +
> +#define _mm_fmadd_round_ss(A, B, C, R) \
> +  ((__m128) \
> +   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A), \
> +  (__v4sf) (__m128) (B), \
> +  (__v4sf) (__m128) (C), \
> +  (__mmask8) (-1), \
> +  (int) (R)))
> +
> +#define _mm_fmsub_round_sd(A, B, C, R) \
> +  ((__m128d) \
> +   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A), \
> +  (__v2df) (__m128d) (B), \
> +  (__v2df) (__m128d) (-(C)), \
> +  (__mmask8) (-1), \
> +  (int) (R)))
> +
> +#define _mm_fmsub_round_ss(A, B, C, R) \
> +  ((__m128) \
> +   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A), \
> +  (__v4sf) (__m128) (B), \
> +  (__v4sf) (__m128) (-(C)), \
> +  (__mmask8) (-1), \
> +  (int) (R)))
> +
> +#define _mm_fnmadd_round_sd(A, B, C, R) \
> +  ((__m128d) \
> +   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A), \
> +  (__v2df) (__m128d) (-(B)), \
> +  (__v2df) (__m128d) (C), \
> +  (__mmask8) (-1), \
> +  (int) (R)))
> +
> +#define _mm_fnmadd_round_ss(A, B, C, R) \
> +  ((__m128) \
> +   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A), \
> +  (__v4sf) (__m128) (-(B)), \
> +  (__v4sf) (__m128) (C), \
> +  (__mmask8) (-1), \
> +  (int) (R)))
> +
> +#define _mm_fnmsub_round_sd(A, B, C, R) \
> +  ((__m128d) \
> +   __builtin_ia32_vfmaddsd3_mask ((__v2df) (__m128d) (A), \
> +  (__v2df) (__m128d) (-(B)), \
> +  (__v2df) (__m128d) (-(C)), \
> +  (__mmask8) (-1), \
> +  (int) (R)))
> +
> +#define _mm_fnmsub_round_ss(A, B, C, R) \
> +  ((__m128) \
> +   __builtin_ia32_vfmaddss3_mask ((__v4sf) (__m128) (A), \
> +  (__v4sf) (__m128) (-(B)), \
> +  (__v4sf) (__m128) (-(C)), \
> +  (__mmask8) (-1), \
> +  (int) (R)))
>  #endif
>
>  extern __inline __m128d
> @@ -14504,20 +14692,24 @@ extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getexp_ss (__m128 __A, __m128 __B)
>  {
> -  return (__m128) __builtin_ia32_getexpss128_round ((__v4sf) __A,
> -    (__v4sf) __B,
> -    _MM_FROUND_CUR_DIRECTION);
> +  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
> +      (__v4sf) __B,
> +      (__v4sf)
> +      _mm_undefined_ps (),
> +      (__mmask8) -1,
> +      _MM_FROUND_CUR_DIRECTION);
>  }
>
>  extern __inline __m128
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_mask_getexp_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B)
>  {
> -  return (__m128) __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
> - (__v4sf) __B,
> - (__v4sf) __W,
> - (__mmask8) __U,
> - _MM_FROUND_CUR_DIRECTION);
> +  return (__m128)
> +    __builtin_ia32_getexpss_mask_round ((__v4sf) __A,
> + (__v4sf) __B,
> + (__v4sf) __W,
> + (__mmask8) __U,
> + _MM_FROUND_CUR_DIRECTION);
>  }
>
>  extern __inline __m128
> @@ -14536,9 +14728,13 @@ extern __inline __m128d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getexp_sd (__m128d __A, __m128d __B)
>  {
> -  return (__m128d) __builtin_ia32_getexpsd128_round ((__v2df) __A,
> -     (__v2df) __B,
> -     _MM_FROUND_CUR_DIRECTION);
> +  return (__m128d)
> +    __builtin_ia32_getexpsd_mask_round ((__v2df) __A,
> + (__v2df) __B,
> + (__v2df)
> + _mm_undefined_pd (),
> + (__mmask8) -1,
> + _MM_FROUND_CUR_DIRECTION);
>  }
>
>  extern __inline __m128d
> @@ -14641,10 +14837,14 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getmant_sd (__m128d __A, __m128d __B, _MM_MANTISSA_NORM_ENUM __C,
>   _MM_MANTISSA_SIGN_ENUM __D)
>  {
> -  return (__m128d) __builtin_ia32_getmantsd_round ((__v2df) __A,
> -   (__v2df) __B,
> -   (__D << 2) | __C,
> -   _MM_FROUND_CUR_DIRECTION);
> +  return (__m128d)
> +    __builtin_ia32_getmantsd_mask_round ((__v2df) __A,
> + (__v2df) __B,
> + (__D << 2) | __C,
> + (__v2df)
> + _mm_undefined_pd (),
> + (__mmask8) -1,
> + _MM_FROUND_CUR_DIRECTION);
>  }
>
>  extern __inline __m128d
> @@ -14679,10 +14879,14 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_getmant_ss (__m128 __A, __m128 __B, _MM_MANTISSA_NORM_ENUM __C,
>   _MM_MANTISSA_SIGN_ENUM __D)
>  {
> -  return (__m128) __builtin_ia32_getmantss_round ((__v4sf) __A,
> -  (__v4sf) __B,
> -  (__D << 2) | __C,
> -  _MM_FROUND_CUR_DIRECTION);
> +  return (__m128)
> +    __builtin_ia32_getmantss_mask_round ((__v4sf) __A,
> + (__v4sf) __B,
> + (__D << 2) | __C,
> + (__v4sf)
> + _mm_undefined_ps (),
> + (__mmask8) -1,
> + _MM_FROUND_CUR_DIRECTION);
>  }
>
>  extern __inline __m128
> @@ -14753,11 +14957,15 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                               (__v16sf)_mm512_setzero_ps(),          \
>                                               (__mmask16)(U),\
>       _MM_FROUND_CUR_DIRECTION))
> -#define _mm_getmant_sd(X, Y, C, D)                                                  \
> -  ((__m128d)__builtin_ia32_getmantsd_round ((__v2df)(__m128d)(X),                    \
> -                                           (__v2df)(__m128d)(Y),                    \
> -                                           (int)(((D)<<2) | (C)),                   \
> -   _MM_FROUND_CUR_DIRECTION))
> +#define _mm_getmant_sd(X, Y, C, D) \
> +  ((__m128d) \
> +   __builtin_ia32_getmantsd_mask_round ((__v2df) (__m128d) (X), \
> + (__v2df) (__m128d) (Y), \
> + (int) (((D)<<2) | (C)), \
> + (__v2df) (__m128d) \
> + _mm_undefined_pd (), \
> + (__mmask8) (-1), \
> + _MM_FROUND_CUR_DIRECTION))
>
>  #define _mm_mask_getmant_sd(W, U, X, Y, C, D)                                       \
>    ((__m128d)__builtin_ia32_getmantsd_mask_round ((__v2df)(__m128d)(X),                 \
> @@ -14775,11 +14983,15 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                                (__mmask8)(U),\
>        _MM_FROUND_CUR_DIRECTION))
>
> -#define _mm_getmant_ss(X, Y, C, D)                                                  \
> -  ((__m128)__builtin_ia32_getmantss_round ((__v4sf)(__m128)(X),                      \
> -                                          (__v4sf)(__m128)(Y),                      \
> -                                          (int)(((D)<<2) | (C)),                    \
> -  _MM_FROUND_CUR_DIRECTION))
> +#define _mm_getmant_ss(X, Y, C, D) \
> +  ((__m128) \
> +   __builtin_ia32_getmantss_mask_round ((__v4sf) (__m128) (X), \
> + (__v4sf) (__m128) (Y), \
> + (int) (((D)<<2) | (C)), \
> + (__v4sf) (__m128) \
> + _mm_undefined_ps (), \
> + (__mmask8) (-1), \
> + _MM_FROUND_CUR_DIRECTION))
>
>  #define _mm_mask_getmant_ss(W, U, X, Y, C, D)                                       \
>    ((__m128)__builtin_ia32_getmantss_mask_round ((__v4sf)(__m128)(X),                 \
> @@ -14797,9 +15009,14 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
>                                                (__mmask8)(U),\
>        _MM_FROUND_CUR_DIRECTION))
>
> -#define _mm_getexp_ss(A, B)      \
> -  ((__m128)__builtin_ia32_getexpss128_round((__v4sf)(__m128)(A), (__v4sf)(__m128)(B),  \
> -   _MM_FROUND_CUR_DIRECTION))
> +#define _mm_getexp_ss(A, B) \
> +  ((__m128) \
> +   __builtin_ia32_getexpss_mask_round ((__v4sf) (__m128) (A), \
> +       (__v4sf) (__m128) (B), \
> +       (__v4sf) (__m128) \
> +       _mm_undefined_ps (), \
> +       (__mmask8) (-1), \
> +       _MM_FROUND_CUR_DIRECTION))
>
>  #define _mm_mask_getexp_ss(W, U, A, B) \
>      (__m128)__builtin_ia32_getexpss_mask_round(A, B, W, U,\
> @@ -14809,9 +15026,14 @@ _mm_maskz_getmant_ss (__mmask8 __U, __m128 __A, __m128 __B,
>      (__m128)__builtin_ia32_getexpss_mask_round(A, B, (__v4sf)_mm_setzero_ps(), U,\
>        _MM_FROUND_CUR_DIRECTION)
>
> -#define _mm_getexp_sd(A, B)       \
> -  ((__m128d)__builtin_ia32_getexpsd128_round((__v2df)(__m128d)(A), (__v2df)(__m128d)(B),\
> -    _MM_FROUND_CUR_DIRECTION))
> +#define _mm_getexp_sd(A, B) \
> +  ((__m128d) \
> +   __builtin_ia32_getexpsd_mask_round ((__v2df) (__m128d) (A), \
> +       (__v2df) (__m128d)(B), \
> +       (__v2df) (__m128d) \
> +       _mm_undefined_pd (), \
> +       (__mmask8) (-1), \
> +       _MM_FROUND_CUR_DIRECTION))
>
>  #define _mm_mask_getexp_sd(W, U, A, B) \
>      (__m128d)__builtin_ia32_getexpsd_mask_round(A, B, W, U,\
> diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
> index a6500f9d9b5..c2039e9d112 100644
> --- a/gcc/config/i386/i386-builtin.def
> +++ b/gcc/config/i386/i386-builtin.def
> @@ -2751,9 +2751,7 @@ BDESC_END (ARGS, ROUND_ARGS)
>  BDESC_FIRST (round_args, ROUND_ARGS,
>         OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_addv8df3_mask_round, "__builtin_ia32_addpd512_mask", IX86_BUILTIN_ADDPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_addv16sf3_mask_round, "__builtin_ia32_addps512_mask", IX86_BUILTIN_ADDPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmaddv2df3_round, "__builtin_ia32_addsd_round", IX86_BUILTIN_ADDSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmaddv2df3_mask_round, "__builtin_ia32_addsd_mask_round", IX86_BUILTIN_ADDSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmaddv4sf3_round, "__builtin_ia32_addss_round", IX86_BUILTIN_ADDSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmaddv4sf3_mask_round, "__builtin_ia32_addss_mask_round", IX86_BUILTIN_ADDSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_cmpv8df3_mask_round, "__builtin_ia32_cmppd512_mask", IX86_BUILTIN_CMPPD512, UNKNOWN, (int) UQI_FTYPE_V8DF_V8DF_INT_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_cmpv16sf3_mask_round, "__builtin_ia32_cmpps512_mask", IX86_BUILTIN_CMPPS512, UNKNOWN, (int) UHI_FTYPE_V16SF_V16SF_INT_UHI_INT)
> @@ -2784,9 +2782,7 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_cvtusi2ss32_round, "__builtin_ia32_c
>  BDESC (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_cvtusi2ss64_round, "__builtin_ia32_cvtusi2ss64", IX86_BUILTIN_CVTUSI2SS64, UNKNOWN, (int) V4SF_FTYPE_V4SF_UINT64_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_divv8df3_mask_round, "__builtin_ia32_divpd512_mask", IX86_BUILTIN_DIVPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_divv16sf3_mask_round, "__builtin_ia32_divps512_mask", IX86_BUILTIN_DIVPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmdivv2df3_round, "__builtin_ia32_divsd_round", IX86_BUILTIN_DIVSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmdivv2df3_mask_round, "__builtin_ia32_divsd_mask_round", IX86_BUILTIN_DIVSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmdivv4sf3_round, "__builtin_ia32_divss_round", IX86_BUILTIN_DIVSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmdivv4sf3_mask_round, "__builtin_ia32_divss_mask_round", IX86_BUILTIN_DIVSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fixupimmv8df_mask_round, "__builtin_ia32_fixupimmpd512_mask", IX86_BUILTIN_FIXUPIMMPD512_MASK, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DI_INT_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fixupimmv8df_maskz_round, "__builtin_ia32_fixupimmpd512_maskz", IX86_BUILTIN_FIXUPIMMPD512_MASKZ, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DI_INT_QI_INT)
> @@ -2798,33 +2794,23 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sfixupimmv4sf_mask_round, "_
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sfixupimmv4sf_maskz_round, "__builtin_ia32_fixupimmss_maskz", IX86_BUILTIN_FIXUPIMMSS128_MASKZ, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SI_INT_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getexpv8df_mask_round, "__builtin_ia32_getexppd512_mask", IX86_BUILTIN_GETEXPPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getexpv16sf_mask_round, "__builtin_ia32_getexpps512_mask", IX86_BUILTIN_GETEXPPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv2df_round, "__builtin_ia32_getexpsd128_round", IX86_BUILTIN_GETEXPSD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv2df_mask_round, "__builtin_ia32_getexpsd_mask_round", IX86_BUILTIN_GETEXPSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv4sf_round, "__builtin_ia32_getexpss128_round", IX86_BUILTIN_GETEXPSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_sgetexpv4sf_mask_round, "__builtin_ia32_getexpss_mask_round", IX86_BUILTIN_GETEXPSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getmantv8df_mask_round, "__builtin_ia32_getmantpd512_mask", IX86_BUILTIN_GETMANTPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_getmantv16sf_mask_round, "__builtin_ia32_getmantps512_mask", IX86_BUILTIN_GETMANTPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv2df_round, "__builtin_ia32_getmantsd_round", IX86_BUILTIN_GETMANTSD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv2df_mask_round, "__builtin_ia32_getmantsd_mask_round", IX86_BUILTIN_GETMANTSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv4sf_round, "__builtin_ia32_getmantss_round", IX86_BUILTIN_GETMANTSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vgetmantv4sf_mask_round, "__builtin_ia32_getmantss_mask_round", IX86_BUILTIN_GETMANTSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_smaxv8df3_mask_round, "__builtin_ia32_maxpd512_mask", IX86_BUILTIN_MAXPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_smaxv16sf3_mask_round, "__builtin_ia32_maxps512_mask", IX86_BUILTIN_MAXPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsmaxv2df3_round, "__builtin_ia32_maxsd_round", IX86_BUILTIN_MAXSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsmaxv2df3_mask_round, "__builtin_ia32_maxsd_mask_round", IX86_BUILTIN_MAXSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsmaxv4sf3_round, "__builtin_ia32_maxss_round", IX86_BUILTIN_MAXSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsmaxv4sf3_mask_round, "__builtin_ia32_maxss_mask_round", IX86_BUILTIN_MAXSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sminv8df3_mask_round, "__builtin_ia32_minpd512_mask", IX86_BUILTIN_MINPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sminv16sf3_mask_round, "__builtin_ia32_minps512_mask", IX86_BUILTIN_MINPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsminv2df3_round, "__builtin_ia32_minsd_round", IX86_BUILTIN_MINSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsminv2df3_mask_round, "__builtin_ia32_minsd_mask_round", IX86_BUILTIN_MINSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsminv4sf3_round, "__builtin_ia32_minss_round", IX86_BUILTIN_MINSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsminv4sf3_mask_round, "__builtin_ia32_minss_mask_round", IX86_BUILTIN_MINSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_mulv8df3_mask_round, "__builtin_ia32_mulpd512_mask", IX86_BUILTIN_MULPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_mulv16sf3_mask_round, "__builtin_ia32_mulps512_mask", IX86_BUILTIN_MULPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmmulv2df3_round, "__builtin_ia32_mulsd_round", IX86_BUILTIN_MULSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmmulv2df3_mask_round, "__builtin_ia32_mulsd_mask_round", IX86_BUILTIN_MULSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmmulv4sf3_round, "__builtin_ia32_mulss_round", IX86_BUILTIN_MULSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmmulv4sf3_mask_round, "__builtin_ia32_mulss_mask_round", IX86_BUILTIN_MULSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_rndscalev8df_mask_round, "__builtin_ia32_rndscalepd_mask", IX86_BUILTIN_RNDSCALEPD, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_QI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_rndscalev16sf_mask_round, "__builtin_ia32_rndscaleps_mask", IX86_BUILTIN_RNDSCALEPS, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_HI_INT)
> @@ -2840,9 +2826,7 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsqrtv2df2_mask_round, "__buil
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsqrtv4sf2_mask_round, "__builtin_ia32_sqrtss_mask_round", IX86_BUILTIN_SQRTSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_subv8df3_mask_round, "__builtin_ia32_subpd512_mask", IX86_BUILTIN_SUBPD512, UNKNOWN, (int) V8DF_FTYPE_V8DF_V8DF_V8DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_subv16sf3_mask_round, "__builtin_ia32_subps512_mask", IX86_BUILTIN_SUBPS512, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsubv2df3_round, "__builtin_ia32_subsd_round", IX86_BUILTIN_SUBSD_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_vmsubv2df3_mask_round, "__builtin_ia32_subsd_mask_round", IX86_BUILTIN_SUBSD_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsubv4sf3_round, "__builtin_ia32_subss_round", IX86_BUILTIN_SUBSS_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse_vmsubv4sf3_mask_round, "__builtin_ia32_subss_mask_round", IX86_BUILTIN_SUBSS_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_sse2_cvtsd2si_round, "__builtin_ia32_vcvtsd2si32", IX86_BUILTIN_VCVTSD2SI32, UNKNOWN, (int) INT_FTYPE_V2DF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse2_cvtsd2siq_round, "__builtin_ia32_vcvtsd2si64", IX86_BUILTIN_VCVTSD2SI64, UNKNOWN, (int) INT64_FTYPE_V2DF_INT)
> @@ -2866,8 +2850,6 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v8df_maskz_round, "__b
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_mask_round, "__builtin_ia32_vfmaddps512_mask", IX86_BUILTIN_VFMADDPS512_MASK, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_mask3_round, "__builtin_ia32_vfmaddps512_mask3", IX86_BUILTIN_VFMADDPS512_MASK3, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_fmadd_v16sf_maskz_round, "__builtin_ia32_vfmaddps512_maskz", IX86_BUILTIN_VFMADDPS512_MASKZ, UNKNOWN, (int) V16SF_FTYPE_V16SF_V16SF_V16SF_HI_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_fmai_vmfmadd_v2df_round, "__builtin_ia32_vfmaddsd3_round", IX86_BUILTIN_VFMADDSD3_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_INT)
> -BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_fmai_vmfmadd_v4sf_round, "__builtin_ia32_vfmaddss3_round", IX86_BUILTIN_VFMADDSS3_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_mask_round, "__builtin_ia32_vfmaddsd3_mask", IX86_BUILTIN_VFMADDSD3_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_mask3_round, "__builtin_ia32_vfmaddsd3_mask3", IX86_BUILTIN_VFMADDSD3_MASK3, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
>  BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_vmfmadd_v2df_maskz_round, "__builtin_ia32_vfmaddsd3_maskz", IX86_BUILTIN_VFMADDSD3_MASKZ, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT)
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index cbf4eb7b487..66bf9be5bd4 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -10193,13 +10193,6 @@ ix86_expand_round_builtin (const struct builtin_description *d,
>      case V16SI_FTYPE_V16SF_V16SI_HI_INT:
>      case V8DF_FTYPE_V8SF_V8DF_QI_INT:
>      case V16SF_FTYPE_V16HI_V16SF_HI_INT:
> -    case V2DF_FTYPE_V2DF_V2DF_V2DF_INT:
> -    case V4SF_FTYPE_V4SF_V4SF_V4SF_INT:
> -      nargs = 4;
> -      break;
> -    case V4SF_FTYPE_V4SF_V4SF_INT_INT:
> -    case V2DF_FTYPE_V2DF_V2DF_INT_INT:
> -      nargs_constant = 2;
>        nargs = 4;
>        break;
>      case INT_FTYPE_V4SF_V4SF_INT_INT:
> diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
> index 3600a7abe91..0e00bfbbb5e 100644
> --- a/gcc/testsuite/gcc.target/i386/avx-1.c
> +++ b/gcc/testsuite/gcc.target/i386/avx-1.c
> @@ -172,9 +172,7 @@
>  #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
>  #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
>  #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
>  #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
>  #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
> @@ -206,9 +204,7 @@
>  #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
>  #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
>  #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
>  #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
>  #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
> @@ -232,15 +228,11 @@
>  #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
>  #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
>  #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
> -#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
>  #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
>  #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
>  #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
> -#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
> -#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
>  #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
> @@ -248,21 +240,15 @@
>  #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
>  #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
>  #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_minsd_round(A, B, C) __builtin_ia32_minsd_round(A, B, 4)
>  #define __builtin_ia32_minsd_mask_round(A, B, C, D, E) __builtin_ia32_minsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_minss_round(A, B, C) __builtin_ia32_minss_round(A, B, 4)
>  #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
>  #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
>  #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
>  #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
> @@ -309,9 +295,7 @@
>  #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
>  #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
>  #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
>  #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
> @@ -341,8 +325,6 @@
>  #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
> -#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
> -#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
> diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
> index 45c1c285c57..fdb7852f0b3 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-13.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-13.c
> @@ -189,9 +189,7 @@
>  #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
>  #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
>  #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
>  #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
>  #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
> @@ -223,9 +221,7 @@
>  #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
>  #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
>  #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
>  #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
>  #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
> @@ -249,15 +245,11 @@
>  #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
>  #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
>  #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
> -#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
>  #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
>  #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
>  #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
> -#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
> -#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
>  #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
> @@ -265,21 +257,15 @@
>  #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
>  #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
>  #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_minsd_round(A, B, C) __builtin_ia32_minsd_round(A, B, 4)
>  #define __builtin_ia32_minsd_mask_round(A, B, C, D, E) __builtin_ia32_minsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_minss_round(A, B, C) __builtin_ia32_minss_round(A, B, 4)
>  #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
>  #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
>  #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
>  #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
> @@ -326,9 +312,7 @@
>  #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, E, 8)
>  #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
>  #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
>  #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
>  #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
> @@ -358,8 +342,6 @@
>  #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
> -#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
> -#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
> diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
> index e98c7693ef7..cb98cc63e6b 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-23.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-23.c
> @@ -191,9 +191,7 @@
>  #define __builtin_ia32_kshiftrihi(A, B) __builtin_ia32_kshiftrihi(A, 8)
>  #define __builtin_ia32_addpd512_mask(A, B, C, D, E) __builtin_ia32_addpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_addps512_mask(A, B, C, D, E) __builtin_ia32_addps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
>  #define __builtin_ia32_addsd_mask_round(A, B, C, D, E) __builtin_ia32_addsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
>  #define __builtin_ia32_addss_mask_round(A, B, C, D, E) __builtin_ia32_addss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_alignd512_mask(A, B, F, D, E) __builtin_ia32_alignd512_mask(A, B, 1, D, E)
>  #define __builtin_ia32_alignq512_mask(A, B, F, D, E) __builtin_ia32_alignq512_mask(A, B, 1, D, E)
> @@ -225,9 +223,7 @@
>  #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
>  #define __builtin_ia32_divpd512_mask(A, B, C, D, E) __builtin_ia32_divpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_divps512_mask(A, B, C, D, E) __builtin_ia32_divps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
>  #define __builtin_ia32_divsd_mask_round(A, B, C, D, E) __builtin_ia32_divsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_round(A, B, 8)
>  #define __builtin_ia32_divss_mask_round(A, B, C, D, E) __builtin_ia32_divss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_extractf32x4_mask(A, E, C, D) __builtin_ia32_extractf32x4_mask(A, 1, C, D)
>  #define __builtin_ia32_extractf64x4_mask(A, E, C, D) __builtin_ia32_extractf64x4_mask(A, 1, C, D)
> @@ -251,15 +247,11 @@
>  #define __builtin_ia32_gathersiv8di(A, B, C, D, F) __builtin_ia32_gathersiv8di(A, B, C, D, 8)
>  #define __builtin_ia32_getexppd512_mask(A, B, C, D) __builtin_ia32_getexppd512_mask(A, B, C, 8)
>  #define __builtin_ia32_getexpps512_mask(A, B, C, D) __builtin_ia32_getexpps512_mask(A, B, C, 8)
> -#define __builtin_ia32_getexpsd128_round(A, B, C) __builtin_ia32_getexpsd128_round(A, B, 4)
>  #define __builtin_ia32_getexpsd_mask_round(A, B, C, D, E) __builtin_ia32_getexpsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_getexpss128_round(A, B, C) __builtin_ia32_getexpss128_round(A, B, 4)
>  #define __builtin_ia32_getexpss_mask_round(A, B, C, D, E) __builtin_ia32_getexpss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_getmantpd512_mask(A, F, C, D, E) __builtin_ia32_getmantpd512_mask(A, 1, C, D, 8)
>  #define __builtin_ia32_getmantps512_mask(A, F, C, D, E) __builtin_ia32_getmantps512_mask(A, 1, C, D, 8)
> -#define __builtin_ia32_getmantsd_round(A, B, C, D) __builtin_ia32_getmantsd_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantsd_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsd_mask_round(A, B, 1, W, U, 4)
> -#define __builtin_ia32_getmantss_round(A, B, C, D) __builtin_ia32_getmantss_round(A, B, 1, 4)
>  #define __builtin_ia32_getmantss_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantss_mask_round(A, B, 1, W, U, 4)
>  #define __builtin_ia32_insertf32x4_mask(A, B, F, D, E) __builtin_ia32_insertf32x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_insertf64x4_mask(A, B, F, D, E) __builtin_ia32_insertf64x4_mask(A, B, 1, D, E)
> @@ -267,9 +259,7 @@
>  #define __builtin_ia32_inserti64x4_mask(A, B, F, D, E) __builtin_ia32_inserti64x4_mask(A, B, 1, D, E)
>  #define __builtin_ia32_maxpd512_mask(A, B, C, D, E) __builtin_ia32_maxpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_maxps512_mask(A, B, C, D, E) __builtin_ia32_maxps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_maxsd_round(A, B, C) __builtin_ia32_maxsd_round(A, B, 4)
>  #define __builtin_ia32_maxsd_mask_round(A, B, C, D, E) __builtin_ia32_maxsd_mask_round(A, B, C, D, 4)
> -#define __builtin_ia32_maxss_round(A, B, C) __builtin_ia32_maxss_round(A, B, 4)
>  #define __builtin_ia32_maxss_mask_round(A, B, C, D, E) __builtin_ia32_maxss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_minpd512_mask(A, B, C, D, E) __builtin_ia32_minpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_minps512_mask(A, B, C, D, E) __builtin_ia32_minps512_mask(A, B, C, D, 8)
> @@ -279,9 +269,7 @@
>  #define __builtin_ia32_minss_mask_round(A, B, C, D, E) __builtin_ia32_minss_mask_round(A, B, C, D, 4)
>  #define __builtin_ia32_mulpd512_mask(A, B, C, D, E) __builtin_ia32_mulpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_mulps512_mask(A, B, C, D, E) __builtin_ia32_mulps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_mulsd_round(A, B, C) __builtin_ia32_mulsd_round(A, B, 8)
>  #define __builtin_ia32_mulsd_mask_round(A, B, C, D, E) __builtin_ia32_mulsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_mulss_round(A, B, C) __builtin_ia32_mulss_round(A, B, 8)
>  #define __builtin_ia32_mulss_mask_round(A, B, C, D, E) __builtin_ia32_mulss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_permdf512_mask(A, E, C, D) __builtin_ia32_permdf512_mask(A, 1, C, D)
>  #define __builtin_ia32_permdi512_mask(A, E, C, D) __builtin_ia32_permdi512_mask(A, 1, C, D)
> @@ -328,9 +316,7 @@
>  #define __builtin_ia32_sqrtsd_mask_round(A, B, C, D, E) __builtin_ia32_sqrtsd_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_subpd512_mask(A, B, C, D, E) __builtin_ia32_subpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_subps512_mask(A, B, C, D, E) __builtin_ia32_subps512_mask(A, B, C, D, 8)
> -#define __builtin_ia32_subsd_round(A, B, C) __builtin_ia32_subsd_round(A, B, 8)
>  #define __builtin_ia32_subsd_mask_round(A, B, C, D, E) __builtin_ia32_subsd_mask_round(A, B, C, D, 8)
> -#define __builtin_ia32_subss_round(A, B, C) __builtin_ia32_subss_round(A, B, 8)
>  #define __builtin_ia32_subss_mask_round(A, B, C, D, E) __builtin_ia32_subss_mask_round(A, B, C, D, 8)
>  #define __builtin_ia32_ucmpd512_mask(A, B, E, D) __builtin_ia32_ucmpd512_mask(A, B, 1, D)
>  #define __builtin_ia32_ucmpq512_mask(A, B, E, D) __builtin_ia32_ucmpq512_mask(A, B, 1, D)
> @@ -360,8 +346,6 @@
>  #define __builtin_ia32_vfmaddps512_mask(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddps512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddps512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddps512_maskz(A, B, C, D, 8)
> -#define __builtin_ia32_vfmaddsd3_round(A, B, C, D) __builtin_ia32_vfmaddsd3_round(A, B, C, 8)
> -#define __builtin_ia32_vfmaddss3_round(A, B, C, D) __builtin_ia32_vfmaddss3_round(A, B, C, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_mask3(A, B, C, D, 8)
>  #define __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddsubpd512_maskz(A, B, C, D, 8)
> diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
> index 98f1141a8a4..e102b15ce54 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -7786,7 +7786,7 @@ proc check_effective_target_avx512f { } {
>
>   __m128d _mm128_getmant (__m128d a)
>   {
> -  return __builtin_ia32_getmantsd_round (a, a, 0, 8);
> +  return __builtin_ia32_getmantsd_mask_round (a, a, 0, a, 1, 8);
>   }
>      } "-O2 -mavx512f" ]
>  }

-- 
Regards,

Hongyu, Wang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.
  2020-11-13  6:21   ` Hongyu Wang
@ 2020-11-30 16:23     ` Jeff Law
  2020-11-30 16:26       ` Jakub Jelinek
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Law @ 2020-11-30 16:23 UTC (permalink / raw)
  To: Hongyu Wang; +Cc: Jakub Jelinek, GCC Patches, Hongtao Liu, H.J. Lu



On 11/12/20 11:21 PM, Hongyu Wang wrote:
> Hi
>
> Thanks for reminding me about this patch. I didn't remove any existing
> intrinsics, just remove redundant builtin functions that end-users
> would not likely to use.
>
> Also I'm OK to keep current implementation, in case there might be
> someone using the builtin directly.
That seems wise -- we can't reasonably predict if users are using those
builtins directly. 

So if we can clean things up and keep the redundant builtins that seems
best.  Or just leave things as-is. 

The other possibility would be to deprecate the redundant builtins this
release and remove them in gcc-12.  I haven't looked at how difficult
that might be, but the idea here would be to give users a warning if
they use those builtins directly and enough time to resolve the issue
before we remove them.

jeff


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.
  2020-11-30 16:23     ` Jeff Law
@ 2020-11-30 16:26       ` Jakub Jelinek
  2020-11-30 20:51         ` Jeff Law
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2020-11-30 16:26 UTC (permalink / raw)
  To: Jeff Law; +Cc: Hongyu Wang, GCC Patches, Hongtao Liu, H.J. Lu

On Mon, Nov 30, 2020 at 09:23:15AM -0700, Jeff Law wrote:
> 
> 
> On 11/12/20 11:21 PM, Hongyu Wang wrote:
> > Hi
> >
> > Thanks for reminding me about this patch. I didn't remove any existing
> > intrinsics, just remove redundant builtin functions that end-users
> > would not likely to use.
> >
> > Also I'm OK to keep current implementation, in case there might be
> > someone using the builtin directly.
> That seems wise -- we can't reasonably predict if users are using those
> builtins directly. 
> 
> So if we can clean things up and keep the redundant builtins that seems
> best.  Or just leave things as-is. 
> 
> The other possibility would be to deprecate the redundant builtins this
> release and remove them in gcc-12.  I haven't looked at how difficult
> that might be, but the idea here would be to give users a warning if
> they use those builtins directly and enough time to resolve the issue
> before we remove them.

In the past we've removed the builtins without any warning, we state all the
time that the builtins used to implement the intrinsics themselves aren't
supported, only the intrinsics declared in the headers.

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.
  2020-11-30 16:26       ` Jakub Jelinek
@ 2020-11-30 20:51         ` Jeff Law
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Law @ 2020-11-30 20:51 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Hongyu Wang, GCC Patches, Hongtao Liu, H.J. Lu



On 11/30/20 9:26 AM, Jakub Jelinek wrote:
> On Mon, Nov 30, 2020 at 09:23:15AM -0700, Jeff Law wrote:
>>
>> On 11/12/20 11:21 PM, Hongyu Wang wrote:
>>> Hi
>>>
>>> Thanks for reminding me about this patch. I didn't remove any existing
>>> intrinsics, just remove redundant builtin functions that end-users
>>> would not likely to use.
>>>
>>> Also I'm OK to keep current implementation, in case there might be
>>> someone using the builtin directly.
>> That seems wise -- we can't reasonably predict if users are using those
>> builtins directly. 
>>
>> So if we can clean things up and keep the redundant builtins that seems
>> best.  Or just leave things as-is. 
>>
>> The other possibility would be to deprecate the redundant builtins this
>> release and remove them in gcc-12.  I haven't looked at how difficult
>> that might be, but the idea here would be to give users a warning if
>> they use those builtins directly and enough time to resolve the issue
>> before we remove them.
> In the past we've removed the builtins without any warning, we state all the
> time that the builtins used to implement the intrinsics themselves aren't
> supported, only the intrinsics declared in the headers.
In that case, I don't mind removing those builtins.
jeff


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-11-30 20:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-24 14:24 [PATCH] Remove redundant builtins for avx512f scalar instructions Hongyu Wang
2020-01-14 20:52 ` Jeff Law
2020-01-15  2:55   ` Hongyu Wang
2020-11-13  5:42 ` Jeff Law
2020-11-13  6:21   ` Hongyu Wang
2020-11-30 16:23     ` Jeff Law
2020-11-30 16:26       ` Jakub Jelinek
2020-11-30 20:51         ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).