public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Support AVX10.1 for AVX512DQ intrins
@ 2023-08-17  6:55 Haochen Jiang
  2023-08-17  6:55 ` [PATCH 1/2] [PATCH 1/2] " Haochen Jiang
  2023-08-17  6:55 ` [PATCH 2/2] [PATCH 2/2] " Haochen Jiang
  0 siblings, 2 replies; 3+ messages in thread
From: Haochen Jiang @ 2023-08-17  6:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: hongtao.liu, ubizjak

Hi all,

I have just checked in the first nine patches for AVX10.1 after
one day waiting since Hongtao said ok.

These two patches aimed to add AVX512DQ scalar intrins to AVX10.1.

Regtested on on x86_64-pc-linux-gnu. Ok for trunk?

Also, We proposed to commit the patches step by step in the following
weeks to see that if there is no unexpected regressions on trunk.

The schedule comes following:

This week: AVX512DQ 128/256/scalar intrins

Week 8/21-8/27: AVX512BW 128/256/scalar intrins

Week 8/28-9/3: AVX512VL, AVX512F, AVX512BF16, AVX512VBMI, AVX512VNNI,
AVX512IFMA, AVX512VBMI2, AVX512BITALG, AVX512VPOPCNTDQ 128/256/scalar
intrins

Week 9/4-9/10: AVX512FP16 128/256/scalar intrins and GFNI, VPCLMULQDQ,
VAES AVX10 check

Week 9/11-9/17: AVX512 optimization migration to AVX10.1

Thx,
Haochen




^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] [PATCH 1/2] Support AVX10.1 for AVX512DQ intrins
  2023-08-17  6:55 [PATCH 0/2] Support AVX10.1 for AVX512DQ intrins Haochen Jiang
@ 2023-08-17  6:55 ` Haochen Jiang
  2023-08-17  6:55 ` [PATCH 2/2] [PATCH 2/2] " Haochen Jiang
  1 sibling, 0 replies; 3+ messages in thread
From: Haochen Jiang @ 2023-08-17  6:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: hongtao.liu, ubizjak

gcc/ChangeLog:

	* config.gcc: Add avx512dqavx10_1intrin.h.
	* config/i386/avx512dqintrin.h: Move avx10_1 related intrins
	to new intrin file.
	* config/i386/i386-builtin.def (BDESC):
	Add OPTION_MASK_ISA2_AVX10_1.
	* config/i386/i386.md (x64_avx512dq): Rename to
	x64_avx10_1_or_avx512dq. Add TARGET_AVX10_1.
	(*movqi_internal): Add TARGET_AVX10_1.
	* config/i386/immintrin.h: Add avx512dqavx10_1intrin.h.
	* config/i386/sse.md (SWI1248_AVX512BWDQ): Add
	TARGET_AVX10_1 and TARGET_AVX512F.
	(SWI1248_AVX512BW): Ditto.
	(SWI1248_AVX512BWDQ2): Ditto.
	(kmov<mskmodesuffix>): Remove TARGET_AVX512F check.
	(k<code><mode>): Remove TARGET_AVX512F check. Add
	TARGET_AVX10_1.
	(kandn<mode>): Ditto.
	(kxnor<mode>): Ditto.
	(knot<mode>): Ditto.
	(kadd<mode>): Remove TARGET_AVX512F check.
	(k<code><mode>): Ditto.
	(ktest<mode>): Ditto.
	(kortest<mode>): Ditto.
	(reduces<mode><mask_scalar_name><round_saeonly_scalar_name>):
	Add TARGET_AVX10_1.
	(pinsr_evex_isa): Change avx512dq to avx10_1_or_avx512dq.
	(*vec_extractv4si): Ditto.
	(*vec_extractv4si_zext): Ditto.
	(*vec_concatv2si_sse4_1): Ditto.
	(*vec_extractv2di_1): Change x64_avx512dq to
	x64_avx10_1_or_avx512dq.
	(vec_concatv2di): Ditto.
	(avx512dq_ranges<mode><mask_scalar_name><round_saeonly_scalar_name>):
	Add TARGET_AVX10_1.
	(avx512dq_vmfpclass<mode><mask_scalar_merge_name>): Ditto.
	* config/i386/subst.md (mask_scalar): Ditto.
	(round_saeonly_scalar): Ditto.

gcc/testsuite/Changelog:

	* gcc.target/i386/sse-26.c: Skip avx512dqavx10_1intrin.h.
---
 gcc/config.gcc                          |   9 +-
 gcc/config/i386/avx512dqavx10_1intrin.h | 634 ++++++++++++++++++++++++
 gcc/config/i386/avx512dqintrin.h        | 602 ----------------------
 gcc/config/i386/i386-builtin.def        |  50 +-
 gcc/config/i386/i386.md                 |   8 +-
 gcc/config/i386/immintrin.h             |   2 +
 gcc/config/i386/sse.md                  |  63 +--
 gcc/config/i386/subst.md                |   4 +-
 gcc/testsuite/gcc.target/i386/sse-26.c  |   1 +
 9 files changed, 706 insertions(+), 667 deletions(-)
 create mode 100644 gcc/config/i386/avx512dqavx10_1intrin.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 415e0e1ebc5..9b1be5350cd 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -415,10 +415,11 @@ i[34567]86-*-* | x86_64-*-*)
 		       adxintrin.h fxsrintrin.h xsaveintrin.h xsaveoptintrin.h
 		       avx512cdintrin.h avx512erintrin.h avx512pfintrin.h
 		       shaintrin.h clflushoptintrin.h xsavecintrin.h
-		       xsavesintrin.h avx512dqintrin.h avx512bwintrin.h
-		       avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
-		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
-		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
+		       xsavesintrin.h avx512dqintrin.h avx512dqavx10_1intrin.h
+		       avx512bwintrin.h avx512vlintrin.h avx512vlbwintrin.h
+		       avx512vldqintrin.h avx512ifmaintrin.h avx512ifmavlintrin.h
+		       avx512vbmiintrin.h avx512vbmivlintrin.h
+		       avx5124fmapsintrin.h avx5124vnniwintrin.h
 		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
 		       clzerointrin.h pkuintrin.h sgxintrin.h cetintrin.h
 		       gfniintrin.h cet.h avx512vbmi2intrin.h
diff --git a/gcc/config/i386/avx512dqavx10_1intrin.h b/gcc/config/i386/avx512dqavx10_1intrin.h
new file mode 100644
index 00000000000..4621f24863b
--- /dev/null
+++ b/gcc/config/i386/avx512dqavx10_1intrin.h
@@ -0,0 +1,634 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _IMMINTRIN_H_INCLUDED
+#error "Never use <avx512dqavx10_1intrin.h> directly; include <immintrin.h> instead."
+#endif
+
+#ifndef _AVX512DQAVX10_1INTRIN_H_INCLUDED
+#define _AVX512DQAVX10_1INTRIN_H_INCLUDED
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_ktest_mask8_u8  (__mmask8 __A,  __mmask8 __B, unsigned char *__CF)
+{
+  *__CF = (unsigned char) __builtin_ia32_ktestcqi (__A, __B);
+  return (unsigned char) __builtin_ia32_ktestzqi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_ktestz_mask8_u8 (__mmask8 __A, __mmask8 __B)
+{
+  return (unsigned char) __builtin_ia32_ktestzqi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_ktestc_mask8_u8 (__mmask8 __A, __mmask8 __B)
+{
+  return (unsigned char) __builtin_ia32_ktestcqi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_ktest_mask16_u8  (__mmask16 __A,  __mmask16 __B, unsigned char *__CF)
+{
+  *__CF = (unsigned char) __builtin_ia32_ktestchi (__A, __B);
+  return (unsigned char) __builtin_ia32_ktestzhi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_ktestz_mask16_u8 (__mmask16 __A, __mmask16 __B)
+{
+  return (unsigned char) __builtin_ia32_ktestzhi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_ktestc_mask16_u8 (__mmask16 __A, __mmask16 __B)
+{
+  return (unsigned char) __builtin_ia32_ktestchi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kortest_mask8_u8  (__mmask8 __A,  __mmask8 __B, unsigned char *__CF)
+{
+  *__CF = (unsigned char) __builtin_ia32_kortestcqi (__A, __B);
+  return (unsigned char) __builtin_ia32_kortestzqi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kortestz_mask8_u8 (__mmask8 __A, __mmask8 __B)
+{
+  return (unsigned char) __builtin_ia32_kortestzqi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kortestc_mask8_u8 (__mmask8 __A, __mmask8 __B)
+{
+  return (unsigned char) __builtin_ia32_kortestcqi (__A, __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kadd_mask8 (__mmask8 __A, __mmask8 __B)
+{
+  return (__mmask8) __builtin_ia32_kaddqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kadd_mask16 (__mmask16 __A, __mmask16 __B)
+{
+  return (__mmask16) __builtin_ia32_kaddhi ((__mmask16) __A, (__mmask16) __B);
+}
+
+extern __inline unsigned int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtmask8_u32 (__mmask8 __A)
+{
+  return (unsigned int) __builtin_ia32_kmovb ((__mmask8 ) __A);
+}
+	
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtu32_mask8 (unsigned int __A)
+{
+  return (__mmask8) __builtin_ia32_kmovb ((__mmask8) __A);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_load_mask8 (__mmask8 *__A)
+{
+  return (__mmask8) __builtin_ia32_kmovb (*(__mmask8 *) __A);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_store_mask8 (__mmask8 *__A, __mmask8 __B)
+{
+  *(__mmask8 *) __A = __builtin_ia32_kmovb (__B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_knot_mask8 (__mmask8 __A)
+{
+  return (__mmask8) __builtin_ia32_knotqi ((__mmask8) __A);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kor_mask8 (__mmask8 __A, __mmask8 __B)
+{
+  return (__mmask8) __builtin_ia32_korqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kxnor_mask8 (__mmask8 __A, __mmask8 __B)
+{
+  return (__mmask8) __builtin_ia32_kxnorqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kxor_mask8 (__mmask8 __A, __mmask8 __B)
+{
+  return (__mmask8) __builtin_ia32_kxorqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kand_mask8 (__mmask8 __A, __mmask8 __B)
+{
+  return (__mmask8) __builtin_ia32_kandqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kandn_mask8 (__mmask8 __A, __mmask8 __B)
+{
+  return (__mmask8) __builtin_ia32_kandnqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kshiftli_mask8 (__mmask8 __A, unsigned int __B)
+{
+  return (__mmask8) __builtin_ia32_kshiftliqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kshiftri_mask8 (__mmask8 __A, unsigned int __B)
+{
+  return (__mmask8) __builtin_ia32_kshiftriqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_reduce_sd (__m128d __A, __m128d __B, int __C)
+{
+  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
+						 (__v2df) __B, __C,
+						 (__v2df) _mm_setzero_pd (),
+						 (__mmask8) -1);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_reduce_round_sd (__m128d __A, __m128d __B, int __C, const int __R)
+{
+  return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A,
+						       (__v2df) __B, __C,
+						       (__v2df)
+						       _mm_setzero_pd (),
+						       (__mmask8) -1, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_reduce_sd (__m128d __W,  __mmask8 __U, __m128d __A,
+		    __m128d __B, int __C)
+{
+  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
+						 (__v2df) __B, __C,
+						 (__v2df) __W,
+						 (__mmask8) __U);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_reduce_round_sd (__m128d __W,  __mmask8 __U, __m128d __A,
+			  __m128d __B, int __C, const int __R)
+{
+  return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A,
+						       (__v2df) __B, __C,
+						       (__v2df) __W,
+						       __U, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_reduce_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C)
+{
+  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
+						 (__v2df) __B, __C,
+						 (__v2df) _mm_setzero_pd (),
+						 (__mmask8) __U);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_reduce_round_sd (__mmask8 __U, __m128d __A, __m128d __B,
+			   int __C, const int __R)
+{
+  return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A,
+						       (__v2df) __B, __C,
+						       (__v2df)
+						       _mm_setzero_pd (),
+						       __U, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_reduce_ss (__m128 __A, __m128 __B, int __C)
+{
+  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
+						(__v4sf) __B, __C,
+						(__v4sf) _mm_setzero_ps (),
+						(__mmask8) -1);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_reduce_round_ss (__m128 __A, __m128 __B, int __C, const int __R)
+{
+  return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A,
+						      (__v4sf) __B, __C,
+						      (__v4sf)
+						      _mm_setzero_ps (),
+						      (__mmask8) -1, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_reduce_ss (__m128 __W,  __mmask8 __U, __m128 __A,
+		    __m128 __B, int __C)
+{
+  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
+						(__v4sf) __B, __C,
+						(__v4sf) __W,
+						(__mmask8) __U);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_reduce_round_ss (__m128 __W,  __mmask8 __U, __m128 __A,
+			  __m128 __B, int __C, const int __R)
+{
+  return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A,
+						      (__v4sf) __B, __C,
+						      (__v4sf) __W,
+						      __U, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_reduce_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C)
+{
+  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
+						(__v4sf) __B, __C,
+						(__v4sf) _mm_setzero_ps (),
+						(__mmask8) __U);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_reduce_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
+			   int __C, const int __R)
+{
+  return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A,
+						      (__v4sf) __B, __C,
+						      (__v4sf)
+						      _mm_setzero_ps (),
+						      __U, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_range_sd (__m128d __A, __m128d __B, int __C)
+{
+  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
+						   (__v2df) __B, __C,
+						   (__v2df)
+						   _mm_setzero_pd (),
+						   (__mmask8) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_range_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B, int __C)
+{
+  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
+						   (__v2df) __B, __C,
+						   (__v2df) __W,
+						   (__mmask8) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_range_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C)
+{
+  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
+						   (__v2df) __B, __C,
+						   (__v2df)
+						   _mm_setzero_pd (),
+						   (__mmask8) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_range_ss (__m128 __A, __m128 __B, int __C)
+{
+  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
+						  (__v4sf) __B, __C,
+						  (__v4sf)
+						  _mm_setzero_ps (),
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_range_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B, int __C)
+{
+  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
+						  (__v4sf) __B, __C,
+						  (__v4sf) __W,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_range_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C)
+{
+  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
+						  (__v4sf) __B, __C,
+						  (__v4sf)
+						  _mm_setzero_ps (),
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_range_round_sd (__m128d __A, __m128d __B, int __C, const int __R)
+{
+  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
+						   (__v2df) __B, __C,
+						   (__v2df)
+						   _mm_setzero_pd (),
+						   (__mmask8) -1, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_range_round_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B,
+			 int __C, const int __R)
+{
+  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
+						   (__v2df) __B, __C,
+						   (__v2df) __W,
+						   (__mmask8) __U, __R);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_range_round_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C,
+			  const int __R)
+{
+  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
+						   (__v2df) __B, __C,
+						   (__v2df)
+						   _mm_setzero_pd (),
+						   (__mmask8) __U, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_range_round_ss (__m128 __A, __m128 __B, int __C, const int __R)
+{
+  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
+						  (__v4sf) __B, __C,
+						  (__v4sf)
+						  _mm_setzero_ps (),
+						  (__mmask8) -1, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_range_round_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B,
+			 int __C, const int __R)
+{
+  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
+						  (__v4sf) __B, __C,
+						  (__v4sf) __W,
+						  (__mmask8) __U, __R);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_range_round_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C,
+			  const int __R)
+{
+  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
+						  (__v4sf) __B, __C,
+						  (__v4sf)
+						  _mm_setzero_ps (),
+						  (__mmask8) __U, __R);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_fpclass_ss_mask (__m128 __A, const int __imm)
+{
+  return (__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) __A, __imm,
+						   (__mmask8) -1);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_fpclass_sd_mask (__m128d __A, const int __imm)
+{
+  return (__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) __A, __imm,
+						   (__mmask8) -1);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_fpclass_ss_mask (__mmask8 __U, __m128 __A, const int __imm)
+{
+  return (__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) __A, __imm, __U);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_fpclass_sd_mask (__mmask8 __U, __m128d __A, const int __imm)
+{
+  return (__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) __A, __imm, __U);
+}
+
+#else
+#define _kshiftli_mask8(X, Y)                                           \
+  ((__mmask8) __builtin_ia32_kshiftliqi ((__mmask8)(X), (__mmask8)(Y)))
+
+#define _kshiftri_mask8(X, Y)                                           \
+  ((__mmask8) __builtin_ia32_kshiftriqi ((__mmask8)(X), (__mmask8)(Y)))
+
+#define _mm_range_sd(A, B, C)						 \
+  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), 	 \
+    (__mmask8) -1, _MM_FROUND_CUR_DIRECTION))
+
+#define _mm_mask_range_sd(W, U, A, B, C)				 \
+  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
+    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), 		 \
+    (__mmask8)(U), _MM_FROUND_CUR_DIRECTION))
+
+#define _mm_maskz_range_sd(U, A, B, C)					 \
+  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), 	 \
+    (__mmask8)(U), _MM_FROUND_CUR_DIRECTION))
+
+#define _mm_range_ss(A, B, C)						\
+  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
+    (__mmask8) -1, _MM_FROUND_CUR_DIRECTION))
+
+#define _mm_mask_range_ss(W, U, A, B, C)				\
+  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W),			\
+    (__mmask8)(U), _MM_FROUND_CUR_DIRECTION))
+
+#define _mm_maskz_range_ss(U, A, B, C)					\
+  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
+    (__mmask8)(U), _MM_FROUND_CUR_DIRECTION))
+
+#define _mm_range_round_sd(A, B, C, R)					 \
+  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		 \
+    (__mmask8) -1, (R)))
+
+#define _mm_mask_range_round_sd(W, U, A, B, C, R)			 \
+  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
+    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W),		 \
+    (__mmask8)(U), (R)))
+
+#define _mm_maskz_range_round_sd(U, A, B, C, R)				 \
+  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		 \
+    (__mmask8)(U), (R)))
+
+#define _mm_range_round_ss(A, B, C, R)					\
+  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
+    (__mmask8) -1, (R)))
+
+#define _mm_mask_range_round_ss(W, U, A, B, C, R)			\
+  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W),			\
+    (__mmask8)(U), (R)))
+
+#define _mm_maskz_range_round_ss(U, A, B, C, R)				\
+  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
+    (__mmask8)(U), (R)))
+
+#define _mm_fpclass_ss_mask(X, C)					\
+  ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X),	\
+					     (int) (C), (__mmask8) (-1))) \
+
+#define _mm_fpclass_sd_mask(X, C)					\
+  ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X),	\
+					     (int) (C), (__mmask8) (-1))) \
+
+#define _mm_mask_fpclass_ss_mask(X, C, U)				\
+  ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X),	\
+					     (int) (C), (__mmask8) (U)))
+
+#define _mm_mask_fpclass_sd_mask(X, C, U)				\
+  ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X),	\
+					     (int) (C), (__mmask8) (U)))
+#define _mm_reduce_sd(A, B, C)						\
+  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		\
+    (__mmask8)-1))
+
+#define _mm_mask_reduce_sd(W, U, A, B, C)				\
+  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
+    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), (__mmask8)(U)))
+
+#define _mm_maskz_reduce_sd(U, A, B, C)					\
+  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		\
+    (__mmask8)(U)))
+
+#define _mm_reduce_round_sd(A, B, C, R)				       \
+  ((__m128d) __builtin_ia32_reducesd_round ((__v2df)(__m128d)(A),      \
+    (__v2df)(__m128d)(B), (int)(C), (__mmask8)(U), (int)(R)))
+
+#define _mm_mask_reduce_round_sd(W, U, A, B, C, R)		       \
+  ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \
+    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W),	       \
+    (__mmask8)(U), (int)(R)))
+
+#define _mm_maskz_reduce_round_sd(U, A, B, C, R)		       \
+  ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),	       \
+    (__mmask8)(U), (int)(R)))
+
+#define _mm_reduce_ss(A, B, C)						\
+  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
+    (__mmask8)-1))
+
+#define _mm_mask_reduce_ss(W, U, A, B, C)				\
+  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), (__mmask8)(U)))
+
+#define _mm_maskz_reduce_ss(U, A, B, C)					\
+  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
+    (__mmask8)(U)))
+
+#define _mm_reduce_round_ss(A, B, C, R)				       \
+  ((__m128) __builtin_ia32_reducess_round ((__v4sf)(__m128)(A),	       \
+    (__v4sf)(__m128)(B), (int)(C), (__mmask8)(U), (int)(R)))
+
+#define _mm_mask_reduce_round_ss(W, U, A, B, C, R)		       \
+  ((__m128) __builtin_ia32_reducess_mask_round ((__v4sf)(__m128)(A),   \
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W),		       \
+    (__mmask8)(U), (int)(R)))
+
+#define _mm_maskz_reduce_round_ss(U, A, B, C, R)		       \
+  ((__m128) __builtin_ia32_reducesd_mask_round ((__v4sf)(__m128)(A),   \
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),	       \
+    (__mmask8)(U), (int)(R)))
+
+#endif
+
+#endif /* _AVX512DQAVX10_1INTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/avx512dqintrin.h b/gcc/config/i386/avx512dqintrin.h
index 93900a0b5c7..64321e47131 100644
--- a/gcc/config/i386/avx512dqintrin.h
+++ b/gcc/config/i386/avx512dqintrin.h
@@ -34,156 +34,6 @@
 #define __DISABLE_AVX512DQ__
 #endif /* __AVX512DQ__ */
 
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktest_mask8_u8  (__mmask8 __A,  __mmask8 __B, unsigned char *__CF)
-{
-  *__CF = (unsigned char) __builtin_ia32_ktestcqi (__A, __B);
-  return (unsigned char) __builtin_ia32_ktestzqi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktestz_mask8_u8 (__mmask8 __A, __mmask8 __B)
-{
-  return (unsigned char) __builtin_ia32_ktestzqi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktestc_mask8_u8 (__mmask8 __A, __mmask8 __B)
-{
-  return (unsigned char) __builtin_ia32_ktestcqi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktest_mask16_u8  (__mmask16 __A,  __mmask16 __B, unsigned char *__CF)
-{
-  *__CF = (unsigned char) __builtin_ia32_ktestchi (__A, __B);
-  return (unsigned char) __builtin_ia32_ktestzhi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktestz_mask16_u8 (__mmask16 __A, __mmask16 __B)
-{
-  return (unsigned char) __builtin_ia32_ktestzhi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktestc_mask16_u8 (__mmask16 __A, __mmask16 __B)
-{
-  return (unsigned char) __builtin_ia32_ktestchi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kortest_mask8_u8  (__mmask8 __A,  __mmask8 __B, unsigned char *__CF)
-{
-  *__CF = (unsigned char) __builtin_ia32_kortestcqi (__A, __B);
-  return (unsigned char) __builtin_ia32_kortestzqi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kortestz_mask8_u8 (__mmask8 __A, __mmask8 __B)
-{
-  return (unsigned char) __builtin_ia32_kortestzqi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kortestc_mask8_u8 (__mmask8 __A, __mmask8 __B)
-{
-  return (unsigned char) __builtin_ia32_kortestcqi (__A, __B);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kadd_mask8 (__mmask8 __A, __mmask8 __B)
-{
-  return (__mmask8) __builtin_ia32_kaddqi ((__mmask8) __A, (__mmask8) __B);
-}
-
-extern __inline __mmask16
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kadd_mask16 (__mmask16 __A, __mmask16 __B)
-{
-  return (__mmask16) __builtin_ia32_kaddhi ((__mmask16) __A, (__mmask16) __B);
-}
-
-extern __inline unsigned int
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_cvtmask8_u32 (__mmask8 __A)
-{
-  return (unsigned int) __builtin_ia32_kmovb ((__mmask8 ) __A);
-}
-	
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_cvtu32_mask8 (unsigned int __A)
-{
-  return (__mmask8) __builtin_ia32_kmovb ((__mmask8) __A);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_load_mask8 (__mmask8 *__A)
-{
-  return (__mmask8) __builtin_ia32_kmovb (*(__mmask8 *) __A);
-}
-
-extern __inline void
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_store_mask8 (__mmask8 *__A, __mmask8 __B)
-{
-  *(__mmask8 *) __A = __builtin_ia32_kmovb (__B);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_knot_mask8 (__mmask8 __A)
-{
-  return (__mmask8) __builtin_ia32_knotqi ((__mmask8) __A);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kor_mask8 (__mmask8 __A, __mmask8 __B)
-{
-  return (__mmask8) __builtin_ia32_korqi ((__mmask8) __A, (__mmask8) __B);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kxnor_mask8 (__mmask8 __A, __mmask8 __B)
-{
-  return (__mmask8) __builtin_ia32_kxnorqi ((__mmask8) __A, (__mmask8) __B);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kxor_mask8 (__mmask8 __A, __mmask8 __B)
-{
-  return (__mmask8) __builtin_ia32_kxorqi ((__mmask8) __A, (__mmask8) __B);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kand_mask8 (__mmask8 __A, __mmask8 __B)
-{
-  return (__mmask8) __builtin_ia32_kandqi ((__mmask8) __A, (__mmask8) __B);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kandn_mask8 (__mmask8 __A, __mmask8 __B)
-{
-  return (__mmask8) __builtin_ia32_kandnqi ((__mmask8) __A, (__mmask8) __B);
-}
-
 extern __inline __m512d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_broadcast_f64x2 (__m128d __A)
@@ -1070,20 +920,6 @@ _mm512_maskz_cvtepu64_pd (__mmask8 __U, __m512i __A)
 }
 
 #ifdef __OPTIMIZE__
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kshiftli_mask8 (__mmask8 __A, unsigned int __B)
-{
-  return (__mmask8) __builtin_ia32_kshiftliqi ((__mmask8) __A, (__mmask8) __B);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kshiftri_mask8 (__mmask8 __A, unsigned int __B)
-{
-  return (__mmask8) __builtin_ia32_kshiftriqi ((__mmask8) __A, (__mmask8) __B);
-}
-
 extern __inline __m512d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_range_pd (__m512d __A, __m512d __B, int __C)
@@ -1156,305 +992,6 @@ _mm512_maskz_range_ps (__mmask16 __U, __m512 __A, __m512 __B, int __C)
 						  _MM_FROUND_CUR_DIRECTION);
 }
 
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_reduce_sd (__m128d __A, __m128d __B, int __C)
-{
-  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
-						 (__v2df) __B, __C,
-						 (__v2df) _mm_setzero_pd (),
-						 (__mmask8) -1);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_reduce_round_sd (__m128d __A, __m128d __B, int __C, const int __R)
-{
-  return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A,
-						       (__v2df) __B, __C,
-						       (__v2df)
-						       _mm_setzero_pd (),
-						       (__mmask8) -1, __R);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_reduce_sd (__m128d __W,  __mmask8 __U, __m128d __A,
-		    __m128d __B, int __C)
-{
-  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
-						 (__v2df) __B, __C,
-						 (__v2df) __W,
-						 (__mmask8) __U);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_reduce_round_sd (__m128d __W,  __mmask8 __U, __m128d __A,
-			  __m128d __B, int __C, const int __R)
-{
-  return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A,
-						       (__v2df) __B, __C,
-						       (__v2df) __W,
-						       __U, __R);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_maskz_reduce_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C)
-{
-  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
-						 (__v2df) __B, __C,
-						 (__v2df) _mm_setzero_pd (),
-						 (__mmask8) __U);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_maskz_reduce_round_sd (__mmask8 __U, __m128d __A, __m128d __B,
-			   int __C, const int __R)
-{
-  return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A,
-						       (__v2df) __B, __C,
-						       (__v2df)
-						       _mm_setzero_pd (),
-						       __U, __R);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_reduce_ss (__m128 __A, __m128 __B, int __C)
-{
-  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
-						(__v4sf) __B, __C,
-						(__v4sf) _mm_setzero_ps (),
-						(__mmask8) -1);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_reduce_round_ss (__m128 __A, __m128 __B, int __C, const int __R)
-{
-  return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A,
-						      (__v4sf) __B, __C,
-						      (__v4sf)
-						      _mm_setzero_ps (),
-						      (__mmask8) -1, __R);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_reduce_ss (__m128 __W,  __mmask8 __U, __m128 __A,
-		    __m128 __B, int __C)
-{
-  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
-						(__v4sf) __B, __C,
-						(__v4sf) __W,
-						(__mmask8) __U);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_reduce_round_ss (__m128 __W,  __mmask8 __U, __m128 __A,
-			  __m128 __B, int __C, const int __R)
-{
-  return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A,
-						      (__v4sf) __B, __C,
-						      (__v4sf) __W,
-						      __U, __R);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_maskz_reduce_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C)
-{
-  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
-						(__v4sf) __B, __C,
-						(__v4sf) _mm_setzero_ps (),
-						(__mmask8) __U);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_maskz_reduce_round_ss (__mmask8 __U, __m128 __A, __m128 __B,
-			   int __C, const int __R)
-{
-  return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A,
-						      (__v4sf) __B, __C,
-						      (__v4sf)
-						      _mm_setzero_ps (),
-						      __U, __R);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_range_sd (__m128d __A, __m128d __B, int __C)
-{
-  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
-						   (__v2df) __B, __C,
-						   (__v2df)
-						   _mm_setzero_pd (),
-						   (__mmask8) -1,
-						   _MM_FROUND_CUR_DIRECTION);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_range_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B, int __C)
-{
-  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
-						   (__v2df) __B, __C,
-						   (__v2df) __W,
-						   (__mmask8) __U,
-						   _MM_FROUND_CUR_DIRECTION);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_maskz_range_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C)
-{
-  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
-						   (__v2df) __B, __C,
-						   (__v2df)
-						   _mm_setzero_pd (),
-						   (__mmask8) __U,
-						   _MM_FROUND_CUR_DIRECTION);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_range_ss (__m128 __A, __m128 __B, int __C)
-{
-  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
-						  (__v4sf) __B, __C,
-						  (__v4sf)
-						  _mm_setzero_ps (),
-						  (__mmask8) -1,
-						  _MM_FROUND_CUR_DIRECTION);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_range_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B, int __C)
-{
-  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
-						  (__v4sf) __B, __C,
-						  (__v4sf) __W,
-						  (__mmask8) __U,
-						  _MM_FROUND_CUR_DIRECTION);
-}
-
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_maskz_range_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C)
-{
-  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
-						  (__v4sf) __B, __C,
-						  (__v4sf)
-						  _mm_setzero_ps (),
-						  (__mmask8) __U,
-						  _MM_FROUND_CUR_DIRECTION);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_range_round_sd (__m128d __A, __m128d __B, int __C, const int __R)
-{
-  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
-						   (__v2df) __B, __C,
-						   (__v2df)
-						   _mm_setzero_pd (),
-						   (__mmask8) -1, __R);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_range_round_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B,
-			 int __C, const int __R)
-{
-  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
-						   (__v2df) __B, __C,
-						   (__v2df) __W,
-						   (__mmask8) __U, __R);
-}
-
-extern __inline __m128d
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_maskz_range_round_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C,
-			  const int __R)
-{
-  return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A,
-						   (__v2df) __B, __C,
-						   (__v2df)
-						   _mm_setzero_pd (),
-						   (__mmask8) __U, __R);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_range_round_ss (__m128 __A, __m128 __B, int __C, const int __R)
-{
-  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
-						  (__v4sf) __B, __C,
-						  (__v4sf)
-						  _mm_setzero_ps (),
-						  (__mmask8) -1, __R);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_range_round_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B,
-			 int __C, const int __R)
-{
-  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
-						  (__v4sf) __B, __C,
-						  (__v4sf) __W,
-						  (__mmask8) __U, __R);
-}
-
-extern __inline __m128
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_maskz_range_round_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C,
-			  const int __R)
-{
-  return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A,
-						  (__v4sf) __B, __C,
-						  (__v4sf)
-						  _mm_setzero_ps (),
-						  (__mmask8) __U, __R);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_fpclass_ss_mask (__m128 __A, const int __imm)
-{
-  return (__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) __A, __imm,
-						   (__mmask8) -1);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_fpclass_sd_mask (__m128d __A, const int __imm)
-{
-  return (__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) __A, __imm,
-						   (__mmask8) -1);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_fpclass_ss_mask (__mmask8 __U, __m128 __A, const int __imm)
-{
-  return (__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) __A, __imm, __U);
-}
-
-extern __inline __mmask8
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_fpclass_sd_mask (__mmask8 __U, __m128d __A, const int __imm)
-{
-  return (__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) __A, __imm, __U);
-}
-
 extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_cvtt_roundpd_epi64 (__m512d __A, const int __R)
@@ -2395,72 +1932,6 @@ _mm512_fpclass_ps_mask (__m512 __A, const int __imm)
 }
 
 #else
-#define _kshiftli_mask8(X, Y)						\
-  ((__mmask8) __builtin_ia32_kshiftliqi ((__mmask8)(X), (__mmask8)(Y)))
-
-#define _kshiftri_mask8(X, Y)						\
-  ((__mmask8) __builtin_ia32_kshiftriqi ((__mmask8)(X), (__mmask8)(Y)))
-
-#define _mm_range_sd(A, B, C)						 \
-  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
-    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), 	 \
-    (__mmask8) -1, _MM_FROUND_CUR_DIRECTION))
-
-#define _mm_mask_range_sd(W, U, A, B, C)				 \
-  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
-    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), 		 \
-    (__mmask8)(U), _MM_FROUND_CUR_DIRECTION))
-
-#define _mm_maskz_range_sd(U, A, B, C)					 \
-  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
-    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), 	 \
-    (__mmask8)(U), _MM_FROUND_CUR_DIRECTION))
-
-#define _mm_range_ss(A, B, C)						\
-  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
-    (__mmask8) -1, _MM_FROUND_CUR_DIRECTION))
-
-#define _mm_mask_range_ss(W, U, A, B, C)				\
-  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W),			\
-    (__mmask8)(U), _MM_FROUND_CUR_DIRECTION))
-
-#define _mm_maskz_range_ss(U, A, B, C)					\
-  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
-    (__mmask8)(U), _MM_FROUND_CUR_DIRECTION))
-
-#define _mm_range_round_sd(A, B, C, R)					 \
-  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
-    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		 \
-    (__mmask8) -1, (R)))
-
-#define _mm_mask_range_round_sd(W, U, A, B, C, R)			 \
-  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
-    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W),		 \
-    (__mmask8)(U), (R)))
-
-#define _mm_maskz_range_round_sd(U, A, B, C, R)				 \
-  ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \
-    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		 \
-    (__mmask8)(U), (R)))
-
-#define _mm_range_round_ss(A, B, C, R)					\
-  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
-    (__mmask8) -1, (R)))
-
-#define _mm_mask_range_round_ss(W, U, A, B, C, R)			\
-  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W),			\
-    (__mmask8)(U), (R)))
-
-#define _mm_maskz_range_round_ss(U, A, B, C, R)				\
-  ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A),	\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
-    (__mmask8)(U), (R)))
-
 #define _mm512_cvtt_roundpd_epi64(A, B)		    \
   ((__m512i)__builtin_ia32_cvttpd2qq512_mask ((A), (__v8di)		\
 					      _mm512_setzero_si512 (),	\
@@ -2792,22 +2263,6 @@ _mm512_fpclass_ps_mask (__m512 __A, const int __imm)
     (__v16si)(__m512i)_mm512_setzero_si512 (),\
     (__mmask16)(U)))
 
-#define _mm_fpclass_ss_mask(X, C)					\
-  ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X),	\
-					     (int) (C), (__mmask8) (-1))) \
-
-#define _mm_fpclass_sd_mask(X, C)					\
-  ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X),	\
-					     (int) (C), (__mmask8) (-1))) \
-
-#define _mm_mask_fpclass_ss_mask(X, C, U)				\
-  ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X),	\
-					     (int) (C), (__mmask8) (U)))
-
-#define _mm_mask_fpclass_sd_mask(X, C, U)				\
-  ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X),	\
-					     (int) (C), (__mmask8) (U)))
-
 #define _mm512_mask_fpclass_pd_mask(u, X, C)                            \
   ((__mmask8) __builtin_ia32_fpclasspd512_mask ((__v8df) (__m512d) (X), \
 						(int) (C), (__mmask8)(u)))
@@ -2824,63 +2279,6 @@ _mm512_fpclass_ps_mask (__m512 __A, const int __imm)
   ((__mmask16) __builtin_ia32_fpclassps512_mask ((__v16sf) (__m512) (x),\
 						 (int) (c),(__mmask16)-1))
 
-#define _mm_reduce_sd(A, B, C)						\
-  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
-    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		\
-    (__mmask8)-1))
-
-#define _mm_mask_reduce_sd(W, U, A, B, C)				\
-  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
-    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), (__mmask8)(U)))
-
-#define _mm_maskz_reduce_sd(U, A, B, C)					\
-  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
-    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		\
-    (__mmask8)(U)))
-
-#define _mm_reduce_round_sd(A, B, C, R)				       \
-  ((__m128d) __builtin_ia32_reducesd_round ((__v2df)(__m128d)(A),      \
-    (__v2df)(__m128d)(B), (int)(C), (__mmask8)(U), (int)(R)))
-
-#define _mm_mask_reduce_round_sd(W, U, A, B, C, R)		       \
-  ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \
-    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W),	       \
-    (__mmask8)(U), (int)(R)))
-
-#define _mm_maskz_reduce_round_sd(U, A, B, C, R)		       \
-  ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \
-    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),	       \
-    (__mmask8)(U), (int)(R)))
-
-#define _mm_reduce_ss(A, B, C)						\
-  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
-    (__mmask8)-1))
-
-#define _mm_mask_reduce_ss(W, U, A, B, C)				\
-  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), (__mmask8)(U)))
-
-#define _mm_maskz_reduce_ss(U, A, B, C)					\
-  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
-    (__mmask8)(U)))
-
-#define _mm_reduce_round_ss(A, B, C, R)				       \
-  ((__m128) __builtin_ia32_reducess_round ((__v4sf)(__m128)(A),	       \
-    (__v4sf)(__m128)(B), (int)(C), (__mmask8)(U), (int)(R)))
-
-#define _mm_mask_reduce_round_ss(W, U, A, B, C, R)		       \
-  ((__m128) __builtin_ia32_reducess_mask_round ((__v4sf)(__m128)(A),   \
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W),		       \
-    (__mmask8)(U), (int)(R)))
-
-#define _mm_maskz_reduce_round_ss(U, A, B, C, R)		       \
-  ((__m128) __builtin_ia32_reducesd_mask_round ((__v4sf)(__m128)(A),   \
-    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),	       \
-    (__mmask8)(U), (int)(R)))
-
-
 #endif
 
 #ifdef __DISABLE_AVX512DQ__
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 34768552e78..7bbe9b2bb01 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -1587,40 +1587,40 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_roundpd_vec_pack_sfix512, "_
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_roundpd_vec_pack_sfix512, "__builtin_ia32_ceilpd_vec_pack_sfix512", IX86_BUILTIN_CEILPD_VEC_PACK_SFIX512, (enum rtx_code) ROUND_CEIL, (int) V16SI_FTYPE_V8DF_V8DF_ROUND)
 
 /* Mask arithmetic operations */
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kashiftqi, "__builtin_ia32_kshiftliqi", IX86_BUILTIN_KSHIFTLI8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI_CONST)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kashiftqi, "__builtin_ia32_kshiftliqi", IX86_BUILTIN_KSHIFTLI8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI_CONST)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kashifthi, "__builtin_ia32_kshiftlihi", IX86_BUILTIN_KSHIFTLI16, UNKNOWN, (int) UHI_FTYPE_UHI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kashiftsi, "__builtin_ia32_kshiftlisi", IX86_BUILTIN_KSHIFTLI32, UNKNOWN, (int) USI_FTYPE_USI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kashiftdi, "__builtin_ia32_kshiftlidi", IX86_BUILTIN_KSHIFTLI64, UNKNOWN, (int) UDI_FTYPE_UDI_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_klshiftrtqi, "__builtin_ia32_kshiftriqi", IX86_BUILTIN_KSHIFTRI8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI_CONST)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_klshiftrtqi, "__builtin_ia32_kshiftriqi", IX86_BUILTIN_KSHIFTRI8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI_CONST)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_klshiftrthi, "__builtin_ia32_kshiftrihi", IX86_BUILTIN_KSHIFTRI16, UNKNOWN, (int) UHI_FTYPE_UHI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_klshiftrtsi, "__builtin_ia32_kshiftrisi", IX86_BUILTIN_KSHIFTRI32, UNKNOWN, (int) USI_FTYPE_USI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_klshiftrtdi, "__builtin_ia32_kshiftridi", IX86_BUILTIN_KSHIFTRI64, UNKNOWN, (int) UDI_FTYPE_UDI_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kandqi, "__builtin_ia32_kandqi", IX86_BUILTIN_KAND8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kandqi, "__builtin_ia32_kandqi", IX86_BUILTIN_KAND8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kandhi, "__builtin_ia32_kandhi", IX86_BUILTIN_KAND16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kandsi, "__builtin_ia32_kandsi", IX86_BUILTIN_KAND32, UNKNOWN, (int) USI_FTYPE_USI_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kanddi, "__builtin_ia32_kanddi", IX86_BUILTIN_KAND64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kandnqi, "__builtin_ia32_kandnqi", IX86_BUILTIN_KANDN8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kandnqi, "__builtin_ia32_kandnqi", IX86_BUILTIN_KANDN8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kandnhi, "__builtin_ia32_kandnhi", IX86_BUILTIN_KANDN16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kandnsi, "__builtin_ia32_kandnsi", IX86_BUILTIN_KANDN32, UNKNOWN, (int) USI_FTYPE_USI_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kandndi, "__builtin_ia32_kandndi", IX86_BUILTIN_KANDN64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_knotqi, "__builtin_ia32_knotqi", IX86_BUILTIN_KNOT8, UNKNOWN, (int) UQI_FTYPE_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_knotqi, "__builtin_ia32_knotqi", IX86_BUILTIN_KNOT8, UNKNOWN, (int) UQI_FTYPE_UQI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_knothi, "__builtin_ia32_knothi", IX86_BUILTIN_KNOT16, UNKNOWN, (int) UHI_FTYPE_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_knotsi, "__builtin_ia32_knotsi", IX86_BUILTIN_KNOT32, UNKNOWN, (int) USI_FTYPE_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_knotdi, "__builtin_ia32_knotdi", IX86_BUILTIN_KNOT64, UNKNOWN, (int) UDI_FTYPE_UDI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kiorqi, "__builtin_ia32_korqi", IX86_BUILTIN_KOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kiorqi, "__builtin_ia32_korqi", IX86_BUILTIN_KOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kiorhi, "__builtin_ia32_korhi", IX86_BUILTIN_KOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kiorsi, "__builtin_ia32_korsi", IX86_BUILTIN_KOR32, UNKNOWN, (int) USI_FTYPE_USI_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kiordi, "__builtin_ia32_kordi", IX86_BUILTIN_KOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_ktestqi, "__builtin_ia32_ktestcqi", IX86_BUILTIN_KTESTC8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_ktestqi, "__builtin_ia32_ktestzqi", IX86_BUILTIN_KTESTZ8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_ktesthi, "__builtin_ia32_ktestchi", IX86_BUILTIN_KTESTC16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_ktesthi, "__builtin_ia32_ktestzhi", IX86_BUILTIN_KTESTZ16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_ktestqi, "__builtin_ia32_ktestcqi", IX86_BUILTIN_KTESTC8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_ktestqi, "__builtin_ia32_ktestzqi", IX86_BUILTIN_KTESTZ8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_ktesthi, "__builtin_ia32_ktestchi", IX86_BUILTIN_KTESTC16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_ktesthi, "__builtin_ia32_ktestzhi", IX86_BUILTIN_KTESTZ16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_ktestsi, "__builtin_ia32_ktestcsi", IX86_BUILTIN_KTESTC32, UNKNOWN, (int) USI_FTYPE_USI_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_ktestsi, "__builtin_ia32_ktestzsi", IX86_BUILTIN_KTESTZ32, UNKNOWN, (int) USI_FTYPE_USI_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_ktestdi, "__builtin_ia32_ktestcdi", IX86_BUILTIN_KTESTC64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_ktestdi, "__builtin_ia32_ktestzdi", IX86_BUILTIN_KTESTZ64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kortestqi, "__builtin_ia32_kortestcqi", IX86_BUILTIN_KORTESTC8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kortestqi, "__builtin_ia32_kortestzqi", IX86_BUILTIN_KORTESTZ8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kortestqi, "__builtin_ia32_kortestcqi", IX86_BUILTIN_KORTESTC8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kortestqi, "__builtin_ia32_kortestzqi", IX86_BUILTIN_KORTESTZ8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kortesthi, "__builtin_ia32_kortestchi", IX86_BUILTIN_KORTESTC16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kortesthi, "__builtin_ia32_kortestzhi", IX86_BUILTIN_KORTESTZ16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kortestsi, "__builtin_ia32_kortestcsi", IX86_BUILTIN_KORTESTC32, UNKNOWN, (int) USI_FTYPE_USI_USI)
@@ -1629,20 +1629,20 @@ BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kortestdi, "__builtin_ia32_kortestc
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kortestdi, "__builtin_ia32_kortestzdi", IX86_BUILTIN_KORTESTZ64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
 
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kunpckhi, "__builtin_ia32_kunpckhi", IX86_BUILTIN_KUNPCKBW, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kxnorqi, "__builtin_ia32_kxnorqi", IX86_BUILTIN_KXNOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kxnorqi, "__builtin_ia32_kxnorqi", IX86_BUILTIN_KXNOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kxnorhi, "__builtin_ia32_kxnorhi", IX86_BUILTIN_KXNOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kxnorsi, "__builtin_ia32_kxnorsi", IX86_BUILTIN_KXNOR32, UNKNOWN, (int) USI_FTYPE_USI_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kxnordi, "__builtin_ia32_kxnordi", IX86_BUILTIN_KXNOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kxorqi, "__builtin_ia32_kxorqi", IX86_BUILTIN_KXOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kxorqi, "__builtin_ia32_kxorqi", IX86_BUILTIN_KXOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kxorhi, "__builtin_ia32_kxorhi", IX86_BUILTIN_KXOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kxorsi, "__builtin_ia32_kxorsi", IX86_BUILTIN_KXOR32, UNKNOWN, (int) USI_FTYPE_USI_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kxordi, "__builtin_ia32_kxordi", IX86_BUILTIN_KXOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kmovb, "__builtin_ia32_kmovb", IX86_BUILTIN_KMOV8, UNKNOWN, (int) UQI_FTYPE_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kmovb, "__builtin_ia32_kmovb", IX86_BUILTIN_KMOV8, UNKNOWN, (int) UQI_FTYPE_UQI)
 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kmovw, "__builtin_ia32_kmovw", IX86_BUILTIN_KMOV16, UNKNOWN, (int) UHI_FTYPE_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kmovd, "__builtin_ia32_kmovd", IX86_BUILTIN_KMOV32, UNKNOWN, (int) USI_FTYPE_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kmovq, "__builtin_ia32_kmovq", IX86_BUILTIN_KMOV64, UNKNOWN, (int) UDI_FTYPE_UDI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kaddqi, "__builtin_ia32_kaddqi", IX86_BUILTIN_KADD8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kaddhi, "__builtin_ia32_kaddhi", IX86_BUILTIN_KADD16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kaddqi, "__builtin_ia32_kaddqi", IX86_BUILTIN_KADD8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kaddhi, "__builtin_ia32_kaddhi", IX86_BUILTIN_KADD16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kaddsi, "__builtin_ia32_kaddsi", IX86_BUILTIN_KADD32, UNKNOWN, (int) USI_FTYPE_USI_USI)
 BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kadddi, "__builtin_ia32_kadddi", IX86_BUILTIN_KADD64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
 
@@ -1814,8 +1814,8 @@ BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv2df_mask, "__builtin_ia32_reducepd128_mask", IX86_BUILTIN_REDUCEPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_INT_V2DF_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv8sf_mask, "__builtin_ia32_reduceps256_mask", IX86_BUILTIN_REDUCEPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_INT_V8SF_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv4sf_mask, "__builtin_ia32_reduceps128_mask", IX86_BUILTIN_REDUCEPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_INT_V4SF_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv2df_mask, "__builtin_ia32_reducesd_mask", IX86_BUILTIN_REDUCESD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv4sf_mask, "__builtin_ia32_reducess_mask", IX86_BUILTIN_REDUCESS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducesv2df_mask, "__builtin_ia32_reducesd_mask", IX86_BUILTIN_REDUCESD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducesv4sf_mask, "__builtin_ia32_reducess_mask", IX86_BUILTIN_REDUCESS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_permvarv16hi_mask, "__builtin_ia32_permvarhi256_mask", IX86_BUILTIN_VPERMVARHI256_MASK, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_permvarv8hi_mask, "__builtin_ia32_permvarhi128_mask", IX86_BUILTIN_VPERMVARHI128_MASK, UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vpermt2varv16hi3_mask, "__builtin_ia32_vpermt2varhi256_mask", IX86_BUILTIN_VPERMT2VARHI256, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI)
@@ -2186,10 +2186,10 @@ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_rorv4si_mask, "__builtin_i
 BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_rolv4si_mask, "__builtin_ia32_prold128_mask", IX86_BUILTIN_PROLD128, UNKNOWN, (int) V4SI_FTYPE_V4SI_INT_V4SI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv4df_mask, "__builtin_ia32_fpclasspd256_mask", IX86_BUILTIN_FPCLASSPD256, UNKNOWN, (int) QI_FTYPE_V4DF_INT_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv2df_mask, "__builtin_ia32_fpclasspd128_mask", IX86_BUILTIN_FPCLASSPD128, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_vmfpclassv2df_mask, "__builtin_ia32_fpclasssd_mask", IX86_BUILTIN_FPCLASSSD_MASK, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_vmfpclassv2df_mask, "__builtin_ia32_fpclasssd_mask", IX86_BUILTIN_FPCLASSSD_MASK, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv8sf_mask, "__builtin_ia32_fpclassps256_mask", IX86_BUILTIN_FPCLASSPS256, UNKNOWN, (int) QI_FTYPE_V8SF_INT_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv4sf_mask, "__builtin_ia32_fpclassps128_mask", IX86_BUILTIN_FPCLASSPS128, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_vmfpclassv4sf_mask, "__builtin_ia32_fpclassss_mask", IX86_BUILTIN_FPCLASSSS_MASK, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_vmfpclassv4sf_mask, "__builtin_ia32_fpclassss_mask", IX86_BUILTIN_FPCLASSSS_MASK, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_cvtb2maskv16qi, "__builtin_ia32_cvtb2mask128", IX86_BUILTIN_CVTB2MASK128, UNKNOWN, (int) UHI_FTYPE_V16QI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_cvtb2maskv32qi, "__builtin_ia32_cvtb2mask256", IX86_BUILTIN_CVTB2MASK256, UNKNOWN, (int) USI_FTYPE_V32QI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_cvtw2maskv8hi, "__builtin_ia32_cvtw2mask128", IX86_BUILTIN_CVTW2MASK128, UNKNOWN, (int) UQI_FTYPE_V8HI)
@@ -3209,10 +3209,10 @@ BDESC (OPTION_MASK_ISA_AVX512ER, 0, CODE_FOR_avx512er_vmrsqrt28v4sf_mask_round,
 /* AVX512DQ.  */
 BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducepv8df_mask_round, "__builtin_ia32_reducepd512_mask_round", IX86_BUILTIN_REDUCEPD512_MASK_ROUND, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducepv16sf_mask_round, "__builtin_ia32_reduceps512_mask_round", IX86_BUILTIN_REDUCEPS512_MASK_ROUND, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_UHI_INT)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv2df_mask_round, "__builtin_ia32_reducesd_mask_round", IX86_BUILTIN_REDUCESD128_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv4sf_mask_round, "__builtin_ia32_reducess_mask_round", IX86_BUILTIN_REDUCESS128_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_rangesv2df_mask_round, "__builtin_ia32_rangesd128_mask_round", IX86_BUILTIN_RANGESD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT)
-BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_rangesv4sf_mask_round, "__builtin_ia32_rangess128_mask_round", IX86_BUILTIN_RANGESS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducesv2df_mask_round, "__builtin_ia32_reducesd_mask_round", IX86_BUILTIN_REDUCESD128_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducesv4sf_mask_round, "__builtin_ia32_reducess_mask_round", IX86_BUILTIN_REDUCESS128_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_rangesv2df_mask_round, "__builtin_ia32_rangesd128_mask_round", IX86_BUILTIN_RANGESD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT)
+BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_rangesv4sf_mask_round, "__builtin_ia32_rangess128_mask_round", IX86_BUILTIN_RANGESS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT)
 BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_fix_notruncv8dfv8di2_mask_round, "__builtin_ia32_cvtpd2qq512_mask", IX86_BUILTIN_CVTPD2QQ512, UNKNOWN, (int) V8DI_FTYPE_V8DF_V8DI_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_cvtps2qqv8di_mask_round, "__builtin_ia32_cvtps2qq512_mask", IX86_BUILTIN_CVTPS2QQ512, UNKNOWN, (int) V8DI_FTYPE_V8SF_V8DI_QI_INT)
 BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_fixuns_notruncv8dfv8di2_mask_round, "__builtin_ia32_cvtpd2uqq512_mask", IX86_BUILTIN_CVTPD2UQQ512, UNKNOWN, (int) V8DI_FTYPE_V8DF_V8DI_QI_INT)
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 108f4af8552..3f1ce3dae21 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -533,7 +533,7 @@
 
 ;; Used to control the "enabled" attribute on a per-instruction basis.
 (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx,
-		    x64_avx,x64_avx512bw,x64_avx512dq,aes,
+		    x64_avx,x64_avx512bw,x64_avx10_1_or_avx512dq,aes,
 		    sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx,
 		    avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f,
 		    avx512bw,noavx512bw,avx512dq,noavx512dq,fma_or_avx512vl,
@@ -875,8 +875,8 @@
 	   (symbol_ref "TARGET_64BIT && TARGET_AVX")
 	 (eq_attr "isa" "x64_avx512bw")
 	   (symbol_ref "TARGET_64BIT && TARGET_AVX512BW")
-	 (eq_attr "isa" "x64_avx512dq")
-	   (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ")
+	 (eq_attr "isa" "x64_avx10_1_or_avx512dq")
+	   (symbol_ref "TARGET_64BIT && (TARGET_AVX512DQ || TARGET_AVX10_1)")
 	 (eq_attr "isa" "aes") (symbol_ref "TARGET_AES")
 	 (eq_attr "isa" "sse_noavx")
 	   (symbol_ref "TARGET_SSE && !TARGET_AVX")
@@ -3114,7 +3114,7 @@
 	     (eq_attr "alternative" "8")
 	       (const_string "QI")
 	     (and (eq_attr "alternative" "9,10,11,14")
-		  (not (match_test "TARGET_AVX512DQ")))
+		  (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1")))
 	       (const_string "HI")
 	     (eq_attr "type" "imovx")
 	       (const_string "SI")
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index 29b4dbbda24..83ff756ac76 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -64,6 +64,8 @@
 
 #include <avx512bwintrin.h>
 
+#include <avx512dqavx10_1intrin.h>
+
 #include <avx512dqintrin.h>
 
 #include <avx512vlbwintrin.h>
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 6784a8c5369..d4a5bca932f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1972,23 +1972,25 @@
 
 ;; All integer modes with AVX512BW/DQ.
 (define_mode_iterator SWI1248_AVX512BWDQ
-  [(QI "TARGET_AVX512DQ") HI (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")])
+  [(QI "TARGET_AVX512DQ || TARGET_AVX10_1") (HI "TARGET_AVX512F")
+   (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")])
 
 ;; All integer modes with AVX512BW, where HImode operation
 ;; can be used instead of QImode.
 (define_mode_iterator SWI1248_AVX512BW
-  [QI HI (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")])
+  [(QI "TARGET_AVX512F || TARGET_AVX10_1") (HI "TARGET_AVX512F")
+   (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")])
 
 ;; All integer modes with AVX512BW/DQ, even HImode requires DQ.
 (define_mode_iterator SWI1248_AVX512BWDQ2
-  [(QI "TARGET_AVX512DQ") (HI "TARGET_AVX512DQ")
+  [(QI "TARGET_AVX512DQ || TARGET_AVX10_1")
+   (HI "TARGET_AVX512DQ || TARGET_AVX10_1")
    (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")])
 
 (define_expand "kmov<mskmodesuffix>"
   [(set (match_operand:SWI1248_AVX512BWDQ 0 "nonimmediate_operand")
 	(match_operand:SWI1248_AVX512BWDQ 1 "nonimmediate_operand"))]
-  "TARGET_AVX512F
-   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
+  "!(MEM_P (operands[0]) && MEM_P (operands[1]))")
 
 (define_insn "k<code><mode>"
   [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k")
@@ -1996,7 +1998,7 @@
 	  (match_operand:SWI1248_AVX512BW 1 "register_operand" "k")
 	  (match_operand:SWI1248_AVX512BW 2 "register_operand" "k")))
    (unspec [(const_int 0)] UNSPEC_MASKOP)]
-  "TARGET_AVX512F"
+  ""
 {
   if (get_attr_mode (insn) == MODE_HI)
     return "k<logic>w\t{%2, %1, %0|%0, %1, %2}";
@@ -2007,7 +2009,7 @@
    (set_attr "prefix" "vex")
    (set (attr "mode")
      (cond [(and (match_test "<MODE>mode == QImode")
-		 (not (match_test "TARGET_AVX512DQ")))
+		 (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1")))
 	       (const_string "HI")
 	   ]
 	   (const_string "<MODE>")))])
@@ -2018,7 +2020,7 @@
 	  (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand")
 	  (match_operand:SWI1248_AVX512BW 2 "mask_reg_operand")))
    (clobber (reg:CC FLAGS_REG))]
-  "TARGET_AVX512F && reload_completed"
+  "reload_completed"
   [(parallel
      [(set (match_dup 0)
 	   (any_logic:SWI1248_AVX512BW (match_dup 1) (match_dup 2)))
@@ -2031,7 +2033,7 @@
 	    (match_operand:SWI1248_AVX512BW 1 "register_operand" "k"))
 	  (match_operand:SWI1248_AVX512BW 2 "register_operand" "k")))
    (unspec [(const_int 0)] UNSPEC_MASKOP)]
-  "TARGET_AVX512F"
+  ""
 {
   if (get_attr_mode (insn) == MODE_HI)
     return "kandnw\t{%2, %1, %0|%0, %1, %2}";
@@ -2042,7 +2044,7 @@
    (set_attr "prefix" "vex")
    (set (attr "mode")
      (cond [(and (match_test "<MODE>mode == QImode")
-		 (not (match_test "TARGET_AVX512DQ")))
+		 (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1")))
 	      (const_string "HI")
 	   ]
 	   (const_string "<MODE>")))])
@@ -2054,7 +2056,7 @@
 	    (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand"))
 	  (match_operand:SWI1248_AVX512BW 2 "mask_reg_operand")))
    (clobber (reg:CC FLAGS_REG))]
-  "TARGET_AVX512F && reload_completed"
+  "reload_completed"
   [(parallel
      [(set (match_dup 0)
 	   (and:SWI1248_AVX512BW
@@ -2069,7 +2071,7 @@
 	    (match_operand:SWI1248_AVX512BW 1 "register_operand" "k")
 	    (match_operand:SWI1248_AVX512BW 2 "register_operand" "k"))))
    (unspec [(const_int 0)] UNSPEC_MASKOP)]
-  "TARGET_AVX512F"
+  ""
 {
   if (get_attr_mode (insn) == MODE_HI)
     return "kxnorw\t{%2, %1, %0|%0, %1, %2}";
@@ -2080,7 +2082,7 @@
    (set_attr "prefix" "vex")
    (set (attr "mode")
      (cond [(and (match_test "<MODE>mode == QImode")
-		 (not (match_test "TARGET_AVX512DQ")))
+		 (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1")))
 	      (const_string "HI")
 	   ]
 	   (const_string "<MODE>")))])
@@ -2090,7 +2092,7 @@
 	(not:SWI1248_AVX512BW
 	  (match_operand:SWI1248_AVX512BW 1 "register_operand" "k")))
    (unspec [(const_int 0)] UNSPEC_MASKOP)]
-  "TARGET_AVX512F"
+  ""
 {
   if (get_attr_mode (insn) == MODE_HI)
     return "knotw\t{%1, %0|%0, %1}";
@@ -2101,7 +2103,7 @@
    (set_attr "prefix" "vex")
    (set (attr "mode")
      (cond [(and (match_test "<MODE>mode == QImode")
-		 (not (match_test "TARGET_AVX512DQ")))
+		 (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1")))
 	       (const_string "HI")
 	   ]
 	   (const_string "<MODE>")))])
@@ -2110,7 +2112,7 @@
   [(set (match_operand:SWI1248_AVX512BW 0 "mask_reg_operand")
 	(not:SWI1248_AVX512BW
 	  (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand")))]
-  "TARGET_AVX512F && reload_completed"
+  "reload_completed"
   [(parallel
      [(set (match_dup 0)
 	   (not:SWI1248_AVX512BW (match_dup 1)))
@@ -2144,7 +2146,7 @@
 	  (match_operand:SWI1248_AVX512BWDQ2 1 "register_operand" "k")
 	  (match_operand:SWI1248_AVX512BWDQ2 2 "register_operand" "k")))
    (unspec [(const_int 0)] UNSPEC_MASKOP)]
-  "TARGET_AVX512F"
+  ""
   "kadd<mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "msklog")
    (set_attr "prefix" "vex")
@@ -2159,7 +2161,7 @@
 	  (match_operand:SWI1248_AVX512BWDQ 1 "register_operand" "k")
 	  (match_operand 2 "const_0_to_255_operand")))
    (unspec [(const_int 0)] UNSPEC_MASKOP)]
-  "TARGET_AVX512F"
+  ""
   "k<mshift><mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "msklog")
    (set_attr "prefix" "vex")
@@ -2171,7 +2173,7 @@
 	  (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand")
 	  (match_operand 2 "const_int_operand")))
    (clobber (reg:CC FLAGS_REG))]
-  "TARGET_AVX512F && reload_completed"
+  "reload_completed"
   [(parallel
      [(set (match_dup 0)
 	   (any_lshift:SWI1248_AVX512BW
@@ -2185,7 +2187,7 @@
 	  [(match_operand:SWI1248_AVX512BWDQ2 0 "register_operand" "k")
 	   (match_operand:SWI1248_AVX512BWDQ2 1 "register_operand" "k")]
 	  UNSPEC_KTEST))]
-  "TARGET_AVX512F"
+  ""
   "ktest<mskmodesuffix>\t{%1, %0|%0, %1}"
   [(set_attr "mode" "<MODE>")
    (set_attr "type" "msklog")
@@ -2197,7 +2199,7 @@
 	  [(match_operand:SWI1248_AVX512BWDQ 0 "register_operand" "k")
 	   (match_operand:SWI1248_AVX512BWDQ 1 "register_operand" "k")]
 	  UNSPEC_KORTEST))]
-  "TARGET_AVX512F"
+  ""
   "kortest<mskmodesuffix>\t{%1, %0|%0, %1}"
   [(set_attr "mode" "<MODE>")
    (set_attr "type" "msklog")
@@ -3565,7 +3567,8 @@
 	    UNSPEC_REDUCE)
 	  (match_dup 1)
 	  (const_int 1)))]
-  "TARGET_AVX512DQ || (VALID_AVX512FP16_REG_MODE (<MODE>mode))"
+  "TARGET_AVX512DQ || (VALID_AVX512FP16_REG_MODE (<MODE>mode))
+  || TARGET_AVX10_1"
   "vreduce<ssescalarmodesuffix>\t{%3, <round_saeonly_scalar_mask_op4>%2, %1, %0<mask_scalar_operand4>|%0<mask_scalar_operand4>, %1, %<iptr>2<round_saeonly_scalar_mask_op4>, %3}"
   [(set_attr "type" "sse")
    (set_attr "prefix" "evex")
@@ -18897,7 +18900,7 @@
 
 (define_mode_attr pinsr_evex_isa
   [(V16QI "avx512bw") (V8HI "avx512bw") (V8HF "avx512bw")
-   (V8BF "avx512bw") (V4SI "avx512dq") (V2DI "avx512dq")])
+   (V8BF "avx512bw") (V4SI "avx10_1_or_avx512dq") (V2DI "avx10_1_or_avx512dq")])
 
 ;; sse4_1_pinsrd must come before sse2_loadld since it is preferred.
 (define_insn "<sse2p4_1>_pinsr<ssemodesuffix>"
@@ -20276,7 +20279,7 @@
       gcc_unreachable ();
     }
 }
-  [(set_attr "isa" "*,avx512dq,noavx,noavx,avx")
+  [(set_attr "isa" "*,avx10_1_or_avx512dq,noavx,noavx,avx")
    (set_attr "type" "sselog1,sselog1,sseishft1,sseishft1,sseishft1")
    (set (attr "prefix_extra")
      (if_then_else (eq_attr "alternative" "0,1")
@@ -20294,7 +20297,7 @@
 	    (parallel [(match_operand:SI 2 "const_0_to_3_operand")]))))]
   "TARGET_64BIT && TARGET_SSE4_1"
   "%vpextrd\t{%2, %1, %k0|%k0, %1, %2}"
-  [(set_attr "isa" "*,avx512dq")
+  [(set_attr "isa" "*,avx10_1_or_avx512dq")
    (set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
@@ -20343,7 +20346,7 @@
      (cond [(eq_attr "alternative" "0")
 	      (const_string "x64_sse4")
 	    (eq_attr "alternative" "1")
-	      (const_string "x64_avx512dq")
+	      (const_string "x64_avx10_1_or_avx512dq")
 	    (eq_attr "alternative" "3")
 	      (const_string "sse2_noavx")
 	    (eq_attr "alternative" "4")
@@ -20509,7 +20512,7 @@
    %vmovd\t{%1, %0|%0, %1}
    punpckldq\t{%2, %0|%0, %2}
    movd\t{%1, %0|%0, %1}"
-  [(set_attr "isa" "noavx,noavx,avx,avx512dq,noavx,noavx,avx,*,*,*")
+  [(set_attr "isa" "noavx,noavx,avx,avx10_1_or_avx512dq,noavx,noavx,avx,*,*,*")
    (set (attr "mmx_isa")
      (if_then_else (eq_attr "alternative" "8,9")
 		   (const_string "native")
@@ -20665,7 +20668,7 @@
 	    (eq_attr "alternative" "2")
 	      (const_string "x64_avx")
 	    (eq_attr "alternative" "3")
-	      (const_string "x64_avx512dq")
+	      (const_string "x64_avx10_1_or_avx512dq")
 	    (eq_attr "alternative" "4")
 	      (const_string "sse2_noavx")
 	    (eq_attr "alternative" "5,8")
@@ -28600,7 +28603,7 @@
 	    UNSPEC_RANGE)
 	  (match_dup 1)
 	  (const_int 1)))]
-  "TARGET_AVX512DQ"
+  "TARGET_AVX512DQ || TARGET_AVX10_1"
 {
   if (TARGET_DEST_FALSE_DEP_FOR_GLC
       && <mask_scalar4_dest_false_dep_for_glc_cond>
@@ -28634,7 +28637,7 @@
              (match_operand 2 "const_0_to_255_operand")]
 	    UNSPEC_FPCLASS)
 	  (const_int 1)))]
-   "TARGET_AVX512DQ || VALID_AVX512FP16_REG_MODE(<MODE>mode)"
+   "TARGET_AVX512DQ || VALID_AVX512FP16_REG_MODE(<MODE>mode) || TARGET_AVX10_1"
    "vfpclass<ssescalarmodesuffix>\t{%2, %1, %0<mask_scalar_merge_operand3>|%0<mask_scalar_merge_operand3>, %1, %2}";
   [(set_attr "type" "sse")
    (set_attr "length_immediate" "1")
diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
index fe923458ab8..a8b3081df70 100644
--- a/gcc/config/i386/subst.md
+++ b/gcc/config/i386/subst.md
@@ -353,7 +353,7 @@
 	  (match_operand:SUBST_V 1)
 	  (match_operand:SUBST_V 2)
 	  (const_int 1)))]
-  "TARGET_AVX512F"
+  "TARGET_AVX512F || TARGET_AVX10_1"
   [(set (match_dup 0)
 	(vec_merge:SUBST_V
 	  (vec_merge:SUBST_V
@@ -460,7 +460,7 @@
           (match_operand:SUBST_V 1)
           (match_operand:SUBST_V 2)
           (const_int 1)))]
-  "TARGET_AVX512F"
+  "TARGET_AVX512F || TARGET_AVX10_1"
   [(set (match_dup 0)
 	(unspec:SUBST_V [
 	     (vec_merge:SUBST_V
diff --git a/gcc/testsuite/gcc.target/i386/sse-26.c b/gcc/testsuite/gcc.target/i386/sse-26.c
index 89db33b8b8c..d67b6056954 100644
--- a/gcc/testsuite/gcc.target/i386/sse-26.c
+++ b/gcc/testsuite/gcc.target/i386/sse-26.c
@@ -7,5 +7,6 @@
    intrinsics. */
 
 #define _AVX512VLDQINTRIN_H_INCLUDED
+#define _AVX512DQAVX10_1INTRIN_H_INCLUDED
 
 #include "sse-13.c"
-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 2/2] [PATCH 2/2] Support AVX10.1 for AVX512DQ intrins
  2023-08-17  6:55 [PATCH 0/2] Support AVX10.1 for AVX512DQ intrins Haochen Jiang
  2023-08-17  6:55 ` [PATCH 1/2] [PATCH 1/2] " Haochen Jiang
@ 2023-08-17  6:55 ` Haochen Jiang
  1 sibling, 0 replies; 3+ messages in thread
From: Haochen Jiang @ 2023-08-17  6:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: hongtao.liu, ubizjak

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx10_1-kaddb-1.c: New test.
	* gcc.target/i386/avx10_1-kaddw-1.c: Ditto.
	* gcc.target/i386/avx10_1-kandb-1.c: Ditto.
	* gcc.target/i386/avx10_1-kandnb-1.c: Ditto.
	* gcc.target/i386/avx10_1-kmovb-1.c: Ditto.
	* gcc.target/i386/avx10_1-kmovb-2.c: Ditto.
	* gcc.target/i386/avx10_1-kmovb-3.c: Ditto.
	* gcc.target/i386/avx10_1-kmovb-4.c: Ditto.
	* gcc.target/i386/avx10_1-knotb-1.c: Ditto.
	* gcc.target/i386/avx10_1-korb-1.c: Ditto.
	* gcc.target/i386/avx10_1-kortestb-1.c: Ditto.
	* gcc.target/i386/avx10_1-kshiftlb-1.c: Ditto.
	* gcc.target/i386/avx10_1-kshiftrb-1.c: Ditto.
	* gcc.target/i386/avx10_1-ktestb-1.c: Ditto.
	* gcc.target/i386/avx10_1-ktestw-1.c: Ditto.
	* gcc.target/i386/avx10_1-kxnorb-1.c: Ditto.
	* gcc.target/i386/avx10_1-kxorb-1.c: Ditto.
	* gcc.target/i386/avx10_1-vfpclasssd-1.c: New test.
	* gcc.target/i386/avx10_1-vfpclassss-1.c: Ditto.
	* gcc.target/i386/avx10_1-vpextr-1.c: Ditto.
	* gcc.target/i386/avx10_1-vpinsr-1.c: Ditto.
	* gcc.target/i386/avx10_1-vrangesd-1.c: Ditto.
	* gcc.target/i386/avx10_1-vrangess-1.c: Ditto.
	* gcc.target/i386/avx10_1-vreducesd-1.c: Ditto.
	* gcc.target/i386/avx10_1-vreducess-1.c: Ditto.
---
 .../gcc.target/i386/avx10_1-kaddb-1.c         | 12 +++++
 .../gcc.target/i386/avx10_1-kaddw-1.c         | 12 +++++
 .../gcc.target/i386/avx10_1-kandb-1.c         | 16 ++++++
 .../gcc.target/i386/avx10_1-kandnb-1.c        | 16 ++++++
 .../gcc.target/i386/avx10_1-kmovb-1.c         | 15 ++++++
 .../gcc.target/i386/avx10_1-kmovb-2.c         | 16 ++++++
 .../gcc.target/i386/avx10_1-kmovb-3.c         | 17 ++++++
 .../gcc.target/i386/avx10_1-kmovb-4.c         | 15 ++++++
 .../gcc.target/i386/avx10_1-knotb-1.c         | 15 ++++++
 .../gcc.target/i386/avx10_1-korb-1.c          | 16 ++++++
 .../gcc.target/i386/avx10_1-kortestb-1.c      | 16 ++++++
 .../gcc.target/i386/avx10_1-kshiftlb-1.c      | 16 ++++++
 .../gcc.target/i386/avx10_1-kshiftrb-1.c      | 16 ++++++
 .../gcc.target/i386/avx10_1-ktestb-1.c        | 16 ++++++
 .../gcc.target/i386/avx10_1-ktestw-1.c        | 16 ++++++
 .../gcc.target/i386/avx10_1-kxnorb-1.c        | 16 ++++++
 .../gcc.target/i386/avx10_1-kxorb-1.c         | 16 ++++++
 .../gcc.target/i386/avx10_1-vfpclasssd-1.c    | 16 ++++++
 .../gcc.target/i386/avx10_1-vfpclassss-1.c    | 16 ++++++
 .../gcc.target/i386/avx10_1-vpextr-1.c        | 53 +++++++++++++++++++
 .../gcc.target/i386/avx10_1-vpinsr-1.c        | 33 ++++++++++++
 .../gcc.target/i386/avx10_1-vrangesd-1.c      | 26 +++++++++
 .../gcc.target/i386/avx10_1-vrangess-1.c      | 25 +++++++++
 .../gcc.target/i386/avx10_1-vreducesd-1.c     | 31 +++++++++++
 .../gcc.target/i386/avx10_1-vreducess-1.c     | 30 +++++++++++
 25 files changed, 492 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kaddb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kaddw-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kandb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kandnb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kmovb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kmovb-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kmovb-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kmovb-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-knotb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-korb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kortestb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kshiftlb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kshiftrb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-ktestb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-ktestw-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kxnorb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kxorb-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vfpclasssd-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vfpclassss-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vpextr-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vpinsr-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vrangesd-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vrangess-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vreducesd-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vreducess-1.c

diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kaddb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kaddb-1.c
new file mode 100644
index 00000000000..6da7b497722
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kaddb-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kaddb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  __mmask8 k = _kadd_mask8 (11, 12);
+  asm volatile ("" : "+k" (k));
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kaddw-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kaddw-1.c
new file mode 100644
index 00000000000..033b7005d71
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kaddw-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kaddw\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  __mmask16 k = _kadd_mask16 (11, 12);
+  asm volatile ("" : "+k" (k));
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kandb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kandb-1.c
new file mode 100644
index 00000000000..5510a982c97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kandb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kandb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  volatile __mmask8 k1, k2, k3;
+
+  __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+  __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+  k3 = _kand_mask8 (k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kandnb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kandnb-1.c
new file mode 100644
index 00000000000..e57078074e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kandnb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kandnb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  volatile __mmask8 k1, k2, k3;
+
+  __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+  __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+  k3 = _kandn_mask8 (k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-1.c
new file mode 100644
index 00000000000..15b9d9a5daa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kmovb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+volatile __mmask8 k1;
+
+void
+avx10_1_test ()
+{
+  __mmask8 k = _cvtu32_mask8 (11);
+
+  asm volatile ("" : "+k" (k));
+  k1 = k;
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-2.c b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-2.c
new file mode 100644
index 00000000000..e4f73f0870e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-2.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kmovb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+volatile __mmask8 k1;
+
+void
+avx10_1_test ()
+{
+  __mmask8 k0 = 11; 
+  __mmask8 k = _load_mask8 (&k0);
+
+  asm volatile ("" : "+k" (k));
+  k1 = k;
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-3.c b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-3.c
new file mode 100644
index 00000000000..47d4d1aafe2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kmovb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+volatile __mmask8 k1 = 11;
+
+void
+avx10_1_test ()
+{
+  __mmask8 k0, k;
+ 
+  _store_mask8 (&k, k1);
+
+  asm volatile ("" : "+k" (k));
+  k0 = k;
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-4.c b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-4.c
new file mode 100644
index 00000000000..79effebdfc7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-4.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kmovb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+volatile unsigned int i;
+
+void
+avx10_1_test ()
+{
+  __mmask8 k = 11;
+
+  asm volatile ("" : "+k" (k));
+  i = _cvtmask8_u32 (k);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-knotb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-knotb-1.c
new file mode 100644
index 00000000000..4a353bcd921
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-knotb-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "knotb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  volatile __mmask8 k1, k2;
+
+  __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (45) );
+
+  k2 = _knot_mask8 (k1);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-korb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-korb-1.c
new file mode 100644
index 00000000000..c912bec1482
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-korb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "korb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  volatile __mmask8 k1, k2, k3;
+
+  __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+  __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+  k3 = _kor_mask8 (k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kortestb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kortestb-1.c
new file mode 100644
index 00000000000..9c8783f0bc5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kortestb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -mavx10.1" } */
+/* { dg-final { scan-assembler-times "kortestb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 2 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test () {
+  volatile __mmask8 k1;
+  __mmask8 k2;
+
+  volatile unsigned char r __attribute__((unused));	
+
+  r = _kortestc_mask8_u8(k1, k2);
+  r = _kortestz_mask8_u8(k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kshiftlb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kshiftlb-1.c
new file mode 100644
index 00000000000..54e8cfd98a9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kshiftlb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kshiftlb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  volatile __mmask8 k1, k2;
+  unsigned int i = 5;
+
+  __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+
+  k2 = _kshiftli_mask8 (k1, i);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kshiftrb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kshiftrb-1.c
new file mode 100644
index 00000000000..625007fded0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kshiftrb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kshiftrb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  volatile __mmask8 k1, k2;
+  unsigned int i = 5;
+
+  __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+
+  k2 = _kshiftri_mask8 (k1, i);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-ktestb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-ktestb-1.c
new file mode 100644
index 00000000000..5f4fe298bd6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-ktestb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -mavx10.1" } */
+/* { dg-final { scan-assembler-times "ktestb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 2 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test () {
+  volatile __mmask8 k1;
+  __mmask8 k2;
+
+  volatile unsigned char r __attribute__((unused));	
+
+  r = _ktestc_mask8_u8(k1, k2);
+  r = _ktestz_mask8_u8(k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-ktestw-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-ktestw-1.c
new file mode 100644
index 00000000000..c606abfb12b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-ktestw-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -mavx10.1" } */
+/* { dg-final { scan-assembler-times "ktestw\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 2 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test () {
+  volatile __mmask16 k1;
+  __mmask16 k2;
+
+  volatile unsigned char r __attribute__((unused));	
+
+  r = _ktestc_mask16_u8(k1, k2);
+  r = _ktestz_mask16_u8(k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kxnorb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kxnorb-1.c
new file mode 100644
index 00000000000..3abe56974bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kxnorb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kxnorb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  volatile __mmask8 k1, k2, k3;
+
+  __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+  __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+  k3 = _kxnor_mask8 (k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kxorb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kxorb-1.c
new file mode 100644
index 00000000000..a39604f038b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-kxorb-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "kxorb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx10_1_test ()
+{
+  volatile __mmask8 k1, k2, k3;
+
+  __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+  __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+  k3 = _kxor_mask8 (k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasssd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasssd-1.c
new file mode 100644
index 00000000000..dbfbe421889
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasssd-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "vfpclasssd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vfpclasssd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+volatile __m128d x128;
+volatile __mmask8 m8;
+
+void extern
+avx10_1_test (void)
+{
+  m8 = _mm_fpclass_sd_mask (x128, 13);
+  m8 = _mm_mask_fpclass_sd_mask (m8, x128, 13);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassss-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassss-1.c
new file mode 100644
index 00000000000..20fd6d3c87b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassss-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "vfpclassss\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vfpclassss\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+volatile __m128 x128;
+volatile __mmask8 m8;
+
+void extern
+avx10_1_test (void)
+{
+  m8 = _mm_fpclass_ss_mask (x128, 13);
+  m8 = _mm_mask_fpclass_ss_mask (m8, x128, 13);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vpextr-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vpextr-1.c
new file mode 100644
index 00000000000..32fa2efa696
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-vpextr-1.c
@@ -0,0 +1,53 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx10.1" } */
+
+typedef int v4si __attribute__((vector_size (16)));
+typedef long long v2di __attribute__((vector_size (16)));
+
+unsigned int
+f1 (v4si a)
+{
+  register v4si c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v4si d = c;
+  return ((unsigned int *) &d)[3];
+}
+
+unsigned long long
+f2 (v2di a)
+{
+  register v2di c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v2di d = c;
+  return ((unsigned long long *) &d)[1];
+}
+
+unsigned long long
+f3 (v4si a)
+{
+  register v4si c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v4si d = c;
+  return ((unsigned int *) &d)[3];
+}
+
+void
+f4 (v4si a, unsigned int *p)
+{
+  register v4si c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v4si d = c;
+  *p = ((unsigned int *) &d)[3];
+}
+
+void
+f5 (v2di a, unsigned long long *p)
+{
+  register v2di c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v2di d = c;
+  *p = ((unsigned long long *) &d)[1];
+}
+
+/* { dg-final { scan-assembler-times "vpextrd\[^\n\r]*xmm16" 3 } } */
+/* { dg-final { scan-assembler-times "vpextrq\[^\n\r]*xmm16" 2 } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vpinsr-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vpinsr-1.c
new file mode 100644
index 00000000000..e473ddb64fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-vpinsr-1.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx10.1" } */
+
+typedef int v4si __attribute__((vector_size (16)));
+typedef long long v2di __attribute__((vector_size (16)));
+
+v4si
+f1 (v4si a, int b)
+{
+  register v4si c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v4si d = c;
+  ((int *) &d)[3] = b;
+  c = d;
+  asm volatile ("" : "+v" (c));
+  return c;
+}
+
+/* { dg-final { scan-assembler "vpinsrd\[^\n\r]*xmm16" } } */
+
+v2di
+f2 (v2di a, long long b)
+{
+  register v2di c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v2di d = c;
+  ((long long *) &d)[1] = b;
+  c = d;
+  asm volatile ("" : "+v" (c));
+  return c;
+}
+
+/* { dg-final { scan-assembler "vpinsrq\[^\n\r]*xmm16" } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vrangesd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vrangesd-1.c
new file mode 100644
index 00000000000..4a388643a52
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-vrangesd-1.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\$\n\]*\\$\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+
+#include <immintrin.h>
+
+volatile __m128d x1, x2;
+volatile __mmask8 m;
+
+void extern
+avx10_1_test (void)
+{
+  x1 = _mm_range_sd (x1, x2, 3);
+  x1 = _mm_mask_range_sd (x1, m, x1, x2, 3);
+  x1 = _mm_maskz_range_sd (m, x1, x2, 3);
+
+  x1 = _mm_range_round_sd (x1, x2, 3, _MM_FROUND_NO_EXC);
+  x1 = _mm_mask_range_round_sd (x1, m, x1, x2, 3, _MM_FROUND_NO_EXC);
+  x1 = _mm_maskz_range_round_sd (m, x1, x2, 3, _MM_FROUND_NO_EXC);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vrangess-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vrangess-1.c
new file mode 100644
index 00000000000..f704ab95056
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-vrangess-1.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\$\n\]*\\$\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+volatile __m128 x1, x2;
+volatile __mmask8 m;
+
+void extern
+avx10_1_test (void)
+{
+  x1 = _mm_range_ss (x1, x2, 1);
+  x1 = _mm_mask_range_ss (x1, m, x1, x2, 1);
+  x1 = _mm_maskz_range_ss (m, x1, x2, 1);
+
+  x1 = _mm_range_round_ss (x1, x2, 1, _MM_FROUND_NO_EXC);
+  x1 = _mm_mask_range_round_ss (x1, m, x1, x2, 1, _MM_FROUND_NO_EXC);
+  x1 = _mm_maskz_range_round_ss (m, x1, x2, 1, _MM_FROUND_NO_EXC);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vreducesd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vreducesd-1.c
new file mode 100644
index 00000000000..5953466c372
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-vreducesd-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+
+#include <immintrin.h>
+
+#define IMM 123
+
+volatile __m128d x1, x2, xx1, xx2;
+volatile __mmask8 m;
+
+void extern
+avx10_1_test (void)
+{
+  xx1 = _mm_reduce_round_sd (xx1, xx2, IMM, _MM_FROUND_NO_EXC);
+  x1 = _mm_reduce_sd (x1, x2, IMM);
+
+  xx1 = _mm_mask_reduce_round_sd(xx1, m, xx1, xx2, IMM, _MM_FROUND_NO_EXC);
+  x1 = _mm_mask_reduce_sd(x1, m, x1, x2, IMM);
+
+  xx1 = _mm_maskz_reduce_round_sd(m, xx1, xx2, IMM, _MM_FROUND_NO_EXC);
+  x1 = _mm_maskz_reduce_sd(m, x1, x2, IMM);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vreducess-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vreducess-1.c
new file mode 100644
index 00000000000..edd7ec07923
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx10_1-vreducess-1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx10.1 -O2" } */
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+#define IMM 123
+
+volatile __m128 x1, x2, xx1, xx2;
+volatile __mmask8 m;
+
+void extern
+avx10_1_test (void)
+{
+  xx1 = _mm_reduce_round_ss (xx1, xx2, IMM, _MM_FROUND_NO_EXC);
+  x1 = _mm_reduce_ss (x1, x2, IMM);
+
+  xx1 = _mm_mask_reduce_round_ss (xx1, m, xx1, xx2, IMM, _MM_FROUND_NO_EXC);
+  x1 = _mm_mask_reduce_ss (x1, m, x1, x2, IMM);
+ 
+  xx1 = _mm_maskz_reduce_round_ss (m, xx1, xx2, IMM, _MM_FROUND_NO_EXC);
+  x1 = _mm_maskz_reduce_ss (m, x1, x2, IMM);
+}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-08-17  6:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-17  6:55 [PATCH 0/2] Support AVX10.1 for AVX512DQ intrins Haochen Jiang
2023-08-17  6:55 ` [PATCH 1/2] [PATCH 1/2] " Haochen Jiang
2023-08-17  6:55 ` [PATCH 2/2] [PATCH 2/2] " Haochen Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).