From: liuhongt <hongtao.liu@intel.com>
To: gcc-patches@gcc.gnu.org
Cc: crazylht@gmail.com, hjl.tools@gmail.com, ubizjak@gmail.com,
jakub@redhat.com
Subject: [PATCH 42/62] AVX512FP16: Add FP16 fma instructions.
Date: Thu, 1 Jul 2021 14:16:28 +0800 [thread overview]
Message-ID: <20210701061648.9447-43-hongtao.liu@intel.com> (raw)
In-Reply-To: <20210701061648.9447-1-hongtao.liu@intel.com>
Add vfmadd[132,213,231]ph/vfnmadd[132,213,231]ph/vfmsub[132,213,231]ph/
vfnmsub[132,213,231]ph.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm512_mask_fmadd_ph):
New intrinsic.
(_mm512_mask3_fmadd_ph): Likewise.
(_mm512_maskz_fmadd_ph): Likewise.
(_mm512_fmadd_round_ph): Likewise.
(_mm512_mask_fmadd_round_ph): Likewise.
(_mm512_mask3_fmadd_round_ph): Likewise.
(_mm512_maskz_fmadd_round_ph): Likewise.
(_mm512_fnmadd_ph): Likewise.
(_mm512_mask_fnmadd_ph): Likewise.
(_mm512_mask3_fnmadd_ph): Likewise.
(_mm512_maskz_fnmadd_ph): Likewise.
(_mm512_fnmadd_round_ph): Likewise.
(_mm512_mask_fnmadd_round_ph): Likewise.
(_mm512_mask3_fnmadd_round_ph): Likewise.
(_mm512_maskz_fnmadd_round_ph): Likewise.
(_mm512_fmsub_ph): Likewise.
(_mm512_mask_fmsub_ph): Likewise.
(_mm512_mask3_fmsub_ph): Likewise.
(_mm512_maskz_fmsub_ph): Likewise.
(_mm512_fmsub_round_ph): Likewise.
(_mm512_mask_fmsub_round_ph): Likewise.
(_mm512_mask3_fmsub_round_ph): Likewise.
(_mm512_maskz_fmsub_round_ph): Likewise.
(_mm512_fnmsub_ph): Likewise.
(_mm512_mask_fnmsub_ph): Likewise.
(_mm512_mask3_fnmsub_ph): Likewise.
(_mm512_maskz_fnmsub_ph): Likewise.
(_mm512_fnmsub_round_ph): Likewise.
(_mm512_mask_fnmsub_round_ph): Likewise.
(_mm512_mask3_fnmsub_round_ph): Likewise.
(_mm512_maskz_fnmsub_round_ph): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm256_fmadd_ph):
New intrinsic.
(_mm256_mask_fmadd_ph): Likewise.
(_mm256_mask3_fmadd_ph): Likewise.
(_mm256_maskz_fmadd_ph): Likewise.
(_mm_fmadd_ph): Likewise.
(_mm_mask_fmadd_ph): Likewise.
(_mm_mask3_fmadd_ph): Likewise.
(_mm_maskz_fmadd_ph): Likewise.
(_mm256_fnmadd_ph): Likewise.
(_mm256_mask_fnmadd_ph): Likewise.
(_mm256_mask3_fnmadd_ph): Likewise.
(_mm256_maskz_fnmadd_ph): Likewise.
(_mm_fnmadd_ph): Likewise.
(_mm_mask_fnmadd_ph): Likewise.
(_mm_mask3_fnmadd_ph): Likewise.
(_mm_maskz_fnmadd_ph): Likewise.
(_mm256_fmsub_ph): Likewise.
(_mm256_mask_fmsub_ph): Likewise.
(_mm256_mask3_fmsub_ph): Likewise.
(_mm256_maskz_fmsub_ph): Likewise.
(_mm_fmsub_ph): Likewise.
(_mm_mask_fmsub_ph): Likewise.
(_mm_mask3_fmsub_ph): Likewise.
(_mm_maskz_fmsub_ph): Likewise.
(_mm256_fnmsub_ph): Likewise.
(_mm256_mask_fnmsub_ph): Likewise.
(_mm256_mask3_fnmsub_ph): Likewise.
(_mm256_maskz_fnmsub_ph): Likewise.
(_mm_fnmsub_ph): Likewise.
(_mm_mask_fnmsub_ph): Likewise.
(_mm_mask3_fnmsub_ph): Likewise.
(_mm_maskz_fnmsub_ph): Likewise.
* config/i386/i386-builtin.def: Add corresponding new builtins.
* config/i386/sse.md (avx512bcst): Add HF vector modes.
(<avx512>_fmadd_<mode>_maskz<round_expand_name>): Adjust to
support HF vector modes.
(<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name><round_name>):
Ditto.
(*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_1): Ditto.
(*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_2): Ditto.
(*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_3): Ditto.
(<avx512>_fmadd_<mode>_mask<round_name>): Ditto.
(<avx512>_fmadd_<mode>_mask3<round_name>): Ditto.
(<avx512>_fmsub_<mode>_maskz<round_expand_name>): Ditto.
(<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name><round_name>):
Ditto.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_1): Ditto.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_2): Ditto.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_3): Ditto.
(<avx512>_fmsub_<mode>_mask<round_name>): Ditto.
(<avx512>_fmsub_<mode>_mask3<round_name>): Ditto.
(<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name><round_name>):
Ditto.
(*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_1): Ditto.
(*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_2): Ditto.
(*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_3): Ditto.
(<avx512>_fnmadd_<mode>_mask<round_name>): Ditto.
(<avx512>_fnmadd_<mode>_mask3<round_name>): Ditto.
(<avx512>_fnmsub_<mode>_maskz<round_expand_name>): Ditto.
(<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name><round_name>):
Ditto.
(*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_1): Ditto.
(*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_2): Ditto.
(*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_3): Ditto.
(<avx512>_fnmsub_<mode>_mask<round_name>): Ditto.
(<avx512>_fnmsub_<mode>_mask3<round_name>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test fot new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
---
gcc/config/i386/avx512fp16intrin.h | 432 +++++++++++++++++++++++++
gcc/config/i386/avx512fp16vlintrin.h | 364 +++++++++++++++++++++
gcc/config/i386/i386-builtin.def | 36 +++
gcc/config/i386/sse.md | 196 +++++------
gcc/testsuite/gcc.target/i386/avx-1.c | 12 +
gcc/testsuite/gcc.target/i386/sse-13.c | 12 +
gcc/testsuite/gcc.target/i386/sse-14.c | 16 +
gcc/testsuite/gcc.target/i386/sse-22.c | 16 +
gcc/testsuite/gcc.target/i386/sse-23.c | 12 +
9 files changed, 999 insertions(+), 97 deletions(-)
diff --git a/gcc/config/i386/avx512fp16intrin.h b/gcc/config/i386/avx512fp16intrin.h
index 4092663b504..f246bab5159 100644
--- a/gcc/config/i386/avx512fp16intrin.h
+++ b/gcc/config/i386/avx512fp16intrin.h
@@ -5265,6 +5265,438 @@ _mm512_maskz_fmsubadd_round_ph (__mmask32 __U, __m512h __A, __m512h __B,
#endif /* __OPTIMIZE__ */
+/* Intrinsics vfmadd[132,213,231]ph. */
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+ _mm512_fmadd_ph (__m512h __A, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfmaddph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) -1,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_fmadd_ph (__m512h __A, __mmask32 __U, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfmaddph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask3_fmadd_ph (__m512h __A, __m512h __B, __m512h __C, __mmask32 __U)
+{
+ return (__m512h)
+ __builtin_ia32_vfmaddph512_mask3 ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_fmadd_ph (__mmask32 __U, __m512h __A, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfmaddph512_maskz ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_fmadd_round_ph (__m512h __A, __m512h __B, __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfmaddph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) -1, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_fmadd_round_ph (__m512h __A, __mmask32 __U, __m512h __B,
+ __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfmaddph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask3_fmadd_round_ph (__m512h __A, __m512h __B, __m512h __C,
+ __mmask32 __U, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfmaddph512_mask3 ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_fmadd_round_ph (__mmask32 __U, __m512h __A, __m512h __B,
+ __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfmaddph512_maskz ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+#else
+#define _mm512_fmadd_round_ph(A, B, C, R) \
+ ((__m512h)__builtin_ia32_vfmaddph512_mask ((A), (B), (C), -1, (R)))
+
+#define _mm512_mask_fmadd_round_ph(A, U, B, C, R) \
+ ((__m512h)__builtin_ia32_vfmaddph512_mask ((A), (B), (C), (U), (R)))
+
+#define _mm512_mask3_fmadd_round_ph(A, B, C, U, R) \
+ ((__m512h)__builtin_ia32_vfmaddph512_mask3 ((A), (B), (C), (U), (R)))
+
+#define _mm512_maskz_fmadd_round_ph(U, A, B, C, R) \
+ ((__m512h)__builtin_ia32_vfmaddph512_maskz ((A), (B), (C), (U), (R)))
+
+#endif /* __OPTIMIZE__ */
+
+/* Intrinsics vfnmadd[132,213,231]ph. */
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_fnmadd_ph (__m512h __A, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfnmaddph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) -1,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_fnmadd_ph (__m512h __A, __mmask32 __U, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfnmaddph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask3_fnmadd_ph (__m512h __A, __m512h __B, __m512h __C, __mmask32 __U)
+{
+ return (__m512h)
+ __builtin_ia32_vfnmaddph512_mask3 ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_fnmadd_ph (__mmask32 __U, __m512h __A, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfnmaddph512_maskz ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_fnmadd_round_ph (__m512h __A, __m512h __B, __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfnmaddph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) -1, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_fnmadd_round_ph (__m512h __A, __mmask32 __U, __m512h __B,
+ __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfnmaddph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask3_fnmadd_round_ph (__m512h __A, __m512h __B, __m512h __C,
+ __mmask32 __U, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfnmaddph512_mask3 ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_fnmadd_round_ph (__mmask32 __U, __m512h __A, __m512h __B,
+ __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfnmaddph512_maskz ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+#else
+#define _mm512_fnmadd_round_ph(A, B, C, R) \
+ ((__m512h)__builtin_ia32_vfnmaddph512_mask ((A), (B), (C), -1, (R)))
+
+#define _mm512_mask_fnmadd_round_ph(A, U, B, C, R) \
+ ((__m512h)__builtin_ia32_vfnmaddph512_mask ((A), (B), (C), (U), (R)))
+
+#define _mm512_mask3_fnmadd_round_ph(A, B, C, U, R) \
+ ((__m512h)__builtin_ia32_vfnmaddph512_mask3 ((A), (B), (C), (U), (R)))
+
+#define _mm512_maskz_fnmadd_round_ph(U, A, B, C, R) \
+ ((__m512h)__builtin_ia32_vfnmaddph512_maskz ((A), (B), (C), (U), (R)))
+
+#endif /* __OPTIMIZE__ */
+
+/* Intrinsics vfmsub[132,213,231]ph. */
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_fmsub_ph (__m512h __A, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfmsubph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) -1,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_fmsub_ph (__m512h __A, __mmask32 __U, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfmsubph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask3_fmsub_ph (__m512h __A, __m512h __B, __m512h __C, __mmask32 __U)
+{
+ return (__m512h)
+ __builtin_ia32_vfmsubph512_mask3 ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_fmsub_ph (__mmask32 __U, __m512h __A, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfmsubph512_maskz ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_fmsub_round_ph (__m512h __A, __m512h __B, __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfmsubph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) -1, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_fmsub_round_ph (__m512h __A, __mmask32 __U, __m512h __B,
+ __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfmsubph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask3_fmsub_round_ph (__m512h __A, __m512h __B, __m512h __C,
+ __mmask32 __U, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfmsubph512_mask3 ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_fmsub_round_ph (__mmask32 __U, __m512h __A, __m512h __B,
+ __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfmsubph512_maskz ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+#else
+#define _mm512_fmsub_round_ph(A, B, C, R) \
+ ((__m512h)__builtin_ia32_vfmsubph512_mask((A), (B), (C), -1, (R)))
+
+#define _mm512_mask_fmsub_round_ph(A, U, B, C, R) \
+ ((__m512h)__builtin_ia32_vfmsubph512_mask((A), (B), (C), (U), (R)))
+
+#define _mm512_mask3_fmsub_round_ph(A, B, C, U, R) \
+ ((__m512h)__builtin_ia32_vfmsubph512_mask3((A), (B), (C), (U), (R)))
+
+#define _mm512_maskz_fmsub_round_ph(U, A, B, C, R) \
+ ((__m512h)__builtin_ia32_vfmsubph512_maskz((A), (B), (C), (U), (R)))
+
+#endif /* __OPTIMIZE__ */
+
+/* Intrinsics vfnmsub[132,213,231]ph. */
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_fnmsub_ph (__m512h __A, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfnmsubph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) -1,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_fnmsub_ph (__m512h __A, __mmask32 __U, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfnmsubph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask3_fnmsub_ph (__m512h __A, __m512h __B, __m512h __C, __mmask32 __U)
+{
+ return (__m512h)
+ __builtin_ia32_vfnmsubph512_mask3 ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_fnmsub_ph (__mmask32 __U, __m512h __A, __m512h __B, __m512h __C)
+{
+ return (__m512h)
+ __builtin_ia32_vfnmsubph512_maskz ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U,
+ _MM_FROUND_CUR_DIRECTION);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_fnmsub_round_ph (__m512h __A, __m512h __B, __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfnmsubph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) -1, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_fnmsub_round_ph (__m512h __A, __mmask32 __U, __m512h __B,
+ __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfnmsubph512_mask ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask3_fnmsub_round_ph (__m512h __A, __m512h __B, __m512h __C,
+ __mmask32 __U, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfnmsubph512_mask3 ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+extern __inline __m512h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_fnmsub_round_ph (__mmask32 __U, __m512h __A, __m512h __B,
+ __m512h __C, const int __R)
+{
+ return (__m512h) __builtin_ia32_vfnmsubph512_maskz ((__v32hf) __A,
+ (__v32hf) __B,
+ (__v32hf) __C,
+ (__mmask32) __U, __R);
+}
+
+#else
+#define _mm512_fnmsub_round_ph(A, B, C, R) \
+ ((__m512h)__builtin_ia32_vfnmsubph512_mask ((A), (B), (C), -1, (R)))
+
+#define _mm512_mask_fnmsub_round_ph(A, U, B, C, R) \
+ ((__m512h)__builtin_ia32_vfnmsubph512_mask ((A), (B), (C), (U), (R)))
+
+#define _mm512_mask3_fnmsub_round_ph(A, B, C, U, R) \
+ ((__m512h)__builtin_ia32_vfnmsubph512_mask3 ((A), (B), (C), (U), (R)))
+
+#define _mm512_maskz_fnmsub_round_ph(U, A, B, C, R) \
+ ((__m512h)__builtin_ia32_vfnmsubph512_maskz ((A), (B), (C), (U), (R)))
+
+#endif /* __OPTIMIZE__ */
+
#ifdef __DISABLE_AVX512FP16__
#undef __DISABLE_AVX512FP16__
#pragma GCC pop_options
diff --git a/gcc/config/i386/avx512fp16vlintrin.h b/gcc/config/i386/avx512fp16vlintrin.h
index 8825fae52aa..bba98f105ac 100644
--- a/gcc/config/i386/avx512fp16vlintrin.h
+++ b/gcc/config/i386/avx512fp16vlintrin.h
@@ -2451,6 +2451,370 @@ _mm_maskz_fmsubadd_ph (__mmask8 __U, __m128h __A, __m128h __B,
__U);
}
+/* Intrinsics vfmadd[132,213,231]ph. */
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_fmadd_ph (__m256h __A, __m256h __B, __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfmaddph256_mask ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16) -1);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask_fmadd_ph (__m256h __A, __mmask16 __U, __m256h __B,
+ __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfmaddph256_mask ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16) __U);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask3_fmadd_ph (__m256h __A, __m256h __B, __m256h __C,
+ __mmask16 __U)
+{
+ return (__m256h) __builtin_ia32_vfmaddph256_mask3 ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16)
+ __U);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_maskz_fmadd_ph (__mmask16 __U, __m256h __A, __m256h __B,
+ __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfmaddph256_maskz ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16)
+ __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_fmadd_ph (__m128h __A, __m128h __B, __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfmaddph128_mask ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8) -1);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_fmadd_ph (__m128h __A, __mmask8 __U, __m128h __B,
+ __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfmaddph128_mask ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8) __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask3_fmadd_ph (__m128h __A, __m128h __B, __m128h __C,
+ __mmask8 __U)
+{
+ return (__m128h) __builtin_ia32_vfmaddph128_mask3 ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8)
+ __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_fmadd_ph (__mmask8 __U, __m128h __A, __m128h __B,
+ __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfmaddph128_maskz ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8)
+ __U);
+}
+
+/* Intrinsics vfnmadd[132,213,231]ph. */
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_fnmadd_ph (__m256h __A, __m256h __B, __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfnmaddph256_mask ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16) -1);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask_fnmadd_ph (__m256h __A, __mmask16 __U, __m256h __B,
+ __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfnmaddph256_mask ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16) __U);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask3_fnmadd_ph (__m256h __A, __m256h __B, __m256h __C,
+ __mmask16 __U)
+{
+ return (__m256h) __builtin_ia32_vfnmaddph256_mask3 ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16)
+ __U);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_maskz_fnmadd_ph (__mmask16 __U, __m256h __A, __m256h __B,
+ __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfnmaddph256_maskz ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16)
+ __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_fnmadd_ph (__m128h __A, __m128h __B, __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfnmaddph128_mask ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8) -1);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_fnmadd_ph (__m128h __A, __mmask8 __U, __m128h __B,
+ __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfnmaddph128_mask ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8) __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask3_fnmadd_ph (__m128h __A, __m128h __B, __m128h __C,
+ __mmask8 __U)
+{
+ return (__m128h) __builtin_ia32_vfnmaddph128_mask3 ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8)
+ __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_fnmadd_ph (__mmask8 __U, __m128h __A, __m128h __B,
+ __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfnmaddph128_maskz ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8)
+ __U);
+}
+
+/* Intrinsics vfmsub[132,213,231]ph. */
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_fmsub_ph (__m256h __A, __m256h __B, __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfmsubph256_mask ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16) -1);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask_fmsub_ph (__m256h __A, __mmask16 __U, __m256h __B,
+ __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfmsubph256_mask ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16) __U);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask3_fmsub_ph (__m256h __A, __m256h __B, __m256h __C,
+ __mmask16 __U)
+{
+ return (__m256h) __builtin_ia32_vfmsubph256_mask3 ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16)
+ __U);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_maskz_fmsub_ph (__mmask16 __U, __m256h __A, __m256h __B,
+ __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfmsubph256_maskz ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16)
+ __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_fmsub_ph (__m128h __A, __m128h __B, __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfmsubph128_mask ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8) -1);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_fmsub_ph (__m128h __A, __mmask8 __U, __m128h __B,
+ __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfmsubph128_mask ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8) __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask3_fmsub_ph (__m128h __A, __m128h __B, __m128h __C,
+ __mmask8 __U)
+{
+ return (__m128h) __builtin_ia32_vfmsubph128_mask3 ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8)
+ __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_fmsub_ph (__mmask8 __U, __m128h __A, __m128h __B,
+ __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfmsubph128_maskz ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8)
+ __U);
+}
+
+/* Intrinsics vfnmsub[132,213,231]ph. */
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_fnmsub_ph (__m256h __A, __m256h __B, __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfnmsubph256_mask ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16) -1);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask_fnmsub_ph (__m256h __A, __mmask16 __U, __m256h __B,
+ __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfnmsubph256_mask ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16) __U);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask3_fnmsub_ph (__m256h __A, __m256h __B, __m256h __C,
+ __mmask16 __U)
+{
+ return (__m256h) __builtin_ia32_vfnmsubph256_mask3 ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16)
+ __U);
+}
+
+extern __inline __m256h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_maskz_fnmsub_ph (__mmask16 __U, __m256h __A, __m256h __B,
+ __m256h __C)
+{
+ return (__m256h) __builtin_ia32_vfnmsubph256_maskz ((__v16hf) __A,
+ (__v16hf) __B,
+ (__v16hf) __C,
+ (__mmask16)
+ __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_fnmsub_ph (__m128h __A, __m128h __B, __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfnmsubph128_mask ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8) -1);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_fnmsub_ph (__m128h __A, __mmask8 __U, __m128h __B,
+ __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfnmsubph128_mask ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8) __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask3_fnmsub_ph (__m128h __A, __m128h __B, __m128h __C,
+ __mmask8 __U)
+{
+ return (__m128h) __builtin_ia32_vfnmsubph128_mask3 ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8)
+ __U);
+}
+
+extern __inline __m128h
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_fnmsub_ph (__mmask8 __U, __m128h __A, __m128h __B,
+ __m128h __C)
+{
+ return (__m128h) __builtin_ia32_vfnmsubph128_maskz ((__v8hf) __A,
+ (__v8hf) __B,
+ (__v8hf) __C,
+ (__mmask8)
+ __U);
+}
+
#ifdef __DISABLE_AVX512FP16VL__
#undef __DISABLE_AVX512FP16VL__
#pragma GCC pop_options
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 42bba719ec3..cf0259843cc 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2887,6 +2887,30 @@ BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_
BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmsubadd_v8hf_mask, "__builtin_ia32_vfmsubaddph128_mask", IX86_BUILTIN_VFMSUBADDPH128_MASK, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmsubadd_v8hf_mask3, "__builtin_ia32_vfmsubaddph128_mask3", IX86_BUILTIN_VFMSUBADDPH128_MASK3, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmsubadd_v8hf_maskz, "__builtin_ia32_vfmsubaddph128_maskz", IX86_BUILTIN_VFMSUBADDPH128_MASKZ, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fmadd_v16hf_mask, "__builtin_ia32_vfmaddph256_mask", IX86_BUILTIN_VFMADDPH256_MASK, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fmadd_v16hf_mask3, "__builtin_ia32_vfmaddph256_mask3", IX86_BUILTIN_VFMADDPH256_MASK3, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fmadd_v16hf_maskz, "__builtin_ia32_vfmaddph256_maskz", IX86_BUILTIN_VFMADDPH256_MASKZ, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmadd_v8hf_mask, "__builtin_ia32_vfmaddph128_mask", IX86_BUILTIN_VFMADDPH128_MASK, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmadd_v8hf_mask3, "__builtin_ia32_vfmaddph128_mask3", IX86_BUILTIN_VFMADDPH128_MASK3, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmadd_v8hf_maskz, "__builtin_ia32_vfmaddph128_maskz", IX86_BUILTIN_VFMADDPH128_MASKZ, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fnmadd_v16hf_mask, "__builtin_ia32_vfnmaddph256_mask", IX86_BUILTIN_VFNMADDPH256_MASK, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fnmadd_v16hf_mask3, "__builtin_ia32_vfnmaddph256_mask3", IX86_BUILTIN_VFNMADDPH256_MASK3, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fnmadd_v16hf_maskz, "__builtin_ia32_vfnmaddph256_maskz", IX86_BUILTIN_VFNMADDPH256_MASKZ, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fnmadd_v8hf_mask, "__builtin_ia32_vfnmaddph128_mask", IX86_BUILTIN_VFNMADDPH128_MASK, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fnmadd_v8hf_mask3, "__builtin_ia32_vfnmaddph128_mask3", IX86_BUILTIN_VFNMADDPH128_MASK3, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fnmadd_v8hf_maskz, "__builtin_ia32_vfnmaddph128_maskz", IX86_BUILTIN_VFNMADDPH128_MASKZ, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fmsub_v16hf_mask, "__builtin_ia32_vfmsubph256_mask", IX86_BUILTIN_VFMSUBPH256_MASK, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fmsub_v16hf_mask3, "__builtin_ia32_vfmsubph256_mask3", IX86_BUILTIN_VFMSUBPH256_MASK3, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fmsub_v16hf_maskz, "__builtin_ia32_vfmsubph256_maskz", IX86_BUILTIN_VFMSUBPH256_MASKZ, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmsub_v8hf_mask, "__builtin_ia32_vfmsubph128_mask", IX86_BUILTIN_VFMSUBPH128_MASK, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmsub_v8hf_mask3, "__builtin_ia32_vfmsubph128_mask3", IX86_BUILTIN_VFMSUBPH128_MASK3, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fmsub_v8hf_maskz, "__builtin_ia32_vfmsubph128_maskz", IX86_BUILTIN_VFMSUBPH128_MASKZ, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fnmsub_v16hf_mask, "__builtin_ia32_vfnmsubph256_mask", IX86_BUILTIN_VFNMSUBPH256_MASK, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fnmsub_v16hf_mask3, "__builtin_ia32_vfnmsubph256_mask3", IX86_BUILTIN_VFNMSUBPH256_MASK3, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_fnmsub_v16hf_maskz, "__builtin_ia32_vfnmsubph256_maskz", IX86_BUILTIN_VFNMSUBPH256_MASKZ, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_V16HF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fnmsub_v8hf_mask, "__builtin_ia32_vfnmsubph128_mask", IX86_BUILTIN_VFNMSUBPH128_MASK, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fnmsub_v8hf_mask3, "__builtin_ia32_vfnmsubph128_mask3", IX86_BUILTIN_VFNMSUBPH128_MASK3, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_fnmsub_v8hf_maskz, "__builtin_ia32_vfnmsubph128_maskz", IX86_BUILTIN_VFNMSUBPH128_MASKZ, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
/* Builtins with rounding support. */
BDESC_END (ARGS, ROUND_ARGS)
@@ -3158,6 +3182,18 @@ BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmaddsub_v32hf_maskz_ro
BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmsubadd_v32hf_mask_round, "__builtin_ia32_vfmsubaddph512_mask", IX86_BUILTIN_VFMSUBADDPH512_MASK, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmsubadd_v32hf_mask3_round, "__builtin_ia32_vfmsubaddph512_mask3", IX86_BUILTIN_VFMSUBADDPH512_MASK3, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmsubadd_v32hf_maskz_round, "__builtin_ia32_vfmsubaddph512_maskz", IX86_BUILTIN_VFMSUBADDPH512_MASKZ, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmadd_v32hf_mask_round, "__builtin_ia32_vfmaddph512_mask", IX86_BUILTIN_VFMADDPH512_MASK, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmadd_v32hf_mask3_round, "__builtin_ia32_vfmaddph512_mask3", IX86_BUILTIN_VFMADDPH512_MASK3, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmadd_v32hf_maskz_round, "__builtin_ia32_vfmaddph512_maskz", IX86_BUILTIN_VFMADDPH512_MASKZ, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fnmadd_v32hf_mask_round, "__builtin_ia32_vfnmaddph512_mask", IX86_BUILTIN_VFNMADDPH512_MASK, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fnmadd_v32hf_mask3_round, "__builtin_ia32_vfnmaddph512_mask3", IX86_BUILTIN_VFNMADDPH512_MASK3, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fnmadd_v32hf_maskz_round, "__builtin_ia32_vfnmaddph512_maskz", IX86_BUILTIN_VFNMADDPH512_MASKZ, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmsub_v32hf_mask_round, "__builtin_ia32_vfmsubph512_mask", IX86_BUILTIN_VFMSUBPH512_MASK, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmsub_v32hf_mask3_round, "__builtin_ia32_vfmsubph512_mask3", IX86_BUILTIN_VFMSUBPH512_MASK3, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fmsub_v32hf_maskz_round, "__builtin_ia32_vfmsubph512_maskz", IX86_BUILTIN_VFMSUBPH512_MASKZ, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fnmsub_v32hf_mask_round, "__builtin_ia32_vfnmsubph512_mask", IX86_BUILTIN_VFNMSUBPH512_MASK, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fnmsub_v32hf_mask3_round, "__builtin_ia32_vfnmsubph512_mask3", IX86_BUILTIN_VFNMSUBPH512_MASK3, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
+BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_fnmsub_v32hf_maskz_round, "__builtin_ia32_vfnmsubph512_maskz", IX86_BUILTIN_VFNMSUBPH512_MASKZ, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_V32HF_USI_INT)
BDESC_END (ROUND_ARGS, MULTI_ARG)
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 847684e232e..fdcc0515228 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -825,7 +825,9 @@ (define_mode_attr avx512bcst
(V16SI "%{1to16%}") (V8DI "%{1to8%}")
(V4SF "%{1to4%}") (V2DF "%{1to2%}")
(V8SF "%{1to8%}") (V4DF "%{1to4%}")
- (V16SF "%{1to16%}") (V8DF "%{1to8%}")])
+ (V16SF "%{1to16%}") (V8DF "%{1to8%}")
+ (V8HF "%{1to8%}") (V16HF "%{1to16%}")
+ (V32HF "%{1to32%}")])
;; Mapping from float mode to required SSE level
(define_mode_attr sse
@@ -4507,10 +4509,10 @@ (define_expand "fma4i_fnmsub_<mode>"
(match_operand:FMAMODE_AVX512 3 "nonimmediate_operand"))))])
(define_expand "<avx512>_fmadd_<mode>_maskz<round_expand_name>"
- [(match_operand:VF_AVX512VL 0 "register_operand")
- (match_operand:VF_AVX512VL 1 "<round_expand_nimm_predicate>")
- (match_operand:VF_AVX512VL 2 "<round_expand_nimm_predicate>")
- (match_operand:VF_AVX512VL 3 "<round_expand_nimm_predicate>")
+ [(match_operand:VFH_AVX512VL 0 "register_operand")
+ (match_operand:VFH_AVX512VL 1 "<round_expand_nimm_predicate>")
+ (match_operand:VFH_AVX512VL 2 "<round_expand_nimm_predicate>")
+ (match_operand:VFH_AVX512VL 3 "<round_expand_nimm_predicate>")
(match_operand:<avx512fmaskmode> 4 "register_operand")]
"TARGET_AVX512F && <round_mode512bit_condition>"
{
@@ -4550,11 +4552,11 @@ (define_mode_iterator VFH_SF_AVX512VL
DF V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")])
(define_insn "<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name><round_name>"
- [(set (match_operand:VF_SF_AVX512VL 0 "register_operand" "=v,v,v")
- (fma:VF_SF_AVX512VL
- (match_operand:VF_SF_AVX512VL 1 "<bcst_round_nimm_predicate>" "%0,0,v")
- (match_operand:VF_SF_AVX512VL 2 "<bcst_round_nimm_predicate>" "<bcst_round_constraint>,v,<bcst_round_constraint>")
- (match_operand:VF_SF_AVX512VL 3 "<bcst_round_nimm_predicate>" "v,<bcst_round_constraint>,0")))]
+ [(set (match_operand:VFH_SF_AVX512VL 0 "register_operand" "=v,v,v")
+ (fma:VFH_SF_AVX512VL
+ (match_operand:VFH_SF_AVX512VL 1 "<bcst_round_nimm_predicate>" "%0,0,v")
+ (match_operand:VFH_SF_AVX512VL 2 "<bcst_round_nimm_predicate>" "<bcst_round_constraint>,v,<bcst_round_constraint>")
+ (match_operand:VFH_SF_AVX512VL 3 "<bcst_round_nimm_predicate>" "v,<bcst_round_constraint>,0")))]
"TARGET_AVX512F && <sd_mask_mode512bit_condition> && <round_mode512bit_condition>"
"@
vfmadd132<ssemodesuffix>\t{<round_sd_mask_op4>%2, %3, %0<sd_mask_op4>|%0<sd_mask_op4>, %3, %2<round_sd_mask_op4>}
@@ -4564,12 +4566,12 @@ (define_insn "<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name><round_name>"
(set_attr "mode" "<MODE>")])
(define_insn "<avx512>_fmadd_<mode>_mask<round_name>"
- [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v,v")
- (vec_merge:VF_AVX512VL
- (fma:VF_AVX512VL
- (match_operand:VF_AVX512VL 1 "register_operand" "0,0")
- (match_operand:VF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v")
- (match_operand:VF_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>"))
+ [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v,v")
+ (vec_merge:VFH_AVX512VL
+ (fma:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 1 "register_operand" "0,0")
+ (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v")
+ (match_operand:VFH_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>"))
(match_dup 1)
(match_operand:<avx512fmaskmode> 4 "register_operand" "Yk,Yk")))]
"TARGET_AVX512F && <round_mode512bit_condition>"
@@ -4580,12 +4582,12 @@ (define_insn "<avx512>_fmadd_<mode>_mask<round_name>"
(set_attr "mode" "<MODE>")])
(define_insn "<avx512>_fmadd_<mode>_mask3<round_name>"
- [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
- (vec_merge:VF_AVX512VL
- (fma:VF_AVX512VL
- (match_operand:VF_AVX512VL 1 "<round_nimm_predicate>" "%v")
- (match_operand:VF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")
- (match_operand:VF_AVX512VL 3 "register_operand" "0"))
+ [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v")
+ (vec_merge:VFH_AVX512VL
+ (fma:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 1 "<round_nimm_predicate>" "%v")
+ (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")
+ (match_operand:VFH_AVX512VL 3 "register_operand" "0"))
(match_dup 3)
(match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
"TARGET_AVX512F"
@@ -4612,10 +4614,10 @@ (define_insn "*fma_fmsub_<mode>"
(set_attr "mode" "<MODE>")])
(define_expand "<avx512>_fmsub_<mode>_maskz<round_expand_name>"
- [(match_operand:VF_AVX512VL 0 "register_operand")
- (match_operand:VF_AVX512VL 1 "<round_expand_nimm_predicate>")
- (match_operand:VF_AVX512VL 2 "<round_expand_nimm_predicate>")
- (match_operand:VF_AVX512VL 3 "<round_expand_nimm_predicate>")
+ [(match_operand:VFH_AVX512VL 0 "register_operand")
+ (match_operand:VFH_AVX512VL 1 "<round_expand_nimm_predicate>")
+ (match_operand:VFH_AVX512VL 2 "<round_expand_nimm_predicate>")
+ (match_operand:VFH_AVX512VL 3 "<round_expand_nimm_predicate>")
(match_operand:<avx512fmaskmode> 4 "register_operand")]
"TARGET_AVX512F && <round_mode512bit_condition>"
{
@@ -4626,12 +4628,12 @@ (define_expand "<avx512>_fmsub_<mode>_maskz<round_expand_name>"
})
(define_insn "<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name><round_name>"
- [(set (match_operand:VF_SF_AVX512VL 0 "register_operand" "=v,v,v")
- (fma:VF_SF_AVX512VL
- (match_operand:VF_SF_AVX512VL 1 "<bcst_round_nimm_predicate>" "%0,0,v")
- (match_operand:VF_SF_AVX512VL 2 "<bcst_round_nimm_predicate>" "<bcst_round_constraint>,v,<bcst_round_constraint>")
- (neg:VF_SF_AVX512VL
- (match_operand:VF_SF_AVX512VL 3 "<bcst_round_nimm_predicate>" "v,<bcst_round_constraint>,0"))))]
+ [(set (match_operand:VFH_SF_AVX512VL 0 "register_operand" "=v,v,v")
+ (fma:VFH_SF_AVX512VL
+ (match_operand:VFH_SF_AVX512VL 1 "<bcst_round_nimm_predicate>" "%0,0,v")
+ (match_operand:VFH_SF_AVX512VL 2 "<bcst_round_nimm_predicate>" "<bcst_round_constraint>,v,<bcst_round_constraint>")
+ (neg:VFH_SF_AVX512VL
+ (match_operand:VFH_SF_AVX512VL 3 "<bcst_round_nimm_predicate>" "v,<bcst_round_constraint>,0"))))]
"TARGET_AVX512F && <sd_mask_mode512bit_condition> && <round_mode512bit_condition>"
"@
vfmsub132<ssemodesuffix>\t{<round_sd_mask_op4>%2, %3, %0<sd_mask_op4>|%0<sd_mask_op4>, %3, %2<round_sd_mask_op4>}
@@ -4641,13 +4643,13 @@ (define_insn "<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name><round_name>"
(set_attr "mode" "<MODE>")])
(define_insn "<avx512>_fmsub_<mode>_mask<round_name>"
- [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v,v")
- (vec_merge:VF_AVX512VL
- (fma:VF_AVX512VL
- (match_operand:VF_AVX512VL 1 "register_operand" "0,0")
- (match_operand:VF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v")
- (neg:VF_AVX512VL
- (match_operand:VF_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>")))
+ [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v,v")
+ (vec_merge:VFH_AVX512VL
+ (fma:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 1 "register_operand" "0,0")
+ (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v")
+ (neg:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>")))
(match_dup 1)
(match_operand:<avx512fmaskmode> 4 "register_operand" "Yk,Yk")))]
"TARGET_AVX512F"
@@ -4658,13 +4660,13 @@ (define_insn "<avx512>_fmsub_<mode>_mask<round_name>"
(set_attr "mode" "<MODE>")])
(define_insn "<avx512>_fmsub_<mode>_mask3<round_name>"
- [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
- (vec_merge:VF_AVX512VL
- (fma:VF_AVX512VL
- (match_operand:VF_AVX512VL 1 "<round_nimm_predicate>" "%v")
- (match_operand:VF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")
- (neg:VF_AVX512VL
- (match_operand:VF_AVX512VL 3 "register_operand" "0")))
+ [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v")
+ (vec_merge:VFH_AVX512VL
+ (fma:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 1 "<round_nimm_predicate>" "%v")
+ (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")
+ (neg:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 3 "register_operand" "0")))
(match_dup 3)
(match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
"TARGET_AVX512F && <round_mode512bit_condition>"
@@ -4691,10 +4693,10 @@ (define_insn "*fma_fnmadd_<mode>"
(set_attr "mode" "<MODE>")])
(define_expand "<avx512>_fnmadd_<mode>_maskz<round_expand_name>"
- [(match_operand:VF_AVX512VL 0 "register_operand")
- (match_operand:VF_AVX512VL 1 "<round_expand_nimm_predicate>")
- (match_operand:VF_AVX512VL 2 "<round_expand_nimm_predicate>")
- (match_operand:VF_AVX512VL 3 "<round_expand_nimm_predicate>")
+ [(match_operand:VFH_AVX512VL 0 "register_operand")
+ (match_operand:VFH_AVX512VL 1 "<round_expand_nimm_predicate>")
+ (match_operand:VFH_AVX512VL 2 "<round_expand_nimm_predicate>")
+ (match_operand:VFH_AVX512VL 3 "<round_expand_nimm_predicate>")
(match_operand:<avx512fmaskmode> 4 "register_operand")]
"TARGET_AVX512F && <round_mode512bit_condition>"
{
@@ -4705,12 +4707,12 @@ (define_expand "<avx512>_fnmadd_<mode>_maskz<round_expand_name>"
})
(define_insn "<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name><round_name>"
- [(set (match_operand:VF_SF_AVX512VL 0 "register_operand" "=v,v,v")
- (fma:VF_SF_AVX512VL
- (neg:VF_SF_AVX512VL
- (match_operand:VF_SF_AVX512VL 1 "<bcst_round_nimm_predicate>" "%0,0,v"))
- (match_operand:VF_SF_AVX512VL 2 "<bcst_round_nimm_predicate>" "<bcst_round_constraint>,v,<bcst_round_constraint>")
- (match_operand:VF_SF_AVX512VL 3 "<bcst_round_nimm_predicate>" "v,<bcst_round_constraint>,0")))]
+ [(set (match_operand:VFH_SF_AVX512VL 0 "register_operand" "=v,v,v")
+ (fma:VFH_SF_AVX512VL
+ (neg:VFH_SF_AVX512VL
+ (match_operand:VFH_SF_AVX512VL 1 "<bcst_round_nimm_predicate>" "%0,0,v"))
+ (match_operand:VFH_SF_AVX512VL 2 "<bcst_round_nimm_predicate>" "<bcst_round_constraint>,v,<bcst_round_constraint>")
+ (match_operand:VFH_SF_AVX512VL 3 "<bcst_round_nimm_predicate>" "v,<bcst_round_constraint>,0")))]
"TARGET_AVX512F && <sd_mask_mode512bit_condition> && <round_mode512bit_condition>"
"@
vfnmadd132<ssemodesuffix>\t{<round_sd_mask_op4>%2, %3, %0<sd_mask_op4>|%0<sd_mask_op4>, %3, %2<round_sd_mask_op4>}
@@ -4720,13 +4722,13 @@ (define_insn "<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name><round_name>"
(set_attr "mode" "<MODE>")])
(define_insn "<avx512>_fnmadd_<mode>_mask<round_name>"
- [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v,v")
- (vec_merge:VF_AVX512VL
- (fma:VF_AVX512VL
- (neg:VF_AVX512VL
- (match_operand:VF_AVX512VL 1 "register_operand" "0,0"))
- (match_operand:VF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v")
- (match_operand:VF_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>"))
+ [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v,v")
+ (vec_merge:VFH_AVX512VL
+ (fma:VFH_AVX512VL
+ (neg:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 1 "register_operand" "0,0"))
+ (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v")
+ (match_operand:VFH_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>"))
(match_dup 1)
(match_operand:<avx512fmaskmode> 4 "register_operand" "Yk,Yk")))]
"TARGET_AVX512F && <round_mode512bit_condition>"
@@ -4737,13 +4739,13 @@ (define_insn "<avx512>_fnmadd_<mode>_mask<round_name>"
(set_attr "mode" "<MODE>")])
(define_insn "<avx512>_fnmadd_<mode>_mask3<round_name>"
- [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
- (vec_merge:VF_AVX512VL
- (fma:VF_AVX512VL
- (neg:VF_AVX512VL
- (match_operand:VF_AVX512VL 1 "<round_nimm_predicate>" "%v"))
- (match_operand:VF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")
- (match_operand:VF_AVX512VL 3 "register_operand" "0"))
+ [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v")
+ (vec_merge:VFH_AVX512VL
+ (fma:VFH_AVX512VL
+ (neg:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 1 "<round_nimm_predicate>" "%v"))
+ (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")
+ (match_operand:VFH_AVX512VL 3 "register_operand" "0"))
(match_dup 3)
(match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
"TARGET_AVX512F && <round_mode512bit_condition>"
@@ -4771,10 +4773,10 @@ (define_insn "*fma_fnmsub_<mode>"
(set_attr "mode" "<MODE>")])
(define_expand "<avx512>_fnmsub_<mode>_maskz<round_expand_name>"
- [(match_operand:VF_AVX512VL 0 "register_operand")
- (match_operand:VF_AVX512VL 1 "<round_expand_nimm_predicate>")
- (match_operand:VF_AVX512VL 2 "<round_expand_nimm_predicate>")
- (match_operand:VF_AVX512VL 3 "<round_expand_nimm_predicate>")
+ [(match_operand:VFH_AVX512VL 0 "register_operand")
+ (match_operand:VFH_AVX512VL 1 "<round_expand_nimm_predicate>")
+ (match_operand:VFH_AVX512VL 2 "<round_expand_nimm_predicate>")
+ (match_operand:VFH_AVX512VL 3 "<round_expand_nimm_predicate>")
(match_operand:<avx512fmaskmode> 4 "register_operand")]
"TARGET_AVX512F && <round_mode512bit_condition>"
{
@@ -4785,13 +4787,13 @@ (define_expand "<avx512>_fnmsub_<mode>_maskz<round_expand_name>"
})
(define_insn "<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name><round_name>"
- [(set (match_operand:VF_SF_AVX512VL 0 "register_operand" "=v,v,v")
- (fma:VF_SF_AVX512VL
- (neg:VF_SF_AVX512VL
- (match_operand:VF_SF_AVX512VL 1 "<bcst_round_nimm_predicate>" "%0,0,v"))
- (match_operand:VF_SF_AVX512VL 2 "<bcst_round_nimm_predicate>" "<bcst_round_constraint>,v,<bcst_round_constraint>")
- (neg:VF_SF_AVX512VL
- (match_operand:VF_SF_AVX512VL 3 "<bcst_round_nimm_predicate>" "v,<bcst_round_constraint>,0"))))]
+ [(set (match_operand:VFH_SF_AVX512VL 0 "register_operand" "=v,v,v")
+ (fma:VFH_SF_AVX512VL
+ (neg:VFH_SF_AVX512VL
+ (match_operand:VFH_SF_AVX512VL 1 "<bcst_round_nimm_predicate>" "%0,0,v"))
+ (match_operand:VFH_SF_AVX512VL 2 "<bcst_round_nimm_predicate>" "<bcst_round_constraint>,v,<bcst_round_constraint>")
+ (neg:VFH_SF_AVX512VL
+ (match_operand:VFH_SF_AVX512VL 3 "<bcst_round_nimm_predicate>" "v,<bcst_round_constraint>,0"))))]
"TARGET_AVX512F && <sd_mask_mode512bit_condition> && <round_mode512bit_condition>"
"@
vfnmsub132<ssemodesuffix>\t{<round_sd_mask_op4>%2, %3, %0<sd_mask_op4>|%0<sd_mask_op4>, %3, %2<round_sd_mask_op4>}
@@ -4801,14 +4803,14 @@ (define_insn "<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name><round_name>"
(set_attr "mode" "<MODE>")])
(define_insn "<avx512>_fnmsub_<mode>_mask<round_name>"
- [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v,v")
- (vec_merge:VF_AVX512VL
- (fma:VF_AVX512VL
- (neg:VF_AVX512VL
- (match_operand:VF_AVX512VL 1 "register_operand" "0,0"))
- (match_operand:VF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v")
- (neg:VF_AVX512VL
- (match_operand:VF_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>")))
+ [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v,v")
+ (vec_merge:VFH_AVX512VL
+ (fma:VFH_AVX512VL
+ (neg:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 1 "register_operand" "0,0"))
+ (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v")
+ (neg:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>")))
(match_dup 1)
(match_operand:<avx512fmaskmode> 4 "register_operand" "Yk,Yk")))]
"TARGET_AVX512F && <round_mode512bit_condition>"
@@ -4819,14 +4821,14 @@ (define_insn "<avx512>_fnmsub_<mode>_mask<round_name>"
(set_attr "mode" "<MODE>")])
(define_insn "<avx512>_fnmsub_<mode>_mask3<round_name>"
- [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
- (vec_merge:VF_AVX512VL
- (fma:VF_AVX512VL
- (neg:VF_AVX512VL
- (match_operand:VF_AVX512VL 1 "<round_nimm_predicate>" "%v"))
- (match_operand:VF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")
- (neg:VF_AVX512VL
- (match_operand:VF_AVX512VL 3 "register_operand" "0")))
+ [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v")
+ (vec_merge:VFH_AVX512VL
+ (fma:VFH_AVX512VL
+ (neg:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 1 "<round_nimm_predicate>" "%v"))
+ (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")
+ (neg:VFH_AVX512VL
+ (match_operand:VFH_AVX512VL 3 "register_operand" "0")))
(match_dup 3)
(match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
"TARGET_AVX512F"
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index 51a0cf2fe87..d2ab16538d8 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -763,6 +763,18 @@
#define __builtin_ia32_vfmsubaddph512_mask(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_mask(A, B, C, D, 8)
#define __builtin_ia32_vfmsubaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_mask3(A, B, C, D, 8)
#define __builtin_ia32_vfmsubaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_mask(A, B, C, D, E) __builtin_ia32_vfmaddph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_mask(A, B, C, D, E) __builtin_ia32_vfnmaddph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfnmaddph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfnmaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_mask(A, B, C, D, E) __builtin_ia32_vfmsubph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_mask3(A, B, C, D, E) __builtin_ia32_vfmsubph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_maskz(A, B, C, D, E) __builtin_ia32_vfmsubph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_mask(A, B, C, D, E) __builtin_ia32_vfnmsubph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_mask3(A, B, C, D, E) __builtin_ia32_vfnmsubph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_maskz(A, B, C, D, E) __builtin_ia32_vfnmsubph512_maskz(A, B, C, D, 8)
/* avx512fp16vlintrin.h */
#define __builtin_ia32_vcmpph_v8hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v8hf_mask(A, B, 1, D)
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index a53f4653908..49c72f6fcef 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -780,6 +780,18 @@
#define __builtin_ia32_vfmsubaddph512_mask(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_mask(A, B, C, D, 8)
#define __builtin_ia32_vfmsubaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_mask3(A, B, C, D, 8)
#define __builtin_ia32_vfmsubaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_mask(A, B, C, D, E) __builtin_ia32_vfmaddph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_mask(A, B, C, D, E) __builtin_ia32_vfnmaddph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfnmaddph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfnmaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_mask(A, B, C, D, E) __builtin_ia32_vfmsubph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_mask3(A, B, C, D, E) __builtin_ia32_vfmsubph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_maskz(A, B, C, D, E) __builtin_ia32_vfmsubph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_mask(A, B, C, D, E) __builtin_ia32_vfnmsubph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_mask3(A, B, C, D, E) __builtin_ia32_vfnmsubph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_maskz(A, B, C, D, E) __builtin_ia32_vfnmsubph512_maskz(A, B, C, D, 8)
/* avx512fp16vlintrin.h */
#define __builtin_ia32_vcmpph_v8hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v8hf_mask(A, B, 1, D)
diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.target/i386/sse-14.c
index 48895e0dd0d..9151e50afd2 100644
--- a/gcc/testsuite/gcc.target/i386/sse-14.c
+++ b/gcc/testsuite/gcc.target/i386/sse-14.c
@@ -838,6 +838,10 @@ test_3 (_mm_maskz_cvt_roundss_sh, __m128h, __mmask8, __m128h, __m128, 8)
test_3 (_mm_maskz_cvt_roundsd_sh, __m128h, __mmask8, __m128h, __m128d, 8)
test_3 (_mm512_fmaddsub_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
test_3 (_mm512_fmsubadd_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
+test_3 (_mm512_fmadd_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
+test_3 (_mm512_fnmadd_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
+test_3 (_mm512_fmsub_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
+test_3 (_mm512_fnmsub_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
test_3x (_mm512_mask_cmp_round_ph_mask, __mmask32, __mmask32, __m512h, __m512h, 1, 8)
test_3x (_mm_mask_cmp_round_sh_mask, __mmask8, __mmask8, __m128h, __m128h, 1, 8)
test_3x (_mm512_mask_reduce_round_ph, __m512h, __m512h, __mmask32, __m512h, 123, 8)
@@ -876,6 +880,18 @@ test_4 (_mm512_maskz_fmaddsub_round_ph, __m512h, __mmask32, __m512h, __m512h, __
test_4 (_mm512_mask3_fmsubadd_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
test_4 (_mm512_mask_fmsubadd_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
test_4 (_mm512_maskz_fmsubadd_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
+test_4 (_mm512_mask_fmadd_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
+test_4 (_mm512_mask3_fmadd_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
+test_4 (_mm512_maskz_fmadd_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
+test_4 (_mm512_mask_fnmadd_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
+test_4 (_mm512_mask3_fnmadd_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
+test_4 (_mm512_maskz_fnmadd_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
+test_4 (_mm512_mask_fmsub_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
+test_4 (_mm512_mask3_fmsub_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
+test_4 (_mm512_maskz_fmsub_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
+test_4 (_mm512_mask_fnmsub_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
+test_4 (_mm512_mask3_fnmsub_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
+test_4 (_mm512_maskz_fnmsub_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
test_4x (_mm_mask_reduce_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123, 8)
test_4x (_mm_mask_roundscale_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123, 8)
test_4x (_mm_mask_getmant_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 1, 1)
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index bc530da388b..892b6334ae2 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -941,6 +941,10 @@ test_3 (_mm_maskz_cvt_roundss_sh, __m128h, __mmask8, __m128h, __m128, 8)
test_3 (_mm_maskz_cvt_roundsd_sh, __m128h, __mmask8, __m128h, __m128d, 8)
test_3 (_mm512_fmaddsub_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
test_3 (_mm512_fmsubadd_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
+test_3 (_mm512_fmadd_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
+test_3 (_mm512_fnmadd_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
+test_3 (_mm512_fmsub_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
+test_3 (_mm512_fnmsub_round_ph, __m512h, __m512h, __m512h, __m512h, 9)
test_3x (_mm512_mask_cmp_round_ph_mask, __mmask32, __mmask32, __m512h, __m512h, 1, 8)
test_3x (_mm_mask_cmp_round_sh_mask, __mmask8, __mmask8, __m128h, __m128h, 1, 8)
test_3x (_mm512_mask_reduce_round_ph, __m512h, __m512h, __mmask32, __m512h, 123, 8)
@@ -978,6 +982,18 @@ test_4 (_mm512_maskz_fmaddsub_round_ph, __m512h, __mmask32, __m512h, __m512h, __
test_4 (_mm512_mask3_fmsubadd_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
test_4 (_mm512_mask_fmsubadd_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
test_4 (_mm512_maskz_fmsubadd_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
+test_4 (_mm512_mask_fmadd_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
+test_4 (_mm512_mask3_fmadd_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
+test_4 (_mm512_maskz_fmadd_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
+test_4 (_mm512_mask_fnmadd_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
+test_4 (_mm512_mask3_fnmadd_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
+test_4 (_mm512_maskz_fnmadd_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
+test_4 (_mm512_mask_fmsub_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
+test_4 (_mm512_mask3_fmsub_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
+test_4 (_mm512_maskz_fmsub_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
+test_4 (_mm512_mask_fnmsub_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 9)
+test_4 (_mm512_mask3_fnmsub_round_ph, __m512h, __m512h, __m512h, __m512h, __mmask32, 9)
+test_4 (_mm512_maskz_fnmsub_round_ph, __m512h, __mmask32, __m512h, __m512h, __m512h, 9)
test_4x (_mm_mask_reduce_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123, 8)
test_4x (_mm_mask_roundscale_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123, 8)
test_4x (_mm_mask_getmant_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 1, 1)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index df43931ca97..447b83829f3 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -781,6 +781,18 @@
#define __builtin_ia32_vfmsubaddph512_mask(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_mask(A, B, C, D, 8)
#define __builtin_ia32_vfmsubaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_mask3(A, B, C, D, 8)
#define __builtin_ia32_vfmsubaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfmsubaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_mask(A, B, C, D, E) __builtin_ia32_vfmaddph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfmaddph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfmaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfmaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_mask(A, B, C, D, E) __builtin_ia32_vfnmaddph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_mask3(A, B, C, D, E) __builtin_ia32_vfnmaddph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfnmaddph512_maskz(A, B, C, D, E) __builtin_ia32_vfnmaddph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_mask(A, B, C, D, E) __builtin_ia32_vfmsubph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_mask3(A, B, C, D, E) __builtin_ia32_vfmsubph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfmsubph512_maskz(A, B, C, D, E) __builtin_ia32_vfmsubph512_maskz(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_mask(A, B, C, D, E) __builtin_ia32_vfnmsubph512_mask(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_mask3(A, B, C, D, E) __builtin_ia32_vfnmsubph512_mask3(A, B, C, D, 8)
+#define __builtin_ia32_vfnmsubph512_maskz(A, B, C, D, E) __builtin_ia32_vfnmsubph512_maskz(A, B, C, D, 8)
/* avx512fp16vlintrin.h */
#define __builtin_ia32_vcmpph_v8hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v8hf_mask(A, B, 1, D)
--
2.18.1
next prev parent reply other threads:[~2021-07-01 6:18 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-01 6:15 [PATCH 00/62] Support all AVX512FP16 intrinsics liuhongt
2021-07-01 6:15 ` [PATCH 01/62] AVX512FP16: Support vector init/broadcast for FP16 liuhongt
2021-07-01 6:15 ` [PATCH 02/62] AVX512FP16: Add testcase for vector init and broadcast intrinsics liuhongt
2021-07-01 6:15 ` [PATCH 03/62] AVX512FP16: Fix HF vector passing in variable arguments liuhongt
2021-07-01 6:15 ` [PATCH 04/62] AVX512FP16: Add ABI tests for xmm liuhongt
2021-07-01 6:15 ` [PATCH 05/62] AVX512FP16: Add ABI test for ymm liuhongt
2021-07-01 6:15 ` [PATCH 06/62] AVX512FP16: Add abi test for zmm liuhongt
2021-07-01 6:15 ` [PATCH 07/62] AVX512FP16: Add vaddph/vsubph/vdivph/vmulph liuhongt
2021-09-09 7:48 ` Hongtao Liu
2021-07-01 6:15 ` [PATCH 08/62] AVX512FP16: Add testcase for vaddph/vsubph/vmulph/vdivph liuhongt
2021-07-01 6:15 ` [PATCH 09/62] AVX512FP16: Enable _Float16 autovectorization liuhongt
2021-09-10 7:03 ` Hongtao Liu
2021-07-01 6:15 ` [PATCH 10/62] AVX512FP16: Add vaddsh/vsubsh/vmulsh/vdivsh liuhongt
2021-07-01 6:15 ` [PATCH 11/62] AVX512FP16: Add testcase for vaddsh/vsubsh/vmulsh/vdivsh liuhongt
2021-07-01 6:15 ` [PATCH 12/62] AVX512FP16: Add vmaxph/vminph/vmaxsh/vminsh liuhongt
2021-07-01 6:15 ` [PATCH 13/62] AVX512FP16: Add testcase for vmaxph/vmaxsh/vminph/vminsh liuhongt
2021-07-01 6:16 ` [PATCH 14/62] AVX512FP16: Add vcmpph/vcmpsh/vcomish/vucomish liuhongt
2021-07-01 6:16 ` [PATCH 15/62] AVX512FP16: Add testcase for vcmpph/vcmpsh/vcomish/vucomish liuhongt
2021-07-01 6:16 ` [PATCH 16/62] AVX512FP16: Add vsqrtph/vrsqrtph/vsqrtsh/vrsqrtsh liuhongt
2021-09-14 3:50 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 17/62] AVX512FP16: Add testcase for vsqrtph/vsqrtsh/vrsqrtph/vrsqrtsh liuhongt
2021-07-01 6:16 ` [PATCH 18/62] AVX512FP16: Add vrcpph/vrcpsh/vscalefph/vscalefsh liuhongt
2021-07-01 6:16 ` [PATCH 19/62] AVX512FP16: Add testcase for vrcpph/vrcpsh/vscalefph/vscalefsh liuhongt
2021-07-01 6:16 ` [PATCH 20/62] AVX512FP16: Add vreduceph/vreducesh/vrndscaleph/vrndscalesh liuhongt
2021-07-01 6:16 ` [PATCH 21/62] AVX512FP16: Add testcase for vreduceph/vreducesh/vrndscaleph/vrndscalesh liuhongt
2021-07-01 6:16 ` [PATCH 22/62] AVX512FP16: Add fpclass/getexp/getmant instructions liuhongt
2021-07-01 6:16 ` [PATCH 23/62] AVX512FP16: Add testcase for fpclass/getmant/getexp instructions liuhongt
2021-07-01 6:16 ` [PATCH 24/62] AVX512FP16: Add vmovw/vmovsh liuhongt
2021-09-16 5:08 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 25/62] AVX512FP16: Add testcase for vmovsh/vmovw liuhongt
2021-07-01 6:16 ` [PATCH 26/62] AVX512FP16: Add vcvtph2dq/vcvtph2qq/vcvtph2w/vcvtph2uw/vcvtph2uqq/vcvtph2udq liuhongt
2021-07-01 6:16 ` [PATCH 27/62] AVX512FP16: Add testcase for vcvtph2w/vcvtph2uw/vcvtph2dq/vcvtph2udq/vcvtph2qq/vcvtph2uqq liuhongt
2021-07-01 6:16 ` [PATCH 28/62] AVX512FP16: Add vcvtuw2ph/vcvtw2ph/vcvtdq2ph/vcvtudq2ph/vcvtqq2ph/vcvtuqq2ph liuhongt
2021-07-01 6:16 ` [PATCH 29/62] AVX512FP16: Add testcase for vcvtw2ph/vcvtuw2ph/vcvtdq2ph/vcvtudq2ph/vcvtqq2ph/vcvtuqq2ph liuhongt
2021-07-01 6:16 ` [PATCH 30/62] AVX512FP16: Add vcvtsh2si/vcvtsh2usi/vcvtsi2sh/vcvtusi2sh liuhongt
2021-09-17 8:07 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 31/62] AVX512FP16: Add testcase for vcvtsh2si/vcvtsh2usi/vcvtsi2sh/vcvtusi2sh liuhongt
2021-07-01 6:16 ` [PATCH 32/62] AVX512FP16: Add vcvttph2w/vcvttph2uw/vcvttph2dq/vcvttph2qq/vcvttph2udq/vcvttph2uqq liuhongt
2021-07-01 6:16 ` [PATCH 33/62] AVX512FP16: Add testcase for vcvttph2w/vcvttph2uw/vcvttph2dq/vcvttph2udq/vcvttph2qq/vcvttph2uqq liuhongt
2021-07-01 6:16 ` [PATCH 34/62] AVX512FP16: Add vcvttsh2si/vcvttsh2usi liuhongt
2021-07-01 6:16 ` [PATCH 35/62] AVX512FP16: Add vcvtph2pd/vcvtph2psx/vcvtpd2ph/vcvtps2phx liuhongt
2021-07-01 6:16 ` [PATCH 36/62] AVX512FP16: Add testcase for vcvtph2pd/vcvtph2psx/vcvtpd2ph/vcvtps2phx liuhongt
2021-07-01 6:16 ` [PATCH 37/62] AVX512FP16: Add vcvtsh2ss/vcvtsh2sd/vcvtss2sh/vcvtsd2sh liuhongt
2021-07-01 6:16 ` [PATCH 38/62] AVX512FP16: Add testcase for vcvtsh2sd/vcvtsh2ss/vcvtsd2sh/vcvtss2sh liuhongt
2021-07-01 6:16 ` [PATCH 39/62] AVX512FP16: Add intrinsics for casting between vector float16 and vector float32/float64/integer liuhongt
2021-07-01 6:16 ` [PATCH 40/62] AVX512FP16: Add vfmaddsub[132, 213, 231]ph/vfmsubadd[132, 213, 231]ph liuhongt
2021-09-18 7:04 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 41/62] AVX512FP16: Add testcase for " liuhongt
2021-07-01 6:16 ` liuhongt [this message]
2021-07-01 6:16 ` [PATCH 43/62] AVX512FP16: Add testcase for fma instructions liuhongt
2021-07-01 6:16 ` [PATCH 44/62] AVX512FP16: Add scalar/vector bitwise operations, including liuhongt
2021-07-23 5:13 ` Hongtao Liu
2021-07-26 2:25 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 45/62] AVX512FP16: Add testcase for fp16 bitwise operations liuhongt
2021-07-01 6:16 ` [PATCH 46/62] AVX512FP16: Enable FP16 mask load/store liuhongt
2021-07-01 6:16 ` [PATCH 47/62] AVX512FP16: Add scalar fma instructions liuhongt
2021-07-01 6:16 ` [PATCH 48/62] AVX512FP16: Add testcase for scalar FMA instructions liuhongt
2021-07-01 6:16 ` [PATCH 49/62] AVX512FP16: Add vfcmaddcph/vfmaddcph/vfcmulcph/vfmulcph liuhongt
2021-09-22 4:38 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 50/62] AVX512FP16: Add testcases for vfcmaddcph/vfmaddcph/vfcmulcph/vfmulcph liuhongt
2021-07-01 6:16 ` [PATCH 51/62] AVX512FP16: Add vfcmaddcsh/vfmaddcsh/vfcmulcsh/vfmulcsh liuhongt
2021-07-01 6:16 ` [PATCH 52/62] AVX512FP16: Add testcases for vfcmaddcsh/vfmaddcsh/vfcmulcsh/vfmulcsh liuhongt
2021-07-01 6:16 ` [PATCH 53/62] AVX512FP16: Add expander for sqrthf2 liuhongt
2021-07-23 5:12 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 54/62] AVX512FP16: Add expander for ceil/floor/trunc/roundeven liuhongt
2021-07-01 6:16 ` [PATCH 55/62] AVX512FP16: Add expander for cstorehf4 liuhongt
2021-07-01 6:16 ` [PATCH 56/62] AVX512FP16: Optimize (_Float16) sqrtf ((float) f16) to sqrtf16 (f16) liuhongt
2021-07-01 9:50 ` Richard Biener
2021-07-01 10:23 ` Hongtao Liu
2021-07-01 12:43 ` Richard Biener
2021-07-01 21:48 ` Joseph Myers
2021-07-02 7:38 ` Richard Biener
2021-07-01 21:17 ` Joseph Myers
2021-07-01 6:16 ` [PATCH 57/62] AVX512FP16: Add expander for fmahf4 liuhongt
2021-07-01 6:16 ` [PATCH 58/62] AVX512FP16: Optimize for code like (_Float16) __builtin_ceif ((float) f16) liuhongt
2021-07-01 9:52 ` Richard Biener
2021-07-01 21:26 ` Joseph Myers
2021-07-02 7:36 ` Richard Biener
2021-07-02 11:46 ` Bernhard Reutner-Fischer
2021-07-04 5:17 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 59/62] AVX512FP16: Support load/store/abs intrinsics liuhongt
2021-09-22 10:30 ` Hongtao Liu
2021-07-01 6:16 ` [PATCH 60/62] AVX512FP16: Add reduce operators(add/mul/min/max) liuhongt
2021-07-01 6:16 ` [PATCH 61/62] AVX512FP16: Add complex conjugation intrinsic instructions liuhongt
2021-07-01 6:16 ` [PATCH 62/62] AVX512FP16: Add permutation and mask blend intrinsics liuhongt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210701061648.9447-43-hongtao.liu@intel.com \
--to=hongtao.liu@intel.com \
--cc=crazylht@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=jakub@redhat.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).