From: Andrew Senkevich <andrew.n.senkevich@gmail.com>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>, "H.J. Lu" <hjl.tools@gmail.com>
Subject: Re: [PATCH] Add AVX512 k-mask intrinsics
Date: Mon, 05 Dec 2016 14:59:00 -0000 [thread overview]
Message-ID: <CAMXFM3uBecoF3o_AP5_qPTU5pAEH2N7yO3N0veDmD=BiAjBtOA@mail.gmail.com> (raw)
In-Reply-To: <CAFULd4Zj1Fj0HUSEC_V71Vnutmw0=OU5HSKJFeOCdAY1rWn++g@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 7918 bytes --]
2016-12-02 21:31 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
> On Fri, Dec 2, 2016 at 6:44 PM, Andrew Senkevich
> <andrew.n.senkevich@gmail.com> wrote:
>> 2016-11-11 22:14 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>> On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
>>> <andrew.n.senkevich@gmail.com> wrote:
>>>> 2016-11-11 20:56 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>>>> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>>>>> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>>>>>> <andrew.n.senkevich@gmail.com> wrote:
>>>>>>> 2016-11-11 17:34 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>>>>>>> Some quick remarks:
>>>>>>>>
>>>>>>>> +(define_insn "kmovb"
>>>>>>>> + [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
>>>>>>>> + (unspec:QI
>>>>>>>> + [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>>>>>>> + UNSPEC_KMOV))]
>>>>>>>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
>>>>>>>> + "@
>>>>>>>> + kmovb\t{%k1, %0|%0, %k1}
>>>>>>>> + kmovb\t{%1, %0|%0, %1}";
>>>>>>>> + [(set_attr "mode" "QI")
>>>>>>>> + (set_attr "type" "mskmov")
>>>>>>>> + (set_attr "prefix" "vex")])
>>>>>>>> +
>>>>>>>> +(define_insn "kmovd"
>>>>>>>> + [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
>>>>>>>> + (unspec:SI
>>>>>>>> + [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>>>>>>>> + UNSPEC_KMOV))]
>>>>>>>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>>>>>>> + "@
>>>>>>>> + kmovd\t{%k1, %0|%0, %k1}
>>>>>>>> + kmovd\t{%1, %0|%0, %1}";
>>>>>>>> + [(set_attr "mode" "SI")
>>>>>>>> + (set_attr "type" "mskmov")
>>>>>>>> + (set_attr "prefix" "vex")])
>>>>>>>> +
>>>>>>>> +(define_insn "kmovq"
>>>>>>>> + [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
>>>>>>>> + (unspec:DI
>>>>>>>> + [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>>>>>>> + UNSPEC_KMOV))]
>>>>>>>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>>>>>>> + "@
>>>>>>>> + kmovq\t{%k1, %0|%0, %k1}
>>>>>>>> + kmovq\t{%1, %0|%0, %1}
>>>>>>>> + kmovq\t{%1, %0|%0, %1}";
>>>>>>>> + [(set_attr "mode" "DI")
>>>>>>>> + (set_attr "type" "mskmov")
>>>>>>>> + (set_attr "prefix" "vex")])
>>>>>>>>
>>>>>>>> - kmovd (and existing kmovw) should be using register_operand for
>>>>>>>> opreand 0. In this case, there is no need for MEM_P checks at all.
>>>>>>>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
>>>>>>>> - please put these definitions above corresponding *mov??_internal patterns.
>>>>>>>
>>>>>>> Do you mean put below *mov??_internal patterns? Attached corrected such way.
>>>>>>
>>>>>> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
>>>>>> etc. It doesn't matter if they are above or below their respective
>>>>>> *mov??_internal patterns, as long as they are positioned in some
>>>>>> consistent way. IOW, new patterns shouldn't be grouped together, as is
>>>>>> the case with your patch.
>>>>>
>>>>> +(define_insn "kmovb"
>>>>> + [(set (match_operand:QI 0 "register_operand" "=k,k")
>>>>> + (unspec:QI
>>>>> + [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>>>> + UNSPEC_KMOV))]
>>>>> + "TARGET_AVX512DQ && !MEM_P (operands[1])"
>>>>>
>>>>> There is no need for !MEM_P, this will prevent memory operand, which
>>>>> is allowed by constraint "m".
>>>>>
>>>>> +(define_insn "kmovq"
>>>>> + [(set (match_operand:DI 0 "register_operand" "=k,k,km")
>>>>> + (unspec:DI
>>>>> + [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>>>> + UNSPEC_KMOV))]
>>>>> + "TARGET_AVX512BW && !MEM_P (operands[1])"
>>>>>
>>>>> Operand 0 should have "nonimmediate_operand" predicate. And here you
>>>>> need && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
>>>>> mem->mem moves.
>>>>
>>>> Changed according your comments and attached.
>>>
>>> Still not good.
>>>
>>> +(define_insn "kmovd"
>>> + [(set (match_operand:SI 0 "register_operand" "=k,k")
>>> + (unspec:SI
>>> + [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>>> + UNSPEC_KMOV))]
>>> + "TARGET_AVX512BW && !MEM_P (operands[1])"
>>>
>>> Remove !MEM_P in the above pattern.
>>>
>>> (define_insn "kmovw"
>>> - [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
>>> + [(set (match_operand:HI 0 "register_operand" "=k,k")
>>> (unspec:HI
>>> [(match_operand:HI 1 "nonimmediate_operand" "r,km")]
>>> UNSPEC_KMOV))]
>>> - "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
>>> + "TARGET_AVX512F && !MEM_P (operands[1])"
>>>
>>> Also remove !MEM_P here.
>>>
>>> +(define_insn "kadd<mode>"
>>> + [(set (match_operand:SWI1248x 0 "register_operand" "=r,&r,!k")
>>> + (plus:SWI1248x
>>> + (not:SWI1248x
>>> + (match_operand:SWI1248x 1 "register_operand" "r,0,k"))
>>> + (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
>>> + (clobber (reg:CC FLAGS_REG))]
>>> + "TARGET_AVX512F"
>>> +{
>>> + switch (which_alternative)
>>> + {
>>> + case 0:
>>> + return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}";
>>> + case 1:
>>> + return "#";
>>> + case 2:
>>> + if (TARGET_AVX512BW && <MODE>mode == DImode)
>>> + return "kaddq\t{%2, %1, %0|%0, %1, %2}";
>>> + else if (TARGET_AVX512BW && <MODE>mode == SImode)
>>> + return "kaddd\t{%2, %1, %0|%0, %1, %2}";
>>> + else if (TARGET_AVX512DQ && <MODE>mode == QImode)
>>> + return "kaddb\t{%2, %1, %0|%0, %1, %2}";
>>> + else
>>> + return "kaddw\t{%2, %1, %0|%0, %1, %2}";
>>> +
>>>
>>> The above pattern is wrong. Is there really a NOT RTX present,
>>> implying effectively a kaddn?
>>>
>>> If this is plain add, then you need to change other add patterns, see
>>> how logic patterns are amended with "k" constraint, added pattern
>>> should look like *k<logic><mode> pattern.
>>>
>>> (define_insn "kandn<mode>"
>>> - [(set (match_operand:SWI12 0 "register_operand" "=r,&r,!k")
>>> - (and:SWI12
>>> - (not:SWI12
>>> - (match_operand:SWI12 1 "register_operand" "r,0,k"))
>>> - (match_operand:SWI12 2 "register_operand" "r,r,k")))
>>> + [(set (match_operand:SWI1248x 0 "register_operand" "=r,&r,!k")
>>> + (and:SWI1248x
>>> + (not:SWI1248x
>>> + (match_operand:SWI1248x 1 "register_operand" "r,0,k"))
>>> + (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
>>> (clobber (reg:CC FLAGS_REG))]
>>> "TARGET_AVX512F"
>>> {
>>> @@ -8319,10 +8358,50 @@
>>> case 1:
>>> return "#";
>>> case 2:
>>> - if (TARGET_AVX512DQ && <MODE>mode == QImode)
>>> + if (TARGET_AVX512BW && <MODE>mode == DImode)
>>> + return "kandnq\t{%2, %1, %0|%0, %1, %2}";
>>> + else if (TARGET_AVX512BW && <MODE>mode == SImode)
>>> + return "kandnd\t{%2, %1, %0|%0, %1, %2}";
>>> + else if (TARGET_AVX512DQ && <MODE>mode == QImode)
>>> return "kandnb\t{%2, %1, %0|%0, %1, %2}";
>>> else
>>> return "kandnw\t{%2, %1, %0|%0, %1, %2}";
>>>
>>> The above should use SWI1248_AVX512BW mode iterator, see
>>> *k<logic><mode> pattern.
>>
>> I split this patch after last updates in md files, here is the first
>> part which doesn't change md files.
>> Regtested on x86_64-linux-gnu. Is this part ok?
>
> There is no point to scan for kmovX insn in e.g.:
>
> +/* { dg-final { scan-assembler-times "kmovq" 2 } } */
> +
> +#include <immintrin.h>
> +
> +void
> +avx512bw_test ()
> +{
> + __mmask64 k1, k2, k3;
> + volatile __m512i x = _mm512_setzero_si512 ();
> +
> + __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (1) );
> + __asm__( "kmovq %1, %0" : "=k" (k2) : "r" (2) );
>
> since you emit it from inline asm.
>
> Please remove these pointles kmovX scan-asm-times directives from the
> testcases, and please also remove it from avx512f-kandnw-1.c
> testcase.
>
> The patch is OK with this change.
Attached fixed with updated ChangeLogs.
HJ, could you commit please?
--
WBR,
Andrew
[-- Attachment #2: avx512-kmask-intrin-part1_v2.patch --]
[-- Type: application/octet-stream, Size: 32933 bytes --]
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1ace8b0..02d560d
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,25 @@
+2016-12-05 Andrew Senkevich <andrew.senkevich@intel.com>
+
+ * config/i386/avx512bwintrin.h: Add new k-mask intrinsics.
+ * config/i386/avx512dqintrin.h: Ditto.
+ * config/i386/avx512fintrin.h: Ditto.
+ * config/i386/i386-builtin-types.def (UCHAR_FTYPE_UQI_UQI_PUCHAR,
+ UCHAR_FTYPE_UHI_UHI_PUCHAR, UCHAR_FTYPE_USI_USI_PUCHAR,
+ UCHAR_FTYPE_UDI_UDI_PUCHAR, UCHAR_FTYPE_UQI_UQI, UCHAR_FTYPE_UHI_UHI,
+ UCHAR_FTYPE_USI_USI, UCHAR_FTYPE_UDI_UDI, UQI_FTYPE_UQI_INT,
+ UHI_FTYPE_UHI_INT, USI_FTYPE_USI_INT, UDI_FTYPE_UDI_INT,
+ UQI_FTYPE_UQI, USI_FTYPE_USI, UDI_FTYPE_UDI, UQI_FTYPE_UQI_UQI): New
+ function types.
+ * config/i386/i386-builtin.def (__builtin_ia32_knotqi,
+ __builtin_ia32_knotsi, __builtin_ia32_knotdi,
+ __builtin_ia32_korqi, __builtin_ia32_korsi, __builtin_ia32_kordi,
+ __builtin_ia32_kxnorqi, __builtin_ia32_kxnorsi,
+ __builtin_ia32_kxnordi, __builtin_ia32_kxorqi, __builtin_ia32_kxorsi,
+ __builtin_ia32_kxordi, __builtin_ia32_kandqi,
+ __builtin_ia32_kandsi, __builtin_ia32_kanddi, __builtin_ia32_kandnqi,
+ __builtin_ia32_kandnsi, __builtin_ia32_kandndi): New.
+ * config/i386/i386.c (ix86_expand_args_builtin): Handle new types.
+
2016-12-05 Segher Boessenkool <segher@kernel.crashing.org>
* combine.c: Revert r243162.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index d9edb52..3b0a8fa
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,28 @@
+2016-12-05 Andrew Senkevich <andrew.senkevich@intel.com>
+
+ * gcc.target/i386/avx512bw-kandd-1.c: New.
+ * gcc.target/i386/avx512bw-kandnd-1.c: Ditto.
+ * gcc.target/i386/avx512bw-kandnq-1.c: Ditto.
+ * gcc.target/i386/avx512bw-kandq-1.c: Ditto.
+ * gcc.target/i386/avx512bw-knotd-1.c: Ditto.
+ * gcc.target/i386/avx512bw-knotq-1.c: Ditto.
+ * gcc.target/i386/avx512bw-kord-1.c: Ditto.
+ * gcc.target/i386/avx512bw-korq-1.c: Ditto.
+ * gcc.target/i386/avx512bw-kunpckdq-3.c: Ditto.
+ * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto.
+ * gcc.target/i386/avx512bw-kxnord-1.c: Ditto.
+ * gcc.target/i386/avx512bw-kxnorq-1.c: Ditto.
+ * gcc.target/i386/avx512bw-kxord-1.c: Ditto.
+ * gcc.target/i386/avx512bw-kxorq-1.c: Ditto.
+ * gcc.target/i386/avx512dq-kandb-1.c: Ditto.
+ * gcc.target/i386/avx512dq-kandnb-1.c: Ditto.
+ * gcc.target/i386/avx512dq-knotb-1.c: Ditto.
+ * gcc.target/i386/avx512dq-korb-1.c: Ditto.
+ * gcc.target/i386/avx512dq-kxnorb-1.c: Ditto.
+ * gcc.target/i386/avx512dq-kxorb-1.c: Ditto.
+ * gcc.target/i386/avx512f-kunpckbw-3.c: Ditto.
+ * gcc.target/i386/avx512f-kandnw-1.c: Removed unneeded check.
+
2016-12-05 Paolo Bonzini <bonzini@gnu.org>
* gcc.dg/fold-and-lshift.c, gcc.dg/fold-and-rshift-1.c,
diff --git a/gcc/config/i386/avx512bwintrin.h b/gcc/config/i386/avx512bwintrin.h
index 4069802..9e6e0ce 100644
--- a/gcc/config/i386/avx512bwintrin.h
+++ b/gcc/config/i386/avx512bwintrin.h
@@ -40,6 +40,90 @@ typedef char __v64qi __attribute__ ((__vector_size__ (64)));
typedef unsigned long long __mmask64;
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_knot_mask32 (__mmask32 __A)
+{
+ return (__mmask32) __builtin_ia32_knotsi ((__mmask32) __A);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_knot_mask64 (__mmask64 __A)
+{
+ return (__mmask64) __builtin_ia32_knotdi ((__mmask64) __A);
+}
+
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kor_mask32 (__mmask32 __A, __mmask32 __B)
+{
+ return (__mmask32) __builtin_ia32_korsi ((__mmask32) __A, (__mmask32) __B);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kor_mask64 (__mmask64 __A, __mmask64 __B)
+{
+ return (__mmask64) __builtin_ia32_kordi ((__mmask64) __A, (__mmask64) __B);
+}
+
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kxnor_mask32 (__mmask32 __A, __mmask32 __B)
+{
+ return (__mmask32) __builtin_ia32_kxnorsi ((__mmask32) __A, (__mmask32) __B);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kxnor_mask64 (__mmask64 __A, __mmask64 __B)
+{
+ return (__mmask64) __builtin_ia32_kxnordi ((__mmask64) __A, (__mmask64) __B);
+}
+
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kxor_mask32 (__mmask32 __A, __mmask32 __B)
+{
+ return (__mmask32) __builtin_ia32_kxorsi ((__mmask32) __A, (__mmask32) __B);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kxor_mask64 (__mmask64 __A, __mmask64 __B)
+{
+ return (__mmask64) __builtin_ia32_kxordi ((__mmask64) __A, (__mmask64) __B);
+}
+
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kand_mask32 (__mmask32 __A, __mmask32 __B)
+{
+ return (__mmask32) __builtin_ia32_kandsi ((__mmask32) __A, (__mmask32) __B);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kand_mask64 (__mmask64 __A, __mmask64 __B)
+{
+ return (__mmask64) __builtin_ia32_kanddi ((__mmask64) __A, (__mmask64) __B);
+}
+
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kandn_mask32 (__mmask32 __A, __mmask32 __B)
+{
+ return (__mmask32) __builtin_ia32_kandnsi ((__mmask32) __A, (__mmask32) __B);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kandn_mask64 (__mmask64 __A, __mmask64 __B)
+{
+ return (__mmask64) __builtin_ia32_kandndi ((__mmask64) __A, (__mmask64) __B);
+}
+
extern __inline __m512i
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_mask_mov_epi16 (__m512i __W, __mmask32 __U, __m512i __A)
@@ -114,6 +198,14 @@ _mm512_kunpackw (__mmask32 __A, __mmask32 __B)
(__mmask32) __B);
}
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kunpackw_mask32 (__mmask16 __A, __mmask16 __B)
+{
+ return (__mmask32) __builtin_ia32_kunpcksi ((__mmask32) __A,
+ (__mmask32) __B);
+}
+
extern __inline __mmask64
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_kunpackd (__mmask64 __A, __mmask64 __B)
@@ -122,6 +214,14 @@ _mm512_kunpackd (__mmask64 __A, __mmask64 __B)
(__mmask64) __B);
}
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kunpackd_mask64 (__mmask32 __A, __mmask32 __B)
+{
+ return (__mmask64) __builtin_ia32_kunpckdi ((__mmask64) __A,
+ (__mmask64) __B);
+}
+
extern __inline __m512i
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_mask_loadu_epi8 (__m512i __W, __mmask64 __U, void const *__P)
diff --git a/gcc/config/i386/avx512dqintrin.h b/gcc/config/i386/avx512dqintrin.h
index 4b954f9..d2405c3 100644
--- a/gcc/config/i386/avx512dqintrin.h
+++ b/gcc/config/i386/avx512dqintrin.h
@@ -34,6 +34,48 @@
#define __DISABLE_AVX512DQ__
#endif /* __AVX512DQ__ */
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_knot_mask8 (__mmask8 __A)
+{
+ return (__mmask8) __builtin_ia32_knotqi ((__mmask8) __A);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kor_mask8 (__mmask8 __A, __mmask8 __B)
+{
+ return (__mmask8) __builtin_ia32_korqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kxnor_mask8 (__mmask8 __A, __mmask8 __B)
+{
+ return (__mmask8) __builtin_ia32_kxnorqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kxor_mask8 (__mmask8 __A, __mmask8 __B)
+{
+ return (__mmask8) __builtin_ia32_kxorqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kand_mask8 (__mmask8 __A, __mmask8 __B)
+{
+ return (__mmask8) __builtin_ia32_kandqi ((__mmask8) __A, (__mmask8) __B);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kandn_mask8 (__mmask8 __A, __mmask8 __B)
+{
+ return (__mmask8) __builtin_ia32_kandnqi ((__mmask8) __A, (__mmask8) __B);
+}
+
extern __inline __m512d
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_broadcast_f64x2 (__m128d __A)
diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 2372c83..ab1704b 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -9977,6 +9977,13 @@ _mm512_maskz_expandloadu_epi32 (__mmask16 __U, void const *__P)
}
/* Mask arithmetic operations */
+#define _kand_mask16 _mm512_kand
+#define _kandn_mask16 _mm512_kandn
+#define _knot_mask16 _mm512_knot
+#define _kor_mask16 _mm512_kor
+#define _kxnor_mask16 _mm512_kxnor
+#define _kxor_mask16 _mm512_kxor
+
extern __inline __mmask16
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_kand (__mmask16 __A, __mmask16 __B)
@@ -9988,7 +9995,8 @@ extern __inline __mmask16
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_kandn (__mmask16 __A, __mmask16 __B)
{
- return (__mmask16) __builtin_ia32_kandnhi ((__mmask16) __A, (__mmask16) __B);
+ return (__mmask16) __builtin_ia32_kandnhi ((__mmask16) __A,
+ (__mmask16) __B);
}
extern __inline __mmask16
@@ -10042,6 +10050,13 @@ _mm512_kunpackb (__mmask16 __A, __mmask16 __B)
return (__mmask16) __builtin_ia32_kunpckhi ((__mmask16) __A, (__mmask16) __B);
}
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kunpackb_mask16 (__mmask8 __A, __mmask8 __B)
+{
+ return (__mmask16) __builtin_ia32_kunpckhi ((__mmask16) __A, (__mmask16) __B);
+}
+
#ifdef __OPTIMIZE__
extern __inline __m512i
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def
index 4a38c12..6e938eb 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -139,6 +139,12 @@ DEF_POINTER_TYPE (PLONGLONG, LONGLONG)
DEF_POINTER_TYPE (PULONGLONG, ULONGLONG)
DEF_POINTER_TYPE (PUNSIGNED, UNSIGNED)
+DEF_POINTER_TYPE (PUQI, UQI)
+DEF_POINTER_TYPE (PUHI, UHI)
+DEF_POINTER_TYPE (PUSI, USI)
+DEF_POINTER_TYPE (PUDI, UDI)
+DEF_POINTER_TYPE (PUCHAR, UCHAR)
+
DEF_POINTER_TYPE (PV2SI, V2SI)
DEF_POINTER_TYPE (PV2DF, V2DF)
DEF_POINTER_TYPE (PV2DI, V2DI)
@@ -536,7 +542,28 @@ DEF_FUNCTION_TYPE (V16SI, V16SI, V16SI, V16SI, V16SI, V16SI, PCV4SI)
# Instructions returning mask
+DEF_FUNCTION_TYPE (UCHAR, UQI, UQI, PUCHAR)
+DEF_FUNCTION_TYPE (UCHAR, UQI, UQI)
+DEF_FUNCTION_TYPE (UCHAR, UHI, UHI, PUCHAR)
+DEF_FUNCTION_TYPE (UCHAR, UHI, UHI)
+DEF_FUNCTION_TYPE (UCHAR, USI, USI, PUCHAR)
+DEF_FUNCTION_TYPE (UCHAR, USI, USI)
+DEF_FUNCTION_TYPE (UCHAR, UDI, UDI, PUCHAR)
+DEF_FUNCTION_TYPE (UCHAR, UDI, UDI)
+
+DEF_FUNCTION_TYPE (USI, UQI)
+DEF_FUNCTION_TYPE (USI, UHI)
+DEF_FUNCTION_TYPE (UQI, USI)
+DEF_FUNCTION_TYPE (UHI, USI)
+
+DEF_FUNCTION_TYPE (UQI, UQI, INT)
+DEF_FUNCTION_TYPE (UHI, UHI, INT)
+DEF_FUNCTION_TYPE (USI, USI, INT)
+DEF_FUNCTION_TYPE (UDI, UDI, INT)
+DEF_FUNCTION_TYPE (UQI, UQI)
DEF_FUNCTION_TYPE (UHI, UHI)
+DEF_FUNCTION_TYPE (USI, USI)
+DEF_FUNCTION_TYPE (UDI, UDI)
DEF_FUNCTION_TYPE (UHI, V16QI)
DEF_FUNCTION_TYPE (USI, V32QI)
DEF_FUNCTION_TYPE (UDI, V64QI)
@@ -549,6 +576,7 @@ DEF_FUNCTION_TYPE (UHI, V16SI)
DEF_FUNCTION_TYPE (UQI, V2DI)
DEF_FUNCTION_TYPE (UQI, V4DI)
DEF_FUNCTION_TYPE (UQI, V8DI)
+DEF_FUNCTION_TYPE (UQI, UQI, UQI)
DEF_FUNCTION_TYPE (UHI, UHI, UHI)
DEF_FUNCTION_TYPE (USI, USI, USI)
DEF_FUNCTION_TYPE (UDI, UDI, UDI)
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index a9c272a..83a5089 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -1436,15 +1436,33 @@ BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_roundpd_vec_pack_sfix512, "__bu
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_roundpd_vec_pack_sfix512, "__builtin_ia32_ceilpd_vec_pack_sfix512", IX86_BUILTIN_CEILPD_VEC_PACK_SFIX512, (enum rtx_code) ROUND_CEIL, (int) V16SI_FTYPE_V8DF_V8DF_ROUND)
/* Mask arithmetic operations */
+BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_kandqi, "__builtin_ia32_kandqi", IX86_BUILTIN_KAND8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kandhi, "__builtin_ia32_kandhi", IX86_BUILTIN_KAND16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kandsi, "__builtin_ia32_kandsi", IX86_BUILTIN_KAND32, UNKNOWN, (int) USI_FTYPE_USI_USI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kanddi, "__builtin_ia32_kanddi", IX86_BUILTIN_KAND64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_kandnqi, "__builtin_ia32_kandnqi", IX86_BUILTIN_KANDN8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kandnhi, "__builtin_ia32_kandnhi", IX86_BUILTIN_KANDN16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kandnsi, "__builtin_ia32_kandnsi", IX86_BUILTIN_KANDN32, UNKNOWN, (int) USI_FTYPE_USI_USI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kandndi, "__builtin_ia32_kandndi", IX86_BUILTIN_KANDN64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_knotqi, "__builtin_ia32_knotqi", IX86_BUILTIN_KNOT8, UNKNOWN, (int) UQI_FTYPE_UQI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_knothi, "__builtin_ia32_knothi", IX86_BUILTIN_KNOT16, UNKNOWN, (int) UHI_FTYPE_UHI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_knotsi, "__builtin_ia32_knotsi", IX86_BUILTIN_KNOT32, UNKNOWN, (int) USI_FTYPE_USI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_knotdi, "__builtin_ia32_knotdi", IX86_BUILTIN_KNOT64, UNKNOWN, (int) UDI_FTYPE_UDI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_kiorqi, "__builtin_ia32_korqi", IX86_BUILTIN_KOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kiorhi, "__builtin_ia32_korhi", IX86_BUILTIN_KOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kiorsi, "__builtin_ia32_korsi", IX86_BUILTIN_KOR32, UNKNOWN, (int) USI_FTYPE_USI_USI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kiordi, "__builtin_ia32_kordi", IX86_BUILTIN_KOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kortestchi, "__builtin_ia32_kortestchi", IX86_BUILTIN_KORTESTC16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kortestzhi, "__builtin_ia32_kortestzhi", IX86_BUILTIN_KORTESTZ16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kunpckhi, "__builtin_ia32_kunpckhi", IX86_BUILTIN_KUNPCKBW, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_kxnorqi, "__builtin_ia32_kxnorqi", IX86_BUILTIN_KXNOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kxnorhi, "__builtin_ia32_kxnorhi", IX86_BUILTIN_KXNOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kxnorsi, "__builtin_ia32_kxnorsi", IX86_BUILTIN_KXNOR32, UNKNOWN, (int) USI_FTYPE_USI_USI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kxnordi, "__builtin_ia32_kxnordi", IX86_BUILTIN_KXNOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_kxorqi, "__builtin_ia32_kxorqi", IX86_BUILTIN_KXOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kxorhi, "__builtin_ia32_kxorhi", IX86_BUILTIN_KXOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kxorsi, "__builtin_ia32_kxorsi", IX86_BUILTIN_KXOR32, UNKNOWN, (int) USI_FTYPE_USI_USI)
+BDESC (OPTION_MASK_ISA_AVX512BW, CODE_FOR_kxordi, "__builtin_ia32_kxordi", IX86_BUILTIN_KXOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI)
BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kmovw, "__builtin_ia32_kmov16", IX86_BUILTIN_KMOV16, UNKNOWN, (int) UHI_FTYPE_UHI)
/* SHA */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 41717da..003439f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -34842,7 +34842,12 @@ ix86_expand_args_builtin (const struct builtin_description *d,
case V4DI_FTYPE_V8HI:
case V4DI_FTYPE_V4SI:
case V4DI_FTYPE_V2DI:
+ case UQI_FTYPE_UQI:
case UHI_FTYPE_UHI:
+ case USI_FTYPE_USI:
+ case USI_FTYPE_UQI:
+ case USI_FTYPE_UHI:
+ case UDI_FTYPE_UDI:
case UHI_FTYPE_V16QI:
case USI_FTYPE_V32QI:
case UDI_FTYPE_V64QI:
@@ -34976,6 +34981,7 @@ ix86_expand_args_builtin (const struct builtin_description *d,
case UINT_FTYPE_UINT_UCHAR:
case UINT16_FTYPE_UINT16_INT:
case UINT8_FTYPE_UINT8_INT:
+ case UQI_FTYPE_UQI_UQI:
case UHI_FTYPE_UHI_UHI:
case USI_FTYPE_USI_USI:
case UDI_FTYPE_UDI_UDI:
@@ -35023,6 +35029,10 @@ ix86_expand_args_builtin (const struct builtin_description *d,
case V4DI_FTYPE_V8DI_INT:
case QI_FTYPE_V4SF_INT:
case QI_FTYPE_V2DF_INT:
+ case UQI_FTYPE_UQI_INT:
+ case UHI_FTYPE_UHI_INT:
+ case USI_FTYPE_USI_INT:
+ case UDI_FTYPE_UDI_INT:
nargs = 2;
nargs_constant = 1;
break;
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kandd-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kandd-1.c
new file mode 100644
index 0000000..2a934f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kandd-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kandd\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask32 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_epi32();
+
+ __asm__( "kmovd %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovd %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kand_mask32 (k1, k2);
+ x = _mm512_mask_add_epi16 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kandnd-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kandnd-1.c
new file mode 100644
index 0000000..69cbe04
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kandnd-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kandnd\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask32 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovd %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovd %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kandn_mask32 (k1, k2);
+ x = _mm512_mask_add_epi16 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kandnq-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kandnq-1.c
new file mode 100644
index 0000000..e8b7a5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kandnq-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kandnq\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask64 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovq %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kandn_mask64 (k1, k2);
+ x = _mm512_mask_add_epi8 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kandq-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kandq-1.c
new file mode 100644
index 0000000..a1aaed6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kandq-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kandq\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask64 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_epi32();
+
+ __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovq %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kand_mask64 (k1, k2);
+ x = _mm512_mask_add_epi8 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-knotd-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-knotd-1.c
new file mode 100644
index 0000000..8a7e033
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-knotd-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "knotd\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask32 k1, k2;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovd %1, %0" : "=k" (k1) : "r" (45) );
+
+ k2 = _knot_mask32 (k1);
+ x = _mm512_mask_add_epi16 (x, k1, x, x);
+ x = _mm512_mask_add_epi16 (x, k2, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-knotq-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-knotq-1.c
new file mode 100644
index 0000000..deb6579
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-knotq-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "knotq\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask64 k1, k2;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (45) );
+
+ k2 = _knot_mask64 (k1);
+ x = _mm512_mask_add_epi8 (x, k1, x, x);
+ x = _mm512_mask_add_epi8 (x, k2, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kord-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kord-1.c
new file mode 100644
index 0000000..4c35a81
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kord-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kord\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask32 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovd %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovd %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kor_mask32 (k1, k2);
+ x = _mm512_mask_add_epi16 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-korq-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-korq-1.c
new file mode 100644
index 0000000..89753f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-korq-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "korq\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask64 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovq %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kor_mask64 (k1, k2);
+ x = _mm512_mask_add_epi8 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kunpckdq-3.c b/gcc/testsuite/gcc.target/i386/avx512bw-kunpckdq-3.c
new file mode 100644
index 0000000..951260f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kunpckdq-3.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kunpckdq\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test () {
+ volatile __mmask64 k3;
+ __mmask32 k1, k2;
+
+ __asm__( "kmovd %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovd %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kunpackd_mask64 (k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kunpckwd-3.c b/gcc/testsuite/gcc.target/i386/avx512bw-kunpckwd-3.c
new file mode 100644
index 0000000..c68ad8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kunpckwd-3.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kunpckwd\[ \\t\]+\[^\{\n\]*%k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test () {
+ volatile __mmask32 k3;
+ __mmask16 k1, k2;
+
+ __asm__( "kmovw %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovw %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kunpackw_mask32 (k1, k2);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kxnord-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kxnord-1.c
new file mode 100644
index 0000000..d93d61e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kxnord-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kxnord\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask32 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovd %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovd %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kxnor_mask32 (k1, k2);
+ x = _mm512_mask_add_epi16 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kxnorq-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kxnorq-1.c
new file mode 100644
index 0000000..ba72e1f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kxnorq-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kxnorq\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask64 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovq %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kxnor_mask64 (k1, k2);
+ x = _mm512_mask_add_epi8 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kxord-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kxord-1.c
new file mode 100644
index 0000000..97ea291
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kxord-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kxord\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask32 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovd %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovd %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kxor_mask32 (k1, k2);
+ x = _mm512_mask_add_epi16 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kxorq-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-kxorq-1.c
new file mode 100644
index 0000000..abf4280
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-kxorq-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "kxorq\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512bw_test ()
+{
+ __mmask64 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_si512 ();
+
+ __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovq %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kxor_mask64 (k1, k2);
+ x = _mm512_mask_add_epi8 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-kandb-1.c b/gcc/testsuite/gcc.target/i386/avx512dq-kandb-1.c
new file mode 100644
index 0000000..b5b5367
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-kandb-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512dq -O2" } */
+/* { dg-final { scan-assembler-times "kandb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512dq_test ()
+{
+ __mmask8 k1, k2, k3;
+ volatile __m512i x = _mm512_setzero_epi32();
+
+ __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kand_mask8 (k1, k2);
+ x = _mm512_mask_add_epi64 (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-kandnb-1.c b/gcc/testsuite/gcc.target/i386/avx512dq-kandnb-1.c
new file mode 100644
index 0000000..a0e96fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-kandnb-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512dq -O2" } */
+/* { dg-final { scan-assembler-times "kandnb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512dq_test ()
+{
+ __mmask8 k1, k2, k3;
+ volatile __m512d x = _mm512_setzero_pd();
+
+ __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kandn_mask8 (k1, k2);
+ x = _mm512_mask_add_pd (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-knotb-1.c b/gcc/testsuite/gcc.target/i386/avx512dq-knotb-1.c
new file mode 100644
index 0000000..03bbf83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-knotb-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512dq -O2" } */
+/* { dg-final { scan-assembler-times "knotb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512dq_test ()
+{
+ __mmask8 k1, k2;
+ volatile __m512d x = _mm512_setzero_pd();
+
+ __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (45) );
+
+ k2 = _knot_mask8 (k1);
+ x = _mm512_mask_add_pd (x, k1, x, x);
+ x = _mm512_mask_add_pd (x, k2, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-korb-1.c b/gcc/testsuite/gcc.target/i386/avx512dq-korb-1.c
new file mode 100644
index 0000000..7717aee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-korb-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512dq -O2" } */
+/* { dg-final { scan-assembler-times "korb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512dq_test ()
+{
+ __mmask8 k1, k2, k3;
+ volatile __m512d x = _mm512_setzero_pd();
+
+ __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kor_mask8 (k1, k2);
+ x = _mm512_mask_add_pd (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-kxnorb-1.c b/gcc/testsuite/gcc.target/i386/avx512dq-kxnorb-1.c
new file mode 100644
index 0000000..faa974f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-kxnorb-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512dq -O2" } */
+/* { dg-final { scan-assembler-times "kxnorb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512dq_test ()
+{
+ __mmask8 k1, k2, k3;
+ volatile __m512d x = _mm512_setzero_pd();
+
+ __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kxnor_mask8 (k1, k2);
+ x = _mm512_mask_add_pd (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-kxorb-1.c b/gcc/testsuite/gcc.target/i386/avx512dq-kxorb-1.c
new file mode 100644
index 0000000..a21830b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-kxorb-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512dq -O2" } */
+/* { dg-final { scan-assembler-times "kxorb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512dq_test ()
+{
+ __mmask8 k1, k2, k3;
+ volatile __m512d x = _mm512_setzero_pd();
+
+ __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kxor_mask8 (k1, k2);
+ x = _mm512_mask_add_pd (x, k3, x, x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-kandnw-1.c b/gcc/testsuite/gcc.target/i386/avx512f-kandnw-1.c
index 727a589..17b7b29 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-kandnw-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-kandnw-1.c
@@ -1,7 +1,6 @@
/* { dg-do compile } */
/* { dg-options "-mavx512f -O2" } */
/* { dg-final { scan-assembler-times "kandnw\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "kmovw" 2 } } */
#include <immintrin.h>
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-kunpckbw-3.c b/gcc/testsuite/gcc.target/i386/avx512f-kunpckbw-3.c
new file mode 100644
index 0000000..2061f0a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-kunpckbw-3.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512f -O2" } */
+/* { dg-final { scan-assembler-times "kunpckbw\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <immintrin.h>
+
+void
+avx512f_test () {
+ __mmask8 k1, k2;
+ __mmask16 k3;
+ volatile __m512 x = _mm512_setzero_ps();
+
+ __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) );
+ __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) );
+
+ k3 = _kunpackb_mask16 (k1, k2);
+ x = _mm512_mask_add_ps (x, k3, x, x);
+}
next prev parent reply other threads:[~2016-12-05 14:59 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-11 14:34 Uros Bizjak
2016-11-11 17:39 ` Andrew Senkevich
2016-11-11 17:50 ` Uros Bizjak
2016-11-11 17:56 ` Uros Bizjak
2016-11-11 18:23 ` Andrew Senkevich
2016-11-11 19:14 ` Uros Bizjak
2016-12-02 17:45 ` Andrew Senkevich
2016-12-02 18:31 ` Uros Bizjak
2016-12-05 14:59 ` Andrew Senkevich [this message]
2016-12-05 17:19 ` H.J. Lu
2016-12-14 19:33 ` Andrew Senkevich
2016-12-14 20:35 ` Uros Bizjak
[not found] ` <CAMXFM3vC-3bMgQaQ2bnjDU7oQMPdvhurzgOFftZHqzNXAw=WgA@mail.gmail.com>
2016-12-15 16:51 ` Uros Bizjak
2016-12-15 19:04 ` Andrew Senkevich
2016-12-16 12:45 ` Uros Bizjak
2017-01-16 22:30 ` Andrew Senkevich
2017-01-16 22:55 ` Jakub Jelinek
2017-01-17 11:05 ` Andrew Senkevich
2017-01-17 11:06 ` Uros Bizjak
2017-01-17 12:30 ` Kirill Yukhin
2017-01-17 13:03 ` Andrew Senkevich
2017-01-17 13:51 ` Jakub Jelinek
2017-01-18 12:48 ` Andrew Senkevich
2017-01-18 21:45 ` Uros Bizjak
2017-01-19 10:46 ` Kirill Yukhin
2017-01-19 16:45 ` Andrew Senkevich
2017-01-19 18:04 ` Kirill Yukhin
2017-01-20 13:41 ` Andrew Senkevich
2017-01-20 13:47 ` Uros Bizjak
2017-01-20 17:26 ` Kirill Yukhin
2017-01-20 20:07 ` Andrew Senkevich
2017-01-21 8:25 ` Richard Biener
2017-01-23 11:33 ` Kirill Yukhin
2017-01-26 9:38 ` Thomas Schwinge
2017-01-26 10:04 ` Uros Bizjak
2017-01-26 10:51 ` Kirill Yukhin
2017-01-26 10:54 ` Jakub Jelinek
2017-01-26 10:55 ` Uros Bizjak
2017-01-26 11:04 ` Jakub Jelinek
2017-01-26 11:18 ` Uros Bizjak
2017-01-26 11:53 ` Thomas Schwinge
2017-01-26 12:04 ` Kirill Yukhin
2017-01-26 12:17 ` Jakub Jelinek
2017-01-26 12:23 ` Kirill Yukhin
2017-01-17 8:12 ` Uros Bizjak
-- strict thread matches above, loose matches on Subject: below --
2016-11-11 14:14 Andrew Senkevich
2016-11-11 15:26 ` Marc Glisse
2016-11-11 18:28 ` Andrew Senkevich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMXFM3uBecoF3o_AP5_qPTU5pAEH2N7yO3N0veDmD=BiAjBtOA@mail.gmail.com' \
--to=andrew.n.senkevich@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).