public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
@ 2016-12-27 13:40 Uros Bizjak
  2016-12-27 13:43 ` Andrew Senkevich
  0 siblings, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2016-12-27 13:40 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Senkevich

Hello!

> this patch enables AVX512 VPOPCNTD/VPOPCNTQ instructions recently
> added in Instruction Set Extensions
> (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf).

@@ -265,6 +268,9 @@
 (define_mode_iterator VF_512
   [V16SF V8DF])

+(define_mode_iterator VI_AVX512F
+  [V16SI V8DI])

Please name this iterator VI_512.

@@ -19881,3 +19887,44 @@
    [(set_attr ("type") ("ssemuladd"))
     (set_attr ("prefix") ("evex"))
     (set_attr ("mode") ("TI"))])
+
+(define_insn "vpopcount<mode>"
+  [(set (match_operand:VI_AVX512F 0 "register_operand" "=v, v")
+ (popcount:VI_AVX512F
+  (match_operand:VI_AVX512F 1 "nonimmediate_operand" "v, m")))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcnt<ssemodesuffix>\t{%1, %0|%0, %1}")
+
+(define_insn "vpopcountv16si_mask"
+  [(set (match_operand:V16SI 0 "register_operand" "=v, v")
+ (unspec:V16SI
+  [(match_operand:V16SI 1 "nonimmediate_operand" "v, m")
+   (match_operand:HI 2 "register_operand" "Yk, Yk")
+   (match_operand:V16SI 3 "nonimmediate_operand" "0, 0")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntd\t{%1, %0%{%2%}|%{%2%}%0, %1}")
+
+(define_insn "vpopcountv16si_maskz"
+  [(set (match_operand:V16SI 0 "register_operand" "=v, v")
+ (unspec:V16SI
+  [(match_operand:HI 1 "register_operand" "Yk, Yk")
+   (match_operand:V16SI 2 "nonimmediate_operand" "v, m")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntd\t{%2, %0%{%1%}%{z%}|%{%1%}%{z%}%0, %2}")
+
+(define_insn "vpopcountv8di_mask"
+  [(set (match_operand:V8DI 0 "register_operand" "=v, v")
+ (unspec:V8DI
+  [(match_operand:V8DI 1 "nonimmediate_operand" "v, m")
+   (match_operand:QI 2 "register_operand" "Yk, Yk")
+   (match_operand:V8DI 3 "nonimmediate_operand" "0, 0")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntq\t{%1, %0%{%2%}|%{%2%}%0, %1}")
+
+(define_insn "vpopcountv8di_maskz"
+  [(set (match_operand:V8DI 0 "register_operand" "=v, v")
+ (unspec:V8DI
+  [(match_operand:QI 1 "register_operand" "Yk, Yk")
+   (match_operand:V8DI 2 "nonimmediate_operand" "v, m")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntq\t{%2, %0%{%1%}%{z%}|%{%1%}%{z%}%0, %2}")

You should use (=v,vm) or (=v,vm,Yk) constraints. No need for separate
alternatives when they are handled in the same way.

Also, insn patterns with mask operands can use mode iterators. Use
avx512fmaskmode mode attribute for mask operand.

OTOH, the above patterns should probably use define_subst
infrastructure. Please see config/i386/subst.md for available
substitutions.

Uros.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2016-12-27 13:40 [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions Uros Bizjak
@ 2016-12-27 13:43 ` Andrew Senkevich
  2016-12-27 13:50   ` Uros Bizjak
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Senkevich @ 2016-12-27 13:43 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches

2016-12-27 16:35 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
> Hello!
>
>> this patch enables AVX512 VPOPCNTD/VPOPCNTQ instructions recently
>> added in Instruction Set Extensions
>> (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf).
>
> @@ -265,6 +268,9 @@
>  (define_mode_iterator VF_512
>    [V16SF V8DF])
>
> +(define_mode_iterator VI_AVX512F
> +  [V16SI V8DI])
>
> Please name this iterator VI_512.

But there are already VI_512 :)

;; All 512bit vector integer modes
(define_mode_iterator VI_512
  [(V64QI "TARGET_AVX512BW")
   (V32HI "TARGET_AVX512BW")
   V16SI V8DI])


--
WBR,
Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2016-12-27 13:43 ` Andrew Senkevich
@ 2016-12-27 13:50   ` Uros Bizjak
  2016-12-27 14:13     ` Uros Bizjak
  0 siblings, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2016-12-27 13:50 UTC (permalink / raw)
  To: Andrew Senkevich; +Cc: gcc-patches

On Tue, Dec 27, 2016 at 2:40 PM, Andrew Senkevich
<andrew.n.senkevich@gmail.com> wrote:
> 2016-12-27 16:35 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>> Hello!
>>
>>> this patch enables AVX512 VPOPCNTD/VPOPCNTQ instructions recently
>>> added in Instruction Set Extensions
>>> (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf).
>>
>> @@ -265,6 +268,9 @@
>>  (define_mode_iterator VF_512
>>    [V16SF V8DF])
>>
>> +(define_mode_iterator VI_AVX512F
>> +  [V16SI V8DI])
>>
>> Please name this iterator VI_512.
>
> But there are already VI_512 :)
>
> ;; All 512bit vector integer modes
> (define_mode_iterator VI_512
>   [(V64QI "TARGET_AVX512BW")
>    (V32HI "TARGET_AVX512BW")
>    V16SI V8DI])

Eh, this one is duplicate of VI_AVX512BW and should be removed.

Uros.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2016-12-27 13:50   ` Uros Bizjak
@ 2016-12-27 14:13     ` Uros Bizjak
  2017-01-10 10:05       ` Kirill Yukhin
  0 siblings, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2016-12-27 14:13 UTC (permalink / raw)
  To: Andrew Senkevich; +Cc: gcc-patches

On Tue, Dec 27, 2016 at 2:47 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Dec 27, 2016 at 2:40 PM, Andrew Senkevich
> <andrew.n.senkevich@gmail.com> wrote:
>> 2016-12-27 16:35 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>> Hello!
>>>
>>>> this patch enables AVX512 VPOPCNTD/VPOPCNTQ instructions recently
>>>> added in Instruction Set Extensions
>>>> (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf).
>>>
>>> @@ -265,6 +268,9 @@
>>>  (define_mode_iterator VF_512
>>>    [V16SF V8DF])
>>>
>>> +(define_mode_iterator VI_AVX512F
>>> +  [V16SI V8DI])
>>>
>>> Please name this iterator VI_512.
>>
>> But there are already VI_512 :)
>>
>> ;; All 512bit vector integer modes
>> (define_mode_iterator VI_512
>>   [(V64QI "TARGET_AVX512BW")
>>    (V32HI "TARGET_AVX512BW")
>>    V16SI V8DI])
>
> Eh, this one is duplicate of VI_AVX512BW and should be removed.

Actually, you should use existing VI48_512 mode iterator.

Uros.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2016-12-27 14:13     ` Uros Bizjak
@ 2017-01-10 10:05       ` Kirill Yukhin
  2017-01-10 10:22         ` Andrew Senkevich
  0 siblings, 1 reply; 12+ messages in thread
From: Kirill Yukhin @ 2017-01-10 10:05 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Andrew Senkevich, gcc-patches

Hi,
In addition to Uroš's inputs:
> diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h
> b/gcc/config/i386/avx512vpopcntdqintrin.h
> new file mode 100644
> index 0000000..28305f6
> --- /dev/null
> +++ b/gcc/config/i386/avx512vpopcntdqintrin.h
> @@ -0,0 +1,90 @@
> +/* Copyright (C) 2016 Free Software Foundation, Inc.
Pls, fix year.

Pattern should perfectly fit into subst infra.

On 27 Dec 14:59, Uros Bizjak wrote:
> On Tue, Dec 27, 2016 at 2:47 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> > On Tue, Dec 27, 2016 at 2:40 PM, Andrew Senkevich
> > <andrew.n.senkevich@gmail.com> wrote:
> >> 2016-12-27 16:35 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
> >>> Hello!
> >>>
> >>>> this patch enables AVX512 VPOPCNTD/VPOPCNTQ instructions recently
> >>>> added in Instruction Set Extensions
> >>>> (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf).
> >>>
> >>> @@ -265,6 +268,9 @@
> >>>  (define_mode_iterator VF_512
> >>>    [V16SF V8DF])
> >>>
> >>> +(define_mode_iterator VI_AVX512F
> >>> +  [V16SI V8DI])
> >>>
> >>> Please name this iterator VI_512.
> >>
> >> But there are already VI_512 :)
> >>
> >> ;; All 512bit vector integer modes
> >> (define_mode_iterator VI_512
> >>   [(V64QI "TARGET_AVX512BW")
> >>    (V32HI "TARGET_AVX512BW")
> >>    V16SI V8DI])
> >
> > Eh, this one is duplicate of VI_AVX512BW and should be removed.
>
> Actually, you should use existing VI48_512 mode iterator.
>
> Uros.

--
Thanks, K

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2017-01-10 10:05       ` Kirill Yukhin
@ 2017-01-10 10:22         ` Andrew Senkevich
  2017-01-10 10:31           ` Uros Bizjak
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Senkevich @ 2017-01-10 10:22 UTC (permalink / raw)
  To: Kirill Yukhin; +Cc: Uros Bizjak, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 549 bytes --]

2017-01-10 13:04 GMT+03:00 Kirill Yukhin <kirill.yukhin@gmail.com>:
> Hi,
> In addition to Uroš's inputs:
>> diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h
>> b/gcc/config/i386/avx512vpopcntdqintrin.h
>> new file mode 100644
>> index 0000000..28305f6
>> --- /dev/null
>> +++ b/gcc/config/i386/avx512vpopcntdqintrin.h
>> @@ -0,0 +1,90 @@
>> +/* Copyright (C) 2016 Free Software Foundation, Inc.
> Pls, fix year.
>
> Pattern should perfectly fit into subst infra.

Indeed, patch attached.
Changelogs will be fixed accordingly.

[-- Attachment #2: avx512vpopcntdq_v2.patch --]
[-- Type: application/octet-stream, Size: 30001 bytes --]

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index d1f82fd..4152ef8 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
   (OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512BW_SET)
 #define OPTION_MASK_ISA_AVX5124FMAPS_SET OPTION_MASK_ISA_AVX5124FMAPS
 #define OPTION_MASK_ISA_AVX5124VNNIW_SET OPTION_MASK_ISA_AVX5124VNNIW
+#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET OPTION_MASK_ISA_AVX512VPOPCNTDQ
 #define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
@@ -183,6 +184,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_AVX512VBMI_UNSET OPTION_MASK_ISA_AVX512VBMI
 #define OPTION_MASK_ISA_AVX5124FMAPS_UNSET OPTION_MASK_ISA_AVX5124FMAPS
 #define OPTION_MASK_ISA_AVX5124VNNIW_UNSET OPTION_MASK_ISA_AVX5124VNNIW
+#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET OPTION_MASK_ISA_AVX512VPOPCNTDQ
 #define OPTION_MASK_ISA_RTM_UNSET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_UNSET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_UNSET OPTION_MASK_ISA_RDSEED
@@ -409,6 +411,8 @@ ix86_handle_option (struct gcc_options *opts,
 	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX5124FMAPS_UNSET;
 	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX5124VNNIW_UNSET;
 	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX5124VNNIW_UNSET;
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
 	}
       return true;
 
@@ -481,6 +485,21 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_mavx512vpopcntdq:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET;
+	  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512F_SET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512F_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	}
+      return true;
+
     case OPT_mavx512dq:
       if (value)
 	{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7c27546..bb25d54 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -375,7 +375,8 @@ i[34567]86-*-*)
 		       avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
 		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
 		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
-		       clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h"
+		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
+		       clzerointrin.h pkuintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -397,7 +398,8 @@ x86_64-*-*)
 		       avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
 		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
 		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
-		       clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h"
+		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
+		       clzerointrin.h pkuintrin.h"
 	;;
 ia64-*-*)
 	extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h b/gcc/config/i386/avx512vpopcntdqintrin.h
new file mode 100644
index 0000000..9b0bc1b
--- /dev/null
+++ b/gcc/config/i386/avx512vpopcntdqintrin.h
@@ -0,0 +1,94 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <avx512vpopcntdqintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef _AVX512VPOPCNTDQINTRIN_H_INCLUDED
+#define _AVX512VPOPCNTDQINTRIN_H_INCLUDED
+
+#ifndef __AVX512VPOPCNTDQ__
+#pragma GCC push_options
+#pragma GCC target("avx512vpopcntdq")
+#define __DISABLE_AVX512VPOPCNTDQ__
+#endif /* __AVX512VPOPCNTDQ__ */
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_popcnt_epi32 (__m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si ((__v16si) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_popcnt_epi32 (__m512i __A, __mmask16 __U, __m512i __B)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si_mask ((__v16si) __A,
+							 (__v16si) __B,
+							 (__mmask16) __U);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_popcnt_epi32 (__mmask16 __U, __m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si_mask ((__v16si) __A,
+							 (__v16si)
+							 _mm512_setzero_si512 (),
+							 (__mmask16) __U);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_popcnt_epi64 (__m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di ((__v8di) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_popcnt_epi64 (__m512i __A, __mmask8 __U, __m512i __B)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di_mask ((__v8di) __A,
+							(__v8di) __B,
+							(__mmask8) __U);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_popcnt_epi64 (__mmask8 __U, __m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di_mask ((__v8di) __A,
+							(__v8di)
+							_mm512_setzero_si512 (),
+							(__mmask8) __U);
+}
+
+#ifdef __DISABLE_AVX512VPOPCNTDQ__
+#undef __DISABLE_AVX512VPOPCNTDQ__
+#pragma GCC pop_options
+#endif /* __DISABLE_AVX512VPOPCNTDQ__ */
+
+#endif /* _AVX512VPOPCNTDQINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index fdd7e15..4bdc19e 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -54,6 +54,7 @@
 #define bit_SSE4a	(1 << 6)
 #define bit_PRFCHW	(1 << 8)
 #define bit_XOP         (1 << 11)
+#define bit_AVX512VPOPCNTDQ	(1 << 14)
 #define bit_LWP 	(1 << 15)
 #define bit_FMA4        (1 << 16)
 #define bit_TBM         (1 << 21)
diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def
index 6e938eb..18b3d4c 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -305,9 +305,11 @@ DEF_FUNCTION_TYPE (V8DF, V2DF)
 DEF_FUNCTION_TYPE (V16SI, V4SI)
 DEF_FUNCTION_TYPE (V16SI, V8SI)
 DEF_FUNCTION_TYPE (V16SI, V16SF)
+DEF_FUNCTION_TYPE (V16SI, V16SI)
 DEF_FUNCTION_TYPE (V16SI, V16SI, V16SI, UHI)
 DEF_FUNCTION_TYPE (V8DI, V8DI, V8DI, UQI)
 DEF_FUNCTION_TYPE (V8DI, PV8DI)
+DEF_FUNCTION_TYPE (V8DI, V8DI)
 
 DEF_FUNCTION_TYPE (DI, V2DI, INT)
 DEF_FUNCTION_TYPE (DOUBLE, V2DF, INT)
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 48063d1..c351335 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2527,6 +2527,10 @@ BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssd, "__builtin
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssd_mask, "__builtin_ia32_vp4dpwssd_mask", IX86_BUILTIN_4DPWSSD_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI_V16SI_UHI)
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssds, "__builtin_ia32_vp4dpwssds", IX86_BUILTIN_4DPWSSDS, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI)
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssds_mask, "__builtin_ia32_vp4dpwssds_mask", IX86_BUILTIN_4DPWSSDS_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI_V16SI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si, "__builtin_ia32_vpopcountd_v16si", IX86_BUILTIN_VPOPCOUNTDV16SI, UNKNOWN, (int) V16SI_FTYPE_V16SI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si_mask, "__builtin_ia32_vpopcountd_v16si_mask", IX86_BUILTIN_VPOPCOUNTDV16SI_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di, "__builtin_ia32_vpopcountq_v8di", IX86_BUILTIN_VPOPCOUNTQV8DI, UNKNOWN, (int) V8DI_FTYPE_V8DI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di_mask, "__builtin_ia32_vpopcountq_v8di_mask", IX86_BUILTIN_VPOPCOUNTQV8DI_MASK, UNKNOWN, (int) V8DI_FTYPE_V8DI_V8DI_UQI)
 
 BDESC_END (ARGS2, MPX)
 
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index f633a2e..855ff79 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -380,6 +380,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__AVX5124VNNIW__");
   if (isa_flag2 & OPTION_MASK_ISA_AVX5124FMAPS)
     def_or_undef (parse_in, "__AVX5124FMAPS__");
+  if (isa_flag2 & OPTION_MASK_ISA_AVX512VPOPCNTDQ)
+    def_or_undef (parse_in, "__AVX512VPOPCNTDQ__");
   if (isa_flag & OPTION_MASK_ISA_FMA)
     def_or_undef (parse_in, "__FMA__");
   if (isa_flag & OPTION_MASK_ISA_RTM)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b173b89..e03dadd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4320,6 +4320,7 @@ ix86_target_string (HOST_WIDE_INT isa, HOST_WIDE_INT isa2, int flags,
   {
     { "-mavx5124vnniw", OPTION_MASK_ISA_AVX5124VNNIW },
     { "-mavx5124fmaps", OPTION_MASK_ISA_AVX5124FMAPS },
+    { "-mavx512vpopcntdq", OPTION_MASK_ISA_AVX512VPOPCNTDQ },
   };
   /* Flag options.  */
   static struct ix86_target_opts flag_opts[] =
@@ -4919,6 +4920,7 @@ ix86_option_override_internal (bool main_args_p,
 #define PTA_PKU		(HOST_WIDE_INT_1 << 59)
 #define PTA_AVX5124VNNIW	(HOST_WIDE_INT_1 << 60)
 #define PTA_AVX5124FMAPS	(HOST_WIDE_INT_1 << 61)
+#define PTA_AVX512VPOPCNTDQ	(HOST_WIDE_INT_1 << 62)
 
 #define PTA_CORE2 \
   (PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 \
@@ -5581,6 +5583,9 @@ ix86_option_override_internal (bool main_args_p,
 	if (processor_alias_table[i].flags & PTA_AVX5124FMAPS
 	    && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_AVX5124FMAPS))
 	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX5124FMAPS;
+	if (processor_alias_table[i].flags & PTA_AVX512VPOPCNTDQ
+	    && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_AVX512VPOPCNTDQ))
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VPOPCNTDQ;
 
 	if (processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE))
 	  x86_prefetch_sse = true;
@@ -6625,6 +6630,7 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
     IX86_ATTR_ISA ("avx512vl",	OPT_mavx512vl),
     IX86_ATTR_ISA ("avx5124fmaps",	OPT_mavx5124fmaps),
     IX86_ATTR_ISA ("avx5124vnniw",	OPT_mavx5124vnniw),
+    IX86_ATTR_ISA ("avx512vpopcntdq",	OPT_mavx512vpopcntdq),
     IX86_ATTR_ISA ("mmx",	OPT_mmmx),
     IX86_ATTR_ISA ("pclmul",	OPT_mpclmul),
     IX86_ATTR_ISA ("popcnt",	OPT_mpopcnt),
@@ -33300,6 +33306,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
     F_AVX512IFMA,
     F_AVX5124VNNIW,
     F_AVX5124FMAPS,
+    F_AVX512VPOPCNTDQ,
     F_MAX
   };
 
@@ -33414,6 +33421,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
       {"avx512ifma",F_AVX512IFMA},
       {"avx5124vnniw",F_AVX5124VNNIW},
       {"avx5124fmaps",F_AVX5124FMAPS},
+      {"avx512vpopcntdq",F_AVX512VPOPCNTDQ},
     };
 
   tree __processor_model_type = build_processor_model_struct ();
@@ -34891,8 +34899,10 @@ ix86_expand_args_builtin (const struct builtin_description *d,
     case V16SF_FTYPE_V4SF:
     case V16SI_FTYPE_V4SI:
     case V16SI_FTYPE_V16SF:
+    case V16SI_FTYPE_V16SI:
     case V16SF_FTYPE_V16SF:
     case V8DI_FTYPE_UQI:
+    case V8DI_FTYPE_V8DI:
     case V8DF_FTYPE_V4DF:
     case V8DF_FTYPE_V2DF:
     case V8DF_FTYPE_V8DF:
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index e6f9a75..a7d5f96 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -85,6 +85,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_AVX5124FMAPS_P(x) TARGET_ISA_AVX5124FMAPS_P(x)
 #define TARGET_AVX5124VNNIW	TARGET_ISA_AVX5124VNNIW
 #define TARGET_AVX5124VNNIW_P(x) TARGET_ISA_AVX5124VNNIW_P(x)
+#define TARGET_AVX512VPOPCNTDQ	TARGET_ISA_AVX512VPOPCNTDQ
+#define TARGET_AVX512VPOPCNTDQ_P(x) TARGET_ISA_AVX512VPOPCNTDQ_P(x)
 #define TARGET_FMA	TARGET_ISA_FMA
 #define TARGET_FMA_P(x)	TARGET_ISA_FMA_P(x)
 #define TARGET_SSE4A	TARGET_ISA_SSE4A
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 530f46d..11948a8 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -705,6 +705,10 @@ mavx5124vnniw
 Target Report Mask(ISA_AVX5124VNNIW) Var(ix86_isa_flags2) Save
 Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and AVX5124VNNIW built-in functions and code generation.
 
+mavx512vpopcntdq
+Target Report Mask(ISA_AVX512VPOPCNTDQ) Var(ix86_isa_flags2) Save
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and AVX512VPOPCNTDQ built-in functions and code generation.
+
 mfma
 Target Report Mask(ISA_FMA) Var(ix86_isa_flags) Save
 Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and FMA built-in functions and code generation.
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index 2436496..80dfefe 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -72,6 +72,8 @@
 
 #include <avx5124vnniwintrin.h>
 
+#include <avx512vpopcntdqintrin.h>
+
 #include <shaintrin.h>
 
 #include <lzcntintrin.h>
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 32b4901..f754994 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -19875,3 +19875,10 @@
    [(set_attr ("type") ("ssemuladd"))
     (set_attr ("prefix") ("evex"))
     (set_attr ("mode") ("TI"))])
+
+(define_insn "vpopcount<mode><mask_name>"
+  [(set (match_operand:VI48_512 0 "register_operand" "=v")
+	(popcount:VI48_512
+          (match_operand:VI48_512 1 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcnt<ssemodesuffix>\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}")
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C
index 701051d..ad9fb7c 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,11 +1,11 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h and mm_malloc.h.h are usable with
-   -O -pedantic-errors.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h.h are usable
+   with -O -pedantic-errors.  */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C
index cd8f217..084a1bb 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,10 +1,10 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h and mm_malloc.h are usable with
-   -O -fkeep-inline-functions.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h are
+   usable with -O -fkeep-inline-functions.  */
 
 #include <x86intrin.h>
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
new file mode 100644
index 0000000..c55a05a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)"  1 } } */
+
+#include <x86intrin.h>
+
+extern __m512i z, z1;
+
+int foo ()
+{
+  __mmask16 msk;
+  __m512i c = _mm512_popcnt_epi32 (z);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_mask_popcnt_epi32 (z, msk, z1);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_maskz_popcnt_epi32 (msk, z);
+  asm volatile ("" : "+v" (c));
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
new file mode 100644
index 0000000..2698ec3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)"  1 } } */
+
+#include <x86intrin.h>
+
+extern __m512i z, z1;
+
+int foo ()
+{
+  __mmask8 msk; 
+  __m512i c = _mm512_popcnt_epi64 (z);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_mask_popcnt_epi64 (z, msk, z1);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_maskz_popcnt_epi64 (msk, z);  
+  asm volatile ("" : "+v" (c));
+}
diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c b/gcc/testsuite/gcc.target/i386/builtin_target.c
index c620a74..e50695c 100644
--- a/gcc/testsuite/gcc.target/i386/builtin_target.c
+++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
@@ -217,6 +217,8 @@ check_features (unsigned int ecx, unsigned int edx,
 	assert (__builtin_cpu_supports ("avx5124vnniw"));
       if (edx & bit_AVX5124FMAPS)
 	assert (__builtin_cpu_supports ("avx5124fmaps"));
+      if (ecx & bit_AVX512VPOPCNTDQ)
+	assert (__builtin_cpu_supports ("avx512vpopcntdq"));
     }
 }
 
@@ -319,6 +321,8 @@ quick_check ()
 
   assert (__builtin_cpu_supports ("avx5124fmaps") >= 0);
 
+  assert (__builtin_cpu_supports ("avx512vpopcntdq") >= 0);
+
   /* Check CPU type.  */
   assert (__builtin_cpu_is ("amd") >= 0);
 
diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
index 9334e9e..c999080 100644
--- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
+++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
@@ -30,6 +30,7 @@ extern void test_avx512pf(void)			__attribute__((__target__("avx512pf")));
 extern void test_avx512cd(void)			__attribute__((__target__("avx512cd")));
 extern void test_avx5124fmaps(void)             __attribute__((__target__("avx5124fmaps")));
 extern void test_avx5124vnniw(void)             __attribute__((__target__("avx5124vnniw")));
+extern void test_avx512vpopcntdq(void)		__attribute__((__target__("avx512vpopcntdq")));
 extern void test_bmi (void)			__attribute__((__target__("bmi")));
 extern void test_bmi2 (void)			__attribute__((__target__("bmi2")));
 
@@ -63,6 +64,7 @@ extern void test_bo_avx512pf(void)		__attribute__((__target__("no-avx512pf")));
 extern void test_no_avx512cd(void)		__attribute__((__target__("no-avx512cd")));
 extern void test_no_avx5124fmaps(void)          __attribute__((__target__("no-avx5124fmaps")));
 extern void test_no_avx5124vnniw(void)          __attribute__((__target__("no-avx5124vnniw")));
+extern void test_no_avx512vpopcntdq(void)	__attribute__((__target__("no-avx512vpopcntdq")));
 extern void test_no_bmi (void)			__attribute__((__target__("no-bmi")));
 extern void test_no_bmi2 (void)			__attribute__((__target__("no-bmi2")));
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c
index 3e8417b..19ff785 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -3,7 +3,7 @@
    popcntintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index 67f3b93..350e2ed 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index 44d48fd..85f9119 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -9,8 +9,8 @@
    are defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h,
    mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h,
    tbmintrin.h, lwpintrin.h, popcntintrin.h, fmaintrin.h,
-   avx5124fmapsintrin.h, avx5124vnniwintrin.h and mm_malloc.h 
-   that reference the proper builtin functions.
+   avx5124fmapsintrin.h, avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and
+   mm_malloc.h that reference the proper builtin functions.
 
    Defining away "extern" and "__inline" results in all of them being
    compiled as proper functions.  */
@@ -101,7 +101,7 @@
 
 
 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
 #endif
 
 /* Following intrinsics require immediate arguments.  They
@@ -218,7 +218,7 @@ test_4 (_mm_cmpestrz, int, __m128i, int, __m128i, int, 1)
 
 /* immintrin.h (AVX/AVX2/RDRND/FSGSBASE/F16C/RTM/AVX512F/SHA) */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw")
+#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
 #endif
 #include <immintrin.h>
 test_1 (_cvtss_sh, unsigned short, float, 1)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 61f1b00..3fc1f75 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -8,8 +8,8 @@
    are defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h,
    mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h,
    tbmintrin.h, lwpintrin.h, popcntintrin.h, fmaintrin.h,
-   avx5124fmapsintrin.h, avx5124vnniwintrin.h and mm_malloc.h 
-   that reference the proper builtin functions.
+   avx5124fmapsintrin.h, avx5124vnniwintrin.h, avx512vpopcntdqintrin.h
+   and mm_malloc.h that reference the proper builtin functions.
 
    Defining away "extern" and "__inline" results in all of them being
    compiled as proper functions.  */
@@ -595,6 +595,6 @@
 #define __builtin_ia32_extracti64x2_256_mask(A, E, C, D) __builtin_ia32_extracti64x2_256_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x2_256_mask(A, E, C, D) __builtin_ia32_extractf64x2_256_mask(A, 1, C, D)
 
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,clwb,mwaitx,clzero,pku")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku")
 
 #include <x86intrin.h>
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 9e9156b..737d1aa 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -277,6 +277,8 @@ get_available_features (unsigned int ecx, unsigned int edx,
 	features |= (1 << FEATURE_AVX5124VNNIW);
       if (edx & bit_AVX5124FMAPS)
 	features |= (1 << FEATURE_AVX5124FMAPS);
+      if (ecx & bit_AVX512VPOPCNTDQ)
+	features |= (1 << FEATURE_AVX512VPOPCNTDQ);
     }
 
   unsigned int ext_level;
diff --git a/libgcc/config/i386/cpuinfo.h b/libgcc/config/i386/cpuinfo.h
index ceb09d2..872b45e 100644
--- a/libgcc/config/i386/cpuinfo.h
+++ b/libgcc/config/i386/cpuinfo.h
@@ -104,7 +104,8 @@ enum processor_features
   FEATURE_AVX512VBMI,
   FEATURE_AVX512IFMA,
   FEATURE_AVX5124VNNIW,
-  FEATURE_AVX5124FMAPS
+  FEATURE_AVX5124FMAPS,
+  FEATURE_AVX512VPOPCNTDQ
 };
 
 extern struct __processor_model

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2017-01-10 10:22         ` Andrew Senkevich
@ 2017-01-10 10:31           ` Uros Bizjak
  2017-01-10 12:01             ` Andrew Senkevich
  0 siblings, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2017-01-10 10:31 UTC (permalink / raw)
  To: Andrew Senkevich; +Cc: Kirill Yukhin, gcc-patches

On Tue, Jan 10, 2017 at 11:21 AM, Andrew Senkevich
<andrew.n.senkevich@gmail.com> wrote:
> 2017-01-10 13:04 GMT+03:00 Kirill Yukhin <kirill.yukhin@gmail.com>:
>> Hi,
>> In addition to Uroš's inputs:
>>> diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h
>>> b/gcc/config/i386/avx512vpopcntdqintrin.h
>>> new file mode 100644
>>> index 0000000..28305f6
>>> --- /dev/null
>>> +++ b/gcc/config/i386/avx512vpopcntdqintrin.h
>>> @@ -0,0 +1,90 @@
>>> +/* Copyright (C) 2016 Free Software Foundation, Inc.
>> Pls, fix year.
>>
>> Pattern should perfectly fit into subst infra.
>
> Indeed, patch attached.
> Changelogs will be fixed accordingly.

Patch is OK for mainline.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2017-01-10 10:31           ` Uros Bizjak
@ 2017-01-10 12:01             ` Andrew Senkevich
  2017-01-10 12:58               ` Kirill Yukhin
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Senkevich @ 2017-01-10 12:01 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Kirill Yukhin, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 878 bytes --]

2017-01-10 13:31 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
> On Tue, Jan 10, 2017 at 11:21 AM, Andrew Senkevich
> <andrew.n.senkevich@gmail.com> wrote:
>> 2017-01-10 13:04 GMT+03:00 Kirill Yukhin <kirill.yukhin@gmail.com>:
>>> Hi,
>>> In addition to Uroš's inputs:
>>>> diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h
>>>> b/gcc/config/i386/avx512vpopcntdqintrin.h
>>>> new file mode 100644
>>>> index 0000000..28305f6
>>>> --- /dev/null
>>>> +++ b/gcc/config/i386/avx512vpopcntdqintrin.h
>>>> @@ -0,0 +1,90 @@
>>>> +/* Copyright (C) 2016 Free Software Foundation, Inc.
>>> Pls, fix year.
>>>
>>> Pattern should perfectly fit into subst infra.
>>
>> Indeed, patch attached.
>> Changelogs will be fixed accordingly.
>
> Patch is OK for mainline.

Thanks!

Attached with updated ChangeLogs.
Kirill, could you commit please?


--
WBR,
Andrew

[-- Attachment #2: avx512vpopcntdq_v3.patch --]
[-- Type: application/octet-stream, Size: 32465 bytes --]

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4878272..ba1dfa9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,24 @@
+2017-01-10  Andrew Senkevich  <andrew.senkevich@intel.com>
+
+	* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET,
+	OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET): New.
+	* config.gcc: Add avx512vpopcntdqintrin.h.
+	* config/i386/avx512vpopcntdqintrin.h: New.
+	* config/i386/cpuid.h (bit_AVX512VPOPCNTDQ): New.
+	* config/i386/i386-builtin-types.def: Add new types.
+	* config/i386/i386-builtin.def (__builtin_ia32_vpopcountd_v16si,
+	__builtin_ia32_vpopcountd_v16si_mask, __builtin_ia32_vpopcountq_v8di,
+	__builtin_ia32_vpopcountq_v8di_mask): New.
+	* config/i386/i386-c.c (ix86_target_macros_internal): Define
+	__AVX512VPOPCNTDQ__.
+	* config/i386/i386.c (ix86_target_string): Add -mavx512vpopcntdq.
+	(PTA_AVX512VPOPCNTDQ): Define.
+	* config/i386/i386.h (TARGET_AVX512VPOPCNTDQ,
+	TARGET_AVX512VPOPCNTDQ_P): Define.
+	* config/i386/i386.opt: Add mavx512vpopcntdq.
+	* config/i386/immintrin.h: Include avx512vpopcntdqintrin.h.
+	* config/i386/sse.md (define_insn "vpopcount<mode><mask_name>"): New.
+
 2017-01-01  Jan Hubicka  <hubicka@ucw.cz>
 
 	PR middle-end/77484
diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index d1f82fd..4152ef8 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
   (OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512BW_SET)
 #define OPTION_MASK_ISA_AVX5124FMAPS_SET OPTION_MASK_ISA_AVX5124FMAPS
 #define OPTION_MASK_ISA_AVX5124VNNIW_SET OPTION_MASK_ISA_AVX5124VNNIW
+#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET OPTION_MASK_ISA_AVX512VPOPCNTDQ
 #define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
@@ -183,6 +184,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_AVX512VBMI_UNSET OPTION_MASK_ISA_AVX512VBMI
 #define OPTION_MASK_ISA_AVX5124FMAPS_UNSET OPTION_MASK_ISA_AVX5124FMAPS
 #define OPTION_MASK_ISA_AVX5124VNNIW_UNSET OPTION_MASK_ISA_AVX5124VNNIW
+#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET OPTION_MASK_ISA_AVX512VPOPCNTDQ
 #define OPTION_MASK_ISA_RTM_UNSET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_UNSET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_UNSET OPTION_MASK_ISA_RDSEED
@@ -409,6 +411,8 @@ ix86_handle_option (struct gcc_options *opts,
 	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX5124FMAPS_UNSET;
 	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX5124VNNIW_UNSET;
 	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX5124VNNIW_UNSET;
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
 	}
       return true;
 
@@ -481,6 +485,21 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_mavx512vpopcntdq:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET;
+	  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512F_SET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512F_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	}
+      return true;
+
     case OPT_mavx512dq:
       if (value)
 	{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7c27546..bb25d54 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -375,7 +375,8 @@ i[34567]86-*-*)
 		       avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
 		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
 		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
-		       clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h"
+		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
+		       clzerointrin.h pkuintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -397,7 +398,8 @@ x86_64-*-*)
 		       avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
 		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
 		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
-		       clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h"
+		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
+		       clzerointrin.h pkuintrin.h"
 	;;
 ia64-*-*)
 	extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h b/gcc/config/i386/avx512vpopcntdqintrin.h
new file mode 100644
index 0000000..9b0bc1b
--- /dev/null
+++ b/gcc/config/i386/avx512vpopcntdqintrin.h
@@ -0,0 +1,94 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <avx512vpopcntdqintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef _AVX512VPOPCNTDQINTRIN_H_INCLUDED
+#define _AVX512VPOPCNTDQINTRIN_H_INCLUDED
+
+#ifndef __AVX512VPOPCNTDQ__
+#pragma GCC push_options
+#pragma GCC target("avx512vpopcntdq")
+#define __DISABLE_AVX512VPOPCNTDQ__
+#endif /* __AVX512VPOPCNTDQ__ */
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_popcnt_epi32 (__m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si ((__v16si) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_popcnt_epi32 (__m512i __A, __mmask16 __U, __m512i __B)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si_mask ((__v16si) __A,
+							 (__v16si) __B,
+							 (__mmask16) __U);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_popcnt_epi32 (__mmask16 __U, __m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si_mask ((__v16si) __A,
+							 (__v16si)
+							 _mm512_setzero_si512 (),
+							 (__mmask16) __U);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_popcnt_epi64 (__m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di ((__v8di) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_popcnt_epi64 (__m512i __A, __mmask8 __U, __m512i __B)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di_mask ((__v8di) __A,
+							(__v8di) __B,
+							(__mmask8) __U);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_popcnt_epi64 (__mmask8 __U, __m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di_mask ((__v8di) __A,
+							(__v8di)
+							_mm512_setzero_si512 (),
+							(__mmask8) __U);
+}
+
+#ifdef __DISABLE_AVX512VPOPCNTDQ__
+#undef __DISABLE_AVX512VPOPCNTDQ__
+#pragma GCC pop_options
+#endif /* __DISABLE_AVX512VPOPCNTDQ__ */
+
+#endif /* _AVX512VPOPCNTDQINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index fdd7e15..4bdc19e 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -54,6 +54,7 @@
 #define bit_SSE4a	(1 << 6)
 #define bit_PRFCHW	(1 << 8)
 #define bit_XOP         (1 << 11)
+#define bit_AVX512VPOPCNTDQ	(1 << 14)
 #define bit_LWP 	(1 << 15)
 #define bit_FMA4        (1 << 16)
 #define bit_TBM         (1 << 21)
diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def
index 6e938eb..18b3d4c 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -305,9 +305,11 @@ DEF_FUNCTION_TYPE (V8DF, V2DF)
 DEF_FUNCTION_TYPE (V16SI, V4SI)
 DEF_FUNCTION_TYPE (V16SI, V8SI)
 DEF_FUNCTION_TYPE (V16SI, V16SF)
+DEF_FUNCTION_TYPE (V16SI, V16SI)
 DEF_FUNCTION_TYPE (V16SI, V16SI, V16SI, UHI)
 DEF_FUNCTION_TYPE (V8DI, V8DI, V8DI, UQI)
 DEF_FUNCTION_TYPE (V8DI, PV8DI)
+DEF_FUNCTION_TYPE (V8DI, V8DI)
 
 DEF_FUNCTION_TYPE (DI, V2DI, INT)
 DEF_FUNCTION_TYPE (DOUBLE, V2DF, INT)
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 48063d1..c351335 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2527,6 +2527,10 @@ BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssd, "__builtin
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssd_mask, "__builtin_ia32_vp4dpwssd_mask", IX86_BUILTIN_4DPWSSD_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI_V16SI_UHI)
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssds, "__builtin_ia32_vp4dpwssds", IX86_BUILTIN_4DPWSSDS, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI)
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssds_mask, "__builtin_ia32_vp4dpwssds_mask", IX86_BUILTIN_4DPWSSDS_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI_V16SI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si, "__builtin_ia32_vpopcountd_v16si", IX86_BUILTIN_VPOPCOUNTDV16SI, UNKNOWN, (int) V16SI_FTYPE_V16SI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si_mask, "__builtin_ia32_vpopcountd_v16si_mask", IX86_BUILTIN_VPOPCOUNTDV16SI_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di, "__builtin_ia32_vpopcountq_v8di", IX86_BUILTIN_VPOPCOUNTQV8DI, UNKNOWN, (int) V8DI_FTYPE_V8DI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di_mask, "__builtin_ia32_vpopcountq_v8di_mask", IX86_BUILTIN_VPOPCOUNTQV8DI_MASK, UNKNOWN, (int) V8DI_FTYPE_V8DI_V8DI_UQI)
 
 BDESC_END (ARGS2, MPX)
 
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index f633a2e..855ff79 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -380,6 +380,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__AVX5124VNNIW__");
   if (isa_flag2 & OPTION_MASK_ISA_AVX5124FMAPS)
     def_or_undef (parse_in, "__AVX5124FMAPS__");
+  if (isa_flag2 & OPTION_MASK_ISA_AVX512VPOPCNTDQ)
+    def_or_undef (parse_in, "__AVX512VPOPCNTDQ__");
   if (isa_flag & OPTION_MASK_ISA_FMA)
     def_or_undef (parse_in, "__FMA__");
   if (isa_flag & OPTION_MASK_ISA_RTM)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b173b89..e03dadd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4320,6 +4320,7 @@ ix86_target_string (HOST_WIDE_INT isa, HOST_WIDE_INT isa2, int flags,
   {
     { "-mavx5124vnniw", OPTION_MASK_ISA_AVX5124VNNIW },
     { "-mavx5124fmaps", OPTION_MASK_ISA_AVX5124FMAPS },
+    { "-mavx512vpopcntdq", OPTION_MASK_ISA_AVX512VPOPCNTDQ },
   };
   /* Flag options.  */
   static struct ix86_target_opts flag_opts[] =
@@ -4919,6 +4920,7 @@ ix86_option_override_internal (bool main_args_p,
 #define PTA_PKU		(HOST_WIDE_INT_1 << 59)
 #define PTA_AVX5124VNNIW	(HOST_WIDE_INT_1 << 60)
 #define PTA_AVX5124FMAPS	(HOST_WIDE_INT_1 << 61)
+#define PTA_AVX512VPOPCNTDQ	(HOST_WIDE_INT_1 << 62)
 
 #define PTA_CORE2 \
   (PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 \
@@ -5581,6 +5583,9 @@ ix86_option_override_internal (bool main_args_p,
 	if (processor_alias_table[i].flags & PTA_AVX5124FMAPS
 	    && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_AVX5124FMAPS))
 	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX5124FMAPS;
+	if (processor_alias_table[i].flags & PTA_AVX512VPOPCNTDQ
+	    && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_AVX512VPOPCNTDQ))
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VPOPCNTDQ;
 
 	if (processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE))
 	  x86_prefetch_sse = true;
@@ -6625,6 +6630,7 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
     IX86_ATTR_ISA ("avx512vl",	OPT_mavx512vl),
     IX86_ATTR_ISA ("avx5124fmaps",	OPT_mavx5124fmaps),
     IX86_ATTR_ISA ("avx5124vnniw",	OPT_mavx5124vnniw),
+    IX86_ATTR_ISA ("avx512vpopcntdq",	OPT_mavx512vpopcntdq),
     IX86_ATTR_ISA ("mmx",	OPT_mmmx),
     IX86_ATTR_ISA ("pclmul",	OPT_mpclmul),
     IX86_ATTR_ISA ("popcnt",	OPT_mpopcnt),
@@ -33300,6 +33306,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
     F_AVX512IFMA,
     F_AVX5124VNNIW,
     F_AVX5124FMAPS,
+    F_AVX512VPOPCNTDQ,
     F_MAX
   };
 
@@ -33414,6 +33421,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
       {"avx512ifma",F_AVX512IFMA},
       {"avx5124vnniw",F_AVX5124VNNIW},
       {"avx5124fmaps",F_AVX5124FMAPS},
+      {"avx512vpopcntdq",F_AVX512VPOPCNTDQ},
     };
 
   tree __processor_model_type = build_processor_model_struct ();
@@ -34891,8 +34899,10 @@ ix86_expand_args_builtin (const struct builtin_description *d,
     case V16SF_FTYPE_V4SF:
     case V16SI_FTYPE_V4SI:
     case V16SI_FTYPE_V16SF:
+    case V16SI_FTYPE_V16SI:
     case V16SF_FTYPE_V16SF:
     case V8DI_FTYPE_UQI:
+    case V8DI_FTYPE_V8DI:
     case V8DF_FTYPE_V4DF:
     case V8DF_FTYPE_V2DF:
     case V8DF_FTYPE_V8DF:
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index e6f9a75..a7d5f96 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -85,6 +85,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_AVX5124FMAPS_P(x) TARGET_ISA_AVX5124FMAPS_P(x)
 #define TARGET_AVX5124VNNIW	TARGET_ISA_AVX5124VNNIW
 #define TARGET_AVX5124VNNIW_P(x) TARGET_ISA_AVX5124VNNIW_P(x)
+#define TARGET_AVX512VPOPCNTDQ	TARGET_ISA_AVX512VPOPCNTDQ
+#define TARGET_AVX512VPOPCNTDQ_P(x) TARGET_ISA_AVX512VPOPCNTDQ_P(x)
 #define TARGET_FMA	TARGET_ISA_FMA
 #define TARGET_FMA_P(x)	TARGET_ISA_FMA_P(x)
 #define TARGET_SSE4A	TARGET_ISA_SSE4A
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 530f46d..11948a8 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -705,6 +705,10 @@ mavx5124vnniw
 Target Report Mask(ISA_AVX5124VNNIW) Var(ix86_isa_flags2) Save
 Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and AVX5124VNNIW built-in functions and code generation.
 
+mavx512vpopcntdq
+Target Report Mask(ISA_AVX512VPOPCNTDQ) Var(ix86_isa_flags2) Save
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and AVX512VPOPCNTDQ built-in functions and code generation.
+
 mfma
 Target Report Mask(ISA_FMA) Var(ix86_isa_flags) Save
 Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and FMA built-in functions and code generation.
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index 2436496..80dfefe 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -72,6 +72,8 @@
 
 #include <avx5124vnniwintrin.h>
 
+#include <avx512vpopcntdqintrin.h>
+
 #include <shaintrin.h>
 
 #include <lzcntintrin.h>
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 32b4901..f754994 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -19875,3 +19875,10 @@
    [(set_attr ("type") ("ssemuladd"))
     (set_attr ("prefix") ("evex"))
     (set_attr ("mode") ("TI"))])
+
+(define_insn "vpopcount<mode><mask_name>"
+  [(set (match_operand:VI48_512 0 "register_operand" "=v")
+	(popcount:VI48_512
+          (match_operand:VI48_512 1 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcnt<ssemodesuffix>\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}")
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 1054e20..c00016d 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,16 @@
+2017-01-10  Andrew Senkevich  <andrew.senkevich@intel.com>
+
+	* g++.dg/other/i386-2.C: Add -mavx512vpopcntdq.
+	* g++.dg/other/i386-3.C: Ditto.
+	* gcc.target/i386/sse-12.c: Ditto.
+	* gcc.target/i386/sse-13.c: Ditto.
+	* gcc.target/i386/sse-22.c: Ditto.
+	* gcc.target/i386/sse-23.c: Ditto.
+	* gcc.target/i386/builtin_target.c: Handle new option.
+	* gcc.target/i386/funcspec-56.inc: Test new attributes.
+	* gcc.target/i386/avx512vpopcntdq-vpopcntd.c: New test.
+	* gcc.target/i386/avx512vpopcntdq-vpopcntq.c: Ditto.
+
 2017-01-09  Martin Sebor  <msebor@redhat.com>
 
 	PR testsuite/79036
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C
index 701051d..ad9fb7c 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,11 +1,11 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h and mm_malloc.h.h are usable with
-   -O -pedantic-errors.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h.h are usable
+   with -O -pedantic-errors.  */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C
index cd8f217..084a1bb 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,10 +1,10 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h and mm_malloc.h are usable with
-   -O -fkeep-inline-functions.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h are
+   usable with -O -fkeep-inline-functions.  */
 
 #include <x86intrin.h>
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
new file mode 100644
index 0000000..c55a05a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)"  1 } } */
+
+#include <x86intrin.h>
+
+extern __m512i z, z1;
+
+int foo ()
+{
+  __mmask16 msk;
+  __m512i c = _mm512_popcnt_epi32 (z);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_mask_popcnt_epi32 (z, msk, z1);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_maskz_popcnt_epi32 (msk, z);
+  asm volatile ("" : "+v" (c));
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
new file mode 100644
index 0000000..2698ec3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)"  1 } } */
+
+#include <x86intrin.h>
+
+extern __m512i z, z1;
+
+int foo ()
+{
+  __mmask8 msk; 
+  __m512i c = _mm512_popcnt_epi64 (z);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_mask_popcnt_epi64 (z, msk, z1);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_maskz_popcnt_epi64 (msk, z);  
+  asm volatile ("" : "+v" (c));
+}
diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c b/gcc/testsuite/gcc.target/i386/builtin_target.c
index c620a74..e50695c 100644
--- a/gcc/testsuite/gcc.target/i386/builtin_target.c
+++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
@@ -217,6 +217,8 @@ check_features (unsigned int ecx, unsigned int edx,
 	assert (__builtin_cpu_supports ("avx5124vnniw"));
       if (edx & bit_AVX5124FMAPS)
 	assert (__builtin_cpu_supports ("avx5124fmaps"));
+      if (ecx & bit_AVX512VPOPCNTDQ)
+	assert (__builtin_cpu_supports ("avx512vpopcntdq"));
     }
 }
 
@@ -319,6 +321,8 @@ quick_check ()
 
   assert (__builtin_cpu_supports ("avx5124fmaps") >= 0);
 
+  assert (__builtin_cpu_supports ("avx512vpopcntdq") >= 0);
+
   /* Check CPU type.  */
   assert (__builtin_cpu_is ("amd") >= 0);
 
diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
index 9334e9e..c999080 100644
--- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
+++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
@@ -30,6 +30,7 @@ extern void test_avx512pf(void)			__attribute__((__target__("avx512pf")));
 extern void test_avx512cd(void)			__attribute__((__target__("avx512cd")));
 extern void test_avx5124fmaps(void)             __attribute__((__target__("avx5124fmaps")));
 extern void test_avx5124vnniw(void)             __attribute__((__target__("avx5124vnniw")));
+extern void test_avx512vpopcntdq(void)		__attribute__((__target__("avx512vpopcntdq")));
 extern void test_bmi (void)			__attribute__((__target__("bmi")));
 extern void test_bmi2 (void)			__attribute__((__target__("bmi2")));
 
@@ -63,6 +64,7 @@ extern void test_bo_avx512pf(void)		__attribute__((__target__("no-avx512pf")));
 extern void test_no_avx512cd(void)		__attribute__((__target__("no-avx512cd")));
 extern void test_no_avx5124fmaps(void)          __attribute__((__target__("no-avx5124fmaps")));
 extern void test_no_avx5124vnniw(void)          __attribute__((__target__("no-avx5124vnniw")));
+extern void test_no_avx512vpopcntdq(void)	__attribute__((__target__("no-avx512vpopcntdq")));
 extern void test_no_bmi (void)			__attribute__((__target__("no-bmi")));
 extern void test_no_bmi2 (void)			__attribute__((__target__("no-bmi2")));
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c
index 3e8417b..19ff785 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -3,7 +3,7 @@
    popcntintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index 67f3b93..350e2ed 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index 44d48fd..85f9119 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -9,8 +9,8 @@
    are defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h,
    mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h,
    tbmintrin.h, lwpintrin.h, popcntintrin.h, fmaintrin.h,
-   avx5124fmapsintrin.h, avx5124vnniwintrin.h and mm_malloc.h 
-   that reference the proper builtin functions.
+   avx5124fmapsintrin.h, avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and
+   mm_malloc.h that reference the proper builtin functions.
 
    Defining away "extern" and "__inline" results in all of them being
    compiled as proper functions.  */
@@ -101,7 +101,7 @@
 
 
 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
 #endif
 
 /* Following intrinsics require immediate arguments.  They
@@ -218,7 +218,7 @@ test_4 (_mm_cmpestrz, int, __m128i, int, __m128i, int, 1)
 
 /* immintrin.h (AVX/AVX2/RDRND/FSGSBASE/F16C/RTM/AVX512F/SHA) */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw")
+#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
 #endif
 #include <immintrin.h>
 test_1 (_cvtss_sh, unsigned short, float, 1)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 61f1b00..3fc1f75 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -8,8 +8,8 @@
    are defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h,
    mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h,
    tbmintrin.h, lwpintrin.h, popcntintrin.h, fmaintrin.h,
-   avx5124fmapsintrin.h, avx5124vnniwintrin.h and mm_malloc.h 
-   that reference the proper builtin functions.
+   avx5124fmapsintrin.h, avx5124vnniwintrin.h, avx512vpopcntdqintrin.h
+   and mm_malloc.h that reference the proper builtin functions.
 
    Defining away "extern" and "__inline" results in all of them being
    compiled as proper functions.  */
@@ -595,6 +595,6 @@
 #define __builtin_ia32_extracti64x2_256_mask(A, E, C, D) __builtin_ia32_extracti64x2_256_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x2_256_mask(A, E, C, D) __builtin_ia32_extractf64x2_256_mask(A, 1, C, D)
 
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,clwb,mwaitx,clzero,pku")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku")
 
 #include <x86intrin.h>
diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index f0eb567..87db42d 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,10 @@
+2017-01-10  Andrew Senkevich  <andrew.senkevich@intel.com>
+
+	* config/i386/cpuinfo.h (processor_features): Add
+	FEATURE_AVX512VPOPCNTDQ.
+	* config/i386/cpuinfo.c (get_available_features): Habdle new
+	feature.
+
 2017-01-04  Joseph Myers  <joseph@codesourcery.com>
 
 	* config/mips/sfp-machine.h (_FP_CHOOSENAN): Always preserve NaN
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 9e9156b..737d1aa 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -277,6 +277,8 @@ get_available_features (unsigned int ecx, unsigned int edx,
 	features |= (1 << FEATURE_AVX5124VNNIW);
       if (edx & bit_AVX5124FMAPS)
 	features |= (1 << FEATURE_AVX5124FMAPS);
+      if (ecx & bit_AVX512VPOPCNTDQ)
+	features |= (1 << FEATURE_AVX512VPOPCNTDQ);
     }
 
   unsigned int ext_level;
diff --git a/libgcc/config/i386/cpuinfo.h b/libgcc/config/i386/cpuinfo.h
index ceb09d2..872b45e 100644
--- a/libgcc/config/i386/cpuinfo.h
+++ b/libgcc/config/i386/cpuinfo.h
@@ -104,7 +104,8 @@ enum processor_features
   FEATURE_AVX512VBMI,
   FEATURE_AVX512IFMA,
   FEATURE_AVX5124VNNIW,
-  FEATURE_AVX5124FMAPS
+  FEATURE_AVX5124FMAPS,
+  FEATURE_AVX512VPOPCNTDQ
 };
 
 extern struct __processor_model

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2017-01-10 12:01             ` Andrew Senkevich
@ 2017-01-10 12:58               ` Kirill Yukhin
  2017-02-24 20:42                 ` Andrew Senkevich
  0 siblings, 1 reply; 12+ messages in thread
From: Kirill Yukhin @ 2017-01-10 12:58 UTC (permalink / raw)
  To: Andrew Senkevich; +Cc: Uros Bizjak, gcc-patches

On 10 Jan 15:00, Andrew Senkevich wrote:
> 2017-01-10 13:31 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
> > On Tue, Jan 10, 2017 at 11:21 AM, Andrew Senkevich
> > <andrew.n.senkevich@gmail.com> wrote:
> >> 2017-01-10 13:04 GMT+03:00 Kirill Yukhin <kirill.yukhin@gmail.com>:
> >>> Hi,
> >>> In addition to Uroš's inputs:
> >>>> diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h
> >>>> b/gcc/config/i386/avx512vpopcntdqintrin.h
> >>>> new file mode 100644
> >>>> index 0000000..28305f6
> >>>> --- /dev/null
> >>>> +++ b/gcc/config/i386/avx512vpopcntdqintrin.h
> >>>> @@ -0,0 +1,90 @@
> >>>> +/* Copyright (C) 2016 Free Software Foundation, Inc.
> >>> Pls, fix year.
> >>>
> >>> Pattern should perfectly fit into subst infra.
> >>
> >> Indeed, patch attached.
> >> Changelogs will be fixed accordingly.
> >
> > Patch is OK for mainline.
>
> Thanks!
>
> Attached with updated ChangeLogs.
> Kirill, could you commit please?
Done.

Also, could you pls implement runtime test cases for new intrinsics.

--
Thanks, K

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2017-01-10 12:58               ` Kirill Yukhin
@ 2017-02-24 20:42                 ` Andrew Senkevich
  2017-02-25 11:47                   ` Uros Bizjak
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Senkevich @ 2017-02-24 20:42 UTC (permalink / raw)
  To: Kirill Yukhin; +Cc: Uros Bizjak, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1556 bytes --]

2017-01-10 13:58 GMT+01:00 Kirill Yukhin <kirill.yukhin@gmail.com>:
> On 10 Jan 15:00, Andrew Senkevich wrote:
>> 2017-01-10 13:31 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>> > On Tue, Jan 10, 2017 at 11:21 AM, Andrew Senkevich
>> > <andrew.n.senkevich@gmail.com> wrote:
>> >> 2017-01-10 13:04 GMT+03:00 Kirill Yukhin <kirill.yukhin@gmail.com>:
>> >>> Hi,
>> >>> In addition to Uroš's inputs:
>> >>>> diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h
>> >>>> b/gcc/config/i386/avx512vpopcntdqintrin.h
>> >>>> new file mode 100644
>> >>>> index 0000000..28305f6
>> >>>> --- /dev/null
>> >>>> +++ b/gcc/config/i386/avx512vpopcntdqintrin.h
>> >>>> @@ -0,0 +1,90 @@
>> >>>> +/* Copyright (C) 2016 Free Software Foundation, Inc.
>> >>> Pls, fix year.
>> >>>
>> >>> Pattern should perfectly fit into subst infra.
>> >>
>> >> Indeed, patch attached.
>> >> Changelogs will be fixed accordingly.
>> >
>> > Patch is OK for mainline.
>>
>> Thanks!
>>
>> Attached with updated ChangeLogs.
>> Kirill, could you commit please?
> Done.
>
> Also, could you pls implement runtime test cases for new intrinsics.

Hi,

those tests are attached, are they Ok?
ChangLog:

gcc/testsuite/

    * gcc.target/i386/avx512vpopcntdq-check.h: New.
    * gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c: Ditto.
    * gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c: Ditto.
    * gcc.target/i386/avx512f-helper.h: Add avx512vpopcntdq-check.h.
    * gcc.target/i386/i386.exp (check_effective_target_avx512vpopcntdq): New.


--
WBR,
Andrew

[-- Attachment #2: avx512vpopcntdq_rt_tests.patch --]
[-- Type: application/octet-stream, Size: 5959 bytes --]

diff --git a/gcc/testsuite/gcc.target/i386/avx512f-helper.h b/gcc/testsuite/gcc.target/i386/avx512f-helper.h
index 6aca0d6..ef4661a 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-helper.h
+++ b/gcc/testsuite/gcc.target/i386/avx512f-helper.h
@@ -26,6 +26,8 @@
 #include "avx5124fmaps-check.h"
 #elif defined (AVX5124VNNIW) && !defined (AVX512VL)
 #include "avx5124vnniw-check.h"
+#elif defined (AVX512VPOPCNTDQ) && !defined (AVX512VL)
+#include "avx512vpopcntdq-check.h"
 #elif defined (AVX512VL)
 #include "avx512vl-check.h"
 #endif
@@ -144,6 +146,9 @@ avx5124fmaps_test (void) { test_512 (); }
 #elif defined (AVX5124VNNIW) && !defined (AVX512VL)
 void
 avx5124vnniw_test (void) { test_512 (); }
+#elif defined (AVX512VPOPCNTDQ) && !defined (AVX512VL)
+void
+avx512vpopcntdq_test (void) { test_512 (); }
 #elif defined (AVX512VL)
 void
 avx512vl_test (void) { test_256 (); test_128 (); }
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-check.h b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-check.h
new file mode 100644
index 0000000..179548b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-check.h
@@ -0,0 +1,47 @@
+#include <stdlib.h>
+#include "cpuid.h"
+#include "m512-check.h"
+#include "avx512f-os-support.h"
+
+static void avx512vpopcntdq_test (void);
+
+static void __attribute__ ((noinline)) do_test (void)
+{
+  avx512vpopcntdq_test ();
+}
+
+int
+main ()
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  /* Run AVX512_VPOPCNTDQ test only if host has the support.  */
+  if ((ecx & bit_OSXSAVE) == (bit_OSXSAVE))
+    {
+      if (__get_cpuid_max (0, NULL) < 7)
+	return 0;
+
+      __cpuid_count (7, 0, eax, ebx, ecx, edx);
+
+      if ((avx512f_os_support ()) && ((ecx & bit_AVX512VPOPCNTDQ) == bit_AVX512VPOPCNTDQ))
+	{
+	  do_test ();
+#ifdef DEBUG
+	  printf ("PASSED\n");
+#endif
+	  return 0;
+	}
+#ifdef DEBUG
+      printf ("SKIPPED\n");
+#endif
+    }
+#ifdef DEBUG
+  else
+    printf ("SKIPPED\n");
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c
new file mode 100644
index 0000000..d9faf0a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-require-effective-target avx512vpopcntdq } */
+
+#define AVX512VPOPCNTDQ
+#include "avx512f-helper.h"
+
+#define SIZE (AVX512F_LEN / 32)
+
+#include "avx512f-mask-type.h"
+
+#define TYPE int
+
+static int
+compute_popcnt (TYPE v)
+{
+  int ret;
+  int i;
+
+ ret = 0;
+ for (i = 0; i < sizeof(v) * 8; i++)
+   if ((v & ((TYPE)1 << (TYPE) i)))
+     ret++;
+
+ return ret;
+}
+
+void
+TEST (void)
+{
+  UNION_TYPE (AVX512F_LEN, i_d) res1, res2, res3, src, src0;
+  MASK_TYPE mask = MASK_VALUE;
+  TYPE res_ref[SIZE];
+  src.x = _mm512_set1_epi8 (0x3D);
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+  {
+    res_ref[i] = compute_popcnt (src.a[i]);
+    src0.a[i] = DEFAULT_VALUE;
+  }
+
+  res1.x = INTRINSIC (_popcnt_epi32)       (src.x);
+  res2.x = INTRINSIC (_mask_popcnt_epi32)  (src.x, mask, src0.x);
+  res3.x = INTRINSIC (_maskz_popcnt_epi32) (mask, src.x);
+
+  if (UNION_CHECK (AVX512F_LEN, i_d) (res1, res_ref))
+    abort ();
+
+  MASK_MERGE (i_d) (res_ref, mask, SIZE);
+  if (UNION_CHECK (AVX512F_LEN, i_d) (res2, res_ref))
+    abort ();
+
+  MASK_ZERO (i_d) (res_ref, mask, SIZE);
+  if (UNION_CHECK (AVX512F_LEN, i_d) (res3, res_ref))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c
new file mode 100644
index 0000000..5a62821
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-require-effective-target avx512vpopcntdq } */
+
+#define AVX512VPOPCNTDQ
+#include "avx512f-helper.h"
+
+#define SIZE (AVX512F_LEN / 64)
+
+#include "avx512f-mask-type.h"
+
+#define TYPE long long
+
+static int
+compute_popcnt (TYPE v)
+{
+  int ret;
+  int i;
+
+ ret = 0;
+ for (i = 0; i < sizeof(v) * 8; i++)
+   if ((v & ((TYPE)1 << (TYPE) i)))
+     ret++;
+
+ return ret;
+}
+
+void
+TEST (void)
+{
+  UNION_TYPE (AVX512F_LEN, i_q) res1, res2, res3, src, src0;
+  MASK_TYPE mask = MASK_VALUE;
+  TYPE res_ref[SIZE];
+  src.x = _mm512_set1_epi8 (0x3D);
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+  {
+    res_ref[i] = compute_popcnt (src.a[i]);
+    src0.a[i] = DEFAULT_VALUE;
+  }
+
+  res1.x = INTRINSIC (_popcnt_epi64)       (src.x);
+  res2.x = INTRINSIC (_mask_popcnt_epi64)  (src.x, mask, src0.x);
+  res3.x = INTRINSIC (_maskz_popcnt_epi64) (mask, src.x);
+
+  if (UNION_CHECK (AVX512F_LEN, i_q) (res1, res_ref))
+    abort ();
+
+  MASK_MERGE (i_q) (res_ref, mask, SIZE);
+  if (UNION_CHECK (AVX512F_LEN, i_q) (res2, res_ref))
+    abort ();
+
+  MASK_ZERO (i_q) (res_ref, mask, SIZE);
+  if (UNION_CHECK (AVX512F_LEN, i_q) (res3, res_ref))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/i386.exp b/gcc/testsuite/gcc.target/i386/i386.exp
index d06c0d9..b335e9d 100644
--- a/gcc/testsuite/gcc.target/i386/i386.exp
+++ b/gcc/testsuite/gcc.target/i386/i386.exp
@@ -408,6 +408,20 @@ proc check_effective_target_avx5124vnniw { } {
     } "-mavx5124vnniw" ]
 }
 
+
+# Return 1 if avx512_vpopcntdq instructions can be compiled.
+proc check_effective_target_avx512vpopcntdq { } {
+    return [check_no_compiler_messages avx512vpopcntdq object {
+        typedef int __v16si __attribute__ ((__vector_size__ (64)));
+
+        __v16si
+        _mm512_popcnt_epi32 (__v16si __A)
+        {
+            return (__v16si) __builtin_ia32_vpopcountd_v16si ((__v16si) __A);
+        }
+    } "-mavx512vpopcntdq" ]
+}
+
 # If a testcase doesn't have special options, use these.
 global DEFAULT_CFLAGS
 if ![info exists DEFAULT_CFLAGS] then {

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
  2017-02-24 20:42                 ` Andrew Senkevich
@ 2017-02-25 11:47                   ` Uros Bizjak
  0 siblings, 0 replies; 12+ messages in thread
From: Uros Bizjak @ 2017-02-25 11:47 UTC (permalink / raw)
  To: Andrew Senkevich; +Cc: Kirill Yukhin, gcc-patches

On Fri, Feb 24, 2017 at 8:29 PM, Andrew Senkevich
<andrew.n.senkevich@gmail.com> wrote:

>> Also, could you pls implement runtime test cases for new intrinsics.
>
> Hi,
>
> those tests are attached, are they Ok?
> ChangLog:
>
> gcc/testsuite/
>
>     * gcc.target/i386/avx512vpopcntdq-check.h: New.
>     * gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c: Ditto.
>     * gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c: Ditto.
>     * gcc.target/i386/avx512f-helper.h: Add avx512vpopcntdq-check.h.
>     * gcc.target/i386/i386.exp (check_effective_target_avx512vpopcntdq): New.

OK.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions
@ 2016-12-22 16:38 Andrew Senkevich
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Senkevich @ 2016-12-22 16:38 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 36486 bytes --]

Hi,

this patch enables AVX512 VPOPCNTD/VPOPCNTQ instructions recently
added in Instruction Set Extensions
(https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf).

gcc/
    * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET,
    OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET): New.
    * config.gcc: Add avx512vpopcntdqintrin.h.
    * config/i386/avx512vpopcntdqintrin.h: New.
    * config/i386/cpuid.h (bit_AVX512VPOPCNTDQ): New.
    * config/i386/i386-builtin-types.def: Add new types.
    * config/i386/i386-builtin.def (__builtin_ia32_vpopcountd_v16si,
    __builtin_ia32_vpopcountd_v16si_mask,
    __builtin_ia32_vpopcountd_v16si_maskz, __builtin_ia32_vpopcountq_v8di,
    __builtin_ia32_vpopcountq_v8di_mask,
    __builtin_ia32_vpopcountq_v8di_maskz): New.
    * config/i386/i386-c.c (ix86_target_macros_internal): Define
    __AVX512VPOPCNTDQ__.
    * config/i386/i386.c (ix86_target_string): Add -mavx512vpopcntdq.
    (PTA_AVX512VPOPCNTDQ): Define.
    * config/i386/i386.h (TARGET_AVX512VPOPCNTDQ,
    TARGET_AVX512VPOPCNTDQ_P): Define.
    * config/i386/i386.opt: Add mavx512vpopcntdq.
    * config/i386/immintrin.h: Include avx512vpopcntdqintrin.h.
    * config/i386/sse.md (unspec): Add UNSPEC_VPOPCNTDQ.
    (define_insn "vpopcount<mode>"): New.
    (define_insn "vpopcountv16si_mask"): Ditto.
    (define_insn "vpopcountv16si_maskz"): Ditto.
    (define_insn "vpopcountv8di_mask"): Ditto.
    (define_insn "vpopcountv8di_maskz"): Ditto.
    (define_mode_iterator VI_AVX512F): Ditto.

gcc/testsuite/
    * g++.dg/other/i386-2.C: Add -mavx512vpopcntdq.
    * g++.dg/other/i386-3.C: Ditto.
    * gcc.target/i386/sse-12.c: Ditto.
    * gcc.target/i386/sse-13.c: Ditto.
    * gcc.target/i386/sse-22.c: Ditto.
    * gcc.target/i386/sse-23.c: Ditto.
    * gcc.target/i386/builtin_target.c: Handle new option.
    * gcc.target/i386/funcspec-56.inc: Test new attributes.
    * gcc.target/i386/avx512vpopcntdq-vpopcntd.c: New test.
    * gcc.target/i386/avx512vpopcntdq-vpopcntq.c: Ditto.

libgcc/
    * config/i386/cpuinfo.h (processor_features): Add
    FEATURE_AVX512VPOPCNTDQ.
    * config/i386/cpuinfo.c (get_available_features): Habdle new
    feature.


diff --git a/gcc/common/config/i386/i386-common.c
b/gcc/common/config/i386/i386-common.c
index 98224f5..a425af5 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
   (OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512BW_SET)
 #define OPTION_MASK_ISA_AVX5124FMAPS_SET OPTION_MASK_ISA_AVX5124FMAPS
 #define OPTION_MASK_ISA_AVX5124VNNIW_SET OPTION_MASK_ISA_AVX5124VNNIW
+#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET OPTION_MASK_ISA_AVX512VPOPCNTDQ
 #define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
@@ -183,6 +184,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_AVX512VBMI_UNSET OPTION_MASK_ISA_AVX512VBMI
 #define OPTION_MASK_ISA_AVX5124FMAPS_UNSET OPTION_MASK_ISA_AVX5124FMAPS
 #define OPTION_MASK_ISA_AVX5124VNNIW_UNSET OPTION_MASK_ISA_AVX5124VNNIW
+#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET OPTION_MASK_ISA_AVX512VPOPCNTDQ
 #define OPTION_MASK_ISA_RTM_UNSET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_UNSET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_UNSET OPTION_MASK_ISA_RDSEED
@@ -409,6 +411,8 @@ ix86_handle_option (struct gcc_options *opts,
   opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX5124FMAPS_UNSET;
   opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX5124VNNIW_UNSET;
   opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX5124VNNIW_UNSET;
+  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
  }
       return true;

@@ -481,6 +485,21 @@ ix86_handle_option (struct gcc_options *opts,
  }
       return true;

+    case OPT_mavx512vpopcntdq:
+      if (value)
+ {
+  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET;
+  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET;
+  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512F_SET;
+  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512F_SET;
+ }
+      else
+ {
+  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+ }
+      return true;
+
     case OPT_mavx512dq:
       if (value)
  {
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7afbc54..f9e9399 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -375,7 +375,8 @@ i[34567]86-*-*)
        avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
        avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
        avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
-       clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h"
+       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
+       clzerointrin.h pkuintrin.h"
  ;;
 x86_64-*-*)
  cpu_type=i386
@@ -397,7 +398,8 @@ x86_64-*-*)
        avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
        avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
        avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
-       clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h"
+       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
+       clzerointrin.h pkuintrin.h"
  ;;
 ia64-*-*)
  extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h
b/gcc/config/i386/avx512vpopcntdqintrin.h
new file mode 100644
index 0000000..28305f6
--- /dev/null
+++ b/gcc/config/i386/avx512vpopcntdqintrin.h
@@ -0,0 +1,90 @@
+/* Copyright (C) 2016 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <avx512vpopcntdqintrin.h> directly; include
<x86intrin.h> instead."
+#endif
+
+#ifndef _AVX512VPOPCNTDQINTRIN_H_INCLUDED
+#define _AVX512VPOPCNTDQINTRIN_H_INCLUDED
+
+#ifndef __AVX512VPOPCNTDQ__
+#pragma GCC push_options
+#pragma GCC target("avx512vpopcntdq")
+#define __DISABLE_AVX512VPOPCNTDQ__
+#endif /* __AVX512VPOPCNTDQ__ */
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_popcnt_epi32 (__m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si ((__v16si) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_popcnt_epi32 (__m512i __A, __mmask16 __U, __m512i __B)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si_mask ((__v16si) __A,
+ (__mmask16) __U,
+ (__v16si) __B);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_popcnt_epi32 (__mmask16 __U, __m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si_maskz ((__mmask16) __U,
+  (__v16si) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_popcnt_epi64 (__m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di ((__v8di) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_popcnt_epi64 (__m512i __A, __mmask8 __U, __m512i __B)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di_mask ((__v8di) __A,
+ (__mmask8) __U,
+ (__v8di) __B);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_popcnt_epi64 (__mmask8 __U, __m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di_maskz ((__mmask8) __U,
+ (__v8di) __A);
+}
+
+#ifdef __DISABLE_AVX512VPOPCNTDQ__
+#undef __DISABLE_AVX512VPOPCNTDQ__
+#pragma GCC pop_options
+#endif /* __DISABLE_AVX512VPOPCNTDQ__ */
+
+#endif /* _AVX512VPOPCNTDQINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index abe7c62..d094b78 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -54,6 +54,7 @@
 #define bit_SSE4a (1 << 6)
 #define bit_PRFCHW (1 << 8)
 #define bit_XOP         (1 << 11)
+#define bit_AVX512VPOPCNTDQ (1 << 14)
 #define bit_LWP (1 << 15)
 #define bit_FMA4        (1 << 16)
 #define bit_TBM         (1 << 21)
diff --git a/gcc/config/i386/i386-builtin-types.def
b/gcc/config/i386/i386-builtin-types.def
index 6e938eb..6b3ced9 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -305,9 +305,15 @@ DEF_FUNCTION_TYPE (V8DF, V2DF)
 DEF_FUNCTION_TYPE (V16SI, V4SI)
 DEF_FUNCTION_TYPE (V16SI, V8SI)
 DEF_FUNCTION_TYPE (V16SI, V16SF)
+DEF_FUNCTION_TYPE (V16SI, V16SI)
+DEF_FUNCTION_TYPE (V16SI, UHI, V16SI)
+DEF_FUNCTION_TYPE (V16SI, V16SI, UHI, V16SI)
 DEF_FUNCTION_TYPE (V16SI, V16SI, V16SI, UHI)
 DEF_FUNCTION_TYPE (V8DI, V8DI, V8DI, UQI)
 DEF_FUNCTION_TYPE (V8DI, PV8DI)
+DEF_FUNCTION_TYPE (V8DI, V8DI)
+DEF_FUNCTION_TYPE (V8DI, UQI, V8DI)
+DEF_FUNCTION_TYPE (V8DI, V8DI, UQI, V8DI)

 DEF_FUNCTION_TYPE (DI, V2DI, INT)
 DEF_FUNCTION_TYPE (DOUBLE, V2DF, INT)
@@ -486,6 +492,7 @@ DEF_FUNCTION_TYPE (V16SI, V16SI, INT)
 DEF_FUNCTION_TYPE (V16SI, V16SI, V4SI, V16SI, UHI)
 DEF_FUNCTION_TYPE (V16SI, V16SI, INT, V16SI, UHI)
 DEF_FUNCTION_TYPE (V8SI, PCV8SI, V8SI)
+DEF_FUNCTION_TYPE (V4DI, V4DI)
 DEF_FUNCTION_TYPE (V4DI, V4DI, V4DI)
 DEF_FUNCTION_TYPE (V16SI, V8DF, V8DF)
 DEF_FUNCTION_TYPE (V8DI, V8DI, V8DI, INT, V8DI, UQI)
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 7d86008..2e58a26 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2527,6 +2527,12 @@ BDESC (OPTION_MASK_ISA_AVX5124VNNIW,
CODE_FOR_avx5124vnniw_vp4dpwssd, "__builtin
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW,
CODE_FOR_avx5124vnniw_vp4dpwssd_mask, "__builtin_ia32_vp4dpwssd_mask",
IX86_BUILTIN_4DPWSSD_MASK, UNKNOWN, (int)
V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI_V16SI_UHI)
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW,
CODE_FOR_avx5124vnniw_vp4dpwssds, "__builtin_ia32_vp4dpwssds",
IX86_BUILTIN_4DPWSSDS, UNKNOWN, (int)
V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI)
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW,
CODE_FOR_avx5124vnniw_vp4dpwssds_mask,
"__builtin_ia32_vp4dpwssds_mask", IX86_BUILTIN_4DPWSSDS_MASK, UNKNOWN,
(int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI_V16SI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si,
"__builtin_ia32_vpopcountd_v16si", IX86_BUILTIN_VPOPCOUNTDV16SI,
UNKNOWN, (int) V16SI_FTYPE_V16SI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si_mask,
"__builtin_ia32_vpopcountd_v16si_mask",
IX86_BUILTIN_VPOPCOUNTDV16SI_MASK, UNKNOWN, (int)
V16SI_FTYPE_V16SI_UHI_V16SI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ,
CODE_FOR_vpopcountv16si_maskz,
"__builtin_ia32_vpopcountd_v16si_maskz",
IX86_BUILTIN_VPOPCOUNTDV16SI_MASKZ, UNKNOWN, (int)
V16SI_FTYPE_UHI_V16SI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di,
"__builtin_ia32_vpopcountq_v8di", IX86_BUILTIN_VPOPCOUNTQV8DI,
UNKNOWN, (int) V8DI_FTYPE_V8DI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di_mask,
"__builtin_ia32_vpopcountq_v8di_mask",
IX86_BUILTIN_VPOPCOUNTQV8DI_MASK, UNKNOWN, (int)
V8DI_FTYPE_V8DI_UQI_V8DI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di_maskz,
"__builtin_ia32_vpopcountq_v8di_maskz",
IX86_BUILTIN_VPOPCOUNTQV8DI_MASKZ, UNKNOWN, (int) V8DI_FTYPE_UQI_V8DI)

 BDESC_END (ARGS2, MPX)

diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 6e56c83..8a91e39 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -380,6 +380,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__AVX5124VNNIW__");
   if (isa_flag2 & OPTION_MASK_ISA_AVX5124FMAPS)
     def_or_undef (parse_in, "__AVX5124FMAPS__");
+  if (isa_flag2 & OPTION_MASK_ISA_AVX512VPOPCNTDQ)
+    def_or_undef (parse_in, "__AVX512VPOPCNTDQ__");
   if (isa_flag & OPTION_MASK_ISA_FMA)
     def_or_undef (parse_in, "__FMA__");
   if (isa_flag & OPTION_MASK_ISA_RTM)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 792e8ec..164b911 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4320,6 +4320,7 @@ ix86_target_string (HOST_WIDE_INT isa,
HOST_WIDE_INT isa2, int flags,
   {
     { "-mavx5124vnniw", OPTION_MASK_ISA_AVX5124VNNIW },
     { "-mavx5124fmaps", OPTION_MASK_ISA_AVX5124FMAPS },
+    { "-mavx512vpopcntdq", OPTION_MASK_ISA_AVX512VPOPCNTDQ },
   };
   /* Flag options.  */
   static struct ix86_target_opts flag_opts[] =
@@ -4919,6 +4920,7 @@ ix86_option_override_internal (bool main_args_p,
 #define PTA_PKU (HOST_WIDE_INT_1 << 59)
 #define PTA_AVX5124VNNIW (HOST_WIDE_INT_1 << 60)
 #define PTA_AVX5124FMAPS (HOST_WIDE_INT_1 << 61)
+#define PTA_AVX512VPOPCNTDQ (HOST_WIDE_INT_1 << 62)

 #define PTA_CORE2 \
   (PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 \
@@ -5581,6 +5583,9 @@ ix86_option_override_internal (bool main_args_p,
  if (processor_alias_table[i].flags & PTA_AVX5124FMAPS
     && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_AVX5124FMAPS))
   opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX5124FMAPS;
+ if (processor_alias_table[i].flags & PTA_AVX512VPOPCNTDQ
+    && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_AVX512VPOPCNTDQ))
+  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VPOPCNTDQ;

  if (processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE))
   x86_prefetch_sse = true;
@@ -6625,6 +6630,7 @@ ix86_valid_target_attribute_inner_p (tree args,
char *p_strings[],
     IX86_ATTR_ISA ("avx512vl", OPT_mavx512vl),
     IX86_ATTR_ISA ("avx5124fmaps", OPT_mavx5124fmaps),
     IX86_ATTR_ISA ("avx5124vnniw", OPT_mavx5124vnniw),
+    IX86_ATTR_ISA ("avx512vpopcntdq", OPT_mavx512vpopcntdq),
     IX86_ATTR_ISA ("mmx", OPT_mmmx),
     IX86_ATTR_ISA ("pclmul", OPT_mpclmul),
     IX86_ATTR_ISA ("popcnt", OPT_mpopcnt),
@@ -33300,6 +33306,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
     F_AVX512IFMA,
     F_AVX5124VNNIW,
     F_AVX5124FMAPS,
+    F_AVX512VPOPCNTDQ,
     F_MAX
   };

@@ -33414,6 +33421,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
       {"avx512ifma",F_AVX512IFMA},
       {"avx5124vnniw",F_AVX5124VNNIW},
       {"avx5124fmaps",F_AVX5124FMAPS},
+      {"avx512vpopcntdq",F_AVX512VPOPCNTDQ},
     };

   tree __processor_model_type = build_processor_model_struct ();
@@ -34885,14 +34893,17 @@ ix86_expand_args_builtin (const struct
builtin_description *d,
     case V16SI_FTYPE_UHI:
     case V2DI_FTYPE_UQI:
     case V4DI_FTYPE_UQI:
+    case V4DI_FTYPE_V4DI:
     case V16SI_FTYPE_INT:
     case V16SF_FTYPE_V8SF:
     case V16SI_FTYPE_V8SI:
     case V16SF_FTYPE_V4SF:
     case V16SI_FTYPE_V4SI:
     case V16SI_FTYPE_V16SF:
+    case V16SI_FTYPE_V16SI:
     case V16SF_FTYPE_V16SF:
     case V8DI_FTYPE_UQI:
+    case V8DI_FTYPE_V8DI:
     case V8DF_FTYPE_V4DF:
     case V8DF_FTYPE_V2DF:
     case V8DF_FTYPE_V8DF:
@@ -34997,7 +35008,9 @@ ix86_expand_args_builtin (const struct
builtin_description *d,
     case UHI_FTYPE_UHI_UHI:
     case USI_FTYPE_USI_USI:
     case UDI_FTYPE_UDI_UDI:
+    case V8DI_FTYPE_UQI_V8DI:
     case V16SI_FTYPE_V8DF_V8DF:
+    case V16SI_FTYPE_UHI_V16SI:
       nargs = 2;
       break;
     case V2DI_FTYPE_V2DI_INT_CONVERT:
@@ -35203,6 +35216,11 @@ ix86_expand_args_builtin (const struct
builtin_description *d,
       nargs = 3;
       nargs_constant = 1;
       break;
+    case V8DI_FTYPE_V8DI_UQI_V8DI:
+    case V16SI_FTYPE_V16SI_UHI_V16SI:
+      nargs = 3;
+      mask_pos = 1;
+      break;
     case V4DI_FTYPE_V4DI_V4DI_INT_CONVERT:
       nargs = 3;
       rmode = V4DImode;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 5f5368d..748de25 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -85,6 +85,8 @@ see the files COPYING3 and COPYING.RUNTIME
respectively.  If not, see
 #define TARGET_AVX5124FMAPS_P(x) TARGET_ISA_AVX5124FMAPS_P(x)
 #define TARGET_AVX5124VNNIW TARGET_ISA_AVX5124VNNIW
 #define TARGET_AVX5124VNNIW_P(x) TARGET_ISA_AVX5124VNNIW_P(x)
+#define TARGET_AVX512VPOPCNTDQ TARGET_ISA_AVX512VPOPCNTDQ
+#define TARGET_AVX512VPOPCNTDQ_P(x) TARGET_ISA_AVX512VPOPCNTDQ_P(x)
 #define TARGET_FMA TARGET_ISA_FMA
 #define TARGET_FMA_P(x) TARGET_ISA_FMA_P(x)
 #define TARGET_SSE4A TARGET_ISA_SSE4A
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 390412a..b914287 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -705,6 +705,10 @@ mavx5124vnniw
 Target Report Mask(ISA_AVX5124VNNIW) Var(ix86_isa_flags2) Save
 Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
AVX512F and AVX5124VNNIW built-in functions and code generation.

+mavx512vpopcntdq
+Target Report Mask(ISA_AVX512VPOPCNTDQ) Var(ix86_isa_flags2) Save
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
AVX512F and AVX512VPOPCNTDQ built-in functions and code generation.
+
 mfma
 Target Report Mask(ISA_FMA) Var(ix86_isa_flags) Save
 Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and FMA
built-in functions and code generation.
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index 3fd3c9c..0692580 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -72,6 +72,8 @@

 #include <avx5124vnniwintrin.h>

+#include <avx512vpopcntdqintrin.h>
+
 #include <shaintrin.h>

 #include <lzcntintrin.h>
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 4c9bdec..6b2a638 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -155,6 +155,9 @@
   UNSPEC_VP4FNMADD
   UNSPEC_VP4DPWSSD
   UNSPEC_VP4DPWSSDS
+
+  ;; For VPOPCOUNTDQ support
+  UNSPEC_VPOPCNTDQ
 ])

 (define_c_enum "unspecv" [
@@ -265,6 +268,9 @@
 (define_mode_iterator VF_512
   [V16SF V8DF])

+(define_mode_iterator VI_AVX512F
+  [V16SI V8DI])
+
 (define_mode_iterator VI48_AVX512VL
   [V16SI (V8SI  "TARGET_AVX512VL") (V4SI  "TARGET_AVX512VL")
    V8DI  (V4DI  "TARGET_AVX512VL") (V2DI  "TARGET_AVX512VL")])
@@ -19881,3 +19887,44 @@
    [(set_attr ("type") ("ssemuladd"))
     (set_attr ("prefix") ("evex"))
     (set_attr ("mode") ("TI"))])
+
+(define_insn "vpopcount<mode>"
+  [(set (match_operand:VI_AVX512F 0 "register_operand" "=v, v")
+ (popcount:VI_AVX512F
+  (match_operand:VI_AVX512F 1 "nonimmediate_operand" "v, m")))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcnt<ssemodesuffix>\t{%1, %0|%0, %1}")
+
+(define_insn "vpopcountv16si_mask"
+  [(set (match_operand:V16SI 0 "register_operand" "=v, v")
+ (unspec:V16SI
+  [(match_operand:V16SI 1 "nonimmediate_operand" "v, m")
+   (match_operand:HI 2 "register_operand" "Yk, Yk")
+   (match_operand:V16SI 3 "nonimmediate_operand" "0, 0")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntd\t{%1, %0%{%2%}|%{%2%}%0, %1}")
+
+(define_insn "vpopcountv16si_maskz"
+  [(set (match_operand:V16SI 0 "register_operand" "=v, v")
+ (unspec:V16SI
+  [(match_operand:HI 1 "register_operand" "Yk, Yk")
+   (match_operand:V16SI 2 "nonimmediate_operand" "v, m")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntd\t{%2, %0%{%1%}%{z%}|%{%1%}%{z%}%0, %2}")
+
+(define_insn "vpopcountv8di_mask"
+  [(set (match_operand:V8DI 0 "register_operand" "=v, v")
+ (unspec:V8DI
+  [(match_operand:V8DI 1 "nonimmediate_operand" "v, m")
+   (match_operand:QI 2 "register_operand" "Yk, Yk")
+   (match_operand:V8DI 3 "nonimmediate_operand" "0, 0")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntq\t{%1, %0%{%2%}|%{%2%}%0, %1}")
+
+(define_insn "vpopcountv8di_maskz"
+  [(set (match_operand:V8DI 0 "register_operand" "=v, v")
+ (unspec:V8DI
+  [(match_operand:QI 1 "register_operand" "Yk, Yk")
+   (match_operand:V8DI 2 "nonimmediate_operand" "v, m")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntq\t{%2, %0%{%1%}%{z%}|%{%1%}%{z%}%0, %2}")
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C
b/gcc/testsuite/g++.dg/other/i386-2.C
index 701051d..ad9fb7c 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,11 +1,11 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx
-mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2
-mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw
-madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf
-msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq
-mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps
-mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx
-mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2
-mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw
-madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf
-msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq
-mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps
-mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */

 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h and mm_malloc.h.h are usable with
-   -O -pedantic-errors.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h.h are usable
+   with -O -pedantic-errors.  */

 #include <x86intrin.h>

diff --git a/gcc/testsuite/g++.dg/other/i386-3.C
b/gcc/testsuite/g++.dg/other/i386-3.C
index cd8f217..084a1bb 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,10 +1,10 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow
-mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi
-mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed
-mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd
-mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt
-mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi
-mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow
-mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi
-mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed
-mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd
-mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt
-mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi
-mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx
-mclzero -mpku" } */

 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h and mm_malloc.h are usable with
-   -O -fkeep-inline-functions.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h are
+   usable with -O -fkeep-inline-functions.  */

 #include <x86intrin.h>
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
new file mode 100644
index 0000000..c55a05a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-final { scan-assembler-times "vpopcntd\[
\\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntd\[
\\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntd\[
\\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)"  1 }
} */
+
+#include <x86intrin.h>
+
+extern __m512i z, z1;
+
+int foo ()
+{
+  __mmask16 msk;
+  __m512i c = _mm512_popcnt_epi32 (z);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_mask_popcnt_epi32 (z, msk, z1);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_maskz_popcnt_epi32 (msk, z);
+  asm volatile ("" : "+v" (c));
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
new file mode 100644
index 0000000..2698ec3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-final { scan-assembler-times "vpopcntq\[
\\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntq\[
\\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntq\[
\\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)"  1 }
} */
+
+#include <x86intrin.h>
+
+extern __m512i z, z1;
+
+int foo ()
+{
+  __mmask8 msk;
+  __m512i c = _mm512_popcnt_epi64 (z);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_mask_popcnt_epi64 (z, msk, z1);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_maskz_popcnt_epi64 (msk, z);
+  asm volatile ("" : "+v" (c));
+}
diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c
b/gcc/testsuite/gcc.target/i386/builtin_target.c
index c620a74..e50695c 100644
--- a/gcc/testsuite/gcc.target/i386/builtin_target.c
+++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
@@ -217,6 +217,8 @@ check_features (unsigned int ecx, unsigned int edx,
  assert (__builtin_cpu_supports ("avx5124vnniw"));
       if (edx & bit_AVX5124FMAPS)
  assert (__builtin_cpu_supports ("avx5124fmaps"));
+      if (ecx & bit_AVX512VPOPCNTDQ)
+ assert (__builtin_cpu_supports ("avx512vpopcntdq"));
     }
 }

@@ -319,6 +321,8 @@ quick_check ()

   assert (__builtin_cpu_supports ("avx5124fmaps") >= 0);

+  assert (__builtin_cpu_supports ("avx512vpopcntdq") >= 0);
+
   /* Check CPU type.  */
   assert (__builtin_cpu_is ("amd") >= 0);

diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
index 9334e9e..c999080 100644
--- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
+++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
@@ -30,6 +30,7 @@ extern void test_avx512pf(void)
__attribute__((__target__("avx512pf")));
 extern void test_avx512cd(void) __attribute__((__target__("avx512cd")));
 extern void test_avx5124fmaps(void)
__attribute__((__target__("avx5124fmaps")));
 extern void test_avx5124vnniw(void)
__attribute__((__target__("avx5124vnniw")));
+extern void test_avx512vpopcntdq(void)
__attribute__((__target__("avx512vpopcntdq")));
 extern void test_bmi (void) __attribute__((__target__("bmi")));
 extern void test_bmi2 (void) __attribute__((__target__("bmi2")));

@@ -63,6 +64,7 @@ extern void test_bo_avx512pf(void)
__attribute__((__target__("no-avx512pf")));
 extern void test_no_avx512cd(void) __attribute__((__target__("no-avx512cd")));
 extern void test_no_avx5124fmaps(void)
__attribute__((__target__("no-avx5124fmaps")));
 extern void test_no_avx5124vnniw(void)
__attribute__((__target__("no-avx5124vnniw")));
+extern void test_no_avx512vpopcntdq(void)
__attribute__((__target__("no-avx512vpopcntdq")));
 extern void test_no_bmi (void) __attribute__((__target__("no-bmi")));
 extern void test_no_bmi2 (void) __attribute__((__target__("no-bmi2")));

diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c
b/gcc/testsuite/gcc.target/i386/sse-12.c
index 3e8417b..19ff785 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -3,7 +3,7 @@
    popcntintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a
-m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm
-mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm
-mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er
-mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves
-mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi
-mavx512ifma -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero
-mpku" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a
-m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm
-mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm
-mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er
-mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves
-mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi
-mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb
-mmwaitx -mclzero -mpku" } */

 #include <x86intrin.h>

diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c
b/gcc/testsuite/gcc.target/i386/sse-13.c
index 67f3b93..350e2ed 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8
-msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt
-mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma
-mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er
-mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves
-mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi
-mavx512ifma -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero
-mpku" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8
-msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt
-mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma
-mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er
-mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves
-mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi
-mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb
-mmwaitx -mclzero -mpku" } */
 /* { dg-add-options bind_pic_locally } */

 #include <mm_malloc.h>
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c
b/gcc/testsuite/gcc.target/i386/sse-22.c
index 44d48fd..85f9119 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -9,8 +9,8 @@
    are defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h,
    mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h,
    tbmintrin.h, lwpintrin.h, popcntintrin.h, fmaintrin.h,
-   avx5124fmapsintrin.h, avx5124vnniwintrin.h and mm_malloc.h
-   that reference the proper builtin functions.
+   avx5124fmapsintrin.h, avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and
+   mm_malloc.h that reference the proper builtin functions.

    Defining away "extern" and "__inline" results in all of them being
    compiled as proper functions.  */
@@ -101,7 +101,7 @@


 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target
("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw")
+#pragma GCC target
("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
 #endif

 /* Following intrinsics require immediate arguments.  They
@@ -218,7 +218,7 @@ test_4 (_mm_cmpestrz, int, __m128i, int, __m128i, int, 1)

 /* immintrin.h (AVX/AVX2/RDRND/FSGSBASE/F16C/RTM/AVX512F/SHA) */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target
("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw")
+#pragma GCC target
("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
 #endif
 #include <immintrin.h>
 test_1 (_cvtss_sh, unsigned short, float, 1)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c
b/gcc/testsuite/gcc.target/i386/sse-23.c
index 61f1b00..3fc1f75 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -8,8 +8,8 @@
    are defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h,
    mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h,
    tbmintrin.h, lwpintrin.h, popcntintrin.h, fmaintrin.h,
-   avx5124fmapsintrin.h, avx5124vnniwintrin.h and mm_malloc.h
-   that reference the proper builtin functions.
+   avx5124fmapsintrin.h, avx5124vnniwintrin.h, avx512vpopcntdqintrin.h
+   and mm_malloc.h that reference the proper builtin functions.

    Defining away "extern" and "__inline" results in all of them being
    compiled as proper functions.  */
@@ -595,6 +595,6 @@
 #define __builtin_ia32_extracti64x2_256_mask(A, E, C, D)
__builtin_ia32_extracti64x2_256_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x2_256_mask(A, E, C, D)
__builtin_ia32_extractf64x2_256_mask(A, 1, C, D)

-#pragma GCC target
("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,clwb,mwaitx,clzero,pku")
+#pragma GCC target
("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku")

 #include <x86intrin.h>
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 9f30cb8..93b9307 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -277,6 +277,8 @@ get_available_features (unsigned int ecx, unsigned int edx,
  features |= (1 << FEATURE_AVX5124VNNIW);
       if (edx & bit_AVX5124FMAPS)
  features |= (1 << FEATURE_AVX5124FMAPS);
+      if (ecx & bit_AVX512VPOPCNTDQ)
+ features |= (1 << FEATURE_AVX512VPOPCNTDQ);
     }

   unsigned int ext_level;
diff --git a/libgcc/config/i386/cpuinfo.h b/libgcc/config/i386/cpuinfo.h
index cf848e6..49d0909 100644
--- a/libgcc/config/i386/cpuinfo.h
+++ b/libgcc/config/i386/cpuinfo.h
@@ -104,7 +104,8 @@ enum processor_features
   FEATURE_AVX512VBMI,
   FEATURE_AVX512IFMA,
   FEATURE_AVX5124VNNIW,
-  FEATURE_AVX5124FMAPS
+  FEATURE_AVX5124FMAPS,
+  FEATURE_AVX512VPOPCNTDQ
 };

 extern struct __processor_model


Is this patch Ok?


--
WBR,
Andrew

[-- Attachment #2: avx512vpopcntdq.patch --]
[-- Type: application/octet-stream, Size: 33520 bytes --]

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 98224f5..a425af5 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
   (OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512BW_SET)
 #define OPTION_MASK_ISA_AVX5124FMAPS_SET OPTION_MASK_ISA_AVX5124FMAPS
 #define OPTION_MASK_ISA_AVX5124VNNIW_SET OPTION_MASK_ISA_AVX5124VNNIW
+#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET OPTION_MASK_ISA_AVX512VPOPCNTDQ
 #define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
@@ -183,6 +184,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_AVX512VBMI_UNSET OPTION_MASK_ISA_AVX512VBMI
 #define OPTION_MASK_ISA_AVX5124FMAPS_UNSET OPTION_MASK_ISA_AVX5124FMAPS
 #define OPTION_MASK_ISA_AVX5124VNNIW_UNSET OPTION_MASK_ISA_AVX5124VNNIW
+#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET OPTION_MASK_ISA_AVX512VPOPCNTDQ
 #define OPTION_MASK_ISA_RTM_UNSET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_UNSET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_UNSET OPTION_MASK_ISA_RDSEED
@@ -409,6 +411,8 @@ ix86_handle_option (struct gcc_options *opts,
 	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX5124FMAPS_UNSET;
 	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX5124VNNIW_UNSET;
 	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX5124VNNIW_UNSET;
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
 	}
       return true;
 
@@ -481,6 +485,21 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_mavx512vpopcntdq:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET;
+	  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512F_SET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512F_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET;
+	}
+      return true;
+
     case OPT_mavx512dq:
       if (value)
 	{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7afbc54..f9e9399 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -375,7 +375,8 @@ i[34567]86-*-*)
 		       avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
 		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
 		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
-		       clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h"
+		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
+		       clzerointrin.h pkuintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -397,7 +398,8 @@ x86_64-*-*)
 		       avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h
 		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
 		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
-		       clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h"
+		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
+		       clzerointrin.h pkuintrin.h"
 	;;
 ia64-*-*)
 	extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h b/gcc/config/i386/avx512vpopcntdqintrin.h
new file mode 100644
index 0000000..28305f6
--- /dev/null
+++ b/gcc/config/i386/avx512vpopcntdqintrin.h
@@ -0,0 +1,90 @@
+/* Copyright (C) 2016 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <avx512vpopcntdqintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef _AVX512VPOPCNTDQINTRIN_H_INCLUDED
+#define _AVX512VPOPCNTDQINTRIN_H_INCLUDED
+
+#ifndef __AVX512VPOPCNTDQ__
+#pragma GCC push_options
+#pragma GCC target("avx512vpopcntdq")
+#define __DISABLE_AVX512VPOPCNTDQ__
+#endif /* __AVX512VPOPCNTDQ__ */
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_popcnt_epi32 (__m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si ((__v16si) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_popcnt_epi32 (__m512i __A, __mmask16 __U, __m512i __B)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si_mask ((__v16si) __A,
+							 (__mmask16) __U,
+							 (__v16si) __B);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_popcnt_epi32 (__mmask16 __U, __m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountd_v16si_maskz ((__mmask16) __U,
+							  (__v16si) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_popcnt_epi64 (__m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di ((__v8di) __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_popcnt_epi64 (__m512i __A, __mmask8 __U, __m512i __B)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di_mask ((__v8di) __A,
+							(__mmask8) __U,
+							(__v8di) __B);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_popcnt_epi64 (__mmask8 __U, __m512i __A)
+{
+  return (__m512i) __builtin_ia32_vpopcountq_v8di_maskz ((__mmask8) __U,
+							 (__v8di) __A);
+}
+
+#ifdef __DISABLE_AVX512VPOPCNTDQ__
+#undef __DISABLE_AVX512VPOPCNTDQ__
+#pragma GCC pop_options
+#endif /* __DISABLE_AVX512VPOPCNTDQ__ */
+
+#endif /* _AVX512VPOPCNTDQINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index abe7c62..d094b78 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -54,6 +54,7 @@
 #define bit_SSE4a	(1 << 6)
 #define bit_PRFCHW	(1 << 8)
 #define bit_XOP         (1 << 11)
+#define bit_AVX512VPOPCNTDQ	(1 << 14)
 #define bit_LWP 	(1 << 15)
 #define bit_FMA4        (1 << 16)
 #define bit_TBM         (1 << 21)
diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def
index 6e938eb..6b3ced9 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -305,9 +305,15 @@ DEF_FUNCTION_TYPE (V8DF, V2DF)
 DEF_FUNCTION_TYPE (V16SI, V4SI)
 DEF_FUNCTION_TYPE (V16SI, V8SI)
 DEF_FUNCTION_TYPE (V16SI, V16SF)
+DEF_FUNCTION_TYPE (V16SI, V16SI)
+DEF_FUNCTION_TYPE (V16SI, UHI, V16SI)
+DEF_FUNCTION_TYPE (V16SI, V16SI, UHI, V16SI)
 DEF_FUNCTION_TYPE (V16SI, V16SI, V16SI, UHI)
 DEF_FUNCTION_TYPE (V8DI, V8DI, V8DI, UQI)
 DEF_FUNCTION_TYPE (V8DI, PV8DI)
+DEF_FUNCTION_TYPE (V8DI, V8DI)
+DEF_FUNCTION_TYPE (V8DI, UQI, V8DI)
+DEF_FUNCTION_TYPE (V8DI, V8DI, UQI, V8DI)
 
 DEF_FUNCTION_TYPE (DI, V2DI, INT)
 DEF_FUNCTION_TYPE (DOUBLE, V2DF, INT)
@@ -486,6 +492,7 @@ DEF_FUNCTION_TYPE (V16SI, V16SI, INT)
 DEF_FUNCTION_TYPE (V16SI, V16SI, V4SI, V16SI, UHI)
 DEF_FUNCTION_TYPE (V16SI, V16SI, INT, V16SI, UHI)
 DEF_FUNCTION_TYPE (V8SI, PCV8SI, V8SI)
+DEF_FUNCTION_TYPE (V4DI, V4DI)
 DEF_FUNCTION_TYPE (V4DI, V4DI, V4DI)
 DEF_FUNCTION_TYPE (V16SI, V8DF, V8DF)
 DEF_FUNCTION_TYPE (V8DI, V8DI, V8DI, INT, V8DI, UQI)
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 7d86008..2e58a26 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2527,6 +2527,12 @@ BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssd, "__builtin
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssd_mask, "__builtin_ia32_vp4dpwssd_mask", IX86_BUILTIN_4DPWSSD_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI_V16SI_UHI)
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssds, "__builtin_ia32_vp4dpwssds", IX86_BUILTIN_4DPWSSDS, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI)
 BDESC (OPTION_MASK_ISA_AVX5124VNNIW, CODE_FOR_avx5124vnniw_vp4dpwssds_mask, "__builtin_ia32_vp4dpwssds_mask", IX86_BUILTIN_4DPWSSDS_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_V16SI_V16SI_V16SI_V16SI_PCV4SI_V16SI_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si, "__builtin_ia32_vpopcountd_v16si", IX86_BUILTIN_VPOPCOUNTDV16SI, UNKNOWN, (int) V16SI_FTYPE_V16SI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si_mask, "__builtin_ia32_vpopcountd_v16si_mask", IX86_BUILTIN_VPOPCOUNTDV16SI_MASK, UNKNOWN, (int) V16SI_FTYPE_V16SI_UHI_V16SI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv16si_maskz, "__builtin_ia32_vpopcountd_v16si_maskz", IX86_BUILTIN_VPOPCOUNTDV16SI_MASKZ, UNKNOWN, (int) V16SI_FTYPE_UHI_V16SI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di, "__builtin_ia32_vpopcountq_v8di", IX86_BUILTIN_VPOPCOUNTQV8DI, UNKNOWN, (int) V8DI_FTYPE_V8DI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di_mask, "__builtin_ia32_vpopcountq_v8di_mask", IX86_BUILTIN_VPOPCOUNTQV8DI_MASK, UNKNOWN, (int) V8DI_FTYPE_V8DI_UQI_V8DI)
+BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di_maskz, "__builtin_ia32_vpopcountq_v8di_maskz", IX86_BUILTIN_VPOPCOUNTQV8DI_MASKZ, UNKNOWN, (int) V8DI_FTYPE_UQI_V8DI)
 
 BDESC_END (ARGS2, MPX)
 
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 6e56c83..8a91e39 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -380,6 +380,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__AVX5124VNNIW__");
   if (isa_flag2 & OPTION_MASK_ISA_AVX5124FMAPS)
     def_or_undef (parse_in, "__AVX5124FMAPS__");
+  if (isa_flag2 & OPTION_MASK_ISA_AVX512VPOPCNTDQ)
+    def_or_undef (parse_in, "__AVX512VPOPCNTDQ__");
   if (isa_flag & OPTION_MASK_ISA_FMA)
     def_or_undef (parse_in, "__FMA__");
   if (isa_flag & OPTION_MASK_ISA_RTM)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 792e8ec..164b911 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4320,6 +4320,7 @@ ix86_target_string (HOST_WIDE_INT isa, HOST_WIDE_INT isa2, int flags,
   {
     { "-mavx5124vnniw", OPTION_MASK_ISA_AVX5124VNNIW },
     { "-mavx5124fmaps", OPTION_MASK_ISA_AVX5124FMAPS },
+    { "-mavx512vpopcntdq", OPTION_MASK_ISA_AVX512VPOPCNTDQ },
   };
   /* Flag options.  */
   static struct ix86_target_opts flag_opts[] =
@@ -4919,6 +4920,7 @@ ix86_option_override_internal (bool main_args_p,
 #define PTA_PKU		(HOST_WIDE_INT_1 << 59)
 #define PTA_AVX5124VNNIW	(HOST_WIDE_INT_1 << 60)
 #define PTA_AVX5124FMAPS	(HOST_WIDE_INT_1 << 61)
+#define PTA_AVX512VPOPCNTDQ	(HOST_WIDE_INT_1 << 62)
 
 #define PTA_CORE2 \
   (PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 \
@@ -5581,6 +5583,9 @@ ix86_option_override_internal (bool main_args_p,
 	if (processor_alias_table[i].flags & PTA_AVX5124FMAPS
 	    && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_AVX5124FMAPS))
 	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX5124FMAPS;
+	if (processor_alias_table[i].flags & PTA_AVX512VPOPCNTDQ
+	    && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_AVX512VPOPCNTDQ))
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VPOPCNTDQ;
 
 	if (processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE))
 	  x86_prefetch_sse = true;
@@ -6625,6 +6630,7 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
     IX86_ATTR_ISA ("avx512vl",	OPT_mavx512vl),
     IX86_ATTR_ISA ("avx5124fmaps",	OPT_mavx5124fmaps),
     IX86_ATTR_ISA ("avx5124vnniw",	OPT_mavx5124vnniw),
+    IX86_ATTR_ISA ("avx512vpopcntdq",	OPT_mavx512vpopcntdq),
     IX86_ATTR_ISA ("mmx",	OPT_mmmx),
     IX86_ATTR_ISA ("pclmul",	OPT_mpclmul),
     IX86_ATTR_ISA ("popcnt",	OPT_mpopcnt),
@@ -33300,6 +33306,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
     F_AVX512IFMA,
     F_AVX5124VNNIW,
     F_AVX5124FMAPS,
+    F_AVX512VPOPCNTDQ,
     F_MAX
   };
 
@@ -33414,6 +33421,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
       {"avx512ifma",F_AVX512IFMA},
       {"avx5124vnniw",F_AVX5124VNNIW},
       {"avx5124fmaps",F_AVX5124FMAPS},
+      {"avx512vpopcntdq",F_AVX512VPOPCNTDQ},
     };
 
   tree __processor_model_type = build_processor_model_struct ();
@@ -34885,14 +34893,17 @@ ix86_expand_args_builtin (const struct builtin_description *d,
     case V16SI_FTYPE_UHI:
     case V2DI_FTYPE_UQI:
     case V4DI_FTYPE_UQI:
+    case V4DI_FTYPE_V4DI:
     case V16SI_FTYPE_INT:
     case V16SF_FTYPE_V8SF:
     case V16SI_FTYPE_V8SI:
     case V16SF_FTYPE_V4SF:
     case V16SI_FTYPE_V4SI:
     case V16SI_FTYPE_V16SF:
+    case V16SI_FTYPE_V16SI:
     case V16SF_FTYPE_V16SF:
     case V8DI_FTYPE_UQI:
+    case V8DI_FTYPE_V8DI:
     case V8DF_FTYPE_V4DF:
     case V8DF_FTYPE_V2DF:
     case V8DF_FTYPE_V8DF:
@@ -34997,7 +35008,9 @@ ix86_expand_args_builtin (const struct builtin_description *d,
     case UHI_FTYPE_UHI_UHI:
     case USI_FTYPE_USI_USI:
     case UDI_FTYPE_UDI_UDI:
+    case V8DI_FTYPE_UQI_V8DI:
     case V16SI_FTYPE_V8DF_V8DF:
+    case V16SI_FTYPE_UHI_V16SI:
       nargs = 2;
       break;
     case V2DI_FTYPE_V2DI_INT_CONVERT:
@@ -35203,6 +35216,11 @@ ix86_expand_args_builtin (const struct builtin_description *d,
       nargs = 3;
       nargs_constant = 1;
       break;
+    case V8DI_FTYPE_V8DI_UQI_V8DI:
+    case V16SI_FTYPE_V16SI_UHI_V16SI:
+      nargs = 3;
+      mask_pos = 1;
+      break;
     case V4DI_FTYPE_V4DI_V4DI_INT_CONVERT:
       nargs = 3;
       rmode = V4DImode;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 5f5368d..748de25 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -85,6 +85,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_AVX5124FMAPS_P(x) TARGET_ISA_AVX5124FMAPS_P(x)
 #define TARGET_AVX5124VNNIW	TARGET_ISA_AVX5124VNNIW
 #define TARGET_AVX5124VNNIW_P(x) TARGET_ISA_AVX5124VNNIW_P(x)
+#define TARGET_AVX512VPOPCNTDQ	TARGET_ISA_AVX512VPOPCNTDQ
+#define TARGET_AVX512VPOPCNTDQ_P(x) TARGET_ISA_AVX512VPOPCNTDQ_P(x)
 #define TARGET_FMA	TARGET_ISA_FMA
 #define TARGET_FMA_P(x)	TARGET_ISA_FMA_P(x)
 #define TARGET_SSE4A	TARGET_ISA_SSE4A
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 390412a..b914287 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -705,6 +705,10 @@ mavx5124vnniw
 Target Report Mask(ISA_AVX5124VNNIW) Var(ix86_isa_flags2) Save
 Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and AVX5124VNNIW built-in functions and code generation.
 
+mavx512vpopcntdq
+Target Report Mask(ISA_AVX512VPOPCNTDQ) Var(ix86_isa_flags2) Save
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and AVX512VPOPCNTDQ built-in functions and code generation.
+
 mfma
 Target Report Mask(ISA_FMA) Var(ix86_isa_flags) Save
 Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and FMA built-in functions and code generation.
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index 3fd3c9c..0692580 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -72,6 +72,8 @@
 
 #include <avx5124vnniwintrin.h>
 
+#include <avx512vpopcntdqintrin.h>
+
 #include <shaintrin.h>
 
 #include <lzcntintrin.h>
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 4c9bdec..6b2a638 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -155,6 +155,9 @@
   UNSPEC_VP4FNMADD
   UNSPEC_VP4DPWSSD
   UNSPEC_VP4DPWSSDS
+
+  ;; For VPOPCOUNTDQ support
+  UNSPEC_VPOPCNTDQ
 ])
 
 (define_c_enum "unspecv" [
@@ -265,6 +268,9 @@
 (define_mode_iterator VF_512
   [V16SF V8DF])
 
+(define_mode_iterator VI_AVX512F
+  [V16SI V8DI])
+
 (define_mode_iterator VI48_AVX512VL
   [V16SI (V8SI  "TARGET_AVX512VL") (V4SI  "TARGET_AVX512VL")
    V8DI  (V4DI  "TARGET_AVX512VL") (V2DI  "TARGET_AVX512VL")])
@@ -19881,3 +19887,44 @@
    [(set_attr ("type") ("ssemuladd"))
     (set_attr ("prefix") ("evex"))
     (set_attr ("mode") ("TI"))])
+
+(define_insn "vpopcount<mode>"
+  [(set (match_operand:VI_AVX512F 0 "register_operand" "=v, v")
+	(popcount:VI_AVX512F
+	  (match_operand:VI_AVX512F 1 "nonimmediate_operand" "v, m")))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcnt<ssemodesuffix>\t{%1, %0|%0, %1}")
+
+(define_insn "vpopcountv16si_mask"
+  [(set (match_operand:V16SI 0 "register_operand" "=v, v")
+	(unspec:V16SI
+	  [(match_operand:V16SI 1 "nonimmediate_operand" "v, m")
+	   (match_operand:HI 2 "register_operand" "Yk, Yk")
+	   (match_operand:V16SI 3 "nonimmediate_operand" "0, 0")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntd\t{%1, %0%{%2%}|%{%2%}%0, %1}")
+
+(define_insn "vpopcountv16si_maskz"
+  [(set (match_operand:V16SI 0 "register_operand" "=v, v")
+	(unspec:V16SI
+	  [(match_operand:HI 1 "register_operand" "Yk, Yk")
+	   (match_operand:V16SI 2 "nonimmediate_operand" "v, m")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntd\t{%2, %0%{%1%}%{z%}|%{%1%}%{z%}%0, %2}")
+
+(define_insn "vpopcountv8di_mask"
+  [(set (match_operand:V8DI 0 "register_operand" "=v, v")
+	(unspec:V8DI
+	  [(match_operand:V8DI 1 "nonimmediate_operand" "v, m")
+	   (match_operand:QI 2 "register_operand" "Yk, Yk")
+	   (match_operand:V8DI 3 "nonimmediate_operand" "0, 0")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntq\t{%1, %0%{%2%}|%{%2%}%0, %1}")
+
+(define_insn "vpopcountv8di_maskz"
+  [(set (match_operand:V8DI 0 "register_operand" "=v, v")
+	(unspec:V8DI
+	  [(match_operand:QI 1 "register_operand" "Yk, Yk")
+	   (match_operand:V8DI 2 "nonimmediate_operand" "v, m")] UNSPEC_VPOPCNTDQ))]
+  "TARGET_AVX512VPOPCNTDQ"
+  "vpopcntq\t{%2, %0%{%1%}%{z%}|%{%1%}%{z%}%0, %2}")
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C
index 701051d..ad9fb7c 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,11 +1,11 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h and mm_malloc.h.h are usable with
-   -O -pedantic-errors.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h.h are usable
+   with -O -pedantic-errors.  */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C
index cd8f217..084a1bb 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,10 +1,10 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h and mm_malloc.h are usable with
-   -O -fkeep-inline-functions.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h are
+   usable with -O -fkeep-inline-functions.  */
 
 #include <x86intrin.h>
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
new file mode 100644
index 0000000..c55a05a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)"  1 } } */
+
+#include <x86intrin.h>
+
+extern __m512i z, z1;
+
+int foo ()
+{
+  __mmask16 msk;
+  __m512i c = _mm512_popcnt_epi32 (z);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_mask_popcnt_epi32 (z, msk, z1);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_maskz_popcnt_epi32 (msk, z);
+  asm volatile ("" : "+v" (c));
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
new file mode 100644
index 0000000..2698ec3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vpopcntq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)"  1 } } */
+
+#include <x86intrin.h>
+
+extern __m512i z, z1;
+
+int foo ()
+{
+  __mmask8 msk; 
+  __m512i c = _mm512_popcnt_epi64 (z);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_mask_popcnt_epi64 (z, msk, z1);
+  asm volatile ("" : "+v" (c));
+  c = _mm512_maskz_popcnt_epi64 (msk, z);  
+  asm volatile ("" : "+v" (c));
+}
diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c b/gcc/testsuite/gcc.target/i386/builtin_target.c
index c620a74..e50695c 100644
--- a/gcc/testsuite/gcc.target/i386/builtin_target.c
+++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
@@ -217,6 +217,8 @@ check_features (unsigned int ecx, unsigned int edx,
 	assert (__builtin_cpu_supports ("avx5124vnniw"));
       if (edx & bit_AVX5124FMAPS)
 	assert (__builtin_cpu_supports ("avx5124fmaps"));
+      if (ecx & bit_AVX512VPOPCNTDQ)
+	assert (__builtin_cpu_supports ("avx512vpopcntdq"));
     }
 }
 
@@ -319,6 +321,8 @@ quick_check ()
 
   assert (__builtin_cpu_supports ("avx5124fmaps") >= 0);
 
+  assert (__builtin_cpu_supports ("avx512vpopcntdq") >= 0);
+
   /* Check CPU type.  */
   assert (__builtin_cpu_is ("amd") >= 0);
 
diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
index 9334e9e..c999080 100644
--- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
+++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
@@ -30,6 +30,7 @@ extern void test_avx512pf(void)			__attribute__((__target__("avx512pf")));
 extern void test_avx512cd(void)			__attribute__((__target__("avx512cd")));
 extern void test_avx5124fmaps(void)             __attribute__((__target__("avx5124fmaps")));
 extern void test_avx5124vnniw(void)             __attribute__((__target__("avx5124vnniw")));
+extern void test_avx512vpopcntdq(void)		__attribute__((__target__("avx512vpopcntdq")));
 extern void test_bmi (void)			__attribute__((__target__("bmi")));
 extern void test_bmi2 (void)			__attribute__((__target__("bmi2")));
 
@@ -63,6 +64,7 @@ extern void test_bo_avx512pf(void)		__attribute__((__target__("no-avx512pf")));
 extern void test_no_avx512cd(void)		__attribute__((__target__("no-avx512cd")));
 extern void test_no_avx5124fmaps(void)          __attribute__((__target__("no-avx5124fmaps")));
 extern void test_no_avx5124vnniw(void)          __attribute__((__target__("no-avx5124vnniw")));
+extern void test_no_avx512vpopcntdq(void)	__attribute__((__target__("no-avx512vpopcntdq")));
 extern void test_no_bmi (void)			__attribute__((__target__("no-bmi")));
 extern void test_no_bmi2 (void)			__attribute__((__target__("no-bmi2")));
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c
index 3e8417b..19ff785 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -3,7 +3,7 @@
    popcntintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index 67f3b93..350e2ed 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mclwb -mmwaitx -mclzero -mpku" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index 44d48fd..85f9119 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -9,8 +9,8 @@
    are defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h,
    mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h,
    tbmintrin.h, lwpintrin.h, popcntintrin.h, fmaintrin.h,
-   avx5124fmapsintrin.h, avx5124vnniwintrin.h and mm_malloc.h 
-   that reference the proper builtin functions.
+   avx5124fmapsintrin.h, avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and
+   mm_malloc.h that reference the proper builtin functions.
 
    Defining away "extern" and "__inline" results in all of them being
    compiled as proper functions.  */
@@ -101,7 +101,7 @@
 
 
 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
 #endif
 
 /* Following intrinsics require immediate arguments.  They
@@ -218,7 +218,7 @@ test_4 (_mm_cmpestrz, int, __m128i, int, __m128i, int, 1)
 
 /* immintrin.h (AVX/AVX2/RDRND/FSGSBASE/F16C/RTM/AVX512F/SHA) */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw")
+#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
 #endif
 #include <immintrin.h>
 test_1 (_cvtss_sh, unsigned short, float, 1)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 61f1b00..3fc1f75 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -8,8 +8,8 @@
    are defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h,
    mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h,
    tbmintrin.h, lwpintrin.h, popcntintrin.h, fmaintrin.h,
-   avx5124fmapsintrin.h, avx5124vnniwintrin.h and mm_malloc.h 
-   that reference the proper builtin functions.
+   avx5124fmapsintrin.h, avx5124vnniwintrin.h, avx512vpopcntdqintrin.h
+   and mm_malloc.h that reference the proper builtin functions.
 
    Defining away "extern" and "__inline" results in all of them being
    compiled as proper functions.  */
@@ -595,6 +595,6 @@
 #define __builtin_ia32_extracti64x2_256_mask(A, E, C, D) __builtin_ia32_extracti64x2_256_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x2_256_mask(A, E, C, D) __builtin_ia32_extractf64x2_256_mask(A, 1, C, D)
 
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,clwb,mwaitx,clzero,pku")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku")
 
 #include <x86intrin.h>
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 9f30cb8..93b9307 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -277,6 +277,8 @@ get_available_features (unsigned int ecx, unsigned int edx,
 	features |= (1 << FEATURE_AVX5124VNNIW);
       if (edx & bit_AVX5124FMAPS)
 	features |= (1 << FEATURE_AVX5124FMAPS);
+      if (ecx & bit_AVX512VPOPCNTDQ)
+	features |= (1 << FEATURE_AVX512VPOPCNTDQ);
     }
 
   unsigned int ext_level;
diff --git a/libgcc/config/i386/cpuinfo.h b/libgcc/config/i386/cpuinfo.h
index cf848e6..49d0909 100644
--- a/libgcc/config/i386/cpuinfo.h
+++ b/libgcc/config/i386/cpuinfo.h
@@ -104,7 +104,8 @@ enum processor_features
   FEATURE_AVX512VBMI,
   FEATURE_AVX512IFMA,
   FEATURE_AVX5124VNNIW,
-  FEATURE_AVX5124FMAPS
+  FEATURE_AVX5124FMAPS,
+  FEATURE_AVX512VPOPCNTDQ
 };
 
 extern struct __processor_model

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-02-25  9:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-27 13:40 [PATCH][x86_64] Enable AVX512 VPOPCNTD/VPOPCNTQ instructions Uros Bizjak
2016-12-27 13:43 ` Andrew Senkevich
2016-12-27 13:50   ` Uros Bizjak
2016-12-27 14:13     ` Uros Bizjak
2017-01-10 10:05       ` Kirill Yukhin
2017-01-10 10:22         ` Andrew Senkevich
2017-01-10 10:31           ` Uros Bizjak
2017-01-10 12:01             ` Andrew Senkevich
2017-01-10 12:58               ` Kirill Yukhin
2017-02-24 20:42                 ` Andrew Senkevich
2017-02-25 11:47                   ` Uros Bizjak
  -- strict thread matches above, loose matches on Subject: below --
2016-12-22 16:38 Andrew Senkevich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).