public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/5] x86: AVX10.1 (alternative attempt)
@ 2023-08-25 12:43 Jan Beulich
  2023-08-25 12:44 ` [PATCH 1/5] x86: correct source used for two non-AVX512 VEXWIG tests Jan Beulich
                   ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Jan Beulich @ 2023-08-25 12:43 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu, Jiang, Haochen

... and accompanied by a little bit of prereq tidying.

1: correct source used for two non-AVX512 VEXWIG tests
2: rename CpuPCLMUL
3: support AVX10.1/512
4: unindent most of set_cpu_arch()
5: support AVX10.1 vector size restrictions

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/5] x86: correct source used for two non-AVX512 VEXWIG tests
  2023-08-25 12:43 [PATCH 0/5] x86: AVX10.1 (alternative attempt) Jan Beulich
@ 2023-08-25 12:44 ` Jan Beulich
  2023-08-25 12:45 ` [PATCH 2/5] x86: rename CpuPCLMUL Jan Beulich
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2023-08-25 12:44 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu, Jiang, Haochen

These shouldn't wrongly include the AVX512VL sources. Obviously the
expectations therefore also need to change.

--- a/gas/testsuite/gas/i386/vaes-wig1.d
+++ b/gas/testsuite/gas/i386/vaes-wig1.d
@@ -1,7 +1,7 @@
 #as: -mvexwig=1
 #objdump: -dw
 #name: i386 AVX/VAES wig insns
-#source: avx512vl_vaes.s
+#source: vaes.s
 
 .*: +file format .*
 
@@ -9,100 +9,24 @@
 Disassembly of section \.text:
 
 00000000 <_start>:
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 de f4[ 	]*vaesdec %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 de b4 f4 c0 1d fe ff[ 	]*vaesdec -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 de b2 f0 07 00 00[ 	]*vaesdec 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 de f4[ 	]*vaesdec %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 de b4 f4 c0 1d fe ff[ 	]*vaesdec -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 de b2 e0 0f 00 00[ 	]*vaesdec 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 df f4[ 	]*vaesdeclast %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 df b4 f4 c0 1d fe ff[ 	]*vaesdeclast -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 df b2 f0 07 00 00[ 	]*vaesdeclast 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 df f4[ 	]*vaesdeclast %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 df b4 f4 c0 1d fe ff[ 	]*vaesdeclast -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 df b2 e0 0f 00 00[ 	]*vaesdeclast 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dc f4[ 	]*vaesenc %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dc b4 f4 c0 1d fe ff[ 	]*vaesenc -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dc b2 f0 07 00 00[ 	]*vaesenc 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dc f4[ 	]*vaesenc %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dc b4 f4 c0 1d fe ff[ 	]*vaesenc -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dc b2 e0 0f 00 00[ 	]*vaesenc 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dd f4[ 	]*vaesenclast %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dd b4 f4 c0 1d fe ff[ 	]*vaesenclast -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dd b2 f0 07 00 00[ 	]*vaesenclast 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dd f4[ 	]*vaesenclast %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dd b4 f4 c0 1d fe ff[ 	]*vaesenclast -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dd b2 e0 0f 00 00[ 	]*vaesenclast 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 de f4[ 	]*\{evex\} vaesdec %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 de b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesdec -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 de 72 7f[ 	]*\{evex\} vaesdec 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 de f4[ 	]*\{evex\} vaesdec %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 de b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesdec -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 de 72 7f[ 	]*\{evex\} vaesdec 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 df f4[ 	]*\{evex\} vaesdeclast %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 df b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesdeclast -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 df 72 7f[ 	]*\{evex\} vaesdeclast 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 df f4[ 	]*\{evex\} vaesdeclast %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 df b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesdeclast -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 df 72 7f[ 	]*\{evex\} vaesdeclast 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dc f4[ 	]*\{evex\} vaesenc %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dc b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesenc -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dc 72 7f[ 	]*\{evex\} vaesenc 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dc f4[ 	]*\{evex\} vaesenc %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dc b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesenc -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dc 72 7f[ 	]*\{evex\} vaesenc 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dd f4[ 	]*\{evex\} vaesenclast %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dd b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesenclast -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dd 72 7f[ 	]*\{evex\} vaesenclast 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dd f4[ 	]*\{evex\} vaesenclast %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dd b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesenclast -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dd 72 7f[ 	]*\{evex\} vaesenclast 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 de f4[ 	]*vaesdec %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 de b4 f4 c0 1d fe ff[ 	]*vaesdec -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 de b2 f0 07 00 00[ 	]*vaesdec 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 de f4[ 	]*vaesdec %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 de b4 f4 c0 1d fe ff[ 	]*vaesdec -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 de b2 e0 0f 00 00[ 	]*vaesdec 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 df f4[ 	]*vaesdeclast %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 df b4 f4 c0 1d fe ff[ 	]*vaesdeclast -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 df b2 f0 07 00 00[ 	]*vaesdeclast 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 df f4[ 	]*vaesdeclast %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 df b4 f4 c0 1d fe ff[ 	]*vaesdeclast -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 df b2 e0 0f 00 00[ 	]*vaesdeclast 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dc f4[ 	]*vaesenc %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dc b4 f4 c0 1d fe ff[ 	]*vaesenc -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dc b2 f0 07 00 00[ 	]*vaesenc 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dc f4[ 	]*vaesenc %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dc b4 f4 c0 1d fe ff[ 	]*vaesenc -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dc b2 e0 0f 00 00[ 	]*vaesenc 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dd f4[ 	]*vaesenclast %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dd b4 f4 c0 1d fe ff[ 	]*vaesenclast -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d1 dd b2 f0 07 00 00[ 	]*vaesenclast 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dd f4[ 	]*vaesenclast %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dd b4 f4 c0 1d fe ff[ 	]*vaesenclast -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e2 d5 dd b2 e0 0f 00 00[ 	]*vaesenclast 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 de f4[ 	]*\{evex\} vaesdec %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 de b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesdec -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 de 72 7f[ 	]*\{evex\} vaesdec 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 de f4[ 	]*\{evex\} vaesdec %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 de b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesdec -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 de 72 7f[ 	]*\{evex\} vaesdec 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 df f4[ 	]*\{evex\} vaesdeclast %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 df b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesdeclast -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 df 72 7f[ 	]*\{evex\} vaesdeclast 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 df f4[ 	]*\{evex\} vaesdeclast %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 df b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesdeclast -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 df 72 7f[ 	]*\{evex\} vaesdeclast 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dc f4[ 	]*\{evex\} vaesenc %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dc b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesenc -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dc 72 7f[ 	]*\{evex\} vaesenc 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dc f4[ 	]*\{evex\} vaesenc %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dc b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesenc -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dc 72 7f[ 	]*\{evex\} vaesenc 0xfe0\(%edx\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dd f4[ 	]*\{evex\} vaesenclast %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dd b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesenclast -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 08 dd 72 7f[ 	]*\{evex\} vaesenclast 0x7f0\(%edx\),%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dd f4[ 	]*\{evex\} vaesenclast %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dd b4 f4 c0 1d fe ff[ 	]*\{evex\} vaesenclast -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*62 f2 55 28 dd 72 7f[ 	]*\{evex\} vaesenclast 0xfe0\(%edx\),%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	c4 e2 cd dc d4       	vaesenc %ymm4,%ymm6,%ymm2
+[ 	]*[a-f0-9]+:	c4 e2 cd dc 39       	vaesenc \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd dd d4       	vaesenclast %ymm4,%ymm6,%ymm2
+[ 	]*[a-f0-9]+:	c4 e2 cd dd 39       	vaesenclast \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd de d4       	vaesdec %ymm4,%ymm6,%ymm2
+[ 	]*[a-f0-9]+:	c4 e2 cd de 39       	vaesdec \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd df d4       	vaesdeclast %ymm4,%ymm6,%ymm2
+[ 	]*[a-f0-9]+:	c4 e2 cd df 39       	vaesdeclast \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd dc d4       	vaesenc %ymm4,%ymm6,%ymm2
+[ 	]*[a-f0-9]+:	c4 e2 cd dc 39       	vaesenc \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd dc 39       	vaesenc \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd dd d4       	vaesenclast %ymm4,%ymm6,%ymm2
+[ 	]*[a-f0-9]+:	c4 e2 cd dd 39       	vaesenclast \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd dd 39       	vaesenclast \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd de d4       	vaesdec %ymm4,%ymm6,%ymm2
+[ 	]*[a-f0-9]+:	c4 e2 cd de 39       	vaesdec \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd de 39       	vaesdec \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd df d4       	vaesdeclast %ymm4,%ymm6,%ymm2
+[ 	]*[a-f0-9]+:	c4 e2 cd df 39       	vaesdeclast \(%ecx\),%ymm6,%ymm7
+[ 	]*[a-f0-9]+:	c4 e2 cd df 39       	vaesdeclast \(%ecx\),%ymm6,%ymm7
 #pass
--- a/gas/testsuite/gas/i386/vpclmulqdq-wig1.d
+++ b/gas/testsuite/gas/i386/vpclmulqdq-wig1.d
@@ -1,7 +1,7 @@
 #as: -mvexwig=1
 #objdump: -dw
 #name: i386 AVX/VPCLMULQDQ wig insns
-#source: avx512vl_vpclmulqdq.s
+#source: vpclmulqdq.s
 
 .*: +file format .*
 
@@ -9,36 +9,14 @@
 Disassembly of section \.text:
 
 00000000 <_start>:
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 e9 44 da ab[ 	]*vpclmulqdq \$0xab,%xmm2,%xmm2,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 e9 44 9c f4 c0 1d fe ff 7b[ 	]*vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm2,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 e9 44 9a f0 07 00 00 7b[ 	]*vpclmulqdq \$0x7b,0x7f0\(%edx\),%xmm2,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 e1 ab[ 	]*vpclmulqdq \$0xab,%ymm1,%ymm5,%ymm4
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 a4 f4 c0 1d fe ff 7b[ 	]*vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm4
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 a2 e0 0f 00 00 7b[ 	]*vpclmulqdq \$0x7b,0xfe0\(%edx\),%ymm5,%ymm4
-[ 	]*[a-f0-9]+:[ 	]*62 f3 6d 08 44 da ab[ 	]*\{evex\} vpclmulqdq \$0xab,%xmm2,%xmm2,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*62 f3 6d 08 44 9c f4 c0 1d fe ff 7b[ 	]*\{evex\} vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm2,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*62 f3 6d 08 44 5a 7f 7b[ 	]*\{evex\} vpclmulqdq \$0x7b,0x7f0\(%edx\),%xmm2,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*62 f3 55 28 44 e1 ab[ 	]*\{evex\} vpclmulqdq \$0xab,%ymm1,%ymm5,%ymm4
-[ 	]*[a-f0-9]+:[ 	]*62 f3 55 28 44 a4 f4 c0 1d fe ff 7b[ 	]*\{evex\} vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm4
-[ 	]*[a-f0-9]+:[ 	]*62 f3 55 28 44 62 7f 7b[ 	]*\{evex\} vpclmulqdq \$0x7b,0xfe0\(%edx\),%ymm5,%ymm4
-[ 	]*[a-f0-9]+:[ 	]*62 f3 65 08 44 e2 11[ 	]*\{evex\} vpclmulhqhqdq %xmm2,%xmm3,%xmm4
-[ 	]*[a-f0-9]+:[ 	]*62 f3 5d 08 44 eb 01[ 	]*\{evex\} vpclmulhqlqdq %xmm3,%xmm4,%xmm5
-[ 	]*[a-f0-9]+:[ 	]*62 f3 55 08 44 f4 10[ 	]*\{evex\} vpclmullqhqdq %xmm4,%xmm5,%xmm6
-[ 	]*[a-f0-9]+:[ 	]*62 f3 4d 08 44 fd 00[ 	]*\{evex\} vpclmullqlqdq %xmm5,%xmm6,%xmm7
-[ 	]*[a-f0-9]+:[ 	]*62 f3 6d 28 44 d9 11[ 	]*\{evex\} vpclmulhqhqdq %ymm1,%ymm2,%ymm3
-[ 	]*[a-f0-9]+:[ 	]*62 f3 65 28 44 e2 01[ 	]*\{evex\} vpclmulhqlqdq %ymm2,%ymm3,%ymm4
-[ 	]*[a-f0-9]+:[ 	]*62 f3 5d 28 44 eb 10[ 	]*\{evex\} vpclmullqhqdq %ymm3,%ymm4,%ymm5
-[ 	]*[a-f0-9]+:[ 	]*62 f3 55 28 44 f4 00[ 	]*\{evex\} vpclmullqlqdq %ymm4,%ymm5,%ymm6
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 d1 44 db ab[ 	]*vpclmulqdq \$0xab,%xmm3,%xmm5,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 d1 44 9c f4 c0 1d fe ff 7b[ 	]*vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 d1 44 9a f0 07 00 00 7b[ 	]*vpclmulqdq \$0x7b,0x7f0\(%edx\),%xmm5,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 ed 44 d2 ab[ 	]*vpclmulqdq \$0xab,%ymm2,%ymm2,%ymm2
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 ed 44 94 f4 c0 1d fe ff 7b[ 	]*vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm2,%ymm2
-[ 	]*[a-f0-9]+:[ 	]*c4 e3 ed 44 92 e0 0f 00 00 7b[ 	]*vpclmulqdq \$0x7b,0xfe0\(%edx\),%ymm2,%ymm2
-[ 	]*[a-f0-9]+:[ 	]*62 f3 55 08 44 db ab[ 	]*\{evex\} vpclmulqdq \$0xab,%xmm3,%xmm5,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*62 f3 55 08 44 9c f4 c0 1d fe ff 7b[ 	]*\{evex\} vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*62 f3 55 08 44 5a 7f 7b[ 	]*\{evex\} vpclmulqdq \$0x7b,0x7f0\(%edx\),%xmm5,%xmm3
-[ 	]*[a-f0-9]+:[ 	]*62 f3 6d 28 44 d2 ab[ 	]*\{evex\} vpclmulqdq \$0xab,%ymm2,%ymm2,%ymm2
-[ 	]*[a-f0-9]+:[ 	]*62 f3 6d 28 44 94 f4 c0 1d fe ff 7b[ 	]*\{evex\} vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm2,%ymm2
-[ 	]*[a-f0-9]+:[ 	]*62 f3 6d 28 44 52 7f 7b[ 	]*\{evex\} vpclmulqdq \$0x7b,0xfe0\(%edx\),%ymm2,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 f4 ab[ 	]*vpclmulqdq \$0xab,%ymm4,%ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 b4 f4 c0 1d fe ff 7b[ 	]*vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 b2 e0 0f 00 00 7b[ 	]*vpclmulqdq \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 ed 44 d9 11[ 	]*vpclmulhqhqdq %ymm1,%ymm2,%ymm3
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 e5 44 e2 01[ 	]*vpclmulhqlqdq %ymm2,%ymm3,%ymm4
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 dd 44 eb 10[ 	]*vpclmullqhqdq %ymm3,%ymm4,%ymm5
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 f4 00[ 	]*vpclmullqlqdq %ymm4,%ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 f4 ab[ 	]*vpclmulqdq \$0xab,%ymm4,%ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 b4 f4 c0 1d fe ff 7b[ 	]*vpclmulqdq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*c4 e3 d5 44 b2 e0 0f 00 00 7b[ 	]*vpclmulqdq \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6
 #pass


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 2/5] x86: rename CpuPCLMUL
  2023-08-25 12:43 [PATCH 0/5] x86: AVX10.1 (alternative attempt) Jan Beulich
  2023-08-25 12:44 ` [PATCH 1/5] x86: correct source used for two non-AVX512 VEXWIG tests Jan Beulich
@ 2023-08-25 12:45 ` Jan Beulich
  2023-08-25 12:46 ` [PATCH 3/5] x86: support AVX10.1/512 Jan Beulich
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2023-08-25 12:45 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu, Jiang, Haochen

The name we use internally isn't in line with the SDM, and also isn't in
line with CpuVPCLMULQDQ. Add the missing suffix, but of course leave
alone user facing names.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1053,8 +1053,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (xsavec, XSAVEC, ANY_XSAVEC, false),
   SUBARCH (xsaves, XSAVES, ANY_XSAVES, false),
   SUBARCH (aes, AES, ANY_AES, false),
-  SUBARCH (pclmul, PCLMUL, ANY_PCLMUL, false),
-  SUBARCH (clmul, PCLMUL, ANY_PCLMUL, true),
+  SUBARCH (pclmul, PCLMULQDQ, ANY_PCLMULQDQ, false),
+  SUBARCH (clmul, PCLMULQDQ, ANY_PCLMULQDQ, true),
   SUBARCH (fsgsbase, FSGSBASE, FSGSBASE, false),
   SUBARCH (rdrnd, RDRND, RDRND, false),
   SUBARCH (f16c, F16C, ANY_F16C, false),
@@ -1911,7 +1911,7 @@ cpu_flags_match (const insn_template *t)
 		      || (sse2avx && !i.prefix[DATA_PREFIX]))
 		  && (!x.bitfield.cpuaes || cpu.bitfield.cpuaes)
 		  && (!x.bitfield.cpugfni || cpu.bitfield.cpugfni)
-		  && (!x.bitfield.cpupclmul || cpu.bitfield.cpupclmul))
+		  && (!x.bitfield.cpupclmulqdq || cpu.bitfield.cpupclmulqdq))
 		match |= CPU_FLAGS_ARCH_MATCH;
 	    }
 	  else if (x.bitfield.cpuavx512f)
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -79,7 +79,7 @@ static const dependency isa_dependencies
   { "AMDFAM10",
     "K8|FISTTP|SSE4A|ABM|MONITOR" },
   { "BDVER1",
-    "GENERIC64|FISTTP|Rdtscp|MONITOR|CX16|LAHF_SAHF|XOP|ABM|LWP|SVME|AES|PCLMUL|PRFCHW" },
+    "GENERIC64|FISTTP|Rdtscp|MONITOR|CX16|LAHF_SAHF|XOP|ABM|LWP|SVME|AES|PCLMULQDQ|PRFCHW" },
   { "BDVER2",
     "BDVER1|FMA|BMI|TBM|F16C" },
   { "BDVER3",
@@ -87,7 +87,7 @@ static const dependency isa_dependencies
   { "BDVER4",
     "BDVER3|AVX2|Movbe|BMI2|RdRnd|MWAITX" },
   { "ZNVER1",
-    "GENERIC64|FISTTP|Rdtscp|MONITOR|CX16|LAHF_SAHF|AVX2|SSE4A|ABM|SVME|AES|PCLMUL|PRFCHW|FMA|BMI|F16C|Xsaveopt|FSGSBase|Movbe|BMI2|RdRnd|ADX|RdSeed|SMAP|SHA|XSAVEC|XSAVES|ClflushOpt|CLZERO|MWAITX" },
+    "GENERIC64|FISTTP|Rdtscp|MONITOR|CX16|LAHF_SAHF|AVX2|SSE4A|ABM|SVME|AES|PCLMULQDQ|PRFCHW|FMA|BMI|F16C|Xsaveopt|FSGSBase|Movbe|BMI2|RdRnd|ADX|RdSeed|SMAP|SHA|XSAVEC|XSAVES|ClflushOpt|CLZERO|MWAITX" },
   { "ZNVER2",
     "ZNVER1|CLWB|RDPID|RDPRU|MCOMMIT|WBNOINVD" },
   { "ZNVER3",
@@ -97,7 +97,7 @@ static const dependency isa_dependencies
   { "BTVER1",
     "GENERIC64|FISTTP|MONITOR|CX16|LAHF_SAHF|Rdtscp|SSSE3|SSE4A|ABM|PRFCHW|Clflush|FISTTP|SVME" },
   { "BTVER2",
-    "BTVER1|AVX|BMI|F16C|AES|PCLMUL|Movbe|Xsaveopt|PRFCHW" },
+    "BTVER1|AVX|BMI|F16C|AES|PCLMULQDQ|Movbe|Xsaveopt|PRFCHW" },
   { "286",
     "186" },
   { "386",
@@ -132,7 +132,7 @@ static const dependency isa_dependencies
     "XSAVE" },
   { "AES",
     "SSE2" },
-  { "PCLMUL",
+  { "PCLMULQDQ",
     "SSE2" },
   { "FMA",
     "AVX" },
@@ -315,7 +315,7 @@ static bitfield cpu_flags[] =
   BITFIELD (Xsave),
   BITFIELD (Xsaveopt),
   BITFIELD (AES),
-  BITFIELD (PCLMUL),
+  BITFIELD (PCLMULQDQ),
   BITFIELD (FMA),
   BITFIELD (FMA4),
   BITFIELD (XOP),
--- a/opcodes/i386-init.h
+++ b/opcodes/i386-init.h
@@ -408,7 +408,7 @@
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
       0, 0, 0, 0, 0 } }
 
-#define CPU_PCLMUL_FLAGS \
+#define CPU_PCLMULQDQ_FLAGS \
   { { 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, \
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, \
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
@@ -1968,7 +1968,7 @@
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
       0, 0, 0, 0, 0 } }
 
-#define CPU_ANY_PCLMUL_FLAGS \
+#define CPU_ANY_PCLMULQDQ_FLAGS \
   { { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, \
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -107,8 +107,8 @@ enum i386_cpu
   CpuXsaveopt,
   /* AES support required */
   CpuAES,
-  /* PCLMUL support required */
-  CpuPCLMUL,
+  /* PCLMULQDQ support required */
+  CpuPCLMULQDQ,
   /* FMA support required */
   CpuFMA,
   /* FMA4 support required */
@@ -415,7 +415,7 @@ typedef union i386_cpu_flags
       unsigned int cpuxsave:1;
       unsigned int cpuxsaveopt:1;
       unsigned int cpuaes:1;
-      unsigned int cpupclmul:1;
+      unsigned int cpupclmulqdq:1;
       unsigned int cpufma:1;
       unsigned int cpufma4:1;
       unsigned int cpuxop:1;
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1448,15 +1448,15 @@ vaesdeclast, 0x66df, VAES, Modrm|Vex256|
 vaesenc, 0x66dc, VAES, Modrm|Vex256|Space0F38|VexVVVV|VexWIG|NoSuf, { RegYMM|Unspecified|BaseIndex, RegYMM, RegYMM }
 vaesenclast, 0x66dd, VAES, Modrm|Vex256|Space0F38|VexVVVV|VexWIG|NoSuf, { RegYMM|Unspecified|BaseIndex, RegYMM, RegYMM }
 
-// PCLMUL
+// PCLMULQDQ
 
 <pclmul:cpu:attr, $avx:AVX|:Vex128|VexW0|SSE2AVX|VexVVVV, $sse::>
 
-pclmulqdq<pclmul>, 0x660f3a44, <pclmul:cpu>PCLMUL, Modrm|<pclmul:attr>|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
-pclmullqlqdq<pclmul>, 0x660f3a44/0x00, <pclmul:cpu>PCLMUL, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
-pclmulhqlqdq<pclmul>, 0x660f3a44/0x01, <pclmul:cpu>PCLMUL, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
-pclmullqhqdq<pclmul>, 0x660f3a44/0x10, <pclmul:cpu>PCLMUL, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
-pclmulhqhqdq<pclmul>, 0x660f3a44/0x11, <pclmul:cpu>PCLMUL, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
+pclmulqdq<pclmul>, 0x660f3a44, <pclmul:cpu>PCLMULQDQ, Modrm|<pclmul:attr>|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
+pclmullqlqdq<pclmul>, 0x660f3a44/0x00, <pclmul:cpu>PCLMULQDQ, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
+pclmulhqlqdq<pclmul>, 0x660f3a44/0x01, <pclmul:cpu>PCLMULQDQ, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
+pclmullqhqdq<pclmul>, 0x660f3a44/0x10, <pclmul:cpu>PCLMULQDQ, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
+pclmulhqhqdq<pclmul>, 0x660f3a44/0x11, <pclmul:cpu>PCLMULQDQ, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
 
 // GFNI
 
@@ -1771,13 +1771,13 @@ vaesenclast, 0x66dd, AVX|AES, Modrm|Vex|
 vaesimc, 0x66db, AVX|AES, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM }
 vaeskeygenassist, 0x66df, AVX|AES, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM }
 
-// PCLMUL + AVX
+// PCLMULQDQ + AVX
 
-vpclmulqdq, 0x6644, AVX|PCLMUL, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmullqlqdq, 0x6644/0x00, AVX|PCLMUL, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmulhqlqdq, 0x6644/0x01, AVX|PCLMUL, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmullqhqdq, 0x6644/0x10, AVX|PCLMUL, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmulhqhqdq, 0x6644/0x11, AVX|PCLMUL, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulqdq, 0x6644, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmullqlqdq, 0x6644/0x00, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulhqlqdq, 0x6644/0x01, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmullqhqdq, 0x6644/0x10, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulhqhqdq, 0x6644/0x11, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 
 // GFNI + AVX
 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 3/5] x86: support AVX10.1/512
  2023-08-25 12:43 [PATCH 0/5] x86: AVX10.1 (alternative attempt) Jan Beulich
  2023-08-25 12:44 ` [PATCH 1/5] x86: correct source used for two non-AVX512 VEXWIG tests Jan Beulich
  2023-08-25 12:45 ` [PATCH 2/5] x86: rename CpuPCLMUL Jan Beulich
@ 2023-08-25 12:46 ` Jan Beulich
  2023-08-28  2:34   ` Jiang, Haochen
  2023-08-25 12:47 ` [PATCH 4/5] x86: unindent most of set_cpu_arch() Jan Beulich
  2023-08-25 12:47 ` [PATCH 5/5] x86: support AVX10.1 vector size restrictions Jan Beulich
  4 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-08-25 12:46 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu, Jiang, Haochen

Since this is merely a re-branding of certain AVX512* features, there's
little code to be added. As to VAES and VPCLMULQDQ, please see the code
comment.

The main aspect here are new testcases. In order to be able to re-use
some of the existing testcases, several of them need their start symbols
adjusted. Note that 256- and 128-bit tests want adding here, as these
need to work right away. Subsequently they'll gain vector length
constraints.

Since it was missing and is wanted here, also add an AVX512VL+VPOPCNTDQ
test.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1156,6 +1156,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (sm3, SM3, ANY_SM3, false),
   SUBARCH (sm4, SM4, ANY_SM4, false),
   SUBARCH (pbndkb, PBNDKB, PBNDKB, false),
+  SUBARCH (avx10.1, AVX10_1, ANY_AVX512F, false),
 };
 
 #undef SUBARCH
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -212,6 +212,7 @@ accept various extension mnemonics.  For
 @code{sm3},
 @code{sm4},
 @code{pbndkb},
+@code{avx10.1},
 @code{amx_int8},
 @code{amx_bf16},
 @code{amx_fp16},
@@ -1636,7 +1637,7 @@ supported on the CPU specified.  The cho
 @item @samp{.avx512vbmi} @tab @samp{.avx512_4fmaps} @tab @samp{.avx512_4vnniw}
 @item @samp{.avx512_vpopcntdq} @tab @samp{.avx512_vbmi2} @tab @samp{.avx512_vnni}
 @item @samp{.avx512_bitalg} @tab @samp{.avx512_bf16} @tab @samp{.avx512_vp2intersect}
-@item @samp{.tdx} @tab @samp{.avx_vnni}  @tab @samp{.avx512_fp16}
+@item @samp{.tdx} @tab @samp{.avx_vnni} @tab @samp{.avx512_fp16} @tab @samp{.avx10.1}
 @item @samp{.clwb} @tab @samp{.rdpid} @tab @samp{.ptwrite} @tab @samp{.ibt}
 @item @samp{.prefetchi} @tab @samp{.avx_ifma} @tab @samp{.avx_vnni_int8}
 @item @samp{.cmpccxadd} @tab @samp{.wrmsrns} @tab @samp{.msrlist}
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-1.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/256 (part 1)
+#dump: avx512f_vl.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-1.s
@@ -0,0 +1,3 @@
+	.arch generic32
+	.arch .avx10.1
+	.include "avx512f_vl.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-2.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/256 (part 2)
+#dump: avx512bw_vl.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-2.s
@@ -0,0 +1,3 @@
+	.arch generic32
+	.arch .avx10.1
+	.include "avx512bw_vl.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-3.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/256 (part 3)
+#dump: avx512dq_vl.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-3.s
@@ -0,0 +1,3 @@
+	.arch generic32
+	.arch .avx10.1
+	.include "avx512dq_vl.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-4.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/256 (part 4)
+#dump: avx512_fp16_vl.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-4.s
@@ -0,0 +1,3 @@
+	.arch generic32
+	.arch .avx10.1
+	.include "avx512_fp16_vl.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-5.d
@@ -0,0 +1,1077 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/256 (part 5)
+
+.*: +file format .*
+
+
+Disassembly of section \.text:
+
+0+ <bitalg>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f ec[ 	]*vpshufbitqmb %xmm4,%xmm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb -0x1e240\(%esp,%esi,8\),%xmm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f 6a 7f[ 	]*vpshufbitqmb 0x7f0\(%edx\),%xmm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8f ec[ 	]*vpshufbitqmb %ymm4,%ymm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb -0x1e240\(%esp,%esi,8\),%ymm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8f 6a 7f[ 	]*vpshufbitqmb 0xfe0\(%edx\),%ymm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 54 f5[ 	]*vpopcntb %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 54 f5[ 	]*vpopcntb %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 54 b4 f4 c0 1d fe ff[ 	]*vpopcntb -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 54 72 7f[ 	]*vpopcntb 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 54 f5[ 	]*vpopcntb %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 54 f5[ 	]*vpopcntb %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 54 b4 f4 c0 1d fe ff[ 	]*vpopcntb -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 54 72 7f[ 	]*vpopcntb 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 54 f5[ 	]*vpopcntw %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 54 f5[ 	]*vpopcntw %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 54 b4 f4 c0 1d fe ff[ 	]*vpopcntw -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 54 72 7f[ 	]*vpopcntw 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 54 f5[ 	]*vpopcntw %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 54 f5[ 	]*vpopcntw %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 54 b4 f4 c0 1d fe ff[ 	]*vpopcntw -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 54 72 7f[ 	]*vpopcntw 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 55 f5[ 	]*vpopcntd %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 55 f5[ 	]*vpopcntd %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 55 72 7f[ 	]*vpopcntd 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 55 f5[ 	]*vpopcntd %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 55 f5[ 	]*vpopcntd %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 55 72 7f[ 	]*vpopcntd 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 55 f5[ 	]*vpopcntq %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 55 f5[ 	]*vpopcntq %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 55 72 7f[ 	]*vpopcntq 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 55 f5[ 	]*vpopcntq %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 55 f5[ 	]*vpopcntq %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 55 72 7f[ 	]*vpopcntq 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f ec[ 	]*vpshufbitqmb %xmm4,%xmm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb -0x1e240\(%esp,%esi,8\),%xmm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f 6a 7f[ 	]*vpshufbitqmb 0x7f0\(%edx\),%xmm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8f ec[ 	]*vpshufbitqmb %ymm4,%ymm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb -0x1e240\(%esp,%esi,8\),%ymm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8f 6a 7f[ 	]*vpshufbitqmb 0xfe0\(%edx\),%ymm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 54 f5[ 	]*vpopcntb %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 54 f5[ 	]*vpopcntb %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 54 b4 f4 c0 1d fe ff[ 	]*vpopcntb -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 54 72 7f[ 	]*vpopcntb 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 54 f5[ 	]*vpopcntb %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 54 f5[ 	]*vpopcntb %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 54 b4 f4 c0 1d fe ff[ 	]*vpopcntb -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 54 72 7f[ 	]*vpopcntb 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 54 f5[ 	]*vpopcntw %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 54 f5[ 	]*vpopcntw %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 54 b4 f4 c0 1d fe ff[ 	]*vpopcntw -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 54 72 7f[ 	]*vpopcntw 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 54 f5[ 	]*vpopcntw %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 54 f5[ 	]*vpopcntw %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 54 b4 f4 c0 1d fe ff[ 	]*vpopcntw -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 54 72 7f[ 	]*vpopcntw 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 55 f5[ 	]*vpopcntd %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 55 f5[ 	]*vpopcntd %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 55 72 7f[ 	]*vpopcntd 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 55 32[ 	]*vpopcntd \(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 55 f5[ 	]*vpopcntd %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 55 f5[ 	]*vpopcntd %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 55 72 7f[ 	]*vpopcntd 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 55 32[ 	]*vpopcntd \(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 55 f5[ 	]*vpopcntq %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 55 f5[ 	]*vpopcntq %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 55 72 7f[ 	]*vpopcntq 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 55 32[ 	]*vpopcntq \(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 55 f5[ 	]*vpopcntq %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 55 f5[ 	]*vpopcntq %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 55 72 7f[ 	]*vpopcntq 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 55 32[ 	]*vpopcntq \(%edx\)\{1to4\},%ymm6\{%k7\}
+
+0+[a-f0-9]+ <cd>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 f5[ 	]*vpconflictd %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f c4 f5[ 	]*vpconflictd %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 31[ 	]*vpconflictd \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 b4 f4 c0 1d fe ff[ 	]*vpconflictd -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 30[ 	]*vpconflictd \(%eax\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 72 7f[ 	]*vpconflictd 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 b2 00 08 00 00[ 	]*vpconflictd 0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 72 80[ 	]*vpconflictd -0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 b2 f0 f7 ff ff[ 	]*vpconflictd -0x810\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 72 7f[ 	]*vpconflictd 0x1fc\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 b2 00 02 00 00[ 	]*vpconflictd 0x200\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 72 80[ 	]*vpconflictd -0x200\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 b2 fc fd ff ff[ 	]*vpconflictd -0x204\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 f5[ 	]*vpconflictd %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af c4 f5[ 	]*vpconflictd %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 31[ 	]*vpconflictd \(%ecx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 b4 f4 c0 1d fe ff[ 	]*vpconflictd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 30[ 	]*vpconflictd \(%eax\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 72 7f[ 	]*vpconflictd 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 b2 00 10 00 00[ 	]*vpconflictd 0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 72 80[ 	]*vpconflictd -0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 b2 e0 ef ff ff[ 	]*vpconflictd -0x1020\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 72 7f[ 	]*vpconflictd 0x1fc\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 b2 00 02 00 00[ 	]*vpconflictd 0x200\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 72 80[ 	]*vpconflictd -0x200\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 b2 fc fd ff ff[ 	]*vpconflictd -0x204\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 f5[ 	]*vpconflictq %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f c4 f5[ 	]*vpconflictq %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 31[ 	]*vpconflictq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 b4 f4 c0 1d fe ff[ 	]*vpconflictq -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 30[ 	]*vpconflictq \(%eax\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 72 7f[ 	]*vpconflictq 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 b2 00 08 00 00[ 	]*vpconflictq 0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 72 80[ 	]*vpconflictq -0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 b2 f0 f7 ff ff[ 	]*vpconflictq -0x810\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 72 7f[ 	]*vpconflictq 0x3f8\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 b2 00 04 00 00[ 	]*vpconflictq 0x400\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 72 80[ 	]*vpconflictq -0x400\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 b2 f8 fb ff ff[ 	]*vpconflictq -0x408\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 f5[ 	]*vpconflictq %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af c4 f5[ 	]*vpconflictq %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 31[ 	]*vpconflictq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 b4 f4 c0 1d fe ff[ 	]*vpconflictq -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 30[ 	]*vpconflictq \(%eax\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 72 7f[ 	]*vpconflictq 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 b2 00 10 00 00[ 	]*vpconflictq 0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 72 80[ 	]*vpconflictq -0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 b2 e0 ef ff ff[ 	]*vpconflictq -0x1020\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 72 7f[ 	]*vpconflictq 0x3f8\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 b2 00 04 00 00[ 	]*vpconflictq 0x400\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 72 80[ 	]*vpconflictq -0x400\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 b2 f8 fb ff ff[ 	]*vpconflictq -0x408\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 f5[ 	]*vplzcntd %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 44 f5[ 	]*vplzcntd %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 31[ 	]*vplzcntd \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 b4 f4 c0 1d fe ff[ 	]*vplzcntd -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 30[ 	]*vplzcntd \(%eax\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 72 7f[ 	]*vplzcntd 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 b2 00 08 00 00[ 	]*vplzcntd 0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 72 80[ 	]*vplzcntd -0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 b2 f0 f7 ff ff[ 	]*vplzcntd -0x810\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 72 7f[ 	]*vplzcntd 0x1fc\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 b2 00 02 00 00[ 	]*vplzcntd 0x200\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 72 80[ 	]*vplzcntd -0x200\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 b2 fc fd ff ff[ 	]*vplzcntd -0x204\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 f5[ 	]*vplzcntd %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 44 f5[ 	]*vplzcntd %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 31[ 	]*vplzcntd \(%ecx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 b4 f4 c0 1d fe ff[ 	]*vplzcntd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 30[ 	]*vplzcntd \(%eax\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 72 7f[ 	]*vplzcntd 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 b2 00 10 00 00[ 	]*vplzcntd 0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 72 80[ 	]*vplzcntd -0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 b2 e0 ef ff ff[ 	]*vplzcntd -0x1020\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 72 7f[ 	]*vplzcntd 0x1fc\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 b2 00 02 00 00[ 	]*vplzcntd 0x200\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 72 80[ 	]*vplzcntd -0x200\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 b2 fc fd ff ff[ 	]*vplzcntd -0x204\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 f5[ 	]*vplzcntq %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 44 f5[ 	]*vplzcntq %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 31[ 	]*vplzcntq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 b4 f4 c0 1d fe ff[ 	]*vplzcntq -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 30[ 	]*vplzcntq \(%eax\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 72 7f[ 	]*vplzcntq 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 b2 00 08 00 00[ 	]*vplzcntq 0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 72 80[ 	]*vplzcntq -0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 b2 f0 f7 ff ff[ 	]*vplzcntq -0x810\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 72 7f[ 	]*vplzcntq 0x3f8\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 b2 00 04 00 00[ 	]*vplzcntq 0x400\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 72 80[ 	]*vplzcntq -0x400\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 b2 f8 fb ff ff[ 	]*vplzcntq -0x408\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 f5[ 	]*vplzcntq %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 44 f5[ 	]*vplzcntq %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 31[ 	]*vplzcntq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 b4 f4 c0 1d fe ff[ 	]*vplzcntq -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 30[ 	]*vplzcntq \(%eax\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 72 7f[ 	]*vplzcntq 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 b2 00 10 00 00[ 	]*vplzcntq 0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 72 80[ 	]*vplzcntq -0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 b2 e0 ef ff ff[ 	]*vplzcntq -0x1020\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 72 7f[ 	]*vplzcntq 0x3f8\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 b2 00 04 00 00[ 	]*vplzcntq 0x400\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 72 80[ 	]*vplzcntq -0x400\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 b2 f8 fb ff ff[ 	]*vplzcntq -0x408\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7e 08 3a f6[ 	]*vpbroadcastmw2d %k6,%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7e 28 3a f6[ 	]*vpbroadcastmw2d %k6,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fe 08 2a f6[ 	]*vpbroadcastmb2q %k6,%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fe 28 2a f6[ 	]*vpbroadcastmb2q %k6,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 f5[ 	]*vpconflictd %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f c4 f5[ 	]*vpconflictd %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 31[ 	]*vpconflictd \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 b4 f4 c0 1d fe ff[ 	]*vpconflictd -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 30[ 	]*vpconflictd \(%eax\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 72 7f[ 	]*vpconflictd 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 b2 00 08 00 00[ 	]*vpconflictd 0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 72 80[ 	]*vpconflictd -0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 b2 f0 f7 ff ff[ 	]*vpconflictd -0x810\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 72 7f[ 	]*vpconflictd 0x1fc\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 b2 00 02 00 00[ 	]*vpconflictd 0x200\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 72 80[ 	]*vpconflictd -0x200\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f c4 b2 fc fd ff ff[ 	]*vpconflictd -0x204\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 f5[ 	]*vpconflictd %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af c4 f5[ 	]*vpconflictd %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 31[ 	]*vpconflictd \(%ecx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 b4 f4 c0 1d fe ff[ 	]*vpconflictd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 30[ 	]*vpconflictd \(%eax\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 72 7f[ 	]*vpconflictd 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 b2 00 10 00 00[ 	]*vpconflictd 0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 72 80[ 	]*vpconflictd -0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f c4 b2 e0 ef ff ff[ 	]*vpconflictd -0x1020\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 72 7f[ 	]*vpconflictd 0x1fc\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 b2 00 02 00 00[ 	]*vpconflictd 0x200\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 72 80[ 	]*vpconflictd -0x200\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f c4 b2 fc fd ff ff[ 	]*vpconflictd -0x204\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 f5[ 	]*vpconflictq %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f c4 f5[ 	]*vpconflictq %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 31[ 	]*vpconflictq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 b4 f4 c0 1d fe ff[ 	]*vpconflictq -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 30[ 	]*vpconflictq \(%eax\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 72 7f[ 	]*vpconflictq 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 b2 00 08 00 00[ 	]*vpconflictq 0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 72 80[ 	]*vpconflictq -0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f c4 b2 f0 f7 ff ff[ 	]*vpconflictq -0x810\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 72 7f[ 	]*vpconflictq 0x3f8\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 b2 00 04 00 00[ 	]*vpconflictq 0x400\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 72 80[ 	]*vpconflictq -0x400\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f c4 b2 f8 fb ff ff[ 	]*vpconflictq -0x408\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 f5[ 	]*vpconflictq %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af c4 f5[ 	]*vpconflictq %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 31[ 	]*vpconflictq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 b4 f4 c0 1d fe ff[ 	]*vpconflictq -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 30[ 	]*vpconflictq \(%eax\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 72 7f[ 	]*vpconflictq 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 b2 00 10 00 00[ 	]*vpconflictq 0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 72 80[ 	]*vpconflictq -0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f c4 b2 e0 ef ff ff[ 	]*vpconflictq -0x1020\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 72 7f[ 	]*vpconflictq 0x3f8\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 b2 00 04 00 00[ 	]*vpconflictq 0x400\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 72 80[ 	]*vpconflictq -0x400\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f c4 b2 f8 fb ff ff[ 	]*vpconflictq -0x408\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 f5[ 	]*vplzcntd %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 44 f5[ 	]*vplzcntd %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 31[ 	]*vplzcntd \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 b4 f4 c0 1d fe ff[ 	]*vplzcntd -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 30[ 	]*vplzcntd \(%eax\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 72 7f[ 	]*vplzcntd 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 b2 00 08 00 00[ 	]*vplzcntd 0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 72 80[ 	]*vplzcntd -0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 44 b2 f0 f7 ff ff[ 	]*vplzcntd -0x810\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 72 7f[ 	]*vplzcntd 0x1fc\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 b2 00 02 00 00[ 	]*vplzcntd 0x200\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 72 80[ 	]*vplzcntd -0x200\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 1f 44 b2 fc fd ff ff[ 	]*vplzcntd -0x204\(%edx\)\{1to4\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 f5[ 	]*vplzcntd %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 44 f5[ 	]*vplzcntd %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 31[ 	]*vplzcntd \(%ecx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 b4 f4 c0 1d fe ff[ 	]*vplzcntd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 30[ 	]*vplzcntd \(%eax\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 72 7f[ 	]*vplzcntd 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 b2 00 10 00 00[ 	]*vplzcntd 0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 72 80[ 	]*vplzcntd -0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 44 b2 e0 ef ff ff[ 	]*vplzcntd -0x1020\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 72 7f[ 	]*vplzcntd 0x1fc\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 b2 00 02 00 00[ 	]*vplzcntd 0x200\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 72 80[ 	]*vplzcntd -0x200\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 3f 44 b2 fc fd ff ff[ 	]*vplzcntd -0x204\(%edx\)\{1to8\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 f5[ 	]*vplzcntq %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 44 f5[ 	]*vplzcntq %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 31[ 	]*vplzcntq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 b4 f4 c0 1d fe ff[ 	]*vplzcntq -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 30[ 	]*vplzcntq \(%eax\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 72 7f[ 	]*vplzcntq 0x7f0\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 b2 00 08 00 00[ 	]*vplzcntq 0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 72 80[ 	]*vplzcntq -0x800\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 44 b2 f0 f7 ff ff[ 	]*vplzcntq -0x810\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 72 7f[ 	]*vplzcntq 0x3f8\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 b2 00 04 00 00[ 	]*vplzcntq 0x400\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 72 80[ 	]*vplzcntq -0x400\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 1f 44 b2 f8 fb ff ff[ 	]*vplzcntq -0x408\(%edx\)\{1to2\},%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 f5[ 	]*vplzcntq %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 44 f5[ 	]*vplzcntq %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 31[ 	]*vplzcntq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 b4 f4 c0 1d fe ff[ 	]*vplzcntq -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 30[ 	]*vplzcntq \(%eax\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 72 7f[ 	]*vplzcntq 0xfe0\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 b2 00 10 00 00[ 	]*vplzcntq 0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 72 80[ 	]*vplzcntq -0x1000\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 44 b2 e0 ef ff ff[ 	]*vplzcntq -0x1020\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 72 7f[ 	]*vplzcntq 0x3f8\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 b2 00 04 00 00[ 	]*vplzcntq 0x400\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 72 80[ 	]*vplzcntq -0x400\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 3f 44 b2 f8 fb ff ff[ 	]*vplzcntq -0x408\(%edx\)\{1to4\},%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7e 08 3a f6[ 	]*vpbroadcastmw2d %k6,%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7e 28 3a f6[ 	]*vpbroadcastmw2d %k6,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fe 08 2a f6[ 	]*vpbroadcastmb2q %k6,%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fe 28 2a f6[ 	]*vpbroadcastmb2q %k6,%ymm6
+
+0+[a-f0-9]+ <ifma>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 f4[ 	]*vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f b4 f4[ 	]*vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 31[ 	]*vpmadd52luq \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 b4 f4 c0 1d fe ff[ 	]*vpmadd52luq -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 30[ 	]*vpmadd52luq \(%eax\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 72 7f[ 	]*vpmadd52luq 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 b2 00 08 00 00[ 	]*vpmadd52luq 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 72 80[ 	]*vpmadd52luq -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 b2 f0 f7 ff ff[ 	]*vpmadd52luq -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 72 7f[ 	]*vpmadd52luq 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 b2 00 04 00 00[ 	]*vpmadd52luq 0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 72 80[ 	]*vpmadd52luq -0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 b2 f8 fb ff ff[ 	]*vpmadd52luq -0x408\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 f4[ 	]*vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af b4 f4[ 	]*vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 31[ 	]*vpmadd52luq \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 b4 f4 c0 1d fe ff[ 	]*vpmadd52luq -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 30[ 	]*vpmadd52luq \(%eax\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 72 7f[ 	]*vpmadd52luq 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 b2 00 10 00 00[ 	]*vpmadd52luq 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 72 80[ 	]*vpmadd52luq -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 b2 e0 ef ff ff[ 	]*vpmadd52luq -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 72 7f[ 	]*vpmadd52luq 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 b2 00 04 00 00[ 	]*vpmadd52luq 0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 72 80[ 	]*vpmadd52luq -0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 b2 f8 fb ff ff[ 	]*vpmadd52luq -0x408\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 f4[ 	]*vpmadd52huq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f b5 f4[ 	]*vpmadd52huq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 31[ 	]*vpmadd52huq \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 b4 f4 c0 1d fe ff[ 	]*vpmadd52huq -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 30[ 	]*vpmadd52huq \(%eax\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 72 7f[ 	]*vpmadd52huq 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 b2 00 08 00 00[ 	]*vpmadd52huq 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 72 80[ 	]*vpmadd52huq -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 b2 f0 f7 ff ff[ 	]*vpmadd52huq -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 72 7f[ 	]*vpmadd52huq 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 b2 00 04 00 00[ 	]*vpmadd52huq 0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 72 80[ 	]*vpmadd52huq -0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 b2 f8 fb ff ff[ 	]*vpmadd52huq -0x408\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 f4[ 	]*vpmadd52huq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af b5 f4[ 	]*vpmadd52huq %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 31[ 	]*vpmadd52huq \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 b4 f4 c0 1d fe ff[ 	]*vpmadd52huq -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 30[ 	]*vpmadd52huq \(%eax\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 72 7f[ 	]*vpmadd52huq 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 b2 00 10 00 00[ 	]*vpmadd52huq 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 72 80[ 	]*vpmadd52huq -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 b2 e0 ef ff ff[ 	]*vpmadd52huq -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 72 7f[ 	]*vpmadd52huq 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 b2 00 04 00 00[ 	]*vpmadd52huq 0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 72 80[ 	]*vpmadd52huq -0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 b2 f8 fb ff ff[ 	]*vpmadd52huq -0x408\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 f4[ 	]*vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f b4 f4[ 	]*vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 31[ 	]*vpmadd52luq \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 b4 f4 c0 1d fe ff[ 	]*vpmadd52luq -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 30[ 	]*vpmadd52luq \(%eax\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 72 7f[ 	]*vpmadd52luq 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 b2 00 08 00 00[ 	]*vpmadd52luq 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 72 80[ 	]*vpmadd52luq -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 b2 f0 f7 ff ff[ 	]*vpmadd52luq -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 72 7f[ 	]*vpmadd52luq 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 b2 00 04 00 00[ 	]*vpmadd52luq 0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 72 80[ 	]*vpmadd52luq -0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b4 b2 f8 fb ff ff[ 	]*vpmadd52luq -0x408\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 f4[ 	]*vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af b4 f4[ 	]*vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 31[ 	]*vpmadd52luq \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 b4 f4 c0 1d fe ff[ 	]*vpmadd52luq -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 30[ 	]*vpmadd52luq \(%eax\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 72 7f[ 	]*vpmadd52luq 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 b2 00 10 00 00[ 	]*vpmadd52luq 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 72 80[ 	]*vpmadd52luq -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b4 b2 e0 ef ff ff[ 	]*vpmadd52luq -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 72 7f[ 	]*vpmadd52luq 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 b2 00 04 00 00[ 	]*vpmadd52luq 0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 72 80[ 	]*vpmadd52luq -0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b4 b2 f8 fb ff ff[ 	]*vpmadd52luq -0x408\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 f4[ 	]*vpmadd52huq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f b5 f4[ 	]*vpmadd52huq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 31[ 	]*vpmadd52huq \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 b4 f4 c0 1d fe ff[ 	]*vpmadd52huq -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 30[ 	]*vpmadd52huq \(%eax\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 72 7f[ 	]*vpmadd52huq 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 b2 00 08 00 00[ 	]*vpmadd52huq 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 72 80[ 	]*vpmadd52huq -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b5 b2 f0 f7 ff ff[ 	]*vpmadd52huq -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 72 7f[ 	]*vpmadd52huq 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 b2 00 04 00 00[ 	]*vpmadd52huq 0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 72 80[ 	]*vpmadd52huq -0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f b5 b2 f8 fb ff ff[ 	]*vpmadd52huq -0x408\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 f4[ 	]*vpmadd52huq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af b5 f4[ 	]*vpmadd52huq %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 31[ 	]*vpmadd52huq \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 b4 f4 c0 1d fe ff[ 	]*vpmadd52huq -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 30[ 	]*vpmadd52huq \(%eax\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 72 7f[ 	]*vpmadd52huq 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 b2 00 10 00 00[ 	]*vpmadd52huq 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 72 80[ 	]*vpmadd52huq -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f b5 b2 e0 ef ff ff[ 	]*vpmadd52huq -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 72 7f[ 	]*vpmadd52huq 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 b2 00 04 00 00[ 	]*vpmadd52huq 0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 72 80[ 	]*vpmadd52huq -0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f b5 b2 f8 fb ff ff[ 	]*vpmadd52huq -0x408\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+
+0+[a-f0-9]+ <vbmi>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d f4[ 	]*vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 8d f4[ 	]*vpermb %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d 31[ 	]*vpermb \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d b4 f4 c0 1d fe ff[ 	]*vpermb -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d 72 7f[ 	]*vpermb 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d b2 00 08 00 00[ 	]*vpermb 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d 72 80[ 	]*vpermb -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d b2 f0 f7 ff ff[ 	]*vpermb -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d f4[ 	]*vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 8d f4[ 	]*vpermb %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d 31[ 	]*vpermb \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d b4 f4 c0 1d fe ff[ 	]*vpermb -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d 72 7f[ 	]*vpermb 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d b2 00 10 00 00[ 	]*vpermb 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d 72 80[ 	]*vpermb -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d b2 e0 ef ff ff[ 	]*vpermb -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 f4[ 	]*vpermi2b %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 75 f4[ 	]*vpermi2b %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 31[ 	]*vpermi2b \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 b4 f4 c0 1d fe ff[ 	]*vpermi2b -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 72 7f[ 	]*vpermi2b 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 b2 00 08 00 00[ 	]*vpermi2b 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 72 80[ 	]*vpermi2b -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 b2 f0 f7 ff ff[ 	]*vpermi2b -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 f4[ 	]*vpermi2b %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 75 f4[ 	]*vpermi2b %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 31[ 	]*vpermi2b \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 b4 f4 c0 1d fe ff[ 	]*vpermi2b -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 72 7f[ 	]*vpermi2b 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 b2 00 10 00 00[ 	]*vpermi2b 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 72 80[ 	]*vpermi2b -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 b2 e0 ef ff ff[ 	]*vpermi2b -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d f4[ 	]*vpermt2b %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 7d f4[ 	]*vpermt2b %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d 31[ 	]*vpermt2b \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d b4 f4 c0 1d fe ff[ 	]*vpermt2b -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d 72 7f[ 	]*vpermt2b 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d b2 00 08 00 00[ 	]*vpermt2b 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d 72 80[ 	]*vpermt2b -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d b2 f0 f7 ff ff[ 	]*vpermt2b -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d f4[ 	]*vpermt2b %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 7d f4[ 	]*vpermt2b %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d 31[ 	]*vpermt2b \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d b4 f4 c0 1d fe ff[ 	]*vpermt2b -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d 72 7f[ 	]*vpermt2b 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d b2 00 10 00 00[ 	]*vpermt2b 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d 72 80[ 	]*vpermt2b -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d b2 e0 ef ff ff[ 	]*vpermt2b -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 f4[ 	]*vpmultishiftqb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 83 f4[ 	]*vpmultishiftqb %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 31[ 	]*vpmultishiftqb \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 b4 f4 c0 1d fe ff[ 	]*vpmultishiftqb -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 30[ 	]*vpmultishiftqb \(%eax\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 72 7f[ 	]*vpmultishiftqb 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 b2 00 08 00 00[ 	]*vpmultishiftqb 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 72 80[ 	]*vpmultishiftqb -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 b2 f0 f7 ff ff[ 	]*vpmultishiftqb -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 72 7f[ 	]*vpmultishiftqb 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 b2 00 04 00 00[ 	]*vpmultishiftqb 0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 72 80[ 	]*vpmultishiftqb -0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 b2 f8 fb ff ff[ 	]*vpmultishiftqb -0x408\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 f4[ 	]*vpmultishiftqb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 83 f4[ 	]*vpmultishiftqb %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 31[ 	]*vpmultishiftqb \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 b4 f4 c0 1d fe ff[ 	]*vpmultishiftqb -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 30[ 	]*vpmultishiftqb \(%eax\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 72 7f[ 	]*vpmultishiftqb 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 b2 00 10 00 00[ 	]*vpmultishiftqb 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 72 80[ 	]*vpmultishiftqb -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 b2 e0 ef ff ff[ 	]*vpmultishiftqb -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 72 7f[ 	]*vpmultishiftqb 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 b2 00 04 00 00[ 	]*vpmultishiftqb 0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 72 80[ 	]*vpmultishiftqb -0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 b2 f8 fb ff ff[ 	]*vpmultishiftqb -0x408\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d f4[ 	]*vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 8d f4[ 	]*vpermb %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d 31[ 	]*vpermb \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d b4 f4 c0 1d fe ff[ 	]*vpermb -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d 72 7f[ 	]*vpermb 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d b2 00 08 00 00[ 	]*vpermb 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d 72 80[ 	]*vpermb -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d b2 f0 f7 ff ff[ 	]*vpermb -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d f4[ 	]*vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 8d f4[ 	]*vpermb %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d 31[ 	]*vpermb \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d b4 f4 c0 1d fe ff[ 	]*vpermb -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d 72 7f[ 	]*vpermb 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d b2 00 10 00 00[ 	]*vpermb 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d 72 80[ 	]*vpermb -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 8d b2 e0 ef ff ff[ 	]*vpermb -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 f4[ 	]*vpermi2b %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 75 f4[ 	]*vpermi2b %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 31[ 	]*vpermi2b \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 b4 f4 c0 1d fe ff[ 	]*vpermi2b -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 72 7f[ 	]*vpermi2b 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 b2 00 08 00 00[ 	]*vpermi2b 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 72 80[ 	]*vpermi2b -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 75 b2 f0 f7 ff ff[ 	]*vpermi2b -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 f4[ 	]*vpermi2b %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 75 f4[ 	]*vpermi2b %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 31[ 	]*vpermi2b \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 b4 f4 c0 1d fe ff[ 	]*vpermi2b -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 72 7f[ 	]*vpermi2b 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 b2 00 10 00 00[ 	]*vpermi2b 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 72 80[ 	]*vpermi2b -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 75 b2 e0 ef ff ff[ 	]*vpermi2b -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d f4[ 	]*vpermt2b %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 7d f4[ 	]*vpermt2b %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d 31[ 	]*vpermt2b \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d b4 f4 c0 1d fe ff[ 	]*vpermt2b -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d 72 7f[ 	]*vpermt2b 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d b2 00 08 00 00[ 	]*vpermt2b 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d 72 80[ 	]*vpermt2b -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 7d b2 f0 f7 ff ff[ 	]*vpermt2b -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d f4[ 	]*vpermt2b %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 7d f4[ 	]*vpermt2b %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d 31[ 	]*vpermt2b \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d b4 f4 c0 1d fe ff[ 	]*vpermt2b -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d 72 7f[ 	]*vpermt2b 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d b2 00 10 00 00[ 	]*vpermt2b 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d 72 80[ 	]*vpermt2b -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 7d b2 e0 ef ff ff[ 	]*vpermt2b -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 f4[ 	]*vpmultishiftqb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 83 f4[ 	]*vpmultishiftqb %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 31[ 	]*vpmultishiftqb \(%ecx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 b4 f4 c0 1d fe ff[ 	]*vpmultishiftqb -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 30[ 	]*vpmultishiftqb \(%eax\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 72 7f[ 	]*vpmultishiftqb 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 b2 00 08 00 00[ 	]*vpmultishiftqb 0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 72 80[ 	]*vpmultishiftqb -0x800\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 83 b2 f0 f7 ff ff[ 	]*vpmultishiftqb -0x810\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 72 7f[ 	]*vpmultishiftqb 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 b2 00 04 00 00[ 	]*vpmultishiftqb 0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 72 80[ 	]*vpmultishiftqb -0x400\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 83 b2 f8 fb ff ff[ 	]*vpmultishiftqb -0x408\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 f4[ 	]*vpmultishiftqb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 83 f4[ 	]*vpmultishiftqb %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 31[ 	]*vpmultishiftqb \(%ecx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 b4 f4 c0 1d fe ff[ 	]*vpmultishiftqb -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 30[ 	]*vpmultishiftqb \(%eax\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 72 7f[ 	]*vpmultishiftqb 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 b2 00 10 00 00[ 	]*vpmultishiftqb 0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 72 80[ 	]*vpmultishiftqb -0x1000\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 83 b2 e0 ef ff ff[ 	]*vpmultishiftqb -0x1020\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 72 7f[ 	]*vpmultishiftqb 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 b2 00 04 00 00[ 	]*vpmultishiftqb 0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 72 80[ 	]*vpmultishiftqb -0x400\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 83 b2 f8 fb ff ff[ 	]*vpmultishiftqb -0x408\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+
+0+[a-f0-9]+ <vbmi2>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %xmm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 72 7e[ 	]*vpcompressb %xmm6,0x7e\(%edx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %ymm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 63 72 7e[ 	]*vpcompressb %ymm6,0x7e\(%edx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 ee[ 	]*vpcompressb %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 63 ee[ 	]*vpcompressb %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 63 ee[ 	]*vpcompressb %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 63 ee[ 	]*vpcompressb %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressw %xmm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 63 72 40[ 	]*vpcompressw %xmm6,0x80\(%edx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressw %ymm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 63 72 40[ 	]*vpcompressw %ymm6,0x80\(%edx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 63 ee[ 	]*vpcompressw %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 63 ee[ 	]*vpcompressw %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 63 ee[ 	]*vpcompressw %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 63 ee[ 	]*vpcompressw %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 62 31[ 	]*vpexpandb \(%ecx\),%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 62 b4 f4 c0 1d fe ff[ 	]*vpexpandb -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 62 72 7e[ 	]*vpexpandb 0x7e\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 62 31[ 	]*vpexpandb \(%ecx\),%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 62 b4 f4 c0 1d fe ff[ 	]*vpexpandb -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 62 72 7e[ 	]*vpexpandb 0x7e\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 62 f5[ 	]*vpexpandb %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 62 f5[ 	]*vpexpandb %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 62 f5[ 	]*vpexpandb %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 62 f5[ 	]*vpexpandb %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 62 31[ 	]*vpexpandw \(%ecx\),%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 62 b4 f4 c0 1d fe ff[ 	]*vpexpandw -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 62 72 40[ 	]*vpexpandw 0x80\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 62 31[ 	]*vpexpandw \(%ecx\),%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 62 b4 f4 c0 1d fe ff[ 	]*vpexpandw -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 62 72 40[ 	]*vpexpandw 0x80\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 62 f5[ 	]*vpexpandw %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 62 f5[ 	]*vpexpandw %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 62 f5[ 	]*vpexpandw %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 62 f5[ 	]*vpexpandw %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 70 f4[ 	]*vpshldvw %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 70 f4[ 	]*vpshldvw %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 70 b4 f4 c0 1d fe ff[ 	]*vpshldvw -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 70 72 7f[ 	]*vpshldvw 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 70 f4[ 	]*vpshldvw %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 70 f4[ 	]*vpshldvw %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 70 b4 f4 c0 1d fe ff[ 	]*vpshldvw -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 70 72 7f[ 	]*vpshldvw 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 71 f4[ 	]*vpshldvd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 71 f4[ 	]*vpshldvd %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 71 b4 f4 c0 1d fe ff[ 	]*vpshldvd -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 71 72 7f[ 	]*vpshldvd 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 1f 71 72 7f[ 	]*vpshldvd 0x1fc\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 71 f4[ 	]*vpshldvd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 71 f4[ 	]*vpshldvd %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 71 b4 f4 c0 1d fe ff[ 	]*vpshldvd -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 71 72 7f[ 	]*vpshldvd 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 3f 71 72 7f[ 	]*vpshldvd 0x1fc\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 71 f4[ 	]*vpshldvq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 71 f4[ 	]*vpshldvq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 71 b4 f4 c0 1d fe ff[ 	]*vpshldvq -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 71 72 7f[ 	]*vpshldvq 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 71 72 7f[ 	]*vpshldvq 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 71 f4[ 	]*vpshldvq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 71 f4[ 	]*vpshldvq %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 71 b4 f4 c0 1d fe ff[ 	]*vpshldvq -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 71 72 7f[ 	]*vpshldvq 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 71 72 7f[ 	]*vpshldvq 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 72 f4[ 	]*vpshrdvw %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 72 f4[ 	]*vpshrdvw %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 72 b4 f4 c0 1d fe ff[ 	]*vpshrdvw -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 72 72 7f[ 	]*vpshrdvw 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 72 f4[ 	]*vpshrdvw %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 72 f4[ 	]*vpshrdvw %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 72 b4 f4 c0 1d fe ff[ 	]*vpshrdvw -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 72 72 7f[ 	]*vpshrdvw 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 73 f4[ 	]*vpshrdvd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 73 f4[ 	]*vpshrdvd %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvd -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 73 72 7f[ 	]*vpshrdvd 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 1f 73 72 7f[ 	]*vpshrdvd 0x1fc\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 73 f4[ 	]*vpshrdvd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 73 f4[ 	]*vpshrdvd %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvd -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 73 72 7f[ 	]*vpshrdvd 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 3f 73 72 7f[ 	]*vpshrdvd 0x1fc\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 73 f4[ 	]*vpshrdvq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 73 f4[ 	]*vpshrdvq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvq -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 73 72 7f[ 	]*vpshrdvq 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 73 72 7f[ 	]*vpshrdvq 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 73 f4[ 	]*vpshrdvq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 73 f4[ 	]*vpshrdvq %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvq -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 73 72 7f[ 	]*vpshrdvq 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 73 72 7f[ 	]*vpshrdvq 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 70 f4 ab[ 	]*vpshldw \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 8f 70 f4 ab[ 	]*vpshldw \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 70 b4 f4 c0 1d fe ff 7b[ 	]*vpshldw \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 70 72 7f 7b[ 	]*vpshldw \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 70 f4 ab[ 	]*vpshldw \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 af 70 f4 ab[ 	]*vpshldw \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 70 b4 f4 c0 1d fe ff 7b[ 	]*vpshldw \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 70 72 7f 7b[ 	]*vpshldw \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 71 f4 ab[ 	]*vpshldd \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 8f 71 f4 ab[ 	]*vpshldd \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldd \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 71 72 7f 7b[ 	]*vpshldd \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 1f 71 72 7f 7b[ 	]*vpshldd \$0x7b,0x1fc\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 71 f4 ab[ 	]*vpshldd \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 af 71 f4 ab[ 	]*vpshldd \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldd \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 71 72 7f 7b[ 	]*vpshldd \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 3f 71 72 7f 7b[ 	]*vpshldd \$0x7b,0x1fc\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 71 f4 ab[ 	]*vpshldq \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 8f 71 f4 ab[ 	]*vpshldq \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldq \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 71 72 7f 7b[ 	]*vpshldq \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 1f 71 72 7f 7b[ 	]*vpshldq \$0x7b,0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 71 f4 ab[ 	]*vpshldq \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 af 71 f4 ab[ 	]*vpshldq \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 71 72 7f 7b[ 	]*vpshldq \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 3f 71 72 7f 7b[ 	]*vpshldq \$0x7b,0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 72 f4 ab[ 	]*vpshrdw \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 8f 72 f4 ab[ 	]*vpshrdw \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 72 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdw \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 72 72 7f 7b[ 	]*vpshrdw \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 72 f4 ab[ 	]*vpshrdw \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 af 72 f4 ab[ 	]*vpshrdw \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 72 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdw \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 72 72 7f 7b[ 	]*vpshrdw \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 73 f4 ab[ 	]*vpshrdd \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 8f 73 f4 ab[ 	]*vpshrdd \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdd \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 1f 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0x1fc\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 73 f4 ab[ 	]*vpshrdd \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 af 73 f4 ab[ 	]*vpshrdd \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdd \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 3f 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0x1fc\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 73 f4 ab[ 	]*vpshrdq \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 8f 73 f4 ab[ 	]*vpshrdq \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdq \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 1f 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 73 f4 ab[ 	]*vpshrdq \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 af 73 f4 ab[ 	]*vpshrdq \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 3f 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %xmm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 72 7e[ 	]*vpcompressb %xmm6,0x7e\(%edx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %ymm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 63 72 7e[ 	]*vpcompressb %ymm6,0x7e\(%edx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 ee[ 	]*vpcompressb %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 63 ee[ 	]*vpcompressb %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 63 ee[ 	]*vpcompressb %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 63 ee[ 	]*vpcompressb %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressw %xmm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 63 72 40[ 	]*vpcompressw %xmm6,0x80\(%edx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressw %ymm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 63 72 40[ 	]*vpcompressw %ymm6,0x80\(%edx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 63 ee[ 	]*vpcompressw %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 63 ee[ 	]*vpcompressw %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 63 ee[ 	]*vpcompressw %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 63 ee[ 	]*vpcompressw %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 62 31[ 	]*vpexpandb \(%ecx\),%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 62 b4 f4 c0 1d fe ff[ 	]*vpexpandb -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 62 72 7e[ 	]*vpexpandb 0x7e\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 62 31[ 	]*vpexpandb \(%ecx\),%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 62 b4 f4 c0 1d fe ff[ 	]*vpexpandb -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 62 72 7e[ 	]*vpexpandb 0x7e\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 62 f5[ 	]*vpexpandb %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 62 f5[ 	]*vpexpandb %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 62 f5[ 	]*vpexpandb %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 62 f5[ 	]*vpexpandb %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 62 31[ 	]*vpexpandw \(%ecx\),%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 62 b4 f4 c0 1d fe ff[ 	]*vpexpandw -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 62 72 40[ 	]*vpexpandw 0x80\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 62 31[ 	]*vpexpandw \(%ecx\),%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 62 b4 f4 c0 1d fe ff[ 	]*vpexpandw -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 62 72 40[ 	]*vpexpandw 0x80\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 62 f5[ 	]*vpexpandw %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 62 f5[ 	]*vpexpandw %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 62 f5[ 	]*vpexpandw %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 62 f5[ 	]*vpexpandw %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 70 f4[ 	]*vpshldvw %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 70 f4[ 	]*vpshldvw %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 70 b4 f4 c0 1d fe ff[ 	]*vpshldvw -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 70 72 7f[ 	]*vpshldvw 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 70 f4[ 	]*vpshldvw %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 70 f4[ 	]*vpshldvw %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 70 b4 f4 c0 1d fe ff[ 	]*vpshldvw -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 70 72 7f[ 	]*vpshldvw 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 71 f4[ 	]*vpshldvd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 71 f4[ 	]*vpshldvd %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 71 b4 f4 c0 1d fe ff[ 	]*vpshldvd -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 71 72 7f[ 	]*vpshldvd 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 1f 71 72 7f[ 	]*vpshldvd 0x1fc\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 71 f4[ 	]*vpshldvd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 71 f4[ 	]*vpshldvd %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 71 b4 f4 c0 1d fe ff[ 	]*vpshldvd -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 71 72 7f[ 	]*vpshldvd 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 3f 71 72 7f[ 	]*vpshldvd 0x1fc\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 71 f4[ 	]*vpshldvq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 71 f4[ 	]*vpshldvq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 71 b4 f4 c0 1d fe ff[ 	]*vpshldvq -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 71 72 7f[ 	]*vpshldvq 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 71 72 7f[ 	]*vpshldvq 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 71 f4[ 	]*vpshldvq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 71 f4[ 	]*vpshldvq %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 71 b4 f4 c0 1d fe ff[ 	]*vpshldvq -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 71 72 7f[ 	]*vpshldvq 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 71 72 7f[ 	]*vpshldvq 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 72 f4[ 	]*vpshrdvw %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 72 f4[ 	]*vpshrdvw %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 72 b4 f4 c0 1d fe ff[ 	]*vpshrdvw -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 72 72 7f[ 	]*vpshrdvw 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 72 f4[ 	]*vpshrdvw %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 72 f4[ 	]*vpshrdvw %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 72 b4 f4 c0 1d fe ff[ 	]*vpshrdvw -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 72 72 7f[ 	]*vpshrdvw 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 73 f4[ 	]*vpshrdvd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 73 f4[ 	]*vpshrdvd %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvd -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 73 72 7f[ 	]*vpshrdvd 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 1f 73 72 7f[ 	]*vpshrdvd 0x1fc\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 73 f4[ 	]*vpshrdvd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 af 73 f4[ 	]*vpshrdvd %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvd -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 2f 73 72 7f[ 	]*vpshrdvd 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 3f 73 72 7f[ 	]*vpshrdvd 0x1fc\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 73 f4[ 	]*vpshrdvq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f 73 f4[ 	]*vpshrdvq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvq -0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f 73 72 7f[ 	]*vpshrdvq 0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 1f 73 72 7f[ 	]*vpshrdvq 0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 73 f4[ 	]*vpshrdvq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 af 73 f4[ 	]*vpshrdvq %ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvq -0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 2f 73 72 7f[ 	]*vpshrdvq 0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 3f 73 72 7f[ 	]*vpshrdvq 0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 70 f4 ab[ 	]*vpshldw \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 8f 70 f4 ab[ 	]*vpshldw \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 70 b4 f4 c0 1d fe ff 7b[ 	]*vpshldw \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 70 72 7f 7b[ 	]*vpshldw \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 70 f4 ab[ 	]*vpshldw \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 af 70 f4 ab[ 	]*vpshldw \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 70 b4 f4 c0 1d fe ff 7b[ 	]*vpshldw \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 70 72 7f 7b[ 	]*vpshldw \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 71 f4 ab[ 	]*vpshldd \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 8f 71 f4 ab[ 	]*vpshldd \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldd \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 71 72 7f 7b[ 	]*vpshldd \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 1f 71 72 7f 7b[ 	]*vpshldd \$0x7b,0x1fc\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 71 f4 ab[ 	]*vpshldd \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 af 71 f4 ab[ 	]*vpshldd \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldd \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 71 72 7f 7b[ 	]*vpshldd \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 3f 71 72 7f 7b[ 	]*vpshldd \$0x7b,0x1fc\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 71 f4 ab[ 	]*vpshldq \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 8f 71 f4 ab[ 	]*vpshldq \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldq \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 71 72 7f 7b[ 	]*vpshldq \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 1f 71 72 7f 7b[ 	]*vpshldq \$0x7b,0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 71 f4 ab[ 	]*vpshldq \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 af 71 f4 ab[ 	]*vpshldq \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 71 72 7f 7b[ 	]*vpshldq \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 3f 71 72 7f 7b[ 	]*vpshldq \$0x7b,0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 72 f4 ab[ 	]*vpshrdw \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 8f 72 f4 ab[ 	]*vpshrdw \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 72 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdw \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 72 72 7f 7b[ 	]*vpshrdw \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 72 f4 ab[ 	]*vpshrdw \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 af 72 f4 ab[ 	]*vpshrdw \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 72 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdw \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 72 72 7f 7b[ 	]*vpshrdw \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 73 f4 ab[ 	]*vpshrdd \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 8f 73 f4 ab[ 	]*vpshrdd \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdd \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 0f 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 1f 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0x1fc\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 73 f4 ab[ 	]*vpshrdd \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 af 73 f4 ab[ 	]*vpshrdd \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdd \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 2f 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 3f 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0x1fc\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 73 f4 ab[ 	]*vpshrdq \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 8f 73 f4 ab[ 	]*vpshrdq \$0xab,%xmm4,%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdq \$0x7b,-0x1e240\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 0f 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0x7f0\(%edx\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 1f 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0x3f8\(%edx\)\{1to2\},%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 73 f4 ab[ 	]*vpshrdq \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 af 73 f4 ab[ 	]*vpshrdq \$0xab,%ymm4,%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdq \$0x7b,-0x1e240\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 2f 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0xfe0\(%edx\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 3f 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0x3f8\(%edx\)\{1to4\},%ymm5,%ymm6\{%k7\}
+
+0+[a-f0-9]+ <vnni>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0b 52 d2[ 	]*vpdpwssd %xmm2,%xmm4,%xmm2\{%k3\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 8b 52 d2[ 	]*vpdpwssd %xmm2,%xmm4,%xmm2\{%k3\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 09 52 94 f4 c0 1d fe ff[ 	]*vpdpwssd -0x1e240\(%esp,%esi,8\),%xmm4,%xmm2\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 09 52 52 7f[ 	]*vpdpwssd 0x7f0\(%edx\),%xmm4,%xmm2\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 19 52 52 7f[ 	]*vpdpwssd 0x1fc\(%edx\)\{1to4\},%xmm4,%xmm2\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 29 52 d9[ 	]*vpdpwssd %ymm1,%ymm3,%ymm3\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 a9 52 d9[ 	]*vpdpwssd %ymm1,%ymm3,%ymm3\{%k1\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 2c 52 9c f4 c0 1d fe ff[ 	]*vpdpwssd -0x1e240\(%esp,%esi,8\),%ymm3,%ymm3\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 2c 52 5a 7f[ 	]*vpdpwssd 0xfe0\(%edx\),%ymm3,%ymm3\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 3c 52 5a 7f[ 	]*vpdpwssd 0x1fc\(%edx\)\{1to8\},%ymm3,%ymm3\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 09 53 d1[ 	]*vpdpwssds %xmm1,%xmm4,%xmm2\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 89 53 d1[ 	]*vpdpwssds %xmm1,%xmm4,%xmm2\{%k1\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0c 53 94 f4 c0 1d fe ff[ 	]*vpdpwssds -0x1e240\(%esp,%esi,8\),%xmm4,%xmm2\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0c 53 52 7f[ 	]*vpdpwssds 0x7f0\(%edx\),%xmm4,%xmm2\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 1c 53 52 7f[ 	]*vpdpwssds 0x1fc\(%edx\)\{1to4\},%xmm4,%xmm2\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 2f 53 e4[ 	]*vpdpwssds %ymm4,%ymm1,%ymm4\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 af 53 e4[ 	]*vpdpwssds %ymm4,%ymm1,%ymm4\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 2b 53 a4 f4 c0 1d fe ff[ 	]*vpdpwssds -0x1e240\(%esp,%esi,8\),%ymm1,%ymm4\{%k3\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 2b 53 62 7f[ 	]*vpdpwssds 0xfe0\(%edx\),%ymm1,%ymm4\{%k3\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 3b 53 62 7f[ 	]*vpdpwssds 0x1fc\(%edx\)\{1to8\},%ymm1,%ymm4\{%k3\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 0c 50 d1[ 	]*vpdpbusd %xmm1,%xmm3,%xmm2\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 8c 50 d1[ 	]*vpdpbusd %xmm1,%xmm3,%xmm2\{%k4\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 0a 50 94 f4 c0 1d fe ff[ 	]*vpdpbusd -0x1e240\(%esp,%esi,8\),%xmm3,%xmm2\{%k2\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 0a 50 52 7f[ 	]*vpdpbusd 0x7f0\(%edx\),%xmm3,%xmm2\{%k2\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 1a 50 52 7f[ 	]*vpdpbusd 0x1fc\(%edx\)\{1to4\},%xmm3,%xmm2\{%k2\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2d 50 d2[ 	]*vpdpbusd %ymm2,%ymm2,%ymm2\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d ad 50 d2[ 	]*vpdpbusd %ymm2,%ymm2,%ymm2\{%k5\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2f 50 94 f4 c0 1d fe ff[ 	]*vpdpbusd -0x1e240\(%esp,%esi,8\),%ymm2,%ymm2\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2f 50 52 7f[ 	]*vpdpbusd 0xfe0\(%edx\),%ymm2,%ymm2\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 3f 50 52 7f[ 	]*vpdpbusd 0x1fc\(%edx\)\{1to8\},%ymm2,%ymm2\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 0e 51 f4[ 	]*vpdpbusds %xmm4,%xmm2,%xmm6\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 8e 51 f4[ 	]*vpdpbusds %xmm4,%xmm2,%xmm6\{%k6\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 0c 51 b4 f4 c0 1d fe ff[ 	]*vpdpbusds -0x1e240\(%esp,%esi,8\),%xmm2,%xmm6\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 0c 51 72 7f[ 	]*vpdpbusds 0x7f0\(%edx\),%xmm2,%xmm6\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 1c 51 72 7f[ 	]*vpdpbusds 0x1fc\(%edx\)\{1to4\},%xmm2,%xmm6\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 2f 51 e1[ 	]*vpdpbusds %ymm1,%ymm3,%ymm4\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 af 51 e1[ 	]*vpdpbusds %ymm1,%ymm3,%ymm4\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 29 51 a4 f4 c0 1d fe ff[ 	]*vpdpbusds -0x1e240\(%esp,%esi,8\),%ymm3,%ymm4\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 29 51 62 7f[ 	]*vpdpbusds 0xfe0\(%edx\),%ymm3,%ymm4\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 39 51 62 7f[ 	]*vpdpbusds 0x1fc\(%edx\)\{1to8\},%ymm3,%ymm4\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 09 52 ea[ 	]*vpdpwssd %xmm2,%xmm2,%xmm5\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 89 52 ea[ 	]*vpdpwssd %xmm2,%xmm2,%xmm5\{%k1\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 0e 52 ac f4 c0 1d fe ff[ 	]*vpdpwssd -0x1e240\(%esp,%esi,8\),%xmm2,%xmm5\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 0e 52 6a 7f[ 	]*vpdpwssd 0x7f0\(%edx\),%xmm2,%xmm5\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 1e 52 6a 7f[ 	]*vpdpwssd 0x1fc\(%edx\)\{1to4\},%xmm2,%xmm5\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2f 52 cc[ 	]*vpdpwssd %ymm4,%ymm2,%ymm1\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d af 52 cc[ 	]*vpdpwssd %ymm4,%ymm2,%ymm1\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2e 52 8c f4 c0 1d fe ff[ 	]*vpdpwssd -0x1e240\(%esp,%esi,8\),%ymm2,%ymm1\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2e 52 4a 7f[ 	]*vpdpwssd 0xfe0\(%edx\),%ymm2,%ymm1\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 3e 52 4a 7f[ 	]*vpdpwssd 0x1fc\(%edx\)\{1to8\},%ymm2,%ymm1\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0a 53 c9[ 	]*vpdpwssds %xmm1,%xmm4,%xmm1\{%k2\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 8a 53 c9[ 	]*vpdpwssds %xmm1,%xmm4,%xmm1\{%k2\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0e 53 8c f4 c0 1d fe ff[ 	]*vpdpwssds -0x1e240\(%esp,%esi,8\),%xmm4,%xmm1\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0e 53 4a 7f[ 	]*vpdpwssds 0x7f0\(%edx\),%xmm4,%xmm1\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 1e 53 4a 7f[ 	]*vpdpwssds 0x1fc\(%edx\)\{1to4\},%xmm4,%xmm1\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2c 53 dc[ 	]*vpdpwssds %ymm4,%ymm2,%ymm3\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d ac 53 dc[ 	]*vpdpwssds %ymm4,%ymm2,%ymm3\{%k4\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2d 53 9c f4 c0 1d fe ff[ 	]*vpdpwssds -0x1e240\(%esp,%esi,8\),%ymm2,%ymm3\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2d 53 5a 7f[ 	]*vpdpwssds 0xfe0\(%edx\),%ymm2,%ymm3\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 3d 53 5a 7f[ 	]*vpdpwssds 0x1fc\(%edx\)\{1to8\},%ymm2,%ymm3\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0f 50 dc[ 	]*vpdpbusd %xmm4,%xmm4,%xmm3\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 8f 50 dc[ 	]*vpdpbusd %xmm4,%xmm4,%xmm3\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 09 50 9c f4 c0 1d fe ff[ 	]*vpdpbusd -0x1e240\(%esp,%esi,8\),%xmm4,%xmm3\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 09 50 5a 7f[ 	]*vpdpbusd 0x7f0\(%edx\),%xmm4,%xmm3\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 19 50 5a 7f[ 	]*vpdpbusd 0x1fc\(%edx\)\{1to4\},%xmm4,%xmm3\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2d 50 f4[ 	]*vpdpbusd %ymm4,%ymm2,%ymm6\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d ad 50 f4[ 	]*vpdpbusd %ymm4,%ymm2,%ymm6\{%k5\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2d 50 b4 f4 c0 1d fe ff[ 	]*vpdpbusd -0x1e240\(%esp,%esi,8\),%ymm2,%ymm6\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 2d 50 72 7f[ 	]*vpdpbusd 0xfe0\(%edx\),%ymm2,%ymm6\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 3d 50 72 7f[ 	]*vpdpbusd 0x1fc\(%edx\)\{1to8\},%ymm2,%ymm6\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 0d 51 dc[ 	]*vpdpbusds %xmm4,%xmm3,%xmm3\{%k5\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 8d 51 dc[ 	]*vpdpbusds %xmm4,%xmm3,%xmm3\{%k5\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 0c 51 9c f4 c0 1d fe ff[ 	]*vpdpbusds -0x1e240\(%esp,%esi,8\),%xmm3,%xmm3\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 0c 51 5a 7f[ 	]*vpdpbusds 0x7f0\(%edx\),%xmm3,%xmm3\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 1c 51 5a 7f[ 	]*vpdpbusds 0x1fc\(%edx\)\{1to4\},%xmm3,%xmm3\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 2c 51 d4[ 	]*vpdpbusds %ymm4,%ymm3,%ymm2\{%k4\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 ac 51 d4[ 	]*vpdpbusds %ymm4,%ymm3,%ymm2\{%k4\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 29 51 94 f4 c0 1d fe ff[ 	]*vpdpbusds -0x1e240\(%esp,%esi,8\),%ymm3,%ymm2\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 29 51 52 7f[ 	]*vpdpbusds 0xfe0\(%edx\),%ymm3,%ymm2\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 39 51 52 7f[ 	]*vpdpbusds 0x1fc\(%edx\)\{1to8\},%ymm3,%ymm2\{%k1\}
+
+0+[a-f0-9]+ <bf16>:
+[ 	]*[a-f0-9]+:	62 f2 57 28 72 f4    	vcvtne2ps2bf16 %ymm4,%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 57 08 72 f4    	vcvtne2ps2bf16 %xmm4,%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 57 2f 72 b4 f4 00 00 00 10 	vcvtne2ps2bf16 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 57 38 72 31    	vcvtne2ps2bf16 \(%ecx\)\{1to8\},%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 57 28 72 71 7f 	vcvtne2ps2bf16 0xfe0\(%ecx\),%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 57 bf 72 b2 00 f0 ff ff 	vcvtne2ps2bf16 -0x1000\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 57 0f 72 b4 f4 00 00 00 10 	vcvtne2ps2bf16 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 57 18 72 31    	vcvtne2ps2bf16 \(%ecx\)\{1to4\},%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 57 08 72 71 7f 	vcvtne2ps2bf16 0x7f0\(%ecx\),%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 57 9f 72 b2 00 f8 ff ff 	vcvtne2ps2bf16 -0x800\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7e 08 72 f5    	vcvtneps2bf16 %xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 28 72 f5    	vcvtneps2bf16 %ymm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 0f 72 b4 f4 00 00 00 10 	vcvtneps2bf16x 0x10000000\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 7e 18 72 31    	vcvtneps2bf16 \(%ecx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 18 72 31    	vcvtneps2bf16 \(%ecx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 08 72 71 7f 	vcvtneps2bf16x 0x7f0\(%ecx\),%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 9f 72 b2 00 f8 ff ff 	vcvtneps2bf16 -0x800\(%edx\)\{1to4\},%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7e 38 72 31    	vcvtneps2bf16 \(%ecx\)\{1to8\},%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 38 72 31    	vcvtneps2bf16 \(%ecx\)\{1to8\},%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 28 72 71 7f 	vcvtneps2bf16y 0xfe0\(%ecx\),%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e bf 72 b2 00 f0 ff ff 	vcvtneps2bf16 -0x1000\(%edx\)\{1to8\},%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 56 28 52 f4    	vdpbf16ps %ymm4,%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 56 08 52 f4    	vdpbf16ps %xmm4,%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 56 2f 52 b4 f4 00 00 00 10 	vdpbf16ps 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 56 38 52 31    	vdpbf16ps \(%ecx\)\{1to8\},%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 56 28 52 71 7f 	vdpbf16ps 0xfe0\(%ecx\),%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 56 bf 52 b2 00 f0 ff ff 	vdpbf16ps -0x1000\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 56 0f 52 b4 f4 00 00 00 10 	vdpbf16ps 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 56 18 52 31    	vdpbf16ps \(%ecx\)\{1to4\},%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 56 08 52 71 7f 	vdpbf16ps 0x7f0\(%ecx\),%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 56 9f 52 b2 00 f8 ff ff 	vdpbf16ps -0x800\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 57 28 72 f4    	vcvtne2ps2bf16 %ymm4,%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 57 08 72 f4    	vcvtne2ps2bf16 %xmm4,%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 57 2f 72 b4 f4 00 00 00 10 	vcvtne2ps2bf16 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 57 38 72 31    	vcvtne2ps2bf16 \(%ecx\)\{1to8\},%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 57 28 72 71 7f 	vcvtne2ps2bf16 0xfe0\(%ecx\),%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 57 bf 72 b2 00 f0 ff ff 	vcvtne2ps2bf16 -0x1000\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 57 0f 72 b4 f4 00 00 00 10 	vcvtne2ps2bf16 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 57 18 72 31    	vcvtne2ps2bf16 \(%ecx\)\{1to4\},%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 57 08 72 71 7f 	vcvtne2ps2bf16 0x7f0\(%ecx\),%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 57 9f 72 b2 00 f8 ff ff 	vcvtne2ps2bf16 -0x800\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7e 08 72 f5    	vcvtneps2bf16 %xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 28 72 f5    	vcvtneps2bf16 %ymm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 0f 72 b4 f4 00 00 00 10 	vcvtneps2bf16x 0x10000000\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 7e 18 72 31    	vcvtneps2bf16 \(%ecx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 18 72 31    	vcvtneps2bf16 \(%ecx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 08 72 71 7f 	vcvtneps2bf16x 0x7f0\(%ecx\),%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 9f 72 b2 00 f8 ff ff 	vcvtneps2bf16 -0x800\(%edx\)\{1to4\},%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7e 38 72 31    	vcvtneps2bf16 \(%ecx\)\{1to8\},%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 38 72 31    	vcvtneps2bf16 \(%ecx\)\{1to8\},%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 28 72 71 7f 	vcvtneps2bf16y 0xfe0\(%ecx\),%xmm6
+[ 	]*[a-f0-9]+:	62 f2 7e bf 72 b2 00 f0 ff ff 	vcvtneps2bf16 -0x1000\(%edx\)\{1to8\},%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 56 28 52 f4    	vdpbf16ps %ymm4,%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 56 08 52 f4    	vdpbf16ps %xmm4,%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 56 2f 52 b4 f4 00 00 00 10 	vdpbf16ps 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 56 38 52 31    	vdpbf16ps \(%ecx\)\{1to8\},%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 56 28 52 71 7f 	vdpbf16ps 0xfe0\(%ecx\),%ymm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 56 bf 52 b2 00 f0 ff ff 	vdpbf16ps -0x1000\(%edx\)\{1to8\},%ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 56 0f 52 b4 f4 00 00 00 10 	vdpbf16ps 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 56 18 52 31    	vdpbf16ps \(%ecx\)\{1to4\},%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 56 08 52 71 7f 	vdpbf16ps 0x7f0\(%ecx\),%xmm5,%xmm6
+[ 	]*[a-f0-9]+:	62 f2 56 9f 52 b2 00 f8 ff ff 	vdpbf16ps -0x800\(%edx\)\{1to4\},%xmm5,%xmm6\{%k7\}\{z\}
+
+0+[a-f0-9]+ <vpopcnt>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 f5[ 	]*vpopcntd %ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 55 f5[ 	]*vpopcntd %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 55 f5[ 	]*vpopcntd %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 31[ 	]*vpopcntd \(%ecx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 30[ 	]*vpopcntd \(%eax\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 72 7f[ 	]*vpopcntd 0xfe0\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 b2 00 10 00 00[ 	]*vpopcntd 0x1000\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 72 80[ 	]*vpopcntd -0x1000\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 b2 e0 ef ff ff[ 	]*vpopcntd -0x1020\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 b2 00 02 00 00[ 	]*vpopcntd 0x200\(%edx\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 72 80[ 	]*vpopcntd -0x200\(%edx\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 b2 fc fd ff ff[ 	]*vpopcntd -0x204\(%edx\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 f5[ 	]*vpopcntq %ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 55 f5[ 	]*vpopcntq %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 55 f5[ 	]*vpopcntq %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 31[ 	]*vpopcntq \(%ecx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 30[ 	]*vpopcntq \(%eax\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 72 7f[ 	]*vpopcntq 0xfe0\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 b2 00 10 00 00[ 	]*vpopcntq 0x1000\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 72 80[ 	]*vpopcntq -0x1000\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 b2 e0 ef ff ff[ 	]*vpopcntq -0x1020\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 b2 00 04 00 00[ 	]*vpopcntq 0x400\(%edx\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 72 80[ 	]*vpopcntq -0x400\(%edx\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 b2 f8 fb ff ff[ 	]*vpopcntq -0x408\(%edx\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 f5[ 	]*vpopcntd %xmm5,%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 55 f5[ 	]*vpopcntd %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 55 f5[ 	]*vpopcntd %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 31[ 	]*vpopcntd \(%ecx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 30[ 	]*vpopcntd \(%eax\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 30[ 	]*vpopcntd \(%eax\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 72 7f[ 	]*vpopcntd 0x7f0\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 b2 00 08 00 00[ 	]*vpopcntd 0x800\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 72 80[ 	]*vpopcntd -0x800\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 b2 f0 f7 ff ff[ 	]*vpopcntd -0x810\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 b2 00 02 00 00[ 	]*vpopcntd 0x200\(%edx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 72 80[ 	]*vpopcntd -0x200\(%edx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 b2 fc fd ff ff[ 	]*vpopcntd -0x204\(%edx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 f5[ 	]*vpopcntq %xmm5,%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 55 f5[ 	]*vpopcntq %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 55 f5[ 	]*vpopcntq %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 31[ 	]*vpopcntq \(%ecx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 30[ 	]*vpopcntq \(%eax\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 30[ 	]*vpopcntq \(%eax\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 72 7f[ 	]*vpopcntq 0x7f0\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 b2 00 08 00 00[ 	]*vpopcntq 0x800\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 72 80[ 	]*vpopcntq -0x800\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 b2 f0 f7 ff ff[ 	]*vpopcntq -0x810\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 b2 00 04 00 00[ 	]*vpopcntq 0x400\(%edx\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 72 80[ 	]*vpopcntq -0x400\(%edx\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 b2 f8 fb ff ff[ 	]*vpopcntq -0x408\(%edx\)\{1to2\},%xmm6
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-5.s
@@ -0,0 +1,25 @@
+	.arch generic32
+	.arch .avx10.1
+
+	.include "avx512bitalg_vl.s"
+
+	.att_syntax prefix
+	.include "avx512cd_vl.s"
+
+	.att_syntax prefix
+	.include "avx512ifma_vl.s"
+
+	.att_syntax prefix
+	.include "avx512vbmi_vl.s"
+
+	.att_syntax prefix
+	.include "avx512vbmi2_vl.s"
+
+	.att_syntax prefix
+	.include "avx512vnni_vl.s"
+
+	.att_syntax prefix
+	.include "avx512_bf16_vl.s"
+
+	.att_syntax prefix
+	.include "avx512_vpopcntdq_vl.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-gfni.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/256 + GFNI
+#dump: avx512vl_gfni.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-gfni.s
@@ -0,0 +1,4 @@
+	.arch generic32
+	.arch .avx10.1
+	.arch .gfni
+	.include "avx512vl_gfni.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-vaes.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/256 + VAES
+#dump: avx512vl_vaes.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-vaes.s
@@ -0,0 +1,4 @@
+	.arch generic32
+	.arch .avx10.1
+	.arch .vaes
+	.include "avx512vl_vaes.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-vpclmulqdq.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/256 + VPLCMULQDQ
+#dump: avx512vl_vpclmulqdq.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-256-vpclmulqdq.s
@@ -0,0 +1,4 @@
+	.arch generic32
+	.arch .avx10.1
+	.arch .vpclmulqdq
+	.include "avx512vl_vpclmulqdq.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-1.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/512 (part 1)
+#dump: avx512f.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-1.s
@@ -0,0 +1,3 @@
+	.arch generic32
+	.arch .avx10.1
+	.include "avx512f.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-2.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/512 (part 2)
+#dump: avx512bw.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-2.s
@@ -0,0 +1,3 @@
+	.arch generic32
+	.arch .avx10.1
+	.include "avx512bw.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-3.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/512 (part 3)
+#dump: avx512dq.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-3.s
@@ -0,0 +1,3 @@
+	.arch generic32
+	.arch .avx10.1
+	.include "avx512dq.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-4.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/512 (part 4)
+#dump: avx512_fp16.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-4.s
@@ -0,0 +1,3 @@
+	.arch generic32
+	.arch .avx10.1
+	.include "avx512_fp16.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-5.d
@@ -0,0 +1,654 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/512 (part 5)
+
+.*: +file format .*
+
+
+Disassembly of section \.text:
+
+0+ <bitalg>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f ec[ 	]*vpshufbitqmb %zmm4,%zmm5,%k5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 8f ec[ 	]*vpshufbitqmb %zmm4,%zmm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb -0x1e240\(%esp,%esi,8\),%zmm5,%k5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f 6a 7f[ 	]*vpshufbitqmb 0x1fc0\(%edx\),%zmm5,%k5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 54 f5[ 	]*vpopcntb %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 54 f5[ 	]*vpopcntb %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 54 f5[ 	]*vpopcntb %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 54 b4 f4 c0 1d fe ff[ 	]*vpopcntb -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 54 72 7f[ 	]*vpopcntb 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 54 f5[ 	]*vpopcntw %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 54 f5[ 	]*vpopcntw %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 54 f5[ 	]*vpopcntw %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 54 b4 f4 c0 1d fe ff[ 	]*vpopcntw -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 54 72 7f[ 	]*vpopcntw 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 f5[ 	]*vpopcntd %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 72 7f[ 	]*vpopcntd 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 f5[ 	]*vpopcntq %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 55 f5[ 	]*vpopcntq %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 55 f5[ 	]*vpopcntq %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 72 7f[ 	]*vpopcntq 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f ec[ 	]*vpshufbitqmb %zmm4,%zmm5,%k5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 8f ec[ 	]*vpshufbitqmb %zmm4,%zmm5,%k5\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb -0x1e240\(%esp,%esi,8\),%zmm5,%k5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f 6a 7f[ 	]*vpshufbitqmb 0x1fc0\(%edx\),%zmm5,%k5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 54 f5[ 	]*vpopcntb %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 54 f5[ 	]*vpopcntb %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 54 f5[ 	]*vpopcntb %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 54 b4 f4 c0 1d fe ff[ 	]*vpopcntb -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 54 72 7f[ 	]*vpopcntb 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 54 f5[ 	]*vpopcntw %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 54 f5[ 	]*vpopcntw %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 54 f5[ 	]*vpopcntw %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 54 b4 f4 c0 1d fe ff[ 	]*vpopcntw -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 54 72 7f[ 	]*vpopcntw 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 f5[ 	]*vpopcntd %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 72 7f[ 	]*vpopcntd 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 f5[ 	]*vpopcntq %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 55 f5[ 	]*vpopcntq %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 55 f5[ 	]*vpopcntq %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 72 7f[ 	]*vpopcntq 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to8\},%zmm6
+
+0+[a-f0-9]+ <cd>:
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 f5    	vpconflictd %zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 4f c4 f5    	vpconflictd %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 7d cf c4 f5    	vpconflictd %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 31    	vpconflictd \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 b4 f4 c0 1d fe ff 	vpconflictd -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 30    	vpconflictd \(%eax\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 72 7f 	vpconflictd 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 b2 00 20 00 00 	vpconflictd 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 72 80 	vpconflictd -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 b2 c0 df ff ff 	vpconflictd -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 72 7f 	vpconflictd 0x1fc\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 b2 00 02 00 00 	vpconflictd 0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 72 80 	vpconflictd -0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 b2 fc fd ff ff 	vpconflictd -0x204\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 f5    	vpconflictq %zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 4f c4 f5    	vpconflictq %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 fd cf c4 f5    	vpconflictq %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 31    	vpconflictq \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 b4 f4 c0 1d fe ff 	vpconflictq -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 30    	vpconflictq \(%eax\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 72 7f 	vpconflictq 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 b2 00 20 00 00 	vpconflictq 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 72 80 	vpconflictq -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 b2 c0 df ff ff 	vpconflictq -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 72 7f 	vpconflictq 0x3f8\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 b2 00 04 00 00 	vpconflictq 0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 72 80 	vpconflictq -0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 b2 f8 fb ff ff 	vpconflictq -0x408\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 f5    	vplzcntd %zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 4f 44 f5    	vplzcntd %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 7d cf 44 f5    	vplzcntd %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 31    	vplzcntd \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 b4 f4 c0 1d fe ff 	vplzcntd -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 30    	vplzcntd \(%eax\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 72 7f 	vplzcntd 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 b2 00 20 00 00 	vplzcntd 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 72 80 	vplzcntd -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 b2 c0 df ff ff 	vplzcntd -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 72 7f 	vplzcntd 0x1fc\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 b2 00 02 00 00 	vplzcntd 0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 72 80 	vplzcntd -0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 b2 fc fd ff ff 	vplzcntd -0x204\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 f5    	vplzcntq %zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 4f 44 f5    	vplzcntq %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 fd cf 44 f5    	vplzcntq %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 31    	vplzcntq \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 b4 f4 c0 1d fe ff 	vplzcntq -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 30    	vplzcntq \(%eax\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 72 7f 	vplzcntq 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 b2 00 20 00 00 	vplzcntq 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 72 80 	vplzcntq -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 b2 c0 df ff ff 	vplzcntq -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 72 7f 	vplzcntq 0x3f8\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 b2 00 04 00 00 	vplzcntq 0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 72 80 	vplzcntq -0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 b2 f8 fb ff ff 	vplzcntq -0x408\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 48 3a f6    	vpbroadcastmw2d %k6,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fe 48 2a f6    	vpbroadcastmb2q %k6,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 f5    	vpconflictd %zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 4f c4 f5    	vpconflictd %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 7d cf c4 f5    	vpconflictd %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 31    	vpconflictd \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 b4 f4 c0 1d fe ff 	vpconflictd -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 30    	vpconflictd \(%eax\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 72 7f 	vpconflictd 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 b2 00 20 00 00 	vpconflictd 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 72 80 	vpconflictd -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 c4 b2 c0 df ff ff 	vpconflictd -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 72 7f 	vpconflictd 0x1fc\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 b2 00 02 00 00 	vpconflictd 0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 72 80 	vpconflictd -0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 c4 b2 fc fd ff ff 	vpconflictd -0x204\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 f5    	vpconflictq %zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 4f c4 f5    	vpconflictq %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 fd cf c4 f5    	vpconflictq %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 31    	vpconflictq \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 b4 f4 c0 1d fe ff 	vpconflictq -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 30    	vpconflictq \(%eax\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 72 7f 	vpconflictq 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 b2 00 20 00 00 	vpconflictq 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 72 80 	vpconflictq -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 c4 b2 c0 df ff ff 	vpconflictq -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 72 7f 	vpconflictq 0x3f8\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 b2 00 04 00 00 	vpconflictq 0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 72 80 	vpconflictq -0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 c4 b2 f8 fb ff ff 	vpconflictq -0x408\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 f5    	vplzcntd %zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 4f 44 f5    	vplzcntd %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 7d cf 44 f5    	vplzcntd %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 31    	vplzcntd \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 b4 f4 c0 1d fe ff 	vplzcntd -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 30    	vplzcntd \(%eax\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 72 7f 	vplzcntd 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 b2 00 20 00 00 	vplzcntd 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 72 80 	vplzcntd -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 48 44 b2 c0 df ff ff 	vplzcntd -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 72 7f 	vplzcntd 0x1fc\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 b2 00 02 00 00 	vplzcntd 0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 72 80 	vplzcntd -0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7d 58 44 b2 fc fd ff ff 	vplzcntd -0x204\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 f5    	vplzcntq %zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 4f 44 f5    	vplzcntq %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 fd cf 44 f5    	vplzcntq %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 31    	vplzcntq \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 b4 f4 c0 1d fe ff 	vplzcntq -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 30    	vplzcntq \(%eax\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 72 7f 	vplzcntq 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 b2 00 20 00 00 	vplzcntq 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 72 80 	vplzcntq -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 48 44 b2 c0 df ff ff 	vplzcntq -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 72 7f 	vplzcntq 0x3f8\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 b2 00 04 00 00 	vplzcntq 0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 72 80 	vplzcntq -0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fd 58 44 b2 f8 fb ff ff 	vplzcntq -0x408\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:	62 f2 7e 48 3a f6    	vpbroadcastmw2d %k6,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 fe 48 2a f6    	vpbroadcastmb2q %k6,%zmm6
+
+0+[a-f0-9]+ <ifma>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 31[ 	]*vpmadd52luq \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 b4 f4 c0 1d fe ff[ 	]*vpmadd52luq -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 30[ 	]*vpmadd52luq \(%eax\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 72 7f[ 	]*vpmadd52luq 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 b2 00 20 00 00[ 	]*vpmadd52luq 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 72 80[ 	]*vpmadd52luq -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 b2 c0 df ff ff[ 	]*vpmadd52luq -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 72 7f[ 	]*vpmadd52luq 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 b2 00 04 00 00[ 	]*vpmadd52luq 0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 72 80[ 	]*vpmadd52luq -0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 b2 f8 fb ff ff[ 	]*vpmadd52luq -0x408\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 f4[ 	]*vpmadd52huq %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f b5 f4[ 	]*vpmadd52huq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf b5 f4[ 	]*vpmadd52huq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 31[ 	]*vpmadd52huq \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 b4 f4 c0 1d fe ff[ 	]*vpmadd52huq -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 30[ 	]*vpmadd52huq \(%eax\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 72 7f[ 	]*vpmadd52huq 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 b2 00 20 00 00[ 	]*vpmadd52huq 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 72 80[ 	]*vpmadd52huq -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 b2 c0 df ff ff[ 	]*vpmadd52huq -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 72 7f[ 	]*vpmadd52huq 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 b2 00 04 00 00[ 	]*vpmadd52huq 0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 72 80[ 	]*vpmadd52huq -0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 b2 f8 fb ff ff[ 	]*vpmadd52huq -0x408\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 31[ 	]*vpmadd52luq \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 b4 f4 c0 1d fe ff[ 	]*vpmadd52luq -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 30[ 	]*vpmadd52luq \(%eax\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 72 7f[ 	]*vpmadd52luq 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 b2 00 20 00 00[ 	]*vpmadd52luq 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 72 80[ 	]*vpmadd52luq -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 b2 c0 df ff ff[ 	]*vpmadd52luq -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 72 7f[ 	]*vpmadd52luq 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 b2 00 04 00 00[ 	]*vpmadd52luq 0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 72 80[ 	]*vpmadd52luq -0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b4 b2 f8 fb ff ff[ 	]*vpmadd52luq -0x408\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 f4[ 	]*vpmadd52huq %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f b5 f4[ 	]*vpmadd52huq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf b5 f4[ 	]*vpmadd52huq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 31[ 	]*vpmadd52huq \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 b4 f4 c0 1d fe ff[ 	]*vpmadd52huq -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 30[ 	]*vpmadd52huq \(%eax\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 72 7f[ 	]*vpmadd52huq 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 b2 00 20 00 00[ 	]*vpmadd52huq 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 72 80[ 	]*vpmadd52huq -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b5 b2 c0 df ff ff[ 	]*vpmadd52huq -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 72 7f[ 	]*vpmadd52huq 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 b2 00 04 00 00[ 	]*vpmadd52huq 0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 72 80[ 	]*vpmadd52huq -0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 b5 b2 f8 fb ff ff[ 	]*vpmadd52huq -0x408\(%edx\)\{1to8\},%zmm5,%zmm6
+
+0+[a-f0-9]+ <vbmi>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d 31[ 	]*vpermb \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d b4 f4 c0 1d fe ff[ 	]*vpermb -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d 72 7f[ 	]*vpermb 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d b2 00 20 00 00[ 	]*vpermb 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d 72 80[ 	]*vpermb -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d b2 c0 df ff ff[ 	]*vpermb -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 f4[ 	]*vpermi2b %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 75 f4[ 	]*vpermi2b %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 75 f4[ 	]*vpermi2b %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 31[ 	]*vpermi2b \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 b4 f4 c0 1d fe ff[ 	]*vpermi2b -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 72 7f[ 	]*vpermi2b 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 b2 00 20 00 00[ 	]*vpermi2b 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 72 80[ 	]*vpermi2b -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 b2 c0 df ff ff[ 	]*vpermi2b -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d f4[ 	]*vpermt2b %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 7d f4[ 	]*vpermt2b %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 7d f4[ 	]*vpermt2b %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d 31[ 	]*vpermt2b \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d b4 f4 c0 1d fe ff[ 	]*vpermt2b -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d 72 7f[ 	]*vpermt2b 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d b2 00 20 00 00[ 	]*vpermt2b 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d 72 80[ 	]*vpermt2b -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d b2 c0 df ff ff[ 	]*vpermt2b -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 f4[ 	]*vpmultishiftqb %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 83 f4[ 	]*vpmultishiftqb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 83 f4[ 	]*vpmultishiftqb %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 31[ 	]*vpmultishiftqb \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 b4 f4 c0 1d fe ff[ 	]*vpmultishiftqb -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 30[ 	]*vpmultishiftqb \(%eax\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 72 7f[ 	]*vpmultishiftqb 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 b2 00 20 00 00[ 	]*vpmultishiftqb 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 72 80[ 	]*vpmultishiftqb -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 b2 c0 df ff ff[ 	]*vpmultishiftqb -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 72 7f[ 	]*vpmultishiftqb 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 b2 00 04 00 00[ 	]*vpmultishiftqb 0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 72 80[ 	]*vpmultishiftqb -0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 b2 f8 fb ff ff[ 	]*vpmultishiftqb -0x408\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d 31[ 	]*vpermb \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d b4 f4 c0 1d fe ff[ 	]*vpermb -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d 72 7f[ 	]*vpermb 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d b2 00 20 00 00[ 	]*vpermb 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d 72 80[ 	]*vpermb -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d b2 c0 df ff ff[ 	]*vpermb -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 f4[ 	]*vpermi2b %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 75 f4[ 	]*vpermi2b %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 75 f4[ 	]*vpermi2b %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 31[ 	]*vpermi2b \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 b4 f4 c0 1d fe ff[ 	]*vpermi2b -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 72 7f[ 	]*vpermi2b 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 b2 00 20 00 00[ 	]*vpermi2b 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 72 80[ 	]*vpermi2b -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 75 b2 c0 df ff ff[ 	]*vpermi2b -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d f4[ 	]*vpermt2b %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 7d f4[ 	]*vpermt2b %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 7d f4[ 	]*vpermt2b %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d 31[ 	]*vpermt2b \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d b4 f4 c0 1d fe ff[ 	]*vpermt2b -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d 72 7f[ 	]*vpermt2b 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d b2 00 20 00 00[ 	]*vpermt2b 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d 72 80[ 	]*vpermt2b -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 7d b2 c0 df ff ff[ 	]*vpermt2b -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 f4[ 	]*vpmultishiftqb %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 83 f4[ 	]*vpmultishiftqb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 83 f4[ 	]*vpmultishiftqb %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 31[ 	]*vpmultishiftqb \(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 b4 f4 c0 1d fe ff[ 	]*vpmultishiftqb -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 30[ 	]*vpmultishiftqb \(%eax\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 72 7f[ 	]*vpmultishiftqb 0x1fc0\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 b2 00 20 00 00[ 	]*vpmultishiftqb 0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 72 80[ 	]*vpmultishiftqb -0x2000\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 83 b2 c0 df ff ff[ 	]*vpmultishiftqb -0x2040\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 72 7f[ 	]*vpmultishiftqb 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 b2 00 04 00 00[ 	]*vpmultishiftqb 0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 72 80[ 	]*vpmultishiftqb -0x400\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 83 b2 f8 fb ff ff[ 	]*vpmultishiftqb -0x408\(%edx\)\{1to8\},%zmm5,%zmm6
+
+0+[a-f0-9]+ <vbmi2>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 63 31[ 	]*vpcompressb %zmm6,\(%ecx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %zmm6,-0x1e240\(%esp,%esi,8\)
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 72 7e[ 	]*vpcompressb %zmm6,0x7e\(%edx\)
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 ee[ 	]*vpcompressb %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 63 ee[ 	]*vpcompressb %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 63 ee[ 	]*vpcompressb %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 63 31[ 	]*vpcompressw %zmm6,\(%ecx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 63 b4 f4 c0 1d fe ff[ 	]*vpcompressw %zmm6,-0x1e240\(%esp,%esi,8\)
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 63 72 40[ 	]*vpcompressw %zmm6,0x80\(%edx\)
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 63 ee[ 	]*vpcompressw %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 63 ee[ 	]*vpcompressw %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 63 ee[ 	]*vpcompressw %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 62 31[ 	]*vpexpandb \(%ecx\),%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 62 31[ 	]*vpexpandb \(%ecx\),%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 62 b4 f4 c0 1d fe ff[ 	]*vpexpandb -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 62 72 7e[ 	]*vpexpandb 0x7e\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 62 f5[ 	]*vpexpandb %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 62 f5[ 	]*vpexpandb %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 62 f5[ 	]*vpexpandb %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 62 31[ 	]*vpexpandw \(%ecx\),%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 62 31[ 	]*vpexpandw \(%ecx\),%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 62 b4 f4 c0 1d fe ff[ 	]*vpexpandw -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 62 72 40[ 	]*vpexpandw 0x80\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 62 f5[ 	]*vpexpandw %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 62 f5[ 	]*vpexpandw %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 62 f5[ 	]*vpexpandw %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 70 f4[ 	]*vpshldvw %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 70 f4[ 	]*vpshldvw %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 70 f4[ 	]*vpshldvw %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 70 b4 f4 c0 1d fe ff[ 	]*vpshldvw -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 70 72 02[ 	]*vpshldvw 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 71 f4[ 	]*vpshldvd %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 71 f4[ 	]*vpshldvd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 71 f4[ 	]*vpshldvd %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 71 b4 f4 c0 1d fe ff[ 	]*vpshldvd -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 71 72 02[ 	]*vpshldvd 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 58 71 72 7f[ 	]*vpshldvd 0x1fc\(%edx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 71 f4[ 	]*vpshldvq %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 71 f4[ 	]*vpshldvq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 71 f4[ 	]*vpshldvq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 71 b4 f4 c0 1d fe ff[ 	]*vpshldvq -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 71 72 02[ 	]*vpshldvq 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 71 72 7f[ 	]*vpshldvq 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 72 f4[ 	]*vpshrdvw %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 72 f4[ 	]*vpshrdvw %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 72 f4[ 	]*vpshrdvw %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 72 b4 f4 c0 1d fe ff[ 	]*vpshrdvw -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 72 72 02[ 	]*vpshrdvw 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 73 f4[ 	]*vpshrdvd %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 73 f4[ 	]*vpshrdvd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 73 f4[ 	]*vpshrdvd %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvd -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 73 72 02[ 	]*vpshrdvd 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 58 73 72 7f[ 	]*vpshrdvd 0x1fc\(%edx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 73 f4[ 	]*vpshrdvq %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 73 f4[ 	]*vpshrdvq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 73 f4[ 	]*vpshrdvq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvq -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 73 72 02[ 	]*vpshrdvq 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 73 72 7f[ 	]*vpshrdvq 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 4f 70 f4 ab[ 	]*vpshldw \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 cf 70 f4 ab[ 	]*vpshldw \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 70 f4 7b[ 	]*vpshldw \$0x7b,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 70 b4 f4 c0 1d fe ff 7b[ 	]*vpshldw \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 70 72 02 7b[ 	]*vpshldw \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 4f 71 f4 ab[ 	]*vpshldd \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 cf 71 f4 ab[ 	]*vpshldd \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 71 f4 7b[ 	]*vpshldd \$0x7b,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldd \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 71 72 02 7b[ 	]*vpshldd \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 58 71 72 7f 7b[ 	]*vpshldd \$0x7b,0x1fc\(%edx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 4f 71 f4 ab[ 	]*vpshldq \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 cf 71 f4 ab[ 	]*vpshldq \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldq \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 71 72 02 7b[ 	]*vpshldq \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 58 71 72 7f 7b[ 	]*vpshldq \$0x7b,0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 72 f4 ab[ 	]*vpshrdw \$0xab,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 4f 72 f4 ab[ 	]*vpshrdw \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 cf 72 f4 ab[ 	]*vpshrdw \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 72 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdw \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 72 72 02 7b[ 	]*vpshrdw \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 73 f4 ab[ 	]*vpshrdd \$0xab,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 4f 73 f4 ab[ 	]*vpshrdd \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 cf 73 f4 ab[ 	]*vpshrdd \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdd \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 73 72 02 7b[ 	]*vpshrdd \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 58 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0x1fc\(%edx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 4f 73 f4 ab[ 	]*vpshrdq \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 cf 73 f4 ab[ 	]*vpshrdq \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 73 f4 7b[ 	]*vpshrdq \$0x7b,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdq \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 73 72 02 7b[ 	]*vpshrdq \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 58 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 63 31[ 	]*vpcompressb %zmm6,\(%ecx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %zmm6,-0x1e240\(%esp,%esi,8\)
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 72 7e[ 	]*vpcompressb %zmm6,0x7e\(%edx\)
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 ee[ 	]*vpcompressb %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 63 ee[ 	]*vpcompressb %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 63 ee[ 	]*vpcompressb %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 63 31[ 	]*vpcompressw %zmm6,\(%ecx\)\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 63 b4 f4 c0 1d fe ff[ 	]*vpcompressw %zmm6,-0x1e240\(%esp,%esi,8\)
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 63 72 40[ 	]*vpcompressw %zmm6,0x80\(%edx\)
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 63 ee[ 	]*vpcompressw %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 63 ee[ 	]*vpcompressw %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 63 ee[ 	]*vpcompressw %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 62 31[ 	]*vpexpandb \(%ecx\),%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 62 31[ 	]*vpexpandb \(%ecx\),%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 62 b4 f4 c0 1d fe ff[ 	]*vpexpandb -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 62 72 7e[ 	]*vpexpandb 0x7e\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 62 f5[ 	]*vpexpandb %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 62 f5[ 	]*vpexpandb %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 62 f5[ 	]*vpexpandb %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 62 31[ 	]*vpexpandw \(%ecx\),%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 62 31[ 	]*vpexpandw \(%ecx\),%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 62 b4 f4 c0 1d fe ff[ 	]*vpexpandw -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 62 72 40[ 	]*vpexpandw 0x80\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 62 f5[ 	]*vpexpandw %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 62 f5[ 	]*vpexpandw %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 62 f5[ 	]*vpexpandw %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 70 f4[ 	]*vpshldvw %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 70 f4[ 	]*vpshldvw %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 70 f4[ 	]*vpshldvw %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 70 b4 f4 c0 1d fe ff[ 	]*vpshldvw -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 70 72 02[ 	]*vpshldvw 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 71 f4[ 	]*vpshldvd %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 71 f4[ 	]*vpshldvd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 71 f4[ 	]*vpshldvd %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 71 b4 f4 c0 1d fe ff[ 	]*vpshldvd -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 71 72 02[ 	]*vpshldvd 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 58 71 72 7f[ 	]*vpshldvd 0x1fc\(%edx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 71 f4[ 	]*vpshldvq %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 71 f4[ 	]*vpshldvq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 71 f4[ 	]*vpshldvq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 71 b4 f4 c0 1d fe ff[ 	]*vpshldvq -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 71 72 02[ 	]*vpshldvq 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 71 72 7f[ 	]*vpshldvq 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 72 f4[ 	]*vpshrdvw %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 72 f4[ 	]*vpshrdvw %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 72 f4[ 	]*vpshrdvw %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 72 b4 f4 c0 1d fe ff[ 	]*vpshrdvw -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 72 72 02[ 	]*vpshrdvw 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 73 f4[ 	]*vpshrdvd %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 73 f4[ 	]*vpshrdvd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 73 f4[ 	]*vpshrdvd %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvd -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 73 72 02[ 	]*vpshrdvd 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 58 73 72 7f[ 	]*vpshrdvd 0x1fc\(%edx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 73 f4[ 	]*vpshrdvq %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f 73 f4[ 	]*vpshrdvq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf 73 f4[ 	]*vpshrdvq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 73 b4 f4 c0 1d fe ff[ 	]*vpshrdvq -0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 73 72 02[ 	]*vpshrdvq 0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 d5 58 73 72 7f[ 	]*vpshrdvq 0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 70 f4 ab[ 	]*vpshldw \$0xab,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 4f 70 f4 ab[ 	]*vpshldw \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 cf 70 f4 ab[ 	]*vpshldw \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 70 b4 f4 c0 1d fe ff 7b[ 	]*vpshldw \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 70 72 02 7b[ 	]*vpshldw \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 71 f4 ab[ 	]*vpshldd \$0xab,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 4f 71 f4 ab[ 	]*vpshldd \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 cf 71 f4 ab[ 	]*vpshldd \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldd \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 71 72 02 7b[ 	]*vpshldd \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 58 71 72 7f 7b[ 	]*vpshldd \$0x7b,0x1fc\(%edx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 71 f4 ab[ 	]*vpshldq \$0xab,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 4f 71 f4 ab[ 	]*vpshldq \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 cf 71 f4 ab[ 	]*vpshldq \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 71 b4 f4 c0 1d fe ff 7b[ 	]*vpshldq \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 71 72 02 7b[ 	]*vpshldq \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 58 71 72 7f 7b[ 	]*vpshldq \$0x7b,0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 72 f4 ab[ 	]*vpshrdw \$0xab,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 4f 72 f4 ab[ 	]*vpshrdw \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 cf 72 f4 ab[ 	]*vpshrdw \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 72 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdw \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 72 72 02 7b[ 	]*vpshrdw \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 73 f4 ab[ 	]*vpshrdd \$0xab,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 4f 73 f4 ab[ 	]*vpshrdd \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 cf 73 f4 ab[ 	]*vpshrdd \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdd \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 48 73 72 02 7b[ 	]*vpshrdd \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 55 58 73 72 7f 7b[ 	]*vpshrdd \$0x7b,0x1fc\(%edx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 73 f4 ab[ 	]*vpshrdq \$0xab,%zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 4f 73 f4 ab[ 	]*vpshrdq \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 cf 73 f4 ab[ 	]*vpshrdq \$0xab,%zmm4,%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 73 b4 f4 c0 1d fe ff 7b[ 	]*vpshrdq \$0x7b,-0x1e240\(%esp,%esi,8\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 48 73 72 02 7b[ 	]*vpshrdq \$0x7b,0x80\(%edx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f3 d5 58 73 72 7f 7b[ 	]*vpshrdq \$0x7b,0x3f8\(%edx\)\{1to8\},%zmm5,%zmm6
+
+0+[a-f0-9]+ <vnni>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 52 e3[ 	]*vpdpwssd %zmm3,%zmm1,%zmm4
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 49 52 e3[ 	]*vpdpwssd %zmm3,%zmm1,%zmm4\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 c9 52 e3[ 	]*vpdpwssd %zmm3,%zmm1,%zmm4\{%k1\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 52 a4 f4 c0 1d fe ff[ 	]*vpdpwssd -0x1e240\(%esp,%esi,8\),%zmm1,%zmm4
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 52 62 7f[ 	]*vpdpwssd 0x1fc0\(%edx\),%zmm1,%zmm4
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 58 52 62 7f[ 	]*vpdpwssd 0x1fc\(%edx\)\{1to16\},%zmm1,%zmm4
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 53 d4[ 	]*vpdpwssds %zmm4,%zmm5,%zmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 4e 53 d4[ 	]*vpdpwssds %zmm4,%zmm5,%zmm2\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 ce 53 d4[ 	]*vpdpwssds %zmm4,%zmm5,%zmm2\{%k6\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 53 94 f4 c0 1d fe ff[ 	]*vpdpwssds -0x1e240\(%esp,%esi,8\),%zmm5,%zmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 53 52 7f[ 	]*vpdpwssds 0x1fc0\(%edx\),%zmm5,%zmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 55 58 53 52 7f[ 	]*vpdpwssds 0x1fc\(%edx\)\{1to16\},%zmm5,%zmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 48 50 eb[ 	]*vpdpbusd %zmm3,%zmm2,%zmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 49 50 eb[ 	]*vpdpbusd %zmm3,%zmm2,%zmm5\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d c9 50 eb[ 	]*vpdpbusd %zmm3,%zmm2,%zmm5\{%k1\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 48 50 ac f4 c0 1d fe ff[ 	]*vpdpbusd -0x1e240\(%esp,%esi,8\),%zmm2,%zmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 48 50 6a 7f[ 	]*vpdpbusd 0x1fc0\(%edx\),%zmm2,%zmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 6d 58 50 6a 7f[ 	]*vpdpbusd 0x1fc\(%edx\)\{1to16\},%zmm2,%zmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 48 51 e9[ 	]*vpdpbusds %zmm1,%zmm3,%zmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 4a 51 e9[ 	]*vpdpbusds %zmm1,%zmm3,%zmm5\{%k2\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 ca 51 e9[ 	]*vpdpbusds %zmm1,%zmm3,%zmm5\{%k2\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 48 51 ac f4 c0 1d fe ff[ 	]*vpdpbusds -0x1e240\(%esp,%esi,8\),%zmm3,%zmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 48 51 6a 7f[ 	]*vpdpbusds 0x1fc0\(%edx\),%zmm3,%zmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 65 58 51 6a 7f[ 	]*vpdpbusds 0x1fc\(%edx\)\{1to16\},%zmm3,%zmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 48 52 d9[ 	]*vpdpwssd %zmm1,%zmm4,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 4b 52 d9[ 	]*vpdpwssd %zmm1,%zmm4,%zmm3\{%k3\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d cb 52 d9[ 	]*vpdpwssd %zmm1,%zmm4,%zmm3\{%k3\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 48 52 9c f4 c0 1d fe ff[ 	]*vpdpwssd -0x1e240\(%esp,%esi,8\),%zmm4,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 48 52 5a 7f[ 	]*vpdpwssd 0x1fc0\(%edx\),%zmm4,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 58 52 5a 7f[ 	]*vpdpwssd 0x1fc\(%edx\)\{1to16\},%zmm4,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 53 da[ 	]*vpdpwssds %zmm2,%zmm1,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 4f 53 da[ 	]*vpdpwssds %zmm2,%zmm1,%zmm3\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 cf 53 da[ 	]*vpdpwssds %zmm2,%zmm1,%zmm3\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 53 9c f4 c0 1d fe ff[ 	]*vpdpwssds -0x1e240\(%esp,%esi,8\),%zmm1,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 53 5a 7f[ 	]*vpdpwssds 0x1fc0\(%edx\),%zmm1,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 58 53 5a 7f[ 	]*vpdpwssds 0x1fc\(%edx\)\{1to16\},%zmm1,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 48 50 d9[ 	]*vpdpbusd %zmm1,%zmm4,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 4e 50 d9[ 	]*vpdpbusd %zmm1,%zmm4,%zmm3\{%k6\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d ce 50 d9[ 	]*vpdpbusd %zmm1,%zmm4,%zmm3\{%k6\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 48 50 9c f4 c0 1d fe ff[ 	]*vpdpbusd -0x1e240\(%esp,%esi,8\),%zmm4,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 48 50 5a 7f[ 	]*vpdpbusd 0x1fc0\(%edx\),%zmm4,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 5d 58 50 5a 7f[ 	]*vpdpbusd 0x1fc\(%edx\)\{1to16\},%zmm4,%zmm3
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 51 c9[ 	]*vpdpbusds %zmm1,%zmm1,%zmm1
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 49 51 c9[ 	]*vpdpbusds %zmm1,%zmm1,%zmm1\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 c9 51 c9[ 	]*vpdpbusds %zmm1,%zmm1,%zmm1\{%k1\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 51 8c f4 c0 1d fe ff[ 	]*vpdpbusds -0x1e240\(%esp,%esi,8\),%zmm1,%zmm1
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 51 4a 7f[ 	]*vpdpbusds 0x1fc0\(%edx\),%zmm1,%zmm1
+[ 	]*[a-f0-9]+:[ 	]*62 f2 75 58 51 4a 7f[ 	]*vpdpbusds 0x1fc\(%edx\)\{1to16\},%zmm1,%zmm1
+
+0+[a-f0-9]+ <bf16>:
+[ 	]*[a-f0-9]+:	62 f2 57 48 72 f4    	vcvtne2ps2bf16 %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 57 4f 72 b4 f4 00 00 00 10 	vcvtne2ps2bf16 0x10000000\(%esp,%esi,8\),%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 57 58 72 31    	vcvtne2ps2bf16 \(%ecx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 57 48 72 71 7f 	vcvtne2ps2bf16 0x1fc0\(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 57 df 72 b2 00 e0 ff ff 	vcvtne2ps2bf16 -0x2000\(%edx\)\{1to16\},%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7e 48 72 f5    	vcvtneps2bf16 %zmm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 7e 4f 72 b4 f4 00 00 00 10 	vcvtneps2bf16 0x10000000\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 7e 58 72 31    	vcvtneps2bf16 \(%ecx\)\{1to16\},%ymm6
+[ 	]*[a-f0-9]+:	62 f2 7e 48 72 71 7f 	vcvtneps2bf16 0x1fc0\(%ecx\),%ymm6
+[ 	]*[a-f0-9]+:	62 f2 7e df 72 b2 00 e0 ff ff 	vcvtneps2bf16 -0x2000\(%edx\)\{1to16\},%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 56 48 52 f4    	vdpbf16ps %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 56 4f 52 b4 f4 00 00 00 10 	vdpbf16ps 0x10000000\(%esp,%esi,8\),%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 56 58 52 31    	vdpbf16ps \(%ecx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 56 48 52 71 7f 	vdpbf16ps 0x1fc0\(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 56 df 52 b2 00 e0 ff ff 	vdpbf16ps -0x2000\(%edx\)\{1to16\},%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 57 48 72 f4    	vcvtne2ps2bf16 %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 57 4f 72 b4 f4 00 00 00 10 	vcvtne2ps2bf16 0x10000000\(%esp,%esi,8\),%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 57 58 72 31    	vcvtne2ps2bf16 \(%ecx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 57 48 72 71 7f 	vcvtne2ps2bf16 0x1fc0\(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 57 df 72 b2 00 e0 ff ff 	vcvtne2ps2bf16 -0x2000\(%edx\)\{1to16\},%zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 7e 48 72 f5    	vcvtneps2bf16 %zmm5,%ymm6
+[ 	]*[a-f0-9]+:	62 f2 7e 4f 72 b4 f4 00 00 00 10 	vcvtneps2bf16 0x10000000\(%esp,%esi,8\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 7e 58 72 31    	vcvtneps2bf16 \(%ecx\)\{1to16\},%ymm6
+[ 	]*[a-f0-9]+:	62 f2 7e 48 72 71 7f 	vcvtneps2bf16 0x1fc0\(%ecx\),%ymm6
+[ 	]*[a-f0-9]+:	62 f2 7e df 72 b2 00 e0 ff ff 	vcvtneps2bf16 -0x2000\(%edx\)\{1to16\},%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:	62 f2 56 48 52 f4    	vdpbf16ps %zmm4,%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 56 4f 52 b4 f4 00 00 00 10 	vdpbf16ps 0x10000000\(%esp,%esi,8\),%zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:	62 f2 56 58 52 31    	vdpbf16ps \(%ecx\)\{1to16\},%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 56 48 52 71 7f 	vdpbf16ps 0x1fc0\(%ecx\),%zmm5,%zmm6
+[ 	]*[a-f0-9]+:	62 f2 56 df 52 b2 00 e0 ff ff 	vdpbf16ps -0x2000\(%edx\)\{1to16\},%zmm5,%zmm6\{%k7\}\{z\}
+
+0+[a-f0-9]+ <vpopcnt>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 f5[ 	]*vpopcntd %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 31[ 	]*vpopcntd \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 30[ 	]*vpopcntd \(%eax\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 72 7f[ 	]*vpopcntd 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 b2 00 20 00 00[ 	]*vpopcntd 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 72 80[ 	]*vpopcntd -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 b2 c0 df ff ff[ 	]*vpopcntd -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 b2 00 02 00 00[ 	]*vpopcntd 0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 72 80[ 	]*vpopcntd -0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 b2 fc fd ff ff[ 	]*vpopcntd -0x204\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 f5[ 	]*vpopcntq %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 55 f5[ 	]*vpopcntq %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 55 f5[ 	]*vpopcntq %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 31[ 	]*vpopcntq \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 30[ 	]*vpopcntq \(%eax\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 72 7f[ 	]*vpopcntq 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 b2 00 20 00 00[ 	]*vpopcntq 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 72 80[ 	]*vpopcntq -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 b2 c0 df ff ff[ 	]*vpopcntq -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 b2 00 04 00 00[ 	]*vpopcntq 0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 72 80[ 	]*vpopcntq -0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 b2 f8 fb ff ff[ 	]*vpopcntq -0x408\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 f5[ 	]*vpopcntd %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 31[ 	]*vpopcntd \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 30[ 	]*vpopcntd \(%eax\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 30[ 	]*vpopcntd \(%eax\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 72 7f[ 	]*vpopcntd 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 b2 00 20 00 00[ 	]*vpopcntd 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 72 80[ 	]*vpopcntd -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 b2 c0 df ff ff[ 	]*vpopcntd -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 b2 00 02 00 00[ 	]*vpopcntd 0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 72 80[ 	]*vpopcntd -0x200\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 58 55 b2 fc fd ff ff[ 	]*vpopcntd -0x204\(%edx\)\{1to16\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 f5[ 	]*vpopcntq %zmm5,%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 4f 55 f5[ 	]*vpopcntq %zmm5,%zmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd cf 55 f5[ 	]*vpopcntq %zmm5,%zmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 31[ 	]*vpopcntq \(%ecx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 30[ 	]*vpopcntq \(%eax\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 30[ 	]*vpopcntq \(%eax\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 72 7f[ 	]*vpopcntq 0x1fc0\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 b2 00 20 00 00[ 	]*vpopcntq 0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 72 80[ 	]*vpopcntq -0x2000\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 48 55 b2 c0 df ff ff[ 	]*vpopcntq -0x2040\(%edx\),%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 b2 00 04 00 00[ 	]*vpopcntq 0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 72 80[ 	]*vpopcntq -0x400\(%edx\)\{1to8\},%zmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 58 55 b2 f8 fb ff ff[ 	]*vpopcntq -0x408\(%edx\)\{1to8\},%zmm6
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-5.s
@@ -0,0 +1,25 @@
+	.arch generic32
+	.arch .avx10.1
+
+	.include "avx512bitalg.s"
+
+	.att_syntax prefix
+	.include "avx512cd.s"
+
+	.att_syntax prefix
+	.include "avx512ifma.s"
+
+	.att_syntax prefix
+	.include "avx512vbmi.s"
+
+	.att_syntax prefix
+	.include "avx512vbmi2.s"
+
+	.att_syntax prefix
+	.include "avx512vnni.s"
+
+	.att_syntax prefix
+	.include "avx512_bf16.s"
+
+	.att_syntax prefix
+	.include "avx512_vpopcntdq.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-gfni.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/512 + GFNI
+#dump: avx512f_gfni.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-gfni.s
@@ -0,0 +1,4 @@
+	.arch generic32
+	.arch .avx10.1
+	.arch .gfni
+	.include "avx512f_gfni.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-vaes.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/512 + VAES
+#dump: avx512f_vaes.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-vaes.s
@@ -0,0 +1,4 @@
+	.arch generic32
+	.arch .avx10.1
+	.arch .vaes
+	.include "avx512f_vaes.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-vpclmulqdq.d
@@ -0,0 +1,4 @@
+#as: -I${srcdir}/$subdir
+#objdump: -dw
+#name: AVX10.1/512 + VPCLMULQDQ
+#dump: avx512f_vpclmulqdq.d
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10.1-512-vpclmulqdq.s
@@ -0,0 +1,4 @@
+	.arch generic32
+	.arch .avx10.1
+	.arch .vpclmulqdq
+	.include "avx512f_vpclmulqdq.s"
--- a/gas/testsuite/gas/i386/avx512_bf16.d
+++ b/gas/testsuite/gas/i386/avx512_bf16.d
@@ -8,7 +8,7 @@
 
 Disassembly of section .text:
 
-0+ <_start>:
+0+ <bf16>:
 [ 	]*[a-f0-9]+:	62 f2 57 48 72 f4    	vcvtne2ps2bf16 %zmm4,%zmm5,%zmm6
 [ 	]*[a-f0-9]+:	62 f2 57 4f 72 b4 f4 00 00 00 10 	vcvtne2ps2bf16 0x10000000\(%esp,%esi,8\),%zmm5,%zmm6\{%k7\}
 [ 	]*[a-f0-9]+:	62 f2 57 58 72 31    	vcvtne2ps2bf16 \(%ecx\)\{1to16\},%zmm5,%zmm6
--- a/gas/testsuite/gas/i386/avx512_bf16.s
+++ b/gas/testsuite/gas/i386/avx512_bf16.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+bf16:
 	vcvtne2ps2bf16	%zmm4, %zmm5, %zmm6	 #AVX512_BF16
 	vcvtne2ps2bf16	0x10000000(%esp, %esi, 8), %zmm5, %zmm6{%k7}	 #AVX512_BF16 MASK_ENABLING
 	vcvtne2ps2bf16	(%ecx){1to16}, %zmm5, %zmm6	 #AVX512_BF16 BROADCAST_EN
--- a/gas/testsuite/gas/i386/avx512_bf16_vl.d
+++ b/gas/testsuite/gas/i386/avx512_bf16_vl.d
@@ -8,7 +8,7 @@
 
 Disassembly of section .text:
 
-0+ <_start>:
+0+ <bf16>:
 [ 	]*[a-f0-9]+:	62 f2 57 28 72 f4    	vcvtne2ps2bf16 %ymm4,%ymm5,%ymm6
 [ 	]*[a-f0-9]+:	62 f2 57 08 72 f4    	vcvtne2ps2bf16 %xmm4,%xmm5,%xmm6
 [ 	]*[a-f0-9]+:	62 f2 57 2f 72 b4 f4 00 00 00 10 	vcvtne2ps2bf16 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6\{%k7\}
--- a/gas/testsuite/gas/i386/avx512_bf16_vl.s
+++ b/gas/testsuite/gas/i386/avx512_bf16_vl.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+bf16:
 	vcvtne2ps2bf16	%ymm4, %ymm5, %ymm6	 #AVX512{BF16,VL}
 	vcvtne2ps2bf16	%xmm4, %xmm5, %xmm6	 #AVX512{BF16,VL}
 	vcvtne2ps2bf16	0x10000000(%esp, %esi, 8), %ymm5, %ymm6{%k7}	 #AVX512{BF16,VL} MASK_ENABLING
--- a/gas/testsuite/gas/i386/avx512_vpopcntdq.d
+++ b/gas/testsuite/gas/i386/avx512_vpopcntdq.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vpopcnt>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 f5[ 	]*vpopcntd %zmm5,%zmm6
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 55 f5[ 	]*vpopcntd %zmm5,%zmm6\{%k7\}\{z\}
--- a/gas/testsuite/gas/i386/avx512_vpopcntdq.s
+++ b/gas/testsuite/gas/i386/avx512_vpopcntdq.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+vpopcnt:
 	vpopcntd	%zmm5, %zmm6	 # AVX512_VPOPCNTDQ
 	vpopcntd	%zmm5, %zmm6{%k7}	 # AVX512_VPOPCNTDQ
 	vpopcntd	%zmm5, %zmm6{%k7}{z}	 # AVX512_VPOPCNTDQ
--- a/gas/testsuite/gas/i386/avx512_vpopcntdq-intel.d
+++ b/gas/testsuite/gas/i386/avx512_vpopcntdq-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vpopcnt>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 55 f5[ 	]*vpopcntd zmm6,zmm5
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 55 f5[ 	]*vpopcntd zmm6\{k7\},zmm5
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d cf 55 f5[ 	]*vpopcntd zmm6\{k7\}\{z\},zmm5
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx512_vpopcntdq_vl.d
@@ -0,0 +1,69 @@
+#objdump: -dw
+#name: i386 AVX512VL+VPOPCNTDQ insns
+#source: avx512_vpopcntdq_vl.s
+
+.*: +file format .*
+
+
+Disassembly of section \.text:
+
+00000000 <vpopcnt>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 f5[ 	]*vpopcntd %ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 55 f5[ 	]*vpopcntd %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d af 55 f5[ 	]*vpopcntd %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 31[ 	]*vpopcntd \(%ecx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 30[ 	]*vpopcntd \(%eax\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 72 7f[ 	]*vpopcntd 0xfe0\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 b2 00 10 00 00[ 	]*vpopcntd 0x1000\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 72 80[ 	]*vpopcntd -0x1000\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 28 55 b2 e0 ef ff ff[ 	]*vpopcntd -0x1020\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 b2 00 02 00 00[ 	]*vpopcntd 0x200\(%edx\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 72 80[ 	]*vpopcntd -0x200\(%edx\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 38 55 b2 fc fd ff ff[ 	]*vpopcntd -0x204\(%edx\)\{1to8\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 f5[ 	]*vpopcntq %ymm5,%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 55 f5[ 	]*vpopcntq %ymm5,%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 55 f5[ 	]*vpopcntq %ymm5,%ymm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 31[ 	]*vpopcntq \(%ecx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 30[ 	]*vpopcntq \(%eax\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 72 7f[ 	]*vpopcntq 0xfe0\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 b2 00 10 00 00[ 	]*vpopcntq 0x1000\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 72 80[ 	]*vpopcntq -0x1000\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 28 55 b2 e0 ef ff ff[ 	]*vpopcntq -0x1020\(%edx\),%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 b2 00 04 00 00[ 	]*vpopcntq 0x400\(%edx\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 72 80[ 	]*vpopcntq -0x400\(%edx\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 38 55 b2 f8 fb ff ff[ 	]*vpopcntq -0x408\(%edx\)\{1to4\},%ymm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 f5[ 	]*vpopcntd %xmm5,%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 55 f5[ 	]*vpopcntd %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f 55 f5[ 	]*vpopcntd %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 31[ 	]*vpopcntd \(%ecx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 b4 f4 c0 1d fe ff[ 	]*vpopcntd -0x1e240\(%esp,%esi,8\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 30[ 	]*vpopcntd \(%eax\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 30[ 	]*vpopcntd \(%eax\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 72 7f[ 	]*vpopcntd 0x7f0\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 b2 00 08 00 00[ 	]*vpopcntd 0x800\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 72 80[ 	]*vpopcntd -0x800\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 08 55 b2 f0 f7 ff ff[ 	]*vpopcntd -0x810\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 72 7f[ 	]*vpopcntd 0x1fc\(%edx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 b2 00 02 00 00[ 	]*vpopcntd 0x200\(%edx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 72 80[ 	]*vpopcntd -0x200\(%edx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 7d 18 55 b2 fc fd ff ff[ 	]*vpopcntd -0x204\(%edx\)\{1to4\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 f5[ 	]*vpopcntq %xmm5,%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 0f 55 f5[ 	]*vpopcntq %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 8f 55 f5[ 	]*vpopcntq %xmm5,%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 31[ 	]*vpopcntq \(%ecx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 b4 f4 c0 1d fe ff[ 	]*vpopcntq -0x1e240\(%esp,%esi,8\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 30[ 	]*vpopcntq \(%eax\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 30[ 	]*vpopcntq \(%eax\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 72 7f[ 	]*vpopcntq 0x7f0\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 b2 00 08 00 00[ 	]*vpopcntq 0x800\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 72 80[ 	]*vpopcntq -0x800\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 08 55 b2 f0 f7 ff ff[ 	]*vpopcntq -0x810\(%edx\),%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 72 7f[ 	]*vpopcntq 0x3f8\(%edx\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 b2 00 04 00 00[ 	]*vpopcntq 0x400\(%edx\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 72 80[ 	]*vpopcntq -0x400\(%edx\)\{1to2\},%xmm6
+[ 	]*[a-f0-9]+:[ 	]*62 f2 fd 18 55 b2 f8 fb ff ff[ 	]*vpopcntq -0x408\(%edx\)\{1to2\},%xmm6
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx512_vpopcntdq_vl.s
@@ -0,0 +1,64 @@
+# Check 32bit AVX512VL+VPOPCNTDQ instructions
+
+	.text
+vpopcnt:
+	vpopcntd	%ymm5, %ymm6
+	vpopcntd	%ymm5, %ymm6{%k7}
+	vpopcntd	%ymm5, %ymm6{%k7}{z}
+	vpopcntd	(%ecx), %ymm6
+	vpopcntd	-123456(%esp,%esi,8), %ymm6
+	vpopcntd	(%eax){1to8}, %ymm6
+	vpopcntd	4064(%edx), %ymm6	 #  Disp8
+	vpopcntd	4096(%edx), %ymm6
+	vpopcntd	-4096(%edx), %ymm6	 #  Disp8
+	vpopcntd	-4128(%edx), %ymm6
+	vpopcntd	508(%edx){1to8}, %ymm6	 #  Disp8
+	vpopcntd	512(%edx){1to8}, %ymm6
+	vpopcntd	-512(%edx){1to8}, %ymm6	 #  Disp8
+	vpopcntd	-516(%edx){1to8}, %ymm6
+	vpopcntq	%ymm5, %ymm6
+	vpopcntq	%ymm5, %ymm6{%k7}
+	vpopcntq	%ymm5, %ymm6{%k7}{z}
+	vpopcntq	(%ecx), %ymm6
+	vpopcntq	-123456(%esp,%esi,8), %ymm6
+	vpopcntq	(%eax){1to4}, %ymm6
+	vpopcntq	4064(%edx), %ymm6	 #  Disp8
+	vpopcntq	4096(%edx), %ymm6
+	vpopcntq	-4096(%edx), %ymm6	 #  Disp8
+	vpopcntq	-4128(%edx), %ymm6
+	vpopcntq	1016(%edx){1to4}, %ymm6	 #  Disp8
+	vpopcntq	1024(%edx){1to4}, %ymm6
+	vpopcntq	-1024(%edx){1to4}, %ymm6	 #  Disp8
+	vpopcntq	-1032(%edx){1to4}, %ymm6
+
+	.intel_syntax noprefix
+	vpopcntd	xmm6, xmm5
+	vpopcntd	xmm6{k7}, xmm5
+	vpopcntd	xmm6{k7}{z}, xmm5
+	vpopcntd	xmm6, XMMWORD PTR [ecx]
+	vpopcntd	xmm6, XMMWORD PTR [esp+esi*8-123456]
+	vpopcntd	xmm6, [eax]{1to4}
+	vpopcntd	xmm6, DWORD BCST [eax]
+	vpopcntd	xmm6, XMMWORD PTR [edx+2032]	 #  Disp8
+	vpopcntd	xmm6, XMMWORD PTR [edx+2048]
+	vpopcntd	xmm6, XMMWORD PTR [edx-2048]	 #  Disp8
+	vpopcntd	xmm6, XMMWORD PTR [edx-2064]
+	vpopcntd	xmm6, [edx+508]{1to4}	 #  Disp8
+	vpopcntd	xmm6, [edx+512]{1to4}
+	vpopcntd	xmm6, [edx-512]{1to4}	 #  Disp8
+	vpopcntd	xmm6, [edx-516]{1to4}
+	vpopcntq	xmm6, xmm5
+	vpopcntq	xmm6{k7}, xmm5
+	vpopcntq	xmm6{k7}{z}, xmm5
+	vpopcntq	xmm6, XMMWORD PTR [ecx]
+	vpopcntq	xmm6, XMMWORD PTR [esp+esi*8-123456]
+	vpopcntq	xmm6, [eax]{1to2}
+	vpopcntq	xmm6, QWORD BCST [eax]
+	vpopcntq	xmm6, XMMWORD PTR [edx+2032]	 #  Disp8
+	vpopcntq	xmm6, XMMWORD PTR [edx+2048]
+	vpopcntq	xmm6, XMMWORD PTR [edx-2048]	 #  Disp8
+	vpopcntq	xmm6, XMMWORD PTR [edx-2064]
+	vpopcntq	xmm6, [edx+1016]{1to2}	 #  Disp8
+	vpopcntq	xmm6, [edx+1024]{1to2}
+	vpopcntq	xmm6, [edx-1024]{1to2}	 #  Disp8
+	vpopcntq	xmm6, [edx-1032]{1to2}
--- a/gas/testsuite/gas/i386/avx512bitalg-intel.d
+++ b/gas/testsuite/gas/i386/avx512bitalg-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <bitalg>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f ec[ 	]*vpshufbitqmb k5,zmm5,zmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 8f ec[ 	]*vpshufbitqmb k5\{k7\},zmm5,zmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb k5,zmm5,ZMMWORD PTR \[esp\+esi\*8-0x1e240\]
--- a/gas/testsuite/gas/i386/avx512bitalg.d
+++ b/gas/testsuite/gas/i386/avx512bitalg.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <bitalg>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f ec[ 	]*vpshufbitqmb %zmm4,%zmm5,%k5
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 8f ec[ 	]*vpshufbitqmb %zmm4,%zmm5,%k5\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb -0x1e240\(%esp,%esi,8\),%zmm5,%k5
--- a/gas/testsuite/gas/i386/avx512bitalg.s
+++ b/gas/testsuite/gas/i386/avx512bitalg.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+bitalg:
 	vpshufbitqmb	%zmm4, %zmm5, %k5	 # AVX512BITALG
 	vpshufbitqmb	%zmm4, %zmm5, %k5{%k7}	 # AVX512BITALG
 	vpshufbitqmb	-123456(%esp,%esi,8), %zmm5, %k5	 # AVX512BITALG
--- a/gas/testsuite/gas/i386/avx512bitalg_vl-intel.d
+++ b/gas/testsuite/gas/i386/avx512bitalg_vl-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <bitalg>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f ec[ 	]*vpshufbitqmb k5\{k7\},xmm5,xmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb k5\{k7\},xmm5,XMMWORD PTR \[esp\+esi\*8-0x1e240\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f 6a 7f[ 	]*vpshufbitqmb k5\{k7\},xmm5,XMMWORD PTR \[edx\+0x7f0\]
--- a/gas/testsuite/gas/i386/avx512bitalg_vl.d
+++ b/gas/testsuite/gas/i386/avx512bitalg_vl.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <bitalg>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f ec[ 	]*vpshufbitqmb %xmm4,%xmm5,%k5\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f ac f4 c0 1d fe ff[ 	]*vpshufbitqmb -0x1e240\(%esp,%esi,8\),%xmm5,%k5\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8f 6a 7f[ 	]*vpshufbitqmb 0x7f0\(%edx\),%xmm5,%k5\{%k7\}
--- a/gas/testsuite/gas/i386/avx512bitalg_vl.s
+++ b/gas/testsuite/gas/i386/avx512bitalg_vl.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+bitalg:
 	vpshufbitqmb	%xmm4, %xmm5, %k5{%k7}	 # AVX512{BITALG,VL}
 	vpshufbitqmb	-123456(%esp,%esi,8), %xmm5, %k5{%k7}	 # AVX512{BITALG,VL}
 	vpshufbitqmb	2032(%edx), %xmm5, %k5{%k7}	 # AVX512{BITALG,VL} Disp8
--- a/gas/testsuite/gas/i386/avx512cd-intel.d
+++ b/gas/testsuite/gas/i386/avx512cd-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section .text:
 
-0+ <_start>:
+0+ <cd>:
 [ 	]*[a-f0-9]+:	62 f2 7d 48 c4 f5    	vpconflictd zmm6,zmm5
 [ 	]*[a-f0-9]+:	62 f2 7d 4f c4 f5    	vpconflictd zmm6\{k7\},zmm5
 [ 	]*[a-f0-9]+:	62 f2 7d cf c4 f5    	vpconflictd zmm6\{k7\}\{z\},zmm5
--- a/gas/testsuite/gas/i386/avx512cd.d
+++ b/gas/testsuite/gas/i386/avx512cd.d
@@ -7,7 +7,7 @@
 
 Disassembly of section .text:
 
-0+ <_start>:
+0+ <cd>:
 [ 	]*[a-f0-9]+:	62 f2 7d 48 c4 f5    	vpconflictd %zmm5,%zmm6
 [ 	]*[a-f0-9]+:	62 f2 7d 4f c4 f5    	vpconflictd %zmm5,%zmm6\{%k7\}
 [ 	]*[a-f0-9]+:	62 f2 7d cf c4 f5    	vpconflictd %zmm5,%zmm6\{%k7\}\{z\}
--- a/gas/testsuite/gas/i386/avx512cd.s
+++ b/gas/testsuite/gas/i386/avx512cd.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+cd:
 
 	vpconflictd	%zmm5, %zmm6	 # AVX512CD
 	vpconflictd	%zmm5, %zmm6{%k7}	 # AVX512CD
--- a/gas/testsuite/gas/i386/avx512cd_vl-intel.d
+++ b/gas/testsuite/gas/i386/avx512cd_vl-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <cd>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 f5[ 	]*vpconflictd xmm6\{k7\},xmm5
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f c4 f5[ 	]*vpconflictd xmm6\{k7\}\{z\},xmm5
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 31[ 	]*vpconflictd xmm6\{k7\},XMMWORD PTR \[ecx\]
--- a/gas/testsuite/gas/i386/avx512cd_vl.d
+++ b/gas/testsuite/gas/i386/avx512cd_vl.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <cd>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 f5[ 	]*vpconflictd %xmm5,%xmm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 8f c4 f5[ 	]*vpconflictd %xmm5,%xmm6\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f c4 31[ 	]*vpconflictd \(%ecx\),%xmm6\{%k7\}
--- a/gas/testsuite/gas/i386/avx512cd_vl.s
+++ b/gas/testsuite/gas/i386/avx512cd_vl.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+cd:
 	vpconflictd	%xmm5, %xmm6{%k7}	 # AVX512{CD,VL}
 	vpconflictd	%xmm5, %xmm6{%k7}{z}	 # AVX512{CD,VL}
 	vpconflictd	(%ecx), %xmm6{%k7}	 # AVX512{CD,VL}
--- a/gas/testsuite/gas/i386/avx512ifma-intel.d
+++ b/gas/testsuite/gas/i386/avx512ifma-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <ifma>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 f4[ 	]*vpmadd52luq zmm6,zmm5,zmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f b4 f4[ 	]*vpmadd52luq zmm6\{k7\},zmm5,zmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf b4 f4[ 	]*vpmadd52luq zmm6\{k7\}\{z\},zmm5,zmm4
--- a/gas/testsuite/gas/i386/avx512ifma.d
+++ b/gas/testsuite/gas/i386/avx512ifma.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <ifma>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 48 b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 4f b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 cf b4 f4[ 	]*vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
--- a/gas/testsuite/gas/i386/avx512ifma.s
+++ b/gas/testsuite/gas/i386/avx512ifma.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+ifma:
 	vpmadd52luq	%zmm4, %zmm5, %zmm6	 # AVX512IFMA
 	vpmadd52luq	%zmm4, %zmm5, %zmm6{%k7}	 # AVX512IFMA
 	vpmadd52luq	%zmm4, %zmm5, %zmm6{%k7}{z}	 # AVX512IFMA
--- a/gas/testsuite/gas/i386/avx512ifma_vl-intel.d
+++ b/gas/testsuite/gas/i386/avx512ifma_vl-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <ifma>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 f4[ 	]*vpmadd52luq xmm6\{k7\},xmm5,xmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f b4 f4[ 	]*vpmadd52luq xmm6\{k7\}\{z\},xmm5,xmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 31[ 	]*vpmadd52luq xmm6\{k7\},xmm5,XMMWORD PTR \[ecx\]
--- a/gas/testsuite/gas/i386/avx512ifma_vl.d
+++ b/gas/testsuite/gas/i386/avx512ifma_vl.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <ifma>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 f4[ 	]*vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 8f b4 f4[ 	]*vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 d5 0f b4 31[ 	]*vpmadd52luq \(%ecx\),%xmm5,%xmm6\{%k7\}
--- a/gas/testsuite/gas/i386/avx512ifma_vl.s
+++ b/gas/testsuite/gas/i386/avx512ifma_vl.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+ifma:
 	vpmadd52luq	%xmm4, %xmm5, %xmm6{%k7}	 # AVX512{IFMA,VL}
 	vpmadd52luq	%xmm4, %xmm5, %xmm6{%k7}{z}	 # AVX512{IFMA,VL}
 	vpmadd52luq	(%ecx), %xmm5, %xmm6{%k7}	 # AVX512{IFMA,VL}
--- a/gas/testsuite/gas/i386/avx512vbmi-intel.d
+++ b/gas/testsuite/gas/i386/avx512vbmi-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vbmi>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d f4[ 	]*vpermb zmm6,zmm5,zmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 8d f4[ 	]*vpermb zmm6\{k7\},zmm5,zmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 8d f4[ 	]*vpermb zmm6\{k7\}\{z\},zmm5,zmm4
--- a/gas/testsuite/gas/i386/avx512vbmi.d
+++ b/gas/testsuite/gas/i386/avx512vbmi.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vbmi>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 48 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 4f 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 cf 8d f4[ 	]*vpermb %zmm4,%zmm5,%zmm6\{%k7\}\{z\}
--- a/gas/testsuite/gas/i386/avx512vbmi.s
+++ b/gas/testsuite/gas/i386/avx512vbmi.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+vbmi:
 	vpermb	%zmm4, %zmm5, %zmm6	 # AVX512VBMI
 	vpermb	%zmm4, %zmm5, %zmm6{%k7}	 # AVX512VBMI
 	vpermb	%zmm4, %zmm5, %zmm6{%k7}{z}	 # AVX512VBMI
--- a/gas/testsuite/gas/i386/avx512vbmi2-intel.d
+++ b/gas/testsuite/gas/i386/avx512vbmi2-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vbmi2>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 63 31[ 	]*vpcompressb ZMMWORD PTR \[ecx\]\{k7\},zmm6
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb ZMMWORD PTR \[esp\+esi\*8-0x1e240\],zmm6
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 72 7e[ 	]*vpcompressb ZMMWORD PTR \[edx\+0x7e\],zmm6
--- a/gas/testsuite/gas/i386/avx512vbmi2.d
+++ b/gas/testsuite/gas/i386/avx512vbmi2.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vbmi2>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 4f 63 31[ 	]*vpcompressb %zmm6,\(%ecx\)\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %zmm6,-0x1e240\(%esp,%esi,8\)
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 48 63 72 7e[ 	]*vpcompressb %zmm6,0x7e\(%edx\)
--- a/gas/testsuite/gas/i386/avx512vbmi2.s
+++ b/gas/testsuite/gas/i386/avx512vbmi2.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+vbmi2:
 	vpcompressb	%zmm6, (%ecx){%k7}	 # AVX512VBMI2
 	vpcompressb	%zmm6, -123456(%esp,%esi,8)	 # AVX512VBMI2
 	vpcompressb	%zmm6, 126(%edx)	 # AVX512VBMI2 Disp8
--- a/gas/testsuite/gas/i386/avx512vbmi2_vl-intel.d
+++ b/gas/testsuite/gas/i386/avx512vbmi2_vl-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vbmi2>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb XMMWORD PTR \[esp\+esi\*8-0x1e240\]\{k7\},xmm6
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 72 7e[ 	]*vpcompressb XMMWORD PTR \[edx\+0x7e\]\{k7\},xmm6
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb YMMWORD PTR \[esp\+esi\*8-0x1e240\]\{k7\},ymm6
--- a/gas/testsuite/gas/i386/avx512vbmi2_vl.d
+++ b/gas/testsuite/gas/i386/avx512vbmi2_vl.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vbmi2>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %xmm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 0f 63 72 7e[ 	]*vpcompressb %xmm6,0x7e\(%edx\)\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 63 b4 f4 c0 1d fe ff[ 	]*vpcompressb %ymm6,-0x1e240\(%esp,%esi,8\)\{%k7\}
--- a/gas/testsuite/gas/i386/avx512vbmi2_vl.s
+++ b/gas/testsuite/gas/i386/avx512vbmi2_vl.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+vbmi2:
 	vpcompressb	%xmm6, -123456(%esp,%esi,8){%k7}	 # AVX512{VBMI2,VL}
 	vpcompressb	%xmm6, 126(%edx){%k7}	 # AVX512{VBMI2,VL} Disp8
 	vpcompressb	%ymm6, -123456(%esp,%esi,8){%k7}	 # AVX512{VBMI2,VL}
--- a/gas/testsuite/gas/i386/avx512vbmi_vl-intel.d
+++ b/gas/testsuite/gas/i386/avx512vbmi_vl-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vbmi>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d f4[ 	]*vpermb xmm6\{k7\},xmm5,xmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 8d f4[ 	]*vpermb xmm6\{k7\}\{z\},xmm5,xmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d 31[ 	]*vpermb xmm6\{k7\},xmm5,XMMWORD PTR \[ecx\]
--- a/gas/testsuite/gas/i386/avx512vbmi_vl.d
+++ b/gas/testsuite/gas/i386/avx512vbmi_vl.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vbmi>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d f4[ 	]*vpermb %xmm4,%xmm5,%xmm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 8f 8d f4[ 	]*vpermb %xmm4,%xmm5,%xmm6\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 55 0f 8d 31[ 	]*vpermb \(%ecx\),%xmm5,%xmm6\{%k7\}
--- a/gas/testsuite/gas/i386/avx512vbmi_vl.s
+++ b/gas/testsuite/gas/i386/avx512vbmi_vl.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+vbmi:
 	vpermb	%xmm4, %xmm5, %xmm6{%k7}	 # AVX512{VBMI,VL}
 	vpermb	%xmm4, %xmm5, %xmm6{%k7}{z}	 # AVX512{VBMI,VL}
 	vpermb	(%ecx), %xmm5, %xmm6{%k7}	 # AVX512{VBMI,VL}
--- a/gas/testsuite/gas/i386/avx512vnni-intel.d
+++ b/gas/testsuite/gas/i386/avx512vnni-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vnni>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 52 e3[ 	]*vpdpwssd zmm4,zmm1,zmm3
 [ 	]*[a-f0-9]+:[ 	]*62 f2 75 49 52 e3[ 	]*vpdpwssd zmm4\{k1\},zmm1,zmm3
 [ 	]*[a-f0-9]+:[ 	]*62 f2 75 c9 52 e3[ 	]*vpdpwssd zmm4\{k1\}\{z\},zmm1,zmm3
--- a/gas/testsuite/gas/i386/avx512vnni.d
+++ b/gas/testsuite/gas/i386/avx512vnni.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vnni>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 75 48 52 e3[ 	]*vpdpwssd %zmm3,%zmm1,%zmm4
 [ 	]*[a-f0-9]+:[ 	]*62 f2 75 49 52 e3[ 	]*vpdpwssd %zmm3,%zmm1,%zmm4\{%k1\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 75 c9 52 e3[ 	]*vpdpwssd %zmm3,%zmm1,%zmm4\{%k1\}\{z\}
--- a/gas/testsuite/gas/i386/avx512vnni.s
+++ b/gas/testsuite/gas/i386/avx512vnni.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+vnni:
 	vpdpwssd	%zmm3, %zmm1, %zmm4	 # AVX512VNNI
 	vpdpwssd	%zmm3, %zmm1, %zmm4{%k1}	 # AVX512VNNI
 	vpdpwssd	%zmm3, %zmm1, %zmm4{%k1}{z}	 # AVX512VNNI
--- a/gas/testsuite/gas/i386/avx512vnni_vl-intel.d
+++ b/gas/testsuite/gas/i386/avx512vnni_vl-intel.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vnni>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0b 52 d2[ 	]*vpdpwssd xmm2\{k3\},xmm4,xmm2
 [ 	]*[a-f0-9]+:[ 	]*62 f2 5d 8b 52 d2[ 	]*vpdpwssd xmm2\{k3\}\{z\},xmm4,xmm2
 [ 	]*[a-f0-9]+:[ 	]*62 f2 5d 09 52 94 f4 c0 1d fe ff[ 	]*vpdpwssd xmm2\{k1\},xmm4,XMMWORD PTR \[esp\+esi\*8-0x1e240\]
--- a/gas/testsuite/gas/i386/avx512vnni_vl.d
+++ b/gas/testsuite/gas/i386/avx512vnni_vl.d
@@ -8,7 +8,7 @@
 
 Disassembly of section \.text:
 
-00000000 <_start>:
+00000000 <vnni>:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 5d 0b 52 d2[ 	]*vpdpwssd %xmm2,%xmm4,%xmm2\{%k3\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 5d 8b 52 d2[ 	]*vpdpwssd %xmm2,%xmm4,%xmm2\{%k3\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 5d 09 52 94 f4 c0 1d fe ff[ 	]*vpdpwssd -0x1e240\(%esp,%esi,8\),%xmm4,%xmm2\{%k1\}
--- a/gas/testsuite/gas/i386/avx512vnni_vl.s
+++ b/gas/testsuite/gas/i386/avx512vnni_vl.s
@@ -2,7 +2,7 @@
 
 	.allow_index_reg
 	.text
-_start:
+vnni:
 	vpdpwssd	%xmm2, %xmm4, %xmm2{%k3}	 # AVX512{VNNI,VL}
 	vpdpwssd	%xmm2, %xmm4, %xmm2{%k3}{z}	 # AVX512{VNNI,VL}
 	vpdpwssd	-123456(%esp,%esi,8), %xmm4, %xmm2{%k1}	 # AVX512{VNNI,VL}
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -449,6 +449,7 @@ if [gas_32_check] then {
     run_dump_test "avx512_4vnniw-intel"
     run_dump_test "avx512_vpopcntdq"
     run_dump_test "avx512_vpopcntdq-intel"
+    run_dump_test "avx512_vpopcntdq_vl"
     run_dump_test "avx512vbmi2"
     run_dump_test "avx512vbmi2-intel"
     run_dump_test "avx512vbmi2_vl"
@@ -562,6 +563,22 @@ if [gas_32_check] then {
     run_list_test "avx512vl-2" "-al"
     run_list_test "avx512vl-ambig"
     run_list_test "avx512vl-plain" "-al"
+    run_dump_test "avx10.1-512-1"
+    run_dump_test "avx10.1-512-2"
+    run_dump_test "avx10.1-512-3"
+    run_dump_test "avx10.1-512-4"
+    run_dump_test "avx10.1-512-5"
+    run_dump_test "avx10.1-512-gfni"
+    run_dump_test "avx10.1-512-vaes"
+    run_dump_test "avx10.1-512-vpclmulqdq"
+    run_dump_test "avx10.1-256-1"
+    run_dump_test "avx10.1-256-2"
+    run_dump_test "avx10.1-256-3"
+    run_dump_test "avx10.1-256-4"
+    run_dump_test "avx10.1-256-5"
+    run_dump_test "avx10.1-256-gfni"
+    run_dump_test "avx10.1-256-vaes"
+    run_dump_test "avx10.1-256-vpclmulqdq"
     run_dump_test "fpu-bad"
     run_dump_test "cet"
     run_dump_test "cet-intel"
--- a/gas/testsuite/gas/i386/noavx512-1.l
+++ b/gas/testsuite/gas/i386/noavx512-1.l
@@ -39,6 +39,26 @@
 .*:25: Error: .*not supported.*
 .*:26: Error: .*not supported.*
 .*:27: Error: .*not supported.*
+.*:8: Error: .*bad register name.*
+.*:9: Error: .*unknown vector operation.*
+.*:10: Error: .*unknown vector operation.*
+.*:11: Error: .*not supported.*
+.*:12: Error: .*not supported.*
+.*:13: Error: .*not supported.*
+.*:14: Error: .*not supported.*
+.*:15: Error: .*not supported.*
+.*:16: Error: .*not supported.*
+.*:17: Error: .*not supported.*
+.*:18: Error: .*bad register name.*
+.*:19: Error: .*unknown vector operation.*
+.*:20: Error: .*unknown vector operation.*
+.*:21: Error: .*bad register name.*
+.*:22: Error: .*unknown vector operation.*
+.*:23: Error: .*unknown vector operation.*
+.*:24: Error: .*not supported.*
+.*:25: Error: .*not supported.*
+.*:26: Error: .*not supported.*
+.*:27: Error: .*not supported.*
 #...
 [ 	]*[0-9]+[ 	]+\# Test \.arch \.noavx512XX
 [ 	]*[0-9]+[ 	]+\.text
@@ -383,6 +403,29 @@
 [ 	]*[0-9]+[ 	]+>  vpabsb %zmm5,%zmm6\{%k7\}
 [ 	]*[0-9]+[ 	]+>  vpabsb %xmm5,%xmm6\{%k7\}
 [ 	]*[0-9]+[ 	]+>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx10\.1
+[ 	]*[0-9]+[ 	]+>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpabsb %ymm5,%ymm6\{%k7\}
 [ 	]*[0-9]+[ 	]+>  vpconflictd %zmm5,%zmm6
 [ 	]*[0-9]+[ 	]+>  vpconflictd %xmm5,%xmm6
 [ 	]*[0-9]+[ 	]+>  vpconflictd %ymm5,%ymm6
--- a/gas/testsuite/gas/i386/noavx512-1.s
+++ b/gas/testsuite/gas/i386/noavx512-1.s
@@ -1,7 +1,7 @@
 # Test .arch .noavx512XX
 	.text
 
-	.irp isa, default, .noavx512bw, .noavx512cd, .noavx512dq, .noavx512er, .noavx512ifma, .noavx512pf, .noavx512vbmi, .noavx512f
+	.irp isa, default, .noavx512bw, .noavx512cd, .noavx512dq, .noavx512er, .noavx512ifma, .noavx512pf, .noavx512vbmi, .noavx512f, .noavx10.1
 
 	.arch default
 	.arch \isa
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -232,6 +232,9 @@ static const dependency isa_dependencies
     "AVX2" },
   { "VPCLMULQDQ",
     "AVX2" },
+  { "AVX10_1",
+    "AVX512VL|AVX512DQ|AVX512CD|AVX512VBMI|AVX512_VBMI2|AVX512IFMA"
+    "|AVX512_VNNI|AVX512_BF16|AVX512_FP16|AVX512_VPOPCNTDQ|AVX512_BITALG" },
   { "SEV_ES",
     "SVME" },
   { "SNP",
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1762,22 +1762,28 @@ vpgatherq<dq>, 0x6691, AVX2, Modrm|Vex12
 vpgatherqd, 0x6691, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB256, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
 vpgatherqq, 0x6691, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW1|SwapSources|NoSuf|VecSIB256, { RegYMM, Qword|Unspecified|BaseIndex, RegYMM }
 
+// AVX10 is somewhat special wrt VAES and VPCLMULQDQ: The latter two imply
+// 256-bit VEX encodings, but surely the 128-bit forms are also supported.
+// Short of making AES a prereq of VAES (and PCLMULQDQ for VPCLMULQDQ) in
+// i386-gen, add otherwise redundant secondary templates here.
+<avx10:cpu, $avx:AVX|, $avx10:AVX512F|V>
+
 // AES + AVX
 
-vaesdec, 0x66de, AVX|AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaesdeclast, 0x66df, AVX|AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaesenc, 0x66dc, AVX|AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaesenclast, 0x66dd, AVX|AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaesimc, 0x66db, AVX|AES, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM }
+vaesdec<avx10>, 0x66de, <avx10:cpu>AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaesdeclast<avx10>, 0x66df, <avx10:cpu>AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaesenc<avx10>, 0x66dc, <avx10:cpu>AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaesenclast<avx10>, 0x66dd, <avx10:cpu>AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaesimc<avx10>, 0x66db, AVX|AES, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM }
 vaeskeygenassist, 0x66df, AVX|AES, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM }
 
 // PCLMULQDQ + AVX
 
-vpclmulqdq, 0x6644, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmullqlqdq, 0x6644/0x00, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmulhqlqdq, 0x6644/0x01, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmullqhqdq, 0x6644/0x10, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmulhqhqdq, 0x6644/0x11, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulqdq<avx10>, 0x6644, <avx10:cpu>PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmullqlqdq<avx10>, 0x6644/0x00, <avx10:cpu>PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulhqlqdq<avx10>, 0x6644/0x01, <avx10:cpu>PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmullqhqdq<avx10>, 0x6644/0x10, <avx10:cpu>PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulhqhqdq<avx10>, 0x6644/0x11, <avx10:cpu>PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 
 // GFNI + AVX
 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 4/5] x86: unindent most of set_cpu_arch()
  2023-08-25 12:43 [PATCH 0/5] x86: AVX10.1 (alternative attempt) Jan Beulich
                   ` (2 preceding siblings ...)
  2023-08-25 12:46 ` [PATCH 3/5] x86: support AVX10.1/512 Jan Beulich
@ 2023-08-25 12:47 ` Jan Beulich
  2023-08-25 12:47 ` [PATCH 5/5] x86: support AVX10.1 vector size restrictions Jan Beulich
  4 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2023-08-25 12:47 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu, Jiang, Haochen

Inverting the initial if()'s condition allows to move out the bulk of
the function by a level, improving readability at least a bit. While
doing that also pull the push/pop handling up first, such that "else if"
after "return" isn't needed anymore; the order in which special cases
are checked doesn't really matter.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -2794,29 +2794,134 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
     bool no_cond_jump_promotion;
   } arch_stack_entry;
   static const arch_stack_entry *arch_stack_top;
+  char *s;
+  int e;
+  const char *string;
+  unsigned int j = 0;
+  i386_cpu_flags flags;
 
   SKIP_WHITESPACE ();
 
-  if (!is_end_of_line[(unsigned char) *input_line_pointer])
+  if (is_end_of_line[(unsigned char) *input_line_pointer])
     {
-      char *s;
-      int e = get_symbol_name (&s);
-      const char *string = s;
-      unsigned int j = 0;
-      i386_cpu_flags flags;
+      as_bad (_("missing cpu architecture"));
+      input_line_pointer++;
+      return;
+    }
+
+  e = get_symbol_name (&s);
+  string = s;
+
+  if (strcmp (string, "push") == 0)
+    {
+      arch_stack_entry *top = XNEW (arch_stack_entry);
+
+      top->name = cpu_arch_name;
+      if (cpu_sub_arch_name)
+	top->sub_name = xstrdup (cpu_sub_arch_name);
+      else
+	top->sub_name = NULL;
+      top->flags = cpu_arch_flags;
+      top->isa = cpu_arch_isa;
+      top->isa_flags = cpu_arch_isa_flags;
+      top->flag_code = flag_code;
+      top->stackop_size = stackop_size;
+      top->no_cond_jump_promotion = no_cond_jump_promotion;
+
+      top->prev = arch_stack_top;
+      arch_stack_top = top;
+
+      (void) restore_line_pointer (e);
+      demand_empty_rest_of_line ();
+      return;
+    }
+
+  if (strcmp (string, "pop") == 0)
+    {
+      const arch_stack_entry *top = arch_stack_top;
+
+      if (!top)
+	as_bad (_(".arch stack is empty"));
+      else if (top->flag_code != flag_code
+	       || top->stackop_size != stackop_size)
+	{
+	  static const unsigned int bits[] = {
+	    [CODE_16BIT] = 16,
+	    [CODE_32BIT] = 32,
+	    [CODE_64BIT] = 64,
+	  };
+
+	  as_bad (_("this `.arch pop' requires `.code%u%s' to be in effect"),
+		  bits[top->flag_code],
+		  top->stackop_size == LONG_MNEM_SUFFIX ? "gcc" : "");
+	}
+      else
+	{
+	  arch_stack_top = top->prev;
+
+	  cpu_arch_name = top->name;
+	  free (cpu_sub_arch_name);
+	  cpu_sub_arch_name = top->sub_name;
+	  cpu_arch_flags = top->flags;
+	  cpu_arch_isa = top->isa;
+	  cpu_arch_isa_flags = top->isa_flags;
+	  no_cond_jump_promotion = top->no_cond_jump_promotion;
+
+	  XDELETE (top);
+	}
+
+      (void) restore_line_pointer (e);
+      demand_empty_rest_of_line ();
+      return;
+    }
 
-      if (strcmp (string, "default") == 0)
+  if (strcmp (string, "default") == 0)
+    {
+      if (strcmp (default_arch, "iamcu") == 0)
+	string = default_arch;
+      else
 	{
-	  if (strcmp (default_arch, "iamcu") == 0)
-	    string = default_arch;
+	  static const i386_cpu_flags cpu_unknown_flags = CPU_UNKNOWN_FLAGS;
+
+	  cpu_arch_name = NULL;
+	  free (cpu_sub_arch_name);
+	  cpu_sub_arch_name = NULL;
+	  cpu_arch_flags = cpu_unknown_flags;
+	  if (flag_code == CODE_64BIT)
+	    {
+	      cpu_arch_flags.bitfield.cpu64 = 1;
+	      cpu_arch_flags.bitfield.cpuno64 = 0;
+	    }
 	  else
 	    {
-	      static const i386_cpu_flags cpu_unknown_flags = CPU_UNKNOWN_FLAGS;
+	      cpu_arch_flags.bitfield.cpu64 = 0;
+	      cpu_arch_flags.bitfield.cpuno64 = 1;
+	    }
+	  cpu_arch_isa = PROCESSOR_UNKNOWN;
+	  cpu_arch_isa_flags = cpu_arch[flag_code == CODE_64BIT].enable;
+	  if (!cpu_arch_tune_set)
+	    {
+	      cpu_arch_tune = cpu_arch_isa;
+	      cpu_arch_tune_flags = cpu_arch_isa_flags;
+	    }
 
-	      cpu_arch_name = NULL;
+	  j = ARRAY_SIZE (cpu_arch) + 1;
+	}
+    }
+
+  for (; j < ARRAY_SIZE (cpu_arch); j++)
+    {
+      if (strcmp (string + (*string == '.'), cpu_arch[j].name) == 0
+	  && (*string == '.') == (cpu_arch[j].type == PROCESSOR_NONE))
+	{
+	  if (*string != '.')
+	    {
+	      check_cpu_arch_compatible (string, cpu_arch[j].enable);
+
+	      cpu_arch_name = cpu_arch[j].name;
 	      free (cpu_sub_arch_name);
 	      cpu_sub_arch_name = NULL;
-	      cpu_arch_flags = cpu_unknown_flags;
+	      cpu_arch_flags = cpu_arch[j].enable;
 	      if (flag_code == CODE_64BIT)
 		{
 		  cpu_arch_flags.bitfield.cpu64 = 1;
@@ -2827,173 +2932,71 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
 		  cpu_arch_flags.bitfield.cpu64 = 0;
 		  cpu_arch_flags.bitfield.cpuno64 = 1;
 		}
-	      cpu_arch_isa = PROCESSOR_UNKNOWN;
-	      cpu_arch_isa_flags = cpu_arch[flag_code == CODE_64BIT].enable;
+	      cpu_arch_isa = cpu_arch[j].type;
+	      cpu_arch_isa_flags = cpu_arch[j].enable;
 	      if (!cpu_arch_tune_set)
 		{
 		  cpu_arch_tune = cpu_arch_isa;
 		  cpu_arch_tune_flags = cpu_arch_isa_flags;
 		}
-
-	      j = ARRAY_SIZE (cpu_arch) + 1;
+	      pre_386_16bit_warned = false;
+	      break;
 	    }
-	}
-      else if (strcmp (string, "push") == 0)
-	{
-	  arch_stack_entry *top = XNEW (arch_stack_entry);
 
-	  top->name = cpu_arch_name;
-	  if (cpu_sub_arch_name)
-	    top->sub_name = xstrdup (cpu_sub_arch_name);
-	  else
-	    top->sub_name = NULL;
-	  top->flags = cpu_arch_flags;
-	  top->isa = cpu_arch_isa;
-	  top->isa_flags = cpu_arch_isa_flags;
-	  top->flag_code = flag_code;
-	  top->stackop_size = stackop_size;
-	  top->no_cond_jump_promotion = no_cond_jump_promotion;
+	  if (cpu_flags_all_zero (&cpu_arch[j].enable))
+	    continue;
 
-	  top->prev = arch_stack_top;
-	  arch_stack_top = top;
+	  flags = cpu_flags_or (cpu_arch_flags, cpu_arch[j].enable);
 
-	  (void) restore_line_pointer (e);
-	  demand_empty_rest_of_line ();
-	  return;
-	}
-      else if (strcmp (string, "pop") == 0)
-	{
-	  const arch_stack_entry *top = arch_stack_top;
-
-	  if (!top)
-	    as_bad (_(".arch stack is empty"));
-	  else if (top->flag_code != flag_code
-		   || top->stackop_size != stackop_size)
+	  if (!cpu_flags_equal (&flags, &cpu_arch_flags))
 	    {
-	      static const unsigned int bits[] = {
-	        [CODE_16BIT] = 16,
-	        [CODE_32BIT] = 32,
-	        [CODE_64BIT] = 64,
-	      };
-
-	      as_bad (_("this `.arch pop' requires `.code%u%s' to be in effect"),
-		      bits[top->flag_code],
-		      top->stackop_size == LONG_MNEM_SUFFIX ? "gcc" : "");
+	      extend_cpu_sub_arch_name (string + 1);
+	      cpu_arch_flags = flags;
+	      cpu_arch_isa_flags = flags;
 	    }
 	  else
-	    {
-	      arch_stack_top = top->prev;
-
-	      cpu_arch_name = top->name;
-	      free (cpu_sub_arch_name);
-	      cpu_sub_arch_name = top->sub_name;
-	      cpu_arch_flags = top->flags;
-	      cpu_arch_isa = top->isa;
-	      cpu_arch_isa_flags = top->isa_flags;
-	      no_cond_jump_promotion = top->no_cond_jump_promotion;
-
-	      XDELETE (top);
-	    }
+	    cpu_arch_isa_flags
+	      = cpu_flags_or (cpu_arch_isa_flags, cpu_arch[j].enable);
 
 	  (void) restore_line_pointer (e);
 	  demand_empty_rest_of_line ();
 	  return;
 	}
+    }
 
-      for (; j < ARRAY_SIZE (cpu_arch); j++)
-	{
-	  if (strcmp (string + (*string == '.'), cpu_arch[j].name) == 0
-	     && (*string == '.') == (cpu_arch[j].type == PROCESSOR_NONE))
-	    {
-	      if (*string != '.')
-		{
-		  check_cpu_arch_compatible (string, cpu_arch[j].enable);
-
-		  cpu_arch_name = cpu_arch[j].name;
-		  free (cpu_sub_arch_name);
-		  cpu_sub_arch_name = NULL;
-		  cpu_arch_flags = cpu_arch[j].enable;
-		  if (flag_code == CODE_64BIT)
-		    {
-		      cpu_arch_flags.bitfield.cpu64 = 1;
-		      cpu_arch_flags.bitfield.cpuno64 = 0;
-		    }
-		  else
-		    {
-		      cpu_arch_flags.bitfield.cpu64 = 0;
-		      cpu_arch_flags.bitfield.cpuno64 = 1;
-		    }
-		  cpu_arch_isa = cpu_arch[j].type;
-		  cpu_arch_isa_flags = cpu_arch[j].enable;
-		  if (!cpu_arch_tune_set)
-		    {
-		      cpu_arch_tune = cpu_arch_isa;
-		      cpu_arch_tune_flags = cpu_arch_isa_flags;
-		    }
-		  pre_386_16bit_warned = false;
-		  break;
-		}
-
-	      if (cpu_flags_all_zero (&cpu_arch[j].enable))
-	        continue;
-
-	      flags = cpu_flags_or (cpu_arch_flags,
-				    cpu_arch[j].enable);
-
-	      if (!cpu_flags_equal (&flags, &cpu_arch_flags))
-		{
-		  extend_cpu_sub_arch_name (string + 1);
-		  cpu_arch_flags = flags;
-		  cpu_arch_isa_flags = flags;
-		}
-	      else
-		cpu_arch_isa_flags
-		  = cpu_flags_or (cpu_arch_isa_flags,
-				  cpu_arch[j].enable);
-	      (void) restore_line_pointer (e);
-	      demand_empty_rest_of_line ();
-	      return;
-	    }
-	}
-
-      if (startswith (string, ".no") && j >= ARRAY_SIZE (cpu_arch))
-	{
-	  /* Disable an ISA extension.  */
-	  for (j = 0; j < ARRAY_SIZE (cpu_arch); j++)
-	    if (cpu_arch[j].type == PROCESSOR_NONE
-	        && strcmp (string + 3, cpu_arch[j].name) == 0)
+  if (startswith (string, ".no") && j >= ARRAY_SIZE (cpu_arch))
+    {
+      /* Disable an ISA extension.  */
+      for (j = 0; j < ARRAY_SIZE (cpu_arch); j++)
+	if (cpu_arch[j].type == PROCESSOR_NONE
+	    && strcmp (string + 3, cpu_arch[j].name) == 0)
+	  {
+	    flags = cpu_flags_and_not (cpu_arch_flags, cpu_arch[j].disable);
+	    if (!cpu_flags_equal (&flags, &cpu_arch_flags))
 	      {
-		flags = cpu_flags_and_not (cpu_arch_flags,
-					   cpu_arch[j].disable);
-		if (!cpu_flags_equal (&flags, &cpu_arch_flags))
-		  {
-		    extend_cpu_sub_arch_name (string + 1);
-		    cpu_arch_flags = flags;
-		    cpu_arch_isa_flags = flags;
-		  }
-		(void) restore_line_pointer (e);
-		demand_empty_rest_of_line ();
-		return;
+		extend_cpu_sub_arch_name (string + 1);
+		cpu_arch_flags = flags;
+		cpu_arch_isa_flags = flags;
 	      }
-	}
 
-      if (j == ARRAY_SIZE (cpu_arch))
-	as_bad (_("no such architecture: `%s'"), string);
-
-      *input_line_pointer = e;
+	    (void) restore_line_pointer (e);
+	    demand_empty_rest_of_line ();
+	    return;
+	  }
     }
-  else
-    as_bad (_("missing cpu architecture"));
+
+  if (j == ARRAY_SIZE (cpu_arch))
+    as_bad (_("no such architecture: `%s'"), string);
+
+  *input_line_pointer = e;
 
   no_cond_jump_promotion = 0;
   if (*input_line_pointer == ','
       && !is_end_of_line[(unsigned char) input_line_pointer[1]])
     {
-      char *string;
-      char e;
-
       ++input_line_pointer;
-      e = get_symbol_name (&string);
+      e = get_symbol_name (&s);
+      string = s;
 
       if (strcmp (string, "nojumps") == 0)
 	no_cond_jump_promotion = 1;


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-25 12:43 [PATCH 0/5] x86: AVX10.1 (alternative attempt) Jan Beulich
                   ` (3 preceding siblings ...)
  2023-08-25 12:47 ` [PATCH 4/5] x86: unindent most of set_cpu_arch() Jan Beulich
@ 2023-08-25 12:47 ` Jan Beulich
  2023-08-29 16:26   ` H.J. Lu
  4 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-08-25 12:47 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu, Jiang, Haochen

Recognize "/<number>" suffixes on both -march=+avx10.1 and the
corresponding .arch directive, setting an upper bound on the vector size
that insns may use. Such a restriction can be reset by setting a new base
architecture, by using a suffix-less form, by disabling AVX10, or by
enabling any other VEX/EVEX-based vector extension.

While for most insns we can suppress their use with too wide operands
via registers becoming unavailable (or in Intel syntax memory operand
size specifiers not being recognized), mask register insns have to have
their minimum required vector size specified in a new attribute. (Of
course this new attribute could also be used on other insns.)

Note that .insn continues to be permitted to emit EVEX{512,256} (and
VEX256 ones) encodings regardless of vector size restrictions in place.
Of course these can't be expressed using zmm (or ymm) operands then,
but need using the EVEX.512.* forms (broadcast forms may be usable right
now, but this may go away so shouldn't be relied upon). This is why no
assertions should be added to build_{e,}vex_prefix().
---
It is unclear whether Vsz is a good name for the new attribute: The spec
leaves open how 256-bit embedded rounding is going to be expressed. Yet
that may require some similar attribute ...

--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,7 @@
 -*- text -*-
 
+* Add support for Intel AVX10.1.
+
 * Add support for Intel PBNDKB instructions.
 
 * Add support for Intel SM4 instructions.
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -213,6 +213,9 @@ accept various extension mnemonics.  For
 @code{sm4},
 @code{pbndkb},
 @code{avx10.1},
+@code{avx10.1/512},
+@code{avx10.1/256},
+@code{avx10.1/128},
 @code{amx_int8},
 @code{amx_bf16},
 @code{amx_fp16},
@@ -267,7 +270,11 @@ accept various extension mnemonics.  For
 @code{svme} and
 @code{padlock}.
 Note that these extension mnemonics can be prefixed with @code{no} to revoke
-the respective (and any dependent) functionality.
+the respective (and any dependent) functionality.  Note further that the
+suffixes permitted on @code{-march=avx10.<N>} enforce a vector length
+restriction, i.e. despite these otherwise being "enabling" options, using
+these suffixes will disable all insns with wider vector or mask register
+operands.
 
 When the @code{.arch} directive is used with @option{-march}, the
 @code{.arch} directive will take precedent.
@@ -1673,6 +1680,12 @@ an unconditional jump to the target.
 
 Note that the sub-architecture specifiers (starting with a dot) can be prefixed
 with @code{no} to revoke the respective (and any dependent) functionality.
+Note further that @samp{.avx10.<N>} can be suffixed with a vector length
+restriction (@samp{/256} or @samp{/128}, with @samp{/512} simply restoring the
+default).  Despite these otherwise being "enabling" specifiers, using these
+suffixes will disable all insns with wider vector or mask register operands.
+On SVR4-derived platforms, the separator character @samp{/} can be replaced by
+@samp{:}.
 
 Following the CPU architecture (but not a sub-architecture, which are those
 starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -131,6 +131,7 @@ typedef struct
   unsigned int len:8;		/* arch string length */
   bool skip:1;			/* show_arch should skip this. */
   enum processor_type type;	/* arch type */
+  enum { vsz_none, vsz_set, vsz_reset } vsz; /* vector size control */
   i386_cpu_flags enable;		/* cpu feature enable flags */
   i386_cpu_flags disable;	/* cpu feature disable flags */
 }
@@ -841,6 +842,10 @@ static unsigned int sse2avx;
 /* Encode aligned vector move as unaligned vector move.  */
 static unsigned int use_unaligned_vector_move;
 
+/* Maximum permitted vector size. */
+#define VSZ_DEFAULT VSZ512
+static unsigned int vector_size = VSZ_DEFAULT;
+
 /* Encode scalar AVX instructions with specific vector length.  */
 static enum
   {
@@ -969,11 +974,14 @@ const relax_typeS md_relax_table[] =
 };
 
 #define ARCH(n, t, f, s) \
-  { STRING_COMMA_LEN (#n), s, PROCESSOR_ ## t, CPU_ ## f ## _FLAGS, \
+  { STRING_COMMA_LEN (#n), s, PROCESSOR_ ## t, vsz_none, CPU_ ## f ## _FLAGS, \
     CPU_NONE_FLAGS }
 #define SUBARCH(n, e, d, s) \
-  { STRING_COMMA_LEN (#n), s, PROCESSOR_NONE, CPU_ ## e ## _FLAGS, \
+  { STRING_COMMA_LEN (#n), s, PROCESSOR_NONE, vsz_none, CPU_ ## e ## _FLAGS, \
     CPU_ ## d ## _FLAGS }
+#define VECARCH(n, e, d, v) \
+  { STRING_COMMA_LEN (#n), false, PROCESSOR_NONE, vsz_ ## v, \
+    CPU_ ## e ## _FLAGS, CPU_ ## d ## _FLAGS }
 
 static const arch_entry cpu_arch[] =
 {
@@ -1035,15 +1043,15 @@ static const arch_entry cpu_arch[] =
   SUBARCH (sse4.1, SSE4_1, ANY_SSE4_1, false),
   SUBARCH (sse4.2, SSE4_2, ANY_SSE4_2, false),
   SUBARCH (sse4, SSE4_2, ANY_SSE4_1, false),
-  SUBARCH (avx, AVX, ANY_AVX, false),
-  SUBARCH (avx2, AVX2, ANY_AVX2, false),
-  SUBARCH (avx512f, AVX512F, ANY_AVX512F, false),
-  SUBARCH (avx512cd, AVX512CD, ANY_AVX512CD, false),
-  SUBARCH (avx512er, AVX512ER, ANY_AVX512ER, false),
-  SUBARCH (avx512pf, AVX512PF, ANY_AVX512PF, false),
-  SUBARCH (avx512dq, AVX512DQ, ANY_AVX512DQ, false),
-  SUBARCH (avx512bw, AVX512BW, ANY_AVX512BW, false),
-  SUBARCH (avx512vl, AVX512VL, ANY_AVX512VL, false),
+  VECARCH (avx, AVX, ANY_AVX, reset),
+  VECARCH (avx2, AVX2, ANY_AVX2, reset),
+  VECARCH (avx512f, AVX512F, ANY_AVX512F, reset),
+  VECARCH (avx512cd, AVX512CD, ANY_AVX512CD, reset),
+  VECARCH (avx512er, AVX512ER, ANY_AVX512ER, reset),
+  VECARCH (avx512pf, AVX512PF, ANY_AVX512PF, reset),
+  VECARCH (avx512dq, AVX512DQ, ANY_AVX512DQ, reset),
+  VECARCH (avx512bw, AVX512BW, ANY_AVX512BW, reset),
+  VECARCH (avx512vl, AVX512VL, ANY_AVX512VL, reset),
   SUBARCH (monitor, MONITOR, MONITOR, false),
   SUBARCH (vmx, VMX, ANY_VMX, false),
   SUBARCH (vmfunc, VMFUNC, ANY_VMFUNC, false),
@@ -1095,15 +1103,15 @@ static const arch_entry cpu_arch[] =
   SUBARCH (prefetchwt1, PREFETCHWT1, PREFETCHWT1, false),
   SUBARCH (se1, SE1, SE1, false),
   SUBARCH (clwb, CLWB, CLWB, false),
-  SUBARCH (avx512ifma, AVX512IFMA, ANY_AVX512IFMA, false),
-  SUBARCH (avx512vbmi, AVX512VBMI, ANY_AVX512VBMI, false),
-  SUBARCH (avx512_4fmaps, AVX512_4FMAPS, ANY_AVX512_4FMAPS, false),
-  SUBARCH (avx512_4vnniw, AVX512_4VNNIW, ANY_AVX512_4VNNIW, false),
-  SUBARCH (avx512_vpopcntdq, AVX512_VPOPCNTDQ, ANY_AVX512_VPOPCNTDQ, false),
-  SUBARCH (avx512_vbmi2, AVX512_VBMI2, ANY_AVX512_VBMI2, false),
-  SUBARCH (avx512_vnni, AVX512_VNNI, ANY_AVX512_VNNI, false),
-  SUBARCH (avx512_bitalg, AVX512_BITALG, ANY_AVX512_BITALG, false),
-  SUBARCH (avx_vnni, AVX_VNNI, ANY_AVX_VNNI, false),
+  VECARCH (avx512ifma, AVX512IFMA, ANY_AVX512IFMA, reset),
+  VECARCH (avx512vbmi, AVX512VBMI, ANY_AVX512VBMI, reset),
+  VECARCH (avx512_4fmaps, AVX512_4FMAPS, ANY_AVX512_4FMAPS, reset),
+  VECARCH (avx512_4vnniw, AVX512_4VNNIW, ANY_AVX512_4VNNIW, reset),
+  VECARCH (avx512_vpopcntdq, AVX512_VPOPCNTDQ, ANY_AVX512_VPOPCNTDQ, reset),
+  VECARCH (avx512_vbmi2, AVX512_VBMI2, ANY_AVX512_VBMI2, reset),
+  VECARCH (avx512_vnni, AVX512_VNNI, ANY_AVX512_VNNI, reset),
+  VECARCH (avx512_bitalg, AVX512_BITALG, ANY_AVX512_BITALG, reset),
+  VECARCH (avx_vnni, AVX_VNNI, ANY_AVX_VNNI, reset),
   SUBARCH (clzero, CLZERO, CLZERO, false),
   SUBARCH (mwaitx, MWAITX, MWAITX, false),
   SUBARCH (ospke, OSPKE, ANY_OSPKE, false),
@@ -1112,8 +1120,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (ibt, IBT, IBT, false),
   SUBARCH (shstk, SHSTK, SHSTK, false),
   SUBARCH (gfni, GFNI, ANY_GFNI, false),
-  SUBARCH (vaes, VAES, ANY_VAES, false),
-  SUBARCH (vpclmulqdq, VPCLMULQDQ, ANY_VPCLMULQDQ, false),
+  VECARCH (vaes, VAES, ANY_VAES, reset),
+  VECARCH (vpclmulqdq, VPCLMULQDQ, ANY_VPCLMULQDQ, reset),
   SUBARCH (wbnoinvd, WBNOINVD, WBNOINVD, false),
   SUBARCH (pconfig, PCONFIG, PCONFIG, false),
   SUBARCH (waitpkg, WAITPKG, WAITPKG, false),
@@ -1125,9 +1133,9 @@ static const arch_entry cpu_arch[] =
   SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false),
   SUBARCH (movdiri, MOVDIRI, MOVDIRI, false),
   SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false),
-  SUBARCH (avx512_bf16, AVX512_BF16, ANY_AVX512_BF16, false),
-  SUBARCH (avx512_vp2intersect, AVX512_VP2INTERSECT,
-	   ANY_AVX512_VP2INTERSECT, false),
+  VECARCH (avx512_bf16, AVX512_BF16, ANY_AVX512_BF16, reset),
+  VECARCH (avx512_vp2intersect, AVX512_VP2INTERSECT,
+	   ANY_AVX512_VP2INTERSECT, reset),
   SUBARCH (tdx, TDX, TDX, false),
   SUBARCH (enqcmd, ENQCMD, ENQCMD, false),
   SUBARCH (serialize, SERIALIZE, SERIALIZE, false),
@@ -1139,24 +1147,24 @@ static const arch_entry cpu_arch[] =
   SUBARCH (widekl, WIDEKL, ANY_WIDEKL, false),
   SUBARCH (uintr, UINTR, UINTR, false),
   SUBARCH (hreset, HRESET, HRESET, false),
-  SUBARCH (avx512_fp16, AVX512_FP16, ANY_AVX512_FP16, false),
+  VECARCH (avx512_fp16, AVX512_FP16, ANY_AVX512_FP16, reset),
   SUBARCH (prefetchi, PREFETCHI, PREFETCHI, false),
-  SUBARCH (avx_ifma, AVX_IFMA, ANY_AVX_IFMA, false),
-  SUBARCH (avx_vnni_int8, AVX_VNNI_INT8, ANY_AVX_VNNI_INT8, false),
+  VECARCH (avx_ifma, AVX_IFMA, ANY_AVX_IFMA, reset),
+  VECARCH (avx_vnni_int8, AVX_VNNI_INT8, ANY_AVX_VNNI_INT8, reset),
   SUBARCH (cmpccxadd, CMPCCXADD, CMPCCXADD, false),
   SUBARCH (wrmsrns, WRMSRNS, WRMSRNS, false),
   SUBARCH (msrlist, MSRLIST, MSRLIST, false),
-  SUBARCH (avx_ne_convert, AVX_NE_CONVERT, ANY_AVX_NE_CONVERT, false),
+  VECARCH (avx_ne_convert, AVX_NE_CONVERT, ANY_AVX_NE_CONVERT, reset),
   SUBARCH (rao_int, RAO_INT, RAO_INT, false),
   SUBARCH (rmpquery, RMPQUERY, ANY_RMPQUERY, false),
   SUBARCH (fred, FRED, ANY_FRED, false),
   SUBARCH (lkgs, LKGS, ANY_LKGS, false),
-  SUBARCH (avx_vnni_int16, AVX_VNNI_INT16, ANY_AVX_VNNI_INT16, false),
-  SUBARCH (sha512, SHA512, ANY_SHA512, false),
-  SUBARCH (sm3, SM3, ANY_SM3, false),
-  SUBARCH (sm4, SM4, ANY_SM4, false),
+  VECARCH (avx_vnni_int16, AVX_VNNI_INT16, ANY_AVX_VNNI_INT16, reset),
+  VECARCH (sha512, SHA512, ANY_SHA512, reset),
+  VECARCH (sm3, SM3, ANY_SM3, reset),
+  VECARCH (sm4, SM4, ANY_SM4, reset),
   SUBARCH (pbndkb, PBNDKB, PBNDKB, false),
-  SUBARCH (avx10.1, AVX10_1, ANY_AVX512F, false),
+  VECARCH (avx10.1, AVX10_1, ANY_AVX512F, set),
 };
 
 #undef SUBARCH
@@ -2790,6 +2798,7 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
     i386_cpu_flags isa_flags;
     enum processor_type isa;
     enum flag_code flag_code;
+    unsigned int vector_size;
     char stackop_size;
     bool no_cond_jump_promotion;
   } arch_stack_entry;
@@ -2825,6 +2834,7 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
       top->isa = cpu_arch_isa;
       top->isa_flags = cpu_arch_isa_flags;
       top->flag_code = flag_code;
+      top->vector_size = vector_size;
       top->stackop_size = stackop_size;
       top->no_cond_jump_promotion = no_cond_jump_promotion;
 
@@ -2865,6 +2875,7 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
 	  cpu_arch_flags = top->flags;
 	  cpu_arch_isa = top->isa;
 	  cpu_arch_isa_flags = top->isa_flags;
+	  vector_size = top->vector_size;
 	  no_cond_jump_promotion = top->no_cond_jump_promotion;
 
 	  XDELETE (top);
@@ -2905,6 +2916,8 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
 	      cpu_arch_tune_flags = cpu_arch_isa_flags;
 	    }
 
+	  vector_size = VSZ_DEFAULT;
+
 	  j = ARRAY_SIZE (cpu_arch) + 1;
 	}
     }
@@ -2939,6 +2952,9 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
 		  cpu_arch_tune = cpu_arch_isa;
 		  cpu_arch_tune_flags = cpu_arch_isa_flags;
 		}
+
+	      vector_size = VSZ_DEFAULT;
+
 	      pre_386_16bit_warned = false;
 	      break;
 	    }
@@ -2959,6 +2975,38 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
 	      = cpu_flags_or (cpu_arch_isa_flags, cpu_arch[j].enable);
 
 	  (void) restore_line_pointer (e);
+
+	  switch (cpu_arch[j].vsz)
+	    {
+	    default:
+	      break;
+
+	    case vsz_set:
+#ifdef SVR4_COMMENT_CHARS
+	      if (*input_line_pointer == ':' || *input_line_pointer == '/')
+#else
+	      if (*input_line_pointer == '/')
+#endif
+		{
+		  ++input_line_pointer;
+		  switch (get_absolute_expression ())
+		    {
+		    case 512: vector_size = VSZ512; break;
+		    case 256: vector_size = VSZ256; break;
+		    case 128: vector_size = VSZ128; break;
+		    default:
+		      as_bad (_("Unrecognized vector size specifier"));
+		      ignore_rest_of_line ();
+		      return;
+		    }
+		  break;
+		}
+		/* Fall through.  */
+	    case vsz_reset:
+	      vector_size = VSZ_DEFAULT;
+	      break;
+	    }
+
 	  demand_empty_rest_of_line ();
 	  return;
 	}
@@ -2979,6 +3027,9 @@ set_cpu_arch (int dummy ATTRIBUTE_UNUSED
 		cpu_arch_isa_flags = flags;
 	      }
 
+	    if (cpu_arch[j].vsz == vsz_set)
+	      vector_size = VSZ_DEFAULT;
+
 	    (void) restore_line_pointer (e);
 	    demand_empty_rest_of_line ();
 	    return;
@@ -5431,9 +5482,11 @@ md_assemble (char *line)
 	case RegSIMD:
 	  if (i.tm.operand_types[j].bitfield.tmmword)
 	    i.xstate |= xstate_tmm;
-	  else if (i.tm.operand_types[j].bitfield.zmmword)
+	  else if (i.tm.operand_types[j].bitfield.zmmword
+		   && vector_size >= VSZ512)
 	    i.xstate |= xstate_zmm;
-	  else if (i.tm.operand_types[j].bitfield.ymmword)
+	  else if (i.tm.operand_types[j].bitfield.ymmword
+		   && vector_size >= VSZ256)
 	    i.xstate |= xstate_ymm;
 	  else if (i.tm.operand_types[j].bitfield.xmmword)
 	    i.xstate |= xstate_xmm;
@@ -6606,9 +6659,13 @@ check_VecOperands (const insn_template *
 	  type.bitfield.xmmword = 1;
 	  break;
 	case 32:
+	  if (vector_size < VSZ256)
+	    goto bad_broadcast;
 	  type.bitfield.ymmword = 1;
 	  break;
 	case 64:
+	  if (vector_size < VSZ512)
+	    goto bad_broadcast;
 	  type.bitfield.zmmword = 1;
 	  break;
 	default:
@@ -6820,6 +6877,19 @@ VEX_check_encoding (const insn_template
       return 1;
     }
 
+  /* Vector size restrictions.  */
+  if ((vector_size < VSZ512
+       && (t->opcode_modifier.evex == EVEX512
+	   || t->opcode_modifier.vsz >= VSZ512))
+      || (vector_size < VSZ256
+	  && (t->opcode_modifier.evex == EVEX256
+	      || t->opcode_modifier.vex == VEX256
+	      || t->opcode_modifier.vsz >= VSZ256)))
+    {
+      i.error = unsupported;
+      return 1;
+    }
+
   if (i.vec_encoding == vex_encoding_evex)
     {
       /* This instruction must be encoded with EVEX prefix.  */
@@ -7608,8 +7678,27 @@ process_suffix (void)
 
 	  for (op = 0; op < i.tm.operands; ++op)
 	    {
-	      if (is_evex_encoding (&i.tm)
-		  && !cpu_arch_flags.bitfield.cpuavx512vl)
+	      if (vector_size < VSZ512)
+		{
+		  i.tm.operand_types[op].bitfield.zmmword = 0;
+		  if (vector_size < VSZ256)
+		    {
+		      i.tm.operand_types[op].bitfield.ymmword = 0;
+		      if (i.tm.operand_types[op].bitfield.xmmword
+			  && (i.tm.opcode_modifier.evex == EVEXDYN
+			      || (!i.tm.opcode_modifier.evex
+				  && is_evex_encoding (&i.tm))))
+			i.tm.opcode_modifier.evex = EVEX128;
+		    }
+		  else if (i.tm.operand_types[op].bitfield.ymmword
+			   && !i.tm.operand_types[op].bitfield.xmmword
+			   && (i.tm.opcode_modifier.evex == EVEXDYN
+			       || (!i.tm.opcode_modifier.evex
+				   && is_evex_encoding (&i.tm))))
+		    i.tm.opcode_modifier.evex = EVEX256;
+		}
+	      else if (is_evex_encoding (&i.tm)
+		       && !cpu_arch_flags.bitfield.cpuavx512vl)
 		{
 		  if (i.tm.operand_types[op].bitfield.ymmword)
 		    i.tm.operand_types[op].bitfield.xmmword = 0;
@@ -13857,6 +13946,12 @@ static bool check_register (const reg_en
 	}
     }
 
+  if (vector_size < VSZ512 && r->reg_type.bitfield.zmmword)
+    return false;
+
+  if (vector_size < VSZ256 && r->reg_type.bitfield.ymmword)
+    return false;
+
   if (r->reg_type.bitfield.tmmword
       && (!cpu_arch_flags.bitfield.cpuamx_tile
           || flag_code != CODE_64BIT))
@@ -14355,13 +14450,21 @@ md_parse_option (int c, const char *arg)
 	arch++;
       do
 	{
+	  char *vsz;
+
 	  if (*arch == '.')
 	    as_fatal (_("invalid -march= option: `%s'"), arg);
 	  next = strchr (arch, '+');
 	  if (next)
 	    *next++ = '\0';
+	  vsz = strchr (arch, '/');
+	  if (vsz)
+	    *vsz++ = '\0';
 	  for (j = 0; j < ARRAY_SIZE (cpu_arch); j++)
 	    {
+	      if (vsz && cpu_arch[j].vsz != vsz_set)
+		continue;
+
 	      if (arch == saved && cpu_arch[j].type != PROCESSOR_NONE
 	          && strcmp (arch, cpu_arch[j].name) == 0)
 		{
@@ -14380,6 +14483,7 @@ md_parse_option (int c, const char *arg)
 		      cpu_arch_tune = cpu_arch_isa;
 		      cpu_arch_tune_flags = cpu_arch_isa_flags;
 		    }
+		  vector_size = VSZ_DEFAULT;
 		  break;
 		}
 	      else if (cpu_arch[j].type == PROCESSOR_NONE
@@ -14402,6 +14506,37 @@ md_parse_option (int c, const char *arg)
 		    cpu_arch_isa_flags
 		      = cpu_flags_or (cpu_arch_isa_flags,
 				      cpu_arch[j].enable);
+
+		  switch (cpu_arch[j].vsz)
+		    {
+		    default:
+		      break;
+
+		    case vsz_set:
+		      if (vsz)
+			{
+			  char *end;
+			  unsigned long val = strtoul (vsz, &end, 0);
+
+			  if (*end)
+			    val = 0;
+			  switch (val)
+			    {
+			    case 512: vector_size = VSZ512; break;
+			    case 256: vector_size = VSZ256; break;
+			    case 128: vector_size = VSZ128; break;
+			    default:
+			      as_warn (_("Unrecognized vector size specifier ignored"));
+			      break;
+			    }
+			  break;
+			}
+			/* Fall through.  */
+		    case vsz_reset:
+		      vector_size = VSZ_DEFAULT;
+		      break;
+		    }
+
 		  break;
 		}
 	    }
@@ -14423,6 +14558,8 @@ md_parse_option (int c, const char *arg)
 			cpu_arch_flags = flags;
 			cpu_arch_isa_flags = flags;
 		      }
+		    if (cpu_arch[j].vsz == vsz_set)
+		      vector_size = VSZ_DEFAULT;
 		    break;
 		  }
 	    }
--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -768,10 +768,20 @@ i386_intel_operand (char *operand_string
 	  break;
 
 	case O_ymmword_ptr:
+	  if (vector_size < VSZ256)
+	    {
+	      as_bad (_("256-bit operands disabled"));
+	      return 0;
+	    }
 	  i.types[this_operand].bitfield.ymmword = 1;
 	  break;
 
 	case O_zmmword_ptr:
+	  if (vector_size < VSZ512)
+	    {
+	      as_bad (_("512-bit operands disabled"));
+	      return 0;
+	    }
 	  i.types[this_operand].bitfield.zmmword = 1;
 	  break;
 
--- a/gas/testsuite/gas/i386/avx10.1-256-1.d
+++ b/gas/testsuite/gas/i386/avx10.1-256-1.d
@@ -1,4 +1,4 @@
-#as: -I${srcdir}/$subdir
+#as: --divide -I${srcdir}/$subdir
 #objdump: -dw
 #name: AVX10.1/256 (part 1)
 #dump: avx512f_vl.d
--- a/gas/testsuite/gas/i386/avx10.1-256-1.s
+++ b/gas/testsuite/gas/i386/avx10.1-256-1.s
@@ -1,3 +1,3 @@
 	.arch generic32
-	.arch .avx10.1
+	.arch .avx10.1/256
 	.include "avx512f_vl.s"
--- a/gas/testsuite/gas/i386/avx10.1-256-2.d
+++ b/gas/testsuite/gas/i386/avx10.1-256-2.d
@@ -1,4 +1,4 @@
-#as: -I${srcdir}/$subdir
+#as: --divide -I${srcdir}/$subdir
 #objdump: -dw
 #name: AVX10.1/256 (part 2)
 #dump: avx512bw_vl.d
--- a/gas/testsuite/gas/i386/avx10.1-256-2.s
+++ b/gas/testsuite/gas/i386/avx10.1-256-2.s
@@ -1,3 +1,3 @@
 	.arch generic32
-	.arch .avx10.1
+	.arch .avx10.1/256
 	.include "avx512bw_vl.s"
--- a/gas/testsuite/gas/i386/avx10.1-256-3.d
+++ b/gas/testsuite/gas/i386/avx10.1-256-3.d
@@ -1,4 +1,4 @@
-#as: -I${srcdir}/$subdir
+#as: --divide -I${srcdir}/$subdir
 #objdump: -dw
 #name: AVX10.1/256 (part 3)
 #dump: avx512dq_vl.d
--- a/gas/testsuite/gas/i386/avx10.1-256-3.s
+++ b/gas/testsuite/gas/i386/avx10.1-256-3.s
@@ -1,3 +1,3 @@
 	.arch generic32
-	.arch .avx10.1
+	.arch .avx10.1/256
 	.include "avx512dq_vl.s"
--- a/gas/testsuite/gas/i386/avx10.1-256-4.d
+++ b/gas/testsuite/gas/i386/avx10.1-256-4.d
@@ -1,4 +1,4 @@
-#as: -I${srcdir}/$subdir
+#as: --divide -I${srcdir}/$subdir
 #objdump: -dw
 #name: AVX10.1/256 (part 4)
 #dump: avx512_fp16_vl.d
--- a/gas/testsuite/gas/i386/avx10.1-256-4.s
+++ b/gas/testsuite/gas/i386/avx10.1-256-4.s
@@ -1,3 +1,3 @@
 	.arch generic32
-	.arch .avx10.1
+	.arch .avx10.1/256
 	.include "avx512_fp16_vl.s"
--- a/gas/testsuite/gas/i386/avx10.1-256-5.d
+++ b/gas/testsuite/gas/i386/avx10.1-256-5.d
@@ -1,4 +1,4 @@
-#as: -I${srcdir}/$subdir
+#as: --divide -I${srcdir}/$subdir
 #objdump: -dw
 #name: AVX10.1/256 (part 5)
 
--- a/gas/testsuite/gas/i386/avx10.1-256-5.s
+++ b/gas/testsuite/gas/i386/avx10.1-256-5.s
@@ -1,5 +1,5 @@
 	.arch generic32
-	.arch .avx10.1
+	.arch .avx10.1/256
 
 	.include "avx512bitalg_vl.s"
 
--- a/gas/testsuite/gas/i386/avx10.1-256-gfni.d
+++ b/gas/testsuite/gas/i386/avx10.1-256-gfni.d
@@ -1,4 +1,4 @@
-#as: -I${srcdir}/$subdir
+#as: --divide -I${srcdir}/$subdir
 #objdump: -dw
 #name: AVX10.1/256 + GFNI
 #dump: avx512vl_gfni.d
--- a/gas/testsuite/gas/i386/avx10.1-256-gfni.s
+++ b/gas/testsuite/gas/i386/avx10.1-256-gfni.s
@@ -1,4 +1,4 @@
 	.arch generic32
-	.arch .avx10.1
+	.arch .avx10.1/256
 	.arch .gfni
 	.include "avx512vl_gfni.s"
--- a/gas/testsuite/gas/i386/avx10.1-256-vaes.d
+++ b/gas/testsuite/gas/i386/avx10.1-256-vaes.d
@@ -1,4 +1,4 @@
-#as: -I${srcdir}/$subdir
+#as: --divide -I${srcdir}/$subdir
 #objdump: -dw
 #name: AVX10.1/256 + VAES
 #dump: avx512vl_vaes.d
--- a/gas/testsuite/gas/i386/avx10.1-256-vaes.s
+++ b/gas/testsuite/gas/i386/avx10.1-256-vaes.s
@@ -1,4 +1,4 @@
 	.arch generic32
-	.arch .avx10.1
+	.arch .avx10.1/256
 	.arch .vaes
 	.include "avx512vl_vaes.s"
--- a/gas/testsuite/gas/i386/avx10.1-256-vpclmulqdq.d
+++ b/gas/testsuite/gas/i386/avx10.1-256-vpclmulqdq.d
@@ -1,4 +1,4 @@
-#as: -I${srcdir}/$subdir
+#as: --divide -I${srcdir}/$subdir
 #objdump: -dw
 #name: AVX10.1/256 + VPLCMULQDQ
 #dump: avx512vl_vpclmulqdq.d
--- a/gas/testsuite/gas/i386/avx10.1-256-vpclmulqdq.s
+++ b/gas/testsuite/gas/i386/avx10.1-256-vpclmulqdq.s
@@ -1,4 +1,4 @@
 	.arch generic32
-	.arch .avx10.1
+	.arch .avx10.1/256
 	.arch .vpclmulqdq
 	.include "avx512vl_vpclmulqdq.s"
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10-vsz.l
@@ -0,0 +1,294 @@
+.*: Assembler messages:
+.*:7: Warning: default
+.*:46: Error: ambiguous operand size for `vcvtpd2ph'
+.*:47: Error: ambiguous operand size for `vcvtpd2ps'
+.*:48: Error: ambiguous operand size for `vfpclassps'
+.*:7: Warning: \.avx10\.1/256
+.*:12: Error: unsupported instruction `kunpckdq'
+.*:16: Error: unsupported instruction `kaddq'
+.*:17: Error: unsupported instruction `kandq'
+.*:18: Error: unsupported instruction `kandnq'
+.*:19: Error: unsupported instruction `kmovq'
+.*:20: Error: unsupported instruction `kmovq'
+.*:21: Error: unsupported instruction `kmovq'
+.*:22: Error: unsupported instruction `knotq'
+.*:23: Error: unsupported instruction `korq'
+.*:24: Error: unsupported instruction `kortestq'
+.*:25: Error: unsupported instruction `kshiftlq'
+.*:26: Error: unsupported instruction `kshiftrq'
+.*:27: Error: unsupported instruction `ktestq'
+.*:28: Error: unsupported instruction `kxorq'
+.*:29: Error: unsupported instruction `kxnorq'
+.*:32: Error: unsupported instruction `vcvtpd2phz'
+.*:34: Error: unsupported broadcast for `vcvtpd2ph'
+.*:37: Error: unsupported instruction `vcvtpd2ps'
+.*:43: Error: unsupported instruction `vfpclasspsz'
+.*:46: Error: ambiguous operand size for `vcvtpd2ph'
+.*:47: Error: ambiguous operand size for `vcvtpd2ps'
+.*:48: Error: ambiguous operand size for `vfpclassps'
+.*:7: Warning: \.avx10\.1/128
+.*:10: Error: unsupported instruction `kmovd'
+.*:11: Error: unsupported instruction `kmovd'
+.*:12: Error: unsupported instruction `kunpckdq'
+.*:13: Error: unsupported instruction `kunpckwd'
+.*:16: Error: unsupported instruction `kaddd'
+.*:17: Error: unsupported instruction `kandd'
+.*:18: Error: unsupported instruction `kandnd'
+.*:19: Error: unsupported instruction `kmovd'
+.*:20: Error: unsupported instruction `kmovd'
+.*:21: Error: unsupported instruction `kmovd'
+.*:22: Error: unsupported instruction `knotd'
+.*:23: Error: unsupported instruction `kord'
+.*:24: Error: unsupported instruction `kortestd'
+.*:25: Error: unsupported instruction `kshiftld'
+.*:26: Error: unsupported instruction `kshiftrd'
+.*:27: Error: unsupported instruction `ktestd'
+.*:28: Error: unsupported instruction `kxord'
+.*:29: Error: unsupported instruction `kxnord'
+.*:16: Error: unsupported instruction `kaddq'
+.*:17: Error: unsupported instruction `kandq'
+.*:18: Error: unsupported instruction `kandnq'
+.*:19: Error: unsupported instruction `kmovq'
+.*:20: Error: unsupported instruction `kmovq'
+.*:21: Error: unsupported instruction `kmovq'
+.*:22: Error: unsupported instruction `knotq'
+.*:23: Error: unsupported instruction `korq'
+.*:24: Error: unsupported instruction `kortestq'
+.*:25: Error: unsupported instruction `kshiftlq'
+.*:26: Error: unsupported instruction `kshiftrq'
+.*:27: Error: unsupported instruction `ktestq'
+.*:28: Error: unsupported instruction `kxorq'
+.*:29: Error: unsupported instruction `kxnorq'
+.*:32: Error: unsupported instruction `vcvtpd2phz'
+.*:33: Error: unsupported instruction `vcvtpd2phy'
+.*:34: Error: unsupported broadcast for `vcvtpd2ph'
+.*:35: Error: unsupported broadcast for `vcvtpd2ph'
+.*:37: Error: .*
+.*:38: Error: unsupported instruction `vcvtpd2psy'
+.*:39: Error: unsupported instruction `vcvtpd2psy'
+.*:40: Error: unsupported broadcast for `vcvtpd2ps'
+.*:42: Error: unsupported instruction `vfpclasspsy'
+.*:43: Error: unsupported instruction `vfpclasspsz'
+.*:7: Warning: \.avx10\.1
+.*:46: Error: ambiguous operand size for `vcvtpd2ph'
+.*:47: Error: ambiguous operand size for `vcvtpd2ps'
+.*:48: Error: ambiguous operand size for `vfpclassps'
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch generic32
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+\?+ C5FB93D1[ 	]+>  kmovd %k1,%edx
+[ 	]*[0-9]+[ 	]+\?+ C5FB92D1[ 	]+>  kmovd %ecx,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC4B D9[ 	]+>  kunpckdq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C5EC4BD9[ 	]+>  kunpckwd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  \.irp sz,d,q
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED4A D9[ 	]+>>  kaddd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED41 D9[ 	]+>>  kandd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED42 D9[ 	]+>>  kandnd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F990 D1[ 	]+>>  kmovd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F991 0A[ 	]+>>  kmovd %k1,\(%edx\)
+[ 	]*[0-9]+[ 	]+\?+ C4E1F990 11[ 	]+>>  kmovd \(%ecx\),%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F944 D1[ 	]+>>  knotd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED45 D9[ 	]+>>  kord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F998 D1[ 	]+>>  kortestd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E37933 DA01[ 	]+>>  kshiftld \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E37931 DA01[ 	]+>>  kshiftrd \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F999 D1[ 	]+>>  ktestd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED47 D9[ 	]+>>  kxord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED46 D9[ 	]+>>  kxnord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC4A D9[ 	]+>>  kaddq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC41 D9[ 	]+>>  kandq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC42 D9[ 	]+>>  kandnq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F890 D1[ 	]+>>  kmovq %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F891 0A[ 	]+>>  kmovq %k1,\(%edx\)
+[ 	]*[0-9]+[ 	]+\?+ C4E1F890 11[ 	]+>>  kmovq \(%ecx\),%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F844 D1[ 	]+>>  knotq %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC45 D9[ 	]+>>  korq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F898 D1[ 	]+>>  kortestq %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E3F933 DA01[ 	]+>>  kshiftlq \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E3F931 DA01[ 	]+>>  kshiftrq \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F899 D1[ 	]+>>  ktestq %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC47 D9[ 	]+>>  kxorq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC46 D9[ 	]+>>  kxnorq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD48 5A00[ 	]+>  vcvtpd2phz \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD28 5A00[ 	]+>  vcvtpd2phy \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD58 5A00[ 	]+>  vcvtpd2ph \(%eax\)\{1to8\},%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD38 5A00[ 	]+>  vcvtpd2ph \(%eax\)\{1to4\},%xmm0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+\?+ 62F1FD48 5A00[ 	]+>  vcvtpd2ps \(%eax\),%ymm0
+[ 	]*[0-9]+[ 	]+\?+ C5FD5A00[ 	]+>  vcvtpd2psy \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F1FD29 5A00[ 	]+>  vcvtpd2psy \(%eax\),%xmm0\{%k1\}
+[ 	]*[0-9]+[ 	]+\?+ 62F1FD38 5A00[ 	]+>  vcvtpd2ps \(%eax\)\{1to4\},%xmm0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+\?+ 62F37D28 660000[ 	]+>  vfpclasspsy \$0,\(%eax\),%k0
+[ 	]*[0-9]+[ 	]+\?+ 62F37D48 660000[ 	]+>  vfpclasspsz \$0,\(%eax\),%k0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  \.intel_syntax noprefix
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ph xmm0,\[eax\]
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ps xmm0,\[eax\]
+[ 	]*[0-9]+[ 	]+>  vfpclassps k0,\[eax\],0
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch generic32
+[ 	]*[0-9]+[ 	]+>  \.arch \.avx10\.1/256
+[ 	]*[0-9]+[ 	]+\?+ C5FB93D1[ 	]+>  kmovd %k1,%edx
+[ 	]*[0-9]+[ 	]+\?+ C5FB92D1[ 	]+>  kmovd %ecx,%k2
+[ 	]*[0-9]+[ 	]+>  kunpckdq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C5EC4BD9[ 	]+>  kunpckwd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  \.irp sz,d,q
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED4A D9[ 	]+>>  kaddd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED41 D9[ 	]+>>  kandd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED42 D9[ 	]+>>  kandnd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F990 D1[ 	]+>>  kmovd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F991 0A[ 	]+>>  kmovd %k1,\(%edx\)
+[ 	]*[0-9]+[ 	]+\?+ C4E1F990 11[ 	]+>>  kmovd \(%ecx\),%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F944 D1[ 	]+>>  knotd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED45 D9[ 	]+>>  kord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F998 D1[ 	]+>>  kortestd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E37933 DA01[ 	]+>>  kshiftld \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E37931 DA01[ 	]+>>  kshiftrd \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F999 D1[ 	]+>>  ktestd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED47 D9[ 	]+>>  kxord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED46 D9[ 	]+>>  kxnord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kaddq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kandq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kandnq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kmovq %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kmovq %k1,\(%edx\)
+[ 	]*[0-9]+[ 	]+>>  kmovq \(%ecx\),%k2
+[ 	]*[0-9]+[ 	]+>>  knotq %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  korq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kortestq %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kshiftlq \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kshiftrq \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  ktestq %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kxorq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kxnorq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  vcvtpd2phz \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD28 5A00[ 	]+>  vcvtpd2phy \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ph \(%eax\)\{1to8\},%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD38 5A00[ 	]+>  vcvtpd2ph \(%eax\)\{1to4\},%xmm0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ps \(%eax\),%ymm0
+[ 	]*[0-9]+[ 	]+\?+ C5FD5A00[ 	]+>  vcvtpd2psy \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F1FD29 5A00[ 	]+>  vcvtpd2psy \(%eax\),%xmm0\{%k1\}
+[ 	]*[0-9]+[ 	]+\?+ 62F1FD38 5A00[ 	]+>  vcvtpd2ps \(%eax\)\{1to4\},%xmm0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+\?+ 62F37D28 660000[ 	]+>  vfpclasspsy \$0,\(%eax\),%k0
+[ 	]*[0-9]+[ 	]+>  vfpclasspsz \$0,\(%eax\),%k0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  \.intel_syntax noprefix
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ph xmm0,\[eax\]
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ps xmm0,\[eax\]
+[ 	]*[0-9]+[ 	]+>  vfpclassps k0,\[eax\],0
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch generic32
+[ 	]*[0-9]+[ 	]+>  \.arch \.avx10\.1/128
+[ 	]*[0-9]+[ 	]+>  kmovd %k1,%edx
+[ 	]*[0-9]+[ 	]+>  kmovd %ecx,%k2
+[ 	]*[0-9]+[ 	]+>  kunpckdq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>  kunpckwd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  \.irp sz,d,q
+[ 	]*[0-9]+[ 	]+>>  kaddd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kandd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kandnd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kmovd %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kmovd %k1,\(%edx\)
+[ 	]*[0-9]+[ 	]+>>  kmovd \(%ecx\),%k2
+[ 	]*[0-9]+[ 	]+>>  knotd %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kortestd %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kshiftld \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kshiftrd \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  ktestd %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kxord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kxnord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kaddq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kandq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kandnq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kmovq %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kmovq %k1,\(%edx\)
+[ 	]*[0-9]+[ 	]+>>  kmovq \(%ecx\),%k2
+[ 	]*[0-9]+[ 	]+>>  knotq %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  korq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kortestq %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kshiftlq \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kshiftrq \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  ktestq %k1,%k2
+[ 	]*[0-9]+[ 	]+>>  kxorq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+>>  kxnorq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  vcvtpd2phz \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+>  vcvtpd2phy \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ph \(%eax\)\{1to8\},%xmm0
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ph \(%eax\)\{1to4\},%xmm0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ps \(%eax\),%ymm0
+[ 	]*[0-9]+[ 	]+>  vcvtpd2psy \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+>  vcvtpd2psy \(%eax\),%xmm0\{%k1\}
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ps \(%eax\)\{1to4\},%xmm0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  vfpclasspsy \$0,\(%eax\),%k0
+[ 	]*[0-9]+[ 	]+>  vfpclasspsz \$0,\(%eax\),%k0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  \.intel_syntax noprefix
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD08 5A00[ 	]+>  vcvtpd2ph xmm0,\[eax\]
+[ 	]*[0-9]+[ 	]+\?+ C5F95A00[ 	]+>  vcvtpd2ps xmm0,\[eax\]
+[ 	]*[0-9]+[ 	]+\?+ 62F37D08 660000[ 	]+>  vfpclassps k0,\[eax\],0
+#...
+[ 	]*[0-9]+[ 	]+\?+ C5FB93D1[ 	]+>  kmovd %k1,%edx
+[ 	]*[0-9]+[ 	]+\?+ C5FB92D1[ 	]+>  kmovd %ecx,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC4B D9[ 	]+>  kunpckdq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C5EC4BD9[ 	]+>  kunpckwd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  \.irp sz,d,q
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED4A D9[ 	]+>>  kaddd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED41 D9[ 	]+>>  kandd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED42 D9[ 	]+>>  kandnd %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F990 D1[ 	]+>>  kmovd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F991 0A[ 	]+>>  kmovd %k1,\(%edx\)
+[ 	]*[0-9]+[ 	]+\?+ C4E1F990 11[ 	]+>>  kmovd \(%ecx\),%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F944 D1[ 	]+>>  knotd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED45 D9[ 	]+>>  kord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F998 D1[ 	]+>>  kortestd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E37933 DA01[ 	]+>>  kshiftld \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E37931 DA01[ 	]+>>  kshiftrd \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F999 D1[ 	]+>>  ktestd %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED47 D9[ 	]+>>  kxord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1ED46 D9[ 	]+>>  kxnord %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC4A D9[ 	]+>>  kaddq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC41 D9[ 	]+>>  kandq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC42 D9[ 	]+>>  kandnq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F890 D1[ 	]+>>  kmovq %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F891 0A[ 	]+>>  kmovq %k1,\(%edx\)
+[ 	]*[0-9]+[ 	]+\?+ C4E1F890 11[ 	]+>>  kmovq \(%ecx\),%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1F844 D1[ 	]+>>  knotq %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC45 D9[ 	]+>>  korq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F898 D1[ 	]+>>  kortestq %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E3F933 DA01[ 	]+>>  kshiftlq \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E3F931 DA01[ 	]+>>  kshiftrq \$1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1F899 D1[ 	]+>>  ktestq %k1,%k2
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC47 D9[ 	]+>>  kxorq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+\?+ C4E1EC46 D9[ 	]+>>  kxnorq %k1,%k2,%k3
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD48 5A00[ 	]+>  vcvtpd2phz \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD28 5A00[ 	]+>  vcvtpd2phy \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD58 5A00[ 	]+>  vcvtpd2ph \(%eax\)\{1to8\},%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F5FD38 5A00[ 	]+>  vcvtpd2ph \(%eax\)\{1to4\},%xmm0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+\?+ 62F1FD48 5A00[ 	]+>  vcvtpd2ps \(%eax\),%ymm0
+[ 	]*[0-9]+[ 	]+\?+ C5FD5A00[ 	]+>  vcvtpd2psy \(%eax\),%xmm0
+[ 	]*[0-9]+[ 	]+\?+ 62F1FD29 5A00[ 	]+>  vcvtpd2psy \(%eax\),%xmm0\{%k1\}
+[ 	]*[0-9]+[ 	]+\?+ 62F1FD38 5A00[ 	]+>  vcvtpd2ps \(%eax\)\{1to4\},%xmm0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+\?+ 62F37D28 660000[ 	]+>  vfpclasspsy \$0,\(%eax\),%k0
+[ 	]*[0-9]+[ 	]+\?+ 62F37D48 660000[ 	]+>  vfpclasspsz \$0,\(%eax\),%k0
+[ 	]*[0-9]+[ 	]+> *
+[ 	]*[0-9]+[ 	]+>  \.intel_syntax noprefix
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ph xmm0,\[eax\]
+[ 	]*[0-9]+[ 	]+>  vcvtpd2ps xmm0,\[eax\]
+[ 	]*[0-9]+[ 	]+>  vfpclassps k0,\[eax\],0
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx10-vsz.s
@@ -0,0 +1,52 @@
+# Test AVX10 vector size restriction
+	.text
+
+	.irp isa, default, .avx10.1/256, .avx10.1/128, .avx10.1
+
+	.att_syntax prefix
+	.warning "\isa"
+	.arch generic32
+	.arch \isa
+	kmovd		%k1, %edx
+	kmovd		%ecx, %k2
+	kunpckdq	%k1, %k2, %k3
+	kunpckwd	%k1, %k2, %k3
+
+	.irp sz, d, q
+	kadd\sz		%k1, %k2, %k3
+	kand\sz		%k1, %k2, %k3
+	kandn\sz	%k1, %k2, %k3
+	kmov\sz		%k1, %k2
+	kmov\sz		%k1, (%edx)
+	kmov\sz		(%ecx), %k2
+	knot\sz		%k1, %k2
+	kor\sz		%k1, %k2, %k3
+	kortest\sz	%k1, %k2
+	kshiftl\sz	$1, %k2, %k3
+	kshiftr\sz	$1, %k2, %k3
+	ktest\sz	%k1, %k2
+	kxor\sz		%k1, %k2, %k3
+	kxnor\sz	%k1, %k2, %k3
+	.endr
+
+	vcvtpd2phz	(%eax), %xmm0
+	vcvtpd2phy	(%eax), %xmm0
+	vcvtpd2ph	(%eax){1to8}, %xmm0
+	vcvtpd2ph	(%eax){1to4}, %xmm0
+
+	vcvtpd2ps	(%eax), %ymm0
+	vcvtpd2psy	(%eax), %xmm0
+	vcvtpd2psy	(%eax), %xmm0{%k1}
+	vcvtpd2ps	(%eax){1to4}, %xmm0
+
+	vfpclasspsy	$0, (%eax), %k0
+	vfpclasspsz	$0, (%eax), %k0
+
+	.intel_syntax noprefix
+	vcvtpd2ph	xmm0, [eax]
+	vcvtpd2ps	xmm0, [eax]
+	vfpclassps	k0, [eax], 0
+
+	.endr
+
+	.p2align 4
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -238,7 +238,7 @@ if [gas_32_check] then {
     run_list_test "noavx-3" "-al"
     run_dump_test "noavx-4"
     run_list_test "noavx512-1" "-almn"
-    run_list_test "noavx512-2" "-al"
+    run_list_test "noavx512-2" "-almn --divide"
     run_dump_test "noextreg"
     run_dump_test "xmmhi32"
     run_dump_test "xsave"
@@ -579,6 +579,7 @@ if [gas_32_check] then {
     run_dump_test "avx10.1-256-gfni"
     run_dump_test "avx10.1-256-vaes"
     run_dump_test "avx10.1-256-vpclmulqdq"
+    run_list_test "avx10-vsz" "-almn --divide --listing-lhs-width=2"
     run_dump_test "fpu-bad"
     run_dump_test "cet"
     run_dump_test "cet-intel"
--- a/gas/testsuite/gas/i386/noavx512-2.l
+++ b/gas/testsuite/gas/i386/noavx512-2.l
@@ -1,106 +1,195 @@
 .*: Assembler messages:
-.*:26: Error: .*unsupported masking.*
-.*:27: Error: .*unsupported masking.*
-.*:29: Error: .*unsupported instruction.*
-.*:30: Error: .*unsupported instruction.*
-.*:32: Error: .*unsupported instruction.*
-.*:33: Error: .*unsupported instruction.*
-.*:36: Error: .*unsupported masking.*
-.*:37: Error: .*unsupported masking.*
-.*:39: Error: .*unsupported instruction.*
-.*:40: Error: .*unsupported instruction.*
-.*:43: Error: .*unsupported instruction.*
-.*:44: Error: .*unsupported instruction.*
-GAS LISTING .*
-#...
-[ 	]*1[ 	]+\# Test \.arch \.noavx512vl
-[ 	]*2[ 	]+\.text
-[ 	]*3[ 	]+\?\?\?\? 62F27D4F 		vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*3[ 	]+1CF5
-[ 	]*4[ 	]+\?\?\?\? 62F27D0F 		vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*4[ 	]+1CF5
-[ 	]*5[ 	]+\?\?\?\? 62F27D2F 		vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*5[ 	]+1CF5
-[ 	]*6[ 	]+\?\?\?\? 62F27D48 		vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*6[ 	]+C4F5
-[ 	]*7[ 	]+\?\?\?\? 62F27D08 		vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*7[ 	]+C4F5
-[ 	]*8[ 	]+\?\?\?\? 62F27D28 		vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*8[ 	]+C4F5
-[ 	]*9[ 	]+\?\?\?\? 62F1FD4F 		vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*9[ 	]+7B31
-[ 	]*10[ 	]+\?\?\?\? 62F1FD0F 		vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*10[ 	]+7B31
-[ 	]*11[ 	]+\?\?\?\? 62F1FD2F 		vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*11[ 	]+7B31
-[ 	]*12[ 	]+\?\?\?\? 62F27D4F 		vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*12[ 	]+C8F5
-[ 	]*13[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*13[ 	]+58F4
-[ 	]*14[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*14[ 	]+58F4
-[ 	]*15[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*15[ 	]+58F4
-[ 	]*16[ 	]+\?\?\?\? 62F2D54F 		vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*16[ 	]+B4F4
-[ 	]*17[ 	]+\?\?\?\? 62F2D50F 		vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*17[ 	]+B4F4
-[ 	]*18[ 	]+\?\?\?\? 62F2D52F 		vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*18[ 	]+B4F4
-[ 	]*19[ 	]+\?\?\?\? 62F2FD49 		vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*19[ 	]+C68CFD17 
-[ 	]*19[ 	]+000000
-[ 	]*20[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*20[ 	]+8DF4
-[ 	]*21[ 	]+\?\?\?\? 62F2550F 		vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*21[ 	]+8DF4
-[ 	]*22[ 	]+\?\?\?\? 62F2552F 		vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*22[ 	]+8DF4
-[ 	]*23[ 	]+
-[ 	]*24[ 	]+\.arch \.noavx512vl
-[ 	]*25[ 	]+\?\?\?\? 62F27D4F 		vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*25[ 	]+1CF5
-[ 	]*26[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*27[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*28[ 	]+\?\?\?\? 62F27D48 		vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*28[ 	]+C4F5
-[ 	]*29[ 	]+vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*30[ 	]+vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*31[ 	]+\?\?\?\? 62F1FD4F 		vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*31[ 	]+7B31
-[ 	]*32[ 	]+vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*33[ 	]+vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-\fGAS LISTING .*
-
-
-[ 	]*34[ 	]+\?\?\?\? 62F27D4F 		vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*34[ 	]+C8F5
-[ 	]*35[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*35[ 	]+58F4
-[ 	]*36[ 	]+vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*37[ 	]+vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*38[ 	]+\?\?\?\? 62F2D54F 		vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*38[ 	]+B4F4
-[ 	]*39[ 	]+vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*40[ 	]+vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*41[ 	]+\?\?\?\? 62F2FD49 		vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*41[ 	]+C68CFD17 
-[ 	]*41[ 	]+000000
-[ 	]*42[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*42[ 	]+8DF4
-[ 	]*43[ 	]+vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*44[ 	]+vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*45[ 	]+
-[ 	]*46[ 	]+\?\?\?\? C4E2791C 		vpabsb %xmm5, %xmm6
-[ 	]*46[ 	]+F5
-[ 	]*47[ 	]+\?\?\?\? C4E27D1C 		vpabsb %ymm5, %ymm6
-[ 	]*47[ 	]+F5
-[ 	]*48[ 	]+\?\?\?\? C5D158F4 		vaddpd %xmm4, %xmm5, %xmm6
-[ 	]*49[ 	]+\?\?\?\? C5D558F4 		vaddpd %ymm4, %ymm5, %ymm6
-[ 	]*50[ 	]+\?\?\?\? 660F381C 		pabsb %xmm5, %xmm6
-[ 	]*50[ 	]+F5
-[ 	]*51[ 	]+\?\?\?\? 660F58F4 		addpd %xmm4, %xmm6
-[ 	]*52[ 	]+
+.*:8: Error: .*bad register name `%zmm.*
+.*:11: Error: .*bad register name `%zmm.*
+.*:14: Error: .*bad register name `%zmm.*
+.*:17: Error: .*bad register name `%zmm.*
+.*:18: Error: .*bad register name `%zmm.*
+.*:21: Error: .*bad register name `%zmm.*
+.*:24: Error: .*unsupported instruction.*
+.*:25: Error: .*bad register name `%zmm.*
+.*:8: Error: .*bad register name `%zmm.*
+.*:10: Error: .*bad register name `%ymm.*
+.*:11: Error: .*bad register name `%zmm.*
+.*:13: Error: .*bad register name `%ymm.*
+.*:14: Error: .*bad register name `%zmm.*
+.*:16: Error: .*bad register name `%ymm.*
+.*:17: Error: .*bad register name `%zmm.*
+.*:18: Error: .*bad register name `%zmm.*
+.*:20: Error: .*bad register name `%ymm.*
+.*:21: Error: .*bad register name `%zmm.*
+.*:23: Error: .*bad register name `%ymm.*
+.*:24: Error: .*
+.*:25: Error: .*bad register name `%zmm.*
+.*:27: Error: .*bad register name `%ymm.*
+.*:9: Error: .*unsupported masking.*
+.*:10: Error: .*unsupported masking.*
+.*:12: Error: .*unsupported instruction.*
+.*:13: Error: .*unsupported instruction.*
+.*:15: Error: .*unsupported instruction.*
+.*:16: Error: .*unsupported instruction.*
+.*:19: Error: .*unsupported masking.*
+.*:20: Error: .*unsupported masking.*
+.*:22: Error: .*unsupported instruction.*
+.*:23: Error: .*unsupported instruction.*
+.*:26: Error: .*unsupported instruction.*
+.*:27: Error: .*unsupported instruction.*
+#...
+[ 	]*[0-9]+[ 	]+\# Test \.arch \.noavx512vl
+[ 	]*[0-9]+[ 	]+\.text
+[ 	]*[0-9]+[ 	]*
+[ 	]*[0-9]+[ 	]+\.irp isa, default, .*
+#...
+[ 	]*[0-9]+[ 	]+\.endr
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.avx10\.1/256
+[ 	]*[0-9]+[ 	]+>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.avx10\.1/128
+[ 	]*[0-9]+[ 	]+>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512vl
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+#...
+[ 	]*[0-9]+[ 	]+\?\?\?\? C4E2791C 		vpabsb %xmm5, %xmm6
+[ 	]*[0-9]+[ 	]+F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? C4E27D1C 		vpabsb %ymm5, %ymm6
+[ 	]*[0-9]+[ 	]+F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? C5D158F4 		vaddpd %xmm4, %xmm5, %xmm6
+[ 	]*[0-9]+[ 	]+\?\?\?\? C5D558F4 		vaddpd %ymm4, %ymm5, %ymm6
+[ 	]*[0-9]+[ 	]+\?\?\?\? 660F381C 		pabsb %xmm5, %xmm6
+[ 	]*[0-9]+[ 	]+F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 660F58F4 		addpd %xmm4, %xmm6
+[ 	]*[0-9]+[ 	]+
 [ 	]*[1-9][0-9]*[ 	]+\.intel_syntax noprefix
 [ 	]*[1-9][0-9]*[ 	]+\?\?\?\? 62F3FD48 		vfpclasspd k0, \[eax], 0
 [ 	]*[1-9][0-9]*[ 	]+660000
--- a/gas/testsuite/gas/i386/noavx512-2.s
+++ b/gas/testsuite/gas/i386/noavx512-2.s
@@ -1,27 +1,10 @@
 # Test .arch .noavx512vl
 	.text
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
 
-	.arch .noavx512vl
+	.irp isa, default, .avx10.1/256, .avx10.1/128, .noavx512vl
+
+	.arch default
+	.arch \isa
 	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
 	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
 	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
@@ -43,6 +26,8 @@
 	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
 	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
 
+	.endr
+
 	vpabsb %xmm5, %xmm6
 	vpabsb %ymm5, %ymm6
 	vaddpd %xmm4, %xmm5, %xmm6
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -464,6 +464,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (StaticRounding),
   BITFIELD (SAE),
   BITFIELD (Disp8MemShift),
+  BITFIELD (Vsz),
   BITFIELD (Optimize),
   BITFIELD (ATTMnemonic),
   BITFIELD (ATTSyntax),
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -716,6 +716,16 @@ enum
 #define DISP8_SHIFT_VL 7
   Disp8MemShift,
 
+  /* insn has vector size restrictions, requiring a minimum of:
+	0: 128 bits.
+	1: 256 bits.
+	2: 512 bits.
+   */
+#define VSZ128                 0 /* Not to be used in templates.  */
+#define VSZ256                 1
+#define VSZ512                 2
+  Vsz,
+
   /* Support encoding optimization.  */
   Optimize,
 
@@ -776,6 +786,7 @@ typedef struct i386_opcode_modifier
   unsigned int staticrounding:1;
   unsigned int sae:1;
   unsigned int disp8memshift:3;
+  unsigned int vsz:3;
   unsigned int optimize:1;
   unsigned int attmnemonic:1;
   unsigned int attsyntax:1;
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -131,6 +131,9 @@
 #define EVexLIG EVex=EVEXLIG
 #define EVexDYN EVex=EVEXDYN
 
+#define Vsz256 Vsz=VSZ256
+#define Vsz512 Vsz=VSZ512
+
 // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
 // the bit to mark commutative VEX encodings where swapping the source
 // operands may allow to switch from 3-byte to 2-byte VEX encoding.
@@ -987,9 +990,9 @@ pause, 0xf390, i186, NoSuf, {}
     b:0:VexW0:Byte:AVX512DQ:66:AVX512VBMI, +
     w:1:VexW1:Word:AVX512F::AVX512BW>
 
-<dq:opc:vexw:vexw64:elem:cpu64:gpr:kpfx, +
-    d:0:VexW0::Dword::Reg32:66, +
-    q:1:VexW1:VexW1:Qword:x64:Reg64:>
+<dq:opc:vexw:vexw64:elem:cpu64:gpr:kpfx:kvsz, +
+    d:0:VexW0::Dword::Reg32:66:Vsz256, +
+    q:1:VexW1:VexW1:Qword:x64:Reg64::Vsz512>
 
 emms, 0xf77, MMX, NoSuf, {}
 // These really shouldn't allow for Reg64 (movq is the right mnemonic for
@@ -2623,22 +2626,22 @@ vpmovzxwq, 0x6634, AVX512F|AVX512VL, Mod
 
 // AVX512BW instructions.
 
-kadd<dq>, 0x<dq:kpfx>4a, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
-kand<dq>, 0x<dq:kpfx>41, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
-kandn<dq>, 0x<dq:kpfx>42, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf|Optimize, { RegMask, RegMask, RegMask }
-kmov<dq>, 0x<dq:kpfx>90, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
-kmov<dq>, 0x<dq:kpfx>91, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
-kmov<dq>, 0xf292, AVX512BW, D|Modrm|Vex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
-knot<dq>, 0x<dq:kpfx>44, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
-kor<dq>, 0x<dq:kpfx>45, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
-kortest<dq>, 0x<dq:kpfx>98, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
-ktest<dq>, 0x<dq:kpfx>99, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
-kxnor<dq>, 0x<dq:kpfx>46, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
-kxor<dq>, 0x<dq:kpfx>47, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf|Optimize, { RegMask, RegMask, RegMask }
-kunpckdq, 0x4B, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
-kunpckwd, 0x4B, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
-kshiftl<dq>, 0x6633, AVX512BW, Modrm|Vex128|Space0F3A|<dq:vexw>|NoSuf, { Imm8, RegMask, RegMask }
-kshiftr<dq>, 0x6631, AVX512BW, Modrm|Vex128|Space0F3A|<dq:vexw>|NoSuf, { Imm8, RegMask, RegMask }
+kadd<dq>, 0x<dq:kpfx>4a, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
+kand<dq>, 0x<dq:kpfx>41, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
+kandn<dq>, 0x<dq:kpfx>42, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf|Optimize, { RegMask, RegMask, RegMask }
+kmov<dq>, 0x<dq:kpfx>90, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
+kmov<dq>, 0x<dq:kpfx>91, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
+kmov<dq>, 0xf292, AVX512BW, D|Modrm|Vex128|Space0F|<dq:vexw64>|<dq:kvsz>|NoSuf, { <dq:gpr>, RegMask }
+knot<dq>, 0x<dq:kpfx>44, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask }
+kor<dq>, 0x<dq:kpfx>45, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
+kortest<dq>, 0x<dq:kpfx>98, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask }
+ktest<dq>, 0x<dq:kpfx>99, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask }
+kxnor<dq>, 0x<dq:kpfx>46, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
+kxor<dq>, 0x<dq:kpfx>47, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf|Optimize, { RegMask, RegMask, RegMask }
+kunpckdq, 0x4B, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|Vsz512|NoSuf, { RegMask, RegMask, RegMask }
+kunpckwd, 0x4B, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW0|Vsz256|NoSuf, { RegMask, RegMask, RegMask }
+kshiftl<dq>, 0x6633, AVX512BW, Modrm|Vex128|Space0F3A|<dq:vexw>|<dq:kvsz>|NoSuf, { Imm8, RegMask, RegMask }
+kshiftr<dq>, 0x6631, AVX512BW, Modrm|Vex128|Space0F3A|<dq:vexw>|<dq:kvsz>|NoSuf, { Imm8, RegMask, RegMask }
 
 vdbpsadbw, 0x6642, AVX512BW, Modrm|Masking|Space0F3A|VexVVVV|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-25 12:46 ` [PATCH 3/5] x86: support AVX10.1/512 Jan Beulich
@ 2023-08-28  2:34   ` Jiang, Haochen
  2023-08-28  6:45     ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Jiang, Haochen @ 2023-08-28  2:34 UTC (permalink / raw)
  To: Beulich, Jan, Binutils; +Cc: H.J. Lu

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, August 25, 2023 8:47 PM
> To: Binutils <binutils@sourceware.org>
> Cc: H.J. Lu <hjl.tools@gmail.com>; Jiang, Haochen <haochen.jiang@intel.com>
> Subject: [PATCH 3/5] x86: support AVX10.1/512
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -1762,22 +1762,28 @@ vpgatherq<dq>, 0x6691, AVX2, Modrm|Vex12
>  vpgatherqd, 0x6691, AVX2,
> Modrm|Vex256|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB256,
> { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
>  vpgatherqq, 0x6691, AVX2,
> Modrm|Vex256|Space0F38|VexVVVV|VexW1|SwapSources|NoSuf|VecSIB256,
> { RegYMM, Qword|Unspecified|BaseIndex, RegYMM }
> 
> +// AVX10 is somewhat special wrt VAES and VPCLMULQDQ: The latter two imply
> +// 256-bit VEX encodings, but surely the 128-bit forms are also supported.
> +// Short of making AES a prereq of VAES (and PCLMULQDQ for VPCLMULQDQ) in
> +// i386-gen, add otherwise redundant secondary templates here.
> +<avx10:cpu, $avx:AVX|, $avx10:AVX512F|V>
> +

Should it be AVX512VL instead of AVX512F here? I am not sure if elsewhere in as will
help check AVX512VL for EVEX encoding xmm register usage.

Thx,
Haochen


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-28  2:34   ` Jiang, Haochen
@ 2023-08-28  6:45     ` Jan Beulich
  2023-08-28  6:59       ` Jiang, Haochen
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-08-28  6:45 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: H.J. Lu, Binutils

On 28.08.2023 04:34, Jiang, Haochen wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Friday, August 25, 2023 8:47 PM
>> To: Binutils <binutils@sourceware.org>
>> Cc: H.J. Lu <hjl.tools@gmail.com>; Jiang, Haochen <haochen.jiang@intel.com>
>> Subject: [PATCH 3/5] x86: support AVX10.1/512
>> --- a/opcodes/i386-opc.tbl
>> +++ b/opcodes/i386-opc.tbl
>> @@ -1762,22 +1762,28 @@ vpgatherq<dq>, 0x6691, AVX2, Modrm|Vex12
>>  vpgatherqd, 0x6691, AVX2,
>> Modrm|Vex256|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB256,
>> { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
>>  vpgatherqq, 0x6691, AVX2,
>> Modrm|Vex256|Space0F38|VexVVVV|VexW1|SwapSources|NoSuf|VecSIB256,
>> { RegYMM, Qword|Unspecified|BaseIndex, RegYMM }
>>
>> +// AVX10 is somewhat special wrt VAES and VPCLMULQDQ: The latter two imply
>> +// 256-bit VEX encodings, but surely the 128-bit forms are also supported.
>> +// Short of making AES a prereq of VAES (and PCLMULQDQ for VPCLMULQDQ) in
>> +// i386-gen, add otherwise redundant secondary templates here.
>> +<avx10:cpu, $avx:AVX|, $avx10:AVX512F|V>
>> +
> 
> Should it be AVX512VL instead of AVX512F here? I am not sure if elsewhere in as will
> help check AVX512VL for EVEX encoding xmm register usage.

Oh, yes, I think you're right. These aren't templates allowing for multiple
operand sizes. Thanks for spotting.

But of course the question remains on whether this is needed in the first
place. Personally I'd favor making AES a prereq of VAES (and PCLMULQDQ one
of VPCLMULQDQ); if we decided to go that route, I'd make that change a
separate prereq patch. Yet before that a clarification in the AVX10 spec
(or even in the SDM itself) would of course be quite helpful, albeit of
course we have some leeway here, as there's some level of discretion that
we have as to what exact effects -march= / .arch have.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-28  6:45     ` Jan Beulich
@ 2023-08-28  6:59       ` Jiang, Haochen
  2023-08-28  7:09         ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Jiang, Haochen @ 2023-08-28  6:59 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: H.J. Lu, Binutils

> But of course the question remains on whether this is needed in the first
> place. Personally I'd favor making AES a prereq of VAES (and PCLMULQDQ one
> of VPCLMULQDQ); if we decided to go that route, I'd make that change a

For PCLMULQDQ and VPCLMULQDQ, the imply should be ok. Actually, I have
added that in GCC in April, but I forgot to check Binutils at that time:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=4246611d1915f1664c01f286dbeb946dd06e2a4d

For AES/VAES, I am conservative on that at that time because not all the insts are
included in VAES:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=24a8acc1662c37003a7b54814bf840019fec2190

But I suppose the reason why AESIMC/AESKEYGENASSIST are not included might
be one key length is 128 bit while generating/transforming two at the same time
and putting them into ymm register is not quite needed.

I am both ok on whether to imply or not since it would be simple if implied.

Thx,
Haochen

> separate prereq patch. Yet before that a clarification in the AVX10 spec
> (or even in the SDM itself) would of course be quite helpful, albeit of
> course we have some leeway here, as there's some level of discretion that
> we have as to what exact effects -march= / .arch have.
> 
> Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-28  6:59       ` Jiang, Haochen
@ 2023-08-28  7:09         ` Jan Beulich
  2023-08-29 16:18           ` H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-08-28  7:09 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: H.J. Lu, Binutils

On 28.08.2023 08:59, Jiang, Haochen wrote:
>> But of course the question remains on whether this is needed in the first
>> place. Personally I'd favor making AES a prereq of VAES (and PCLMULQDQ one
>> of VPCLMULQDQ); if we decided to go that route, I'd make that change a
> 
> For PCLMULQDQ and VPCLMULQDQ, the imply should be ok. Actually, I have
> added that in GCC in April, but I forgot to check Binutils at that time:
> 
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=4246611d1915f1664c01f286dbeb946dd06e2a4d
> 
> For AES/VAES, I am conservative on that at that time because not all the insts are
> included in VAES:
> 
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=24a8acc1662c37003a7b54814bf840019fec2190
> 
> But I suppose the reason why AESIMC/AESKEYGENASSIST are not included might
> be one key length is 128 bit while generating/transforming two at the same time
> and putting them into ymm register is not quite needed.

Plus VAES is going to be of less use without those two insns, so assuming
AES to also be there if VAES is ought to be sufficiently "natural". Yet
as said, it would be quite nice if the SDM could make such "connections"
explicit.

> I am both ok on whether to imply or not since it would be simple if implied.

Okay, I'll tentatively add a new prereq patch then, but before posting v2
I'd like to give H.J. the opportunity to voice an opinion.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-28  7:09         ` Jan Beulich
@ 2023-08-29 16:18           ` H.J. Lu
  2023-08-30  1:10             ` Jiang, Haochen
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2023-08-29 16:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jiang, Haochen, Binutils

On Mon, Aug 28, 2023 at 12:09 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 28.08.2023 08:59, Jiang, Haochen wrote:
> >> But of course the question remains on whether this is needed in the first
> >> place. Personally I'd favor making AES a prereq of VAES (and PCLMULQDQ one
> >> of VPCLMULQDQ); if we decided to go that route, I'd make that change a
> >
> > For PCLMULQDQ and VPCLMULQDQ, the imply should be ok. Actually, I have
> > added that in GCC in April, but I forgot to check Binutils at that time:
> >
> > https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=4246611d1915f1664c01f286dbeb946dd06e2a4d
> >
> > For AES/VAES, I am conservative on that at that time because not all the insts are
> > included in VAES:
> >
> > https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=24a8acc1662c37003a7b54814bf840019fec2190
> >
> > But I suppose the reason why AESIMC/AESKEYGENASSIST are not included might
> > be one key length is 128 bit while generating/transforming two at the same time
> > and putting them into ymm register is not quite needed.
>
> Plus VAES is going to be of less use without those two insns, so assuming
> AES to also be there if VAES is ought to be sufficiently "natural". Yet
> as said, it would be quite nice if the SDM could make such "connections"
> explicit.
>
> > I am both ok on whether to imply or not since it would be simple if implied.
>
> Okay, I'll tentatively add a new prereq patch then, but before posting v2
> I'd like to give H.J. the opportunity to voice an opinion.
>

I don't think we need to make AVX10 to imply AES and VPCLMULQDQ
in assembler.  The AES and PCLMULQDQ changes aren't needed.


-- 
H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-25 12:47 ` [PATCH 5/5] x86: support AVX10.1 vector size restrictions Jan Beulich
@ 2023-08-29 16:26   ` H.J. Lu
  2023-08-30  7:57     ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2023-08-29 16:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils, Jiang, Haochen

On Fri, Aug 25, 2023 at 5:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Recognize "/<number>" suffixes on both -march=+avx10.1 and the
> corresponding .arch directive, setting an upper bound on the vector size
> that insns may use. Such a restriction can be reset by setting a new base
> architecture, by using a suffix-less form, by disabling AVX10, or by
> enabling any other VEX/EVEX-based vector extension.
>
> While for most insns we can suppress their use with too wide operands
> via registers becoming unavailable (or in Intel syntax memory operand
> size specifiers not being recognized), mask register insns have to have
> their minimum required vector size specified in a new attribute. (Of
> course this new attribute could also be used on other insns.)
>
> Note that .insn continues to be permitted to emit EVEX{512,256} (and
> VEX256 ones) encodings regardless of vector size restrictions in place.
> Of course these can't be expressed using zmm (or ymm) operands then,
> but need using the EVEX.512.* forms (broadcast forms may be usable right
> now, but this may go away so shouldn't be relied upon). This is why no
> assertions should be added to build_{e,}vex_prefix().
> ---
> It is unclear whether Vsz is a good name for the new attribute: The spec
> leaves open how 256-bit embedded rounding is going to be expressed. Yet
> that may require some similar attribute ...
>
> --- a/gas/NEWS
> +++ b/gas/NEWS
> @@ -1,5 +1,7 @@
>  -*- text -*-
>
> +* Add support for Intel AVX10.1.
> +
>  * Add support for Intel PBNDKB instructions.
>
>  * Add support for Intel SM4 instructions.
> --- a/gas/doc/c-i386.texi
> +++ b/gas/doc/c-i386.texi
> @@ -213,6 +213,9 @@ accept various extension mnemonics.  For
>  @code{sm4},
>  @code{pbndkb},
>  @code{avx10.1},
> +@code{avx10.1/512},
> +@code{avx10.1/256},
> +@code{avx10.1/128},
>  @code{amx_int8},
>  @code{amx_bf16},
>  @code{amx_fp16},
> @@ -267,7 +270,11 @@ accept various extension mnemonics.  For
>  @code{svme} and
>  @code{padlock}.
>  Note that these extension mnemonics can be prefixed with @code{no} to revoke
> -the respective (and any dependent) functionality.
> +the respective (and any dependent) functionality.  Note further that the
> +suffixes permitted on @code{-march=avx10.<N>} enforce a vector length
> +restriction, i.e. despite these otherwise being "enabling" options, using
> +these suffixes will disable all insns with wider vector or mask register
> +operands.
>
>  When the @code{.arch} directive is used with @option{-march}, the
>  @code{.arch} directive will take precedent.
> @@ -1673,6 +1680,12 @@ an unconditional jump to the target.
>
>  Note that the sub-architecture specifiers (starting with a dot) can be prefixed
>  with @code{no} to revoke the respective (and any dependent) functionality.
> +Note further that @samp{.avx10.<N>} can be suffixed with a vector length
> +restriction (@samp{/256} or @samp{/128}, with @samp{/512} simply restoring the
> +default).  Despite these otherwise being "enabling" specifiers, using these
> +suffixes will disable all insns with wider vector or mask register operands.
> +On SVR4-derived platforms, the separator character @samp{/} can be replaced by
> +@samp{:}.
>
>  Following the CPU architecture (but not a sub-architecture, which are those
>  starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to

Although CPUID bits in AVX10 spec may leave an impression that 128-bit,
256-bit and 512-bit vectors may be enabled independently.  But it also says

A “converged” version of Intel AVX10 with maximum vector lengths of 256
bits and 32-bit opmask registers will be supported across all Intel processors,
while 512-bit vector registers and 64-bit opmasks will continue to be supported
on some P-core processors.

Adding avx10.1/128 isn't necessary.

-- 
H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-29 16:18           ` H.J. Lu
@ 2023-08-30  1:10             ` Jiang, Haochen
  2023-08-30  7:47               ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Jiang, Haochen @ 2023-08-30  1:10 UTC (permalink / raw)
  To: H.J. Lu, Beulich, Jan; +Cc: Binutils

> I don't think we need to make AVX10 to imply AES and VPCLMULQDQ in
> assembler.  The AES and PCLMULQDQ changes aren't needed.
> 

AVX10 should not imply them. However, I suppose we are talking about
the relation between AES/VAES and PCLMULQDQ/VPCLMULQDQ.

Thx,
Haochen

> 
> --
> H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-30  1:10             ` Jiang, Haochen
@ 2023-08-30  7:47               ` Jan Beulich
  2023-08-30 15:28                 ` H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-08-30  7:47 UTC (permalink / raw)
  To: Jiang, Haochen, H.J. Lu; +Cc: Binutils

On 30.08.2023 03:10, Jiang, Haochen wrote:
>> I don't think we need to make AVX10 to imply AES and VPCLMULQDQ in
>> assembler.  The AES and PCLMULQDQ changes aren't needed.
>>
> 
> AVX10 should not imply them. However, I suppose we are talking about
> the relation between AES/VAES and PCLMULQDQ/VPCLMULQDQ.

Correct. Some change needs to be done; the question is whether as
posted, or whether with a prereq patch adding those two connections
(with, as said before, me intending to go the latter route unless I
hear otherwise).

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-29 16:26   ` H.J. Lu
@ 2023-08-30  7:57     ` Jan Beulich
  2023-08-30 15:25       ` H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-08-30  7:57 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils, Jiang, Haochen

On 29.08.2023 18:26, H.J. Lu wrote:
> On Fri, Aug 25, 2023 at 5:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>> @@ -1673,6 +1680,12 @@ an unconditional jump to the target.
>>
>>  Note that the sub-architecture specifiers (starting with a dot) can be prefixed
>>  with @code{no} to revoke the respective (and any dependent) functionality.
>> +Note further that @samp{.avx10.<N>} can be suffixed with a vector length
>> +restriction (@samp{/256} or @samp{/128}, with @samp{/512} simply restoring the
>> +default).  Despite these otherwise being "enabling" specifiers, using these
>> +suffixes will disable all insns with wider vector or mask register operands.
>> +On SVR4-derived platforms, the separator character @samp{/} can be replaced by
>> +@samp{:}.
>>
>>  Following the CPU architecture (but not a sub-architecture, which are those
>>  starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to
> 
> Although CPUID bits in AVX10 spec may leave an impression that 128-bit,
> 256-bit and 512-bit vectors may be enabled independently.  But it also says
> 
> A “converged” version of Intel AVX10 with maximum vector lengths of 256
> bits and 32-bit opmask registers will be supported across all Intel processors,
> while 512-bit vector registers and 64-bit opmasks will continue to be supported
> on some P-core processors.
> 
> Adding avx10.1/128 isn't necessary.

I agree it isn't necessary, but as expressed before I view it as desirable.
Apart from the sentence you quoted the spec later also says "There are
currently no plans to support an Intel AVX10/128 implementation." For my
choice of also supporting the 128-bit restriction I'd like to put emphasis
on "currently". I think I said before that emulation environments (qemu,
sde to name just two well-known examples) are free to implement such
further restricted ISAs without then becoming out-of-spec.

Plus supporting this mode right away has made me make certain adjustments
in what I'd call more clean a way, which I view as desirable as well.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-30  7:57     ` Jan Beulich
@ 2023-08-30 15:25       ` H.J. Lu
  2023-08-30 16:16         ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2023-08-30 15:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils, Jiang, Haochen

On Wed, Aug 30, 2023 at 12:57 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 29.08.2023 18:26, H.J. Lu wrote:
> > On Fri, Aug 25, 2023 at 5:48 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> @@ -1673,6 +1680,12 @@ an unconditional jump to the target.
> >>
> >>  Note that the sub-architecture specifiers (starting with a dot) can be prefixed
> >>  with @code{no} to revoke the respective (and any dependent) functionality.
> >> +Note further that @samp{.avx10.<N>} can be suffixed with a vector length
> >> +restriction (@samp{/256} or @samp{/128}, with @samp{/512} simply restoring the
> >> +default).  Despite these otherwise being "enabling" specifiers, using these
> >> +suffixes will disable all insns with wider vector or mask register operands.
> >> +On SVR4-derived platforms, the separator character @samp{/} can be replaced by
> >> +@samp{:}.
> >>
> >>  Following the CPU architecture (but not a sub-architecture, which are those
> >>  starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to
> >
> > Although CPUID bits in AVX10 spec may leave an impression that 128-bit,
> > 256-bit and 512-bit vectors may be enabled independently.  But it also says
> >
> > A “converged” version of Intel AVX10 with maximum vector lengths of 256
> > bits and 32-bit opmask registers will be supported across all Intel processors,
> > while 512-bit vector registers and 64-bit opmasks will continue to be supported
> > on some P-core processors.
> >
> > Adding avx10.1/128 isn't necessary.
>
> I agree it isn't necessary, but as expressed before I view it as desirable.
> Apart from the sentence you quoted the spec later also says "There are
> currently no plans to support an Intel AVX10/128 implementation." For my
> choice of also supporting the 128-bit restriction I'd like to put emphasis
> on "currently". I think I said before that emulation environments (qemu,
> sde to name just two well-known examples) are free to implement such
> further restricted ISAs without then becoming out-of-spec.
>
> Plus supporting this mode right away has made me make certain adjustments
> in what I'd call more clean a way, which I view as desirable as well.
>

Since AVX10 spec doesn't specify if mask registers should be limited to
16 bits for AVX10/128, doing it in assembler is premature.


-- 
H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-30  7:47               ` Jan Beulich
@ 2023-08-30 15:28                 ` H.J. Lu
  2023-09-01  8:41                   ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2023-08-30 15:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jiang, Haochen, Binutils

On Wed, Aug 30, 2023 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 30.08.2023 03:10, Jiang, Haochen wrote:
> >> I don't think we need to make AVX10 to imply AES and VPCLMULQDQ in
> >> assembler.  The AES and PCLMULQDQ changes aren't needed.
> >>
> >
> > AVX10 should not imply them. However, I suppose we are talking about
> > the relation between AES/VAES and PCLMULQDQ/VPCLMULQDQ.
>
> Correct. Some change needs to be done; the question is whether as
> posted, or whether with a prereq patch adding those two connections
> (with, as said before, me intending to go the latter route unless I
> hear otherwise).
>

I agree.  VAES should imply AES and VPCLMULQDQ should imply
PCLMULQDQ.

-- 
H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-30 15:25       ` H.J. Lu
@ 2023-08-30 16:16         ` Jan Beulich
  2023-08-30 18:00           ` H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-08-30 16:16 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils, Jiang, Haochen

On 30.08.2023 17:25, H.J. Lu wrote:
> On Wed, Aug 30, 2023 at 12:57 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 29.08.2023 18:26, H.J. Lu wrote:
>>> On Fri, Aug 25, 2023 at 5:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> @@ -1673,6 +1680,12 @@ an unconditional jump to the target.
>>>>
>>>>  Note that the sub-architecture specifiers (starting with a dot) can be prefixed
>>>>  with @code{no} to revoke the respective (and any dependent) functionality.
>>>> +Note further that @samp{.avx10.<N>} can be suffixed with a vector length
>>>> +restriction (@samp{/256} or @samp{/128}, with @samp{/512} simply restoring the
>>>> +default).  Despite these otherwise being "enabling" specifiers, using these
>>>> +suffixes will disable all insns with wider vector or mask register operands.
>>>> +On SVR4-derived platforms, the separator character @samp{/} can be replaced by
>>>> +@samp{:}.
>>>>
>>>>  Following the CPU architecture (but not a sub-architecture, which are those
>>>>  starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to
>>>
>>> Although CPUID bits in AVX10 spec may leave an impression that 128-bit,
>>> 256-bit and 512-bit vectors may be enabled independently.  But it also says
>>>
>>> A “converged” version of Intel AVX10 with maximum vector lengths of 256
>>> bits and 32-bit opmask registers will be supported across all Intel processors,
>>> while 512-bit vector registers and 64-bit opmasks will continue to be supported
>>> on some P-core processors.
>>>
>>> Adding avx10.1/128 isn't necessary.
>>
>> I agree it isn't necessary, but as expressed before I view it as desirable.
>> Apart from the sentence you quoted the spec later also says "There are
>> currently no plans to support an Intel AVX10/128 implementation." For my
>> choice of also supporting the 128-bit restriction I'd like to put emphasis
>> on "currently". I think I said before that emulation environments (qemu,
>> sde to name just two well-known examples) are free to implement such
>> further restricted ISAs without then becoming out-of-spec.
>>
>> Plus supporting this mode right away has made me make certain adjustments
>> in what I'd call more clean a way, which I view as desirable as well.
> 
> Since AVX10 spec doesn't specify if mask registers should be limited to
> 16 bits for AVX10/128, doing it in assembler is premature.

It's hard to see why they would remain wider. The more that they were 16
bits only in AVX512F.

Plus of course nobody needs to use the options to enforce the 128-bit
limit. The way I've coded it, it matches what the specification says.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-30 16:16         ` Jan Beulich
@ 2023-08-30 18:00           ` H.J. Lu
  2023-08-31  5:56             ` Jiang, Haochen
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2023-08-30 18:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils, Jiang, Haochen

On Wed, Aug 30, 2023 at 9:16 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 30.08.2023 17:25, H.J. Lu wrote:
> > On Wed, Aug 30, 2023 at 12:57 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 29.08.2023 18:26, H.J. Lu wrote:
> >>> On Fri, Aug 25, 2023 at 5:48 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> @@ -1673,6 +1680,12 @@ an unconditional jump to the target.
> >>>>
> >>>>  Note that the sub-architecture specifiers (starting with a dot) can be prefixed
> >>>>  with @code{no} to revoke the respective (and any dependent) functionality.
> >>>> +Note further that @samp{.avx10.<N>} can be suffixed with a vector length
> >>>> +restriction (@samp{/256} or @samp{/128}, with @samp{/512} simply restoring the
> >>>> +default).  Despite these otherwise being "enabling" specifiers, using these
> >>>> +suffixes will disable all insns with wider vector or mask register operands.
> >>>> +On SVR4-derived platforms, the separator character @samp{/} can be replaced by
> >>>> +@samp{:}.
> >>>>
> >>>>  Following the CPU architecture (but not a sub-architecture, which are those
> >>>>  starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to
> >>>
> >>> Although CPUID bits in AVX10 spec may leave an impression that 128-bit,
> >>> 256-bit and 512-bit vectors may be enabled independently.  But it also says
> >>>
> >>> A “converged” version of Intel AVX10 with maximum vector lengths of 256
> >>> bits and 32-bit opmask registers will be supported across all Intel processors,
> >>> while 512-bit vector registers and 64-bit opmasks will continue to be supported
> >>> on some P-core processors.
> >>>
> >>> Adding avx10.1/128 isn't necessary.
> >>
> >> I agree it isn't necessary, but as expressed before I view it as desirable.
> >> Apart from the sentence you quoted the spec later also says "There are
> >> currently no plans to support an Intel AVX10/128 implementation." For my
> >> choice of also supporting the 128-bit restriction I'd like to put emphasis
> >> on "currently". I think I said before that emulation environments (qemu,
> >> sde to name just two well-known examples) are free to implement such
> >> further restricted ISAs without then becoming out-of-spec.
> >>
> >> Plus supporting this mode right away has made me make certain adjustments
> >> in what I'd call more clean a way, which I view as desirable as well.
> >
> > Since AVX10 spec doesn't specify if mask registers should be limited to
> > 16 bits for AVX10/128, doing it in assembler is premature.
>
> It's hard to see why they would remain wider. The more that they were 16
> bits only in AVX512F.
>
> Plus of course nobody needs to use the options to enforce the 128-bit
> limit. The way I've coded it, it matches what the specification says.
>

AVX10 spec only has

Quadword opmask instructions will only be supported on processors
supporting vector lengths of 512 bits.

It doesn't say anything about 32-bit mask.   32-bit mask can be useful
even with 16 byte vector.

-- 
H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-30 18:00           ` H.J. Lu
@ 2023-08-31  5:56             ` Jiang, Haochen
  2023-08-31  7:18               ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Jiang, Haochen @ 2023-08-31  5:56 UTC (permalink / raw)
  To: H.J. Lu, Beulich, Jan; +Cc: Binutils

> > >> I agree it isn't necessary, but as expressed before I view it as desirable.
> > >> Apart from the sentence you quoted the spec later also says "There are
> > >> currently no plans to support an Intel AVX10/128 implementation." For my
> > >> choice of also supporting the 128-bit restriction I'd like to put emphasis
> > >> on "currently". I think I said before that emulation environments (qemu,
> > >> sde to name just two well-known examples) are free to implement such
> > >> further restricted ISAs without then becoming out-of-spec.
> > >>
> > >> Plus supporting this mode right away has made me make certain adjustments
> > >> in what I'd call more clean a way, which I view as desirable as well.
> > >
> > > Since AVX10 spec doesn't specify if mask registers should be limited to
> > > 16 bits for AVX10/128, doing it in assembler is premature.
> >
> > It's hard to see why they would remain wider. The more that they were 16
> > bits only in AVX512F.
> >
> > Plus of course nobody needs to use the options to enforce the 128-bit
> > limit. The way I've coded it, it matches what the specification says.
> >
> 
> AVX10 spec only has
> 
> Quadword opmask instructions will only be supported on processors
> supporting vector lengths of 512 bits.
> 
> It doesn't say anything about 32-bit mask.   32-bit mask can be useful
> even with 16 byte vector.
> 
> --
> H.J.

The concern form my side is if there is an extreme case that overloads
registers, we might need to spill 32-bit register to 32-bit mask register
in the compiler.

Another minor concern is if there is finally a AVX10/128, although I do
not see that could happen, if we get a wrong choice here, it will take
some more time to correct the final assembler on the user side, which
I mean on the real OS.

However, I suppose both ok for me whether to allow 32-bit mask since
AVX10/128 is nowhere near in the future and it is a toy code to play with.
We could be some kind of conservative at first by just allowing 16-bit
mask register. Also, the code change is quite easy and no much worry on
changing that.

Thx,
Haochen

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-31  5:56             ` Jiang, Haochen
@ 2023-08-31  7:18               ` Jan Beulich
  2023-09-01  6:21                 ` Jiang, Haochen
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-08-31  7:18 UTC (permalink / raw)
  To: Jiang, Haochen, H.J. Lu; +Cc: Binutils

On 31.08.2023 07:56, Jiang, Haochen wrote:
>>>>> I agree it isn't necessary, but as expressed before I view it as desirable.
>>>>> Apart from the sentence you quoted the spec later also says "There are
>>>>> currently no plans to support an Intel AVX10/128 implementation." For my
>>>>> choice of also supporting the 128-bit restriction I'd like to put emphasis
>>>>> on "currently". I think I said before that emulation environments (qemu,
>>>>> sde to name just two well-known examples) are free to implement such
>>>>> further restricted ISAs without then becoming out-of-spec.
>>>>>
>>>>> Plus supporting this mode right away has made me make certain adjustments
>>>>> in what I'd call more clean a way, which I view as desirable as well.
>>>>
>>>> Since AVX10 spec doesn't specify if mask registers should be limited to
>>>> 16 bits for AVX10/128, doing it in assembler is premature.
>>>
>>> It's hard to see why they would remain wider. The more that they were 16
>>> bits only in AVX512F.
>>>
>>> Plus of course nobody needs to use the options to enforce the 128-bit
>>> limit. The way I've coded it, it matches what the specification says.
>>>
>>
>> AVX10 spec only has
>>
>> Quadword opmask instructions will only be supported on processors
>> supporting vector lengths of 512 bits.
>>
>> It doesn't say anything about 32-bit mask.   32-bit mask can be useful
>> even with 16 byte vector.

How's that any different for 64-bit mask with 32-byte vector?

> The concern form my side is if there is an extreme case that overloads
> registers, we might need to spill 32-bit register to 32-bit mask register
> in the compiler.

How's that any different for spilling of 64-bit registers?

> Another minor concern is if there is finally a AVX10/128, although I do
> not see that could happen, if we get a wrong choice here, it will take
> some more time to correct the final assembler on the user side, which
> I mean on the real OS.
> 
> However, I suppose both ok for me whether to allow 32-bit mask since
> AVX10/128 is nowhere near in the future and it is a toy code to play with.
> We could be some kind of conservative at first by just allowing 16-bit
> mask register. Also, the code change is quite easy and no much worry on
> changing that.

Exactly: I'd rather be overly restrictive initially (which people can
easily work around by using .arch suitably around individual insns)
rather than being too permissive and then failing to flag mistakes.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 5/5] x86: support AVX10.1 vector size restrictions
  2023-08-31  7:18               ` Jan Beulich
@ 2023-09-01  6:21                 ` Jiang, Haochen
  0 siblings, 0 replies; 27+ messages in thread
From: Jiang, Haochen @ 2023-09-01  6:21 UTC (permalink / raw)
  To: Beulich, Jan, H.J. Lu; +Cc: Binutils

> >>
> >> AVX10 spec only has
> >>
> >> Quadword opmask instructions will only be supported on processors
> >> supporting vector lengths of 512 bits.
> >>
> >> It doesn't say anything about 32-bit mask.   32-bit mask can be useful
> >> even with 16 byte vector.
> 
> How's that any different for 64-bit mask with 32-byte vector?
> 
> > The concern form my side is if there is an extreme case that overloads
> > registers, we might need to spill 32-bit register to 32-bit mask
> > register in the compiler.
> 
> How's that any different for spilling of 64-bit registers?
>

In GCC, under 256 bit mode, we will ban them. But since there is no definition
for 128 bit mode in GCC for a long time, it will leave some space of undefined.

Actually, the cost for that spill is really high. So it should be little concern, not a
blocking issue.

Thx,
Haochen


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] x86: support AVX10.1/512
  2023-08-30 15:28                 ` H.J. Lu
@ 2023-09-01  8:41                   ` Jan Beulich
  2023-09-01  8:52                     ` Jiang, Haochen
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-09-01  8:41 UTC (permalink / raw)
  To: H.J. Lu, Jiang, Haochen; +Cc: Binutils

On 30.08.2023 17:28, H.J. Lu wrote:
> On Wed, Aug 30, 2023 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 30.08.2023 03:10, Jiang, Haochen wrote:
>>>> I don't think we need to make AVX10 to imply AES and VPCLMULQDQ in
>>>> assembler.  The AES and PCLMULQDQ changes aren't needed.
>>>>
>>>
>>> AVX10 should not imply them. However, I suppose we are talking about
>>> the relation between AES/VAES and PCLMULQDQ/VPCLMULQDQ.
>>
>> Correct. Some change needs to be done; the question is whether as
>> posted, or whether with a prereq patch adding those two connections
>> (with, as said before, me intending to go the latter route unless I
>> hear otherwise).
> 
> I agree.  VAES should imply AES and VPCLMULQDQ should imply
> PCLMULQDQ.

Actually there's something similar with AVX10 itself: AVX512F includes
equivalents right away of what comes under separate extensions for AVX:
F16C and FMA. AVX10, otoh, is presently specified to only guarantee
AVX and AVX2. Does that mean that VEX-encoded vfm{add,sub}* and ps<->ph
conversion insns aren't guaranteed to also be available? Doesn't seem
logical to me, so I'm inclined to make FMA and F16C prereqs of AVX10.1
as well (or alternatively of AVX512F, but I think this would have
undesirable effects). AVX2 isn't an explicit prereq only because it
already is one of AVX512F.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/5] x86: support AVX10.1/512
  2023-09-01  8:41                   ` Jan Beulich
@ 2023-09-01  8:52                     ` Jiang, Haochen
  2023-09-01  9:57                       ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Jiang, Haochen @ 2023-09-01  8:52 UTC (permalink / raw)
  To: Beulich, Jan, H.J. Lu; +Cc: Binutils



> >>> AVX10 should not imply them. However, I suppose we are talking about
> >>> the relation between AES/VAES and PCLMULQDQ/VPCLMULQDQ.
> >>
> >> Correct. Some change needs to be done; the question is whether as
> >> posted, or whether with a prereq patch adding those two connections
> >> (with, as said before, me intending to go the latter route unless I
> >> hear otherwise).
> >
> > I agree.  VAES should imply AES and VPCLMULQDQ should imply
> > PCLMULQDQ.
> 
> Actually there's something similar with AVX10 itself: AVX512F includes
> equivalents right away of what comes under separate extensions for AVX:
> F16C and FMA. AVX10, otoh, is presently specified to only guarantee
> AVX and AVX2. Does that mean that VEX-encoded vfm{add,sub}* and ps<-ph
> conversion insns aren't guaranteed to also be available? Doesn't seem
> logical to me, so I'm inclined to make FMA and F16C prereqs of AVX10.1
> as well (or alternatively of AVX512F, but I think this would have
> undesirable effects). AVX2 isn't an explicit prereq only because it
> already is one of AVX512F.

I suppose AVX10 should only enable EVEX encoding,  they have nothing
to do with the VEX encoding.

For those independent VEX ISAs, if AVX512F is not enabling it, AVX10 neither.

Actually, not only F16C and FMA, under AVX10, ISAs like AVX-VNNI, AVX-IFMA
are also not enabled.

Thx,
Haochen

> 
> Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] x86: support AVX10.1/512
  2023-09-01  8:52                     ` Jiang, Haochen
@ 2023-09-01  9:57                       ` Jan Beulich
  2023-09-05  7:04                         ` Jiang, Haochen
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-09-01  9:57 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: Binutils, H.J. Lu

On 01.09.2023 10:52, Jiang, Haochen wrote:
> 
> 
>>>>> AVX10 should not imply them. However, I suppose we are talking about
>>>>> the relation between AES/VAES and PCLMULQDQ/VPCLMULQDQ.
>>>>
>>>> Correct. Some change needs to be done; the question is whether as
>>>> posted, or whether with a prereq patch adding those two connections
>>>> (with, as said before, me intending to go the latter route unless I
>>>> hear otherwise).
>>>
>>> I agree.  VAES should imply AES and VPCLMULQDQ should imply
>>> PCLMULQDQ.
>>
>> Actually there's something similar with AVX10 itself: AVX512F includes
>> equivalents right away of what comes under separate extensions for AVX:
>> F16C and FMA. AVX10, otoh, is presently specified to only guarantee
>> AVX and AVX2. Does that mean that VEX-encoded vfm{add,sub}* and ps<-ph
>> conversion insns aren't guaranteed to also be available? Doesn't seem
>> logical to me, so I'm inclined to make FMA and F16C prereqs of AVX10.1
>> as well (or alternatively of AVX512F, but I think this would have
>> undesirable effects). AVX2 isn't an explicit prereq only because it
>> already is one of AVX512F.
> 
> I suppose AVX10 should only enable EVEX encoding,  they have nothing
> to do with the VEX encoding.
> 
> For those independent VEX ISAs, if AVX512F is not enabling it, AVX10 neither.
> 
> Actually, not only F16C and FMA, under AVX10, ISAs like AVX-VNNI, AVX-IFMA
> are also not enabled.

The difference to the AVX-* ones you mention is important here: AVX-VNNI
(taking that as example) isn't a feature that had equivalent EVEX
encodings added right in AVX512F. So I'd like to ask that you re-consider
what you said. Also think about what the compiler does (which doesn't
emit .arch directives to limit the usable ISA extensions) when just
-mavx512vl is passed to it: VEX-encoded vfm{add,sub}* would then still be
resulting (to prevent that, the compiler would need to further emit {evex}
pseudo-prefixes). IOW in the compiler there is such an implication already
anyway.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/5] x86: support AVX10.1/512
  2023-09-01  9:57                       ` Jan Beulich
@ 2023-09-05  7:04                         ` Jiang, Haochen
  2023-09-05  7:25                           ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Jiang, Haochen @ 2023-09-05  7:04 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Binutils, H.J. Lu

> >> Actually there's something similar with AVX10 itself: AVX512F includes
> >> equivalents right away of what comes under separate extensions for AVX:
> >> F16C and FMA. AVX10, otoh, is presently specified to only guarantee
> >> AVX and AVX2. Does that mean that VEX-encoded vfm{add,sub}* and ps<-ph
> >> conversion insns aren't guaranteed to also be available? Doesn't seem
> >> logical to me, so I'm inclined to make FMA and F16C prereqs of AVX10.1
> >> as well (or alternatively of AVX512F, but I think this would have
> >> undesirable effects). AVX2 isn't an explicit prereq only because it
> >> already is one of AVX512F.
> >
> > I suppose AVX10 should only enable EVEX encoding,  they have nothing
> > to do with the VEX encoding.
> >
> > For those independent VEX ISAs, if AVX512F is not enabling it, AVX10 neither.
> >
> > Actually, not only F16C and FMA, under AVX10, ISAs like AVX-VNNI, AVX-IFMA
> > are also not enabled.
> 
> The difference to the AVX-* ones you mention is important here: AVX-VNNI
> (taking that as example) isn't a feature that had equivalent EVEX
> encodings added right in AVX512F. So I'd like to ask that you re-consider

I see your point since here we are just focusing on features introduced in
AVX512F. But I still would like to mention AVX-VNNI below just for discussion.

> what you said. Also think about what the compiler does (which doesn't
> emit .arch directives to limit the usable ISA extensions) when just
> -mavx512vl is passed to it: VEX-encoded vfm{add,sub}* would then still be
> resulting (to prevent that, the compiler would need to further emit {evex}
> pseudo-prefixes). IOW in the compiler there is such an implication already
> anyway.

For FMA, in GCC, we have such comment on that:

;; The standard names for scalar FMA are only available with SSE math enabled.
;; CPUID bit AVX512F enables evex encoded scalar and 512-bit fma.  It doesn't
;; care about FMA bit, so we enable fma for TARGET_AVX512F even when TARGET_FMA
;; and TARGET_FMA4 are both false.
;; TODO: In theory AVX512F does not automatically imply FMA, and without FMA
;; one must force the EVEX encoding of the fma insns.  Ideally we'd improve
;; GAS to allow proper prefix selection.  However, for the moment all hardware
;; that supports AVX512F also supports FMA so we can ignore this for now.

Although splitting the pattern between FMA/FMA4 and AVX512F, the code itself actually
won't emit an {evex} prefix in mnemonic if there is only AVX512F since there is no true
hardware for codegen to do so.

For F16C, the pattern is even not split, so the scenario is the same as FMA/FMA4.

Therefore, I suppose it could be ok for AVX10 to imply FMA/F16C in gas for simplicity. But
let's wait for H.J.'s opinion on that.

For AVX-VNNI issue, it is introduced in Sapphire Rapids, which is before AVX10.1 introduction
(Granite Rapids), which means that on the hardware we will always have AVX-VNNI while
AVX10.1 is there. So there might be a chance to imply AVX-VNNI in AVX10.1 in compiler,
but we could put that discussion after everything in AVX10.1 is set in community.

Thx,
Haochen

> 
> Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] x86: support AVX10.1/512
  2023-09-05  7:04                         ` Jiang, Haochen
@ 2023-09-05  7:25                           ` Jan Beulich
  0 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2023-09-05  7:25 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: Binutils, H.J. Lu

On 05.09.2023 09:04, Jiang, Haochen wrote:
>>>> Actually there's something similar with AVX10 itself: AVX512F includes
>>>> equivalents right away of what comes under separate extensions for AVX:
>>>> F16C and FMA. AVX10, otoh, is presently specified to only guarantee
>>>> AVX and AVX2. Does that mean that VEX-encoded vfm{add,sub}* and ps<-ph
>>>> conversion insns aren't guaranteed to also be available? Doesn't seem
>>>> logical to me, so I'm inclined to make FMA and F16C prereqs of AVX10.1
>>>> as well (or alternatively of AVX512F, but I think this would have
>>>> undesirable effects). AVX2 isn't an explicit prereq only because it
>>>> already is one of AVX512F.
>>>
>>> I suppose AVX10 should only enable EVEX encoding,  they have nothing
>>> to do with the VEX encoding.
>>>
>>> For those independent VEX ISAs, if AVX512F is not enabling it, AVX10 neither.
>>>
>>> Actually, not only F16C and FMA, under AVX10, ISAs like AVX-VNNI, AVX-IFMA
>>> are also not enabled.
>>
>> The difference to the AVX-* ones you mention is important here: AVX-VNNI
>> (taking that as example) isn't a feature that had equivalent EVEX
>> encodings added right in AVX512F. So I'd like to ask that you re-consider
> 
> I see your point since here we are just focusing on features introduced in
> AVX512F. But I still would like to mention AVX-VNNI below just for discussion.
> 
>> what you said. Also think about what the compiler does (which doesn't
>> emit .arch directives to limit the usable ISA extensions) when just
>> -mavx512vl is passed to it: VEX-encoded vfm{add,sub}* would then still be
>> resulting (to prevent that, the compiler would need to further emit {evex}
>> pseudo-prefixes). IOW in the compiler there is such an implication already
>> anyway.
> 
> For FMA, in GCC, we have such comment on that:
> 
> ;; The standard names for scalar FMA are only available with SSE math enabled.
> ;; CPUID bit AVX512F enables evex encoded scalar and 512-bit fma.  It doesn't
> ;; care about FMA bit, so we enable fma for TARGET_AVX512F even when TARGET_FMA
> ;; and TARGET_FMA4 are both false.
> ;; TODO: In theory AVX512F does not automatically imply FMA, and without FMA
> ;; one must force the EVEX encoding of the fma insns.  Ideally we'd improve
> ;; GAS to allow proper prefix selection.  However, for the moment all hardware
> ;; that supports AVX512F also supports FMA so we can ignore this for now.

Interesting. I wonder what gas improvement is being thought about here, when
gcc doesn't emit .arch.

> Although splitting the pattern between FMA/FMA4 and AVX512F, the code itself actually
> won't emit an {evex} prefix in mnemonic if there is only AVX512F since there is no true
> hardware for codegen to do so.
> 
> For F16C, the pattern is even not split, so the scenario is the same as FMA/FMA4.
> 
> Therefore, I suppose it could be ok for AVX10 to imply FMA/F16C in gas for simplicity. But
> let's wait for H.J.'s opinion on that.

Okay, I'll submit v2 then with this just as a remark for the time being.
Luckily in the follow-on work where I ran into this I now no longer depend
on there being such an explicit connection. (Whether what I'm doing there
is acceptable will need to be seen.)

> For AVX-VNNI issue, it is introduced in Sapphire Rapids, which is before AVX10.1 introduction
> (Granite Rapids), which means that on the hardware we will always have AVX-VNNI while
> AVX10.1 is there. So there might be a chance to imply AVX-VNNI in AVX10.1 in compiler,
> but we could put that discussion after everything in AVX10.1 is set in community.

Hmm, yes. An implication from making it another prereq is that with AVX10.1
explicitly enabled, VEX encodings then ought to be preferred over the EVEX
ones (for being shorter), except when Disp8-scaling helps shortening a memory
reference. That'll for sure require extra code in tc-i386.c, so would likely
want to be a separate patch then. (Actually I think we should already do so
anyway when AVX-VNNI is explicitly enabled.)

I'd then further raise the same question towards AVX-IFMA.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2023-09-05  7:25 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-25 12:43 [PATCH 0/5] x86: AVX10.1 (alternative attempt) Jan Beulich
2023-08-25 12:44 ` [PATCH 1/5] x86: correct source used for two non-AVX512 VEXWIG tests Jan Beulich
2023-08-25 12:45 ` [PATCH 2/5] x86: rename CpuPCLMUL Jan Beulich
2023-08-25 12:46 ` [PATCH 3/5] x86: support AVX10.1/512 Jan Beulich
2023-08-28  2:34   ` Jiang, Haochen
2023-08-28  6:45     ` Jan Beulich
2023-08-28  6:59       ` Jiang, Haochen
2023-08-28  7:09         ` Jan Beulich
2023-08-29 16:18           ` H.J. Lu
2023-08-30  1:10             ` Jiang, Haochen
2023-08-30  7:47               ` Jan Beulich
2023-08-30 15:28                 ` H.J. Lu
2023-09-01  8:41                   ` Jan Beulich
2023-09-01  8:52                     ` Jiang, Haochen
2023-09-01  9:57                       ` Jan Beulich
2023-09-05  7:04                         ` Jiang, Haochen
2023-09-05  7:25                           ` Jan Beulich
2023-08-25 12:47 ` [PATCH 4/5] x86: unindent most of set_cpu_arch() Jan Beulich
2023-08-25 12:47 ` [PATCH 5/5] x86: support AVX10.1 vector size restrictions Jan Beulich
2023-08-29 16:26   ` H.J. Lu
2023-08-30  7:57     ` Jan Beulich
2023-08-30 15:25       ` H.J. Lu
2023-08-30 16:16         ` Jan Beulich
2023-08-30 18:00           ` H.J. Lu
2023-08-31  5:56             ` Jiang, Haochen
2023-08-31  7:18               ` Jan Beulich
2023-09-01  6:21                 ` Jiang, Haochen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).