public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 00/10] x86: re-work ISA extension dependency handling
@ 2022-12-19  8:31 Jan Beulich
  2022-12-19 10:44 ` [PATCH 01/10] " Jan Beulich
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19  8:31 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Getting both forward and reverse ISA dependencies right / consistent has
been a permanent source of mistakes, myself included. Reduce what needs
specifying manually to just the direct forward dependencies. Plus a
number of dependencies weren't put in place at all.

01: re-work ISA extension dependency handling
02: correct what gets disabled by certain ".arch .no*"
03: correct SSE dependencies
04: add dependencies on AVX2
05: rework noavx512-1 testcase
06: correct dependencies of a few AVX512 sub-features
07: correct XSAVE* dependencies
08: add dependencies on VMX
09: add dependencies on SVME
10: correct/improve TSX controls

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 01/10] x86: re-work ISA extension dependency handling
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
@ 2022-12-19 10:44 ` Jan Beulich
  2022-12-19 10:45 ` [PATCH 02/10] x86: correct what gets disabled by certain ".arch .no*" Jan Beulich
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:44 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Getting both forward and reverse ISA dependencies right / consistent has
been a permanent source of mistakes. Reduce what needs specifying
manually to just the direct forward dependencies. Transitive forward
dependencies as well as reverse ones are now derived and hence cannot go
out of sync anymore (at least in the vast majority of cases; there are a
few special cases to still take care of manually). In the course of this
several CPU_ANY_*_FLAGS disappear, requiring adjustment to the
assembler's cpu_arch[].

Note that to retain the correct reverse dependency of AVX512F wrt
AVX512-VP2INTERSECT, the latter has the previously missing AVX512F
prereq added.

Note further that to avoid adding the following undue prereqs:
* ATHLON, K8, and AMDFAM10 gain CMOV and FXSR,
* IAMCU gains 387,
auxiliary table entries (including a colon-separated modifier) are
introduced in addition to the ones representing from converting the old
table.

To maintain forward-only dependencies between AVX (XOP) and SSE* (SSE4a)
(i.e. "nosse" not disabling AVX), reverse dependency tracking is
artifically suppressed.

As a side effect disabling of SSE or SSE2 will now also disable AES,
PCLMUL, and SHA (respective elements were missing from
CPU_ANY_SSE2_FLAGS).
---
An option would be to generate

#define CPU_ANY_XYZ_FLAGS CPU_XYZ_FLAGS

for all XYZ which don't otherwise have CPU_ANY_XYZ_FLAGS generated. This
would allow cpu_arch[] to become more uniform. But maybe this would, if
desired in the first place, better be a separate, follow-on change.

I wonder whether isa_dependencies[] wouldn't better be in a separate
input file (e.g. i386-isa.tbl), just like the other inputs are also
separate files. If we wanted to switch, perhaps again better as a
follow-on change.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -964,13 +964,13 @@ static const arch_entry cpu_arch[] =
   ARCH (generic32, GENERIC32, GENERIC32, false),
   ARCH (generic64, GENERIC64, GENERIC64, false),
   ARCH (i8086, UNKNOWN, NONE, false),
-  ARCH (i186, UNKNOWN, I186, false),
-  ARCH (i286, UNKNOWN, I286, false),
-  ARCH (i386, I386, I386, false),
-  ARCH (i486, I486, I486, false),
-  ARCH (i586, PENTIUM, I586, false),
-  ARCH (i686, PENTIUMPRO, I686, false),
-  ARCH (pentium, PENTIUM, I586, false),
+  ARCH (i186, UNKNOWN, 186, false),
+  ARCH (i286, UNKNOWN, 286, false),
+  ARCH (i386, I386, 386, false),
+  ARCH (i486, I486, 486, false),
+  ARCH (i586, PENTIUM, 586, false),
+  ARCH (i686, PENTIUMPRO, 686, false),
+  ARCH (pentium, PENTIUM, 586, false),
   ARCH (pentiumpro, PENTIUMPRO, PENTIUMPRO, false),
   ARCH (pentiumii, PENTIUMPRO, P2, false),
   ARCH (pentiumiii, PENTIUMPRO, P3, false),
@@ -1001,13 +1001,13 @@ static const arch_entry cpu_arch[] =
   ARCH (btver1, BT, BTVER1, false),
   ARCH (btver2, BT, BTVER2, false),
 
-  SUBARCH (8087, 8087, ANY_X87, false),
-  SUBARCH (87, NONE, ANY_X87, false), /* Disable only!  */
+  SUBARCH (8087, 8087, ANY_8087, false),
+  SUBARCH (87, NONE, ANY_8087, false), /* Disable only!  */
   SUBARCH (287, 287, ANY_287, false),
   SUBARCH (387, 387, ANY_387, false),
   SUBARCH (687, 687, ANY_687, false),
-  SUBARCH (cmov, CMOV, ANY_CMOV, false),
-  SUBARCH (fxsr, FXSR, ANY_FXSR, false),
+  SUBARCH (cmov, CMOV, CMOV, false),
+  SUBARCH (fxsr, FXSR, FXSR, false),
   SUBARCH (mmx, MMX, ANY_MMX, false),
   SUBARCH (sse, SSE, ANY_SSE, false),
   SUBARCH (sse2, SSE2, ANY_SSE2, false),
@@ -1088,8 +1088,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (ospke, OSPKE, OSPKE, false),
   SUBARCH (rdpid, RDPID, RDPID, false),
   SUBARCH (ptwrite, PTWRITE, PTWRITE, false),
-  SUBARCH (ibt, IBT, ANY_IBT, false),
-  SUBARCH (shstk, SHSTK, ANY_SHSTK, false),
+  SUBARCH (ibt, IBT, IBT, false),
+  SUBARCH (shstk, SHSTK, SHSTK, false),
   SUBARCH (gfni, GFNI, GFNI, false),
   SUBARCH (vaes, VAES, VAES, false),
   SUBARCH (vpclmulqdq, VPCLMULQDQ, VPCLMULQDQ, false),
@@ -1101,31 +1101,31 @@ static const arch_entry cpu_arch[] =
   SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false),
   SUBARCH (amx_fp16, AMX_FP16, AMX_FP16, false),
   SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false),
-  SUBARCH (movdiri, MOVDIRI, ANY_MOVDIRI, false),
-  SUBARCH (movdir64b, MOVDIR64B, ANY_MOVDIR64B, false),
+  SUBARCH (movdiri, MOVDIRI, MOVDIRI, false),
+  SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false),
   SUBARCH (avx512_bf16, AVX512_BF16, ANY_AVX512_BF16, false),
   SUBARCH (avx512_vp2intersect, AVX512_VP2INTERSECT,
 	   ANY_AVX512_VP2INTERSECT, false),
-  SUBARCH (tdx, TDX, ANY_TDX, false),
-  SUBARCH (enqcmd, ENQCMD, ANY_ENQCMD, false),
-  SUBARCH (serialize, SERIALIZE, ANY_SERIALIZE, false),
+  SUBARCH (tdx, TDX, TDX, false),
+  SUBARCH (enqcmd, ENQCMD, ENQCMD, false),
+  SUBARCH (serialize, SERIALIZE, SERIALIZE, false),
   SUBARCH (rdpru, RDPRU, RDPRU, false),
   SUBARCH (mcommit, MCOMMIT, MCOMMIT, false),
   SUBARCH (sev_es, SEV_ES, SEV_ES, false),
-  SUBARCH (tsxldtrk, TSXLDTRK, ANY_TSXLDTRK, false),
-  SUBARCH (kl, KL, ANY_KL, false),
-  SUBARCH (widekl, WIDEKL, ANY_WIDEKL, false),
-  SUBARCH (uintr, UINTR, ANY_UINTR, false),
-  SUBARCH (hreset, HRESET, ANY_HRESET, false),
+  SUBARCH (tsxldtrk, TSXLDTRK, TSXLDTRK, false),
+  SUBARCH (kl, KL, KL, false),
+  SUBARCH (widekl, WIDEKL, WIDEKL, false),
+  SUBARCH (uintr, UINTR, UINTR, false),
+  SUBARCH (hreset, HRESET, HRESET, false),
   SUBARCH (avx512_fp16, AVX512_FP16, ANY_AVX512_FP16, false),
   SUBARCH (prefetchi, PREFETCHI, PREFETCHI, false),
   SUBARCH (avx_ifma, AVX_IFMA, ANY_AVX_IFMA, false),
   SUBARCH (avx_vnni_int8, AVX_VNNI_INT8, ANY_AVX_VNNI_INT8, false),
-  SUBARCH (cmpccxadd, CMPCCXADD, ANY_CMPCCXADD, false),
-  SUBARCH (wrmsrns, WRMSRNS, ANY_WRMSRNS, false),
-  SUBARCH (msrlist, MSRLIST, ANY_MSRLIST, false),
+  SUBARCH (cmpccxadd, CMPCCXADD, CMPCCXADD, false),
+  SUBARCH (wrmsrns, WRMSRNS, WRMSRNS, false),
+  SUBARCH (msrlist, MSRLIST, MSRLIST, false),
   SUBARCH (avx_ne_convert, AVX_NE_CONVERT, ANY_AVX_NE_CONVERT, false),
-  SUBARCH (rao_int, RAO_INT, ANY_RAO_INT, false),
+  SUBARCH (rao_int, RAO_INT, RAO_INT, false),
   SUBARCH (rmpquery, RMPQUERY, RMPQUERY, false),
 };
 
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -18,6 +18,7 @@
    MA 02110-1301, USA.  */
 
 #include "sysdep.h"
+#include <stdbool.h>
 #include <stdio.h>
 #include <errno.h>
 #include "getopt.h"
@@ -37,446 +38,192 @@
 static const char *program_name = NULL;
 static int debug = 0;
 
-typedef struct initializer
+typedef struct dependency
 {
   const char *name;
-  const char *init;
-} initializer;
+  /* Note: Only direct dependencies should be enumerated.  */
+  const char *deps;
+} dependency;
 
-static initializer cpu_flag_init[] =
+static const dependency isa_dependencies[] =
 {
-  { "CPU_UNKNOWN_FLAGS",
+  { "UNKNOWN",
     "~IAMCU" },
-  { "CPU_GENERIC32_FLAGS",
-    "186|286|386" },
-  { "CPU_GENERIC64_FLAGS",
-    "CPU_PENTIUMPRO_FLAGS|Clflush|SYSCALL|CPU_MMX_FLAGS|CPU_SSE2_FLAGS|LM" },
-  { "CPU_NONE_FLAGS",
-   "0" },
-  { "CPU_I186_FLAGS",
+  { "GENERIC32",
+    "386" },
+  { "GENERIC64",
+    "PENTIUMPRO|Clflush|SYSCALL|MMX|SSE2|LM" },
+  { "NONE",
+    "0" },
+  { "PENTIUMPRO",
+    "686|Nop" },
+  { "P2",
+    "PENTIUMPRO|MMX" },
+  { "P3",
+    "P2|SSE" },
+  { "P4",
+    "P3|Clflush|SSE2" },
+  { "NOCONA",
+    "GENERIC64|FISTTP|SSE3|CX16" },
+  { "CORE",
+    "P4|FISTTP|SSE3|CX16" },
+  { "CORE2",
+    "NOCONA|SSSE3" },
+  { "COREI7",
+    "CORE2|SSE4_2|Rdtscp" },
+  { "K6",
+    "186|286|386|486|586|SYSCALL|387|MMX" },
+  { "K6_2",
+    "K6|3dnow" },
+  { "ATHLON",
+    "K6_2|686:min|687|Nop|3dnowA" },
+  { "K8",
+    "ATHLON|Rdtscp|SSE2|LM" },
+  { "AMDFAM10",
+    "K8|FISTTP|SSE4A|ABM" },
+  { "BDVER1",
+    "GENERIC64|FISTTP|Rdtscp|CX16|XOP|ABM|LWP|SVME|AES|PCLMUL|PRFCHW" },
+  { "BDVER2",
+    "BDVER1|FMA|BMI|TBM|F16C" },
+  { "BDVER3",
+    "BDVER2|Xsaveopt|FSGSBase" },
+  { "BDVER4",
+    "BDVER3|AVX2|Movbe|BMI2|RdRnd|MWAITX" },
+  { "ZNVER1",
+    "GENERIC64|FISTTP|Rdtscp|CX16|AVX2|SSE4A|ABM|SVME|AES|PCLMUL|PRFCHW|FMA|BMI|F16C|Xsaveopt|FSGSBase|Movbe|BMI2|RdRnd|ADX|RdSeed|SMAP|SHA|XSAVEC|XSAVES|ClflushOpt|CLZERO|MWAITX" },
+  { "ZNVER2",
+    "ZNVER1|CLWB|RDPID|RDPRU|MCOMMIT|WBNOINVD" },
+  { "ZNVER3",
+    "ZNVER2|INVLPGB|TLBSYNC|VAES|VPCLMULQDQ|INVPCID|SNP|OSPKE" },
+  { "ZNVER4",
+    "ZNVER3|AVX512F|AVX512DQ|AVX512IFMA|AVX512CD|AVX512BW|AVX512VL|AVX512_BF16|AVX512VBMI|AVX512_VBMI2|AVX512_VNNI|AVX512_BITALG|AVX512_VPOPCNTDQ|GFNI|RMPQUERY" },
+  { "BTVER1",
+    "GENERIC64|FISTTP|CX16|Rdtscp|SSSE3|SSE4A|ABM|PRFCHW|CX16|Clflush|FISTTP|SVME" },
+  { "BTVER2",
+    "BTVER1|AVX|BMI|F16C|AES|PCLMUL|Movbe|Xsaveopt|PRFCHW" },
+  { "286",
     "186" },
-  { "CPU_I286_FLAGS",
-    "CPU_I186_FLAGS|286" },
-  { "CPU_I386_FLAGS",
-    "CPU_I286_FLAGS|386" },
-  { "CPU_I486_FLAGS",
-    "CPU_I386_FLAGS|486" },
-  { "CPU_I586_FLAGS",
-    "CPU_I486_FLAGS|387|586" },
-  { "CPU_I686_FLAGS",
-    "CPU_I586_FLAGS|686|687|CMOV|FXSR" },
-  { "CPU_PENTIUMPRO_FLAGS",
-    "CPU_I686_FLAGS|Nop" },
-  { "CPU_P2_FLAGS",
-    "CPU_PENTIUMPRO_FLAGS|CPU_MMX_FLAGS" },
-  { "CPU_P3_FLAGS",
-    "CPU_P2_FLAGS|CPU_SSE_FLAGS" },
-  { "CPU_P4_FLAGS",
-    "CPU_P3_FLAGS|Clflush|CPU_SSE2_FLAGS" },
-  { "CPU_NOCONA_FLAGS",
-    "CPU_GENERIC64_FLAGS|FISTTP|CPU_SSE3_FLAGS|CX16" },
-  { "CPU_CORE_FLAGS",
-    "CPU_P4_FLAGS|FISTTP|CPU_SSE3_FLAGS|CX16" },
-  { "CPU_CORE2_FLAGS",
-    "CPU_NOCONA_FLAGS|CPU_SSSE3_FLAGS" },
-  { "CPU_COREI7_FLAGS",
-    "CPU_CORE2_FLAGS|CPU_SSE4_2_FLAGS|Rdtscp" },
-  { "CPU_K6_FLAGS",
-    "186|286|386|486|586|SYSCALL|387|CPU_MMX_FLAGS" },
-  { "CPU_K6_2_FLAGS",
-    "CPU_K6_FLAGS|3dnow" },
-  { "CPU_ATHLON_FLAGS",
-    "CPU_K6_2_FLAGS|686|687|Nop|3dnowA" },
-  { "CPU_K8_FLAGS",
-    "CPU_ATHLON_FLAGS|Rdtscp|CPU_SSE2_FLAGS|LM" },
-  { "CPU_AMDFAM10_FLAGS",
-    "CPU_K8_FLAGS|FISTTP|CPU_SSE4A_FLAGS|LZCNT|POPCNT" },
-  { "CPU_BDVER1_FLAGS",
-    "CPU_GENERIC64_FLAGS|FISTTP|Rdtscp|CX16|CPU_XOP_FLAGS|LZCNT|POPCNT|LWP|SVME|AES|PCLMUL|PRFCHW" },
-  { "CPU_BDVER2_FLAGS",
-    "CPU_BDVER1_FLAGS|FMA|BMI|TBM|F16C" },
-  { "CPU_BDVER3_FLAGS",
-    "CPU_BDVER2_FLAGS|Xsaveopt|FSGSBase" },
-  { "CPU_BDVER4_FLAGS",
-    "CPU_BDVER3_FLAGS|AVX2|Movbe|BMI2|RdRnd|MWAITX" },
-  { "CPU_ZNVER1_FLAGS",
-    "CPU_GENERIC64_FLAGS|FISTTP|Rdtscp|CX16|CPU_AVX2_FLAGS|SSE4A|LZCNT|POPCNT|SVME|AES|PCLMUL|PRFCHW|FMA|BMI|F16C|Xsaveopt|FSGSBase|Movbe|BMI2|RdRnd|ADX|RdSeed|SMAP|SHA|XSAVEC|XSAVES|ClflushOpt|CLZERO|MWAITX" },
-  { "CPU_ZNVER2_FLAGS",
-    "CPU_ZNVER1_FLAGS|CLWB|RDPID|RDPRU|MCOMMIT|WBNOINVD" },
-  { "CPU_ZNVER3_FLAGS",
-    "CPU_ZNVER2_FLAGS|INVLPGB|TLBSYNC|VAES|VPCLMULQDQ|INVPCID|SNP|OSPKE" },
-  { "CPU_ZNVER4_FLAGS",
-    "CPU_ZNVER3_FLAGS|AVX512F|AVX512DQ|AVX512IFMA|AVX512CD|AVX512BW|AVX512VL|AVX512_BF16|AVX512VBMI|AVX512_VBMI2|AVX512_VNNI|AVX512_BITALG|AVX512_VPOPCNTDQ|GFNI|RMPQUERY" },
-  { "CPU_BTVER1_FLAGS",
-    "CPU_GENERIC64_FLAGS|FISTTP|CX16|Rdtscp|CPU_SSSE3_FLAGS|SSE4A|LZCNT|POPCNT|PRFCHW|CX16|Clflush|FISTTP|SVME" },
-  { "CPU_BTVER2_FLAGS",
-    "CPU_BTVER1_FLAGS|CPU_AVX_FLAGS|BMI|F16C|AES|PCLMUL|Movbe|Xsaveopt|PRFCHW" },
-  { "CPU_8087_FLAGS",
-    "8087" },
-  { "CPU_287_FLAGS",
-    "287" },
-  { "CPU_387_FLAGS",
+  { "386",
+    "286" },
+  { "486",
+    "386" },
+  { "586",
+    "486|387" },
+  { "586:nofpu",
+    "486" },
+  { "686",
+    "586|687|CMOV|FXSR" },
+  { "686:min",
+    "586|687" },
+  { "687",
     "387" },
-  { "CPU_687_FLAGS",
-    "CPU_387_FLAGS|687" },
-  { "CPU_CMOV_FLAGS",
-    "CMOV" },
-  { "CPU_FXSR_FLAGS",
-    "FXSR" },
-  { "CPU_CLFLUSH_FLAGS",
-    "Clflush" },
-  { "CPU_NOP_FLAGS",
-    "Nop" },
-  { "CPU_SYSCALL_FLAGS",
-    "SYSCALL" },
-  { "CPU_MMX_FLAGS",
-    "MMX" },
-  { "CPU_SSE_FLAGS",
+  { "FISTTP",
+    "687" },
+  { "SSE2",
     "SSE" },
-  { "CPU_SSE2_FLAGS",
-    "CPU_SSE_FLAGS|SSE2" },
-  { "CPU_SSE3_FLAGS",
-    "CPU_SSE2_FLAGS|SSE3" },
-  { "CPU_SSSE3_FLAGS",
-    "CPU_SSE3_FLAGS|SSSE3" },
-  { "CPU_SSE4_1_FLAGS",
-    "CPU_SSSE3_FLAGS|SSE4_1" },
-  { "CPU_SSE4_2_FLAGS",
-    "CPU_SSE4_1_FLAGS|SSE4_2|POPCNT" },
-  { "CPU_VMX_FLAGS",
-    "VMX" },
-  { "CPU_SMX_FLAGS",
-    "SMX" },
-  { "CPU_XSAVE_FLAGS",
-    "Xsave" },
-  { "CPU_XSAVEOPT_FLAGS",
-    "CPU_XSAVE_FLAGS|Xsaveopt" },
-  { "CPU_AES_FLAGS",
-    "CPU_SSE2_FLAGS|AES" },
-  { "CPU_PCLMUL_FLAGS",
-    "CPU_SSE2_FLAGS|PCLMUL" },
-  { "CPU_FMA_FLAGS",
-    "CPU_AVX_FLAGS|FMA" },
-  { "CPU_FMA4_FLAGS",
-    "CPU_AVX_FLAGS|FMA4" },
-  { "CPU_XOP_FLAGS",
-    "CPU_SSE4A_FLAGS|CPU_FMA4_FLAGS|XOP" },
-  { "CPU_LWP_FLAGS",
-    "CPU_XSAVE_FLAGS|LWP" },
-  { "CPU_BMI_FLAGS",
-    "BMI" },
-  { "CPU_TBM_FLAGS",
-    "TBM" },
-  { "CPU_MOVBE_FLAGS",
-    "Movbe" },
-  { "CPU_CX16_FLAGS",
-    "CX16" },
-  { "CPU_RDTSCP_FLAGS",
-    "Rdtscp" },
-  { "CPU_EPT_FLAGS",
-    "EPT" },
-  { "CPU_FSGSBASE_FLAGS",
-    "FSGSBase" },
-  { "CPU_RDRND_FLAGS",
-    "RdRnd" },
-  { "CPU_F16C_FLAGS",
-    "CPU_AVX_FLAGS|F16C" },
-  { "CPU_BMI2_FLAGS",
-    "BMI2" },
-  { "CPU_LZCNT_FLAGS",
-    "LZCNT" },
-  { "CPU_POPCNT_FLAGS",
-    "POPCNT" },
-  { "CPU_HLE_FLAGS",
-    "HLE" },
-  { "CPU_RTM_FLAGS",
-    "RTM" },
-  { "CPU_INVPCID_FLAGS",
-    "INVPCID" },
-  { "CPU_VMFUNC_FLAGS",
-    "VMFUNC" },
-  { "CPU_3DNOW_FLAGS",
-    "CPU_MMX_FLAGS|3dnow" },
-  { "CPU_3DNOWA_FLAGS",
-    "CPU_3DNOW_FLAGS|3dnowA" },
-  { "CPU_PADLOCK_FLAGS",
-    "PadLock" },
-  { "CPU_SVME_FLAGS",
-    "SVME" },
-  { "CPU_SSE4A_FLAGS",
-    "CPU_SSE3_FLAGS|SSE4a" },
-  { "CPU_ABM_FLAGS",
+  { "SSE3",
+    "SSE2" },
+  { "SSSE3",
+    "SSE3" },
+  { "SSE4_1",
+    "SSSE3" },
+  { "SSE4_2",
+    "SSE4_1|POPCNT" },
+  { "Xsaveopt",
+    "XSAVE" },
+  { "AES",
+    "SSE2" },
+  { "PCLMUL",
+    "SSE2" },
+  { "FMA",
+    "AVX" },
+  { "FMA4",
+    "AVX" },
+  { "XOP",
+    "SSE4A|FMA4" },
+  { "LWP",
+    "XSAVE" },
+  { "F16C",
+    "AVX" },
+  { "3dnow",
+    "MMX" },
+  { "3dnowA",
+    "3dnow" },
+  { "SSE4a",
+    "SSE3" },
+  { "ABM",
     "LZCNT|POPCNT" },
-  { "CPU_AVX_FLAGS",
-    "CPU_SSE4_2_FLAGS|CPU_XSAVE_FLAGS|AVX" },
-  { "CPU_AVX2_FLAGS",
-    "CPU_AVX_FLAGS|AVX2" },
-  { "CPU_AVX_VNNI_FLAGS",
-    "CPU_AVX2_FLAGS|AVX_VNNI" },
-  { "CPU_AVX512F_FLAGS",
-    "CPU_AVX2_FLAGS|AVX512F" },
-  { "CPU_AVX512CD_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512CD" },
-  { "CPU_AVX512ER_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512ER" },
-  { "CPU_AVX512PF_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512PF" },
-  { "CPU_AVX512DQ_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512DQ" },
-  { "CPU_AVX512BW_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512BW" },
-  { "CPU_AVX512VL_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512VL" },
-  { "CPU_AVX512IFMA_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512IFMA" },
-  { "CPU_AVX512VBMI_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512VBMI" },
-  { "CPU_AVX512_4FMAPS_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512_4FMAPS" },
-  { "CPU_AVX512_4VNNIW_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512_4VNNIW" },
-  { "CPU_AVX512_VPOPCNTDQ_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512_VPOPCNTDQ" },
-  { "CPU_AVX512_VBMI2_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512_VBMI2" },
-  { "CPU_AVX512_VNNI_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512_VNNI" },
-  { "CPU_AVX512_BITALG_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512_BITALG" },
-  { "CPU_AVX512_BF16_FLAGS",
-    "CPU_AVX512F_FLAGS|AVX512_BF16" },
-  { "CPU_AVX512_FP16_FLAGS",
-    "CPU_AVX512BW_FLAGS|AVX512_FP16" },
-  { "CPU_PREFETCHI_FLAGS",
-    "PREFETCHI"},
-  { "CPU_AVX_IFMA_FLAGS",
-    "CPU_AVX2_FLAGS|AVX_IFMA" },
-  { "CPU_AVX_VNNI_INT8_FLAGS",
-    "CPU_AVX2_FLAGS|AVX_VNNI_INT8" },
-  { "CPU_CMPCCXADD_FLAGS",
-    "CMPCCXADD" },
-  { "CPU_WRMSRNS_FLAGS",
-    "WRMSRNS" },
-  { "CPU_MSRLIST_FLAGS",
-    "MSRLIST" },
-  { "CPU_AVX_NE_CONVERT_FLAGS",
-    "CPU_AVX2_FLAGS|AVX_NE_CONVERT" },
-  { "CPU_RAO_INT_FLAGS",
-    "RAO_INT" },
-  { "CPU_IAMCU_FLAGS",
-    "186|286|386|486|586|IAMCU" },
-  { "CPU_ADX_FLAGS",
-    "ADX" },
-  { "CPU_RDSEED_FLAGS",
-    "RdSeed" },
-  { "CPU_PRFCHW_FLAGS",
-    "PRFCHW" },
-  { "CPU_SMAP_FLAGS",
-    "SMAP" },
-  { "CPU_MPX_FLAGS",
-    "CPU_XSAVE_FLAGS|MPX" },
-  { "CPU_SHA_FLAGS",
-    "CPU_SSE2_FLAGS|SHA" },
-  { "CPU_CLFLUSHOPT_FLAGS",
-    "ClflushOpt" },
-  { "CPU_XSAVES_FLAGS",
-    "CPU_XSAVE_FLAGS|XSAVES" },
-  { "CPU_XSAVEC_FLAGS",
-    "CPU_XSAVE_FLAGS|XSAVEC" },
-  { "CPU_PREFETCHWT1_FLAGS",
-    "PREFETCHWT1" },
-  { "CPU_SE1_FLAGS",
-    "SE1" },
-  { "CPU_CLWB_FLAGS",
-    "CLWB" },
-  { "CPU_CLZERO_FLAGS",
-    "CLZERO" },
-  { "CPU_MWAITX_FLAGS",
-    "MWAITX" },
-  { "CPU_OSPKE_FLAGS",
-    "CPU_XSAVE_FLAGS|OSPKE" },
-  { "CPU_RDPID_FLAGS",
-    "RDPID" },
-  { "CPU_PTWRITE_FLAGS",
-    "PTWRITE" },
-  { "CPU_IBT_FLAGS",
-    "IBT" },
-  { "CPU_SHSTK_FLAGS",
-    "SHSTK" },
-  { "CPU_GFNI_FLAGS",
-    "GFNI" },
-  { "CPU_VAES_FLAGS",
-    "VAES" },
-  { "CPU_VPCLMULQDQ_FLAGS",
-    "VPCLMULQDQ" },
-  { "CPU_WBNOINVD_FLAGS",
-    "WBNOINVD" },
-  { "CPU_PCONFIG_FLAGS",
-    "PCONFIG" },
-  { "CPU_WAITPKG_FLAGS",
-    "WAITPKG" },
-  { "CPU_UINTR_FLAGS",
-    "UINTR" },
-  { "CPU_CLDEMOTE_FLAGS",
-    "CLDEMOTE" },
-  { "CPU_AMX_INT8_FLAGS",
-    "CPU_AMX_TILE_FLAGS|AMX_INT8" },
-  { "CPU_AMX_BF16_FLAGS",
-    "CPU_AMX_TILE_FLAGS|AMX_BF16" },
-  { "CPU_AMX_FP16_FLAGS",
-    "CPU_AMX_TILE_FLAGS|AMX_FP16" },
-  { "CPU_AMX_TILE_FLAGS",
+  { "AVX",
+    "SSE4_2|XSAVE" },
+  { "AVX2",
+    "AVX" },
+  { "AVX_VNNI",
+    "AVX2" },
+  { "AVX_IFMA",
+    "AVX2" },
+  { "AVX_VNNI_INT8",
+    "AVX2" },
+  { "AVX_NE_CONVERT",
+    "AVX2" },
+  { "AVX512F",
+    "AVX2" },
+  { "AVX512CD",
+    "AVX512F" },
+  { "AVX512ER",
+    "AVX512F" },
+  { "AVX512PF",
+    "AVX512F" },
+  { "AVX512DQ",
+    "AVX512F" },
+  { "AVX512BW",
+    "AVX512F" },
+  { "AVX512VL",
+    "AVX512F" },
+  { "AVX512IFMA",
+    "AVX512F" },
+  { "AVX512VBMI",
+    "AVX512F" },
+  { "AVX512_4FMAPS",
+    "AVX512F" },
+  { "AVX512_4VNNIW",
+    "AVX512F" },
+  { "AVX512_VPOPCNTDQ",
+    "AVX512F" },
+  { "AVX512_VBMI2",
+    "AVX512F" },
+  { "AVX512_VNNI",
+    "AVX512F" },
+  { "AVX512_BITALG",
+    "AVX512F" },
+  { "AVX512_VP2INTERSECT",
+    "AVX512F" },
+  { "AVX512_BF16",
+    "AVX512F" },
+  { "AVX512_FP16",
+    "AVX512BW" },
+  { "IAMCU",
+    "586:nofpu" },
+  { "MPX",
+    "XSAVE" },
+  { "SHA",
+    "SSE2" },
+  { "XSAVES",
+    "XSAVE" },
+  { "XSAVEC",
+    "XSAVE" },
+  { "OSPKE",
+    "XSAVE" },
+  { "AMX_INT8",
+    "AMX_TILE" },
+  { "AMX_BF16",
+    "AMX_TILE" },
+  { "AMX_FP16",
     "AMX_TILE" },
-  { "CPU_MOVDIRI_FLAGS",
-    "MOVDIRI" },
-  { "CPU_MOVDIR64B_FLAGS",
-    "MOVDIR64B" },
-  { "CPU_ENQCMD_FLAGS",
-    "ENQCMD" },
-  { "CPU_SERIALIZE_FLAGS",
-    "SERIALIZE" },
-  { "CPU_AVX512_VP2INTERSECT_FLAGS",
-    "AVX512_VP2INTERSECT" },
-  { "CPU_TDX_FLAGS",
-    "TDX" },
-  { "CPU_RDPRU_FLAGS",
-    "RDPRU" },
-  { "CPU_MCOMMIT_FLAGS",
-    "MCOMMIT" },
-  { "CPU_SEV_ES_FLAGS",
-    "SEV_ES" },
-  { "CPU_TSXLDTRK_FLAGS",
-    "TSXLDTRK"},
-  { "CPU_KL_FLAGS",
-    "KL" },
-  { "CPU_WIDEKL_FLAGS",
-    "WideKL" },
-  { "CPU_HRESET_FLAGS",
-    "HRESET"},
-  { "CPU_INVLPGB_FLAGS",
-    "INVLPGB" },
-  { "CPU_TLBSYNC_FLAGS",
-    "TLBSYNC" },
-  { "CPU_SNP_FLAGS",
-    "SNP" },
-  { "CPU_RMPQUERY_FLAGS",
-    "RMPQUERY" },
-  { "CPU_ANY_X87_FLAGS",
-    "CPU_ANY_287_FLAGS|8087" },
-  { "CPU_ANY_287_FLAGS",
-    "CPU_ANY_387_FLAGS|287" },
-  { "CPU_ANY_387_FLAGS",
-    "CPU_ANY_687_FLAGS|387" },
-  { "CPU_ANY_687_FLAGS",
-    "687|FISTTP" },
-  { "CPU_ANY_CMOV_FLAGS",
-    "CMOV" },
-  { "CPU_ANY_FXSR_FLAGS",
-    "FXSR" },
-  { "CPU_ANY_MMX_FLAGS",
-    "CPU_3DNOWA_FLAGS" },
-  { "CPU_ANY_SSE_FLAGS",
-    "CPU_ANY_SSE2_FLAGS|SSE" },
-  { "CPU_ANY_SSE2_FLAGS",
-    "CPU_ANY_SSE3_FLAGS|SSE2" },
-  { "CPU_ANY_SSE3_FLAGS",
-    "CPU_ANY_SSSE3_FLAGS|SSE3|SSE4a" },
-  { "CPU_ANY_SSSE3_FLAGS",
-    "CPU_ANY_SSE4_1_FLAGS|SSSE3" },
-  { "CPU_ANY_SSE4_1_FLAGS",
-    "CPU_ANY_SSE4_2_FLAGS|SSE4_1" },
-  { "CPU_ANY_SSE4_2_FLAGS",
-    "SSE4_2" },
-  { "CPU_ANY_SSE4A_FLAGS",
-    "SSE4a" },
-  { "CPU_ANY_AVX_FLAGS",
-    "CPU_ANY_AVX2_FLAGS|F16C|FMA|FMA4|XOP|AVX" },
-  { "CPU_ANY_AVX2_FLAGS",
-    "CPU_ANY_AVX512F_FLAGS|AVX2|AVX_VNNI|AVX_IFMA|AVX_VNNI_INT8|AVX_NE_CONVERT" },
-  { "CPU_ANY_AVX512F_FLAGS",
-    "AVX512F|AVX512CD|AVX512ER|AVX512PF|AVX512DQ|CPU_ANY_AVX512BW_FLAGS|AVX512VL|AVX512IFMA|AVX512VBMI|AVX512_4FMAPS|AVX512_4VNNIW|AVX512_VPOPCNTDQ|AVX512_VBMI2|AVX512_VNNI|AVX512_BITALG|AVX512_BF16|AVX512_VP2INTERSECT" },
-  { "CPU_ANY_AVX512CD_FLAGS",
-    "AVX512CD" },
-  { "CPU_ANY_AVX512ER_FLAGS",
-    "AVX512ER" },
-  { "CPU_ANY_AVX512PF_FLAGS",
-    "AVX512PF" },
-  { "CPU_ANY_AVX512DQ_FLAGS",
-    "AVX512DQ" },
-  { "CPU_ANY_AVX512BW_FLAGS",
-    "AVX512BW|CPU_ANY_AVX512_FP16_FLAGS" },
-  { "CPU_ANY_AVX512VL_FLAGS",
-    "AVX512VL" },
-  { "CPU_ANY_AVX512IFMA_FLAGS",
-    "AVX512IFMA" },
-  { "CPU_ANY_AVX512VBMI_FLAGS",
-    "AVX512VBMI" },
-  { "CPU_ANY_AVX512_4FMAPS_FLAGS",
-    "AVX512_4FMAPS" },
-  { "CPU_ANY_AVX512_4VNNIW_FLAGS",
-    "AVX512_4VNNIW" },
-  { "CPU_ANY_AVX512_VPOPCNTDQ_FLAGS",
-    "AVX512_VPOPCNTDQ" },
-  { "CPU_ANY_IBT_FLAGS",
-    "IBT" },
-  { "CPU_ANY_SHSTK_FLAGS",
-    "SHSTK" },
-  { "CPU_ANY_AVX512_VBMI2_FLAGS",
-    "AVX512_VBMI2" },
-  { "CPU_ANY_AVX512_VNNI_FLAGS",
-    "AVX512_VNNI" },
-  { "CPU_ANY_AVX512_BITALG_FLAGS",
-    "AVX512_BITALG" },
-  { "CPU_ANY_AVX512_BF16_FLAGS",
-    "AVX512_BF16" },
-  { "CPU_ANY_AMX_INT8_FLAGS",
-    "AMX_INT8" },
-  { "CPU_ANY_AMX_BF16_FLAGS",
-    "AMX_BF16" },
-  { "CPU_ANY_AMX_TILE_FLAGS",
-    "AMX_TILE|AMX_INT8|AMX_BF16|AMX_FP16" },
-  { "CPU_ANY_AVX_VNNI_FLAGS",
-    "AVX_VNNI" },
-  { "CPU_ANY_MOVDIRI_FLAGS",
-    "MOVDIRI" },
-  { "CPU_ANY_UINTR_FLAGS",
-    "UINTR" },
-  { "CPU_ANY_MOVDIR64B_FLAGS",
-    "MOVDIR64B" },
-  { "CPU_ANY_ENQCMD_FLAGS",
-    "ENQCMD" },
-  { "CPU_ANY_SERIALIZE_FLAGS",
-    "SERIALIZE" },
-  { "CPU_ANY_AVX512_VP2INTERSECT_FLAGS",
-    "AVX512_VP2INTERSECT" },
-  { "CPU_ANY_TDX_FLAGS",
-    "TDX" },
-  { "CPU_ANY_TSXLDTRK_FLAGS",
-    "TSXLDTRK" },
-  { "CPU_ANY_KL_FLAGS",
-    "KL|WideKL" },
-  { "CPU_ANY_WIDEKL_FLAGS",
-    "WideKL" },
-  { "CPU_ANY_HRESET_FLAGS",
-    "HRESET" },
-  { "CPU_ANY_AVX512_FP16_FLAGS",
-    "AVX512_FP16" },
-  { "CPU_ANY_AVX_IFMA_FLAGS",
-    "AVX_IFMA" },
-  { "CPU_ANY_AVX_VNNI_INT8_FLAGS",
-    "AVX_VNNI_INT8" },
-  { "CPU_ANY_CMPCCXADD_FLAGS",
-    "CMPCCXADD" },
-  { "CPU_ANY_WRMSRNS_FLAGS",
-    "WRMSRNS" },
-  { "CPU_ANY_MSRLIST_FLAGS",
-    "MSRLIST" },
-  { "CPU_ANY_AVX_NE_CONVERT_FLAGS",
-    "AVX_NE_CONVERT" },
-  { "CPU_ANY_RAO_INT_FLAGS",
-    "RAO_INT"},
 };
 
+/* This array is populated as process_i386_initializers() walks cpu_flags[].  */
+static unsigned char isa_reverse_deps[Cpu64][Cpu64];
+
 typedef struct bitfield
 {
   int position;
@@ -867,32 +614,6 @@ next_field (char *str, char sep, char **
 
 static void set_bitfield (char *, bitfield *, int, unsigned int, int);
 
-static int
-set_bitfield_from_cpu_flag_init (char *f, bitfield *array, unsigned int size,
-				 int lineno)
-{
-  char *str, *next, *last;
-  unsigned int i;
-
-  for (i = 0; i < ARRAY_SIZE (cpu_flag_init); i++)
-    if (strcmp (cpu_flag_init[i].name, f) == 0)
-      {
-	/* Turn on selective bits.  */
-	char *init = xstrdup (cpu_flag_init[i].init);
-	last = init + strlen (init);
-	for (next = init; next && next < last; )
-	  {
-	    str = next_field (next, '|', &next, last);
-	    if (str)
-	      set_bitfield (str, array, 1, size, lineno);
-	  }
-	free (init);
-	return 0;
-      }
-
-  return -1;
-}
-
 static void
 set_bitfield (char *f, bitfield *array, int value,
 	      unsigned int size, int lineno)
@@ -933,10 +654,6 @@ set_bitfield (char *f, bitfield *array,
 	}
     }
 
-  /* Handle CPU_XXX_FLAGS.  */
-  if (value == 1 && !set_bitfield_from_cpu_flag_init (f, array, size, lineno))
-    return;
-
   if (lineno != -1)
     fail (_("%s: %d: unknown bitfield: %s\n"), filename, lineno, f);
   else
@@ -944,6 +661,73 @@ set_bitfield (char *f, bitfield *array,
 }
 
 static void
+add_isa_dependencies (bitfield *flags, const char *f, int value,
+		      unsigned int reverse)
+{
+  unsigned int i;
+  char *str = NULL;
+  const char *isa = f;
+  bool is_isa = false, is_avx = false;
+
+  /* Need to find base entry for references to auxiliary ones.  */
+  if (strchr (f, ':'))
+    {
+      str = xstrdup (f);
+      *strchr (str, ':') = '\0';
+      isa = str;
+    }
+  for (i = 0; i < Cpu64; ++i)
+    if (strcasecmp (flags[i].name, isa) == 0)
+      {
+	flags[i].value = value;
+	if (reverse < ARRAY_SIZE (isa_reverse_deps[0])
+	    /* Don't record the feature itself here.  */
+	    && reverse != i
+	    /* Don't record base architectures.  */
+	    && reverse > Cpu686)
+	  isa_reverse_deps[i][reverse] = 1;
+	is_isa = true;
+	if (i == CpuAVX || i == CpuXOP)
+	  is_avx = true;
+	break;
+      }
+  free (str);
+
+  /* Do not turn off dependencies.  */
+  if (is_isa && !value)
+    return;
+
+  for (i = 0; i < ARRAY_SIZE (isa_dependencies); ++i)
+    if (strcasecmp (isa_dependencies[i].name, f) == 0)
+      {
+	char *deps = xstrdup (isa_dependencies[i].deps);
+	char *next = deps;
+	char *last = deps + strlen (deps);
+
+	for (; next && next < last; )
+	  {
+	    char *str = next_field (next, '|', &next, last);
+
+	    /* No AVX/XOP -> SSE reverse dependencies.  */
+	    if (is_avx && strncmp (str, "SSE", 3) == 0)
+	      add_isa_dependencies (flags, str, value, CpuMax);
+	    else
+	      add_isa_dependencies (flags, str, value, reverse);
+	  }
+	free (deps);
+
+	/* ISA extensions with dependencies need CPU_ANY_*_FLAGS emitted.  */
+	if (reverse < ARRAY_SIZE (isa_reverse_deps[0]))
+	  isa_reverse_deps[reverse][reverse] = 1;
+
+	return;
+      }
+
+  if (!is_isa)
+    fail (_("unknown bitfield: %s\n"), f);
+}
+
+static void
 output_cpu_flags (FILE *table, bitfield *flags, unsigned int size,
 		  int macro, const char *comma, const char *indent)
 {
@@ -975,18 +759,27 @@ output_cpu_flags (FILE *table, bitfield
 }
 
 static void
-process_i386_cpu_flag (FILE *table, char *flag, int macro,
+process_i386_cpu_flag (FILE *table, char *flag,
+		       const char *name,
 		       const char *comma, const char *indent,
-		       int lineno)
+		       int lineno, unsigned int reverse)
 {
   char *str, *next = flag, *last;
   unsigned int i;
   int value = 1;
+  bool is_isa = false;
   bitfield flags [ARRAY_SIZE (cpu_flags)];
 
   /* Copy the default cpu flags.  */
   memcpy (flags, cpu_flags, sizeof (cpu_flags));
 
+  if (flag == NULL)
+    {
+      for (i = 0; i < ARRAY_SIZE (isa_reverse_deps[0]); ++i)
+	flags[i].value = isa_reverse_deps[reverse][i];
+      goto output;
+    }
+
   if (flag[0] == '~')
     {
       last = flag + strlen (flag);
@@ -1013,19 +806,54 @@ process_i386_cpu_flag (FILE *table, char
       value = 0;
     }
 
+  if (name != NULL && value != 0)
+    {
+      for (i = 0; i < ARRAY_SIZE (flags); i++)
+	if (strcasecmp (flags[i].name, name) == 0)
+	  {
+	    add_isa_dependencies (flags, name, 1, reverse);
+	    is_isa = true;
+	    break;
+	  }
+    }
+
   if (strcmp (flag, "0"))
     {
+      if (is_isa)
+	return;
+
       /* Turn on/off selective bits.  */
       last = flag + strlen (flag);
       for (; next && next < last; )
 	{
 	  str = next_field (next, '|', &next, last);
-	  if (str)
+	  if (name == NULL)
 	    set_bitfield (str, flags, value, ARRAY_SIZE (flags), lineno);
+	  else
+	    add_isa_dependencies (flags, str, value, reverse);
 	}
     }
 
-  output_cpu_flags (table, flags, ARRAY_SIZE (flags), macro,
+ output:
+  if (name != NULL)
+    {
+      size_t len = strlen (name);
+      char *upper = xmalloc (len + 1);
+
+      for (i = 0; i < len; ++i)
+	{
+	  /* Don't emit #define-s for auxiliary entries.  */
+	  if (name[i] == ':')
+	    return;
+	  upper[i] = TOUPPER (name[i]);
+	}
+      upper[i] = '\0';
+      fprintf (table, "\n#define CPU_%s%s_FLAGS \\\n",
+	       flag != NULL ? "": "ANY_", upper);
+      free (upper);
+    }
+
+  output_cpu_flags (table, flags, ARRAY_SIZE (flags), name != NULL,
 		    comma, indent);
 }
 
@@ -1396,7 +1224,7 @@ output_i386_opcode (FILE *table, const c
   process_i386_opcode_modifier (table, opcode_modifier, space, prefix,
 				operand_types, lineno);
 
-  process_i386_cpu_flag (table, cpu_flags, 0, ",", "    ", lineno);
+  process_i386_cpu_flag (table, cpu_flags, NULL, ",", "    ", lineno, CpuMax);
 
   fprintf (table, "    { ");
 
@@ -1935,7 +1763,6 @@ process_i386_initializers (void)
 {
   unsigned int i;
   FILE *fp = fopen ("i386-init.h", "w");
-  char *init;
 
   if (fp == NULL)
     fail (_("can't create i386-init.h, errno = %s\n"),
@@ -1943,12 +1770,44 @@ process_i386_initializers (void)
 
   process_copyright (fp);
 
-  for (i = 0; i < ARRAY_SIZE (cpu_flag_init); i++)
+  for (i = 0; i < Cpu64; i++)
+    process_i386_cpu_flag (fp, "0", cpu_flags[i].name, "", "  ", -1, i);
+
+  for (i = 0; i < ARRAY_SIZE (isa_dependencies); i++)
     {
-      fprintf (fp, "\n#define %s \\\n", cpu_flag_init[i].name);
-      init = xstrdup (cpu_flag_init[i].init);
-      process_i386_cpu_flag (fp, init, 1, "", "  ", -1);
-      free (init);
+      char *deps = xstrdup (isa_dependencies[i].deps);
+
+      process_i386_cpu_flag (fp, deps, isa_dependencies[i].name,
+			     "", "  ", -1, CpuMax);
+      free (deps);
+    }
+
+  /* Early x87 is somewhat special: Both 287 and 387 not only add new insns
+     but also remove some.  Hence 8087 isn't a prereq to 287, and 287 isn't
+     one to 387.  We want the reverse to be true though: Disabling 8087 also
+     is to disable 287+ and later; disabling 287 also means disabling 387+.  */
+  memcpy (isa_reverse_deps[Cpu287], isa_reverse_deps[Cpu387],
+          sizeof (isa_reverse_deps[0]));
+  isa_reverse_deps[Cpu287][Cpu387] = 1;
+  memcpy (isa_reverse_deps[Cpu8087], isa_reverse_deps[Cpu287],
+          sizeof (isa_reverse_deps[0]));
+  isa_reverse_deps[Cpu8087][Cpu287] = 1;
+
+  /* While we treat POPCNT as a prereq to SSE4.2, its disabling should not
+     lead to disabling of anything else.  */
+  memset (isa_reverse_deps[CpuPOPCNT], 0, sizeof (isa_reverse_deps[0]));
+
+  for (i = Cpu686 + 1; i < ARRAY_SIZE (isa_reverse_deps); i++)
+    {
+      size_t len;
+      char *upper;
+
+      if (memchr(isa_reverse_deps[i], 1,
+	  ARRAY_SIZE (isa_reverse_deps[0])) == NULL)
+	continue;
+
+      isa_reverse_deps[i][i] = 1;
+      process_i386_cpu_flag (fp, NULL, cpu_flags[i].name, "", "  ", -1, i);
     }
 
   fprintf (fp, "\n");


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 02/10] x86: correct what gets disabled by certain ".arch .no*"
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
  2022-12-19 10:44 ` [PATCH 01/10] " Jan Beulich
@ 2022-12-19 10:45 ` Jan Beulich
  2022-12-19 10:45 ` [PATCH 03/10] x86: correct SSE dependencies Jan Beulich
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:45 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Features with prereqs as well as features with dependents cannot re-use
CPU_*_MASK for the 3rd argument of SUBARCH() - they need to use
CPU_ANY_*_MASK in order to avoid disabling too many (when there are
prereqs) and/or too few (when there are dependents) features.

Generally any CPU_ANY_*_MASK which exist should not remain unused.
Exceptions are
- FISTTP which has no corresponding entry in cpu_arch[],
- IAMCU which is a base architecture and hence uses ARCH(), not
  SUBARCH() (only extensions can be disabled, but unlike for Cpu*86 it
  would be a little more clumsy to suppress generating of the #define).

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1029,21 +1029,21 @@ static const arch_entry cpu_arch[] =
   SUBARCH (vmx, VMX, VMX, false),
   SUBARCH (vmfunc, VMFUNC, VMFUNC, false),
   SUBARCH (smx, SMX, SMX, false),
-  SUBARCH (xsave, XSAVE, XSAVE, false),
-  SUBARCH (xsaveopt, XSAVEOPT, XSAVEOPT, false),
-  SUBARCH (xsavec, XSAVEC, XSAVEC, false),
-  SUBARCH (xsaves, XSAVES, XSAVES, false),
-  SUBARCH (aes, AES, AES, false),
-  SUBARCH (pclmul, PCLMUL, PCLMUL, false),
-  SUBARCH (clmul, PCLMUL, PCLMUL, true),
+  SUBARCH (xsave, XSAVE, ANY_XSAVE, false),
+  SUBARCH (xsaveopt, XSAVEOPT, ANY_XSAVEOPT, false),
+  SUBARCH (xsavec, XSAVEC, ANY_XSAVEC, false),
+  SUBARCH (xsaves, XSAVES, ANY_XSAVES, false),
+  SUBARCH (aes, AES, ANY_AES, false),
+  SUBARCH (pclmul, PCLMUL, ANY_PCLMUL, false),
+  SUBARCH (clmul, PCLMUL, ANY_PCLMUL, true),
   SUBARCH (fsgsbase, FSGSBASE, FSGSBASE, false),
   SUBARCH (rdrnd, RDRND, RDRND, false),
-  SUBARCH (f16c, F16C, F16C, false),
+  SUBARCH (f16c, F16C, ANY_F16C, false),
   SUBARCH (bmi2, BMI2, BMI2, false),
-  SUBARCH (fma, FMA, FMA, false),
-  SUBARCH (fma4, FMA4, FMA4, false),
-  SUBARCH (xop, XOP, XOP, false),
-  SUBARCH (lwp, LWP, LWP, false),
+  SUBARCH (fma, FMA, ANY_FMA, false),
+  SUBARCH (fma4, FMA4, ANY_FMA4, false),
+  SUBARCH (xop, XOP, ANY_XOP, false),
+  SUBARCH (lwp, LWP, ANY_LWP, false),
   SUBARCH (movbe, MOVBE, MOVBE, false),
   SUBARCH (cx16, CX16, CX16, false),
   SUBARCH (ept, EPT, EPT, false),
@@ -1056,8 +1056,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (nop, NOP, NOP, false),
   SUBARCH (syscall, SYSCALL, SYSCALL, false),
   SUBARCH (rdtscp, RDTSCP, RDTSCP, false),
-  SUBARCH (3dnow, 3DNOW, 3DNOW, false),
-  SUBARCH (3dnowa, 3DNOWA, 3DNOWA, false),
+  SUBARCH (3dnow, 3DNOW, ANY_3DNOW, false),
+  SUBARCH (3dnowa, 3DNOWA, ANY_3DNOWA, false),
   SUBARCH (padlock, PADLOCK, PADLOCK, false),
   SUBARCH (pacifica, SVME, SVME, true),
   SUBARCH (svme, SVME, SVME, false),
@@ -1068,8 +1068,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (rdseed, RDSEED, RDSEED, false),
   SUBARCH (prfchw, PRFCHW, PRFCHW, false),
   SUBARCH (smap, SMAP, SMAP, false),
-  SUBARCH (mpx, MPX, MPX, false),
-  SUBARCH (sha, SHA, SHA, false),
+  SUBARCH (mpx, MPX, ANY_MPX, false),
+  SUBARCH (sha, SHA, ANY_SHA, false),
   SUBARCH (clflushopt, CLFLUSHOPT, CLFLUSHOPT, false),
   SUBARCH (prefetchwt1, PREFETCHWT1, PREFETCHWT1, false),
   SUBARCH (se1, SE1, SE1, false),
@@ -1085,7 +1085,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (avx_vnni, AVX_VNNI, ANY_AVX_VNNI, false),
   SUBARCH (clzero, CLZERO, CLZERO, false),
   SUBARCH (mwaitx, MWAITX, MWAITX, false),
-  SUBARCH (ospke, OSPKE, OSPKE, false),
+  SUBARCH (ospke, OSPKE, ANY_OSPKE, false),
   SUBARCH (rdpid, RDPID, RDPID, false),
   SUBARCH (ptwrite, PTWRITE, PTWRITE, false),
   SUBARCH (ibt, IBT, IBT, false),
@@ -1099,7 +1099,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (cldemote, CLDEMOTE, CLDEMOTE, false),
   SUBARCH (amx_int8, AMX_INT8, ANY_AMX_INT8, false),
   SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false),
-  SUBARCH (amx_fp16, AMX_FP16, AMX_FP16, false),
+  SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false),
   SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false),
   SUBARCH (movdiri, MOVDIRI, MOVDIRI, false),
   SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false),


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 03/10] x86: correct SSE dependencies
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
  2022-12-19 10:44 ` [PATCH 01/10] " Jan Beulich
  2022-12-19 10:45 ` [PATCH 02/10] x86: correct what gets disabled by certain ".arch .no*" Jan Beulich
@ 2022-12-19 10:45 ` Jan Beulich
  2022-12-19 10:45 ` [PATCH 04/10] x86: add dependencies on AVX2 Jan Beulich
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:45 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

SSE itself takes FXSR as a prereq. Like AES, PCLMUL, and SHA both GFNI
and KL take SSE2 as a prereq, for operating on packed integers. And
while correcting KL also record it as a prereq to WIDEKL.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1007,7 +1007,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (387, 387, ANY_387, false),
   SUBARCH (687, 687, ANY_687, false),
   SUBARCH (cmov, CMOV, CMOV, false),
-  SUBARCH (fxsr, FXSR, FXSR, false),
+  SUBARCH (fxsr, FXSR, ANY_FXSR, false),
   SUBARCH (mmx, MMX, ANY_MMX, false),
   SUBARCH (sse, SSE, ANY_SSE, false),
   SUBARCH (sse2, SSE2, ANY_SSE2, false),
@@ -1090,7 +1090,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (ptwrite, PTWRITE, PTWRITE, false),
   SUBARCH (ibt, IBT, IBT, false),
   SUBARCH (shstk, SHSTK, SHSTK, false),
-  SUBARCH (gfni, GFNI, GFNI, false),
+  SUBARCH (gfni, GFNI, ANY_GFNI, false),
   SUBARCH (vaes, VAES, VAES, false),
   SUBARCH (vpclmulqdq, VPCLMULQDQ, VPCLMULQDQ, false),
   SUBARCH (wbnoinvd, WBNOINVD, WBNOINVD, false),
@@ -1113,8 +1113,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (mcommit, MCOMMIT, MCOMMIT, false),
   SUBARCH (sev_es, SEV_ES, SEV_ES, false),
   SUBARCH (tsxldtrk, TSXLDTRK, TSXLDTRK, false),
-  SUBARCH (kl, KL, KL, false),
-  SUBARCH (widekl, WIDEKL, WIDEKL, false),
+  SUBARCH (kl, KL, ANY_KL, false),
+  SUBARCH (widekl, WIDEKL, ANY_WIDEKL, false),
   SUBARCH (uintr, UINTR, UINTR, false),
   SUBARCH (hreset, HRESET, HRESET, false),
   SUBARCH (avx512_fp16, AVX512_FP16, ANY_AVX512_FP16, false),
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -119,6 +119,8 @@ static const dependency isa_dependencies
     "387" },
   { "FISTTP",
     "687" },
+  { "SSE",
+    "FXSR" },
   { "SSE2",
     "SSE" },
   { "SSE3",
@@ -213,12 +215,18 @@ static const dependency isa_dependencies
     "XSAVE" },
   { "OSPKE",
     "XSAVE" },
+  { "GFNI",
+    "SSE2" },
   { "AMX_INT8",
     "AMX_TILE" },
   { "AMX_BF16",
     "AMX_TILE" },
   { "AMX_FP16",
     "AMX_TILE" },
+  { "KL",
+    "SSE2" },
+  { "WIDEKL",
+    "KL" },
 };
 
 /* This array is populated as process_i386_initializers() walks cpu_flags[].  */


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 04/10] x86: add dependencies on AVX2
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
                   ` (2 preceding siblings ...)
  2022-12-19 10:45 ` [PATCH 03/10] x86: correct SSE dependencies Jan Beulich
@ 2022-12-19 10:45 ` Jan Beulich
  2022-12-19 10:46 ` [PATCH 05/10] x86: rework noavx512-1 testcase Jan Beulich
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:45 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Like AVX-VNNI both VAES and VPCLMUL take AVX2 as a prereq, for operating
on up to 256-bit packed integer vectors.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1091,8 +1091,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (ibt, IBT, IBT, false),
   SUBARCH (shstk, SHSTK, SHSTK, false),
   SUBARCH (gfni, GFNI, ANY_GFNI, false),
-  SUBARCH (vaes, VAES, VAES, false),
-  SUBARCH (vpclmulqdq, VPCLMULQDQ, VPCLMULQDQ, false),
+  SUBARCH (vaes, VAES, ANY_VAES, false),
+  SUBARCH (vpclmulqdq, VPCLMULQDQ, ANY_VPCLMULQDQ, false),
   SUBARCH (wbnoinvd, WBNOINVD, WBNOINVD, false),
   SUBARCH (pconfig, PCONFIG, PCONFIG, false),
   SUBARCH (waitpkg, WAITPKG, WAITPKG, false),
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -217,6 +217,10 @@ static const dependency isa_dependencies
     "XSAVE" },
   { "GFNI",
     "SSE2" },
+  { "VAES",
+    "AVX2" },
+  { "VPCLMULQDQ",
+    "AVX2" },
   { "AMX_INT8",
     "AMX_TILE" },
   { "AMX_BF16",


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 05/10] x86: rework noavx512-1 testcase
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
                   ` (3 preceding siblings ...)
  2022-12-19 10:45 ` [PATCH 04/10] x86: add dependencies on AVX2 Jan Beulich
@ 2022-12-19 10:46 ` Jan Beulich
  2022-12-19 10:46 ` [PATCH 06/10] x86: correct dependencies of a few AVX512 sub-features Jan Beulich
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:46 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

So far the set of ".noavx512*" has been accumulating, which isn't ideal.
In particular this hides issues with dependencies between features.
Switch back to the default ISA before disabling a particular subset.
Furthermore limit redundancy by wrapping the repeated block of insns in
an .irp.

--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -233,7 +233,7 @@ if [gas_32_check] then {
     run_list_test "noavx-2" "-march=+noavx -al"
     run_list_test "noavx-3" "-al"
     run_dump_test "noavx-4"
-    run_list_test "noavx512-1" "-al"
+    run_list_test "noavx512-1" "-almn"
     run_list_test "noavx512-2" "-al"
     run_dump_test "noextreg"
     run_dump_test "xmmhi32"
--- a/gas/testsuite/gas/i386/noavx512-1.l
+++ b/gas/testsuite/gas/i386/noavx512-1.l
@@ -1,416 +1,413 @@
 .*: Assembler messages:
-.*:25: Error: .*operand size mismatch.*
-.*:26: Error: .*unsupported masking.*
-.*:27: Error: .*unsupported masking.*
-.*:47: Error: .*operand size mismatch.*
-.*:48: Error: .*unsupported masking.*
-.*:49: Error: .*unsupported masking.*
-.*:50: Error: .*not supported.*
-.*:51: Error: .*not supported.*
-.*:52: Error: .*not supported.*
-.*:69: Error: .*operand size mismatch.*
-.*:70: Error: .*unsupported masking.*
-.*:71: Error: .*unsupported masking.*
-.*:72: Error: .*not supported.*
-.*:73: Error: .*not supported.*
-.*:74: Error: .*not supported.*
-.*:75: Error: .*not supported.*
-.*:76: Error: .*not supported.*
-.*:77: Error: .*not supported.*
-.*:91: Error: .*operand size mismatch.*
-.*:92: Error: .*unsupported masking.*
-.*:93: Error: .*unsupported masking.*
-.*:94: Error: .*not supported.*
-.*:95: Error: .*not supported.*
-.*:96: Error: .*not supported.*
-.*:97: Error: .*not supported.*
-.*:98: Error: .*not supported.*
-.*:99: Error: .*not supported.*
-.*:100: Error: .*not supported.*
-.*:113: Error: .*operand size mismatch.*
-.*:114: Error: .*unsupported masking.*
-.*:115: Error: .*unsupported masking.*
-.*:116: Error: .*not supported.*
-.*:117: Error: .*not supported.*
-.*:118: Error: .*not supported.*
-.*:119: Error: .*not supported.*
-.*:120: Error: .*not supported.*
-.*:121: Error: .*not supported.*
-.*:122: Error: .*not supported.*
-.*:126: Error: .*operand .*
-.*:127: Error: .*unsupported .*
-.*:128: Error: .*unsupported .*
-.*:135: Error: .*operand size mismatch.*
-.*:136: Error: .*unsupported masking.*
-.*:137: Error: .*unsupported masking.*
-.*:138: Error: .*not supported.*
-.*:139: Error: .*not supported.*
-.*:140: Error: .*not supported.*
-.*:141: Error: .*not supported.*
-.*:142: Error: .*not supported.*
-.*:143: Error: .*not supported.*
-.*:144: Error: .*not supported.*
-.*:148: Error: .*operand .*
-.*:149: Error: .*unsupported .*
-.*:150: Error: .*unsupported .*
-.*:151: Error: .*not supported.*
-.*:157: Error: .*operand size mismatch.*
-.*:158: Error: .*unsupported masking.*
-.*:159: Error: .*unsupported masking.*
-.*:160: Error: .*not supported.*
-.*:161: Error: .*not supported.*
-.*:162: Error: .*not supported.*
-.*:163: Error: .*not supported.*
-.*:164: Error: .*not supported.*
-.*:165: Error: .*not supported.*
-.*:166: Error: .*not supported.*
-.*:170: Error: .*operand .*
-.*:171: Error: .*unsupported .*
-.*:172: Error: .*unsupported .*
-.*:173: Error: .*not supported.*
-.*:174: Error: .*not supported.*
-.*:175: Error: .*not supported.*
-.*:176: Error: .*not supported.*
-.*:179: Error: .*bad register name.*
-.*:180: Error: .*unknown vector operation.*
-.*:181: Error: .*unknown vector operation.*
-.*:182: Error: .*not supported.*
-.*:183: Error: .*not supported.*
-.*:184: Error: .*not supported.*
-.*:185: Error: .*not supported.*
-.*:186: Error: .*not supported.*
-.*:187: Error: .*not supported.*
-.*:188: Error: .*not supported.*
-.*:189: Error: .*bad register name.*
-.*:190: Error: .*unknown vector operation.*
-.*:191: Error: .*unknown vector operation.*
-.*:192: Error: .*bad register name.*
-.*:193: Error: .*unknown vector operation.*
-.*:194: Error: .*unknown vector operation.*
-.*:195: Error: .*not supported.*
-.*:196: Error: .*not supported.*
-.*:197: Error: .*not supported.*
-.*:198: Error: .*not supported.*
-GAS LISTING .*
-#...
-[ 	]*1[ 	]+\# Test \.arch \.noavx512XX
-[ 	]*2[ 	]+\.text
-[ 	]*3[ 	]+\?\?\?\? 62F27D4F 		vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*3[ 	]+1CF5
-[ 	]*4[ 	]+\?\?\?\? 62F27D0F 		vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*4[ 	]+1CF5
-[ 	]*5[ 	]+\?\?\?\? 62F27D2F 		vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*5[ 	]+1CF5
-[ 	]*6[ 	]+\?\?\?\? 62F27D48 		vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*6[ 	]+C4F5
-[ 	]*7[ 	]+\?\?\?\? 62F27D08 		vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*7[ 	]+C4F5
-[ 	]*8[ 	]+\?\?\?\? 62F27D28 		vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*8[ 	]+C4F5
-[ 	]*9[ 	]+\?\?\?\? 62F1FD4F 		vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*9[ 	]+7B31
-[ 	]*10[ 	]+\?\?\?\? 62F1FD0F 		vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*10[ 	]+7B31
-[ 	]*11[ 	]+\?\?\?\? 62F1FD2F 		vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*11[ 	]+7B31
-[ 	]*12[ 	]+\?\?\?\? 62F27D4F 		vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*12[ 	]+C8F5
-[ 	]*13[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*13[ 	]+58F4
-[ 	]*14[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*14[ 	]+58F4
-[ 	]*15[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*15[ 	]+58F4
-[ 	]*16[ 	]+\?\?\?\? 62F2D54F 		vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*16[ 	]+B4F4
-[ 	]*17[ 	]+\?\?\?\? 62F2D50F 		vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*17[ 	]+B4F4
-[ 	]*18[ 	]+\?\?\?\? 62F2D52F 		vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*18[ 	]+B4F4
-[ 	]*19[ 	]+\?\?\?\? 62F2FD49 		vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*19[ 	]+C68CFD17 
-[ 	]*19[ 	]+000000
-[ 	]*20[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*20[ 	]+8DF4
-[ 	]*21[ 	]+\?\?\?\? 62F2550F 		vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*21[ 	]+8DF4
-[ 	]*22[ 	]+\?\?\?\? 62F2552F 		vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*22[ 	]+8DF4
-[ 	]*23[ 	]+
-[ 	]*24[ 	]+\.arch \.noavx512bw
-[ 	]*25[ 	]+vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*26[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*27[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*28[ 	]+\?\?\?\? 62F27D48 		vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*28[ 	]+C4F5
-[ 	]*29[ 	]+\?\?\?\? 62F27D08 		vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*29[ 	]+C4F5
-[ 	]*30[ 	]+\?\?\?\? 62F27D28 		vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*30[ 	]+C4F5
-[ 	]*31[ 	]+\?\?\?\? 62F1FD4F 		vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*31[ 	]+7B31
-[ 	]*32[ 	]+\?\?\?\? 62F1FD0F 		vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-\fGAS LISTING .*
-
-
-[ 	]*32[ 	]+7B31
-[ 	]*33[ 	]+\?\?\?\? 62F1FD2F 		vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*33[ 	]+7B31
-[ 	]*34[ 	]+\?\?\?\? 62F27D4F 		vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*34[ 	]+C8F5
-[ 	]*35[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*35[ 	]+58F4
-[ 	]*36[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*36[ 	]+58F4
-[ 	]*37[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*37[ 	]+58F4
-[ 	]*38[ 	]+\?\?\?\? 62F2D54F 		vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*38[ 	]+B4F4
-[ 	]*39[ 	]+\?\?\?\? 62F2D50F 		vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*39[ 	]+B4F4
-[ 	]*40[ 	]+\?\?\?\? 62F2D52F 		vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*40[ 	]+B4F4
-[ 	]*41[ 	]+\?\?\?\? 62F2FD49 		vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*41[ 	]+C68CFD17 
-[ 	]*41[ 	]+000000
-[ 	]*42[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*42[ 	]+8DF4
-[ 	]*43[ 	]+\?\?\?\? 62F2550F 		vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*43[ 	]+8DF4
-[ 	]*44[ 	]+\?\?\?\? 62F2552F 		vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*44[ 	]+8DF4
-[ 	]*45[ 	]+
-[ 	]*46[ 	]+\.arch \.noavx512cd
-[ 	]*47[ 	]+vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*48[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*49[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*50[ 	]+vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*51[ 	]+vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*52[ 	]+vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*53[ 	]+\?\?\?\? 62F1FD4F 		vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*53[ 	]+7B31
-[ 	]*54[ 	]+\?\?\?\? 62F1FD0F 		vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*54[ 	]+7B31
-[ 	]*55[ 	]+\?\?\?\? 62F1FD2F 		vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*55[ 	]+7B31
-[ 	]*56[ 	]+\?\?\?\? 62F27D4F 		vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*56[ 	]+C8F5
-[ 	]*57[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*57[ 	]+58F4
-[ 	]*58[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*58[ 	]+58F4
-[ 	]*59[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*59[ 	]+58F4
-[ 	]*60[ 	]+\?\?\?\? 62F2D54F 		vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*60[ 	]+B4F4
-[ 	]*61[ 	]+\?\?\?\? 62F2D50F 		vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*61[ 	]+B4F4
-[ 	]*62[ 	]+\?\?\?\? 62F2D52F 		vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*62[ 	]+B4F4
-[ 	]*63[ 	]+\?\?\?\? 62F2FD49 		vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*63[ 	]+C68CFD17 
-[ 	]*63[ 	]+000000
-\fGAS LISTING .*
-
-
-[ 	]*64[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*64[ 	]+8DF4
-[ 	]*65[ 	]+\?\?\?\? 62F2550F 		vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*65[ 	]+8DF4
-[ 	]*66[ 	]+\?\?\?\? 62F2552F 		vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*66[ 	]+8DF4
-[ 	]*67[ 	]+
-[ 	]*68[ 	]+\.arch \.noavx512dq
-[ 	]*69[ 	]+vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*70[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*71[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*72[ 	]+vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*73[ 	]+vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*74[ 	]+vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*75[ 	]+vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*76[ 	]+vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*77[ 	]+vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*78[ 	]+\?\?\?\? 62F27D4F 		vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*78[ 	]+C8F5
-[ 	]*79[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*79[ 	]+58F4
-[ 	]*80[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*80[ 	]+58F4
-[ 	]*81[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*81[ 	]+58F4
-[ 	]*82[ 	]+\?\?\?\? 62F2D54F 		vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*82[ 	]+B4F4
-[ 	]*83[ 	]+\?\?\?\? 62F2D50F 		vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*83[ 	]+B4F4
-[ 	]*84[ 	]+\?\?\?\? 62F2D52F 		vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*84[ 	]+B4F4
-[ 	]*85[ 	]+\?\?\?\? 62F2FD49 		vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*85[ 	]+C68CFD17 
-[ 	]*85[ 	]+000000
-[ 	]*86[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*86[ 	]+8DF4
-[ 	]*87[ 	]+\?\?\?\? 62F2550F 		vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*87[ 	]+8DF4
-[ 	]*88[ 	]+\?\?\?\? 62F2552F 		vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*88[ 	]+8DF4
-[ 	]*89[ 	]+
-[ 	]*90[ 	]+\.arch \.noavx512er
-[ 	]*91[ 	]+vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*92[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*93[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*94[ 	]+vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*95[ 	]+vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*96[ 	]+vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*97[ 	]+vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*98[ 	]+vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*99[ 	]+vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*100[ 	]+vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*101[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*101[ 	]+58F4
-[ 	]*102[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*102[ 	]+58F4
-[ 	]*103[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-\fGAS LISTING .*
-
-
-[ 	]*103[ 	]+58F4
-[ 	]*104[ 	]+\?\?\?\? 62F2D54F 		vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*104[ 	]+B4F4
-[ 	]*105[ 	]+\?\?\?\? 62F2D50F 		vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*105[ 	]+B4F4
-[ 	]*106[ 	]+\?\?\?\? 62F2D52F 		vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*106[ 	]+B4F4
-[ 	]*107[ 	]+\?\?\?\? 62F2FD49 		vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*107[ 	]+C68CFD17 
-[ 	]*107[ 	]+000000
-[ 	]*108[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*108[ 	]+8DF4
-[ 	]*109[ 	]+\?\?\?\? 62F2550F 		vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*109[ 	]+8DF4
-[ 	]*110[ 	]+\?\?\?\? 62F2552F 		vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*110[ 	]+8DF4
-[ 	]*111[ 	]+
-[ 	]*112[ 	]+\.arch \.noavx512ifma
-[ 	]*113[ 	]+vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*114[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*115[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*116[ 	]+vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*117[ 	]+vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*118[ 	]+vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*119[ 	]+vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*120[ 	]+vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*121[ 	]+vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*122[ 	]+vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*123[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*123[ 	]+58F4
-[ 	]*124[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*124[ 	]+58F4
-[ 	]*125[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*125[ 	]+58F4
-[ 	]*126[ 	]+vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*127[ 	]+vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*128[ 	]+vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*129[ 	]+\?\?\?\? 62F2FD49 		vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*129[ 	]+C68CFD17 
-[ 	]*129[ 	]+000000
-[ 	]*130[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*130[ 	]+8DF4
-[ 	]*131[ 	]+\?\?\?\? 62F2550F 		vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*131[ 	]+8DF4
-[ 	]*132[ 	]+\?\?\?\? 62F2552F 		vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*132[ 	]+8DF4
-[ 	]*133[ 	]+
-[ 	]*134[ 	]+\.arch \.noavx512pf
-[ 	]*135[ 	]+vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*136[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*137[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*138[ 	]+vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*139[ 	]+vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*140[ 	]+vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*141[ 	]+vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*142[ 	]+vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*143[ 	]+vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-\fGAS LISTING .*
-
-
-[ 	]*144[ 	]+vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*145[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*145[ 	]+58F4
-[ 	]*146[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*146[ 	]+58F4
-[ 	]*147[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*147[ 	]+58F4
-[ 	]*148[ 	]+vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*149[ 	]+vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*150[ 	]+vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*151[ 	]+vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*152[ 	]+\?\?\?\? 62F2554F 		vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*152[ 	]+8DF4
-[ 	]*153[ 	]+\?\?\?\? 62F2550F 		vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*153[ 	]+8DF4
-[ 	]*154[ 	]+\?\?\?\? 62F2552F 		vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*154[ 	]+8DF4
-[ 	]*155[ 	]+
-[ 	]*156[ 	]+\.arch \.noavx512vbmi
-[ 	]*157[ 	]+vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*158[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*159[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*160[ 	]+vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*161[ 	]+vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*162[ 	]+vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*163[ 	]+vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*164[ 	]+vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*165[ 	]+vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*166[ 	]+vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*167[ 	]+\?\?\?\? 62F1D54F 		vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*167[ 	]+58F4
-[ 	]*168[ 	]+\?\?\?\? 62F1D50F 		vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*168[ 	]+58F4
-[ 	]*169[ 	]+\?\?\?\? 62F1D52F 		vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*169[ 	]+58F4
-[ 	]*170[ 	]+vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*171[ 	]+vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*172[ 	]+vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*173[ 	]+vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*174[ 	]+vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*175[ 	]+vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*176[ 	]+vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*177[ 	]+
-[ 	]*178[ 	]+\.arch \.noavx512f
-[ 	]*179[ 	]+vpabsb %zmm5, %zmm6\{%k7\}		\# AVX512BW
-[ 	]*180[ 	]+vpabsb %xmm5, %xmm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*181[ 	]+vpabsb %ymm5, %ymm6\{%k7\}		\# AVX512BW \+ AVX512VL
-[ 	]*182[ 	]+vpconflictd %zmm5, %zmm6		\# AVX412CD
-[ 	]*183[ 	]+vpconflictd %xmm5, %xmm6		\# AVX412CD \+ AVX512VL
-[ 	]*184[ 	]+vpconflictd %ymm5, %ymm6		\# AVX412CD \+ AVX512VL
-[ 	]*185[ 	]+vcvtpd2qq \(%ecx\), %zmm6\{%k7\}		\# AVX512DQ
-[ 	]*186[ 	]+vcvtpd2qq \(%ecx\), %xmm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*187[ 	]+vcvtpd2qq \(%ecx\), %ymm6\{%k7\}		\# AVX512DQ \+ AVX512VL
-[ 	]*188[ 	]+vexp2ps %zmm5, %zmm6\{%k7\}		\# AVX512ER
-[ 	]*189[ 	]+vaddpd %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512F
-[ 	]*190[ 	]+vaddpd %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512F \+ AVX512VL
-[ 	]*191[ 	]+vaddpd %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512F \+ AVX512VL
-\fGAS LISTING .*
-
-
-[ 	]*192[ 	]+vpmadd52luq %zmm4, %zmm5, %zmm6\{%k7\}	\# AVX512IFMA
-[ 	]*193[ 	]+vpmadd52luq %xmm4, %xmm5, %xmm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*194[ 	]+vpmadd52luq %ymm4, %ymm5, %ymm6\{%k7\}	\# AVX512IFMA \+ AVX512VL
-[ 	]*195[ 	]+vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}	\# AVX512PF
-[ 	]*196[ 	]+vpermb %zmm4, %zmm5, %zmm6\{%k7\}		\# AVX512VBMI
-[ 	]*197[ 	]+vpermb %xmm4, %xmm5, %xmm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*198[ 	]+vpermb %ymm4, %ymm5, %ymm6\{%k7\}		\# AVX512VBMI \+ AVX512VL
-[ 	]*199[ 	]+
-[ 	]*200[ 	]+\?\?\?\? C4E2791C 		vpabsb %xmm5, %xmm6
-[ 	]*200[ 	]+F5
-[ 	]*201[ 	]+\?\?\?\? C4E27D1C 		vpabsb %ymm5, %ymm6
-[ 	]*201[ 	]+F5
-[ 	]*202[ 	]+\?\?\?\? C5D158F4 		vaddpd %xmm4, %xmm5, %xmm6
-[ 	]*203[ 	]+\?\?\?\? C5D558F4 		vaddpd %ymm4, %ymm5, %ymm6
-[ 	]*204[ 	]+\?\?\?\? 660F381C 		pabsb %xmm5, %xmm6
-[ 	]*204[ 	]+F5
-[ 	]*205[ 	]+\?\?\?\? 660F58F4 		addpd %xmm4, %xmm6
-[ 	]*206[ 	]+
-[ 	]*207[ 	]+\?\?\?\? 0F1F8000 		\.p2align 4
-[ 	]*207[ 	]+000000
+.*:8: Error: .*operand size mismatch.*
+.*:9: Error: .*unsupported masking.*
+.*:10: Error: .*unsupported masking.*
+.*:11: Error: .*not supported.*
+.*:12: Error: .*not supported.*
+.*:13: Error: .*not supported.*
+.*:14: Error: .*not supported.*
+.*:15: Error: .*not supported.*
+.*:16: Error: .*not supported.*
+.*:17: Error: .*not supported.*
+.*:21: Error: .*operand.*mismatch.*
+.*:22: Error: .*unsupported masking.*
+.*:23: Error: .*unsupported masking.*
+.*:24: Error: .*not supported.*
+.*:25: Error: .*not supported.*
+.*:26: Error: .*not supported.*
+.*:27: Error: .*not supported.*
+.*:8: Error: .*bad register name.*
+.*:9: Error: .*unknown vector operation.*
+.*:10: Error: .*unknown vector operation.*
+.*:11: Error: .*not supported.*
+.*:12: Error: .*not supported.*
+.*:13: Error: .*not supported.*
+.*:14: Error: .*not supported.*
+.*:15: Error: .*not supported.*
+.*:16: Error: .*not supported.*
+.*:17: Error: .*not supported.*
+.*:18: Error: .*bad register name.*
+.*:19: Error: .*unknown vector operation.*
+.*:20: Error: .*unknown vector operation.*
+.*:21: Error: .*bad register name.*
+.*:22: Error: .*unknown vector operation.*
+.*:23: Error: .*unknown vector operation.*
+.*:24: Error: .*not supported.*
+.*:25: Error: .*not supported.*
+.*:26: Error: .*not supported.*
+.*:27: Error: .*not supported.*
+#...
+[ 	]*[0-9]+[ 	]+\# Test \.arch \.noavx512XX
+[ 	]*[0-9]+[ 	]+\.text
+[ 	]*[0-9]+[ 	]*
+[ 	]*[0-9]+[ 	]+\.irp isa, default, .*
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512bw
+[ 	]*[0-9]+[ 	]+>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512cd
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512dq
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512er
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512ifma
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512pf
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+8DF4
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512vbmi
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D0F 	>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D2F 	>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+1CF5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D48 	>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D08 	>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D28 	>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+C4F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD4F 	>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD0F 	>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1FD2F 	>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+7B31
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F27D4F 	>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+C8F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D54F 	>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D50F 	>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F1D52F 	>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+58F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D54F 	>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D50F 	>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2D52F 	>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+B4F4
+[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+C68CFD17 *
+[ 	]*[0-9]+[ 	]+000000
+[ 	]*[0-9]+[ 	]+>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+#...
+[ 	]*[0-9]+[ 	]+>  \.arch default
+[ 	]*[0-9]+[ 	]+>  \.arch \.noavx512f
+[ 	]*[0-9]+[ 	]+>  vpabsb %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpabsb %xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpabsb %ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpconflictd %zmm5,%zmm6
+[ 	]*[0-9]+[ 	]+>  vpconflictd %xmm5,%xmm6
+[ 	]*[0-9]+[ 	]+>  vpconflictd %ymm5,%ymm6
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vcvtpd2qq \(%ecx\),%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vexp2ps %zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vaddpd %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpmadd52luq %ymm4,%ymm5,%ymm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
+[ 	]*[0-9]+[ 	]+>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
+#...
+[ 	]*[0-9]+[ 	]+\.endr
+[ 	]*[0-9]+[ 	]*
+[ 	]*[0-9]+[ 	]+\?\?\?\? C4E2791C 		vpabsb %xmm5, %xmm6
+[ 	]*[0-9]+[ 	]+F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? C4E27D1C 		vpabsb %ymm5, %ymm6
+[ 	]*[0-9]+[ 	]+F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? C5D158F4 		vaddpd %xmm4, %xmm5, %xmm6
+[ 	]*[0-9]+[ 	]+\?\?\?\? C5D558F4 		vaddpd %ymm4, %ymm5, %ymm6
+[ 	]*[0-9]+[ 	]+\?\?\?\? 660F381C 		pabsb %xmm5, %xmm6
+[ 	]*[0-9]+[ 	]+F5
+[ 	]*[0-9]+[ 	]+\?\?\?\? 660F58F4 		addpd %xmm4, %xmm6
 #pass
--- a/gas/testsuite/gas/i386/noavx512-1.s
+++ b/gas/testsuite/gas/i386/noavx512-1.s
@@ -1,49 +1,10 @@
 # Test .arch .noavx512XX
 	.text
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
 
-	.arch .noavx512bw
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
+	.irp isa, default, .noavx512bw, .noavx512cd, .noavx512dq, .noavx512er, .noavx512ifma, .noavx512pf, .noavx512vbmi, .noavx512f
 
-	.arch .noavx512cd
+	.arch default
+	.arch \isa
 	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
 	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
 	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
@@ -65,137 +26,7 @@
 	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
 	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
 
-	.arch .noavx512dq
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
-
-	.arch .noavx512er
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
-
-	.arch .noavx512ifma
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
-
-	.arch .noavx512pf
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
-
-	.arch .noavx512vbmi
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
-
-	.arch .noavx512f
-	vpabsb %zmm5, %zmm6{%k7}		# AVX512BW
-	vpabsb %xmm5, %xmm6{%k7}		# AVX512BW + AVX512VL
-	vpabsb %ymm5, %ymm6{%k7}		# AVX512BW + AVX512VL
-	vpconflictd %zmm5, %zmm6		# AVX412CD
-	vpconflictd %xmm5, %xmm6		# AVX412CD + AVX512VL
-	vpconflictd %ymm5, %ymm6		# AVX412CD + AVX512VL
-	vcvtpd2qq (%ecx), %zmm6{%k7}		# AVX512DQ
-	vcvtpd2qq (%ecx), %xmm6{%k7}		# AVX512DQ + AVX512VL
-	vcvtpd2qq (%ecx), %ymm6{%k7}		# AVX512DQ + AVX512VL
-	vexp2ps %zmm5, %zmm6{%k7}		# AVX512ER
-	vaddpd %zmm4, %zmm5, %zmm6{%k7}		# AVX512F
-	vaddpd %xmm4, %xmm5, %xmm6{%k7}		# AVX512F + AVX512VL
-	vaddpd %ymm4, %ymm5, %ymm6{%k7}		# AVX512F + AVX512VL
-	vpmadd52luq %zmm4, %zmm5, %zmm6{%k7}	# AVX512IFMA
-	vpmadd52luq %xmm4, %xmm5, %xmm6{%k7}	# AVX512IFMA + AVX512VL
-	vpmadd52luq %ymm4, %ymm5, %ymm6{%k7}	# AVX512IFMA + AVX512VL
-	vgatherpf0dpd 23(%ebp,%ymm7,8){%k1}	# AVX512PF
-	vpermb %zmm4, %zmm5, %zmm6{%k7}		# AVX512VBMI
-	vpermb %xmm4, %xmm5, %xmm6{%k7}		# AVX512VBMI + AVX512VL
-	vpermb %ymm4, %ymm5, %ymm6{%k7}		# AVX512VBMI + AVX512VL
+	.endr
 
 	vpabsb %xmm5, %xmm6
 	vpabsb %ymm5, %ymm6


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 06/10] x86: correct dependencies of a few AVX512 sub-features
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
                   ` (4 preceding siblings ...)
  2022-12-19 10:46 ` [PATCH 05/10] x86: rework noavx512-1 testcase Jan Beulich
@ 2022-12-19 10:46 ` Jan Beulich
  2022-12-19 10:47 ` [PATCH 07/10] x86: correct XSAVE* dependencies Jan Beulich
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:46 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Like AVX512-FP16, several other extensions require wider than 16-bit
mask registers. As a result they take AVX512BW as a prereq, not (just)
AVX512F. Which in turn points out wrong expectations in the noavx512-1
testcase.

--- a/gas/testsuite/gas/i386/noavx512-1.l
+++ b/gas/testsuite/gas/i386/noavx512-1.l
@@ -2,6 +2,9 @@
 .*:8: Error: .*operand size mismatch.*
 .*:9: Error: .*unsupported masking.*
 .*:10: Error: .*unsupported masking.*
+.*:25: Error: .*not supported.*
+.*:26: Error: .*not supported.*
+.*:27: Error: .*not supported.*
 .*:11: Error: .*not supported.*
 .*:12: Error: .*not supported.*
 .*:13: Error: .*not supported.*
@@ -120,12 +123,9 @@
 [ 	]*[0-9]+[ 	]+\?\?\?\? 62F2FD49 	>  vgatherpf0dpd 23\(%ebp,%ymm7,8\)\{%k1\}
 [ 	]*[0-9]+[ 	]+C68CFD17 *
 [ 	]*[0-9]+[ 	]+000000
-[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2554F 	>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
-[ 	]*[0-9]+[ 	]+8DF4
-[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2550F 	>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
-[ 	]*[0-9]+[ 	]+8DF4
-[ 	]*[0-9]+[ 	]+\?\?\?\? 62F2552F 	>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
-[ 	]*[0-9]+[ 	]+8DF4
+[ 	]*[0-9]+[ 	]+>  vpermb %zmm4,%zmm5,%zmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %xmm4,%xmm5,%xmm6\{%k7\}
+[ 	]*[0-9]+[ 	]+>  vpermb %ymm4,%ymm5,%ymm6\{%k7\}
 #...
 [ 	]*[0-9]+[ 	]+>  \.arch default
 [ 	]*[0-9]+[ 	]+>  \.arch \.noavx512cd
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -184,7 +184,7 @@ static const dependency isa_dependencies
   { "AVX512IFMA",
     "AVX512F" },
   { "AVX512VBMI",
-    "AVX512F" },
+    "AVX512BW" },
   { "AVX512_4FMAPS",
     "AVX512F" },
   { "AVX512_4VNNIW",
@@ -192,15 +192,15 @@ static const dependency isa_dependencies
   { "AVX512_VPOPCNTDQ",
     "AVX512F" },
   { "AVX512_VBMI2",
-    "AVX512F" },
+    "AVX512BW" },
   { "AVX512_VNNI",
     "AVX512F" },
   { "AVX512_BITALG",
-    "AVX512F" },
+    "AVX512BW" },
   { "AVX512_VP2INTERSECT",
     "AVX512F" },
   { "AVX512_BF16",
-    "AVX512F" },
+    "AVX512BW" },
   { "AVX512_FP16",
     "AVX512BW" },
   { "IAMCU",


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 07/10] x86: correct XSAVE* dependencies
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
                   ` (5 preceding siblings ...)
  2022-12-19 10:46 ` [PATCH 06/10] x86: correct dependencies of a few AVX512 sub-features Jan Beulich
@ 2022-12-19 10:47 ` Jan Beulich
  2022-12-19 10:47 ` [PATCH 08/10] x86: add dependencies on VMX Jan Beulich
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:47 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Like various other features AMX-TILE takes XSAVE as a prereq.

XSAVES, unconditionally using compacted format, in turn effectively
takes XSAVEC as a prereq (an SDM clarification to this effect is in the
works).

--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -210,7 +210,7 @@ static const dependency isa_dependencies
   { "SHA",
     "SSE2" },
   { "XSAVES",
-    "XSAVE" },
+    "XSAVEC" },
   { "XSAVEC",
     "XSAVE" },
   { "OSPKE",
@@ -221,6 +221,8 @@ static const dependency isa_dependencies
     "AVX2" },
   { "VPCLMULQDQ",
     "AVX2" },
+  { "AMX_TILE",
+    "XSAVE" },
   { "AMX_INT8",
     "AMX_TILE" },
   { "AMX_BF16",


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 08/10] x86: add dependencies on VMX
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
                   ` (6 preceding siblings ...)
  2022-12-19 10:47 ` [PATCH 07/10] x86: correct XSAVE* dependencies Jan Beulich
@ 2022-12-19 10:47 ` Jan Beulich
  2022-12-19 10:48 ` [PATCH 09/10] x86: add dependencies on SVME Jan Beulich
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:47 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Both EPT and VMFUNC are extensions to VMX.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1026,8 +1026,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (avx512dq, AVX512DQ, ANY_AVX512DQ, false),
   SUBARCH (avx512bw, AVX512BW, ANY_AVX512BW, false),
   SUBARCH (avx512vl, AVX512VL, ANY_AVX512VL, false),
-  SUBARCH (vmx, VMX, VMX, false),
-  SUBARCH (vmfunc, VMFUNC, VMFUNC, false),
+  SUBARCH (vmx, VMX, ANY_VMX, false),
+  SUBARCH (vmfunc, VMFUNC, ANY_VMFUNC, false),
   SUBARCH (smx, SMX, SMX, false),
   SUBARCH (xsave, XSAVE, ANY_XSAVE, false),
   SUBARCH (xsaveopt, XSAVEOPT, ANY_XSAVEOPT, false),
@@ -1046,7 +1046,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (lwp, LWP, ANY_LWP, false),
   SUBARCH (movbe, MOVBE, MOVBE, false),
   SUBARCH (cx16, CX16, CX16, false),
-  SUBARCH (ept, EPT, EPT, false),
+  SUBARCH (ept, EPT, ANY_EPT, false),
   SUBARCH (lzcnt, LZCNT, LZCNT, false),
   SUBARCH (popcnt, POPCNT, POPCNT, false),
   SUBARCH (hle, HLE, HLE, false),
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -205,6 +205,10 @@ static const dependency isa_dependencies
     "AVX512BW" },
   { "IAMCU",
     "586:nofpu" },
+  { "EPT",
+    "VMX" },
+  { "VMFUNC",
+    "VMX" },
   { "MPX",
     "XSAVE" },
   { "SHA",


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 09/10] x86: add dependencies on SVME
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
                   ` (7 preceding siblings ...)
  2022-12-19 10:47 ` [PATCH 08/10] x86: add dependencies on VMX Jan Beulich
@ 2022-12-19 10:48 ` Jan Beulich
  2022-12-19 10:48 ` [PATCH 10/10] x86: correct/improve TSX controls Jan Beulich
  2022-12-20  2:25 ` [PATCH 00/10] x86: re-work ISA extension dependency handling H.J. Lu
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:48 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

SEV-ES is an extension to SVME. SNP in turn is an extension to SEV-ES,
and yet in turn RMPQUERY is a SNP extension.

Note that cpu_arch[] has no SNP entry, so CPU_ANY_SNP_FLAGS remains
unused (just like CPU_SNP_FLAGS already is).

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1059,8 +1059,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (3dnow, 3DNOW, ANY_3DNOW, false),
   SUBARCH (3dnowa, 3DNOWA, ANY_3DNOWA, false),
   SUBARCH (padlock, PADLOCK, PADLOCK, false),
-  SUBARCH (pacifica, SVME, SVME, true),
-  SUBARCH (svme, SVME, SVME, false),
+  SUBARCH (pacifica, SVME, ANY_SVME, true),
+  SUBARCH (svme, SVME, ANY_SVME, false),
   SUBARCH (abm, ABM, ABM, false),
   SUBARCH (bmi, BMI, BMI, false),
   SUBARCH (tbm, TBM, TBM, false),
@@ -1111,7 +1111,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (serialize, SERIALIZE, SERIALIZE, false),
   SUBARCH (rdpru, RDPRU, RDPRU, false),
   SUBARCH (mcommit, MCOMMIT, MCOMMIT, false),
-  SUBARCH (sev_es, SEV_ES, SEV_ES, false),
+  SUBARCH (sev_es, SEV_ES, ANY_SEV_ES, false),
   SUBARCH (tsxldtrk, TSXLDTRK, TSXLDTRK, false),
   SUBARCH (kl, KL, ANY_KL, false),
   SUBARCH (widekl, WIDEKL, ANY_WIDEKL, false),
@@ -1126,7 +1126,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (msrlist, MSRLIST, MSRLIST, false),
   SUBARCH (avx_ne_convert, AVX_NE_CONVERT, ANY_AVX_NE_CONVERT, false),
   SUBARCH (rao_int, RAO_INT, RAO_INT, false),
-  SUBARCH (rmpquery, RMPQUERY, RMPQUERY, false),
+  SUBARCH (rmpquery, RMPQUERY, ANY_RMPQUERY, false),
 };
 
 #undef SUBARCH
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -225,6 +225,12 @@ static const dependency isa_dependencies
     "AVX2" },
   { "VPCLMULQDQ",
     "AVX2" },
+  { "SEV_ES",
+    "SVME" },
+  { "SNP",
+    "SEV_ES" },
+  { "RMPQUERY",
+    "SNP" },
   { "AMX_TILE",
     "XSAVE" },
   { "AMX_INT8",


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 10/10] x86: correct/improve TSX controls
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
                   ` (8 preceding siblings ...)
  2022-12-19 10:48 ` [PATCH 09/10] x86: add dependencies on SVME Jan Beulich
@ 2022-12-19 10:48 ` Jan Beulich
  2022-12-20  2:25 ` [PATCH 00/10] x86: re-work ISA extension dependency handling H.J. Lu
  10 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:48 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

TSXLDTRK takes RTM as a prereq. Additionally introduce an umbrella "tsx"
extension option covering both RTM and HLE, paralleling the "abm" one we
already have.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1050,7 +1050,8 @@ static const arch_entry cpu_arch[] =
   SUBARCH (lzcnt, LZCNT, LZCNT, false),
   SUBARCH (popcnt, POPCNT, POPCNT, false),
   SUBARCH (hle, HLE, HLE, false),
-  SUBARCH (rtm, RTM, RTM, false),
+  SUBARCH (rtm, RTM, ANY_RTM, false),
+  SUBARCH (tsx, TSX, TSX, false),
   SUBARCH (invpcid, INVPCID, INVPCID, false),
   SUBARCH (clflush, CLFLUSH, CLFLUSH, false),
   SUBARCH (nop, NOP, NOP, false),
@@ -1112,7 +1113,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (rdpru, RDPRU, RDPRU, false),
   SUBARCH (mcommit, MCOMMIT, MCOMMIT, false),
   SUBARCH (sev_es, SEV_ES, ANY_SEV_ES, false),
-  SUBARCH (tsxldtrk, TSXLDTRK, TSXLDTRK, false),
+  SUBARCH (tsxldtrk, TSXLDTRK, ANY_TSXLDTRK, false),
   SUBARCH (kl, KL, ANY_KL, false),
   SUBARCH (widekl, WIDEKL, ANY_WIDEKL, false),
   SUBARCH (uintr, UINTR, UINTR, false),
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -227,6 +227,7 @@ accept various extension mnemonics.  For
 @code{popcnt},
 @code{hle},
 @code{rtm},
+@code{tsx},
 @code{invpcid},
 @code{clflush},
 @code{mwaitx},
@@ -1485,8 +1486,8 @@ supported on the CPU specified.  The cho
 @item @samp{.aes} @tab @samp{.pclmul} @tab @samp{.fma} @tab @samp{.fsgsbase}
 @item @samp{.rdrnd} @tab @samp{.f16c} @tab @samp{.avx2} @tab @samp{.bmi2}
 @item @samp{.lzcnt} @tab @samp{.popcnt} @tab @samp{.invpcid} @tab @samp{.vmfunc}
-@item @samp{.hle}
-@item @samp{.rtm} @tab @samp{.adx} @tab @samp{.rdseed} @tab @samp{.prfchw}
+@item @samp{.hle} @tab @samp{.rtm} @tab @samp{.tsx}
+@item @samp{.adx} @tab @samp{.rdseed} @tab @samp{.prfchw}
 @item @samp{.smap} @tab @samp{.mpx} @tab @samp{.sha} @tab @samp{.prefetchwt1}
 @item @samp{.clflushopt} @tab @samp{.xsavec} @tab @samp{.xsaves} @tab @samp{.se1}
 @item @samp{.avx512f} @tab @samp{.avx512cd} @tab @samp{.avx512er} @tab @samp{.avx512pf}
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -231,6 +231,10 @@ static const dependency isa_dependencies
     "SEV_ES" },
   { "RMPQUERY",
     "SNP" },
+  { "TSX",
+    "RTM|HLE" },
+  { "TSXLDTRK",
+    "RTM" },
   { "AMX_TILE",
     "XSAVE" },
   { "AMX_INT8",


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 00/10] x86: re-work ISA extension dependency handling
  2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
                   ` (9 preceding siblings ...)
  2022-12-19 10:48 ` [PATCH 10/10] x86: correct/improve TSX controls Jan Beulich
@ 2022-12-20  2:25 ` H.J. Lu
  2022-12-20  8:09   ` Jan Beulich
  10 siblings, 1 reply; 14+ messages in thread
From: H.J. Lu @ 2022-12-20  2:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Mon, Dec 19, 2022 at 12:31 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Getting both forward and reverse ISA dependencies right / consistent has
> been a permanent source of mistakes, myself included. Reduce what needs
> specifying manually to just the direct forward dependencies. Plus a
> number of dependencies weren't put in place at all.
>
> 01: re-work ISA extension dependency handling
> 02: correct what gets disabled by certain ".arch .no*"
> 03: correct SSE dependencies
> 04: add dependencies on AVX2
> 05: rework noavx512-1 testcase
> 06: correct dependencies of a few AVX512 sub-features
> 07: correct XSAVE* dependencies
> 08: add dependencies on VMX
> 09: add dependencies on SVME
> 10: correct/improve TSX controls
>
> Jan

If a CPUID feature, like X, implies another CPUID feature, Y,
disable X shouldn't disable Y.  Will this patch set still support
this without adding CpuX to all Y instructions?

-- 
H.J.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 00/10] x86: re-work ISA extension dependency handling
  2022-12-20  2:25 ` [PATCH 00/10] x86: re-work ISA extension dependency handling H.J. Lu
@ 2022-12-20  8:09   ` Jan Beulich
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-20  8:09 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils

On 20.12.2022 03:25, H.J. Lu wrote:
> On Mon, Dec 19, 2022 at 12:31 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> Getting both forward and reverse ISA dependencies right / consistent has
>> been a permanent source of mistakes, myself included. Reduce what needs
>> specifying manually to just the direct forward dependencies. Plus a
>> number of dependencies weren't put in place at all.
>>
>> 01: re-work ISA extension dependency handling
>> 02: correct what gets disabled by certain ".arch .no*"
>> 03: correct SSE dependencies
>> 04: add dependencies on AVX2
>> 05: rework noavx512-1 testcase
>> 06: correct dependencies of a few AVX512 sub-features
>> 07: correct XSAVE* dependencies
>> 08: add dependencies on VMX
>> 09: add dependencies on SVME
>> 10: correct/improve TSX controls
> 
> If a CPUID feature, like X, implies another CPUID feature, Y,
> disable X shouldn't disable Y.  Will this patch set still support
> this without adding CpuX to all Y instructions?

This series doesn't alter behavior in this regard (as can also be seen by
there not being any changes to the insn templates, nor to respective test
cases; the one testcase the series does touch is being altered separately
first for the very reason of demonstrating that behavior of the assembler
doesn't change, except of course for adding previously missing connections
between ISA extensions). The meaning of CPU{,_ANY}_*_FLAGS remains exactly
the same. It is only the way they're calculated which changes.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 00/10] x86: re-work ISA extension dependency handling
@ 2022-12-19 10:35 Jan Beulich
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-12-19 10:35 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Getting both forward and reverse ISA dependencies right / consistent has
been a permanent source of mistakes, myself included. Reduce what needs
specifying manually to just the direct forward dependencies. Plus a
number of dependencies weren't put in place at all.

01: re-work ISA extension dependency handling
02: correct what gets disabled by certain ".arch .no*"
03: correct SSE dependencies
04: add dependencies on AVX2
05: rework noavx512-1 testcase
06: correct dependencies of a few AVX512 sub-features
07: correct XSAVE* dependencies
08: add dependencies on VMX
09: add dependencies on SVME
10: correct/improve TSX controls

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-12-20  8:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-19  8:31 [PATCH 00/10] x86: re-work ISA extension dependency handling Jan Beulich
2022-12-19 10:44 ` [PATCH 01/10] " Jan Beulich
2022-12-19 10:45 ` [PATCH 02/10] x86: correct what gets disabled by certain ".arch .no*" Jan Beulich
2022-12-19 10:45 ` [PATCH 03/10] x86: correct SSE dependencies Jan Beulich
2022-12-19 10:45 ` [PATCH 04/10] x86: add dependencies on AVX2 Jan Beulich
2022-12-19 10:46 ` [PATCH 05/10] x86: rework noavx512-1 testcase Jan Beulich
2022-12-19 10:46 ` [PATCH 06/10] x86: correct dependencies of a few AVX512 sub-features Jan Beulich
2022-12-19 10:47 ` [PATCH 07/10] x86: correct XSAVE* dependencies Jan Beulich
2022-12-19 10:47 ` [PATCH 08/10] x86: add dependencies on VMX Jan Beulich
2022-12-19 10:48 ` [PATCH 09/10] x86: add dependencies on SVME Jan Beulich
2022-12-19 10:48 ` [PATCH 10/10] x86: correct/improve TSX controls Jan Beulich
2022-12-20  2:25 ` [PATCH 00/10] x86: re-work ISA extension dependency handling H.J. Lu
2022-12-20  8:09   ` Jan Beulich
2022-12-19 10:35 Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).